Apply retention policies failing

We have six months worth of packages which are not being cleared down as the ‘Apply retention policies’ task is failing with a C# exception -

Object reference not set to an instance of an object. System.NullReferenceException at Octopus.Core.Packages.Retention.ReleasePackageReferenceEvaluator.<>c.<GetPackageVersionsUsedInReleases>b__0_1(<>f__AnonymousType613 pair) in ReleasePackageReferenceEvaluator.cs:line 35 at System.Linq.Enumerable.SelectManyIterator[TSource,TCollection,TResult](IEnumerable1 source, Func2 collectionSelector, Func3 resultSelector)+MoveNext() at System.Linq.Enumerable.WhereSelectEnumerableIterator2.MoveNext() at System.Linq.Enumerable.WhereSelectEnumerableIterator2.MoveNext() at System.Linq.Enumerable.WhereSelectEnumerableIterator2.MoveNext() at System.Linq.Enumerable.WhereSelectEnumerableIterator2.ToList() at Octopus.Core.Packages.Retention.ReleasePackageReferenceEvaluator.GetPackageVersionsUsedInReleases(IOctopusQueryExecutor queryExecutor, SpaceId spaceId, ITaskLog taskLog, IDocumentStore2 projectsDocumentStore) in ReleasePackageReferenceEvaluator.cs:line 32 at Octopus.Server.Orchestration.ServerTasks.ApplyRetentionPolicies.ApplyRetentionPoliciesTaskController.<>c__DisplayClass12_0.<Execute in ApplyRetentionPoliciesTaskController.cs:line 92 at Octopus.Server.Infrastructure.Orchestration.UnitsOfWork.UnitOfWorkExecutor.<>c__DisplayClass9_07.<Execute in UnitOfWorkExecutor.cs:line 237 at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.DoAsync(IUnitOfWork unitOfWork, Func1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 75 at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.DoAsync(IUnitOfWork unitOfWork, Func1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 75 at Octopus.Server.Infrastructure.Orchestration.UnitsOfWork.UnitOfWorkExecutor.Execute[T1,T2,T3,T4,T5,T6,T7](Func9 action, CancellationToken cancellationToken, String name) in UnitOfWorkExecutor.cs:line 240 at Octopus.Server.Orchestration.ServerTasks.ApplyRetentionPolicies.ApplyRetentionPoliciesTaskController.Execute(ITaskLog taskLog, CancellationToken cancellationToken) in ApplyRetentionPoliciesTaskController.cs:line 98 at Octopus.Server.Orchestration.ServerTasks.RunningTask.<>c__DisplayClass31_1.<WorkerTask in RunningTask.cs:line 171 at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.DoAsync(IUnitOfWork unitOfWork, Func1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 75 at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.DoAsync(IUnitOfWork unitOfWork, Func1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 75 at Octopus.Server.Orchestration.ServerTasks.RunningTask.WorkerTask(CancellationToken cancellationToken) in RunningTask.cs:line 204`

We’ve upgraded the server to latest version, just in case it’s a know bug, but it is still failing.

Any suggestions please?

Thanks

Simon

Hey @simon.george,

Thanks for reaching out, sorry to hear you’re running into issues.

Would you be able to send over the full retention run task log, please?

I can provide a secure link to upload to, in order to avoid posting information publicly.
Feel free to upload any logs or files in regards to the issue here: Support Files.

With the full log we can hopefully learn more about the issue and direct you further in troubleshooting efforts.

Kind Regards,
Adam

@adam.hollow I’ve added the log

Hi Simon,

Thanks for following up and uploading the retention log file. I’ve been digging into this one, and I have an educated guess as to what the issue might be. I’ve found references to this identical error occur a while back in the retention task which was a result of a project containing runbooks being cloned, and on the clone Octopus was incorrectly setting the retention in the new cloned runbook to null. This null proceeded to trip up the retention task in this way.

If this sounds possibly relevant, could you run the following SQL query against your Octopus database?

SELECT [JSON] FROM [dbo].[Runbook]
WHERE [JSON] LIKE '%"RunRetentionPolicy":null%';

If you get anything returned from the above query, you can reset this null value to the default value via the below UPDATE:

UPDATE [dbo].[Runbook]
SET [JSON] = JSON_MODIFY([JSON], '$.RunRetentionPolicy', JSON_QUERY('{"QuantityToKeep": 100, "ShouldKeepForever": false}'))
GO

If this isn’t applicable to your scenario, could you also let me know which version you upgraded from?

Best regards,

Kenny

I’ve run the first sql statement, it doesn’t return any results.

We upgraded from the latest version in Dec 2021, I can’t remember the exact version

Hi Simon,

Thanks for following up and ruling that out as a possibility. We’re hoping a system integrity check might illuminate something - would you be willing to run that (Configuration > Diagnostics > System Integrity Check in your web portal), and if it reports any issues upload the resulting log at this link?

If everything passes in the check, it would be best if we could obtain a copy of your Octopus database backup so we can resolve this issue faster. Could you please upload a compressed/zipped backup of your database to the same link above, and let us know when that upload is complete and we will take a look? As part of supplying the backup we don’t need your master key, that will keep sensitive data you store in Octopus secure as we won’t be able to decrypt it. We also scrub any personal identifiable information from the database before we start working on the database.

I appreciate your assistance so far, and my apologies for the inconvenience this issue has caused you.

Best regards,

Kenny

Hi Kenny, system integrity shows no issues.

I’ve backed up the DB, it’s 29GB uncompressed. What’s the best way to get it to you?

Simon

Hi @simon.george,

Thanks for getting back to us, if you could compress the backup in a ZIP file and upload it to our secure support file hosting here: Support files - Simon.George, that would be awesome!

Hopefully, we can resolve this issue quickly once we’ve acquired the backup.
Should you have any questions or concerns, please let me know and I’ll do my best to put you at ease.

Kind Regards,
Adam

@adam.hollow I’m ready to upload the database (16Gb zip) but the link has expired…

Hi Simon,

Sorry about that, you should be able to use the following link to upload your backup: Support files - simon.george

Best,
Patrick

I’ve uploaded a zip, this contains the backup spanned over multiple files.

A colleague pointed out that we have another integrity issue - one or two of the (empty) environments can’t be deleted.

Hi @simon.george,

Thank you for uploading your database backup for us. We will work today to reproduce your issue in-house and I will keep you posted with any updates.

Can you provide some more information on your issue of deleting environments? Do you receive an error in the UI and if so could you send me a screenshot? If you can let me know which environments you are having this issue with I will add that to our repro.

I look forward to hearing back and please let me know if you have any other questions for us.

Thanks!
Dan

We cannot delete the environment ‘staging16’

We simply get ‘Object reference not set to an instance of an object.’ when we try and do this

Do you have any feedback on my original problem now that you have my database?

Hi @simon.george,

Apologies for the delays. I need to get some input from our engineering team, however, the original log files you uploaded have expired so we no longer have access to them. When you get a moment would you be able to upload a new log from your retention policy task as well as a system diagnostic report? We do still have your database backup available so you won’t need to upload that again.

I’ve generated a new secure upload link in case the previous one expires before you have a chance to upload the new files. Please let me know once you’ve uploaded the files and I’ll pass the information along to our engineering team. Let me know if you have any other questions for us in the meantime.

Thanks!
Dan

I’ve uploaded both of those

1 Like

Hey @simon.george,

Just jumping in for Dan who is currently off shift as part of our US Based team, thank you for those logs we will get those sent to the engineers.

We will be in contact when we have any new information for you, reach out in the meantime if you need anything further.

Kind Regards,

Clare

Hi Simon,

We greatly appreciate your patience and assistance in troubleshooting these issues. I’m jumping back in for Dan and Clare as they’re both currently offline, and we got some helpful input from the engineers. :slight_smile:

Regarding the error you’re seeing when attempting to delete your ‘staging16’ environment (Object reference not set to an instance of an object), looking in the logs we’re confident it’s related to this bug recently fixed in 2022.2.3181 so I’m hopeful upgrading will fix this.

Regarding the issue running retention policies - It looks like there are some releases with empty JSON which seems to be causing the issue. We’re not sure at this point why some releases have empty JSON, but deleting them should get you unstuck. You can find these releases via the following SQL query:

SELECT * FROM [dbo].[Release] where JSON = ''

Then you can delete these releases via the UI or the API.

Please let me know how you go or if you have any questions, and we’ll keep you updated with any further updates.

Best regards,

Kenny

Thank you for the diagnosis, I’ve identified those releases, they aren’t available in the UI, so I’m going to try the API now.

1 Like