Delete Release from UI Hung, DB CPU Pressure

Hi,

We accidentally created too many artifacts of config files in our releases over a period of 3 or 4 days. That problem is fixed, but when I tried to delete one of those releases via the UI, the delete never finished. After an hour I closed the UI window. But the DB CPU is still running high (~55%) and using the CLI to delete the same release timed out.

This is pretty urgent because the CPU does not appear to be coming down and I suspect some kind of loop or perhaps a deleting a release via the UI doesn’t time-out.

I need to be able find out if the first attempt to delete the release via the UI is still running and putting CPU pressure on the DB, and if so, safely stop that. Then I need to be able to delete the dozen or so releases that still have this large number of artifacts.

I found the server log with the error from the attempt to delete the release from the UI -

June 18th 2018 19:09:17Error
An unexpected error occurred while attempting to retrieve and execute a task: A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable) System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable)
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at System.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName, Boolean shouldReconnect) at System.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso, String transactionName) at System.Data.SqlClient.SqlConnection.BeginDbTransaction(IsolationLevel isolationLevel) at Nevermore.RelationalTransaction..ctor(RelationalTransactionRegistry registry, RetriableOperation retriableOperation, IsolationLevel isolationLevel, ISqlCommandFactory sqlCommandFactory, JsonSerializerSettings jsonSerializerSettings, RelationalMappings mappings, IKeyAllocator keyAllocator, IRelatedDocumentStore relatedDocumentStore, String name) at Nevermore.RelationalStore.BeginTransaction(RetriableOperation retriableOperation, String name) at Octopus.Core.Model.Clustering.OctopusClusterService.GetCurrentNode() at Octopus.Server.Orchestration.TaskQueue.TaskQueueLimiter.GetTasksThisNodeCanFairlyExecute() at Octopus.Server.Orchestration.TaskQueue.TaskQueue.PollForMoreWorkOnBackgroundThread(Object state) ClientConnectionId:fd82b338-abdb-442a-86ee-87a95c7bb8d4 Error Number:-1,State:0,Class:20 June 18th 2018 19:45:27Error An error occurred when checking for canceled tasks System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable) at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at System.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName, Boolean shouldReconnect)
at System.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso, String transactionName)
at System.Data.SqlClient.SqlConnection.BeginDbTransaction(IsolationLevel isolationLevel)
at Nevermore.RelationalTransaction…ctor(RelationalTransactionRegistry registry, RetriableOperation retriableOperation, IsolationLevel isolationLevel, ISqlCommandFactory sqlCommandFactory, JsonSerializerSettings jsonSerializerSettings, RelationalMappings mappings, IKeyAllocator keyAllocator, IRelatedDocumentStore relatedDocumentStore, String name)
at Nevermore.RelationalStore.BeginTransaction(RetriableOperation retriableOperation, String name)
at Octopus.Server.Orchestration.ServerTasks.RelationalTaskRunner.GetOwnedExecutingAndCancellingTasks()
at Octopus.Server.Orchestration.ServerTasks.RelationalTaskRunner.CheckForCancellationOrPause()
at Octopus.Server.Orchestration.ServerTasks.RelationalTaskRunner.RunCancellationCheck()
ClientConnectionId:bb003ddb-69a4-47ef-8738-3fdeaf782f47
Error Number:-1,State:0,Class:20

OK, the original delete release from the UI completed after 2+ hours. I attempted to delete another release via the CLI and it timed out after 10 minutes -

Finding project: ShipStation
Finding channels: Production Channel
Finding releases for project…
Deleting version 18165.1.1

System.TimeoutException: Timeout getting response, client timeout is set to 00:10:00.
at Octopus.Client.OctopusAsyncClient.d__51`1.MoveNext() in Z:\buildAgent\workDir\52d4a5804c7de8e\source\Octopus.Client\OctopusAsyncClient.cs:line 569
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Octopus.Cli.Commands.DeleteReleasesCommand.d__20.MoveNext() in Z:\buildAgent\workDir\52d4a5804c7de8e\source\Octo\Commands\DeleteReleasesCommand.cs:line 81
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Octopus.Cli.Commands.ApiCommand.d__31.MoveNext() in Z:\buildAgent\workDir\52d4a5804c7de8e\source\Octo\Commands\ApiCommand.cs:line 124
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Octopus.Cli.Program.Run(String args) in Z:\buildAgent\workDir\52d4a5804c7de8e\source\Octo\Program.cs:line 48
Exit code: -3
PS /Users/bob.hardister>

Hi Bob,

Thanks for getting in touch! I’m sorry for the late reply, this one slipped through the cracks from yesterday.

It looks like you’re running Octopus Server 2018.6.3. We shipped 2018.6.4 this week which helped another customer with problems deleting releases. https://github.com/OctopusDeploy/Issues/issues/4653

If you can upgrade to the latest version of Octopus Server (after following all the normal backup procedures) hopefully this will alleviate the effort on the database CPU.

That being said, you mentioned creating a lot of artifacts. This fix was focused on the relationship between Releases and Deloyment Processes and Variable Sets. There is a chance this fix will help in your scenario…

That being said, how many artifacts are you talking about? Artifacts are very lightweight so I’m surprised you’re seeing a timeout there.

Hope that helps!
Mike

Hi Mike,

That’s good to hear. I’ll do the upgrade and see if that works.

I do have a follow-up question: I see in the API that I can delete artifacts. How is a deployment affected if I delete the artifacts associated with it? Does that cause any problems? Or, would the artifacts just no longer show up when you bring up the deployment in the UI?

Thanks!

Bob

Hi Bob,

Thanks for keeping in touch! That’s a good idea actually. Artifacts are just an attachment to a deployment. If you delete an artifact, the file is deleted, and it simply doesn’t appear any more.

Hope that helps!
Mike

Hi Michael,

Upgrading Octopus to the latest release did not significantly address the problem. I did write a script which deleted all the artifacts for a given release in a project. Once I was able to reduce the artifact bloat, things got real snappy. Everything is working and performing well again.

Bob

Hi Bob,

Thanks for keeping in touch! I’m glad to hear things are back to normal for you. We would like to investigate further so nobody else has this experience.

Can you help with some estimates?

Approximately how many releases were affected?
Approximately how many deployments per affected release?
Approximately how many artifacts per deployment?

Thanks!
Mike

Hi Michael,

About a dozen or so releases with around 8 deployments each. About 140,000 artifacts between those deployments. Accidentally, a recurse switch was enabled for a couple of days causing an entire repo of config files to be created as artifacts for each deployment.

Thanks for keeping in touch Bob!

I’ve added a card for us to investigate why this would happen. Your report is the first time we’ve heard of anything similar, but we have some features coming down the pipes over the next year which will probably end up generating more and more artifacts. We probably have some greedy/lazy code which assumes there will only ever be a few artifacts per deployment which will need to be fixed.

All the best, and please do get back in touch if we can help with anything else.
Mike