Problem with queued builds if sql server connection is lost

So for some reason or another our sql server instance got restarted while a deployment was taking place which is a bad thing to happen. I don’t particularly expect the deployment to complete however in our case we could not cancel a task after it failed because of this. We saw the following in our deployment page and we could not cancel the task without a server reboot.

A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
System.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.Open()
   at Octopus.Shared.TransientFaultHandling.RetryPolicy.<>c__DisplayClass26_0.<ExecuteAction>b__0() in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\TransientFaultHandling\RetryPolicy.cs:line 172
   at Octopus.Shared.TransientFaultHandling.RetryPolicy.ExecuteAction[TResult](Func`1 func) in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\TransientFaultHandling\RetryPolicy.cs:line 215
   at Octopus.Core.RelationalStorage.RelationalStore.BeginTransaction(IsolationLevel isolationLevel) in Y:\work\refs\tags\3.1.5\source\Octopus.Core\RelationalStorage\RelationalStore.cs:line 62
   at Octopus.Server.Orchestration.Deploy.DeploymentPlanService.Persist(DeploymentPlan plan) in Y:\work\refs\tags\3.1.5\source\Octopus.Server\Orchestration\Deploy\DeploymentPlanService.cs:line 42
   at Octopus.Server.Orchestration.Deploy.DeploymentTaskController.Execute() in Y:\work\refs\tags\3.1.5\source\Octopus.Server\Orchestration\Deploy\DeploymentTaskController.cs:line 65
   at Octopus.Shared.Tasks.RunningTask.RunMainThread() in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\Tasks\RunningTask.cs:line 80
ClientConnectionId:00000000-0000-0000-0000-000000000000
Octopus.Server version 3.1.5 (3.1.5+Branch.master.Sha.a3fb854d900077b8b028687f3a4ca01c59e84f56)
00:39:49Error
Unable to mark task ServerTasks-9181 as complete: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
System.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.Open()
   at Octopus.Shared.TransientFaultHandling.RetryPolicy.<>c__DisplayClass26_0.<ExecuteAction>b__0() in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\TransientFaultHandling\RetryPolicy.cs:line 172
   at Octopus.Shared.TransientFaultHandling.RetryPolicy.ExecuteAction[TResult](Func`1 func) in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\TransientFaultHandling\RetryPolicy.cs:line 215
   at Octopus.Core.RelationalStorage.RelationalStore.BeginTransaction(IsolationLevel isolationLevel) in Y:\work\refs\tags\3.1.5\source\Octopus.Core\RelationalStorage\RelationalStore.cs:line 62
   at Octopus.Server.Orchestration.RelationalTaskRunner.TaskComplete(String taskId, Exception error) in Y:\work\refs\tags\3.1.5\source\Octopus.Server\Orchestration\RelationalTaskRunner.cs:line 91
   at Octopus.Shared.Tasks.RunningTask.CompleteTask(Exception error) in Y:\work\refs\tags\3.1.5\source\Octopus.Shared\Tasks\RunningTask.cs:line 159
ClientConnectionId:00000000-0000-0000-0000-000000000000
Octopus.Server version 3.1.5 (3.1.5+Branch.master.Sha.a3fb854d900077b8b028687f3a4ca01c59e84f56)

Hi Brent,

Thanks for getting in touch. The short answer is to restart your Octopus Server. This will find any tasks that were in limbo, and Cancel them.

You can see in that stack trace that we actually try really hard to mark the task as completed, but eventually give up.

I’m going to talk to the team and see what we think about allowing a Task to be manually Cancelled if we are back in communication with the database.

Hope that helps.
Mike