Random SQL Connection Errors from Octopus Server

We are using SQL Server 2017 (RTM-CU20) (KB4541283) - 14.0.3294.2

Octopus server version 2019.6.3

Seemingly randomly getting the following errors, the same sql server is used by a few other apps with no issues at all, we cant see any errors on SQL server side at the times these are reported either

I found a few issues relating to MARS online and have tried to disable it in connection string without it having any effect

The SQL queries we get the error on are not consistent, sometimes SELECT, sometimes UPDATES, etc.

Also we’ve tried running the octopus app server to a different VM on our network to eliminate any network issues and still have same issues

Octopus Server returned an error: Exception occurred while executing a reader for SELECT [Id] from [IdsInUse] WITH (NOLOCK) WHERE ([Id] IN (@ids_0, @ids_1)) AND ([SpaceId] = @space_id OR [SpaceId] is null)
SQL Error 0 - The connection is broken and recovery is not possible. The connection is marked by the server as unrecoverable. No attempt was made to restore the connection.

Octopus.Server.Schedules.RunOnAScheduleAdapter1[Octopus.Server.Schedules.CheckTentacleHealth] System.Exception: Exception occurred while executing a reader for SELECT MAX([DataVersion]) AS [Latest],
COUNT(*) AS [Count],
‘DeploymentEnvironment’ AS [TableName],
‘Spaces-1’ AS [PartitionId]
FROM dbo.[DeploymentEnvironment]
WHERE ((([SpaceId] = ‘Spaces-1’)))
AND ([SpaceId] = ‘Spaces-1’)
ORDER BY [Count] ---> System.Data.SqlClient.SqlException: The connection is broken and recovery is not possible. The connection is marked by the server as unrecoverable. No attempt was made to restore the connection. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)

last one this time with full stack trace

Exception occurred while executing a reader for SELECT MAX([DataVersion]) AS [Latest], MIN_ACTIVE_ROWVERSION() as [MinActive], COUNT_BIG(*) AS [Count] FROM dbo.[Artifact] WHERE SpaceId = @spaceId SELECT [DataVersion] FROM dbo.[Artifact] WHERE SpaceId = @spaceId AND [DataVersion] >= MIN_ACTIVE_ROWVERSION() ORDER BY [DataVersion] desc System.Exception: Exception occurred while executing a reader for SELECT MAX([DataVersion]) AS [Latest], MIN_ACTIVE_ROWVERSION() as [MinActive], COUNT_BIG(*) AS [Count] FROM dbo.[Artifact] WHERE SpaceId = @spaceId SELECT [DataVersion] FROM dbo.[Artifact] WHERE SpaceId = @spaceId AND [DataVersion] >= MIN_ACTIVE_ROWVERSION() ORDER BY [DataVersion] desc —> System.Data.SqlClient.SqlException: Cannot continue the execution because the session is in the kill state.
A severe error occurred on the current command. The results, if any, should be discarded.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
at Nevermore.Transient.IDbCommandExtensions.<>c__DisplayClass5_0.b__0()
at Nevermore.Transient.RetryPolicy.ExecuteAction[TResult](Func1 func) at Nevermore.Transient.IDbCommandExtensions.ExecuteReaderWithRetry(IDbCommand command, RetryPolicy commandRetryPolicy, RetryPolicy connectionRetryPolicy, String operationName) --- End of inner exception stack trace --- at Nevermore.Transient.IDbCommandExtensions.ExecuteReaderWithRetry(IDbCommand command, RetryPolicy commandRetryPolicy, RetryPolicy connectionRetryPolicy, String operationName) at Nevermore.RelationalTransaction.ExecuteReader(String query, CommandParameterValues args, Action1 readerCallback)
at Octopus.Server.Caching.DataVersion.TableDataVersionTagProvider1.GetDataVersionTag(IOctopusRelationalTransaction transaction, String spaceId) at Octopus.Server.Web.Api.Rules.DataVersionCacheRule.Before(ISpecialRuleContext context) at Octopus.Server.Web.Infrastructure.Api.Responder1.ExecuteRegisteredRules[TRule](Action2 ruleCallback) at Octopus.Server.Web.Infrastructure.Api.Responder1.Respond(TDescriptor options, NancyContext context)
at System.Dynamic.UpdateDelegates.UpdateAndExecute3[T0,T1,T2,TRet](CallSite site, T0 arg0, T1 arg1, T2 arg2)
at Octopus.Server.Web.Infrastructure.OctopusNancyModule.<>c__DisplayClass14_0.<get_Routes>b__1(Object o, CancellationToken x)
at Nancy.Routing.Route`1.d__7.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Nancy.Routing.DefaultRouteInvoker.d__2.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Nancy.Routing.DefaultRequestDispatcher.d__5.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Nancy.NancyEngine.d__22.MoveNext()

Hi @joel_dickson,

Thanks for getting in touch!

The only other instances of this error that I can find are linked to the Azure deployment steps and seem to be related to timeouts occurring. This is the first time seeing it between the server and database though.

Are these errors appearing within the UI when they occur or are they just appearing in the log?
And when they do occur, assuming it is the result of some interaction in the UI, is there any delay before the error appears or is it an immediate failure? I’m wondering if it is timeout related.

If it does seem like it could be timeout related, it may be worth checking the index fragmentation level. This query should return that information.

SELECT OBJECT_NAME(ips.OBJECT_ID)
 ,i.NAME
 ,ips.index_id
 ,index_type_desc
 ,avg_fragmentation_in_percent
 ,avg_page_space_used_in_percent
 ,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') ips
INNER JOIN sys.indexes i ON (ips.object_id = i.object_id)
 AND (ips.index_id = i.index_id)
ORDER BY avg_fragmentation_in_percent DESC

Regards,
Paul

They come up in the UI (sometimes on the dashboard loads as a red box), and in logs, and also in teamcity logs when we are doing some api calls

WIll get back with fragmentation level

Thanks,
Joel

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.