Infrequent failures of >octo.exe create-release

We got a couple of failures (very infrequent, say, once a week) of the following nature:

:upload
Executing task ':upload' (up-to-date check took 0.0 secs) due to:
  Task has not declared any outputs.
Starting process 'command 'nuget''. Working directory: C:\Jenkins\jobs\iRePORT-Dev\workspace\build Command: nuget push Nete.Ireport.12.0-Dev2769.nupkg -ApiKey ******** -Source http://***/octopus/nuget/packages
Successfully started process 'command 'nuget''
Pushing Nete.Ireport 12.0-Dev2769 to 'http://***/octopus/nuget/packages'...
Your package was pushed.
Starting process 'command 'octo''. Working directory: C:\Jenkins\jobs\iRePORT-Dev\workspace Command: octo create-release --project=iRePORT --version=12.0-Dev2769 --packageversion=12.0-Dev2769 --server=http://***/octopus/ --apiKey=******** --releaseNotes=Build iRePORT-Dev #2769 (http://jenkins-ireport/job/iRePORT-Dev/2769)
Successfully started process 'command 'octo''
Octopus Deploy Command Line Tool, version 3.2.0.97

Handshaking with Octopus server: http://***/octopus/
Handshake successful. Octopus version: 3.2.1; API version: 3.0.0
Authenticated as: DK <***> 
Finding project: iRePORT
Finding deployment process for project: iRePORT
Finding release template...
Using version number provided on command-line.
Release plan for release:    12.0-Dev2769
Steps: 
  #   Name                           Version         Source                              
  --- ------------------------------ --------------- ------------------------------------
  1   Configure and Deploy Package   12.0-Dev2769    User specified                      

Creating release...
Error from Octopus server (HTTP 500): Octopus Server returned an error: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
Server exception: 
System.Data.SqlClient.SqlException (0x80131904): Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
   at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
   at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
   at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
   at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
   at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
   at System.Data.SqlClient.SqlInternalTransaction.Commit()
   at System.Data.SqlClient.SqlTransaction.Commit()
   at Octopus.Server.Web.Infrastructure.Api.CreateResponseDescriptor`2.Responder.Execute() in Y:\work\refs\tags\3.2.1\source\Octopus.Server\Web\Infrastructure\Api\CreateResponseDescriptor.cs:line 91
   at Octopus.Server.Web.Infrastructure.Api.Responder`1.Respond(TDescriptor options, NancyContext context) in Y:\work\refs\tags\3.2.1\source\Octopus.Server\Web\Infrastructure\Api\Responder.cs:line 162
   at System.Dynamic.UpdateDelegates.UpdateAndExecute3[T0,T1,T2,TRet](CallSite site, T0 arg0, T1 arg1, T2 arg2)
   at CallSite.Target(Closure , CallSite , Object , Object , NancyContext )
   at Octopus.Server.Web.Api.OctopusRestApiModule.<>c__DisplayClass0_0.<.ctor>b__0(Object o) in Y:\work\refs\tags\3.2.1\source\Octopus.Server\Web\Api\OctopusRestApiModule.cs:line 46
   at CallSite.Target(Closure , CallSite , Func`2 , Object )
   at Nancy.Routing.Route.<>c__DisplayClass4.<Wrap>b__3(Object parameters, CancellationToken context)
ClientConnectionId:1b98f4ff-d4e3-40eb-bff8-830c7f6ba632
Error Number:-2,State:0,Class:11
-----------------------

Exit code: -7
:upload FAILED
:upload (Thread[main,5,main]) completed. Took 23.613 secs.

As you can see, create-release fails pretty fast (less than 24s), so I guess it must be not related to the default 10 minute query timeout in SQL Server (https://msdn.microsoft.com/en-us/library/ms189040(v=sql.120).aspx).

Hi Dmitriy,

Thanks for getting in touch. This TimeoutException is being thrown after calling SqlTransaction.Commit() after all of the INSERT/UPDATE work has been completed. For many SqlCommands we have increased the CommandTimeout to 60 seconds, but we don’t control the timeout for the SqlTransactionCommit() operation, so I’d guess this is using the default value of 30 seconds. It’s this CommandTimeout value which causes the TimeoutException to be thrown by the ADO.NET client, not a server-side configuration. Unfortunately I cannot explain what is happening in your specific situation with these logs.

Is your Octopus Server or Octopus Database shared with other applications? Does this problem happen at a regular time? Perhaps there is some resource contention causing the problem?

Sorry I couldn’t offer better news, but there is nothing obviously wrong/broken on our side. If you find something different please don’t hesitate to get in touch again.

Hope that helps!
Michael

Thanks for the update. Answering to your questions:

Is your Octopus Server or Octopus Database shared with other applications?

Our Octopus Server is run on our Dev server that is shared with other tools and our application

Does this problem happen at a regular time? Perhaps there is some resource contention causing the problem?

I could not see a pattern. We have had only 2 failures so far, I believe.

I’ll keep an eye on this. For now, it’s a mere nuisance because rebuilds make this issue go away.

Hi Dmitriy,

Thanks for getting back to me with those details. I’ll be interested to know if you find a solution to make this more reliable in your situation - the vast majority of TimeoutExceptions have a root cause that is environmental. In some cases we’ve been able to make Octopus more robust with Transient Failures, but there are cases like this where it just isn’t safe to retry the operation, especially because the transaction may have actually committed successfully on the server.

All the best!
Mike

No, I have not been able to resolve this. We still have infrequent random failures: just had one more of these.

Other than provisioning a new server, we don’t have many other options: need all software we have running in our Dev environment:

  • IIS with our application
  • Piwik
  • Jenkins
  • Octopus Deploy Server

BTW, if you develop Octopus in C#, here is a way to disable the default timeout, We do it in our app because 30 seconds default is a bit ridiculous:

public AppDbContext() : base(DbConnectionFactory.CreateConnection(), true)
{
    // Disable the default SQL Server provider's 30 sec. timeout.
    Database.CommandTimeout = 0;
    // Logs SQL in Database.SqlQuery() and Database.ExecuteSqlCommand(), but not in ExecuteStoredProc()
    Database.Log = (msg =>
    {
        if (!string.IsNullOrWhiteSpace(msg))
        {
            Log.Debug(msg.Trim('\r', '\n'));
        }
    });
}

Hi Dmitriy,

Thanks for keeping me in the loop. I’ve raised a GitHub Issue to investigate this further.

Hope that helps!
Mike