Octopus hanging when starting windows service

When we are deploying a release octopus often ends up hanging when waiting for services to start/stop.

An example was when we tried to deploy a simple service, whose (almost) sole responsibility is to start another service and redirect the standard output into a log. The service started after 4 seconds but octopus deploy kept hanging on the step ‘Starting the Runner service’.

the octopus server task log last entries was:
[“P”,“ServerTasks-8241/9d78fd7580324a8c82d42e025728ef3d/44bdcd05562a40f1bdbf5c227be5ac72/d419aaecac10414a8bf04c83f572f353/d86f50fd46bc42728941872505e74cbf”,“2015-01-26T13:56:14.1615804+00:00”,“Finished”,“ProcedureCallOrchestrator-_g-AbF3e+XBig@SQ-PC-2A5967BB”,"",“100”]
[“L”,“ServerTasks-8241/9d78fd7580324a8c82d42e025728ef3d/44bdcd05562a40f1bdbf5c227be5ac72/d419aaecac10414a8bf04c83f572f353/f18151daed0448f397ed542385b463e9”,“2015-01-26T13:56:14.3646638+00:00”,“Verbose”,“Anonymous@SQ-PCC-2A5967BB”,“Running PowerShell script: D:\Octopus\Applications\.SQ-PC-2A5967BB\Octopus.Tentacle\2.6.0.778_1\Scripts\Octopus.Features.WindowsService_BeforePostDeploy.ps1”,null]
[“L”,“ServerTasks-8241/9d78fd7580324a8c82d42e025728ef3d/44bdcd05562a40f1bdbf5c227be5ac72/d419aaecac10414a8bf04c83f572f353/f18151daed0448f397ed542385b463e9”,“2015-01-26T13:56:26.4088856+00:00”,“Info”,“Anonymous@SQ-PC-2A5967BB”,“Starting the Runner service\r\n”,null]

while the corresponding tentacle entries was
2015-01-26 14:56:32.4387 INFO WARNING: Waiting for service ‘Runner (Runner )’ to finish starting…
2015-01-26 14:56:33.4854 INFO WARNING: Waiting for service ‘Runner (Runner )’ to finish starting…
Service started

Seems as if the client realises the service is started but the server never responds.

We are running Octopus server and tentacle version 2.6.0.778
The tentacle is a listening one

Also currently we are having several machines disabled, each running an older version of the tentacles.
We can’t seem to open the diagnostics page, even when the dashboard shows no tentacles hanging

Restarting octopus deploy fixed the diagnostics page, though deployment still leads to thins hanging.

The diagnostics page shows 3 errors of the following kind:
Unable to load actor from C:\Octopus\OctopusServer\Actors\DeploymentOrchestrator-AqI-Aa5MA2jcVQ.pfa; attempting to discard.
System.Security.Cryptography.CryptographicException: Padding is invalid and cannot be removed.
at System.Security.Cryptography.CapiSymmetricAlgorithm.DepadBlock(Byte[] block, Int32 offset, Int32 count)
at System.Security.Cryptography.CapiSymmetricAlgorithm.TransformFinalBlock(Byte[] inputBuffer, Int32 inputOffset, Int32 inputCount)
at System.Security.Cryptography.CryptoStream.FlushFinalBlock()
at System.Security.Cryptography.CryptoStream.Dispose(Boolean disposing)
at System.IO.Stream.Close()
at Octopus.Shared.Bcl.IO.StreamDisposalChain.Dispose(Boolean disposing) in y:\work\refs\heads\release\source\Octopus.Shared\Bcl\IO\StreamDisposalChain.cs:line 24
at System.IO.Stream.Close()
at System.IO.StreamReader.Dispose(Boolean disposing)
at System.IO.TextReader.Dispose()
at Pipefish.Persistence.Filesystem.ActorStateFile.Deserialize() in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\ActorStateFile.cs:line 67
at Pipefish.Persistence.Filesystem.ActorStateFile…ctor(String path, IStorageStreamTransform sst) in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\ActorStateFile.cs:line 23
at Pipefish.Persistence.Filesystem.DirectoryActorStorage.TryLoadActor(String pfaFile) in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\DirectoryActorStorage.cs:line 45

Hi Jakob,

Sorry for the delay in a response. This is a known issue that we are currently working on, and are hoping to have a new release out by next week.
Most customers reporting this state that it is ‘random’ and sometimes will work and sometimes will hang. So running the deployment again might be successful.
There is no other real workaround.

Vanessa

We have updated the server to octopus deploy version 2.6.1.796 and the tentacles correspondingly.

The error still randomly persist. The client says the service is started, but the server never seems to get it

Hi Jakob,

Could you try the following to force a reset of all activities and actors:

cd \Program Files\Octopus Deploy\Octopus
octopus.server service --stop
octopus.server reset-activities
octopus.server service --start

It’s not something we suggest running often as it will cancel everything in progress.
Let me know how the server behaves after.

Vanessa

Thank you for the quick answer.

The first time we tried after running your suggested commands, we failed in the ‘acquire packages’ step.

The server had the following error:
[“L”,“ServerTasks-9026/af67ed98e10c46e1892128d920fae226/6cf5323e3ede41eda59755b901e7dd85/ddca7877fdce4c3997c55a175f86b70d”,“2015-02-05T08:26:38.0614975+00:00”,“Fatal”,“FileSender-Mg-AbklF82YDw@SQ-OLDDEV-A6932307”,“Upload of file C:\Octopus\OctopusServer\Repository\Packages\Runner\Runner0.1.58.nupkg with hash 0cba3f406d183287a743b24d0f09023cfcfb5433 to SQ-PC-2A5967BB failed”,“The actor FileReceiver-ATM-AbklGCbx6A@SQ-PC-2A5967BB cannot handle failure Pipefish.Messages.Supervision.StartedEvent\r\nSystem.InvalidOperationException: The actor FileReceiver-ATM-AbklGCbx6A@SQ-PC-2A5967BB cannot handle failure Pipefish.Messages.Supervision.StartedEvent\r\n at Pipefish.Actor.OnHandleFailedTyped[TBody](Message deliveryFailure, Message failedMessage, Error error) in y:\work\3cbe05672d69a231\source\Pipefish\Actor.cs:line 169\r\nOctopus.Server version 2.6.1.796”]

The client correspondingly had this error
2015-02-05 09:26:37.3249 ERROR Error in FileReceiver-ATA-AbklGB_K+A@SQ-PC-2A5967BB while receiving b1b0f6e5-ad10-11e4-a385-00155d057901
System.InvalidOperationException: The actor FileReceiver-ATA-AbklGB_K+A@SQ-PC-2A5967BB cannot handle failure Octopus.Shared.FileTransfer.SendStreamRequest
at Pipefish.Actor.OnHandleFailedTyped[TBody](Message deliveryFailure, Message failedMessage, Error error) in y:\work\3cbe05672d69a231\source\Pipefish\Actor.cs:line 169

We have previously moved the octopus server from the machine ‘OLDDEV’ to a new one, but hopefully the fact that the server-activities has LoggerActorIds containing the old machine name doesn’t lead to any errors.

Retrying the deployment again, still let to the previous random hanging mentioned; the client log showed the service started, but the server didn’t seem to get the message

Hi Jakob,

We have just (about) dropped 2.6.2. It includes some extra logging so we can try to track this down.
Let me know when you have the ability to upgrade, and I will give you instructions to capture some logs for us.

Vanessa

We have tried reinstalling the tentacle on 2 of our machines.

One of them had some trouble uninstalling, as it failed to remove the tentacle service, so it had to be removed manually.

After the manual tentacle reinstall we have not had the same issues as previously.

Before we updated by pushing tentacle-updates via the octopus server, perhaps an issue exist there.

Hi Jakob,

Thanks for letting us know. Please notify me if you notice this behavior starting again.

Vanessa