When we are deploying a release octopus often ends up hanging when waiting for services to start/stop.
An example was when we tried to deploy a simple service, whose (almost) sole responsibility is to start another service and redirect the standard output into a log. The service started after 4 seconds but octopus deploy kept hanging on the step ‘Starting the Runner service’.
the octopus server task log last entries was:
[“L”,“ServerTasks-8241/9d78fd7580324a8c82d42e025728ef3d/44bdcd05562a40f1bdbf5c227be5ac72/d419aaecac10414a8bf04c83f572f353/f18151daed0448f397ed542385b463e9”,“2015-01-26T13:56:14.3646638+00:00”,“Verbose”,“Anonymous@SQ-PCC-2A5967BB”,“Running PowerShell script: D:\Octopus\Applications\.SQ-PC-2A5967BB\Octopus.Tentacle\188.8.131.528_1\Scripts\Octopus.Features.WindowsService_BeforePostDeploy.ps1”,null]
[“L”,“ServerTasks-8241/9d78fd7580324a8c82d42e025728ef3d/44bdcd05562a40f1bdbf5c227be5ac72/d419aaecac10414a8bf04c83f572f353/f18151daed0448f397ed542385b463e9”,“2015-01-26T13:56:26.4088856+00:00”,“Info”,“Anonymous@SQ-PC-2A5967BB”,“Starting the Runner service\r\n”,null]
while the corresponding tentacle entries was
2015-01-26 14:56:32.4387 INFO WARNING: Waiting for service ‘Runner (Runner )’ to finish starting…
2015-01-26 14:56:33.4854 INFO WARNING: Waiting for service ‘Runner (Runner )’ to finish starting…
Seems as if the client realises the service is started but the server never responds.
We are running Octopus server and tentacle version 184.108.40.2068
The tentacle is a listening one
Also currently we are having several machines disabled, each running an older version of the tentacles.
We can’t seem to open the diagnostics page, even when the dashboard shows no tentacles hanging
Restarting octopus deploy fixed the diagnostics page, though deployment still leads to thins hanging.
The diagnostics page shows 3 errors of the following kind:
Unable to load actor from C:\Octopus\OctopusServer\Actors\DeploymentOrchestrator-AqI-Aa5MA2jcVQ.pfa; attempting to discard.
System.Security.Cryptography.CryptographicException: Padding is invalid and cannot be removed.
at System.Security.Cryptography.CapiSymmetricAlgorithm.DepadBlock(Byte block, Int32 offset, Int32 count)
at System.Security.Cryptography.CapiSymmetricAlgorithm.TransformFinalBlock(Byte inputBuffer, Int32 inputOffset, Int32 inputCount)
at System.Security.Cryptography.CryptoStream.Dispose(Boolean disposing)
at Octopus.Shared.Bcl.IO.StreamDisposalChain.Dispose(Boolean disposing) in y:\work\refs\heads\release\source\Octopus.Shared\Bcl\IO\StreamDisposalChain.cs:line 24
at System.IO.StreamReader.Dispose(Boolean disposing)
at Pipefish.Persistence.Filesystem.ActorStateFile.Deserialize() in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\ActorStateFile.cs:line 67
at Pipefish.Persistence.Filesystem.ActorStateFile…ctor(String path, IStorageStreamTransform sst) in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\ActorStateFile.cs:line 23
at Pipefish.Persistence.Filesystem.DirectoryActorStorage.TryLoadActor(String pfaFile) in y:\work\3cbe05672d69a231\source\Pipefish\Persistence\Filesystem\DirectoryActorStorage.cs:line 45
Sorry for the delay in a response. This is a known issue that we are currently working on, and are hoping to have a new release out by next week.
Most customers reporting this state that it is ‘random’ and sometimes will work and sometimes will hang. So running the deployment again might be successful.
There is no other real workaround.
We have updated the server to octopus deploy version 220.127.116.116 and the tentacles correspondingly.
The error still randomly persist. The client says the service is started, but the server never seems to get it
Could you try the following to force a reset of all activities and actors:
cd \Program Files\Octopus Deploy\Octopus
octopus.server service --stop
octopus.server service --start
It’s not something we suggest running often as it will cancel everything in progress.
Let me know how the server behaves after.
Thank you for the quick answer.
The first time we tried after running your suggested commands, we failed in the ‘acquire packages’ step.
The server had the following error:
[“L”,“ServerTasks-9026/af67ed98e10c46e1892128d920fae226/6cf5323e3ede41eda59755b901e7dd85/ddca7877fdce4c3997c55a175f86b70d”,“2015-02-05T08:26:38.0614975+00:00”,“Fatal”,“FileSender-Mg-AbklF82YDw@SQ-OLDDEV-A6932307”,“Upload of file C:\Octopus\OctopusServer\Repository\Packages\Runner\Runner0.1.58.nupkg with hash 0cba3f406d183287a743b24d0f09023cfcfb5433 to SQ-PC-2A5967BB failed”,“The actor FileReceiver-ATM-AbklGCbx6A@SQ-PC-2A5967BB cannot handle failure Pipefish.Messages.Supervision.StartedEvent\r\nSystem.InvalidOperationException: The actor FileReceiver-ATM-AbklGCbx6A@SQ-PC-2A5967BB cannot handle failure Pipefish.Messages.Supervision.StartedEvent\r\n at Pipefish.Actor.OnHandleFailedTyped[TBody](Message deliveryFailure, Message failedMessage, Error error) in y:\work\3cbe05672d69a231\source\Pipefish\Actor.cs:line 169\r\nOctopus.Server version 18.104.22.1686”]
The client correspondingly had this error
2015-02-05 09:26:37.3249 ERROR Error in FileReceiver-ATA-AbklGB_K+A@SQ-PC-2A5967BB while receiving b1b0f6e5-ad10-11e4-a385-00155d057901
System.InvalidOperationException: The actor FileReceiver-ATA-AbklGB_K+A@SQ-PC-2A5967BB cannot handle failure Octopus.Shared.FileTransfer.SendStreamRequest
at Pipefish.Actor.OnHandleFailedTyped[TBody](Message deliveryFailure, Message failedMessage, Error error) in y:\work\3cbe05672d69a231\source\Pipefish\Actor.cs:line 169
We have previously moved the octopus server from the machine ‘OLDDEV’ to a new one, but hopefully the fact that the server-activities has LoggerActorIds containing the old machine name doesn’t lead to any errors.
Retrying the deployment again, still let to the previous random hanging mentioned; the client log showed the service started, but the server didn’t seem to get the message
We have just (about) dropped 2.6.2. It includes some extra logging so we can try to track this down.
Let me know when you have the ability to upgrade, and I will give you instructions to capture some logs for us.
We have tried reinstalling the tentacle on 2 of our machines.
One of them had some trouble uninstalling, as it failed to remove the tentacle service, so it had to be removed manually.
After the manual tentacle reinstall we have not had the same issues as previously.
Before we updated by pushing tentacle-updates via the octopus server, perhaps an issue exist there.
Thanks for letting us know. Please notify me if you notice this behavior starting again.