Octopus Polling Tentacles deployment time is much slower?

Hi,

We have a new environment that we are deploying to which is using polling tentacles and have noticed the deployment times are much much slower, to the point of annoyance.

Both the deployment server and the tentacles are idling away and don’t seem overloaded in any way.

Strangely, the command times in the task log appear to be OK, but we have large gaps between:

  • Running “X” on “Y”
  • Beginning deployment…

We are seeing gaps of around 30-60s before the “Beginning deployment”, and each part of the process in the task log is taking much longer compared with listening tentacles. Strangely the task log is super slow at updating that the variables have been substituted, and config transforms applied but the times of the task log show that the task completed within or less than a second? What would cause the task log to give a slow update? Is this due to the polling tentacles?

Also, the package upload speeds seem slower. Is this a polling tentacle issue or likely to be related to our network?

Ideally we would like to stick with the polling tentacles but unfortunately so far we are seeing major performance problems.

Cheers,
Michael

This may have something to do with it. We are seeing these messages in the tentacle logs mixed in with the normal tentacle deployment logs:

2014-09-17 12:54:22.9433 ERROR Error posting to: "Octopus Server"
System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond X.X.X.X:10943
at System.Net.Sockets.TcpClient.Connect(String hostname, Int32 port)
at Pipefish.Transport.SecureTcp.Client.SecureTcpClient.Send(SecureTcpRequest request) in y:\work\3cbe05672d69a231\source\Pipefish.Transport.SecureTcp\Client\SecureTcpClient.cs:line 39
at Pipefish.Transport.SecureTcp.MessageExchange.Client.ClientWorker.PerformExchange() in y:\work\3cbe05672d69a231\source\Pipefish.Transport.SecureTcp\MessageExchange\Client\ClientWorker.cs:line 312
at Pipefish.Transport.SecureTcp.MessageExchange.Client.ClientWorker.Run() in y:\work\3cbe05672d69a231\source\Pipefish.Transport.SecureTcp\MessageExchange\Client\ClientWorker.cs:line 175

Is this the tentacle failing to poll with the server therefore giving us the delays?

Presumably we will have to look at this on our side but it seems odd that it periodically works…

We’ve been looking at this, and it appears that we have fixed the problem. In our environments tab we had a lot of developer machines in test environments that were no longer connected or valid, causing the errors in the octopus server log and the tentacle log. The tentacles because of contention on the server checking for connections?

We have a mixed of listening and polling tentacles, and most of the the disabled machines were listening connections, with a small about being polling (3 - 7?)

Since disabling these machines, the polling tentacle deployments are now cranked up to 11, the speed is much better.

Questions:

  • How did we manage to kill polling tentacle deployments with all the broken machines? Listening tentacle deployments were still fast.
  • Were we just creating a long queue of work for the octopus server to get through?
  • Can we get more feedback on this type of connection issues in the portal and what affect it is having on our server?

Cheers,
Michael

Hi Michael,

Thanks for getting in touch! You’ve pretty much hit the nail on the head with this one. Octopus doesn’t handle offline machines well at all.
It creates tasks to check their status, and really it’s something we know about and have plans for changes, when we figure out how.
This creates issues with the task queue and definitely would have created issues for a polling machine.
I suggest that a policy to disable no longer connected or valid machines be implemented.

Sorry that it caused you so many troubles.
Vanessa

Thanks for the info.

I suppose one option would be to run a separate queues for different categories of work, e.g. Deployment, Maintenance. but this sounds like it could cause headaches of its own.

But I suppose this is a non-issue for most people as we shouldn’t have had so many broken tentacles… :slight_smile:

Cheers,
Michael

We are now on the go slow again…

To add to this issue we are now seeing the following in the server logs:

Unhandled error when processing request from client
System.IO.IOException: Authentication failed because the remote party has closed the transport stream.
   at System.Net.Security.SslState.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult)
   at Pipefish.Transport.SecureTcp.Server.SecureTcpServer.ExecuteRequest(TcpClient client) in y:\work\3cbe05672d69a231\source\Pipefish.Transport.SecureTcp\Server\SecureTcpServer.cs:line 111

We can’t even begin to diagnose which server is at fault, as there is no server name / IP in the logs…

To add to this, when we restart the Octopus Server, we see messages such as:

Warning: Rejecting connection: the client at X.X.X.X:PORT provided a certificate with thumbprint XXXXXXXXXXXXXX, which is associated with MACHINE but not configured for distribution

This seems to be a different set of polling tentacles in this list every time we restart the server, which is proving hard to debug. These tentacles have the correct thumbprint configured on the server and deployments seem to work OK, just very slowly!

The health checks for these servers all pass OK. We have had some tentacles that fail health checks when manually provoked but the connectively tab reports that all is OK with the tentacle. Upon restarting the offending tentacle it seemed to pass the manual health check.

These links seem similar to our issue but have no resolve:

http://help.octopusdeploy.com/discussions/problems/19302-release-gets-stuck-when-uploading-packages-to-polling-tentacle
http://help.octopusdeploy.com/discussions/problems/23622-polling-tentacles-dont-respond

Cheers,
Michael

To add to this, we have seen the Octopus Server crash twice within 3 days. This is the error we get in the event log:

Faulting application name: Octopus.Server.exe, version: 2.5.8.447, time stamp: 0x53fd1df5
Faulting module name: clr.dll, version: 4.0.30319.18033, time stamp: 0x50b5a6ba
Exception code: 0xc00000fd
Fault offset: 0x000000000000513b
Faulting process id: 0x350
Faulting application start time: 0x01cfd71af6b91dad
Faulting application path: C:\Program Files\Octopus Deploy\Octopus\Octopus.Server.exe
Faulting module path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll
Report Id: 9e47c297-4324-11e4-93f8-005056a65e4f
Faulting package full name: 
Faulting package-relative application ID:

We are so far attributing it to the errors we are seeing and eventually causing the server to crash completely. These errors have only really come about since we introduced a environments with polling tentacles.

We are still getting the same errors we mentioned in the previous posts, but it appears that once the deploy server is congested it causes the tentacles to fail hence the slowness.

Cheers,
Michael

Hi Michael,

We are going to need to see the full server logs. If it is a default location install they can be found under c:\Octopus\Logs
Feel free to either mark this thread as private or email them through to support at octopusdeploy dot com.

Vanessa