Large package is causing deployment failure

Running Octopus Server 3.3.6

We have a project that we have been deploying for about a year now via a mix of some environments using listening and some polling tentacles. We are adding a new, very large package to deploy (about 350MB). The deployment seems to work fine for internal servers using listening or polling tentacles, but on external servers (we use only polling tentacles for external servers), the deployment will hang for about 12 minutes attempting to acquire the large package, then it will finally fail on that attempt. It will attempt 5 retries, all of which will fail in about 2 minutes each. Also, the tentacle is now unreachable at this point. The only way I have found to get it back up and running is to remote to the machine and manually restart the tentacle service. I have created a new project that does nothing but deploy the large package, and it always fails in the same way. 12 minutes of trying to deploy the package, followed by several quicker retry failures, then the tentacle is unusable until it is restarted.

I do know that our Octopus server is not very quick at serving up the packages. It may take about 25 minutes for a package of that size to download, but I expected that Octopus would be able to handle that.

I am really not sure what we are running up against here. Is this some sort of timeout issue, or is there some sort of package size limitation? I am having a tough time narrowing down what the problem could be here.

Here is a copy of the Acquiring Packages step in the failure case:

Uploading package xxx
Beginning streaming transfer of xxx.1.11.8621.nupkg-c750c99f-6fbc-4130-adae-d0f52a546477
No response was received from the endpoint within the allowed time.
Halibut.HalibutClientException: No response was received from the endpoint within the allowed time.
Server stack trace: 
   at Halibut.ServiceModel.HalibutProxy.EnsureNotError(ResponseMessage responseMessage) in y:\work\7ab39c94136bc5c6\source\Halibut\ServiceModel\HalibutProxy.cs:line 86
   at Halibut.ServiceModel.HalibutProxy.Invoke(IMessage msg) in y:\work\7ab39c94136bc5c6\source\Halibut\ServiceModel\HalibutProxy.cs:line 37
Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at Octopus.Shared.Contracts.IFileTransferService.UploadFile(String remotePath, DataStream upload)
   at Octopus.Worker.Tentacles.TentacleRemoteEndpointFacade.UploadFile(String fileName, DataStream package) in Y:\work\refs\tags\3.3.6\source\Octopus.Worker\Tentacles\TentacleRemoteEndpointFacade.cs:line 78
Octopus.Server version 3.3.6 (3.3.6+Branch.master.Sha.ef711d651c56c0d0097bfecf797bb02a0abb1d00)
File upload failed. Retry attempt 1 of 5...
Beginning streaming transfer of xxx.1.11.8621.nupkg-c750c99f-6fbc-4130-adae-d0f52a546477
A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
Server exception: 
System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
Halibut.HalibutClientException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
Server exception: 
System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
Server stack trace: 
   at Halibut.ServiceModel.HalibutProxy.EnsureNotError(ResponseMessage responseMessage) in y:\work\7ab39c94136bc5c6\source\Halibut\ServiceModel\HalibutProxy.cs:line 86
   at Halibut.ServiceModel.HalibutProxy.Invoke(IMessage msg) in y:\work\7ab39c94136bc5c6\source\Halibut\ServiceModel\HalibutProxy.cs:line 37
Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at Octopus.Shared.Contracts.IFileTransferService.UploadFile(String remotePath, DataStream upload)
   at Octopus.Worker.Tentacles.TentacleRemoteEndpointFacade.UploadFile(String fileName, DataStream package) in Y:\work\refs\tags\3.3.6\source\Octopus.Worker\Tentacles\TentacleRemoteEndpointFacade.cs:line 78
Octopus.Server version 3.3.6 (3.3.6+Branch.master.Sha.ef711d651c56c0d0097bfecf797bb02a0abb1d00)

Any assistance would be appreciated.
Eben

Hi Eben,

I’m sorry you’re experiencing this issue.

The first thing that I think may be worth trying is increasing some internal timeout limits in our communications library (Halibut).

Could you try adding the config lines below to the appSettings section of Octopus.Server.exe.config, located by default at C:\Program Files\Octopus Deploy\Octopus:

<add key="Halibut.PollingRequestMaximumMessageProcessingTimeout" value="01:00:00"/>
<add key="Halibut.PollingRequestQueueTimeout" value="01:00:00"/>

This increases your polling timeouts to 1 hour. You can tweak the actual timeouts above as you prefer.

It would also be interesting to see the results if you ran Tentacle Ping on these external machines.
In your case you would run Tentacle Pong on the server and Tentacle Ping from your external polling Tentacles.

Please let me know if the timeout configuration changes ease the problem, and if you do run Tentacle Ping please post the results.

I hope this helps,
Michael

I tested out the config file change, and that did resolve the issue. If I am understanding this correctly, though, that would mean that the polling tentacle would not time out for an hour no matter what is happening (ie: it would take an hour even if it did not respond at all).

Ideally, I would like to keep the unresponsive tentacle timeout down low, but raise the timeout for a tentacle that has responded and is just processing for a long time. I believe I accomplished that by setting the PollingRequestMaximumMessageProcessingTimeout to 1 hour, and removing the PollingRequestQueueTimeout setting, but I just wanted to verify that this is the correct approach.

As long as that seems like an okay approach to run with, I am happy with that solution for our problem.

Thanks,
Eben

Eben,

That is correct. I agree with your approach.

One thing to keep in mind, those settings are overwritten when you upgrade the Octopus Server. You will need to re-add them. I know this isn’t ideal, and it’s on our radar to resolve.

Don’t hesitate to contact us if there’s anything else we can do to help.

Happy Deployments!

I will keep that in mind. Thanks for the help!

Even after changing the time out, I am getting the same error
Task ID: ServerTasks-101
Task status: Failed
Task queued: Saturday, August 27, 2016 12:55 AM
Task started: Saturday, August 27, 2016 12:55 AM
Task duration: 1 hour
Server version: 3.4.1+Branch.master.Sha.e8b55c0651d9edede9cac656dace196e18004a0d

                | == Failed: Deploy GVNet TEST release 30.1.15 to WIN-9E2CL1BVC1A ==

00:55:23 Verbose | Guided failure is not enabled for this task
01:55:23 Fatal | The deployment failed because one or more steps failed. Please see the deployment log for details.
|
| == Failed: Acquire packages ==
00:55:23 Info | Acquiring packages
00:55:23 Info | Making a list of packages to download
00:55:23 Verbose | Checking package cache for package Arthama.Web.GlobalvaluesWeb version 30.1.15
00:55:23 Verbose | Package Arthama.Web.GlobalvaluesWeb version 30.1.15 was found in cache. No need to download. Using file: D:\Octopus\OctopusServer\PackageCache\feeds-teamcity-feed\Arthama.Web.GlobalvaluesWeb.30.1.15_83573494B152554B9C518BA120E5E3D8.nupkg
00:55:23 Verbose | SHA1 hash of package Arthama.Web.GlobalvaluesWeb is: 218db4a20266600cd17b2d9213c22a2fde753663
01:55:23 Fatal | The step failed: Activity failed with error ‘A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (01:00:00), so the request timed out.
| Server exception:
| System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (01:00:00), so the request timed out.’.
01:55:23 Verbose | Acquire Packages completed
|
| Failed: WIN-9E2CL1BVC1A
01:55:23 Fatal | A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (01:00:00), so the request timed out.
| Server exception:
| System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (01:00:00), so the request timed out.
|
| Running: Upload package Arthama.Web.GlobalvaluesWeb version 30.1.15
|
| Canceled: Step 1: GVNet TEST Deploy Package
01:55:23 Verbose | Step “GVNet TEST Deploy Package” runs only when all previous steps succeeded; skipping
|

Hi Aruloli,

Is it possible your package is taking longer than 1 hour to download?

Have you tried increasing the timeout further?

After restarting the server it works fine now

Thanks
Aruloli Rajaram