Tentacle loses connection with server mid-deploy

We have a few Tentacles deployed in a client’s environment to receive / install some software we create. For ease-of-use we’ve configured them in Listening mode (about 10% of our Deployment Targets use this mode).
Quite often deployments to this particular client fail (not necessarily on the same step).
Previously when it failed it would simply hang, but after our recent update to build
13113 it instead is timing out.

I have to say that troubleshooting is quite difficult - we don’t have direct access to the client system, and there isn’t a facility in OD to retrieve client logs. (Although after building a troubleshooting Project to simply read the logs in to the deployment log, nothing shows up there anyway).

From the server point-of-view, it seems like the client just never picks up the instructions for the next step?
At the moment the client’s antivirus/security solution is a suspect, but I’m told the Octopus directory has been excluded from it’s actions, so I’m not sure where to look now.

Can you suggest anything? We have build 13113 and the relevant Tentacles are all up-to-date. The problem appears to affect at least 2, probably all 4 Tentacles at this particular client. They’re using Azure VMs, and Tentacle health checks show all green.

Here’s the log from the currently failing step (this is after successfully running some powershell script steps, as well as one to install and configure an IIS site):


  Info     |       A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
           |       Server exception:
           |       System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
  Info     |       Retry (attempt 1)
  Info     |       A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
           |       Server exception:
           |       System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
  Info     |       Wait for 15 seconds
  Retry    |       2023-08-01T18:27:50.8058521+00:00
  Info     |       Retry (attempt 2)
  Info     |       A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
           |       Server exception:
           |       System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
  Info     |       Wait for 15 seconds
  Retry    |       2023-08-01T18:30:05.8169776+00:00
  Info     |       Retry (attempt 3)
  Verbose  |       A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
           |       Server exception:
           |       System.TimeoutException: A request was sent to a polling endpoint, but the polling endpoint did not collect the request within the allowed time (00:02:00), so the request timed out.
           |       Halibut.HalibutClientException
           |       at Octopus.Server.Orchestration.Targets.Tentacles.Observability.HalibutTentacleRpcTimerProxy.Invoke(MethodInfo targetMethod, Object[] args) in HalibutTentacleRPCTimerProxy.cs:line 59
           |       at generatedProxy_3.StartScript(StartScriptCommand )
           |       at Octopus.Server.Orchestration.Targets.Tentacles.TentacleRemoteEndpointFacade.ExecuteCommand(StartScriptCommand command, ITaskLog taskLog, CancellationToken cancellationToken) in TentacleRemoteEndpointFacade.cs:line 63
           |       at Octopus.Server.Orchestration.Targets.Tentacles.Observability.ErrorLoggingRemoteEndpointFacadeDecorator.ExecuteCommand(StartScriptCommand command, ITaskLog taskLog, CancellationToken cancellationToken) in ErrorLoggingRemoteEndpointFacadeDecorator.cs:line 72
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionExecution.Immediate.ExecutionTargets.TentacleExecutionTarget.Execute(ScriptCollection bootstrapperScripts, IReadOnlyList`1 bootstrapperArguments, IReadOnlyList`1 files, Nullable`1 forceIsolationLevel, Boolean raw, ITaskLog taskLog, String isolationMutexName, CancellationToken cancellationToken, Nullable`1 isolationMutexTimeout) in TentacleExecutionTarget.cs:line 70
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionExecution.Immediate.ImmediateExecutor.ExecuteRawScript(ScriptCollection scripts, IReadOnlyList`1 files, Boolean isRaw, ITaskLog taskLog, CancellationToken cancellationToken, Nullable`1 isolationMutexTimeout, ExecutionIsolation isolationLevel) in ImmediateExecutor.cs:line 118
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionExecution.Immediate.ImmediateExecutor.ExecuteCalamari(CalamariFlavour calamariFlavour, String calamariCommand, IReadOnlyList`1 calamariArguments, IReadOnlyList`1 files, IReadOnlyList`1 deploymentTools, VariableCollection extraVariables, TargetManifest targetManifest, CalamariPlatformConstraint calamariPlatformConstraint, Nullable`1 isolationMutexTimeout, String isolationMutexName, ITaskLog taskLog, CancellationToken cancellationToken) in ImmediateExecutor.cs:line 176
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionExecution.CommandBuilders.CalamariCommandBuilder.Execute(ITaskLog taskLog, CancellationToken cancellationToken) in CalamariCommandBuilder.cs:line 163
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Steps.Script.ScriptActionHandler.ExecuteWithDefaultScriptHandler(IActionHandlerContext context, ITaskLog taskLog, CancellationToken cancellationToken) in ScriptActionHandler.cs:line 64
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Steps.Script.ScriptActionHandler.Execute(IActionHandlerContext context, ITaskLog taskLog, CancellationToken cancellationToken) in ScriptActionHandler.cs:line 45
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.ExecuteAction(ActionCommand command, DeploymentTarget inContextOf, TargetManifest targetManifest, ITaskLog taskLog, IActionHandler handler, IActionHandlerContext handlerContext, CancellationToken ct) in NewActionDispatcher.cs:line 331
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.<>c__DisplayClass20_0.<Execute in NewActionDispatcher.cs:line 298
           |       at Octopus.Server.Orchestration.ServerTasks.ActionRetry.Execute(Func`2 callback, ITaskLog taskLog, TargetManifest targetManifest, IActionAndTargetScopedVariables variablesSnapshot, PlannedAction action, CancellationToken cancellationToken) in ActionRetry.cs:line 51
           |       at Octopus.Server.Orchestration.ServerTasks.ActionRetry.Execute(Func`2 callback, ITaskLog taskLog, TargetManifest targetManifest, IActionAndTargetScopedVariables variablesSnapshot, PlannedAction action, CancellationToken cancellationToken) in ActionRetry.cs:line 38
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.<>c__DisplayClass20_0.<Execute in NewActionDispatcher.cs:line 305
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Guidance.ExecuteWithoutGuidance(Func`2 callback, String actionName, Boolean actionIsRequiredToRun, ITaskLog taskLog, CancellationToken cancellationToken) in Guidance.cs:line 144
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Guidance.Execute(Func`2 callback, String actionName, Boolean actionIsRequiredToRun, ITaskLog taskLog, Action callbackOnExclude, CancellationToken cancellationToken) in Guidance.cs:line 74
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.Execute(ActionCommand command, DeploymentTarget inContextOf, IExecutor executor, TargetManifest targetManifest, Maybe`1 stagedPackagePath, IEnumerable`1 packageAcquisitionInformation, IActionAndTargetScopedVariables variablesSnapshot, ITaskLog taskLog, IActionHandler handler, Action guidanceExcludeCallback, CancellationToken cancellationToken) in NewActionDispatcher.cs:line 311
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.ExecuteOnDeploymentTarget(ActionCommand command, DeploymentTarget deploymentTarget, Action guidanceExcludeCallback, ITaskLog taskLogForTarget, ITaskLog taskLogRoot, IActionHandler handler, CancellationToken cancellationToken) in NewActionDispatcher.cs:line 169
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.ActionDispatch.NewActionDispatcher.Dispatch(ActionCommand command, DeploymentTarget deploymentTarget, ITaskLog taskLogForTarget, ITaskLog taskLogRoot, IActionHandlerResolver actionHandlerResolver, Action guidanceExcludeCallback, CancellationToken cancellationToken) in NewActionDispatcher.cs:line 125
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.PlannedStepControllers.ProcessStepController.<>c__DisplayClass8_2.<TryExecuteActionAndInitLoggingContext in ProcessStepController.cs:line 297
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Guidance.ExecuteWithoutGuidance(Func`2 callback, String actionName, Boolean actionIsRequiredToRun, ITaskLog taskLog, CancellationToken cancellationToken) in Guidance.cs:line 144
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.Guidance.Execute(Func`2 callback, String actionName, Boolean actionIsRequiredToRun, ITaskLog taskLog, Action callbackOnExclude, CancellationToken cancellationToken) in Guidance.cs:line 74
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.PlannedStepControllers.ProcessStepController.<>c__DisplayClass8_1.<TryExecuteActionAndInitLoggingContext in ProcessStepController.cs:line 303
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.TransientErrorDetectionExecutor.Execute(Func`2 action, ExecutionPlan plan, ITaskLog taskLog, CancellationToken cancellationToken, DeploymentTarget deploymentTarget) in TransientErrorDetectionExecutor.cs:line 50
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.TransientErrorDetectionExecutor.Execute(Func`2 action, ExecutionPlan plan, ITaskLog taskLog, CancellationToken cancellationToken, DeploymentTarget deploymentTarget) in TransientErrorDetectionExecutor.cs:line 50
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.PlannedStepControllers.ProcessStepController.<>c__DisplayClass8_0.<TryExecuteActionAndInitLoggingContext in ProcessStepController.cs:line 308
           |       at Octopus.Server.Infrastructure.Orchestration.UnitsOfWork.UnitOfWorkExecutor.<>c__DisplayClass7_0`4.<Execute in UnitOfWorkExecutor.cs:line 115
           |       at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.Do(IUnitOfWork unitOfWork, Func`1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 92
           |       at Octopus.Core.Infrastructure.UnitsOfWork.UnitOfWorkExtensionMethods.Do(IUnitOfWork unitOfWork, Func`1 action, CancellationToken cancellationToken, String name) in UnitOfWorkExtensionMethods.cs:line 92
           |       at Octopus.Server.Infrastructure.Orchestration.UnitsOfWork.UnitOfWorkExecutor.Execute[T1,T2,T3,T4](Func`6 action, CancellationToken cancellationToken, String name) in UnitOfWorkExecutor.cs:line 118
           |       at Octopus.Server.Orchestration.ServerTasks.Deploy.PlannedStepControllers.ProcessStepController.TryExecuteActionAndInitLoggingContext(ExecutionPlan plan, ExecutionPlanner planner, PlannedStep step, DeploymentTarget targetContext, PlannedAction action, ITaskLog taskLogForTarget, ITaskLog taskLogRoot, CancellationToken cancellationToken) in ProcessStepController.cs:line 329

Good morning @Chris_P_Pleasant,

Thank you for contacting Octopus Support and I am sorry to hear one of your targets is timing out picking up the deployment from the server.

This could be down to AV blocking the files from getting onto the tentacle but usually if that is the case we would not see timeouts. If AV is suspected we tend to have customers temporarily disable the AV agent on the affected machine as its not always clear if the correct folder has been whitelisted so disabling AV temporarily altogether on that machine rules it out completely.

Do you have quite a lot of deployments going to this target at similar times, you could be seeing timeouts as usually a tentacle will receive a request from the Octopus server and will queue that request if its already processing one as tentacles by default will run deployments individually. This could mean by the time it gets to a certain request it will have timed out on the Octopus Server as the server has not received feedback that the tentacle has started that task.

If you have a fair few deployments running to this tentacle it may be worth looking at our OctopusBypassDeploymentMutex variable which you can get some more information about here. Setting that on this project will make the tentacle run tasks in parallel which should alleviate that issue.

I will say though, this will impact the RAM and CPU of the tentacle as its processing more tasks simultaneously so if they are low spec targets (ie 4GB RAM 2 CPUs) it might slow the tentacle down and you will start seeing I/O errors on deployments.

Also, for IIS deployments tasks will not run in parallel even if you set the variable, this is an IIS limitation so there is nothing Octopus can do about that.

The other thing you can try if you are running Octopus Server (not our self hosted Cloud version) would be to install tentacle ping on your Octopus server and try and ping each client and see if you see any dropouts, it could be a flakey network if most of those client machines are having the issues.

The other thing to mention is our step retry feature which was created for this exact type of senario, you can set that on the step and it will just retry the step if it fails.

Does that information help at all?

Kind Regards,
Clare

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.