Waiting on scripts in tasks to finish

Hi,
A team has reported an issue in their deployments which is doubling the execution time of their deployments.

So for example in one of the steps as seen below, which is run on an on-prem server,there are multiple deployments which the step is waiting on to finish before this step can start. Like the step below to upload a package cant start until all the other server tasks are finished.


ServerTasks-3336012.log.txt (369.0 KB)

The variable ‘OctopusBypassDeploymentMutex’ is already enabled in all our projects.
Also the max parallelism set in our projects are 20 and I believe there is no connection of this issue with this maxparallelism variable since the same is for enabling maximum number of parallel runs for the same project itself.

I have attached the log from this deployment.
Octopus version is: 2022.4.8505.
Diagnostics system check is showing up no issues.

As it is an on-prem server, I am thinking it may be an AV scanning issue, which we have seen before or the deployment target is over utilized but not sure. I have raised the ticket to see what you think.

Kind Regards,
Micheál Power

Hey @mikepower79,

Thanks for reaching out and for the great question.

When it comes to parallelism, there are certain restrictions that are still in place regardless of the max parallelism variable being set at X value.

I’m sure you’ve seen our docs relating to this, but I want to specifically call out the below statement:

OctopusBypassDeploymentMutex must be set at the project variable stage. It will allow for multiple processes to run at once on the target. Having said that, deployments of the same project to the same environment (and, if applicable, the same tenant) are not able to be run in parallel even when using this variable.

This means that it could be a case that through some combination of tenant/project/environment for the deployments, there may be something holding up the tasks which is causing this behaviour.

My best recommendation for getting to the bottom of this is to follow the daisy chain of tasks that are specified in the task log as “waiting for”, to see if any of these are attempting to use a resource that cannot be used in parallel per the information above.

I hope this helps explain what’s happening further! If you’ve had a look through the deployments mentioned in the task log and can’t see anything that would cause this behaviour that lines up with our documentation, then please let me know and we’ll do our best to investigate further.

Kind Regards,
Adam

Hi @adam.hollow,
The issue is intermittent and does not occur every time which is strange.
The issue also appears in the Acquire packages step also where there are no scripts running.

Kind Regards,
Micheál Power

Hi @mikepower79,

The issue being intermittent could make sense as other deployments may be happening at certain times causing the issues and then not at other times, allowing deployments to run smoothly.

The acquire packages step, depending on acquisition location, still requires that the deployment target runs a process, which could conflict depending on the parameters I mentioned before.

The task logs mentioned in the screenshot you posted should provide further insight as to what is causing the blockage to happen at particular times and I believe would be the best data to look over for troubleshooting further.

Kind Regards,
Adam

1 Like

Hi @adam.hollow
We are getting this issue very frequently now and needs an immediate assistance on this.
Due to this the deployments which are expected to complete within 15 minutes are taking hours.
I can explain the current scenario for better understanding of the issue.
We have tentacles for the same environment which are tagged with multiple tentants.
And if we trigger multiple projects parallely for each tentants, for the same enironment(the tentacles using across multiple projects are same), we are getting this issue. And the issue we are not getting for all tentacles.
Our release trigger setup is same since long time and this specific issue started from few days only.
We can say that it is not a script side issue since we are getting this issue in the default acquire package step as well.
We could not see any proper information in the task logs as well

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.