Deployment stuck taking lock on package

Chris_Tamlyn · 17 June 2020 07:45

Hi Octopus Team

We are running regularly into an issue with deployment, where the process gets stuck when deploying a package.
Deploying package: /home/docker/.octopus/OctopusServer/Files/Package@S1.0.431@3785128AA7D4204A948FEEF4DF9BF550.nupkg June 17th 2020 01:07:35 Warning Lock file existed but was not readable, and has existed for longer than lock timeout. Taking lock. June 17th 2020 01:09:35 Warning Lock file existed but was not readable, and has existed for longer than lock timeout. Taking lock.

This process seems to go on indefinitely until we kill the process.

What is Octopus doing at this step and what causes this issue?
Is there a way to setup a max time? We use a burstable instance on AWS EC2 and the process is CPU heavy and consumes all the credits if we don’t spot its running.

donny.bell · 17 June 2020 14:18

Hi Chris,

Thank you for contacting Octopus Support. I’m sorry that you are having this issue.

A few questions for you:

Would you mind telling me what version of Octopus you are running?
Does this happen with every deployment for this project?
Could you provide an example raw task log that had the issue?

I look forward to hearing back from you.

Regards,
Donny

Chris_Tamlyn · 17 June 2020 14:39

Version: 2019.9.8

Not every run, but something like 1 in 3. An example below:

deploy-log.txt (30.4 KB)

donny.bell · 17 June 2020 20:08

Hi Chris,

Thank you for the quick response.

After taking a look at the log, I noticed:

00:05:38 Verbose | Another process is using the deployment journal

We’ve seen this issue before where Octopus was running into a permissions issue in the /tmp directory. I suggest double-checking that a different process is not interfering in this case.

If that doesn’t turn anything up, let me know and we’ll dig deeper.

Regards,
Donny

Chris_Tamlyn · 22 June 2020 08:23

It looks like it is Octopus tasks that are getting orphaned and keeping the file locked. When a deploy got stick in the loop

Cancelled the task in the UI, so that there were no running processes on that deployment target
SSH into the machine

I could see the .lck file existed in the /temp directory. Trying to forcibly remove it failed, so I checked the running processes. There was a mono task running deploy-package. Which was surprising as the UI said all tasks were cancelled. I manually killed that and I could then remove the lock file and allow deployments to resume.

However I don’t know how I can automate this set of actions, the taking a lock seems to have no timeout and just goes on forever.

/tmp $ ls
Octopus.Calamari.DeploymentJournal.lck
/tmp $ rm -f Octopus.Calamari.DeploymentJournal.lck
/tmp $ ls
Octopus.Calamari.DeploymentJournal.lck
/tmp $ ps -a
PID USER TIME COMMAND
1 root 0:00 bash /entry.sh /usr/sbin/sshd -D -f /etc/ssh/sshd_config
8 root 0:00 /usr/sbin/sshd -D -f /etc/ssh/sshd_config
8105 root 0:00 sshd: docker [priv]
8107 docker 0:00 sshd: docker@pts/0
8108 docker 0:00 -sh
8378 docker 0:00 sh -c export TentacleHome="$HOME/.octopus/OctopusServer" export TentacleApplications="$HOME/.octo
8379 docker 0:00 /bin/bash /home/docker/.octopus/OctopusServer/Work/20200622080247-199323-3314/command.sh -variabl
8381 docker 0:10 mono /home/docker/.octopus/OctopusServer/Calamari/7.1.9/Calamari.exe deploy-package -package /hom
8438 docker 0:00 ps -a
/tmp $ kill 8381
/tmp $ ps -a
PID USER TIME COMMAND
1 root 0:00 bash /entry.sh /usr/sbin/sshd -D -f /etc/ssh/sshd_config
8 root 0:00 /usr/sbin/sshd -D -f /etc/ssh/sshd_config
8105 root 0:00 sshd: docker [priv]
8107 docker 0:00 sshd: docker@pts/0
8108 docker 0:00 -sh
8441 docker 0:00 ps -a
/tmp $ rm -f Octopus.Calamari.DeploymentJournal.lck
/tmp $ ls

donny.bell · 22 June 2020 14:25

Hi Chris,

Thank you for the additional information.

It might be worth trying the self-contained Calamari, rather than the mono version, to see if there is any difference in behavior.

More info here

Let me know if using the self-contained version turns anything up.

Regards,
Donny

Chris_Tamlyn · 23 June 2020 09:39

Do you support Alpine Linux for the self contained Calamari?

donny.bell · 23 June 2020 15:28

Hi Chris,

You can find a list of the Linux distros we automatically test against here: https://octopus.com/docs/infrastructure/deployment-targets/linux#supported-distributions

With regard to Alpine Linux, your options would be to use an Alpine image with .NET core already on it or install the prerequisites for .NET core:
https://hub.docker.com/_/microsoft-dotnet-core-runtime/

Let me know if you have any more questions.

Regards,

system · 24 July 2020 15:28

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.