We have multiple workers configured as SSH connection. We migrate most of our deployments to run deployment behalf of docker worker. Then We have been experiencing a problem with disk memory overflow for some time now. The Work folder is not being cleaned up from completed tasks. We decided to move our connection method from SSH to Tentacle. Later, we noticed that Tentacle was logging a message on the machine indicating a lack of permissions. We turned out that the data in the Works folder was owned by the root user. In the Octopus logs, it can be observed that Docker mounts the appropriate files, the container is launched, package extraction is performed, but this container operates under the root user. Consequently, unnecessary files remain on the machine that Octopus will not delete. All of our deployments are performed on non root user. What We can do ?
Some logs from Octopus Server deployment process
Great to hear from you, sorry to hear the execution containers running as root are causing your disks to fill up!
I’ll dig into this and confirm exactly what’s going on, just to confirm are you using our worker-tools image?
If you’d like to send through the full task logs that should provide all the info needed to repro this, I’ve setup a secure link to upload them through our Upload Portal here, let me know if there are any issues with it!
Looking forward to getting this sorted, feel free to reach out with any questions!
We use own docker image implementation build as in Octopus documentation. Yesterday I run your worker-tool locally to check /etc/passwd and no any other user is provided. I assume that when Dockerfile contain USER definition, there still will be a issue with random UID. The host user and container user on the container may have different UIDs.
There will be a problem to send a complete logs - company security policy, but this issue can be reproduced.
Just jumping in for Finnian as he is currently off shift as part of our Australian based team.
Just to let you know, Worker package retention runs periodically and is a system task on the machine not a task performed by Octopus as part of a retention policy.
There is a forum post here which explains how it works so I would take a look at that as I popped some log snippets in there too about what you can see and how it removes the packages from a worker.
At the moment the only way to remove bulk packages from a worker would be to run a script to remove any files over x days old in the worker folder, you will need to account for each worker instance you have on that machine. And then run that script either through Octopus as a runbook, or possibly have Windows run it from the Task Scheduler so it runs every so many hours or on start up.
Having SSH workers or tentacle ones wont make a difference I am afraid so the script option would be the only one at the moment.
Let me know if you have any questions surrounding this at all as I would be happy to help.
Whilst I know that does not solve the issue of the container files filling up the drive I thought I would update you on how the package retention worked on workers. The removal of the files does get done on:
C:\Octopus\[Nameofworker]\Files\ (if on a windows machine)
So it does not affect the work folder you mentioned but hopefully that alleviates any concerns you had with the previous issue you mentioned about package retention not running on workers.
At first, I thought it was a problem with the “Files” folder, but the issue lies with the “Work” folder. Running the process within the container context changes the owner of the files. This is built-in feature in Octopus. Exactly, it violates the principle of least privilege, which states that users should be granted only the minimal set of privileges necessary to perform their tasks.
Just an update from my investigation before the weekend, I haven’t quite reproduced it just yet but will be looking into it further.
I noticed the Docker Security docs mention the Linux Kernel capabilities include the ability to change a file’s owner so that could explain what’s going on, could you please confirm how you gave the non-root user permission to use the docker socket?
One primary risk with running Docker containers is that the default set of capabilities and mounts given to a container may provide incomplete isolation, either independently, or when used in combination with kernel vulnerabilities.
I’ll keep you posted with any further updates or questions, feel free to reach out with any of your own!
On each Linux machine, a user named octopus is created and added to the docker group.
I believe there is an issue with running a Docker container through Octopus. The problem is simply the lack of parameterization in the “docker run” command, specifically regarding the “–user” argument.
Jumping in for Finnian as he is off shift at the moment and something rang a bell from your previous comment.
You did actually put a request in for this ages ago which resulted in us creating this GitHub issue and subsequently fixing it allowing non-root users to be able to run our docker images.
But I recalled our lead engineer recently saying something about a non-root user and docker. I then found the forum post where he mentioned what the issue was, whilst this is not directly related to the work folder I do believe it is a similar issue you are seeing, just on a different folder:
The problem stemming from the fact we don’t add the octopus user to the subuid/subgid files - it’s currently on the backlog, and the team will get this fixed up at the earliest opportunity.
You mentioned:
Later, we noticed that Tentacle was logging a message on the machine indicating a lack of permissions. We turned out that the data in the Works folder was owned by the root user
So it might be the case we need to add the octopus user to the work folder too alongside the subuid/subgid files?
I will add this to the discussion with Finnian and the engineers and see if that is the issue and if so we can hopefully get that fixed for you and other users, I have not heard of other users having this issue but it might be they just have not noticed the work folder not being cleaned out properly.
We will be in touch when we have an update for you,
Kind Regards,
Clare
Cheers for confirming that, that’s the same process I’ve done on my end however I did notice the docker docs indicate these users still have root-level permissions so I thought I’d double check: Linux post-installation steps for Docker Engine | Docker Docs
The docker group grants root-level privileges to the user.
I’ve finally managed to reproduce the files being created by the root user not clearing by adding a Package Reference to a Script step that’s configured to be extracted:
And using the following dockerfile I was able to get the files created in the container to be owned by the non-root user that I had added to the docker group, which I’ve confirmed get cleaned up afterwards:
FROM octopusdeploy/worker-tools:ubuntu.22.04
# Add a new user "finnocto" with user id 1001
RUN useradd -u 1001 finnocto
# Change to non-root privilege
USER finnocto
I’ll keep digging into this and see if there’s an issue or improvement to be made here, as it’s not ideal having drives fill up, but hope that helps in the meantime!
Feel free to reach out with any questions or if I can clarify anything at all!
I tested your solution, and it works, but under certain conditions. The user inside the container, as well as on the Linux host, must have the same UID. Otherwise, the deployment encounters permission issues with files.