We’re trying to deploy a relatively small project to a large number (2000+) of deployment targets. The project is a collection of files, rather than an IIS application or anything else too complex.
So far, we’ve found using Octopus’s inbuilt tools for creating deployments to be relatively unsuccessful for deploying releases of this project. It’s almost impossible to guarantee that all 2000+ nodes are available and healthy, and by default a failure on 1 machine then fails the whole deployment.
We experimented with guided failure mode, but this isn’t really practical either when a deployment contains 2000+ machines.
We’ve also seen that Octopus seems to consider each deployment a single task, so despite having Octopus setup in HA mode with 4 server nodes, and a task cap of 10 per server, Octopus was ever only running 1 “task” to try and deploy our project.
It’s probably worth mentioning we’ve tried different variations of rolling deployment windows, max parallelism, etc, to try and mitigate some of these problems, but generally this hasn’t improved the situation.
We’ve started testing out using the API to create multiple deployments for this project, each one targeting a subset of the deployment targets, for example: instead of 1 deployment of 2000 machines, we create 5 deployments of 400 machines. Our hope was that when these deployments were created through the API, Octopus would recognise the multiple large deployments and load balance them across our 4 nodes.
While the deployments get created successfully, only 1 deployment for this project runs, while the others are queued. Once again, only 1/10 tasks on 1 node is in use. I can however see deployments for other projects seem to “skip” this queue and start deploying straight away.
How can we configure Octopus in such a way that multiple deployments of the same project to different targets aren’t stuck in their own queue? Ideally I’d like to be able to split this into many small deployments across all 4 nodes.
Additionally, is there any way to configure Octopus so that it ignores failures and continues the deployment, without using guided failure mode?
We’re running Octopus Server 2018.4.0, with tentacles on 3.19.