We have a large project with a lengthy deployment process that we run against multiple servers. We’ve been able to speed up our deployments significantly by using the “run in parallel” option on many of our steps.
The basic structure of the deployment process is to run all of the uninstall steps in parallel, wait for those to finish, then run all of the install steps in parallel. On an individual server level that is important because as part of the uninstall we run some cleanup tasks that have to finish before any installs start. Across multiple servers however it doesn’t matter - the install on Server A could happily start while the uninstall on Server B is still going.
We’re finding that some features don’t quite work the way we’d like them to for a parallel deployment across multiple servers. Can anyone suggest any patterns or settings we can use to improve these?
Most of our environments have multiple redundent servers. Currently if one server fails during the uninstall, the whole deployment stops until someone investigates and rectifies the issue. Also if one server is slower on the uninstall then the others will stop and wait for it to catch up before starting the install. Unfortunately that means our whole environment can be down for an extended period of time even though most of the servers could have continued deploying and been back online much earlier. We’ve tried running separate deployments to each machine but Octopus queues them up instead of running them in parallel (presumably because we’re deploying the same project to the same environment). It would be great if there was an option we could set to tell Octopus to run the deployment to all of the machines in an environment in parallel but treat each machine separately for the purposes of handling the “run in parallel” vs. “wait for the previous step” settings.
We find the “guided failure” mode very useful and use it for all our deploys. Unfortunately the “failure guidance” dialogs currently don’t provide any information about what went wrong. We can click through to see the details in the log, but if you have a number of failure guidance dialogs from different steps on different machines at the same it can be very confusing… especially if more are popping up while you are trying to investigate! It would be great if the dialogues contained some basic information like the machine and the action that failed without needing to look at the log.
Thanks for getting in touch!
Deployments failing on some of the targets
Unfortunately there is nothing in Octopus that would let you continue in case of a failure. The only workaround I can think of is to split your environment into two, one for active targets and one for redundant targets, and then deploy to them separately. This is similar to how you would implment Blue Green deployments (http://docs.octopusdeploy.com/display/OD/Blue-green+deployments). You could also have a look at Rolling deployments and see whether that would provide you with a better solution (http://docs.octopusdeploy.com/display/OD/Rolling+deployments).
Slow parallel deployments
What you could do here is to convert all your install and uninstall steps into child steps (http://docs.octopusdeploy.com/display/OD/Rolling+deployments#Rollingdeployments-Childsteps) of a single top level step. In this way all child steps will be executed sequentialy on a given target but the top level step will sill be executed in parallel on all targets.
Thank you for your suggestions. I will pass them to the team.
Please let me know if this doesn’t solve your problem.
Thank you for the suggestions! We’re always on the lookout for ways to better leverage the features of Octopus
We have considered the option of converting everything into child actions, however that would prevent us from parallelising tasks within the “install” and “uninstall” phases that we actually want to run in parallel (e.g. uninstalling different applications).
What we really need is the ability to use rolling deployments at the project level rather than the step level. E.g. If we had 4 servers and a window size of 2, we’d like Octopus to run through all of the steps in the project on those first 2 servers (let’s call them Group A), then move onto the other 2 (Group B) once Group A is complete. Alternatively, if we had the same parallelism controls on actions that we have on steps (“wait for the previous step” / “run in parallel with the previous step”) then we could convert everything into child actions as you’ve suggested.
The best option that we’ve come up with so far is to fire off two deployments to the same environment, one with only the Group A targets selected and the other with the Group B targets selected. Octopus will queue up the second deployment and so it won’t start until all of our Group A servers are back online. It’s similar to the Blue/Green deployment configuration but doesn’t require us to duplicate the environment (and scope all of the variables twice!) and also takes care of waiting for the first group to finish before starting the second group.
It still feels like a bit of a hack though
Thanks for your reply. Unfortunately Octopus doesn’t support this sceanrio at the moment. Would you mind describing your ideal solution on UserVoice ( https://octopusdeploy.uservoice.com)?