Load balanced scenario, getting tenticles to communicate?

When deploying to our load balanced servers, one or more nodes must be removed before and after deployment as is the typical scenario. So when deploying to say node1, node1 will be removed from the cluster (by calling a ps1 script for example), the deployment proceeds and the remaining steps are completed. Once the deployment on node1 is done - by done I mean all steps completed successfully and now must be returned to the cluster, and the next node(s) must be removed from the cluster.

In our case, in the live environment, we stager the deployment across the nodes. For example, we deploy to node1, that box is returned to the cluster and node2 is removed. Node2 will remain off the cluster for n hours, usually 24. Once the customer is happy we then deploy onto node2, test and again when the customer is happy node2 is joined to the cluster. Job done.

Manual intervention and email are required in our case and octopus can handle that which is fantastic. However the last piece to the puzzle is getting node1 to ‘signal’ (maybe a step) that it is done with it’s deployment so node2 (or many other nodes for that matter - lets think big!) can do something (like execute a ps1 script or some code).

I appreciate that we can have multiple machines in an environment and I initially thought of using steps to achieve this. Where the first step is to remove itself from the cluster, the last step would be adding it. Manual steps strategically placed would mean node2 could remain off the cluster until someone completed the deployment. I’m not convinced that would work and a concern is that in this case once node1 is completed and octopus then starts to deploy onto node2, there could be a considerable delay while the package is uploaded node2 mean both nodes are active. Toggling node1 on and node2 off needs to be rapid - 2 seconds or less.

Having each node in a separate environment may be possible however that would mean someone sat there waiting for one deployment to complete to start the next which again causes a delay and introduces human error (like walking off for a coffee or a poo).

Any suggestions?


Ok, I believed I’ve solved my specific scenario by using roles.