Hi all,
I’ve recently started using Octopus deploy in an AWS EC2 environment and have some questions for the community on approaches to certain issues. To contribute, I’ve also listed the approach I’m currently using (or plan to use) to address the issue, although I make no promises that they are the best approach. If you have a better approach that you would like to share, or advice regarding one of my approaches I would very much appreciate hearing your feedback.
Automatic Tentacle Registration on Auto Scale:
I basically used the script at http://octopusdeploy.com/blog/auto-provision-ec2-instances-with-tentacle-installed The script gets passed to the newly launched instance by way of “user data” that is specified in the auto scale groups launch config. I did make some minor changes such that the name used in the registration is the EC2 instance ID and not the hostname. I also added an automatic download/install of the octopustools.zip file. And I added some functionality so that the environment/role/project associated with the instance is pulled from AWS instance Tags. That way the same script can be used for all systems with just the instance Tags being changed.
Automatic Tentacle Deregistering on Scale-down/Termination:
Most of the approaches I saw suggested using a clean-up powershell script that would run prior to a deployment that would remove any orphaned tentacles. I don’t think this is a good approach as a temporary period of network downtime between Octopus and Tentacle could cause Octopus to remove the Tentacle even though the associated instance is still in AWS and receiving traffic. This would cause code on the effected instance to become stale. My thought was to leverage the notification functionality built-in to auto-scale groups. When a termination or scale-down event occurred, the notification would be recorded in an SQS queue. There would then be a powershell script that runs periodically from task scheduler on the octopus server that reviews this queue, deregisters effected instances, and then removes the entry from the queue. My concern with this approach is that there still may be edge cases where AWS and Octopus don’t agree on what the environment looks like.
Automatic Deploy of Software on Auto Registered Tentacles:
The same “user data” powershell script that performs the Tentacle registration also initiates a deploy of the latest software. This idea was mostly taken from the script posted at http://www.codeproject.com/Articles/719801/AWS-Deployment-With-Octopus-Deploy One of my concerns with this approach is that a rapid scale-up event could be slow as the deployments happen serially one instance at a time. An alternate method might be to remove the instance specific deploy and simply do a full deploy to tentacles that don’t have the latest code. But the issue with this is that it could be more fragile, as a single unreachable tentacle causes full deployments to fail. Does anyone have an approach that addresses both of these?
Blue/Green Testing:
I’m interested to hear how people are doing this in large AWS envrionments with Octopus Deploy. Initially I was thinking that I could have two stacks and switch between them with DNS, but this can be very expensive if the environment is large and you operate in the 2N mode for very long. One approach that I was bouncing around was to do the following:
- One auto-scale group for Green one for Blue with their own load balancers
- By default the non-active group is configured with min/max instances to 1
- When a production deploy is initiated from octopus a powershell script runs that does the following
- changes min for the group to whatever the current number of servers is plus some fudge factor
- sets max to whatever the max of the active group is
- waits for all instances to launch
- performs DNS switch
- waits for manual review
- change min/max of new non-active group to 1
- wait for instances to terminate/deregister
One thing that isn’t clear to me is how would I go about initiating the powershell script above only during a full site deployment vs a single host deploy that happens during tentacle registration? It also seems like I will need to have a blue/green tag associated with the instances and an associated deployment that is specific to that role (ie webserver-green, webserver-blue). This all seems very messy, does anyone have thoughts on how to bring this piece together in a modern development environment?