Scaling Octopus Deploy

chris_little · 13 November 2014 15:15

We had a rough deployment last night and I’m looking for suggestions on how I can make our Octopus server better able to support large scale deployments?

Our Octopus server is a VM running Windows 2012 with 2 CPU’s and 6GB of RAM. We’re running 745 tentacles and I’m guessing that 675 are in the production environment.

The deployment last night involved around 500 of the production tentacles. The problems started when the deploy into the largest group, about 400 tentacles, started outputting a higher number of log messages than usual. About 80 to 100 extra messages per tentacle. This brought Octopus to a halt. The server was unresponsive and we eventually rebooted it after waiting a couple of hours to see if it was just a backlog issue. After rebooting I disabled the anti-virus which helped performance. There was a backlog of 100,000 log messages to process and we had to cancel about 20 tasks that been running for 2 hours without any messages from the tentacles.

After all that we were able to rerun the failed deployments one at a time. What usually takes a couple of hours took about seven. Needless to say my operations people aren’t happy.

The log message processing seems to be the key. I saw another thread where the server had 8 CPUs. Is log message processing CPU bound?

Do I need to setup some exclusions for the anti-virus?

Do I need the suggested feature to cap the number of concurrent installs? Do I need some other feature?

We’re running 2.5.8.447. Would upgrade to 2.5.12.666 help?

Paul_Stovell · 13 November 2014 15:47

Hi Chris,

Thanks for getting in touch and I’m really sorry for the bad deployment experience.

Processing log messages can certainly be a performance issue; usually, each time a line of log output is written (either by Tentacle or via your PowerShell etc. scripts), a message is created. That message is serialized to disk, then read from disk again, then sent over the wire, then deserialized again, then appended to a file, then eventually assembled into the hierarchy that you see in the UI.

In 2.5.9, we did some work to improve this - if you have PowerShell scripts that produce a lot of log output, we now buffer these so that multiple lines are sent at once. Specifically:

So upgrading to the latest version could certainly help to deal with the increase in log messages. Excluding the activity logs directory from anti virus is also a good idea if you are able to.

Paul

chris_little · 13 November 2014 16:04

Hi Paul,

Thank you for the response. It was PowerShell scripts sending the bulk of the messages. I will see about getting our server upgraded and getting an exclusion added to our anti-virus.

Chris