Octopus interface very slow after migrating server, Home folder and database

mattjeanes23 · 16 October 2018 12:46

Hi there,

We’ve recently moved our Octopus instance to a new plenty powerful server, it’s database to a plenty powerful SQL Server, and it’s Home folder to a decent network drive but we’re seeing huge slowdowns when trying to browse the interface. The slowest things seem to be logging in (checking your credentials screen, active directory, taking 3-4 seconds now used to be near instant) and loading the details for a particular release (sometimes taking 5 seconds, again used to be near instant)

As far as I’m aware the only things stored on the Home folder are the task logs, internal packages and artifacts (e.g. project icons) so even if that was on a really slow network drive I’m not sure that it could cause this kind of slowdown when doing things completely unrelated to them e.g. just authenticating over AD.

SQL Server slowness could definitely cause what I’m seeing - but the server we’re using powers all our production sites and is by no means slow. Running a fairly complex query ‘SELECT * FROM dbo.Release_WithDeploymentProcess’ takes about a second or ‘SELECT * FROM dbo.NuGetPackage’ takes less than a second.

The only remaining thing I can think of is the server change itself but CPU/Memory/Network is showing no signs of hitting maximum. We do have a lot of network traffic going through the server but it’s connected to the host via 10Gbit link and the SQL Server by a 100mbit link, and even when the network traffic is quiet it is still slow. Server CPU averages below 10% total capacity and Octopus right now is running at 0.5% CPU usage. Total memory usage is 33% and Octopus is using ~475MB.

Sometimes the interface is very quick though, just for brief periods cannot find any correlation as of yet as to why that may be, but the vast majority of time since we migrated it has been very slow.

Happy to provide any logs or try things out to speed it up. Would really appreciate any help.

Thanks,
Matt.

Shane_Gill · 21 October 2018 23:20

Hi Matt,

Thanks for getting in touch. When you migrated did you update versions or just change hardware?

The home folder contains server logs, task logs, artifacts and packages. A slow disk would usually affect deployment-related things so I would agree that slowness authenticating would be unrelated. Very slow write to the server logs might cause issues.

If your SQL is quick and your hardware mostly unused that does make it a bit of a puzzle. I think our best bet is to capture some traces when you are using the interface to see where the slowness is. There are some instructions here: https://octopus.com/docs/support/record-a-problem-with-your-browser. If you are already familiar with performance trace in Chrome please just send a HAR file. Your Octopus Server logs would also be a big help, the are located in the “Logs” subfolder of your Home folder. Please send them to support@octopus.com referencing this issue to protect your privacy.

We will have a look and hopefully figure out why it has started to get slow.

Cheers,
Shane

mattjeanes23 · 22 October 2018 09:29

Hi Shane,

We upgraded versions just before the migration as guidance stated migrating a Octopus installation might be to the exact same version. It was only 2 patch versions I believe, but I can’t recall the exact versions - we generally keep our installation fairly up to date. Maybe the logs will tell you this?

I’ve sent a 7z containing 2 HAR files from Chrome and the Octopus Logs directory. Funnily enough while trying to capture the HARs it was fast for a few minutes so I couldn’t capture one, but a few minutes later it was back to slow.

The only thing I can think of is that the network drive being used for the Home folder is blocking when trying to retrieve e.g. the project logo or write to a log or something, but hopefully the logs reveal some info. I’m happy to move the SQL database / home folder if you feel that would be a good test. Moving the installation is also possible but I’d need a little bit of time to get that done.

Worth noting the Octopus installation will be registered under Moneybarn or ‘Money Barn’ as it says in the license, as that’s the company I work for and my work email will be matthew.jeanes(at)moneybarn(dot)com

Thanks a lot for your help.

Shane_Gill · 22 October 2018 23:34

Hi Matt,

Thanks for sending your logs.

My initial impression is that every query is taking 500ms longer than it should. In the HAR that you sent there is a ServerStatus request which takes 16ms - this hits a cache and responds quickly so it looks like the Octopus Server is pretty responsive. The slow ones on there are not hitting the cache, a good example is DeploymentProcess taking 530ms. I would expect it to take closer to 30ms. I think this helps explain why sometime the UI seems fast - all the requests are hitting the cache.

The server logs are very revealing, it looks like all of your queries are taking longer than 500ms. DeploymentProcess for example:

2018-10-22 01:27:22.9800   3088     13  INFO  Reader took 517ms (0ms until the first record) in transaction '<unknown>': SELECT TOP 1 *
FROM dbo.[DeploymentProcess]
WHERE (IsFrozen = 0)
AND (OwnerId = @projectid)
ORDER BY [Id]

The log is full of queries that I would not expect to take longer than a few milliseconds. A good test to confirm would be to run the following query against the Octopus database from the Octopus Server:

SELECT *
FROM dbo.[MachinePolicy]
ORDER BY [Id]

You should get a small result set very quickly (<1ms if the database is local). What kind of latency is there between your Octopus Server and SQL Server?

Hope this helps steer you in the right direction.

Cheers,
Shane

mattjeanes23 · 23 October 2018 10:56

I think it has steered us in the right direction!

We found that the connectivity between our SQL Server and Octopus Server was significantly (500ms~) slower than it should be sometimes, it seems to correlate with deployments though, but directly querying the DB during these times from the Octopus Server was giving 500ms+ response times while running the same query from my dev PC to the SQL Server took only a couple ms.

We’ve got a lot of network contention around that server, with network usage going >1Gbps sometimes and most of our network is only 1Gbit/s so I’ve moved the database onto the previous server which is also on the same VMware host as the Octopus Deploy server meaning 10Gbit/s theoretical connection between the VMs as they are both using the vmxnet3 adapter.

So far so good, but will continue to monitor over the next couple of days to see if that resolves our issue. I’ll come back to let you know if it did or did not so we can get this closed off - and to help anyone else who may stumble across a similar problem

mattjeanes23 · 31 October 2018 12:58

Hi again, that did indeed seem to solve it for us. Thanks for your help - the logs are very useful!

Shane_Gill · 31 October 2018 23:07

Excellent, glad to hear things have improved for you.