I am using version 220.127.116.110 Octopus Server and Client API. I am trying to use the Client API to gather statistics about usage, and other data points to show trends and other interesting data. In my staging server, it has a much smaller dataset and the results, while stale, come back in a timely fashion. When I point the same code to the production server, it has much more data and while I have attempted to chunk the TaskResource data by dates, I still receive WebException timeouts.
Currently I have the following stats from 6 months of actual operation:
Total # of users: 497
Total # of projects: 205 (growing about 2-8 different projects per day)
Total # of releases to date: 4003 (growing about 10-20 per day)
Total # of deployments to date 5141 (growing 20-30 per day) <-- this is where the communication with the production server starts to breakdown and the timeouts occur.
I am wondering if there is unknown or complicated limitation with the data backend communicating with the client api? I have noticed that api seems to work very fast with items numbering less than 3000, but starts to time out when we start reaching 5000 items or more.
Could there be a more efficient way to chunk the requests? I can’t access the ravendb through the 10931 port, so my only other access capability is through the client api.
Using the client api puts a cpu utilization on the Octopus Server from 15-50% above idle. If there are other users actively using the Octopus UI, the server cpu utilization even higher and can cause timeouts to occur. I tested this out by taking a back up of the production data and then using the client api to hit the restored data on a staging server that was sitting idle.
Retrieving just TaskResource objects through the client api, where the criteria was Name=“Deploy” and a startdate and enddate 30 days apart. The retrieval took better part of an hour, with the cpu on the Octopus Server ranging from 15-35% cpu utilization and 300Kbps - 1.1Mbps.
Thanks for getting in touch! I am going to answer both of your questions here as they are mostly related.
As I am sure you will be aware, we have announced our decision to move from Raven to SQL. This is not only to make your life easier with things like getting unique queries of the data for reporting purposes, one of the biggest changes you will see is the load and CPU utilization. We have a very strong educated opinion that these issues are going to minimize in a dramatic way.
So I guess what I am saying is, we are aware and are making a change to fix this and make it better. But there is nothing that can be done in the mean time, we are just working hard to make it happen as fast as possible.
For the issues relating to seeing who is using the system or idle and calling system resources, there is no current way to do this, and no real plans to make updates to allow such a search. Our end goal being that you should not need such a function.
To help resolve the timeout while trying to get a total number of deployments, there are two suggestions. One is to have a daily process that get deployments by date and aggregates them to a total. Or to use the projects to iterate through how many deployments they have and total the whole.
I know I am not really providing you with an answer or help, as moving to SQL is our answer and we do believe it will help.