Deployment Metrics

Hi,

Our deployment has a lot of nuget steps (typically over 100) some of which deploy to multiple machines.
In an attempt to speed up our deployments we want to dig into actual deployments to try and find steps which take a long time to deploy.

We have recently upgraded to 3.13.0 and can see that on the task log each step/machine has a time to execute (attached)
However, this seems to include the time where the task is not actually running, but waiting on other tasks which effects the metrics i am trying to gather.
Whilst i agree that the UI is correct in terms of how long that block was running for (in this example it was nearly 5 minutes due to a 4 minute wait) is there a way to gather a metric on exactly how long each deployment step is taking (essentially how long it takes to run the deploy.ps1

Many thanks in advance for any assistance you can give!

Chris

Hi Chris,

Thanks for getting in touch! Do you know about the full deployment logs? They give a bit more granular permissions.
If you look under RAW logs as the option - it should assist in timing for each part of the step.
Here is a doc image which shows where that toggle is: https://i.octopus.com/docs/images/3048147/3278094.png

Please let me know if that is what you were looking for.
Vanessa

Hi Vanessa,

I am aware of the full deployment logs however, this makes it very difficult to try and analyse quite how long each part of our deployment process takes to identify where we can make improvements.

Hi Chris,

We directly parse the logs to show the times you do see on the overview page, there is no other source of data im afraid.

Vanessa

Hi Vanessa,

Thanks for your feedback.

I don’t supose you will share your parsing code with us :wink:

Hi Chris,

That’s cool what you’re trying to do :slight_smile:

The parsing code isn’t something easily shared, there’s quite a lot to it as it’s used to get all the information about the deployment. When we were implementing the timing display UI feature we factored in total running time, what you’re trying to determine is the core work of the step that’s a bit more parsing effort.

I checked the API response it may have enough detail for what you want to calculate. But I suggest you double check for your deployments, if you go to the API route api/tasks/ServerTasks-<yourtasknumber>/details will return you a Task and it will have ActivityLogs, you’ll see the LogElements array and it will have the entries, look for the script related ones like Deploying package or Making sure a Website or Executing script on and you can extract the OccurredAt times for those entries, and then subtract that time from the next step that starts after it, or if it’s the last step then the time of The deployment completed successfully.

If you determine the API call data doesn’t have enough detail and you really want to parse the server logs, don’t use the raw ones returned form the UI as they are already processed to a human readable format. In the Server\Tasks\ logging folder where Octopus runs, pick the files for the deployments you do want to analyse.

The structure of each line is a comma delimited set of details.

["ServerTasks-2310_ABCDEF/STEP_LEVEL1_GUID/SECTION_GUID/ANOTHER_SECTION_GUID","INF","2017-05-25T13:11:52.6685530+00:00","Executing script on TEST","",0]

You’ll want to make use of “INF” or “VBS” only (and not “PLN” entries as they are created up front on creation - well before deployment happens). Again you’re looking for the lines that represent the real start and end of processing effort you want to measure.

Hope this helps.

Regards,
Nick

Hi Nick,

I realise this has been a while, but i finally got round to looking at this…I thought i would share what i came up with following your help!

I’ve attached a linqpad script (renamed to .txt from .linq) with some C# hitting the octopus api and recursively finding what i think would be an action completed (something without any children & having log entries) it then summarises those in a report and outputs it.

It specifically looks for when Octopus logs “Cannot start this task yet…” and captures the start of the task, and then the date/time of the log directly after the last “Cannot start this task yet” message to treat that as the actual start.
We therefore get 3 duration, an overall (the one Octopus shows), a waiting time due to parallel limits and an actual script execution time which is exactly what i needed :slight_smile:

I have also included a task level total time of the overall duration (as reported by Octopus), the total time of tasks including waits but not taking into account a failure guidance or manual interaction step and finally a total for all the actual deployment time.

OctopusDeployDurationReport-Upload.txt (3 KB)

Hi Chris,

Nick is currently busy doing other things so I am taking over this case.
This is pretty cool what you did, thank you for sharing it with us.
Actually it is so cool that I think it would be great if you create an idea in https://octopusdeploy.uservoice.com/ to see who else would like to have this built into Octopus, what do you think ?

Cheers
John

Hey John,

Yeah i originally wanted this in Octopus, so well worth it.

If i had any votes left…i would create it myself :wink:

There you go Chris https://octopusdeploy.uservoice.com/forums/170787-general/suggestions/31246756-show-stats-on-how-long-a-step-script-takes-to-run, tell your friends to vote :slight_smile:

Cheers
John