Is the manual intervention status available from custom scripts?

OfficeSpacePirate · 3 February 2020 04:56

I would like to create a step template(s) that restrict a specific environment to one deploy at a time. A hacked approach might go something like:

Step that runs custom script checking all deployment statuses in a particular environment (and most likely a project group) and set an output variable.
Manual Intervention step that conditionally runs based on value of output variable.
All steps pertaining to the deployment in question.
Step that runs custom script checking for earliest deploy timestamp in target environment/project group, assigns deployment to fake user (because Manual Intervention will have paused it) and triggers the next deploy.

If I could manually set the status of a deployment to the same one that gets used for the Manual Intervention step (also freeing up the worker hopefully), then 1 and 2 get combined as well as the last step no longer needing to assign deployments. Thoughts?

I asked about queuing deployments a few months ago and it seems like it’s not directly supported, so I’m trying to come up with a feasible work around.

Bob_Walker · 4 February 2020 23:44

Hi Clayton, that is an interesting question. The short answer to your request is no. I spoke to the engineer who implemented that logic and that state cannot be modified after/before the fact.

There might be an alternative. If you don’t mind, could you provide me your scenario (or scenarios) that led you to want to do that in Octopus? Knowing that, I might be able to provide a different work around.

Thanks!

OfficeSpacePirate · 5 February 2020 02:10

Hey Bob,

Thanks for getting back right away. We currently have 12 services that make up the backend of one of our products and we deploy to our CI environment (first in the Octopus lifecycle) and run tests whenever their default branches are updated. We are ultimately hoping to adopt a continuous delivery model, but one of the issues we have run into is that our automated testing fails when there are multiple deployments occurring at the same time.

What I would like to do is limit only our CI environment to a single parallel deployment, but allow deployments to be run in parallel because I use the Deploy a Release step template in a global deploy for staging and production releases. This would allow system testing to be run after any deployment without the possibility of apps getting replaced in the middle of a test run (or the need for a human to manually run the tests). The simple solution is to write a script that checks current deployments, but then a worker will be tied for the length of several deployments rather than just one; hence the idea of using the Manual Intervention step.

Let me know if you need anymore detail. Thanks

Bob_Walker · 5 February 2020 19:55

Hi Clayton,

Hmm, I can see your conundrum. In Octopus we treat each deployment as a separate event, it doesn’t really know about other projects. One thought I had was leveraging the run a script step and API to find if there are active deployments for your specific projects. The script would wait until all projects are completed before kicking off your tests. For example, in this article, I was doing something similar, but waiting for all deployments to a specific machine to wait: How can I force a deployment to wait in Octopus Deploy until another deployment finishes?

That approach has its own pros and cons. Putting that step in the right spot could be tricky, and I could see instances where you would have that step multiple times in the same project. Another question that comes to mind is how often are N of the 12 services being built and deployed at the same time? If it is once every couple of days, would it be easier include retry logic in the script which kicks off the script? Or would the effort be much greater?

Finally, how long do those tests run? Should they always run after each build? The reason I ask that is because I worked on another application a few years ago that typically had between 20-40 builds a day. A test suite which took 20-30 minutes to finish caused a lot of grumbling from the developers. Our compromise was to run the test suite at set intervals throughout the day, at 9 during our stand-up, over lunch, and at the end of the day.

I hope that helps trigger something that can drive towards a solution!

OfficeSpacePirate · 5 February 2020 20:59

The only problem with the other approach is that it ties up the worker for the duration of more than one deployment, otherwise it would address what I’m trying to do.

This is happening multiple times in a day sometimes and not for several days in a row other times (as software development ebbs and flows). We have ~10 devs that all work on these services and they are independent in responsibility within the system, but dependent in that they make up one product’s backend.

The major issue I see with this approach is that it doesn’t address the issue so much as cover it up. There would also be the possibility of another service getting deployed during the second test run and returning equally inaccurate results.

Our tests run in lass than 10 minutes (in most cases less than 3 minutes) for our backend services. Our client teams’ UI tests run for ~30 minutes, but they are unaffected by the 12 services in question (I assume that 20-30 minutes implies UI tests or a very large test suite). Because we are pushing for a continuous delivery and built in quality philosophies, we consider the test runs required for every deployment currently in order to block our development pipeline with the knowledge of what blocked it. Set intervals would add triaging time whenever an issue came up after several merges.

I appreciate your time looking into this on my behalf. In the end, our goal is to remove human interaction from this point in our process until the deployment to our production environment (so long as Octopus didn’t report a failure of some kind along the way). If there isn’t any current support or better work around than the idea I proposed, I think I will just move forward with that for now. If this or a queuing system of some kind might be supported in the future, my team (and likely all the teams here) would be very interested in the timeline on that.

Access to the status of the Manual Intervention step (or another new pending status) and the ability to free up a worker seems like the easiest addition on your end. That being said, I imagine some awareness or deployments within a Project Group in Octopus would have several benefits for a variety of Octopus use cases.

Thanks again.

Bob_Walker · 5 February 2020 22:39

Hi Clayton,

The unfortunate thing is adding the ability to change a manual intervention step has other implications, namely around auditing. It isn’t a matter of “could we,” it is more of a question “should we.”

I was thinking you could disable the mutex for that specific environment. https://octopus.com/docs/administration/managing-infrastructure/run-multiple-processes-on-a-tentacle-simultaneously. The variable to disable the mutex could be added and set to true for dev, but false for upper environments.

Or, you said the word “worker” a few times. If you are running them on a worker (https://octopus.com/worker), that doesn’t have a mutex at all. It can run 1 to N process concurrently out of the box.

I hope that helps!

OfficeSpacePirate · 6 February 2020 00:11

I’ll have to look into the mutex option. We currently use the same environments across a few project groups, but that would be easy enough to separate.

On the “freeing the worker” concern, I may be incorrect on the actual interaction between the worker and parallel deployments; I believe Octopus defaults to 10 simultaneously and those probably run on fewer than 10 machines now that you mention it . I just want to avoid backing up deployments as we do share the Octopus instance with our whole organization.

Thanks again for the quick response.

Bob_Walker · 6 February 2020 01:32

Yeah, you will have to dial it in a bit to make sure you don’t spike the CPU, but hopefully with those options you can get something working. If you have further questions, please let me know!

system · 7 March 2020 01:39

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.