Urgent (10hrs) Maintenance Mode is enabled making the cluster read-only for most users. All Octopus Server nodes in this cluster are draining tasks

resolved
reliability
external
(Pm Swetz) #1

We have been in maintenance mode for over 10 hours now and it is having a major impact on our development, QA and product ownership teams as we cannot actively build, test, deploy or sell our product. This has crippled us for the time being and we would very much like an explanation as to

  1. why this has taken so long
  2. when we can expect to be operational again

This is affecting active development teams approaching 50 people in three countries. The costs overall are in the tens of thousands of dollars and growing.

(Justin Walsh) #2

Hi @PMSwetz,

Thanks for reaching out, I’ve reconfigured your instance to take it out of maintenance mode.

It looks like this stemmed from an attempt to migrate your instance to our new cloud platform during your maintenance window earlier today, which failed and left your machine in maintenance mode. Our systems generally account for this, but for some reason, it did not.

The team and I will dig into this to see why this failed. In the future, if you see this outside your maintenance window (as configured on Octopus.com), please let us know straight away, and we’ll get it fixed up for you.

Apologies for the inconvenience caused by this, and we’ll add some additional safeguards to our process to ensure that it doesn’t happen again.

(Pm Swetz) #3

Thank you, please keep me informed on the findings as this has created quite a bit of concern over here.