Kubernetes deployment doesn't delete failed deployment

reliability
(Luke) #1

When the Kubernetes deployment fails, although Octopus reports the failure, it does not clean up the pods that are still set to “pending”, the only way I can find to clear them up is to eventually make a successful deployment. I can easily end up with 20 or 30 dead pods that I don’t need.

This might be by design so that the deployment can be debugged but is it possible to do this automatically on failed deployment?

(Michael Richardson) #3

Hi Luke,

You’re correct, this was by design for exactly the reason you mention: to assist in diagnosis.

We could potentially add a configuration option which would delete any resources on failure.

What I would suggest in the meantime, is adding a custom label as part of that step, and configuring your own Run a kubectl CLI script step to run if the previous step fails. It could then simply execute a

kubectl delete deployment -l #{YourLabelVariable}

The label should value should probably include #{Octopus.Deployment.Id} as part of it, to make it unique to the current deployment.

You could even configure the cleanup step to only run in certain environments. For example you may want it to run in development, where deployments are more likely to fail, but not in production where you may want to investigate any failures.

I hope that helps. Please reach out if that doesn’t make sense, or if we can be of any assistance implementing it.

(Luke) #4

That makes sense. Thanks.

1 Like
(Tina) closed #5