No package has been staged

dgard1981 · 6 September 2021 15:55

Hello,

We have a deployment that uses the Run an AWS CLI Script step, but lately that step has been failing quite often with the following error -

No package has been staged

We are using guided failure mode, and retrying the failed step does not work (the same error is displayed). The only way around it is to cancel the deployment and start again.

It’s almost like something is deleting a package from the worker (Hosted Ubuntu). The steps all run in a docker container, so I can’t be sure if the missing package is the docker image or the NuGet package containing files to work on.

The logs don’t seem to tell much of a story, but there may be something of more use in another spot, so I’m happy to share the full logs via PM if that would help.

Is this a known issue, or perhaps is something mis-configured?

Thanks,
David

                    |   == Running: Step 10: Terraform - Ingress ==
                    |   
                    |     == Running: Worker on behalf of eks-test ==
15:34:55   Verbose  |       Octopus Server version: 2021.2.7428+Branch.release-2021.2.Sha.d771d6437f879f789be3f86d8c7d4ffa53eb3867
15:34:55   Verbose  |       Environment Information:
                    |       IsRunningInContainer: True
                    |       OperatingSystem: Linux 5.4.0-1047-azure #49~18.04.1-Ubuntu SMP Thu Apr 22 21:28:54 UTC 2021
                    |       OsBitVersion: x64
                    |       Is64BitProcess: True
                    |       CurrentUser: root
                    |       MachineName: octopus-i009472-59d687874-v4md6
                    |       ProcessorCount: 3
                    |       CurrentDirectory: /Octopus
                    |       TempDirectory: /tmp/
                    |       HostProcessName: Octopus.Server
                    |       PID: 1
15:34:55   Info     |       Leasing UbuntuDefault dynamic worker...
15:34:57   Info     |       Obtained worker lease successfully.
15:34:57   Verbose  |       Leased worker 21-09-06-1041-fotjo from pool Hosted Ubuntu (lease Leases-36155670).
15:34:57   Verbose  |       Executing Terraform - Ingress (type Run an AWS CLI Script) on DynamicWorker 21-09-06-1041-fotjo
15:34:57   Warning  |       Using the AWS tools bundled with Octopus is not recommended. Learn more about AWS Tools at [https://g.octopushq.com/AWSTools](https://g.octopushq.com/AWSTools).
15:34:57   Error    |       No package has been staged
15:38:08   Info     |       Guidance received: Retry
15:38:08   Info     |       Retry attempt #1 of 10
15:38:08   Warning  |       Using the AWS tools bundled with Octopus is not recommended. Learn more about AWS Tools at [https://g.octopushq.com/AWSTools](https://g.octopushq.com/AWSTools).
15:38:08   Error    |       No package has been staged

finnian.dempsey · 6 September 2021 23:22

Hi @dgard1981,

Thanks for reaching out!

I’ll start to dig into this on my end to see if I can spot any new issues as we aren’t aware of any regarding this currently.

The full logs will definitely help build a clearer picture of what’s going on. If you could please also share those as well as the deployment process JSON for this project, I’ve created a secure link for you to upload any files here.

Feel free to let me know if you have any questions, I’ll reach out if I need anymore info.

Best Regards,

dgard1981 · 7 September 2021 08:34

Hi Finnian,

I’ve uploaded the logs from a deployment that suffered this issue, as well as the deployment process for the project.

I’m reasonably sure that the issue has occurred on more than just the “Terraform - Ingress” step, but I can’t fine any logs for those at this time.

Please let me know if you need anything further from me.

Thanks,
David

paul.calvert · 7 September 2021 08:47

Hi David,

You mention that this has been fairly quite often recently with this issue? Am I correct in assuming that there are times it runs successfully?
If so, would you be able to also send us the deployment log from a successful run?

Regards,
Paul

dgard1981 · 7 September 2021 09:09

Hi Paul,

As far as I remember, on every occasion the deployment has worked when I re-run it. However, retrying the failed step on the failed deployment (guided failure) has not worked on any occasion.

I’ve uploaded logs for a successful deployment of the same release for which you already have the failed logs.

And to clarify, by quite often I mean I’m seeing it once or twice a day, it’s not like it’s every other deployment that is suffering. So subjectively it seems quite often to me but probably isn’t in the grand scheme of things.

Thanks,
David

paul.calvert · 7 September 2021 09:13

Thanks for that, we’ll take a look into this.

It does make some sense that retrying the step via Guided Failure would continue to fail, as this doesn’t re-trigger the Acquire Packages step. Whereas running the entire deployment again will push the packages back out.

It just isn’t clear how the package is going missing in the first place.

paul.calvert · 14 September 2021 07:00

Hi David,

We’ve been investigating this with the engineers and so far haven’t been able to pin down any specific cause.
The one thing we did notice is that in the failing task log, the worker gets leased again in step 10 before it fails. We are wondering if the worker being leased again is causing it to misplace the package and would like to review other failed and successful deployments to see if this is a pattern.

Are you happy for us to log in to your cloud instance and view these deployment logs?

Regards,
Paul

dgard1981 · 14 September 2021 08:39

Hi Paul,

Yep, happy for you to take a look around. I’ll drop you a PM with the link to the project that seems to suffer from this the most.

Thanks,
David

system · 18 October 2021 00:32

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

mjhilton_octo · 6 December 2021 01:59

Hi David,

We’ve narrowed down a reproducible cause of this problem and logged it as a GitHub Issue. It turns out that one of our internal scheduled deployments was also encountering this issue, which helped us track down a reproduction scenario.

Unfortunately the deployment process JSON you provided earlier have been purged from our systems (they get removed after 31 days), so I can’t 100% say whether this root cause and workaround matches your exact situation, but from all I’ve been able to gather from the error logs you provided above, I’m hoping it’s a match.

We’ve documented a workaround on the Issue which may help you out with encountering this problem until we can get the fix shipped. TL;DR: add an explicit Full Health Check step after all your dynamic infrastructure provisioning steps, which will re-trigger package acquisition and ensure the package is staged for newly added targets.

Hope that helps, and let us know how you go!

Best Regards,
Matt.