Linux tentacle error - unable to read data from the transport connection

jason.jhuboo · 30 October 2019 16:56

Please see attached error log.

My setup is 1 server (Win Server 2016) with 2 workers (linux - Debian with Calimari installed).

I’ve noticed that a number of my deloyments are failing with:

Activity devownct on a Worker failed with error 'An error occurred when sending a request to ‘https://octopusworker1:10933/’, before the request could begin: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

These errors have been happening before a deployment starts, sometimes during a deployment.

I am deploying multiple projects at the same time (spinning up a micro services environment) - running bash scripts to deploy approx 40 docker containers to a Kubernetes cluster. The deployments work by deploying:

1 container - wait until complete,
1 container - wait until complete,
1 container - wait until complete,
15 containers - wait until complete,
20 containers - wait until complete.

This may be coincidence, but the first three containers seem to deploy. Is this error indicative of a resource starved worker machine?

octoerror.txt (11.7 KB)

Justin_Walsh · 31 October 2019 20:14

Hi @jason.jhuboo!

Thanks for getting in touch - this definitely looks like some external factor is interfering with your deployment process somehow, yes. We typically see this when either there’s a network security mechanism in play, interfering with the connection on either end (Firewall, etc). Or the target machine is overloaded to the point where it fails to respond to the connection request. Since it does work sometimes, I’m leaning towards the latter there.

Assuming that there’s no network security blocking the connections, the first place I would look would be your resource utilization on your workers while a deployment is attempted, and see if they’re getting hammered by the requests, if so, you may need to adjust the available resources to these machines.

I hope this helps, and please don’t hesitate to reach out if you have any further questions.