Registration of polling tentacles behind Amazon elastic load balancer

Garry · 28 March 2014 00:57

I’ve encountered an issue with using Octopus polling Tentacles on EC2 instances behind an Amazon load balancer. The issue doesn’t exist when those same instances are run without the load balancer.

On each instance, there is a startup batch script that installs Tentacle, registers with the Octopus Server, and pulls the latest code for its particular code environment. This all works perfectly and reliably on instances that we spin up on their own.

When that same server image is attached to a load balancer, the polling Tentacle will register its name with the Octopus Server, but that is where things stop. If I check the Connectivity tab on Octopus Server for that Tentacle with hostname IP-AC1F1289, I see “4 messages have expired while waiting for delivery since the last attempted connection.”

If I check in the Server diagnostics tab, I get a bunch of messages like “Rejecting connection: the client at xx.xx.x.xx:49217 provided a certificate with thumbprint 74E3730259098313F1F1FCB3CC5DAFA15F727ED5, which is associated with IP-AC1F1289 (machines-130), but not configured for distribution.” The IP address making the connection has the correct hostname and the correct thumbprint.

On both instances, either standalone or behind the ELB, the security groups are the same - allowing only HTTP and HTTPS inbound, and anything outbound. The Octopus Server is using a non-standard port, but the config settings that we have in place work perfectly for instances outside of the ELB to register and deploy code automatically.

Any pointers as to what might be causing this would be greatly appreciated.

Paul_Stovell · 28 March 2014 01:03

Hi Garry,

What happens if you RDP to the machine behind the load balancer, and browse to the Octopus server’s listening port directly? (E.g., https://your-octopus:10943)

In C:\Octopus\Logs on the Tentacle you’ll find some log files - could you attach them?

Paul

Garry · 28 March 2014 10:08

Hi Paul,

Thanks for getting back to me so quickly. When I remote desktop into one of the instances behind the load balancer and try to bring up the https://your-octopus:10943 page in IE, it shows me the “there is a problem with this website’s security certificate” message, which I accept by clicking Continue, and then it shows me a 404.

However, just to compare I tried that with an instance that is not behind the ELB and the same thing happens - cert error then 404. That server has no issue in receiving commands and prompted code though.

I’ve attached the logs from one of the instances behind the ELB that is having issues (all I’ve changed in it is the IP address). In the documentation I read that port 10933 on the Tentacle need not be opened on polling, but is it possible that the firewall rules should open that as well?

OctopusTentacle.txt (1 MB)

Paul_Stovell · 28 March 2014 10:15

Hi Garry,

This seems to be the issue:

We aborted the connection because the remote host was not authenticated. This happens when the remote host presents a different certificate from the one we expected

Basically, the thumbprint of the certificate in your Octopus server is different to the one that the Tentacle is configured to trust.

In Octopus server, go to the Configuration → Certificates tab, and look at the Octopus->Tentacle thumbprint.

On your polling Tentacle, open the Tentacle Manager, and look at the thumbprints shown.

If there’s a difference then this will be the cause. Is it possible your automatic provisioning script is telling the Tentacle to trust an Octopus server with a different thumbprint to what you are seeing?

Paul

Garry · 28 March 2014 11:02

Hi Paul,

The automated script that runs on an instance’s startup doesn’t have the server’s thumbprint. Instead, it connects with the API Key and, I assume, is given the most up-to-date Certificate from the server directly during that handshake.

I’ve attached the script that we run on instance startup if you want to have a quick look.

Garry.

GoGoGadgetTentacle.bat (1 KB)

Garry · 1 April 2014 15:39

Hi Paul,

I was wondering if you had a chance to look into this any further? As far as I can tell everything has been configured correctly, and the Octopus Server even begins to deploy to the instances. Not sure what might be preventing a full deployment.

Paul_Stovell · 1 April 2014 16:47

Hi Garry,

Just looked at your script and it seems to make sense, I can’t see why it wouldn’t work. Perhaps you could add me to Skype (paulstovell) and we could do a screen sharing session to try and work out why it’s not working. If Skype doesn’t work for you let me know another sharing technology.

Paul

Paul_Stovell · 1 April 2014 18:02

Hi Garry,

Could you try shutting down the Octopus server, then starting it up again, and then provisioning the EC2 machine?

Paul

Paul_Stovell · 1 April 2014 18:04

Also, is your Octopus server behind a load balancer, or a reverse proxy, or anything else that could be caching requests?

If you remote to your Tentacle, then browse to:

http://your-octopus/api/certificates
http://your-octopus/api/certificates/certificate-global

Do you see any difference in the thumbprint values?

Paul

j.cuddy · 24 November 2014 15:55

Was this ever resolved? I believe I found the cause of the issue, but need some assistance implementing the workaround. I posted a question here: http://help.octopusdeploy.com/discussions/problems/27218-accessing-tentacle-behind-amazon-load-balancer-elb

Basically, I think what’s happening is that you can’t just simply set a port listener on ELB for a port using SSL. ELB is expecting you to configure the private key, public key, and optionally the certificate chain on that port when you set it up. If you don’t do that, ELB can’t decrypt the traffic between the server and tentacle and will not forward the request. I can’t seem to find the private key that’s generated for Octopus, (and I don’t think just the thumbprint would work) so I can’t really test this theory out, however.

dMb · 20 January 2015 17:26

Hi

I have this same issue. Did you find a fix? Seems to me that you need to be able to export the private and public key from the Octopus sever thumbprint and apply to the ELB.

Any updates here?

Thanks

j.cuddy · 21 January 2015 13:35

@dMb What I ended up doing was spinning up a dedicated Octopus instance within my VPC. Octopus allows up to three installs per license, so I figured this would be the easiest way to handle my issue. Now the instance isn’t exposed to the internet and each of my servers have an open connection to it. The only drawback to this is having to re-create all your projects if you already have another internal server.

dMb · 21 January 2015 14:18

@Jason

Thank you and that’s what we ended up doing. I believe the fix is to somehow export the private and public key from the Octopus X500 cert and use on the AWS ELB. I just cannot find a way to export the cert and it does not appear in the local windows certificate store.

Damian_Maclennan · 29 January 2015 11:15

You should be able to get a private address for the machines as well though and connect to that. You definitely don’t want to go through the load balancer (it’s just not going to work).

So you should have one address which is routed through your load balancer for the public and another private address that you would communicate with, the second one is how you’d connect to the tentacle.

Does that make sense ?

dMb · 29 January 2015 11:32

Yes that makes sense.

So do you need the list of private IPs or can you just use 10.201.0.0/16 as we want to scan any system in our VPC.

Damian_Maclennan · 29 January 2015 11:45

You’d want the private IP and register the tentacle with that address.

dMb · 29 January 2015 11:50

Yep got that but just making sure that AWS will allow me to scan all private IPs in 10.201.0.0/16 from a private in that same network.

Damian_Maclennan · 29 January 2015 12:05

I’m not understanding you properly, why do you need to scan all the IPs ?