Connectivity between Octopus Server and Tentacle

reliability
(Andy) #1
  1. The tentacle is being installed as part of the bootstrapping of an EC2 instance. Just wanted to mention that we are paying for the licenses and really like the product.
  2. When the provisioning is complete, I can see the tentacle by it’s name under the “Deployment Targets” section.
  3. However, the tentacle has a red icon. Upon clicking on “Connectivity” under the tentacle section, I see this error message:
    Socket communication error with connection to https://cldxxxxxx01.workgroup:10933/ System.Net.Sockets.SocketException (0x80004005): No such host is known
  4. I then checked the tentacle machine by following this link along with these steps:
    *https://localhost:10933 works fine
    And, it seems that everything is fine on the Tentacle EC2 instance (netstat -a -n -o shows that it is listening on 10933)
  5. When I access https://tentacleIP:10933 from the server. It works fine.
  6. When I manually force the health check from Octopus Server, I get the following error messages:
    Check deployment target: Testing-Andy

An error occurred when sending a request to 'https://xxxxxx:10933/', before the request could begin: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. An existing connection was forcibly closed by the remote host

Summary - One or more machines were not available. Please see the output Log for details.
7. I am using Windows server 2016 and TLS 1.1 and 1.2 both are enabled.
8. When I delete the tentacle from ODS and then try to re-add manually by supplying additional parameters, I am getting the following error message:
Opening a new connection
July 3rd 2019 14:47:08
Info
Connection established
July 3rd 2019 14:47:08
Info
Performing TLS handshake
July 3rd 2019 14:47:08
Error
The remote host at https://xxxx:10933/ reset the connection, this may mean that the expected listening service does not trust the thumbprint XXXXXXXXXXXXXXXXXXXXX or was shut down.
9. This doesnt seem to be a trust issue since I have matched the thumbprint of both the OD and Tentacle against each other (From the tentacle manager as well).
10. My AWS VPC flow logs show the biredirectional HTTPS traffic between my ODS and tentacle EC2 instance.
12. My ODS and tentacle EC2 are both in the same network so network connectivity doesnt seem to be an issue, at all.
12. I have been trying to troubleshoot this for quite some time now. Any suggestions/pointers will be greatly appreciated!
Thanks in advance!

(Henrik Andersson) #3

Hi Andy,

Thanks for getting in touch and I’m sorry to hear you are having these issues registering your Tentacle with your Octopus server.

First thing that comes to mind is that it could be a DNS issue.

You say that you can connect to https://TentacleIP:10933 successfully from your Octopus server but can you try connecting to https://cldxxxxxx01.workgroup:10933 from your Octopus server and see if that also works successfully?

If you update the Tentacle in your Octopus server to use the TentacleIP instead of FQDN, does the health check complete successfully?

Thank you and best regards,
Henrik

1 Like
(Andy) #4

Thank you for your reply Henrik.

  1. I manually tried to add the deployment target in ODS using the FQDN of tentacle machine and see “No such host is known” error message.

  2. I believe If this was a DNS related issue, I should have been able to add it just by the https://tentacleIP:10933 manually? It did not work for some reason. Just thinking out loud :slight_smile:

  3. Just checked the logs on tentacle machine once again and this is what I see:

    2019-07-03 10:38:45.1674 5440 1 INFO ==== RegisterMachineCommand ====
    2019-07-03 10:38:45.1830 5440 1 INFO CommandLine: C:\Program Files\Octopus Deploy\Tentacle\Tentacle.exe register-with --instance Tentacle --server https://xxxxoctopusserverxxxx --name tentaclemachinename --apiKey ******** --force --console --comms-style TentaclePassive --publicHostName something --environment=SIT --role testrole --tenant sometenant --tenanted-deployment-participation TenantedOrUntenanted --environment=SIT
    2019-07-03 10:38:45.7299 5440 1 INFO Registering the tentacle with the server at https://xxxxxoctpus-serverxxxxxxxxx/
    2019-07-03 10:38:45.9486 5440 1 INFO Detected automation environment: NoneOrUnknown
    2019-07-03 10:38:48.0111 5440 9 INFO Machine registered successfully
    2019-07-03 10:38:48.0111 5440 1 WARN These changes require a restart of the Tentacle.
    2019-07-03 11:38:51.2778 2384 8 INFO listen://[::]:10933/ 8 Accepted TCP client: [::ffff:10.129.25.155]:61833
    2019-07-03 11:38:51.4966 2384 9 INFO listen://[::]:10933/ 9 Unhandled error when handling request from client: [::ffff:10.129.25.155]:61833
    System.IO.IOException: Authentication failed because the remote party has closed the transport stream.
    at System.Net.Security.SslState.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
    at System.Net.Security.SslState.EndProcessAuthentication(IAsyncResult result)
    at System.Threading.Tasks.TaskFactory1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task1 promise, Boolean requiresSynchronization)
    — End of stack trace from previous location where exception was thrown —
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Halibut.Transport.SecureListener.d__18.MoveNext() in Z:\buildAgent\workDir\fe2b45bbd4978f75\source\Halibut\Transport\SecureListener.cs:line 172

Why is the remote party closing the connection as can be seen in the above logs?

(Rob Livermore) #5

With out details on the network topology or how DNS is setup up it is hard to troubleshoot. All you know is the listening tentacle is attempting to open a TCP socket and something in the route is closing it.

Suggestion

Use Powershell 5.1 Test-NetConnection to collect a little more diagnostic for networking issue.

Example to run on Server to Tentacle
Test-NetConnection tentacle.mycompany.com -Port 10933 -InformationLevel Detailed

Transport Layer Security

In your logs you are using TLS/https, this can be tricky to get right. This adds an extra work on top the of networking. From the logs above the DNS names probably do not match the PK509 Server Authentication certificates’ subject. Get the certificate from the Octopus Server. The best way I diagnose this is to follow Chris Duck Blog on how to test encrypted sockets.

(Andy) #7

Finally I was able to resolve the issue with some great advise from @henrik !
For those who bump into the same problem:

  1. All the above error messages except the one mentioned in #3 in the Original post are more or less misleading.
  2. The actual error message to be troubleshot is “No such host is known” message.
  3. The reason is what you would have assumed by now. Yes, it’s the name resolution. I did not know that octopus needs to resolve the “ComputerName” parameter to the IP of the tentacle while supplying the configuration parameters in cTentacleAgent’s PublicHostNameConfiguration configuration.
  4. What further confirmed the above were the following points:
  • I was able to add the tentacle manually from both the OD and the tentacle.
  • I made a host file entry in Octopus Server like this 10.130.65.34 CLD-D-CXXXX30 and the health check was passed this time.
  • Made an entry in the DNS (Which should have been done automatically, but will figure it out later) and the healthcheck passed.
  1. On a slightly unrelated note - TentaclePing/Pong is a great utility which helps you figure out the network issues and gives you the insight on the network communication.
  2. In my case, AWS VPC flow logs helped me a lot otherwise I would have turned to WireShark.
1 Like