Kubernetes Cluster Target Not Healthy

manjunatha.karekal · 6 December 2022 12:26

Hi,
I have added all the required information. Also, the telnet from the octopus server to the Kubernetes cluster API target is happening successfully.

Still the healtch check is failing at the Octopus. Can you please help? The verbose log is not showing any useful information for me to be able to debug this issue.

Attached is the logs from the Octopus console.

Pls help with this issue. Let me know in case more information is required.

dane.falvo · 6 December 2022 14:00

Hi @manjunatha.karekal.

I’m sorry to hear that you are having some troubles with your Kubernetes Cluster running as a Deployment Target.

Where is your Kubernetes cluster located? ( What cloud environment?)

I’m wondering if you have investigated your WAF logs to see if anything is currently being blocked as suggested by our troubleshooting guide?

Let me know how you go.

Regards,

Dane.

manjunatha.karekal · 6 December 2022 14:04

@dane.falvo
Thanks for responding. I am using EKS (AWS) as a kubernetes cluster providing service.
There is no involvement of WAF as I am having the kubernetes cluster running in the same subnet as my octopus server There is not firewall in between Octopus and EKS cluster.

I am also able to telnet to the eks endpoint successfully from the octopus server. That rules out any network issues.

adam.hollow · 6 December 2022 15:57

Hi @manjunatha.karekal,

Thanks for getting back to us!

Is it possible that the two different connection endpoints are running different security ciphers?

Would you be able to confirm that both the Octopus Server and the Kubernetes Cluster host are running the same TLS versions?

Kind Regards,
Adam

manjunatha.karekal · 6 December 2022 18:02

@adam.hollow

Hi Adam,
Cant we enable debug logs in octopus to check what exactly is happening? Can you please suggest?

dane.falvo · 7 December 2022 07:41

Hi @manjunatha.karekal,

You can definitely increase the log level, which may improve visibility. There is a guide here.

As for Security ciphers and testing of the valid ciphers allowed by your Server, we have a guide for that specifically but the only valid part of the document for your situation would be the initial part on Octopus Server ciphers.

In an attempt to narrow down where the issue lays, I would start my troubleshooting by double-checking the Worker performing the Health Check.

Can you tell me what you have configured in this section and can you try running the health check “inside of a container, on a worker” as I have provided in the example?

By setting this Docker image, we should be confident that the right tools are installed on the worker to communicate with the EKS cluster. You will need to select an appropriate worker pool in order for it to work.

If performing a health check, after setting that container image still doesn’t work, I would explore the other end of the scenario by diving into logs from EKS. The lower portion of this page contains some basic troubleshooting steps which I would start off with to rule out some connectivity issues.

Let me know how you go.

Regards,

manjunatha.karekal · 8 December 2022 13:53

Hi @dane.falvo
Thanks for responding. Yes I will try the docker container thing for sure. I would also like to share the following findings.

I am getting below logs while octopus tries to connect to EKS cluster via worker server.
Pls note I have followed the guidelines to have right set of kubectl and aws-iam-authenticator.
Below are the screenshot of the error.

Error-screenshots.docx (1.2 MB)

It seems Octopus is trying to connect to the cluster using “octouser”. Why is it so?

manjunatha.karekal · 8 December 2022 15:48

Hi @dane.falvo

In Brief:

Below is my architecture

Octopus Deploy (Hosted on Windows) ==> Octopus Worker (RHEL8) ==> Kubernetes Cluster (Amazon EKS)

Octopus Version ==> V2022.3 (Build 10437)
kubectl Version ==> 1.23.6 (Recommended by octopus)
aws-iam-authenticator version ==> 0.5.3 (Recommended by octopus)

I have created a Kubernetes deployment target, which is so far not healthy. My aim is to make it healthy.

Below is the configuration of Kubernetes Deployment Target

Authentication Type ==> AWS Account
Execute using AWS Service role for an EC2 Instance ==> yes
HealthCheck Container Image runs diretcly on a worker.

When I test the target, I get the errors attached in the scresnshot.

Note: I have tried using the workertool ubuntu image but the health check is failing

Please note: I have seperately checked communication between worker server and the EKS cluste. It works fine. I am also able to communicate to the eks cluster, get pods, etc from the worker server. However, when Octopus tries to communicate with the cluster via worker server. The communication is failing.

Pls find attached the logs for multiple scnarios.
Error-screenshots.docx (2.0 MB)

manjunatha.karekal · 9 December 2022 10:26

Hi,
My issues have been resolved after I have assigned proper roles to the Worker node Ec2 instance.

Thank for your help.

dane.falvo · 9 December 2022 13:11

Excellent news. Thanks for the update.

Dane

system · 9 January 2023 13:11

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.