We have a deployment target that is displaying as ‘Unhealthy’ despite operating as normal and running through deploys without issue. When I run a health check, it succeeds and is marked as ‘Healthy’. However, when I refresh the page a few seconds after the health check finishes, it shows ‘Unhealthy’ again.
The ‘Events’ page seems to indicate that it’s having issues resolving the target’s OS/architecture information; while it successfully updates the information in the health check, it immediately reverts all of it to unknown/null a few seconds after the health check finishes.
I’ve copied the Events information below, removing our target’s actual name. I don’t see anything that stands out in the ’ Recent Communication Logs’, but I also don’t have access to the full communication logs, only the last couple hundred lines that display in the Octo UI’s Connectivity page (I may be able to request them from our ops team if required). When refreshing the page a few times I think I saw a Halibut error related to ‘tried to read past the end of the stream’ but I haven’t been able to reproduce the error in order to provide the full error message.
Tentacle version: 7.0.33
Calamari version: 25.3.3
DeploymentTarget (target name) was modified
Thursday, July 27, 2023 2:02:51 PM
system
Established with: Unknown
User agent: Server
Category: Document modified
before:
"OperatingSystem": "Unknown",
"ShellName": "Unknown",
"ShellVersion": "Unknown",
"Architecture": "Unknown",
"IsRunningInContainer": null
after:
"OperatingSystem": "Ubuntu 18.04.6 LTS (bionic)",
"ShellName": "Bash",
"ShellVersion": "4.4.20(1)-release",
"Architecture": "x86_64",
"IsRunningInContainer": false
DeploymentTarget (target name) became healthy
Thursday, July 27, 2023 2:02:51 PM
system
Established with: Unknown
User agent: Server
Category: Machine found healthy
DeploymentTarget (target name) was modified
Thursday, July 27, 2023 2:02:58 PM
system
Established with: Unknown
User agent: Server
Category: Document modified
before:
"OperatingSystem": "Ubuntu 18.04.6 LTS (bionic)",
"ShellName": "Bash",
"ShellVersion": "4.4.20(1)-release",
"Architecture": "x86_64",
"IsRunningInContainer": false
after:
"OperatingSystem": "Unknown",
"ShellName": "Unknown",
"ShellVersion": "Unknown",
"Architecture": "Unknown",
"IsRunningInContainer": null
DeploymentTarget (target name) became unhealthy
Thursday, July 27, 2023 2:02:58 PM
system
Established with: Unknown
User agent: Server
Category: Machine found to be unhealthy
The remote script failed with exit code 1