Octopus Server returned an error: An error occured reading the log file

I did not see a resolution on this issue in the linked issue.

So I am opening a new case for this.
We are receiving this error multiple times per day in our TC logs. A retry usually fixes the issue. Octopus Deploy is successful, but the return to TC is a failure. How do we look at resolving this issue as it causes issues for our developers. We are using AWS FSx as our shared HA storage.

14:58:55     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
14:58:55     Input string '3426fb7933452ddb8c9ed' is not a valid number. Path '', line 430, position 2761.
14:58:55     Error from Octopus Server (HTTP 500 InternalServerError)
14:58:55     Exit code: -7
14:58:55     octo exit code: 249

Hi @paul.benoit,

Thanks for reaching out, Iā€™d be happy to re-open the investigation into this and hopefully get to the bottom of whatā€™s going on!

Would be willing to share the build logs from TeamCity as well as the Octopus Task logs for the deployment that show the error? That should give the info we need to start investigating, such as the versions used as well as the ServerTask Id that seems to have this ā€˜invalid numberā€™.

Are you able to locate the ServerTask file thatā€™s giving the error in AWS FSx and does it include the invalid number in itā€™s name? e.g. servertasks-1234_3426fb7933452ddb8c9ed.txt?

Iā€™d also like to confirm if you using Windows File Server or Lustre for FSx?

You should be able to upload files securely here, but please let us know if there are any issues with it or you have any questions at all!

Best Regards,

I uploaded the TC log and Octopus log (from FSx)

We are using Windowā€™s File Server.

Hi @paul.benoit,

Thanks for sending those through, confirming we have received them ok!

I can see that the error is occurring after the Tentacle Retention kicks in, however it looks to be a slightly different error in this log, indicating that the Path was empty:

Success: Apply retention policy on Tentacles
Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550.
Unexpected character encountered while parsing value: c. Path '', line 4218, position 3745.

Could you please also send through the TC logs for the error Input string 'XX' is not a valid number? Iā€™m curious if this error occurs at the same location in the logs and with similar timings.

The task logs indicate that it is waiting for another script to finish, which takes around 20 mins, however the log message times donā€™t seem to quite line up with the TC logs. Iā€™d just like to double check that you sent through the correct file?

Releasing package lock for XXX
Acquiring isolation mutex RunningScript with NoIsolation in ServerTasks-7730526
Waiting for the script in task [ServerTasks-7730525](~/app#/tasks/ServerTasks-7730525) to finish as that script requires that no other Octopus scripts are executing on this target at the same time. 

Iā€™ll keep digging into this and keep you posted with any updates, feel free to reach out with any questions!

Best Regards,

I expanded all the TC logs (even on some new jobs that are failing again) and I donā€™t see anything related to Input string 'XX' is not a valid number.

I sent the script that the Octopus job (which TC kicked off) mentioned in its own release. Maybe it was waiting on some other script with the name mentioned. You will notice the 7730526 script mentioned 7730525 and 7730524 also. I will attach those.

Hi @paul.benoit,

Cheers, confirming I have received those additional Task logs uploaded.

Just to confirm, you are only seeing the Unexpected character encountered while parsing value: c. Path '' error in your TC logs?

Do you recall where that other error in your initial message came from?

14:58:55     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
14:58:55     Input string '3426fb7933452ddb8c9ed' is not a valid number. Path '', line 430, position 2761.
14:58:55     Error from Octopus Server (HTTP 500 InternalServerError)
14:58:55     Exit code: -7
14:58:55     octo exit code: 249

Iā€™ve reached out to the devs for some advice and will keep you posted with any updates or suggestions from them!

Best Regards,

I am actually realizing I see many different error messages that describe the same error. I have asked our devs if it seems related to our application repositories and they they have said that is not the case.
I also canā€™t find a relation to this anywhere else in the logs, but I can post the full TC logs for all the following errors if that helps.

Here are the snippets (7 of them):

14:58:25           Delta for package RateServer v23004.14-dev-ss157 successfully uploaded and applied.
14:58:55     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
14:58:55     Input string '3426fb7933452ddb8c9ed' is not a valid number. Path '', line 430, position 2761.
14:58:55     Error from Octopus Server (HTTP 500 InternalServerError)
14:58:55     Exit code: -7
14:58:55     octo exit code: 249
14:58:55   Unable to create or deploy release. Please check the build log for details on the error.
13:59:24       All packages have been acquired
14:01:06     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
14:01:06     Unexpected character encountered while parsing value: d. Path '', line 2790, position 3447.
14:01:06     Error from Octopus Server (HTTP 500 InternalServerError)
14:01:06     Exit code: -7
14:01:06     octo exit code: -7
14:01:06   Unable to create or deploy release. Please check the build log for details on the error.
19:15:37           Keeping this deployment and the previous 3 successful deployments
19:15:37           Removing directory 'D:\Octopus\Applications\ss145\DynDev\RateServer\23002.1-dev-ss145'
19:15:37           Removing package file 'D:\Octopus\Files\RateServer@S23002.1-dev-ss145@8946CF35302F7E4D8BA7D7D813C8091A.nupkg'
19:16:18     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
19:16:18     JsonToken EndArray is not valid for closing JsonType None. Path '', line 2879, position 856.
19:16:18     Error from Octopus Server (HTTP 500 InternalServerError)
19:16:18     Exit code: -7
19:16:18     octo exit code: -7
19:16:18   Unable to create or deploy release. Please check the build log for details on the error.
17:15:14           Keeping this deployment and the previous 3 successful deployments
17:15:14           Did not find any deployments to clean up
17:15:24     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
17:15:24     Unexpected character encountered while parsing value: c. Path '', line 4218, position 3745.
17:15:24     Error from Octopus Server (HTTP 500 InternalServerError)
17:15:24     Exit code: -7
17:15:24     octo exit code: 249
17:15:24   Unable to create or deploy release. Please check the build log for details on the error.
15:44:22     Success: Acquire packages
15:46:14     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
15:46:14     Unexpected character encountered while parsing number: ". Path '', line 1705, position 3257.
15:46:14     Error from Octopus Server (HTTP 500 InternalServerError)
15:46:14     Exit code: -7
15:46:14     octo exit code: 249
15:46:14   Unable to create or deploy release. Please check the build log for details on the error.
19:15:37           Removing directory 'D:\Octopus\Applications\ss145\DynDev\RateServer\23002.1-dev-ss145'
19:15:37           Removing package file 'D:\Octopus\Files\RateServer@S23002.1-dev-ss145@8946CF35302F7E4D8BA7D7D813C8091A.nupkg'
19:16:18     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
19:16:18     JsonToken EndArray is not valid for closing JsonType None. Path '', line 2879, position 856.
19:16:18     Error from Octopus Server (HTTP 500 InternalServerError)
19:16:18     Exit code: -7
19:16:18     octo exit code: -7
19:16:18   Unable to create or deploy release. Please check the build log for details on the error.
7:15:14           Keeping this deployment and the previous 3 successful deployments
17:15:14           Did not find any deployments to clean up
17:15:24     Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
17:15:24     Unexpected character encountered while parsing value: c. Path '', line 4218, position 3745.
17:15:24     Error from Octopus Server (HTTP 500 InternalServerError)
17:15:24     Exit code: -7
17:15:24     octo exit code: 249
17:15:24   Unable to create or deploy release. Please check the build log for details on the error.

Hey @paul.benoit,

Just jumping in for Finnian as he is currently off shift as part of our AUS based team.

Thank you for providing those bits of logs, it looks like there are different issues each time with the parsing of the value Path ' '.

Finnian did link another ticket we had (internal so I cannot show you) which seems to match what you are seeing.

They were getting the error:

System.Exception: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550. ---> Newtonsoft.Json.JsonReaderException: Unexpected character encountered while parsing value: b. Path '', line 181, position 46.

Their resource share was also hosted on an AWS Fsx instance, accessed via SMB.

Are you able to confirm if you are running de-duplication on your Fsx instance? If you are can you try turning that off and seeing if that works for you? The other customer had theirs turned off so this did not work for them but I wanted to rule that out in your senario.

For some context, our engineers worked on this issue in 2017 and found:

It appears to stem from the writing of the log files, and an apparent race condition. Octopus writes to log file as the messages come back from the deployment agent, and the Server OS will send the writes to the storage device like:

  • Block 1000-1020 ā€œ123 Blah blah blahā€
  • Block 1021-1030 ā€œ456 Blah blah blahā€

The issue we were seeing is that the cache on the storage device fails to write the first block first, so Blocks 1021-1030 get written to their correct spot before 1000-1020 are written. In most cases, this works itself out, blocks 1000 - 1020 get written eventually, just out of sync.

Theyā€™re fairly confident that this error can be somewhat ignored, itā€™s just a network issue or device cache issue that pops up rarely.

Unfortunately that issue on the ticket I mentioned was never fixed, and the only option was to look at changing the shared filesystem backend to something like DFS on the tasklog directory.

Let me know if you have de-duplication turned on and if so, are you able to turn this off and test? If not, this looks to be an issue with AWS Fsx that is very rare and unfortunately it looks like we have been unable to solve it as it only happens on rare occasions.

Ill catch up with Finnian to see if he has any other ideas but it might be that you just cannot use AWS Fsx for your shared storage in this instance.

Kind Regards,
Clare

Thank you for all that insight. I had assumed it might be some kind of race condition. Re-trying enough times have always fixed it, but devs who have not seen it before always waste time trying to debug.

We have not enabled de-duplication, but Iā€™ll try to find out how to log in to see if AWS might have done that by default. I do have multi-AZ enabled.

1 Like

Hi @paul.benoit,

Thanks heaps for sending those additional logs through, that confirms that it is progressing through the package lock release section and is most likely a race condition as itā€™s not failing at the same place every time.

Sorry for not requesting earlier but are you able to please send through the OctopusServer.txt log file? That same upload link should still be working but reach out if there are any issues with it.

Iā€™m keen to get to the bottom of this and get a fix in place (or at least a workaround) rather than hopefully needing to migrate away from AWS FSx, Iā€™ll keep digging into this with the devs and keep you posted with any updates!

Best Regards,

1 Like

I am uploading the Octopus log and TC log that correlate. Octopus job did pass (as usual), but just sent a failure back to TC.

Snippet of TC failure:

    Octopus Server returned an error: An error occured reading the log file. This does not affect the deployment. This usually indicates a problem with the storage interface if using a SAN, or the network if using NAS or DFS. See https://github.com/OctopusDeploy/Issues/issues/3550
14:57:47     Unexpected character encountered while parsing number: T. Path '', line 1533, position 2608.
14:57:47     Error from Octopus Server (HTTP 500 InternalServerError)
14:57:47     Exit code: -7
14:57:47     octo exit code: 249
1 Like

@finnian.dempsey

Is there way to reduce logging or turn it off for certain tenants in Octopus? We are noticing this does not happen in production. So if it is a race condition that was stated earlier, that could help in our dev/test environments.

Hi Paul,

Sorry for the delay with this, I can confirm the engineers are looking into it however I donā€™t have any updates just yet.

Iā€™m fairly certain that adding --logLevel=fatal as a command line argument to your TeamCity Deploy a Release step, Octopus will only log fatal events and so I believe the TC build should be able to finish successfully but I havenā€™t tested this to confirm. Iā€™ve asked the devs about whether this will work (or if there are better ways to do it) and will keep you posted as soon as I hear back from them.

Best Regards,

Hi Paul,

Just an update, the engineers have confirmed that wonā€™t reduce the log level where the error is coming from and so it wonā€™t help.

Iā€™ll keep you posted if I have any other suggestions or workarounds!

Best Regards,

Hi @paul.benoit,

Just an update with a potential workaround from the engineers you could use while we investigate.

A new version of our CLI is coming out soon which doesnā€™t pull task logs back as continuously as the old CLI does, so it might allow for you to not have this race condition.

This new CLI will need to be installed on the TeamCity Agent and a script step (such as Powershell) configured to call the CLI commands to deploy the release on the target Octopus Server, then disabling the Deploy a Release step.

  1. Install the new Octopus CLI on the TeamCity agent (e.g. choco install octopus-cli) - GitHub - OctopusDeploy/cli: | Public | Next Generation of the Octopus CLI
  2. Configure system ENV variables for the API key and Octopus URL -GitHub - OctopusDeploy/cli: | Public | Next Generation of the Octopus CLI
$env:OCTOPUS_URL="http://localhost:8050" 
$env:OCTOPUS_API_KEY="API-XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
  1. Configure a Script step to Run the following command, which deploys a release and waits it to complete - cli/examples.md at main Ā· OctopusDeploy/cli Ā· GitHub
octopus release deploy \
  --project 'New Awesome Project' \
  --version '0.0.4' \
  --environment 'test' \
  --space 'Default' \
  --no-prompt \
  --output-format basic \
  | octopus task wait
  1. Disable the ā€œDeploy a Releaseā€ step and test the build

Let me know how you get on or if you have any questions at all!

Best Regards,

Thank you, what is the new CLI version?
I still see version 9.1.7 on the choco page. Is that the one that allows the new parameter of wait?

This is how it looks now. We should just add the | octopus task wait command?

    octo create-release --project="%OctopusProjectName%" `
                        --channel="%OctoChannel%" `
                        --version=$versionNbr `
                        --packageversion=%build.number% `
                        --releasenotes="%dep.dde_engine_build.teamcity.build.branch% https://github.com/xxxx/%system.GitRepoName%/commit/%dep.dde_engine_build.build.vcs.number%, Configs: %dep.DevOps_DeploymentsForDynamicDev.AssemblyInfoVer%, Tools: %dep.Artifacts_Tools.AssemblyInfoVer%" `
                        --server="%env.OctopusURL%" `
                        --apiKey="%env.OctopusAPIKey%" `
                        --ignoreexisting `
                        --deployto="$deployto" `
                        --tenant="%env.tenant%"

Hi @paul.benoit,

Iā€™m stepping in for Finnian, as he is offline for the day, but Iā€™m happy to help!

It looks like Finnian recommended trying our new CLI (written in Go rather than C#, and not related to the existing CLI):

This CLI has a slightly different syntax than before, so as Finnian mentioned, youā€™d need to update this in your process:

octopus release deploy \
  --project 'New Awesome Project' \
  --version '0.0.4' \
  --environment 'test' \
  --space 'Default' \
  --no-prompt \
  --output-format basic \
  | octopus task wait

You can read more about the new CLI here, which includes some examples of the syntax as well:

https://octopus.com/blog/building-octopus-cli-vnext

I hope this helps, but feel free to reach back out if you have any more questions.

Best regards,

Britton

1 Like

Ah, I understand. Thank you.

I will run this by the team to see if it works for deployments.

1 Like

Hey, I am trying this option to see if log size will help with this issue:
I am changing to the ā€œfatalā€ option as minlevel for now.

https://www.octopus.com/docs/support/log-files#Logfiles-Changingloglevels