Upgrade Octopus Deploy from 2019.1.5 to LTS release 2019.3.5 failed

Hi,

I just went through a process to upgrade our hosted Octopus Deploy from 2019.1.5 to 2019.3.5 but it appears to have failed on the database schema update.

Looking over the logs this appears to be what started it

2019-06-24 10:01:51.5041   2484      9 ERROR  This Octopus Server node called 'EC2AMAZ-B7KKH5A' failed to heartbeat and has been demoted to Follower in the cluster.

System.Exception: Error while executing SQL command in transaction ‘OctopusClusterService.UpdateLastSeen’: Invalid column name ‘LastSeen’.
Invalid column name ‘Rank’.
Invalid column name ‘IsOnline’.
The command being executed was:
UPDATE dbo.[OctopusServerNode] SET [Name] = @Name, [LastSeen] = @LastSeen, [Rank] = @Rank, [MaxConcurrentTasks] = @MaxConcurrentTasks, [IsInMaintenanceMode] = @IsInMaintenanceMode, [IsOnline] = @IsOnline, [JSON] = @JSON WHERE [Id] = @Id —> System.Data.SqlClient.SqlException: Invalid column name ‘LastSeen’.
Invalid column name ‘Rank’.
Invalid column name ‘IsOnline’.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Nevermore.Transient.IDbCommandExtensions.<>c__DisplayClass2_0.<ExecuteNonQueryWithRetry>b__0() at Nevermore.Transient.RetryPolicy.ExecuteAction[TResult](Func1 func)
at Nevermore.RelationalTransaction.Update[TDocument](TDocument instance, String tableHint, Nullable1 commandTimeoutSeconds) --- End of inner exception stack trace --- at Nevermore.RelationalTransaction.Update[TDocument](TDocument instance, String tableHint, Nullable1 commandTimeoutSeconds)
at Octopus.Core.Model.Clustering.OctopusClusterService.UpdateLastSeen()
at Octopus.Core.Model.Clustering.OctopusClusterService.TryHeartbeat()
2019-06-24 10:01:51.5197 2484 9 ERROR Failed to record the demotion of the Octopus Server node called ‘EC2AMAZ-B7KKH5A’. Another node should elect itself as the leader if this node appears to be offline for long enough. If a new leader is not elected try stopping this node and starting it again.
System.Exception: Error while executing SQL command in transaction ‘OctopusClusterService:TryRecordDemotionInDatabase’: Invalid column name ‘LastSeen’.
Invalid column name ‘Rank’.
Invalid column name ‘IsOnline’.
The command being executed was:
UPDATE dbo.[OctopusServerNode] SET [Name] = @Name, [LastSeen] = @LastSeen, [Rank] = @Rank, [MaxConcurrentTasks] = @MaxConcurrentTasks, [IsInMaintenanceMode] = @IsInMaintenanceMode, [IsOnline] = @IsOnline, [JSON] = @JSON WHERE [Id] = @Id —> System.Data.SqlClient.SqlException: Invalid column name ‘LastSeen’.
Invalid column name ‘Rank’.
Invalid column name ‘IsOnline’.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Nevermore.Transient.IDbCommandExtensions.<>c__DisplayClass2_0.<ExecuteNonQueryWithRetry>b__0() at Nevermore.Transient.RetryPolicy.ExecuteAction[TResult](Func1 func)
at Nevermore.RelationalTransaction.Update[TDocument](TDocument instance, String tableHint, Nullable1 commandTimeoutSeconds) --- End of inner exception stack trace --- at Nevermore.RelationalTransaction.Update[TDocument](TDocument instance, String tableHint, Nullable1 commandTimeoutSeconds)
at Octopus.Core.Model.Clustering.OctopusClusterService.TryRecordDemotionInDatabase()

I can upload the entire OctopusServer log if required. Currently I’ve rolled back the OD servers to 2019.1.5 and pointed to OD servers to an RDS that was snapshotted before the change.

Are you able to assist in getting my production database in a state where it will work with 2019.3.5?

Thanks
Anthony

Hi Anthony,

This very much looks like you still had an Octopus Server instance running at the time that the upgrade was occurring. With Octopus HA we can’t (currently) support nodes of differing versions, you need to take the entire cluster offline and upgrade all of the nodes before adding them back into the cluster.

If you have active older nodes while running an upgrade you will run into problems as you have seen.

We understand that this is not ideal, unfortunately as upgrades can (and do) change SQL schema and data it is very difficult to make those upgrades backwards and forwards compatible. It is definitely something that we would like to achieve, we just haven’t had the available resources yet to tackle this challenge.

If you can retry the upgrade firstly by taking all nodes offline, upgrading the first node and confirming that everything has gone OK, then upgrade each other node before bringing them back online.

Apologies if we’ve caused any confusion, as part of writing this response I noticed that we have given bad advise on our Upgrades page which I will update to match this response.

If you have any questions please let me know.

Regards,
Alex

Hi Alex,

Thanks for that. I re-did the upgrade and went to the latest LTS and it looks like it’s gone successfully.

I am having one issue at the moment and I’m not sure if it’s due to the upgrade. When a particular set of users attempt to deploy a release they are being presented with an error saying “Missing Resource” when trying to select the “Deveopment” environment


Hovering over error shows
image

Permissions wise the account certainly has access to Deveopment environment.


Permissions_export_2019_07_04__03_11_41_UTC.csv (2.0 KB)

If I hit the API for Environments-2 it shows it exists and is “Development”

{
  "Id": "Environments-2",
  "Name": "Development",
  "Description": "",
  "SortOrder": 1,
  "UseGuidedFailure": false,
  "AllowDynamicInfrastructure": false,
  "SpaceId": "Spaces-1",
  "ExtensionSettings": [],
  "Links": {
    "Self": "/api/Spaces-1/environments/Environments-2",
    "Machines": "/api/Spaces-1/environments/Environments-2/machines{?skip,take,partialName,roles,isDisabled,healthStatuses,commStyles,tenantIds,tenantTags}",
    "SinglyScopedVariableDetails": "/api/Spaces-1/environments/Environments-2/singlyScopedVariableDetails",
    "Metadata": "/api/Spaces-1/environments/Environments-2/metadata"
  }
}

This was 100% working in it’s current state before I upgraded to the latest LTS last night.

Any ideas?

Thanks
Anthony

Hi Anthony,

Glad to hear that you were able to complete the upgrade, and sorry the hear that you have immediately run into an issue.

This is a known problem, and we are working on it with urgency. I’m hoping we should have a fix out the door in the next day or two.

Any questions please let me know,

Regards
Alex

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.