SQL Execution Timeout Expired After upgrade

Hi Octopus support,

After upgrading to version 2023.2 (build 12209) some weeks ago
and now we experience that with one runbook its generate the error given below.

Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. Current transactions: Transaction ‘GetEvents.GetEvents|80007b9d-0002-fb00-b63f-84710c7967bb|T1027’ Open with 2 commands started at 2023-07-06T12:19:15 (60.66 seconds ago) 2023-07-06T12:19:15 SELECT COUNT(*) FROM [dbo].[Event] WHERE (((([SpaceId] in (‘Spaces-1’))) OR (([SpaceId] is null)))) AND ([Id] in (SELECT er.EventId from EventRelatedDocument er where er.RelatedDocumentId = @regarding0)) 2023-07-06T12:19:15 SELECT * FROM ( SELECT Id,RelatedDocumentIds,ProjectId,EnvironmentId,Category,UserId,Username,Occurred,Message,TenantId,AutoId,DataVersion,UserAgent,SpaceId,JSONBlob,ChangeDetails,IpAddress,JSON, ROW_NUMBER() OVER (ORDER BY [Occurred] DESC, [AutoId] DESC) AS RowNum FROM [dbo].[Event] WHERE (((([SpaceId] in (‘Spaces-1’))) OR (([SpaceId] is null)))) AND ([Id] in (SELECT er.EventId from EventRelatedDocument er where er.RelatedDocumentId = @regarding0)) ) ALIAS_GENERATED_1 WHERE ([RowNum] >= @_minrow) AND ([RowNum] <= @_maxrow) ORDER BY [RowNum] SQL Error -2 - Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. The wait operation timed out.

We have run System Integrity Check and no errors there.
Any advice?

Hi @j.keulartz,

Thank you for contacting Octopus Support. I’m sorry to hear you are having timeout issues when trying to run a particular Runbook.

Are you able to generate a System Diagnostics Report and share that with us via this secure upload link?

Let us know once you have uploaded that and we’ll have a look to see what the issue is.

Best Regards,
Donny

Hi Octopus support,

I have just uploaded the files

Hi @j.keulartz,

Thank you for getting back to me.

I don’t see anything in particular that jumps out other than the timeouts seem to occurring throughout the logs.

Is this Runbook the only “thing” that seems to be generating this error? If so, are you able to try cloning the Runbook and see if the clone behaves in the same way?

Let me know regarding the above at your earliest convenience.

Best Regards,
Donny

Hi Donny

I just spoke to end user.

It appears that when viewing the snapshots of runbooks history, sometimes the screen immediately jumps. But also sometimes needs more than 1 min or shows the previously delivered error (time out)

I have included a movie (mp4) that shows a long period of waiting

<img width=“449” height=“349” style=“width:4.677in;height:3.6354in” id=“Afbeelding_x0020_3” src=“//octopus-help-prod.s3.dualstack.us-west-2.amazonaws.com/original/3X/9/9/99bb7cd4fde609c1cd292bee2daea56f7864afbb.png” alt="Afbeelding met tekst, schermopname, Lettertype, Webpagina

Automatisch gegenereerde beschrijving">

Hi @j.keulartz,

Thank you for letting me know. Unfortunately, the video didn’t come through. However I suspect this may be due to many repeated runs of the same Runbook Snapshots. This is the type of behavior I would expect if the Runbook Retention Policy is set to Keep All.

Can you confirm the current Retention Policy setting for the Runbook in question?

Example:

Let me know at your earliest convenience.

Best Regards,
Donny

Hi @j.keulartz
Just stepping in for Donny who is offline at present as part of the UK team.

I came across this issue recently and we do have a GitHub issue for this which is still open:

Reading through the issue it seems to be a failing on SQL server due to poor execution plans. There is a workaround for on-prem servers which is simply to clear the execution plan cache:

DBCC FREEPROCCACHE

If you get a chance to run this on your SQL server it may resolve the timeouts.

Let us know how you get on.

Kind regards,
Paraic

Hi Goodmorning Paraic,

Hi @j.keulartz,

Thank you for getting back to us.

As you are running Octopus Server 2023.2.12209, I don’t believe we have any specific performance updates relating to the timeout issue you are experiencing when compared to 2023.2.12998.

Were you able to find the Runbook Retention Policy? If so, what is the current setting?

Let us know at your earliest convenience.

Best Regards,
Donny

Hi Donny,

we’ve played around with the retention policies from 20 to 2, but this doesn’t seem to help.

We wonder how we can see in the database how many retentions there should be?

do you have the location (table name) or a script for us to request it? Your help is appreciated

Hi @j.keulartz,

Thank you for getting back to me.

Are you able to run 2 SQL queries against your Octopus SQL DB so we can get a better idea of what may be causing this issue?

Index Fragmentation Report:

SELECT S.name as 'Schema',
T.name as 'Table',
I.name as 'Index',
DDIPS.avg_fragmentation_in_percent,
DDIPS.page_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS DDIPS
INNER JOIN sys.tables T on T.object_id = DDIPS.object_id
INNER JOIN sys.schemas S on T.schema_id = S.schema_id
INNER JOIN sys.indexes I ON I.object_id = DDIPS.object_id
AND DDIPS.index_id = I.index_id
WHERE DDIPS.database_id = DB_ID()
and I.name is not null
AND DDIPS.avg_fragmentation_in_percent > 0
ORDER BY DDIPS.avg_fragmentation_in_percent desc

Table Size Report:

-- Gets the size of all the tables
SELECT 
    t.NAME AS TableName,
    p.rows AS RowCounts,
    SUM(a.total_pages) * 8 AS TotalSpaceKB, 
    CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS TotalSpaceMB,
    SUM(a.used_pages) * 8 AS UsedSpaceKB, 
    CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS UsedSpaceMB, 
    (SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB,
    CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2) AS NUMERIC(36, 2)) AS UnusedSpaceMB
FROM 
    sys.tables t
INNER JOIN      
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN 
    sys.schemas s ON t.schema_id = s.schema_id
WHERE 
    t.NAME NOT LIKE 'dt%' 
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID > 255 
GROUP BY 
    t.Name, s.Name, p.Rows
ORDER BY 
    2 desc

Once you have run those, please export the results in .CSV format, then upload them using the secure upload link.

Let me know once you have upload and we’ll have a look at the results.

Best Regards,
Donny

Hi Donny

Have you received the files for further investigation?

Hi @j.keulartz,

Thank you for getting back to me and for sharing the results.

Upon reviewing the information, the fragmentation of the SQL DB indexes does appear high and would likely see a noticeable benefit from running defragmentation and hopefully fix the timeout issue.

Below is a script you can run against your Octopus SQL DB in order to rebuild the indexes. I recommend running this at a time where users won’t be impacted as rebuilding indexes will affect SQL DB performance while it is running:

DECLARE @indexName NVARCHAR(MAX), @schemaName NVARCHAR(MAX), @tableName NVARCHAR(MAX), @current SMALLINT, @avgFragmentationPercentage FLOAT, @start DATETIMEOFFSET, @globalStart DATETIMEOFFSET

SET @globalStart = SYSDATETIMEOFFSET()

PRINT 'Performing maintenance on SQL indexes'
DECLARE @indexFragmentation TABLE (
                                      [TableName] sysname,
                                      [SchemaName] sysname,
                                      [IndexName] sysname,
                                      [AvgFragmentationPercentage] FLOAT
                                  )

PRINT 'Querying index fragmentation'
INSERT INTO @indexFragmentation ([TableName], [SchemaName], [IndexName], [AvgFragmentationPercentage])
SELECT
    tbl.name as [TableName],
    SCHEMA_NAME (tbl.schema_id) as [SchemaName],
    idx.Name as [IndexName],
    pst.avg_fragmentation_in_percent as [AvgFragmentationPercentage]
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL , 'SAMPLED') as pst
         INNER JOIN sys.tables as tbl ON pst.object_id = tbl.object_id
         INNER JOIN sys.indexes idx ON pst.object_id = idx.object_id AND pst.index_id = idx.index_id
WHERE pst.index_id != 0
  AND pst.alloc_unit_type_desc IN ( N'IN_ROW_DATA', N'ROW_OVERFLOW_DATA');


SET @current = 1
DECLARE indexName_cursor CURSOR
    FOR
    SELECT [IndexName]
    FROM @indexFragmentation
    WHERE [AvgFragmentationPercentage] > 10 AND [AvgFragmentationPercentage] <= 30
    ORDER BY AvgFragmentationPercentage DESC;

OPEN indexName_cursor;
PRINT 'Reorganizing ' + LTRIM(STR(@@CURSOR_ROWS)) + ' fragmented indexes'
FETCH NEXT FROM indexName_cursor INTO @indexName;
WHILE @@fetch_status = 0
    BEGIN
        SELECT
                @schemaName = [SchemaName],
                @tableName = [TableName],
                @avgFragmentationPercentage = [AvgFragmentationPercentage]
        FROM @indexFragmentation
        WHERE [IndexName] = @indexName

        PRINT 'Reorganizing index ' + @indexName + ' (' + LTRIM(STR(@avgFragmentationPercentage)) + '%) on table ' + @schemaName + '.' + @tableName + ' (' + LTRIM(STR(@current)) + '/' + LTRIM(STR(@@CURSOR_ROWS)) + ')';
        SET @start = SYSDATETIMEOFFSET()
        EXEC('ALTER INDEX [' + @indexName + '] ON ['+ @schemaName +'].[' + @tableName + '] REORGANIZE;');

        PRINT 'Reorganizing index ' + @indexName + ' took ' + CONVERT(varchar(40), DATEDIFF(second, @start, SYSDATETIMEOFFSET())) + ' seconds';

        SET @current = @current + 1
        FETCH NEXT FROM indexName_cursor INTO @indexName;
    END;
CLOSE indexName_cursor;
DEALLOCATE indexName_cursor;

SET @current = 1

DECLARE indexName_cursor CURSOR
    FOR
    SELECT [IndexName]
    FROM @indexFragmentation
    WHERE [AvgFragmentationPercentage] > 30
    ORDER BY AvgFragmentationPercentage DESC;

OPEN indexName_cursor;
PRINT 'Rebuilding ' + LTRIM(STR(@@CURSOR_ROWS)) + ' heavily fragmented indexes'
FETCH NEXT FROM indexName_cursor INTO @indexName;
WHILE @@fetch_status = 0
    BEGIN
        SELECT
                @schemaName = [SchemaName],
                @tableName = [TableName],
                @avgFragmentationPercentage = [AvgFragmentationPercentage]
        FROM @indexFragmentation
        WHERE [IndexName] = @indexName

        PRINT 'Rebuilding index ' + @indexName + ' (' + LTRIM(STR(@avgFragmentationPercentage)) + '%) on table ' + @schemaName + '.' + @tableName + ' (' + LTRIM(STR(@current)) + '/' + LTRIM(STR(@@CURSOR_ROWS)) + ')';

        SET @start = SYSDATETIMEOFFSET()
        EXEC('ALTER INDEX [' + @indexName + '] ON ['+ @schemaName +'].[' + @tableName + '] REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON, STATISTICS_NORECOMPUTE = OFF);');

        PRINT 'Rebuilding index ' + @indexName + ' took ' + CONVERT(varchar(40), DATEDIFF(second, @start, SYSDATETIMEOFFSET())) + ' seconds';

        SET @current = @current + 1
        FETCH NEXT FROM indexName_cursor INTO @indexName;
    END;
CLOSE indexName_cursor;
DEALLOCATE indexName_cursor;

PRINT 'Finished performing maintenance on SQL indexes. The whole process took ' + CONVERT(varchar(40), DATEDIFF(second, @globalStart, SYSDATETIMEOFFSET())) + ' seconds'

Let me know if you have any questions prior to running the script. Once the script has run, please confirm if you are able to access the Runbook snapshots for the Runbook in question without timing out.

Best Regards,
Donny

Hi Donny,

I can confirm that script had the desired result.

Thank you for your support
You can close topic

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.