504 When adding Azure Web App as deployment target

Hello,

I am following the steps here https://octopus.com/docs/infrastructure/azure/web-app-targets to add an existing azure app service as a deployment target. I have a Azure service principal account and existing app services. The version of octopus deploy is 2018.5.1.

After specifying the Account the page attempts to load a list of Azure Web Apps. This process times out resulting in a 504, and the Azure Web Apps list remains empty.

A screenshot of the page is attached. Searching through logs I couldn’t find anything relevant. Any information on this helps.

Thanks!

Aidan

1 Like

Hi,

Thanks for reaching out. The formatting of that error is definitely an issue and I’ve raised an issue that you can following in GitHub.

As for what’s actually causing the error, our usual suspects on that front are usually data size or more often proxies. Could I just check, were you able to test the account without issue when you created it? This will help with understanding whether a proxy might be in the way.

Other than that, how many web apps are in the subscription? We’ve tested with some reasonable numbers, but some customers have a lot more than we do.

We do also get reports of timeout issues like this occasionally where it seems to be a transient issue in a particular region. Are you seeing this issue consistently?

Regards
Shannon

We have used this account with the previous method of app service deployments without issue. Also, I can hit other API endpoints like /api/accounts/{id}/resourceGroups without issue, but /api/accounts/{id}/websites gives a Gateway Timeout.

We have a little over 100 app services in this subscription

Thanks,
Robert

1 Like

Hello!

Thank you for the response, is there any update on the issue loading Web Apps?

The test connection works for that account, and it is in use elsewhere in Octopus. As Robert said we have a bit over 100 app services in that subscription. The issue is consistent.

Thanks again,

Aidan

Hi,

We have tried to reproduce this but as yet we have been unsuccessful. I still suspect it is something network related, but I cannot explain why the resource group list works and the web app list does not. Could you confirm whether there are any proxies or other network appliances between Octopus and Azure?

Could I also get you to run the following script on the Octopus server, as the same account the Octopus server is running as, and using the details from the Service Principal?

$subscription = ""
$tenant = ""
$appId = ""
$key = ""

$securePassword = ConvertTo-SecureString $key -AsPlainText -Force
$creds = New-Object System.Management.Automation.PSCredential ($appId, $securePassword)

Login-AzureRmAccount -Credential $creds -TenantId $tenant -SubscriptionId $subscription -ServicePrincipal

Foreach ($resourceGroup in Get-AzureRmResourceGroup) {
    Write-Host $resourceGroup.ResourceGroupName

    $webAppsInGroup = Get-AzureRmWebApp -ResourceGroupName $resourceGroup.ResourceGroupName

    Foreach($webApp in $webAppsInGroup) {
        Write-Host "    " $webApp.Name
    }
}

Internally Octopus server uses the C# libraries from Microsoft to list the web app, this is the PowerShell equivalent. The loop over the resource group may look a little odd, we have to do this due to ASEs (when you have multiple ASEs the same Web App name can exist in more than 1, so we need the group/web app name pair to get a unique representation of each Web App).

Regards
Shannon

Sorry for the delay in response. Our Octopus is sitting in AWS on a direct connect with our datacenter, I believe.

I just ran the script as the local system user (which is what Octopus is running as). This succeeded. It takes about 4 minutes to run. However, if I comment out the part that gets the Web Apps and just get the resource groups, it takes about 2 seconds. So there is definitely a significant slowdown when getting the web apps list

I just tried this again with a modification of the script where I removed the ResourceGroupName parameter from the Get-AzureRmWebApp call. This reduced the time to 2+ minutes. I’m thinking the limiting factor might be the number of Resource Groups. We have 61 resource groups in all.

Can I just check, when you say that Octopus is in AWS on a direct connect with your datacenter, does that mean any internet traffic from Octopus to Azure is also being routed via your datacenter? It’s possible this could be adding a large amount of the delay you are seeing.

We have also discovered while investigating this that the Azure tooling from Microsoft has been updated since our original implementation to handle ASEs. The Site data returned to us now has the ResourceGroup included with it, so we are looking to change our calls to not use that outer resource group looping concept. We’re working in this update now and hope to have it available soon. We have an issue that you can follow on GitHub for updates.

This change should greatly reduce the number of calls being made, and may in itself fix the issue you are seeing. Other than that you may be able to get better response times if Octopus traffic to Azure could be routed directly out of AWS.

Even from my local machine that call takes a couple minutes because it has to loop through all the resource groups. Is there any kind of workaround for this, as we can’t deploy Azure Web Apps if we can’t create the deployment target?

Hi Robert,

The only thing I could think of that may help at the moment is to change the permissions on the Service Principal so it only has access to a smaller number of resource groups, if that is practical for you?

Rest assured we are working on this issue as I type. We have the code updated and there is a considerable performance increase, as you’ve seen in making the calls. We’ve hit a minor complication with the way we handle slots, so as soon as we make sure we don’t break customers who are using that feature then we’ll be shipping this fix.

Timing will depend on how we can work around this other complication, it may require a database migration to change the shape of some data, in which case we’ll have to wait until 2018.7 in a couple of weeks. If we don’t need to change the data it can go out in a patch and we should have it shipped early next week.

Hope that helps and apologies for the inconvenience.

Thanks for the update. That’s a good idea to change the permissions on the Principal used.

Hi Shannon,

Just wanted to check if this is still on track for 2018.7.

Thanks,
Robert

Hi Robert,

The work for this has been completed but is going through testing at the moment. We still hope to release this as part of the imminent2018.7 release. There is a slight chance that it may not make it in time. If that does happen we will release it as part of 2018.7.1 not too long after that.

Regards,
Shaun

Thanks for your work on this. Upgrading to 2018.7.2 fixed this issue.

Thanks,
Robert

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.