Terraform plugin crashes when running big plan/apply commands

(the below is a copy of this chat in slack so my teammates can keep track of this conversation in this public forum. Let’s continue our chat here :slight_smile: )

Hi there team :slightly_smiling_face:,

We are trying to move to Terraform to manage resources that are common across all projects (environments, targets, tenants, library variable sets, etc).

Octopus Version: 2020.5.8
Terraform provider Version: 0.7.40

In my first attempt trying to update our biggest library variable set in store, I keep getting the below message when trying to run a rather big t plan or t apply from my local machine.

To give you some context, I’m trying to update some variables in a variable set and the current state has ~380 resources counting the Library variable set, the variables and all the resources they depend on for scoping.

Stack trace from the terraform-provider-octopusdeploy_v0.7.40 plugin:
panic: interface conversion: error is *url.Error, not *octopusdeploy.APIError
goroutine 1887 [running]:
github.com/OctopusDeploy/terraform-provider-octopusdeploy/octopusdeploy.resourceVariableRead(0x1bbeaa0, 0xc00055b020, 0xc000590b80, 0x1925b00, 0xc00012ac80, 0xc000bffe80, 0xc0007798f0, 0x100df58)
	github.com/OctopusDeploy/terraform-provider-octopusdeploy/octopusdeploy/resource_variable.go:116 +0x9a8
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0xc000486540, 0x1bbea20, 0xc000ae9340, 0xc000590b80, 0x1925b00, 0xc00012ac80, 0x0, 0x0, 0x0)
	github.com/hashicorp/terraform-plugin-sdk/v2@v2.6.1/helper/schema/resource.go:347 +0x17c
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0xc000486540, 0x1bbea20, 0xc000ae9340, 0xc0001c62a0, 0x1925b00, 0xc00012ac80, 0xc00012f110, 0x0, 0x0, 0x0)
	github.com/hashicorp/terraform-plugin-sdk/v2@v2.6.1/helper/schema/resource.go:624 +0x1cb
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0xc00000cd40, 0x1bbea20, 0xc000ae9340, 0xc000ae9380, 0xc000ae9340, 0x1a1b840, 0x1a527c0)
	github.com/hashicorp/terraform-plugin-sdk/v2@v2.6.1/helper/schema/grpc_provider.go:575 +0x43b
github.com/hashicorp/terraform-plugin-go/tfprotov5/server.(*server).ReadResource(0xc00060d260, 0x1bbea20, 0xc000ae9340, 0xc00055a900, 0xc00060d260, 0xc000811c50, 0xc000906ba0)
	github.com/hashicorp/terraform-plugin-go@v0.3.0/tfprotov5/server/server.go:298 +0x105
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler(0x1a527c0, 0xc00060d260, 0x1bbeae0, 0xc000811c50, 0xc00055a8a0, 0x0, 0x1bbeae0, 0xc000811c50, 0xc000868200, 0x1f1)
	github.com/hashicorp/terraform-plugin-go@v0.3.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:344 +0x214
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000453500, 0x1bc8400, 0xc000682600, 0xc000864000, 0xc000502ba0, 0x20d7d70, 0x0, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/server.go:1286 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc000453500, 0x1bc8400, 0xc000682600, 0xc000864000, 0x0)
	google.golang.org/grpc@v1.38.0/server.go:1609 +0xd05
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc000400140, 0xc000453500, 0x1bc8400, 0xc000682600, 0xc000864000)
	google.golang.org/grpc@v1.38.0/server.go:934 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
	google.golang.org/grpc@v1.38.0/server.go:932 +0x1fd
Error: The terraform-provider-octopusdeploy_v0.7.40 plugin crashed!
This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

My wild-unexperienced-guess is that this is happening because of the amount of resources Terraform has to refresh before doing anything. I reached to this conclusion after seeing that:

  • Running t plan refresh=false doesnt crash at all
  • Running smaller plans works fine every time.

While its true that this is our largest variable set in store, we are worried that this solution wont scale once we start including the rest of the resources that are common to all projects.

I went ahead and run a t apply with the Debug flag turned ON. I’m sending it on an email in a couple of minutes.

Looking forward to hear your thoughts about this :slight_smile:

The Powershell God

Hi @dalmiro.granas!

Thanks for sending this through - I’ll circle around with the team that maintains the TF provider, and see if we can get some answers for you ASAP. As they’re based in Australia, it will likely be tonight or tomorrow before I have an update for you.

I’ll be in touch as soon as I have some news for you!

Thanks for submitting this issue! Here’s where I’m starting:

error is *url.Error

This error (above) is a conversion error, but it has to do with an underlying URL value — perhaps in your HCL or elsewhere. This occurs when the provider is attempting to read a variable in your configuration. I’d start there. In the meantime, I’ll be adding more guards to the provider code to try and catch this earlier in the processing.

Do you have an example of the HCL that the provider is crashing on? I suspect the answer lies there.