Terragrunt
Created: 2018-12-02 11:16:39 -0800 Modified: 2019-03-12 09:00:08 -0700
- What it is: Terragrunt is a wrapper around Terraform that makes it easier to work with multiple modules and manage remote state. I decided to start using it over Terraform directly when I realized how annoying it would be to manage multiple “phases” of a deployment by myself. To elaborate: I wanted to be able to deploy some of my infrastructure, e.g. the database, a VPC, and ECR repositories, without deploying ECS services themselves. The reason why is because the services relied on Docker images that wouldn’t actually exist yet. Thus, my general plan was to do something like this:
- Have CI set up ECR via Terraform
- Have CI build a Docker image and push it to ECR
- Have CI set up ECS via Terraform
- As you can see above, there are two steps involving Terraform, hence the “phases” of deployment. This was further complicated by other moving parts like a Verdaccio repository, but it’s not worth detailing that here.
- Terragrunt helps here because I don’t have to manage all of the inputs, outputs, providers, and remote state for each of these by myself.
- Installation (reference):
- wget https://github.com/gruntwork-io/terragrunt/releases/download/v0.17.3/terragrunt_linux_amd64
- Replace the version above with the newest one from the releases page.
- chmod +x ./terragrunt_linux_amd64
- sudo mv ./terragrunt_linux_amd64 /usr/local/bin/terragrunt
- wget https://github.com/gruntwork-io/terragrunt/releases/download/v0.17.3/terragrunt_linux_amd64
- Example:
- They have a single example of everything that just happens to be split across two repos to show how you might make, say, a staging and a live environment that has consul, MySQL, and a webserver running in ALB.
- Specifying variables (reference): the terragrunt configuration block itself doesn’t have the same interpolation syntax for variables, so “{get_env(“foo”, “default_value_for_foo”)}“. A default is required with this syntax until this issue is fixed. As a workaround in the meantime, you can specify invalid defaults. For example, S3 buckets can’t start with an exclamation point and AWS regions can’t be a random string:
Keep in mind that variables are not assumed to start with “TFVAR” here, so if you do “get_env(“foo”, “default”)”, then you can set this variable on Linux via ” export TF_VAR_foo=bar”.
- Working with relative paths (reference): for AWS Lambda, I had to have a zip file of the code to put on Lambda. Before I was using Terragrunt, I had something like this:
…but that referred to a directory that was above the Terragrunt root directory, which means that it wouldn’t get copied to Terragrunt’s temporary directory. There are a couple of solutions for this:
- Copy your ZIP file into the Terraform folder and then specify it like this:
-
Point at a git URL and set up SSH so that no relative path is needed (reference).
-
Sharing values of variables: as far as I can tell, every individual module still needs to manifest each necessary variable in a variables.tf file, i.e. there’s no way to share that a variable is required. However, you can share the values of variables, e.g. if you wanted a certain EC2 instance size for all production services. You do this using required_var_files. This is what root_folder/terraform.tfvars may look like
Then, in a child folder, make sure “SOME_VARIABLE” is manifested:
variable “SOME_VARIABLE” {}
…and it will automatically be given “some default value here” as a value.
Gotchas
Section titled Gotchas- Always use absolute paths rather than relative paths (reference). This is because Terragrunt copies everything to a temporary directory, so relative paths would be broken.
Using variables from a dependency (reference)
Section titled Using variables from a dependency (reference)This deserved its own section. I couldn’t find anything in the examples that shows how this worked, so I was grateful to find this GitHub issue. From what I can tell, the reason why there is no documentation for this in Terragrunt is because it’s actually just a Terraform feature (terraform_remote_state).
Scenario: my real scenario is relatively common: you’ve got some shared components like a VPC, subnets, a database, etc., and you need to use output variables from those common components in your application-specific components. However, just to make this really simple to understand, let’s distill that into just two parts:
- You have one Terraform folder that will make a VPC.
- You have another Terraform folder that will make a security group in that VPC. This obviously depends on the VPC.
This is still part of the core scenario; your VPC may be shared by your entire application, but until you get to application-specific folders, you probably don’t want to start creating security groups.
Here’s how all of this ended up looking:
Overview
From the folder that creates the VPC, make outputs.tf with this
From the folder that uses the VPC (application_foundation/main.tf below), configure your terraform_remote_state:
Then you can access your vpc_id via ”${data.terraform_remote_state.common_foundation.vpc_id}“.
Folder structure
└───prod
*This is only included for the sake of completeness (this example does require a remote backend due to how it’s configured below), but it’s not all that pertinent.
File contents
prod/terraform.tfvars
prod/variables.tf
common_foundation/main.tf
common_foundation/outputs.tf
common_foundation/terraform.tfvars
common_foundation/vpc.tf
setup_remote_backend/main.tf
setup_remote_backend/terraform.tfvars
application_foundation/main.tf
application_foundation/terraform.tfvars
application_foundation/variables.tf
Improvements on using dependencies’ variables
Eventually, “get_output” should be a function that Terragrunt provides, but it won’t arrive at least until HCL2 drops in Terraform v0.12 (reference). When that happens, it shouldn’t be necessary to explicitly state the “dependencies” block in application_foundation’s terraform.tfvars. It may also simplify the syntax a bit.
I have to manifest TFSTATE_S3_BUCKET_NAME in the variables.tf of every single sub-folder that I make. I don’t know if there’s a way to only have to specify it once.
Troubleshooting
Section titled TroubleshootingTerragrunt asks to create the S3 backend instead of automatically creating it (reference)
Section titled Terragrunt asks to create the S3 backend instead of automatically creating it (reference)I.e. you get a message like this:
[terragrunt] Remote state S3 <name redacted> does not exist or you don’t have permissions to access it. Would you like Terragrunt to create it? (y/n)
For CI runs, you may not want it to ask you. There are two solutions:
- Run Terragrunt with “—terragrunt-non-interactive”. If you’re going to do this, it may make sense just to have a separate folder that only sets up the S3 state and does absolutely nothing else, that way you’re not accidentally confirming actions like “apply” that you may not want to confirm.
- Create the S3 bucket yourself.
Note that if you get this error when you’re just trying to do something like “terragrunt output”, then it likely means that you haven’t run Terragrunt in your root directory.
destroy-all errors when running multiple times
Section titled destroy-all errors when running multiple timesI got an error like this when I tried doing destroy-all
Error: Error applying plan:
1 error(s) occurred:
- module.alb.output.load_balancer_id: element: element() may not be used with an empty list in:
${element(concat(awslb.application..id, awslb.application_no_logs..id), 0)}
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
It was coming from a single folder (common_foundation in my case). I checked my remote tfstate and didn’t see any resources being tracked, so I think this is just a transient problem that I can ignore.