Terragrunt, tool to reduce Terraform code redundancy

Posted on 27 January 2022, updated on 6 February 2024.

Organizing the code in a Terraform project is a complex issue. A perfect general solution to this problem doesn’t exist. With my project team, we came across Terragrunt, a Terraform wrapper promising to reduce code redundancy. This article summarizes what we learned so far.

The issue with the Terraform code

My project team, we are currently facing this issue: due to the increase of our project scope, we found out that we spend a lot of time copying and replicating blocks of code. It makes the Terraform codebase difficult to maintain and increases the risk of time-consuming small bugs if we forget to update the value of a variable. We thus decided to take some time to think of a better code organization.

The application

In this article, we will consider a simple example infrastructure composed of several APIs (API A, API B, etc) running in containers deployed on Azure App Services. Each API will also have a Keyvault to store the secrets injected in the containers as environment variables and an API in an Azure API Management to route HTTP traffic to the App Service (the API Management will be common to all APIs as it is quite expensive). This infrastructure is deployed in 3 distinct environments:

A production environment: production
Two non-production environments: dev and staging

Below is an illustration of what an environment of the example infrastructure looks like.

infrastructure_scheme

💡 If you are not familiar with Azure resources, don’t panic! Most of the concepts and examples are common to all cloud providers.

The Terraform code

Because it is very painful to work on a big Terraform codebase, the Terraform code is split into parts: each API has its own Terraform code in the API repository and the common resources (A Resource Group, the API Management, a Container Registry, a database, etc.) are described as code in a separate repository.

In order to prevent code repetition, a Terraform module (let’s call it simple-api) contains the code to deploy an App Service, a Keyvault, and an API in the API Management. The code for this module is hosted in a distinct repository to allow the versioning of the module. I will assume that we can find this module at github.com/foo/simple-api.

Here is an example of the structure of an API repository:

.
├── code/
│   └── contains the application code
├── terraform/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── providers.tf
│   │   └── versions.tf
│   ├── staging/
│   │   ├── main.tf
│   │   ├── providers.tf
│   │   └── versions.tf
│   └── production/
│       ├── main.tf
│       ├── providers.tf
│       └── versions.tf
├── README.md
└── ...

In every environment, there are 3 different Terraform files in this example:

versions.tf is used to pin Terraform binary and providers versions. It is a similar to:

# terraform/dev/versions.tf
terraform { 
  required_version = "~1.1.0"
 
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 2.90"
    }
  }

  backend "azurerm" {}
}

providers.tf is used to configure the Azure provider:

# terraform/dev/providers.tf
provider "azurerm" { 
  features {}
  skip_provider_registration = true
}

main.tf is a call to the simple-api module:

# terraform/dev/main.tf
module "simple-api" {
  source = "git@github.com:foo/simple-api.git?ref=v0.1.0"

	# Passing input values to the module variables
	cors_urls           = ["http://localhost:3000"]
	api_name            = "my-api"
	resource_group_name = "my-rg-dev"
	# ...
}

The issue

As the APIs composing the application can be quite different (different language, CORS policy, OAuth 2.0 authentication, etc.), the simple-api module requires a relatively large amount of values to be provided at every module call to support all these use cases. Some values such as cors_url are specific to a given environment, but others like api_name are not and are therefore repeated in the Terraform configuration of every environment.

This is where Terragrunt comes in handy, as it provides several functionalities to keep Terraform code DRY (Don’t Repeat Yourself).

A bit of theory about Terragrunt

What is Terragrunt?

Terragrunt is a thin wrapper of Terraform maintained by Gruntwork allowing to manage remote Terraform states and Terraform modules. It aims at reducing code repetitions. Moreover, it is very easy to use, as you just have to install Terragrunt and replace terraform with terragrunt in all Terraform CLI commands ( terragrunt apply, terragrunt plan... ).

Let’s see what we can do with Terragrunt to improve our Terraform codebase!

Decouple the logic of the Terraform code with its implementation

The main advantage of Terragrunt is that it allows decoupling the logic of your code Terraform (which lies in your Terraform modules) from its implementation (which lies in the configuration of the different environments calling multiple Terraform modules). Terragrunt can thus be considered a tool to orchestrate your Terraform modules.

In concrete terms, we will replace the traditional *.tf files in the configuration of our API with Terragrunt *.hcl configuration files. By doing so, we will be able to define the input values passed when calling the simple-api module anywhere in our repository. Values factorization is straightforward in this configuration!

That’s it for the theoretical part, let’s migrate the Terraform code for API A to Terragrunt.

Migration of one API to Terragrunt

As mentioned in the previous part, we will have no *.tf files in our configuration after the migration. The Terraform code will be used only for the logic in the modules, decoupled from the configuration, which uses only *.hcl files.

The target repository structure after the migrations is:

.
├── code/
│   └── contains the application code
├── aic/
│   ├── dev/
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   ├── staging/
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   ├── production/
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   └── terragrunt.hcl
├── README.md
└── ...

Let’s go through the steps to achieve this goal!

Factorize `version.tf` and `providers.tf`

The first step of the migration is to get rid of the files versions.tf and providers.tf . As these files are common to all environments, we will migrate their content to the file terragrunt.hcl at the root of the terraform folder.

The root terragrunt.hcl will then look like this:

# aic/terragrunt.hcl

generate "versions" {
  path = "versions.tf"
 
  if_exists = "overwrite_terragrunt"
 
  contents = < < EOF
terraform { 
  required_version = "~1.1.0"
 
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 2.90"
    }
  }

  backend "azurerm" {}
}

EOF
}

generate "provider" {
  path = "providers.tf"
 
  if_exists = "overwrite_terragrunt"
 
  contents = < < EOF
provider "azurerm" { 
  features {}
  skip_provider_registration = true
}
 
EOF
}

These two blocks tell Terragrunt to generate the two files versions.tf and providers.tf before applying the Terraform code. But as the Terragrunt configuration is always applied from a leaf terragrunt.hcl, we need to tell Terragrunt to import the root terragrunt.hcl file. Let’s add in every leaf terragrunt.hcl file the following block to do so.

# aic/dev/terragrunt.hcl
include "root" {
  path = find_in_parent_folders()
}

Migrate the `main.tf` file to Terragrunt

In order to migrate the main.tf file, we must know how to make a call to a module in Terragrunt. The good news is that it is very easy: add the following block to the leaf terragrunt.hcl.

terraform {
	source  = "github.com/foo/simple-api.git?ref=v0.1.0"
}

As you can see, it is very similar to calling a remote module directly with Terraform.

What about the input values passed to the module on call?

That’s right we are currently passing no values corresponding to the module input variables. But don’t panic, Terragrunt makes child’s play of passing input values to a Terraform module. All you have to do is to add to a Terragrunt configuration file a block similar to:

inputs = {
	my_first_value = "foo"
	my_second_value = "bar"
}

At this point What’s very convenient is that you can define an inputs block in different Terragrunt configuration files and Terragrunt will merge them before calling Terraform commands for you. This behavior is very convenient to factorize variables!

Let’s dispatch the input values for our module in terragrunt.hcl files:

cors_urls is specific to the environment, we will place it in the leaf Terragrunt config file (aic/dev/terragrunt.hcl)
api_name is common to all environments, we will place it in the root Terragrunt config file (aic/terragrunt.hcl)
resource_group_name depends only on an environment identified (its name in this case). We will use the file aic/dev/env.hcl to store this identifier as a local value that can be used to create the value of resource_group_name

The 3 files used to define the dev environment are now:

Root terragrunt.hcl:

# aic/terragrunt.hcl

# Read the variables defined in "env.hcl" file:
locals {
  env_vars = read_terragrunt_config("env.hcl")
}

generate "versions" {
  path = "versions.tf"
 
  if_exists = "overwrite_terragrunt"
 
  contents = <<EOF
terraform { 
  required_version = "~1.1.0"
 
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 2.90"
    }
  }

  backend "azurerm" {}
}

EOF
}

generate "provider" {
  path = "providers.tf"
 
  if_exists = "overwrite_terragrunt"
 
  contents = <<EOF
provider "azurerm" { 
  features {}
  skip_provider_registration = true
}
 
EOF
}

inputs = {
	api_name            = "my-api"
	resource_group_name = "my-rg-${local.env_vars.locals.environment}"
}

Leaf terragrunt.hcl:

# aic/dev/terragrunt.hcl
include "root" {
  path = find_in_parent_folders()
}

terraform {
	source  = "github.com/foo/simple-api.git?ref=v0.1.0"
}

inputs = {
	cors_urls = ["http://localhost:3000"]
}

env.hcl:

# aic/dev/env.hcl
locals {
  environment = "dev"
}

Define the staging environment

We now have achieved the migration of an environment from Terraform to Terragrunt. To see the benefits of this migration, let’s define the staging environment:

Create the aic/staging directory

Create the aic/staging/env.hcl file:

# aic/staging/env.hcl
locals {
  environment = "staging"
}

Create the aic/staging/terragrunt.hcl leaf config file:

# aic/staging/terragrunt.hcl
include "root" {
  path = find_in_parent_folders()
}

terraform {
	source  = "github.com/foo/simple-api.git?ref=v0.1.0"
}

inputs = {
	cors_urls = []
}

Terragrunt allows us to automatically reuse the input values in the root terragrunt.hcl without redefining them. We will thus need to define only 2 values instead of 4 before the migration. But Terragrunt is even more powerful if define all the APIs of our example application in the same repository.

Bonus: Merge all APIs of the application in a single repository and use regions

Merging infrastructure repositories of all our APIs

Terragrunt allowed us to reduce code redundancy in a single repository. But the factorization is even better if we declare the configurations for the whole application in a single repository using the same process as in the previous part to factorize efficiently variables, the repository structure might look like the following.

.
├── code/
│   └── contains the application code
├── iac/
│   ├── dev/
│   │   ├── network/
│   │   │   └── terragrunt.hcl
│   │   ├── database/
│   │   │   └── terragrunt.hcl
│   │   ├── api-a/
│   │   │   └── terragrunt.hcl
│   │   ├── api-b/
│   │   │   └── terragrunt.hcl
│   │   ├── ...
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   ├── staging/
│   │   ├── network/
│   │   │   └── terragrunt.hcl
│   │   ├── database/
│   │   │   └── terragrunt.hcl
│   │   ├── api-a/
│   │   │   └── terragrunt.hcl
│   │   ├── api-b/
│   │   │   └── terragrunt.hcl
│   │   ├── ...
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   ├── production/
│   │   ├── network/
│   │   │   └── terragrunt.hcl
│   │   ├── database/
│   │   │   └── terragrunt.hcl
│   │   ├── api-a/
│   │   │   └── terragrunt.hcl
│   │   ├── api-b/
│   │   │   └── terragrunt.hcl
│   │   ├── ...
│   │   ├── terragrunt.hcl
│   │   └── env.hcl
│   └── terragrunt.hcl
├── README.md
└── ...

Rework the repository structure

The first step is to alter the repository structure to add more directory layers (the more layers, the better the variable factorization). For instance, it is often interesting to add a layer to separate production environments from non-production environments to factorize input concerning the size and skus of the resources. If we use two different Azure subscriptions (one for prod environment and one for non-prod environment), the subscription ID can also be defined at this level.

At this point, we might think that the migration to Terragrunt complicated our code, but one as the configuration file is often very short and there is as little code redundancy as possible, making the code easier to maintain.

In this article, we discovered Terragrunt and how it can help us to reduce the code redundancy in Terraform code. We discussed its basic principles and the possibilities to factorize variables that it provides. This article is a summary of what we learned about Terragrunt so far, and the next step for my project team is to develop a Proof of Concept of the migration towards Terragrunt and measure the improvement with some previously defined KPI. We will keep you updated on our experimentation with Terragrunt in future articles on Padok’s blog.

We used basic features of Terragrunt in this article, but Terragrunt has a couple of other use cases where it turns out to be helpful. Feel free to have a look at the different use cases of Terragrunt. Gruntwork also provides a very useful demo repository using Terragrunt and a testing framework called Terratest that seems promising.