Posted on 27 January 2022, updated on 6 February 2024.
Organizing the code in a Terraform project is a complex issue. A perfect general solution to this problem doesn’t exist. With my project team, we came across Terragrunt, a Terraform wrapper promising to reduce code redundancy. This article summarizes what we learned so far.
The issue with the Terraform code
My project team, we are currently facing this issue: due to the increase of our project scope, we found out that we spend a lot of time copying and replicating blocks of code. It makes the Terraform codebase difficult to maintain and increases the risk of time-consuming small bugs if we forget to update the value of a variable. We thus decided to take some time to think of a better code organization.
The application
In this article, we will consider a simple example infrastructure composed of several APIs (API A
, API B
, etc) running in containers deployed on Azure App Services. Each API will also have a Keyvault to store the secrets injected in the containers as environment variables and an API in an Azure API Management to route HTTP traffic to the App Service (the API Management will be common to all APIs as it is quite expensive). This infrastructure is deployed in 3 distinct environments:
- A production environment:
production
- Two non-production environments:
dev
andstaging
Below is an illustration of what an environment of the example infrastructure looks like.
💡 If you are not familiar with Azure resources, don’t panic! Most of the concepts and examples are common to all cloud providers.
The Terraform code
Because it is very painful to work on a big Terraform codebase, the Terraform code is split into parts: each API has its own Terraform code in the API repository and the common resources (A Resource Group, the API Management, a Container Registry, a database, etc.) are described as code in a separate repository.
In order to prevent code repetition, a Terraform module (let’s call it simple-api
) contains the code to deploy an App Service, a Keyvault, and an API in the API Management. The code for this module is hosted in a distinct repository to allow the versioning of the module. I will assume that we can find this module at github.com/foo/simple-api
.
Here is an example of the structure of an API repository:
.
├── code/
│ └── contains the application code
├── terraform/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── providers.tf
│ │ └── versions.tf
│ ├── staging/
│ │ ├── main.tf
│ │ ├── providers.tf
│ │ └── versions.tf
│ └── production/
│ ├── main.tf
│ ├── providers.tf
│ └── versions.tf
├── README.md
└── ...
In every environment, there are 3 different Terraform files in this example:
-
versions.tf
is used to pin Terraform binary and providers versions. It is a similar to:# terraform/dev/versions.tf terraform { required_version = "~1.1.0" required_providers { azurerm = { source = "hashicorp/azurerm" version = "~> 2.90" } } backend "azurerm" {} }
-
providers.tf
is used to configure the Azure provider:# terraform/dev/providers.tf provider "azurerm" { features {} skip_provider_registration = true }
-
main.tf
is a call to thesimple-api
module:# terraform/dev/main.tf module "simple-api" { source = "git@github.com:foo/simple-api.git?ref=v0.1.0" # Passing input values to the module variables cors_urls = ["http://localhost:3000"] api_name = "my-api" resource_group_name = "my-rg-dev" # ... }
The issue
As the APIs composing the application can be quite different (different language, CORS policy, OAuth 2.0 authentication, etc.), the simple-api
module requires a relatively large amount of values to be provided at every module call to support all these use cases. Some values such as cors_url
are specific to a given environment, but others like api_name
are not and are therefore repeated in the Terraform configuration of every environment.
This is where Terragrunt comes in handy, as it provides several functionalities to keep Terraform code DRY (Don’t Repeat Yourself).
A bit of theory about Terragrunt
What is Terragrunt?
Terragrunt is a thin wrapper of Terraform maintained by Gruntwork allowing to manage remote Terraform states and Terraform modules. It aims at reducing code repetitions. Moreover, it is very easy to use, as you just have to install Terragrunt and replace terraform
with terragrunt
in all Terraform CLI commands ( terragrunt apply, terragrunt plan... ).
Let’s see what we can do with Terragrunt to improve our Terraform codebase!
Decouple the logic of the Terraform code with its implementation
The main advantage of Terragrunt is that it allows decoupling the logic of your code Terraform (which lies in your Terraform modules) from its implementation (which lies in the configuration of the different environments calling multiple Terraform modules). Terragrunt can thus be considered a tool to orchestrate your Terraform modules.
In concrete terms, we will replace the traditional *.tf
files in the configuration of our API with Terragrunt *.hcl
configuration files. By doing so, we will be able to define the input values passed when calling the simple-api
module anywhere in our repository. Values factorization is straightforward in this configuration!
That’s it for the theoretical part, let’s migrate the Terraform code for API A
to Terragrunt.
Migration of one API to Terragrunt
As mentioned in the previous part, we will have no *.tf
files in our configuration after the migration. The Terraform code will be used only for the logic in the modules, decoupled from the configuration, which uses only *.hcl
files.
The target repository structure after the migrations is:
.
├── code/
│ └── contains the application code
├── aic/
│ ├── dev/
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ ├── staging/
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ ├── production/
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ └── terragrunt.hcl
├── README.md
└── ...
Let’s go through the steps to achieve this goal!
Factorize version.tf
and providers.tf
The first step of the migration is to get rid of the files versions.tf
and providers.tf
. As these files are common to all environments, we will migrate their content to the file terragrunt.hcl
at the root of the terraform
folder.
The root terragrunt.hcl
will then look like this:
# aic/terragrunt.hcl
generate "versions" {
path = "versions.tf"
if_exists = "overwrite_terragrunt"
contents = < < EOF
terraform {
required_version = "~1.1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 2.90"
}
}
backend "azurerm" {}
}
EOF
}
generate "provider" {
path = "providers.tf"
if_exists = "overwrite_terragrunt"
contents = < < EOF
provider "azurerm" {
features {}
skip_provider_registration = true
}
EOF
}
These two blocks tell Terragrunt to generate the two files versions.tf
and providers.tf
before applying the Terraform code. But as the Terragrunt configuration is always applied from a leaf terragrunt.hcl
, we need to tell Terragrunt to import the root terragrunt.hcl
file. Let’s add in every leaf terragrunt.hcl
file the following block to do so.
# aic/dev/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
Migrate the main.tf
file to Terragrunt
In order to migrate the main.tf
file, we must know how to make a call to a module in Terragrunt. The good news is that it is very easy: add the following block to the leaf terragrunt.hcl
.
terraform {
source = "github.com/foo/simple-api.git?ref=v0.1.0"
}
As you can see, it is very similar to calling a remote module directly with Terraform.
What about the input values passed to the module on call?
That’s right we are currently passing no values corresponding to the module input variables. But don’t panic, Terragrunt makes child’s play of passing input values to a Terraform module. All you have to do is to add to a Terragrunt configuration file a block similar to:
inputs = {
my_first_value = "foo"
my_second_value = "bar"
}
At this point What’s very convenient is that you can define an inputs
block in different Terragrunt configuration files and Terragrunt will merge them before calling Terraform commands for you. This behavior is very convenient to factorize variables!
Let’s dispatch the input values for our module in terragrunt.hcl
files:
cors_urls
is specific to the environment, we will place it in the leaf Terragrunt config file (aic/dev/terragrunt.hcl
)api_name
is common to all environments, we will place it in the root Terragrunt config file (aic/terragrunt.hcl
)resource_group_name
depends only on an environment identified (its name in this case). We will use the fileaic/dev/env.hcl
to store this identifier as a local value that can be used to create the value ofresource_group_name
The 3 files used to define the dev environment are now:
-
Root
terragrunt.hcl
:# aic/terragrunt.hcl # Read the variables defined in "env.hcl" file: locals { env_vars = read_terragrunt_config("env.hcl") } generate "versions" { path = "versions.tf" if_exists = "overwrite_terragrunt" contents = <<EOF terraform { required_version = "~1.1.0" required_providers { azurerm = { source = "hashicorp/azurerm" version = "~> 2.90" } } backend "azurerm" {} } EOF } generate "provider" { path = "providers.tf" if_exists = "overwrite_terragrunt" contents = <<EOF provider "azurerm" { features {} skip_provider_registration = true } EOF } inputs = { api_name = "my-api" resource_group_name = "my-rg-${local.env_vars.locals.environment}" }
-
Leaf
terragrunt.hcl
:# aic/dev/terragrunt.hcl include "root" { path = find_in_parent_folders() } terraform { source = "github.com/foo/simple-api.git?ref=v0.1.0" } inputs = { cors_urls = ["http://localhost:3000"] }
-
env.hcl
:# aic/dev/env.hcl locals { environment = "dev" }
Define the staging environment
We now have achieved the migration of an environment from Terraform to Terragrunt. To see the benefits of this migration, let’s define the staging environment:
-
Create the
aic/staging
directory -
Create the
aic/staging/env.hcl
file:# aic/staging/env.hcl locals { environment = "staging" }
-
Create the
aic/staging/terragrunt.hcl
leaf config file:# aic/staging/terragrunt.hcl include "root" { path = find_in_parent_folders() } terraform { source = "github.com/foo/simple-api.git?ref=v0.1.0" } inputs = { cors_urls = [] }
Terragrunt allows us to automatically reuse the input values in the root terragrunt.hcl
without redefining them. We will thus need to define only 2 values instead of 4 before the migration. But Terragrunt is even more powerful if define all the APIs of our example application in the same repository.
Bonus: Merge all APIs of the application in a single repository and use regions
Merging infrastructure repositories of all our APIs
Terragrunt allowed us to reduce code redundancy in a single repository. But the factorization is even better if we declare the configurations for the whole application in a single repository using the same process as in the previous part to factorize efficiently variables, the repository structure might look like the following.
.
├── code/
│ └── contains the application code
├── iac/
│ ├── dev/
│ │ ├── network/
│ │ │ └── terragrunt.hcl
│ │ ├── database/
│ │ │ └── terragrunt.hcl
│ │ ├── api-a/
│ │ │ └── terragrunt.hcl
│ │ ├── api-b/
│ │ │ └── terragrunt.hcl
│ │ ├── ...
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ ├── staging/
│ │ ├── network/
│ │ │ └── terragrunt.hcl
│ │ ├── database/
│ │ │ └── terragrunt.hcl
│ │ ├── api-a/
│ │ │ └── terragrunt.hcl
│ │ ├── api-b/
│ │ │ └── terragrunt.hcl
│ │ ├── ...
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ ├── production/
│ │ ├── network/
│ │ │ └── terragrunt.hcl
│ │ ├── database/
│ │ │ └── terragrunt.hcl
│ │ ├── api-a/
│ │ │ └── terragrunt.hcl
│ │ ├── api-b/
│ │ │ └── terragrunt.hcl
│ │ ├── ...
│ │ ├── terragrunt.hcl
│ │ └── env.hcl
│ └── terragrunt.hcl
├── README.md
└── ...
Rework the repository structure
The first step is to alter the repository structure to add more directory layers (the more layers, the better the variable factorization). For instance, it is often interesting to add a layer to separate production environments from non-production environments to factorize input concerning the size and skus of the resources. If we use two different Azure subscriptions (one for prod environment and one for non-prod environment), the subscription ID can also be defined at this level.
At this point, we might think that the migration to Terragrunt complicated our code, but one as the configuration file is often very short and there is as little code redundancy as possible, making the code easier to maintain.
In this article, we discovered Terragrunt and how it can help us to reduce the code redundancy in Terraform code. We discussed its basic principles and the possibilities to factorize variables that it provides. This article is a summary of what we learned about Terragrunt so far, and the next step for my project team is to develop a Proof of Concept of the migration towards Terragrunt and measure the improvement with some previously defined KPI. We will keep you updated on our experimentation with Terragrunt in future articles on Padok’s blog.
We used basic features of Terragrunt in this article, but Terragrunt has a couple of other use cases where it turns out to be helpful. Feel free to have a look at the different use cases of Terragrunt. Gruntwork also provides a very useful demo repository using Terragrunt and a testing framework called Terratest that seems promising.