Some infrastructure projects may be split in different layers or repositories.
There are different ways to use resources from other layers in your project.
This standard applies to terraform
infrastructure. To go further, feel free to read our terragrunt standard about layers.
Prefer using data source when you have to refer to a resource not present in your current code. You will increase code consistancy and make sure it fails on plan step and not only on apply.
In terraform, most resources have a data source object to refer to an existing object in an other layer.
Multiple layers infrastructure
.
├── application
│ ├── app.tf
│ └── _settings.tf
└── core
├── core.tf
└── _settings.tf
core/main.tf
# Create a resource group
resource "azurerm_resource_group" "example" {
name = "example"
location = "west-europe"
}
application/main.tf
# Retrieve a resource group instance
data "azurerm_resource_group" "example" {
name = "example"
location = "west-europe"
}
resource "azurerm_app_service" "example" {
name = "example"
resource_group_name = data.azurerm_resource_group.example.name
location = data.azurerm_resource_group.example.location
...
}
By doing that, terraform will check the existence of referred resource at plan step and stop if it does not exist. Moreover, your code will adapt if the remote resource update its values.
An other way to refer a resource from another layer is to use remote state data source.
In that way, you will create a correlation between different terraform states. You retrieve the information from a state reflecting the remote infrastructure at its last apply.
data "terraform_remote_state" "example" {
backend = "azurerm"
config = {
storage_account_name = "example"
container_name = "tfstate"
key = "prod.terraform.tfstate"
}
}
resource "azurerm_virtual_network" "example" {
name = "example"
resource_group_name = data.terraform_remote_state.example.outputs.resource_group_name
location = data.terraform_remote_state.example.outputs.location
}
This solution is working but it does not create a strong dependancy between layer A and layer B. If you change a parameter in layer A you will have to re-apply both layers. You have to keep tracking your layer dependencies.
The last way yo refer a resource from a different layer is to hard get the string value you want, and pray for it to be valid.
In a small infrastructure it may seem a good idea, but it’s not since:
Do not do that.
When your layer refers to multiple data called for your different modules, you may want to call those data in dedicated files. In that case, two patterns stand out for your code architecture:
data.tf
fileNone of these two pattern is better than the other, but you should choose the one that is more adapted to your need.
data.tf
fileA data is like a variable: it is a necessary input to use your module or create your resource. Therefore, all data from a layer should be gather in a single data.tf
file (as it is for your variables.tf
).
This way would have the following advantages:
data
block aside to pollute and affect the readabilitydata.tf
file to check if it is already thereMoreover, this does not affect the comprehension of your code, as if your data is well named, you understand easily what it represents and therefore do not need to dive in the data
block to understand what it is.
.
├── core.tf
├── data.tf
└── _settings.tf
As an antipattern of the solution above, following our wysiwyg convention involves creating files refering the purpose of your data.
In that way, you will know where to search your existing data simply by its purpose and not by its code origin. The data is considered as a part of your infrastructure code and not as an input.
.
├── core.tf
├── rg.tf
├── virtual_network.tf
└── _settings.tf
Using data in terraform module is not recomanded because it can create circular dependencies between layers and force you to do target apply.
Prefer to set variables in your module, then call your data where your module is applied.