Architectural Overview¶

Components¶

The server¶

The server is a REST server which exposes the API consumed by the Web UI. It has the following responsibilities:

listener for Git webhook events

Other features will be implemented when the Web UI will be in development.

The repository Controller¶

The repository controller is a Kubernetes Controller which is only used to register TerraformRepository resources.

The layer Controller¶

The layer controller is a Kubernetes Controller which continuously monitors declared TerraformLayer resources. It regurlarly creates TerraformRun resources which run a terraform plan for each of your layer to check if a drift has been introduced. If so, it has the possibility to create a TerraformRun that does a terraform apply.

It is also responsible for running your Terraform plan and apply if there is a new commit on your layer.

The run Controller¶

The run controller is a Kubernetes Controller which continuously monitors declared TerraformRun resources.

It is responsible for running the terraform plan and terraform apply commands by creating runner pods. It handles failure and retries of the runner pods.

It also generates Leases to make sure no concurrent terraform commands will be launched on the same layer at the same time.

The Redis instance¶

The Redis instance is used to store the binary generated by terraform plan before running the apply. We also store information about the plan/apply output to print it in the resources' statuses

Implementation¶

The operator has been bootstrapped using the operator-sdk.

The CLI used to start the different components is implemented using cobra.

The TerraformLayer Controller¶

The status of a TerraformLayer is defined using the conditions standards defined by the community.

3 conditions are defined for a layer:

IsPlanArtifactUpToDate. This condition is used for drift detection. The evaluation is made by compraing the timestamp of the last terraform plan which ran and the current date. The timestamp of the last plan is "stored" using an annotation.
IsApplyUpToDate. This condition is used to check if an apply needs to run after the last plan. Comparison is made by comparing a checksum of the last planned binary and a checksum last applied binary stored in the annotations.
IsLastRelevantCommitPlanned. This condition is used to check if a new commit has been made to the layer and need to be applied. It is evaluated by comparing the commit used for the last plan, the last commit which intoduced changes to the layer and the last commit made to the same branch of the repository. Those commits are "stored" as annotations.

Info

We use annotations to store information because we do not want to rely too heavily on the uptime of the Redis instance.

With those 3 conditions, we defined 3 states:

Idle. This is the state of a layer if no runner needs be started
PlanNeeded. This is the state of a layer if burrito needs to start a plan runner
ApplyNeeded. This is the state of a layer if burrito needs to start an apply runner

Info

If you use dry remediation strategy and an apply is needed, the layer will stay in the ApplyNeeded as long as it does not need to enter the PlanNeeded.

The TerraformRun Controller¶

The status of a TerraformRun is also defined using the same conditions standards defined by the community.

5 conditions are defined for a run:

HasStatus. This condition is used to check if a TerraformRun has already been reconciled by the controller.
HasReachedRetryLimit. Used to check if a TerraformRun has reached the maximum number of retries.
HasSucceeded. Used to check if a TerraformRun has already succeeded (runner pod exited successfully).
IsRunning. Used to check if a TerraformRun is currently running by checking the current phase of its associated pod.
IsInfailureGracePeriod. This condition is used to check if a Terraform workflow has already failed. If so, we use an exponential backoff strategy before restarting a runner on the given layer.

With those 5 conditions, we defined 6 states:

Initial. This is the state of a run when it has just been created and has launched its first runner pod.
Running. This is the state of a run if a runner pod is currently running.
FailureGracePeriod. This is the state of a layer if a plan or apply runner has failed
Retrying. This is an intermediate state of a run if a runner pod has failed and is being restarted (not in failure grace period anymore).
Succeeded. This is one of the two final states a run can have. It means that the runner pod has exited successfully.
Failed. This is the other final state a run can have. It means that the run has failed multiple times and has reached the maximum number of retries.

The TerraformRun controller also creates and deletes the Kubernetes leases to avoid concurrent use of Terraform on the same layer.

Info

N.B.: We use lease objects in order to not have to rely on the Redis instance for layer locking.

The runners¶

The runner image implementation heavily relies on Golang libraries provided by Hashicorp such as tfexec and hc-install which allows us to dynamically download and use any version of the Terraform binary. Thus, we support any existing version of Terraform.

The runners also support any existing version of Terragrunt.

The runner is responsible to update the annotations of the layer it is associated with to store information about what commit was plan/apply and when.