Architectural Overview¶
Components¶
The server¶
The server is a REST server which exposes the API consumed by the Web UI. It has the following responsibilities:
- listener for Git webhook events
Other features will be implemented when the Web UI will be in development.
The repository Controller¶
The repository controller is a Kubernetes Controller which is only used to register TerraformRepository
resources.
The layer Controller¶
The layer controller is a Kubernetes Controller which continuously monitors declared TerraformLayer
resources.
It regurlarly creates TerraformRun
resources which run a terraform plan
for each of your layer to check if a drift has been introduced.
If so, it has the possibility to create a TerraformRun
that does a terraform apply
.
It is also responsible for running your Terraform plan
and apply
if there is a new commit on your layer.
The run Controller¶
The run controller is a Kubernetes Controller which continuously monitors declared TerraformRun
resources.
It is responsible for running the terraform plan
and terraform apply
commands by creating runner pods. It handles failure and retries of the runner pods.
It also generates Leases
to make sure no concurrent terraform commands will be launched on the same layer at the same time.
The Redis instance¶
The Redis instance is used to store the binary generated by terraform plan
before running the apply
. We also store information about the plan
/apply
output to print it in the resources' statuses
Implementation¶
The operator has been bootstrapped using the operator-sdk
.
The CLI used to start the different components is implemented using cobra
.
The TerraformLayer Controller¶
The status of a TerraformLayer
is defined using the conditions standards defined by the community.
3 conditions are defined for a layer:
IsPlanArtifactUpToDate
. This condition is used for drift detection. The evaluation is made by compraing the timestamp of the lastterraform plan
which ran and the current date. The timestamp of the last plan is "stored" using an annotation.IsApplyUpToDate
. This condition is used to check if anapply
needs to run after the lastplan
. Comparison is made by comparing a checksum of the last planned binary and a checksum last applied binary stored in the annotations.IsLastRelevantCommitPlanned
. This condition is used to check if a new commit has been made to the layer and need to be applied. It is evaluated by comparing the commit used for the lastplan
, the last commit which intoduced changes to the layer and the last commit made to the same branch of the repository. Those commits are "stored" as annotations.
Info
We use annotations to store information because we do not want to rely too heavily on the uptime of the Redis instance.
With those 3 conditions, we defined 3 states:
Idle
. This is the state of a layer if no runner needs be startedPlanNeeded
. This is the state of a layer if burrito needs to start aplan
runnerApplyNeeded
. This is the state of a layer if burrito needs to start anapply
runner
Info
If you use dry
remediation strategy and an apply is needed, the layer will stay in the ApplyNeeded
as long as it does not need to enter the PlanNeeded
.
The TerraformRun Controller¶
The status of a TerraformRun
is also defined using the same conditions standards defined by the community.
5 conditions are defined for a run:
HasStatus
. This condition is used to check if aTerraformRun
has already been reconciled by the controller.HasReachedRetryLimit
. Used to check if aTerraformRun
has reached the maximum number of retries.HasSucceeded
. Used to check if aTerraformRun
has already succeeded (runner pod exited successfully).IsRunning
. Used to check if aTerraformRun
is currently running by checking the current phase of its associated pod.IsInfailureGracePeriod
. This condition is used to check if a Terraform workflow has already failed. If so, we use an exponential backoff strategy before restarting a runner on the given layer.
With those 5 conditions, we defined 6 states:
Initial
. This is the state of a run when it has just been created and has launched its first runner pod.Running
. This is the state of a run if a runner pod is currently running.FailureGracePeriod
. This is the state of a layer if aplan
orapply
runner has failedRetrying
. This is an intermediate state of a run if a runner pod has failed and is being restarted (not in failure grace period anymore).Succeeded
. This is one of the two final states a run can have. It means that the runner pod has exited successfully.Failed
. This is the other final state a run can have. It means that the run has failed multiple times and has reached the maximum number of retries.
The TerraformRun
controller also creates and deletes the Kubernetes leases to avoid concurrent use of Terraform on the same layer.
Info
N.B.: We use lease objects in order to not have to rely on the Redis instance for layer locking.
The runners¶
The runner image implementation heavily relies on Golang libraries provided by Hashicorp such as tfexec
and hc-install
which allows us to dynamically download and use any version of the Terraform binary.
Thus, we support any existing version of Terraform.
The runners also support any existing version of Terragrunt.
The runner is responsible to update the annotations of the layer it is associated with to store information about what commit was plan/apply and when.