Skip to main content

How to manage Terraform state?

Écrit par:
Stephane Jourdan
Stephane Jourdan
wordpress-sync/feature-synk-iac-terraform-teal

26 mai 2020

0 minutes de lecture

Editor's note: This post originally appeared on CloudSkiff.com. CloudSkiff joined Snyk in October 2021.

How to manage Terraform state? And by the way, what is a TFState file? How does it make Terraform code different from other configuration management tools and what are the best practices around it?

How to manage Terraform state?

Knowing how to manage Terraform state is key. First of all, what exactly is a TFState?

The TFState file in Terraform is what makes it very different from other systems. You can spin and launch infrastructures with other configuration management tools like Chef, Saltstack, and Ansible, but the biggest difference with Terraform relies on this state.

You can see your TFState file as a big JSON structure of the reality of your infrastructure, working together with the Terraform code to declare the so-called “desired state” you want to achieve.

This desired state is declarative, which means that when you declare within your code that you want a specific resource with a specific configuration and when you apply this code, Terraform will “talk” to your Cloud provider’s API and then spawn all those resources. Once it’s done, it will write the reality of the deployment on the cloud provider’s side on this JSON file.

At the end of the day, you have 3 situations with 3 parties:

  • your code

  • the reality on your cloud provider’s account

  • the state file

The state file is exactly the mirror of the last successful apply of your code. It works with any kind of resources, which means that you can do this with your cloud provider if you’re interested in launching infrastructure resources, but you can do this as well with any kind of providers.

Let’s say that you deploy helm charts using Terraform. If you do so, you will have a state file of your deployments as well. It’s the same with GithHub. If you use the GitHub provider, define users, and the same user is used for a project on Google Cloud and an IAM on AWS as a GithHub user, it’s going to be dumped for this specific user — all linked on the Terraform state file.

Consider it as the reference for the reality of the existing infrastructure which you can refer to, from inside your code.

In short, a state file is the reality of your deployment on your cloud provider from the intention declared on your Terraform code.

How do you deal with it and where do you store your Terraform State?

By default, there is no option. The first time you initialize your git repo with Terraform code and apply some small Terraform code, it is going to create the TFState directly on the root of your GitHub repo. You need to check that in somewhere. By default, you can be tempted to push that to GitHub, which is what a lot of people do, but it is not one of the best practices around here because it can contain secrets and you probably do not want to push secrets to GitHub.

But still, you need to store your state file, because if you don’t store it, you’re going to lose what Terraform considers as the “existing infrastructure”. So if you drop it, and you run an apply again, it’s going to try to apply again all the infrastructure as “new” and you probably you don’t want to do this.

The best practice is to share it somewhere, usually on an Amazon S3 bucket or any kind of storage bucket on AWS, Azure, Google Cloud:

1resource "aws_instance" "vm" {
2  ami                    = data.aws_ami.amazon-linux.id
3  instance_type          = var.instance_type
4  tags = {
5    Name = "DEMO VM DESTROY ME - ${terraform.workspace}",
6    Terraform = "true"
7  }
8}
9
10data "aws_ami" "amazon-linux" {
11  most_recent = true
12  owners = ["amazon"]
13
14  filter {
15    name   = "name"
16    values = ["amzn-ami-hvm-*"]
17  }
18
19}

So basically you declare your back end. In this case, it’s a very simple code that deploys basically a single VM. It’s for demonstration purposes. I use this to play and test my Terraform workspaces. So don’t take it for major work.

You can configure Terraform using the Terraform keyword and say: “for Terraform, I want my back-end to be S3, and the bucket for S3 needs to be this one.”

You state where you want your state file to be. It’s as simple as that. At the next Terraform apply, Terraform will use a temporary state file locally and then upload it on your S3 bucket.

And then, each time you want to work on it, it’s going to use this one.

1terraform {
2  backend "s3" {
3    bucket = "cs-tfstates-demo-sj-frankfurt-1"
4    key    = "tfstates/terraform.tfstate"
5  }
6}

Last tip, don’t forget to add a lock file on a database as well. That way, two people can’t start a destructive action or apply on Terraform at the same time if this lock exists.

Securing your Terraform code

Snyk IaC secures your Terraform configs (and Kubernetes, CloudFormation, and ARM templates!) as you code, with guided fixes so you can merge and move on. You can test as you write, monitor for changes in your git repositories, and automate testing in your build pipelines before deployment. Getting started with a Free plan takes minutes, while a breach from an IaC misconfiguration can cause damage to last a lifetime. Sign up and start securing your configs below.