Steven's Knowledge

State Management

Terraform state - what it is, why remote backends with locking are non-negotiable for teams, and how to safely operate on state

State Management

The state file is Terraform's mapping from your HCL to real cloud resources. Without it, Terraform has no way to know that aws_s3_bucket.hello in your code corresponds to bucket my-bucket-abc123 in AWS.

What's in the State File

After apply, Terraform writes terraform.tfstate — a JSON file containing:

  • Every managed resource and its computed attributes (IDs, ARNs, IPs).
  • A serial number incremented on each change.
  • Output values.
  • The provider config used.
{
  "version": 4,
  "terraform_version": "1.7.5",
  "serial": 42,
  "resources": [
    {
      "type": "aws_s3_bucket",
      "name": "hello",
      "instances": [
        {
          "attributes": {
            "id": "my-first-terraform-bucket-a4f9",
            "arn": "arn:aws:s3:::my-first-terraform-bucket-a4f9",
            ...
          }
        }
      ]
    }
  ]
}

State files contain sensitive values in plain text — DB passwords, API keys, instance IPs. Treat them like secrets. Never commit terraform.tfstate to git.

Why You Need a Remote Backend

The default local backend writes terraform.tfstate next to your code. That's fine for one person learning. The moment two people share a project, you have problems:

ProblemWhat happens
No lockingTwo apply runs at once race and corrupt state
No shared source of truthEach laptop has a different state file
No encryptionPlaintext secrets sit on disk
No historyHard to recover from a bad apply

A remote backend stores state in a shared location and (for the good ones) locks it during apply.

S3 + DynamoDB Backend (AWS)

The classic pattern: S3 for storage, DynamoDB for the lock.

# backend.tf
terraform {
  backend "s3" {
    bucket         = "myproject-terraform-state"
    key            = "production/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-lock"
    encrypt        = true
  }
}

The bucket and lock table must exist before terraform init — chicken-and-egg, so bootstrap them once by hand or with a tiny separate Terraform config:

resource "aws_s3_bucket" "state" {
  bucket = "myproject-terraform-state"
}

resource "aws_s3_bucket_versioning" "state" {
  bucket = aws_s3_bucket.state.id
  versioning_configuration { status = "Enabled" }      # keep history
}

resource "aws_s3_bucket_server_side_encryption_configuration" "state" {
  bucket = aws_s3_bucket.state.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "state" {
  bucket                  = aws_s3_bucket.state.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_dynamodb_table" "lock" {
  name         = "terraform-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

AWS S3 native locking (preview/GA in recent Terraform versions) removes the DynamoDB requirement: set use_lockfile = true in the backend block. Newer projects can skip the DynamoDB table.

Other Backend Options

BackendLockingBest for
s3DynamoDB or native S3 lockAWS shops
gcsBuilt-inGCP shops
azurermBlob leaseAzure shops
remote (Terraform Cloud / HCP)Built-inTeams wanting a managed UI + run pipeline
httpCustom (GitLab/Atlantis offer one)GitLab pipelines, self-hosted

State Locking in Practice

When you run terraform plan or apply, Terraform acquires the lock first:

Acquiring state lock. This may take a few moments...

If someone else holds the lock, you'll wait — or fail fast with a clear error. Never use -lock=false to force past it; you'll corrupt state. If a lock is genuinely stale (CI killed mid-apply), release it explicitly:

terraform force-unlock <LOCK_ID>      # only when you're certain no one else is applying

Inspecting State

Read-only commands — safe to run anytime:

terraform state list                                  # all resource addresses
terraform state show aws_s3_bucket.hello              # attributes of one resource
terraform output -json                                # output values

Mutating State

When the world drifted from state, or when you refactor HCL, you sometimes need to edit state directly.

Renaming a resource (no API call)

You renamed aws_instance.web to aws_instance.app in HCL. Without help, Terraform plans to destroy web and create app. Use mv instead:

terraform state mv aws_instance.web aws_instance.app

Removing a resource from state (without destroying it)

terraform state rm aws_s3_bucket.hello                # forgets the bucket; bucket still exists

Importing an existing resource

You created a bucket by hand, now you want Terraform to manage it:

# main.tf
resource "aws_s3_bucket" "imported" {
  bucket = "existing-bucket-name"
}
terraform import aws_s3_bucket.imported existing-bucket-name
terraform plan                                       # should now be a no-op or near-no-op

Modern Terraform also supports import blocks (in HCL) so imports become reviewable in PRs:

import {
  to = aws_s3_bucket.imported
  id = "existing-bucket-name"
}

Refreshing state

terraform plan automatically refreshes state. To force just a refresh:

terraform refresh

State per Environment

Two patterns, both fine:

1. Directory per environment (more explicit, recommended for production)

infrastructure/
├── modules/                  # shared module code
├── staging/
│   ├── backend.tf           # key = "staging/terraform.tfstate"
│   └── main.tf
└── production/
    ├── backend.tf           # key = "production/terraform.tfstate"
    └── main.tf

2. Workspaces (one config directory, multiple state files)

terraform workspace new staging
terraform workspace new production
terraform workspace select production

terraform.workspace is then available in HCL. Workspaces are quick for ephemeral envs (preview environments, feature branches), but for staging/prod separation most teams prefer directory-per-environment because the blast radius of a wrong select is real.

What's Next

You can now run Terraform safely across a team. Next: stop copy-pasting resource blocks across projects → Modules.

On this page