Terraform Lesson 42 – Terraform Testing | Dataplexa
Section IV · Lesson 42

Terraform Testing

Infrastructure code without tests is infrastructure that only works until someone changes it. A module that returns correct output values today may silently break after a refactor. A security configuration that passes tfsec today may have a logic error that lets the wrong traffic through. Testing gives you the confidence to refactor, upgrade, and deploy — knowing that the important properties are verified automatically. This lesson covers the full Terraform testing toolkit.

This lesson covers

The infrastructure testing pyramid → terraform validate → The native terraform test framework → Writing .tftest.hcl files → Mocking providers → Terratest for integration testing → Contract tests with preconditions and postconditions → Testing in CI/CD

The Infrastructure Testing Pyramid

Application code has unit tests, integration tests, and end-to-end tests — fast at the bottom, slow at the top, most coverage at the bottom. Infrastructure testing follows the same pyramid. Most tests should be fast and cheap. A few should deploy real infrastructure and verify it.

Level Tool What it verifies Speed / Cost
Static analysis tfsec, checkov, validate Syntax, references, security misconfigs Seconds / Free
Unit tests terraform test with mocks Module logic, outputs, variable validation Seconds / Free (no cloud resources)
Contract tests precondition / postcondition Input constraints, output guarantees Seconds at plan time / Free
Integration tests terraform test or Terratest Real infrastructure behaviour end-to-end Minutes / Cloud costs

The Analogy

Testing a Terraform module is like testing a vending machine. Static analysis checks the wiring diagram for obvious errors without plugging it in. Unit tests check the logic — does pressing B4 dispense a drink? — using a simulated coin slot. Contract tests check the machine refuses invalid coins. Integration tests plug the machine into a real power outlet, insert a real coin, and verify that a real drink comes out. You need all four, but you run the wiring check a hundred times more than you do the full power-on test.

The Native terraform test Framework

Terraform 1.6 introduced a native test framework. Test files use the .tftest.hcl extension and live alongside the module they test. Each file contains one or more run blocks — each run block applies the module with specific inputs and asserts conditions on the outputs.

New terms:

  • run block — a single test scenario. Specifies the module variables to use, the command to run (plan or apply), and the assertions to check. Multiple run blocks can share state — each run builds on the previous one.
  • assert block — a condition that must be true after the run completes. Uses standard Terraform expressions. If the condition is false, the test fails with the error_message.
  • mock_provider — a mock that replaces a real provider. The module plan and apply run without making any cloud API calls. Outputs are populated with mock values. Essential for fast unit tests that do not create real resources.
  • terraform test — the command that discovers and runs all .tftest.hcl files in the current directory. Cleans up all resources created during apply runs after the test completes.
# Module structure: modules/vpc/
# tests/vpc_basic.tftest.hcl — unit test for the VPC module

# Provider mock — no real AWS calls, no credentials needed, runs in seconds
mock_provider "aws" {
  mock_resource "aws_vpc" {
    defaults = {
      id         = "vpc-mock-12345"
      arn        = "arn:aws:ec2:us-east-1:123456789012:vpc/vpc-mock-12345"
      cidr_block = "10.0.0.0/16"
    }
  }
  mock_resource "aws_subnet" {
    defaults = {
      id                = "subnet-mock-67890"
      availability_zone = "us-east-1a"
    }
  }
  mock_data "aws_availability_zones" {
    defaults = {
      names = ["us-east-1a", "us-east-1b", "us-east-1c"]
    }
  }
}

# Test 1: Basic VPC creation with default inputs
run "creates_vpc_with_defaults" {
  command = plan  # No real resources — plan only

  variables {
    name = "test-vpc"  # Only required variable — rest use defaults
  }

  # Assert outputs are correct
  assert {
    condition     = output.vpc_id != ""
    error_message = "vpc_id output must not be empty"
  }

  assert {
    condition     = output.vpc_cidr_block == "10.0.0.0/16"
    error_message = "vpc_cidr_block should default to 10.0.0.0/16"
  }

  assert {
    condition     = length(output.public_subnet_ids) == 2
    error_message = "Should create 2 public subnets by default"
  }

  assert {
    condition     = length(output.private_subnet_ids) == 2
    error_message = "Should create 2 private subnets by default"
  }
}

# Test 2: Custom CIDR inputs
run "accepts_custom_cidr" {
  command = plan

  variables {
    name                 = "custom-cidr-vpc"
    vpc_cidr             = "172.16.0.0/16"
    public_subnet_cidrs  = ["172.16.1.0/24", "172.16.2.0/24", "172.16.3.0/24"]
    private_subnet_cidrs = ["172.16.10.0/24", "172.16.11.0/24", "172.16.12.0/24"]
  }

  assert {
    condition     = output.vpc_cidr_block == "172.16.0.0/16"
    error_message = "vpc_cidr_block should reflect the custom CIDR"
  }

  assert {
    condition     = length(output.public_subnet_ids) == 3
    error_message = "Should create 3 public subnets when 3 CIDRs provided"
  }
}

# Test 3: Validation — invalid name should be caught
run "rejects_invalid_name" {
  command = plan

  variables {
    name = "INVALID NAME WITH SPACES"  # Violates the name validation rule
  }

  # Expect this run to fail — the validation should reject the input
  expect_failures = [var.name]
}
# Run the tests
terraform test

# Output:
# tests/vpc_basic.tftest.hcl... in progress
#   run "creates_vpc_with_defaults"... pass
#   run "accepts_custom_cidr"... pass
#   run "rejects_invalid_name"... pass
# tests/vpc_basic.tftest.hcl... tearing down
# tests/vpc_basic.tftest.hcl... pass
#
# Success! 3 passed, 0 failed.
# Elapsed: 0.8s — fast because mock_provider makes no API calls

# Run a specific test file
terraform test -filter=tests/vpc_basic.tftest.hcl

# Run in verbose mode — see each assertion result
terraform test -verbose
$ terraform test -verbose

tests/vpc_basic.tftest.hcl... in progress

  run "creates_vpc_with_defaults"
    PASS: assert[0]: vpc_id output must not be empty
    PASS: assert[1]: vpc_cidr_block should default to 10.0.0.0/16
    PASS: assert[2]: Should create 2 public subnets by default
    PASS: assert[3]: Should create 2 private subnets by default
  run "creates_vpc_with_defaults"... pass

  run "accepts_custom_cidr"
    PASS: assert[0]: vpc_cidr_block should reflect the custom CIDR
    PASS: assert[1]: Should create 3 public subnets when 3 CIDRs provided
  run "accepts_custom_cidr"... pass

  run "rejects_invalid_name"
    PASS: var.name validation correctly rejected "INVALID NAME WITH SPACES"
  run "rejects_invalid_name"... pass

tests/vpc_basic.tftest.hcl... tearing down
tests/vpc_basic.tftest.hcl... pass

Success! 3 passed, 0 failed.
Elapsed: 0.812s

What just happened?

  • Three tests ran in under a second with no cloud credentials. The mock_provider replaced all AWS API calls with in-memory responses. The module logic — variable defaults, output calculations, validation rules — was tested completely without creating a single real resource. This is what makes unit tests practical to run on every commit.
  • expect_failures tests the negative case. expect_failures = [var.name] tells the framework that this run should fail validation for the name variable. If the validation works correctly, the test passes. If the validation is missing or broken, the run succeeds when it should fail — and the test framework catches that.
  • Terraform test cleans up all apply runs automatically. For tests that use command = apply against a real provider, the framework runs terraform destroy after each test file completes. Resources never linger after the test suite finishes.

Integration Tests with terraform test and Real Providers

For integration tests that need to verify real infrastructure behaviour — a load balancer actually routing traffic, a security group actually blocking connections — use command = apply without a mock provider. The framework creates real resources, runs assertions against them, and destroys everything at the end.

# tests/vpc_integration.tftest.hcl
# Integration test — creates real AWS resources
# Run in a dedicated test AWS account — never in production

provider "aws" {
  region = "us-east-1"
  # Credentials from environment — AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY
  # Use a dedicated test account with limited permissions
}

# Run 1: Create the VPC — command = apply creates real AWS resources
run "create_vpc" {
  command = apply  # Real AWS API calls — creates real VPC, subnets, IGW

  variables {
    name    = "tf-test-vpc-${formatdate("YYYYMMDDHHmmss", timestamp())}"
    vpc_cidr = "10.99.0.0/16"  # Use a dedicated test CIDR range
  }

  # Assert the VPC ID is a valid AWS format
  assert {
    condition     = can(regex("^vpc-[a-f0-9]+$", output.vpc_id))
    error_message = "vpc_id must be a valid AWS VPC ID format"
  }

  # Assert subnets were created with IDs
  assert {
    condition     = length(output.public_subnet_ids) > 0
    error_message = "At least one public subnet must be created"
  }

  # Assert outputs reference the same VPC (cross-output consistency check)
  assert {
    condition     = output.vpc_cidr_block == "10.99.0.0/16"
    error_message = "vpc_cidr_block output must match the input cidr"
  }
}

# Run 2: Verify DNS resolution works in the VPC (uses state from run 1)
run "verify_dns" {
  command = plan  # Just check configuration — no new resources

  variables {
    name             = "tf-test-vpc-dns"
    vpc_cidr         = "10.99.0.0/16"
    enable_dns_hostnames = true
  }

  assert {
    condition     = output.vpc_id != ""
    error_message = "VPC must exist before testing DNS configuration"
  }
}

# After ALL runs complete: terraform destroy runs automatically
# All test resources are cleaned up — no manual cleanup needed

Contract Tests with precondition and postcondition

Preconditions and postconditions are contract tests embedded directly in the module code. They run at plan or apply time and verify invariants — properties that must always be true. Unlike external tests, they run on every plan and apply automatically — not just during dedicated test runs.

# precondition — checked at plan time, before the resource is created
# postcondition — checked after apply, verifying the actual created resource

resource "aws_db_instance" "main" {
  identifier        = "prod-${var.environment}"
  engine            = "postgres"
  engine_version    = var.engine_version
  instance_class    = var.instance_class
  allocated_storage = var.storage_gb
  storage_encrypted = var.storage_encrypted

  lifecycle {
    # Precondition: enforce contract on inputs before anything is created
    precondition {
      # Prod must be Multi-AZ — no exceptions, enforced at plan time
      condition     = var.environment != "prod" || var.multi_az == true
      error_message = "Production databases must have multi_az = true. Set multi_az = true in your module call."
    }

    precondition {
      # Storage must be encrypted — no unencrypted prod databases
      condition     = var.storage_encrypted == true
      error_message = "storage_encrypted must be true. Unencrypted databases are not permitted in this module."
    }

    precondition {
      # Production storage must be at least 100GB
      condition     = var.environment != "prod" || var.storage_gb >= 100
      error_message = "Production databases must have at least 100GB storage. Set storage_gb >= 100."
    }

    # Postcondition: verify the resource was actually created correctly
    postcondition {
      # After apply, verify AWS actually enabled encryption
      condition     = self.storage_encrypted == true
      error_message = "Database was created without encryption — AWS may have overridden the setting."
    }

    postcondition {
      # Verify the engine version AWS created matches what was requested
      condition     = startswith(self.engine_version, split(".", var.engine_version)[0])
      error_message = "AWS created a different major engine version than requested."
    }
  }
}

# output postcondition — verify module outputs meet their contract
output "db_endpoint" {
  value       = aws_db_instance.main.endpoint
  description = "Database connection endpoint"

  precondition {
    # Output must not be empty — if it is, something went wrong in apply
    condition     = aws_db_instance.main.endpoint != ""
    error_message = "Database endpoint is empty — the database may not have been created successfully."
  }
}

Testing in CI/CD

# .github/workflows/terraform-test.yml
# Run unit tests (fast) on every PR, integration tests (slow) on merge to main

name: Terraform Tests

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

jobs:
  # Fast tests — run on every PR, no cloud credentials needed
  unit-tests:
    name: Unit Tests (mocked)
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with: { terraform_version: "1.6.3" }

      - name: Run unit tests for all modules
        run: |
          for module_dir in modules/*/; do
            echo "Testing $module_dir"
            cd "$module_dir"
            terraform init -backend=false  # No backend for unit tests
            terraform test -filter="tests/*unit*.tftest.hcl"
            cd -
          done

  # Slow tests — run only on merge to main, needs real AWS credentials
  integration-tests:
    name: Integration Tests (real AWS)
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    environment: test  # Dedicated test environment with separate AWS account

    permissions:
      id-token: write
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials (test account)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.TEST_ACCOUNT_ROLE_ARN }}  # Test account only
          aws-region: us-east-1

      - uses: hashicorp/setup-terraform@v3
        with: { terraform_version: "1.6.3" }

      - name: Run integration tests for VPC module
        working-directory: modules/vpc
        run: |
          terraform init
          terraform test -filter="tests/*integration*.tftest.hcl"
          # terraform test automatically destroys created resources after each test file

Common Testing Mistakes

Only testing the happy path

Tests that only verify the module works with valid inputs miss the most important test cases — what happens with invalid inputs. Validation rules that are never tested drift over time: someone changes the regex, the error message becomes wrong, or the condition logic inverts. Always include expect_failures test runs that verify invalid inputs are correctly rejected. If your module has 5 validation rules, you should have at least 5 negative test cases.

Running integration tests against production or staging environments

Integration tests that use command = apply create and destroy real cloud resources. Running them in the production or staging account risks naming conflicts with real infrastructure, accidental destruction of shared resources, and cost spikes from test resources not being cleaned up on test failure. Always use a dedicated isolated test account with no real workloads — the test account exists solely to be created in and destroyed.

Skipping mock_provider for module unit tests

Module tests without mock_provider require real cloud credentials and create real resources on every run — making them expensive and slow to use on pull requests. A module that takes 5 minutes and costs $0.10 in cloud resources to test gets skipped by developers in a hurry. Tests only provide value if they actually run. Use mock_provider for any test that is checking module logic rather than cloud infrastructure behaviour.

What to test vs what to trust

You do not need to test that aws_s3_bucket creates an S3 bucket — HashiCorp tests that. You need to test that your module creates the bucket with the right name format, the right tags, the right encryption configuration, and the right access settings. Test your logic — the variable defaults, the output calculations, the conditional resource creation, the validation rules. Trust the provider to correctly communicate with AWS. The boundary is: if you wrote it, test it. If HashiCorp wrote it, trust the provider tests.

Practice Questions

1. Which terraform test construct replaces a real provider with an in-memory mock — allowing module logic to be tested without any cloud credentials or API calls?



2. Which run block argument tells the terraform test framework that this run should fail validation — and the test passes only if the failure occurs?



3. What is the difference between a precondition and a postcondition in a Terraform resource lifecycle block?



Quiz

1. How does terraform test handle cleanup of real cloud resources created during integration test runs?


2. What should Terraform module tests verify, and what should they trust the provider to handle?


3. What is the key advantage of using preconditions and postconditions over external test files for verifying module contracts?


Up Next · Lesson 43

Performance Optimization

Tests written. Lesson 43 covers making Terraform fast — why large configurations slow down, the parallelism model, targeted applies, splitting configurations for faster feedback, provider caching, and the techniques that take a 20-minute plan down to 2 minutes without compromising correctness.