Terraform Lesson 10 – Beginner Best Practices | Dataplexa
Section I · Lesson 10

Beginner Best Practices

Nine lessons of Terraform. Before you move into Section II — variables, state, backends, and everything that makes Terraform work at scale — this lesson locks in the habits that separate engineers who write Terraform confidently from those who cause incidents with it.

This lesson covers

Project structure that scales → Naming conventions → State hygiene → The habits that prevent production incidents → A complete reference configuration applying every practice together

Why Best Practices Matter Now

Most Terraform problems are not caused by wrong syntax. They are caused by habits formed early — a state file committed to Git, credentials hardcoded in a provider block, a single main.tf with 600 lines that nobody dares touch, a project with no .gitignore.

These problems compound. A project that starts messy gets messier. The practices in this lesson are not optional polish — they are the foundation that makes everything in Section II and beyond actually manageable.

The Analogy

A Terraform project is like a kitchen. You can cook in a messy kitchen for a while — you know where things are, it works for you. Bring in a second chef and the chaos doubles. Bring in a third and nobody can work. Best practices are mise en place — everything in its place before you start cooking. It costs a few minutes upfront and saves hours of confusion later.

Practice 1 — Project Structure

Every Terraform project should start with the same file structure. Not because Terraform requires it — it does not — but because every engineer who joins the project will immediately know where to look for anything.

Here is the structure every project should have from day one. Run these commands to create it:

mkdir my-project
cd my-project

# Core configuration files — one clear responsibility each
touch versions.tf    # Terraform version + provider requirements
touch variables.tf   # Input variable declarations
touch main.tf        # Resource blocks
touch outputs.tf     # Output value declarations
touch locals.tf      # Local values and computed expressions

# Environment-specific variable values — never committed if they contain secrets
touch terraform.tfvars

# Git hygiene — created immediately, before the first commit
touch .gitignore

Add the following to .gitignore — this file must exist before your first git add:

# Provider binaries — downloaded by terraform init, never committed
.terraform/

# State files — contain sensitive resource data, use remote state for teams
*.tfstate
*.tfstate.backup

# Saved plan files — binary format, regenerate as needed
*.tfplan

# Variable files that may contain secrets — commit only non-sensitive ones explicitly
*.auto.tfvars

# macOS noise
.DS_Store

# Editor directories
.idea/
.vscode/

What just happened?

  • The .gitignore must be created before the first commit. Once a file is tracked by Git, adding it to .gitignore does not remove it from history. A state file committed accidentally can expose sensitive resource data — connection strings, IPs, resource IDs — to anyone with repository access. Create .gitignore first, always.
  • locals.tf is a new file in this structure. Locals are computed values — transformations of variables or combinations of resource attributes — that you reference multiple times across your configuration. They avoid repetition and make complex expressions readable. Covered in depth in Section II.
  • *.auto.tfvars is gitignored but terraform.tfvars is not. Files ending in .auto.tfvars are typically used for environment-specific overrides that may contain secrets. The base terraform.tfvars file is committed when it contains only non-sensitive defaults — region, environment name, instance sizes.

Practice 2 — Naming Conventions

Consistent naming makes a Terraform project self-documenting. Six months from now, a resource named aws_instance.x tells you nothing. A resource named aws_instance.api_server tells you exactly what it is.

We are writing a variables.tf and locals.tf that demonstrate every naming rule. These files will be reused in the complete reference configuration at the end of the lesson.

New terms:

  • locals block — defines local values that are computed once and referenced by name throughout the configuration. Unlike variables, locals cannot be overridden from outside — they are internal to the configuration. Use them to avoid repeating the same expression in multiple places.
  • local.name — the syntax for referencing a local value. If you declare locals { name_prefix = "acme" }, you reference it as local.name_prefix — note singular local, not locals.
  • lower_snake_case — the standard naming convention for all Terraform identifiers: variable names, resource names, output names, local names. No camelCase, no PascalCase, no hyphens in identifier names. Hyphens appear inside string values — like tag values — not in HCL identifiers.

Add this to variables.tf:

# Every variable gets a description — no exceptions
# Descriptions appear in terraform plan output and generated documentation

variable "project_name" {
  description = "Short name for the project — used in all resource names and tags"
  type        = string
  default     = "acme"
}

variable "environment" {
  description = "Deployment environment — controls sizing and redundancy"
  type        = string
  default     = "dev"

  # Validation catches bad values before they reach the AWS API
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "environment must be dev, staging, or prod."
  }
}

variable "region" {
  description = "AWS region for all resources in this configuration"
  type        = string
  default     = "us-east-1"
}

variable "instance_type" {
  description = "EC2 instance size — use t2.micro for dev, t3.medium for prod"
  type        = string
  default     = "t2.micro"
}

Now create locals.tf. Locals centralise repeated expressions so you never duplicate them:

locals {
  # name_prefix is used in every resource name and tag
  # Centralising it here means changing the prefix updates everything at once
  name_prefix = "${var.project_name}-${var.environment}"

  # Common tags applied to every resource
  # Merge these with resource-specific tags using merge(local.common_tags, {...})
  common_tags = {
    Project     = var.project_name
    Environment = var.environment
    ManagedBy   = "Terraform"
    Region      = var.region
  }

  # Environment-specific sizing — one place to control dev vs prod differences
  # Reference: local.is_production, local.min_capacity etc.
  is_production = var.environment == "prod"
  min_capacity  = local.is_production ? 2 : 1
  max_capacity  = local.is_production ? 10 : 2
}
$ terraform console

> local.name_prefix
"acme-dev"

> local.common_tags
{
  "Environment" = "dev"
  "ManagedBy"   = "Terraform"
  "Project"     = "acme"
  "Region"      = "us-east-1"
}

> local.is_production
false

> local.min_capacity
1

> local.max_capacity
2

What just happened?

  • terraform console is an interactive REPL for evaluating expressions. Type any Terraform expression and it evaluates it against your current variable values and state. It is the fastest way to test locals, string interpolations, and function calls without running a plan. Exit with Ctrl+D or type exit.
  • local.name_prefix resolves to "acme-dev". This string is now used as a prefix in every resource name in main.tf. If the project name or environment changes, every resource name updates automatically — no find-and-replace across multiple files.
  • local.is_production uses a conditional expression. var.environment == "prod" evaluates to true or false. The ternary syntax condition ? true_value : false_value then drives min_capacity and max_capacity. In dev the minimum is 1, in prod it is 2. One variable change — environment = "prod" — flips the entire sizing logic.
  • common_tags will be merged with resource-specific tags using the merge() function. In main.tf you write tags = merge(local.common_tags, { Name = "..." }). The resource-specific tag is merged in — if there is a key conflict the second map wins. This ensures every resource has the standard tags without repeating them in every block.

Practice 3 — versions.tf Every Time

A versions.tf file with pinned versions is not optional for any project that will run in more than one place — your machine, a teammate's machine, CI. Without it, different machines download different provider versions and the same configuration behaves differently.

Add this to versions.tf:

terraform {
  # Minimum CLI version — anyone running an older version gets a clear error
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source = "hashicorp/aws"

      # ~> 5.0 means any 5.x — allows patch and minor updates, blocks major version jumps
      # After running terraform init, commit .terraform.lock.hcl to lock the exact version
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.region

  # default_tags applies these tags to every resource this provider creates
  # No need to repeat them in individual resource blocks
  default_tags {
    tags = local.common_tags
  }
}
$ terraform init

Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Installing hashicorp/aws v5.31.0...

Terraform has created a lock file .terraform.lock.hcl

$ git add .terraform.lock.hcl
$ git commit -m "chore: pin AWS provider to v5.31.0"

What just happened?

  • The lock file is committed immediately after init. Committing .terraform.lock.hcl right after the first init is the correct sequence — do not let other files accumulate before the lock file is in Git. Your CI pipeline will use the same provider version from this point forward.
  • default_tags uses local.common_tags. The tags block in the provider references the local value defined in locals.tf. This means updating common_tags in one place updates the default tags on every resource in the project — without touching a single resource block.
  • The commit message follows a convention. The chore: prefix is part of the Conventional Commits specification — a widely adopted standard for commit message formatting. Infrastructure changes typically use feat: for new resources, fix: for corrections, chore: for maintenance like provider upgrades. Consistent commit messages make the Git log readable as a changelog.

Practice 4 — State Hygiene

The state file is the most sensitive file Terraform produces. It contains resource IDs, IP addresses, database connection strings, and in some cases plaintext passwords. It must never be committed to Git and must never live only on one engineer's laptop.

For any project shared by more than one person, state lives in a remote backend. The standard AWS setup is an S3 bucket plus a DynamoDB table for state locking. We are writing the backend configuration that every shared project needs.

New terms:

  • backend block — inside the terraform block, configures where the state file is stored. With no backend block, state is local. With an S3 backend, state is stored in a bucket and accessible to any machine with the right IAM permissions.
  • bucket — the S3 bucket name where the state file lives. This bucket must exist before Terraform can use it as a backend — Terraform does not create the bucket itself. Create it manually or with a separate bootstrap Terraform configuration.
  • key — the path within the S3 bucket where this project's state file is stored. Using project-name/environment/terraform.tfstate as the key pattern keeps state files organised when multiple projects share one bucket.
  • dynamodb_table — the DynamoDB table used for state locking. When one process runs terraform apply, it writes a lock to this table. Any other process trying to apply at the same time sees the lock and waits. This prevents two concurrent applies from corrupting the state file.
  • encrypt = true — enables server-side encryption for the state file in S3. Encrypts the file at rest using the bucket's default encryption key. Always set this to true — there is no reason not to.

Update versions.tf to add the backend block inside the terraform block:

terraform {
  required_version = ">= 1.5.0"

  # Remote backend — state stored in S3, locked with DynamoDB
  # The S3 bucket and DynamoDB table must be created before terraform init
  backend "s3" {
    bucket         = "acme-terraform-state"       # S3 bucket holding all project state files
    key            = "acme/dev/terraform.tfstate"  # Path within the bucket for this project
    region         = "us-east-1"                   # Region where the bucket lives
    encrypt        = true                           # Encrypt state at rest — always true
    dynamodb_table = "terraform-state-lock"        # DynamoDB table for state locking
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.region

  default_tags {
    tags = local.common_tags
  }
}
$ terraform init

Initializing the backend...

Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Installing hashicorp/aws v5.31.0...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan"
to see any changes that are required for your current configuration.

What just happened?

  • Terraform migrated state to S3. On the first init after adding a backend, Terraform detects that local state exists and offers to copy it to the remote backend. After migration, the local terraform.tfstate file is empty — the real state lives in S3. Every subsequent apply reads from and writes to S3 automatically.
  • The key path acme/dev/terraform.tfstate organises multiple environments. If you have a staging environment, its state lives at acme/staging/terraform.tfstate in the same bucket. Production at acme/prod/terraform.tfstate. One bucket, many projects, perfectly separated state files.
  • DynamoDB locking is now active. When you run terraform apply, Terraform acquires a lock in the DynamoDB table before modifying state. Any concurrent apply on another machine or in CI sees the lock and waits. The lock is released when the apply finishes. If a process crashes mid-apply and the lock is not released, use terraform force-unlock LOCK_ID to clear it manually.

Practice 5 — The Habits That Prevent Incidents

These are not configuration rules — they are behavioural habits. Every production incident caused by Terraform traces back to at least one of these being skipped.

Always run plan before apply — read every line

Every +, every ~, every -, every -/+. The summary line — "3 to add, 1 to change, 0 to destroy" — is not enough. The detail shows which specific arguments are changing and whether any change forces a replacement. Find every # forces replacement annotation in the output before confirming.

Use terraform plan -out=tfplan, then terraform apply tfplan

A plain terraform apply re-plans internally. If infrastructure changed between your review and the apply — a teammate made a change, a scheduled job updated a resource — you are applying a plan you never saw. Save the plan to a file and apply from that file. The plan you reviewed is exactly the plan that executes.

Apply to staging before production — every time

A change that looks correct in the plan can still behave unexpectedly in reality — permissions not quite right, a resource quota exceeded, a dependency not fully in place. Staging catches these at a cost of minutes. Production catches them at a cost of hours and an incident report.

Run terraform plan after every apply — verify zero drift

After a successful apply, run plan again. The output should say "No changes. Your infrastructure matches the configuration." If it does not — something did not apply cleanly, or something changed between the apply and the follow-up plan. Find out why immediately rather than in three weeks when something breaks.

Never store secrets in .tf files or terraform.tfvars

Passwords, API keys, and tokens passed as Terraform variables end up in the state file in plaintext regardless of how they were supplied. Use AWS Secrets Manager, HashiCorp Vault, or environment variables for secrets. Reference them in Terraform via data sources — not by hardcoding them in variable defaults or tfvars files.

terraform destroy after every practice session

Every resource left running costs money. A t2.micro running for a month is roughly $8. An RDS instance is $15–50 depending on size. An NAT gateway is $32 regardless of traffic. Always destroy resources you are not actively using. Set a reminder. Check the AWS billing console weekly when learning.

The Complete Reference Configuration

Every practice in this lesson applied to one complete, working configuration. This is the starting template for any new AWS project. Copy it, fill in your bucket name and DynamoDB table, run terraform init, and you are working at a professional standard from line one.

This is what main.tf looks like in a project that follows every practice — locals for naming, merge() for tags, for_each for instances, lifecycle blocks for safety:

# main.tf — Complete reference configuration applying all best practices

# VPC — the network container for all resources
# prevent_destroy protects it from accidental removal in a plan
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true  # Required for public DNS hostnames on instances

  # merge() combines common_tags with this resource's specific Name tag
  # common_tags come from locals.tf — no need to repeat Project, Environment, ManagedBy here
  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-vpc"
  })

  lifecycle {
    prevent_destroy = true  # Block accidental VPC destruction — remove this line intentionally to destroy
  }
}

# Internet gateway — enables inbound and outbound internet access for the VPC
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id  # Implicit dependency — VPC is created first

  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-igw"
  })
}

# Public subnet — instances here get public IPs automatically
resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "${var.region}a"  # Pin to first AZ in the region
  map_public_ip_on_launch = true               # Every instance in this subnet gets a public IP

  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-subnet-public"
    Tier = "public"
  })
}

# Security group — virtual firewall for the web tier
resource "aws_security_group" "web" {
  name   = "${local.name_prefix}-web-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    description = "HTTP from anywhere"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "HTTPS from anywhere"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    description = "Allow all outbound traffic"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"          # -1 means all protocols
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-web-sg"
  })
}

# EC2 instances — one per entry in var.instances map
# for_each gives each instance a stable key — safe to add or remove entries
resource "aws_instance" "web" {
  for_each = var.instances

  ami                    = "ami-0c55b159cbfafe1f0"  # Amazon Linux 2 — us-east-1 only
  instance_type          = each.value                 # Instance type from the map value
  subnet_id              = aws_subnet.public.id
  vpc_security_group_ids = [aws_security_group.web.id]

  # Wait for the internet gateway before creating — it has no direct attribute reference
  # but must exist for the instance to have internet connectivity at launch
  depends_on = [aws_internet_gateway.main]

  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-web-${each.key}"  # e.g. acme-dev-web-primary
    Role = each.key
  })

  lifecycle {
    create_before_destroy = true   # Create replacement before destroying original — zero downtime
    ignore_changes        = [ami]  # AMI managed externally by patching pipeline — do not revert
  }

  timeouts {
    create = "20m"  # Default is 10m — extend for larger instance types in busy regions
    update = "20m"
    delete = "10m"
  }
}

And the matching outputs.tf:

# outputs.tf — surface the information consumers of this configuration need

output "vpc_id" {
  description = "VPC ID — referenced by other configurations that deploy into this network"
  value       = aws_vpc.main.id
}

output "public_subnet_id" {
  description = "Public subnet ID — pass to any module that needs to launch instances here"
  value       = aws_subnet.public.id
}

output "web_security_group_id" {
  description = "Web security group ID — attach to any instance that needs HTTP/HTTPS access"
  value       = aws_security_group.web.id
}

# for expression builds a map: { "primary" => "i-0abc...", "secondary" => "i-0def..." }
# Much more useful than a list — consumers look up by role name, not by index
output "instance_ids" {
  description = "EC2 instance IDs keyed by role"
  value       = { for k, v in aws_instance.web : k => v.id }
}

output "instance_public_ips" {
  description = "Public IPs of all web instances keyed by role"
  value       = { for k, v in aws_instance.web : k => v.public_ip }
}
$ terraform fmt && terraform validate && terraform plan -out=tfplan

main.tf
locals.tf

Success! The configuration is valid.

Terraform will perform the following actions:

  # aws_vpc.main will be created
  + resource "aws_vpc" "main" {
      + cidr_block           = "10.0.0.0/16"
      + enable_dns_hostnames = true
      + tags = {
          + "Environment" = "dev"
          + "ManagedBy"   = "Terraform"
          + "Name"        = "acme-dev-vpc"
          + "Project"     = "acme"
          + "Region"      = "us-east-1"
        }
    }

  # aws_internet_gateway.main will be created
  + resource "aws_internet_gateway" "main" { ... }

  # aws_subnet.public will be created
  + resource "aws_subnet" "public" { ... }

  # aws_security_group.web will be created
  + resource "aws_security_group" "web" { ... }

  # aws_instance.web["primary"] will be created
  + resource "aws_instance" "web" {
      + ami           = "ami-0c55b159cbfafe1f0"
      + instance_type = "t2.micro"
      + tags = {
          + "Environment" = "dev"
          + "ManagedBy"   = "Terraform"
          + "Name"        = "acme-dev-web-primary"
          + "Project"     = "acme"
          + "Region"      = "us-east-1"
          + "Role"        = "primary"
        }
    }

Plan: 6 to add, 0 to change, 0 to destroy.
Saved the plan to: tfplan

$ terraform apply tfplan

Apply complete! Resources: 6 added, 0 changed, 0 destroyed.

Outputs:

instance_ids = {
  "primary"   = "i-0aaa111bbb222"
  "secondary" = "i-0ccc333ddd444"
}
instance_public_ips = {
  "primary"   = "54.211.89.100"
  "secondary" = "54.211.89.102"
}
public_subnet_id      = "subnet-0abc123"
vpc_id                = "vpc-0def456"
web_security_group_id = "sg-0ghi789"

$ terraform plan

No changes. Your infrastructure matches the configuration.

What just happened?

  • The full workflow ran in one command chain. terraform fmt && terraform validate && terraform plan -out=tfplan — format, validate, plan, save. This is the local workflow from Lesson 5 compressed into a single line. Both fmt and validate must succeed before plan runs — the && operator chains them so the next command only runs if the previous one succeeds.
  • Every resource has five common tags automatically. Environment, ManagedBy, Name, Project, Region — all applied by the combination of default_tags in the provider and merge(local.common_tags, {...}) in each resource. The VPC plan output shows all five tags without a single tag being hardcoded in a resource block.
  • The plan was saved and applied directly. terraform apply tfplan executed exactly the plan that was reviewed — no re-planning, no risk of drift between review and apply. After apply, a follow-up terraform plan confirmed zero changes — the infrastructure matches the configuration exactly.
  • Instance names are self-documenting. acme-dev-web-primary tells you the project, the environment, the tier, and the role — in that order. In an AWS console with hundreds of resources, this naming pattern makes it immediately clear what every resource is and what it belongs to.

Common Mistakes

Creating the .gitignore after the first commit

Once a file is tracked in Git, adding it to .gitignore stops future changes from being tracked — but the file remains in Git history. A state file committed even once exposes every resource ID, IP address, and plaintext password that was in it at commit time. Create .gitignore before the first git add, without exception.

Putting everything in main.tf

A single main.tf with variables, providers, resources, and outputs all mixed together is fine at 50 lines. At 200 lines it is difficult. At 500 lines it is a maintenance problem nobody wants to inherit. Separate concerns from day one — it costs nothing to have four files instead of one.

Skipping descriptions on variables and outputs

A variable named az_count with no description is a mystery in six months. Terraform uses descriptions in the interactive prompt when a required variable has no value, in generated module documentation, and in IDE tooltips. Write them for every variable and every output — one sentence is enough.

The checklist before every apply

Run terraform fmt — clean output means all files were already formatted. Run terraform validate — "Success! The configuration is valid." means no syntax errors. Run terraform plan -out=tfplan — read every line, find every -/+, confirm no unintended replacements. Run terraform apply tfplan. Run terraform plan again — "No changes." means you are done. This five-step sequence is the entire local workflow.

Practice Questions

1. You define a locals block with a value called name_prefix. What is the correct syntax to reference it inside a resource block?



2. After terraform init, which file must be committed to Git to ensure all teammates use the same provider version?



3. Which built-in Terraform function combines two maps together — used to apply common_tags alongside a resource-specific Name tag?



Quiz

1. What is the difference between a local value and a variable in Terraform?


2. What does configuring an S3 backend with a DynamoDB table give you that local state cannot?


3. Why is terraform apply tfplan safer than a plain terraform apply?


Up Next · Section II — Lesson 11

Variables

Section I is complete. Section II starts with variables — and goes far deeper than the basics you have already seen. Types, validation, sensitive values, complex objects, and the full variable precedence chain that determines which value wins when five different sources supply the same variable.