Terraform Lesson 7 – Configuration Files | Dataplexa
Section I · Lesson 7

Configuration Files

In Lesson 6 you created four files and ran your first apply. But why four files? What goes in each one? What happens if you put everything in one file? This lesson answers all of that — and builds the file structure habits that keep projects maintainable as they grow.

This lesson covers

How Terraform reads files → The standard file structure → What belongs in each file → HCL syntax in depth → terraform.tfvars for environment-specific values

How Terraform Reads Your Files

When you run any Terraform command in a directory, Terraform reads every file ending in .tf in that directory and merges them into a single configuration in memory. It does not care how many files there are or what they are named. One file, ten files — the result is identical.

This means file structure is entirely a human convention. Terraform does not enforce it. But conventions matter — a project where everything is dumped into one 800-line main.tf is a project nobody wants to work in after month three.

Terraform does not read subdirectories. If you create a modules/ subfolder, Terraform ignores it unless you explicitly reference it with a module block — covered in Section III. Everything that runs together lives in the same directory.

The Standard File Structure

The industry has converged on a standard starting structure for Terraform projects. Every file has a clear responsibility. When a new engineer joins the project, they know exactly where to look for anything.

File Purpose Commit to Git?
versions.tf Terraform version constraint and required providers Yes
main.tf Resource declarations — the actual infrastructure Yes
variables.tf Variable declarations with descriptions, types, and defaults Yes
outputs.tf Output value declarations — what to print after apply Yes
terraform.tfvars Actual variable values for the current environment Depends — never if it contains secrets
.terraform.lock.hcl Provider version lock file — generated by init Yes — always
.terraform/ Downloaded provider binaries — generated by init No — gitignore this

versions.tf — The Project Contract

The versions.tf file is the contract your project makes with the tools that run it. It declares the minimum Terraform version required and the exact providers needed. This file protects the project from silently breaking when someone runs it with a newer or older version of Terraform or a different provider version.

We are about to write a versions.tf file that pins both the Terraform CLI version and two providers — AWS and a random provider for generating unique identifiers. Read the version constraint syntax carefully — it appears everywhere in Terraform.

New terms:

  • terraform block — a special block that configures Terraform itself rather than a cloud resource. It is the only block in HCL that does not correspond to a provider resource. It controls the Terraform CLI version requirement and which providers are needed.
  • required_version — a constraint on which Terraform CLI versions are acceptable. If someone runs the project with a version that does not satisfy this constraint, Terraform immediately exits with an error before doing anything else.
  • >= 1.5.0 — means "version 1.5.0 or higher". Other operators: ~> 1.5 means 1.5.x but not 2.0, = 1.5.0 means exactly that version, != 1.5.0 means anything except that version.
  • hashicorp/random provider — a provider that generates random values — strings, integers, UUIDs, pet names. It makes no API calls to any cloud. It is useful for generating unique resource names where a globally unique suffix is needed, like an S3 bucket name.
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.5"
    }
  }
}

provider "aws" {
  region = var.region
}
$ terraform init

Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.5"...
- Installing hashicorp/aws v5.31.0...
- Installing hashicorp/random v3.6.0...

Terraform has been successfully initialized!

What just happened?

  • Two providers were downloaded. The hashicorp/aws provider at version 5.31.0 — the newest version satisfying ~> 5.0. And hashicorp/random at version 3.6.0. Both are now in the .terraform/ folder and both versions are locked in .terraform.lock.hcl.
  • The random provider requires no credentials. Unlike the AWS provider which needs an IAM identity, the random provider generates values locally — it never makes a network call. Declaring a provider block for it is not required unless you need to set a seed value.
  • The version constraint ~> 5.0 means any 5.x. If HashiCorp releases version 5.99.0 tomorrow, this project will use it next time someone runs init. If they release 6.0.0, it is ignored — protecting against breaking changes in a major release.

variables.tf — Inputs That Make Code Reusable

The variables.tf file declares every input parameter the configuration accepts. It never contains the actual values — only the declaration of what variables exist, what type they are, what they are for, and optionally a default if none is provided.

We are writing a variables.tf that declares four variables covering region, environment, instance type, and a list of allowed environments. The last one demonstrates Terraform's type system — something most beginners skip over and regret later.

New terms:

  • type constraint — tells Terraform what kind of value a variable accepts. The primitive types are string, number, and bool. Complex types include list(string), map(string), and object({}). If you pass the wrong type, Terraform rejects it before planning.
  • default — the value used when no value is provided at runtime. A variable with no default is required — Terraform will prompt for it interactively or fail if none is supplied.
  • validation block — an optional block inside a variable that defines a rule the value must satisfy. If the condition evaluates to false, Terraform shows the error message and stops. This catches bad inputs before they reach the cloud provider API.
  • condition — a boolean expression inside a validation block. Uses Terraform's built-in functions. contains() checks if a value exists in a list. var.environment references the variable being validated.
variable "region" {
  description = "AWS region to deploy into"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Deployment environment — must be dev, staging, or prod"
  type        = string
  default     = "dev"

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "environment must be one of: dev, staging, prod."
  }
}

variable "instance_type" {
  description = "EC2 instance size"
  type        = string
  default     = "t2.micro"
}

variable "allowed_ports" {
  description = "List of inbound ports to allow on the security group"
  type        = list(number)
  default     = [80, 443]
}
$ terraform apply -var="environment=production"

╷
│ Error: Invalid value for input variable
│
│   on variables.tf line 8, in variable "environment":
│    8:   validation {
│
│ The given value is not valid for variable "environment":
│ environment must be one of: dev, staging, prod.
╵

What just happened?

  • The validation block caught a bad value before the plan ran. The value production was passed in but the validation rule only allows dev, staging, or prod. Terraform rejected it immediately with the custom error message — no API call was made, no resources were touched.
  • contains() is a built-in Terraform function. It checks whether a given value exists inside a list. contains(["dev", "staging", "prod"], var.environment) returns true if the value is in the list, false if not. When the condition is false, the error_message is shown.
  • list(number) is a typed list. The allowed_ports variable accepts a list where every item must be a number. If someone passes ["80", "443"] — strings instead of numbers — Terraform will coerce them to numbers automatically. If they pass ["eighty", "https"], it will fail the type check.

main.tf — The Infrastructure

main.tf is where your resource blocks live. It references variables declared in variables.tf and produces attributes that outputs.tf can expose. It is the heart of the configuration.

We are going to write a main.tf that creates an S3 bucket with a globally unique name — generated using the random provider — and a security group with dynamic port rules driven by the allowed_ports variable. This shows how resources wire together across files.

New terms:

  • random_id resource — generates a random byte string and exposes it in multiple encodings. The hex attribute gives a hexadecimal string like a3f2b1c4. The byte_length argument controls how many random bytes are generated — 4 bytes gives an 8-character hex string.
  • aws_s3_bucket — creates an Amazon S3 object storage bucket. The bucket name must be globally unique across all AWS accounts worldwide — not just yours. Using a random suffix guarantees uniqueness.
  • aws_security_group — a virtual firewall for EC2 instances and other AWS resources. Defines inbound (ingress) and outbound (egress) traffic rules at the network level.
  • dynamic block — generates repeated nested blocks from a list or map. Instead of writing one ingress block per port, a dynamic block iterates over the allowed_ports list and generates one ingress rule per port automatically.
  • content block — the body inside a dynamic block that defines what each generated block looks like. each.value refers to the current item from the list being iterated.
resource "random_id" "suffix" {
  byte_length = 4
}

resource "aws_s3_bucket" "app_data" {
  bucket = "acme-app-data-${var.environment}-${random_id.suffix.hex}"

  tags = {
    Environment = var.environment
    ManagedBy   = "Terraform"
  }
}

resource "aws_security_group" "web" {
  name        = "web-sg-${var.environment}"
  description = "Security group for web servers in ${var.environment}"

  dynamic "ingress" {
    for_each = var.allowed_ports
    content {
      from_port   = ingress.value
      to_port     = ingress.value
      protocol    = "tcp"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Environment = var.environment
    ManagedBy   = "Terraform"
  }
}
$ terraform plan

Terraform will perform the following actions:

  # random_id.suffix will be created
  + resource "random_id" "suffix" {
      + byte_length = 4
      + hex         = (known after apply)
    }

  # aws_s3_bucket.app_data will be created
  + resource "aws_s3_bucket" "app_data" {
      + bucket = (known after apply)
      + tags   = {
          + "Environment" = "dev"
          + "ManagedBy"   = "Terraform"
        }
    }

  # aws_security_group.web will be created
  + resource "aws_security_group" "web" {
      + name = "web-sg-dev"
      + ingress = [
          + {
              + cidr_blocks = ["0.0.0.0/0"]
              + from_port   = 80
              + protocol    = "tcp"
              + to_port     = 80
            },
          + {
              + cidr_blocks = ["0.0.0.0/0"]
              + from_port   = 443
              + protocol    = "tcp"
              + to_port     = 443
            },
        ]
    }

Plan: 3 to add, 0 to change, 0 to destroy.

What just happened?

  • random_id.suffix.hex is (known after apply). The random ID does not exist yet — it will be generated at apply time. Because the S3 bucket name depends on it, the bucket name is also unknown at plan time. This is normal and expected — Terraform shows (known after apply) for any value that depends on a resource not yet created.
  • The dynamic block expanded into two ingress rules. The allowed_ports variable defaults to [80, 443]. Terraform iterated over this list and generated one ingress block for port 80 and one for port 443. If the list had five ports, five rules would appear. The infrastructure adapts to the variable — no code changes needed.
  • The egress rule uses protocol "-1". In AWS security groups, protocol -1 means all protocols. Combined with port range 0-0, this rule allows all outbound traffic — the standard default for most workloads. Restricting egress is covered in the security lessons in Section III.
  • Three resources will be created in the right order. Terraform builds the dependency graph automatically: random_id.suffix first (no dependencies), then aws_s3_bucket.app_data and aws_security_group.web in parallel (both depend only on the random ID and variables, not on each other).

outputs.tf — What to Surface After Apply

outputs.tf declares what information Terraform should print after an apply and store in state for other configurations to consume. Think of outputs as the return values of your configuration.

We are writing an outputs.tf that surfaces the bucket name — which was only known after apply — and the security group ID. The sensitive argument is introduced here because some outputs must never appear in logs.

New terms:

  • sensitive = true — marks an output as sensitive. Terraform redacts its value in terminal output, replacing it with (sensitive value). The value is still stored in the state file — this flag only controls display. Use it for passwords, tokens, private keys, or any value you do not want appearing in CI logs.
  • depends_on — an optional meta-argument available on any resource or output that forces an explicit dependency Terraform cannot infer from attribute references alone. Rarely needed but important to know when implicit dependency detection is insufficient.
output "bucket_name" {
  description = "Name of the S3 bucket — generated with random suffix"
  value       = aws_s3_bucket.app_data.bucket
}

output "bucket_arn" {
  description = "ARN of the S3 bucket — used when granting IAM permissions"
  value       = aws_s3_bucket.app_data.arn
}

output "security_group_id" {
  description = "ID of the web security group — attach to EC2 instances"
  value       = aws_security_group.web.id
}

output "db_password" {
  description = "Database password — redacted from logs"
  value       = var.db_password
  sensitive   = true
}
$ terraform apply

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

Outputs:

bucket_arn         = "arn:aws:s3:::acme-app-data-dev-a3f2b1c4"
bucket_name        = "acme-app-data-dev-a3f2b1c4"
db_password        = (sensitive value)
security_group_id  = "sg-0abc123def456789"

What just happened?

  • The bucket name was resolved after apply. At plan time it showed (known after apply). After the random ID was generated and the bucket created, the actual name — acme-app-data-dev-a3f2b1c4 — appeared in the outputs. The a3f2b1c4 suffix is the hex value of the random ID.
  • db_password printed as (sensitive value). Even though the value exists in the state file, Terraform redacted it from terminal output. This prevents credentials from appearing in CI/CD logs where they might be captured by log aggregation tools. To read a sensitive output explicitly, run terraform output -json db_password.
  • The ARN is immediately usable. An ARN — Amazon Resource Name — is the globally unique identifier for any AWS resource. The bucket ARN is what you paste into IAM policies when granting access to this specific bucket. Surfacing it as an output means other Terraform configurations can consume it without hardcoding it.

terraform.tfvars — Environment-Specific Values

variables.tf declares variables. terraform.tfvars provides their values. Terraform automatically loads any file named terraform.tfvars in the current directory. This is how the same configuration runs differently in dev, staging, and production — different .tfvars files, same .tf code.

Here is a practical setup — three separate variable files, one per environment. We will write all three so you can see how the same configuration produces different infrastructure by swapping the values file.

New terms:

  • terraform.tfvars — automatically loaded by Terraform without any flags. Contains variable assignments in the simple format variable_name = value. No variable keyword, no quotes around the name.
  • *.auto.tfvars — any file ending in .auto.tfvars is also loaded automatically. Useful for splitting variable values across multiple files.
  • -var-file flag — explicitly loads a named variable file. terraform apply -var-file="prod.tfvars". Used when you want to select which environment values to use at runtime rather than by renaming files.
environment   = "dev"
region        = "us-east-1"
instance_type = "t2.micro"
allowed_ports = [80, 443, 8080]

Save the above as dev.tfvars. Now write the production values file — note how the instance type is larger and port 8080 is removed:

environment   = "prod"
region        = "us-east-1"
instance_type = "t3.medium"
allowed_ports = [80, 443]

Save as prod.tfvars. Now run the same configuration with each:

terraform apply -var-file="dev.tfvars"

terraform apply -var-file="prod.tfvars"
$ terraform plan -var-file="dev.tfvars"

  + aws_instance.web {
      + instance_type = "t2.micro"
      + tags = { "Environment" = "dev" }
    }
  + aws_security_group.web {
      + name    = "web-sg-dev"
      + ingress = [port 80, port 443, port 8080]
    }

$ terraform plan -var-file="prod.tfvars"

  + aws_instance.web {
      + instance_type = "t3.medium"
      + tags = { "Environment" = "prod" }
    }
  + aws_security_group.web {
      + name    = "web-sg-prod"
      + ingress = [port 80, port 443]
    }

What just happened?

  • The same four .tf files produced completely different infrastructure. Dev got a t2.micro with three security group rules. Prod got a t3.medium with two. Zero code changes — only the values file changed.
  • The security group name reflects the environment. Because the name is "web-sg-${var.environment}", it automatically became web-sg-dev or web-sg-prod. String interpolation makes names self-documenting without any extra logic.
  • This is the foundation of multi-environment infrastructure. One codebase. Multiple environments. Environment-specific behaviour controlled entirely through variable values. This pattern scales to dozens of environments without duplicating a single resource block.

Common Mistakes

Putting actual values in variables.tf instead of declarations

variables.tf declares what variables exist. It should never contain the actual production values — those go in terraform.tfvars or are passed via -var-file. When you put real environment values in variables.tf as defaults, the same file gets committed to Git and every environment uses the same value unless someone actively overrides it. This defeats the purpose of variables entirely.

Committing terraform.tfvars when it contains secrets

It is tempting to commit terraform.tfvars to Git for convenience. For non-sensitive values like region and instance type, this is fine. For any variable that contains a password, API key, or token — never commit it. Use a secrets manager, environment variables, or a tool like HashiCorp Vault. A committed secret in Git history is compromised forever — even after deletion.

Skipping the description argument on variables and outputs

The description argument is optional but skipping it is a professional mistake. In six months, a variable called az_count with no description is a mystery. Terraform uses descriptions in generated documentation and in the interactive prompt when a variable has no default and no value is provided. Write descriptions for every variable and every output — always.

When main.tf gets too large — split by resource type

When main.tf grows past 150–200 lines, split it into purpose-specific files — networking.tf for VPCs and subnets, compute.tf for EC2 instances, storage.tf for S3 and RDS. Terraform reads all of them together — nothing changes functionally. The project simply becomes much easier to navigate. There is no rule about when to split — use your judgement, but do it before the file becomes painful, not after.

Practice Questions

1. Which file contains variable declarations — their names, types, descriptions, and defaults?



2. Which file does Terraform automatically load to get the actual values for declared variables?



3. Which argument on an output block prevents its value from appearing in terminal output and CI logs?



Quiz

1. What does a dynamic block do in a Terraform resource?


2. A variable has a validation block. What happens when an invalid value is passed?


3. You manage dev, staging, and prod environments with the same Terraform configuration. How do you apply to each environment separately?


Up Next · Lesson 8

Providers

You have used the AWS provider and barely scratched it. Lesson 8 goes deep — multiple provider configurations, aliasing, cross-region resources, and what happens when a provider version breaks your configuration.