Terraform Lesson 26 – Modules Introduction | Dataplexa
Section III · Lesson 26

Introduction to Modules

Every Terraform configuration is already a module — the root module. But when the same infrastructure pattern appears in multiple places — a VPC, a standardised ECS service, a hardened S3 bucket — copying resource blocks is the wrong answer. Modules are how you package infrastructure patterns so they can be reused, versioned, and maintained in one place.

This lesson covers

What modules are and why they exist → Module structure → Building a reusable S3 bucket module → Calling the module from a root configuration → Module inputs and outputs → Local vs registry modules

What Modules Are

A module is a directory containing Terraform configuration files. That is the entire definition. The root module is the directory you run terraform apply from. A child module is any directory that the root module (or another module) calls using a module block.

Modules serve three purposes: encapsulation — hiding implementation detail behind a clean interface; reuse — one module called from many places instead of duplicated blocks; and standardisation — enforcing that every S3 bucket in the organisation has versioning enabled, that every security group follows naming convention, that every VPC has the same routing pattern.

The Analogy

A Terraform module is like a function in code. You define it once — inputs as parameters, resources as the function body, outputs as return values. You call it many times with different inputs and get different results. The callers do not need to know how the function works internally — they just need to know what to pass in and what comes back.

Concept In a function In a module
Inputs Function parameters variable blocks in the module
Logic Function body resource and data blocks in the module
Outputs Return value output blocks in the module
Calling result = myFunction(arg1, arg2) module "name" { source = "..." var1 = val1 }

Module Structure

A module is just a directory with .tf files. But a well-structured module follows the same file conventions as a root configuration — with one addition: a README.md that documents what the module does, what inputs it accepts, and what outputs it produces.

# Standard module directory structure
modules/
└── s3-bucket/              # Module name — describes what it creates
    ├── README.md           # Required: what it does, inputs, outputs, example usage
    ├── variables.tf        # Module inputs — the module's public interface
    ├── main.tf             # Resources the module creates
    ├── outputs.tf          # Module outputs — what it exposes to callers
    └── versions.tf         # Required provider version constraints — no backend block

# Root configuration that calls the module
my-project/
├── versions.tf             # Backend + providers
├── variables.tf
├── main.tf                 # Calls the module(s) here
├── outputs.tf
└── modules/
    └── s3-bucket/          # Local module source

The key rule: modules must never contain a backend block. A module is called by a root configuration that already has its own backend. If a module declared a backend, Terraform would try to initialise a second backend — which is not supported. The backend lives only in the root configuration.

Building a Reusable S3 Bucket Module

We will build a module that creates a fully-configured S3 bucket — with versioning, encryption, and public access blocking enabled by default. Every time the organisation needs an S3 bucket, they call this module. The security defaults are baked in — they cannot be forgotten.

Create the module directory:

mkdir -p terraform-lesson-26/modules/s3-bucket
mkdir -p terraform-lesson-26/root
cd terraform-lesson-26

New terms:

  • module variables.tf — declares the module's inputs. These are the arguments the caller must (or may) pass when calling the module. Required variables have no default. Optional variables have sensible defaults that encode best practices. The types and validation rules here are the module's contract with its callers.
  • module outputs.tf — declares what the module exposes. Callers access these as module.NAME.OUTPUT_NAME. Expose the IDs and ARNs that callers are most likely to need — bucket ID, ARN, regional domain name. Do not expose internal implementation details.
  • module versions.tf — declares required_providers so the module documents what it needs. The root configuration's provider configuration takes precedence — the module uses the provider instances configured by the root. No provider block is needed in the module unless it needs an explicitly aliased provider.

Add this to modules/s3-bucket/versions.tf:

terraform {
  # Minimum Terraform version this module requires
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.0"  # Minimum provider version — no upper bound in modules
      # Modules use >= not ~> for required_providers
      # The root configuration controls the actual provider version
      # An upper bound in the module would conflict with the root's constraint
    }
  }

  # No backend block — modules never have backends
}

Add this to modules/s3-bucket/variables.tf:

# Module inputs — the public interface of the s3-bucket module

variable "bucket_name" {
  description = "Globally unique name for the S3 bucket — max 63 characters, lowercase, hyphens only"
  type        = string

  validation {
    # S3 bucket naming rules: 3-63 chars, lowercase, numbers, hyphens
    condition     = can(regex("^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$", var.bucket_name))
    error_message = "bucket_name must be 3-63 characters, start and end with a letter or number, and contain only lowercase letters, numbers, and hyphens."
  }
}

variable "enable_versioning" {
  description = "Enable S3 versioning — recommended for all buckets, required for state buckets"
  type        = bool
  default     = true  # On by default — callers must explicitly opt out
}

variable "enable_encryption" {
  description = "Enable server-side encryption using AES-256"
  type        = bool
  default     = true  # On by default — security best practice
}

variable "block_public_access" {
  description = "Block all public access to the bucket — disable only for static website hosting"
  type        = bool
  default     = true  # On by default — most buckets should never be public
}

variable "lifecycle_rules" {
  description = "S3 lifecycle rules for automatic object transitions and expiration"
  type = list(object({
    id      = string        # Unique ID for this rule
    enabled = bool          # Whether the rule is active
    prefix  = optional(string, "")  # Only apply to objects with this key prefix
    transitions = optional(list(object({
      days          = number  # Days after creation before transition
      storage_class = string  # Target storage class: STANDARD_IA, GLACIER, etc.
    })), [])
    expiration_days = optional(number, null)  # Days after creation to delete the object
  }))
  default = []  # No lifecycle rules by default — callers add them as needed
}

variable "tags" {
  description = "Additional tags to apply to all resources — merged with module defaults"
  type        = map(string)
  default     = {}
}

variable "force_destroy" {
  description = "Allow bucket deletion even if it contains objects — DANGER: set true only for dev buckets"
  type        = bool
  default     = false  # Safe default — prod buckets must not be force-destroyed
}

Add this to modules/s3-bucket/main.tf:

# The S3 bucket and all its security configuration
# Callers get versioning, encryption, and public access blocking by default
# without having to know about the individual sub-resources

resource "aws_s3_bucket" "this" {
  bucket        = var.bucket_name
  force_destroy = var.force_destroy  # Controlled by caller — default is safe (false)

  tags = merge(var.tags, {
    # Module always adds this tag — callers cannot override it
    # This marks the bucket as created by the module, not manually
    ManagedBy = "terraform-s3-bucket-module"
  })

  lifecycle {
    # Prevent accidental destruction of buckets created through this module
    # The force_destroy variable provides a controlled opt-out for dev
    prevent_destroy = !var.force_destroy
  }
}

# Versioning — enabled by default, can be disabled via variable
resource "aws_s3_bucket_versioning" "this" {
  bucket = aws_s3_bucket.this.id  # Implicit dependency on the bucket

  versioning_configuration {
    # Enabled = versioning on. Suspended = versioning paused but history kept.
    status = var.enable_versioning ? "Enabled" : "Suspended"
  }
}

# Server-side encryption — enabled by default
resource "aws_s3_bucket_server_side_encryption_configuration" "this" {
  # Only create this resource when encryption is enabled
  count  = var.enable_encryption ? 1 : 0
  bucket = aws_s3_bucket.this.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"  # S3-managed keys — no extra cost or key management
    }

    # Enforce encryption even if the PutObject call does not specify it
    bucket_key_enabled = true  # Reduces KMS API calls — cost optimisation
  }
}

# Public access block — enabled by default — most buckets should never be public
resource "aws_s3_bucket_public_access_block" "this" {
  count  = var.block_public_access ? 1 : 0
  bucket = aws_s3_bucket.this.id

  block_public_acls       = true  # Block any attempt to set public ACLs
  block_public_policy     = true  # Block any public bucket policy
  ignore_public_acls      = true  # Ignore existing public ACLs
  restrict_public_buckets = true  # Restrict to bucket owner
}

# Lifecycle rules — optional, driven by the lifecycle_rules variable
resource "aws_s3_bucket_lifecycle_configuration" "this" {
  # Only create if there are lifecycle rules to apply
  count  = length(var.lifecycle_rules) > 0 ? 1 : 0
  bucket = aws_s3_bucket.this.id

  # Versioning must be configured before lifecycle rules can be applied to versions
  depends_on = [aws_s3_bucket_versioning.this]

  # Dynamic block generates one rule per entry in var.lifecycle_rules
  dynamic "rule" {
    for_each = var.lifecycle_rules
    iterator = lc  # Short iterator name for lifecycle — cleaner than each

    content {
      id     = lc.value.id
      status = lc.value.enabled ? "Enabled" : "Disabled"

      # Apply rule only to objects matching the prefix — empty string = all objects
      filter {
        prefix = lc.value.prefix
      }

      # Dynamic block for transitions — zero or more per rule
      dynamic "transition" {
        for_each = lc.value.transitions
        iterator = t

        content {
          days          = t.value.days
          storage_class = t.value.storage_class
        }
      }

      # Optional expiration — only include the block when expiration_days is set
      dynamic "expiration" {
        for_each = lc.value.expiration_days != null ? [lc.value.expiration_days] : []
        iterator = exp

        content {
          days = exp.value  # The list contains one element — the number of days
        }
      }
    }
  }
}

Add this to modules/s3-bucket/outputs.tf:

# Module outputs — what callers can reference after creating the bucket

output "bucket_id" {
  description = "The name of the S3 bucket — same as the ID, useful for resource references"
  value       = aws_s3_bucket.this.id
}

output "bucket_arn" {
  description = "The ARN of the S3 bucket — use in IAM policies and resource-based policies"
  value       = aws_s3_bucket.this.arn
}

output "bucket_regional_domain_name" {
  description = "Regional domain name for the bucket — use for CloudFront origins and pre-signed URLs"
  value       = aws_s3_bucket.this.bucket_regional_domain_name
}

output "bucket_hosted_zone_id" {
  description = "The Route 53 hosted zone ID for the bucket's region — use for alias DNS records"
  value       = aws_s3_bucket.this.hosted_zone_id
}

output "versioning_enabled" {
  description = "Whether versioning is currently enabled on the bucket"
  value       = var.enable_versioning
}

Calling the Module from a Root Configuration

With the module built, create the root configuration that calls it. A root configuration can call the same module multiple times with different inputs — one call for each bucket it needs.

New terms:

  • module block — declares a call to a child module. The source argument is required — it tells Terraform where the module code lives. Arguments beyond source and version are passed as input variables to the module.
  • local module source — a relative path starting with ./ or ../. Points to a directory on the local filesystem. Used during development and for project-specific modules that will never be published to a registry. Does not require a version argument.
  • module.NAME.OUTPUT_NAME — the syntax for accessing a module's output from the root configuration. module.app_data.bucket_arn accesses the bucket_arn output from the module block named app_data.
  • terraform init with modules — when a configuration calls modules, terraform init downloads and caches them in .terraform/modules/. For local modules, it creates a symlink rather than copying. You must run terraform init again whenever you add a new module call or change a module's source.

Add this to root/versions.tf:

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      Project   = "lesson26"
      ManagedBy = "Terraform"
    }
  }
}

Add this to root/main.tf:

# Data source for current account ID — used in bucket names
data "aws_caller_identity" "current" {}

# First call — application data bucket with default security settings
module "app_data" {
  source = "../modules/s3-bucket"  # Relative path to the local module

  bucket_name = "lesson26-app-data-${data.aws_caller_identity.current.account_id}"

  # Override defaults where the application needs different behaviour
  enable_versioning = true   # Explicitly true — makes intent clear
  force_destroy     = false  # Production bucket — never force-destroy

  # Lifecycle rules — transition to cheaper storage after 90 days
  lifecycle_rules = [
    {
      id      = "transition-to-ia"
      enabled = true
      transitions = [
        { days = 90,  storage_class = "STANDARD_IA" },  # Infrequent access after 90 days
        { days = 365, storage_class = "GLACIER"     }   # Archive after 1 year
      ]
    }
  ]

  tags = {
    Environment = "prod"
    Component   = "app-data"
  }
}

# Second call — log bucket with shorter retention
# Same module, different inputs — no code duplication
module "access_logs" {
  source = "../modules/s3-bucket"

  bucket_name   = "lesson26-access-logs-${data.aws_caller_identity.current.account_id}"
  force_destroy = false

  # Log buckets: versioning off (logs are not versioned), encryption on
  enable_versioning = false
  enable_encryption = true

  # Expire log files after 90 days — compliance retention period
  lifecycle_rules = [
    {
      id              = "expire-old-logs"
      enabled         = true
      expiration_days = 90  # Delete log files after 90 days
    }
  ]

  tags = {
    Environment = "prod"
    Component   = "access-logs"
  }
}

# Third call — dev scratch bucket (less restrictive)
module "dev_scratch" {
  source = "../modules/s3-bucket"

  bucket_name   = "lesson26-dev-scratch-${data.aws_caller_identity.current.account_id}"
  force_destroy = true  # Dev bucket — OK to force-destroy for cleanup

  enable_versioning = false  # No versioning needed for temporary dev work

  tags = {
    Environment = "dev"
    Component   = "scratch"
  }
}

Add this to root/outputs.tf:

# Access module outputs using module.NAME.OUTPUT_NAME syntax

output "app_data_bucket_arn" {
  description = "ARN of the application data bucket"
  value       = module.app_data.bucket_arn  # module.MODULE_NAME.OUTPUT_NAME
}

output "access_logs_bucket_id" {
  description = "Name of the access logs bucket"
  value       = module.access_logs.bucket_id
}

output "dev_scratch_bucket_id" {
  description = "Name of the dev scratch bucket"
  value       = module.dev_scratch.bucket_id
}
cd root
terraform init
terraform apply
$ terraform init

Initializing modules...
- app_data in ../modules/s3-bucket
- access_logs in ../modules/s3-bucket
- dev_scratch in ../modules/s3-bucket

Initializing provider plugins...
Installing hashicorp/aws v5.31.0...

$ terraform apply

Plan: 11 to add, 0 to change, 0 to destroy.
# 11 resources because each module call creates multiple sub-resources
# 3 modules × (1 bucket + 1 versioning + 1 encryption + 1 public_access_block) = up to 12
# Some modules have fewer — dev_scratch has no lifecycle config (empty list)

Terraform will perform the following actions:

  # module.app_data.aws_s3_bucket.this will be created
  + resource "aws_s3_bucket" "this" {
      + bucket = "lesson26-app-data-123456789012"
    }

  # module.app_data.aws_s3_bucket_lifecycle_configuration.this[0] will be created
  + resource "aws_s3_bucket_lifecycle_configuration" "this" { ... }

  # module.access_logs.aws_s3_bucket.this will be created
  + resource "aws_s3_bucket" "this" {
      + bucket = "lesson26-access-logs-123456789012"
    }

  # module.dev_scratch.aws_s3_bucket.this will be created
  + resource "aws_s3_bucket" "this" {
      + bucket = "lesson26-dev-scratch-123456789012"
    }

  Enter a value: yes

Apply complete! Resources: 11 added, 0 changed, 0 destroyed.

Outputs:
app_data_bucket_arn    = "arn:aws:s3:::lesson26-app-data-123456789012"
access_logs_bucket_id  = "lesson26-access-logs-123456789012"
dev_scratch_bucket_id  = "lesson26-dev-scratch-123456789012"

What just happened?

  • Three module calls created eleven real AWS resources from one configuration file. The module encapsulated all the sub-resources — versioning, encryption, public access block, lifecycle — behind a simple interface. The caller in main.tf passes six arguments and gets a fully configured, hardened S3 bucket. The security defaults cannot be forgotten because they are baked into the module.
  • Module resource addresses include the module path. In the plan output, resources are addressed as module.app_data.aws_s3_bucket.this — not just aws_s3_bucket.this. The module name is part of the address. This means terraform state list shows all three module instances as separate groups, and terraform state show module.app_data.aws_s3_bucket.this shows just that one bucket.
  • Outputs from the module are accessed with module.NAME.OUTPUT_NAME. The root configuration's outputs.tf uses module.app_data.bucket_arn to expose the ARN that the module computed internally. The root configuration never needs to know the bucket's ARN directly — it delegates that knowledge to the module.

Local Modules vs Registry Modules

The module we built uses a local source — a relative path to a directory on disk. Terraform also supports modules from the public Terraform Registry, private registries, Git repositories, and more. The source argument determines where Terraform fetches the module from.

# Different module source types

# Local path — relative directory on disk (this lesson)
module "s3" {
  source = "../modules/s3-bucket"
  # No version argument — local modules do not have versions
}

# Public Terraform Registry — the most widely used modules
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"  # registry.terraform.io format
  version = "~> 5.0"                          # Version required for registry modules
  # Arguments specific to this module:
  cidr = "10.0.0.0/16"
}

# GitHub — pin to a specific commit or tag for reproducibility
module "vpc" {
  source = "github.com/terraform-aws-modules/terraform-aws-vpc?ref=v5.1.0"
  # ref= pins to a tag, branch, or commit hash
}

# Private registry — for internal modules in HCP Terraform or Terraform Enterprise
module "vpc" {
  source  = "app.terraform.io/acme/vpc/aws"
  version = "~> 2.0"
}

# Git with SSH
module "vpc" {
  source = "git::ssh://git@github.com/acme/terraform-modules.git//modules/vpc?ref=v2.0.0"
  # // separates the repo URL from the subdirectory within the repo
}

Which source type should you use?

  • Local paths for project-specific modules that only this project needs and will never be shared with other projects. Fast to iterate — no publishing step required. Zero versioning overhead.
  • Terraform Registry for well-maintained community modules — the terraform-aws-modules organisation has production-grade modules for VPC, EKS, RDS, and more that are used by thousands of organisations. Always pin the version with version = "~> X.Y".
  • Git with a pinned ref for internal modules that are shared across multiple projects but published to an internal Git server rather than a public registry. The ?ref= pin is critical — without it, every terraform init could fetch a different version.

Common Mistakes

Adding a backend block to a module

A module with a terraform { backend "s3" { ... } } block will cause an error when the root configuration initialises — Terraform does not support multiple backends. Modules declare required_providers but never a backend. State management is always the root configuration's responsibility.

Forgetting terraform init after adding a module call

Every new module block — including new calls to an already-initialised local module — requires running terraform init before plan or apply. Terraform must download and cache the module source. If you skip this step, Terraform errors with "Module not installed" or "Module source has changed."

Using ~> in a module's required_providers

Using version = "~> 5.0" in a module's required_providers block imposes an upper bound that conflicts with the root configuration's constraint. If the root pins to ~> 5.31 and the module uses ~> 5.0, there is no conflict — but if the root uses ~> 4.0, Terraform cannot satisfy both constraints. Module provider requirements should use >= with a minimum version only — never ~>.

When to extract into a module

Extract into a module when: the same group of resources is needed in more than one place; when the group has a stable, meaningful interface (clear inputs and outputs); or when you want to enforce standards that callers cannot override. Do not extract into a module just to organise a single project — file splitting (networking.tf, compute.tf, storage.tf) is simpler and achieves the same organisational goal without the overhead of module inputs and outputs.

Practice Questions

1. You have a module block named "app_data" that exposes an output called "bucket_arn". What expression accesses this output from the root configuration?



2. You add a new module block to main.tf that was not there before. Before running terraform plan, what command must you run?



3. Should a module ever contain a terraform backend block?



Quiz

1. What are the three building blocks of a Terraform module's public interface?


2. A module needs at least AWS provider version 5.0. What is the correct version constraint to use in the module's required_providers block?


3. A module named "vpc" contains resource "aws_vpc" "main". What is the full Terraform state address of this resource?


Up Next · Lesson 27

Module Inputs, Outputs, and Composition

One module is a building block. Multiple modules composed together are an architecture. Lesson 27 covers module composition patterns — passing outputs from one module as inputs to another, module dependency ordering, and building a layered infrastructure from reusable components.