Terraform Lesson 20 – Lifecycle Rules | Dataplexa

Section II · Lesson 20

Lifecycle Rules

The lifecycle block is where you override Terraform's default behaviour for individual resources. You have used prevent_destroy and create_before_destroy. This lesson covers all six lifecycle arguments — including replace_triggered_by, precondition, and postcondition — with the production scenarios that justify each one.

This lesson covers

All six lifecycle arguments → create_before_destroy for zero downtime → replace_triggered_by for cascading replacements → precondition and postcondition for contract-based infrastructure → When each argument belongs in production

The Six Lifecycle Arguments

The lifecycle block sits inside a resource block and accepts six arguments. Each one changes a specific aspect of how Terraform manages that resource across its entire lifetime.

Argument	What it changes	When to use it
create_before_destroy	Creates replacement before destroying original	Resources that must not have downtime gaps
prevent_destroy	Blocks any plan that would destroy this resource	Databases, VPCs, anything catastrophic to lose
ignore_changes	Ignores drift on specific attributes	Attributes managed by external systems
replace_triggered_by	Forces replace when referenced resource changes	Resources that must be recreated when a dependency changes
precondition	Validates inputs before the resource is created	Enforcing contracts and catching configuration errors early
postcondition	Validates outputs after the resource is created	Verifying the resource was created with expected properties

Setting Up

Create a project that demonstrates all six lifecycle arguments against real AWS resources. We will build a realistic web tier — EC2 instance, security group, and S3 bucket — and apply different lifecycle rules to each.

mkdir terraform-lesson-20
cd terraform-lesson-20
touch versions.tf variables.tf main.tf outputs.tf .gitignore

Add this to versions.tf:

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.region
}

Add this to variables.tf:

variable "region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Deployment environment"
  type        = string
  default     = "dev"

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "environment must be dev, staging, or prod."
  }
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t2.micro"
}

variable "ami_id" {
  description = "AMI ID for the EC2 instance — must be valid for the target region"
  type        = string
  default     = "ami-0c55b159cbfafe1f0"  # Amazon Linux 2 — us-east-1 only
}

variable "allowed_cidr_blocks" {
  description = "CIDR blocks allowed inbound on port 80 — must be valid CIDR notation"
  type        = list(string)
  default     = ["0.0.0.0/0"]
}

Run terraform init and continue building main.tf below.

create_before_destroy — Zero Downtime Replacements

By default, when Terraform must replace a resource — the old one is destroyed first, then the new one is created. For a web server behind a load balancer, this creates a gap: the old server is gone and the new one is not ready yet. Traffic fails during this window.

create_before_destroy = true reverses this. The replacement is created and confirmed healthy first. Only then is the original destroyed. For resources attached to load balancers or DNS, this eliminates the downtime window entirely.

New terms:

create_before_destroy = true — inverts the default destroy-then-create sequence for this resource. Terraform creates the new resource first, waits for it to be ready, then destroys the original. Any resources that reference this resource (via its ID or ARN) must also have create_before_destroy set — otherwise they cannot reference a resource being destroyed while the replacement is being created.
name conflict during replacement — when create_before_destroy is enabled, both the old and new resource exist simultaneously during the transition. If the resource has a globally unique name — like an S3 bucket or an IAM role — you must use a generated or random name. You cannot create two resources with the same name simultaneously.

Add this to main.tf:

# Data source — look up the latest Amazon Linux 2 AMI dynamically
# This replaces the hardcoded AMI ID variable with a dynamic lookup
data "aws_ami" "amazon_linux_2" {
  most_recent = true
  owners      = ["amazon"]  # Only trust AMIs published by Amazon

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]  # Amazon Linux 2 HVM x86_64
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]  # Required for all modern instance types
  }
}

# Security group — no name conflict risk, so create_before_destroy is safe
resource "aws_security_group" "web" {
  name        = "web-sg-${var.environment}"
  description = "Web tier security group for ${var.environment}"

  ingress {
    description = "HTTP inbound"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = var.allowed_cidr_blocks  # Driven by variable — validated below
  }

  egress {
    description = "All outbound"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name        = "web-sg-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }

  lifecycle {
    create_before_destroy = true  # New SG created before old is destroyed
    # Security group names must be unique per VPC — if the name is unchanged
    # during a replacement, AWS allows both old and new to coexist briefly
  }
}

# EC2 instance — the main reason create_before_destroy matters
# During replacement, new instance must be running before old is terminated
resource "aws_instance" "web" {
  ami                    = data.aws_ami.amazon_linux_2.id  # Dynamic — from data source
  instance_type          = var.instance_type
  vpc_security_group_ids = [aws_security_group.web.id]

  tags = {
    Name        = "web-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
    AmiName     = data.aws_ami.amazon_linux_2.name  # Track which AMI version is running
  }

  lifecycle {
    create_before_destroy = true   # New instance created before old is terminated — zero downtime
    ignore_changes        = [ami]  # AMI managed by patching pipeline — do not replace on AMI updates
  }
}

$ terraform apply

data.aws_ami.amazon_linux_2: Reading...
data.aws_ami.amazon_linux_2: Read complete [id=ami-0c55b159cbfafe1f0]

Plan: 2 to add, 0 to change, 0 to destroy.

  Enter a value: yes

aws_security_group.web: Creating...
aws_security_group.web: Creation complete [id=sg-0abc123]

aws_instance.web: Creating...
aws_instance.web: Creation complete [id=i-0abc123def456789]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

# Now simulate a forced replacement — change the instance type
# This requires destroy + create because instance_type cannot change in-place
# on a stopped instance without a specific AWS Stop/Modify/Start cycle

$ terraform apply -var="instance_type=t3.micro"

  # aws_instance.web must be replaced
-/+ resource "aws_instance" "web" {
      ~ instance_type = "t2.micro" -> "t3.micro"  # forces replacement
    }

  # Because create_before_destroy = true:
  # 1. New t3.micro instance created first
  # 2. Old t2.micro instance destroyed after

Plan: 1 to add, 0 to change, 1 to destroy.

What just happened?

The instance type change forces a replacement. EC2 does not allow changing instance_type on a running instance via Terraform — it requires stop, modify, start. Terraform treats this as a replace operation: -/+ in the plan. Without create_before_destroy, the old instance is terminated first, creating a traffic gap. With it, the new t3.micro is running before the t2.micro is terminated.
ignore_changes = [ami] prevents constant replacements. Because we use a dynamic AMI data source, the AMI ID changes whenever Amazon releases a new version. Without ignore_changes, every plan after an AMI release would show a replace. The ignore_changes rule means the instance keeps running on its current AMI until an intentional rebuild — not every time Amazon patches something.
The security group also has create_before_destroy. If something changes the security group — description, rules — and it triggers a replace, the new SG is created before the old one is detached from the instance. This keeps the instance protected throughout the transition.

replace_triggered_by — Cascading Replacements

replace_triggered_by forces a resource to be replaced whenever a specified resource or attribute changes — even when the resource's own arguments have not changed. This is the tool for cascading replacements that Terraform cannot infer automatically.

Scenario: You have an EC2 instance and a launch configuration. The instance was created from the launch configuration. When the launch configuration changes — new AMI, new user data — the instance should be rebuilt. But the instance's arguments do not directly reference the launch configuration's content, only its ID. Without replace_triggered_by, Terraform does not know the instance needs replacing.

New terms:

replace_triggered_by — a list of resource references or attribute references. When any item in the list changes, this resource is added to the plan as a replace even if nothing else about it changed. Accepts full resource references (aws_security_group.web) or specific attribute references (aws_security_group.web.id).
aws_launch_template — a versioned EC2 launch configuration. Defines AMI, instance type, user data, security groups, and other instance settings. When you update a launch template, a new version is created — the old version is unchanged. EC2 Auto Scaling groups can reference the latest version automatically, but standalone EC2 instances need an explicit trigger to rebuild.
latest_version attribute — on an aws_launch_template resource, this attribute tracks the most recent version number. Using it in replace_triggered_by means any update to the template — which increments latest_version — triggers a replacement of the dependent instance.

Add this to main.tf:

# Launch template — defines the configuration for EC2 instances
# Each update creates a new version — latest_version increments
resource "aws_launch_template" "web" {
  name          = "web-lt-${var.environment}"
  image_id      = data.aws_ami.amazon_linux_2.id  # AMI from data source
  instance_type = var.instance_type

  # User data script runs when the instance first boots
  # Changes to user data require a new instance — the old one ran the old script
  user_data = base64encode(<<-EOT
    #!/bin/bash
    yum update -y
    yum install -y httpd
    systemctl start httpd
    systemctl enable httpd
    echo "Web Server - ${var.environment}" > /var/www/html/index.html
  EOT
  )

  tag_specifications {
    resource_type = "instance"
    tags = {
      Name        = "web-lt-${var.environment}"
      Environment = var.environment
      ManagedBy   = "Terraform"
    }
  }
}

# EC2 instance launched from the template
# When the launch template changes, this instance must be replaced
# The instance's own arguments don't reference template content — only its ID
resource "aws_instance" "web_from_template" {
  ami           = data.aws_ami.amazon_linux_2.id  # Must match template AMI
  instance_type = var.instance_type

  launch_template {
    id      = aws_launch_template.web.id       # Reference to the template
    version = aws_launch_template.web.latest_version  # Always use the latest version
  }

  tags = {
    Name        = "web-template-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }

  lifecycle {
    create_before_destroy = true  # Zero downtime during forced replacement

    # When the launch template's latest_version changes — meaning the template was updated —
    # this instance must be replaced to run with the new template version
    # Without this, the old instance keeps running on the old template indefinitely
    replace_triggered_by = [
      aws_launch_template.web.latest_version
    ]
  }
}

$ terraform apply  # Initial deploy
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

# Now update the launch template — change the user data
# Edit main.tf: change the echo line to a new message
# This creates a new template version without changing the instance's own arguments

$ terraform plan

  # aws_launch_template.web will be updated in-place
  ~ resource "aws_launch_template" "web" {
      ~ latest_version = 1 -> (known after apply)  # Will become 2
      ~ user_data      = "..." -> "..."             # New script content
    }

  # aws_instance.web_from_template must be replaced
  # (replace_triggered_by aws_launch_template.web.latest_version)
-/+ resource "aws_instance" "web_from_template" {
      # Instance arguments unchanged — but template version incremented
      # replace_triggered_by forces this replacement
    }

Plan: 1 to add, 1 to change, 1 to destroy.

# The template updates in-place (version 1 -> 2)
# The instance is replaced (new instance runs version 2 user data)

What just happened?

The instance replacement was triggered by a change the instance block does not own. The user_data changed inside the launch template. The instance block has no user_data argument — it delegates that to the template. Without replace_triggered_by, Terraform would update the template and leave the old instance running with the old script. With it, the new template version triggers an automatic instance replacement.
latest_version is the correct trigger — not the template ID. The template ID never changes — it is set at creation and stays constant. latest_version increments on every update. Using the ID as a trigger would never fire. Using latest_version fires on every template update, which is exactly the desired behaviour.
create_before_destroy ensures zero downtime during the triggered replacement. The combination of replace_triggered_by and create_before_destroy means: template updates trigger a new instance, and that new instance is healthy before the old one is terminated. Template-driven rolling replacements with no downtime.

precondition — Catch Errors Before Apply

A precondition checks that a condition is true before Terraform creates or modifies a resource. If the condition is false, Terraform aborts the plan with a clear error message — before any infrastructure is touched. This is contract-based infrastructure: define what must be true for a resource to be safely created, and Terraform enforces the contract.

New terms:

precondition block — nested inside a lifecycle block. Contains a condition expression that must evaluate to true and an error_message that prints if it is false. Multiple precondition blocks are allowed on the same resource — all must pass.
self reference — inside precondition and postcondition blocks, self refers to the current resource. self.instance_type accesses the instance_type argument of the resource the lifecycle block belongs to.
contains() in preconditions — checks that a value is in a list of allowed values. More flexible than variable validation because it can check against computed values — the result of data sources or other resource attributes — not just the raw variable value.

Add this S3 bucket with preconditions to main.tf:

# S3 bucket with preconditions — catches configuration errors before apply
resource "aws_s3_bucket" "app_data" {
  bucket = "lesson20-app-data-${var.environment}-${data.aws_caller_identity.current.account_id}"

  tags = {
    Name        = "app-data-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }

  lifecycle {
    prevent_destroy = true  # Never allow this bucket to be destroyed via plan

    precondition {
      # Ensure the bucket name stays under the 63-character S3 limit
      # Length of the full bucket name must not exceed 63 characters
      condition     = length("lesson20-app-data-${var.environment}-${data.aws_caller_identity.current.account_id}") <= 63
      error_message = "Bucket name exceeds 63 characters. Shorten the name prefix or environment name."
    }

    precondition {
      # Ensure we are not accidentally deploying to a production-scale environment
      # without the correct instance sizing — enforce a configuration contract
      condition     = var.environment != "prod" || var.instance_type != "t2.micro"
      error_message = "Production environment must not use t2.micro. Set instance_type to t3.medium or larger."
    }
  }
}

# Data source for account ID — used in precondition above
data "aws_caller_identity" "current" {}

# Test the environment/instance_type precondition
$ terraform plan -var="environment=prod" -var="instance_type=t2.micro"

Planning...
data.aws_caller_identity.current: Reading...

╷
│ Error: Resource precondition failed
│
│   on main.tf line 47, in resource "aws_s3_bucket" "app_data":
│   47:       condition = var.environment != "prod" || var.instance_type != "t2.micro"
│     ├────────────────
│     │ var.environment is "prod"
│     │ var.instance_type is "t2.micro"
│
│ Production environment must not use t2.micro.
│ Set instance_type to t3.medium or larger.
╵

# No infrastructure was touched — precondition fired at plan time
# Correct the configuration and plan succeeds:
$ terraform plan -var="environment=prod" -var="instance_type=t3.medium"
Plan: 1 to add, 0 to change, 0 to destroy.

What just happened?

The precondition fired at plan time — before any API calls. Terraform evaluated both preconditions during the planning phase. When the environment/instance_type combination failed, Terraform printed the exact condition that failed, the values that caused the failure, and the custom error message — then stopped. No AWS API was called, no resource was created or modified.
Preconditions can enforce cross-variable contracts. Variable validation in Lesson 11 can only check a single variable's value in isolation. Preconditions can check relationships between variables — var.environment != "prod" || var.instance_type != "t2.micro" is a constraint that requires knowing both variables simultaneously. This cannot be expressed in a variable validation block.
The error message is actionable. "Set instance_type to t3.medium or larger" tells the engineer exactly what to change. Good error messages in preconditions turn configuration mistakes into self-service fixes rather than debugging sessions.

postcondition — Verify After Creation

A postcondition runs after a resource is created or updated. It validates that the resource was provisioned with the expected properties. If the condition fails, Terraform marks the apply as failed — even though the resource was created — and forces correction on the next apply.

Scenario: You create an EC2 instance and want to verify that AWS assigned it a public IP — because your configuration assumes public accessibility but AWS might not assign one depending on subnet settings. A postcondition catches this immediately rather than letting the team discover the problem hours later when the service does not respond.

# EC2 instance with postconditions — verifies the instance was created as expected
resource "aws_instance" "verified_web" {
  ami                         = data.aws_ami.amazon_linux_2.id
  instance_type               = var.instance_type
  associate_public_ip_address = true  # We need a public IP for this instance

  tags = {
    Name        = "verified-web-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }

  lifecycle {
    create_before_destroy = true

    postcondition {
      # Verify the instance was assigned a public IP after creation
      # If the subnet has map_public_ip_on_launch=false, this may not happen
      # The postcondition catches this immediately rather than silently
      condition     = self.public_ip != ""
      error_message = "Instance was not assigned a public IP. Check subnet settings — map_public_ip_on_launch may be false."
    }

    postcondition {
      # Verify the instance type was accepted by AWS and matches what we requested
      # AWS occasionally substitutes instance types during capacity constraints
      condition     = self.instance_type == var.instance_type
      error_message = "AWS provisioned a different instance type than requested. Check regional availability."
    }
  }
}

$ terraform apply

aws_instance.verified_web: Creating...
aws_instance.verified_web: Creation complete [id=i-0abc123def456789]

# Postcondition check runs after creation
# Checking: self.public_ip != ""

# If the subnet doesn't assign public IPs:
╷
│ Error: Resource postcondition failed
│
│   on main.tf line 88, in resource "aws_instance" "verified_web":
│   88:       condition = self.public_ip != ""
│     ├────────────────
│     │ self.public_ip is ""
│
│ Instance was not assigned a public IP. Check subnet settings —
│ map_public_ip_on_launch may be false.
╵

# The instance WAS created — it exists in AWS and in state
# But the apply is marked as failed
# Fix the subnet settings and apply again — or add vpc_security_group_ids
# to place the instance in a subnet that assigns public IPs

# If postconditions pass:
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

What just happened?

The postcondition ran after the instance was created. Unlike preconditions which run before creation, postconditions have access to the real resource attributes assigned by AWS — self.public_ip, self.instance_type, self.id. The check happens after the creation API call completes and Terraform reads back the actual state.
The instance exists in state even though the apply failed. The creation succeeded — the instance is running in AWS. The postcondition failure marks the apply as failed but does not roll back the creation. On the next apply — after fixing the subnet to assign public IPs — the postcondition will pass and the apply will succeed. The instance is not recreated.
self refers to the current resource's attributes after creation. self.public_ip reads the public_ip attribute from the newly created resource. This is the only context where self is valid — inside lifecycle precondition and postcondition blocks. Outside lifecycle blocks, you must use the full resource address.

A Complete Lifecycle Block

Here is a production-grade RDS database resource combining all the lifecycle arguments that matter for a database — the resource type where lifecycle rules have the highest stakes.

# Production RDS instance — lifecycle rules tuned for zero-risk database management
resource "aws_db_instance" "primary" {
  identifier        = "appdb-${var.environment}"
  engine            = "postgres"
  engine_version    = "15.3"
  instance_class    = "db.t3.micro"  # Size driven by variable — precondition checks prod sizing
  allocated_storage = 20
  storage_encrypted = true           # Always encrypt RDS storage — no exceptions

  db_name  = "appdb"
  username = "dbadmin"
  password = var.db_password         # Sensitive — supplied via TF_VAR_db_password

  skip_final_snapshot = var.environment != "prod"  # Create final snapshot in prod only

  tags = {
    Name        = "appdb-${var.environment}"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }

  lifecycle {
    prevent_destroy = true   # A destroyed database is a catastrophic data loss event

    create_before_destroy = true  # Minimise downtime during forced replacements

    # Auto Scaling may adjust allocated_storage — do not revert it
    # Password rotations happen outside Terraform — do not revert them
    ignore_changes = [
      allocated_storage,  # Managed by RDS storage autoscaling
      password            # Managed by secrets rotation — do not overwrite with old value
    ]

    precondition {
      # Production databases must use storage encryption — never allow unencrypted prod data
      condition     = var.environment != "prod" || self.storage_encrypted == true
      error_message = "Production RDS instances must have storage_encrypted = true."
    }

    precondition {
      # Minimum storage for production — 100GB floor to prevent future scaling issues
      condition     = var.environment != "prod" || self.allocated_storage >= 100
      error_message = "Production RDS must have at least 100GB allocated storage."
    }

    postcondition {
      # Verify the database endpoint was assigned — a sign the instance is available
      condition     = self.endpoint != ""
      error_message = "RDS instance was created but no endpoint was assigned. Check VPC and subnet group settings."
    }
  }
}

What just happened?

prevent_destroy + create_before_destroy work together. prevent_destroy stops accidental destruction via plan. create_before_destroy ensures that when a forced replacement is genuinely necessary — engine version upgrade requiring a new instance — the old database stays available until the replacement is ready. Together they make the database as resilient as possible to both accidents and intentional changes.
ignore_changes protects against two external systems. RDS Storage Autoscaling silently increases allocated_storage when the disk fills up. A secrets rotation system updates the password on a schedule. Without ignore_changes, the next plan would revert both — shrinking the storage back and overwriting the current password with the old Terraform value. Both are silent disasters.
Two preconditions catch production misconfiguration before the 10-minute RDS creation wait. If someone accidentally sets storage_encrypted = false for a production database, the precondition catches it at plan time — before the 10-minute wait for RDS provisioning, before any cost is incurred, before any risk of unencrypted data touching the disk.

Common Mistakes

Using create_before_destroy with a hardcoded unique name

When a resource uses create_before_destroy and has a globally unique name — like an S3 bucket or an IAM role — both the old and new resource exist simultaneously during replacement. If the name is hardcoded, the creation fails because the name is already taken. Either use a random suffix or use a name that includes a unique identifier that changes with each replacement cycle.

Using ignore_changes = all

Telling Terraform to ignore all attribute changes on a resource means Terraform stops being the source of truth for that resource — it will never plan any updates regardless of how far the resource drifts. Only ever ignore specific attributes that are genuinely managed by an external system. ignore_changes = all is almost never the right answer.

Writing preconditions that always pass

A precondition that only checks the dev environment but not prod is worse than no precondition — it gives false confidence. Write preconditions that fire in the environment where the risk is highest. The correct pattern is: var.environment != "prod" || PRODUCTION_SAFETY_CONDITION — bypass for non-prod, enforce for prod.

The lifecycle decision framework

For each resource, ask four questions. Will a replacement cause downtime? Add create_before_destroy. Could an accidental plan destroy something catastrophic? Add prevent_destroy. Is any attribute managed by something other than Terraform? Add it to ignore_changes. Should this resource rebuild when something it depends on changes even though its own arguments are unchanged? Add replace_triggered_by. Then write preconditions for the conditions that must be true before creation and postconditions for what must be true after. A complete lifecycle block answers all four questions.

Practice Questions

1. Which lifecycle argument forces a resource to be replaced when a different resource changes — even when the resource's own arguments are unchanged?

2. Inside a postcondition block, which keyword references the current resource's attributes after it has been created?

3. You want Terraform to abort the plan with a clear error if someone tries to deploy a t2.micro EC2 instance to production. Which lifecycle argument implements this check?

Quiz

Up Next · Lesson 21

Dependencies

Terraform builds its graph from attribute references — but not every dependency can be expressed that way. Lesson 21 covers implicit and explicit dependencies, dependency cycles, and how to architect configurations that Terraform can parallelise safely.

← Previous Course Index Next →

Terraform Course

Lifecycle Rules

The Six Lifecycle Arguments

Setting Up

create_before_destroy — Zero Downtime Replacements

replace_triggered_by — Cascading Replacements

Web Server - ${var.environment}

precondition — Catch Errors Before Apply

postcondition — Verify After Creation

A Complete Lifecycle Block

Common Mistakes

Practice Questions

Quiz