Terraform Lesson 31 – Terraform with AWS | Dataplexa

Section III · Lesson 31

Terraform with AWS

The AWS provider is the most widely used Terraform provider in the world. This lesson goes beyond the basics — covering the full authentication credential chain, multi-account deployments with assume_role, multi-region infrastructure with provider aliases, default_tags, and the key resource patterns every AWS Terraform engineer encounters in production.

This lesson covers

AWS provider authentication chain → Multi-account with assume_role → Multi-region with provider aliases → default_tags and tags_all → IAM role and policy patterns → Security group rule conflicts → KMS key with policy → ARN patterns and cross-resource references

AWS Provider Authentication Chain

The AWS provider authenticates using the same credential chain as the AWS CLI and SDK. It checks each source in order and uses the first one that provides valid credentials. Understanding the chain prevents the most common authentication confusions.

New terms:

credential chain — the ordered list of sources the AWS provider checks for credentials. First valid source wins. If no source provides credentials, the provider fails with an authentication error.
instance metadata / IMDS — when Terraform runs on an AWS compute resource (EC2, ECS, Lambda, CodeBuild) with an instance profile, credentials are fetched from the metadata endpoint automatically. No configuration needed — the SDK handles it transparently.
named profile — a set of credentials stored in ~/.aws/credentials with a name. profile = "acme-prod" in the provider block selects a specific profile for local development.

# AWS credential chain — checked in this order, first valid source wins

# 1. Static credentials in provider block — NEVER use in production
provider "aws" {
  region     = "us-east-1"
  access_key = "AKIAIOSFODNN7EXAMPLE"   # Hardcoded — ends up in Git — never do this
  secret_key = "wJalrXUtnFEMI/K7MDENG" # Same — dangerous
}

# 2. Environment variables — recommended for CI/CD pipelines
# export AWS_ACCESS_KEY_ID="AKIA..."
# export AWS_SECRET_ACCESS_KEY="..."
# export AWS_DEFAULT_REGION="us-east-1"
provider "aws" {
  region = "us-east-1"  # Provider reads credentials from environment automatically
}

# 3. Named profile — recommended for local development
provider "aws" {
  region  = "us-east-1"
  profile = "acme-dev"  # Uses [acme-dev] section from ~/.aws/credentials
}

# 4. Instance metadata — for AWS-hosted CI/CD (CodeBuild, EC2, ECS)
# No credentials configuration needed — SDK fetches from IMDS automatically
provider "aws" {
  region = "us-east-1"  # Credentials come from the attached instance profile
}

# Best practice for CI/CD — environment variables, clean provider block
provider "aws" {
  region = var.region  # Region from variable — credentials from environment

  default_tags {
    tags = {
      Project     = var.project
      Environment = var.environment
      ManagedBy   = "Terraform"
    }
  }
}

Multi-Account Deployments with assume_role

Most organisations structure AWS into multiple accounts — one per environment (dev, staging, prod) or one per team. Terraform in CI/CD runs from a central tools account and assumes a role in each target account. This keeps prod credentials completely out of the dev pipeline.

New terms:

assume_role — IAM mechanism where one identity temporarily adopts the permissions of another role. The CI/CD runner assumes a role in the target account and all API calls use those credentials. The runner's own credentials never need cross-account permissions.
session_name — a label that appears in CloudTrail for every API call made in the assumed role session. Use terraform-prod-deployment so auditors can trace exactly which Terraform run made each change.
external_id — a shared secret required to assume the role. Prevents confused deputy attacks — only callers that know this value can assume the role, even if they know the role ARN.

variable "environment" {
  type    = string
  default = "dev"
}

locals {
  # Account IDs by environment — single source of truth in locals
  account_ids = {
    dev     = "111111111111"
    staging = "222222222222"
    prod    = "333333333333"
  }

  target_account_id = local.account_ids[var.environment]  # Select by environment
}

provider "aws" {
  region = "us-east-1"

  # Assume the deployment role in the target account
  assume_role {
    role_arn = "arn:aws:iam::${local.target_account_id}:role/terraform-deployment-role"

    # Session name appears in CloudTrail — meaningful for audit trails
    session_name = "terraform-${var.environment}-deployment"

    # External ID — prevents confused deputy attacks
    external_id = "terraform-deployment-${var.environment}"

    # Optional: extend session for large deployments that may exceed 1 hour
    duration = "2h"
  }

  default_tags {
    tags = {
      Environment = var.environment
      ManagedBy   = "Terraform"
    }
  }
}

# Usage: terraform apply -var="environment=prod"
# Provider assumes the prod account role automatically — no separate credentials needed

$ terraform apply -var="environment=prod"

# Provider assumes the prod role before any API call
# CloudTrail in the prod account shows:
#   eventName: AssumeRole
#   sessionName: terraform-prod-deployment
#   roleArn: arn:aws:iam::333333333333:role/terraform-deployment-role

# All subsequent API calls carry the assumed role session identity
# The CI/CD runner's own credentials never appear in prod CloudTrail
# A compromise of the dev pipeline credentials has zero prod access

aws_s3_bucket.app: Creating...   [in account 333333333333 — prod]
aws_s3_bucket.app: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

What just happened?

The CI/CD runner's identity was never used directly in the prod account. The runner authenticated with the tools account, assumed terraform-deployment-role in prod, and all API calls used the assumed role session. Prod account IAM never needed to trust the CI/CD runner directly — only the role it assumed.
session_name creates an audit trail. Every CloudTrail entry in prod shows terraform-prod-deployment as the caller. When something changes in prod unexpectedly, the audit trail identifies exactly which Terraform run made it — not just "some CI/CD job."

Multi-Region with Provider Aliases

A single Terraform configuration can deploy to multiple AWS regions simultaneously using provider aliases. Each alias is a separate provider instance configured for a different region. Resources specify which provider instance to use with the provider meta-argument.

New terms:

provider alias — a named instance of a provider with different configuration. Declared with alias = "NAME" in the provider block. Resources without a provider argument use the default unaliased provider.
provider meta-argument — the provider argument on a resource block. Syntax: provider = aws.ALIAS_NAME. Specifies which provider instance manages this resource.

# Default provider — us-east-1 (primary region)
provider "aws" {
  region = "us-east-1"
}

# Aliased provider — eu-west-1 (disaster recovery region)
provider "aws" {
  alias  = "eu_west_1"      # Name this provider instance
  region = "eu-west-1"
}

# Primary S3 bucket — uses default provider (us-east-1), no provider argument needed
resource "aws_s3_bucket" "primary" {
  bucket = "acme-app-data-primary-123456789012"
}

# DR S3 bucket — must use the aliased provider to deploy to eu-west-1
resource "aws_s3_bucket" "dr" {
  bucket   = "acme-app-data-dr-123456789012"
  provider = aws.eu_west_1  # Deploys to eu-west-1 via the aliased provider
}

# ACM certificate for CloudFront — MUST be in us-east-1 regardless of where CloudFront serves
# This is an AWS requirement — CloudFront reads certificates only from us-east-1
resource "aws_acm_certificate" "cloudfront" {
  domain_name       = "app.acme.com"
  validation_method = "DNS"
  # No provider argument — default us-east-1 is correct for CloudFront certificates
}

# ACM certificate for EU load balancer — must be in same region as the ALB
resource "aws_acm_certificate" "eu_alb" {
  domain_name       = "eu.app.acme.com"
  validation_method = "DNS"
  provider          = aws.eu_west_1  # Certificate must be in same region as ALB
}

default_tags and tags_all

default_tags in the provider block applies tags to every AWS resource the provider manages — without any merge() call in individual resource blocks. Understanding how default tags interact with resource-level tags prevents surprises.

provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      ManagedBy   = "Terraform"   # Applied to every resource this provider manages
      Project     = var.project
      Environment = var.environment
      Team        = "platform"
    }
  }
}

# Resource only needs its own specific tags — default_tags adds the rest automatically
resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.micro"

  tags = {
    Name      = "web-server-${var.environment}"  # Resource-specific
    Component = "web-tier"                         # Resource-specific
    # ManagedBy, Project, Environment, Team — added automatically by default_tags
  }

  # Final tags on the instance after merge:
  # Name        = "web-server-dev"      ← from resource tags
  # Component   = "web-tier"            ← from resource tags
  # ManagedBy   = "Terraform"           ← from default_tags
  # Project     = "acme"                ← from default_tags
  # Environment = "dev"                 ← from default_tags
  # Team        = "platform"            ← from default_tags
}

# tags_all — the complete merged set including default_tags
# Use this in outputs and references that need the full tag set
output "instance_tags" {
  # tags shows only resource-level tags — NOT the default_tags
  # tags_all shows the complete merged set — use this for compliance checks
  value = aws_instance.web.tags_all
}

Key AWS Resource Patterns

The most common AWS resource patterns that appear in almost every real configuration — with the subtle decisions that prevent the bugs teams spend hours debugging.

# ── IAM ROLE + POLICY ATTACHMENT + INSTANCE PROFILE ─────────────────────────

resource "aws_iam_role" "app" {
  name = "app-role-${var.environment}"

  # jsonencode() is cleaner than heredoc JSON — Terraform formats it consistently
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "ec2.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

# Attach AWS managed policies by ARN — for permissions that AWS maintains
resource "aws_iam_role_policy_attachment" "ssm" {
  role       = aws_iam_role.app.name             # Implicit dependency on the role
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

# Inline policy for resource-specific custom permissions
resource "aws_iam_role_policy" "s3_access" {
  name = "app-s3-access"
  role = aws_iam_role.app.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:PutObject"]
      Resource = "${aws_s3_bucket.app_data.arn}/*"  # Cross-resource reference in ARN
    }]
  })
}

# Instance profile — required to attach the role to an EC2 instance
resource "aws_iam_instance_profile" "app" {
  name = "app-profile-${var.environment}"
  role = aws_iam_role.app.name  # Implicit dependency on the role
}

# ── SECURITY GROUPS — THE INLINE VS SEPARATE RULE CHOICE ──────────────────────

# Use EITHER inline ingress/egress blocks OR separate aws_security_group_rule resources
# Mixing both on the same security group causes constant rule conflicts

# Option A: Inline rules — simpler for static, self-contained rule sets
resource "aws_security_group" "web_inline" {
  name   = "web-sg-${var.environment}"
  vpc_id = aws_vpc.main.id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]  # HTTPS from anywhere
    description = "HTTPS inbound"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
    description = "All outbound"
  }

  lifecycle {
    create_before_destroy = true  # Zero-downtime SG replacement
  }
}

# Option B: Separate rule resources — required when rules reference other SGs
resource "aws_security_group" "app" {
  name   = "app-sg-${var.environment}"
  vpc_id = aws_vpc.main.id
  # No inline ingress/egress — rules are managed as separate resources below
}

resource "aws_security_group_rule" "app_from_web" {
  type                     = "ingress"
  from_port                = 8080
  to_port                  = 8080
  protocol                 = "tcp"
  security_group_id        = aws_security_group.app.id    # The SG this rule belongs to
  source_security_group_id = aws_security_group.web_inline.id  # Source: the web SG
  description              = "Allow traffic from web tier on 8080"
}

# ── KMS KEY WITH KEY POLICY ──────────────────────────────────────────────────

resource "aws_kms_key" "app" {
  description             = "App encryption key — ${var.environment}"
  deletion_window_in_days = 10        # 7-30 day window before permanent deletion
  enable_key_rotation     = true      # Rotate key material annually — security best practice

  # Key policy controls who can use and administer the key
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "Enable root account full access"
        Effect = "Allow"
        # Root account access is required — without it, you can lock yourself out of the key
        Principal = { AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root" }
        Action    = "kms:*"
        Resource  = "*"   # Resource * in a key policy means this key itself
      },
      {
        Sid    = "Allow app role to use the key for encryption and decryption"
        Effect = "Allow"
        Principal = { AWS = aws_iam_role.app.arn }
        Action    = ["kms:Decrypt", "kms:GenerateDataKey", "kms:DescribeKey"]
        Resource  = "*"
      }
    ]
  })
}

resource "aws_kms_alias" "app" {
  name          = "alias/app-${var.environment}"  # Human-readable alias for the key
  target_key_id = aws_kms_key.app.key_id           # Link alias to the key
}

data "aws_caller_identity" "current" {}  # Current account ID — used in key policy ARN

What just happened?

Security group conflict prevention. The choice between inline rules and separate aws_security_group_rule resources is not aesthetic — it is functional. Mixing both on the same security group causes Terraform and AWS to fight on every apply. Inline rules are correct for self-contained rule sets. Separate rule resources are required when rules reference other security groups as sources.
KMS key policy must include root account access. Without the root account statement in the key policy, the only way to administer the key is through the key policy itself. If you accidentally create a key with no administration access, the key becomes permanently unmanageable and you cannot delete it. Always include the root account statement.
tags vs tags_all. The tags attribute on a resource shows only the tags set directly on that resource block. tags_all shows the complete merged set including default_tags. Compliance checks and outputs that need the full tag set must reference tags_all.

ARN Patterns and Cross-Resource References

# ARN format: arn:partition:service:region:account-id:resource
# S3:    arn:aws:s3:::my-bucket            (no region, no account — globally namespaced)
# EC2:   arn:aws:ec2:us-east-1:123:instance/i-0abc
# IAM:   arn:aws:iam::123:role/my-role     (no region — IAM is global)
# KMS:   arn:aws:kms:us-east-1:123:key/abc-def

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

locals {
  account_id = data.aws_caller_identity.current.account_id
  region     = data.aws_region.current.name

  # Construct resource ARNs from components
  s3_arn         = "arn:aws:s3:::${aws_s3_bucket.app.bucket}"
  s3_objects_arn = "arn:aws:s3:::${aws_s3_bucket.app.bucket}/*"
}

# Cross-resource IAM policy — most common ARN reference pattern
resource "aws_iam_role_policy" "full_example" {
  name = "full-access-policy"
  role = aws_iam_role.app.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = ["s3:ListBucket"]
        Resource = aws_s3_bucket.app.arn         # Bucket ARN — for ListBucket action
      },
      {
        Effect   = "Allow"
        Action   = ["s3:GetObject", "s3:PutObject"]
        Resource = "${aws_s3_bucket.app.arn}/*"  # Objects ARN — for object actions
      },
      {
        Effect   = "Allow"
        Action   = ["kms:Decrypt", "kms:GenerateDataKey"]
        Resource = aws_kms_key.app.arn           # KMS key ARN for decryption
      }
    ]
  })
}

# Data source for existing resources — look up ARNs without hardcoding
data "aws_iam_policy" "ssm_managed" {
  name = "AmazonSSMManagedInstanceCore"  # Look up by name — gets the full ARN
}

resource "aws_iam_role_policy_attachment" "ssm_existing" {
  role       = aws_iam_role.app.name
  policy_arn = data.aws_iam_policy.ssm_managed.arn  # ARN from data source — not hardcoded
}

Common AWS Provider Mistakes

Mixing inline and separate security group rules

Putting ingress blocks inside aws_security_group and also creating aws_security_group_rule resources for the same SG causes constant rule conflicts — each apply has Terraform and AWS fighting over the same rules. Choose one approach per security group and never mix them.

Referencing tags instead of tags_all in compliance checks

When default_tags is set on the provider, the resource's tags attribute only contains tags set directly on the resource block. Policy checks, OPA rules, and outputs that check for required tags must reference tags_all — otherwise the default tags are invisible to the check and it incorrectly reports missing tags.

Not specifying region in the provider block

If the provider has no region and AWS_DEFAULT_REGION is not set, the AWS provider silently defaults to us-east-1. Deploying to the wrong region is a subtle production incident. Always specify the region explicitly in the provider block — either as a literal or a variable. Never rely on the environment variable being set correctly in all contexts.

Read the AWS provider CHANGELOG before major version upgrades

The AWS provider is one of the fastest-moving Terraform providers. The v4 to v5 upgrade changed how S3 bucket configuration is split across sub-resources, renamed several arguments, and changed defaults. Running terraform init -upgrade without reading the CHANGELOG first has caused production outages. Always check github.com/hashicorp/terraform-provider-aws/blob/main/CHANGELOG.md before any major provider upgrade.

Practice Questions

1. You have a default aws provider and one with alias = "eu_west_1". What argument on a resource block makes it deploy to eu-west-1?

2. When default_tags is configured on the provider, which resource attribute contains the complete merged tag set including default_tags?

3. In an assume_role block, which argument creates a meaningful audit trail in CloudTrail showing which Terraform run made changes?

Quiz

Up Next · Lesson 32

Terraform with Azure

AWS mastered. Lesson 32 moves to the Azure provider — the azurerm authentication model, resource groups as a first-class concept, service principals and managed identities, and the key resource patterns that differ most from AWS.

← Previous Course Index Next →

Terraform Course

Terraform with AWS

AWS Provider Authentication Chain

Multi-Account Deployments with assume_role

Multi-Region with Provider Aliases

default_tags and tags_all

Key AWS Resource Patterns

ARN Patterns and Cross-Resource References

Common AWS Provider Mistakes

Practice Questions

Quiz