Terraform Lesson 4 – Terraform Architecture | Dataplexa

Section I · Lesson 4

Terraform Architecture

You have been running Terraform commands without knowing what is actually happening inside. This lesson opens the black box — and once you see the architecture, every Terraform behaviour you will ever encounter makes immediate sense.

This lesson covers

The three core components → How the dependency graph works → What the state file actually contains → How providers fit into the architecture

The Three Core Components

Terraform is not one monolithic program. It is three distinct components working together. Understanding each one separately is what makes the whole system click.

Component	What it is	Where it lives
Terraform Core	The CLI binary — reads config, builds the graph, runs the plan	Your local machine or CI server
Providers	Plugins that translate HCL into real API calls	.terraform/ folder, downloaded on init
State	A JSON record of every resource Terraform manages	Local file or remote backend (S3, GCS, etc.)

Terraform Core

Terraform Core is the binary you install — the terraform command itself. It is written in Go and distributed as a single compiled executable. It has no dependencies. No runtime, no virtual machine, no framework to install.

When you run any Terraform command, Core is doing four things in sequence: reading all your .tf files into memory, reading the current state file, building a dependency graph of all resources, and calculating the diff between desired state and current state. Everything else — the actual API calls — is delegated entirely to providers.

Providers in Detail

Providers are Go plugins that Terraform Core communicates with over a local RPC connection. When you run terraform init, Core downloads the provider binaries from the Terraform Registry and stores them in a hidden .terraform/providers/ directory.

Each provider exposes two things to Core: resource types (things you can create, like aws_instance) and data sources (things you can read, like looking up an existing AMI ID). Core knows nothing about AWS or Azure directly — it only knows the interface. The provider knows everything about the platform.

Provider versions are pinned in a lock file — .terraform.lock.hcl — so your team always uses the same provider version regardless of when they run terraform init. This lock file belongs in Git.

New terms:

required_providers block — declares which providers your configuration needs and which versions are acceptable. Without this, Terraform fetches the latest version of every provider on every init — which can break configurations when providers release breaking changes.
source — the registry address of the provider in the format namespace/provider-name. HashiCorp-maintained providers use hashicorp/aws. Community providers use the publisher's namespace.
version constraint — a rule for which provider versions are acceptable. ~> 5.0 means any version from 5.0 up to but not including 6.0. This protects you from major breaking changes while still receiving patch updates.
.terraform.lock.hcl — the provider lock file. Records the exact version and checksum of every provider downloaded. Commit this to Git so all teammates and CI pipelines use identical provider versions.

# versions.tf — Declare providers and acceptable version ranges

terraform {
  required_version = ">= 1.5.0"   # Minimum Terraform CLI version

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"          # Any 5.x version, not 6.0+
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.0"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.0"          # Community provider — note different namespace
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

provider "azurerm" {
  features {}
}

provider "cloudflare" {
  api_token = var.cloudflare_api_token
}

$ terraform init

Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/azurerm versions matching "~> 3.0"...
- Finding cloudflare/cloudflare versions matching "~> 4.0"...
- Installing hashicorp/aws v5.31.0...
- Installing hashicorp/azurerm v3.85.0...
- Installing cloudflare/cloudflare v4.20.0...

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

What just happened?

Three providers were downloaded in one init. AWS, Azure, and Cloudflare — three completely different platforms — all initialised with a single command. From this point, your configuration can manage resources on all three simultaneously.
Exact versions were selected and locked. Terraform found the newest version within each constraint and locked it in .terraform.lock.hcl. Anyone on your team who runs terraform init now gets the exact same provider versions — v5.31.0, v3.85.0, v4.20.0 — regardless of when they run it.
The .terraform/ folder was created. This folder contains the downloaded provider binaries. It is gitignored — it gets recreated by terraform init. Only the lock file goes into Git, not the binaries themselves.

The Dependency Graph

When Terraform reads your configuration, it builds a directed acyclic graph — a DAG — of all resources and their dependencies. This graph determines two critical things: the order resources are created, and which resources can be created in parallel.

Dependencies are implicit. When one resource references an attribute of another — aws_instance.web.subnet_id = aws_subnet.main.id — Terraform sees the reference and knows the subnet must exist before the instance. You never specify order manually. You just write references and Terraform figures out the graph.

The VPC is created first. Both subnets are created in parallel since neither depends on the other. The EC2 instance waits for both.

This is why Terraform is faster than running API calls in a sequential script. Resources with no dependencies between them run simultaneously. A large infrastructure with 50 resources might only take as long as its longest dependency chain — not 50 sequential API calls.

The State File in Detail

The state file is a JSON document that records the complete current state of every resource Terraform manages — including every attribute, every ID, and every dependency relationship. It is Terraform's source of truth about what exists in the real world.

Here is a simplified excerpt of what a state file entry looks like for the EC2 instance from Lesson 1:

{
  "version": 4,
  "terraform_version": "1.6.0",
  "resources": [
    {
      "mode": "managed",
      "type": "aws_instance",
      "name": "my_server",
      "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
      "instances": [
        {
          "schema_version": 1,
          "attributes": {
            "id":            "i-0abc123def456789",
            "ami":           "ami-0c55b159cbfafe1f0",
            "instance_type": "t2.micro",
            "public_ip":     "54.211.89.132",
            "private_ip":    "172.31.45.67",
            "tags": {
              "Environment": "dev",
              "ManagedBy":   "Terraform",
              "Name":        "MyFirstServer"
            }
          }
        }
      ]
    }
  ]
}

Why this matters

Every attribute is stored — not just what you wrote. The state file contains public_ip, private_ip, and dozens of other attributes that AWS assigned after creation. These were not in your configuration — Terraform fetched and stored them so you can reference them in other resources.
The resource ID is the anchor. The id field — i-0abc123def456789 — is how Terraform identifies this resource on every future operation. When you run terraform plan again, Terraform uses this ID to query AWS for the current state of this specific instance.
Never edit this file manually. The state file is managed exclusively by Terraform. Manual edits almost always corrupt it. If you need to manipulate state, use terraform state commands — covered in Section II.

Common Mistakes

Not committing the lock file to Git

The .terraform.lock.hcl file belongs in your repository. Without it, different teammates and your CI pipeline may silently use different provider versions. A configuration that works locally with AWS provider v5.31.0 may behave differently in CI running v5.40.0. Commit the lock file. Always.

Committing the .terraform/ folder to Git

The .terraform/ folder contains downloaded provider binaries — sometimes hundreds of megabytes. It is regenerated by terraform init and must never go into Git. Add it to your .gitignore from the very start of every project.

Using an unpinned provider version in production

Without a required_providers block, Terraform fetches the latest provider version on every fresh init. A new major provider release with breaking changes will silently break your configuration the next time someone runs init on a clean machine. Always pin provider versions in production configurations.

The .gitignore every Terraform project needs

Three entries, no exceptions: .terraform/ to exclude provider binaries, *.tfstate to exclude local state files from accidentally being committed, and *.tfstate.backup to exclude state backups. The lock file — .terraform.lock.hcl — is the one Terraform file that must always be committed.

Practice Questions

1. What is the name of the main Terraform binary that reads configuration, builds the dependency graph, and calculates the execution plan?

2. What is the name of the file that locks provider versions so all team members use the same provider on every init?

3. What does Terraform build from your configuration to determine the correct order and parallelism for resource creation?

Quiz

Up Next · Lesson 5

Terraform Workflow

You know the four commands. Lesson 5 shows you what a real team's workflow looks like when those commands are wired into a proper process — and where most teams get it wrong.

← Previous Course Index Next →

Terraform Course