Terraform Lesson 33 – Terraform with GCP | Dataplexa
Section III · Lesson 33

Terraform with GCP

Google Cloud introduces a third resource model — project-based scoping, service accounts for identity, IAM bindings as first-class resources, and a dual-provider setup where new features land in google-beta before graduating to google. This lesson covers GCP from authentication through the resource patterns you will use in every real GCP deployment.

This lesson covers

How GCP differs from AWS and Azure → GCP provider authentication → Enabling APIs — the required first step → GCP networking — global VPCs → GCP IAM bindings vs members → Key GCP resource patterns → The google-beta provider

How GCP Differs from AWS and Azure

Concept AWS Azure GCP
Resource scope Account Subscription + Resource Group Project
CI/CD identity IAM Role Service Principal / Managed Identity Service Account
Private network VPC (regional) VNet (regional) VPC (global — spans all regions)
API enablement Always available Always available Must explicitly enable per project
Beta features Single provider Single provider Separate google-beta provider

GCP Provider Authentication

GCP authentication centres on service accounts — identities created for applications and automation. Unlike AWS IAM roles which are assumed temporarily, GCP service accounts are long-lived identities. The recommended approach for CI/CD is Workload Identity Federation — keyless, no credentials to rotate.

New terms:

  • Service Account — a GCP identity for non-human access. Represented as an email: NAME@PROJECT_ID.iam.gserviceaccount.com. Can have JSON key files (for external use) or be impersonated via Workload Identity.
  • Application Default Credentials (ADC) — GCP's credential discovery mechanism. Checks environment variables, well-known file locations, and metadata server in order. gcloud auth application-default login sets up ADC for local development.
  • Workload Identity Federation — allows external identities (GitHub Actions, GitLab CI) to impersonate a GCP service account without any long-lived key files. The recommended approach for CI/CD — keyless, automatic rotation, no credentials to store.
  • GOOGLE_APPLICATION_CREDENTIALS — environment variable pointing to a service account key JSON file. Simpler than Workload Identity but requires managing and rotating the key file.
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"  # GA features — use for all production resources
    }
    google-beta = {
      source  = "hashicorp/google-beta"
      version = "~> 5.0"  # Beta features — pin to same version as google provider
    }
  }
}

# ── Method 1: Application Default Credentials — local development ─────────────
provider "google" {
  project = var.project_id  # Required — all resources go in this project by default
  region  = var.region      # Default region — can be overridden per resource
  zone    = var.zone        # Default zone — can be overridden per resource
  # No credentials — uses ADC from `gcloud auth application-default login`
}

# ── Method 2: Service account key file ───────────────────────────────────────
provider "google" {
  project     = var.project_id
  region      = var.region
  credentials = file("${path.module}/sa-key.json")  # JSON key file path
  # Better: set GOOGLE_APPLICATION_CREDENTIALS environment variable instead
  # Then omit credentials from the provider block entirely
}

# ── Method 3: Workload Identity — GitHub Actions, no key files ───────────────
# Configure in GitHub Actions before running Terraform:
# - uses: google-github-actions/auth@v2
#   with:
#     workload_identity_provider: projects/PROJECT_NUM/locations/global/workloadIdentityPools/POOL/providers/PROVIDER
#     service_account: terraform@PROJECT_ID.iam.gserviceaccount.com
provider "google" {
  project = var.project_id
  region  = var.region
  # No credentials — Workload Identity exchanges the GitHub OIDC token automatically
}

Enabling APIs — The Required First Step

GCP APIs must be explicitly enabled for each project before Terraform can create resources of that type. Forgetting this causes cryptic "API not enabled" errors mid-apply. The google_project_service resource enables APIs, and all resources of that type should depend on it.

New terms:

  • google_project_service — enables a GCP API for a project. Each API must be enabled before Terraform can create resources that use it. AWS and Azure have no equivalent — their services are always available in all accounts/subscriptions.
  • disable_on_destroy = false — prevents disabling the API when the Terraform resource is destroyed. Without this, deleting the google_project_service resource would disable the API — destroying all GCP resources in the project that depend on it. Always set this to false.
variable "project_id" {
  description = "GCP project ID — all resources are created in this project"
  type        = string
}

variable "region" {
  type    = string
  default = "us-central1"
}

variable "zone" {
  type    = string
  default = "us-central1-a"
}

# Enable APIs before creating any resources that depend on them
# disable_on_destroy = false prevents catastrophic cascade deletion

resource "google_project_service" "compute" {
  project            = var.project_id
  service            = "compute.googleapis.com"  # Compute Engine: VMs, VPCs, load balancers
  disable_on_destroy = false  # Never disable on destroy — it would delete all Compute resources
}

resource "google_project_service" "container" {
  project            = var.project_id
  service            = "container.googleapis.com"  # GKE — Kubernetes clusters
  disable_on_destroy = false
}

resource "google_project_service" "sqladmin" {
  project            = var.project_id
  service            = "sqladmin.googleapis.com"  # Cloud SQL — managed databases
  disable_on_destroy = false
}

resource "google_project_service" "secretmanager" {
  project            = var.project_id
  service            = "secretmanager.googleapis.com"
  disable_on_destroy = false
}

resource "google_project_service" "iam" {
  project            = var.project_id
  service            = "iam.googleapis.com"  # Required to create service accounts
  disable_on_destroy = false
}

GCP Networking — Global VPCs

GCP VPCs are global — a single VPC spans all regions automatically. Subnets are regional and live inside the global VPC. This means one VPC can have subnets in multiple regions without any peering configuration — a fundamental difference from AWS and Azure where a VPC or VNet is regional.

# GCP VPC — global, not regional (unlike AWS VPCs which are regional)
resource "google_compute_network" "main" {
  name                    = "vpc-${var.project_id}"
  project                 = var.project_id
  auto_create_subnetworks = false  # Custom mode — create subnets explicitly
  # auto_create_subnetworks = true creates a subnet in every region automatically
  # Avoid in production — creates subnets you never intended to use

  depends_on = [google_project_service.compute]  # Compute API must be enabled first
}

# Subnet — regional, inside the global VPC
resource "google_compute_subnetwork" "app" {
  name          = "subnet-app-${var.region}"
  project       = var.project_id
  region        = var.region                       # Subnet is region-specific
  network       = google_compute_network.main.id   # Belongs to the global VPC
  ip_cidr_range = "10.0.1.0/24"

  # Private Google Access — allows VMs without public IPs to reach Google APIs
  private_ip_google_access = true  # Enable on all subnets — best practice

  # Secondary IP ranges — required for GKE pods and services
  secondary_ip_range {
    range_name    = "pods"
    ip_cidr_range = "192.168.1.0/24"  # GKE pod IP range
  }

  secondary_ip_range {
    range_name    = "services"
    ip_cidr_range = "192.168.2.0/24"  # GKE service IP range
  }
}

# Firewall rules — project-level, not per-VPC like AWS security groups
resource "google_compute_firewall" "allow_https" {
  name    = "allow-https"
  project = var.project_id
  network = google_compute_network.main.name  # Apply to this VPC

  allow {
    protocol = "tcp"
    ports    = ["443"]
  }

  # Target: instances tagged with "web-server" — GCP uses network tags for targeting
  target_tags   = ["web-server"]  # Only instances with this tag are affected
  source_ranges = ["0.0.0.0/0"]  # Restrict in production

  direction = "INGRESS"
  priority  = 1000  # Lower number = higher priority
}

GCP IAM — Bindings vs Members

GCP IAM works differently from AWS and Azure. Instead of policy documents or role assignments, GCP uses bindings — a combination of who (member), what permissions (role), and on what (resource). There are three resource types for GCP IAM, each with critically different behaviour.

New terms:

  • iam_member — adds a single member-role pair without affecting others. Non-authoritative — Terraform does not remove anyone else from the role. Use when Terraform adds access without owning the full role membership.
  • iam_binding — sets all members for a given role. Authoritative for that role — removes any members for that role not in the list. Use only when Terraform owns all access for a specific role on a specific resource.
  • iam_policy — sets the entire IAM policy. Authoritative for all roles — overwrites everything. Dangerous — only for new resources where Terraform controls all access.
  • GCP member format — always prefixed with identity type: serviceAccount:email, user:email, group:email.
# Service account for the application workload
resource "google_service_account" "app" {
  account_id   = "sa-app-${var.environment}"  # Becomes part of the email address
  display_name = "Application Service Account"
  project      = var.project_id
  description  = "Service account for the app-${var.environment} workload"

  depends_on = [google_project_service.iam]  # IAM API must be enabled first
}

# iam_member — additive, does not remove existing members for this role
resource "google_storage_bucket_iam_member" "app_storage" {
  bucket = google_storage_bucket.app_data.name
  role   = "roles/storage.objectAdmin"  # Predefined role: full CRUD on objects

  # member format: serviceAccount:EMAIL — prefix identifies the identity type
  member = "serviceAccount:${google_service_account.app.email}"
}

# iam_member for Secret Manager access
resource "google_secret_manager_secret_iam_member" "app_secret" {
  secret_id = google_secret_manager_secret.db_password.secret_id
  role      = "roles/secretmanager.secretAccessor"
  member    = "serviceAccount:${google_service_account.app.email}"
  project   = var.project_id
}

# iam_binding — authoritative for this role — removes anyone not in the members list
# Use carefully: if you add a user outside Terraform, the next apply removes them
resource "google_project_iam_binding" "viewer_binding" {
  project = var.project_id
  role    = "roles/viewer"

  members = [
    "user:alice@acme.com",
    "user:bob@acme.com",
    # All viewers must be listed here — anyone not listed is removed on next apply
  ]
}

What just happened?

  • iam_member is additive — iam_binding is authoritative for its role. The google_storage_bucket_iam_member adds the app service account without removing any other members. The google_project_iam_binding for the viewer role owns all viewers — if Alice or Bob are removed from the list, the next apply removes their access. Use iam_binding only when you intend Terraform to be the single source of truth for that role on that resource.
  • GCP member format always includes the identity type prefix. serviceAccount:email, user:email, group:email — the prefix is required. A member string without the prefix is invalid and causes an error. This is different from AWS ARNs where the identity type is embedded in the ARN structure.

Key GCP Resource Patterns

# ── CLOUD STORAGE BUCKET ─────────────────────────────────────────────────────

resource "google_storage_bucket" "app_data" {
  name          = "${var.project_id}-app-data"  # Globally unique — project ID as prefix
  project       = var.project_id
  location      = "US"   # Multi-region: US, EU, ASIA — or single region: us-central1
  force_destroy = false  # Prevent deletion if bucket contains objects

  # Uniform bucket-level access — disables per-object ACLs, IAM only
  # Required for modern GCP security — eliminates legacy ACL confusion
  uniform_bucket_level_access = true

  versioning {
    enabled = true  # Keep all object versions for recovery and audit
  }

  lifecycle_rule {
    condition { age = 90 }  # Objects older than 90 days
    action {
      type          = "SetStorageClass"
      storage_class = "NEARLINE"  # Cheaper storage for infrequently accessed data
    }
  }

  depends_on = [google_project_service.compute]
}

# ── CLOUD SQL POSTGRESQL ──────────────────────────────────────────────────────

resource "google_sql_database_instance" "main" {
  name             = "db-${var.environment}-main"
  project          = var.project_id
  database_version = "POSTGRES_15"
  region           = var.region

  settings {
    tier = "db-f1-micro"  # Machine type — db-f1-micro, db-n1-standard-1, etc.

    # REGIONAL = high availability with synchronous replica (for production)
    # ZONAL = single zone (for development — less cost)
    availability_type = var.environment == "prod" ? "REGIONAL" : "ZONAL"

    backup_configuration {
      enabled                        = true
      point_in_time_recovery_enabled = true  # PITR — restore to any point in time
      start_time                     = "02:00"
      transaction_log_retention_days = 7
    }

    ip_configuration {
      ipv4_enabled    = false  # No public IP — private connectivity only
      private_network = google_compute_network.main.id
    }
  }

  # GCP equivalent of prevent_destroy — must be set false to terraform destroy
  deletion_protection = true

  depends_on = [
    google_project_service.sqladmin,
    google_service_networking_connection.private  # VPC peering for private IP
  ]
}

# VPC peering required for Cloud SQL private IP connectivity
resource "google_service_networking_connection" "private" {
  network                 = google_compute_network.main.id
  service                 = "servicenetworking.googleapis.com"
  reserved_peering_ranges = [google_compute_global_address.private_range.name]
}

# Reserved IP range for Google-managed services (Cloud SQL, Memorystore)
resource "google_compute_global_address" "private_range" {
  name          = "private-service-range"
  project       = var.project_id
  purpose       = "VPC_PEERING"   # Purpose: VPC peering for managed services
  address_type  = "INTERNAL"
  prefix_length = 16              # /16 — 65,536 addresses for managed services
  network       = google_compute_network.main.id
}

The google-beta Provider

GCP releases new features in a beta API first, then promotes them to GA. The google-beta provider contains all GA features plus beta-only resources and arguments. When a feature reaches GA, you migrate the resource to the google provider.

# google-beta provider — same authentication as google, different source
provider "google-beta" {
  project = var.project_id
  region  = var.region
  # Same credentials as google provider — ADC or GOOGLE_APPLICATION_CREDENTIALS
}

# GKE cluster — uses google-beta for beta networking arguments
resource "google_container_cluster" "primary" {
  provider = google-beta  # Required when using beta-only arguments

  name     = "gke-${var.environment}-main"
  project  = var.project_id
  location = var.region  # Regional cluster — HA across zones in the region

  remove_default_node_pool = true  # Best practice: manage node pools separately
  initial_node_count       = 1     # Required even when removing default pool

  network    = google_compute_network.main.name
  subnetwork = google_compute_subnetwork.app.name

  # Private cluster — worker nodes have no public IP addresses
  private_cluster_config {
    enable_private_nodes    = true   # No public IPs on worker nodes
    enable_private_endpoint = false  # Control plane is accessible (with CIDR restriction)
    master_ipv4_cidr_block  = "172.16.0.0/28"  # Control plane IP range
  }

  # Use secondary ranges from the subnet for pods and services
  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"      # Matches range_name in google_compute_subnetwork
    services_secondary_range_name = "services"
  }

  depends_on = [
    google_project_service.container,
    google_compute_subnetwork.app,
  ]
}

# Separate node pool — managed independently from the cluster
resource "google_container_node_pool" "primary" {
  name     = "primary-pool"
  project  = var.project_id
  cluster  = google_container_cluster.primary.name
  location = var.region

  node_count = 2  # Per zone — in a regional cluster, 2 nodes × number of zones

  node_config {
    machine_type    = "e2-standard-2"  # 2 vCPU, 8GB RAM
    disk_size_gb    = 50
    disk_type       = "pd-ssd"         # SSD for better IOPS

    service_account = google_service_account.app.email
    oauth_scopes    = ["https://www.googleapis.com/auth/cloud-platform"]
  }

  management {
    auto_repair  = true  # Automatically repair unhealthy nodes
    auto_upgrade = true  # Automatically upgrade node version with cluster
  }

  autoscaling {
    min_node_count = 1   # Minimum nodes per zone
    max_node_count = 10  # Maximum nodes per zone
  }
}

Common GCP Provider Mistakes

Not enabling APIs before creating resources

The most common GCP Terraform error: Error 403: API [compute.googleapis.com] not enabled on project PROJECT_ID. Always add google_project_service resources for every API you use and add depends_on to resources that need the API to be enabled first. This is the number one thing that breaks GCP Terraform configurations for engineers coming from AWS or Azure.

Confusing iam_binding and iam_member

Using iam_binding when you meant iam_member accidentally revokes access for principals added outside Terraform — because iam_binding removes anyone not in its list on the next apply. Using iam_member when you need authoritative control means Terraform cannot enforce who has access. Understand the difference before using either.

Setting auto_create_subnetworks = true on production VPCs

Auto mode VPCs create subnets in every GCP region automatically — including regions you never intend to use, with CIDR ranges you cannot change later without deleting the VPC. Always set auto_create_subnetworks = false and create subnets explicitly for any VPC that will be used in production.

GCP resource naming — use project ID as prefix

Many GCP resources require globally unique names — Cloud Storage buckets, Cloud SQL instances, GKE cluster names. Since project IDs are already globally unique in GCP, prefixing with the project ID is the standard pattern: ${var.project_id}-resource-name. This guarantees uniqueness without a random suffix and makes the resource's ownership immediately obvious from its name.

Practice Questions

1. Which resource enables a GCP API for a project, and which argument prevents catastrophic cascade deletion of all resources of that type?



2. What is the key difference between google_storage_bucket_iam_member and google_storage_bucket_iam_binding?



3. How does GCP VPC networking differ from AWS VPC networking in terms of regional scope?



Quiz

1. What is the recommended keyless authentication method for GitHub Actions deploying to GCP?


2. How do you prevent accidental deletion of a Cloud SQL instance with Terraform?


3. A GCP resource argument you need is only available in beta. What must you do in Terraform?


Up Next · Lesson 34

Terraform with Kubernetes

Three clouds done. Lesson 34 covers the Kubernetes provider — managing cluster resources declaratively with Terraform, when to use the Kubernetes provider vs Helm vs kubectl, deploying manifests, and the patterns for managing cluster infrastructure alongside application workloads.