Docker Lesson 43 – Docker with CI/CD | Dataplexa

Section IV · Lesson 43

Docker with CI/CD

A team's deployment process was a seven-step manual checklist. Build the image. Tag it. Push it. SSH into the server. Pull it. Stop the old container. Start the new one. Each step was done by a human. Each step could be done wrong. One Friday afternoon, step four was skipped — the engineer pushed to the wrong server. The new image was running in staging. Production was serving a three-week-old build. The discrepancy wasn't noticed until Monday morning when a customer called about a feature that "disappeared." A CI/CD pipeline doesn't make mistakes on step four. It doesn't have a step four.

This lesson builds a complete Docker CI/CD pipeline from scratch — the kind used by teams shipping to production dozens of times per day. Every stage is shown with working configuration for GitHub Actions: build, test, vulnerability scan, push to registry, and deploy. One git push triggers the entire sequence. No SSH. No checklists. No step four.

The Assembly Line Analogy

A car factory doesn't build one car at a time, start to finish, with one worker doing every step. It runs an assembly line — each station does one specific job, checks that the job was done correctly, and only passes the car to the next station if the check passes. A broken weld stops the car at station three — it never reaches station seven where it would be driven off the lot. A Docker CI/CD pipeline is the assembly line for software: each stage (build, test, scan, push, deploy) is a station with a quality gate. A failing test stops the pipeline at stage two. A critical CVE stops it at stage three. Nothing broken reaches production — not because a human caught it, but because the line stopped itself.

The Pipeline Stages

Every stage — what it does and what it gates

# Stage What it does What it blocks

1 Build Compiles the production image using BuildKit and registry cache Broken Dockerfile, missing dependencies, compile errors

2 Test Runs unit and integration tests inside the container Failing tests, broken application logic

3 Scan Runs Trivy against the image for CVEs Critical vulnerabilities in dependencies or base image

4 Push Tags and pushes the image to the registry Only runs if build, test, and scan all passed

5 Deploy Pulls the verified image and restarts the service on the server Only runs on merge to main — not on feature branches

Stage 1 — Build

The build stage uses docker buildx with registry cache — so the CI runner reuses cached layers from the previous pipeline run instead of downloading everything from scratch on every commit. The image is tagged with the git commit SHA for full traceability, exactly as covered in Lesson 31.

# .github/workflows/ci.yml — GitHub Actions pipeline
name: Docker CI/CD

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  IMAGE_NAME: acmecorp/payment-api
  REGISTRY: docker.io

jobs:
  build:
    name: Build Image
    runs-on: ubuntu-latest

    steps:
      - name: Checkout source
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        # Buildx enables BuildKit, multi-platform builds, and registry cache.

      - name: Log in to Docker Hub
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
          # Credentials stored as GitHub Actions secrets — never in the workflow file.

      - name: Build production image
        uses: docker/build-push-action@v5
        with:
          context: .
          target: production
          # Build the production stage of the multi-stage Dockerfile.
          push: false
          # Do not push yet — tests and scan must pass first.
          tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
          cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
          cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max
          # Registry cache — warm layers pulled from previous run.
          # Lesson 40: first run 94s, subsequent code-only runs 8s.
          outputs: type=docker,dest=/tmp/image.tar
          # Save the image to a tarball so subsequent jobs can load it
          # without rebuilding — each job gets a fresh runner environment.

      - name: Upload image tarball
        uses: actions/upload-artifact@v4
        with:
          name: docker-image
          path: /tmp/image.tar
          retention-days: 1

Stage 2 — Test

Tests run inside the container that was just built — not on the runner's host environment. This guarantees the tests are exercising exactly the same runtime that will go to production. The test stage loads the image from the tarball uploaded by the build job, starts a Postgres container as a dependency, and runs the test suite. A single test failure exits the job with a non-zero code — the pipeline stops and the push job never runs.

  test:
    name: Run Tests
    runs-on: ubuntu-latest
    needs: build
    # Only runs after the build job succeeds.

    services:
      postgres:
        image: postgres:15-alpine
        env:
          POSTGRES_DB: payment_test
          POSTGRES_USER: payment_user
          POSTGRES_PASSWORD: test_password
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        # GitHub Actions starts this container before the job steps run.
        # --health-cmd waits until Postgres is ready before proceeding.

    steps:
      - name: Download image tarball
        uses: actions/download-artifact@v4
        with:
          name: docker-image
          path: /tmp

      - name: Load image into Docker
        run: docker load --input /tmp/image.tar
        # Loads the exact image built in the previous job — no rebuild.

      - name: Run test suite
        run: |
          docker run --rm \
            --network ${{ job.services.postgres.network }} \
            -e NODE_ENV=test \
            -e DB_HOST=postgres \
            -e DB_PORT=5432 \
            -e DB_NAME=payment_test \
            -e DB_USER=payment_user \
            -e DB_PASSWORD=test_password \
            ${{ env.IMAGE_NAME }}:${{ github.sha }} \
            npm test
          # --rm          → remove container after tests complete
          # --network     → join the same network as the Postgres service container
          # npm test      → overrides the image's CMD for this one run
          # Non-zero exit from npm test → job fails → push job never runs.

Stage 3 — Vulnerability Scan

The scan stage runs Trivy against the built image before it reaches the registry. A critical CVE fails the pipeline — the image is never pushed and never deployed. This is the gate that prevents the scenario from Lesson 32: a compromised image with a known vulnerability sitting in the registry for months, exploited long before anyone notices.

  scan:
    name: Vulnerability Scan
    runs-on: ubuntu-latest
    needs: build
    # Runs in parallel with the test job — both depend on build.
    # Pipeline only proceeds to push when BOTH test and scan pass.

    steps:
      - name: Download image tarball
        uses: actions/download-artifact@v4
        with:
          name: docker-image
          path: /tmp

      - name: Load image into Docker
        run: docker load --input /tmp/image.tar

      - name: Run Trivy vulnerability scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
          format: table
          exit-code: 1
          # exit-code: 1 → fail the job if vulnerabilities are found.
          # Without this, Trivy reports findings but the job still passes.
          ignore-unfixed: true
          # ignore-unfixed: true → only fail on CVEs that have a fix available.
          # No point blocking a deploy for a vulnerability with no patch yet.
          severity: CRITICAL,HIGH
          # Only fail on CRITICAL and HIGH — LOW and MEDIUM are reported but don't block.

      - name: Upload scan results
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        # Upload results even if the scan failed — visible in GitHub Security tab.
        with:
          sarif_file: trivy-results.sarif

# Trivy output when a critical CVE is found:
acmecorp/payment-api:a3f2c8d (alpine 3.18.4)

Library     Vulnerability   Severity  Installed   Fixed      Title
openssl     CVE-2023-5363   CRITICAL  3.1.3-r0    3.1.4-r0   OpenSSL AES-SIV
libcrypto   CVE-2023-4807   HIGH      3.1.3-r0    3.1.4-r0   OpenSSL POLY1305

2 vulnerabilities found.
Exit code: 1

# GitHub Actions output:
✗ scan (Vulnerability Scan) — FAILED
  Error: Process completed with exit code 1.

# Push job status:
⊘ push — SKIPPED (dependency scan failed)
⊘ deploy — SKIPPED (dependency push skipped)

# The broken image never reached the registry.
# Fix: update FROM node:18-alpine3.19 in Dockerfile → rebuilds with patched OpenSSL.
# Pipeline re-runs → scan passes → push proceeds.

What just happened?

Trivy found a critical OpenSSL vulnerability in the Alpine base image and exited with code 1. GitHub Actions treated the non-zero exit as a job failure. The push job, which declared needs: [test, scan], was automatically skipped — the image was never written to the registry. The deploy job, which depends on push, was also skipped. The entire downstream pipeline stopped at stage three. The fix is a one-line Dockerfile change to a newer Alpine tag that includes the patched OpenSSL — exactly the pattern described in Lesson 32.

Stage 4 — Push

The push stage only runs when both test and scan have passed. It applies the full tagging strategy from Lesson 31 — the commit SHA for traceability, the branch name for human readability, and latest for convenience — then pushes all three tags to the registry. The same image layers are pushed once; three tags point to them.

  push:
    name: Push to Registry
    runs-on: ubuntu-latest
    needs: [test, scan]
    # Only runs when BOTH test AND scan pass.
    # If either fails, this job is skipped automatically.
    if: github.ref == 'refs/heads/main'
    # Only push on merges to main — not on pull requests or feature branches.
    # Feature branches build and test but do not pollute the registry.

    steps:
      - name: Download image tarball
        uses: actions/download-artifact@v4
        with:
          name: docker-image
          path: /tmp

      - name: Load image into Docker
        run: docker load --input /tmp/image.tar

      - name: Log in to Docker Hub
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Tag and push image
        run: |
          SHA=${{ github.sha }}
          SHORT_SHA=${SHA::7}
          BRANCH=$(echo ${{ github.ref_name }} | sed 's/\//-/g')

          # Tag with short SHA — primary immutable reference (Lesson 31):
          docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.IMAGE_NAME }}:${SHORT_SHA}

          # Tag with branch-SHA combination:
          docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.IMAGE_NAME }}:${BRANCH}-${SHORT_SHA}

          # Tag as latest — mutable pointer to current main release:
          docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.IMAGE_NAME }}:latest

          # Push all tags — same layers, pushed once:
          docker push ${{ env.IMAGE_NAME }}:${SHORT_SHA}
          docker push ${{ env.IMAGE_NAME }}:${BRANCH}-${SHORT_SHA}
          docker push ${{ env.IMAGE_NAME }}:latest

          echo "Pushed: ${{ env.IMAGE_NAME }}:${SHORT_SHA}"
          echo "Deploy with: docker pull ${{ env.IMAGE_NAME }}:${SHORT_SHA}"

Stage 5 — Deploy

The deploy stage connects to the production server over SSH — using a key stored as a GitHub Actions secret — pulls the verified image, and restarts the service. No human touches the server. No checklist. The server only needs Docker installed; it never needs git, Node.js, or any build tooling. This is the golden rule from Lesson 36 — the production server only runs docker pull and docker run.

  deploy:
    name: Deploy to Production
    runs-on: ubuntu-latest
    needs: push
    # Only runs after the image is confirmed in the registry.
    if: github.ref == 'refs/heads/main'
    environment: production
    # GitHub Environments allow required reviewers and deployment protection rules.
    # A human approval gate can be added here for regulated environments.

    steps:
      - name: Deploy to production server
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.PROD_HOST }}
          username: ${{ secrets.PROD_USER }}
          key: ${{ secrets.PROD_SSH_KEY }}
          # SSH key stored as a GitHub Actions secret — never in the repo.
          script: |
            SHORT_SHA=${GITHUB_SHA::7}
            IMAGE="${{ env.IMAGE_NAME }}:${SHORT_SHA}"

            # Pull the specific SHA tag — not latest.
            # Using the SHA guarantees we deploy exactly what passed CI.
            docker pull ${IMAGE}

            # Update the service — zero-downtime with --no-deps:
            GIT_SHA=${SHORT_SHA} docker compose \
              -f /opt/acmecorp/docker-compose.yml \
              -f /opt/acmecorp/docker-compose.prod.yml \
              up -d \
              --no-deps \
              --force-recreate \
              payment-api

            # Verify the new container is healthy before finishing:
            sleep 10
            docker ps --filter name=payment-api \
                      --format "{{.Status}}" | grep -q "healthy"
            echo "Deployed: ${IMAGE}"
        env:
          GITHUB_SHA: ${{ github.sha }}

# Full pipeline run — merge to main branch:

✓ build    (42s)  — image built with warm registry cache
✓ test     (38s)  — all 147 tests passed, Postgres dependency healthy
✓ scan     (12s)  — 0 CRITICAL, 0 HIGH vulnerabilities
✓ push     (8s)   — 3 tags pushed (a3f2c8d, main-a3f2c8d, latest)
✓ deploy   (14s)  — production server updated, container healthy

Total pipeline time: 114 seconds from git push to deployed production.

# Pipeline run — pull request on feature branch:
✓ build    (44s)
✓ test     (41s)
✓ scan     (11s)
⊘ push     — SKIPPED (not on main branch)
⊘ deploy   — SKIPPED (not on main branch)
# Feature branch: validated but not deployed. Registry stays clean.

# Pipeline run — test failure:
✓ build    (43s)
✗ test     (22s)  — 3 tests failed: PaymentService.processRefund
⊘ scan     — SKIPPED
⊘ push     — SKIPPED
⊘ deploy   — SKIPPED
# Nothing broken reached the registry. Production unchanged.

What just happened?

A git push to main triggered a five-stage pipeline. The image was built in 42 seconds using the registry cache. Tests ran inside the production image against a real Postgres instance. Trivy scanned for vulnerabilities. All three gates passed — the image was tagged with the commit SHA and pushed to the registry. The production server pulled the exact SHA that passed CI and restarted the service. Total time from push to deployed: 114 seconds. No human involvement after the git push. No checklists. No step four.

Secrets in CI — Where They Live

A CI pipeline touches several secrets: registry credentials to push images, SSH keys to reach the production server, and any application secrets needed to run tests. None of these belong in the workflow file or the repository. They all live in GitHub Actions Secrets — encrypted at rest, injected as environment variables at runtime, and never printed in logs.

# Secrets to configure in GitHub → Settings → Secrets and Variables → Actions:

# DOCKERHUB_USERNAME   → Docker Hub username for docker login
# DOCKERHUB_TOKEN      → Docker Hub access token (not your password)
#                        Generate at: hub.docker.com → Account Settings → Security
# PROD_HOST            → Production server IP or hostname
# PROD_USER            → SSH username on the production server
# PROD_SSH_KEY         → Private SSH key (the server has the public key in authorized_keys)

# Generate a dedicated deploy SSH key — not your personal key:
ssh-keygen -t ed25519 -C "github-actions-deploy" -f ~/.ssh/deploy_key -N ""
# Copy the PUBLIC key to the server:
ssh-copy-id -i ~/.ssh/deploy_key.pub user@prod-server
# Store the PRIVATE key as the PROD_SSH_KEY secret in GitHub.

# Test credentials — used only in the test job, never in production:
# DB_PASSWORD in the test job uses a hardcoded test_password directly in the workflow.
# Test databases contain no real data — no secret needed.

A Complete Pipeline Scenario

The scenario: You've just joined a team that deploys manually via checklist. You build and introduce the complete pipeline described in this lesson. Here is the workflow file in its entirety — everything above, assembled into a single deployable ci.yml.

# .github/workflows/ci.yml — complete pipeline
name: Docker CI/CD

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  IMAGE_NAME: acmecorp/payment-api

jobs:
  build:
    name: Build Image
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          context: .
          target: production
          push: false
          tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
          cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
          cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max
          outputs: type=docker,dest=/tmp/image.tar
      - uses: actions/upload-artifact@v4
        with:
          name: docker-image
          path: /tmp/image.tar

  test:
    name: Run Tests
    runs-on: ubuntu-latest
    needs: build
    services:
      postgres:
        image: postgres:15-alpine
        env:
          POSTGRES_DB: payment_test
          POSTGRES_USER: payment_user
          POSTGRES_PASSWORD: test_password
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    steps:
      - uses: actions/download-artifact@v4
        with: { name: docker-image, path: /tmp }
      - run: docker load --input /tmp/image.tar
      - run: |
          docker run --rm \
            --network ${{ job.services.postgres.network }} \
            -e NODE_ENV=test \
            -e DB_HOST=postgres \
            -e DB_PASSWORD=test_password \
            ${{ env.IMAGE_NAME }}:${{ github.sha }} \
            npm test

  scan:
    name: Vulnerability Scan
    runs-on: ubuntu-latest
    needs: build
    steps:
      - uses: actions/download-artifact@v4
        with: { name: docker-image, path: /tmp }
      - run: docker load --input /tmp/image.tar
      - uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
          exit-code: 1
          ignore-unfixed: true
          severity: CRITICAL,HIGH

  push:
    name: Push to Registry
    runs-on: ubuntu-latest
    needs: [test, scan]
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/download-artifact@v4
        with: { name: docker-image, path: /tmp }
      - run: docker load --input /tmp/image.tar
      - uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
      - run: |
          SHORT_SHA=${GITHUB_SHA::7}
          docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.IMAGE_NAME }}:${SHORT_SHA}
          docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.IMAGE_NAME }}:latest
          docker push ${{ env.IMAGE_NAME }}:${SHORT_SHA}
          docker push ${{ env.IMAGE_NAME }}:latest
        env:
          GITHUB_SHA: ${{ github.sha }}

  deploy:
    name: Deploy to Production
    runs-on: ubuntu-latest
    needs: push
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
      - uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.PROD_HOST }}
          username: ${{ secrets.PROD_USER }}
          key: ${{ secrets.PROD_SSH_KEY }}
          script: |
            SHORT_SHA=${GITHUB_SHA::7}
            docker pull ${{ env.IMAGE_NAME }}:${SHORT_SHA}
            GIT_SHA=${SHORT_SHA} docker compose \
              -f /opt/acmecorp/docker-compose.yml \
              -f /opt/acmecorp/docker-compose.prod.yml \
              up -d --no-deps --force-recreate payment-api
        env:
          GITHUB_SHA: ${{ github.sha }}

Teacher's Note

This pipeline is GitHub Actions — but the structure is identical in GitLab CI, CircleCI, and Jenkins. The stages are the same, the Docker commands are the same, and the secrets management approach is the same. The syntax differs; the discipline doesn't. If your team uses a different CI platform, translate the stage structure directly — build, test, scan, push, deploy — and the same quality gates apply. Start with the build and test stages first. Add the scan gate second. The push and deploy stages are the last pieces — a pipeline that validates without deploying is still enormously more valuable than a manual checklist.

Practice Questions

1. In a GitHub Actions workflow, which keyword in a job definition specifies which other jobs must complete successfully before this job runs?

2. When running Trivy in a CI pipeline, which configuration option causes the pipeline job to fail — rather than just report — when a vulnerability is found?

3. To persist the Docker layer cache between CI pipeline runs on stateless runners — so cached layers are available on the next run — which cache type is used in the --cache-from and --cache-to flags?

Quiz

Up Next · Lesson 44

Docker on AWS

Pipeline automated — now the cloud question: how does Docker run on AWS? ECS, ECR, Fargate, and App Runner each solve a different version of the same problem. This lesson maps the Docker concepts you already know to the AWS services that implement them at scale.

← Previous Course Index Next →

Docker Course