Docker Course
Docker with CI/CD
A team's deployment process was a seven-step manual checklist. Build the image. Tag it. Push it. SSH into the server. Pull it. Stop the old container. Start the new one. Each step was done by a human. Each step could be done wrong. One Friday afternoon, step four was skipped — the engineer pushed to the wrong server. The new image was running in staging. Production was serving a three-week-old build. The discrepancy wasn't noticed until Monday morning when a customer called about a feature that "disappeared." A CI/CD pipeline doesn't make mistakes on step four. It doesn't have a step four.
This lesson builds a complete Docker CI/CD pipeline from scratch — the kind used by teams shipping to production dozens of times per day. Every stage is shown with working configuration for GitHub Actions: build, test, vulnerability scan, push to registry, and deploy. One git push triggers the entire sequence. No SSH. No checklists. No step four.
The Assembly Line Analogy
The Assembly Line Analogy
A car factory doesn't build one car at a time, start to finish, with one worker doing every step. It runs an assembly line — each station does one specific job, checks that the job was done correctly, and only passes the car to the next station if the check passes. A broken weld stops the car at station three — it never reaches station seven where it would be driven off the lot. A Docker CI/CD pipeline is the assembly line for software: each stage (build, test, scan, push, deploy) is a station with a quality gate. A failing test stops the pipeline at stage two. A critical CVE stops it at stage three. Nothing broken reaches production — not because a human caught it, but because the line stopped itself.
The Pipeline Stages
Every stage — what it does and what it gates
Stage 1 — Build
The build stage uses docker buildx with registry cache — so the CI runner reuses cached layers from the previous pipeline run instead of downloading everything from scratch on every commit. The image is tagged with the git commit SHA for full traceability, exactly as covered in Lesson 31.
# .github/workflows/ci.yml — GitHub Actions pipeline
name: Docker CI/CD
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
IMAGE_NAME: acmecorp/payment-api
REGISTRY: docker.io
jobs:
build:
name: Build Image
runs-on: ubuntu-latest
steps:
- name: Checkout source
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
# Buildx enables BuildKit, multi-platform builds, and registry cache.
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# Credentials stored as GitHub Actions secrets — never in the workflow file.
- name: Build production image
uses: docker/build-push-action@v5
with:
context: .
target: production
# Build the production stage of the multi-stage Dockerfile.
push: false
# Do not push yet — tests and scan must pass first.
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max
# Registry cache — warm layers pulled from previous run.
# Lesson 40: first run 94s, subsequent code-only runs 8s.
outputs: type=docker,dest=/tmp/image.tar
# Save the image to a tarball so subsequent jobs can load it
# without rebuilding — each job gets a fresh runner environment.
- name: Upload image tarball
uses: actions/upload-artifact@v4
with:
name: docker-image
path: /tmp/image.tar
retention-days: 1
Stage 2 — Test
Tests run inside the container that was just built — not on the runner's host environment. This guarantees the tests are exercising exactly the same runtime that will go to production. The test stage loads the image from the tarball uploaded by the build job, starts a Postgres container as a dependency, and runs the test suite. A single test failure exits the job with a non-zero code — the pipeline stops and the push job never runs.
test:
name: Run Tests
runs-on: ubuntu-latest
needs: build
# Only runs after the build job succeeds.
services:
postgres:
image: postgres:15-alpine
env:
POSTGRES_DB: payment_test
POSTGRES_USER: payment_user
POSTGRES_PASSWORD: test_password
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
# GitHub Actions starts this container before the job steps run.
# --health-cmd waits until Postgres is ready before proceeding.
steps:
- name: Download image tarball
uses: actions/download-artifact@v4
with:
name: docker-image
path: /tmp
- name: Load image into Docker
run: docker load --input /tmp/image.tar
# Loads the exact image built in the previous job — no rebuild.
- name: Run test suite
run: |
docker run --rm \
--network ${{ job.services.postgres.network }} \
-e NODE_ENV=test \
-e DB_HOST=postgres \
-e DB_PORT=5432 \
-e DB_NAME=payment_test \
-e DB_USER=payment_user \
-e DB_PASSWORD=test_password \
${{ env.IMAGE_NAME }}:${{ github.sha }} \
npm test
# --rm → remove container after tests complete
# --network → join the same network as the Postgres service container
# npm test → overrides the image's CMD for this one run
# Non-zero exit from npm test → job fails → push job never runs.
Stage 3 — Vulnerability Scan
The scan stage runs Trivy against the built image before it reaches the registry. A critical CVE fails the pipeline — the image is never pushed and never deployed. This is the gate that prevents the scenario from Lesson 32: a compromised image with a known vulnerability sitting in the registry for months, exploited long before anyone notices.
scan:
name: Vulnerability Scan
runs-on: ubuntu-latest
needs: build
# Runs in parallel with the test job — both depend on build.
# Pipeline only proceeds to push when BOTH test and scan pass.
steps:
- name: Download image tarball
uses: actions/download-artifact@v4
with:
name: docker-image
path: /tmp
- name: Load image into Docker
run: docker load --input /tmp/image.tar
- name: Run Trivy vulnerability scan
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
format: table
exit-code: 1
# exit-code: 1 → fail the job if vulnerabilities are found.
# Without this, Trivy reports findings but the job still passes.
ignore-unfixed: true
# ignore-unfixed: true → only fail on CVEs that have a fix available.
# No point blocking a deploy for a vulnerability with no patch yet.
severity: CRITICAL,HIGH
# Only fail on CRITICAL and HIGH — LOW and MEDIUM are reported but don't block.
- name: Upload scan results
uses: github/codeql-action/upload-sarif@v3
if: always()
# Upload results even if the scan failed — visible in GitHub Security tab.
with:
sarif_file: trivy-results.sarif
# Trivy output when a critical CVE is found: acmecorp/payment-api:a3f2c8d (alpine 3.18.4) Library Vulnerability Severity Installed Fixed Title openssl CVE-2023-5363 CRITICAL 3.1.3-r0 3.1.4-r0 OpenSSL AES-SIV libcrypto CVE-2023-4807 HIGH 3.1.3-r0 3.1.4-r0 OpenSSL POLY1305 2 vulnerabilities found. Exit code: 1 # GitHub Actions output: ✗ scan (Vulnerability Scan) — FAILED Error: Process completed with exit code 1. # Push job status: ⊘ push — SKIPPED (dependency scan failed) ⊘ deploy — SKIPPED (dependency push skipped) # The broken image never reached the registry. # Fix: update FROM node:18-alpine3.19 in Dockerfile → rebuilds with patched OpenSSL. # Pipeline re-runs → scan passes → push proceeds.
What just happened?
Trivy found a critical OpenSSL vulnerability in the Alpine base image and exited with code 1. GitHub Actions treated the non-zero exit as a job failure. The push job, which declared needs: [test, scan], was automatically skipped — the image was never written to the registry. The deploy job, which depends on push, was also skipped. The entire downstream pipeline stopped at stage three. The fix is a one-line Dockerfile change to a newer Alpine tag that includes the patched OpenSSL — exactly the pattern described in Lesson 32.
Stage 4 — Push
The push stage only runs when both test and scan have passed. It applies the full tagging strategy from Lesson 31 — the commit SHA for traceability, the branch name for human readability, and latest for convenience — then pushes all three tags to the registry. The same image layers are pushed once; three tags point to them.
push:
name: Push to Registry
runs-on: ubuntu-latest
needs: [test, scan]
# Only runs when BOTH test AND scan pass.
# If either fails, this job is skipped automatically.
if: github.ref == 'refs/heads/main'
# Only push on merges to main — not on pull requests or feature branches.
# Feature branches build and test but do not pollute the registry.
steps:
- name: Download image tarball
uses: actions/download-artifact@v4
with:
name: docker-image
path: /tmp
- name: Load image into Docker
run: docker load --input /tmp/image.tar
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Tag and push image
run: |
SHA=${{ github.sha }}
SHORT_SHA=${SHA::7}
BRANCH=$(echo ${{ github.ref_name }} | sed 's/\//-/g')
# Tag with short SHA — primary immutable reference (Lesson 31):
docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
${{ env.IMAGE_NAME }}:${SHORT_SHA}
# Tag with branch-SHA combination:
docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
${{ env.IMAGE_NAME }}:${BRANCH}-${SHORT_SHA}
# Tag as latest — mutable pointer to current main release:
docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
${{ env.IMAGE_NAME }}:latest
# Push all tags — same layers, pushed once:
docker push ${{ env.IMAGE_NAME }}:${SHORT_SHA}
docker push ${{ env.IMAGE_NAME }}:${BRANCH}-${SHORT_SHA}
docker push ${{ env.IMAGE_NAME }}:latest
echo "Pushed: ${{ env.IMAGE_NAME }}:${SHORT_SHA}"
echo "Deploy with: docker pull ${{ env.IMAGE_NAME }}:${SHORT_SHA}"
Stage 5 — Deploy
The deploy stage connects to the production server over SSH — using a key stored as a GitHub Actions secret — pulls the verified image, and restarts the service. No human touches the server. No checklist. The server only needs Docker installed; it never needs git, Node.js, or any build tooling. This is the golden rule from Lesson 36 — the production server only runs docker pull and docker run.
deploy:
name: Deploy to Production
runs-on: ubuntu-latest
needs: push
# Only runs after the image is confirmed in the registry.
if: github.ref == 'refs/heads/main'
environment: production
# GitHub Environments allow required reviewers and deployment protection rules.
# A human approval gate can be added here for regulated environments.
steps:
- name: Deploy to production server
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.PROD_HOST }}
username: ${{ secrets.PROD_USER }}
key: ${{ secrets.PROD_SSH_KEY }}
# SSH key stored as a GitHub Actions secret — never in the repo.
script: |
SHORT_SHA=${GITHUB_SHA::7}
IMAGE="${{ env.IMAGE_NAME }}:${SHORT_SHA}"
# Pull the specific SHA tag — not latest.
# Using the SHA guarantees we deploy exactly what passed CI.
docker pull ${IMAGE}
# Update the service — zero-downtime with --no-deps:
GIT_SHA=${SHORT_SHA} docker compose \
-f /opt/acmecorp/docker-compose.yml \
-f /opt/acmecorp/docker-compose.prod.yml \
up -d \
--no-deps \
--force-recreate \
payment-api
# Verify the new container is healthy before finishing:
sleep 10
docker ps --filter name=payment-api \
--format "{{.Status}}" | grep -q "healthy"
echo "Deployed: ${IMAGE}"
env:
GITHUB_SHA: ${{ github.sha }}
# Full pipeline run — merge to main branch: ✓ build (42s) — image built with warm registry cache ✓ test (38s) — all 147 tests passed, Postgres dependency healthy ✓ scan (12s) — 0 CRITICAL, 0 HIGH vulnerabilities ✓ push (8s) — 3 tags pushed (a3f2c8d, main-a3f2c8d, latest) ✓ deploy (14s) — production server updated, container healthy Total pipeline time: 114 seconds from git push to deployed production. # Pipeline run — pull request on feature branch: ✓ build (44s) ✓ test (41s) ✓ scan (11s) ⊘ push — SKIPPED (not on main branch) ⊘ deploy — SKIPPED (not on main branch) # Feature branch: validated but not deployed. Registry stays clean. # Pipeline run — test failure: ✓ build (43s) ✗ test (22s) — 3 tests failed: PaymentService.processRefund ⊘ scan — SKIPPED ⊘ push — SKIPPED ⊘ deploy — SKIPPED # Nothing broken reached the registry. Production unchanged.
What just happened?
A git push to main triggered a five-stage pipeline. The image was built in 42 seconds using the registry cache. Tests ran inside the production image against a real Postgres instance. Trivy scanned for vulnerabilities. All three gates passed — the image was tagged with the commit SHA and pushed to the registry. The production server pulled the exact SHA that passed CI and restarted the service. Total time from push to deployed: 114 seconds. No human involvement after the git push. No checklists. No step four.
Secrets in CI — Where They Live
A CI pipeline touches several secrets: registry credentials to push images, SSH keys to reach the production server, and any application secrets needed to run tests. None of these belong in the workflow file or the repository. They all live in GitHub Actions Secrets — encrypted at rest, injected as environment variables at runtime, and never printed in logs.
# Secrets to configure in GitHub → Settings → Secrets and Variables → Actions:
# DOCKERHUB_USERNAME → Docker Hub username for docker login
# DOCKERHUB_TOKEN → Docker Hub access token (not your password)
# Generate at: hub.docker.com → Account Settings → Security
# PROD_HOST → Production server IP or hostname
# PROD_USER → SSH username on the production server
# PROD_SSH_KEY → Private SSH key (the server has the public key in authorized_keys)
# Generate a dedicated deploy SSH key — not your personal key:
ssh-keygen -t ed25519 -C "github-actions-deploy" -f ~/.ssh/deploy_key -N ""
# Copy the PUBLIC key to the server:
ssh-copy-id -i ~/.ssh/deploy_key.pub user@prod-server
# Store the PRIVATE key as the PROD_SSH_KEY secret in GitHub.
# Test credentials — used only in the test job, never in production:
# DB_PASSWORD in the test job uses a hardcoded test_password directly in the workflow.
# Test databases contain no real data — no secret needed.
A Complete Pipeline Scenario
The scenario: You've just joined a team that deploys manually via checklist. You build and introduce the complete pipeline described in this lesson. Here is the workflow file in its entirety — everything above, assembled into a single deployable ci.yml.
# .github/workflows/ci.yml — complete pipeline
name: Docker CI/CD
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
IMAGE_NAME: acmecorp/payment-api
jobs:
build:
name: Build Image
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- uses: docker/build-push-action@v5
with:
context: .
target: production
push: false
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max
outputs: type=docker,dest=/tmp/image.tar
- uses: actions/upload-artifact@v4
with:
name: docker-image
path: /tmp/image.tar
test:
name: Run Tests
runs-on: ubuntu-latest
needs: build
services:
postgres:
image: postgres:15-alpine
env:
POSTGRES_DB: payment_test
POSTGRES_USER: payment_user
POSTGRES_PASSWORD: test_password
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/download-artifact@v4
with: { name: docker-image, path: /tmp }
- run: docker load --input /tmp/image.tar
- run: |
docker run --rm \
--network ${{ job.services.postgres.network }} \
-e NODE_ENV=test \
-e DB_HOST=postgres \
-e DB_PASSWORD=test_password \
${{ env.IMAGE_NAME }}:${{ github.sha }} \
npm test
scan:
name: Vulnerability Scan
runs-on: ubuntu-latest
needs: build
steps:
- uses: actions/download-artifact@v4
with: { name: docker-image, path: /tmp }
- run: docker load --input /tmp/image.tar
- uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
exit-code: 1
ignore-unfixed: true
severity: CRITICAL,HIGH
push:
name: Push to Registry
runs-on: ubuntu-latest
needs: [test, scan]
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/download-artifact@v4
with: { name: docker-image, path: /tmp }
- run: docker load --input /tmp/image.tar
- uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- run: |
SHORT_SHA=${GITHUB_SHA::7}
docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
${{ env.IMAGE_NAME }}:${SHORT_SHA}
docker tag ${{ env.IMAGE_NAME }}:${{ github.sha }} \
${{ env.IMAGE_NAME }}:latest
docker push ${{ env.IMAGE_NAME }}:${SHORT_SHA}
docker push ${{ env.IMAGE_NAME }}:latest
env:
GITHUB_SHA: ${{ github.sha }}
deploy:
name: Deploy to Production
runs-on: ubuntu-latest
needs: push
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.PROD_HOST }}
username: ${{ secrets.PROD_USER }}
key: ${{ secrets.PROD_SSH_KEY }}
script: |
SHORT_SHA=${GITHUB_SHA::7}
docker pull ${{ env.IMAGE_NAME }}:${SHORT_SHA}
GIT_SHA=${SHORT_SHA} docker compose \
-f /opt/acmecorp/docker-compose.yml \
-f /opt/acmecorp/docker-compose.prod.yml \
up -d --no-deps --force-recreate payment-api
env:
GITHUB_SHA: ${{ github.sha }}
Teacher's Note
This pipeline is GitHub Actions — but the structure is identical in GitLab CI, CircleCI, and Jenkins. The stages are the same, the Docker commands are the same, and the secrets management approach is the same. The syntax differs; the discipline doesn't. If your team uses a different CI platform, translate the stage structure directly — build, test, scan, push, deploy — and the same quality gates apply. Start with the build and test stages first. Add the scan gate second. The push and deploy stages are the last pieces — a pipeline that validates without deploying is still enormously more valuable than a manual checklist.
Practice Questions
1. In a GitHub Actions workflow, which keyword in a job definition specifies which other jobs must complete successfully before this job runs?
2. When running Trivy in a CI pipeline, which configuration option causes the pipeline job to fail — rather than just report — when a vulnerability is found?
3. To persist the Docker layer cache between CI pipeline runs on stateless runners — so cached layers are available on the next run — which cache type is used in the --cache-from and --cache-to flags?
Quiz
1. A CI pipeline runs tests with npm test directly on the GitHub Actions runner — not inside the Docker container. What is the risk of this approach?
2. A developer opens a pull request from a feature branch. The pipeline runs build, test, and scan — all pass. The push job is skipped. Why?
3. The deploy job pulls acmecorp/payment-api:a3f2c8d rather than acmecorp/payment-api:latest. Why is the SHA tag used instead of latest?
Up Next · Lesson 44
Docker on AWS
Pipeline automated — now the cloud question: how does Docker run on AWS? ECS, ECR, Fargate, and App Runner each solve a different version of the same problem. This lesson maps the Docker concepts you already know to the AWS services that implement them at scale.