Docker Lesson 31 – Image Versioning & Tagging | Dataplexa

Section III · Lesson 31

Image Versioning & Tagging

Production is down at 11pm. You need to roll back immediately. You type docker pull acmecorp/payment-api:latest — and get the broken version. Nobody knows which tag was running before. The deployment script only ever used latest. The rollback takes 45 minutes of archaeology instead of 30 seconds. This lesson is how you make sure that never happens.

A tagging strategy isn't bureaucracy — it's the difference between a 30-second rollback and a very bad night. The right tags give you traceability, reproducibility, and the confidence to deploy on a Friday afternoon.

The Problem with latest

latest is a moving pointer. Every time you push a new image to a repository without specifying a tag, it gets the latest tag — and that tag silently moves. The image you tested yesterday is no longer what latest points to today. There is no history. There is no rollback path. There is no way to know — just from the tag — what code is actually inside the image.

The "Today's Special" Menu Analogy

Using latest in production is like a restaurant that only has one menu item called "Today's Special" — no name, no description, just "Today's Special." You ordered it last Tuesday and it was great. You order it again Thursday — completely different dish. You try to re-order what you had Tuesday — impossible, that item no longer exists on any menu. Semver tags are like named dishes. v1.2.3 is always v1.2.3 — today, next year, forever. You can always re-order it.

Tag Strategy 1 — Semantic Versioning

Semantic versioning is the most readable strategy — MAJOR.MINOR.PATCH. A breaking change bumps MAJOR. New backwards-compatible features bump MINOR. Bug fixes bump PATCH. Combined with a latest tag that always points to the current release, this gives you both a rollback path and a convenient "give me the newest" pointer.

# Build and tag with semantic version
docker build -t acmecorp/payment-api:v2.4.1 .

# Also tag as latest — latest always points to the current release
docker tag acmecorp/payment-api:v2.4.1 acmecorp/payment-api:latest

# Also tag the minor version — so :v2.4 always points to the latest patch
docker tag acmecorp/payment-api:v2.4.1 acmecorp/payment-api:v2.4

# Push all three tags — same layers, three pointers
docker push acmecorp/payment-api:v2.4.1
docker push acmecorp/payment-api:v2.4
docker push acmecorp/payment-api:latest

The push refers to repository [docker.io/acmecorp/payment-api]
3a7f2c9e1b4d: Pushed
8b1c4e7a9d2f: Pushed
a3b7c9d1e5f2: Layer already exists
v2.4.1: digest: sha256:9c1e3a5b7d...

v2.4: digest: sha256:9c1e3a5b7d...   ← same digest as v2.4.1

latest: digest: sha256:9c1e3a5b7d... ← same digest — three tags, one image

What just happened?

All three tags share the same digest — sha256:9c1e3a5b7d... — confirming they point to the exact same image. Only two layers actually uploaded because the rest were already in the registry from a previous push. Three tags, one set of stored layers. Now you have v2.4.1 for exact pinning, v2.4 for "latest patch of the 2.4 minor version", and latest for "current production release." Three different use cases, all served by the same underlying image.

Tag Strategy 2 — Git Commit SHA

Semantic versions require a human to decide what changed and bump the version manually. In a fast-moving team pushing to CI dozens of times a day, that's friction. The git commit SHA strategy tags every image build with the exact git commit that produced it — no human decision required, fully automatic, and completely traceable.

If production is broken, you check which commit SHA the running container was built from, git log to that commit, see exactly what changed, and roll back by deploying the previous SHA's image. The entire trail from deployment back to source code is one lookup.

# In a CI/CD pipeline — use the git commit SHA as the tag
GIT_SHA=$(git rev-parse --short HEAD)    # short SHA — e.g. a3f2c8d
GIT_SHA_FULL=$(git rev-parse HEAD)       # full SHA — e.g. a3f2c8d91e44b7e1a4c52f

# Build and tag with the short SHA
docker build -t acmecorp/payment-api:${GIT_SHA} .

# Also tag as latest for convenience
docker tag acmecorp/payment-api:${GIT_SHA} acmecorp/payment-api:latest

docker push acmecorp/payment-api:${GIT_SHA}
docker push acmecorp/payment-api:latest

echo "Image built: acmecorp/payment-api:${GIT_SHA}"

Image built: acmecorp/payment-api:a3f2c8d

# Production deployment
docker run -d \
  --name payment-api \
  acmecorp/payment-api:a3f2c8d  ← you know EXACTLY what code is running

# Something breaks — find the previous SHA
git log --oneline -5
a3f2c8d  feat: add payment retry logic  ← current (broken)
7b1e9c4  fix: correct tax calculation    ← previous (known good)
2d5f8a1  refactor: extract payment utils

# Rollback — one command, 30 seconds
docker stop payment-api
docker run -d --name payment-api acmecorp/payment-api:7b1e9c4

What just happened?

Every image in the registry is tied to a specific git commit. When the feature branch broke production, the rollback was: find the previous commit SHA from git log, pull that SHA's image, restart the container. Total time: under two minutes. No guessing which version was running before. No archaeology. No "does anyone remember what we deployed last week?" The SHA is the version — and the version is in the registry forever.

Tag Strategy 3 — Environment Tags

Environment tags signal intent — which image is currently deployed where. A CI pipeline promotes an image through environments by retagging it at each stage. The image doesn't change. Only its tag does.

# Build with both a SHA tag and an environment tag
GIT_SHA=$(git rev-parse --short HEAD)

docker build -t acmecorp/payment-api:${GIT_SHA} .
docker tag acmecorp/payment-api:${GIT_SHA} acmecorp/payment-api:staging
docker push acmecorp/payment-api:${GIT_SHA}
docker push acmecorp/payment-api:staging
# staging server pulls :staging — always gets the latest staging build

# After QA signs off — promote to production by retagging
docker pull acmecorp/payment-api:${GIT_SHA}
docker tag acmecorp/payment-api:${GIT_SHA} acmecorp/payment-api:production
docker push acmecorp/payment-api:production
# production server pulls :production — image hasn't changed, only the pointer has

# The registry now has three meaningful tags:
# :a3f2c8d   → immutable — the exact build
# :staging   → mutable — currently in staging
# :production → mutable — currently in production

Environment promotion — one image, moving tag pointers

CI builds
docker build

→

:a3f2c8d

:staging

↓ QA passes

Promote
docker tag + push

→

:a3f2c8d

:staging

:production

The SHA tag is immutable — it never changes. The environment tags are mutable pointers that move as the image is promoted. Both are needed: SHA for traceability, environment tags for deployment targeting.

Immutable Tags with Digest References

Even a semver tag like v2.4.1 is technically mutable — someone with push access can overwrite it with a different image. For truly immutable production deployments, reference images by their digest.

# Get the digest of an image after pushing
docker push acmecorp/payment-api:v2.4.1
# Output includes: digest: sha256:9c1e3a5b7d2f8e4a6c0b2d4f6a8c0e2f...

# Reference by digest in production deployments — immutable forever
docker run -d \
  acmecorp/payment-api@sha256:9c1e3a5b7d2f8e4a6c0b2d4f6a8c0e2f4a6b8c0d2e4f6a8b0c2d4e6f8a0b2c4
# This EXACT image — not whatever :v2.4.1 points to today

# In a Compose file
services:
  payment-api:
    image: acmecorp/payment-api@sha256:9c1e3a5b7d2f8e4a6c0b2d4f...
    # This service will always run this exact image, forever
    # Even if someone overwrites the :v2.4.1 tag, this digest reference is unaffected

What just happened?

The digest reference @sha256:9c1e3a5b7d... is a cryptographic fingerprint of the image content. It's computed from the image layers themselves — if even one byte changes, the digest changes. No human can assign this — it's generated by the registry when the image is pushed. Using a digest in a production deployment gives you a guarantee that nobody — not even someone with registry admin access — can silently swap the image under you by retagging. This is the gold standard for production pinning in regulated industries and security-conscious environments.

Putting It Together — A Complete Tagging Pipeline

The scenario: You're a platform engineer setting up a CI/CD pipeline for a team of fifteen developers. You need a tagging strategy that gives every build a traceable identity, supports environment promotion, and enables instant rollback. Here's the pipeline script that runs on every merge to main.

#!/bin/bash
# ci-build-and-push.sh — runs on every merge to main

IMAGE_NAME="acmecorp/payment-api"
GIT_SHA=$(git rev-parse --short HEAD)         # e.g. a3f2c8d
GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD) # e.g. main
BUILD_DATE=$(date -u +"%Y-%m-%d")             # e.g. 2024-01-15

# Build once — tag multiple times
docker build \
  --label git.commit=${GIT_SHA} \
  --label build.date=${BUILD_DATE} \
  --label git.branch=${GIT_BRANCH} \
  -t ${IMAGE_NAME}:${GIT_SHA} .

# Apply all tags pointing to the same image
docker tag ${IMAGE_NAME}:${GIT_SHA} ${IMAGE_NAME}:latest
docker tag ${IMAGE_NAME}:${GIT_SHA} ${IMAGE_NAME}:${GIT_BRANCH}-${GIT_SHA}
# e.g. acmecorp/payment-api:main-a3f2c8d

# Push all tags
docker push ${IMAGE_NAME}:${GIT_SHA}
docker push ${IMAGE_NAME}:latest
docker push ${IMAGE_NAME}:${GIT_BRANCH}-${GIT_SHA}

echo "Built and pushed: ${IMAGE_NAME}:${GIT_SHA}"
echo "Deploy with: docker run acmecorp/payment-api:${GIT_SHA}"

[+] Building 3.1s (9/9) FINISHED          ← fast build — most layers cached
 => CACHED [1/5] FROM node:18-alpine      ← base image cached
 => CACHED [2/5] WORKDIR /app             ← cached
 => CACHED [3/5] COPY package*.json ./    ← cached — no dep changes
 => CACHED [4/5] RUN npm install          ← cached — no dep changes
 => [5/5] COPY . .                        ← only this layer rebuilt

The push refers to repository [docker.io/acmecorp/payment-api]
3a7f2c9e1b4d: Pushed                      ← only the changed layer
8b1c4e7a9d2f: Layer already exists
a3f2c8d91e44: Layer already exists
a3f2c8d: digest: sha256:9c1e3a5b7d...

Built and pushed: acmecorp/payment-api:a3f2c8d
Deploy with: docker run acmecorp/payment-api:a3f2c8d

What just happened?

The pipeline built the image in 3.1 seconds — the layer cache means only the changed source code layer actually rebuilt. Three tags were applied pointing to the same image layers. Only one layer actually pushed — the others already existed in the registry. The --label flags baked metadata into the image — commit SHA, branch, build date — all readable via docker inspect. Any engineer can now pull any image, inspect its labels, and know exactly what git commit it was built from and when.

Teacher's Note

Pick one strategy and be consistent. A registry with images tagged v1.2.3, a3f2c8d, latest, dev, test, new, and final2 is worse than using latest for everything — at least latest is predictably wrong. One tag format, enforced in the pipeline, used everywhere.

Practice Questions

1. In a CI pipeline script, the shell command used to get the short git commit SHA of the current commit is what?

2. To reference a Docker image in a way that is completely immutable — even if someone overwrites the tag in the registry — you reference it by its what?

3. To bake metadata like a git commit SHA and build date into the image at build time — readable later via docker inspect — you use which docker build flag?

Quiz

Up Next · Lesson 32

Docker Security Basics

Tagging done — now let's talk about the security issues hiding in the average Docker setup, and the changes that take minutes to make but dramatically reduce your attack surface.

← Previous Course Index Next →

Docker Course

Image Versioning & Tagging

The Problem with latest

Tag Strategy 1 — Semantic Versioning

Tag Strategy 2 — Git Commit SHA

Tag Strategy 3 — Environment Tags

Immutable Tags with Digest References

Putting It Together — A Complete Tagging Pipeline

Practice Questions

Quiz