CI/CD Course
Artifact Management
In this lesson
An artifact is the versioned, immutable output of a build — the packaged form of an application that is stored, tested, and deployed. It might be a Docker image, a compiled binary, a JAR file, a zip archive, or a compiled frontend bundle. Once built, an artifact does not change. It is stored in an artifact registry — a versioned storage system that the pipeline uses to pass build outputs between stages and to retrieve specific versions for deployment. Artifact management is the practice of storing, versioning, promoting, and eventually retiring these outputs in a controlled, traceable way.
Artifacts vs Source Code — A Fundamental Distinction
Source code is what developers write. An artifact is what gets deployed. These are not the same thing, and conflating them creates one of the most common pipeline antipatterns: rebuilding from source at every stage of the pipeline. If staging and production each run their own build from the same commit, they are not necessarily running the same code — environment differences, tool version drift, or a dependency resolution change between builds can produce subtly different outputs.
The correct model, introduced as "build once, deploy many" in Lesson 12, relies entirely on artifacts. The source is built once. The resulting artifact is stored with a version identifier. Every subsequent stage — test, staging, production — retrieves that exact artifact from the registry and deploys it. What the testing team validated is byte-for-byte identical to what reaches production. The artifact registry is the mechanism that makes this possible.
The Sealed Container Analogy
A pharmaceutical manufacturer does not produce a batch of medicine, test one vial, then produce a fresh batch for distribution. They produce one batch, seal it, test from that sealed batch, and distribute the same sealed containers that were tested. If the seal is broken or the container is swapped, the test results are meaningless. An artifact is the sealed container. The registry is the warehouse. Deploying from source at each stage is like opening the container and remanufacturing it at every step — the tests no longer apply.
Artifact Registries — Storage, Access, and Trust
An artifact registry is a dedicated storage and distribution service for build outputs. It is not a general file store — it is purpose-built for versioned, typed artifacts with access control, metadata, and often built-in vulnerability scanning. The pipeline pushes artifacts to the registry after a successful build, and pulls from the registry at deployment time.
Common Artifact Registries by Type
Versioning and Immutability — The Two Rules of Artifact Storage
Every artifact must have a unique, meaningful version identifier. Without one, there is no reliable way to know what is running in any environment, roll back to a previous version, or audit what changed between deployments. The two most common versioning schemes in CI/CD are semantic versioning (e.g. v1.4.2) and commit SHA tagging (e.g. app:a3f9c12). In practice, many pipelines use both — a human-readable semantic version for releases and a commit SHA for traceability.
Immutability is the other non-negotiable property. Once an artifact is stored under a version tag, that tag must never be overwritten with a different artifact. A Docker image tagged v1.4.2 must always refer to exactly the same image layers. If a tag can be silently reassigned, the entire chain of evidence from build to deployment collapses — you can no longer say with certainty what was tested or what is running. Most enterprise registries enforce immutability by policy; on Docker Hub, it requires deliberate configuration.
Publishing and Using an Artifact — GitHub Actions
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
push: true
tags: |
ghcr.io/${{ github.repository }}:${{ github.sha }} # Commit SHA — always unique
ghcr.io/${{ github.repository }}:latest # Floating tag — latest main build
deploy-staging:
needs: build-and-push # Only runs after the build job succeeds
runs-on: ubuntu-latest
steps:
- name: Deploy specific artifact to staging
run: |
# Pull the exact image that was just built — by SHA, not by 'latest'
docker pull ghcr.io/${{ github.repository }}:${{ github.sha }}
# Deploy that image to staging
./scripts/deploy.sh staging ${{ github.sha }}
What just happened?
The build job produced a Docker image tagged with both the commit SHA and latest, then pushed it to the registry. The deploy job referenced the image by its SHA — not latest — so there is no ambiguity about exactly which artifact is being deployed to staging. Traceability from commit to running container is complete.
Artifact Promotion Through Environments
Artifact promotion is the act of moving a specific, tested artifact from one environment to the next — from staging to production, for example — without rebuilding it. The artifact registry makes promotion possible: because the artifact is stored with a stable identifier, any environment can retrieve and deploy it at any time.
Promotion is often gated by manual approval in production pipelines. A human reviews the staging deployment, confirms the behaviour is correct, and approves the promotion. The pipeline then pulls the same artifact — by the same SHA or version tag — and deploys it to production. Nothing is rebuilt. Nothing is recompiled. The artifact that passed testing is the artifact that reaches users.
Artifact Promotion Flow
app:a3f9c12 and pushed to the registry. Pipeline stores the SHA as a workflow output for downstream jobs.app:a3f9c12 from the registry. Unit, integration, and smoke tests run against it. All pass. The artifact is now a verified candidate.app:a3f9c12 deployed to staging. QA and product sign off. The artifact is promoted — tagged additionally as app:v1.4.2-rc1 to indicate release candidate status.app:a3f9c12 — the same bytes that were built, tested, and verified on staging — and deploys to production. Additionally tagged app:v1.4.2 for the release record.Retention Policies and Registry Cleanup
A pipeline that builds on every commit will accumulate artifacts rapidly. A busy team might produce dozens of Docker images per day. Without a retention policy, storage costs grow unbounded and the registry becomes difficult to navigate. Most registries support automated cleanup rules — delete untagged images after 7 days, keep the last 10 tagged versions per repository, retain any artifact tagged as a release indefinitely.
Retention decisions have a rollback implication. If a production incident requires rolling back to a version deployed three months ago, that artifact must still exist in the registry. Retention policies should be designed to keep all production-deployed artifacts for at least as long as the rollback window the team has committed to — typically 30 to 90 days minimum, longer for regulated industries.
Warning: Mutable Tags Make Deployments Unauditable and Rollbacks Unreliable
Using a mutable tag like latest as the sole deployment reference means you can never be certain what is actually running in any environment. If latest is reassigned to a new image between the time staging was deployed and the time production is deployed, staging and production are running different code — with no visible indication of the discrepancy. Always deploy by commit SHA or an immutable version tag. Use latest only as a convenience pointer for human reference, never as a deployment target in your pipeline scripts.
Key Takeaways from This Lesson
latest — mutable floating tags leave no guarantee about what is actually running in any environment. SHA-based references are unambiguous and traceable.
Teacher's Note
Tag every production deployment with both the commit SHA and the semantic version — the SHA gives you traceability back to the exact code, and the version gives you something a human can read in an incident at 2am.
Practice Questions
Answer in your own words — then check against the expected answer.
1. What is the property that prevents an artifact version tag from being overwritten with a different artifact — ensuring that a given version always refers to exactly the same build output?
2. What is the term for moving a tested, verified artifact from one environment to the next — retrieving it from the registry by its stable identifier rather than rebuilding it from source?
3. Instead of using the mutable latest tag as a deployment reference, what unique identifier — generated automatically by Git for every commit — should pipeline scripts use to reference a specific artifact unambiguously?
Lesson Quiz
1. A pipeline deploys to staging using the latest tag, then deploys to production using the same latest tag an hour later. A new build ran between the two deployments and reassigned the tag. What is the result?
2. A production incident occurs and the team needs to roll back to the version deployed 45 days ago. What must be true for this rollback to be possible?
3. A team rebuilds their application from source at the test stage, the staging stage, and the production stage, all from the same Git commit. What is the core problem with this approach?
Up Next · Lesson 15
Automated Testing
The artifact is built and stored — now it needs to be verified. Automated testing is the stage that turns a green build into a deployable release, and a failing test into an early warning before anything reaches users.