CI/CD Lesson 14 – Artifact Management | Dataplexa

Section II · Lesson 14

Artifact Management

In this lesson

What an Artifact Is Artifact Registries Versioning & Immutability Artifact Promotion Retention & Cleanup

An artifact is the versioned, immutable output of a build — the packaged form of an application that is stored, tested, and deployed. It might be a Docker image, a compiled binary, a JAR file, a zip archive, or a compiled frontend bundle. Once built, an artifact does not change. It is stored in an artifact registry — a versioned storage system that the pipeline uses to pass build outputs between stages and to retrieve specific versions for deployment. Artifact management is the practice of storing, versioning, promoting, and eventually retiring these outputs in a controlled, traceable way.

Artifacts vs Source Code — A Fundamental Distinction

Source code is what developers write. An artifact is what gets deployed. These are not the same thing, and conflating them creates one of the most common pipeline antipatterns: rebuilding from source at every stage of the pipeline. If staging and production each run their own build from the same commit, they are not necessarily running the same code — environment differences, tool version drift, or a dependency resolution change between builds can produce subtly different outputs.

The correct model, introduced as "build once, deploy many" in Lesson 12, relies entirely on artifacts. The source is built once. The resulting artifact is stored with a version identifier. Every subsequent stage — test, staging, production — retrieves that exact artifact from the registry and deploys it. What the testing team validated is byte-for-byte identical to what reaches production. The artifact registry is the mechanism that makes this possible.

The Sealed Container Analogy

A pharmaceutical manufacturer does not produce a batch of medicine, test one vial, then produce a fresh batch for distribution. They produce one batch, seal it, test from that sealed batch, and distribute the same sealed containers that were tested. If the seal is broken or the container is swapped, the test results are meaningless. An artifact is the sealed container. The registry is the warehouse. Deploying from source at each stage is like opening the container and remanufacturing it at every step — the tests no longer apply.

Artifact Registries — Storage, Access, and Trust

An artifact registry is a dedicated storage and distribution service for build outputs. It is not a general file store — it is purpose-built for versioned, typed artifacts with access control, metadata, and often built-in vulnerability scanning. The pipeline pushes artifacts to the registry after a successful build, and pulls from the registry at deployment time.

Common Artifact Registries by Type

Artifact Type

Registry

Notes

Docker images

Docker Hub, AWS ECR, GHCR

Most common artifact type in modern pipelines. Tagged by version and/or commit SHA.

Java JARs / WARs

Nexus, JFrog Artifactory

Maven-compatible format. Also used for internal library distribution across teams.

npm packages

npm registry, GitHub Packages

Used for shared internal libraries as well as public open-source packages.

Generic files

GitHub Actions Artifacts, S3

Short-lived pipeline outputs passed between jobs. Not for long-term storage or deployment.

Helm charts

OCI registries, ChartMuseum

Kubernetes deployment packages. Often stored alongside Docker images in the same OCI-compatible registry.

Versioning and Immutability — The Two Rules of Artifact Storage

Every artifact must have a unique, meaningful version identifier. Without one, there is no reliable way to know what is running in any environment, roll back to a previous version, or audit what changed between deployments. The two most common versioning schemes in CI/CD are semantic versioning (e.g. v1.4.2) and commit SHA tagging (e.g. app:a3f9c12). In practice, many pipelines use both — a human-readable semantic version for releases and a commit SHA for traceability.

Immutability is the other non-negotiable property. Once an artifact is stored under a version tag, that tag must never be overwritten with a different artifact. A Docker image tagged v1.4.2 must always refer to exactly the same image layers. If a tag can be silently reassigned, the entire chain of evidence from build to deployment collapses — you can no longer say with certainty what was tested or what is running. Most enterprise registries enforce immutability by policy; on Docker Hub, it requires deliberate configuration.

Publishing and Using an Artifact — GitHub Actions

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Log in to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          push: true
          tags: |
            ghcr.io/${{ github.repository }}:${{ github.sha }}   # Commit SHA — always unique
            ghcr.io/${{ github.repository }}:latest              # Floating tag — latest main build

  deploy-staging:
    needs: build-and-push          # Only runs after the build job succeeds
    runs-on: ubuntu-latest
    steps:
      - name: Deploy specific artifact to staging
        run: |
          # Pull the exact image that was just built — by SHA, not by 'latest'
          docker pull ghcr.io/${{ github.repository }}:${{ github.sha }}
          # Deploy that image to staging
          ./scripts/deploy.sh staging ${{ github.sha }}

What just happened?

The build job produced a Docker image tagged with both the commit SHA and latest, then pushed it to the registry. The deploy job referenced the image by its SHA — not latest — so there is no ambiguity about exactly which artifact is being deployed to staging. Traceability from commit to running container is complete.

Artifact Promotion Through Environments

Artifact promotion is the act of moving a specific, tested artifact from one environment to the next — from staging to production, for example — without rebuilding it. The artifact registry makes promotion possible: because the artifact is stored with a stable identifier, any environment can retrieve and deploy it at any time.

Promotion is often gated by manual approval in production pipelines. A human reviews the staging deployment, confirms the behaviour is correct, and approves the promotion. The pipeline then pulls the same artifact — by the same SHA or version tag — and deploys it to production. Nothing is rebuilt. Nothing is recompiled. The artifact that passed testing is the artifact that reaches users.

Artifact Promotion Flow

Build

Source code compiled and packaged. Artifact tagged app:a3f9c12 and pushed to the registry. Pipeline stores the SHA as a workflow output for downstream jobs.

Test

The test job pulls app:a3f9c12 from the registry. Unit, integration, and smoke tests run against it. All pass. The artifact is now a verified candidate.

Staging

app:a3f9c12 deployed to staging. QA and product sign off. The artifact is promoted — tagged additionally as app:v1.4.2-rc1 to indicate release candidate status.

Production

Manual approval granted. The pipeline pulls app:a3f9c12 — the same bytes that were built, tested, and verified on staging — and deploys to production. Additionally tagged app:v1.4.2 for the release record.

Retention Policies and Registry Cleanup

A pipeline that builds on every commit will accumulate artifacts rapidly. A busy team might produce dozens of Docker images per day. Without a retention policy, storage costs grow unbounded and the registry becomes difficult to navigate. Most registries support automated cleanup rules — delete untagged images after 7 days, keep the last 10 tagged versions per repository, retain any artifact tagged as a release indefinitely.

Retention decisions have a rollback implication. If a production incident requires rolling back to a version deployed three months ago, that artifact must still exist in the registry. Retention policies should be designed to keep all production-deployed artifacts for at least as long as the rollback window the team has committed to — typically 30 to 90 days minimum, longer for regulated industries.

Warning: Mutable Tags Make Deployments Unauditable and Rollbacks Unreliable

Using a mutable tag like latest as the sole deployment reference means you can never be certain what is actually running in any environment. If latest is reassigned to a new image between the time staging was deployed and the time production is deployed, staging and production are running different code — with no visible indication of the discrepancy. Always deploy by commit SHA or an immutable version tag. Use latest only as a convenience pointer for human reference, never as a deployment target in your pipeline scripts.

Key Takeaways from This Lesson

✓

An artifact is built once and promoted — never rebuilt — the same binary or image that passes testing is the one deployed to production. Rebuilding at each stage breaks the chain of evidence.

✓

Artifacts must be versioned and immutable — a version tag must always refer to the same artifact. Overwriting a tag with a different artifact makes rollback, auditing, and incident response unreliable.

✓

Deploy by commit SHA, not by latest — mutable floating tags leave no guarantee about what is actually running in any environment. SHA-based references are unambiguous and traceable.

✓

Artifact promotion is the mechanism behind environment progression — a tested artifact moves from staging to production by being pulled from the registry under its stable identifier, not by being rebuilt.

✓

Retention policies must account for rollback windows — any artifact that has been deployed to production must remain available in the registry for at least as long as the team needs to be able to roll back to it.

Teacher's Note

Tag every production deployment with both the commit SHA and the semantic version — the SHA gives you traceability back to the exact code, and the version gives you something a human can read in an incident at 2am.

Practice Questions

Answer in your own words — then check against the expected answer.

1. What is the property that prevents an artifact version tag from being overwritten with a different artifact — ensuring that a given version always refers to exactly the same build output?

2. What is the term for moving a tested, verified artifact from one environment to the next — retrieving it from the registry by its stable identifier rather than rebuilding it from source?

3. Instead of using the mutable latest tag as a deployment reference, what unique identifier — generated automatically by Git for every commit — should pipeline scripts use to reference a specific artifact unambiguously?

Lesson Quiz

Up Next · Lesson 15

Automated Testing

The artifact is built and stored — now it needs to be verified. Automated testing is the stage that turns a green build into a deployable release, and a failing test into an early warning before anything reaches users.

← Previous Course Index Next →

CI/CD Course