Docker Lesson 14 – Docker Images Layers | Dataplexa

Section II · Lesson 14

Docker Image Layers

You've seen the cache saving build time across the last three lessons. Now let's understand exactly why it works — and how to write Dockerfiles that exploit it to the maximum. The difference between a 25-second build and a 2-second build is almost always layer order.

Layer caching is one of Docker's most powerful features — and one of the most misunderstood. Engineers who understand it write fast, efficient builds. Engineers who don't spend their careers waiting for npm install to run on every single code change.

Every Instruction Creates a Layer

Each instruction in a Dockerfile that modifies the filesystem — FROM, RUN, COPY, ADD — creates a new read-only layer on top of the previous one. Instructions that don't modify the filesystem — ENV, EXPOSE, WORKDIR, CMD — create metadata layers with zero size.

Docker identifies each layer with a SHA256 hash computed from two things: the instruction itself and the content it operates on. If neither changes between builds, the hash is identical — and Docker serves the layer from cache instead of re-executing the instruction. This is the layer cache.

The Photocopier Analogy

Imagine photocopying a 10-page document, then changing only page 7 and copying it again. A smart photocopier remembers pages 1–6 are identical and only re-copies pages 7–10. Docker's layer cache works exactly this way. If you change a line of source code, only the layers that depend on that source code are re-executed. Everything before it — base OS, runtime, installed packages — comes straight from the cache.

Cache Invalidation — The Critical Rule

The cache works top to bottom. The moment Docker finds a layer whose hash has changed — either the instruction changed or the files it operates on changed — it invalidates that layer and every layer below it. All subsequent instructions re-run from scratch, regardless of whether they changed.

This single rule explains every caching decision in a well-written Dockerfile. Put the things that change least at the top. Put the things that change most at the bottom.

Cache invalidation cascades downward

Bad order — code change busts everything

FROM node:18-alpine ✓ cached

COPY . . ✗ CHANGED

RUN npm install ✗ re-runs

CMD ["node", "server.js"] ✗ re-runs

Every code change re-runs npm install

Good order — only code layer re-runs

FROM node:18-alpine ✓ cached

COPY package*.json ✓ cached

RUN npm install ✓ cached

COPY . . ✗ CHANGED

CMD ["node", "server.js"] ✗ re-runs

npm install stays cached every time

Seeing the Cache in Action

The scenario: You're a backend developer actively working on the payment API from Lesson 12. You edit a single line in server.js and rebuild. With the correct Dockerfile order, the rebuild should take under 3 seconds. With the wrong order, it would take 20+ seconds every single time.

# First build — everything runs fresh, populates the cache
docker build -t payment-api:v1.3.0 .

# Edit server.js — change one line of code
# Now rebuild — watch which layers are CACHED vs re-run
docker build -t payment-api:v1.3.1 .

First build:
[+] Building 21.4s (9/9) FINISHED
 => [1/5] FROM node:18-alpine                                           3.8s
 => [2/5] WORKDIR /app                                                  0.0s
 => [3/5] COPY package*.json ./                                         0.1s
 => [4/5] RUN npm install --omit=dev                                   14.9s
 => [5/5] COPY . .                                                      0.3s

After editing server.js — second build:
[+] Building 2.1s (9/9) FINISHED
 => CACHED [1/5] FROM node:18-alpine                                    0.0s
 => CACHED [2/5] WORKDIR /app                                           0.0s
 => CACHED [3/5] COPY package*.json ./                                  0.0s
 => CACHED [4/5] RUN npm install --omit=dev                             0.0s
 => [5/5] COPY . .                                                      0.3s

What just happened?

The second build took 2.1 seconds — down from 21.4 on the first. Four out of five steps were served directly from cache. The only step that re-ran was COPY . . because that's the first layer that includes the changed server.js. The npm install layer — the slow one — stayed cached because package.json didn't change. This is 10x faster development feedback. Every developer working on this codebase benefits every time they rebuild, all day, every day.

Reducing Image Size — Combining RUN Instructions

Each RUN instruction creates a new layer. If you install a package in one RUN instruction and then delete temporary files in a separate RUN instruction, the deleted files are gone from the final filesystem — but the original layer that contained them still exists in the image and still takes up disk space.

The fix is to chain related commands into a single RUN instruction using &&. Everything in one RUN happens in a single layer — install, use, and clean up all at once, and only the final state of that layer is stored.

FROM ubuntu:22.04

# Bad — three separate layers, cleanup layer doesn't reduce size
RUN apt-get update
RUN apt-get install -y curl wget build-essential
RUN rm -rf /var/lib/apt/lists/*

# Good — single layer, cleanup happens before the layer is committed
RUN apt-get update && \
    apt-get install -y curl wget build-essential && \
    rm -rf /var/lib/apt/lists/*
# The && chains commands — if any command fails, the whole RUN fails (fail fast)
# The backslash \ continues the instruction onto the next line for readability
# rm -rf /var/lib/apt/lists/* removes the apt cache — saves ~30-50 MB per image

Bad approach — image history:
IMAGE          CREATED BY                                    SIZE
a1b2c3d4e5f6   RUN rm -rf /var/lib/apt/lists/*              0B
<missing>      RUN apt-get install -y curl wget...          148MB
<missing>      RUN apt-get update                           28MB
<missing>      FROM ubuntu:22.04                            77MB
Total: 253MB (cleanup layer adds 0B but the 148MB layer still exists)

Good approach — image history:
IMAGE          CREATED BY                                    SIZE
b9c8d7e6f5a4   RUN apt-get update && apt-get install...     118MB
<missing>      FROM ubuntu:22.04                            77MB
Total: 195MB (cleanup happened inside the same layer — 58MB saved)

What just happened?

The bad approach produced three layers. The cleanup RUN created a new layer with 0 bytes — but the 148 MB installation layer still exists underneath and contributes to the total image size. Layers are immutable — you can't shrink a previous layer by deleting files in a later one. The good approach combines everything into one layer. The cleanup happens before the layer is committed to the image, so the 30 MB of apt cache never makes it into any layer. The result: 58 MB smaller image, one fewer layer, and identical functionality.

The .dockerignore Effect on Layers

A frequently missed connection: the .dockerignore file directly affects layer caching. The COPY . . instruction computes a hash of all the files it copies. If any of those files change — including log files, temporary files, or IDE configs — the hash changes and the layer cache is busted.

A thorough .dockerignore keeps the build context lean and stable, which means the COPY . . layer only invalidates when actual source code changes — not when your editor creates a .DS_Store file or your test runner writes a log.

# Check the size of your build context before and after .dockerignore
docker build --no-cache -t payment-api:v1.3.0 . 2>&1 | head -5
# Look for: "Sending build context to Docker daemon  X.XXkB"
# A build context over 10 MB is a sign that .dockerignore needs work

Without .dockerignore:
[+] Building 18.9s (9/9) FINISHED
 => [internal] load build context                                       2.3s
 => => transferring context: 87.42MB                                    2.1s

With .dockerignore (excluding node_modules, .git, logs):
[+] Building 3.2s (9/9) FINISHED
 => [internal] load build context                                       0.1s
 => => transferring context: 342.8kB                                    0.0s

What just happened?

Without .dockerignore, Docker transferred 87 MB to the Daemon before even starting — that's node_modules, .git history, and log files all being packaged up and sent. That 2.1-second transfer happens on every single build regardless of what changed. With .dockerignore properly configured, the context drops to 342 KB — 250x smaller — and transfers in milliseconds. Across hundreds of builds per day on a CI server, this compounds into hours of saved time.

Teacher's Note

Two habits that make every Dockerfile better: always chain RUN commands with && and always clean up in the same layer that made the mess. These alone cut image sizes by 30–50% in most projects.

Practice Questions

1. When Docker detects a change in one Dockerfile layer, what happens to all the layers that come after it in the build?

2. To chain multiple shell commands into a single RUN layer so that cleanup happens before the layer is committed, you join them with which operator?

3. To keep the npm install layer cached across code changes, which file must be copied into the image before the source code?

Quiz

Up Next · Lesson 15

Docker Volumes

Your containers are ephemeral — everything inside them disappears on deletion. Volumes are how you keep data alive across the container lifecycle.

← Previous Course Index Next →

Docker Course

Docker Image Layers

Every Instruction Creates a Layer

Cache Invalidation — The Critical Rule

Seeing the Cache in Action

Reducing Image Size — Combining RUN Instructions

The .dockerignore Effect on Layers

Practice Questions

Quiz