Docker Lesson 22 – Docker Best Practices | Dataplexa

Section II · Lesson 22

Docker Best Practices

A colleague's Docker image is 1.2 GB and takes 14 minutes to build. Yours is 87 MB and takes 90 seconds. Same application. The difference isn't the code — it's whether you know these practices or not.

Section II covered every major Docker concept — Dockerfiles, images, volumes, networking, environment variables. This lesson distils all of that into a practical set of habits. Think of it as the checklist you run through before shipping any Docker image to production.

Practice 1 — Use Small Base Images

Your base image choice is the single biggest factor in final image size. The full node:18 image is based on Debian and weighs ~950 MB. node:18-slim is a trimmed Debian — ~240 MB. node:18-alpine is based on Alpine Linux — ~127 MB. Same Node version, same npm, three completely different sizes.

Bad Dockerfile

FROM node:18
# Full Debian image — ~950 MB before you add anything
# Ships with gcc, make, python, and hundreds of
# packages your app will never use

WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["node", "server.js"]
# Final image: ~1.1 GB

Optimised Dockerfile

FROM node:18-alpine
# Alpine image — ~127 MB base
# Minimal OS with only what's needed
# musl libc instead of glibc — small tradeoffs
# for most Node apps, zero impact

WORKDIR /app
COPY package*.json ./
RUN npm install --omit=dev
COPY . .
CMD ["node", "server.js"]
# Final image: ~167 MB

Alpine caveat: Some packages with native binaries — particularly those using glibc — don't compile cleanly on Alpine's musl libc. If you hit build errors with Alpine, try node:18-slim as a middle ground. It's still much smaller than the full image and uses standard glibc.

Practice 2 — Order Layers for Cache Efficiency

This is the practice that has the most impact on developer experience. Copy dependency manifests first, install dependencies, then copy source code. This way a source code change only busts the last layer — the slow dependency install stays cached.

Cache-Busting Order

FROM node:18-alpine
WORKDIR /app
COPY . .
# Copies everything including source code first
# Any code change busts the cache here

RUN npm install --omit=dev
# npm install re-runs on EVERY code change
# 14-second build, every single time

CMD ["node", "server.js"]

Cache-Friendly Order

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
# Only copies dependency files

RUN npm install --omit=dev
# Cached unless package.json changes
# Only runs fresh when deps actually change

COPY . .
# Source code copied last — changes here
# only bust the final layer
CMD ["node", "server.js"]

Practice 3 — Never Run as Root

By default, processes inside a container run as root. If an attacker exploits your application and gains code execution, they immediately have root access inside the container. Depending on the Docker configuration, this can translate to privileges on the host machine.

The fix is one Dockerfile instruction — create a non-root user and switch to it before the CMD.

FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm install --omit=dev
COPY . .

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# addgroup -S → create a system group called appgroup
# adduser -S  → create a system user called appuser in that group
# -S means system account — no password, no home directory, no login shell

RUN chown -R appuser:appgroup /app
# Transfer ownership of the /app directory to the new user
# The app process needs to read its own files

USER appuser
# Switch to the non-root user for all subsequent instructions
# The CMD now runs as appuser, not root

EXPOSE 3000
CMD ["node", "server.js"]

# Verify the running process is not root
docker exec payment-api whoami
appuser

docker exec payment-api id
uid=100(appuser) gid=101(appgroup) groups=101(appgroup)
# uid 100 — not uid 0 (root)
# This container process has minimal privileges

What just happened?

The container process is now running as appuser with uid 100 — not uid 0 (root). If someone exploits the application and gets code execution, they land as appuser with no special privileges — they can't write outside /app, can't install packages, can't read system files, can't interact with Docker directly. One Dockerfile instruction just dramatically reduced the blast radius of any future vulnerability in this application.

Practice 4 — Use a Thorough .dockerignore

Every file that isn't excluded by .dockerignore gets sent to the Docker Daemon as part of the build context on every build. A bloated build context slows every build and risks accidentally including secrets or unnecessary files in the image.

# .dockerignore — the production-ready template
node_modules          # never copy — npm install runs inside the image
.git                  # git history has no place in a production image
.gitignore            # not needed inside the container
.dockerignore         # meta — no need to include this inside the image
*.md                  # documentation files
*.log                 # log files
.env                  # CRITICAL — never bake secrets into the image
.env.*                # catches .env.local, .env.development, .env.test etc.
coverage/             # test coverage reports
.nyc_output/          # Istanbul/nyc coverage data
dist/                 # compiled output if building outside Docker
.DS_Store             # macOS metadata
Thumbs.db             # Windows metadata
tests/                # test files have no place in a production image
__tests__/            # Jest test directories
*.test.js             # individual test files
Dockerfile            # not needed inside the container
docker-compose*.yml   # Compose files not needed in production images

Practice 5 — Pin Specific Versions

Using latest or unpinned tags for base images is one of the most common causes of "it worked yesterday" failures. An upstream maintainer pushes a new version overnight, your CI/CD pipeline pulls it at 3am, and your builds break in ways that are hard to diagnose because nothing in your code changed.

# Bad — unpinned, will change without warning
FROM node:latest
FROM node:18
FROM postgres:15

# Good — pinned to a specific patch version
FROM node:18.19.0-alpine3.19
FROM postgres:15.4-alpine3.18

# Best for critical production images — pin by digest (immutable)
FROM node:18.19.0-alpine3.19@sha256:8d6421d663a9fe62e6be4f16661e9e9f4f3abfcb92d6b45b6a8f7d2b9c3e1a05
# The @sha256 digest pins to an exact image build — cannot change even if
# the maintainer pushes a new image under the same tag

Practice 6 — One Process Per Container

Each container should run a single process — one concern per container. Don't run your web server, database, and background worker in the same container. This principle keeps containers observable (one set of logs, one health check), replaceable (scale just the component that needs scaling), and maintainable (update one thing at a time).

Bad — Everything in One Container

Node.js API + PostgreSQL + Redis all in one image
Must restart everything to update one component
Can't scale API independently from database
Logs from three processes mixed together
If any process crashes, restart the whole container

Good — One Process Per Container

API container, DB container, Redis container — separate
Update the API without touching the database
Scale only the API when traffic spikes
Clear, isolated logs per service
Each container has one health check, one restart policy

Practice 7 — Use HEALTHCHECK

A container can be running — docker ps shows Up — while the application inside it is completely broken. The process is alive but the web server is stuck in an infinite loop, the database connection failed at startup, or the app is returning 500 errors on every request. Docker doesn't know.

The HEALTHCHECK instruction tells Docker how to test whether the application is actually healthy. Docker runs the command periodically and updates the container's health status — healthy, unhealthy, or starting.

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install --omit=dev
COPY . .

HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
# --interval=30s      → run the check every 30 seconds
# --timeout=10s       → fail the check if it takes more than 10 seconds
# --start-period=15s  → give the app 15 seconds to start before health checks begin
# --retries=3         → mark unhealthy only after 3 consecutive failures
# wget ...health      → the actual check — hits the /health endpoint
# || exit 1           → if wget fails, exit with code 1 (Docker marks as unhealthy)

EXPOSE 3000
CMD ["node", "server.js"]

CONTAINER ID   IMAGE                STATUS                   PORTS
a3f2c8d91e44   payment-api:v1.2.0   Up 2 minutes (healthy)   0.0.0.0:3000->3000/tcp

# After the health endpoint starts failing:
CONTAINER ID   IMAGE                STATUS                     PORTS
a3f2c8d91e44   payment-api:v1.2.0   Up 5 minutes (unhealthy)   0.0.0.0:3000->3000/tcp

What just happened?

The STATUS column now shows (healthy) or (unhealthy) — not just Up. Container orchestration systems like Docker Swarm and Kubernetes use this health status to make routing decisions. An unhealthy container stops receiving traffic and gets restarted automatically. Without a HEALTHCHECK, a broken container stays in the load balancer rotation and serves errors to users indefinitely. With one, it's removed and replaced without any human intervention.

Best Practices — The Complete Checklist

Before shipping any Docker image — run through this list

✓

Alpine or slim base image — not the full OS image

✓

Dependencies before source code — package.json copied and installed before COPY . .

✓

--omit=dev / --no-dev — production dependencies only

✓

RUN commands chained with && — cleanup in the same layer that made the mess

✓

Non-root USER — create appuser and switch before CMD

✓

Thorough .dockerignore — node_modules, .git, .env, logs, tests excluded

✓

Pinned base image version — no :latest in production

✓

One process per container — single responsibility, clear logs, independent scaling

✓

HEALTHCHECK defined — container orchestration needs this to route traffic correctly

✓

No secrets in the image — everything sensitive passed at runtime via -e or Docker Secrets

Teacher's Note

You won't apply every practice on every image immediately — but the non-root user and layer order are the two that pay back the most for the least effort. Start with those two on your next Dockerfile.

Practice Questions

1. The Dockerfile instruction that switches the process running inside the container from root to a non-privileged user is called what?

2. The Dockerfile instruction that tells Docker how to periodically test whether the application inside a container is actually working correctly is called what?

3. Between node:18, node:18-slim, and node:18-alpine, which base image produces the smallest final Docker image?

Quiz

Up Next · Lesson 23

Docker Compose Introduction

Section II complete. Section III begins — and the first thing you'll do is stop running containers one by one and start defining your entire application stack in a single file.

← Previous Course Index Next →

Docker Course