Docker Course
Docker Compose YAML Deep Dive
A junior developer once showed me a Compose file with depends_on: db and wondered why the API kept crashing on startup every morning. The database was running. The API was starting. But every night at 3am, when the health check script cycled the containers, the API came up before Postgres finished initialising — and crashed silently. One line in the Compose file would have fixed it. That line is in this lesson.
Lesson 23 gave you a working Compose file. This lesson makes it bulletproof. Every key that matters in the real world — with the reasoning behind it, not just the syntax.
depends_on — The Most Misunderstood Key
This is the one everyone gets wrong first time. depends_on does not wait for a service to be ready. It waits for the container to start. Those are two completely different things.
PostgreSQL takes 5–10 seconds to initialise after its container starts. The container is running. The process is running. But Postgres isn't accepting connections yet. If your API starts during that window, it tries to connect, gets "connection refused", and either crashes or logs a flood of errors that take ten minutes to debug.
The Bakery Analogy
Simple depends_on is like a customer arriving at a bakery the moment the baker unlocks the front door — the shop is open, but the bread isn't out of the oven yet. The customer wants bread. There is no bread. condition: service_healthy is like the baker putting a sign in the window that only flips to "Open" once the bread is actually ready. The customer waits at the door until the sign flips — then walks in and gets exactly what they came for.
services:
db:
image: postgres:15-alpine
environment:
POSTGRES_PASSWORD: secret123
POSTGRES_DB: orders
volumes:
- postgres-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres -d orders"]
# pg_isready is Postgres's own readiness tool
# Returns exit code 0 only when the database is accepting connections
interval: 5s # check every 5 seconds
timeout: 5s # fail if no response in 5 seconds
retries: 5 # mark unhealthy after 5 consecutive failures
start_period: 10s # give Postgres 10 seconds to boot before we start checking
api:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: postgresql://postgres:secret123@db:5432/orders
depends_on:
db:
condition: service_healthy
# THIS is the line that fixes the 3am crash
# api will not start until pg_isready returns success
# Without this condition, depends_on only waits for the container to exist
volumes:
postgres-data:
# With condition: service_healthy [+] Running 2/2 ✔ Container myapp-db-1 Healthy 8.3s ✔ Container myapp-api-1 Started 8.6s # db reached Healthy at 8.3s — api started 0.3s later. Clean. # Without the condition [+] Running 2/2 ✔ Container myapp-db-1 Started 0.8s ✔ Container myapp-api-1 Started 1.1s # api started 0.3s after the db CONTAINER — Postgres not ready # API logs: "connection refused" — the bug nobody wants to debug at 3am
What just happened?
With condition: service_healthy the db container sat in a starting state while Compose polled pg_isready every 5 seconds. At 8.3 seconds, Postgres was ready, the health check passed, the status flipped to Healthy — and only then did Compose allow the API to start. The API connected on the first attempt. Zero errors. This one change makes container startup completely deterministic regardless of how fast or slow your host machine is.
The Full Service — Every Key That Matters
Here's a single service definition with every key you'll ever use — annotated with the real-world reasoning behind each one, not just what it does.
services:
order-api:
# ── Image ──────────────────────────────────────────────────────────────
build:
context: . # build from Dockerfile in current directory
dockerfile: Dockerfile.prod # use a specific Dockerfile (not just "Dockerfile")
# image: order-api:v1.2.0 # OR use a pre-built image — pick one or the other
container_name: order-api # give it a fixed name — useful for scripts and logs
# without this: projectname-order-api-1
# ── Ports ──────────────────────────────────────────────────────────────
ports:
- "3000:3000" # public API port — expose to host
- "127.0.0.1:9229:9229" # Node.js debug port — localhost only, never public
# ── Config ─────────────────────────────────────────────────────────────
environment:
NODE_ENV: production
PORT: 3000
env_file:
- .env # load secrets from .env — never hardcode passwords here
# ── Storage ────────────────────────────────────────────────────────────
volumes:
- ./logs:/app/logs # bind mount — write logs to host so they survive restarts
- app-uploads:/app/uploads # named volume — user uploads persist across container replacements
# ── Networking ─────────────────────────────────────────────────────────
networks:
- backend-net
- frontend-net
# ── Startup ────────────────────────────────────────────────────────────
depends_on:
db:
condition: service_healthy # wait for real readiness, not just container start
redis:
condition: service_started # redis is fast — container start is fine here
restart: unless-stopped # survive crashes and host reboots — manual stop is respected
# ── Health ─────────────────────────────────────────────────────────────
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 15s # give the app 15s to boot before checking
# ── Resources ──────────────────────────────────────────────────────────
deploy:
resources:
limits:
cpus: "0.5" # this container gets max half a CPU core
memory: 512M # hard limit — if exceeded, the process is killed
reservations:
memory: 256M # guaranteed minimum — Docker won't starve this container
volumes — Named, External, and Why the Difference Matters
A team once ran docker compose down -v on a production server because they thought it just stopped the containers. It deleted the database volume. Three weeks of production data gone in under a second. Understanding who owns a volume — Compose or Docker directly — is the difference between a recoverable situation and a very bad day.
services:
db:
image: postgres:15-alpine
volumes:
- postgres-data:/var/lib/postgresql/data # named volume — Compose owns this
- ./init-scripts:/docker-entrypoint-initdb.d:ro # bind mount — init SQL on first run
api:
image: order-api:v1.2.0
volumes:
- app-uploads:/app/uploads # named volume — survives container replacement
- ./src:/app/src # bind mount — live code reloading in development
volumes:
postgres-data:
# empty = Docker defaults. Compose creates it, Compose can delete it with down -v.
app-uploads:
driver: local # explicit but identical to the default
production-backups:
external: true # THIS VOLUME WAS NOT CREATED BY COMPOSE
name: nightly-backup-vol # the actual Docker volume name
# external: true means Compose will NEVER create or delete this volume
# If it doesn't exist when you run docker compose up — the stack fails to start
# This is intentional: you don't want Compose accidentally creating an empty backup volume
The Critical Distinction
docker compose down removes containers and the network — volumes survive. docker compose down -v also removes all named volumes declared in the Compose file. If your production database volume is declared in the Compose file, down -v deletes it permanently. Use external: true for any volume that holds data you cannot afford to lose — Compose will never touch it, regardless of which flags you use.
networks — Segmentation That Actually Protects You
Most developers put everything on one network and call it a day. That works — until someone accidentally exposes the database port to the wrong service, or a compromised container can reach the database directly. Network segmentation is five extra lines of YAML that adds a meaningful security layer.
services:
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
networks:
- frontend-net # receives traffic from the internet
- backend-net # proxies requests to the api
api:
build: .
networks:
- backend-net # only on the internal network — nginx can reach it, internet cannot
db:
image: postgres:15-alpine
networks:
- backend-net # deepest layer — only api can reach it
networks:
frontend-net:
driver: bridge # standard bridge — can reach the internet
backend-net:
driver: bridge
internal: true # no outbound internet from this network
# api and db cannot call external URLs
# a compromised api cannot exfiltrate data to an external server
Network segmentation — who can reach who
port 80 / 443
on BOTH networks
Only nginx is exposed here. Nothing else touches the public network.
proxies to api
internal only
unreachable from internet
internal: true — no outbound internet. Database is invisible to the outside world.
nginx bridges both networks — it lives in both. api and db only live in backend-net. The internet can only reach nginx.
Profiles — One File, Multiple Contexts
Every project eventually accumulates tools that developers need but production doesn't — a database admin UI, an email catcher, a mock payment gateway. The temptation is to maintain a separate docker-compose.dev.yml. The problem is two files drift apart, contradict each other, and nobody trusts either one after six months.
Profiles solve this elegantly — one file, services tagged with their context, activated on demand.
services:
db:
image: postgres:15-alpine # no profile → always starts
environment:
POSTGRES_PASSWORD: secret123
api:
build: . # no profile → always starts
ports:
- "3000:3000"
depends_on:
db:
condition: service_healthy
pgadmin:
image: dpage/pgadmin4 # database GUI — dev only
profiles:
- dev # only starts when --profile dev is passed
ports:
- "5050:80"
environment:
PGADMIN_DEFAULT_EMAIL: dev@local.dev
PGADMIN_DEFAULT_PASSWORD: dev
mailhog:
image: mailhog/mailhog # catches outgoing emails — dev only
profiles:
- dev
ports:
- "8025:8025" # web UI to inspect caught emails
mock-payment:
image: stripe/stripe-mock # fake payment API — dev and testing
profiles:
- dev
- test # services can belong to multiple profiles
ports:
- "12111:12111"
docker compose up -d # production: db and api only
docker compose --profile dev up -d # development: everything
docker compose --profile test up -d # testing: db, api, mock-payment only
# docker compose up -d (no profile) [+] Running 2/2 ✔ Container myapp-db-1 Started ✔ Container myapp-api-1 Started # docker compose --profile dev up -d [+] Running 5/5 ✔ Container myapp-db-1 Started ✔ Container myapp-api-1 Started ✔ Container myapp-pgadmin-1 Started ✔ Container myapp-mailhog-1 Started ✔ Container myapp-mock-payment-1 Started
What just happened?
Without --profile dev, two containers. With it, five — the full developer environment including the database GUI, email catcher, and fake payment API. The mock-payment service appears in both the dev and test profiles — a service can belong to as many profiles as makes sense. One file. Three different runtime contexts. The production deployment never accidentally starts pgadmin or mailhog because they simply don't activate without an explicit flag.
Teacher's Note
condition: service_healthy on depends_on is the single most impactful line you can add to any Compose file with a database. I've seen it fix bugs that teams spent days chasing. Add it to every dependency that takes time to initialise.
Practice Questions
1. To make Compose wait until a service's HEALTHCHECK passes before starting a dependent service, which condition value do you set under depends_on?
2. To tell Compose to use a pre-existing volume that it must never create or delete regardless of which flags are passed, you set which key under the volume definition?
3. The Compose feature that lets you define services only activated in specific contexts — like dev tools that shouldn't run in production — is called what?
Quiz
1. A developer adds depends_on: db to their API service but the API still crashes on startup with connection refused. The root cause is:
2. A production database volume is declared with external: true in the Compose file. A developer runs docker compose down -v. What happens to the volume?
3. A Compose network is configured with internal: true. What restriction does this place on containers attached to it?
Up Next · Lesson 25
Multi-Container Applications
YAML mastered — now let's build a complete real-world five-service application from scratch, with everything wired together correctly the first time.