Docker Lesson 27 – Docker Compose Volumes | Dataplexa

Section III · Lesson 27

Docker Compose Volumes

A team deployed a new version of their app with docker compose up --build on a Friday afternoon. Monday morning, every user's uploaded profile photo was gone. The code was fine. The database was fine. But nobody had declared the uploads directory as a volume — and docker compose up --build recreated the container, wiping the writable layer. This lesson makes sure that never happens on your watch.

Compose volumes work the same as Docker volumes — but Compose adds lifecycle management, sharing between services, and backup patterns that manual docker volume commands don't give you out of the box. There's more here than just repeating -v in a YAML file.

How Compose Manages Volume Lifecycle

When you declare a named volume in a Compose file, Compose takes ownership of it. It creates it on first docker compose up. It leaves it alone on docker compose down. It deletes it only when you explicitly ask with docker compose down -v. This is deliberate — stopping an application should never silently delete data.

The volume name Compose uses is prefixed with the project name — just like containers and networks. A volume declared as postgres-data in a project named order-api becomes order-api_postgres-data in Docker's volume list. This scoping prevents two Compose projects from accidentally sharing the same volume.

# See all volumes Compose created for your project
docker volume ls | grep order-api

# Full lifecycle — what happens to volumes at each stage
docker compose up -d        # creates order-api_postgres-data if it doesn't exist
docker compose down         # stops containers, removes network — volume UNTOUCHED
docker compose up -d        # postgres finds existing data — no re-initialisation
docker compose down -v      # WARNING: deletes order-api_postgres-data permanently

DRIVER    VOLUME NAME
local     order-api_postgres-data
local     order-api_redis-data
local     order-api_app-uploads

# After docker compose down (no -v flag):
DRIVER    VOLUME NAME
local     order-api_postgres-data    ← still here
local     order-api_redis-data       ← still here
local     order-api_app-uploads      ← still here

# After docker compose down -v:
# (empty — all three deleted permanently)

What just happened?

docker compose down left all three volumes intact — data is always safe from a plain shutdown. The second docker compose up would find existing data in the volumes and pick up exactly where it left off. docker compose down -v deleted everything permanently — there's no undo, no recycle bin, no recovery. That flag is a deliberate nuclear option for resetting a development environment. Never run it on a machine that holds data you need.

Sharing a Volume Between Services

Multiple services in the same Compose stack can mount the same named volume simultaneously. This is how you build shared file stores — a web server and a thumbnail generator both accessing the same uploads directory, an API and a worker both reading from the same queue directory.

The scenario: Your order platform has an API that accepts file uploads and a worker service that processes those files asynchronously — resizing images, extracting metadata, generating thumbnails. Both need access to the same files. The correct architecture is one shared volume, not two separate volumes with a sync mechanism between them.

services:

  order-api:
    build: .
    ports:
      - "3000:3000"
    volumes:
      - app-uploads:/app/uploads     # API writes uploaded files here
      - app-logs:/app/logs           # API writes logs here

  image-worker:
    build: ./worker
    volumes:
      - app-uploads:/worker/uploads  # worker reads from the SAME volume
      # mounted at a different path — /worker/uploads instead of /app/uploads
      # but it's the same underlying storage — same files

  log-aggregator:
    image: fluent/fluentd:v1.16-1
    volumes:
      - app-logs:/fluentd/log:ro     # log aggregator reads API logs READ-ONLY
      # :ro prevents log-aggregator from accidentally writing or deleting logs

volumes:
  app-uploads:    # one volume, two services, different mount paths
  app-logs:       # one volume, API writes, aggregator reads-only

# Verify both services see the same files
docker compose exec order-api ls /app/uploads
invoice-001.pdf   receipt-002.jpg   order-003.png

docker compose exec image-worker ls /worker/uploads
invoice-001.pdf   receipt-002.jpg   order-003.png
# Identical — same volume, same files, different mount paths

# API writes a new file
docker compose exec order-api touch /app/uploads/new-order-004.pdf

# Worker sees it immediately
docker compose exec image-worker ls /worker/uploads
invoice-001.pdf   receipt-002.jpg   order-003.png   new-order-004.pdf

What just happened?

Both containers are mounting the same underlying volume — order-api_app-uploads — but at different paths inside each container. When the API writes new-order-004.pdf to /app/uploads, the worker sees it immediately at /worker/uploads because they're pointing at the same directory on disk. No copying, no syncing, no message passing — just a shared filesystem. The :ro flag on the log-aggregator's mount means it can read logs but cannot write or delete them, protecting the API's log output from accidental modification.

Shared volume — one storage, multiple mounts

order-api
/app/uploads (rw)

writes files

↓ mounts

app-uploads
named volume

↑ mounts

image-worker
/worker/uploads (rw)

reads and processes

Same volume, different container paths. A file written by order-api appears instantly in image-worker. No sync needed.

Seeding a Volume on First Run

A common need — populate a volume with initial data the first time the stack starts. PostgreSQL handles this natively through its /docker-entrypoint-initdb.d/ convention: any .sql or .sh files mounted there are executed on first initialisation. Many other official images follow the same pattern.

services:

  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_USER: orders_user
      POSTGRES_PASSWORD: secret123
      POSTGRES_DB: orders
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./db/init.sql:/docker-entrypoint-initdb.d/01-init.sql:ro
      - ./db/seed.sql:/docker-entrypoint-initdb.d/02-seed.sql:ro
      # Files in /docker-entrypoint-initdb.d/ run in alphabetical order
      # ONLY on first initialisation — when postgres-data volume is empty
      # Once the volume has data, these files are ignored entirely
      # :ro — read-only, Postgres reads the files but cannot modify them

volumes:
  postgres-data:

# First run — volume is empty, init scripts execute
db-1  | /docker-entrypoint.sh: running /docker-entrypoint-initdb.d/01-init.sql
db-1  | CREATE TABLE
db-1  | CREATE INDEX
db-1  | /docker-entrypoint.sh: running /docker-entrypoint-initdb.d/02-seed.sql
db-1  | INSERT 0 50
db-1  | PostgreSQL init process complete; ready for start up.

# Second run — volume already has data, init scripts silently skipped
db-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
db-1  | database system is ready to accept connections

What just happened?

On the first run, PostgreSQL found an empty data volume and executed both SQL files in alphabetical order — creating tables, indexes, then seeding 50 rows of test data. On every subsequent run, it found existing data and skipped initialisation entirely. This is the cleanest way to give new developers a working local database with real test data — just docker compose up and the schema and seed data are there automatically. The bind-mounted SQL files stay on the host — update them, run docker compose down -v && docker compose up, and the database resets to the new schema.

Backing Up and Restoring a Compose Volume

Volumes are managed by Docker — you can't just copy them like files. The correct backup pattern runs a temporary container that mounts the volume and tars its contents to your host.

# Backup a Compose volume to a tar file on the host
docker run --rm \
  -v order-api_postgres-data:/data:ro \
  -v $(pwd)/backups:/backup \
  alpine \
  tar czf /backup/postgres-backup-$(date +%Y%m%d).tar.gz -C /data .
# --rm                         → remove the temporary container when done
# -v order-api_postgres-data   → mount the volume to back up (read-only)
# -v $(pwd)/backups:/backup    → mount local backups directory
# alpine                       → tiny image with tar available
# tar czf ...                  → compress the volume contents into a timestamped archive

# Restore from a backup into a fresh volume
docker run --rm \
  -v order-api_postgres-data:/data \
  -v $(pwd)/backups:/backup:ro \
  alpine \
  tar xzf /backup/postgres-backup-20240115.tar.gz -C /data

# Backup output (no output = success with tar)
ls -lh backups/
-rw-r--r--  1 dev  staff   14M  Jan 15 09:23  postgres-backup-20240115.tar.gz

# To restore — stop postgres first, restore, then restart
docker compose stop db
docker run --rm \
  -v order-api_postgres-data:/data \
  -v $(pwd)/backups:/backup:ro \
  alpine tar xzf /backup/postgres-backup-20240115.tar.gz -C /data
docker compose start db

What just happened?

A temporary Alpine container mounted the Postgres volume read-only and the local backups/ directory, compressed the entire volume contents into a 14 MB timestamped archive, then deleted itself. The backup is a standard .tar.gz file on your host — copy it anywhere, restore it anywhere. The restore reverses the process: stop the database so nothing is writing to the volume, restore the archive, restart. This pattern works for any volume — uploads, Redis snapshots, application state — not just Postgres.

Stop the Container Before Restoring

Never restore into a volume while the database is running and writing to it. You'll end up with a mix of old and new data that corrupts the database files. Always docker compose stop db before restoring, and docker compose start db after. For production restores, take the entire application offline first.

Teacher's Note

The temporary Alpine container backup pattern is one of those Docker tricks every engineer should have memorised — it works for any volume, any data, any OS, and you'll reach for it more often than you expect.

Practice Questions

1. The command that stops all Compose services AND permanently deletes all named volumes declared in the Compose file is what?

2. To mount a volume into a container so the container can read its contents but cannot write or delete anything, which suffix do you add to the volume mount path?

3. To automatically run SQL scripts when a PostgreSQL container initialises for the first time, you mount them into which directory inside the container?

Quiz

Up Next · Lesson 28

Docker Registry Concepts

You've been pulling images since Lesson 1 — now let's understand exactly how registries work, the difference between public and private, and how images travel from your build machine to production.

← Previous Course Index Next →

Docker Course

Docker Compose Volumes

How Compose Manages Volume Lifecycle

Sharing a Volume Between Services

Seeding a Volume on First Run

Backing Up and Restoring a Compose Volume

Practice Questions

Quiz