Docker Lesson 15 – Docker Volumes | Dataplexa

Section II · Lesson 15

Docker Volumes

A developer runs a PostgreSQL container for three weeks, carefully building up a local database. Then they run docker rm to clean up — and every table, every row, every migration is gone forever. This lesson is how you make sure that never happens to you.

Containers are ephemeral by design. The moment a container is deleted, its writable layer — and everything stored in it — is permanently gone. For stateless applications this is a feature. For databases, uploaded files, and any data that needs to outlive a container, it's a disaster waiting to happen. Docker Volumes are the solution.

The Problem Volumes Solve

The External Hard Drive Analogy

A Docker volume is like an external hard drive plugged into a container. The container can be deleted and recreated from scratch — but the hard drive stays. Plug the same drive into a new container and all the data is right there, untouched. The container is temporary. The volume is permanent. This separation is what makes stateful workloads — databases, file stores, caches — safe to run in containers.

Without a volume, data lives in the container's writable layer. Delete the container, lose the data. With a volume, data lives in a special directory managed by the Docker Daemon on the host filesystem — completely outside the container. The container mounts the volume at a specified path and reads and writes through it. When the container is gone, the volume remains.

Before and After Volumes

Without a Volume

Data lives in the container's writable layer
docker rm = data permanently deleted
Container restart loses any uncommitted state
No way to share data between containers
No way to back up data independently
Database containers are dangerous to delete

With a Volume

Data lives in a Docker-managed directory on the host
docker rm leaves the volume intact
Container restarts and replacements are safe
Multiple containers can mount the same volume
Volumes can be backed up and restored independently
Database containers can be safely deleted and recreated

Creating and Using Volumes

The scenario: You're a backend developer setting up a local PostgreSQL database for development. You need the data to survive container restarts and deletions — so you can blow away the container when something goes wrong and bring it back without losing your schema and test data.

# Create a named volume explicitly
docker volume create postgres-data
# Docker creates a managed directory on the host to store this volume's data
# The name postgres-data is how you reference it in docker run commands

# List all volumes on your machine
docker volume ls

# Inspect a volume — see where Docker is actually storing the data on the host
docker volume inspect postgres-data

postgres-data

DRIVER    VOLUME NAME
local     postgres-data

[
  {
    "Name": "postgres-data",
    "Driver": "local",
    "Mountpoint": "/var/lib/docker/volumes/postgres-data/_data",
    "Scope": "local",
    "CreatedAt": "2024-01-15T09:23:11Z"
  }
]

What just happened?

Docker created a named volume and confirmed it with the name postgres-data. The docker volume inspect output reveals the Mountpoint — the actual path on the host machine where Docker stores this volume's data: /var/lib/docker/volumes/postgres-data/_data. This directory is managed entirely by the Docker Daemon. You should never manually edit files here — always interact with volumes through containers. The Driver: local means the volume uses the local filesystem. Enterprise setups use remote storage drivers like NFS or cloud-backed volumes.

# Run PostgreSQL with the volume mounted
docker run -d \
  --name db-container \
  -e POSTGRES_PASSWORD=secret123 \
  -e POSTGRES_DB=orders \
  -p 5432:5432 \
  -v postgres-data:/var/lib/postgresql/data \
  postgres:15-alpine
# -v postgres-data:/var/lib/postgresql/data
#    left side  → the named volume (postgres-data)
#    right side → the path inside the container where PostgreSQL stores its data
# Docker mounts the volume at that path — all database writes go to the volume

c9d4e8f2a1b3c7e5d9f0a2b4c6e8f0a2b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8

PostgreSQL Database directory appears to contain a database; Skipping initialization
LOG:  starting PostgreSQL 15.4 on x86_64-pc-linux-musl
LOG:  listening on IPv4 address "0.0.0.0", port 5432
LOG:  database system is ready to accept connections

What just happened?

The -v postgres-data:/var/lib/postgresql/data flag mounted the named volume at the exact path PostgreSQL uses to store its data files. Every write PostgreSQL makes — creating tables, inserting rows, running migrations — goes to the volume, not the container's writable layer. The log line "PostgreSQL Database directory appears to contain a database; Skipping initialization" would appear on subsequent runs — confirming that the data survived the container being stopped and recreated.

Proving Data Survives Container Deletion

The real test of a volume isn't creating it — it's proving the data survives a container being destroyed.

# Delete the container entirely
docker rm -f db-container
# The container is gone — but the volume still exists

# Confirm the volume survived
docker volume ls

# Recreate the container — mount the same volume
docker run -d \
  --name db-container \
  -e POSTGRES_PASSWORD=secret123 \
  -e POSTGRES_DB=orders \
  -p 5432:5432 \
  -v postgres-data:/var/lib/postgresql/data \
  postgres:15-alpine

# Check the logs — PostgreSQL recognises the existing data
docker logs db-container

db-container

DRIVER    VOLUME NAME
local     postgres-data

c2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3

PostgreSQL Database directory appears to contain a database; Skipping initialization
LOG:  starting PostgreSQL 15.4 on x86_64-pc-linux-musl
LOG:  database system is ready to accept connections

What just happened?

The container was force-deleted — completely gone. The volume survived, confirmed by docker volume ls still showing postgres-data. A brand new container was created from scratch and mounted the same volume. PostgreSQL's log says "Skipping initialization" — it found existing data in the volume and connected to it rather than creating a fresh empty database. Every table, every row, every index created before the container was deleted is still there. The volume outlived the container. This is volumes working exactly as designed.

Anonymous Volumes vs Named Volumes

Docker supports two kinds of volumes, and knowing the difference saves a lot of confusion.

Named vs anonymous volumes

Named volume — -v postgres-data:/var/lib/postgresql/data

You give the volume an explicit name. It persists until you explicitly delete it with docker volume rm. You can reference it by name to mount it in any container. This is the correct choice for any data you care about — databases, uploads, caches.

Anonymous volume — -v /var/lib/postgresql/data

Docker generates a random hash as the volume name. It persists after the container is deleted but is almost impossible to find and reference again — you'd have to inspect the volume list and guess which random hash contains your data. Avoid anonymous volumes for anything you need to find again.

Managing Volumes

docker volume ls                    # list all volumes
docker volume inspect postgres-data # detailed info — mountpoint, driver, labels
docker volume rm postgres-data      # delete a specific volume — data is gone permanently
docker volume prune                 # delete all volumes not currently mounted by a container
# WARNING: docker volume prune is destructive — it removes ALL unused volumes

DRIVER    VOLUME NAME
local     postgres-data
local     redis-cache
local     app-uploads

Deleted Volumes:
redis-cache
app-uploads

Total reclaimed space: 2.14GB

What just happened?

docker volume ls showed three volumes. docker volume prune removed the two that weren't mounted by any running container — redis-cache and app-uploads — and reclaimed 2.14 GB of disk space. The postgres-data volume survived because the db-container is still running and has it mounted. Prune only removes volumes with no active mounts — it won't delete data from a running container's volume.

docker volume prune Is Permanent

docker volume prune deletes all unmounted volumes with no confirmation prompt by default — and deleted volume data cannot be recovered. Always run docker volume ls first and verify you're not about to delete something important. In production, never run prune without knowing exactly what volumes are on the machine.

Teacher's Note

Always use named volumes — never anonymous ones. If you can't name it, you can't find it. If you can't find it, you can't back it up or share it between containers.

Practice Questions

1. The command used to explicitly create a named Docker volume before running a container is called what?

2. To mount a named volume into a container when running docker run, which flag do you use?

3. The command that removes all Docker volumes not currently mounted by any container is called what?

Quiz

Up Next · Lesson 16

Bind Mounts vs Volumes

Volumes aren't the only way to get data into a container — bind mounts map a specific directory from your host directly into the container. Knowing when to use which one is a decision you'll make on every project.

← Previous Course Index Next →

Docker Course

Docker Volumes

The Problem Volumes Solve

Before and After Volumes

Creating and Using Volumes

Proving Data Survives Container Deletion

Anonymous Volumes vs Named Volumes

Managing Volumes

Practice Questions

Quiz