Docker Course
Resource Limits
A fintech startup ran twelve microservices on a single production host. One night a bug introduced an infinite loop in the notification service — it pegged one CPU core at 100% and started allocating memory in a tight loop. Within four minutes, the host was unresponsive. All twelve services — including payments — went down simultaneously. The bug was a three-line fix. The outage lasted two hours. Without resource limits, one container's problem is every container's problem.
By default, Docker places no ceiling on how much CPU or memory a container can consume. It will take everything the host has. Resource limits are the enforcement layer that turns that default into a guarantee: each container gets its share, none can starve the others, and a runaway process hits a wall instead of a production incident.
Unlimited vs Limited Containers
Default — no limits set
- One container can consume 100% of host CPU
- A memory leak can exhaust all host RAM
- Linux OOM killer terminates random processes — including the Docker Daemon
- Every other container on the host suffers
- No visibility into what each container is actually consuming
- Impossible to capacity plan or right-size the host
With resource limits enforced
- Each container has a guaranteed CPU ceiling
- Memory exhaustion kills only the offending container
- OOM kill is contained — the Daemon and other containers are unaffected
- Other services continue running normally during an incident
docker statsgives meaningful utilisation percentages- Predictable resource usage enables accurate host sizing
Memory Limits
Memory is the more critical limit to set. When a process leaks memory or allocates without bound, the Linux kernel's Out-Of-Memory (OOM) killer activates and starts terminating processes to reclaim memory. Without a container-level limit, the OOM killer picks victims across the entire host — it may kill the Docker Daemon itself. With a memory limit, the kernel kills only the process inside the offending container and Docker restarts it according to the restart policy. The blast radius stays contained.
docker run -d \
--name payment-api \
--memory 512m \
--memory-swap 512m \
-p 3000:3000 \
payment-api:v1.2.0
# --memory 512m → hard limit: container cannot exceed 512 MB of RAM
# if it tries, the Linux OOM killer terminates the process
# Docker then restarts the container per its restart policy
# --memory-swap 512m → set swap equal to --memory to disable swap entirely
# swap == memory means 0 MB of swap is available
# prevents the process from silently spilling onto disk
# and masking a memory problem for hours before it crashes
# Confirm limits are applied: docker inspect payment-api | grep -A4 '"Memory"' "Memory": 536870912, "MemorySwap": 536870912, # 536870912 bytes = 512 MB — limit is confirmed. # Simulate OOM — allocate beyond the limit from inside the container: docker exec payment-api sh -c "dd if=/dev/zero of=/dev/null bs=1M count=600" # Docker detects the OOM kill and restarts the container: docker ps CONTAINER ID NAME STATUS a1b2c3d4e5f6 payment-api Up 3 seconds (restarted 1 time) # The container restarted. The host is unaffected. Other containers kept running.
What just happened?
The container exceeded its 512 MB memory limit. Linux OOM-killed the process inside the container. Docker's restart policy brought the container back within seconds. Every other container on the host continued running without interruption. Without the memory limit, that same leak would have consumed all available host RAM, triggered a host-wide OOM event, and potentially taken down every service — including the Docker Daemon itself.
CPU Limits
CPU limits work differently from memory limits — exceeding a CPU limit does not kill the process. Instead, the Linux kernel throttles it: the container is still running, but its CPU access is capped. A container at its CPU ceiling simply slows down rather than crashes. This is the correct behaviour for a busy service. The two flags you need are --cpus for the ceiling and --cpu-shares for relative priority when the host is under load.
docker run -d \
--name payment-api \
--cpus 1.5 \
--cpu-shares 512 \
-p 3000:3000 \
payment-api:v1.2.0
# --cpus 1.5 → hard ceiling: the container can use at most 1.5 CPU cores
# on a 4-core host, this means 37.5% of total CPU capacity
# a spinning infinite loop hits this ceiling and is throttled —
# it cannot consume more, but other containers are unaffected
# --cpu-shares 512 → relative weight when the host is under contention
# default is 1024 — setting 512 gives this container half the
# CPU time of a default-weight container when all containers
# are competing simultaneously
# has no effect when the host has idle CPU capacity
# Confirm CPU limits: docker inspect payment-api | grep -E '"NanoCpus"|"CpuShares"' "NanoCpus": 1500000000, "CpuShares": 512, # NanoCpus: 1,500,000,000 = 1.5 CPU cores. Limit confirmed. # Simulate a CPU spike — run a tight loop inside the container: docker exec -d payment-api sh -c "while true; do :; done" # Observe that the container is capped at ~1.5 cores: docker stats payment-api --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % a1b2c3d4e5f6 payment-api 149.8% 210MiB / 512MiB 41.0% # CPU is pegged at ~150% (1.5 cores out of 4) — throttled exactly at the limit. # The other 2.5 cores remain fully available to all other containers on the host.
What just happened?
The infinite loop tried to consume every available CPU cycle on the host. The --cpus 1.5 limit throttled it at exactly 1.5 cores — the container kept running, slowed by throttling, while the remaining 2.5 cores stayed completely available to every other container. The runaway process was fully contained without killing anything or causing any interruption to other services.
Setting Limits in Docker Compose
Running docker run with flags works for single containers, but in practice most multi-service deployments use Docker Compose. Resource limits belong in the Compose file — declared once, applied consistently across every deployment.
version: "3.8"
services:
api:
image: payment-api:v1.2.0
deploy:
resources:
limits:
cpus: "1.5"
# Hard ceiling — container is throttled if it exceeds this
memory: 512M
# Hard ceiling — OOM kill if exceeded
reservations:
cpus: "0.5"
# Guaranteed minimum CPU — scheduler will not starve this service
memory: 256M
# Guaranteed minimum RAM — Docker will not schedule this container
# on a host that cannot provide at least this much free memory
ports:
- "3000:3000"
db:
image: postgres:15-alpine
deploy:
resources:
limits:
cpus: "2.0"
memory: 1G
reservations:
cpus: "1.0"
memory: 512M
volumes:
- pgdata:/var/lib/postgresql/data
redis:
image: redis:7-alpine
deploy:
resources:
limits:
cpus: "0.5"
memory: 128M
reservations:
cpus: "0.1"
memory: 64M
volumes:
pgdata:
docker compose up -d [+] Running 3/3 ✔ Container redis Started ✔ Container postgres-db Started ✔ Container payment-api Started # Verify limits are active across all containers: docker stats --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % a1b2c3d4e5f6 payment-api 2.4% 198MiB / 512MiB 38.7% b2c3d4e5f6a7 postgres-db 0.8% 312MiB / 1024MiB 30.5% c3d4e5f6a7b8 redis 0.1% 8MiB / 128MiB 6.3% # Each container shows its enforced memory ceiling in the LIMIT column. # Total allocated: 1664 MiB maximum — fits cleanly on a 4 GB host.
limits vs reservations — what each does
limits.memory
Hard ceiling. Exceeding it triggers OOM kill. The container is terminated and restarted per its restart policy.
limits.cpus
Hard ceiling. Exceeding it causes throttling — the process slows but does not crash or restart.
reservations.memory
Soft guarantee. Docker will not schedule this container on a host that cannot provide at least this much free memory.
reservations.cpus
Soft guarantee. The scheduler prioritises this container's CPU access under contention — it will not be starved.
Monitoring with docker stats
docker stats is the built-in real-time view of resource consumption across all running containers. Without limits, the LIMIT column shows the total host memory — useless for spotting problems. With limits, it shows each container's individual ceiling and how close it is to hitting it. It's the first tool to reach for during an incident.
# Live stream of all running containers (refreshes every second):
docker stats
# Single snapshot — useful in scripts and CI checks:
docker stats --no-stream
# Filter to a specific container:
docker stats payment-api --no-stream
# Custom output format — machine-readable for monitoring pipelines:
docker stats --no-stream \
--format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"
# Key signals to watch:
# MEM % approaching 80-90% → container close to OOM kill threshold
# CPU % sustained at limit → throttling is occurring — service is degraded
# NET I/O spike → unexpected traffic or a data leak
# BLOCK I/O spike → unexpected disk activity — check logs or tmp growth
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O a1b2c3d4e5f6 payment-api 3.2% 201MiB / 512MiB 39.3% 14.5MB / 8.2MB 0B / 0B b2c3d4e5f6a7 postgres-db 1.1% 318MiB / 1024MiB 31.1% 2.1MB / 1.4MB 142MB / 88MB c3d4e5f6a7b8 redis 0.1% 8MiB / 128MiB 6.3% 980kB / 720kB 0B / 0B # MEM % is the most critical column. # A container at 85% of its memory limit is one traffic burst away from OOM kill. # Without limits, the LIMIT column shows total host RAM — percentages are meaningless. # With limits, every percentage is actionable.
What just happened?
docker stats shows each container's real consumption against its enforced limit. Without limits set, the LIMIT column would display total host RAM — making percentages meaningless and problems invisible until it's too late. With limits, a container at 85% memory is an early warning. At 95%, it's a pager alert. The number is only actionable because there's a ceiling to measure against.
How to Choose Your Limits
Setting limits too low causes OOM kills under normal load. Setting them too high wastes the purpose of having limits. The correct approach: run without limits first, measure under realistic load with docker stats, then set limits based on observed peak consumption with headroom built in.
# Step 1 — run with no limits and observe peak consumption under realistic load
docker stats --no-stream --format \
"{{.Name}}: CPU={{.CPUPerc}} MEM={{.MemUsage}}"
# Example peaks observed during a load test:
# payment-api: CPU=62% MEM=287MiB / host-total
# postgres-db: CPU=48% MEM=611MiB / host-total
# redis: CPU=4% MEM=22MiB / host-total
# Step 2 — set limits at observed peak + ~50% headroom
# payment-api peak RAM: 287 MiB → limit: 430M
# postgres-db peak RAM: 611 MiB → limit: 900M
# redis peak RAM: 22 MiB → limit: 64M
# Step 3 — re-run the load test with limits active and confirm:
# (a) No OOM kills occur during normal peak traffic
# (b) A runaway process does hit the ceiling without affecting other containers
# (c) docker stats shows sensible percentages — not 95%+ under normal load
# Rule of thumb:
# memory limit = observed peak × 1.5
# cpu limit = observed peak cores × 1.5 (round to nearest 0.25)
# reservation = observed average (not peak) × 0.8
The Resource Limits Checklist
Before any container goes to production
--memory or limits.memory value--memory-swap equals --memory to prevent silent disk spill--cpus or limits.cpus prevents runaway processes from starving the hostrestart: unless-stopped in Compose so OOM-killed containers recover automaticallyTeacher's Note
Set memory limits first — that is the one that prevents a single container from taking down the entire host. CPU limits are important but the failure mode (throttling) is far less catastrophic than OOM. Start with broad limits based on your best estimate, load test, then tighten. Never leave a production container running without a memory limit. It's the container equivalent of running without a circuit breaker.
Practice Questions
1. To disable swap for a container — preventing it from silently spilling excess memory onto disk — which flag must be set to the same value as --memory?
2. When a container exceeds its --cpus limit, the process is not killed. What happens to it instead?
3. Which Docker CLI command provides a real-time view of CPU and memory consumption across all running containers, showing each container's usage against its enforced limit?
Quiz
1. A host runs eight containers with no resource limits. One container develops a memory leak. What is the worst-case outcome?
2. In a Docker Compose file, what is the practical difference between limits and reservations under deploy.resources?
3. A team is deploying a new service and needs to set its memory limit. They have never run it in production before. What is the correct process?
Up Next · Lesson 35
Logging & Monitoring
Resource limits contain the blast radius of a failure — but you still need to know when that failure happened and why. Logging and monitoring are how you find out: what a container was doing before it crashed, what's degraded right now, and what pattern predicts the next incident.