Kubernetes Lesson 27 – Kubernetes Volumes | Dataplexa
Core Kubernetes Concepts · Lesson 27

Kubernetes Volumes

Every file written inside a container disappears the moment the container restarts. Kubernetes Volumes solve this — but there are a dozen volume types serving very different purposes, and picking the wrong one is one of the most common architecture mistakes teams make when they first try to run stateful workloads in Kubernetes.

Why Container Storage Is Ephemeral by Default

A container's filesystem is a thin writable layer on top of its image. Every file you write during the container's lifetime lives in that layer. When the container crashes and the kubelet restarts it, a brand new writable layer is created from the image — clean slate. Everything written to the previous layer is gone. Permanently.

This is a feature, not a bug — it's what makes containers reproducible and stateless. But it creates a real problem the moment you need anything to survive a restart: application logs, uploaded files, database data, shared state between containers in the same Pod. That's where Volumes come in.

Volume vs PersistentVolume — the key distinction

A Volume (this lesson) is tied to the lifecycle of a Pod — it exists as long as the Pod exists. When the Pod is deleted, the volume is gone too (unless it's backed by persistent storage). A PersistentVolume (Lessons 28–29) exists independently of any Pod and survives Pod deletion, rescheduling, and node failures. This lesson covers Volumes; the next two cover the persistent storage system.

How Volumes Work in a Pod

A Volume is declared in the Pod spec under spec.volumes and then mounted into one or more containers using spec.containers[].volumeMounts. The volume declaration says "what kind of storage and where it comes from." The volumeMount says "which container gets it and at which path inside that container."

Multiple containers in the same Pod can mount the same volume — which is the primary way containers in a Pod share data. This is the sidecar pattern: a main application container writes files to a shared volume, and a log-shipper sidecar reads those same files and forwards them to a centralised logging system.

The Volume Types You'll Actually Use

Type What it provides Survives container restart? Survives Pod deletion?
emptyDir Empty directory on the node, shared between containers in a Pod ✅ Yes ❌ No — deleted with Pod
hostPath A path on the host node's filesystem mounted into the container ✅ Yes ⚠️ Only if Pod stays on same node
configMap ConfigMap keys exposed as files (covered in Lesson 19) ✅ Yes ❌ No (but ConfigMap object persists)
secret Secret keys exposed as files in tmpfs (covered in Lesson 20) ✅ Yes ❌ No (but Secret object persists)
persistentVolumeClaim Claims durable storage from a PersistentVolume (Lessons 28–29) ✅ Yes ✅ Yes — data outlives the Pod
projected Combines multiple sources (Secrets, ConfigMaps, service account tokens) into a single directory ✅ Yes ❌ No

emptyDir: Scratch Space and Sidecar Sharing

emptyDir is the simplest volume type. When a Pod is created, Kubernetes creates an empty directory on the node. The directory is mounted into the container at the specified path. All containers in the Pod that mount the same emptyDir volume share the same directory and can read and write each other's files in real time.

Data in an emptyDir survives container crashes — if the main container crashes and restarts, the emptyDir is still there. But if the entire Pod is deleted or rescheduled to a different node, the emptyDir is gone.

The scenario: You're a DevOps engineer at a media company. The video transcoding service processes uploaded videos and writes segments to a temporary directory. A second sidecar container picks up those segments and streams them to an S3-compatible object store. Neither container should know about the other's internal workings — they communicate purely through a shared filesystem. Here's the multi-container Pod with a shared emptyDir.

apiVersion: v1
kind: Pod
metadata:
  name: video-transcoder
  namespace: production
  labels:
    app: video-transcoder
spec:
  volumes:
    - name: scratch-space          # Declare the volume — referenced by both containers below
      emptyDir:                    # emptyDir: empty directory, created when Pod starts
        medium: ""                 # medium: "" = node disk (default). "Memory" = tmpfs (RAM-backed)
        sizeLimit: "2Gi"           # Optional: cap the emptyDir to 2 GiB — prevent runaway disk usage
                                   # If exceeded, the Pod is evicted

  containers:
    - name: transcoder             # Main container: transcodes video, writes segments to shared dir
      image: company/transcoder:3.1.0
      volumeMounts:
        - name: scratch-space      # Must match the volume name declared above
          mountPath: /workspace    # Path inside this container where the volume is mounted
      resources:
        requests:
          cpu: "500m"
          memory: "512Mi"
        limits:
          cpu: "2000m"
          memory: "2Gi"

    - name: s3-uploader            # Sidecar container: reads segments from shared dir, uploads to S3
      image: company/s3-uploader:1.4.0
      volumeMounts:
        - name: scratch-space      # Same volume name — this container sees the same directory
          mountPath: /upload       # Different mountPath inside this container — same underlying data
          readOnly: false          # Both containers can read and write to this directory
      env:
        - name: WATCH_DIR
          value: "/upload"         # This container watches /upload — which is /workspace on the transcoder
      resources:
        requests:
          cpu: "100m"
          memory: "128Mi"
        limits:
          cpu: "300m"
          memory: "256Mi"
$ kubectl apply -f video-transcoder-pod.yaml
pod/video-transcoder created

$ kubectl get pods -n production
NAME                READY   STATUS    RESTARTS   AGE
video-transcoder    2/2     Running   0          12s

$ kubectl exec -it video-transcoder -c transcoder -n production -- ls /workspace
segment_001.ts
segment_002.ts
segment_003.ts

$ kubectl exec -it video-transcoder -c s3-uploader -n production -- ls /upload
segment_001.ts
segment_002.ts
segment_003.ts

What just happened?

READY: 2/2 — Both containers in the Pod are running. The READY column shows total ready containers / total containers in the Pod. A multi-container Pod isn't considered ready until all containers pass their readiness probes.

Same files, different paths — The transcoder writes to /workspace. The uploader reads from /upload. Different container paths, same underlying scratch-space volume. Kubernetes mounts the same directory into both containers. This is the sidecar pattern in its purest form — loose coupling through a shared filesystem.

kubectl exec -c containerName — In a multi-container Pod, you must specify which container to exec into using -c. Without it, kubectl chooses the first container in the Pod spec. If you get "Error from server: container not found," you're missing the -c flag.

sizeLimit on emptyDir — The sizeLimit: 2Gi is essential in production. Without it, a buggy container can fill the node's disk entirely — crashing other Pods on the same node. Always set a sizeLimit on emptyDir volumes that could receive unbounded write traffic.

emptyDir with medium: Memory

Setting medium: Memory on an emptyDir creates a tmpfs volume — backed by RAM, not disk. Reads and writes are dramatically faster. It's used for performance-sensitive scratch space, inter-process communication, and anything that should never touch disk (secrets-adjacent data).

volumes:
  - name: fast-scratch
    emptyDir:
      medium: Memory             # RAM-backed tmpfs — reads/writes at memory speed
      sizeLimit: "256Mi"         # CRITICAL: counts against the container's memory limit
                                 # If you set memory limit to 512Mi and use 256Mi for tmpfs,
                                 # your app only has 256Mi for its heap — plan accordingly
                                 # Also: data is lost on node restart, not just Pod restart

⚠️ Memory tmpfs counts against your memory limit. If your container has a 512Mi memory limit and mounts a 256Mi tmpfs emptyDir, the tmpfs counts toward the container's cgroup memory usage. Your application heap is now effectively capped at ~256Mi before OOMKill. This surprises engineers who thought the volume was "separate" from the container's memory budget.

hostPath: Mounting the Node Filesystem

hostPath mounts a path from the host node's filesystem directly into the container. The data lives on the physical node's disk — not in a distributed storage system.

The scenario: You're deploying a log collection DaemonSet — a Pod that runs on every node and ships the Docker/containerd log files to a centralised logging backend. The log files live at /var/log/containers on the host. The log-shipper container needs read access to that directory.

apiVersion: apps/v1
kind: DaemonSet                     # DaemonSet: runs one Pod per node — ideal for node-level agents
metadata:
  name: log-shipper
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: log-shipper
  template:
    metadata:
      labels:
        app: log-shipper
    spec:
      tolerations:                  # Allow scheduling on control-plane nodes too (optional)
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
      volumes:
        - name: container-logs
          hostPath:                 # hostPath: mount a path from the host node's filesystem
            path: /var/log/containers  # Absolute path on the HOST node
            type: DirectoryOrCreate    # type: what to do if path doesn't exist
                                       # DirectoryOrCreate: create dir if missing
                                       # Directory: must already exist (fails if not)
                                       # File: must be a regular file
                                       # FileOrCreate: create file if missing
                                       # Socket: must be a UNIX socket
        - name: host-docker-sock
          hostPath:
            path: /var/run/docker.sock  # Docker socket — needed by some log agents
            type: Socket

      containers:
        - name: log-shipper
          image: fluent/fluent-bit:2.2
          volumeMounts:
            - name: container-logs
              mountPath: /var/log/containers   # Same path inside container
              readOnly: true                   # read-only: log-shipper only reads, never writes
          resources:
            requests:
              cpu: "50m"
              memory: "64Mi"
            limits:
              cpu: "200m"
              memory: "128Mi"
$ kubectl apply -f log-shipper-daemonset.yaml
daemonset.apps/log-shipper created

$ kubectl get pods -n monitoring -o wide
NAME                READY   STATUS    RESTARTS   AGE   NODE
log-shipper-2xkpj   1/1     Running   0          8s    node-eu-west-1a
log-shipper-7rvqn   1/1     Running   0          8s    node-eu-west-1b
log-shipper-m4czl   1/1     Running   0          8s    node-eu-west-1c

$ kubectl exec -it log-shipper-2xkpj -n monitoring -- ls /var/log/containers | head -5
checkout-api_production_checkout-api-abc123.log
payment-api_production_payment-api-def456.log
auth-service_production_auth-service-ghi789.log

What just happened?

DaemonSet + hostPath is the canonical logging pattern — A DaemonSet runs exactly one Pod per node. Each log-shipper Pod mounts the container log directory of its own node via hostPath. The shipper sees log files from every container running on that node. Three nodes = three shipper Pods, each handling their node's logs in parallel.

hostPath security warning — hostPath is a privilege escalation risk. A container with a hostPath mount can read (and write, if not readOnly) arbitrary host filesystem paths. Never use hostPath for general application workloads — only for system-level agents (log shippers, monitoring agents, CNI plugins) that genuinely need host filesystem access. In production clusters, PodSecurity admission or OPA/Gatekeeper policies typically restrict which workloads can use hostPath.

type: DirectoryOrCreate — Without a type field, Kubernetes performs no pre-mount checks. If the path doesn't exist, the mount fails. DirectoryOrCreate gracefully creates the directory if it's missing. Always set the type on hostPath volumes — the implicit behaviour is confusing.

projected: Combining Multiple Sources

The projected volume type combines multiple volume sources — Secrets, ConfigMaps, service account tokens, and the downward API — into a single directory inside the container. Instead of mounting four separate volumes at four different paths, you get one directory with all the files organised together.

The scenario: Your payment service needs a TLS certificate, an API key Secret, its ConfigMap configuration, and a projected service account token — all accessible from a single /config directory so the application can use a single base path in its configuration.

volumes:
  - name: combined-config
    projected:                        # projected: merge multiple sources into one directory
      sources:
        - secret:
            name: payment-tls         # TLS secret — keys become files: tls.crt, tls.key
            items:
              - key: tls.crt
                path: certs/tls.crt   # Write tls.crt as /config/certs/tls.crt inside container
                mode: 0444            # File permissions: owner/group/other read-only
              - key: tls.key
                path: certs/tls.key
                mode: 0400            # Private key: owner read-only

        - configMap:
            name: payment-config      # ConfigMap keys become files in the projected directory
            items:
              - key: app.properties
                path: app.properties  # Written as /config/app.properties

        - serviceAccountToken:        # Project a short-lived service account token
            path: token               # Written as /config/token
            expirationSeconds: 3600   # Token auto-rotates — kubelet refreshes it before expiry
            audience: payment-gateway # Audience claim — for OIDC-based auth to external services

containerSpec:
  volumeMounts:
    - name: combined-config
      mountPath: /config              # All projected sources appear under /config/
      readOnly: true                  # Entire projected volume is read-only
$ kubectl exec -it payment-processor-8c4f7d-j9pkx -n production -- find /config -type f
/config/certs/tls.crt
/config/certs/tls.key
/config/app.properties
/config/token

$ kubectl exec -it payment-processor-8c4f7d-j9pkx -n production -- ls -la /config/certs/
-r-------- 1 root root 1704 Mar 10 09:44 tls.key
-r--r--r-- 1 root root 2048 Mar 10 09:44 tls.crt

What just happened?

Single mount point, multiple sources — Instead of four separate volumeMounts at different paths, the application reads everything from /config. The projected volume organised the files into subdirectories automatically based on the path fields. This is cleaner in large applications with many config files.

serviceAccountToken with expirationSeconds — The old-style service account token (stored permanently in a Secret) is being replaced by projected tokens that auto-rotate. Setting expirationSeconds: 3600 means the kubelet fetches a new token every hour. The token at /config/token is always fresh — applications that re-read it on each request get a valid token without any restart or manual rotation.

Per-file mode permissions — The mode: 0400 on the private key and mode: 0444 on the certificate match what TLS libraries expect. Many TLS implementations refuse to use private keys with world-readable permissions, logging an error like "permissions too open." Setting mode in the projected volume avoids needing a custom init container to chmod the files.

Volume Architecture: How They All Fit Together

Here's the full picture of how volumes, containers, and the underlying storage relate in a multi-container Pod:

Pod Volume Architecture

POD: video-transcoder
Container: transcoder
mounts: scratch-space
→ /workspace
mounts: app-config
→ /config/app.properties
mounts: tls-secret
→ /tls/ (readOnly)
Container: s3-uploader (sidecar)
mounts: scratch-space
→ /upload (readOnly)
no config mount needed
no tls mount needed
VOLUMES (Pod-level, shared across containers)
scratch-space
emptyDir / 2Gi
app-config
configMap
tls-secret
secret / tmpfs
Node Disk
emptyDir storage
etcd (API server)
ConfigMap data
RAM (tmpfs)
Secret data

Teacher's Note: Volumes vs PersistentVolumes — know when you've outgrown Volumes

Everything in this lesson — emptyDir, hostPath, configMap, secret, projected — shares one property: they are tied to the Pod lifecycle. When the Pod goes, the non-persistent data goes with it. For ephemeral scratch space, inter-container communication, configuration injection, and certificate delivery, these volume types are exactly right.

The moment you need data to outlive a Pod — database files, user uploads, audit logs, anything that must survive Pod deletion, rescheduling, or node failure — you need PersistentVolumes and PersistentVolumeClaims. The next two lessons cover exactly that system in full.

A common mistake: developers use hostPath to persist database data because it survives container restarts. It does — until the Pod gets rescheduled to a different node. Then the database starts fresh on the new node with zero data. This mistake has caused data loss in real production systems. If data must survive a Pod move, use a PVC.

Practice Questions

1. Which volume type creates an empty directory when a Pod starts, is shared between all containers in that Pod, survives container crashes but is deleted when the Pod is deleted?



2. A Pod has two containers named transcoder and s3-uploader. What flag do you add to kubectl exec to specify which container to exec into?



3. A DaemonSet needs to read the container log files at /var/log/containers on every node in the cluster. Which volume type mounts a path from the host node's filesystem into the container?



Quiz

1. A container writes files to an emptyDir volume. The container crashes and the kubelet restarts it. Then the entire Pod is deleted. What happens to the files?


2. A container has a memory limit of 512Mi. It mounts an emptyDir with medium: Memory and sizeLimit: 256Mi. What is the consequence?


3. A developer stores database files using a hostPath volume to survive container restarts. The node running the Pod is drained for maintenance and the Pod is rescheduled. What happens to the database data?


Up Next · Lesson 28

Persistent Volumes

Storage that outlives Pods — how Kubernetes abstracts cloud disks, NFS shares, and SAN storage into a unified API that lets developers claim durable storage without knowing the underlying infrastructure.