Kubernetes Lesson 52 – Kubernetes Logging | Dataplexa
Advanced Workloads & Operations · Lesson 52

Kubernetes Logging

Container logs are ephemeral — when a Pod is deleted or restarted, kubectl logs only shows the current container's output. In production you need logs centralised, retained, and searchable. This lesson covers the Kubernetes logging architecture, node-level log collection with Fluent Bit, and shipping to Elasticsearch and CloudWatch.

The Kubernetes Logging Architecture

Kubernetes has no built-in centralised logging — it deliberately leaves this to the operator. What it does provide is a consistent interface: every container writes to stdout/stderr, the container runtime (containerd) captures this and writes to a log file on the node at /var/log/pods/. A log shipper DaemonSet reads these files and forwards to a centralised backend.

Kubernetes Logging Pipeline

Application

Writes to stdout / stderr

containerd

Writes to /var/log/pods/ on the node. Rotates files.

Fluent Bit DaemonSet

Tails logs, parses, enriches with Pod metadata, forwards.

Backend

Elasticsearch, CloudWatch, Loki, Splunk, Datadog

kubectl logs reads directly from the node log file. Once the Pod is deleted, the file is gone — only the centralised backend retains history.

Using kubectl logs Effectively

Before setting up centralised logging, know what kubectl logs can and can't do — it's still the fastest tool for live debugging.

kubectl logs payment-api-7d9f4-xkp2m -n payments
kubectl logs payment-api-7d9f4-xkp2m -n payments --previous    # Crashed container's last logs
kubectl logs -f payment-api-7d9f4-xkp2m -n payments            # Follow live (tail -f)

# Multi-container Pod -- specify the container
kubectl logs payment-api-7d9f4-xkp2m -n payments -c payment-api
kubectl logs payment-api-7d9f4-xkp2m -n payments -c istio-proxy

# Stream logs from ALL Pods matching a label selector
kubectl logs -l app=payment-api -n payments --all-containers=true

# Limit output volume
kubectl logs payment-api-7d9f4-xkp2m -n payments --tail=100
kubectl logs payment-api-7d9f4-xkp2m -n payments --since=15m
kubectl logs payment-api-7d9f4-xkp2m -n payments --since-time="2025-03-10T14:00:00Z"
$ kubectl logs payment-api-7d9f4-xkp2m -n payments --tail=5
{"timestamp":"2025-03-10T14:32:09Z","level":"info","event":"payment_processed","duration_ms":38}
{"timestamp":"2025-03-10T14:32:10Z","level":"info","event":"payment_processed","duration_ms":41}
{"timestamp":"2025-03-10T14:32:11Z","level":"error","event":"payment_failed","reason":"insufficient_funds"}

$ kubectl logs payment-api-7d9f4-xkp2m -n payments --previous | tail -3
{"timestamp":"2025-03-10T14:31:55Z","level":"fatal","event":"startup_failed","reason":"DATABASE_URL not set"}
# --previous shows the crash log from the container run before this one ✓

$ kubectl logs -l app=payment-api -n payments --all-containers=true --tail=2
[payment-api-7d9f4-xkp2m] {"level":"info","event":"payment_processed"}
[payment-api-7d9f4-rvqn2] {"level":"info","event":"payment_processed"}

Structured Logging: JSON Over Plain Text

Plain text logs are hard to search at scale. Structured JSON logs let your backend index individual fields — finding all payment failures for a specific user in the last hour becomes a sub-second query instead of a grep across 50 GB of text.

# Bad: plain text -- hard to parse and search at scale
2025-03-10 14:32:11 ERROR Payment failed for user 12345: insufficient funds

# Good: structured JSON -- every field is indexable
{"timestamp":"2025-03-10T14:32:11Z","level":"error","service":"payment-api",
 "version":"3.1.0","event":"payment_failed","user_id":"12345",
 "amount_cents":5000,"currency":"USD","reason":"insufficient_funds",
 "trace_id":"abc-123-def","duration_ms":42}
# Plain text log -- what you see in kubectl logs:
2025-03-10 14:32:11 ERROR Payment failed for user 12345: insufficient funds

# The problem: to find all failures for user 12345 in the last hour you must:
$ kubectl logs -l app=payment-api -n payments --since=1h | grep "user 12345" | grep ERROR
# This works on one cluster. At 50GB/day across 20 services it takes minutes.

# JSON log -- what you see after switching to structured logging:
{"timestamp":"2025-03-10T14:32:11Z","level":"error","user_id":"12345","reason":"insufficient_funds","duration_ms":42}

# In Elasticsearch -- same query takes <100ms because user_id is an indexed field ✓
# Python structured logging with structlog
import structlog
log = structlog.get_logger()

log.error("payment_failed",
    user_id=user_id,
    amount_cents=amount,
    reason="insufficient_funds",
    duration_ms=duration_ms,
    trace_id=trace_id)

# Go structured logging with zerolog
log.Error().
    Str("event", "payment_failed").
    Str("user_id", userID).
    Int("amount_cents", amount).
    Str("trace_id", traceID).
    Msg("Payment processing failed")
# What the structured log looks like in your terminal:
$ kubectl logs payment-api-7d9f4-xkp2m -n payments | python3 -m json.tool | head -12
{
    "timestamp": "2025-03-10T14:32:11Z",
    "level": "error",
    "service": "payment-api",
    "event": "payment_failed",
    "user_id": "12345",
    "amount_cents": 5000,
    "reason": "insufficient_funds",
    "trace_id": "abc-123-def",
    "duration_ms": 42
}
# Every field is now independently searchable in Elasticsearch or CloudWatch Logs Insights

Why structured logging matters at scale

Indexing changes everything — With JSON logs, Elasticsearch automatically indexes every key as a searchable field. Finding all payment failures for user 12345 in the last hour is a sub-second Kibana query. Dashboards become trivial: avg(duration_ms) group by service gives you a P99 latency chart with no custom parsing.

The trace_id field — Including a distributed trace ID in every log line lets you correlate logs with traces across services. A user reports an error at 14:32 — find the trace ID from your tracing system, search all service logs by that ID, and see the full request journey. This is the foundation of distributed debugging in a microservices cluster.

Deploying Fluent Bit as a DaemonSet

Fluent Bit is the lightweight successor to Fluentd — it uses 10-50x less memory, making it the standard choice for node-level log collection. It runs as a DaemonSet (one Pod per node), tails the container log files, enriches each line with Kubernetes metadata (Pod name, namespace, labels), and forwards to one or more outputs simultaneously.

The scenario: You want to ship all container logs from your EKS cluster to both Elasticsearch (search and dashboards) and CloudWatch (retention compliance). Fluent Bit handles both outputs from a single DaemonSet.

helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  -f fluent-bit-values.yaml
$ helm repo add fluent https://fluent.github.io/helm-charts && helm repo update
"fluent" has been added to your repositories
Update Complete.

$ helm install fluent-bit fluent/fluent-bit --namespace logging --create-namespace \
  -f fluent-bit-values.yaml
NAME: fluent-bit
STATUS: deployed  REVISION: 1

$ kubectl get daemonset fluent-bit -n logging
NAME         DESIRED   CURRENT   READY   NODE SELECTOR
fluent-bit   3         3         3       <none>   # One Pod per node ✓

$ kubectl get pods -n logging
NAME                    READY   STATUS
fluent-bit-xkp2m        1/1     Running   # node 1
fluent-bit-7rvqn        1/1     Running   # node 2
fluent-bit-m4czl        1/1     Running   # node 3
# fluent-bit-values.yaml
config:
  inputs: |
    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        multiline.parser  docker, cri         # Handle both Docker and containerd formats
        Tag               kube.*
        Refresh_Interval  5
        Mem_Buf_Limit     50MB                # Backpressure limit -- drop old logs before OOM
        Skip_Long_Lines   On                  # Drop lines over 32KB (runaway loggers)

  filters: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On               # Parse JSON logs -- merge fields into the record
        Keep_Log            Off              # Drop raw 'log' field after merging
        Annotations         Off              # Skip annotations (too noisy)
        Labels              On               # Include Pod labels for backend filtering

    [FILTER]
        Name   record_modifier
        Match  kube.*
        Record cluster production-us-east-1  # Add cluster name to every log line

  outputs: |
    [OUTPUT]
        Name            es
        Match           kube.*
        Host            elasticsearch.logging.svc.cluster.local
        Port            9200
        Logstash_Format On                   # Daily indices: kubernetes-logs-2025.03.10
        Logstash_Prefix kubernetes-logs
        Retry_Limit     5

    [OUTPUT]
        Name              cloudwatch_logs
        Match             kube.*
        region            us-east-1
        log_group_name    /eks/production/containers
        log_stream_prefix pod/
        auto_create_group On

tolerations:                                 # Run on ALL nodes including tainted ones
  - operator: Exists
$ kubectl get daemonset fluent-bit -n logging
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   NODE SELECTOR
fluent-bit   5         5         5       5            <none>   # One Pod per node ✓

$ kubectl logs -l app.kubernetes.io/name=fluent-bit -n logging --tail=20
[2025/03/10 14:32:11] [info] [output:es:es.0] payments/payment-api-7d9f4/payment-api: OK (200)
[2025/03/10 14:32:11] [info] [output:cloudwatch_logs] 127 events flushed
[2025/03/10 14:32:11] [info] chunk is already flushed, retrying... (attempt 1)

# What a log line looks like after Kubernetes enrichment:
{
  "timestamp": "2025-03-10T14:32:11.847Z",
  "level": "error",
  "event": "payment_failed",
  "user_id": "12345",
  "duration_ms": 42,
  "kubernetes": {
    "pod_name": "payment-api-7d9f4-xkp2m",
    "namespace_name": "payments",
    "container_name": "payment-api",
    "pod_ip": "192.168.2.15",
    "labels": {
      "app": "payment-api",
      "version": "3.1.0"
    },
    "node_name": "ip-10-0-2-44.us-east-1.compute.internal"
  },
  "cluster": "production-us-east-1"
}

What just happened?

The Kubernetes filter enriches every log line — Fluent Bit calls the Kubernetes API to look up the Pod metadata for each log line (using the filename pattern to extract namespace, Pod name, and container name). It appends the full Kubernetes context — Pod IP, node name, labels, namespace — to every event. This means you can filter logs in Elasticsearch or CloudWatch by any label your Pods carry, without the application needing to log that metadata itself.

Merge_Log parses JSON automatically — With Merge_Log: On, if the log line is valid JSON (as our structured logs are), Fluent Bit flattens those fields into the top-level record alongside the Kubernetes metadata. The backend receives a flat, fully indexed document — not a nested JSON-in-a-string that requires parsing.

tolerations: Exists — The DaemonSet needs to run on every node, including ones with taints (GPU nodes, spot nodes, control plane nodes). The blanket operator: Exists toleration matches any taint, ensuring no node's logs are missed.

Log Routing and Filtering

Not every log needs to go to every backend. Fluent Bit's tag-and-match system lets you route different namespaces or applications to different outputs — keeping high-volume debug logs out of expensive backends, or sending security-relevant logs to a dedicated SIEM.

# Route payments namespace to a dedicated high-retention Elasticsearch index
# and kube-system to a separate cheaper storage

[FILTER]
    Name    rewrite_tag
    Match   kube.*
    Rule    $kubernetes['namespace_name'] payments  payments-logs.$TAG false
    Rule    $kubernetes['namespace_name'] kube-system  system-logs.$TAG false
    # Everything else keeps the kube.* tag and goes to the default output

[OUTPUT]
    Name            es
    Match           payments-logs.*           # Only payment namespace logs
    Host            elasticsearch.logging.svc
    Index           payments
    # Separate index: longer retention, dedicated ILM policy

[OUTPUT]
    Name            es
    Match           kube.*                    # All other logs
    Host            elasticsearch.logging.svc
    Index           kubernetes-logs
    # Default index: shorter retention

# Drop noisy system logs you don't need
[FILTER]
    Name   grep
    Match  kube.*
    Exclude kubernetes['container_name'] fluentbit  # Don't log Fluent Bit's own output
$ kubectl apply -f fluent-bit-routing-config.yaml
configmap/fluent-bit updated

# Verify routing is working -- payments logs go to dedicated index
$ curl -s "http://elasticsearch:9200/payments-logs-*/_count" | jq .count
18432   ← payment namespace logs in dedicated index ✓

$ curl -s "http://elasticsearch:9200/kubernetes-logs-*/_count" | jq .count
284719   ← all other namespace logs in default index ✓

# CloudWatch -- verify log groups exist
$ aws logs describe-log-groups --log-group-name-prefix /eks/production
{
    "logGroups": [{
        "logGroupName": "/eks/production/containers",
        "retentionInDays": 30,
        "storedBytes": 1073741824
    }]
}

Teacher's Note: Log volume, cost, and retention strategy

Log costs surprise teams that haven't thought about volume. A 20-node cluster with moderately verbose apps can generate 100–500 GB of logs per day. At typical Elasticsearch or CloudWatch pricing, that's $300–$1500/month in storage alone before query costs. Three things to do immediately:

Set log levels correctly in production — If your payment API logs at DEBUG in production, you're generating 10–100x more volume than you need. Enforce INFO or WARN in production via an environment variable, configurable without a code change.

Use Index Lifecycle Management (ILM) — In Elasticsearch, configure ILM to move indices to warm storage after 7 days and delete them after 30 (or 90 for compliance). CloudWatch Log Groups have a configurable retention period — set it to 30 or 90 days, never "never expire."

Consider Grafana Loki for cost reduction — Loki indexes only labels (namespace, app, pod name), not the log content itself. It stores compressed log chunks in object storage (S3). Query performance for label-based filtering is fast; full-text search is slower. For teams that query logs by service and time rather than arbitrary text patterns, Loki can be 5–10x cheaper than Elasticsearch at scale.

Practice Questions

1. Fluent Bit is deployed using which Kubernetes workload type — ensuring exactly one log collector Pod runs on every node in the cluster?



2. A container crashed and restarted. Which kubectl logs flag retrieves the output from the container before the crash?



3. In the Fluent Bit Kubernetes filter, which setting automatically parses JSON log lines and flattens their fields into the top-level record for indexing?



Quiz

1. A Pod crashed and was deleted by Kubernetes. A developer tries kubectl logs payment-api-7d9f4-xkp2m but gets "not found." Why, and how do you find the logs?


2. Why is structured JSON logging preferred over plain text in a Kubernetes cluster?


3. Your cluster is generating 300 GB of logs per day at significant cost. What are three strategies to reduce log volume and storage cost?


Up Next · Lesson 53

Kubernetes Monitoring

Logs tell you what happened. Metrics tell you how your system is performing right now and over time. This lesson covers the Prometheus and Grafana stack, kube-state-metrics, custom metrics, alerting with Alertmanager, and the four golden signals every production service should track.