Kubernetes Lesson 45 – Kubernetes Security Best Practices | Dataplexa

Networking, Ingress & Security · Lesson 45

Kubernetes Security Best Practices

Security in Kubernetes is not a single setting — it is a layered posture built from a dozen interlocking controls. This lesson consolidates everything covered in Section III into an actionable checklist that a security team can use to audit a cluster, and a platform team can use to harden one from scratch.

The Security Threat Model

Before hardening anything, it helps to know what you're defending against. In a Kubernetes environment, the most common security incidents fall into four categories:

Container Escape

A compromised container breaks out to the host. Mitigated by: non-root, read-only filesystem, dropped capabilities, seccomp, no hostPID/hostNetwork.

Lateral Movement

Attacker pivots from one compromised service to others. Mitigated by: Network Policies (default-deny), minimal Service Account permissions, no overly broad RBAC.

Credential Theft

API tokens, Secrets, or cloud credentials exfiltrated. Mitigated by: external secrets manager, etcd encryption, RBAC on Secrets, short-lived tokens.

Supply Chain

Malicious or vulnerable images deployed. Mitigated by: private registry policy, image scanning in CI, pinned digests, admission policy blocking unapproved registries.

The Production Security Checklist

Each item below links back to the lesson that covers it in depth. Use this as an audit checklist — work through each control and mark it done once it is implemented and verified.

1. RBAC — Minimum Necessary Access Lessons 37–39

No wildcard verbs or resources in Roles. No cluster-admin bindings for application workloads. ServiceAccounts dedicated per workload with automountServiceAccountToken: false where no API access is needed. Audit cluster-admin bindings quarterly.

kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name'

2. Pod Security — Hardened Security Contexts Lesson 40

Every container sets runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities.drop: [ALL], and seccompProfile: RuntimeDefault.

3. Pod Security Admission — Cluster-Wide Enforcement Lesson 41

All application namespaces labelled with at minimum enforce=baseline. Production namespaces with sensitive data labelled enforce=restricted. System namespaces (kube-system) labelled privileged.

kubectl get namespaces -o json | \
  jq '.items[] | select(.metadata.labels["pod-security.kubernetes.io/enforce"] == null) | .metadata.name'
# Lists namespaces with no PSA enforcement — these need attention

4. Network Policies — Default Deny Lesson 36

Every application namespace has a default-deny-all NetworkPolicy. Explicit ingress/egress rules open only the ports and peers each workload needs. DNS egress (port 53 UDP+TCP) explicitly allowed.

5. Secrets Management — No Plaintext in etcd Lesson 42

etcd encryption at rest enabled (KMS preferred). External Secrets Operator syncing from AWS SM / Vault. No secret values committed to Git. Pre-commit hooks scanning for credentials.

6. TLS Everywhere Lesson 44

All external endpoints behind HTTPS Ingress with valid TLS certificates. cert-manager handling automated renewal. No HTTP-only public endpoints. TLS 1.2 minimum enforced at Ingress controller.

7. Image Security — Trusted Supply Chain Lessons 41, 43

All images pulled from an internal registry. Images scanned for CVEs in CI before promotion. No :latest tags in production — use immutable digests or pinned semantic version tags. Kyverno/OPA enforcing registry policy.

kubectl get pods -A -o json | \
  jq '.items[].spec.containers[].image | select(test(":latest$") or test("^[^:]+$"))'
# Lists Pods using :latest or untagged images

8. API Server Access — No Broad Public Exposure

API server not accessible from the public internet (use a VPN or bastion). kubeconfig files not shared — individual user certificates or OIDC-based authentication. API server audit logging enabled and shipped off-cluster.

9. Node Security — Minimal Attack Surface

Nodes run only the kubelet and container runtime — no unnecessary software. Node OS patched on a defined schedule. No SSH keys on nodes (use SSM Session Manager or equivalent). Nodes not accessible from Pod network (host ports avoided unless necessary).

Scoring Your Cluster: The Quick Audit

Run these four commands to get an immediate picture of the most impactful security gaps in any cluster:

# 1. Find Pods running as root
kubectl get pods -A -o json | jq -r '
  .items[] |
  select(
    (.spec.securityContext.runAsNonRoot != true) and
    (.spec.containers[].securityContext.runAsNonRoot != true)
  ) |
  "\(.metadata.namespace)/\(.metadata.name)"'

# 2. Find Pods with no resource limits set
kubectl get pods -A -o json | jq -r '
  .items[] |
  select(.spec.containers[].resources.limits == null) |
  "\(.metadata.namespace)/\(.metadata.name)"'

# 3. Find ServiceAccounts with cluster-admin binding
kubectl get clusterrolebindings -o json | jq -r '
  .items[] |
  select(.roleRef.name == "cluster-admin") |
  "CRB: \(.metadata.name) → \([.subjects[]? | "\(.kind)/\(.name)"] | join(", "))"'

# 4. Find namespaces with no default-deny NetworkPolicy
kubectl get networkpolicy -A -o json | jq -r '
  [.items[] | select(
    .spec.podSelector == {} and
    (.spec.policyTypes | sort) == ["Egress","Ingress"]
  ) | .metadata.namespace] as $secured |
  [.items[].metadata.namespace] | unique |
  map(select(. as $ns | $secured | index($ns) | not)) | .[]'

# Command 1 — Pods running as root:
kube-system/coredns
monitoring/prometheus-server        ← legitimate (needs root for port 9090), document this
payments/legacy-importer            ← needs fixing

# Command 2 — Pods with no resource limits:
payments/legacy-importer
staging/test-runner

# Command 3 — cluster-admin bindings:
CRB: cluster-admin → Group/system:masters  ← built-in, expected
CRB: ci-pipeline-admin → ServiceAccount/ci-pipeline-sa  ← INVESTIGATE: CI should not be cluster-admin

# Command 4 — namespaces without default-deny NetworkPolicy:
staging
analytics
# These namespaces have no default-deny policy — lateral movement is possible

Teacher's Note: Security is a journey, not a destination

No cluster starts fully hardened. The practical approach is to prioritise by impact: start with items 1 and 4 (RBAC and default-deny network policies) — these have the highest impact on blast radius reduction. Then add PSA enforcement (item 3) and secrets management (item 5). TLS (item 6) and image security (item 7) follow once the core controls are stable.

Run the quick audit every quarter and after any major infrastructure change. Consider tools like kube-bench (CIS Kubernetes Benchmark automated checks), kubescape (NSA/CISA framework compliance), and trivy operator (continuous image vulnerability scanning in-cluster) to automate posture monitoring over time.

Practice Questions

1. Which Kubernetes resource, when set to default-deny-all in a namespace, prevents lateral movement between compromised workloads even if RBAC controls are bypassed?

2. A Pod running as root with privileged: true is particularly dangerous because it enables which category of attack from the threat model above?

3. Which open-source tool runs automated CIS Kubernetes Benchmark checks against a cluster's configuration?

Quiz

Up Next · Lesson 46

Kubernetes Scheduling

Section IV begins. The kube-scheduler decides which node each Pod runs on. This lesson covers how scheduling works, node selectors, resource-based placement, and how to influence scheduling decisions for your workloads.

← Previous Course Index Next →

Kubernetes Course