Kubernetes Lesson 47 – Taints And Tolerations | Dataplexa
Advanced Workloads & Operations · Lesson 47

Taints and Tolerations

Taints let node operators repel Pods from a node — marking it as restricted to specific workloads. Tolerations are the matching declaration on a Pod that says "I accept this taint." Together they implement dedicated node pools, GPU isolation, and spot instance handling without needing complex scheduling rules.

How Taints and Tolerations Work

A taint has three parts: a key, a value, and an effect. The effect determines what happens to Pods that don't tolerate the taint:

Effect What happens to intolerant Pods Use for
NoSchedule New Pods without a matching toleration will not be scheduled here. Existing Pods are unaffected. Dedicated node pools: GPU nodes, high-memory nodes
PreferNoSchedule Scheduler tries to avoid this node but will use it if no other option exists. Soft preference — "use these nodes last"
NoExecute New Pods won't be scheduled AND existing Pods without a matching toleration are evicted. Node cordoning, spot instance reclaim, node maintenance

Creating Taints and Tolerations

The scenario: You have a pool of GPU nodes reserved exclusively for ML workloads. No other Pods should land on these expensive nodes. You taint the GPU nodes and add tolerations only to ML Pods.

# Add a taint to a node: key=value:Effect
kubectl taint node gpu-node-1 dedicated=gpu:NoSchedule
kubectl taint node gpu-node-2 dedicated=gpu:NoSchedule
# All new Pods without a matching toleration will be rejected from these nodes

# Remove a taint (append '-' to remove)
kubectl taint node gpu-node-1 dedicated=gpu:NoSchedule-

# View node taints
kubectl describe node gpu-node-1 | grep Taints
# Taints: dedicated=gpu:NoSchedule
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-training
spec:
  template:
    spec:
      tolerations:
        - key: "dedicated"
          operator: "Equal"         # Equal: value must match exactly
          value: "gpu"
          effect: "NoSchedule"      # Must match the taint's effect
          # This Pod can be scheduled on nodes with taint dedicated=gpu:NoSchedule

      nodeSelector:
        accelerator: nvidia-tesla-t4   # Also constrain to GPU-labelled nodes
        # Toleration alone doesn't ATTRACT a Pod to a node — it just allows it
        # Combine with nodeSelector or affinity to both allow AND attract
      containers:
        - name: trainer
          image: registry.company.com/ml-trainer:1.0.0
          resources:
            limits:
              nvidia.com/gpu: 2
$ kubectl get pods -o wide
NAME                  READY   NODE
ml-training-xyz-1     1/1     gpu-node-1   ← tolerated the taint ✓
ml-training-xyz-2     1/1     gpu-node-2   ✓

# Other Pods without the toleration stay off GPU nodes:
$ kubectl get pods -o wide -n default
NAME              READY   NODE
web-app-abc-1     1/1     worker-node-3   ← never lands on gpu-node-* ✓
web-app-abc-2     1/1     worker-node-4   ✓

What just happened?

Tolerations allow, they don't attract — A toleration says "I can run on nodes with this taint." It does not say "schedule me there preferentially." Without a nodeSelector or affinity rule, a Pod with a GPU toleration might still land on a CPU node if the scheduler scores it higher. Always pair a toleration with a nodeSelector or node affinity to both allow and attract.

The operator fieldEqual (default): key, value, and effect must all match the taint. Exists: only the key needs to match — useful for tolerating all taints with a given key regardless of value (e.g., tolerate any node.kubernetes.io/not-ready taint).

Built-in Taints: Automatic Node Conditions

Kubernetes automatically taints nodes when they enter certain conditions. Understanding these is essential — they explain why Pods sometimes get evicted unexpectedly.

Taint When applied Effect
node.kubernetes.io/not-readyNode fails readiness checkNoExecute
node.kubernetes.io/unreachableNode controller loses contactNoExecute
node.kubernetes.io/memory-pressureNode is low on memoryNoSchedule
node.kubernetes.io/disk-pressureNode is low on disk spaceNoSchedule
node.kubernetes.io/unschedulablekubectl cordon was runNoSchedule
# Tolerate temporary node issues — keep Pods running longer before eviction
tolerations:
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300    # Stay on the node for 5 minutes before being evicted
                              # Default without this: 300s (set by the node lifecycle controller)
                              # Lower this for latency-sensitive services that need fast failover

  - key: "node.kubernetes.io/unreachable"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300    # Same — give the node time to recover before evicting

Spot Instance Pattern

Spot/preemptible instances are cheap but can be reclaimed by the cloud provider at any time. A common pattern: taint spot nodes so only workloads that can tolerate interruption run there, while critical services stay on on-demand nodes.

# Node group configuration (in eksctl or cloud-init):
# Spot nodes are automatically tainted by the cloud provider or node group config:
# kubectl taint node spot-node-1 spot=true:NoSchedule

# Batch job — tolerates spot interruption
tolerations:
  - key: "spot"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
# If spot node is reclaimed, the Job re-queues on another available node

# Critical API — no spot toleration, stays on on-demand
# (no tolerations entry needed — it simply won't be scheduled on tainted spot nodes)

Teacher's Note: Taints vs Network Policies — different isolation

Taints control where a Pod runs (node placement). Network Policies control what a Pod can communicate with (network access). They are complementary, not alternatives. A GPU node taint keeps CPU workloads off GPU nodes. A Network Policy keeps those GPU workloads from making outbound calls to the internet. Both are needed in a production multi-tenant cluster.

Practice Questions

1. Which taint effect evicts existing Pods from a node (in addition to blocking new ones) if they don't have a matching toleration?



2. A toleration needs to match any taint with key node.kubernetes.io/not-ready regardless of its value. Which operator should be used?



3. In a NoExecute toleration, which field controls how many seconds a Pod stays on the node before being evicted when the taint is applied?



Quiz

1. You add a toleration for dedicated=gpu:NoSchedule to your ML Pod but it keeps landing on CPU nodes. Why, and how do you fix it?


2. You run kubectl cordon node-1 to drain a node for maintenance. Which built-in taint effect does this apply, and why is eviction necessary for maintenance?


3. You want batch jobs to run on cheap spot instances but your payment API must never land on a spot node. How do taints enable this?


Up Next · Lesson 48

Node & Pod Affinity

Affinity rules give you expressive, flexible control over Pod placement — scheduling onto nodes with specific properties, keeping related Pods together, or spreading Pods across availability zones. Both hard requirements and soft preferences supported.