Kubernetes Lesson 5 – Master Node components | Dataplexa

Kubernetes Fundamentals · Lesson 5

Master Node Components

In Lesson 4 we saw the full architecture from above. Now we zoom right in on the control plane — the brain of your cluster. We're going deep on how each component actually works, what happens when something goes wrong, and what you need to know to keep it healthy in production.

Quick Recap From Lesson 4

The control plane is the decision-making layer. It runs on dedicated machines separate from your application workloads. It has four components: the API Server, etcd, the Scheduler, and the Controller Manager. In managed Kubernetes (EKS, GKE, AKS) the cloud provider runs these for you. In self-managed clusters, you own them entirely.

Why the Control Plane Deserves Your Full Attention

Here is the hard truth about the control plane: if it goes down completely, your running applications keep working — but you lose the ability to do anything about them. No deployments. No scaling. No recovering from a crashed node. Your cluster becomes read-only from an operations standpoint.

That's why in production you always run the control plane in a highly available configuration — usually three control plane nodes, sometimes five. Odd numbers matter here because of how etcd elects a leader, which we'll cover shortly.

Highly Available Control Plane — 3 Nodes

LEADER

Control Plane 1

API Server ✓

etcd (leader)

Scheduler ✓

Controller Mgr ✓

Control Plane 2

API Server ✓

etcd (follower)

Scheduler (standby)

Controller Mgr (standby)

Control Plane 3

API Server ✓

etcd (follower)

Scheduler (standby)

Controller Mgr (standby)

All three API Servers are active and load-balanced · Only one etcd leader processes writes · Scheduler and Controller Manager use leader-election — one active, others on hot standby

API Server — What Actually Happens Inside

We know the API Server is the front door. But every single request — from your kubectl command to the Scheduler writing an assignment — goes through four distinct stages before anything touches etcd.

Every Request Through the API Server — Four Stages

Authentication — Who are you?

Kubernetes checks the request's credentials. Could be a certificate (for kubectl), a service account token (for Pods), or an external identity provider. If it can't verify who you are, the request is rejected with a 401 immediately.

Authorisation — Are you allowed to do this?

Even if you're authenticated, you might not have permission. RBAC checks whether your user or service account has a Role that permits this specific action on this specific resource. We cover RBAC in detail in Lesson 37.

Admission Control — Should this be allowed?

A series of plugins that can inspect and modify or reject the request even after authentication and authorisation pass. Examples: preventing containers from running as root, enforcing that every Pod has resource limits, injecting sidecar containers automatically. Very powerful for enforcing policy cluster-wide.

Validation and Write to etcd

The object is validated against its schema — is the YAML structurally correct and complete? Then it's written to etcd and the API Server responds to you with a success status. From here the other components take over.

One thing worth burning into your memory: the API Server is stateless. It holds no data of its own — all state lives in etcd. This is why you can run multiple API Servers in a highly available setup. They're all just proxies to etcd. If one dies, another picks up immediately with zero loss.

Ports worth knowing

Port 6443

The HTTPS port kubectl and all external clients use. This is the main secure port you'll see in every kubeconfig file.

Port 8080 (deprecated)

An old insecure HTTP port that bypassed authentication completely. Never open in any modern cluster.

etcd — The Heart of Your Cluster

etcd is a distributed key-value store that uses an algorithm called Raft to stay consistent across multiple nodes. You don't need to understand Raft deeply right now — here's the essential version: in a cluster of three etcd nodes, one is elected leader. All writes go to the leader. The leader replicates those writes to at least one other node before confirming success.

This is why you need an odd number of etcd members — it needs a majority (called a quorum) to function. With 3 nodes, the quorum is 2 — you can lose one and keep going. With 5 nodes, quorum is 3 — you can lose two.

etcd cluster size	Quorum needed	Nodes that can fail	Recommended for
1 node	1	0	Dev / learning only
3 nodes	2	1	✓ Standard production
5 nodes	3	2	Large critical clusters

The most important rule in Kubernetes operations

Losing etcd without a backup means losing your entire cluster's state. Every Deployment, every ConfigMap, every Secret, every Service — all gone. The worker nodes will keep their running containers alive for a short while, but you will have no way to manage them and no record of what should be running.

Back up etcd regularly. In Lesson 57 we cover exactly how to do this with a single command using etcdctl snapshot save. If you remember nothing else from this lesson, remember this.

Everything stored in etcd — at a glance

All Pod specs and status

All Deployments and ReplicaSets

All Services and Endpoints

All ConfigMaps and Secrets

Node registrations and health

RBAC roles and bindings

Namespaces

PersistentVolume claims

Ingress rules

Scheduler — How It Actually Picks a Node

The Scheduler doesn't pick nodes randomly. Every unscheduled Pod goes through a deliberate two-phase decision process — first eliminate nodes that can't work, then rank the ones that can.

Filtering — Hard rules

Eliminate every node that simply cannot run this Pod:

Not enough free CPU or memory

nodeSelector label doesn't match

Node has a taint the Pod won't tolerate

Node is cordoned (marked unschedulable)

Scoring — Best fit wins

From surviving nodes, score each one:

Least requested CPU and memory (spread load)

Container image already pulled on this node

Spread Pods from same Deployment across nodes

Affinity rules — prefer certain node labels

Real example — what the Scheduler actually sees

You deploy a Pod requesting 500m CPU and 256Mi memory. Your cluster has 3 nodes:

Node A

Free CPU: 200m ❌

Free RAM: 512Mi ✓

Filtered out

Node B

Free CPU: 800m ✓

Free RAM: 1Gi ✓

Score: 72

Node C

Free CPU: 1.5 cores ✓

Free RAM: 2Gi ✓

Score: 91 → WINNER

Node A is filtered out — not enough CPU. Node C wins because it has the most free resources, giving the Pod room to grow without crowding other workloads.

After scoring, the Scheduler writes the node assignment into the Pod's spec in etcd — and its job is done. It doesn't watch whether the Pod actually started. That's the kubelet's job.

Controller Manager — The Cluster's Immune System

The Controller Manager runs many individual controllers at once, each in its own background loop. Every controller does the same three things over and over:

The Reconciliation Loop — Every Controller Does This

OBSERVE

Read desired state from etcd

→

COMPARE

Check what's actually running

→

ACT

Fix any difference found

→

REPEAT

Loop forever

The controllers you'll deal with most often as a Kubernetes engineer:

ReplicaSet Controller

→ Lesson 9

Watches ReplicaSets. If the number of running Pods drops below what you declared, it creates new Pod objects. If someone manually adds an extra one, it deletes the excess. This is your self-healing guarantee.

Deployment Controller

→ Lesson 10

When you update an image version, it creates a new ReplicaSet and gradually scales it up while scaling the old one down. That's your rolling update. It also manages rollbacks.

Node Controller

Background

Pings every node regularly. If a node stops responding for 40 seconds it gets marked NotReady. After 5 minutes of NotReady all Pods on that node are evicted and rescheduled on healthy nodes.

Endpoints Controller

→ Lesson 11

Keeps the Endpoints object for each Service up to date — adding IPs of new healthy Pods, removing IPs of dead ones. kube-proxy reads this to know where to route traffic.

Namespace Controller

→ Lesson 18

When you delete a namespace, this controller cleans up every resource inside it — Pods, Services, ConfigMaps, everything — in the right order so nothing is left orphaned.

What Breaks When Each Component Fails

This is the practical knowledge that separates engineers who genuinely understand Kubernetes from those who just know the names. If you know exactly what breaks when something fails, you know exactly what to fix first.

Component fails	What stops working	What keeps working
API Server	kubectl stops working. No new deployments, scaling, or config changes possible.	All running containers keep running. kubelet keeps health-checking them.
etcd	API Server can't read or write. Effectively brings the whole control plane down.	Already-running containers keep running temporarily.
Scheduler	New Pods stay in Pending — nobody assigns them to nodes.	All currently running Pods are completely unaffected.
Controller Manager	Self-healing stops. Crashed Pods won't be replaced. Dead nodes' workloads stay where they are.	Healthy running Pods continue normally.

Checking the Control Plane With kubectl

The scenario: You're a DevOps engineer and something feels off — new Pods are getting stuck in Pending. Before you panic, you check the health of every control plane component. These are the exact commands you'd run first.

# Check all nodes — control plane nodes show role: control-plane
kubectl get nodes

# Check the health of control plane components
# Queries the /healthz endpoint of each one
kubectl get componentstatuses

# Control plane components run as Pods in the kube-system namespace
# This shows you the API Server, etcd, Scheduler, and Controller Manager
kubectl get pods -n kube-system

# Read the logs of a specific control plane component
# Replace the Pod name with what you see from the command above
kubectl logs kube-scheduler-control-plane -n kube-system

NAME                    STATUS   ROLES           AGE   VERSION
control-plane           Ready    control-plane   22d   v1.28.0
worker-node-01          Ready    <none>          22d   v1.28.0
worker-node-02          Ready    <none>          22d   v1.28.0

NAME                 STATUS    MESSAGE   ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   ok

NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE
kube-system   etcd-control-plane                           1/1     Running   0          22d
kube-system   kube-apiserver-control-plane                 1/1     Running   0          22d
kube-system   kube-controller-manager-control-plane        1/1     Running   0          22d
kube-system   kube-scheduler-control-plane                 1/1     Running   0          22d
kube-system   coredns-5d78c9869d-4xvzk                     1/1     Running   0          22d
kube-system   coredns-5d78c9869d-7kzpt                     1/1     Running   0          22d

What just happened?

kubectl get nodes — The ROLES column tells you what each machine is. A node with control-plane is a master node. Nodes with no role listed are worker nodes ready to run your application Pods.

kubectl get componentstatuses — This directly checks whether the Scheduler, Controller Manager and etcd are alive and responding. A Healthy: ok for all three means your control plane is fine — the Pending Pod problem lives somewhere else. An Unknown or Error here is your first lead.

kube-system namespace — On kubeadm clusters, all control plane components run as static Pods here. You can read their logs and describe them just like any other Pod. RESTARTS: 0 across all of them is healthy. High restart counts on etcd or the API Server means something is seriously wrong and needs investigation immediately.

Something that surprises most people the first time they see it

On a kubeadm cluster, the control plane components themselves are just containers — they run as static Pods in the kube-system namespace. The kubelet on the control plane node starts them from YAML files sitting in /etc/kubernetes/manifests/. You can inspect their logs, describe them, and even restart one by temporarily moving its YAML file out of that folder.

In managed clusters like EKS, GKE or AKS you never see any of this — the cloud provider hides the control plane completely and just gives you the API Server endpoint. The trade-off is real: less operational overhead, but less visibility and control. Neither is universally better. It depends on your team's size and needs.

Practice Questions

Write from memory — don't scroll back.

1. What is the minimum number of etcd nodes recommended for a production Kubernetes cluster that needs to survive the loss of one node?

2. The Scheduler uses a two-phase process to pick a node for a Pod. What are the two phases called?

3. On a kubeadm-provisioned cluster, control plane components run as static Pods in which namespace?

Knowledge Check

Pick the best answer.

Up Next · Lesson 6

Worker Node Components

We flip to the other side of the cluster — the kubelet, kube-proxy, and container runtime in depth. These are the components that actually run your containers and keep them alive every second of the day.

← Previous Course Index Next →

Kubernetes Course

Master Node Components

Why the Control Plane Deserves Your Full Attention

API Server — What Actually Happens Inside

etcd — The Heart of Your Cluster

Scheduler — How It Actually Picks a Node

Controller Manager — The Cluster's Immune System

What Breaks When Each Component Fails

Checking the Control Plane With kubectl

Practice Questions

Knowledge Check