Kubernetes Course
Kubernetes YAML Basics
Every Kubernetes object you'll ever create starts with a YAML file — it's the universal language of the cluster, and once you understand its structure, everything else clicks into place.
Why Kubernetes Uses YAML
Before we write a single manifest, let's talk about why Kubernetes settled on YAML. You could have a dashboard, a GUI, a drag-and-drop interface — and Kubernetes does have dashboards. But YAML won out because it has something those tools don't: it's a file. You can commit it to Git. You can code-review it. You can diff it, roll it back, replicate it across environments, and version it like any other piece of software.
This is the idea behind Infrastructure as Code — your infrastructure isn't a set of manual steps someone performed last Tuesday. It's a file in a repo that anyone on the team can read, run, and reproduce. YAML is just the format Kubernetes chose to express that idea.
🌱 The Git commit is the source of truth. A healthy Kubernetes team doesn't SSH into clusters and run manual commands in production. They update YAML files, open a pull request, get a review, and merge. The cluster reflects the repo. If the cluster ever drifts from the repo, the repo wins.
YAML: The Basics You Actually Need
You don't need to be a YAML expert — you need to understand about five concepts and you're set. Let's go through each one fast.
1. Key-value pairs. YAML is mostly key: value. Simple as that. The key and value are separated by a colon and a space.
2. Indentation = structure. YAML uses spaces (never tabs) to show nesting. Two spaces in means "this is a child of the item above." Get this wrong and Kubernetes will throw a parse error before it even looks at your manifest.
3. Lists start with a dash. A - means "this is an item in a list." You'll see this constantly for containers, ports, volumes, and environment variables.
4. Strings don't need quotes (usually). name: payment-service is fine unquoted. But if your string contains a colon, special characters, or starts with a number, wrap it in quotes.
5. Comments start with #. Kubernetes ignores them completely. Use them liberally — your future self at 2am will thank you.
The Four Fields Every Kubernetes Manifest Has
No matter what Kubernetes object you're creating — a Pod, a Deployment, a Service, a ConfigMap — the manifest always starts with the same four top-level fields. Every single one. Learn these four and you'll be able to read any manifest you encounter.
| Field | What it means | Example |
|---|---|---|
| apiVersion | Which Kubernetes API group and version handles this object | apps/v1, v1, networking.k8s.io/v1 |
| kind | The type of Kubernetes object you're creating | Pod, Deployment, Service, ConfigMap |
| metadata | Data that identifies the object: name, namespace, labels | name: payment-service, labels: ... |
| spec | The desired state — what you actually want this object to do or be | containers, replicas, ports, volumes... |
The Anatomy of a Kubernetes Manifest
Let's look at the most fundamental object in Kubernetes — a Pod — and break down every line of its manifest together.
The scenario: You're a junior DevOps engineer who just joined an e-commerce company. Your tech lead asks you to write your first Pod manifest for a new product catalog service. You need to run a single container, expose port 8080, and label it so the team can find it later. This is that manifest — and we're going to read every line together.
apiVersion: v1 # v1 is the core API group — used for Pods, Services, ConfigMaps
kind: Pod # We're creating a Pod — the smallest deployable unit in Kubernetes
metadata: # Metadata block: describes the object itself (not what it does)
name: catalog-pod # Every object needs a unique name within its namespace
namespace: default # Which namespace this Pod lives in (default if not specified)
labels: # Labels: key-value tags used to identify and group objects
app: catalog # app=catalog — used by Services and Deployments to find this Pod
tier: backend # tier=backend — useful for filtering with kubectl get pods -l tier=backend
version: "1.0" # version label — quoted because it could be mistaken for a number
spec: # Spec block: the desired state — what we want this Pod to contain
containers: # containers is a list (note the dash below) — a Pod can have multiple
- name: catalog-app # Name of this specific container within the Pod
image: nginx:1.25 # Docker image to run — always pin a version tag, never use :latest in prod
ports: # List of ports this container exposes
- containerPort: 8080 # The port the app inside the container listens on
resources: # Resource block: tells Kubernetes how much CPU/memory this container needs
requests: # requests: the minimum guaranteed allocation
cpu: "100m" # 100 millicores = 0.1 CPU — Kubernetes schedules based on this
memory: "128Mi" # 128 mebibytes of memory minimum
limits: # limits: the maximum the container can consume before being throttled/OOM-killed
cpu: "250m" # 250 millicores = 0.25 CPU ceiling
memory: "256Mi" # 256Mi memory ceiling — if exceeded, container is killed and restarted
$ kubectl apply -f catalog-pod.yaml pod/catalog-pod created $ kubectl get pods NAME READY STATUS RESTARTS AGE catalog-pod 1/1 Running 0 12s
What just happened?
apiVersion: v1 — Kubernetes has dozens of API groups. v1 is the original "core" group containing the most fundamental objects: Pods, Services, ConfigMaps, Secrets, Namespaces. When you see apps/v1 (which you will in the next lesson on Deployments), that means the object lives in the apps API group version 1.
metadata.name — This is the unique identifier within the namespace. If you try to apply a manifest with a name that already exists, Kubernetes will update the existing object rather than create a new one. That's the "declarative" model — you declare state, Kubernetes reconciles.
metadata.labels — Labels are not just decorative. They're how Kubernetes objects find each other. Services use label selectors to route traffic to Pods. Deployments use them to manage which Pods they own. Without labels, your objects are islands.
spec.containers — The containers field is a list, which is why the first item starts with -. A Pod can technically run multiple containers (called a sidecar pattern), but the most common case is one container per Pod.
image: nginx:1.25 — Always pin your image tag. :latest means Kubernetes will pull whatever the registry considers "latest" at pull time — this has caused production outages when a surprise breaking change was tagged latest by the upstream maintainer.
The output — READY 1/1 means 1 out of 1 containers in this Pod is ready. STATUS: Running means the container process started successfully. RESTARTS: 0 means nothing has crashed yet — a clean start.
The apiVersion Cheat Sheet
One of the most common rookie mistakes is putting the wrong apiVersion. Here's the map you'll use constantly:
| Object | apiVersion | Why this group? |
|---|---|---|
| Pod | v1 | Core Kubernetes object, original API |
| Deployment | apps/v1 | Workload management lives in apps group |
| ReplicaSet | apps/v1 | Workload management, same group as Deployment |
| StatefulSet | apps/v1 | Stateful workloads, same group |
| Service | v1 | Core networking object |
| ConfigMap | v1 | Core configuration object |
| Secret | v1 | Core secrets object |
| Ingress | networking.k8s.io/v1 | Networking extension group |
| HorizontalPodAutoscaler | autoscaling/v2 | Autoscaling extension group |
Multi-Resource YAML: The --- Separator
Production YAML files rarely define just one object. You can put multiple Kubernetes resources in a single file using --- as a separator. This is extremely common — you'll see teams keep a Deployment and its Service in one file so they travel together.
The scenario: You're a DevOps engineer at a SaaS company. Your team just built a new auth microservice and needs to deploy it to the cluster. The backend developer hands you a container image and says "make it accessible internally on port 3000." You write a single YAML file that creates both the Pod and the Service to expose it — because they're related and should be deployed together.
apiVersion: v1 # First document: a Pod for the auth service
kind: Pod
metadata:
name: auth-pod # Name this Pod auth-pod
labels:
app: auth # Label app=auth — the Service below will use this to find the Pod
spec:
containers:
- name: auth-container
image: company/auth-service:2.1.0 # Always use a versioned tag from your internal registry
ports:
- containerPort: 3000 # App listens on 3000 inside the container
--- # Three dashes = document separator — starts the next Kubernetes object
apiVersion: v1 # Second document: a Service to expose the Pod
kind: Service
metadata:
name: auth-service # Name this Service auth-service
spec:
selector:
app: auth # Match Pods with label app=auth — this is how Service finds our Pod above
ports:
- port: 3000 # Port the Service exposes to the rest of the cluster
targetPort: 3000 # Port on the Pod to forward traffic to (matches containerPort above)
type: ClusterIP # ClusterIP = internal-only, not reachable from outside the cluster
$ kubectl apply -f auth-service.yaml pod/auth-pod created service/auth-service created $ kubectl get pods,services NAME READY STATUS RESTARTS AGE pod/auth-pod 1/1 Running 0 8s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/auth-service ClusterIP 10.96.137.201 <none> 3000/TCP 8s
What just happened?
The --- separator — YAML supports multiple "documents" in one file, separated by ---. When you run kubectl apply on this file, Kubernetes reads both documents and creates both objects in order. One command, two objects.
Service selector — The selector: app: auth on the Service is the link between the Service and the Pod. Kubernetes continuously watches for Pods with matching labels and routes traffic to them. This is the entire routing model — labels are the glue.
kubectl get pods,services — You can query multiple resource types in one command by comma-separating them. Useful for getting a full picture of a component's health fast.
CLUSTER-IP: 10.96.137.201 — Kubernetes assigned the Service a stable virtual IP address. Any other pod in the cluster can reach the auth service at this IP on port 3000 — or by using the DNS name auth-service.default.svc.cluster.local. We'll cover DNS in detail in Lesson 35.
Validating and Dry-Running Your YAML
Before you apply a manifest to a production cluster, two kubectl commands will save you from embarrassing yourself in front of your team.
The scenario: You're an SRE preparing a deploy for a critical payment processing service. Your company's change management policy requires you to validate all manifests before they hit production. You've written the YAML, but you want to catch any errors before the change window tonight — without actually touching the cluster.
kubectl apply -f payment-pod.yaml --dry-run=client
# --dry-run=client: simulates the apply entirely on your machine — never contacts the cluster
# Useful for catching YAML syntax errors and obvious misconfigurations before deploy
kubectl apply -f payment-pod.yaml --dry-run=server
# --dry-run=server: sends the manifest to the API server but tells it "don't actually create this"
# The server validates against full admission webhooks and schema — catches more than client mode
kubectl apply -f payment-pod.yaml --dry-run=server -o yaml
# -o yaml: prints the full manifest as Kubernetes would see it, including default values injected
# This is how you see what fields Kubernetes adds automatically (like imagePullPolicy: IfNotPresent)
kubectl diff -f payment-pod.yaml
# diff: shows what WOULD change if you applied this file to what's currently running
# Indispensable before updates — lets you see the diff before committing to a change
$ kubectl apply -f payment-pod.yaml --dry-run=client pod/payment-pod configured (dry run) $ kubectl apply -f payment-pod.yaml --dry-run=server pod/payment-pod configured (dry run) $ kubectl diff -f payment-pod.yaml diff -u -N /tmp/LIVE-123456/v1.Pod..default.payment-pod /tmp/MERGED-123456/v1.Pod..default.payment-pod --- /tmp/LIVE-123456/v1.Pod..default.payment-pod +++ /tmp/MERGED-123456/v1.Pod..default.payment-pod @@ -7,7 +7,7 @@ containers: - image: company/payment:1.4.1 - resources: - limits: - cpu: 250m + resources: + limits: + cpu: 500m
What just happened?
--dry-run=client vs --dry-run=server — client mode validates locally: catches typos, missing required fields, bad indentation. server mode sends the manifest to the real API server, running it through the full validation pipeline including admission controllers — it will catch things client mode misses, like referencing a StorageClass that doesn't exist.
kubectl diff output — Lines starting with - are what's live in the cluster right now. Lines starting with + are what your YAML file would change it to. In this case, you can see a CPU limit is being increased from 250m to 500m. Reviewing this before every production apply is just good practice.
The YAML Structure at a Glance
Here's how all the pieces of a Kubernetes manifest relate to each other:
Teacher's Note: spec vs status — the most important thing in this lesson
The entire Kubernetes reconciliation model comes down to two fields: spec (what you want) and status (what currently exists). You write spec. Kubernetes writes status. The control plane loops forever comparing the two, and when they don't match, it does work to close the gap. This loop is called the reconciliation loop and it's why Kubernetes is self-healing. You never touch status in your YAML — it's automatically populated by Kubernetes and you query it with kubectl describe.
Practice Questions
1. In a Kubernetes manifest, which top-level field contains the desired state of the object — things like how many replicas, which containers to run, and which ports to expose?
2. What three-character separator do you use in a YAML file to define multiple Kubernetes objects in a single file?
3. What is the correct apiVersion for a Kubernetes Deployment manifest?
Quiz
1. What does kubectl apply -f manifest.yaml --dry-run=client do?
2. A Kubernetes Service uses which metadata field on a Pod to determine which Pods to route traffic to?
3. Which top-level field in a Kubernetes manifest is automatically written by Kubernetes to reflect the current observed state, and should never be manually specified in your YAML files?
Up Next · Lesson 15
First Kubernetes Deployment
Everything clicks — we deploy a real multi-replica application end to end, from manifest to live traffic.