Kubernetes Course
Mini Project: Production-Ready Cluster
This is the capstone of the course. You will deploy the dataplexa payment platform on EKS from scratch — applying every concept from every section: namespaces, RBAC, Network Policies, TLS, Secrets management, Helm, HPA, Cluster Autoscaling, logging, monitoring, and a GitOps delivery pipeline. Every manifest in this lesson is production-grade and battle-tested.
What You Will Build
Architecture: dataplexa Payment Platform on EKS
Infrastructure
EKS 1.30, 3 AZs, Karpenter, AWS LB Controller, EBS CSI, External Secrets Operator, Fluent Bit, kube-prometheus-stack
Security
Pod Security Admission (restricted), RBAC least-privilege, Network Policies default-deny, IRSA, TLS + cert-manager, etcd encryption
Workloads
payment-api (Deployment + HPA), PostgreSQL (StatefulSet), Ingress (ALB + TLS), PodDisruptionBudgets, Velero backup
Step 1 — Bootstrap the EKS Cluster
# Create the cluster with eksctl
eksctl create cluster -f cluster.yaml # (full config from Lesson 59)
# Install cluster-wide platform components via Helm
# 1. Karpenter node autoscaler
helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
--namespace karpenter --create-namespace \
--set settings.clusterName=production \
--set settings.interruptionQueue=production-karpenter
# 2. AWS Load Balancer Controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=production \
--set serviceAccount.name=aws-load-balancer-controller
# 3. cert-manager (TLS automation)
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true
# 4. External Secrets Operator (secrets from AWS SM)
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets --create-namespace
# 5. Monitoring stack
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
--set prometheus.prometheusSpec.retention=30d \
--set alertmanager.alertmanagerSpec.replicas=2
# 6. Fluent Bit log shipper
helm install fluent-bit fluent/fluent-bit \
--namespace logging --create-namespace \
-f fluent-bit-values.yaml
$ kubectl get pods -A | grep -v Running # All pods Running -- platform bootstrapped ✓ $ kubectl get nodes NAME STATUS VERSION ip-10-0-1-12.compute.internal Ready v1.30.2 ← 3 nodes, 1 per AZ ip-10-0-2-44.compute.internal Ready v1.30.2 ip-10-0-3-78.compute.internal Ready v1.30.2 $ helm list -A NAME NAMESPACE STATUS karpenter karpenter deployed aws-load-balancer-controller kube-system deployed cert-manager cert-manager deployed external-secrets external-secrets deployed kube-prometheus-stack monitoring deployed fluent-bit logging deployed ← all platform charts deployed ✓
Step 2 — Namespace, RBAC, and Security Baseline
# namespace.yaml -- payments namespace with all security controls
apiVersion: v1
kind: Namespace
metadata:
name: payments
labels:
# Pod Security Admission: enforce restricted profile (Lesson 41)
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: "v1.30"
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
team: payments
environment: production
---
# RBAC: developers can view but never read Secrets (Lesson 37-38)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: payments-developers-view
namespace: payments
subjects:
- kind: Group
name: company:payments-team
apiGroup: rbac.authorization.k8s.io
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view # Built-in: read-only, deliberately excludes Secrets
---
# Network Policy: default deny all ingress and egress (Lesson 36)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payments
spec:
podSelector: {}
policyTypes: ["Ingress", "Egress"]
---
# Network Policy: allow DNS (without this, nothing works)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: payments
spec:
podSelector: {}
policyTypes: ["Egress"]
egress:
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
$ kubectl apply -f namespace.yaml
namespace/payments created
rolebinding.rbac.authorization.k8s.io/payments-developers-view created
networkpolicy.networking.k8s.io/default-deny-all created
networkpolicy.networking.k8s.io/allow-dns-egress created
$ kubectl get namespace payments -o jsonpath='{.metadata.labels}' | python3 -m json.tool
{
"pod-security.kubernetes.io/enforce": "restricted",
"pod-security.kubernetes.io/warn": "restricted",
"pod-security.kubernetes.io/audit": "restricted",
"team": "payments",
"environment": "production"
} ← PSA restricted enforced ✓
$ kubectl get networkpolicy -n payments
NAME POD-SELECTOR AGE
default-deny-all <none> 5s ← blocks all traffic by default ✓
allow-dns-egress <none> 5s ← DNS permitted ✓Step 3 — Secrets via External Secrets Operator
# secrets.yaml -- all secrets sourced from AWS Secrets Manager (Lesson 42)
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: aws-secrets-manager
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: eso-sa
namespace: external-secrets
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: postgres-credentials-sync
namespace: payments
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: postgres-credentials
creationPolicy: Owner
data:
- secretKey: POSTGRES_PASSWORD
remoteRef:
key: production/payments/postgres
property: password
- secretKey: DATABASE_URL
remoteRef:
key: production/payments/postgres
property: database_url
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: payment-gateway-sync
namespace: payments
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: payment-gateway-credentials
creationPolicy: Owner
data:
- secretKey: GATEWAY_API_KEY
remoteRef:
key: production/payments/gateway
property: api_key
- secretKey: GATEWAY_WEBHOOK_SECRET
remoteRef:
key: production/payments/gateway
property: webhook_secret
$ kubectl apply -f secrets.yaml clustersecretstore.external-secrets.io/aws-secrets-manager created externalsecret.external-secrets.io/postgres-credentials-sync created externalsecret.external-secrets.io/payment-gateway-sync created $ kubectl get externalsecret -n payments NAME STORE REFRESH STATUS READY postgres-credentials-sync aws-secrets-manager 1h Valid True ✓ payment-gateway-sync aws-secrets-manager 1h Valid True ✓ $ kubectl get secrets -n payments NAME TYPE DATA postgres-credentials Opaque 2 ← created by ESO, sourced from AWS SM ✓ payment-gateway-credentials Opaque 2 ✓ # No secret values in Git -- only ExternalSecret paths ✓
Step 4 — PostgreSQL StatefulSet
# postgres.yaml -- production PostgreSQL StatefulSet (Lesson 30)
apiVersion: v1
kind: ServiceAccount
metadata:
name: postgres-sa
namespace: payments
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/postgres-role
automountServiceAccountToken: false
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: payments
spec:
clusterIP: None # Headless: per-Pod DNS for StatefulSet
selector:
app: postgres
ports:
- port: 5432
name: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: payments
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
serviceAccountName: postgres-sa
securityContext:
runAsNonRoot: true
runAsUser: 999
fsGroup: 999
seccompProfile:
type: RuntimeDefault
containers:
- name: postgres
image: postgres:15-alpine
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false # Postgres needs to write its data dir
capabilities:
drop: ["ALL"]
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: POSTGRES_PASSWORD
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 1000m
memory: 2Gi
readinessProbe:
exec:
command: ["pg_isready", "-U", "postgres"]
initialDelaySeconds: 10
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3-encrypted
resources:
requests:
storage: 50Gi
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: postgres-pdb
namespace: payments
spec:
maxUnavailable: 0 # Never evict the only postgres Pod during node drain
selector:
matchLabels:
app: postgres
$ kubectl apply -f postgres.yaml serviceaccount/postgres-sa created service/postgres created statefulset.apps/postgres created poddisruptionbudget.policy/postgres-pdb created $ kubectl get pods -n payments -w postgres-0 0/1 ContainerCreating postgres-0 1/1 Running ← StatefulSet Pod Ready ✓ $ kubectl get pvc -n payments NAME STATUS VOLUME CAPACITY STORAGECLASS postgres-data-postgres-0 Bound pvc-a1b2c3 50Gi gp3-encrypted ✓ # Verify DNS name works $ kubectl run dns-test --image=busybox --restart=Never -n payments -- \ nslookup postgres-0.postgres.payments.svc.cluster.local Address 1: 192.168.2.15 ← per-Pod DNS confirmed ✓
Step 5 — Payment API Deployment
# payment-api.yaml -- fully hardened Deployment with HPA and network access
apiVersion: v1
kind: ServiceAccount
metadata:
name: payment-api-sa
namespace: payments
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payment-api-role
automountServiceAccountToken: true # Needed for IRSA token projection
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-api
namespace: payments
annotations:
secret.reloader.stakater.com/reload: "postgres-credentials,payment-gateway-credentials"
spec:
replicas: 3
selector:
matchLabels:
app: payment-api
template:
metadata:
labels:
app: payment-api
version: "3.1.0"
spec:
serviceAccountName: payment-api-sa
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: payment-api
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: payment-api
image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/payment-api:3.1.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
ports:
- containerPort: 8080
name: http
- containerPort: 9090
name: metrics
envFrom:
- secretRef:
name: postgres-credentials
- secretRef:
name: payment-gateway-credentials
env:
- name: LOG_LEVEL
value: "warn"
- name: PORT
value: "8080"
volumeMounts:
- name: tmp
mountPath: /tmp
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m
memory: 512Mi
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: tmp
emptyDir: {}
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: payment-api
namespace: payments
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: payment-api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
behavior:
scaleUp:
stabilizationWindowSeconds: 0
scaleDown:
stabilizationWindowSeconds: 300
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: payment-api-pdb
namespace: payments
spec:
minAvailable: 2
selector:
matchLabels:
app: payment-api
---
# Allow payment-api to reach postgres and external gateways
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-payment-api-egress
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-api
policyTypes: ["Egress"]
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- port: 5432
- to:
- namespaceSelector: {} # External HTTPS to payment gateways
ports:
- port: 443
$ kubectl apply -f payment-api.yaml serviceaccount/payment-api-sa created deployment.apps/payment-api created horizontalpodautoscaler.autoscaling/payment-api created poddisruptionbudget.policy/payment-api-pdb created networkpolicy.networking.k8s.io/allow-payment-api-egress created $ kubectl get pods -n payments -o wide NAME READY NODE ZONE payment-api-7d9f4-xkp2m 1/1 ip-10-0-1-12.compute.internal us-east-1a payment-api-7d9f4-rvqn2 1/1 ip-10-0-2-44.compute.internal us-east-1b payment-api-7d9f4-m4czl 1/1 ip-10-0-3-78.compute.internal us-east-1c # One Pod per AZ -- topologySpreadConstraints working ✓ $ kubectl get hpa payment-api -n payments NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS payment-api Deployment/payment-api 38%/60% 3 20 3 ✓
Step 6 — TLS, Ingress, and Service
# ingress.yaml -- ALB with TLS, cert-manager, WAF
apiVersion: v1
kind: Service
metadata:
name: payment-api
namespace: payments
spec:
selector:
app: payment-api
ports:
- name: http
port: 80
targetPort: 8080
- name: metrics
port: 9090
targetPort: 9090
---
# Allow ingress from the ALB to reach payment-api Pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-to-payment-api
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-api
policyTypes: ["Ingress"]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system # ALB controller
ports:
- port: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: payment-api
namespace: payments
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443},{"HTTP":80}]'
alb.ingress.kubernetes.io/ssl-redirect: "443"
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:123456789012:certificate/abc-def
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS13-1-2-2021-06
alb.ingress.kubernetes.io/wafv2-acl-arn: arn:aws:wafv2:us-east-1:123456789012:regional/webacl/production-waf/xyz
alb.ingress.kubernetes.io/group.name: production
alb.ingress.kubernetes.io/healthcheck-path: /health
spec:
rules:
- host: api.dataplexa.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: payment-api
port:
number: 80
$ kubectl apply -f ingress.yaml
service/payment-api created
networkpolicy.networking.k8s.io/allow-ingress-to-payment-api created
ingress.networking.k8s.io/payment-api created
$ kubectl get ingress payment-api -n payments
NAME CLASS HOSTS ADDRESS PORTS
payment-api alb api.dataplexa.com k8s-prod-payment-abc.us-east-1.elb.amazonaws.com 80, 443 ✓
# End-to-end test
$ curl -s https://api.dataplexa.com/health
{"status":"ok","version":"3.1.0","db":"connected"} ✓
# Confirm TLS is active
$ curl -vI https://api.dataplexa.com/health 2>&1 | grep -E "SSL|TLS|issuer|subject"
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* Server certificate: CN=api.dataplexa.com
* issuer: CN=Amazon RSA 2048 M03 ← ACM certificate ✓Step 7 — Monitoring and Alerting
# monitoring.yaml -- ServiceMonitor and PrometheusRule
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: payment-api
namespace: payments
labels:
release: kube-prometheus-stack
spec:
selector:
matchLabels:
app: payment-api
endpoints:
- port: metrics
path: /metrics
interval: 30s
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: payment-api-alerts
namespace: payments
labels:
release: kube-prometheus-stack
spec:
groups:
- name: payment-api
rules:
- alert: PaymentErrorRateTooHigh
expr: |
(sum(rate(payments_total{status="failed"}[5m]))
/ sum(rate(payments_total[5m]))) * 100 > 1
for: 2m
labels:
severity: critical
team: payments
annotations:
summary: "Payment error rate above 1%"
description: "Error rate {{ $value | humanize }}% -- SLO breach"
runbook_url: "https://wiki.dataplexa.com/runbooks/payment-errors"
- alert: PodCrashLooping
expr: increase(kube_pod_container_status_restarts_total{namespace="payments"}[1h]) > 3
for: 0m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} crash-looping in payments"
$ kubectl apply -f monitoring.yaml
servicemonitor.monitoring.coreos.com/payment-api created
prometheusrule.monitoring.coreos.com/payment-api-alerts created
# Verify Prometheus is scraping the payment-api
$ kubectl port-forward svc/kube-prometheus-stack-prometheus 9090:9090 -n monitoring &
$ curl -s 'http://localhost:9090/api/v1/targets' | jq '.data.activeTargets[] | select(.labels.job=="payment-api") | .health'
"up" ← Prometheus scraping payment-api metrics ✓
# Verify alerts are loaded
$ curl -s 'http://localhost:9090/api/v1/rules' | jq '.data.groups[] | select(.name=="payment-api") | .rules[].name'
"PaymentErrorRateTooHigh"
"PodCrashLooping" ✓
# Current SLO status:
$ curl -s 'http://localhost:9090/api/v1/query?query=sum(rate(payments_total{status="failed"}[5m]))/sum(rate(payments_total[5m]))*100' | jq '.data.result[0].value[1]'
"0.26" ← 0.26% error rate -- SLO is 1%, we are green ✓Step 8 — Backup Schedule with Velero
# Install Velero and configure nightly backups (Lesson 57)
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.9.0 \
--bucket dataplexa-velero-backups \
--backup-location-config region=us-east-1 \
--snapshot-location-config region=us-east-1 \
--secret-file ./velero-credentials
# Schedule: nightly backup of the payments namespace
velero schedule create payments-nightly \
--schedule="0 2 * * *" \
--include-namespaces payments \
--ttl 720h
# Test restore process NOW before you need it (quarterly drill)
velero backup create payments-dr-drill --include-namespaces payments --wait
velero restore create payments-dr-test \
--from-backup payments-dr-drill \
--namespace-mappings payments:payments-restored \
--wait
kubectl get pods -n payments-restored # Verify workloads restored
kubectl delete namespace payments-restored # Clean up drill
$ velero schedule create payments-nightly --schedule="0 2 * * *" \ --include-namespaces payments --ttl 720h Schedule "payments-nightly" created successfully. $ velero backup create payments-dr-drill --include-namespaces payments --wait Backup completed with status: Completed ✓ $ velero restore create payments-dr-test \ --from-backup payments-dr-drill \ --namespace-mappings payments:payments-restored --wait Restore completed with status: Completed Errors: 0 Warnings: 0 ✓ $ kubectl get pods -n payments-restored NAME READY STATUS payment-api-7d9f4-xkp2m 1/1 Running ← restored from backup ✓ postgres-0 1/1 Running ✓ $ kubectl delete namespace payments-restored namespace "payments-restored" deleted ← drill complete, cluster clean ✓
Step 9 — Final Production Validation
# Run the complete production readiness checklist
echo "=== 1. All Pods Running ==="
kubectl get pods -n payments
# Expected: payment-api x3 Running, postgres-0 Running
echo "=== 2. PSA Enforcement Active ==="
kubectl get namespace payments -o jsonpath='{.metadata.labels.pod-security\.kubernetes\.io/enforce}'
# Expected: restricted
echo "=== 3. No Secrets in Git ==="
kubectl get externalsecret -n payments
# Expected: all Ready=True
echo "=== 4. HPA Active ==="
kubectl get hpa -n payments
# Expected: TARGETS shows current CPU%, REPLICAS=3
echo "=== 5. Network Policies ==="
kubectl get networkpolicy -n payments
# Expected: default-deny-all, allow-dns-egress, allow-payment-api-egress, allow-ingress-to-payment-api
echo "=== 6. TLS Working ==="
curl -sI https://api.dataplexa.com/health | grep HTTP
# Expected: HTTP/2 200
echo "=== 7. Metrics Being Scraped ==="
kubectl get servicemonitor payment-api -n payments
# Expected: AGE shows monitor is active
echo "=== 8. Backups Scheduled ==="
velero schedule get
# Expected: payments-nightly scheduled
echo "=== 9. Cluster Autoscaling Ready ==="
kubectl get nodeclaims
# Expected: Karpenter NodeClaims showing running nodes
echo "=== 10. Logs Shipping ==="
kubectl get pods -n logging
# Expected: fluent-bit DaemonSet pods all Running
=== 1. All Pods Running === NAME READY STATUS payment-api-7d9f4-xkp2m 1/1 Running ✓ payment-api-7d9f4-rvqn2 1/1 Running ✓ payment-api-7d9f4-m4czl 1/1 Running ✓ postgres-0 1/1 Running ✓ === 2. PSA Enforcement Active === restricted ✓ === 3. No Secrets in Git === NAME READY STATUS postgres-credentials-sync True SecretSynced ✓ payment-gateway-sync True SecretSynced ✓ === 4. HPA Active === NAME TARGETS MINPODS MAXPODS REPLICAS payment-api 38%/60% 3 20 3 ✓ === 5. Network Policies === NAME POD-SELECTOR default-deny-all <none> ✓ allow-dns-egress <none> ✓ allow-payment-api-egress app=payment-api ✓ allow-ingress-to-payment-api app=payment-api ✓ === 6. TLS Working === HTTP/2 200 ✓ === 7. Metrics Being Scraped === NAME AGE payment-api 12m ✓ === 8. Backups Scheduled === NAME STATUS SCHEDULE LAST BACKUP payments-nightly Enabled 0 2 * * * 2h ago ✓ === 9. Cluster Autoscaling Ready === NAME TYPE CAPACITY NODE default-abc123 m5.xlarge spot ip-10-0-1-12 ✓ === 10. Logs Shipping === NAME READY STATUS fluent-bit-xkp2m 1/1 Running ✓ fluent-bit-7rvqn 1/1 Running ✓ fluent-bit-m4czl 1/1 Running ✓ ALL CHECKS PASSED ✅ -- dataplexa payment platform is production-ready
Course Complete — What You Have Built
You have deployed a production-grade Kubernetes platform from scratch. Here is everything active in your cluster right now:
Security Controls
✅ PSA restricted on all app namespaces
✅ RBAC least-privilege per workload
✅ Network Policies default-deny
✅ IRSA — no AWS keys in cluster
✅ TLS on all external endpoints
✅ Secrets from AWS SM, not Git
Reliability Controls
✅ HPA scales 3→20 replicas on CPU
✅ Karpenter adds/removes nodes
✅ topologySpreadConstraints: 1 per AZ
✅ PodDisruptionBudgets protect all services
✅ Velero nightly backups + DR tested
✅ Prometheus alerts on SLO breach
Teacher's Note: What comes next
You now have the foundations to run Kubernetes in production. The next areas to explore as you deepen your expertise are service meshes (Istio or Linkerd) for mutual TLS between services and traffic management, FinOps tooling (Kubecost or OpenCost) for understanding and optimising your cluster spend, and multi-cluster management for running workloads across regions or cloud providers.
The best way to solidify what you have learned is to rebuild this project from scratch without referencing the lessons — just the Kubernetes documentation. If you can produce a running, secure, observable cluster from a blank EKS console, you understand Kubernetes at a production level.
The official certifications worth pursuing next: CKA (Certified Kubernetes Administrator) for operations depth, CKAD (Certified Kubernetes Application Developer) for workload patterns, and CKS (Certified Kubernetes Security Specialist) for the full security control set. Everything in this course maps directly onto those exam objectives.
Final Quiz — Put It All Together
1. You have 3 replicas of the payment-api. An entire AWS Availability Zone fails. What Kubernetes feature ensures the other two replicas are still serving traffic in the remaining zones?
2. A new engineer asks: "The postgres-credentials Secret exists in the cluster but I can't find it in the Git repo. Where is the password stored?" What is the correct answer?
3. A new deployment of the payment-api is applied. Pods show Running and Ready, but the ALB health checks are failing and curl returns 502. What is the first command to run and why?
Course Complete
Kubernetes for Production
60 lessons · Lessons 14–60 · dataplexa.com