// Kubernetes Guide · Intermediate–Advanced

Kubernetes Complete Guide: Pods, Deployments, RBAC & Production Patterns

📅 Updated April 2026 · 📅 April 2026 ⏱ 15 min read 🏷 Kubernetes · DevOps · SRE · Containers

👨‍💻

master.devops

Practising DevOps Engineer with deep hands-on experience in Kubernetes, AWS, CI/CD, and SRE. Every guide is written from real production work.

// Table of Contents

What is Kubernetes and Why Does It Exist?
Kubernetes Architecture
Core Workload Objects
Networking — Services and Ingress
Real YAML Examples
Health Probes — The Most Misunderstood Feature
Security and RBAC
Autoscaling with HPA
Debugging Kubernetes in Production
Interview Q&A

Kubernetes (K8s) is the single most important skill in modern DevOps and SRE. I have used it daily in production — managing clusters on AWS EKS and Azure AKS, responding to incidents, tuning HPA policies, and designing multi-AZ high-availability architectures. This guide is written from that real-world experience, not from reading documentation.

Originally built by Google based on their internal system Borg, Kubernetes was open-sourced in 2014 and has since become the industry standard for running containerised applications at scale. If you are preparing for a Senior DevOps, Platform Engineer, or SRE interview, a deep, practical understanding of Kubernetes is non-negotiable.

What is Kubernetes and Why Does It Exist?

Before Kubernetes, running Docker containers at scale exposed a fundamental problem: Docker solved packaging, but not orchestration. What happens when a container crashes? How do you spread containers across 50 servers? How do you update 100 running containers without downtime? How do you handle a traffic spike at 2am?

Kubernetes answers all of these questions. It is an orchestration platform — a system that manages containers across a cluster of machines. You declare what you want (3 replicas of my API, always running, with 512MB RAM each) and Kubernetes makes it happen and maintains it — even when servers fail, containers crash, or traffic doubles.

    Core philosophy: Kubernetes is a desired-state system. You describe the desired state
    in YAML. Kubernetes continuously reconciles actual state toward desired state. This reconciliation loop
    is the foundation of everything in the platform.
  

Kubernetes Architecture

A Kubernetes cluster has two types of machines: the control plane and worker nodes. Understanding this architecture is the first question in most K8s interviews.

Control Plane Components

API Server (kube-apiserver) — The single entry point for all operations. Every kubectl command, every CI/CD tool, every internal component — all talk to the API server. It validates requests and writes to etcd.
etcd — A distributed key-value store that holds all cluster state. If etcd is lost without backup, the cluster is lost. In production, run etcd on at least 3 nodes for high availability. Back it up daily.
Scheduler (kube-scheduler) — Watches for unscheduled Pods and assigns them to nodes based on resource availability, node selectors, affinity rules, and taints/tolerations.
Controller Manager (kube-controller-manager) — Runs dozens of control loops. The Deployment controller watches for desired replica count and creates/deletes ReplicaSets. The ReplicaSet controller watches for desired pod count. Each controller does one job and does it continuously.

Worker Node Components

kubelet — The agent on every worker node. Receives Pod specs from the API server and ensures the described containers are running and healthy. Reports node and pod status back to the control plane.
kube-proxy — Maintains iptables or IPVS rules for Service routing. Enables any Pod to reach any Service IP, regardless of which node the target Pod is on.
Container Runtime — The software that actually runs containers. containerd is the standard runtime in modern clusters (Docker was removed as the default runtime in K8s 1.24).

Core Workload Objects

Pod — The Atomic Unit

A Pod is the smallest deployable unit. It contains one or more containers that share a network namespace (they talk to each other via localhost) and storage volumes. In practice, most Pods run a single container. The sidecar pattern (Istio's Envoy proxy, Vault's secret injector) is the main case for multi-container Pods.

Pods are ephemeral. They are not self-healing. If a Pod dies, it stays dead unless a controller (like a Deployment) creates a replacement. Never create bare Pods in production — always use a Deployment or StatefulSet.

Deployment — For Stateless Applications

A Deployment manages a ReplicaSet (a set of identical Pods). You declare the number of replicas and the container image. The Deployment controller ensures that many pods are always running. It also handles rolling updates — replacing Pods gradually so there is no downtime — and rollbacks when a new version fails.

StatefulSet — For Stateful Applications

StatefulSets are like Deployments but with three additional guarantees: stable Pod names (pod-0, pod-1, pod-2 — never random), stable per-Pod PersistentVolumeClaims that survive rescheduling, and ordered start/stop (pod-0 starts before pod-1, pod-1 stops before pod-0). Use StatefulSets for PostgreSQL, MySQL, Kafka, Redis Cluster, Cassandra, Elasticsearch — any workload where instance identity matters.

DaemonSet — One Pod Per Node

DaemonSets ensure exactly one Pod runs on every node (or a subset). Used for: log collectors (Fluentd, Filebeat), monitoring agents (Prometheus Node Exporter), network plugins (Calico CNI), and security agents (Falco). When a new node joins the cluster, the DaemonSet Pod is automatically scheduled on it.

Networking — Services and Ingress

Kubernetes networking is the most common area where engineers get stuck. There are four distinct communication problems to understand:

Container-to-container within a Pod — Shared network namespace. Use localhost.
Pod-to-Pod across nodes — Every Pod gets a unique cluster IP. Pods can reach each other directly. The CNI plugin (Calico, Cilium, Flannel) implements this flat network.
Pod-to-Service — Services provide a stable virtual IP and DNS name. kube-proxy routes traffic from the Service IP to healthy Pod endpoints. Even as Pods are replaced, the Service IP stays constant.
External-to-Cluster — LoadBalancer Services or Ingress controllers route external traffic in.

Service Types

ClusterIP (default) — Internal-only virtual IP. Use for microservice-to-microservice communication. DNS: servicename.namespace.svc.cluster.local
NodePort — Exposes service on a static port (30000–32767) on every node. Useful for development. Not for production — exposes node IPs directly.
LoadBalancer — Provisions a cloud load balancer (AWS ELB, Azure LB). One load balancer per service — gets expensive quickly for many services.
Ingress — L7 HTTP/HTTPS router. One Ingress controller (nginx, AWS ALB, Traefik) handles routing for all services based on hostname and URL path rules. SSL termination included. This is the right solution for most production HTTP workloads.

Real YAML Examples

Production Deployment

# production-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
  namespace: production
  labels:
    app: api-server
    version: "2.1.0"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0    # never drop below 100% capacity
      maxSurge: 1          # create 1 extra pod during update
  template:
    metadata:
      labels:
        app: api-server
        version: "2.1.0"
    spec:
      serviceAccountName: api-sa
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001    # NEVER run as root
      containers:
      - name: api
        image: registry.company.com/api:2.1.0
        ports:
        - containerPort: 8080
        resources:
          requests:           # scheduler uses this for placement
            cpu: "100m"
            memory: "256Mi"
          limits:             # OOMKilled if exceeded
            cpu: "500m"
            memory: "512Mi"
        env:
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:     # never hardcode secrets
              name: db-secret
              key: password
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30  # give app time to start
          periodSeconds: 10
        lifecycle:
          preStop:
            exec:
              command: ["sleep", "5"]  # drain connections gracefully
      terminationGracePeriodSeconds: 30

Ingress with TLS

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/rate-limit: "100"    # 100 req/s per IP
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - api.company.com
    secretName: api-tls-cert              # cert-manager manages this
  rules:
  - host: api.company.com
    http:
      paths:
      - path: /api/v1
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /health
        pathType: Exact
        backend:
          service:
            name: api-service
            port:
              number: 80

HPA with Custom Metrics

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60    # scale up when avg CPU > 60%
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # wait 5 min before scaling down
      policies:
      - type: Percent
        value: 10             # scale down max 10% at a time
        periodSeconds: 60

Health Probes — The Most Misunderstood Feature

In production Kubernetes environments, health probe misconfiguration is the single most common cause of production incidents I have investigated. Get these wrong and you will have either CrashLoopBackOff (liveness probe too aggressive) or traffic to broken pods (readiness probe missing).

    Liveness probe failure → Kubernetes kills and restarts the container. Use for detecting deadlocks, infinite loops, or processes that are stuck but still "running".
  

    Readiness probe failure → Pod is removed from Service endpoints. No traffic is routed to it. Container is NOT restarted. Use for: app warming up, database not yet connected, cache not yet populated.
  

Critical rule: Never put an external dependency check in a liveness probe. If your liveness probe calls your database and the database has a 30-second outage, Kubernetes will restart every pod in your fleet simultaneously. Use readiness for external dependency checks.

Startup probe tip: For Java apps with long JVM warm-up, set failureThreshold: 30 and periodSeconds: 10 on the startup probe. This gives 300 seconds (5 minutes) for startup before liveness kicks in. Without this, JVM apps frequently enter CrashLoopBackOff on first deploy.

Security and RBAC

Role-Based Access Control (RBAC) is the primary security mechanism inside a Kubernetes cluster. The principle of least privilege applies everywhere: a Pod should only have access to the resources it actually needs, and nothing more.

# Create a role that can only read pods and logs
kubectl create role pod-reader   --verb=get,list,watch   --resource=pods,pods/log   -n production

# Bind it to a service account
kubectl create rolebinding api-pod-reader   --role=pod-reader   --serviceaccount=production:api-sa   -n production

# Verify — always test what you set up
kubectl auth can-i get pods   --as=system:serviceaccount:production:api-sa   -n production
# Output: yes

kubectl auth can-i delete pods   --as=system:serviceaccount:production:api-sa   -n production
# Output: no

Kubernetes Secrets are NOT encrypted by default. They are base64-encoded, which anyone with etcd access can trivially decode. For production, use HashiCorp Vault with the Agent Injector, Sealed Secrets (Bitnami), or External Secrets Operator with AWS Secrets Manager or Azure Key Vault. Enable etcd encryption at rest as a baseline.

Debugging Kubernetes in Production

This is the section that separates senior engineers from juniors in interviews. When a pod is broken in production, you need a systematic, fast debugging approach — not random kubectl commands.

# Step 1: Get pod status — look for Error, CrashLoopBackOff, Pending, OOMKilled
kubectl get pods -n production -o wide

# Step 2: Describe — the Events section is the most useful part
kubectl describe pod api-7d4b-xyz -n production
# Look for: ImagePullBackOff, resource limits, scheduling failures

# Step 3: Logs — use --previous for crashed container's last logs
kubectl logs api-7d4b-xyz -n production --previous
kubectl logs api-7d4b-xyz -n production -c api --tail=100

# Step 4: Events — sorted by time, cluster-wide view
kubectl get events --sort-by='.lastTimestamp' -n production

# Step 5: Resource usage — check for OOMKilled (memory limit exceeded)
kubectl top pods -n production --sort-by=memory

# Step 6: Exec into running container for live debugging
kubectl exec -it api-7d4b-xyz -n production -- sh

# Step 7: Check if Service has healthy endpoints
kubectl get endpoints api-service -n production
# Empty endpoints = readiness probe failing or selector mismatch

# Step 8: Port-forward for local testing without Ingress
kubectl port-forward svc/api-service 8080:80 -n production

Interview Q&A

Q1: What is CrashLoopBackOff and how do you debug it?

CrashLoopBackOff means a container is crashing immediately after start and Kubernetes is restarting it with exponential backoff (10s, 20s, 40s up to 5 minutes). Common causes: application startup error, missing environment variable or ConfigMap key, liveness probe firing before app is ready, OOMKilled (memory limit too low), wrong ENTRYPOINT command. Debug with kubectl logs --previous to see the last crash output, and kubectl describe pod to see the exit code. Exit code 137 = OOMKilled, exit code 1 = application error.

Q2: What is the difference between Deployment and StatefulSet?

Deployments are for stateless applications — Pods are interchangeable, get random names, can share a single PVC or use no PVC. StatefulSets are for stateful applications needing stable identity: ordered names (pod-0, pod-1), per-Pod stable PVCs that survive rescheduling, and ordered startup/shutdown. Use StatefulSet for: databases (Postgres, MySQL), Kafka, Redis Cluster, Elasticsearch. Never use StatefulSet for stateless apps — it makes rolling updates unnecessarily sequential.

Q3: Liveness vs Readiness — what happens when each fails?

Liveness failure: kubelet kills the container and restarts it. Readiness failure: the Pod is removed from the Service endpoint list — no traffic, no restart. A Pod can be "live" but "not ready" (still warming up). The most common mistake: aggressive liveness probe on slow-starting app causes CrashLoopBackOff. The second most common: no readiness probe, so traffic hits Pods during startup before the app is ready to serve.

Q4: How does the Kubernetes scheduler decide where to place a Pod?

Two phases: Filtering eliminates nodes that cannot run the Pod — insufficient CPU/memory requests, taint not tolerated, node selector mismatch, affinity rule violation, node not Ready. Scoring ranks remaining nodes by least-allocated resources, pod topology spread score, image locality (node already has the image cached). The highest-scoring node wins. If no node passes filtering, the Pod stays Pending indefinitely. Check kubectl describe pod events for "FailedScheduling" to see why.

Q5: How do you ensure zero-downtime deployments?

Set maxUnavailable: 0 and maxSurge: 1 in the RollingUpdate strategy. Configure a proper readiness probe so the new Pod only joins the Service endpoints when it is actually ready to serve traffic. Set a preStop hook (sleep 5) to allow in-flight requests to complete before the container is terminated. Set terminationGracePeriodSeconds long enough for graceful shutdown. Test with a canary deployment first using a second Deployment with 1 replica pointing to the new image.

Q6: What is a PodDisruptionBudget and when do you need one?

A PodDisruptionBudget (PDB) limits how many Pods of a Deployment can be simultaneously unavailable during voluntary disruptions — node drains during maintenance, cluster upgrades, or node auto-scaling scale-down events. Without a PDB, a node drain could evict all replicas of a Deployment at once, causing downtime. Example: minAvailable: 2 ensures at least 2 Pods are always running. Set PDBs for every production Deployment with more than 1 replica.

// More Guides

📖 DevOps ☸️ Kubernetes 🐳 Docker ⚙️ CI/CD 🗂️ Terraform 🐧 Linux 🌿 Git ☁️ AWS 📊 Prometheus

☸️ Explore Kubernetes on the Interactive Mind Map

See how Kubernetes connects to Docker, Helm, ArgoCD, Prometheus, AWS, and more — with real commands and 5 interview Q&As per tool.

Open Interactive Mind Map ← DevOps Basics First

🚀 Want the complete DevOps interview kit?

Full notes, Q&A cheat sheets, real commands — all tools covered.

💳 Get Complete DevOps Kit →

Kubernetes runs on cloud infrastructure. See how to deploy and manage EKS clusters on AWS →, and automate the whole thing with Terraform →

📩 Get Free DevOps Interview Notes

Cheat sheets, real commands, interview Q&As — free.

No spam · Follow @master.devops for daily tips

// Continue Learning

🐳Docker — Understand containers before Kubernetes ⚙️CI/CD — Automate K8s deployments with ArgoCD 📊Prometheus — Monitor your Kubernetes cluster

🔗 Related DevOps Topics

🐳 Docker ☸️ Kubernetes 🗂️ Terraform 🐧 Linux ☁️ AWS ⚙️ CI/CD 📊 Prometheus 🌿 Git 📖 DevOps 🗺️ Mind Map