Kubernetes rolling updates and zero-downtime deployments

A rolling update replaces pods one at a time, but the defaults do not prevent dropped connections. Without the right readiness probe, preStop hook, and terminationGracePeriodSeconds, users see 502s every time you deploy. This guide walks through each setting that matters and ends with a complete, production-ready Deployment manifest.

What you will have at the end

A Deployment that rolls out new versions without dropping a single connection. The configuration covers strategy tuning, readiness gating, the endpoint-removal race condition, and a PodDisruptionBudget that protects against voluntary disruptions.

Prerequisites

RollingUpdate vs. Recreate

Kubernetes supports two strategy types:

RollingUpdate (default) creates pods in a new ReplicaSet while scaling down the old one. At every point during the rollout, a mix of old and new pods serves traffic. This is the only strategy that can deliver zero downtime.

Recreate kills all existing pods before creating new ones. The service is completely unavailable between teardown and readiness. Use it only when two versions of the application cannot coexist safely (a database migration that breaks backward compatibility, for example).

For stateless HTTP services, RollingUpdate is the correct choice. The rest of this article assumes it. If you are wiring this Deployment to a CI/CD pipeline, pair the rollout settings below with the GitHub Actions workflow that builds and deploys container images so the cluster handles pod replacement gracefully while the pipeline handles everything upstream.

Tune maxUnavailable and maxSurge

Two fields control how fast the rollout proceeds:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0   # never reduce capacity below desired count
      maxSurge: 1          # create one extra pod at a time

maxUnavailable sets the maximum number of pods that may be unavailable during the update. Accepts an integer or a percentage (percentages round down). maxSurge sets how many extra pods may exist above the desired count (percentages round up). Kubernetes forbids setting both to 0.

The defaults are 25% for both. With 8 replicas that means up to 2 pods can be unavailable and 2 extra pods can exist simultaneously. That is aggressive enough to cause visible capacity dips.

For user-facing services, set maxUnavailable: 0. This guarantees that full desired capacity is always available. No old pod is removed until a new one is Ready.

Pattern maxUnavailable maxSurge Trade-off
Safest, slowest 0 1 Full capacity always; one pod at a time
Safe, faster 0 25% Full capacity; multiple surge pods accelerate rollout
Balanced (default) 25% 25% Allows brief capacity dip; good for non-critical services

Keep in mind that maxSurge requires extra node capacity. Setting it to 100% doubles the pod count temporarily.

Add a stability buffer with minReadySeconds

spec:
  minReadySeconds: 10

A pod only counts as Available after it has been continuously Ready for minReadySeconds. This catches applications that pass the readiness probe once but crash seconds later. The default is 0, which means the rollout advances the instant the readiness probe first succeeds.

Gate traffic with readiness probes

Without a readiness probe, Kubernetes routes traffic to a pod the moment its container starts. During a rolling update that sends requests to an application that has not finished booting.

readinessProbe:
  httpGet:
    path: /healthz/ready
    port: 8080
  initialDelaySeconds: 5      # match your app's cold-start time
  periodSeconds: 5
  timeoutSeconds: 2
  failureThreshold: 3
  successThreshold: 1

The readiness probe controls whether a pod is included in Service EndpointSlices. During a rolling update, a new pod's readiness probe must pass before Kubernetes considers it Available and before it terminates old pods (when maxUnavailable is 0).

For applications with slow or unpredictable boot times (JVM warm-up, model loading), add a startup probe instead of inflating initialDelaySeconds. The startup probe blocks readiness and liveness checks until boot is complete.

If you are running behind a cloud load balancer (AWS ALB/NLB, GCP GCLB), consider Pod Readiness Gates. The ALB controller injects a custom readiness condition that only becomes True when the target group registers the pod as Healthy. Without it, the rollout may terminate old pods before the load balancer has even started sending traffic to the new ones.

If you are picking probes for a brand-new Deployment, Kubernetes liveness, readiness, and startup probes: configuration guide walks through the decision tree and starter values per runtime. For a deeper dive into probe types, mechanisms, and timing pitfalls when something is misbehaving, see how to configure Kubernetes health probes.

Solve the endpoint-removal race condition

This is the single most common cause of 502 errors during deployments. When Kubernetes decides to terminate a pod, two independent processes start at the same time:

  1. The kubelet executes the preStop hook, then sends SIGTERM.
  2. The endpoint controller removes the pod from EndpointSlices, triggering kube-proxy on every node to update iptables/IPVS rules.

These two tracks have no synchronization. The kube-proxy minSyncPeriod defaults to 1 second, but actual propagation across all nodes can take several seconds. During that window, kube-proxy on some nodes still routes new connections to the terminating pod. If the application has already stopped accepting connections, those requests fail.

The fix is a preStop sleep that holds the pod alive long enough for endpoint removal to propagate everywhere:

spec:
  terminationGracePeriodSeconds: 60
  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 15"]

During the 15-second sleep, the pod still accepts connections while all kube-proxies catch up. After the sleep completes, Kubernetes sends SIGTERM and the application starts its shutdown sequence.

How long should the preStop sleep be?

Environment Recommended sleep Why
Simple on-prem cluster 5 to 10 seconds Fast iptables propagation
Cloud with external load balancer (ALB/NLB) 35+ seconds Cloud LBs take 15 to 30 seconds to deregister targets
Service mesh (Istio, Linkerd) 5 to 10 seconds + mesh drain Envoy sidecar has its own drain cycle

Budget terminationGracePeriodSeconds correctly

The timer starts when the preStop hook begins executing, not after it finishes. Your budget must be:

terminationGracePeriodSeconds >= preStop_sleep + application_shutdown_time + buffer

With a 15-second preStop sleep and an application that needs 10 seconds to drain connections:

terminationGracePeriodSeconds: 30   # 15 + 10 + 5 buffer

If the timer expires before the application exits, Kubernetes sends SIGKILL and drops all in-flight requests.

For complete coverage of SIGTERM handling, connection draining in Go/Node.js/Java/Python, and testing your shutdown, see Kubernetes graceful shutdown.

Protect against voluntary disruptions with PodDisruptionBudgets

A rolling update is not the only event that terminates pods. kubectl drain, cluster autoscaler scale-down, and cloud provider node pool upgrades are voluntary disruptions that can evict pods independently of the Deployment controller. Without a PodDisruptionBudget, a kubectl drain can evict all your pods at once.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
spec:
  maxUnavailable: 1
  unhealthyPodEvictionPolicy: AlwaysAllow
  selector:
    matchLabels:
      app: myapp

maxUnavailable: 1 allows voluntary disruptions to remove one pod at a time. For higher replica counts, maxUnavailable: 25% is a reasonable starting point.

The unhealthyPodEvictionPolicy: AlwaysAllow field (Kubernetes 1.26+) prevents PDBs from blocking node drains when pods are already crash-looping. Without it, a broken pod that is never Ready counts toward the disruption budget, causing kubectl drain to hang indefinitely.

Do not set maxUnavailable: 0 unless you run a quorum-based system that truly cannot lose a single member. It blocks all voluntary evictions, including routine cluster maintenance.

PDBs and rolling updates are independent. PDBs do not control how the Deployment controller rolls out pods. They protect against external disruptions that happen to coincide with a rollout.

Blue/green and canary as alternatives

Rolling updates work well for most stateless services, but two patterns offer additional control when you need it. Blue-green runs the new version alongside the old in a parallel Deployment and switches traffic atomically with a Service-selector patch, giving you instant rollback for the cost of double compute. Canary ships the new version to a small percentage of traffic first (using replica ratios for coarse splits, or NGINX Ingress canary annotations for precise weights) and increases the share as metrics agree.

For a full walkthrough of both, including a decision matrix, native kubectl examples, and the step up to Argo Rollouts for analysis-driven automatic rollback, see Kubernetes blue-green and canary deployment strategies.

Complete production-ready configuration

This manifest combines every setting discussed above. Inline comments explain each field.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 4
  revisionHistoryLimit: 5              # keep 5 old ReplicaSets for rollback
  progressDeadlineSeconds: 300         # mark rollout failed after 5 minutes of no progress
  minReadySeconds: 10                  # pod must be Ready 10s before counting as Available
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0                # never reduce capacity below 4
      maxSurge: 1                      # add one pod at a time
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      terminationGracePeriodSeconds: 60  # preStop 15s + app drain 30s + 15s buffer
      containers:
      - name: app
        image: registry.internal/myapp:v2.4.1
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /healthz/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 2
          failureThreshold: 3
          successThreshold: 1
        livenessProbe:
          httpGet:
            path: /healthz/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 2
          failureThreshold: 3
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            memory: "256Mi"            # no CPU limit; avoid unnecessary throttling
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: myapp
              topologyKey: kubernetes.io/hostname
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
spec:
  maxUnavailable: 1
  unhealthyPodEvictionPolicy: AlwaysAllow
  selector:
    matchLabels:
      app: myapp

Verify the result

After applying the manifest, confirm that the rollout completes without errors:

kubectl rollout status deployment/myapp --timeout=5m

Expected output:

deployment "myapp" successfully rolled out

During the rollout, watch pod transitions in a second terminal:

kubectl get pods -l app=myapp -w

You should see new pods reach 1/1 Running before old pods enter Terminating. At no point should the number of Running, Ready pods drop below your desired replica count.

To verify that zero connections were dropped, run a load test against the Service during a rollout. Tools like hey, wrk, or k6 work well. A successful zero-downtime deployment produces 0 non-2xx responses across the full rollout window.

Common troubleshooting

Rollout hangs and no new pods reach Ready. The readiness probe is failing on new pods. Check kubectl describe pod <new-pod> for probe failure events. The most common cause is a wrong path or port in the readiness probe, or an initialDelaySeconds that is shorter than the application's actual boot time.

502 errors during deploy despite readiness probes passing. The endpoint-removal race condition. Either the preStop hook is missing, or its sleep is too short for your environment. Add or increase the preStop sleep. Behind a cloud load balancer, 35+ seconds is typical.

kubectl drain blocks indefinitely. A PDB is preventing eviction. Either maxUnavailable: 0 is set (allowing zero disruptions), or unhealthy pods are consuming the budget. Set unhealthyPodEvictionPolicy: AlwaysAllow (Kubernetes 1.26+) to unblock drains on crash-looping pods.

ProgressDeadlineExceeded condition appears. The rollout failed to make progress within progressDeadlineSeconds. Kubernetes does not auto-rollback. Roll back manually with kubectl rollout undo deployment/myapp and investigate why new pods are not becoming Available. For the full rollback workflow, including how to target a specific revision and what kubectl rollout undo does and does not actually restore, see Kubernetes deployment rollback with kubectl rollout undo.

Surge pods stuck in Pending. The cluster does not have enough node capacity for the extra pod. Either reduce maxSurge, add nodes, or review your resource requests to see if they are overprovisioned.

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.