Kubernetes PodDisruptionBudgets: protecting availability during maintenance

A PodDisruptionBudget tells Kubernetes how many pods of your application can be offline simultaneously during voluntary maintenance. Without one, a single kubectl drain can evict every replica at once. This guide covers choosing between minAvailable and maxUnavailable, handling unhealthy pods, and avoiding the misconfigurations that block cluster upgrades.

What you will have at the end

A PodDisruptionBudget (PDB) applied to your workload that prevents kubectl drain, Cluster Autoscaler scale-down, and cloud provider node upgrades from evicting too many pods at once. You will know which field to pick for your workload type and how to avoid the misconfigurations that deadlock drain operations.

Prerequisites

Voluntary vs. involuntary disruptions

PDBs only protect against voluntary disruptions. That distinction is worth understanding before writing any YAML.

Voluntary disruptions are actions where an operator or controller deliberately removes pods:

  • kubectl drain for node maintenance
  • Cluster Autoscaler or Karpenter consolidating underutilized nodes
  • Cloud provider node pool upgrades (AKS, GKE, EKS)
  • Manual kubectl delete pod (though this bypasses PDBs because it skips the Eviction API)

Involuntary disruptions are unplanned failures that Kubernetes cannot control: hardware failures, kernel panics, VM disappearances, spot instance interruptions, out-of-memory kills. PDBs have no authority over these. A node crash that takes three pods with it will not ask permission first.

The subtle part: involuntary disruptions count against the budget. If a node failure already reduced your healthy pods below desiredHealthy, a concurrent kubectl drain on a different node will be blocked until replacements are running.

Create a PDB: minAvailable vs. maxUnavailable

A PDB spec requires exactly one of two mutually exclusive fields:

Field Meaning Rounding (percentage)
minAvailable Minimum pods that must remain available after eviction Rounds up (conservative: protects more)
maxUnavailable Maximum pods that can be unavailable after eviction Rounds up (permissive: allows more disruptions)

Both accept an integer or a percentage string like "25%".

Step 1: choose your field

For stateless services (web APIs, workers, microservices), use maxUnavailable. The official Kubernetes docs recommend it because it scales naturally with replica count changes. Statsig documented switching from minAvailable to maxUnavailable after discovering that services with fewer than five pods were stuck at disruptionsAllowed: 0 permanently.

For quorum-based stateful systems (etcd, ZooKeeper, Consul), use minAvailable set to the quorum size. A 3-node etcd cluster needs minAvailable: 2 (quorum = floor(3/2) + 1 = 2). The required number of healthy members is fixed regardless of total replicas.

Step 2: write the PDB manifest

Stateless API with maxUnavailable:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-api-pdb
  namespace: production
spec:
  maxUnavailable: 1                         # one pod at a time
  unhealthyPodEvictionPolicy: AlwaysAllow   # prevents CrashLoopBackOff deadlock (k8s 1.26+)
  selector:
    matchLabels:
      app: web-api

Quorum-based StatefulSet with minAvailable:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: zookeeper-pdb
  namespace: production
spec:
  minAvailable: 2                            # quorum for a 3-node ensemble
  unhealthyPodEvictionPolicy: IfHealthyBudget # conservative for stateful workloads
  selector:
    matchLabels:
      app: zookeeper

Step 3: apply and verify

kubectl apply -f pdb.yaml

kubectl get pdb -n production
# NAME            MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
# web-api-pdb     N/A             1                 2                     5s

The ALLOWED DISRUPTIONS column is the number the Eviction API checks before approving or rejecting an eviction request. If it reads 0, no voluntary disruption can proceed.

To inspect the full status:

kubectl get pdb web-api-pdb -n production -o jsonpath='{.status}' | jq .
# {
#   "currentHealthy": 3,
#   "desiredHealthy": 2,
#   "disruptionsAllowed": 1,
#   "expectedPods": 3
# }

Handle unhealthy pods with unhealthyPodEvictionPolicy

Before Kubernetes 1.26, a pod stuck in CrashLoopBackOff was still protected by its PDB. The pod was broken and not serving traffic, but evicting it would drop the count below desiredHealthy. The result: kubectl drain hangs indefinitely waiting for a pod that will never become healthy.

Two policies exist:

  • IfHealthyBudget (default): unhealthy pods can only be evicted if the overall application is not disrupted (currentHealthy >= desiredHealthy). Conservative, but risks deadlock when all pods are unhealthy.
  • AlwaysAllow: unhealthy pods (those without a Ready condition) can always be evicted regardless of the budget. The Kubernetes docs now recommend this for most workloads.

Set AlwaysAllow unless you run a stateful system where even a partially broken pod contributes to data availability.

How PDBs interact with kubectl drain

When you run kubectl drain, the following happens step by step:

  1. The node is cordoned (marked Unschedulable).
  2. For each pod, kubectl sends an eviction request to the Kubernetes Eviction API.
  3. The Eviction API checks the PDB. If evicting the pod would violate the budget, it returns HTTP 429 (Too Many Requests).
  4. kubectl drain retries rejected requests until success or timeout.

If ALLOWED DISRUPTIONS is 0, drain blocks indefinitely. Cloud providers enforce their own timeouts: AKS times out after 1 hour with UpgradeFailed / PodDrainFailure, GKE force-drains after 1 hour, and EKS fails the upgrade after 50 minutes.

Before initiating a cluster upgrade, check for blocked PDBs:

kubectl get pdb --all-namespaces -o wide | grep ' 0 '
# Any row showing ALLOWED DISRUPTIONS = 0 will block the upgrade

Escape hatches for stuck drains

# Bypass PDBs entirely (use with caution)
kubectl drain <node> --ignore-daemonsets --disable-eviction
# --disable-eviction (k8s 1.18+) forces direct deletion instead of eviction

# Drain with timeout so it does not block indefinitely
kubectl drain <node> --ignore-daemonsets --timeout=300s

--disable-eviction skips PDB checks completely. Use it only when you understand the availability impact and other remediation options are exhausted. For the full cordon-drain-uncordon workflow, the flags every drain command needs, and how managed Kubernetes services differ in their drain timeouts, see Kubernetes node drain and cordon: safe maintenance without downtime.

How PDBs interact with Cluster Autoscaler

The Cluster Autoscaler respects PDBs during scale-down. Before marking a node for termination, it checks whether evicting its pods would violate any PDB. If the answer is yes, the node is marked "not removable" and scale-down is skipped.

Common configurations that block scale-down:

  • maxUnavailable: 0 on any PDB matching a pod on the underutilized node
  • minAvailable equal to the current replica count, producing disruptionsAllowed: 0
  • A single-replica Deployment with any restrictive PDB

If Cluster Autoscaler is not scaling down and you suspect PDB interference, check for PDBs at zero:

kubectl get pdb --all-namespaces -o wide
# Look for ALLOWED DISRUPTIONS = 0 on the workloads running on the stuck node

For Karpenter users: Karpenter's voluntary disruption methods (consolidation, drift, expiration) also respect PDBs. If any PDB on any pod on a node is blocking, Karpenter will not consolidate that node. Note that Karpenter NodePool Disruption Budgets are a separate, complementary system that rate-limits node-level disruptions, not pod-level availability.

PDBs and rolling updates: separate layers

A common misconception: PDBs do not constrain Deployment rolling updates. The official docs state it clearly: "workload resources (such as Deployment and StatefulSet) are not limited by PodDisruptionBudgets when doing rolling updates."

The Deployment controller's .spec.strategy.rollingUpdate.maxUnavailable and maxSurge govern rollout behavior. PDBs govern voluntary evictions from external operations (drain, autoscaler). They are separate, complementary layers:

Deployment strategy → controls rolling update behavior (new version rollout)
PDB                 → controls voluntary eviction behavior (drain, autoscaler)

A good pairing for a production service:

# Deployment: zero-downtime rollout
spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1           # spin up 1 new pod before removing an old one
      maxUnavailable: 0     # never reduce below desired count during rollout
---
# PDB: safe infrastructure maintenance
apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  maxUnavailable: 1          # allow 1 pod to be drained at a time
  unhealthyPodEvictionPolicy: AlwaysAllow
  selector:
    matchLabels:
      app: myapp

If a kubectl drain and a rolling update happen simultaneously, the combined unavailability from both is checked against the PDB. The PDB does not prevent the rolling update itself, but it does prevent drain from making the situation worse.

Common mistakes

PDB on a single-replica Deployment

A PDB with minAvailable: 1 (or maxUnavailable: 0) on a single-replica Deployment produces disruptionsAllowed: 0 permanently. Drain blocks forever, Cluster Autoscaler cannot scale down, Karpenter cannot consolidate. Either run 2+ replicas or skip the PDB entirely for that workload.

maxUnavailable: 0 or minAvailable: 100%

Both configurations block all voluntary evictions. Cluster upgrades on AKS, GKE, and EKS will time out and fail. Use this only if you have coordinated out-of-band maintenance procedures and understand that the cluster cannot self-heal node pools.

Overlapping selectors across multiple PDBs

If two PDBs select the same pod, the Eviction API returns HTTP 500 instead of 429. Drain fails in an unexpected way. Each PDB should cover a unique set of pods.

Empty selector in policy/v1

In policy/v1beta1 (removed in Kubernetes 1.25), an empty selector {} matched zero pods. In policy/v1, an empty selector matches every pod in the namespace. Migrating a PDB manifest without updating the selector can unintentionally lock the entire namespace.

Ignoring the HPA interaction

The Horizontal Pod Autoscaler does not consult PDBs when scaling down replicas. HPA can reduce the replica count below the PDB's minAvailable, which sets disruptionsAllowed to 0 and blocks any concurrent drain operation until HPA scales back up. Monitor ALLOWED DISRUPTIONS after HPA scale-down events.

Verify the result

After applying your PDB, confirm it works:

# Check current state
kubectl get pdb -n production -o wide
# ALLOWED DISRUPTIONS should be >= 1

# Simulate a drain on a non-critical node (cordon first, review, then drain)
kubectl cordon <node-name>
kubectl drain <node-name> --ignore-daemonsets --dry-run=client
# The dry run shows which pods would be evicted and whether PDBs block it

# If satisfied, proceed with the real drain
kubectl drain <node-name> --ignore-daemonsets

After the drain completes, verify that the evicted pods were rescheduled and the service remained available throughout.

When to escalate

If drain remains stuck after reviewing PDB configuration, collect the following before asking for help:

  • kubectl get pdb --all-namespaces -o wide (full PDB status)
  • kubectl describe pdb <name> -n <namespace> (events and conditions)
  • kubectl get events --field-selector reason=EvictionBlocked (eviction-specific events)
  • Kubernetes version (kubectl version --short)
  • Cloud provider and managed service tier (AKS, GKE, EKS)
  • Whether unhealthyPodEvictionPolicy is set and to which value
  • Number of replicas and their Ready status (kubectl get pods -l app=<label> -o wide)

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.