What Pending actually means
The pod lifecycle has five phases: Pending, Running, Succeeded, Failed, Unknown. Pending means the API server accepted the pod but no container has started yet. In almost every case, the reason is that the kube-scheduler could not find a node that passes all its filters.
The scheduler runs a two-phase process on every scheduling cycle. First, a filtering phase applies predicates (PodFitsResources, NodeSelector, TaintToleration, VolumeBinding, and others) to find nodes that could run the pod. If zero nodes pass, the pod stays Pending. Second, a scoring phase ranks the surviving nodes. The scheduler picks the highest-scoring node and binds the pod to it.
A pod that stays Pending for more than a few seconds has hit a filter that no node currently satisfies. The Events section of kubectl describe pod tells you exactly which filter.
Reading the FailedScheduling event
This is the single most important diagnostic step. Run it first, always:
kubectl describe pod <pod-name> -n <namespace>
Scroll to the Events section at the bottom. A pod stuck in Pending generates a Warning FailedScheduling event from default-scheduler. The message follows a structured format:
Warning FailedScheduling 2m default-scheduler 0/5 nodes are available:
2 Insufficient cpu,
1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate,
2 node(s) didn't match Pod's node affinity/selector.
preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling.
Each line names a scheduler predicate that failed and how many nodes it rejected. A single node can fail on multiple predicates, so the counts do not add up to the total. When preemption: ... Preemption is not helpful for scheduling appears, the constraint is not resource-based (it is a taint, affinity, or volume issue) and evicting lower-priority pods would not help.
To find all pending pods across the cluster:
kubectl get pods -A --field-selector=status.phase=Pending
To filter events specifically:
kubectl get events -n <namespace> --field-selector=reason=FailedScheduling --sort-by='.lastTimestamp'
Insufficient cluster resources
The most common cause. The message says Insufficient cpu, Insufficient memory, or Insufficient ephemeral-storage.
The scheduler uses resource requests, not limits, to decide whether a pod fits on a node. Limits are enforced later by the kubelet (CPU throttling, OOM kills). A cluster where containers have high requests but low actual usage will look "full" to the scheduler even though real utilization is low. For a deeper explanation of how requests and limits interact, see Kubernetes resource requests and limits.
The scheduler also compares against allocatable capacity, not raw node capacity:
Allocatable = Capacity - kube-reserved - system-reserved - eviction-threshold
A 4-CPU / 16 GiB node typically has around 3.7 CPU / 13 GiB allocatable after system reservations.
Diagnosing resource exhaustion
Check node allocated resources:
kubectl describe node <node-name>
Find the Allocated resources section:
Allocated resources:
Resource Requests Limits
-------- -------- ------
cpu 3800m (95%) 0 (0%)
memory 12Gi (78%) 16Gi (100%)
The Requests percentage is relative to allocatable. When it is near 100%, no new pod requesting that resource can schedule.
Check what the pod requests:
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].resources.requests}'
Scan all nodes at once:
kubectl describe nodes | grep -A 5 "Allocated resources"
kubectl top nodes # requires metrics-server
Fixes
| Situation | Fix |
|---|---|
| Pod requests are oversized | Lower resources.requests to match actual usage |
| All nodes genuinely full | Add nodes or upgrade to a larger instance type |
| Stale deployments consuming capacity | Delete unused workloads |
| Cloud environment | Enable Cluster Autoscaler |
If the Cluster Autoscaler is already installed but the pod is still Pending, check its events:
kubectl get events -n kube-system | grep cluster-autoscaler
A NotTriggerScaleUp event from the autoscaler means the constraint is not purely resource-based. A taint or affinity mismatch would also prevent a freshly-scaled node from satisfying the pod.
Verification: after adjusting requests or adding nodes, confirm the pod transitions from Pending to Running:
kubectl get pod <pod-name> -n <namespace> -w
Taint and toleration mismatches
A taint on a node repels pods that lack a matching toleration. Three effects exist:
NoSchedule: hard block on new pods, existing pods stay.PreferNoSchedule: soft preference, scheduler avoids but may override.NoExecute: hard block on new pods, and existing pods without the toleration get evicted.
The FailedScheduling message names the taint key:
0/3 nodes are available:
1 node(s) had taint {dedicated: gpu}, that the pod didn't tolerate,
1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate,
1 Insufficient cpu.
That control-plane taint is the most common gotcha. Single-node clusters, lab environments, and local kind/minikube setups with only one node will leave every regular pod Pending because the sole node carries node-role.kubernetes.io/control-plane:NoSchedule.
Diagnosing taint mismatches
View node taints:
kubectl describe nodes | grep -A 2 Taints
View pod tolerations:
kubectl describe pod <pod-name> | grep -A 10 Tolerations
Compare the two. Every NoSchedule and NoExecute taint on a node must have a matching toleration in the pod spec for scheduling to succeed.
Common built-in taints
| Taint | Meaning |
|---|---|
node-role.kubernetes.io/control-plane:NoSchedule |
Control-plane node |
node.kubernetes.io/unschedulable:NoSchedule |
Node cordoned via kubectl cordon |
node.kubernetes.io/not-ready:NoExecute |
Node not ready |
node.kubernetes.io/memory-pressure:NoSchedule |
Memory pressure condition |
node.kubernetes.io/disk-pressure:NoSchedule |
Disk pressure condition |
Fixes
Add a matching toleration to the pod spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
Or remove an accidental taint from the node (the trailing - removes it):
kubectl taint node <node-name> dedicated=gpu:NoSchedule-
For a cordoned node:
kubectl uncordon <node-name>
Verification: kubectl describe pod <pod-name> shows a Scheduled event with the assigned node name.
Node selector and affinity mismatches
The FailedScheduling message didn't match Pod's node affinity/selector points here.
nodeSelector is the simplest form: a map of label key-value pairs that a node must carry. All keys are AND'd. If a pod sets nodeSelector: {disktype: ssd, region: us-east-1}, a node must have both labels.
Node affinity extends this with richer expressions:
requiredDuringSchedulingIgnoredDuringExecution: hard constraint, identical to nodeSelector but with operators (In,NotIn,Exists,DoesNotExist,Gt,Lt). If unsatisfiable, the pod stays Pending.preferredDuringSchedulingIgnoredDuringExecution: soft constraint with a weight. The scheduler tries but will not block.
When both nodeSelector and nodeAffinity are set, both must be satisfied.
Inter-pod anti-affinity
Pod anti-affinity with requiredDuringScheduling is a frequent Pending trap. Five Kafka replicas with anti-affinity on kubernetes.io/hostname requires five distinct nodes. If only four exist, the fifth replica stays Pending permanently.
The event message:
0/4 nodes are available: 4 node(s) didn't match pod anti-affinity rules.
preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
Diagnosing affinity issues
kubectl get pod <pod-name> -o jsonpath='{.spec.affinity}' | jq
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeSelector}' | jq
kubectl get nodes --show-labels
Cross-reference each required rule against the available node labels.
Fixes
| Problem | Fix |
|---|---|
| nodeSelector label missing from nodes | Add it: kubectl label node <node> disktype=ssd |
| Label key typo in pod spec | Fix the pod spec |
| Required affinity unsatisfiable | Switch to preferredDuringScheduling or add nodes with matching labels |
| Anti-affinity needs more nodes than exist | Add nodes, or downgrade to preferredDuringScheduling |
Verification: confirm the pod transitions to Running and landed on a node with the expected labels.
Unbound PersistentVolumeClaims
When a pod references a PVC, the scheduler's VolumeBinding predicate blocks scheduling until the PVC is bound. The event message is unambiguous:
0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
A PVC stays unbound when no PV matches its request (capacity, access mode, or StorageClass mismatch), the named StorageClass does not exist, or the dynamic provisioner is down. For a full explanation of PVs, PVCs, and StorageClasses, see Kubernetes PersistentVolumes and PersistentVolumeClaims.
One specific gotcha: zone conflicts. If a StorageClass uses volumeBindingMode: Immediate (the default), the PVC binds to a PV immediately upon creation, potentially in a different availability zone than where the pod lands. The scheduler then fails with "node(s) had volume node affinity conflict". Switching to volumeBindingMode: WaitForFirstConsumer solves this by deferring binding until the pod is being scheduled.
Diagnosing PVC issues
kubectl get pvc -n <namespace>
kubectl describe pvc <pvc-name> -n <namespace>
kubectl get pv
kubectl get storageclass
In the PVC events, look for:
"storageclass.storage.k8s.io \"standard\" not found"(missing StorageClass)"no persistent volumes available for this claim"(no matching PV)"waiting for first consumer to be created before binding"(normal forWaitForFirstConsumer)"ProvisioningFailed"(CSI driver or provisioner pod issue)
If the provisioner is the problem:
kubectl get pods -n kube-system | grep -E "provisioner|csi"
kubectl logs -n kube-system <provisioner-pod>
Fixes
| Cause | Fix |
|---|---|
| No matching PV | Create one manually or ensure dynamic provisioning is configured |
| StorageClass not found | Create the StorageClass or fix the PVC's storageClassName |
| Provisioner pod down | Restart the CSI driver DaemonSet or Deployment |
| Zone mismatch | Switch to volumeBindingMode: WaitForFirstConsumer |
| Access mode mismatch | Align the PVC's accessModes with what the StorageClass supports |
Verification: kubectl get pvc -n <namespace> shows Bound status; the pod transitions to Running.
ResourceQuota limits
ResourceQuota behaves differently from every other cause on this list. It blocks pod creation at admission time, before the scheduler even sees the pod. For a full guide on setting up quotas in shared clusters, see multi-tenancy namespace isolation. The API server returns a 403 Forbidden:
Error from server (Forbidden): pods "my-pod-7d6fb" is forbidden:
exceeded quota: namespace-quota,
requested: requests.memory=700Mi,
used: requests.memory=600Mi,
limited: requests.memory=1Gi
The pod never enters Pending because it is never created. Instead, the Deployment or ReplicaSet controller logs a FailedCreate condition. You will not find a FailedScheduling event on a pod; you need to check the controller:
kubectl describe deployment <deployment-name> -n <namespace>
Look for ReplicaFailure: True / FailedCreate in the Conditions section, or:
kubectl get events -n <namespace> | grep -i forbidden
Diagnosing quota issues
kubectl get resourcequota -n <namespace>
kubectl describe resourcequota -n <namespace>
The output shows used vs. hard limits:
Resource Used Hard
-------- ---- ----
requests.cpu 1900m 2000m
requests.memory 1.9Gi 2Gi
pods 18 20
When a quota tracks requests, every pod in that namespace must specify resources.requests. A pod without them fails admission. A LimitRange can inject default requests automatically.
Fixes
| Situation | Fix |
|---|---|
| Quota genuinely exhausted | Delete unused workloads or increase quota limits |
| Pods missing resource specs | Add resources.requests to containers, or set a LimitRange with defaults |
| Quota too conservative | Review actual namespace usage and revise |
Verification: kubectl get events -n <namespace> stops showing Forbidden errors; replicas reach the desired count.
Less common causes
A few situations that cause Pending less frequently but are worth ruling out.
Cordoned nodes. kubectl get nodes shows SchedulingDisabled in the STATUS column. Fix: kubectl uncordon <node-name>.
hostPort binding. Using hostPort in a pod spec limits scheduling to one pod per node per port. If every node already runs a pod on that port, new pods stay Pending. Fix: use a Service instead of hostPort.
Pod priority and preemption. A lower-priority pod can be evicted to make room for a higher-priority pod. After eviction, the lower-priority pod returns to Pending. The events will reference preemption.
CNI plugin not ready. If the container network interface is not initialized on a node, it reports NetworkReady=false. Pods skip that node. Fix: check the CNI DaemonSet (kubectl get pods -n kube-system | grep -E "flannel|calico|weave|cilium").
When to escalate
If you have worked through every cause above and the pod is still stuck, collect this information before reaching out for help:
- Full output of
kubectl describe pod <pod-name> -n <namespace> - Output of
kubectl describe nodes(all nodes) - Output of
kubectl get events -n <namespace> --sort-by='.lastTimestamp' - The pod spec YAML:
kubectl get pod <pod-name> -o yaml - PVC status if the pod uses volumes:
kubectl get pvc -n <namespace> - ResourceQuota status:
kubectl describe resourcequota -n <namespace> - Kubernetes version:
kubectl version - Cluster Autoscaler logs if running:
kubectl logs -n kube-system -l app=cluster-autoscaler --tail=100
This covers every detail a platform team or support engineer needs to diagnose the scheduling failure without asking follow-up questions.
How to prevent recurrence
- Set resource requests based on observed usage, not guesswork. Overprovisioned requests are the leading cause of unnecessary Pending pods.
- Use
volumeBindingMode: WaitForFirstConsumeron StorageClasses in multi-zone clusters. - Run
kubectl describe nodes | grep -A 5 "Allocated resources"periodically or feed Prometheus node-exporter metrics into dashboards to track request headroom before pods start queueing. - Review ResourceQuotas quarterly against actual namespace usage.
- When using pod anti-affinity with
required, verify that the node count supports the replica count before scaling up.