What ContainerCreating actually means
ContainerCreating is a kubelet-reported waiting state, not an official Kubernetes pod phase. The actual pod phase is Pending. What the status tells you: the pod has been scheduled to a node and the kubelet is working on prerequisite tasks before the container process can run.
Those tasks include:
- Pulling the container image (if not already cached on the node)
- Creating the pod's network namespace via the CNI plugin
- Attaching and mounting PersistentVolumes
- Injecting ConfigMaps, Secrets, and projected volumes into the container filesystem
- Running init containers to completion
If any of these stall, the pod stays in ContainerCreating indefinitely. There is no built-in timeout that moves it to Failed. The kubelet keeps retrying, so you need to intervene.
How it differs from related states. ImagePullBackOff means the image pull specifically failed and the kubelet is backing off. CreateContainerConfigError means a ConfigMap or Secret reference is invalid and Kubernetes caught it at configuration time. Init:N/M means init containers are still running. If you see ContainerCreating, the image was either already pulled or has not been attempted yet, and the problem lies elsewhere.
Reading kubectl describe pod events
The Events section of kubectl describe pod is the single most useful diagnostic tool. Events expire from the API server after one hour by default, so check them promptly.
kubectl describe pod <pod-name> -n <namespace>
Scroll to the Events section at the bottom. A healthy pod shows this progression:
Normal Scheduled Successfully assigned default/my-pod to node-3
Normal Pulling Pulling image "registry.internal/my-app:2.4.1"
Normal Pulled Successfully pulled image
Normal Created Created container my-app
Normal Started Started container my-app
A pod stuck in ContainerCreating halts before Created and typically shows Warning events. The first Warning event points to the root cause.
| Warning reason | Likely root cause | Section in this article |
|---|---|---|
FailedMount |
Volume cannot be mounted | Volume mount failures |
FailedAttachVolume |
Cloud disk cannot attach | Volume mount failures |
FailedCreatePodSandBox |
CNI plugin error | CNI plugin issues |
Failed with "secret not found" |
Missing Secret | Missing ConfigMaps or Secrets |
Failed with "configmap not found" |
Missing ConfigMap | Missing ConfigMaps or Secrets |
If there are no events at all, the kubelet itself may be having trouble. Check kubelet logs on the node where the pod is scheduled (the Node: field in the describe output shows which node):
# SSH to the node, then:
journalctl -u kubelet --since "15 minutes ago" | grep -i "sandbox\|cni\|volume\|mount"
For pod-specific events sorted by time:
kubectl get events -n <namespace> \
--field-selector involvedObject.name=<pod-name> \
--sort-by='.lastTimestamp'
Volume mount failures
Volume problems are the most common cause of ContainerCreating getting stuck. Three distinct failure modes exist.
PVC not bound
A pod referencing a PersistentVolumeClaim that has STATUS: Pending cannot start. The kubelet waits for the volume to become available.
kubectl get pvc -n <namespace>
# Look for STATUS = Pending
kubectl describe pvc <pvc-name> -n <namespace>
# Events section shows why binding failed
Why a PVC stays Pending:
- No matching PersistentVolume exists. The cluster has no PV matching the PVC's StorageClass, access mode, or capacity request. Check what exists:
kubectl get pv. - StorageClass not found. The
storageClassNamein the PVC references a non-existent class. Verify:kubectl get storageclass. - Access mode mismatch. Block storage (AWS EBS, GCE Persistent Disk, Azure Disk) typically only supports
ReadWriteOnce. A PVC requestingReadWriteManywill not bind to these. WaitForFirstConsumerbinding mode. StorageClasses withvolumeBindingMode: WaitForFirstConsumerdelay provisioning until a pod is actually scheduled. The PVC showsPendinguntil the scheduler picks a node. This is expected behavior, not a bug. If the pod itself is not schedulable (separate issue), the PVC stays Pending.
For a deeper understanding of PV/PVC lifecycle phases and binding mechanics, see Kubernetes PersistentVolumes and PersistentVolumeClaims.
Fixes:
- Create a matching PV manually, or verify that the StorageClass provisioner pod is running in
kube-system. - Correct the
storageClassNameto an existing class. - Switch to an access mode compatible with the underlying storage type.
- For
WaitForFirstConsumer, debug the scheduling problem first (see Pod stuck in Pending).
You will know it worked when: kubectl get pvc -n <namespace> shows STATUS: Bound and the pod transitions out of ContainerCreating.
FailedAttachVolume and Multi-Attach errors
The kubelet emits FailedAttachVolume when a cloud disk cannot be attached to the node. The most common cause: a ReadWriteOnce volume is still recorded as attached to a different node.
Typical event:
Warning FailedAttachVolume Multi-Attach error for volume "pvc-abc123":
Volume is already exclusively attached to one node and can't be attached to another
This happens after a pod is rescheduled to a new node (scaling event, node failure, rolling update) but the previous attachment was not cleaned up.
Diagnosis:
# Check VolumeAttachment objects
kubectl get volumeattachment
# Find stale attachments for the volume
kubectl get volumeattachment -o json | \
jq '.items[] | select(.spec.source.persistentVolumeName=="pvc-abc123") | {name: .metadata.name, node: .spec.nodeName, deletionTimestamp: .metadata.deletionTimestamp}'
Fixes:
- If the old pod is still terminating, wait for it to fully stop:
kubectl get pods -n <namespace> -o wide. - If a stale
VolumeAttachmentobject exists for a node that is gone, delete it:kubectl delete volumeattachment <name>. - For Deployments with
ReadWriteOncevolumes, changestrategy.typetoRecreateinstead ofRollingUpdate. Rolling updates try to bring up a new pod before the old one is fully terminated, which triggers Multi-Attach on RWO volumes. - If you need concurrent access, switch to RWX-capable storage (NFS, AWS EFS, Azure Files).
You will know it worked when: the FailedAttachVolume events stop and the pod transitions to Running.
CSI driver not running
If the CSI controller or node plugin pods are down, all volume operations fail.
kubectl get pods -n kube-system | grep csi
kubectl get csidrivers
If the CSI node plugin is missing on the target node, restart the DaemonSet:
kubectl rollout restart daemonset <csi-node-plugin> -n kube-system
Large volumes with fsGroup
When spec.securityContext.fsGroup is set, the kubelet recursively changes file ownership on every file in the volume during mount. For volumes with millions of files, this can take minutes and cause a mount timeout.
Kubernetes 1.20 introduced fsGroupChangePolicy (GA in 1.23). Set it to OnRootMismatch to skip recursive ownership changes when the root directory already has the correct group:
spec:
securityContext:
fsGroup: 2000
fsGroupChangePolicy: "OnRootMismatch"
Missing ConfigMaps or Secrets
When a pod references a ConfigMap or Secret that does not exist in the same namespace, the behavior depends on how the reference is made.
Missing Secret (volume mount): The pod stays in ContainerCreating. Events show:
Warning FailedMount MountVolume.SetUp failed for volume "secrets":
secret "db-credentials" not found
Missing ConfigMap (env or envFrom): The pod typically shows CreateContainerConfigError in kubectl get pods, not ContainerCreating. Events show:
Warning Failed Error: configmap "app-config" not found
The difference: Kubernetes validates ConfigMap references at container configuration time (earlier in the process), while Secret volume mounts are resolved by the kubelet at mount time (slightly later).
Diagnosis:
# What does the pod reference?
kubectl describe pod <pod-name> -n <namespace>
# Check the Volumes, Env, and EnvFrom sections
# Does the resource exist in the same namespace?
kubectl get configmap -n <namespace>
kubectl get secret -n <namespace>
Key rules: a pod can only reference ConfigMaps and Secrets in the same namespace. A reference to a specific key that does not exist inside the ConfigMap also blocks startup, unless the reference is marked optional: true.
Fixes:
- Create the missing resource:
kubectl create configmap app-config \
--from-file=config.yaml -n <namespace>
kubectl create secret generic db-credentials \
--from-literal=password=changeme-in-production -n <namespace>
- Mark the reference as optional if the config is not strictly required:
volumes:
- name: config
configMap:
name: app-config
optional: true
- After creating the resource, trigger a pod restart. Kubernetes does not automatically retry:
kubectl rollout restart deployment <deployment-name> -n <namespace>
You will know it worked when: kubectl describe pod no longer shows FailedMount or Failed events referencing the missing resource, and the pod transitions to Running.
CNI plugin issues
The CNI (Container Network Interface) plugin assigns an IP address and configures network routing when the pod sandbox is created. If it fails, the sandbox cannot be established and the pod stays in ContainerCreating.
Identifying CNI failures. The describe output shows:
Warning FailedCreatePodSandBox Failed to create pod sandbox: rpc error:
code = Unknown desc = failed to setup network for sandbox "abc123": ...
CNI-specific error messages vary by plugin:
- Calico:
plugin type="calico" failed (add): error getting ClusterInformation: Unauthorizedmeans the calico-node DaemonSet is not running or has RBAC issues. - AWS VPC CNI:
failed to assign an IP address to containermeans the subnet has no free IP addresses. Check IPAMD logs:kubectl logs -n kube-system -l k8s-app=aws-node -c aws-node. - Generic:
network plugin is not ready: cni config uninitializedmeans no CNI configuration exists in/etc/cni/net.d/on the node. The CNI DaemonSet was never deployed or is not scheduled on that node.
Diagnosis:
# Check CNI DaemonSet pods
kubectl get pods -n kube-system -l k8s-app=calico-node # Calico
kubectl get pods -n kube-system -l app=aws-node # AWS VPC CNI
kubectl get pods -n kube-system -l k8s-app=cilium # Cilium
# Check CNI pod logs
kubectl logs -n kube-system <cni-pod-name>
# Check node status (NotReady often indicates networking problems)
kubectl describe node <node-name> | grep -A5 Conditions
Fixes by scenario:
CNI DaemonSet pod crashing or not ready:
kubectl rollout restart daemonset calico-node -n kube-system
IP address pool exhausted (AWS VPC CNI):
- Add more nodes to distribute IP demand.
- Attach additional subnets to the node group.
- Enable prefix delegation to assign /28 prefixes per ENI instead of individual IPs, increasing density from roughly 30 to 110 pods per node on m5.large instances.
Node has no CNI configuration:
A newly joined node may take 30–60 seconds for the CNI DaemonSet pod to start and write its config. If the node stays in this state longer, check that the DaemonSet's nodeSelector and tolerations allow it to run on that node.
You will know it worked when: kubectl describe pod no longer shows FailedCreatePodSandBox events and the pod gets an IP address (visible in kubectl get pods -o wide).
Init containers blocking
Init containers run sequentially before main containers start. Each must exit with code 0 before the next begins. A pod where init containers are still running shows Init:N/M in kubectl get pods, not ContainerCreating.
This distinction matters. If you see Init:0/2, the problem is the init container, not the main container setup. If you see ContainerCreating, all init containers completed but something else (volume, secret, CNI) is blocking.
| kubectl STATUS | Meaning |
|---|---|
Init:N/M |
N of M init containers completed; waiting for more |
Init:Error |
An init container exited non-zero |
Init:CrashLoopBackOff |
An init container is failing repeatedly with backoff |
PodInitializing |
All init containers done; main containers starting |
ContainerCreating |
Main containers being set up (init containers already succeeded) |
Diagnosing stuck init containers:
# See init container states and exit codes
kubectl describe pod <pod-name> -n <namespace>
# Look at the "Init Containers:" section
# Get init container logs
kubectl logs <pod-name> -c <init-container-name> -n <namespace>
# If the container already terminated:
kubectl logs <pod-name> -c <init-container-name> -n <namespace> --previous
Common causes:
- Waiting for a dependency that never becomes ready. The init container runs a
wait-for-itloop checking a database or external service. If the dependency is down or DNS is broken, the loop runs forever. Debug DNS:kubectl run -it --rm dnstest --image=busybox:1.36 --restart=Never -- nslookup <service-name>.<namespace>.svc.cluster.local. - Init container image pull failure. Shows as
ImagePullBackOffon the init container specifically. Check the Init Containers section in describe output and see ImagePullBackOff troubleshooting. - Script exits with non-zero code. Check logs:
kubectl logs <pod> -c <init-container-name>. Addset -xto shell-based init scripts for verbose tracing. - Resource limits too tight. CPU throttling or OOM causes the init container to be killed or run too slowly. Increase
resources.limits.
You will know it worked when: kubectl get pods shows the status progressing from Init:N/M through PodInitializing to Running.
Diagnostic decision tree
Pod stuck in ContainerCreating
|
+-- kubectl describe pod --> Events section
| |
| +-- FailedAttachVolume / FailedMount
| | +-- PVC Pending? --> StorageClass, provisioner, access mode
| | +-- Multi-Attach error? --> Stale VolumeAttachment, Recreate strategy
| | +-- CSI error? --> CSI driver health
| |
| +-- Failed: secret/configmap not found --> Create resource, restart pod
| |
| +-- FailedCreatePodSandBox --> CNI plugin health, IP exhaustion
| |
| +-- No events --> Check kubelet logs (journalctl -u kubelet)
|
+-- kubectl get pods shows Init:N/M --> Not ContainerCreating
+-- kubectl logs <pod> -c <init-container-name>
When to escalate
If the cause does not match any of the above, or if fixes do not resolve the issue, collect this information before asking for help:
- Full output of
kubectl describe pod <pod-name> -n <namespace> - Pod events:
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name> --sort-by='.lastTimestamp' - PVC status (if volumes are involved):
kubectl describe pvc <pvc-name> -n <namespace> - VolumeAttachment objects:
kubectl get volumeattachment -o yaml - CNI pod logs (if sandbox creation failed):
kubectl logs -n kube-system <cni-pod-name> - Kubelet logs from the target node:
journalctl -u kubelet --since "30 minutes ago" - Kubernetes version:
kubectl version - The pod or Deployment manifest (sanitized of secrets)
How to prevent recurrence
- Validate resources before deploying.
kubectl get configmaps,secrets -n <namespace>beforekubectl apply. CI pipelines can automate this check. - Use
Recreatestrategy for Deployments with RWO volumes.RollingUpdatewith a single-replica Deployment using a ReadWriteOnce volume will always trigger Multi-Attach errors. - Monitor PVC binding state. With kube-state-metrics,
kube_persistentvolumeclaim_status_phase{phase="Pending"}catches unbound PVCs before pods reference them. - Set
fsGroupChangePolicy: OnRootMismatchon workloads that mount large volumes withfsGroup. This avoids recursive ownership changes on every pod restart. - Keep CNI DaemonSets healthy. Alert on
kube_daemonset_status_number_unavailable{daemonset=~".*cni.*|calico-node|aws-node|cilium"}to catch CNI pod failures before they block new pods.