What ImagePullBackOff actually means
ErrImagePull and ImagePullBackOff are two stages of the same failure. ErrImagePull appears the first time the kubelet fails to pull a container image from a registry. If the pull keeps failing, Kubernetes enters an exponential backoff loop and the status changes to ImagePullBackOff.
The backoff timing:
| Retry cycle | Approximate wait before next attempt |
|---|---|
| 1 | 10 seconds |
| 2 | 20 seconds |
| 3 | 40 seconds |
| 4 | 80 seconds |
| 5+ | 300 seconds (capped at 5 minutes) |
The pod is not killed during backoff. It sits idle, waiting for the next retry. If the root cause is transient (a brief registry outage, a rate-limit window resetting), the pod self-heals. If the root cause is permanent (a typo, a missing secret), the pod stays in ImagePullBackOff until you fix it.
When the kubelet pulls images
The imagePullPolicy on each container spec controls pull behavior:
Always: pulls on every container start. Default when no tag is specified or when the tag is:latest. The kubelet compares digests and skips redundant layer downloads if the cached image matches.IfNotPresent: pulls only when the image is not cached on the node. Default for tags other than:latest.Never: never pulls. Fails withImagePullBackOffif the image is not already on the node.
Kubernetes v1.33 alpha note. The KubeletEnsureSecretPulledImages feature gate (disabled by default) adds credential verification for cached images. Before v1.33, a pod using imagePullPolicy: IfNotPresent could access a cached private image without valid credentials. With this gate enabled, that same pod will get ImagePullBackOff unless it has the correct imagePullSecrets. If you see new pull failures after upgrading to v1.33 with this gate on, check whether affected pods are missing pull secrets they never needed before.
Diagnosing the root cause
The Events section of kubectl describe pod is the single most useful diagnostic surface for image pull failures. The exact error string tells you the cause category.
kubectl get pods -n <namespace> # find the pod in ErrImagePull or ImagePullBackOff
kubectl describe pod <pod-name> -n <namespace> # scroll to Events
A typical Events output:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 3m kubelet Pulling image "registry.internal/myapp:v2.1"
Warning Failed 3m kubelet Failed to pull image "registry.internal/myapp:v2.1": not found
Warning Failed 3m kubelet Error: ErrImagePull
Warning BackOff 2m (x4 over 3m) kubelet Back-off pulling image "registry.internal/myapp:v2.1"
Error message reference
| Error text in Events | What it means |
|---|---|
not found / manifest unknown |
The image tag does not exist in the registry |
repository does not exist / no such host |
Wrong registry hostname or image path, or a DNS failure |
unauthorized / 401 / pull access denied |
Missing or incorrect credentials for a private registry |
403 Forbidden |
Credentials are valid but lack permission for this image |
toomanyrequests / 429 Too Many Requests |
Registry rate limit hit (typically Docker Hub) |
i/o timeout / connection refused |
Network connectivity issue between the node and the registry |
x509: certificate signed by unknown authority |
TLS certificate problem (self-signed or expired cert) |
For broader event queries across namespaces:
kubectl get events -n <namespace> --field-selector type=Warning
kubectl get events --all-namespaces --field-selector reason=BackOff
Wrong image name or tag
The most common cause. A typo in the image name, tag, or registry hostname produces a manifest unknown or not found error.
How to verify
Pull the image from a local workstation first:
docker pull registry.internal/myapp:v2.1
If this fails locally, the image reference is wrong.
Common mistakes:
- Tag typo:
nginx:latesinstead ofnginx:latest - Missing tag: omitting the tag defaults to
:latest, which many private registries do not publish - Wrong version:
myapp:v3when onlyv2.1was pushed - Swapped path segments:
myapp/myorg:v1instead ofmyorg/myapp:v1 - Deleted tag: a CI/CD pipeline cleaned up old tags after deployment
To list available tags in a registry:
# Docker Hub
curl -s https://registry.hub.docker.com/v2/repositories/myorg/myapp/tags/ | jq '.results[].name'
# Any OCI registry (with crane, from google/go-containerregistry)
crane ls registry.internal/myorg/myapp
Fixing it
Edit the parent resource (Deployment, StatefulSet, DaemonSet), not the pod directly:
kubectl set image deployment/my-deployment app=registry.internal/myapp:v2.2 -n <namespace>
For immutability, pin images by digest instead of tag:
spec:
containers:
- name: app
image: registry.internal/myapp@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2
You will know it worked when kubectl get pods -n <namespace> shows the pod in Running status and kubectl describe pod shows a Pulled event with the correct image reference.
Private registry authentication (imagePullSecrets)
When a pod tries to pull from a private registry without credentials, the registry returns 401 Unauthorized or the misleading repository does not exist or may require 'docker login'. The kubelet pulls images without credentials by default; you must configure imagePullSecrets explicitly.
Step 1: create the secret
kubectl create secret docker-registry regcred \
--docker-server=registry.internal \
--docker-username=deploy-bot \
--docker-password=<token> \
-n my-namespace
For Docker Hub, use https://index.docker.io/v1/ as the server value.
Secrets are namespace-scoped. The secret must exist in the same namespace as the pod.
Step 2: verify the secret
kubectl get secret regcred -n my-namespace \
--output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
The decoded JSON should contain your registry hostname as a key inside the auths object.
Step 3: reference the secret in the pod spec
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-deployment
namespace: my-namespace
spec:
template:
spec:
imagePullSecrets:
- name: regcred
containers:
- name: app
image: registry.internal/myapp:v2.1
Attaching to a ServiceAccount
Instead of adding imagePullSecrets to every pod spec, attach the secret to the default ServiceAccount. Every new pod in the namespace that does not specify a different ServiceAccount inherits the pull secret automatically:
kubectl patch serviceaccount default \
-p '{"imagePullSecrets": [{"name": "regcred"}]}' \
-n my-namespace
This only affects new pods. Existing pods need a restart.
Cloud registries with expiring credentials
AWS ECR tokens expire every 12 hours. GCR service account JSON keys are long-lived but a security risk. Azure ACR service principal credentials expire too. Static imagePullSecrets break on all of these.
The production solution: kubelet credential providers (GA since Kubernetes 1.26). The kubelet calls an external plugin binary at pull time to obtain fresh credentials. No CronJobs refreshing secrets, no stale tokens. Cloud providers maintain the plugins:
- AWS:
ecr-credential-provider(cloud-provider-aws) - GCP: Workload Identity /
gcp-auth-webhook - Azure:
acr-credential-provider
You will know it worked when kubectl describe pod shows a Pulled event instead of Failed, and kubectl get pods shows Running.
Docker Hub rate limits
Docker Hub applies pull rate limits based on account type. As of April 2025:
| Account type | Pull limit |
|---|---|
| Unauthenticated (anonymous) | 100 pulls per 6 hours, per source IP |
| Authenticated (free Personal) | 200 pulls per 6 hours, per account |
| Pro / Team / Business | Unlimited |
The critical detail: unauthenticated limits apply per source IPv4 address. In a managed Kubernetes cluster where all nodes share a NAT gateway, the entire cluster competes for 100 pulls from a single IP. A cluster running autoscaling workloads can exhaust this in minutes.
When the limit is hit, kubectl describe pod Events show:
toomanyrequests: You have reached your pull rate limit.
Diagnosis
Check remaining quota from a node (or any machine sharing the cluster's outbound IP):
TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/nginx:pull" | jq -r .token)
curl -I -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/library/nginx/manifests/latest 2>/dev/null | grep ratelimit
The ratelimit-remaining header shows how many pulls you have left.
Solutions
Authenticate pulls. Even a free Docker Hub account doubles your quota and decouples it from your IP address. Create a personal access token at Docker Hub and add it as an imagePullSecret:
kubectl create secret docker-registry dockerhub-creds \
--docker-server=https://index.docker.io/v1/ \
--docker-username=your-dockerhub-user \
--docker-password=<personal-access-token> \
-n <namespace>
kubectl patch serviceaccount default \
-p '{"imagePullSecrets": [{"name": "dockerhub-creds"}]}' \
-n <namespace>
Use a pull-through cache. Set up a registry mirror (Harbor, Nexus, or a plain registry:2 with proxy cache) that fronts Docker Hub. Configure containerd on all nodes to use the mirror via /etc/containerd/certs.d/docker.io/hosts.toml:
server = "https://registry-1.docker.io"
[host."https://registry-mirror.internal"]
capabilities = ["pull", "resolve"]
Containerd picks up hosts.toml changes dynamically. No restart required.
Set imagePullPolicy: IfNotPresent for stable tagged images. This avoids re-pulls when a node already has the image cached:
imagePullPolicy: IfNotPresent
You will know it worked when the toomanyrequests error disappears from kubectl describe pod Events and the pod transitions to Running.
Node-level troubleshooting with crictl
When kubectl describe pod does not give you enough detail, go to the node. crictl talks directly to the container runtime (containerd or CRI-O) over the CRI socket, bypassing the Kubernetes API.
Connect to the node
kubectl get pod <pod-name> -n <namespace> -o wide # find the node name
ssh <node>
For containerd (default since Kubernetes 1.24):
sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock images
Or set the endpoint persistently in /etc/crictl.yaml:
runtime-endpoint: unix:///var/run/containerd/containerd.sock
Key commands
# List cached images
sudo crictl images | grep myapp
# Test a pull directly (fastest way to isolate auth/network issues)
sudo crictl pull registry.internal/myapp:v2.1
# Test with credentials
sudo crictl pull --creds deploy-bot:<token> registry.internal/myapp:v2.1
# Verbose output for debugging
sudo crictl --debug pull registry.internal/myapp:v2.1
# Check disk space (image pulls fail when the node is full)
df -h /var/lib/containerd
# Prune unused images if disk is full
sudo crictl rmi --prune
containerd vs Docker: a common pitfall
Since Kubernetes 1.24, Docker (dockershim) is removed. All modern clusters use containerd or CRI-O. The Docker CLI no longer interacts with the container runtime Kubernetes uses.
The practical consequence: credentials stored in /root/.docker/config.json are not used by containerd for CRI image pulls. If you migrated from a Docker-based cluster and your pulls stopped working, this is likely the cause. Use imagePullSecrets (through the Kubernetes API) or configure containerd's hosts.toml for node-level credentials.
Network and TLS issues
DNS and firewall problems produce no such host, i/o timeout, or connection refused in the Events section.
From the node:
nslookup registry.internal # DNS resolution
curl -v https://registry.internal/v2/ # TCP + TLS connectivity
For self-signed registry certificates (x509: certificate signed by unknown authority), distribute the CA certificate to nodes and configure containerd:
# /etc/containerd/certs.d/registry.internal/hosts.toml
[host."https://registry.internal"]
capabilities = ["pull", "resolve"]
ca = "/etc/containerd/certs.d/registry.internal/ca.crt"
For clusters behind a corporate proxy, set the proxy environment variables in the containerd service unit or kubelet environment:
HTTPS_PROXY=http://proxy.internal:3128
NO_PROXY=10.0.0.0/8,192.168.0.0/16,.cluster.local
You will know it worked when curl -v https://registry.internal/v2/ returns HTTP 200 (or 401 for auth-required registries) and the pod's pull succeeds on the next retry.
When to escalate
If you have worked through the causes above and the pod is still stuck, collect the following before asking for help:
- Full output of
kubectl describe pod <pod-name> -n <namespace>(especially Events) - The exact image reference from the pod spec (
kubectl get pod <pod> -o yaml | grep image:) - Whether the image can be pulled from the node directly (
sudo crictl pull <image>) - Kubernetes version (
kubectl version) - Container runtime and version (
sudo crictl version) - Node conditions (
kubectl describe node <node> | grep -A5 Conditions) - Disk space on the node (
df -h /var/lib/containerd) - Whether a corporate proxy, firewall, or admission webhook (OPA Gatekeeper, Kyverno) might be interfering
How to prevent recurrence
- Pin images by digest in production workloads to avoid tag-drift surprises.
- Attach
imagePullSecretsto ServiceAccounts rather than individual pod specs so new deployments inherit credentials automatically. - Use kubelet credential providers instead of static secrets for cloud registries with expiring tokens.
- Run a pull-through cache in front of Docker Hub to avoid rate-limit dependency on an external service.
- Monitor for
ImagePullBackOffevents cluster-wide. A query likekubectl get events --all-namespaces --field-selector reason=BackOffin a recurring check catches problems early. - After fixing the root cause, restart the affected pod or Deployment (
kubectl rollout restart deployment/<name>) to clear the backoff timer immediately rather than waiting up to 5 minutes for the next automatic retry.
If your pod started successfully but now keeps restarting, that is a different problem. See CrashLoopBackOff: why your Kubernetes pod keeps restarting for diagnosis steps on container crash loops. If the container starts but does not pass health checks, see how to configure Kubernetes health probes.