Why kubectl exec fails on minimal images
Distroless images (gcr.io/distroless, Chainguard Images, scratch-based builds) ship only the application binary and its runtime dependencies. No /bin/sh. No curl. No ps. That is the point: a smaller attack surface, fewer CVEs to triage, images measured in megabytes rather than hundreds of megabytes.
The tradeoff surfaces the moment something goes wrong:
kubectl exec -it myapp-pod -- /bin/sh
# OCI runtime exec failed: exec failed: unable to start container process:
# exec: "/bin/sh": stat /bin/sh: no such file or directory
No shell means no exec. You cannot attach a debugger, inspect network connections, or read config files inside the running container. Before Kubernetes 1.25, the workaround was rebuilding the image with a debug variant or redeploying with a sidecar. Both require downtime or at least a rollout.
kubectl debug removes that constraint. It injects a temporary container (an ephemeral container) into the running pod, sharing its network and optionally its process namespace, without restarting anything.
Prerequisites
- Kubernetes 1.25 or later. Ephemeral containers reached GA in Kubernetes 1.25; the
EphemeralContainersfeature gate is always on from 1.25 onwards and was fully removed in 1.32. If you are on 1.22 to 1.24, the feature is beta and enabled by default but can be disabled. Clusters older than 1.22 require explicit feature gate activation. kubectlversion matching or newer than your cluster version.- RBAC permission to update
pods/ephemeralcontainers. The defaultadminClusterRole does not include this subresource; see RBAC setup below. - A container runtime that supports the TARGET PID namespace mode (containerd 1.5+ with runc). Without this,
--targetstill works but you only see the debug container's own processes.
Inject a debug container with --target
This is the most common pattern: attach a fully equipped debug container to a running pod and share the target container's process namespace.
Step 1: identify the pod and container name
kubectl get pods -n production -l app=payment-service
# NAME READY STATUS RESTARTS AGE
# payment-service-6b7f8d9c4-xt2kp 1/1 Running 0 4h
kubectl get pod payment-service-6b7f8d9c4-xt2kp -n production \
-o jsonpath='{.spec.containers[*].name}'
# payment-service
Step 2: launch the ephemeral container
kubectl debug -it payment-service-6b7f8d9c4-xt2kp \
-n production \
--image=nicolaka/netshoot \
--target=payment-service \
--profile=general
The --target=payment-service flag tells the kubelet to join the ephemeral container to the target's PID namespace via CRI's TARGET namespace mode. Without it, ps inside the debug container shows only the debug container's own process tree.
The --profile=general flag sets a security context with SYS_PTRACE capability, which you need for tools like strace. On Kubernetes versions before 1.36, the default profile is legacy; from 1.36 onwards, general becomes the default. Be explicit in scripts and runbooks to avoid behaviour changes on cluster upgrades.
Step 3: verify process namespace sharing
Inside the debug container:
ps aux
# PID USER COMMAND
# 1 app /usr/bin/payment-service --port=8080
# 48 root /bin/zsh
If you only see your own shell process, the container runtime does not support TARGET namespace mode. You can still debug networking and DNS, but process inspection requires --copy-to with --share-processes instead (see below).
Step 4: inspect the distroless filesystem
The debug container has its own filesystem, but you can reach the target container's root through /proc:
ls /proc/1/root/
# app etc lib tmp usr
cat /proc/1/root/app/config.yaml
PID 1 in the shared namespace is the application process. /proc/1/root is a symlink to that process's mount namespace root, which is the distroless container's filesystem.
Permission caveat: if the target container runs as a non-root user (e.g., UID 65532 for Chainguard images), you may need --profile=sysadmin or su to the matching UID to read /proc/1/root.
Step 5: run diagnostic commands
# Open network connections
ss -tlnp
# Test an HTTP endpoint from inside the pod network
curl localhost:8080/healthz
# DNS resolution
dig payment-api.production.svc.cluster.local
# Live packet capture on port 8080
tcpdump -i eth0 -n -s 0 port 8080
Step 6: exit and clean up
exit
The ephemeral container stops when you exit, but it cannot be removed from the pod spec. Each kubectl debug invocation adds another entry. The entries disappear when the pod itself is deleted (e.g., during a rollout). This is cosmetic, not operational: stopped ephemeral containers consume no CPU or memory.
Debug a copy of the pod with --copy-to
Ephemeral containers cannot change the application's entrypoint, replace its image, or keep a crashing container alive. For those scenarios, kubectl debug --copy-to creates a new pod that is a modified clone of the original.
When to use --copy-to
- The pod is stuck in CrashLoopBackOff and crashes before you can attach.
- You need to replace the distroless image with a full image (
--set-image). - You need to override the entrypoint to keep the container alive.
- You want a clean pod without accumulated ephemeral container entries.
Override the entrypoint to debug a crashing pod
kubectl debug payment-service-6b7f8d9c4-xt2kp \
-n production \
-it \
--copy-to=payment-debug \
--container=payment-service \
--set-image=payment-service=ubuntu:22.04 \
-- bash
This creates a new pod named payment-debug where the payment-service container runs ubuntu:22.04 with bash as its entrypoint instead of the original binary. Labels are stripped by default so the copy does not receive Service traffic. Liveness, readiness, and startup probes are also stripped.
Inside the copy, you can inspect environment variables, mounted secrets, and filesystem state without the application crashing immediately.
Add a debug sidecar with shared processes
kubectl debug payment-service-6b7f8d9c4-xt2kp \
-n production \
-it \
--image=nicolaka/netshoot \
--copy-to=payment-debug \
--share-processes
The --share-processes flag sets shareProcessNamespace: true on the pod copy, so all containers share one PID namespace. This is an alternative to --target when the runtime does not support TARGET namespace mode.
Clean up the copy
The copy is a standalone pod, not managed by any controller:
kubectl delete pod payment-debug -n production --now
Debug a node with kubectl debug node/
When the problem is on the node itself (kubelet issues, kernel logs, containerd state), kubectl debug node/ creates a privileged pod scheduled on that node with the host filesystem mounted at /host.
kubectl debug node/worker-node-3 -it --image=ubuntu --profile=sysadmin
The sysadmin profile grants a privileged security context, which you need for chroot /host. Without it, the pod starts but you cannot access all host resources.
Inside the debug pod:
chroot /host
# Kubelet logs
journalctl -u kubelet --since "1 hour ago"
# List running containers via containerd
crictl ps
# Inspect a specific container's logs
crictl logs <container-id>
# Check kernel ring buffer for OOM messages
dmesg | grep -i "oom\|kill"
The node debug pod shares the host's IPC, network, and PID namespaces. It can see all node processes, all network interfaces, and all mounted filesystems.
Clean up is manual. Node debug pods are not deleted automatically:
kubectl delete pod node-debugger-worker-node-3-pdx84 --now
RBAC permissions for ephemeral containers
The default admin ClusterRole does not grant permission to use ephemeral containers. You need to explicitly allow update on the pods/ephemeralcontainers subresource. For a deeper understanding of Kubernetes RBAC objects, see the RBAC guide.
A minimal Role for on-call debugging:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: oncall-debug
namespace: production
rules:
- apiGroups: [""]
resources: ["pods", "events"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods/exec", "pods/portforward"]
verbs: ["create"]
# Ephemeral containers
- apiGroups: [""]
resources: ["pods/ephemeralcontainers"]
verbs: ["update"]
For --copy-to (creates a new pod), add create and delete on pods.
Bind the role to a group, not individual users:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: oncall-debug
namespace: production
subjects:
- kind: Group
name: oncall-platform
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: oncall-debug
apiGroup: rbac.authorization.k8s.io
Choosing a debug image
The image you inject determines what tools are available. Pick the smallest image that covers your scenario.
| Scenario | Image | Why |
|---|---|---|
| Network issues, DNS, packet capture | nicolaka/netshoot |
50+ network tools: tcpdump, dig, curl, ss, nmap, iperf3, grpcurl |
| Quick shell, file inspection | busybox:1.37 |
~5 MB, classic Unix utilities |
| Need to install specific tools | ubuntu:24.04 |
Full package manager (apt) |
Replacing a distroless image in --copy-to |
ubuntu:24.04 |
Shell, package manager, familiar filesystem layout |
nicolaka/netshoot is the de facto standard for Kubernetes network debugging. It is Alpine-based (~50 MB) and includes tcpdump, tshark, curl, dig, ss, netstat, iptables, nsenter, strace, grpcurl, and dozens more.
For busybox, pin the version. busybox:latest has had compatibility changes between releases.
Limitations and caveats
- Ephemeral containers are permanent within the pod's lifetime. Each
kubectl debugcall adds a container entry that stays until the pod is deleted. Stopped containers do not consume resources, but the entries accumulate in the pod spec. - Static pods do not support ephemeral containers. If your target is a static pod (e.g.,
kube-apiserveron kubeadm clusters), use--copy-toorkubectl debug node/instead. - Ephemeral containers cannot have resource requests or limits, ports, liveness probes, or readiness probes. They are deliberately limited to prevent interference with the pod's scheduling and lifecycle.
--targetdepends on the CRI. Containerd 1.5+ with runc supports TARGET PID namespace mode. Older runtimes may silently fall back to an isolated PID namespace.- Security risk. Anyone with
pods/ephemeralcontainersupdate permission can inject a container into any pod they can see. The raw PATCH endpoint allows setting arbitrary security contexts. Restrict this permission tightly and enable audit logging for the subresource. - Policy tools must validate
ephemeralContainers. Kyverno (1.5.3+) and Gatekeeper validate ephemeral container specs. Older versions of these tools only checkedcontainersandinitContainers, allowing ephemeral containers to bypass policies.
Common troubleshooting
"error: ephemeralcontainers are disabled for this cluster" means the EphemeralContainers feature gate is not enabled. This only happens on clusters older than Kubernetes 1.23 (where the feature was alpha and off by default). Upgrade to 1.25+ or enable the feature gate on both kube-apiserver and kubelet.
"forbidden: User cannot patch resource pods/ephemeralcontainers" means RBAC is missing the pods/ephemeralcontainers update verb. See RBAC setup.
ps shows only one process inside the debug container means the container runtime does not support TARGET PID namespace mode, or you forgot the --target flag. Verify your runtime version (containerd 1.5+ needed) or use --copy-to with --share-processes as an alternative.
Node debug pod cannot chroot /host means you did not pass --profile=sysadmin. The default profile does not grant privileged access.