Table of contents
- Before you start
- Why etcd backups are a security concern
- etcdctl vs etcdutl: which tool for what
- Taking a snapshot
- Verifying a snapshot
- Automating backups with a CronJob
- Restore: single-node control plane
- Restore: multi-node HA cluster
- The revision bump problem
- Protecting snapshot files
- Testing your backups
- Common pitfalls
- When to escalate
Before you start
This guide targets self-managed Kubernetes clusters deployed with kubeadm. If you run EKS, GKE, or AKS, the cloud provider manages etcd on your behalf and you cannot access it directly. Your backup responsibility on managed clusters shifts to Kubernetes objects (via Velero) and persistent volume data.
Prerequisites:
- SSH access to every control plane node
etcdctlandetcdutlbinaries matching your running etcd version (check withetcdctl version)- TLS certificates for the etcd endpoint. On kubeadm clusters, these live at:
- CA:
/etc/kubernetes/pki/etcd/ca.crt - Cert:
/etc/kubernetes/pki/etcd/server.crt - Key:
/etc/kubernetes/pki/etcd/server.key
- CA:
- Confirm certificate paths from the static pod manifest:
grep file /etc/kubernetes/manifests/etcd.yaml - Valid etcd certificates. If
etcdctl endpoint healthreports x509 expiry errors, renew the kubeadm-managed PKI first; a snapshot taken with expired client certs will fail before it ever writes to disk. - Off-cluster storage destination for snapshots (S3, GCS, NFS)
Why etcd backups are a security concern
etcd stores the entire cluster state in binary protobuf. That includes every Kubernetes Secret. Secrets are base64-encoded by default, not encrypted. An etcd snapshot is therefore a plaintext dump of every database password, API key, TLS certificate, and service account token in the cluster.
This is why etcd backup files need the same security classification as the most sensitive secrets in your cluster. It is also why this article lives in the security subcategory rather than a generic operations guide. For a deeper comparison of tools that keep secrets out of etcd entirely, see Kubernetes secrets management: Sealed Secrets, ESO, and Vault compared.
etcdctl vs etcdutl: which tool for what
This distinction trips people up because many tutorials still show outdated commands.
| Tool | Purpose | Needs running etcd? |
|---|---|---|
etcdctl |
Day-to-day operations: key/value management, member list, health checks, snapshot save | Yes |
etcdutl |
Offline administration: snapshot restore, snapshot status, defrag, data migration | No |
etcdctl snapshot restore was deprecated in etcd v3.5 and removed in etcd v3.6. Always use etcdutl snapshot restore. The same applies to etcdctl snapshot status, which is also removed in v3.6. Use etcdutl snapshot status instead.
Every restore command in this article uses etcdutl. If you find a tutorial that tells you to run etcdctl snapshot restore, it is outdated.
Taking a snapshot
A snapshot captures every Kubernetes API object at a point in time: Pods, Deployments, Services, ConfigMaps, Secrets, RBAC rules, CRDs, and cluster membership metadata. It does not include persistent volume data, container images, or system logs.
# Create a timestamped snapshot (kubeadm cert paths)
sudo etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/backup/etcd-$(date +%Y%m%d-%H%M%S).db
The command connects to the running etcd instance and writes a consistent snapshot to disk. Even on large clusters, this completes in under 10 minutes.
Verifying a snapshot
Every snapshot should be verified immediately after creation. A backup you discover is corrupt during a disaster is not a backup.
etcdutl snapshot status /opt/backup/etcd-20260409-140000.db --write-out=table
Expected output includes a hash, revision number, total keys, and total size. A non-zero key count and a valid hash confirm the snapshot is healthy.
Automating backups with a CronJob
Manual snapshots are a starting point. Automated, recurring snapshots stored off-cluster are what keep you alive when things go wrong.
Recommended intervals:
| Cluster type | Interval |
|---|---|
| Production, high change rate | Every 1 to 4 hours |
| Production, normal | Every 6 hours |
| Staging or development | Daily |
This CronJob runs on a control plane node, takes a snapshot every 6 hours, verifies it, and cleans up files older than 7 days:
apiVersion: batch/v1
kind: CronJob
metadata:
name: etcd-backup
namespace: kube-system
spec:
schedule: "0 */6 * * *" # every 6 hours
jobTemplate:
spec:
template:
spec:
hostNetwork: true # required to reach etcd on 127.0.0.1:2379
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
nodeSelector:
node-role.kubernetes.io/control-plane: ""
containers:
- name: etcd-backup
image: bitnami/etcd:3.5 # pin to your cluster's etcd version
command:
- /bin/sh
- -c
- |
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE=/backup/etcd-${TIMESTAMP}.db
ETCDCTL_API=3 etcdctl snapshot save ${BACKUP_FILE} \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
etcdutl snapshot status ${BACKUP_FILE} --write-out=table
find /backup -name "etcd-*.db" -mtime +7 -delete
volumeMounts:
- name: etcd-certs
mountPath: /etc/kubernetes/pki/etcd
readOnly: true
- name: backup-storage
mountPath: /backup
volumes:
- name: etcd-certs
hostPath:
path: /etc/kubernetes/pki/etcd
- name: backup-storage
persistentVolumeClaim:
claimName: etcd-backup-pvc # point to off-cluster storage
restartPolicy: OnFailure
Store snapshots off-cluster. Backups stored only on the cluster nodes are lost when the cluster is lost. Push snapshots to S3 (with SSE-KMS), GCS, Azure Blob, or an NFS server external to the cluster. Encrypt before or during upload: the snapshot contains every Secret in the cluster.
The adfinis/kubernetes-etcd-backup operator is an alternative if you prefer a dedicated controller over a raw CronJob.
Restore: single-node control plane
This procedure applies to kubeadm clusters with a single control plane node. The goal: replace the current etcd data directory with the snapshot's contents and restart the control plane.
Step 1: stop the API server and etcd.
Move the static pod manifests out of kubelet's watch directory. kubelet will stop the containers within 15 to 30 seconds.
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/
# Verify both containers stopped
sudo crictl ps | grep -E 'etcd|apiserver'
You should see no output. If containers are still running, wait and check again.
Step 2: back up the existing data directory.
sudo mv /var/lib/etcd /var/lib/etcd.bak
Keep this until you confirm the restore succeeded.
Step 3: restore the snapshot.
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
--data-dir /var/lib/etcd \
--bump-revision 1000000000 \
--mark-compacted
The --bump-revision and --mark-compacted flags are explained in the revision bump problem section. Always include them for Kubernetes cluster restores.
Step 4: fix file ownership.
sudo chown -R etcd:etcd /var/lib/etcd
etcd runs as the etcd user. Wrong ownership prevents it from starting.
Step 5: bring the static pods back.
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
Wait 30 to 60 seconds for kubelet to recreate the containers.
Step 6: restart kubelet and verify.
sudo systemctl restart kubelet
# Verify cluster health
kubectl get nodes
kubectl get pods --all-namespaces
# Verify etcd health
sudo etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint health
You will know the restore succeeded when kubectl get nodes shows all nodes as Ready and endpoint health reports is healthy: true.
Safer alternative: instead of overwriting /var/lib/etcd, restore to a new directory (e.g., /var/lib/etcd-restored) and update the volumes[etcd-data].hostPath.path in /tmp/etcd.yaml before moving it back. This way the original data directory survives if the restore fails.
Restore: multi-node HA cluster
This is the most complex scenario. A multi-node restore creates a new logical cluster with new member IDs and a new cluster ID. All members must be restored from the same snapshot before any of them are started.
The sequence is non-negotiable: stop all, restore all, start all.
Starting one restored member before the others are ready breaks Raft quorum. The cluster will not become healthy.
Step 1: stop etcd and the API server on all nodes.
Run on every control plane node:
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/
sudo crictl ps | grep -E 'etcd|apiserver' # confirm stopped
Step 2: distribute the snapshot.
Copy the same snapshot file to all control plane nodes. Do not take separate snapshots on each node.
scp /opt/backup/etcd-20260409-140000.db cp2.internal:/opt/backup/
scp /opt/backup/etcd-20260409-140000.db cp3.internal:/opt/backup/
Step 3: restore on each node with node-specific parameters.
Each node uses the same snapshot and the same --initial-cluster-token, but its own --name and --initial-advertise-peer-urls:
# On cp1 (10.0.1.10):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
--name m1 \
--data-dir /var/lib/etcd \
--initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
--initial-cluster-token etcd-cluster-restored \
--initial-advertise-peer-urls https://10.0.1.10:2380 \
--bump-revision 1000000000 \
--mark-compacted
# On cp2 (10.0.1.11):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
--name m2 \
--data-dir /var/lib/etcd \
--initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
--initial-cluster-token etcd-cluster-restored \
--initial-advertise-peer-urls https://10.0.1.11:2380 \
--bump-revision 1000000000 \
--mark-compacted
# On cp3 (10.0.1.12):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
--name m3 \
--data-dir /var/lib/etcd \
--initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
--initial-cluster-token etcd-cluster-restored \
--initial-advertise-peer-urls https://10.0.1.12:2380 \
--bump-revision 1000000000 \
--mark-compacted
The --initial-cluster-token must be different from the token the old cluster used. This prevents restored members from accidentally communicating with old instances that might still be running somewhere.
Step 4: fix ownership on all nodes.
sudo chown -R etcd:etcd /var/lib/etcd
Step 5: start etcd first, then the API servers.
Restore the etcd manifest on all nodes. Wait for etcd to form quorum (check logs with sudo crictl logs <etcd-container-id>). Then restore the API server manifests:
# On all nodes:
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/
# Wait for etcd quorum, then:
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
Step 6: restart kubelet and verify.
sudo systemctl restart kubelet
kubectl get nodes
kubectl get pods --all-namespaces
sudo etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
member list
You will know it worked when member list shows all three members and kubectl get nodes reports all nodes Ready.
The revision bump problem
etcd revision is a monotonically increasing integer. Kubernetes controllers and informers use it as the resourceVersion to track ordering and cache freshness. After a restore, the revision rolls back to the snapshot's value, which is lower than what controllers last observed.
The result: stale watch caches that believe they are up-to-date, and errors like etcdserver: mvcc: required revision has been compacted in the API server logs. The etcd maintainers added two flags specifically to address this:
--bump-revision 1000000000adds 1 billion to the restored revision, pushing it higher than any revision controllers have seen.--mark-compactedmarks all old revisions as compacted. This forces every open watch to terminate and reinitialize from the new state, flushing all informer caches.
The value 1,000,000,000 is a community convention based on typical cluster revision rates. It is not formally specified in the etcd docs, but it appears consistently across production restore procedures.
Both flags are available on etcdutl snapshot restore in etcd v3.5 and later. Always include them when restoring etcd for a Kubernetes cluster.
Protecting snapshot files
An etcd snapshot is a complete copy of all cluster state. Unless you have configured encryption at rest via an EncryptionConfiguration resource (AES-CBC, AES-GCM, or a KMS provider), every Secret is readable in the snapshot file.
Practical rules:
- Store on encrypted storage (S3 with SSE-KMS, an encrypted filesystem, or both).
- Restrict access with IAM policies, file permissions, or firewall rules.
- Never store backup files only on the cluster nodes.
- If your cluster uses a KMS provider for encryption at rest, you need access to the KMS key at restore time. Losing that key makes every snapshot unrestorable.
For background on why etcd stores secrets unencrypted by default and how workload identity sidesteps the problem entirely, see Kubernetes workload identity.
Testing your backups
A backup that has never been restored is a backup you cannot trust.
Minimum testing cadence:
- After every backup: verify with
etcdutl snapshot status. A non-zero key count and valid hash confirm integrity. - Quarterly: full restore to a staging or throwaway cluster from a real production snapshot. Walk through the entire procedure, including the health checks at the end.
Restore verification checklist:
- Snapshot restore completes without errors
kubectl get nodesshows all nodes Readykubectl get pods --all-namespacesshows pods in expected stateetcdctl endpoint healthreports healthy- Application-level health checks pass
- RBAC rules and secrets are intact (spot-check a known Secret)
Monitor CronJob failures with Prometheus alerts or by watching Kubernetes Job events. A silently failing backup CronJob is worse than no automation at all, because it gives false confidence.
Common pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
Using etcdctl snapshot restore |
Fails on etcd v3.6, deprecated on v3.5 | Use etcdutl snapshot restore |
| Not stopping API server before restore | Data corruption | Stop static pods first |
| Storing backups only on-cluster | Backup lost with the cluster | Off-cluster storage (S3, NFS, GCS) |
| Skipping snapshot verification | Corrupt backup discovered during a disaster | etcdutl snapshot status after every backup |
Omitting --bump-revision / --mark-compacted |
Stale watch caches, controller errors | Always include for Kubernetes restores |
| Starting one HA member before restoring all | Raft quorum not achieved | Stop all, restore all, start all |
| Using different snapshots on different nodes | Split-brain, inconsistent state | Same snapshot distributed to all nodes |
| Wrong file ownership after restore | etcd fails to start | chown -R etcd:etcd /var/lib/etcd |
| Skipping kubelet and control plane restart | Stale in-memory state | Restart kubelet, delete scheduler/controller-manager pods if needed |
| Storing unencrypted snapshots | Every Secret exposed in plaintext | Encrypted storage + restricted access |
When to escalate
If the restore procedure does not resolve the issue, collect the following before asking for help:
- etcd version (
etcdctl version) - Kubernetes version (
kubectl version) - The exact error output from the restore command
- etcd container logs (
sudo crictl logs <etcd-container-id>) - API server logs (
sudo crictl logs <apiserver-container-id>) - Whether the cluster uses encryption at rest and which KMS provider
- Whether this is a single-node or multi-node cluster
- The output of
etcdctl member listandetcdctl endpoint status --write-out=table - Whether you used
--bump-revisionand--mark-compacted