Kubernetes etcd: backup, restore, and disaster recovery

etcd holds every object in your Kubernetes cluster: Deployments, Secrets, RBAC rules, CRDs. Losing it means losing the cluster. This guide covers snapshot creation with etcdctl, automated backups via a CronJob, single-node restore, multi-node HA restore, and the revision-bump flags that prevent controller cache corruption after recovery.

Table of contents

Before you start

This guide targets self-managed Kubernetes clusters deployed with kubeadm. If you run EKS, GKE, or AKS, the cloud provider manages etcd on your behalf and you cannot access it directly. Your backup responsibility on managed clusters shifts to Kubernetes objects (via Velero) and persistent volume data.

Prerequisites:

  • SSH access to every control plane node
  • etcdctl and etcdutl binaries matching your running etcd version (check with etcdctl version)
  • TLS certificates for the etcd endpoint. On kubeadm clusters, these live at:
    • CA: /etc/kubernetes/pki/etcd/ca.crt
    • Cert: /etc/kubernetes/pki/etcd/server.crt
    • Key: /etc/kubernetes/pki/etcd/server.key
  • Confirm certificate paths from the static pod manifest: grep file /etc/kubernetes/manifests/etcd.yaml
  • Valid etcd certificates. If etcdctl endpoint health reports x509 expiry errors, renew the kubeadm-managed PKI first; a snapshot taken with expired client certs will fail before it ever writes to disk.
  • Off-cluster storage destination for snapshots (S3, GCS, NFS)

Why etcd backups are a security concern

etcd stores the entire cluster state in binary protobuf. That includes every Kubernetes Secret. Secrets are base64-encoded by default, not encrypted. An etcd snapshot is therefore a plaintext dump of every database password, API key, TLS certificate, and service account token in the cluster.

This is why etcd backup files need the same security classification as the most sensitive secrets in your cluster. It is also why this article lives in the security subcategory rather than a generic operations guide. For a deeper comparison of tools that keep secrets out of etcd entirely, see Kubernetes secrets management: Sealed Secrets, ESO, and Vault compared.

etcdctl vs etcdutl: which tool for what

This distinction trips people up because many tutorials still show outdated commands.

Tool Purpose Needs running etcd?
etcdctl Day-to-day operations: key/value management, member list, health checks, snapshot save Yes
etcdutl Offline administration: snapshot restore, snapshot status, defrag, data migration No

etcdctl snapshot restore was deprecated in etcd v3.5 and removed in etcd v3.6. Always use etcdutl snapshot restore. The same applies to etcdctl snapshot status, which is also removed in v3.6. Use etcdutl snapshot status instead.

Every restore command in this article uses etcdutl. If you find a tutorial that tells you to run etcdctl snapshot restore, it is outdated.

Taking a snapshot

A snapshot captures every Kubernetes API object at a point in time: Pods, Deployments, Services, ConfigMaps, Secrets, RBAC rules, CRDs, and cluster membership metadata. It does not include persistent volume data, container images, or system logs.

# Create a timestamped snapshot (kubeadm cert paths)
sudo etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /opt/backup/etcd-$(date +%Y%m%d-%H%M%S).db

The command connects to the running etcd instance and writes a consistent snapshot to disk. Even on large clusters, this completes in under 10 minutes.

Verifying a snapshot

Every snapshot should be verified immediately after creation. A backup you discover is corrupt during a disaster is not a backup.

etcdutl snapshot status /opt/backup/etcd-20260409-140000.db --write-out=table

Expected output includes a hash, revision number, total keys, and total size. A non-zero key count and a valid hash confirm the snapshot is healthy.

Automating backups with a CronJob

Manual snapshots are a starting point. Automated, recurring snapshots stored off-cluster are what keep you alive when things go wrong.

Recommended intervals:

Cluster type Interval
Production, high change rate Every 1 to 4 hours
Production, normal Every 6 hours
Staging or development Daily

This CronJob runs on a control plane node, takes a snapshot every 6 hours, verifies it, and cleans up files older than 7 days:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: etcd-backup
  namespace: kube-system
spec:
  schedule: "0 */6 * * *"   # every 6 hours
  jobTemplate:
    spec:
      template:
        spec:
          hostNetwork: true  # required to reach etcd on 127.0.0.1:2379
          tolerations:
            - key: node-role.kubernetes.io/control-plane
              operator: Exists
              effect: NoSchedule
          nodeSelector:
            node-role.kubernetes.io/control-plane: ""
          containers:
            - name: etcd-backup
              image: bitnami/etcd:3.5  # pin to your cluster's etcd version
              command:
                - /bin/sh
                - -c
                - |
                  TIMESTAMP=$(date +%Y%m%d-%H%M%S)
                  BACKUP_FILE=/backup/etcd-${TIMESTAMP}.db
                  ETCDCTL_API=3 etcdctl snapshot save ${BACKUP_FILE} \
                    --endpoints=https://127.0.0.1:2379 \
                    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
                    --cert=/etc/kubernetes/pki/etcd/server.crt \
                    --key=/etc/kubernetes/pki/etcd/server.key
                  etcdutl snapshot status ${BACKUP_FILE} --write-out=table
                  find /backup -name "etcd-*.db" -mtime +7 -delete
              volumeMounts:
                - name: etcd-certs
                  mountPath: /etc/kubernetes/pki/etcd
                  readOnly: true
                - name: backup-storage
                  mountPath: /backup
          volumes:
            - name: etcd-certs
              hostPath:
                path: /etc/kubernetes/pki/etcd
            - name: backup-storage
              persistentVolumeClaim:
                claimName: etcd-backup-pvc  # point to off-cluster storage
          restartPolicy: OnFailure

Store snapshots off-cluster. Backups stored only on the cluster nodes are lost when the cluster is lost. Push snapshots to S3 (with SSE-KMS), GCS, Azure Blob, or an NFS server external to the cluster. Encrypt before or during upload: the snapshot contains every Secret in the cluster.

The adfinis/kubernetes-etcd-backup operator is an alternative if you prefer a dedicated controller over a raw CronJob.

Restore: single-node control plane

This procedure applies to kubeadm clusters with a single control plane node. The goal: replace the current etcd data directory with the snapshot's contents and restart the control plane.

Step 1: stop the API server and etcd.

Move the static pod manifests out of kubelet's watch directory. kubelet will stop the containers within 15 to 30 seconds.

sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/

# Verify both containers stopped
sudo crictl ps | grep -E 'etcd|apiserver'

You should see no output. If containers are still running, wait and check again.

Step 2: back up the existing data directory.

sudo mv /var/lib/etcd /var/lib/etcd.bak

Keep this until you confirm the restore succeeded.

Step 3: restore the snapshot.

sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
  --data-dir /var/lib/etcd \
  --bump-revision 1000000000 \
  --mark-compacted

The --bump-revision and --mark-compacted flags are explained in the revision bump problem section. Always include them for Kubernetes cluster restores.

Step 4: fix file ownership.

sudo chown -R etcd:etcd /var/lib/etcd

etcd runs as the etcd user. Wrong ownership prevents it from starting.

Step 5: bring the static pods back.

sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/

Wait 30 to 60 seconds for kubelet to recreate the containers.

Step 6: restart kubelet and verify.

sudo systemctl restart kubelet

# Verify cluster health
kubectl get nodes
kubectl get pods --all-namespaces

# Verify etcd health
sudo etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health

You will know the restore succeeded when kubectl get nodes shows all nodes as Ready and endpoint health reports is healthy: true.

Safer alternative: instead of overwriting /var/lib/etcd, restore to a new directory (e.g., /var/lib/etcd-restored) and update the volumes[etcd-data].hostPath.path in /tmp/etcd.yaml before moving it back. This way the original data directory survives if the restore fails.

Restore: multi-node HA cluster

This is the most complex scenario. A multi-node restore creates a new logical cluster with new member IDs and a new cluster ID. All members must be restored from the same snapshot before any of them are started.

The sequence is non-negotiable: stop all, restore all, start all.

Starting one restored member before the others are ready breaks Raft quorum. The cluster will not become healthy.

Step 1: stop etcd and the API server on all nodes.

Run on every control plane node:

sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/
sudo crictl ps | grep -E 'etcd|apiserver'  # confirm stopped

Step 2: distribute the snapshot.

Copy the same snapshot file to all control plane nodes. Do not take separate snapshots on each node.

scp /opt/backup/etcd-20260409-140000.db cp2.internal:/opt/backup/
scp /opt/backup/etcd-20260409-140000.db cp3.internal:/opt/backup/

Step 3: restore on each node with node-specific parameters.

Each node uses the same snapshot and the same --initial-cluster-token, but its own --name and --initial-advertise-peer-urls:

# On cp1 (10.0.1.10):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
  --name m1 \
  --data-dir /var/lib/etcd \
  --initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-restored \
  --initial-advertise-peer-urls https://10.0.1.10:2380 \
  --bump-revision 1000000000 \
  --mark-compacted

# On cp2 (10.0.1.11):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
  --name m2 \
  --data-dir /var/lib/etcd \
  --initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-restored \
  --initial-advertise-peer-urls https://10.0.1.11:2380 \
  --bump-revision 1000000000 \
  --mark-compacted

# On cp3 (10.0.1.12):
sudo etcdutl snapshot restore /opt/backup/etcd-20260409-140000.db \
  --name m3 \
  --data-dir /var/lib/etcd \
  --initial-cluster m1=https://10.0.1.10:2380,m2=https://10.0.1.11:2380,m3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-restored \
  --initial-advertise-peer-urls https://10.0.1.12:2380 \
  --bump-revision 1000000000 \
  --mark-compacted

The --initial-cluster-token must be different from the token the old cluster used. This prevents restored members from accidentally communicating with old instances that might still be running somewhere.

Step 4: fix ownership on all nodes.

sudo chown -R etcd:etcd /var/lib/etcd

Step 5: start etcd first, then the API servers.

Restore the etcd manifest on all nodes. Wait for etcd to form quorum (check logs with sudo crictl logs <etcd-container-id>). Then restore the API server manifests:

# On all nodes:
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/

# Wait for etcd quorum, then:
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/

Step 6: restart kubelet and verify.

sudo systemctl restart kubelet
kubectl get nodes
kubectl get pods --all-namespaces

sudo etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list

You will know it worked when member list shows all three members and kubectl get nodes reports all nodes Ready.

The revision bump problem

etcd revision is a monotonically increasing integer. Kubernetes controllers and informers use it as the resourceVersion to track ordering and cache freshness. After a restore, the revision rolls back to the snapshot's value, which is lower than what controllers last observed.

The result: stale watch caches that believe they are up-to-date, and errors like etcdserver: mvcc: required revision has been compacted in the API server logs. The etcd maintainers added two flags specifically to address this:

  • --bump-revision 1000000000 adds 1 billion to the restored revision, pushing it higher than any revision controllers have seen.
  • --mark-compacted marks all old revisions as compacted. This forces every open watch to terminate and reinitialize from the new state, flushing all informer caches.

The value 1,000,000,000 is a community convention based on typical cluster revision rates. It is not formally specified in the etcd docs, but it appears consistently across production restore procedures.

Both flags are available on etcdutl snapshot restore in etcd v3.5 and later. Always include them when restoring etcd for a Kubernetes cluster.

Protecting snapshot files

An etcd snapshot is a complete copy of all cluster state. Unless you have configured encryption at rest via an EncryptionConfiguration resource (AES-CBC, AES-GCM, or a KMS provider), every Secret is readable in the snapshot file.

Practical rules:

  • Store on encrypted storage (S3 with SSE-KMS, an encrypted filesystem, or both).
  • Restrict access with IAM policies, file permissions, or firewall rules.
  • Never store backup files only on the cluster nodes.
  • If your cluster uses a KMS provider for encryption at rest, you need access to the KMS key at restore time. Losing that key makes every snapshot unrestorable.

For background on why etcd stores secrets unencrypted by default and how workload identity sidesteps the problem entirely, see Kubernetes workload identity.

Testing your backups

A backup that has never been restored is a backup you cannot trust.

Minimum testing cadence:

  • After every backup: verify with etcdutl snapshot status. A non-zero key count and valid hash confirm integrity.
  • Quarterly: full restore to a staging or throwaway cluster from a real production snapshot. Walk through the entire procedure, including the health checks at the end.

Restore verification checklist:

  1. Snapshot restore completes without errors
  2. kubectl get nodes shows all nodes Ready
  3. kubectl get pods --all-namespaces shows pods in expected state
  4. etcdctl endpoint health reports healthy
  5. Application-level health checks pass
  6. RBAC rules and secrets are intact (spot-check a known Secret)

Monitor CronJob failures with Prometheus alerts or by watching Kubernetes Job events. A silently failing backup CronJob is worse than no automation at all, because it gives false confidence.

Common pitfalls

Pitfall Consequence Prevention
Using etcdctl snapshot restore Fails on etcd v3.6, deprecated on v3.5 Use etcdutl snapshot restore
Not stopping API server before restore Data corruption Stop static pods first
Storing backups only on-cluster Backup lost with the cluster Off-cluster storage (S3, NFS, GCS)
Skipping snapshot verification Corrupt backup discovered during a disaster etcdutl snapshot status after every backup
Omitting --bump-revision / --mark-compacted Stale watch caches, controller errors Always include for Kubernetes restores
Starting one HA member before restoring all Raft quorum not achieved Stop all, restore all, start all
Using different snapshots on different nodes Split-brain, inconsistent state Same snapshot distributed to all nodes
Wrong file ownership after restore etcd fails to start chown -R etcd:etcd /var/lib/etcd
Skipping kubelet and control plane restart Stale in-memory state Restart kubelet, delete scheduler/controller-manager pods if needed
Storing unencrypted snapshots Every Secret exposed in plaintext Encrypted storage + restricted access

When to escalate

If the restore procedure does not resolve the issue, collect the following before asking for help:

  • etcd version (etcdctl version)
  • Kubernetes version (kubectl version)
  • The exact error output from the restore command
  • etcd container logs (sudo crictl logs <etcd-container-id>)
  • API server logs (sudo crictl logs <apiserver-container-id>)
  • Whether the cluster uses encryption at rest and which KMS provider
  • Whether this is a single-node or multi-node cluster
  • The output of etcdctl member list and etcdctl endpoint status --write-out=table
  • Whether you used --bump-revision and --mark-compacted

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.