How PVs, PVCs, and StorageClasses fit together
Kubernetes storage has three actors, each with a distinct role.
A PersistentVolume (PV) is a cluster-level resource that represents a piece of actual storage: a cloud disk, an NFS export, a local SSD. It exists independently of any pod. An administrator creates it, or a provisioner creates it automatically.
A PersistentVolumeClaim (PVC) is a namespace-scoped request for storage. A developer writes a PVC saying "I need 10Gi of ReadWriteOnce storage." Kubernetes finds a PV that satisfies the claim and binds them 1-to-1. The binding is exclusive: one PVC gets exactly one PV, and that PV cannot serve another claim while bound.
A StorageClass defines how storage is provisioned. It names a provisioner (the CSI driver or in-tree plugin that talks to the storage backend), sets parameters like disk type or IOPS tier, and declares policies for reclaim and volume expansion. When a PVC references a StorageClass, the provisioner creates a PV on demand. No pre-created PV needed.
# A PVC requesting 10Gi from the "standard" StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-postgres-0
namespace: production
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard # references a StorageClass object
resources:
requests:
storage: 10Gi
In practice, most managed clusters (EKS, GKE, AKS) ship with a default StorageClass already configured. When you create a PVC without specifying storageClassName, it uses the default. The provisioner creates the PV behind the scenes. This is dynamic provisioning and it is the common path.
The PV lifecycle: Available, Bound, Released, Failed
A PV moves through four phases:
Available. The PV exists and is not bound to any claim. It is free for a matching PVC to claim.
Bound. A PVC has claimed the PV. Data can be written and read. The PV stays bound for as long as the PVC exists.
Released. The PVC has been deleted, but the PV still exists and holds data. The PV is not available for a new claim yet. What happens next depends on the reclaim policy.
Failed. Automatic reclamation failed. Manual intervention is needed.
The trap is in Released. A Released PV is stuck. Even if you create a new PVC with an identical spec, Kubernetes will not rebind it to the Released PV. The old claimRef still points to the deleted PVC. An administrator must either remove the claimRef field manually to return the PV to Available, or delete the PV entirely and let dynamic provisioning create a fresh one.
Static provisioning vs. dynamic provisioning
Static provisioning is the manual path. An administrator pre-creates PV objects that map to existing storage (an NFS share, a pre-formatted EBS volume). PVCs match against these PVs by storageClassName, access mode, and capacity. The PV's capacity must be >= the PVC's request. If no PV matches, the PVC stays Pending indefinitely.
# Static PV pointing to an existing NFS share
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-archive
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
storageClassName: manual # just a label for matching
nfs:
server: nfs-prod.internal
path: /exports/archive
The storageClassName: manual here is not a real StorageClass object. It is a label that both the PV and PVC share so that Kubernetes matches them. No provisioner runs.
Dynamic provisioning is the automated path. The PVC references a StorageClass; the StorageClass's provisioner creates the PV on demand. This is how most production clusters operate. The advantage is obvious: no administrator pre-creating disks. The risk is less obvious: the default reclaim policy for dynamically provisioned volumes is Delete, meaning data is destroyed when the PVC is deleted.
WaitForFirstConsumer vs. Immediate binding
StorageClasses have a volumeBindingMode that controls when the PV is created.
Immediate (the default) creates the PV as soon as the PVC appears. The problem: for topology-constrained storage (EBS volumes, GCE Persistent Disks), the PV might be created in the wrong availability zone. When the scheduler later places the pod on a node in a different zone, the volume cannot attach, and the pod is stuck.
WaitForFirstConsumer delays PV creation until a pod referencing the PVC is scheduled. The provisioner sees which node the pod landed on and creates the volume in the correct zone. For any topology-aware storage backend, this is the correct setting.
Access modes: node vs. pod
PVs declare which access modes they support. PVCs request one. Matching only succeeds if the PV supports the requested mode.
| Mode | Short name | Meaning |
|---|---|---|
| ReadWriteOnce | RWO | Mounted as read-write by a single node |
| ReadOnlyMany | ROX | Mounted as read-only by many nodes |
| ReadWriteMany | RWX | Mounted as read-write by many nodes |
| ReadWriteOncePod | RWOP | Mounted as read-write by a single pod (GA since v1.29) |
The most common misconception: ReadWriteOnce does not mean one pod. It means one node. Multiple pods on the same node can read from and write to the same RWO volume simultaneously. The Kubernetes blog post introducing RWOP states this explicitly: "The ReadWriteOnce access mode restricts volume access to a single node, which means it is possible for multiple pods on the same node to read from and write to the same volume."
This matters for databases, queues, and any workload that expects exactly one writer. If two replicas land on the same node and share an RWO volume, both can write. Data corruption follows.
ReadWriteOncePod was added to fix this. RWOP enforces single-pod access at the scheduler level. If a second pod tries to mount the same RWOP volume, the scheduler rejects it. Current Kubernetes documentation recommends RWOP over RWO for single-writer production workloads. RWOP requires a CSI driver; in-tree volume plugins do not support it.
Reclaim policies: Retain, Delete, and the deprecated Recycle
The reclaim policy determines what happens to a PV when its PVC is deleted.
Delete removes both the PV object and the underlying storage asset (the cloud disk, the dynamically provisioned volume). This is the default for dynamically provisioned volumes. The data is gone.
Retain keeps the PV and its data. The PV moves to Released. An administrator must manually decide what to do: recover data, clean up, or delete. This is the safe default for anything that holds data you care about.
Recycle (deprecated) ran rm -rf /thevolume/* and made the PV available again. It was unreliable and insecure. Use dynamic provisioning instead.
The dangerous default: if you create a PVC against a StorageClass and accept defaults, the reclaim policy is Delete. Delete the PVC, lose the data. For production databases and stateful workloads, patch the PV's reclaim policy to Retain before you need it:
kubectl patch pv pvc-7a8b3c4d -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
StatefulSet PVC retention policy
StatefulSets have historically preserved PVCs even when the StatefulSet was deleted or scaled down. Since Kubernetes 1.23 (alpha), 1.27 (beta), and 1.32 (GA), spec.persistentVolumeClaimRetentionPolicy gives explicit control:
spec:
persistentVolumeClaimRetentionPolicy:
whenDeleted: Retain # PVCs survive StatefulSet deletion
whenScaled: Delete # PVCs are removed when scaling down
Both fields default to Retain, preserving the historical behavior.
Volume expansion
PVC expansion has been stable since Kubernetes 1.24. To grow a PVC, edit spec.resources.requests.storage to a larger value:
kubectl patch pvc data-postgres-0 -p '{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'
Three conditions must be true for this to work:
- The StorageClass must have
allowVolumeExpansion: true. Without it, the API rejects the request. - The CSI driver must support the
EXPAND_VOLUMEcapability. Check your driver's documentation. - You are growing, not shrinking. Shrinking a PVC is universally unsupported. The API rejects it.
Most CSI drivers handle online expansion: the filesystem grows while the pod keeps running. Some older drivers require the pod to be deleted and recreated for the resize to take effect. If a resize fails (backend quota exhausted, driver error), Kubernetes 1.23+ allows user-initiated retry by patching the PVC again.
What PersistentVolumes are not
PVs are not backups. A PV keeps data across pod restarts, but it does not protect against accidental deletion, corruption, or availability zone failure. Cloud disk snapshots, Velero, or application-level backups are still needed.
PVs are not shared filesystems by default. Most cloud block storage (EBS, Persistent Disk, Azure Disk) supports only RWO. For multi-pod read-write access across nodes, you need a storage backend that supports RWX: NFS, CephFS, Amazon EFS, Azure Files, or similar.
PVCs are not portable across clusters. A PVC references a PV in one cluster. Migrating stateful workloads between clusters requires data migration tools (Velero, storage-level replication, or rsync).
StorageClasses are not storage pools. A StorageClass is a template for provisioning, not a pre-allocated pool of capacity. Capacity limits come from the backend (cloud account quotas, physical disk size), not from the StorageClass object.
Where to go next
- To understand how resource requests and limits affect StatefulSet pod scheduling (and why a database pod with an attached PV might stay Pending), see Kubernetes resource requests and limits
- For networking concepts that complement storage when designing stateful services, see Kubernetes Services explained