Kubernetes StorageClass and dynamic volume provisioning on AWS, GCP, and Azure

Dynamic volume provisioning lets Kubernetes create cloud storage on demand when a pod needs it. Instead of manually pre-creating disks, you define a StorageClass that tells the cluster which CSI driver to call and what parameters to pass. This guide covers the correct CSI driver, StorageClass configuration, and WaitForFirstConsumer binding for EKS (AWS), GKE (GCP), and AKS (Azure).

Table of contents

What you will have at the end

A production-ready StorageClass configured for your cloud provider, with the correct CSI driver installed, WaitForFirstConsumer binding enabled, volume expansion permitted, and a PVC that dynamically provisions a cloud disk when a pod requests it.

Prerequisites

  • A running Kubernetes cluster on one of the three major clouds: Amazon EKS, Google GKE, or Azure AKS
  • kubectl configured and authenticated against the cluster
  • For EKS: aws CLI and eksctl installed; IAM permissions to create roles and install add-ons
  • For GKE: gcloud CLI authenticated
  • For AKS: az CLI authenticated
  • Familiarity with how PVs, PVCs, and StorageClasses relate to each other. A StorageClass is the provisioning template; a PVC is the request; a PV is the actual volume that gets created. Dynamic provisioning connects them automatically.

StorageClass anatomy

A StorageClass has six fields that matter for day-to-day operations:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: example
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"   # mark as cluster default
provisioner: <csi-driver-name>              # which driver creates the volume
reclaimPolicy: Delete                       # Delete or Retain
allowVolumeExpansion: true                  # grow PVCs after creation (shrink is never supported)
volumeBindingMode: WaitForFirstConsumer     # delay provisioning until a pod is scheduled
parameters:                                 # driver-specific: disk type, IOPS, encryption
  type: gp3

The provisioner field determines everything. Each cloud has its own CSI driver with a specific driver name. Use the wrong name and the PVC stays Pending forever.

Cloud CSI driver name What it provisions
AWS EKS ebs.csi.aws.com EBS volumes (gp2, gp3, io1, io2)
AWS EKS Auto Mode ebs.csi.eks.amazonaws.com EBS volumes (managed by EKS Auto Mode)
GCP GKE pd.csi.storage.gke.io Persistent Disks (pd-balanced, pd-ssd, Hyperdisk)
Azure AKS disk.csi.azure.com Azure Managed Disks (Standard SSD, Premium SSD, Ultra)

The reclaimPolicy deserves a careful decision. Delete (the default) destroys the underlying cloud disk when the PVC is deleted. For stateful production workloads like databases, set it to Retain. You can patch an existing PV's reclaim policy after creation, but the StorageClass sets the default for newly provisioned volumes.

AWS EKS: install the EBS CSI driver and create a gp3 StorageClass

EKS does not include the EBS CSI driver by default. Without it, PVCs referencing ebs.csi.aws.com fail with errors like failed to provision volume with StorageClass. On EKS 1.30 and later, no default StorageClass is annotated either, so you must configure both the driver and the StorageClass.

Step 1: create the IAM role

The EBS CSI controller needs AWS permissions to create, attach, and delete EBS volumes. Create a service account role with the AmazonEBSCSIDriverPolicy managed policy:

eksctl create iamserviceaccount \
  --name ebs-csi-controller-sa \
  --namespace kube-system \
  --cluster production-cluster \
  --role-name AmazonEKS_EBS_CSI_DriverRole \
  --role-only \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
  --approve

Both EKS Pod Identities and IRSA (IAM Roles for Service Accounts) work. Pod Identities is the recommended auth method for new clusters.

Step 2: install the EBS CSI add-on

aws eks create-addon \
  --cluster-name production-cluster \
  --addon-name aws-ebs-csi-driver \
  --service-account-role-arn arn:aws:iam::111122223333:role/AmazonEKS_EBS_CSI_DriverRole

Wait for the add-on status to reach ACTIVE:

aws eks describe-addon --cluster-name production-cluster --addon-name aws-ebs-csi-driver \
  --query 'addon.status' --output text

Expected output: ACTIVE

EBS volumes require EC2 nodes. Fargate pods cannot mount EBS volumes.

Step 3: create the StorageClass

gp3 is the correct choice for new deployments: 20% cheaper per GB than gp2, with a baseline of 3,000 IOPS and 125 MiB/s throughput at any volume size. gp2 ties IOPS to volume size (3 IOPS/GiB), so small volumes get poor performance.

# storageclass-ebs-gp3.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-gp3
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete                       # use Retain for production databases
allowVolumeExpansion: true
parameters:
  type: gp3                                 # gp3 baseline: 3,000 IOPS, 125 MiB/s
  iops: "3000"
  throughput: "125"                          # MiB/s; scale up to 1,000 for gp3
  encrypted: "true"                          # EBS encryption at rest
  csi.storage.k8s.io/fstype: ext4

Apply it:

kubectl apply -f storageclass-ebs-gp3.yaml

EKS Auto Mode note: If your cluster runs EKS Auto Mode, the standard EBS CSI add-on is incompatible. EKS Auto Mode uses its own provisioner: ebs.csi.eks.amazonaws.com. Replace the provisioner field accordingly. No manual add-on installation is needed in Auto Mode.

Step 4: create a PVC and a test pod

# test-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-test
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-gp3
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: ebs-test-pod
spec:
  containers:
    - name: app
      image: busybox:1.36
      command: ["sh", "-c", "echo 'volume works' > /data/test.txt && sleep 3600"]
      volumeMounts:
        - mountPath: /data
          name: storage
  volumes:
    - name: storage
      persistentVolumeClaim:
        claimName: ebs-test
kubectl apply -f test-pvc.yaml

The PVC stays in Pending until the pod is scheduled (that is normal with WaitForFirstConsumer). Once the pod runs:

kubectl get pvc ebs-test

Expected output:

NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ebs-test   Bound    pvc-a1b2c3d4-5678-90ab-cdef-111122223333   5Gi        RWO            ebs-gp3        45s

GCP GKE: use the Persistent Disk CSI driver

GKE Autopilot clusters ship with the PD CSI driver enabled and two default StorageClasses: standard-rwo (pd-balanced) and premium-rwo (pd-ssd), both using WaitForFirstConsumer. On GKE Standard clusters, verify the CSI driver is enabled; older clusters may still use the in-tree kubernetes.io/gce-pd provisioner.

The provisioner name is pd.csi.storage.gke.io.

Create a custom StorageClass

If the built-in classes do not fit (you need pd-ssd as default, or Hyperdisk, or regional PDs), create a custom StorageClass:

# storageclass-gke-ssd.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gke-ssd
provisioner: pd.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
  type: pd-ssd                              # options: pd-balanced, pd-standard, pd-ssd, pd-extreme

For regional persistent disks that replicate data across two zones:

parameters:
  type: pd-balanced
  replication-type: regional-pd
allowedTopologies:
  - matchLabelExpressions:
      - key: topology.kubernetes.io/zone
        values:
          - europe-west4-a
          - europe-west4-b

GKE also supports Hyperdisk types (hyperdisk-balanced, hyperdisk-throughput, hyperdisk-extreme, hyperdisk-ml) through the same pd.csi.storage.gke.io provisioner. Availability is region-dependent.

Azure AKS: use the built-in disk CSI driver

AKS installs the Azure Disk CSI driver (disk.csi.azure.com) and several StorageClasses automatically. The default StorageClass (managed-csi) uses StandardSSD_LRS with WaitForFirstConsumer.

For multi-zone AKS clusters deployed on Kubernetes 1.29+, the built-in StorageClasses automatically use Zone-Redundant Storage (StandardSSD_ZRS, Premium_ZRS). ZRS provides cross-zone replication, which improves resilience but increases cost.

Create a custom StorageClass

To control the SKU, caching, or to pin LRS for cost optimization:

# storageclass-aks-premium.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: premium-lrs
provisioner: disk.csi.azure.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
  skuName: Premium_LRS                      # options: Standard_LRS, Premium_LRS, StandardSSD_LRS,
                                            #          PremiumV2_LRS, UltraSSD_LRS, *_ZRS variants
  cachingMode: ReadOnly                      # None, ReadOnly, ReadWrite

Why WaitForFirstConsumer matters even on single-zone clusters

A common misconception is that WaitForFirstConsumer only matters for multi-zone clusters. It does not. There are three reasons to use it everywhere.

Zone safety for future expansion. A single-zone cluster that grows into a second zone will break every Immediate-provisioned PVC. The volumes already exist in zone A; the scheduler might place pods on nodes in zone B. The result is volume node affinity conflict errors and pods stuck in Pending. With WaitForFirstConsumer, volumes are provisioned in the pod's zone from the start.

No orphaned volumes. Immediate provisions a disk the moment the PVC is created, even if no pod ever uses it. Orphaned cloud disks cost money. WaitForFirstConsumer only provisions when a pod actually needs the volume.

It is the default everywhere. All modern managed-cloud StorageClasses use it: EKS 1.30+, AKS managed-csi, GKE standard-rwo. The official aws-ebs-csi-driver example StorageClass uses WaitForFirstConsumer unconditionally.

One operational side effect: a PVC with WaitForFirstConsumer stays in Pending until a pod references it. This is expected behavior, not a provisioning failure. Do not set spec.nodeName directly on the pod when using WaitForFirstConsumer; this bypasses the scheduler and leaves the PVC permanently stuck. Use nodeSelector or node affinity instead.

ReadWriteMany and NFS: not automatic

Block storage (EBS, Azure Disk, GCP PD) supports ReadWriteOnce only. A single node at a time. If you need multiple pods on different nodes writing to the same volume (ReadWriteMany), you need a file-based storage backend.

Having an NFS server on your network is not enough. Kubernetes does not dynamically provision NFS volumes without an explicit CSI driver. The NFS CSI driver (nfs.csi.k8s.io) must be installed separately, typically via Helm.

Cloud-managed file services also require their own drivers:

Service CSI driver Pre-installed?
AWS EFS efs.csi.aws.com No, requires aws-efs-csi-driver add-on
Azure Files file.csi.azure.com Yes, pre-installed on AKS
GCP Filestore filestore.csi.storage.gke.io No, requires explicit enablement

Azure Files is the only cloud file service with a pre-installed CSI driver. AKS ships azurefile-csi and azurefile-csi-premium StorageClasses out of the box.

Verify the result

After applying your StorageClass, confirm it exists and check the default annotation:

kubectl get storageclass

Expected output (EKS example):

NAME              PROVISIONER       RECLAIMPOLICY   VOLUMEBINDINGMODE       ALLOWVOLUMEEXPANSION   AGE
ebs-gp3 (default) ebs.csi.aws.com   Delete          WaitForFirstConsumer    true                   2m

Create a PVC and a pod (use the test manifests from the EKS section above, adjusting storageClassName). Confirm the PVC reaches Bound and the pod starts:

kubectl get pvc
kubectl get pod

If the PVC stays in Pending after the pod is scheduled, check events:

kubectl describe pvc <pvc-name>

The events section tells you exactly what went wrong: missing CSI driver, IAM permission error, or zone conflict.

Common troubleshooting

Symptom Likely cause Fix
PVC stuck in Pending, no pod exists WaitForFirstConsumer working as designed Create a pod that references the PVC
PVC stuck in Pending with pod running spec.nodeName set directly on the pod Use nodeSelector or node affinity instead
UnauthorizedOperation on volume creation CSI driver lacks IAM/RBAC permissions Attach AmazonEBSCSIDriverPolicy (EKS) or verify workload identity (GKE/AKS)
volume node affinity conflict Volume provisioned in wrong zone Switch StorageClass to WaitForFirstConsumer
Multi-Attach error Block storage used with ReadWriteMany Block storage is RWO only; use EFS, Azure Files, or NFS for RWX
EBS CSI error on EKS Auto Mode Using ebs.csi.aws.com provisioner Switch to ebs.csi.eks.amazonaws.com
PVC created, volume not cleaned up on delete reclaimPolicy: Retain in effect Manual PV and cloud disk cleanup required

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.