Buy Me a Coffee

[Kubernetes] Wait… Why Does Kubernetes Still Ask Me to Create a PersistentVolume? - The Storage Confusion Most Kubernetes Learners Hit

Understanding Kubernetes Storage Management

Introduction

When I first learned Kubernetes storage, my mental model was very simple.

In real clusters, we, as developers, usually have a StorageClass configured by Kubernetes administrators. So the workflow I saw everywhere looked like this:

flowchart TB A["User creates a PersistentVolumeClaim (PVC)"] --> B["Kubernetes calls the CSI provisioner"] B --> C["Storage system creates a new disk"] C --> D["Kubernetes creates a PersistentVolume (PV) automatically"] D --> E["PVC binds to the PV"]

In other words: PersistentVolumeClaim is enough.

For example, a developer might create a claim like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  storageClassName: fast-storage
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

In most production clusters, this is exactly what happens. You never see PV manifests. You never create them manually.

The CSI driver handles everything. So naturally I assumed:

“Kubernetes storage = just create PVC.”

But then I started studying CKAD materials. And suddenly I kept seeing PersistentVolume manifests like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
  - ReadWriteOnce
  hostPath:
    path: /data

And that immediately triggered a question:

Why are we creating PVs manually?

If PVCs can already trigger PV creation automatically, why does Kubernetes even support this?

That question leads directly to the concept of static provisioning.

The Missing Piece: Not All Storage Can Be Created Automatically

Dynamic provisioning works great when Kubernetes can ask the storage system to create new volumes.

For example:

  • AWS EBS
  • GCP Persistent Disk
  • Azure Disk
  • Ceph
  • Modern CSI-based storage systems

In these systems, when a PVC appears, Kubernetes can ask the storage backend:

“Please create a new disk.”

But sometimes, the storage already exists. Examples include:

  • A shared NFS directory
  • A pre-existing dataset
  • A local disk already mounted on a node
  • Legacy infrastructure outside Kubernetes
  • Data that must not be recreated

In those cases, Kubernetes cannot create the storage. It can only reference existing storage. This is where static provisioning comes in.

Static Provisioning: Registering Existing Storage in Kubernetes

Imagine your company already has an NFS server. There is an important directory:

/exports/finance-data

It already contains:

  • Accounting archives
  • Compliance reports
  • Historical data

You want a Kubernetes Pod to mount that exact directory. Kubernetes cannot dynamically create it. Instead, the Kubernetes administrator registers that storage as a PV.

Here’s what that looks like:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: finance-nfs-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  nfs:
    server: 10.0.0.50
    path: /exports/finance-data

Important details:

  • The storage already exists.
  • Kubernetes is not creating anything.
  • The PV simply describes how to access the storage.

Now a developer can create a PVC to request access to this storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: finance-pvc
spec:
  storageClassName: ""
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 100Gi

After creating this PVC, Kubernetes will attempt to bind it to a matching PV.

How Kubernetes Matches PVC With an Existing PV

When a PVC appears, Kubernetes looks for a compatible PV. Several conditions must be met for a successful match.

1. AccessMode Compatibility

If the PVC requests:

accessModes:
- ReadWriteMany

then the PV must support ReadWriteMany.

2. Capacity Requirement

If the PVC requests:

resources:
  requests:
    storage: 100Gi

then the PV must provide at least 100Gi.

3. StorageClass

The StorageClass must match.

If the PVC uses:

storageClassName: ""

then the PV must also have:

storageClassName: ""

This disables dynamic provisioning and ensures Kubernetes only considers existing PVs.

When these conditions are satisfied, Kubernetes can successfully bind the PVC to the appropriate PV.

But What If Multiple PVs Match?

Another question arises here. Imagine the cluster contains two PVs:

  • pv-a: 100Gi, ReadWriteMany
  • pv-b: 100Gi, ReadWriteMany

Now, if a PVC requesting those properties appears, which PV will Kubernetes choose?

The answer is: Any matching PV.

Kubernetes does not guarantee deterministic selection. This means that matching access mode and capacity alone is not enough to determine which PV will be chosen.

How to Ensure a PVC Binds to a Specific PV

If you need deterministic binding, Kubernetes provides two mechanisms.

1. Specifying the volumeName in the PersistentVolumeClaim

The first method involves specifying the volumeName in the PVC definition, like so:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: finance-pvc
spec:
  volumeName: finance-nfs-pv   # <-- This ensures binding to the specific PV
  storageClassName: ""
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 100Gi

Now Kubernetes will bind the PVC only to that PV.

2. Using Label Selectors in the PersistentVolumeClaim

Another method involves using labels as selectors. For example, here’s a PV with a specific label:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: finance-nfs-pv
  labels:
    department: finance
spec:
  capacity:
  storage: 100Gi
  accessModes:
  - ReadWriteMany
  storageClassName: ""
  nfs:
    server: 10.0.0.50
    path: /exports/finance-data

Then, the corresponding PVC can use a selector to match that label:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: finance-pvc
spec:
  storageClassName: ""
  selector:
    matchLabels:
      department: finance
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 100Gi

By implementing label selectors, the PVC will only bind with PVs carrying the specified labels.

Dynamic Provisioning: The Workflow Most Clusters Use

In most production clusters today, a StorageClass exists, which simplifies the storage provisioning process.

Here’s an example StorageClass configuration:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-storage
provisioner: ebs.csi.aws.com

Now, when a PVC appears, like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  storageClassName: fast-storage
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

The system automatically:

  1. Calls the CSI driver
  2. Creates the storage
  3. Creates a PV object
  4. Binds the PV to the PVC

No administrator involvement is required.

The Mental Model That Finally Made Sense

After exploring both workflows, the model becomes simple.

A PVC is always the request for storage.

A PV is always the resource that provides storage.

Dynamic provisioning means the PV is created automatically.

Static provisioning means the PV represents storage that already exists.

Most clusters rely heavily on dynamic provisioning.

But static provisioning remains essential whenever Kubernetes must use storage that already exists outside the cluster.

Once you understand that distinction, Kubernetes storage becomes much easier to reason about.


Enjoyed this article? Support my work with a coffee ☕ on Ko-fi.
Buy Me a Coffee at ko-fi.com
DigitalOcean Referral Badge
Sign up to get $200, 60-day account credit !