Kubernetes was initially designed to support primarily stateless applications using ephemeral storage. However, it is now possible to use Kubernetes to build and manage stateful applications using persistent storage.
Kubernetes offers the following components for persistent storage:
- Kubernetes Persistent Volumes (PVs)—a PV is a storage resource made available to a Kubernetes cluster. A PV can be provisioned statically by a cluster administrator or dynamically by Kubernetes. A PV is a volume plugin with its own lifecycle, independent of any individual pod using the PV.
- PersistentVolumeClaim (PVC)—a request for storage made by a user. A PVC specifies the desired size and access mode, and a control loop looks for a matching PV that can fulfill these requirements. If a match exists, the control loop binds the PV and PVC together and provides them to the user. If not, Kubernetes can dynamically provision a PV that meets the requirements.
Common Use Cases for Persistent Volumes
In the early days of containerization, containers were typically stateless. However, as Kubernetes architecture matured and container-based storage solutions were introduced, containers started to be used for stateful applications as well.
There are many benefits for running stateful applications in a container, including fast startup, high availability, and self-healing. It is also easier to also store, maintain, and back up the data the application creates or uses. By ensuring a consistent data state, you can use Kubernetes for complex and not only for 12-factor web applications.
The most common use case for Kubernetes persistent volumes is for databases. Applications that use databases must have constant access to this data. In a Kubernetes environment, this can be achieved by running a database in a PV and mounting it to the pod running the application. PVs can run common databases like MySQL, Cassandra, and Microsoft SQL Server.
A Typical Process for Running Kubernetes Persistent Volumes
Here is the general process for deploying a database in a persistent volume:
- Create pods to run the application, with proper configuration and environment variables
- Create a persistent volume that runs the database, or configure Storage Classes to allow the cluster to create PVs on demand
- Attach persistent volumes to pods via persistent volume claims
- Applications running in the pods can now access the database
Initially, run the first pods manually and confirm that they connect to your persistent volumes correctly. Each new replica of the pod should mount the database as a persistent volume. When you see everything works, you can confidently scale your stateful pod to additional machines, ensuring that each of them receives a PV to run stateful operations. If a pod fails, Kubernetes will automatically run a new one and attach the PV.
6 Kubernetes Persistent Volume Tips
Here are key tips to configuring a PV, as recommended by the Kubernetes documentation:
Configuration Best Practices for PVs
1. Prefer dynamic over static provisioning
Static provisioning can result in management overhead and inefficient scaling. You can avoid this issue by using dynamic provisioning. You still need to use storage classes to define a relevant reclaim policy. This setup enables you to minimize storage costs when pods are deleted.
2. Plan to use the right size of nodes
Each node supports a maximum number of sizes. Note that different node sizes support various types of local storage and capacity. This limitation means you need to plan to deploy the right node size for your application’s expected demands.
3. Account for the independent lifecycle of PVs
PVs are independent of a particular container or pod, while PVCs are unique to a specific user. Each of these components has a unique lifecycle. Here are best practices that can help you ensure that PVs and PVCs are properly utilized:
- PVCs—must always be included in a container’s configuration.
- PVs—must never be included in a container’s configuration. Including PVs in the container configuration tightly couples the container to a specific volume.
- StorageClass—PVCs must always specify a default storage class. PVCs that do not define a specific class fail.
- Descriptive names—you should always give StorageClasses descriptive names.
3 Security Practices for PVs
You should always harden the configuration of your storage system and Kubernetes cluster. Storage volumes may include sensitive information like credentials, private information, or trade secrets. Hardening helps ensure your volumes are visible or accessible only to the assigned pod.
Here are common security practices that can help you harden storage and cluster configurations:
1. Never allow privileged pods
A privileged pod can potentially allow threats to reach the host and access unassigned storage. You must prevent unauthorized applications from using privileged pods. Always prefer using standard containers that cannot mount volumes and restrict all types of users—root or otherwise. You should also use pod security policies to prevent user applications from creating privileged containers.
2. Limit application users
Always limit application users to specific namespaces that have no cluster-level permissions. Only administrator users can manage PVs—users can never create, assign, manage, or destroy PVs. Additionally, users cannot view the details of their PVs—only cluster administrators can.
3. Use network policies
Network policies can help you prevent pods from accessing the storage network directly. It prevents pods from gaining information about storage before the attempt can succeed. You should set up network policies using the host firewall or on a per-namespace basis to deny access to the storage network.
In this article I covered the basics of Kubernetes persistent storage and presented 6 best practices that can help you work with PVs more effectively and securely:
- Prefer dynamic over static provisioning of persistent volumes
- Plan node size and capacity to support your persistent volumes
- Take into account that PVs have an independent lifecycle
- Do not allow privileged pods to prevent security issues
- Limit application users to protect sensitive data
- Use network policies to prevent unauthorized access to PVs
I hope this will be useful as you build your first persistent applications in Kubernetes.