Kubernetes CSI in Action: Explained with Features and Use Cases

Kubernetes Volume plugins have been a great way for the third-party storage providers to support a block or file storage system by extending the Kubernetes volume interface and are “In-Tree” in nature.

In this post, we will dig into Kubernetes Container Storage Interface. We will install CSI Driver for Hostpath locally just to get an idea of how it works by understanding its components and see what really happens during pvc/pv/pod lifecycle. Also, we will look at some cool features that will help you implement multiple persistent volume related use cases.

Introduction

The Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage storage systems to containerised workloads on Container Orchestration Systems (COs) like Kubernetes, Mesos, Docker, and Cloud Foundry. Using CSI third-party storage providers can write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code. This means, implementing a single CSI for a storage vendor is guaranteed to work with all COs.

Some of the key differentiators for Kubernetes has been a powerful volume plugin system that enables many different types of storage systems to:

  1. Automatically create storage when required.

  2. Make storage available to containers wherever they’re scheduled.

  3. Automatically delete the storage when no longer needed. Adding support for new storage systems to Kubernetes, however, has been challenging.

The implementation of the Container Storage Interface (CSI) makes installing new volume plugins as easy as deploying a pod. It indicates that Kubernetes users may depend on the feature and its API without fear of backwards incompatible changes in future causing regressions.

Kubernetes CSI 1.png

Note: we will discuss only about dynamic provisioning in this article. Pre-provisioned volumes and flex volumes are out of scope of this article.

Why CSI?

Kubernetes volume plugins are currently “in-tree”, meaning they’re linked, compiled, built, and shipped with the core kubernetes binaries. Adding support for a new storage system to Kubernetes (a volume plugin) requires checking code into the core Kubernetes repository.

The existing Flex-Volume plugin attempted to address this pain by exposing an exec based API for external volume plugins. Although it enables third party storage vendors to write drivers Out-of-tree, in order to deploy the third party driver files it requires access to the root filesystem of node and master machines.

In addition to being difficult to deploy, Flex did not address the pain of plugin dependencies: Volume plugins tend to have many external requirements (on mount and filesystem tools, for example). These dependencies are assumed to be available on the underlying host OS which is often not the case (and installing them requires access to the root filesystem of node machine).

CSI addresses all of these issues by enabling storage plugins to be developed out-of-tree, containerized, deployed via standard Kubernetes primitives, and consumed through the Kubernetes storage primitives users know and love (PersistentVolumeClaims, PersistentVolumes, StorageClasses).

The goal of CSI is to establish a standardized mechanism for Container Orchestration Systems (COs) to expose arbitrary storage systems to their containerized workloads.

Deploy the Driver Plugin

The Driver Plugin comprises of the CSI side cars and implementation of the CSI services which is shipped by the vendor about which we will  talk about in some time. Kubernetes users interested in how to deploy or manage an existing CSI driver on Kubernetes should look at the documentation provided by the author of the CSI driver.

We will deploy the hostpath driver plugin for our blog.

Pre-requisites:

  • Kubernetes cluster (not Minikube or Microk8s): Tag the instances according to the managed kubernetes like AWS, by KubernetesCluster=<cluster-name>, otherwise kubelet fails on nodes with --cloud-provider=aws, or --cloud-provider=kubernetes for kubernetes installed via kubeadm

  • Running version 1.13 or later

  • Access to the terminal with Kubectl installed

Deploying HostPath Driver Plugin:

  1. Clone the repo of HostPath Driver Plugin locally or just copy the deploy and example folder from the root path
     

  2. Checkout the master branch (if not)

  3. The hostpath driver comprises of manifests for following side-cars: (in ./deploy/master/hostpath/)
    - csi-hostpath-attacher.yaml
    - csi-hostpath-provisioner.yaml
    - csi-hostpath-snapshotter.yaml
    - csi-hostpath-plugin.yaml:
    It will deploy 2 containers, one is node-driver-registrar and a hospath-plugin

  4. The driver also includes separate Service for each component and in the deployment file with statefulsets for the containers

  5. It also deploys Cluster-role-bindings and RBAC rules for each component, maintained in a separate repo

  6. Each Component (side-car) is managed in a separate repository

  7. The /deploy/util/ contains a shell script which handles the complete deployment process

  8. After copying the folder or cloning  the repo, just run:    

$ deploy/kubernetes-latest/deploy-hostpath.sh

9. The output will be similar to:

10. The driver is deployed, we can check:

CSI API-Resources:

There are resources from core apigroups, storage.k8s.io and resources which created by CRDs snapshot.storage.k8s.io and csi.storage.k8s.io.

CSI SideCars

Kubernetes CSI Sidecar Containers are a set of standard containers that aim to simplify the development and deployment of CSI Drivers on Kubernetes. These containers contain common logic to watch the Kubernetes API, trigger appropriate operations against the “CSI volume driver” container, and update the Kubernetes API as appropriate.

The containers are intended to be bundled with third-party CSI driver containers and deployed together as pods.

The Kubernetes development team maintains the following Kubernetes CSI Sidecar Containers:

Kubernetes CSI 2.png

Note: Only one container “csi-hostpath-plugin” contains Hostpath specific code. All the others are common CSI sidecar containers from quay.io/k8scsi. These containers communicate with CSI Driver by gRPC protocol through the socket in the common socker-dir EmptyDir volume.

  1. External Provisioner:
    It  is a sidecar container that watches Kubernetes PersistentVolumeClaim objects and triggers CSI CreateVolume and DeleteVolume operations against a driver endpoint.
    The CSI external-attacher also supports the Snapshot DataSource. If a Snapshot CRD is specified as a data source on a PVC object, the sidecar container fetches the information about the snapshot by fetching the SnapshotContent object and populates the data source field indicating to the storage system that new volume should be populated using specified snapshot.

  2. External Attacher :
    It  is a sidecar container that watches Kubernetes VolumeAttachment objects and triggers CSI ControllerPublish and ControllerUnpublish operations against a driver endpoint

  3. Node-Driver Registrar:
    It is a sidecar container that registers the CSI driver with kubelet, and adds the drivers custom NodeId to a label on the Kubernetes Node API Object. It does this by communicating with the Identity-service on the CSI driver and also calling the CSI GetNodeId operation. Registers the CSI driver with kubelet using the Kubelet device plugin mechanism.

  4. External Snapshotter:
    It is a sidecar container that watches the Kubernetes API server for VolumeSnapshot and VolumeSnapshotContent CRD objects.The creation of a new VolumeSnapshot object referencing a SnapshotClass CRD object corresponding to this driver causes the sidecar container to provision a new snapshot. When a new snapshot is successfully provisioned, the sidecar container creates a Kubernetes VolumeSnapshotContent object to represent the new snapshot.

  5. Cluster-driver Registrar:
    The CSI cluster-driver-registrar is a sidecar container that registers a CSIDriver with a Kubernetes cluster by creating a CSIDriver Object which enables the driver to customize how Kubernetes interacts with it.

Developing a CSI Driver

The first step to creating a CSI driver is writing an application implementing the gRPC services described in the CSI specification

At a minimum, CSI drivers must implement the following CSI services:

  • CSI Identity service: Enables Kubernetes components and CSI containers to identify the driver

  • CSI Node service: Required methods enable callers to make volume available at a specified path. 

All CSI services may be implemented in the same CSI driver application. The CSI driver application should be containerised to make it easy to deploy on Kubernetes. Once containerised, the CSI driver can be paired with CSI Sidecar Containers and deployed in node and/or controller mode as appropriate.

Capabilities

If the custom driver supports additional features, CSI "capabilities" can be used to advertise the optional methods/services it supports. It contains a list of all the features the driver supports.

Note: Refer the link for detailed explanation for developing a CSI Driver.

Try out provisioning the PV:

1. We need to create a storage class with:

volumeBindingMode: WaitForFirstConsumer

2. We need a PVC using which a sample Pod will consume the PV

After this the PV will be created immediately and Bound to the PVC as the volumeBindingMode: Immediate in storage class

3. The Pod to consume the PV

The Pod should be bound to the PV from above step

The above files are found in ./exmples directory and can be deployed using create or apply kubectl commands

Validate the deployed components:

Brief on how it works:

  • csi-provisioner issues CreateVolumeRequest call to the CSI socket, then hostpath-plugin calls CreateVolume and informs CSI about its creation

  • csi-provisioner creates PV and updates PVC to be bound and the VolumeAttachment object is created by controller-manager

  • csi-attacher which watches for VolumeAttachments submits ControllerPublishVolume rpc call to hostpath-plugin, then hostpaths-plugin gets ControllerPublishVolume and calls hostpath AttachVolume csi-attacher update VolumeAttachment status

  • All this time kubelet waits for volume to be attached and submits NodeStageVolume (format and mount to the node to the staging dir) to the csi-node.hostpath-plugin

  • csi-node.hostpath-plugin gets NodeStageVolume call and mounts to `/var/lib/kubelet/plugins/kubernetes.io/csi/pv/<pv-name>/globalmount`, then responses to kubelet

  • kubelet calls NodePublishVolume (mount volume to the pod’s dir)

  • csi-node.hostpath-plugin performs NodePublishVolume and mounts the volume to `/var/lib/kubelet/pods/<pod-uuid>/volumes/kubernetes.io~csi/<pvc-name>/mount`

    Finally, kubelet starts container of the pod with the provisioned volume.

Let’s confirm the working of Hostpath CSI driver:

The Hostpath driver is configured to create new volumes under /tmp inside the hostpath container that is specified in the plugin DaemonSet. This path persist as long as the DaemonSet pod is up and running.

A file written in a properly mounted Hostpath volume inside an application should show up inside the Hostpath container

1. First, create a file from the application pod as shown:

2. Next, Exec into the Hostpath container and verify that the file shows up there:

Note: An additional way to ensure the driver is working properly is by inspecting the VolumeAttachment API object created that represents the attached volume

Support for Snapshot

Volume Snapshotting is a feature introduced for the kubernetes persistent volume in v1.12.

It is an alpha feature, you need to enable feature gate called VolumeSnapshotDataSource in kubernetes.

This feature is a great way to keep the snapshot (VolumeSnapshot) object of your important data in PV.
It revolves around the 3 objects: VolumeSnapshot, VolumeSnapshotContent and VolumeSnapshotClass.

The notion of their working is similar to the StorageClass, PVC and PV. If you have to create a snapshot of a volume provisioned by CSI, you need to create a VolumeSnapshot Object by specifying the source PVC and the storage class and the CSI-Snapshotter container will create a VolumeSnaphsotContent.

Let’s try out with an example:

We can create a volumesnapshot object, which will create the volumesnapshotcontent object.

The volumesnapshotcontent is created. output will look like:

Restore from the snapshot:

The DataSource field in the PVC can accept the source of kind: VolumeSnapsot which will create a new PV from that volume snapshot, when a Pod is bound to this PVC.

This will create a PV with the same contents as that of volume snapshot and can be attached to any pod using a PVC.

And we’re done here!!! The new pod can be created and attached with the `Restored` PV using a PVC.

The Hostpath or all the CSI Drivers also implements Raw Block Volumes. You can try them from The Block Volume Example.

I may talk about the Raw Block Volumes support and use case with the volume snapshot feature of CSI in some other blog.

Conclusion

We learnt about what is CSI, its components, why to use CSI instead of ‘in-tree’ volume plugins and we were able to deploy the sample CSI HostPath Driver on a K8s cluster. We created a PV of Filesystem Mode using the csi-provisioner and developed an understanding of how it works. And we also created a snapshot of the PV and restore it to a completely new PV with the same data, which widens the whole new set of use cases to work on.

About the Author

Photo_02-19(1).jpg

Ayush is a Full-Stack R&D Developer at Velotio. Currently working on a Cloud-Native product for kubernetes using Go, kubernetes operators, CRDs, controllers. He has also Worked on building VMware vSphere security and compliance products, enterprise grade web applications and Chat Bots. He likes to read fiction novels, play console games, and travelling. He is also a big fan of F.R.I.E.N.D.S and Breaking Bad.