Exploring Upgrade Strategies for Stateful Sets in Kubernetes

Introduction

In the age of continuous delivery and agility where the software is being deployed 10s of times per day and sometimes per hour as well using container orchestration platforms, a seamless upgrade mechanism becomes a critical aspect of any technology adoption, Kubernetes being no exception. 

Kubernetes provides a variety of controllers that define how pods are set up and deployed within the Kubernetes cluster. These controllers can group pods together according to their runtime needs and can be used to define pod replication and pod startup ordering. Kubernetes controllers are nothing but an application pattern. The controller controls the pods(smallest unit in Kubernetes), so, you don’t need to create, manage and delete the pods. There are few types of controllers in Kubernetes like,

  1. Deployment

  2. Statefulset

  3. Daemonset

  4. Job

  5. Replica sets

Each controller represents an application pattern. For example, Deployment represents the stateless application pattern in which you don’t store the state of your application. Statefulset represents the statefulset application pattern where you store the data, for example, databases, message queues.  We will be focusing on Statefulset controller and its update feature in this blog.

Statefulset

The StatefulSet acts as a controller in Kubernetes to deploy applications according to a specified rule set and is aimed towards the use of persistent and stateful applications. It is an ordered and graceful deployment. Statefulset is generally used with a distributed applications that require each node to have a persistent state and the ability to configure an arbitrary number of nodes. StatefulSet pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the pod, regardless of which node it’s scheduled on. For more details check here.

Update Strategies FOR STATEFULSETS

There are a couple of different strategies available for upgrades - Blue/Green and Rolling updates. Let's review them in detail:

Blue-Green DeploymentBlue-green deployment is one of the commonly used update strategies. There are 2 identical environments of your application in this strategy. One is the Blue environment which is running the current deployment and the Green environment is the new deployment to which we want to upgrade. The approach is simple:

  1. Switch the load balancer to route traffic to the Green environment.

  2. Delete the Blue environment once the Green environment is verified. 

Disadvantages of Blue-Green deployment:

  1. One of the disadvantages of this strategy is that all current transactions and sessions will be lost, due to the physical switch from one machine serving the traffic to another one.

  2. Implementing blue-green deployment become complex with the database, especially if, the database schema changes across version.

  3. In blue-green deployment, you need the extra cloud setup/hardware which increases the overall costing.

Rolling update strategy

After Blue-Green deployment, let's take a look at Rolling updates and how it works.

  1. In short, as the name suggests this strategy replaces currently running instances of the application with new instances, one by one. 

  2. In this strategy, health checks play an important role i.e. old instances of the application are removed only if new version are healthy. Due to this, the existing deployment becomes heterogeneous while moving from the old version of the application to new version. 

  3. The benefit of this strategy is that its incremental approach to roll out the update and verification happens in parallel while increasing traffic to the application.

  4. In rolling update strategy, you don’t need extra hardware/cloud setup and hence it’s cost-effective technique of upgrade.

Statefulset upgrade strategies

With the basic understanding of upgrade strategies, let's explore the update strategies available for Stateful sets in Kubernetes. Statefulsets are used for databases where the state of the application is the crucial part of the deployment. We will take the example of Cassandra to learn about statefulset upgrade feature. We will use the gce-pd storage to store the data. StatefulSets(since Kubernetes 1.7) uses an update strategy to configure and disable automated rolling updates for containers, labels, resource request/limits, and annotations for its pods. The update strategy is configured using the updateStrategy field.

The updateStrategy field accepts one of the following value 

  1. OnDelete

  2. RollingUpdate

OnDelete update strategy

OnDelete prevents the controller from automatically updating its pods. One needs to delete the pod manually for the changes to take effect. It’s more of a manual update process for the Statefulset application and this is the main difference between OnDelete and RollingUpdate strategy. OnDelete update strategy plays an important role where the user needs to perform few action/verification post the update of each pod. For example, after updating a single pod of Cassandra user might need to check if the updated pod joined the Cassandra cluster correctly.

We will now create a Statefulset deployment first. Let’s take a simple example of Cassandra and deploy it using a Statefulset controller. Persistent storage is the key point in Statefulset controller. You can read more about the storage class here.

For the purpose of this blog, we will use the Google Kubernetes Engine.

  • First, define the storage class as follows:
  • Then create the Storage class using kubectl:

$ kubectl create -f storage_class.yaml

  • Here is the YAML file for the Cassandra service and the Statefulset deployment.
  • Let's create the Statefulset now.
$ kubectl create -f cassandra.yaml
  • After creating Cassandra Statefulset, if you check the running pods then you will find something like,
$ kubectl get pods
NAME              READY   STATUS    RESTARTS AGE
cassandra-0     1/1         Running    0                 2m
cassandra-1     1/1         Running    0                 2m
cassandra-2     1/1         Running    0                 2m

  • Check if Cassandra cluster is formed correctly using following command:
$ kubectl exec -it cassandra-0 -- nodetool status
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving

Address                       Load           Tokens  Owns Host ID                                                               Rack
UN 192.168.4.193    101.15 KiB 32        72.0% abd9f52d-85ef-44ee-863c-e1b174cd9412  Rack1-K8Demo
UN 192.168.199.67  187.81 KiB 32        72.8% c40e89e4-44fe-4fc2-9e8a-863b6a74c90c  Rack1-K8Demo
UN 192.168.187.196 131.42 KiB 32       55.2% c235505c-eec5-43bc-a4d9-350858814fe5  Rack1-K8Demo

  • Let’s describe the running pod first before updating. Look for the image field in the output of the following command
$ kubectl describe pod cassandra-0
  • The Image field will show gcr.io/google-samples/cassandra:v12 . Now, let’s patch the Cassandra statefulset with the latest image to which we want to update. The latest image might contain the new Cassandra version or database schema changes. Before upgrading such crucial components, it’s always safe to have the backup of the data,
$ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v13"}]'

You will see output as `statefulset.apps "cassandra" patched`, but controller won’t update the running pod automatically in this strategy. You need to delete the pods once and wait till pods with new configuration comes up. Let’s try deleting the cassandra-0 pod.

$ kubectl delete pod cassandra-0
  • Wait till cassandra-0 comes up in running state and then check if the cassandra-0 is running with intended/updated image i.e. gcr.io/google-samples/cassandra:v13 Now, cassandra-0 is running the new image while cassandra-1 and cassandra-2 are still running the old image. You need to delete these pods for the new image to take effect in this strategy.

Rolling update strategy

Rollingupdate is an automated update process. In this, the controller deletes and then recreates each of its pods. Pods get updated one at a time. While updating, the controller makes sure that an updated pod is running and is in ready state before updating its predecessor. The pods in the StatefulSet are updated in reverse ordinal order(same as pod termination order i.e from the largest ordinal to the smallest)

For the rolling update strategy, we will create the Cassandra statefulset with the .spec.updateStrategy field pointing to RollingUpdate

  • To try the rolling update feature, we can patch the existing statefulset with the updated image.
$ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v13"}]'
  • Once you execute the above command, monitor the output of the following command,
$ kubectl get pods -w

In the case of failure in update process, controller restores any pod that fails during the update to its current version i.e. pods that have already received the update will be restored to the updated version, and pods that have not yet received the update will be restored to the previous version.

Partitioning a RollingUpdate (Staging an Update)

The updateStrategy contains one more field for partitioning the RollingUpdate. If a partition is specified, all pods with an ordinal greater than or equal to that of the provided partition will be updated and the pods with an ordinal that is less than the partition will not be updated. If the pods with an ordinal value less than the partition get deleted, then those pods will get recreated with the old definition/version. This partitioning rolling update feature plays important role in the scenario where if you want to stage an update, roll out a canary, or perform a phased rollout.

RollingUpdate supports partitioning option. You can define the partition parameter in the .spec.updateStrategy

$ kubectl patch statefulset cassandra -p '{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":2}}}}'

In the above command, we are giving partition value as 2, which will patch the Cassandra statefulset in such a way that, whenever we try to update the Cassandra statefulset, it will update the cassandra-2 pod only. Let’s try to patch the updated image to existing statefulset.

$ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v14"}]'

After patching, watch the following command output,

$ kubectl get pods -w

You can keep decrementing the partition value and that many pods will keep taking the effect of the applied patch. For example, if you patch the statefulset with partition=0 then all the pods of the Cassandra statefulset will get updated with provided upgrade configuration.

Verifying if the upgrade was successful

Verifying the upgrade process of your application is the important step to conclude the upgrade. This step might differ as per the application. Here, in the blog we have taken the Cassandra example, so we will verify if the cluster of the Cassandra nodes is being formed properly.

Use `nodetool status` command to verify the cluster. After upgrading all the pods, you might want to run some post-processing like migrating schema if your upgrade dictates that etc.

As per the upgrade strategy, verification of your application can be done by following ways.

  1. In OnDelete update strategy, you can keep updating pod one by one and keep checking the application status to make sure the upgrade working fine.

  2. In RollingUpdate strategy, you can check the application status once all the running pods of your application gets upgraded.

For Cassandra like application, OnDelete update is more preferred than RollingUpdate. In rolling update, we saw that Cassandra pod gets updated one by one, starting from high to low ordinal index. There might be the case where after updating 2 pods, Cassandra cluster might go in failed state but you can not recover it like the OnDelete strategy. You have to try to recover Cassandra once the complete upgrade is done i.e. once all the pods get upgraded to provided image. If you have to use the rolling update then try partitioning the rolling update.

Conclusion

In this blog, we went through the kubernetes controllers and mainly through statefulsets. We learnt about the differences between blue-green deployment and rolling update strategies then we played with the Cassandra statefulset example and successfully upgraded it with update strategies like OnDelete and RollingUpdate. Do let us know if you have any questions, queries and additional thoughts in the comments section below.


About The Author

Screen+Shot+2017-08-21+at+10.24.13+AM.png

Ajay is a Cloud & Virtualization specialist. He has a strong understanding of VMWare Virtualization Platform, Amazon Web Services & Google Cloud Platform. Lately, he has been working in the world of Docker & Kubernetes. Ajay has helped several customers in adopting Kubernetes and has built tooling and automation solutions around it. He is also a huge fan of FRIENDS and Game Of Thrones!