Oops! Something went wrong while submitting the form.
We use cookies to improve your browsing experience on our website, to show you personalised content and to analize our website traffic. By browsing our website, you consent to our use of cookies. Read privacy policy.
Setting up a ML stack requires lots of tools, analyzing data, and training a model in the ML pipeline. But it is even harder to set up the same stack in multi-cloud environments. This is when Kubeflow comes into the picture and makes it easy to develop, deploy, and manage ML pipelines.
In this article, we are going to learn how to install Kubeflow on Kubernetes (GKE), train a ML model on Kubernetes and publish the results. This introductory guide will be helpful for anyone who wants to understand how to use Kubernetes to run a ML pipeline in a simple, portable and scalable way.
Kubeflow Installation on GKE
You can install Kubeflow onto any Kubernetes cluster no matter which cloud it is, but the cluster needs to fulfill the following minimum requirements:
4 CPU
50 GB storage
12 GB memory
The recommended Kubernetes version is 1.14 and above.
You need to download kfctl from the Kubeflow website and untar the file: tar -xvf kfctl_v1.0.2_<platform>.tar.gz -C /home/velotio/kubeflow</platform>
This will download the file kfctl_k8s_istio.v1.0.2.yaml and a kustomize folder. If you want to expose the UI with LoadBalancer, change the file $KF_DIR/kustomize/istio-install/base/istio-noauth.yaml and edit the service istio-ingressgateway from NodePort to LoadBalancer.
Now, you can install KubeFlow using the following commands:
Developing your ML application consists of several stages:
Gathering data and data analysis
Researching the model for the type of data collected
Training and testing the model
Tuning the model
Deploy the model
These are multi-stage models for any ML problem you’re trying to solve, but where does Kubeflow fit in this model?
Kubeflow provides its own pipelines to solve this problem. The Kubeflow pipeline consists of the ML workflow description, the different stages of the workflow, and how they combine in the form of graph.
Kubeflow provides an ability to run your ML pipeline on any hardware be it your laptop, cloud or multi-cloud environment. Wherever you can run Kubernetes, you can run your ML pipeline.
Training your ML Model on Kubeflow
Once you’ve deployed Kubeflow in the first step, you should be able to access the Kubeflow UI, which would look like:
The first step is to upload your pipeline. However, to do that, you need to prepare your pipeline in the first place. We are going to use a financial series database and train our model. You can find the example code here:
Once we have our image ready on the GCR repo, we can start our training job on Kubernetes. Please have a look at the tfjob resource in CPU/tfjob1.yaml and update the image and bucket reference.
Kubeflow Pipelines needs our pipeline file into a domain-specific-language. We can compile our python3 file with a tool called dsl-compile that comes with the Python3 SDK, which compile our pipeline into DSL. So, first, install that SDK:
Next, inspect the ml_pipline.py and update the ml_pipeline.py with the CPU image path that you built in the previous steps. Then, compile the DSL, using:
Now, a file ml_pipeline.py.tar_gz is generated, which we can upload to the Kubeflow pipelines UI.
Once the pipeline is uploaded, you can see the stages in a graph-like format.
Next, we can click on the pipeline and create a run. For each run, you need to specify the params that you want to use. When the pipeline is running, you can inspect the logs:
Run Jupyter Notebook in your ML Pipeline
You can also interactively define your pipeline from the Jupyter notebook:
1. Navigate to the Notebook Servers through the Kubeflow UI
2. Select the namespace and click on “new server.”
3. Give the server a name and provide the docker image for the TensorFlow on which you want to train your model. I took the TensorFlow 1.15 image.
4. Once a notebook server is available, click on “connect” to connect to the server.
5. This will open up a new window and a Jupyter terminal.
6. Input the following command: pip install -U kfp.
8. Now that you have notebook, you can replace the environment variables like WORKING_DIR, PROJECT_NAME and GITHUB_TOKEN. Once you do that, you can run the notebook step-by-step (one cell at a time) by pressing shift+enter, or you can run the whole notebook by clicking on menu and run all options.
Conclusion
The ML world has its own challenges; the environments are tightly coupled and the tools you needed to deploy to build an ML stack was extremely hard to set up and configure. This becomes harder in production environments because you have to be extremely cautious you are not breaking the components that are already present.
Kubeflow makes getting started on ML highly accessible. You can run your ML workflows anywhere you can run Kubernetes. Kubeflow made it possible to run your ML stack on multi cloud environments, which enables ML engineers to easily train their models at scale with the scalability of Kubernetes.
Setting up a ML stack requires lots of tools, analyzing data, and training a model in the ML pipeline. But it is even harder to set up the same stack in multi-cloud environments. This is when Kubeflow comes into the picture and makes it easy to develop, deploy, and manage ML pipelines.
In this article, we are going to learn how to install Kubeflow on Kubernetes (GKE), train a ML model on Kubernetes and publish the results. This introductory guide will be helpful for anyone who wants to understand how to use Kubernetes to run a ML pipeline in a simple, portable and scalable way.
Kubeflow Installation on GKE
You can install Kubeflow onto any Kubernetes cluster no matter which cloud it is, but the cluster needs to fulfill the following minimum requirements:
4 CPU
50 GB storage
12 GB memory
The recommended Kubernetes version is 1.14 and above.
You need to download kfctl from the Kubeflow website and untar the file: tar -xvf kfctl_v1.0.2_<platform>.tar.gz -C /home/velotio/kubeflow</platform>
This will download the file kfctl_k8s_istio.v1.0.2.yaml and a kustomize folder. If you want to expose the UI with LoadBalancer, change the file $KF_DIR/kustomize/istio-install/base/istio-noauth.yaml and edit the service istio-ingressgateway from NodePort to LoadBalancer.
Now, you can install KubeFlow using the following commands:
Developing your ML application consists of several stages:
Gathering data and data analysis
Researching the model for the type of data collected
Training and testing the model
Tuning the model
Deploy the model
These are multi-stage models for any ML problem you’re trying to solve, but where does Kubeflow fit in this model?
Kubeflow provides its own pipelines to solve this problem. The Kubeflow pipeline consists of the ML workflow description, the different stages of the workflow, and how they combine in the form of graph.
Kubeflow provides an ability to run your ML pipeline on any hardware be it your laptop, cloud or multi-cloud environment. Wherever you can run Kubernetes, you can run your ML pipeline.
Training your ML Model on Kubeflow
Once you’ve deployed Kubeflow in the first step, you should be able to access the Kubeflow UI, which would look like:
The first step is to upload your pipeline. However, to do that, you need to prepare your pipeline in the first place. We are going to use a financial series database and train our model. You can find the example code here:
Once we have our image ready on the GCR repo, we can start our training job on Kubernetes. Please have a look at the tfjob resource in CPU/tfjob1.yaml and update the image and bucket reference.
Kubeflow Pipelines needs our pipeline file into a domain-specific-language. We can compile our python3 file with a tool called dsl-compile that comes with the Python3 SDK, which compile our pipeline into DSL. So, first, install that SDK:
Next, inspect the ml_pipline.py and update the ml_pipeline.py with the CPU image path that you built in the previous steps. Then, compile the DSL, using:
Now, a file ml_pipeline.py.tar_gz is generated, which we can upload to the Kubeflow pipelines UI.
Once the pipeline is uploaded, you can see the stages in a graph-like format.
Next, we can click on the pipeline and create a run. For each run, you need to specify the params that you want to use. When the pipeline is running, you can inspect the logs:
Run Jupyter Notebook in your ML Pipeline
You can also interactively define your pipeline from the Jupyter notebook:
1. Navigate to the Notebook Servers through the Kubeflow UI
2. Select the namespace and click on “new server.”
3. Give the server a name and provide the docker image for the TensorFlow on which you want to train your model. I took the TensorFlow 1.15 image.
4. Once a notebook server is available, click on “connect” to connect to the server.
5. This will open up a new window and a Jupyter terminal.
6. Input the following command: pip install -U kfp.
8. Now that you have notebook, you can replace the environment variables like WORKING_DIR, PROJECT_NAME and GITHUB_TOKEN. Once you do that, you can run the notebook step-by-step (one cell at a time) by pressing shift+enter, or you can run the whole notebook by clicking on menu and run all options.
Conclusion
The ML world has its own challenges; the environments are tightly coupled and the tools you needed to deploy to build an ML stack was extremely hard to set up and configure. This becomes harder in production environments because you have to be extremely cautious you are not breaking the components that are already present.
Kubeflow makes getting started on ML highly accessible. You can run your ML workflows anywhere you can run Kubernetes. Kubeflow made it possible to run your ML stack on multi cloud environments, which enables ML engineers to easily train their models at scale with the scalability of Kubernetes.
Velotio Technologies is an outsourced software product development partner for top technology startups and enterprises. We partner with companies to design, develop, and scale their products. Our work has been featured on TechCrunch, Product Hunt and more.
We have partnered with our customers to built 90+ transformational products in areas of edge computing, customer data platforms, exascale storage, cloud-native platforms, chatbots, clinical trials, healthcare and investment banking.
Since our founding in 2016, our team has completed more than 90 projects with 220+ employees across the following areas:
Building web/mobile applications
Architecting Cloud infrastructure and Data analytics platforms