Infrastructure

Mesosphere DC/OS Masterclass : Tips and tricks to make life easier

Mesosphere DC/OS Masterclass : Tips and tricks to make life easier

DC/OS is an open-source operating system and distributed system for data center built on Apache Mesos distributed system kernel. As a distributed system, it is a cluster of master nodes and private/public nodes, where each node also has host operating system which manages the underlying machine. 

This blog gives an overview on DC/OS, the cli, various kinds of APIs and also shared some lesser known tips and commands for effectively using DC/OS.

Real Time Analytics for IoT Data using Mosquitto, AWS Kinesis and InfluxDB

Real Time Analytics for IoT Data using Mosquitto, AWS Kinesis and InfluxDB

Internet of things (IoT) is maturing rapidly and it is finding application across various industries. Every common device that we use is turning into the category of smart devices. Smart devices are basically IoT devices. These devices captures various parameters in and around their environment leading to generation of a huge amount of data. This data needs to be collected, processed, stored and analyzed in order to get actionable insights from them. To do so, we need to build data pipeline.

In this blog we will be building a similar pipeline using Mosquitto, Kinesis, InfluxDB and Grafana. We will discuss all these individual components of the pipeline and the steps to build it.

Extending Kubernetes APIs with Custom Resource Definitions (CRDs)

Extending Kubernetes APIs with Custom Resource Definitions (CRDs)

Custom resources definition (CRD) is a powerful feature introduced in Kubernetes 1.7 which enables users to add their own/custom objects to the Kubernetes cluster and use it like any other native Kubernetes objects. In this blog post, we will see how we can add a custom resource to a Kubernetes cluster using the command line as well as using the Golang client library thus also learning how to programmatically interact with a Kubernetes cluster.

As per this article, Custom Resource Definitions are part of a wider effort to refine and enhance Kubernetes as an extensible application platform, factoring everything but the bare essentials out of “core” Kubernetes in favour of modular and maintainable extensibility mechanisms.

In this blog we understand how one can go about writing their own CRDs with a hand-on demonstration.

Jenkins X - a Cloud-Native approach to CI/CD

Jenkins X - a Cloud-Native approach to CI/CD

Jenkins X is a project which rethinks how developers should interact with CI/CD in the cloud with a focus on making development teams productive through automation, tooling and DevOps best practices.

In this blog, we explore Jenkins X, understand how it differs from Jenkins and how to go about building and deploying our first application using it.

A Beginner's Guide to Edge Computing

A Beginner's Guide to Edge Computing

There is a recent trend in change in architecture of the way data is stored and compute is done. Edge computing is one of such phenomena in which either the data or the compute is decentralized and taken to the nearest nodes of the user it can either be smartphone or local region servers.

In this blog we will delve into what Edge Computing really is, it’s various types, and see how it is implemented and managed in the real world.

Deploy serverless, event-driven Python applications using Zappa

Deploy serverless, event-driven Python applications using Zappa

Zappa makes it super easy to build and deploy server-less, event-driven Python applications (including, but not limited to, WSGI web apps) on AWS Lambda + API Gateway. Think of it as "serverless" web hosting for your Python apps. That means infinite scalingzero downtimezero maintenance - and at a fraction of the cost of your current deployments!

This blog acts a simple tutorial to deploy a sample Django/Python application using Zappa.

Demystifying High Availability in Kubernetes using Kubeadm

Demystifying High Availability in Kubernetes using Kubeadm

Kubernetes allows deployment and management container-based applications at scale. One of the main advantages of Kubernetes is how it brings greater reliability and stability to the container-based distributed application, through the use of dynamic scheduling of containers.  But, how do you make sure Kubernetes itself stays up when a component or its master node goes down?

In this blog we look at the steps to ensure that your kubernetes cluster is always highly available and fault tolerant.

Exploring Upgrade Strategies for Stateful Sets in Kubernetes

Exploring Upgrade Strategies for Stateful Sets in Kubernetes

In the age of continuous delivery and agility where the software is being deployed 10s of times per day and sometimes per hour as well using container orchestration platforms, a seamless upgrade mechanism becomes a critical aspect of any technology adoption. In this blog we explore the various upgrade strategies available for Statefulsets in Kubernetes with Cassandra as the database.

Lessons learnt while building an ETL pipeline for MongoDB & Amazon Redshift using Apache Airflow

Lessons learnt while building an ETL pipeline for MongoDB & Amazon Redshift using Apache Airflow

Recently, the author was involved in building a custom ETL(Extract-Transform-Load) pipeline using Apache Airflow which included extracting data from MongoDB collections and putting it into Amazon Redshift tables. 

Each ETL pipeline comes with a specific business requirement around processing data which is hard to be achieved using off-the-shelf ETL solutions. This is why a majority of ETL solutions are built manually, from scratch. In this blog, I am going to talk about my learnings around building an optimized, efficient, near real-time and fault tolerant custom ETL solution using Apache Airflow which involved moving data from MongoDB to Redshift.

Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack

Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack

We need a practical and scalable approach to understand the cause-effect relationship between data sources and events across complex infrastructure of VMs, containers, networks, micro-services, regions, etc. Machine learning is particularly useful for such problems where we need to identify “what changed”, since machine learning algorithms can easily analyze existing data to understand the patterns, thus making easier to recognize the cause. This is known as unsupervised learning, where the algorithm learns from the experience and identifies similar patterns when they come along again.

Let's see how you can setup Elastic + X-Pack to enable anomaly detection for your infrastructure & applications.

A practical guide to deploying multi-tier applications on Google Container Engine (GKE)

A practical guide to deploying multi-tier applications on Google Container Engine (GKE)

In this blog, we look at how to deploy, scale & delete a Multi-tier (Flask/Python and MySQL) Application in Google Container Engine.

Elasticsearch 101: Fundamentals & Core Components

Elasticsearch 101: Fundamentals & Core Components

Elasticsearch is currently the most popular way to implement free text search in your application. This blog post is an introduction to Elasticsearch including components and data types. It covers the some of the basic but important concepts of Clusters, different types of Nodes, Documents, Mappings, Indices, and Shards.