A Practical Guide to HashiCorp Consul - Part 1

This is part 1 of 2 part series on A Practical Guide to HashiCorp Consul. This part is primarily focused on understanding the problems that Consul solves and how it solves them. The second part is more focused on a practical application of Consul in a real-life example and will be published next week. Let’s get started.

How about setting up discoverable, configurable, and secure service mesh using a single tool?

What if we tell you this tool is platform-agnostic and cloud-ready?

And comes as a single binary download.

All this is true. The tool we are talking about is HashiCorp Consul.

Consul provides service discovery, health checks, load balancing, service graph, identity enforcement via TLS, and distributed service configuration management.

Let’s learn about Consul in details below and see how it solves these complex challenges and makes the life of a distributed system operator easy.

Contents:

  1. Introduction

  2. Monolithic vs. Service Oriented Architecture (SOA)

  3. Service Discovery in monolithic, challenges in distributed and Consul’s solution

  4. Configuration Management in monolith, challenges in distributed and Consul’s solution

  5. Network Segmentation in monolithic, challenges in distributed and Consul’s solution

  6. Basic Architecture of Consul

  7. Getting Started with Consul

  8. How is Consul different from Zookeeper, doozerd, and etcd?

  9. Open Source Tools around HashiCorp Consul

  10. Conclusion

  11. References

Introduction

Microservices and other distributed systems can enable faster, simpler software development. But there's a trade-off resulting in greater operational complexity around inter-service communication, configuration management, and network segmentation.

Monolithic Application (representational) - with different subsystems A, B, C and D

Monolithic Application (representational) - with different subsystems A, B, C and D

Distributed Application (representational) - with different services A, B, C and D

Distributed Application (representational) - with different services A, B, C and D

HashiCorp Consul is an open source tool that solves these new complexities by providing service discovery, health checks, load balancing, a service graph, mutual TLS identity enforcement, and a configuration key-value store. These features make Consul an ideal control plane for a service mesh.

HashiCorp Consul supports Service Discovery, Service Configuration, and Service Segmentation

HashiCorp Consul supports Service Discovery, Service Configuration, and Service Segmentation

HashiCorp announced Consul in April 2014 and it has since then got a good community acceptance.

This guide is aimed at discussing some of these crucial problems and exploring the various solutions provided by HashiCorp Consul to tackle these problems.

Let’s rundown through the topics that we are going to cover in this guide. The topics are written to be self-content. You can jump directly to a specific topic if you want to.

Brief Background on Monolithic vs. Service-oriented Architectures (SOA)

Looking at traditional architectures of application delivery, what we find is a classic monolith. When we talk about monolith, we have a single application deployment.

Even if it is a single application, typically it has multiple different sub-components.

One of the examples that HashiCorp’s CTO Armon Dadgar gave during his introductory video for Consul was about - delivering desktop banking application. It has a discrete set of sub-components - for example, authentication (say subsystem A), account management (subsystem B), fund transfer (subsystem C), and foreign exchange (subsystem D).

Now, although these are independent functions - system A authentication vs system C fund transfer - we deploy it as a single, monolith app.

Over the last few years, we have seen a trend away from this kind of architectures. There are several reasons for this shift.

Challenge with monolith is: Suppose there is a bug in one of the subsystems, system A, related to authentication.

Representational bug in Subsystem A in our monolithic application

Representational bug in Subsystem A in our monolithic application

We can’t just fix it in system A and update it in production.

Representational bug fix in Subsystem A in our monolithic application

Representational bug fix in Subsystem A in our monolithic application

We have to update system A and do a redeploy of the whole application, which we need deployment of subsystems B, C, and D as well.

Bug fix in one subsystem results in redeployment of the whole monolithic application

Bug fix in one subsystem results in redeployment of the whole monolithic application

This whole redeployment is not ideal. Instead, we would like to do a deployment of individual services.

The same monolithic app delivered as a set of individual, discrete services.

Dividing monolithic application into individual services

Dividing monolithic application into individual services

So, if there is a bug fix in one of our services:

Representational bug in one of service, in this case Service A of our SOA application

Representational bug in one of service, in this case Service A of our SOA application

and we fix that bug:

Representational bug fix in Service A of our SOA application

Representational bug fix in Service A of our SOA application

We can do the redeployment of that service without coordinating the deployment with other services. What we are essentially talking about is one form of microservices.

Bug fix will result into redeployment of only Service A within our whole application

Bug fix will result into redeployment of only Service A within our whole application

This gives a big boost to our development agility. We don’t need to coordinate our development efforts across different development teams or even systems. We will have freedom of developing and deploying independently. One service on a weekly basis and other on quarterly. This is going to be a big advantage to the development teams.

But, there is no such thing as a free lunch.

The development efficiency we have gained introduces its own set of operational challenges. Let’s look at some of those.

Service discovery in a monolith, its challenges in a distributed system, and Consul's solution

Monolithic applications

Assuming two services in a single application want to talk to one another. One way is to expose a method, make it public and allow other services to call it. In a monolithic application, it is a single app, and the services would expose public functions and it would simply mean function calls across services.

Subsystems talk to each other via function call within our monolithic application

Subsystems talk to each other via function call within our monolithic application

As this is a function call within a process, it has happened in-memory. Thus, it's fast, and we need not worry about how our data was moved and if it was secure or not.

Distributed Systems

In the distributed world, service A is no longer delivered as the same application as service B. So, how does service A finds service B if it wants to talk to B?

Service A tries to find Service B to establish communication

Service A tries to find Service B to establish communication

Service A might not even be on the same machine as service B. So, there is a network in play. And it is not as fast and there is a latency that we can measure on the lines of milliseconds, as compared to nanoseconds of a simple function call.

Challenges

As we already know by now, two services on a distributed system have to discover one-another to interact. One of the traditional ways of solving this is by using load balancers.

A load balancer sits between services to allow them to talk to each other

A load balancer sits between services to allow them to talk to each other

Load balancers would sit in front of each service with a static IP known to all other services.

A load balancer between two services allows two way traffic

A load balancer between two services allows two way traffic

This gives an ability to add multiple instances of the same service behind the load balancer and it would direct the traffic accordingly. But this load balancer IP is static and hard-coded within all other services, so services can skip discovery.

Load balancers allow communication between multiple instances of same service

Load balancers allow communication between multiple instances of same service

The challenge is now to maintain a set of load balancers for each individual services. And we can safely assume, there was originally a load balancer for the whole application as well. The cost and effort for maintaining these load balancers have increased.

With load balancers in front of the services, they are a single point of failures. Even when we have multiple instances of service behind the load balancer if it is down our service is down. No matter how many instances of that service are running.

Load balancers also increase the latency of inter-service communication. If service A wish to talk to service B, request from A will have to first talk to the load balancer of service B and then reach B. The response from B will also have to go through the same drill.

Maintaining an entry of a service instances on an application-wide load balancer

Maintaining an entry of a service instances on an application-wide load balancer

And by nature, load balancers are manually managed in most cases. If we add another instance of service, it will not be readily available. We will need to register that service into the load balancer to make it accessible to the world. This would mean manual effort and time.

Consul’s Solutions

Consul’s solution to service discovery problem in distributed systems is a central service registry.

Consul maintains a central registry which contains the entry for all the upstream services. When a service instance starts, it is registered on the central registry. The registry is populated with all the upstream instances of the service.

Consul’s Service Registry helps Service A find Service B and establish communication

Consul’s Service Registry helps Service A find Service B and establish communication

When a service A wants to talk to service B, it will discover and communicate with B by querying the registry about the upstream service instances of B. So, instead of talking to a load balancer, the service can directly talk to the desired destination service instance.

Consul also provides health-checks on these service instances. If one of the service instances or service itself is unhealthy or fails its health-check, the registry would then know about this scenario and would avoid returning the service’s address. The work that load-balancer would do is handled by the registry in this case.

Also, if there are multiple instances of the same service, Consul would send the traffic randomly to different instances. Thus, leveling the load among different instances.

Consul has handled our challenges of failure detection and load distribution across multiple instances of services without a necessity of deploying a centralized load balancer.

Traditional problem of slow and manually managed load balancers is taken care of here. Consul programmatically manages registry, which gets updated when any new service registers itself and becomes available for receiving traffic.

This helps with scaling the services with ease.

Configuration Management in a monolith, its challenges in a distributed environment, and Consul's solution

Monolithic Applications

When we look at the configuration for a monolithic application, they tend to be somewhere along the lines of giant YAML, XML or JSON files. That configuration is supposed to configure the entire application.

Single configuration file shared across different parts of our monolithic application

Single configuration file shared across different parts of our monolithic application

Given a single file, all of our subsystems in our monolithic application would now consume the configuration from the same file. Thus creating a consistent view of all our subsystems or services.

If we wish to change the state of the application using configuration update, it would be easily available to all the subsystems. The new configuration is simultaneously consumed by all the components of our application.

Distributed Systems

Unlike monolith, distributed services would not have a common view on configuration. The configuration is now distributed and there every individual service would need to be configured separately.

A copy of application configuration is distributed across different services

A copy of application configuration is distributed across different services

Challenges in Distributed Systems

  • Configuration is to be spread across different services. Maintaining consistency between the configuration on different services after each update is a challenge.

  • Moreover, the challenge grows when we expect the configuration to be updated dynamically.

Consul’s Solutions

Consul’s solution for configuration management in distributed environment is the central Key-Value store.

Consul’s KV store allows seamless configuration mapping on each service

Consul’s KV store allows seamless configuration mapping on each service

Consul solves this challenge in a unique way. Instead of spreading the configuration across different distributed service as configuration pieces, it pushes the whole configuration to all the services and configures them dynamically on the distributed system.

Let’s take an example of state change in configuration. The changed state is pushed across all the services in real-time. The configuration is consistently present with all the services.

Network segmentation in a monolith, its challenges in distributed systems, and Consul's solutions

Monolithic Applications

When we look at our classic monolithic architecture, the network is typically divided in three different zones.

The first zone in our network is publicly accessible. The traffic coming to our application via the internet and reaching our load balancers.

The second zone is the traffic from our load balancers to our application. Mostly an internal network zone without direct public access.

The third zone is the closed network zone, primarily designated for data. This is considered to be an isolated zone.

Different network zones in typical application

Different network zones in typical application

Only the load balancers zone can reach into the application zone and only the application zone can reach into the data zone. It is a straightforward zoning system, simple to implement and manage.

Distributed Systems

The pattern changes drastically for distributed services.

Complex pattern of network traffic and routes across different services

Complex pattern of network traffic and routes across different services

There are multiple services within our application network zone itself. Each of these service talks to other within this network, making it a complicated traffic pattern.

Challenges

  • The primary challenge is that the traffic is not in any sequential flow. Unlike monolithic architecture, where the flow was defined from load balancers to the application and application to data.

  • Depending on the access pattern we want to support, the traffic might come from different endpoints and reaching different services.

Client essentially talks to each service within the application directly or indirectly

Client essentially talks to each service within the application directly or indirectly

  • Given multiple services and the ability to support multiple endpoints allows us to deploy multiple service consumers and providers.

SOA demands control over trusted and untrusted sources of traffic

SOA demands control over trusted and untrusted sources of traffic

  • Controlling the flow of traffic and segmenting the network into groups or chunks will become a bigger issue. Also, making sure we have strict rules that guide us with partitioning the network based on who should be allowed to talk to whom and vice versa is also vital.

Consul’s Solutions

Consul’s solution to overall network segmentation challenge in distributed systems is by implementing service graph and mutual TLS.

Service-level policy enforcement to define traffic pattern and segmentation using Consul

Service-level policy enforcement to define traffic pattern and segmentation using Consul

Consul solves the problem of network segmentation by centrally managing the definition around who can talk to whom. Consul has a dedicated feature for this called Consul Connect.

Consul Connect enrolls these policies of inter-service communication that we desire and implements it as part of the service graph. So, a policy might say service A can talk to service B, but B cannot talk to C, for example.

The higher benefit of this is, it is not IP restricted. Rather it’s service level. This makes it scalable. The policy will be enforced on all instances of service and there will be no hard bound firewall rule specific to a service’s IP. Making us independent of the scale of our distributed network.

Consul Connect also handles service identity using popular TLS protocol. It distributes the TLS certificate associated with a service.

These certificates help other services securely identify each other. TLS also help with secure communication between the services. This makes for trusted network implementation.

Consul enforces TLS using an agent-based proxy attached to each service instance. This proxy acts as a sidecar. Use of proxy, in this case, prevents us from making any change into the code of original service.

This allows for the higher level benefit of enforcing encryptions on data at rest and data in transit. Moreover, it will assist with fulfilling compliances required by laws around privacy and user identity.

Basic Architecture of Consul

Consul is a distributed and highly available system.

Consul is shipped as a single binary download for all popular platforms. The executable can run as a client as well as server.

Each node that provides services to Consul runs a Consul agent. Each of these agents talk to one or more Consul servers.

Basic Architecture of Consul

Basic Architecture of Consul

Consul agent is responsible for health-checking the services on the node as the health-check of the node itself. It is not responsible for service discovery or maintaining key/value data.

Consul servers are where data is stored and replicated.

Consul can run with single server, but it is recommended by HashiCorp to run a set of 3 to 5 servers to avoid failures. As all the data is stored at Consul server side, with a single server, the failure could cause a data loss.

With multi-servers cluster, they elect a leader among themselves. It is also recommended by HashiCorp to have cluster of servers per datacenter.

During the discovery process, any service in search for other service can query the Consul servers or even Consul agents. The Consul agents forward the queries to Consul servers automatically.

Consul Agent sits on a node and talks to other agents on the network synchronizing all service-level information

Consul Agent sits on a node and talks to other agents on the network synchronizing all service-level information

If the query is cross-datacenter, the queries are forwarded by the Consul server to the remote Consul servers. The results from remote Consul servers are returned to the original Consul server.

Getting Started with Consul

This section is dedicated to closely looking at Consul as a tool, with some hands-on experience.

Download and Install

As discussed above, Consul ships as a single binary downloaded from HashiCorps website or from Consul’s GitHub repo releases section.

Single binary can run as Consul Server or even as Consul Client Agent.

You can download Consul from here - Download Consul page.

Various download options for Consul on different operating systems

Various download options for Consul on different operating systems

We will download Consul on command line using the link from download page

Unzip the downloaded zip file.

Add it to PATH.

Use Consul

Once you unzip the compressed file and put the binary under your PATH, you can run it like this.

This will start the agent in development mode.

Consul Members

While the above command is running, you can check for all the members in Consul’s network.

Given we only have one node running, it is treated as server by default. You can designate an agent as a server by supplying server as command line parameter or server as configuration parameter to Consul’s config.

The output of the above command is based on the gossip protocol and is eventually consistent.

Consul HTTP API

For strongly consistent view of the Consul’s agent network, we can use HTTP API provided out of the box by Consul.

Consul DNS Interface

Consul also provides a DNS interface to query nodes. It serves DNS on 8600 port by default. That port is configurable.

Registering a service on Consul can be achieved either by writing a service definition or by sending a request over an appropriate HTTP API.

Consul Service Definition

Service definition is one of the popular ways of registering a service. Let’s take a look at one of such service definition examples.

To host our service definitions we will add a configuration directory, conventionally names as consul.d - ‘.d’ represents that there are set of configuration files under this directory, instead of single config under name consul.

Write the service definition for a fictitious Django web application running on port 80 on localhost.

To make our consul agent aware of this service definition, we can supply the configuration directory to it.

The relevant information in the log here are the sync statements related to the “web” service. Consul agent as accepted our config and synced it across all nodes. In this case one node.

Consul DNS Service Query

We can query the service with DNS, as we did with node. Like so:

We can also query DNS for service records that give us more info into the service specifics like port and node.

You can also use the TAG that we supplied in the service definition to query a specific tag:

Consul Service Catalog Over HTTP API

Service could similarly be queried using HTTP API:

We can filter the services based on health-checks on HTTP API:

Update Consul Service Definition

If you wish to update the service definition on a running Consul agent, it is very simple.

There are three ways to achieve this. You can send a SIGHUP signal to the process, reload Consul which internally sends SIGHUP on the node or you can call HTTP API dedicated to service definition updates that will internally reload the agent configuration.

Send SIGHUP to 21289

Or reload Consul

Configuration reload triggered

You should see this in your Consul log.

Consul Web UI

Consul provides a beautiful web user interface out-of-the-box. You can access it on port 8500.

In this case at http://localhost:8500. Let’s look at some of the screens.

The home page for the Consul UI is services with all the relevant information related to a Consul agent and web service check.

Exploring defined services on Consul Web UI

Exploring defined services on Consul Web UI

Going into further details on a given service, we get a service dashboard with all the nodes and their health for that service.

Exploring node-level information for each service on Consul Web UI

Exploring node-level information for each service on Consul Web UI

On each individual node, we can look at the health-checks, services, and sessions.

Exploring node-specific health-check information, services information, and sessions information on Consul Web UI

Exploring node-specific health-check information, services information, and sessions information on Consul Web UI

Overall, Consul Web UI is really impressive and a great companion for the command line tools that Consul provides.

How is Consul Different From Zookeeper, doozerd, and etcd?

Consul has a first-class support for service discovery, health-check, key-value storage, multi data centers.

Zookeeper, doozerd, and etcd are primarily based on key-value store mechanism. To achieve something beyond such key-value, store needs additional tools, libraries, and custom development around them.

All these tools, including Consul, uses server nodes that require quorum of nodes to operate and are strongly consistent.

More or less, they all have similar semantics for key/value store management.

These semantics are attractive for building service discovery systems. Consul has out-of the box support for service discovery, which the other systems lack at.

A service discovery systems also requires a way to perform health-checks. As it is important to check for service’s health before allowing others to discover it. Some systems use heartbeats with periodic updates and TTL. The work for these health checks grows with scale and requires fixed infra. The failure detection window is as least as long as TTL.

Unlike Zookeeper, Consul has client agents sitting on each node in the cluster, talking to each other in gossip pool. This allows the clients to be thin, gives better health-checking ability, reduces client-side complexity, and solves debugging challenges.

Also, Consul provides native support for HTTP or DNS interfaces to perform system-wide, node-wide, or service-wide operations. Other systems need those being developed around the  exposed primitives.

Consul’s website gives a good commentary on comparisons between Consul and other tools.

Open Source Tools Around HashiCorp Consul

HashiCorp and the community has built several tools around Consul

These Consul tools are created and managed by the dedicated engineers at HashiCorp:

Consul Template (3.3k stars) - Generic template rendering and notifications with Consul. Template rendering, notifier, and supervisor for @hashicorp Consul and Vault data. It provides a convenient way to populate values from Consul into the file system using the consul-template daemon.

Envconsul (1.2k stars) - Read and set environmental variables for processes from Consul. Envconsul provides a convenient way to launch a subprocess with environment variables populated from HashiCorp Consul and Vault.

Consul Replicate (360 stars) - Consul cross-DC KV replication daemon. This project provides a convenient way to replicate values from one Consul datacenter to another using the consul-replicate daemon.

Consul Migrate - Data migration tool to handle Consul upgrades to 0.5.1+.

The community around Consul has also built several tools to help with registering services and managing service configuration, I would like to mention some of the popular and well-maintained ones -

Confd (5.9k stars) - Manage local application configuration files using templates and data from etcd or consul.

Fabio (5.4k stars) - Fabio is a fast, modern, zero-conf load balancing HTTP(S) and TCP router for deploying applications managed by consul. Register your services in consul, provide a health check and fabio will start routing traffic to them. No configuration required.

Registrator (3.9k stars) - Service registry bridge for Docker with pluggable adapters. Registrator automatically registers and deregisters services for any Docker container by inspecting containers as they come online.

Hashi-UI (871 stars) - A modern user interface for HashiCorp Consul & Nomad.

Git2consul (594 stars) - Mirrors the contents of a git repository into Consul KVs. git2consul takes one or many git repositories and mirrors them into Consul KVs. The goal is for organizations of any size to use git as the backing store, audit trail, and access control mechanism for configuration changes and Consul as the delivery mechanism.

Spring-cloud-consul (503 stars) - This project provides Consul integrations for Spring Boot apps through autoconfiguration and binding to the Spring Environment and other Spring programming model idioms. With a few simple annotations, you can quickly enable and configure the common patterns inside your application and build large distributed systems with Consul based components.

Crypt (453 stars) - Store and retrieve encrypted configs from etcd or consul.

Mesos-Consul (344 stars) - Mesos to Consul bridge for service discovery. Mesos-consul automatically registers/deregisters services run as Mesos tasks.

Consul-cli (228 stars) - Command line interface to Consul HTTP API.

Conclusion

Distributed systems are not easy to build and setup. Maintaining them and keeping them running is an altogether another piece of work. HashiCorp Consul makes the life of engineers facing such challenges easier.

As we went through different aspects of Consul, we learnt how straightforward it would become for us to develop and deploy application with distributed or microservices architecture.

Ease of use, excellent documentation, robust production ready code and community backing, allows adopting and introducing HashiCorp Consul in our technology stack fairly easy.

We hope it was an informative ride on the journey of Consul. Our journey has not yet ended, this was just the first half. We will meet you again with the second part of this article that walks us through practical example close to real-life applications.

Let’s us know what you would like to hear from us more or if you have any questions around the topic, we will be more than happy to answer those.

Reference


About the Author

pranav-profile-pic.png

Pranav is a Technical Lead at Velotio. He is a full-stack developer with extensive product development experience and specializes in deploying high-performance, scalable applications. He primarily leads the backend development on Python/Django and Ruby/Rails and frontend development on ReactJS and jQuery. He has keen interest in hunting new technologies, analytics, and solving puzzles.