In my previous post, I talked about why I used to have a love/hate relationship with microservices. If you’ve read the post, you’ll remember that Kubernetes helped me solve a number of challenges I was having with microservices. We outlined choice, complexity, scaling, and deployments as challenges that Kubernetes (along with Docker and Helm) helped solve.

Now I have a new challenge — and this time it’s with Kubernetes itself. Don’t get me wrong, I still love Kubernetes. But it wasn’t until I started running load tests and simulating scale that I noticed the gaps. With all the great features that Kubernetes has, there are a few areas that it doesn’t focus on. Fortunately, there’s an open source solution that solves this problem perfectly: Istio.

Connect, secure, control, observe

Istio is a joint project launched by IBM, Google, and Lyft to connect, secure, control, and observe services, particularly in a Kubernetes environment. Istio installs on top of your existing Kubernetes clusters with a single kubectl command. It takes advantage of a sidecar model, which means that every Kubernetes pod you deploy gets injected with an Envoy container/proxy that handles all the traffic in and out of that pod. In addition, Istio installs top-level components to your cluster that help manage the various capabilities that I’ll outline in this post.

There are three major ways that Istio complements Kubernetes:

  • Service introspection
  • Application management at scale
  • Hybrid deployment

Service introspection

Istio offers a number of tools to better enable developers to delve into their running containers. Let’s take a look at a few of them:

Metrics capabilities

When you install Istio, you also automatically install two open source utilities for collecting and displaying application metrics. The first is Prometheus, an open source tool for collecting and storing metrics. A great advantage of Prometheus is that it’s a part of the [Cloud Native Computing Foundation (CNCF)] (cncf.io), which means it’s being improved iteratively by the community for use with Kubernetes and container-based projects.

The second tool is Grafana. I’d bet that every monitoring or devops engineer has heard of this tool, as it’s one of the most popular open platforms for displaying metrics. Grafana is also a part of the CNCF.

Together, Prometheus and Grafana enable Istio to effectively collect and display metrics for your running application. The Istio dashboard comes with the following views:

  • Mesh Summary: HTTP/gRPC and TCP workload information.
  • Individual Services: Request- and response-time metrics for each service configured with Istio.
  • Individual Workloads: Similar to the service view, but shows metrics for the whole workload rather than an individual service.

Istio service dashboard

Service graph

Microservices can become exponentially more complex when you start to scale. Everything might seem simple when you have a small number of microservices, but as a product grows, so do the number of services. With additional load, you’ll need to scale out the number of instances, and this can quickly lead to a confusing mesh of interdependent services.

To help you visualize the “spaghetti,” Istio includes a service graph capability so you can see exactly which services are running and the requests they make to one another.

Istio hybrid architecture diagram

You can also use a more robust service like Kiali, which shows you not only the service graph, but also the request traffic, success rate, latency, and more. This kind of service can be really helpful to identify bottlenecks in your application.

Your choice of solutions

This is just the tip of the iceberg — Istio enables all kinds of metrics solutions to work seamlessly. For example, if you prefer to use DataDog, it has integrations to work with Istio. There are many solutions out there to manage monitoring and with Istio, and you have the freedom to choose the one you prefer.

Application management at scale

When moving to scale, microservices can become unruly to manage, even with Docker and Kubernetes driving your workloads. Let’s take a look at some of the management capabilities that come with Istio.

Traffic management

Istio enables developers to manage how traffic flows within their cluster. This is an important part of iterative application development, as well as an overall better experience for users. For example, when companies like Uber or Facebook create a new release, they do staged rollouts. Instead of pushing all users to the new version, they’ll route some users to the new version and iron out the bugs before rolling it out to more users. Eventually, all users will be hitting the new version and old version will be deprecated.

Istio makes it easy to route traffic because it takes advantage of Kubernetes resources for configuration. This decouples the burden of traffic management from the application code, enabling operations engineers to focus only on operations.

Let’s say for example that I have two versions of a microservice — service_v1 and service_v2. When rolling out service_v2 for the very first time, I only want 10% of my users to hit it. Here’s a simplified manifest file that I could deploy to Kubernetes with Istio to make this happen:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
...
  http:
  - route:
    - destination:
        host: service
        subset: v1
      weight: 90
    - destination:
        host: service
        subset: v2
      weight: 10

With the weights set to 90 and 10, Istio will route 10% of traffic to v2 and 90% of traffic continues to hit v1.

Fault injection

Although it might seem counter-productive to intentionally implement faults in your application, it’s vital to ensure that when failures do occur, your services do not respond unpredictably. To better enable this paradigm, Istio enables you to do protocol-specific fault injection at the network level.

There are two key types of failures that Istio can inject:

  • Delays: A delay in responding to a request, often caused by overloaded services or abnormally high network traffic.
  • Aborts: A connection failure or a faulty response; for example, a 400 or 404 error code.

Here’s an example of a how you can inject a fault to have 10% of requests have a 5-second delay:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: ratings
spec:
  hosts:
  - ratings
  http:
  - fault:
     delay:
       percent: 10
       fixedDelay: 5s
    route:
    - destination:
       host: ratings
       subset: v1

Note that this example is straight from the Istio docs!

Additional features

There are many more Istio features that help you manage your applications when moving to scale. For example:

  • Rate limiting
  • Access control
  • Canary releases
  • Security with mTLS
  • Service discovery

Hybrid deployments

One Istio’s most valuable features is support for creating a service mesh that spans multiple environments or clusters. When businesses are transforming to a cloud-native architecture, they often continue to host services on-prem. The migration process is generally slow rather than a “big bang” approach. As services get moved to the cloud, they still need to continue to access legacy endpoints.

For example, with Istio service mesh capabilities, you can host an application that has its individual microservices running on both a local Kubernetes cluster and on a cluster on IBM Cloud Kubernetes Service. Your application continues to work as expected even though the individual services live in different environments.

Istio supports managing all of these workloads together by creating a central control plane across a hybrid environment. I’ll quickly summarize the technology behind it: Istio helps setup authentication so that your Istio-enabled environments are able to communicate with one another. As all traffic flows thru the Istio ingress and egress gateways, your microservices are able to access one another without having to worry about extra application code to authenticate. Your microservices continue to work as-is because they do not need to be aware that they are communicating with another cluster.

To see an example and to try out Istio hybrid cloud deployments yourself, check out Using Istio across private and public clusters, an IBM Developer code pattern that uses the architecture outlined in this diagram:

Architecture diagram for Istio code pattern

What’s next

Kubernetes and Istio make for a powerful combination. Kubernetes tackles multiple fundamental issues around microservices, and Istio provides the tools for introspection, management, and hybrid connectivity. I’ve outlined a few of Istio’s awesome capabilities in this post, but there are many others that I haven’t touched on. To learn more about Istio, be sure to checkout these great resources:

I hope you found this introduction to Istio useful. In my next post, I’ll dive into a new open source project created by engineers from Google, IBM, and other organizations. Knative will help you build and serve workloads on Kubernetes with a focus on serverless.