Traffic Shifting📜

There are a couple of terms and resources we have to talk about to describe how traffic shifting works and how we can route the traffic:

VirtualService
- Configures routing rules for each service.
DestinationRule
- Configures how to reach the target endpoint and is applied after the routing decision has been made.

We will talk about multiple resources in Istio. With the VirtualService, we can configure routing rules for each service, where we can also define traffic splitting, failure or delay injection, and traffic mirroring.

The DestinationRule resources contain any rules applied after making routing decisions. They configure how to reach the target service. Here is where we could configure circuit breakers or TLS settings.

We’ll talk about Gateways and how to configure Envoy, a traditional load balancer deployed at the edge of our mesh. The Gateway resource controls how Envoy listens on the network interface and presents its certificates.

For example: Istio gets deployed with an ingress gateway that exposes Kubernetes services to the outside world.

Let’s consider how the call is made in Kubernetes from one service to another. We have a Kubernetes service called Service A, a corresponding deployment, and the Pods created as part of the deployment. We have the same for Service B, a Kubernetes service and a deployment.

The important thing here is the labels that are set on the Pods as well as on the Kubernetes service. The labels are what tell Kubernetes which pods belong to the logical service.

So whenever we call serviceb.example.cluster.local, the Kubernetes service maintains a collection of endpoints equal to the IP addresses of all pods that match the label set on the Kubernetes service.

The Kubernetes service automatically maintains the list of endpoints/IP addresses. It watches if any pods with those labels are created or deleted and updates the list accordingly.

In step 1, we make the call. The Envoy proxy intercepts the call and routes to serviceb.example.cluster.local. The service routes the call to one of the Pod endpoints in its collection labeled with app: svcB.

When we only have one version of Service B, everything works smoothly. However, what happens when we have multiple versions of Service B? The solution lies in an Istio concept known as subsets, which leverage Kubernetes labels.

Assuming we have Service B v1 and Service B v2 in the cluster. We’d represent that with two separate deployments. The v1 deployment would has a label called version: v1, and the v2 deployment would has a label called version: v2.

However, we’d only have a single Kubernetes service with the same label as before, app: svcB. If we’d request Service B now, the call will end up on one of the pods labeled with app:svcB, which could be a pod labeled v1 or a pod labeled v2.

We could create multiple Kubernetes services, one for svcB-v1 and one for svcB-v2. However, that’s not practical because, from SvcA, we’d have to be explicit about which service we want to call. We always want to call the same Kubernetes service and decide on the version using configuration.

In the DestinationRule resources, we can define subsets to describe services or subsets of services. For instance, in this case, we define two subsets: v1 and v2, along with their corresponding labels. Defining subsets is the initial step toward routing traffic and describing our services.

Then, we can use the VirtualService resource to determine how we want to split the traffic.

For example: We could create two identical destinations for Service B but add subsets and weights to each for splitting the traffic.

This configuration tells Envoy to split the traffic destined for Service B so that 70% of the traffic goes to the subset called v1 and 30% to the subset called v2.

Assume we created the DestinationRule and VirtualService resources that define the two subsets, v1, and v2, and split the traffic 70/30.

Service A does not have any updates. We still make a call to serviceB.example.cluster.local. However, we have the DestinationRule with the subsets and the routing rule with the traffic split defined. Envoy will route 70% of all traffic to subset v1 and 30% to subset v2.

The calculation is complete, and the call routes to subset v1. The labels assigned to subset v1, specifically version:v1, determine which pods to select. The label version:v1 is applied in addition to the app: svcB label. A pod with two labels becomes part of the load-balancing process with other pods. The call is redirected to Service B through the Envoy proxy at the final endpoint.

Similarly, when the 30% case happens, the v2 labels get applied, and the call ends on one of the pods running the v2 version of Service B.

Destination Rule📜

To recap, DestinationRule defines policies that are applied to traffic for certain services. In the previous examples, we had a destination rule for Service B where we declared two subsets. Subsets are a way to configure different versions of the service. They use labels to select pods that are part of the subset.

The other things we can configure in a destination rule are:

Load balancer settings,
Connection pool settings,
Outlier detection, and
TLS settings.

By adjusting the connection pool settings, we can manage the traffic directed toward the service. These settings enable us to establish circuit breakers for TCP or HTTP connections and determine the maximum number of connections or requests, the maximum number of retries, and the maximum number of connection timeouts.

With outlier detection, we can control how unhealthy hosts get evicted from the load-balancing pool. Outlier detection is a circuit breaker implementation that tracks the status of each host. Suppose a host experiences failures such as 500 status codes from HTTP, timeouts, or connection errors. In that case, temporarily removing the failing host from the load-balancing pool is recommended. This action ensures that no requests will be routed to the problematic host.

For example: If there are more than five consecutive 500 errors from a host, consider it unhealthy and eject it from the load balancing pool.

Then, we need to decide how long to eject it. The issues could only be temporary, and the host might improve later. We may consider removing it for 5 minutes or another specified time.

We could also configure the maximum number of hosts we want to eject. Let’s say we only want to eject 50% of the total number of hosts and always leave a minimum of 50% running, regardless of whether they are failing.

Another setting is the minHealthPercent. Using this setting, we only want to enable outlier detection as long as a certain percentage of hosts are in healthy mode. When the percentage of healthy hosts falls under this threshold, the outlier detection is disabled, and load balancing will happen across all hosts (healthy or unhealthy).

Virtual Service📜

With the VirtualService resource, we can configure how to route traffic. We’ve already seen how to split traffic based on the percentage. Inside this resource, we can also define routing rules and matching criteria for traffic. It works by specifying how to match specific traffic by some request properties and where we want the traffic to be routed.

We can redirect traffic to another service, rewrite the URIs and authority headers, or mirror traffic to a different service.

We can inject timeouts, retries, and faults using our VirtualService to test our services.

For example: We could inject 5-second delays for 40% of the traffic going to Service B. Or Service B will return an HTTP 404 response for 20% of the requests. These features are helpful in testing service resiliency.

Another feature is configuring Cross-Origin Resource Sharing (CORS) policies for a service. We could set the origins, methods, or other CORS settings for services.

Finally, we can manipulate both requests and response headers. We can add or remove specific headers from the requests.

Here’s an example of splitting traffic based on the user-agent. If the user-agent header contains the word Firefox, the traffic goes to the v1 subset of Service B. Otherwise, if it’s any other user-agent, it will go to v2.

Below is how we can write that configuration. First, we’d match the user-agent header. We could use a regular expression or an exact string to match the value. Then, we define the destination and the subset. In our case, the v1 subset of Service B is used.

Suppose the user-agent header doesn’t match, we route to the subset v2. Multiple match conditions can be defined using the AND/OR semantics, which will be explained below.

hosts:
  - svcB.example.cluster.local
http:
- match:
  - headers:
      user-agent:
        regex: ".*Firefox.*"
  route:
  - destination:
      host: svcB.example.cluster.local
      subset: v1
- route:
  - destination:
      host: svcB.example.cluster.local
      subset: v2

In this example, the virtualservice will check the value of the x-debug header. If the value matches the string dev AND the request URI starts with /api/debug, we’ll route to the v1 subset.

hosts:
  - svcB.example.cluster.local
http:
- match:
  - headers:
      x-debug:
        exact: dev
    uri:
      prefix: /api/debug
  route:
  - destination:
      host: svcB.example.cluster.local
      subset: v1
- route:
  - destination:
      host: svcB.example.cluster.local
      subset: v2

If we wanted to implement the OR semantic and say either header matches that value or the URI prefix is /api/debug, we would add another match block. In the previous example, there was a single match block. But now we have two of them – one for headers and another for URI.

hosts:
  - svcB.example.cluster.local
http:
- match:
  - headers:
      x-debug:
        exact: dev
  - uri:
      prefix: /api/debug
  route:
  - destination:
      host: svcB.example.cluster.local
      subset: v1
- route:
  - destination:
      host: svcB.example.cluster.local
      subset: v2

Here’s an example of how to set up retries and timeouts. We use a percentage-based traffic split and assign a 5-second timeout for subset v1 and a 0.5-second timeout for subset v2. Additionally, we establish a retry policy for all hosts connecting to service B, specifying three retries with a 2-second timeout per try and only retrying if there’s a connection failure.

hosts:
  - svcB.example.cluster.local
http:
- route:
  - destination:
      host: svcB.example.cluster.local
      subset: v1
    weight: 30
    timeout: 5s
  - destination:
      host: svcB.example.cluster.local
      subset: v2
    weight: 70
    timeout: 0.5s
  retries:
    attempts: 3
    perTryTimeout: 2s
    retryOn: connect-failure

Next📜

In the next lab, we will show how to use Istio’s traffic management features to upgrade the customers service with zero downtime.