Digital Developer Conference: Hybrid Cloud 2021. On Sep 21, gain free hybrid cloud skills from experts and partners. Register now

Red Hat OpenShift on IBM Z: Tune your network performance with RFS

In modern cloud platforms like Red Hat OpenShift that have different (micro-) services running, network performance is a critical issue. Various tunings have shown great potential for increasing the network performance of OpenShift on IBM Z, but these tunings have room for improvement. In many scenarios, the next generation of Receive Packet Steering (RPS) — namely Receive Flow Steering (RFS) — further helps to reduce network latency and throughput. This tutorial shows you how to set up RFS in your cluster, and offers guidance on when you should use it.

Prerequisites

To complete this tutorial, you need OpenShift Container Platform (OCP) version 4.2 or later, installed and running.

Estimated time

It should take you about an hour to complete this tutorial.

What is RFS?

Receive Flow Steering (RFS) improves upon Receive Packet Steering (RPS) by further reducing network latency. RFS is technically based on RPS, and improves the efficiency of packet processing by increasing the CPU cache hit rate. RFS achieves this, plus it considers queue length in determining the most convenient CPU for computation so that cache hits are more likely to occur within the CPU. Thus, the CPU cache is invalidated less often and requires fewer cycles to rebuild the cache. This can help reduce packet processing run time.

When should you use RFS?

RFS can improve both latency and throughput, especially for network-intensive workloads. RFS distributes the interrupt handling of incoming network packets from one CPU to multiple CPUs. To achieve this, the ksoftirq process is distributed over several CPUs. If the load of the ksoftirq process is close to 100% of a CPU, RFS can improve performance by distributing processing over several CPU cores. This means that with RPS, workload latency running on OCP 4.6 can be improved by factor 2.1 and throughput by factor 2.4 (depending on the scenario, and in a controlled measurement environment). In terms of performance, RFS is at least on par with RPS, and in some cases can improve performance by another 7-21%.

However, it’s important to note that the use of both RPS and RFS can lead to higher CPU demand, which can influence CPU-intensive workloads.

How to set up RFS

RFS is built on top of RPS, so RFS can be configured for worker nodes and infrastructure nodes using a Machine Config Object and applied by OpenShift’s Machine Config Operator. You can either use the pre-build YAML file below (enable-rfs.yaml in Step 5) and move forward to Step 3, or set up the Machine Config Object manually starting with Step 1.

Step 1. Define and encode udev rule and sysctl entry

To enable RFS, you need to adjust 2 attributes:

  1. Set the number of actively competing connections that are expected
  2. Set the number of entries that are expected in the per-queue flow table

While the first can be set by a udev rule, the second can be configured by a sysctl entry.

The udev entry should look like this:

# turn on Receive Flow Steering (RFS) for all network interfaces
SUBSYSTEM=="net", ACTION=="add", RUN{program}+="/bin/bash -c 'for x in /sys/$DEVPATH/queues/rx-*; do echo 8192 > $x/rps_flow_cnt;  done'"

And the sysctl entry for configuring the per-queue flow table should look like this:

# define sock flow entries for Receive Flow Steering (RFS)
net.core.rps_sock_flow_entries=8192

Step 2. Create the Machine Config Object in YAML syntax

OCP Machine Config Operators require a specific YAML syntax in base64 encoding. The skeleton of enable-rfs.yaml should look like this:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 50-enable-rfs
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=US-ASCII,####UDEV RULE HERE
        filesystem: root
        mode: 0644
        path: /etc/udev/rules.d/70-persistent-net.rules
      - contents:
          source: data:text/plain;charset=US-ASCII,####SYSCL ENTRY HERE
        filesystem: root
        mode: 0644
        path: /etc/sysctl.d/95-enable-rps.conf

You need to replace ####UDEV RULE HERE and ####SYSCL ENTRY HERE with the corresponding udev rule and sysctl entry encoded in base64 (respectively). But before you do that, you need to determine the number of sock flow entries. As it turns out, the value 8192 is suitable for medium-to-heavy networking loads.

Here’s how you create the udev rule:

echo 'SUBSYSTEM=="net", ACTION=="add", RUN{program}+="/bin/bash -c "for x in /sys/$DEVPATH/queues/rx-*; do echo 8192 > $x/rps_flow_cnt;  done"' | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))"

And here’s how you create the sysctl entry:

echo 'net.core.rps_sock_flow_entries=8192' | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))"

Step 3. Apply the Machine Config Object in OpenShift

To apply the Machine Config Object in OpenShift, execute the following:

oc create -f enable-rfs.yaml

Then wait until the Machine Config Operator has applied the Machine Config Object to the worker nodes one by one. To review whether RFS has been applied, see optional steps 4 and 5 below.

Step 4. (Optional) Watch OCP activate RFS

The following command shows you the list of Machine Config Operators that have ben applied to your OCP cluster. As soon as you switch on RFS (as described in Step 3), the following should appear in the list of operators 50-enable-rfs:

oc get mc
Name GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master fc2e69c4408d898b24760eea9e889f0673369e67 3.1.0 4d2h
00-worker fc2e69c4408d898b24760eea9e889f0673369e67 3.1.0 4d2h
50-enable-rfs fc2e69c4408d898b24760eea9e889f0673369e67 3.1.0 a few seconds

The following command can be used to monitor the progress of RFS activation:

watch oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED
master rendered-master-… True False False
worker rendered-worker-… False True False

When finished, UPDATED turns True and UPDATING turns false.

Step 5. (Optional) Unapply RFS Machine Config Object

The RFS Machine Config Object can be removed as follows:

oc delete mc 50-enable-rfs

And here is enable-rfs.yaml, a ready-to-use configuration that can be copied to the cluster and applied as is:

Example Machine Config Operator: enable-rfs.yaml

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 50-enable-rfs
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=US-ASCII,%23%20turn%20on%20Receive%20Flow%20Steering%20%28RFS%29%20for%20all%20network%20interfaces%0ASUBSYSTEM%3D%3D%22net%22%2C%20ACTION%3D%3D%22add%22%2C%20RUN%7Bprogram%7D%2B%3D%22/bin/bash%20-c%20%27for%20x%20in%20/sys/%24DEVPATH/queues/rx-%2A%3B%20do%20echo%208192%20%3E%20%24x/rps_flow_cnt%3B%20%20done%27%22%0A
        filesystem: root
        mode: 0644
        path: /etc/udev/rules.d/70-persistent-net.rules
      - contents:
          source: data:text/plain;charset=US-ASCII,%23%20define%20sock%20flow%20enbtried%20for%20%20Receive%20Flow%20Steering%20%28RFS%29%0Anet.core.rps_sock_flow_entries%3D8192%0A
        filesystem: root
        mode: 0644
        path: /etc/sysctl.d/95-enable-rps.conf

Summary

The RFS network tuning option can help improve latency and throughput in highly demanding networking scenarios. In this tutorial, you learned how to configure RFS on OCP, and how to enable and disable this option. The tutorial also provided a ready-to-use configuration with enable-rfs.yaml, showed you how to create your own configurations, and provided details to help you understand the basic mechanisms.

For links related to the topics covered here, see the Resources section in the right-hand column.