Simple performance lock analysis tool (splat) for Linux
Use Linux splat to understand and troubleshoot performance issues caused by process locks on your Linux system
This tutorial introduces you to the Linux® simple performance lock analysis tool (Linux splat) for IBM® Power Systems™. Linux splat provides a system-wide view of the different locks that are acquired, released, or actively held by different processes across a Linux system. The locks refer to the synchronization mechanisms used when multiple tasks or processes are running simultaneously. This tool can be very helpful in understanding and troubleshooting performance problems relevant to different types of locks being used across a Linux system.
When the IBM Power® server family introduced support for Linux, there was a lack of tools such as AIX curt and AIX splat on Linux distributions such as Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server (SLES), Ubuntu, and so on.
We developed Linux splat and Linux CPU Utilization Reporting Tool (Linux curt), which we’ll cover in a tutorial coming soon, in response to requests from developers, customers, and administrators who were used to having performance tools such as AIX splat and AIX curt available for advanced troubleshooting of lock and CPU-related performance problems.
Linux splat and Linux curt are also useful when migrating workloads to Linux because they help with troubleshooting performance bottlenecks around CPU or locks, particularly for database loads, which tend to be CPU intensive, and locks to handle synchronization among database threads. Lack of such tools on Linux made troubleshooting harder in this area.
The current iteration of Linux splat generates reports on user-level mutexes but might soon include read/write locks and condition variable reports. It uses the capabilities of the Linux perf facility to record lock-related events and process the perf format trace file generated to produce a per-function, per-task, and system-wide lock statistics report (explained later in this tutorial with examples).
Note: Even though we developed Linux splat in response to specific IBM Power Systems user requests, it will run on any Linux architecture.
Using Linux splat requires a basic understanding of the Linux perf utility and working experience of an operating system such as, UNIX®. Elementary knowledge of Python and shell scripting might be helpful but not mandatory.
For example, on a Red Hat Enterprise Linux (RHEL) system, the following versions of software are needed:
- RHEL 7.5
- RPM packages
- Linux perf utility
You can install the latest versions using the
yum install perf yum install python
Installing Linux splat
The code is publicly available on GitHub at https://github.com/open-power-sdk/splat. You can download it directly as a ZIP file.
Then, extract the package using the following command:
Or, you can clone the repository using the following command as shown in Figure 1:
git clone https://github.com/open-power-sdk/splat.git
Setting up Linux splat
Perform the following steps to set up Linux splat:
Enable user access
Provide read/write access to the /sys/kernel/debug/tracing folder using the following command:
sudo mount -o remount,mode=755 /sys/kernel/debug
Enable any user to collect system-wide events using the following command:
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
Run the mount.sh script to perform the steps above.
Install the debuginfo package for glibc.
For example, from the test setup used for this example, the rpm packages were installed (as root user) by running the following commands:
yum localinstall glibc-debuginfo-2.17-222.el7.ppc64le.rpm
yum localinstall glibc-debuginfo-common-2.17-222.el7.ppc64le.rpm
As a root user, enable the software-defined tracepoints on the system so that the perf utility can log them when the associated code paths are run during event recording.
Running Linux splat
Linux splat can be run in many ways:
The easiest method is to record and report in a single step.
./splat.py --record all <command>
where command is the command to be run including arguments. For example: ./splat.py –record all /usr/bin/ls -l /home
The events recorded will be written to perf.data and also be processed to generate a detailed report.
Process an existing perf.data file.
perf script -s ./splat.py
Or, just run ./splat.py, which will launch the
Process a renamed perf trace file. By default, splat and perf tools look for a perf.data file.
Linux splat usage examples
Linux splat uses the Linux perf command to record events into the perf.data file. This file is processed by splat to generate a report that shows:
- A count of lock acquired attempts, acquisitions, failures, and releases
- Average, minimum, and maximum wait and hold times of all locks
The following statistics are recorded per task and a system-wide summary across all tasks is reported:
- Task : Task ID performing the lock operation
- Function : Function in which the lock operation is taking place
- Lock : Lock ID
- Attempted : Count of lock acquisitions made
- Failed : Count of failed attempts to lock acquisitions
- Acquired : Count of successful lock acquisitions
- Released : Count of locks released
- Wait : The (minimum/maximum/average) wait time for a lock
- Held : The (minimum/maximum/average) hold time for a lock
The following sections demonstrate two examples for using Linux splat. The simple test case shows its basic functionality, while the test case with multiple threads competing for locks shows lock acquisitions attempts (success and failures), wait/hold times, and releases per task as well as anoverall summary.
A simple test case
The splat command is run as shown below. This records all the events related to the probe for a duration of 20 seconds (as specified in the command argument:
sleep 20). A perf format perf.data trace is generated and is processed by splat to create a report as shown in the following output. The list of statistics (as described above) is displayed for all the locks per task.
./splat.py --record all sleep 20
Sample splat command output
Task [ 37569] Function "start_thread+0x104" lock attempted failed acquired released 100200d0 5 2 3 0 wait wait avg wait min wait max 0.009284 0.003095 0.000970 0.007344 held held avg held min held max 0.000000 0.000000 0.000000 0.000000 Function "critSection+0xe0" lock attempted failed acquired released 100200d0 0 0 0 3 wait wait avg wait min wait max 0.000000 0.000000 0.000000 0.000000 held held avg held min held max 0.332536 0.110845 0.100884 0.122726 Function "critSection+0x2c" lock attempted failed acquired released 10020090 1 0 1 0 wait wait avg wait min wait max 0.002226 0.002226 0.002226 0.002226 held held avg held min held max 0.000000 0.000000 0.000000 0.000000 Function "critSection+0x70" lock attempted failed acquired released 10020090 0 0 0 1 wait wait avg wait min wait max 0.000000 0.000000 0.000000 0.000000 held held avg held min held max 0.006928 0.006928 0.006928 0.006928 Task [ 37570] Function "start_thread+0x104" lock attempted failed acquired released 100200d0 6479 6476 3 0 wait wait avg wait min wait max 0.003192 0.001064 0.001048 0.001096 held held avg held min held max 0.000000 0.000000 0.000000 0.000000 Function "critSection+0xe0" lock attempted failed acquired released 100200d0 0 0 0 3 wait wait avg wait min wait max 0.000000 0.000000 0.000000 0.000000 held held avg held min held max 1000.124214 333.374738 0.031022 1000.043584 Function "critSection+0x2c" lock attempted failed acquired released 10020090 1 0 1 0 wait wait avg wait min wait max 0.002312 0.002312 0.002312 0.002312 held held avg held min held max 0.000000 0.000000 0.000000 0.000000 Function "critSection+0x70" lock attempted failed acquired released 10020090 0 0 0 1 wait wait avg wait min wait max 0.000000 0.000000 0.000000 0.000000 held held avg held min held max 0.001238 0.001238 0.001238 0.001238 Task [ ALL] lock acquired failed 10024c810c0 7 0 waited minWait maxWait avgWait 0.009 0.001 0.002 0.001 held minHeld maxHeld avgHeld 0.438 0.001 0.419 0.063 lock acquired failed 100200d0 6 6478 waited minWait maxWait avgWait 0.012 0.001 0.007 0.002 held minHeld maxHeld avgHeld 1000.457 0.031 1000.044 166.743 lock acquired failed 3fff800124f8 4 0 waited minWait maxWait avgWait 0.005 0.001 0.002 0.001 held minHeld maxHeld avgHeld 0.015 0.001 0.011 0.004 lock acquired failed 3fff80012818 4 0 waited minWait maxWait avgWait 0.005 0.001 0.002 0.001 held minHeld maxHeld avgHeld 0.011 0.001 0.006 0.003 lock acquired failed 3fffae030908 2 0 waited minWait maxWait avgWait 0.003 0.001 0.001 0.001 held minHeld maxHeld avgHeld 0.020 0.001 0.019 0.010 lock acquired failed 10020090 2 0 waited minWait maxWait avgWait 0.005 0.002 0.002 0.002 held minHeld maxHeld avgHeld 0.008 0.001 0.007 0.004
Test case with threads competing for locks
The command output also illustrates a slightly more complicated scenario because it has a few threads simultaneously looping and contending for a shared lock. In this scenario, the threads try to compete for lock acquisition and then get into a loop where they use condition variables to wait and signal the lock. The task procedure runs in a loop and the overall process is active for about 10 seconds before terminating.
In the command output, there are two threads (37569 and 37570) attempting, acquiring, and releasing a lock (100200d0) multiple times (as this happens in a loop). You can also see the failed attempts for the lock. The total hold and wait times for acquiring the lock is high, which also explains why there is a high value for the lock attempted statistics. These fields can be observed in the individual task reports and collation of minimum, maximum, and average values can be seen in the command output.
We believe that Linux splat is a very helpful tool for developers, administrators, and clients too allow them to effectively troubleshoot lock-related performance problems they might encounter when migrating applications from AIX to Linux systems.
Now that Linux splat is also available publicly on GitHub, we encourage you to explore and utilize it. We would be very grateful to hear your feedback, comments, suggestions, enhancement ideas, and any other issues you may find.
Linux curt, which we mentioned earlier, is another interesting performance tool that is available on GitHub. We plan to cover Linux curt in a future tutorial. Until then, review the README to learn more.