Learn more >
Ravikumar Bangoria | Updated December 28, 2018 - Published February 1, 2017
Perf is a powerful performance analysis tool. It can be seen as combination of two different components: userspace tool and kernel infrastructure. The userspace tool is also included as part of Linux™ kernel repository under tools/perf/ path. Perf provides several functions with different subcommands by using the infrastructure that is provided by the kernel. These subcommands can be listed by running the perf``--helpcommand. You can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing) by using this tool.
In this article, we focus on the annotate feature that is provided by the userspace Perf tool. The following section describes the annotate feature in general. Later, annotation across architectures are described, which enables recording profile information on, say IBM® PowerPC®, and reporting and annotating this on your notebook or x86 system.
You might have already installed the Perf tool in your system. If not, you can install it using the yum install perf command on Fedora/RHEL or the apt-get install linux-tools-common command on Ubuntu.
yum install perf
apt-get install linux-tools-common
Perf annotate offers the ability to map recorded profile information to the actual functions and instructions in the object code. You can use the code browsing capability to follow the code execution alongside profiling information. It allows you to browse code by using the perf report, perf top, and perf a``nnotate text-based user interface (TUI).
The perf record command records the cycles event by default; use the perf list command to list all possible events supported on your system. You may have to use the sudo command for many of these commands.
$ perf record ‑a
You can view the result by using the perf report command.
$ perf report
Pressing ‘a’ on any symbol, for example snooze_loop(), displays assembly instructions of that function with the source code. If you do not see the source, the debuginfo package for kernel/userspace-binary might be missing and needs to be installed.
Numbers on the left side of the bar indicate the percentage of total samples that are recorded against that particular instruction. For example, 40% samples of snooze_loop() were recorded on the beq 90 instruction. Perf also shows these numbers in different colors based on how hot the instruction is.
Branch instructions display an arrow to the branch target. Pressing Enter on the branch instruction jumps to that target location.
Similarly, a right arrow is displayed for call instructions, and a left arrow is displayed for return instructions. Pressing Enter on the call instruction displays disassembled output of the target function. Pressing Enter on the return instruction gets you back to the caller’s disassembled output. Also, you can press ‘q’ to go back one step.
Select the bl arch_local_irq_restore+0x8 line and press Enter. You will see disassembly of arch_local_irq_restore().
Different options to change or manipulate annotate output are available in help. Press ‘h’ to open help.
Press the ‘s’ key to toggle between display and hide the source code. You can see some examples above that do not show the source. Those were captured with the toggle set to hide the source. Similarly, press the ‘o’ key to display the actual objdump output. Press the ‘J’ key to display the numbers before those instructions, which are target to branch instructions. The number indicates how many branch instructions are targeting this particular instruction.
You can also use the perf annotate command to annotate a symbol.
$ perf annotate smp_call_function_single
You can also annotate using the perf top command. Run the perf top command and press ‘a’ on any particular symbol that you want to annotate. It also dynamically updates data at a fixed interval.
Perf also supports annotate across architecture from kernel v4.10-rc1 onwards. That is, record on, say PowerPC, and annotate it on an x86 system. For example,
1.1 Record on PowerPC by running the following commands.
$ perf record ‑a #Generate perf.data
$ perf archive #Generate perf.data.tar.bz2
Copy perf.data, perf.__data.tar.bz2, and vmlinux with the debug information (on a Fedora system, /usr/lib/debug/lib/modules//vmlinux) to the target x86 system (your notebook, for instance). In the following example, these files are suffixed with the text, powerpc.
1.2 Report/annotate on the x86 system.
$ yum install binutils‑powerpc64le‑linux‑gnu.x86_64 #Install cross‑tools
$ tar xvf perf.data.powerpc.tar.bz2 ‑C ~/.debug
$ perf report ‑i perf.data.powerpc ‑‑vmlinux vmlinux.powerpc
Annotate any symbol by pressing ‘a’ on it.
Cross architecture annotate is only enabled for kernel symbols. Also, you must use the --source option to annotate with source.
Learn about Linux desktops and how to access them
The IBM Developer podcast is the place where developers hear all about open topics and technologies.
July 22, 2019
Back to top