Perf is a powerful performance analysis tool. It can be seen as combination of two different components: userspace tool and kernel infrastructure. The userspace tool is also included as part of Linux™ kernel repository under tools/perf/ path. Perf provides several functions with different subcommands by using the infrastructure that is provided by the kernel. These subcommands can be listed by running the perf``--helpcommand. You can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing) by using this tool.
In this article, we focus on the annotate feature that is provided by the userspace Perf tool. The following section describes the annotate feature in general. Later, annotation across architectures are described, which enables recording profile information on, say IBM® PowerPC®, and reporting and annotating this on your notebook or x86 system.
Prerequisites
You might have already installed the Perf tool in your system. If not, you can install it using the yum install perf command on Fedora/RHEL or the apt-get install linux-tools-common command on Ubuntu.
Annotate with perf
Perf annotate offers the ability to map recorded profile information to the actual functions and instructions in the object code. You can use the code browsing capability to follow the code execution alongside profiling information. It allows you to browse code by using the perf report, perf top, and perf a``nnotate text-based user interface (TUI).
Record
The perf record command records the cycles event by default; use the perf list command to list all possible events supported on your system. You may have to use the sudo command for many of these commands.
$ perf record ‑a
Show more
Report
You can view the result by using the perf report command.
$ perf report
Show more
Figure 1. Perf report
Annotate
Pressing 'a' on any symbol, for example snooze_loop(), displays assembly instructions of that function with the source code. If you do not see the source, the debuginfo package for kernel/userspace-binary might be missing and needs to be installed.
Figure 2. Annotate particular function
Numbers on the left side of the bar indicate the percentage of total samples that are recorded against that particular instruction. For example, 40% samples of snooze_loop() were recorded on the beq 90 instruction. Perf also shows these numbers in different colors based on how hot the instruction is.
Branch instructions display an arrow to the branch target. Pressing Enter on the branch instruction jumps to that target location.
Similarly, a right arrow is displayed for call instructions, and a left arrow is displayed for return instructions. Pressing Enter on the call instruction displays disassembled output of the target function. Pressing Enter on the return instruction gets you back to the caller's disassembled output. Also, you can press 'q' to go back one step.
Figure 3. Annotate call instruction
Select the bl arch_local_irq_restore+0x8 line and press Enter. You will see disassembly of arch_local_irq_restore().
Figure 4. Jump to target function
Annotate help
Different options to change or manipulate annotate output are available in help. Press 'h' to open help.
Figure 5. Annotate help
Press the 's' key to toggle between display and hide the source code. You can see some examples above that do not show the source. Those were captured with the toggle set to hide the source. Similarly, press the 'o' key to display the actual objdump output. Press the 'J' key to display the numbers before those instructions, which are target to branch instructions. The number indicates how many branch instructions are targeting this particular instruction.
Annotate with perf annotate
You can also use the perf annotate command to annotate a symbol.
$ perf annotate smp_call_function_single
Show more
Live annotate with perf top
You can also annotate using the perf top command. Run the perf top command and press 'a' on any particular symbol that you want to annotate. It also dynamically updates data at a fixed interval.
Cross-arch annotate
Perf also supports annotate across architecture from kernel v4.10-rc1 onwards. That is, record on, say PowerPC, and annotate it on an x86 system. For example,
1.1 Record on PowerPC by running the following commands.
$ perf record ‑a#Generate perf.data
$ perf archive #Generate perf.data.tar.bz2
Show more
Copy perf.data, perf.__data.tar.bz2, and vmlinux with the debug information (on a Fedora system, /usr/lib/debug/lib/modules//vmlinux) to the target x86 system (your notebook, for instance). In the following example, these files are suffixed with the text, powerpc.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.