Tutorial

Analyzing performance with perf annotate

Use cross-architecture annotation to analyze recorded profiles on different Linux architectures

By

Ravikumar Bangoria

Introduction

Perf is a powerful performance analysis tool. It can be seen as combination of two different components: userspace tool and kernel infrastructure. The userspace tool is also included as part of Linux™ kernel repository under tools/perf/ path. Perf provides several functions with different subcommands by using the infrastructure that is provided by the kernel. These subcommands can be listed by running the perf``--helpcommand. You can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing) by using this tool.

In this article, we focus on the annotate feature that is provided by the userspace Perf tool. The following section describes the annotate feature in general. Later, annotation across architectures are described, which enables recording profile information on, say IBM® PowerPC®, and reporting and annotating this on your notebook or x86 system.

Prerequisites

You might have already installed the Perf tool in your system. If not, you can install it using the yum install perf command on Fedora/RHEL or the apt-get install linux-tools-common command on Ubuntu.

Annotate with perf

Perf annotate offers the ability to map recorded profile information to the actual functions and instructions in the object code. You can use the code browsing capability to follow the code execution alongside profiling information. It allows you to browse code by using the perf report, perf top, and perf a``nnotate text-based user interface (TUI).

Record

The perf record command records the cycles event by default; use the perf list command to list all possible events supported on your system. You may have to use the sudo command for many of these commands.

 $ perf record ‑a 

Report

You can view the result by using the perf report command.

 $ perf report 

Figure 1. Perf report
alt

Annotate

Pressing 'a' on any symbol, for example snooze_loop(), displays assembly instructions of that function with the source code. If you do not see the source, the debuginfo package for kernel/userspace-binary might be missing and needs to be installed.

Figure 2. Annotate particular function
alt

Numbers on the left side of the bar indicate the percentage of total samples that are recorded against that particular instruction. For example, 40% samples of snooze_loop() were recorded on the beq 90 instruction. Perf also shows these numbers in different colors based on how hot the instruction is.

Branch instructions display an arrow to the branch target. Pressing Enter on the branch instruction jumps to that target location.

Similarly, a right arrow is displayed for call instructions, and a left arrow is displayed for return instructions. Pressing Enter on the call instruction displays disassembled output of the target function. Pressing Enter on the return instruction gets you back to the caller's disassembled output. Also, you can press 'q' to go back one step.

Figure 3. Annotate call instruction
alt

Select the bl arch_local_irq_restore+0x8 line and press Enter. You will see disassembly of arch_local_irq_restore().

Figure 4. Jump to target function
alt

Annotate help

Different options to change or manipulate annotate output are available in help. Press 'h' to open help.

Figure 5. Annotate help
alt

Press the 's' key to toggle between display and hide the source code. You can see some examples above that do not show the source. Those were captured with the toggle set to hide the source. Similarly, press the 'o' key to display the actual objdump output. Press the 'J' key to display the numbers before those instructions, which are target to branch instructions. The number indicates how many branch instructions are targeting this particular instruction.

Annotate with perf annotate

You can also use the perf annotate command to annotate a symbol.

 $ perf annotate smp_call_function_single 

Live annotate with perf top

You can also annotate using the perf top command. Run the perf top command and press 'a' on any particular symbol that you want to annotate. It also dynamically updates data at a fixed interval.

Cross-arch annotate

Perf also supports annotate across architecture from kernel v4.10-rc1 onwards. That is, record on, say PowerPC, and annotate it on an x86 system. For example,

1.1 Record on PowerPC by running the following commands.


        $ perf record ‑a      #Generate perf.data
        $ perf archive        #Generate perf.data.tar.bz2

Copy perf.data, perf.__data.tar.bz2, and vmlinux with the debug information (on a Fedora system, /usr/lib/debug/lib/modules//vmlinux) to the target x86 system (your notebook, for instance). In the following example, these files are suffixed with the text, powerpc.

1.2 Report/annotate on the x86 system.

 
        $ yum install binutils‑powerpc64le‑linux‑gnu.x86_64       #Install cross‑tools
        $ tar xvf perf.data.powerpc.tar.bz2 ‑C ~/.debug 
        $ perf report ‑i perf.data.powerpc ‑‑vmlinux vmlinux.powerpc 

Annotate any symbol by pressing 'a' on it.

Cross architecture annotate is only enabled for kernel symbols. Also, you must use the --source option to annotate with source.