Digital Developer Conference: Hybrid Cloud 2021. On Sep 21, gain free hybrid cloud skills from experts and partners. Register now

Linux interfaces for IBM POWER9 nest counters on IBM PowerVM


System performance analysis is a requirement for understanding and tuning computer systems to perform efficiently. There is a tremendous need to address the difficult, but not uncommon, performance issues on customer systems. The 24×7 feature in IBM® POWER9™ processor-based servers provide the facility to continuously collect large numbers of hardware performance metrics efficiently and accurately. This tutorial introduces the nest hardware performance monitoring counters and its Linux perf interface in an IBM PowerVM® partition. The tutorial also introduces the Performance Co-Pilot (PCP) tool and its usage to collect the POWER9 nest performance monitoring counter data through the hv_24x7 perf interface.

Socket-level resource information using POWER9 nest counters

IBM POWER9 processors implement nest Performance Monitoring Unit (PMUs), which enable measurement of socket-level resource utilization. Each nest PMU has dedicated performance monitoring counters and hardware events. Unlike traditional processor PMU events, nest PMU events focus on data that go off-core. IBM POWER9 processors also implement an accumulation logic in hardware, which is used to update event counter data from nest units to memory periodically.

Linux perf interface for nest counter data collection

IBM POWER9 hardware gathers the socket-level performance data and stores the data in buffers that are allocated and managed by the hypervisor. The interface which is used to harvest this data is known as hv_24x7. The hv_24x7 interface collects the performance data from hypervisor memory through hypervisor call (HCALL). The hv_24x7 interface is integrated with the Linux perf tool. Thus, the user or an application can obtain the nest counter data through perf commands.

To get the list of supported perf events, you need to run the following perf list command.

Figure 1. hv_24x7 events listed with perf list command

Prerequisites for performance data collection

Unless a logical partition (LPAR) is configured to own all system resources, rights must be explicitly granted to collect system-wide data to prevent LPARs from obtaining data about other partitions running on a system.

To enable requests to collect information across all system resources, in the Hardware Management Console (HMC), click General Properties.

Figure 2. General Properties in the HMC menu

On the General page, click Advanced and select the Enable Performance Information Collection check box.

Figure 3. General page with the Enable Performance Information Collection option

And save the changes by clicking on ‘Save’ button.

In case the Enable Performance Information Collection check box is not selected in HMC, while monitoring the nest counter data using perf tool, a not supported message will be displayed, instead of the performance counter data.

To monitor an event, run the following perf command:

perf stat -e hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=7/ -e hv_24x7/PM_PHB0_CYC,chip=7/ -C 0 -I 1000 sleep 100

Figure 4. Example perf stat output of the hv_24x7 events

Pre-Scale values:

With the pre-scale of 256, power bus frequency is 2.39 GHz. Nest counters are programmed with a pre-scale count because the nest PMU counters are smaller in width. The following table lists the pre-scale value for each unit that needs to be multiplied with the counter value to get the final count.

Table 1. Pre-scale values for nest units

Unit Scaling factor
PB 256
MCS 256
PBCQ 256
Alinks 4096
Xlinks 4096
CAPP 256
NTL 256
NPCQ 256
ATS 256
XTS 256
NX 256
MCD 256

Performance-Co-Pilot (PCP)

Nest performance monitoring data are socket-level resource utilization metrics and it is required to have root or admin-level privilege to collect these data. But if you want some trusted users to gather the nest performance data, PCP could be a possible alternative.

PCP is a framework for system performance and analysis that:

  • Uses a distributed architecture to collect, monitor, and analyze system metrics.
  • Has live monitoring and statistical prediction features.
  • Supports APIs such as C, Python, and Perl.

PCP setup and usage

You can install PCP and other required tools using the pcp, libpcp-devel, pcp-pmda-perfevent, pcp-system-tools, and python3-pcp packages.

Performance Metrics Collector Daemon (pmcd)

The pmcd daemon collects performance metrics on a system. There must be an instance of pmcd running on a system to collect performance metrics.

In the pmcd configuration file (/etc/pcp/pmcd/pmcd.conf), you can perform the following tasks:

  • Configure agents to collect specific events.
  • Configure access control lists for hosts and users to allow or restrict actions such as store, fetch, and so on.

Run the following command to enable the pmcd service:

# systemctl enable pmcd

Run the following command to start the pmcd service:

# systemctl start pmcd

Run the following command to check the status of the pmcd service:

# systemctl status pmcd

When the pmcd daemon is not running, the pcp command fails, and the following error is displayed:

# pcp
pcp-summary: Cannot connect to PMCD on host "local:": Connection refused

To collect the hv_24x7 events, you need the perfevent Performance Metric Domain Agent (PMDA).

To install the perfevent PMDA, complete the following steps:

  1. Change the directory to /var/lib/pcp/pmdas/perfevent.
  2. Run the Install script.

    # ./Install
    PMCD should communicate with the perfevent daemon via a pipe or a socket? [pipe]
    Updating the Performance Metrics Name Space (PMNS) ...
    Terminate PMDA if already installed ...
    Updating the PMCD control file, and notifying PMCD ...
    Check perfevent metrics have appeared ... 1285 metrics and 1027 values
  3. Restart the pmcd service.

    # systemctl restart pmcd
  4. View the installed PMDAs using the pcp command

    # pcp
    Performance Co-Pilot configuration on linux-l9tb:
    platform: Linux linux-l9tb 4.12.14-32-default #1 SMP Thu Feb 14 12:19:57 UTC 2019 (e1536c0) ppc64le
    hardware: 64 cpus, 2 disks, 2 nodes, 15111MB RAM
    timezone: EDT+4
    services: pmcd
      pmcd: Version 4.3.0-1, 8 agents
      pmda: root pmcd proc xfs linux mmv kvm[4] jbd2 perfevent

View information about the perf hv_24x7 events

To collect the perf related data, you can run the pmprobe and the pmval PCP commands.

The pmprobe command gives the available list of performance metrics through PCP facilities.

# pmprobe | grep perfevent

The configuration file, /var/lib/pcp/pmdas/perfevent/perfevent.conf, contains details about the hv_24x7 events that are supported. The following hv_24x7 events are supported:

  • hv_24x7.PM_PB_CYC chip:1
  • hv_24x7.PM_MBA0_CLK_CYC chip:0

To see the values from the supported hv_24x7 events, run the pmval command: The output of the pmval command displays the values of provided performance metric name.

# pmval perfevent.hwcounters.hv_24x7.PM_PB_CYC.value
# pmval perfevent.hwcounters.hv_24x7.PM_MBA0_CLK_CYC.value

Refer to the following example output of the hv_24x7.PM_PB_CYC.value:

# pmval perfevent.hwcounters.hv_24x7.PM_PB_CYC.value

metric:    perfevent.hwcounters.hv_24x7.PM_PB_CYC.value
host:      linux-l9tb
semantics: cumulative counter (converting to rate)
units:     count (converting to count / sec)
samples:   all

cpu0                  cpu1                  cpu2                   cpu3
9.370E+06             9.370E+06             9.370E+06             9.370E+06

Refer to the following example output of the hv_24x7.PM_MBA0_CLK_CYC event:

# pmval perfevent.hwcounters.hv_24x7.PM_MBA0_CLK_CYC.value

metric:    perfevent.hwcounters.hv_24x7.PM_MBA0_CLK_CYC.value
host:      linux-l9tb
semantics: cumulative counter (converting to rate)
units:     count (converting to count / sec)
samples:   all

cpu0                 cpu1               cpu2              cpu3
0.0                   0.0                   0.0                   0.0

When you try to run an unsupported event, the message, “No values available” is displayed in the output of the command.

For example:

# pmval perfevent.hwcounters.hv_24x7.CPM_CS_FROM_L2_L3_A_LDATA.value

metric:    perfevent.hwcounters.hv_24x7.CPM_CS_FROM_L2_L3_A_LDATA.value
host:      linux-l9tb
semantics: cumulative counter (converting to rate)
units:     count (converting to count / sec)
samples:   all
No values available
No values available

To obtain the hv_24x7 values by using PCP Python APIs, complete the following procedure:

  1. Import the required PCP API classes:

    from pcp import pmapi
    import cpmapi as c_api
  2. Create the fetchgroup object:

    fg = pmapi.fetchgroup(c_api.PM_CONTEXT_HOST, 'local:')
  3. Add the required event:

    PM_PB_CYC = fg.extend_indom('perfevent.hwcounters.hv_24x7.PM_PB_CYC.value')
  4. Get the samples

  5. Because this event values are ‘cumulative’ in nature, fetch again.

  6. To print the values, multiply with the pre-scale value. Refer to the pre-scale table.

    for icode, iname, value in PM_PB_CYC():
     print(" %s: %7.2f * 256" % (iname, value()), end='')

Example output of above script:

cpu0: 7806515.00 cpu1: 7806515.00 cpu2: 7806515.00 cpu3: 7806515.00