Overview

Skill Level: Intermediate

This recipe details on How to configure 16G huge-pages on PowerVM system running on Linux Operating system. This is authored by Aneesh KK Veetil and Kalpana Shetty. A special thanks to Aneesh for his technical guidance on setting up the 16GB huge-page.

Ingredients

There are two major setup that needs to be done before the Linux partition can make use of 16G huge-pages at the operating system.

Firstly, configure the huge-page pool at the power system level and secondly configure the number of huge-pages at the Linux partition level.

System preparation

The system environment used are Power 9 and RHEL/SLES Linux distributions. However, the steps that are listed are applicable to any PowerVM systems such as Power8 and Power9.

  • Power 9, Zeppelin Hardware
  • RHEL 8.0 and SELS 15 SP1 Linux Distribution

 Set up huge-page memory pool

Set up huge-page memory pool by using an ASMI web interface of the Power System

 

Step-by-step

  1. Set up the huge-page memory pool

    Set up huge-page memory pool through an ASMI web interface of the Power System by completing the following steps:

    1.  Power off the Power system (whole CEC needs to be powered off).
    2.  Log in to the ASMI web interface by using the default user ID and password.
    3.  In the ASMI main page, Expand “Performance Setup”.

     hp-pic1

    4. Select “System Memory Page Setup” and at the right hand side specify the number of huge pages to be configured.

    hp-pic2-1 

     

    5. Save the settings.

      hp-pic3-1

     

    6. Power on the system.

     hp-pic4

    Note: If you have the full resource system without being managed by HMC, that is, a non-LPAR based system, you can skip “Configure huge-page at Linux partition” and continue with the section “Configure huge-pages at Operating System level”.

  2. Configure huge-page at Linux partition

    1. Log in to the HMC where the Managed System is hosted.

    2. Select the Linux partition that needs to be configured with huge-pages.

    3. Edit the partition profile and go to the “Memory” Tab.

     

    hp-pic5-1

    There are 10 huge-pages that can be configurable at partition level.

    Specify “Minimum pages” —-> 4

    “Desired pages” —–> 4

    “Maximum pages” —–> 4

     hp-pic6-1

     

    4. Save the changes and activate the partition.

      hp-pic7

  3. Configuring huge-pages at the Operating System level

    Use the Grub edit method to configure huge-page size and number of huge-pages at the boot time. This is a mandatory step to get huge-pages configured during the boot time.

     

    • Boot the Operating system (at the time of documenting, used RHEL 8.0).

    hp-pic9

     

    • Configure huge-page size and number of huge-pages as the boot parameter.

    Edit the grub file “/etc/default/grub” and specify the huge-page size and request for number of pages. Add as part of “GRUB_CMDLINE_LINUX”

    “hugepagesz=16G hugepages=4”

    For example: GRUB_CMDLINE_LINUX=”crashkernel=auto rd.lvm.lv=rhel_XXXX/root rd.lvm.lv=rhel_XXXX/swap biosdevname=0 hugepagesz=16G hugepages=4″

    Update the grub

    [root@XXXX ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

    Generating grub configuration file …

    Generating boot entries from BLS files…

    Done

     

    • Reboot the partition.

    [root@XXXX ~]# reboot

  4. Verify the configured huge-page

    At the Operating system level there are multiple ways to verify that the huge-page is configured properly.

    • Check kernel command line to verify boot parameters with 16G hugepage.

    [root@XXXX ~]# dmesg | grep huge

    [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.18.0-64.el8.ppc64le root=/dev/mapper/rhel_XXXX-root ro crashkernel=auto rd.lvm.lv=rhel_xxxxx-lp1/root rd.lvm.lv=rhel_XXXX/swap biosdevname=0 hugepagesz=16G hugepages=4

     

    • Check dmesg for huge-page memory addr log.

    If you have configured huge-page pool and requested for number of huge-pages as boot parameter, dmesg will log desired messages.

    ## huge-page pool configured, say 10, so OS will return 10 entries for huge-page memory addr., as specified below:

     linux-zzyi:~ # dmesg | grep -i “Huge page”

    [ 0.000000] Huge page(16GB) memory: addr = 0xF9C00000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFA000000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFA400000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFA800000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFAC00000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFB000000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFB400000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFB800000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFBC00000000 size = 0x400000000 pages = 1

    [ 0.000000] Huge page(16GB) memory: addr = 0xFC000000000 size = 0x400000000 pages = 1

     

    linux-zzyi:~ # dmesg | grep “Huge page” | wc -l

    10

     
    ## check dmesg for HugeTLB

    linux-zzyi:~ # dmesg | grep HugeTLB

    [ 20.429376] HugeTLB registered 16 GB page size, pre-allocated 4 pages

    [ 20.429395] HugeTLB registered 16 MB page size, pre-allocated 0 pages

     
    ## check nr_hugepages

    linux-zzyi:~ # cat /sys/kernel/mm/hugepages/hugepages-16777216kB/nr_hugepages

    4

  5. Run libhugetlb tests to verify 16G huge-page access

    • mount configured 16G huge-page

    linux-zzyi:~ # mkdir /mnt/16G

    linux-zzyi:~ # mount -o pagesize=16G -t hugetlbfs none /mnt/16G

    • check for huge-page mount

    linux-zzyi:~ # mount | grep huge

    cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)

    hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)

    none on /mnt/16G type hugetlbfs (rw,relatime,pagesize=16G)

    • Run the test to verify that 16G huge-page tests are running

    git clone https://github.com/libhugetlbfs/libhugetlbfs.git

    make

    [root@XXXX libhugetlbfs]# make

    VERSION

    version update: 2.21

    version string: 2.21

    CC64 obj64/elflink.o

    AS64 obj64/sys-elf64lppc.o

    CC64 obj64/elf64lppc.o

    CC64 obj64/hugeutils.o

    CC64 obj64/version.o

    version.c:3:19: warning: ‘libhugetlbfs_version’ defined but not used [-Wunused-const-variable=]

    static const char libhugetlbfs_version[] = “VERSION: “VERSION;

    ^~~~~~~~~~~~~~~~~~~~

    CC64 obj64/init.o

    CC64 obj64/morecore.o

    CC64 obj64/debug.o

    ….

    ….

    Make check

    [root@XXXXX libhugetlbfs]# make check

    zero_filesize_segment (16G: 64): PASS

    test_root (16G: 64): PASS

    meminfo_nohuge (16G: 64): PASS

    gethugepagesize (16G: 64): PASS

    gethugepagesizes (16G: 64): PASS

    HUGETLB_VERBOSE=1 empty_mounts (16G: 64): PASS

    HUGETLB_VERBOSE=1 large_mounts (16G: 64): PASS

    find_path (16G: 64): PASS

    unlinked_fd (16G: 64): PASS

    readback (16G: 64): PASS

    truncate (16G: 64): PASS

    shared (16G: 64): PASS

    mprotect (16G: 64): PASS

    mlock (16G: 64): PASS

    misalign (16G: 64): PASS

    fallocate_basic.sh (16G: 64): PASS

    fallocate_align.sh (16G: 64): PASS

    ptrace-write-hugepage (16G: 64): PASS

    icache-hygiene (16G: 64): PASS

    ….

     

  6. Trouble shooting

    Scenario 1:

    If you have not configured huge-page pool but passed kernel boot parameter to get huge-pages.

    For example, passed “hugepagesz=16G hugepages=4” as boot parameter. This will result in not pre-allocating huge-pages at boot time.

    Verify this from dmesg log.

    ## No huge-page pool configured, so OS will return 0

    linux-zzyi:~ # dmesg | grep “Huge page” | wc -l

    0

    ## Though huge-pages are requested during boot, but pool is not configured, pre-allocated pages will be 0 pages

    linux-zzyi:~ # dmesg | grep HugeTLB

    [ 4.210982] HugeTLB registered 16 GB page size, pre-allocated 0 pages

    [ 4.210987] HugeTLB registered 16 MB page size, pre-allocated 0 pages

     

    Scenario 2:

    16GB huge-pages are always preallocated at boot time. If its not preallocated, try to echo nr_hugepagesv value post OS boot, the user will encounter an error as “…write error: Invalid argument”.

    linux-zzyi:~ # echo 1 > /sys/kernel/mm/hugepages/hugepages-16777216kB/nr_hugepages

    -bash: echo: write error: Invalid argument

     

Join The Discussion