IBM AIX optimized system boot and dynamic reconfiguration
Reduced boot time and DLPAR time with LMT improvements
Some of the key factors that are important for system administrators during system maintenance are how long it takes to apply system patches or updates that require a reboot and how fast the system resources can be reconfigured without disrupting the existing workloads.
Boot time is an important component of system performance as users must wait for the boot operation to complete before they can use the device. It is the time taken for a device to be ready to operate after the power has been turned on. Slow boot times would make the system owners to refuse to apply any patches or updates that require a reboot.
Dynamic logical partitioning (DLPAR) is the capability of a logical partition (LPAR) to be reconfigured dynamically, without having to shut down the operating system that runs in the LPAR. DLPAR enables memory, CPU capacity, and I/O interfaces to be moved non-disruptively between LPARs within the same server. This support exists on IBM AIX since AIX 5L. System owners expect DLPAR operations to have minimal impact on the currently running workloads.
This blog talks about the AIX 7.3 system boot and DLPAR optimizations.
AIX 7.3 comes with an optimized boot phase which will have much shorter boot time when compared to a similar configuration with earlier AIX releases. AIX 7.3 has also significantly optimized the CPU and memory dynamic LPAR operations. Both were achieved by the redesign of the Lightweight Memory Trace (LMT) infrastructure.
LMT is a critical reliability, availability, and serviceability (RAS) function on AIX, which is ON by default. To enhance the boot phase, the LMT buffer allocation which occurs early in the boot phase was redesigned and optimized. In AIX 7.3, during boot, LMT will allocate only sufficient buffer size that is sufficient to capture traces during the boot. After the boot, the LMT buffers are resized in the background without holding the boot process, there by resulting in significant improvements in boot times.
|Boot time till login prompt
|Boot time reduction|
|Power9 (with AIX 7.2 TL5)||Power10 (with AIX 7.2 TL5)||Power10 (with AIX 7.3)||AIX optimization effect||AIX + Power10 effect|
The above table captures the reduction in AIX boot time (in percentage) on a large memory system with 48 cores in simultaneous multithreading (SMT) mode 8. AIX 7.3 is supported on IBM Power8 and later processors. The latest Power processor at the time of writing this blog is IBM Power10 and so the data has been captured in comparison with it. On an average, we noticed more than 50% reduction in AIX boot time on IBM Power10 compared to IBM Power9.
LMT buffer management was also optimized for the DLPAR operations. The LMT buffers that are allocated per CPU may sometimes need to be resized during CPU or memory DLPAR operations to keep the total LMT buffer size under predefined system limits. The resize operations were optimized, and this resulted in significant reduction in the time spent on DLPAR operations.
|CPU DLPAR completion time Power9 versus Power10|
|Memory size||DLPAR operations||Operation completion in sec||Performance improvement on Power10|
|Power9 (7.2 TL5)||Power10 (7.3)|
|512 GB||ADD 24 Core||191||17||91%|
|REM 24 Core||33||14||57%|
|1 TB||ADD 24 Core||360||25||93%|
|REM 24 Core||70||21||70%|
|1.5 TB||ADD 24 Core||420||35||91%|
|REM 24 Core||81||24||70%|
|1.5 TB||ADD 24 Core||262||35||86%|
|REM 24 Core||44||19||56%|
|2.5 TB||ADD 24 Core||53||42||20%|
|REM 24 Core||30||16||46%|
This table shows the time spent on the DLPAR process for adding and removing 24 cores with different memory sizes. The LPAR originally had 48 cores running in the default SMT 8 mode. The REM operation removes 24 cores and the ADD operation adds back those removed cores.
As can be seen in the above table, there is a significant improvement in both ADD and REM paths. The scaling issue exists only till 2 TB memory on this setup, which was significantly reduced under the new design improvements.
These optimizations are part of continuous and committed efforts from IBM AIX to better serve its customers. Reducing the time spent on boot and reconfiguration can provide a better administrative experience and is usually welcomed by the AIX system administrators.