While most PowerVC deployments use a single instance to manage the infrastructure, there could be environments that need a mechanism to ensure that PowerVC can be recovered in cases of disaster. Though PowerVC does not provide full-fledged high availability (for example, active-active) at this point, this blog lists a strategy that can be employed to enable recovery. You might want to review PowerVC Knowledge Center topic “Backing up IBM Power® Virtualization Center data” before you proceed further.

Taking PowerVC backup the default way

PowerVC provides “powervc-backup” and “powervc-restore” commands for backup and restore respectively. “powervc-backup” command can be used to take periodic backups, such that these backups can be eventually used to restore another instance of PowerVC (using “powervc-restore”). The system administrator has to maintain two instances of PowerVC, both of which are exactly the same version and with the same connectivity to resources.

PowerVC1 can be referred to as the primary /active one; PowerVC2 can be referred to as the secondary/passive/standby one. Once the active PowerVC (PowerVC1) is fully functional to manage the infrastructure, periodic backups can be taken from it.

The above command takes a backup of all the configuration and data from an instance of PowerVC and creates an archive file of it. The archive file captures the snapshot of PowerVC for that particular moment of time. This archive file has to be provided as an input to the “powervc-restore” command at the time of restore on PowerVC2.
“powervc-backup” command can be run without any arguments as seen below, in which case it takes the default values:

Please note that this command stops all PowerVC services before taking the backup. This happens because PowerVC services consist of multiple databases, whose snapshot can be coherently captured only when there are no write operations happening against it. The “powervc-backup” command does not change or disrupt PowerVC configuration in any manner and is safe to use. By default, all backups are created under the local directory /var/opt/ibm/powervc/backups/timestamp. An option to specify a directory of one’s choice can also be provided.

PowerVC does not have an option to run this command periodically or at certain intervals. However, the system administrators can easily create a cron job to do this. The backups taken can also be stored at an external location (for example, network file system) so that they are not lost if the system that hosts the primary PowerVC goes down.
When services on PowerVC are stopped during “powervc-backup” or “powervc-restore”, PowerVC goes into maintenance mode as shown in image below:

Running “powervc-backup” without restarting services

While the default option is to stop services before taking a backup, the “powervc-backup” command provides an argument named –-active, that “attempts” to take a backup without stopping the services. The primary limitation when using this option is that this might not successfully take a backup if there is any operation running against PowerVC that internally does any database operations. Thus, do not be surprised if you see this command fail; it merely means that the database backup could not be taken because there are one or more operations that are going on. If the command fails with –-active option, it does not generate any backup archive file at all.

Seen above is an example where the backup command has failed. In many cases, this command has to be run multiple times to ensure that a successful backup is created. It is common for a lot of PowerVC customers to run this command at off-peak hours so that this option can be used to take backups, avoiding the need to stop PowerVC services.

Seen below is an example of a successful PowerVC backup using the -–active option:

Restoring a backup on PowerVC

The backup archive created as part of the above operation can be used to run restore on PowerVC2 (passive/standby PowerVC). The important thing to note is that when “powervc-restore” is run on PowerVC2, all resources like VMs, compute nodes etc. that were originally managed by PowerVC1 will be automatically moved to PowerVC2 and all references of these will be removed from PowerVC1. “Note that here compute nodes refer to Novalink hosts.” VMs mentioned refer to VMs deployed on Novalink Hosts. HMC, HMC managed hosts and HMC managed VMs will still be visible on PowerVC1.

Running powervc-restore on the same PowerVC instance

These backups can also be used in case where PowerVC1 goes into a corrupt state (for example, someone accidentally messed up the configuration files beyond recovery). In such a case, the “powervc-restore” command can be run on the same system where PowerVC1 is installed. The system administrator can restore PowerVC back to a previous state by providing the right backup archive file as input.

Running “powervc-restore” command on another PowerVC instance

To be able to run “powervc-restore” command on another PowerVC instance (for e.g on PowerVC2, which is a passive/standby PowerVC), the backup taken from PowerVC1 has to be made available/accessible to PowerVC2. The restore operation stops all services running on PowerVC2, restores the backup that consists of configuration and database files, and then starts the services. This process will seamlessly unmanage and remove all compute nodes and virtual machines from PowerVC1 (if PowerVC1 is still active) and add them to PowerVC2, such that PowerVC2 is now the active and primary management node managing all the resources that were previously managed by PowerVC1.

Below message is received on the PowerVC1 for the managed remote nodes when the backup is restored on PowerVC2.

There could be cases where PowerVC1 has completely crashed, in which case PowerVC2 is restored fine but the restore process will be unable to clean up references of these managed resources in PowerVC1 (as it is inaccessible). When PowerVC1 eventually comes back up, the PowerVC admin has to login to PowerVC1 GUI and clean up compute resources like host and VM that are in error state.
In the below image, you can see that the NovaLink host is displayed in unknown state in PowerVC1 after the host system comes back up:

When manually removing the NovaLink host from PowerVC1 after the host is restored on PowerVC2, it asks for option to remove PowerVC software on the NovaLink host (as shown in the below picture). Do not select this checkbox, otherwise the NovaLink host might get corrupted.

Conclusion

Based on the PowerVC deployment your environment has, you can consider if using PowerVC backup/restore is an option you can use for managing the availability of your PowerVC management controller.

If you have any questions about this topic, please comment below. Watch this space for more information about troubleshooting your environment. In the meantime, don’t forget to follow us on LinkedIn, Facebook, and Twitter.

Authors:
Ankit Arora (aarora06@in.ibm.com)
Divya K Konoor (dikonoor@in.ibm.com)

Join The Discussion

Your email address will not be published. Required fields are marked *