Previous Install toolkit versions(< 5.0.2) required all nodes and components services to be online and healthy prior to starting the upgrade. Furthermore, the upgrade process was not designed to run again and again if the fresh upgrade fails..

Starting from IBM Spectrum Scale 5.0.2 release, the installation toolkit supports upgrade rerun if fresh upgrade fails.

    Do an upgrade rerun if the fresh upgrade fails due to the following reasons:

  • Some component stopped during the upgrade process and it becomes unhealthy
  • Network issue during the upgrade
  • Chef process hangs
  • Mixed mode packages
  • Performance monitoring services configuration failed during initial upgrade process.

5.0.2 Installer upgrade introduces the ability to resume a previously failed upgrade.

You can do a rerun by using the regular upgrade ( ./spectrumscale upgrade run) command. The installation toolkit automatically identifies whether this command is run for a fresh upgrade or for a rerun. During a rerun, the installation toolkit performs an upgrade by ignoring all fatal errors that you received during the fresh upgrade.

    By performing a rerun,

  • The installation toolkit tries to do the upgrade even if some components become unhealthy.
  • The installation toolkit displays warnings instead of fatal messages if some components are not healthy on a node.
  • The installation toolkit does not try to stop the services if they are already stopped.
  • If the services are already running, then the installation toolkit will not try to start the services again.
  • The installation toolkit tries to start the services if they are stopped.
  • Upgrade resume will not try to start/stop the services if node/component is kept in offline using ./spectrumscale upgrade config offline –N .
  • Upgrade resume will not try to start/stop the services if node/component is excluded using ./spectrumscale upgrade config exclude –N .
  • Upgrade resume process instruct user through proper warning message for all unhealthy component.

Installer upgrade rerun has been implemented for all installer supported component,

    GPFS:

Upgrade resume will enable you to run the upgrade even if GPFS becomes unhealthy on some of the nodes in a cluster during initial upgrade. Upgrade resume precheck will display appropriate
warning messages to indicate unhealthy GPFS states. .

    NFS :

Upgrade resume will enable you to run the upgrade even if NFS becomes unhealthy on some of the nodes in a cluster during initial upgrade. Upgrade resume precheck will display appropriate
warning messages to indicate unhealthy NFS states. .

    SMB:

Upgrade resume will enable you to run the upgrade even if SMB becomes unhealthy on some of the nodes in a cluster during initial upgrade. Upgrade resume precheck will display appropriate
warning messages to indicate unhealthy SMB states. .

    Object:

Upgrade resume will enable you to run the upgrade even if Object becomes unhealthy on some of the nodes in a cluster during initial upgrade. Upgrade resume precheck will display appropriate
warning messages to indicate unhealthy Object states. .

    Zimon:

Upgrade resume will enable you to run the upgrade even if Zimon becomes unhealthy on some of the nodes in a cluster during initial upgrade. Upgrade resume precheck will display appropriate
warning messages to indicate unhealthy Zimon states. .

    Some of the example where upgrade rerun procedure is used :

  1. Here a conflicting package is installed and the toolkit upgrade fails.

    Remove the conflicting package, install the correct package, ./spectrumscale upgrade run will tolerate any nodes that are not in healthy state and resume the upgrade from the beginning

  2. Here a Chef process hangs and upgrade fails.
    Follow the chef clean process through KC troubleshooting link and re-run the upgrade , ./spectrumscale upgrade run will tolerate any nodes that are not in healthy state and resume the upgrade from the beginning

  3. Upgrade failure due to mmperfmon config add(zimon) being run when perfmon is not setup on manual created cluster
    Disable the perfmon using ‘./spectrumscale config perfmon -r off ‘and ./spectrumscale upgrade run will tolerate any nodes that are not in healthy state and resume the upgrade.

Join The Discussion

Your email address will not be published. Required fields are marked *