Introduction

IBM Spectrum Scale is a flexible software-defined storage that can be deployed as high performance file storage or a cost optimized large-scale content repository. IBM Spectrum Scale, previously known as IBM General Parallel File System (GPFS), is designed to scale performance and capacity with no bottlenecks. IBM Spectrum Scale is a cluster file system that provides concurrent access to file systems from multiple nodes. The storage provided by these nodes can be direct attached, network attached, SAN attached, or a combination of these methods. Spectrum Scale provides many features beyond common data access, including data replication, policy based storage management, and space efficient file snapshot and clone operations.

How the Spectrum Scale OpenStack Cinder driver works

The Spectrum Scale OpenStack Cinder driver, named ‘gpfs.py’, enables the use of Spectrum Scale as a storage backend for provisioning volumes for OpenStack instances. With the Spectrum Scale driver, instances do not actually access a storage device at the block level. Instead, volume backing files are created in a Spectrum Scale file system and mapped to instances, which emulate a block device.

Spectrum Scale must be installed and cluster has to be created on the storage nodes in the OpenStack environment. A file system must also be created and mounted on these nodes before configuring the cinder service to use Spectrum Scale storage. For more details, please refer to Spectrum Scale product documentation.

Optionally, the Image service can be configured to store glance images in a Spectrum Scale file system. When a Block Storage volume is created from an image, if both image data and volume data reside in the same Spectrum Scale file system, the data from image file is moved efficiently to the volume file using copy-on-write optimization strategy.

Supported operations

  • Create, delete, attach, and detach volumes.
  • Create, delete volume snapshots.
  • Create a volume from a snapshot.
  • Create cloned volumes.
  • Extend a volume.
  • Migrate a volume.
  • Retype a volume.
  • Create, delete consistency groups.
  • Create, delete consistency group snapshots.
  • Copy an image to a volume.
  • Copy a volume to an image.
  • Backup and restore volumes.

Driver configurations

The Spectrum Scale volume driver supports three modes of deployment.

Mode 1 – Pervasive Spectrum Scale Client

When Spectrum Scale is running on compute nodes as well as on the cinder node. For example, Spectrum Scale filesystem is available to both Compute and Block Storage services as a local filesystem.

To use Spectrum Scale driver in this deployment mode, set the volume_driver in the cinder.conf as:

volume_driver = cinder.volume.drivers.ibm.gpfs.GPFSDriver

The following table contains the configuration options supported by the Spectrum Scale driver in this deployment mode.

Configuration option

Description

gpfs_mount_point_base

Specifies the path of the GPFS directory where Block Storage volume and snapshot files are stored.

gpfs_sparse_volumes 

Specifies that volumes are created as sparse files which initially consume no space. If set to False, the volume is created as a fully allocated file, in which case, creation may take a significantly longer time.

gpfs_storage_pool 

Specifies the storage pool that volumes are assigned to. By default, the system storage pool is used.

gpfs_max_clone_depth 

Specifies an upper limit on the number of indirections required to reach a specific block due to snapshots or clones. A lengthy chain of copy-on-write snapshots or clones can have a negative impact on performance, but improves space utilization. 0 indicates unlimited clone depth.

gpfs_images_dir

 Specifies the path of the Image service repository in GPFS. Leave undefined if not storing images in GPFS.

gpfs_images_share_mode 

 Specifies the type of image copy to be used. Set this when the Image service repository also uses GPFS so that image files can be transferred efficiently from the Image service to the Block Storage service. There are two valid values: “copy” specifies that a full copy of the image is made; “copy_on_write” specifies that copy-on-write optimization strategy is used and unmodified blocks of the image file are shared efficiently.

 

Note that the gpfs_images_share_mode flag is only valid if the Image Service is configured to use Spectrum Scale with the gpfs_images_dir flag. When the value of this flag is copy_on_write, the paths specified by  the gpfs_mount_point_base and gpfs_images_dir flags must both reside in the same GPFS file system and in the same GPFS file set.

 

Mode 2 – Remote Spectrum Scale Driver with Local Compute Access

When Spectrum Scale is running on compute nodes, but not on the Block Storage node. For example, Spectrum Scale filesystem is only available to Compute service as Local filesystem where as Block Storage service accesses Spectrum Scale remotely. In this case, cinder-volume service running Spectrum Scale driver access storage system over SSH and creates volume backing files to make them available on the compute nodes. This mode is typically deployed when the cinder and glance services are running inside a Linux container. The container host should have Spectrum Scale client running and GPFS filesystem mount path should be bind mounted into the Linux containers.

Note that the user IDs present in the containers should match as that in the host machines. For example, the containers running cinder and glance services should be priviledged containers.

To use Spectrum Scale driver in this deployment mode, set the volume_driver in the cinder.conf as:

volume_driver = cinder.volume.drivers.ibm.gpfs.GPFSRemoteDriver

The following table contains the configuration options supported by the Spectrum Scale driver in this deployment mode.

Configuration option

Description

gpfs_mount_point_base

Specifies the path of the GPFS directory where Block Storage volume and snapshot files are stored.

gpfs_sparse_volumes 

Specifies that volumes are created as sparse files which initially consume no space. If set to False, the volume is created as a fully allocated file, in which case, creation may take a significantly longer time.

gpfs_storage_pool 

Specifies the storage pool that volumes are assigned to. By default, the system storage pool is used.

gpfs_max_clone_depth

Specifies an upper limit on the number of indirections required to reach a specific block due to snapshots or clones. A lengthy chain of copy-on-write snapshots or clones can have a negative impact on performance, but improves space utilization. 0 indicates unlimited clone depth.

gpfs_images_dir

Specifies the path of the Image service repository in GPFS. Leave undefined if not storing images in GPFS.

gpfs_images_share_mode

Specifies the type of image copy to be used. Set this when the Image service repository also uses GPFS so that image files can be transferred efficiently from the Image service to the Block Storage service. There are two valid values: “copy” specifies that a full copy of the image is made; “copy_on_write” specifies that copy-on-write optimization strategy is used and unmodified blocks of the image file are shared efficiently.

gpfs_hosts

Comma-separated list of IP address or hostnames of GPFS nodes.

gpfs_user_login 

Username for GPFS nodes.

gpfs_user_password

Password for GPFS node user.

gpfs_private_key

Filename of private key to use for SSH authentication.

gpfs_ssh_port 

SSH port to use.

gpfs_hosts_key_file 

 File containing SSH host keys for the gpfs nodes with which driver needs to communicate.

gpfs_strict_host_key_policy 

 Option to enable strict gpfs host key checking while connecting to gpfs nodes.

Note that like earlier mode, gpfs_images_share_mode flag is only valid if the Image Service is configured to use Spectrum Scale with the gpfs_images_dir flag. When the value of this flag is copy_on_write, the path specified by the gpfs_mount_point_base and gpfs_images_dir flags must both reside in the same GPFS file system and in the same GPFS file set.

Mode 3 – Remote Spectrum Scale Access

When both Compute and Block Storage nodes are not running Spectrum Scale software and do not have access to Spectrum Scale file system directly as local filesystem. In this case, we create an NFS export on the volume path and make it available on the cinder node and on compute nodes.

Optionally, if one wants to use the copy-on-write optimization to create bootable volumes from glance images, one need to also export the glance images path and mount it on the nodes where glance and cinder services are running. The cinder and glance services will access the GPFS filesystem through NFS.

To use Spectrum Scale driver in this deployment mode, set the volume_driver in the cinder.conf as:

volume_driver = cinder.volume.drivers.ibm.gpfs.GPFSNFSDriver

The following table contains the configuration options supported by the Spectrum Scale driver in this deployment mode.

Configuration option

Description

gpfs_mount_point_base

Specifies the path of the GPFS directory where Block Storage volume and snapshot files are stored.

gpfs_sparse_volumes 

Specifies that volumes are created as sparse files which initially consume no space. If set to False, the volume is created as a fully allocated file, in which case, creation may take a significantly longer time.

gpfs_storage_pool 

Specifies the storage pool that volumes are assigned to. By default, the system storage pool is used.

gpfs_max_clone_depth 

Specifies an upper limit on the number of indirections required to reach a specific block due to snapshots or clones. A lengthy chain of copy-on-write snapshots or clones can have a negative impact on performance, but improves space utilization. 0 indicates unlimited clone depth.

gpfs_images_dir

 Specifies the path of the Image service repository in GPFS. Leave undefined if not storing images in GPFS.

gpfs_images_share_mode 

 Specifies the type of image copy to be used. Set this when the Image service repository also uses GPFS so that image files can be transferred efficiently from the Image service to the Block Storage service. There are two valid values: “copy” specifies that a full copy of the image is made; “copy_on_write” specifies that copy-on-write optimization strategy is used and unmodified blocks of the image file are shared efficiently.

nas_host

IP address or Hostname of NAS system.

nas_login

User name to connect to NAS system.

nas_password

Password to connect to NAS system.

nas_private_key

Filename of private key to use for SSH authentication.

nas_ssh_port 

SSH port to use to connect to NAS system.

nfs_mount_point_base

Base dir containing mount points for NFS shares.

nfs_shares_config

File with the list of available NFS shares.

 

Additionally, all the options of the base NFS driver are applicable for GPFSNFSDriver. The above table lists the basic configuration options which are needed for initialization of the driver.

 Note that gpfs_images_share_mode flag is only valid if the Image Service is configured to use Spectrum Scale with the gpfs_images_dir flag. When the value of this flag is copy_on_write, the paths specified by the gpfs_mount_point_base  and gpfs_images_dir flags must both reside in the same GPFS file system and in the same GPFS file set.

Volume creation options

It is possible to specify additional volume configuration options on a per-volume basis by specifying volume metadata. The volume is created using the specified options. Changing the metadata after the volume is created has no effect. The following table lists the volume creation options supported by the GPFS volume driver.

Metadata Item Name

Description

fstype

Specifies whether to create a file system or a swap area on the new volume. If fstype=swap is specified, the mkswap command is used to create a swap area. Otherwise the mkfs command is passed the specified file system type, for example ext3, ext4 or ntfs.

fslabel

Sets the file system label for the file system specified by fstype option. This value is only used if fstype is specified.

data_pool_name

Specifies the GPFS storage pool to which the volume is to be assigned. Note: The GPFS storage pool must already have been created.

replicas

Specifies how many copies of the volume file to create. Valid values are 1, 2, and, for Spectrum Scale V3.5.0.7 and later, 3. This value cannot be greater than the value of the MaxDataReplicasattribute of the file system.

dio

Enables or disables the Direct I/O caching policy for the volume file. Valid values are yes and no.

write_affinity_depth

Specifies the allocation policy to be used for the volume file. Note: This option only works if allow-write-affinity is set for the GPFS data pool.

block_group_factor

Specifies how many blocks are laid out sequentially in the volume file to behave as a single large block. Note: This option only works if allow-write-affinity is set for the GPFS data pool.

write_affinity_failure_group

Specifies the range of nodes (in GPFS shared nothing architecture) where replicas of blocks in the volume file are to be written. See Spectrum Scale documentation for more details about this option.

The following example shows the creation of a 50GB volume with an ext4 file system labeled newfs and direct IO enabled:

$ openstack volume create –property fstype=ext4 fslabel=newfs dio=yes  –size 50 VOLUME

Note that if the metadata for the volume is changed later, the changes do not reflect in the backend. User will have to manually change the volume attributes corresponding to metadata on Spectrum Scale filesystem.

For more information, please refer to https://www.redbooks.ibm.com/Abstracts/redp5331.html.
This IBM Redpaper publication describes the benefits and best practice recommendations of the use of IBM Spectrum Scale in OpenStack environments.

Join The Discussion

Your email address will not be published. Required fields are marked *