Overview

Skill Level: Beginner

Customers can find themselves in a situation where they might need to move data between COS and File or Block storage. This recipe will give you a few examples of how to easily complete that task.

Ingredients

There are a number of tools that can be used to move data between IBM Cloud Object Storage (COS) and File or Block storage. This recipe serves as a summary of some of the tools available to you. For this exercise, I assume you have allocated an IBM Cloud Object Storage bucket and a VM with local disk, File (NFS), or Block storage as the destination for the data.

Common terms in this recipe:

  • mdms-migration-bucket - This is the name of the bucket in IBM COS
  • /mnt/data - This is the directory on the server that is backed by File storage, Block Storage, or a local file system.
    • In a Windows environment, this would most likely be an additional drive (e.g. d:\).

For the examples in this list, we will be using the key and secret key credentials to connect to COS. For PaaS provisioned COS buckets, you will need to use HMAC credentials.

Tools to review:

  1. rclone
  2. s3fs
  3. aws cli
  4. s3cmd
  5. cyberduck

There are, of course, many other tools available and this does not intend to be an exhaustive list. Additionally, there are a number of SDKs that can be used when building custom applications.

Step-by-step

  1. rclone

    The rclone tool provides a command line tool that will work with a large number of backend storage systems. This includes Object Storage environments such as IBM COS, AWS S3, and Microsoft Azure Blobstores. A configuration wizard can be launching using the command: rclone config. Configuring rclone with work with IBM Cloud Object Storage can be done using these instructions or these or these (see the Configure rclone for COS section in this last link).

    Once configured, moving data between COS and the file system works similar to a standard copy command. For example, to copy all data from a bucket to the local directory (backed by File or Block or local drives) you could issue this command:

    rclone copy <config_name>:mdms-migration-bucket /mnt/data/

    Depending on the amount of data and the speed of the connection, copying data can take from a few minutes to a few days. There are several switches that can be used to manage the number of parallel transfers. If your system has enough CPU, memory, and network bandwidth you can try increasing the number of parallel threads using --transfers int where int is the number of threads to run.

    You can also run rclone in the other direction if you need to move data from local/File/Block storage to COS. Simply reverse the parameters like to

    rclone copy /mnt/data/ <config_name>:mdms-migration-bucket

    Likewise, with uploading, there are several command line switches that can be used to manage the upload, including the size of chunks for large files being uploaded as multipart.

    In both cases, rclone will run a compare against objects in the bucket and objects on the file system. Objects which are the same on both sides will not be recopied. This makes restarting the process very easy. After a bit of time to compare sides, the copy will resume with objects not already copied.

    If you would also like items removed that have been deleted on one side of the other, you should have a look at the sync command rather than copy.

    Overall, rclone proves to be a very powerful tool for moving file/objects between different types of storage systems for those comfortable with a command line interface. It supports a number of different platforms, including Linux, Mac, and Windows.

  2. s3fs

    The s3fs library is built on the FUSE API for Operating Systems in the Linux and BSD space, including MacOS. Using s3fs you can mount a COS bucket as file system similar to how you can mount File storage via NFS. To the user and applications on the system, the mounted bucket looks like any other type of mounted directory. Under the covers file system operations are translated to COS API requests.

    Configuration

    Using s3fs, the COS bucket is really no different than a local file system on your machine. Any existing tools, libraries, scripts, etc. that you have in your toolbox will see the bucket as a standard file system. Copying data from the bucket to File, Block, or local storage is simply a matter of using the normal command line tools. For example, if your bucket is mounted to /mnt/bucket you could copy the objects using:

    cp -r /mnt/bucket/* /mnt/data/

    That’s all there is to it.

    Depending on how you intend to use the data in the bucket, you might consider just leaving the objects in the bucket and access via s3fs.

    While a little more complicated to install and configure than other options, the power of mounting a bucket as a remote file system could be worth the effort in many cases.

  3. aws cli

    The aws cli is the official command line interface for AWS. It is also compatible with the IBM COS S3 API. The official IBM Documentation will walk you through the configuration and usage for the aws cli with COS. Once configured, moving objects from COS to File, Block, or local storage is similar to the other options we have seen. For example:

    aws --endpoint-url {endpoint} s3 cp --recursive s3://mdms-migration-bucket/ /mnt/data

    As with most tools, there are several options that can be used to control the number of parallel threads, size of chunks when uploading, etc.

    While similar to some of the other options, the aws cli is very well known in the industry and especially useful to teams that are already familiar is using it in their deployments.

  4. s3cmd

    The s3cmd tool is another command line alternative for moving data out of and into IBM COS. The s3cmd provides a wizard that can be used to setup the appropriate configuration for access COS. Launch the wizard using the command

    s3cmd --configure

    See this documentation for assistance with the configuration process. Once configurated, moving data between buckets and File or Block or local storage is very similar to other command line tools. To copy all the data from the bucket to the local directory, use this command:

    s3cmd get --recursive s3://mdms-migration-bucket /data/ /mnt/data/

     The s3cmd usage page will give you all of the details on the various command line options that are available to you with this tool.

  5. cyberduck

    For those that are looking for a GUI experience, Cyberduck is the tool you need. With this tool, you can connect to a bucket and browse it like you would in Windows Explorer or Mac Finder. These instructions will have you up and running in no time. You can browser your bucket in the Cyberduck UI and you can download and sync between the bucket and local file systems using the GUI.

    If you are looking to mount your bucket on Windows or Mac, similar to s3fs, you can have a look at Mountainduck from the team at Iterate GmbH.

Join The Discussion