Classify images with Watson Machine Learning Accelerator

IBM Watson Machine Learning Accelerator is a software solution that bundles Watson Machine Learning Community Edition, IBM Spectrum Conductor, IBM Spectrum Conductor Deep Learning Impact, and support from IBM for the whole stack including the open source deep learning frameworks. Watson Machine Learning Accelerator provides an end-to-end, deep learning platform for data scientists. This includes the complete lifecycle management from installation and configuration; data ingest and preparation; building, optimizing, and distributing the training model; to moving the model into production. Watson Machine Learning Accelerator truly shines when you expand your deep learning environment to include multiple compute nodes. There’s even a free evaluation available, see the Prerequisites for more information.

In this tutorial, you will be performing a basic computer vision image classification example using the Deep Learning Impact function within Watson Machine Learning Accelerator. The example identifies whether the images contain clothes, dresses, clothes on a person, and dresses on a person. Of course, you can use whatever data you’d like in your example.

Learning objectives

After completing this tutorial, you’ll understand how to:

  • Get a feel for the deep learning workflow
  • Classify images with Watson Machine Learning Accelerator
  • Build a model using Watson Machine Learning Accelerator
  • Become more familiar with the IBM Power Systems server ecosystem

Estimated time

  • The end-to-end tutorial takes approx 3 hours, which includes about 50 minutes of model training, plus installation and configuration as well as driving model through the GUI.

Prerequisites

The tutorial requires access to a GPU-accelerated IBM Power Systems server model AC922 or S822LC. In addition to acquiring a server, there are multiple options to access Power Systems servers listed on the PowerAI Developer Portal.

Steps

Step 1. Download, install, and configure the IBM Watson Machine Learning Accelerator Evaluation

Step 2. Configure OS user

  1. At the OS level, as root, on all nodes, create an OS group and user for the OS execution user:

     groupadd egoadmin
     useradd -g egoadmin -m egoadmin
    
  2. The GID and UID of the created user and group must be the same on all nodes.

Step 3. Configure the resource groups

  1. Refer to https://github.com/IBM/wmla-assets/blob/master/runbook/WMLA_installation_configuration.md, Steps 1.1 , 1.2 and 1.7.

  2. Name it ImageRg.

    alt

    alt

  3. Refer to https://github.com/IBM/wmla-assets/blob/master/runbook/WMLA_installation_configuration.md, Steps 1.9 to 1.13.

Step 4. Create Spark instance group

  1. Select Workload->Instance Groups.

    alt

  2. Click New.

    alt

  3. Select Templates.

    alt

  4. Select dli-sig-template-2-3-3.

    alt

  5. Enter following three values:

    alt

  6. Click Configuration and modify the Spark parameters, including:

    a. Set the JAVA_HOME variable to your host Java path. alt

    b. Set the SPARK_EGO_SLOTS_REQUIRED_TIMEOUT variable to 86400. alt

    c. Set the SPARK_EGO_RECLAIM_GRACE_PERIOD variable to 120. alt

  7. Scroll down and select the ImageRg resource group for the Spark executors (GPU slots) that you have created previously. Do not change any other configuration options.

    alt

  8. Click the Create and Deploy Instance group.

  9. Click Continue.
  10. Watch as your instance group gets deployed.

Step 5. Download the instrumented VGG-19 model for TensorFlow

Download all of the files in the https://us-south.git.cloud.ibm.com/ibmconductor-deep-learning-impact/dli-1.2.3-tensorflow-samples/tree/master/tensorflow-1.13.1/vgg19 directory.

Step 6. Download the pre-trained weights

Use the following code to download the pre-trained weights from TensorFlow. More information can be found in the GitHub repo.

mkdir <pretrained weight directory>
cd <pretrained weight directory>
wget http://download.tensorflow.org/models/vgg_19_2016_08_28.tar.gz
tar –zxvf vgg_19_2016_08_28.tar.gz

Modify the user access and group to ensure that Watson Machine Learning Accelerator can read the weight file.

chown -R egoadmin:egoadmin vgg_19.ckpt

Step 7. Download the data sets

For this tutorial, we’re going to use a tool called googliser, which searches Google Images. It is a simple shell script with no prerequisites.

Use the following commands to run googliser and create four data sets in their own directories.

  • dresses_with_model
  • dresses_without_model
  • clothes_with_model
  • clothes_without_model

$ git clone https://github.com/teracow/googliser

$ cd googliser

$ ./googliser.sh --phrase "dresses with model" --title "dresses_with_model" --upper-size 200000 --lower-size 2000 --failures 0 -n 400 -N
 googliser.sh - 2018-07-26 PID:[43878]

 -> processing query: "dresses with model"
 -> searching Google:       10/10 result groups downloaded.      522 results!
 -> acquiring images:      400/400 downloaded and      115/     522 failed. (22%)

 -> All done!

$ ./googliser.sh --phrase "dresses only" --title "dresses_without_model" --upper-size 200000 --lower-size 2000 --failures 0 -n 400 -N
 googliser.sh - 2018-07-26 PID:[86968]

 -> processing query: "dresses only"
 -> searching Google:       10/10 result groups downloaded.      536 results!
 -> acquiring images:      400/400 downloaded and      122/     536 failed. (23%)

 -> All done!

$ ./googliser.sh --phrase "clothes with model" --title "clothes_with_model" --upper-size 200000 --lower-size 2000 --failures 0 -n 400 -N
 googliser.sh - 2018-07-26 PID:[14331]

 -> processing query: "clothes with model"
 -> searching Google:       10/10 result groups downloaded.      615 results!
 -> acquiring images:      400/400 downloaded and      194/     615 failed. (33%)

 -> All done!

$ ./googliser.sh --phrase "clothes only" --title "clothes_without_model" --upper-size 200000 --lower-size 2000 --failures 0 -n 400 -N
 googliser.sh - 2018-07-26 PID:[40210]

 -> processing query: "clothes only"
 -> searching Google:       10/10 result groups downloaded.      630 results!
 -> acquiring images:      400/400 downloaded and      112/     630 failed.  (34%)

 -> All done!

We’re now going to create a parent directory “train” first, then under the “train” parent directory we will create two sub- directories, images_without_model and images_with_model and move the images into those new directories.

mkdir images_with_model
mv dresses_with_model/* images_with_model
mv clothes_with_model/* images_with_model

mkdir images_without_model
mv dress_without_model/* images_without_model
mv clothes_without_model/* images_without_model

Step 8. Load data into Watson Machine Learning Accelerator

Associate the images with Watson Machine Learning Accelerator by creating a new data set.

alt

  1. In the Datasets tab, select New.

    alt

  2. Click Images for Object Classification. When presented with a dialog box, provide a unique name (for example, ‘CodePatternDS’) and select TFRecords for ‘Dataset stores images in’ and then select the folder that contains the images obtained in the previous step and give the values to the other fields as per the below screenshot. When you’re ready, click Create.

    alt alt

With your data in Watson Machine Learning Accelerator, you can begin the next step, building a model.

Step 9. Build the model

  1. Select the Models tab and click New.

    alt

  2. Select Add Location.

    alt

  3. Select TensorFlow as the Framework.

    alt

  4. Select TensorFlow-VGG19 for your new model, and click Next.

    alt

  5. Ensure that the Training engine is set to singlenode and that the data set points to the one you just created.

    alt

    Note: Set the Base learning rate to 0.001 because larger values might lead to exploding gradients.

    alt

The model is now ready to be trained.

Step 10. Run Training

  1. Back at the Models tab, select Train to view the models you can train, then select the model you created in the previous step.

    alt

  2. Use the pre-trained weight file you downloaded in the previous step by specifying the directory. Make sure that the files have a .ckpt extension. Click Start Training.

    alt

Step 11. Inspect the training run

  1. From the Train submenu of the Models tab, select the model that is training by clicking the link.

    alt

  2. Navigate from the Overview panel to the Training panel, and click the most recent link. You can watch as the results roll in.

    alt

Step 12. Create an inference model

From the Training view, click Create Inference Model.

alt

This creates a new model in the Models tab. You can view it by going to the Inference submenu.

alt

Step 13. Test it out

  1. Go back to the Models tab, select the new inference model, and click Test. At the new Testing overview screen, select New Test.

    alt

  2. Download inference test images into your local disk.

  3. Unzip Inference_images.zip and use the Browse option to load 6 images. Click Start Test.

    alt

  4. Wait for the test state to change from RUNNING to FINISHED.

    alt

  5. Click the link to view the results of the test.

    alt

As you can see, the images are available as a thumbnail preview along with their classified label and probability.

alt

Summary

We hope that you have enjoyed reading this tutorial. Happy hacking and good luck on creating your next model with Watson Machine Learning Accelerator.

Kelvin Lui
Jeff Karmiol
Dustin Vanstee