IBM Watson Machine Learning Accelerator is a software solution that bundles IBM PowerAI, IBM Spectrum Conductor, IBM Spectrum Conductor™ Deep Learning Impact, and support from IBM for the whole stack, including the open source deep learning frameworks. Watson Machine Learning Accelerator provides an end-to-end deep learning platform for data scientists. This includes complete lifecycle management from installation and configuration to data ingest and preparation to building, optimizing, and distributing the training model, and to and moving the model into production. Watson Machine Learning Accelerator excels when you expand your deep learning environment to include multiple compute nodes. There’s even a free evaluation available. See the prerequisites from our first introduction tutorial, Classify images with Watson Machine Learning Accelerator.

Learning objectives

This is the second tutorial of this IBM Watson Machine Learning Accelerator education series:

  • Tasks:
    • Customize a notebook package to include Anaconda and sparkmagic.
    • Install the custom notebook package on Watson Machine Learning Accelerator.
    • Create a Spark instance group for the notebook.
    • Start the notebook server and upload a notebook to train a Keras model.
    • Connect to a Hadoop cluster from a notebook and execute a Spark MLlib model.

Estimated time

It should take you about two hours to complete this tutorial, which includes roughly 30 minutes of model training, installation, configuration, and getting the model through the GUI.

Prerequisites

The tutorial requires access to a GPU-accelerated IBM Power® Systems server model AC922 or S822LC. In addition to acquiring a server, there are multiple options to access Power Systems servers listed on the IBM PowerAI developer portal.

We’ll use an existing customized notebook that includes scripts to install PowerAI 1.5.4.1 that comes with Watson Machine Learning Accelerator 1.1.2.

Task 1: Customize a notebook package to include Anaconda and other packages

  1. Download PowerAI-1.5.4.1-Notebook-Base.tar.gz from this github repo. Note: A newer version of the of the package may be available, like 1.5.4.2. Use the newer version in the instructions below, replacing 1.5.4.1 version number with the latest version.
  1. Create a custom working directory and extract the PowerAI-1.5.4.1-Notebook-Base.tar.gz package.

     mkdir custom
     tar -C custom -xzvf PowerAI-1.5.4.1-Notebook-Base.tar.gz
    
  2. Download the Anaconda repo

    cd custom/package
    wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-ppc64le.sh
    cd ..
    
  3. (Optional) Edit the build.version file to include today’s date. This example shows the edits using a vi editor.

     vi build.version
     enter insert mode (a) and edit the build data and number, for example
    
     Build Date: "Jan 17 2019"
     Build Number: 20190117
    
     esc and save the update (:qw)
    
  4. (Optional) Add additional pip packages of interest for your custom notebook by updating the added_packs file.

     vi scripts/added_packs
     enter insert mode (a) and scroll down to the bottom and add additional packages (one per line).  The current list of included packages are shown below.
    
       theano
       keras
       pandas
       sparkmagic
    
     esc and save the update (:qw)
    
  5. Tar the package in the custom directory created earlier.

     tar -czvf PowerAI-1.5.4.1-Notebook.tar.gz .
    

This notebook package can now be installed on Watson Machine Learning Accelerator.

Task 2: Install a custom notebook on Watson Machine Learning Accelerator

  1. Open the Spark Notebook Management panel by using the Spectrum Conductor management console.

    cluster management console

  2. Add a new notebook.

    Spark notebook management console

  3. Fill in the details for the PowerAI-1.5.4.1-Notebook.tar.gz notebook and click Add.

    Add notebook console window Add notebook console window continued

    • Give it a name like PowerAI and a version like 1.5.4.1 (to match the version of the notebook TAR file).

    • Use the Browse button to find the PowerAI-1.5.4.1-Notebook.tar.gz file you created in Task 1.

    • Check the box for Enable collaboration for the notebook.

    • Fill in the Start, Stop, and Monitor commands with:

        ./scripts/start_jupyter.sh
        ./scripts/stop_jupyter.sh
        ./scripts/jobMonitor.sh
      
    • Specify the number of seconds for the Longest update interval for a job monitor. We used 180 seconds.

  4. Click Add to begin the notebook upload. The upload time varies based your network speed.

After the notebook add is complete, you can configure it for use in a new or existing Spark Instance Group (SIG). In the next step, we show how to create a new SIG.

Task 3: Create a Spark Instance Group (SIG) for the notebook

  1. Create a new SIG and include the added notebook.

    Spark instance group window

    Provide the SIG name, the Spark deployment directory, and the execution user. The deployment directory is typically associated with the execution user (demouser in this case).

  2. Check the box for the new custom notebook, PowerAI 1.5.4.1.

    New spark instance group window New spark instance group window continued

  3. Update the GPU resource group for the Spark Executors (GPU Slots).

    Resource groups and plans window

  4. Click Create and Deploy Instance Group. The SIG is created and deployed.

  5. After the deployment completes, start the SIG by clicking Start.

    Notebook window

Task 4: Create the notebook server for users and upload a notebook to train a Keras model

  1. After the SIG is started, go to the Notebook tab a click Create Notebooks for Users.

    PowerAI 154-Notebook window

  2. Select the users for the nodebook server.

    Screen showing my notebooks button

  3. After the notebook has been created, refresh the screen to see My Notebooks. Clicking this shows the list of notebook servers created for this SIG.

    sign-in screen

  4. Select the PowerAI 1.5.4.1 notebook to bring up the notebook server URL.

  5. Sign on to the notebook server.

    Notebook selection window

  6. Download the tf_keras_fashion_mnist.ipynb notebook and upload it to the notebook server by clicking Upload. You have to press upload again after specifying the notebook to upload.

    cell execution

  7. Select the notebook and begin executing the cells. The Keras model is defined in cell [13] and is trained in cell [15].

    hadoop integration

The test of the model shows an accuracy of more than 86 percent after being trained for five epochs.

Task 5: Connect to a Hadoop cluster from a notebook and execute a Spark MLlib model

This next section explains how to use the notebook to connect to a Hadoop data lake that has an Apache Livy service deployed. The following image shows the Hadoop integration.

Hadoop integration

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It supports long-running Spark sessions and multi-tenancy. To install it on your Hadoop cluster, see your Hadoop vendor documentation like this one from Hortonworks. To get the Spark MLlib notebook to connect and run, make the following two changes on the Hortonworks HDP cluster.

  1. Disable the Livy CSRF check by setting livy.server.csrf_protection.enabled=false in the HDP Spark2 configuration. Stop and Start all services to pick up the changes.
  2. Install the numpy package via pip.
    1. yum -y install python-pip
    2. pip install numpy

Sparkmagic runs in a Jupyter Notebook. It includes a set of tools for interactively working with remote Spark clusters through Livy. It is installed through pip and enabled in the notebook by running a Jupyter command.

Sign on to the notebook server and import the hadoop_livy2_spark_mllib_test.ipynb notebook provided by this tutorial and execute it.

  • Notebook cell [1] verifies that the sparkmagic module can be loaded.
  • Notebook cell [2] verifies that the Spark session can be created. Edit the URL to point to your Hadoop host and port for the Livy service.
  • Notebook cell [3] downloads the data and puts it in the hdfs /tmp directory.
  • Notebook cell [4] runs a Spark MLlib kmeans clustering model.
  • Notebook cell [5] cleans up the Spark session running on the Livy service. It is important to clean up the session and associated Hadoop cluster resources.

Running the notebook

Conclusion

You now have learned how to customize and install notebook package on Watson Machine Learning Accelerator. Then use that to run a notebook with a Keras model and to run a notebook that connects to a Hadoop data lake and execute a Spark MLlib model.