Deep learning models using Watson Studio Neural Network Modeler and Experiments

Learning objectives

Deep learning is an efficient technique to solve complex problems, and the “science” part of data science is all about experimenting with different settings and comparing results. Using Watson Studio, you can easily architect a neural network using a friendly GUI, download the model as code in your favorite framework’s settings, and create experiments to compare between different hyperparameter optimization settings.

In this tutorial, we’ll build a model that detects signature fraud by building a deep neural network. You will learn how to use Watson Studio’s Neural Network Modeler to quickly prototype a neural network architecture and test it. You will also learn how to download code generated from Neural Network Modeler and modify it to plug in and work with Watson Studio’s Experiments Hyperparameter Optimization.

The dataset contains images of signatures, some genuine and some were simulated (fraud). The original source of the dataset is the ICFHR 2010 Signature Verification Competition) [1]. Images are resized to 32×32 pixels and are stored as numpy arrays in a pickled format.


Estimated time

It takes approximately 1 hour to read and follow the steps in this tutorial.


1. Upload the dataset to IBM Cloud Object Storage

Before we start building our neural network, we’ll need to upload files containing the data to our Object Storage instance on the cloud. To do that, unzip the assets folder you downloaded as a prerequisite, and make sure you can locate 3 different data files: training_data.pickle, validation_data.pickle and test_data.pickle.

Go to your dashboard on IBM Cloud and click on Cloud Object Storage instance under Services.


Select Create Bucket to store the data. This step will make it easier finding the data when working with Watson Studio’s Neural Network Modeler.


The bucket’s name must be globally unique to IBM Cloud Object Storage. It is suggested to use your name and some sort of identifier for the project. Also, make sure that Cross Region Resiliency is selected, then click Create bucket.


Start adding files to your newly created bucket. You can do that by clicking on Upload button on the top right corner of the page and selecting Files option from the drop-down menu.


Select the Standard Upload option and click the Select Files button.


Choose the three files named training_data.pickle, validation_data.pickle and test_data.pickle from the unzipped assets folder on your local disk. You should see a dialog asking you to confirm the file selections. Click Upload to proceed with the upload process.


Once the upload process is done, you should see the page updated and displaying the files you just uploaded.


2. Build a neural network using Watson Studio Neural Network Modeler

Select Create a project in Watson Studio.


Select the Standard option on the following page.


Name your project and associate a Cloud Object Storage instance. If you followed the previous step, your Object Storage instance should be detected and selectable from the dropdown.


You’re now ready to work with Watson Studio.


Create a Modeler Flow. You can find this option under Add to project in top right of the project page.


Type a name for your model, select Neural Network Modeler, then click Create.


Once the previous step is successful, you’ll be presented with the Modeler Canvas. This is where you’ll build your Neural Network which will be represented in a graphical form instead of code. You’ll find a sidebar on the left of the screen containing all possible components of neural networks, named Palette. The whole idea will be dragging and dropping nodes representing the different layers of a Neural Network and connecting them to create a Flow.


First, we need to provide data for our neural network. To do that, select Image Data from the Input section in the Neural Network Modeler Palette.


Drag and drop the Image Data node on to the canvas, then double-click it to modify its properties. Notice that this will trigger another sidebar on the right.


To define the data source, create a new connection to your Object Storage instance (COS) by clicking Create a New Connection under the Data sction, or select a connection if one already exists. Choose the bucket that contains your data assets (Refer to Step 1). Choose training_data.pickle as the Training data file, test_data.pickle as the Test data file and validation_data.pickle as the Validation data file.


Now close the Data section and switch to the Settings section in the same right-side panel. Adjust all settings as described here and as shown in the screenshot below:

  • Set Image height to 32
  • Set Image width to 32
  • Set Channels to 1 since the images are grayscale
  • Set Tensor dimensionality to channels_last
  • Set Classes to 2 since we are trying to classify signature images into 2 classes, genuine and fraud
  • Set Data format as Python Pickle
  • Set Epochs to 100, this is how many times the Neural Network will iterate over the data in order to learn more and adjust weights to reach better accuracy
  • Set Batch size to 16, and this is how many images will enter and go through the Neural Network at a time

Once you have all these settings in place, click Close to save them and to close the right sidebar.


Now let’s start building the neural network. The first layer we will add is a 2D Convolutional Layer. Select Conv 2D node from the Convolution section in the left sidebar and drag and drop it on to the canvas.

Note: This is a sample architecture, please feel free to try different or more advanced ones


Connect the two nodes, double-click on Conv 2D node to edit its properties. In the right sidebar, change the settings to the following:

  • Set Number of filters to 32, this is the number of feature maps we want to detect in a given image
  • Set Kernel row to 3, this is the width of the filter (think of a window) that will slide across an image and perform feature detection
  • Set Kernel col to 3, this is the height of the filter
  • Set Stride row to 1, this is the amount by which the filter will slide horizontally
  • Set Stride col to 1, this is the amount by which the filter will slide vertically


Continue editing the Conv2D node properties:

  • Set Weight LR multiplier to 10, this is a value multiplied by the learning rate (which we will define later in the Neural Network hyperparameters). This is introduced to modify the learning rate value for each layer separately
  • Set Weight decay multiplier to 1, this is a value multiplied by the decay (which we will define later in the Neural Network hyperparameters). This is introduced to modify the weight decay rate for each layer separately
  • Set Bias LR multiplier to 10
  • Set Bias decay multiplier to 1

We only edited the required parameters here. There are other optional parameters that have default settings, such as Initialization, which is the initial weights values. You can set an intial Bias value and set whether it’s trainable or not. You can choose a regularization method to minimize overfitting and enhance model generalization. This is a way to penalize large weights and focus on learning small ones as they are lower in complexity and provide better explanation for the data; thus, better generalization for the model.

Once you have all these settings in place, click Close to save them and to close the right sidebar.


Next, we’ll add the third node, which is an activation layer. We’ll choose ReLU (Rectified Linear Unit) as the activation function in our architecture. ReLU gives good results generally and is widely used in Convolutional Neural Networks.

Drag and drop the ReLU node, you can find it under the Activation section in the left sidebar.


Then we’ll add another Convolutional layer, drag and drop a Conv2D node found in the Convolution section in the left sidebar. Make sure you connect the nodes after dropping them on to the canvas.


Double click on the second Conv2D node to trigger the right sidebar so we can edit its properties. Change the settings to the following:

  • Set Number of filters to 64
  • Set Kernel row to 3
  • Set Kernel col to 3
  • Set Stride row to 1
  • Set Stride col to 1


Continue editing the Conv2D node properties:

  • Set Weight LR multiplier to 1
  • Set Weight decay multiplier to 1
  • Set Bias LR multiplier to 1
  • Set Bias decay multiplier to 1

Once you have all these settings in place, click Close to save them and to close the right sidebar.


Add another Activation layer, drag and drop a ReLU node from the Activation section in the left sidebar.


Now, we’ll add a Max Pooling layer, with the purpose of down-sampling or dimensionality reduction of the features extracted from the previous convolutional layer. This is achieved through taking the maximum value within specific regions (windows) that will slide across the previous layer’s output. This step helps aggregate many low-level features, extracting only the most dominant ones thus reducing the amount of data to be processed.

Drag and drop a Pool 2D node from the Convolutional section in the left sidebar.


Double-click Pool 2D node to edit its properties. Change the settings to the following:

  • Set Kernel row to 2
  • Set Kernel col to 2
  • Set Stride row to 1
  • Set Stride col to 1

Once you have all these settings in place, click Close to save them and to close the right sidebar.


Next, we’ll add a Dropout layer. This layer’s purpose is to help reduce overfitting, mainly by dropping or ignoring some neurons in the network randomly.

Drag and drop a Dropout node from the Core section in the left sidebar.


Double click the Dropout node and change its Probability to 0.25, then click Close.


Now let’s move on to the fully connected layers. To do that, we need to flatten the output we have up till now into a 1D matrix.

Drag and drop a Flatten node from Core section.


Drag and drop a Dense node from Core section in the left sidebar.


Double click the Dense node and change the number of nodes in the settings to 128 and click Close.


Add another Dropout node. Change the Probability of this layer to 0.5 and click Close.


Add a final Dense node. This will represent the output classes, two in this case. Double click the node and change the number of nodes in the settings to 2 and click Close.


Now, let’s add an activation layer at the end of our architecture. We’ll use Softmax here, it’s commonly used in the last layer of a Neural Network. It returns an output in the range of [0, 1] to represent true and false values for each node in the final layer.

Drag and drop a Softmax node to the canvas and connect it to previous nodes.


Next, we’ll need to add a means to calculate the performance of the model. This is represented in the form of a cost function, which calculates the error of the model’s predicted outputs in comparison to the actual labels in the dataset. Our goal is to minimize the loss as much as possible. One of the functions that can be used for calculating the loss for a classification model is Cross-Entropy.

Drag and drop a Cross-Entropy node from the Loss section in the left sidebar.


We’ll add another node to calculate the accuracy of our model’s predctions. Drag and drop the Accuracy node from the Metrics section in the left sidebar.


Connect both, the Cross-Entropy and the Accuracy nodes to the Softmax node. Both will perform calculations on the model’s output.


Finally, we’ll add an optimization algorithm, which defines how the model will fine tune its parameters to minimize loss. There are many optimization algorithms, in this example, we will use and Adam optimizer. It is a generally well-functioning optimization algorithm and reaches the best results in less time.

Drag and drop the Adam node from Optimizer section in the left sidebar.


Double click on the Adam node and change its settings to the following:

  • Set the Learning rate to 0.001.
  • Set the Decay to 0. Decay changes the learning rate during training iterations, making it smaller as the model nears a convergence or the ground truth value. In Adam optimizer, we don’t need to set this value as Adam already changes the learning rate during training, i.e. it’s not a fixed value as in other optimization algorithms such as SGD.

The reason behind the use of other parameters (Beta_1 and Beta_2) is that the Adam algorithm updates exponential moving averages of the gradient and its square, where these parameters control the exponential decay rates of these moving averages. The moving averages themselves are estimates of the 1st moment (the mean) and the 2nd raw moment (the uncentered variance) of the gradient [2].

The parameters Beta_1 and Beta_2 are already set to default values 0.9 and 0.999 which are good for computer vision problems.


3. Publishing a model from Neural Network Modeler and training it using Experiments

Now that we have the full architecture of our Neural Network, let’s start training it and see how it performs on our dataset. You can do that directly from Watson Studio’s Neural Network Modeler’s interface.

In the top toolbar, select the Publish training definition tab and click it.


Name your model so you can identify it later. You will need to have a Watson Machine Learning Service (from IBM Cloud Catalog) associated to your project. If you don’t have one associated, you’ll be prompted to do that on the fly. In the given prompt, click the Settings link.


You will be redirected to the project settings page. Here you can manage all services and settings related to your current project. Scroll down to the Associated Services section, click Add service and select Watson from the dropdown.


You’ll then be presented with all the available Watson services. Choose Machine Learning and click Add.


If you don’t already have a running Machine Learning service instance, choose the New option to create one on the fly.


If you followed the prerequisites and already have a Machine Learning service instance, choose the Existing option. Select your service from the dropdown list.


Now you’ll be redirected back to your project settings page. Click on Assets on the top bar to return back to your main dashboard on Watson Studio. Scroll down to Modeler flows section and choose the flow we have been working on according to the name you gave it.


Once your flow loads, click again on the Publish training definition tab in the top toolbar.


Make sure you named your Model, and that the Machine Learning service is detected and selected, then click Publish.


Once publishing the training definition is done, you’ll find a notification on the top of the screen showing you a link to train the model in Experiments, click on it.


Now, we’ll start creating a new Experiment. Start by giving it a name, then select the Machine Learning service. Lastly, we need to define the source of our data and where we will store the training results.

Choose the bucket containg your dataset by clicking Select.


Since you have been following along, you should have an existing connection to Object Storage, select that from the dropdown list.


Choose the bucket containg the dataset by choosing the Existing radio button and selecting the name of the bucket where you stored your data as in Step 1 in this guide. Then click Select at the bottom of the page.


You’ll be redirected back to the new Experiment details page to choose where to store your training results, click Select.


As in the previous step, choose Existing connection, select your Object Storage connection from the dropdown list, and to select a bucket to store the resultsby choosing New bucket that has a name globally unique to IBM Cloud Object Storage Finally click Select at the bottom of the page.

Note: It’s advisable to have two different buckets for storing your datasets and for storing the training results


Now back to the new Experiment details page, let’s click on Add training definition link in the right side of the page.


Since we already published a training definition from Neural Network Modeler, we will select the Existing training definition option.


Select the training definition from the dropdown list according to the name you gave it previously. Then click on Select at the bottom of the page.


Let’s select our hardware option that will be used to train the model. If you’re a free-tier (Lite) account, you’ll have access to the first option in the Compute Plan dropdown list.


For the Hyperparameter Optimization Method, let’s select None for now, we will be getting back to that in the next step when we create a training definition using Keras code instead of Neural Network Modeler. Finally, click on Select at the bottom of the page.


You’ll be redirected to the new Experiment details page for the final time, take a look at all the options and make sure everything is set, then click Create and run at the bottom of the page.


Now training your model will start, you’ll be presented with a view that will give you details about the training process and a means to monitor your model.


Once your model is trained, it will be listed in the Completed section at the bottom of the page, with details about how it performed.


If you’re satisfied with your model, you can put it into production and start actually using it and scoring on new images! The first step for doing so is clicking the three vertical dots under the Actions tab and selecting Save Model. You can find your saved model later in the Models section in the main dashboard of your Watson Studio project. From there you can deploy it as REST API.


4. For Experts: Code your own model, train it in Experiments and leverage Hyperparameter Optimzation

If you would like the maximum control over your architectures, models and hyperparameters, you have the option to import your own code files into Experiments and train it there.

Watson Studio Experiments provides the following benefits:

  • Leverage powerful hardware on the cloud without any hardware-software configuration setup needed. Watson Studio provides you with Deep Learning as a Service (DLaaS).

  • Run multiple experiments in parallel

  • Use hyperparameter optimization by giving Watson Studio a range of values for different hyperparameters. Watson Studio will create multiple variants of your model using different hyperparameter settings, it will package each variant of your model into a Container and deploy it on a Kubernetes cluster. All the model variants will run in parallel, Kuberenetes will monitor the performance of each in realtime. At the end of the training process, you can read the details about the performance of all the different runs with the details about what hyperparameters were used for each.

  • Train with Distributed Deep Learning for compute intensive training workloads and exponentially accelerate your training by splitting the workload on multiple servers.

Note: This functionality is not presented in this guide, but you can read more about it in Watson Studio’s documentation

Let’s start working with code.

You can write your model from scratch, using supported deep learning frameworks such as TensorFlow, Keras, PyTorch and Caffe. Another option, which we will be doing here, is to let Watson Studio’s Neural Network Modeler automatically generate code for you in your favorite framework’s code and modify that further.

Open your Neural Network Flow we created earlier, and in the top toolbar, click the Download tab, select a framework from the dropdown list, in this guide, we will be using Keras Code.


The code will be downloaded on your local machine inside a zip folder. You’ll find two files, and

Note: the names of the zip folder and files contained inside may differ


The code responsible for the model can be found in the file, so open that in your favorite code editor.

We’ll do some modifications on the code, to make it ready for plugging into Watson Studio’s Experiments HPO.

First, we’ll import some libraries. In the imports section of your code, add the following lines:

import json
from os import environ
from emetrics import EMetrics

The emetrics import refers to a python script with the same name, it’s availble in the assets folder provided with this guide and downloaded in the prerequisites. Emetrics will be responsible for writing the model’s results into the logs and formatting it in the way that Watson Studio interface expects.

Next, add the following block of code, just beneath the imports section:

# Set up working directories for data, model and logs.
model_filename = "SignatureFraud.h5"

# writing the train model and getting input data
if environ.get('RESULT_DIR') is not None:
    output_model_folder = os.path.join(os.environ["RESULT_DIR"], "model")
    output_model_path = os.path.join(output_model_folder, model_filename)
    output_model_folder = "model"
    output_model_path = os.path.join("model", model_filename)

os.makedirs(output_model_folder, exist_ok=True)

# Set up HPO.

config_file = "config.json"

if os.path.exists(config_file):
    with open(config_file, 'r') as f:
        json_obj = json.load(f)
    if "initial_learning_rate" in json_obj:
        learning_rate = json_obj["initial_learning_rate"]
        learning_rate = 0.001000
    if "batch_size" in json_obj:
        batch_size = json_obj["batch_size"]
        batch_size = 16
    if "num_epochs" in json_obj:
        num_epochs = json_obj["num_epochs"]
        num_epochs = 100
    if "decay" in json_obj:
        decay = json_obj["decay"]
        decay = 0.100000
    if "beta_1" in json_obj:
        beta_1 = json_obj["beta_1"]
        beta_1 = 0.900000
    if "beta_2" in json_obj:
        beta_2 = json_obj["beta_2"]
        beta_2 = 0.999000  
    learning_rate = 0.001000
    batch_size = 16
    num_epochs = 100
    decay = 0.100000
    beta_1 = 0.900000
    beta_2 = 0.999000

def getCurrentSubID():
    if "SUBID" in os.environ:
        return os.environ["SUBID"]
        return None

class HPOMetrics(keras.callbacks.Callback):
    def __init__(self):
        self.emetrics =

    def on_epoch_end(self, epoch, logs={}):
        train_results = {}
        test_results = {}

        for key, value in logs.items():
            if 'val_' in key:
                test_results.update({key: value})
                train_results.update({key: value})

        #print('EPOCH ' + str(epoch))
        self.emetrics.record("train", epoch, train_results)
        self.emetrics.record(EMetrics.TEST_GROUP, epoch, test_results)

    def close(self):

# Perform data pre-processing
defined_metrics = []
defined_loss = []

The main function of this block of code, is to define a destination folder to store model’s results, save the trained model and write the logs. It’s also responsible for grabbing data provided from Watson Studio’s Experiment HPO interface. The hyperparameters provided in the interface are stored in a config_file which we use to extract hyperparameter values and store them in variables that will be accessed by the model later.

Important Note: Please make sure that you have no conflicting lines of code after this point redefining the hyperparameter variables or assigning them to new values, as this will cause HPO to not work properly

Find this block of code:

model_inputs = [ImageData_1]
model_outputs = [Softmax_12]
model = Model(inputs=model_inputs, outputs=model_outputs)

and replace all following code with this block:

# Starting Hyperparameter Optimization
hpo = HPOMetrics()

# Define optimizer
optim = Adam(lr=learning_rate, beta_1=beta_1, beta_2=beta_2, decay=decay)

# Perform training and other misc. final steps
model.compile(loss=defined_loss, optimizer=optim, metrics=defined_metrics)
if len(model_outputs) > 1:
    train_y = [train_y] * len(model_outputs)
    if len(val_x) > 0: val_y = [val_y] * len(model_outputs)
    if len(test_x) > 0: test_y = [test_y] * len(model_outputs)

# Writing metrics
log_dir = os.environ.get("LOG_DIR")
sub_id_dir = os.environ.get("SUBID")
static_path_train = os.path.join("logs", "tb", "train")
static_path_test = os.path.join("logs", "tb", "test")
if log_dir is not None and sub_id_dir is not None:
    tb_directory_train = os.path.join(log_dir, sub_id_dir, static_path_train)
    tb_directory_test = os.path.join(log_dir, sub_id_dir, static_path_test)

    tensorboard_train = TensorBoard(log_dir=tb_directory_train)
    tensorboard_test = TensorBoard(log_dir=tb_directory_test)
    tb_directory_train = static_path_train
    tb_directory_test = static_path_test

    tensorboard_train = TensorBoard(log_dir=tb_directory_train)
    tensorboard_test = TensorBoard(log_dir=tb_directory_test)

if (len(val_x) > 0):
    history =
        validation_data=(val_x, val_y),
        callbacks=[tensorboard_train, tensorboard_test, hpo])
    history =
        callbacks=[tensorboard_train, tensorboard_test, hpo])


#print("Training history:" + str(history.history))

if (len(test_x) > 0):
    test_scores = model.evaluate(test_x, test_y, verbose=1)
    print('Test loss:', test_scores[0])
    print('Test accuracy:', test_scores[1])
print("Model saved in file: %s" % output_model_path)

This should be the end of your code.

What we have just introduced is a means of plugging in the emetrics helper module into our model, so it can read the model’s training history and write it to logs and store it. It will also write a val_dict.json file that will contain the accuracy at each step of the model training as a means to measure its performance. This val_dict.json file will be used by Watson Studio’s interface to display the performance of different training runs with different hyperparameters so you can compare them.

If you want to check the full code in for reference, you can find it here:

Now, to start uploading your code to Watson Studio Experiments, you’ll need to compress it into a zipped folder. You’ll need the following files inside the folder:

  • The recently modified
  • provided in the assets for this guide
  • training_data.pickle – provided
  • validation_data.pickle – provided
  • test_data.pickle – provided

Once you have all files in place in a zipped folder, we can move on now to Watson Studio to start the training process.

In Watson Studio’s main dashboard, scroll down to the Experiments section and click New experiment.


Name your experiment, make sure that a Machine Learning service is selected and choose the buckets for your dataset and the training results as we did previously in Step 3.


Click Add training definition on the right side of the screen.


Choose New training definition option, give it a name and drag and drop the zip folder we prepared earlier with our code files and dataset.


Choose the framework to be used for training, in our case, we wrote our code for Keras which runs on top of Tensorflow, so we will choose the TensorFlow option from the dropdown list.

Write the following line of code as the execution command, it’s similar to how you would execute the code from your terminal locally: python3 keras-code/

Choose the Compute plan, then for Hyperparameter optimization method choose RBFOpt, choose 100 for the Number of optimizer steps, choose accuracy for the Objective and choose maximize in the Maximize or minimze field. Finally, click Add hyperparameter.


Let’s add our first hyperparameter, give it a name. Note: make sure the name matches exactly what was defined in config.json, you can refer to the block of code in where we extracted the hyperparameter values. Type in initial_learning_rate as the name of the hyperparameter. Choose the Distinct value(s) radio button, as we will be adding different values here. Finally, type in the values you want the model to pick from during training. Here I used the following values: 0.0001,0.0003,0.001. When you’re done, click Add and Create another at the bottom of the screen.


On to the next hyperparameter, type batch_size in the Name field. This time we will add a range of values with predefined steps, so choose the Range radio button. Type 8 for the Lower bound and 32 for the Upper bound. Choose Step for the method of Traverse and choose 8 for the Step value. Finally, click Add and Create another.


Let’s add the last hyperparameter, type num_epochs in the Name field, choose the Distinct value(s) radio button and type in 100,200 in the Value(s) field. Finally click Add at the bottom of the page. If you want to add other hyperparameters (like decay, beta_1, beta_2, dropout_1, dropout_2) you can click Add and Create another as we did previously.


Back to the training definition details screen, take a look at all settings and make sure you provided all needed details, then click Create at the bottom of the page.

You can always add more training definitions, and all these will run in parallel.


You’ll be redirected back to the new Experiment details page, confirm that all settings are in place and click Create and run at the bottom of the page.


Watson Studio will create different training runs by using different values you provided for hyperparameters for each training run. They will run in parallel and the progress will be shown in real-time. Once the training is completed, all runs will be listed with their details on how they performed. By heading to Compare Runs in the top bar, you can get an overview on how each model performed and what were the hyperparameters used for that model.


You can also view a graph of models’ history and performance across all training iterations.



If you’re satisfied with a certain model’s performance, you can save it as we viewed previously so it can be ready for deployment and scoring.


In this tutorial, you learned about the powerful deep learning tools available to you on Watson Studio. You learned how to quickly prototype a neural network architecture using Neural Network Modeler. You learned how to publish the Neural Network Flow and train it using Experiments. You also learned how to import your own code, train it, optimize it and monitor its performance using Experiments and HPO.


  1. Marcus Liwicki, Muhammad Imran Malik , Linda Alewijnse, Elisa van den Heuvel, Bryan Found,. “ICFHR2012 Competition on Automatic Forensic Signature Verification (4NsigComp 2012) “, Proc. 13th Int. Conference on Frontiers in Handwriting Recognition, 2012.

  2. Diederik P. Kingma, Jimmy Ba. “Adam: A Method for Stochastic Optimization”, 3rd International Conference for Learning Representations, San Diego, 2015.

  3. IBM Watson Studio Documentation on Deep Learning

  4. IBM Watson Studio: Coding guidelines for deep learning programs