Multiclass image classification of yoga postures using Watson Studio and Deep Learning as a Service

Computer vision usability is on the rise these days, and there might be scenarios where a machine must classify images based on their class to aid in the decision-making process. In this tutorial, we demonstrate a method of preprocessing the images to remove unnecessary information and to help the model learn the features of the images effectively, which enhances the accuracy. The tutorial discusses a few methods to preprocess the images before it is ingested into model building process. These methods include resizing the images, creating the pixel arrays of images and pickling the data, and removing background noise from the images.

The tutorial must be used with the Create a predictive system for image classification using Deep Learning as a Service code pattern to experiment with the different steps in preprocessing and see their impact on the outcome, which is a prediction accuracy of image classification.


To follow this tutorial, you should be aware of Python, computer vision, deep learning, and IBM Cloud environment and services.

Estimated time

It should take approximately 45 – 60 minutes to complete the tutorial.


Resize the images

Resizing the image is an important step where the images are standardized and resized to a specific shape that is typically 224/224. Images come in different sizes, and to help the model learn better you must resize the image to avoid extra padding. The pretrained models also require that the images be resized before they are ingested into the models. The following code shows how to resize the images by using Pillow.

Resize the image using Pillow

Pickle the data

The primary reason to pickle the images data is to convert the images from a .jpeg or .png format into a pixel array that has inputs and targets defined. This array of numbers helps any model (machine learning, deep learning, and pretrained) to learn the features and understand the pattern of any given class, which enhances the model’s accuracy. In the following steps, we create a pixel array and write it into a pickle file.

  1. Define the input.

    Filename and class

  2. Create a pickle file from a pixel array of images.

    Create a pickle file from a pixel array of images

  3. Convert the raw image data into a pixel array. Import and print

  4. Merge input and target variables into a pickle file, then create a new pickle file and write data to it. Merging input and entering data

From the previous images, you are able to define the input and target parameters using a text file, convert the raw image data into a pixel array, then dump them into a pickle file that will be consumed by deep learning models. This step is mandatory for deep learning frameworks like TensorFlow or Keras and for creating deep learning experiments using hyperparameters optimization in Watson Machine Learning. Repeat this process three times for creating, training, testing, and validating the pickle files. Another point to remember is that all images should be the same size to pickle the data, which is why the resizing of images must be done first.

Create test data JSON

This step is mandatory if you must provide the test data in a JSON format for the model to get predictions. Watson Machine Learning requires that the input data be in a JSON format for real-time scoring. The following code snippet reads the pickle file, splits it into input and targets, then writes it to a JSON file.

Creating a JSON file for testing the WML model

Remove background noise

This step is to enhance accuracy by removing the background of an image. An image consists of features, and if the goal is to identify a person or an object, then we can try to remove all other features except the one in question. The following code samples show how we try to identify the posture of the person and remove most of the other features that are not relevant.

Install and import libraries

Smooth, blur, and blend images

Yoga poses

Some of the other methods to remove background are:

  • Watershed method
  • GrabCut method
  • Background Subtractor

There are more resources available on exploration to help you understand more about background removal.


In this tutorial, we discussed a few steps that are part of image preprocessing. These steps can help with preprocessing the images and also with enhancing the model’s accuracy. You should experiment with these parameters and add new parameters if required to get the wanted output. This is not an exhaustive list, but will help you get started.

The implementation of deep learning methodology for image classification should be used with the Create a predictive system for image classification using Deep Learning as a Service code pattern. Explore the previous steps and see how the performance of the model improves with preprocessing the images.