Introduction to computer vision using IBM Visual Insights

This article is part of the Getting started with IBM Visual Insights learning path.


IBM Visual Insights can help provide robust end-to-end workflow support for deep learning models that are related to computer vision. This enterprise-grade software provides a complete ecosystem to label raw data sets for training, creating, and deploying deep learning-based models. IBM Visual Insights is designed to give subject matter experts who have no skills in deep learning technologies the ability to train models for AI applications. It can help train highly accurate models to classify images and detect objects in images and videos.

Introduction to IBM Visual Insights video

IBM Visual Insights is built on open source frameworks for modeling and managing containers to deliver a highly available framework, providing application lifecycle support, centralized management and monitoring, and support from IBM.

Learning objectives

In the first article of this learning path, Introduction to computer vision, I gave an overview of computer vision and how you might use it in your apps.

Now, I’ll explain how IBM Visual Insights provides an easy-to-use UI to help you create a custom model, deploy it to provide an API endpoint, and use it in an application. I’ll discuss how you can:

  • Create a data set
  • Assign categories for image classification
  • Label objects for object detection
  • Train a model
  • Test the model
  • Use it in an app

Access to IBM Visual Insights

Look at the IBM Visual Insights site to learn more about access to free trials (when available). You can also download a 90-day trial of IBM Visual Insights.

Create a data set

First, you’ll want to create a data set. A data set allows you to train a custom model based on your own sets of images. To create a new data set, from the Data sets tab, click Create a new data set and give it a name.


After you have an empty data set, the UI lets you add images, videos, and ZIP files of images. This is easily done with drag-and-drop or the file chooser.

Assign categories for image classification

If you are creating a model for image classification, you must assign categories to your images for training.

Use + Add category to create a category, then use Assign category to assign a category to the selected images. You can do this in large batches by importing a ZIP file of images for one category, assigning the category to all of the new “uncategorized” images, and then repeating the process with the next ZIP file and next category.


After you have uploaded all of your images and assigned categories, you can start training your model by clicking Train model.

Label objects for object detection

If you are creating a model for object detection, you must label objects in your images.

In a data set page, you can select an image or video to label objects. If you are using a video, there is a Capture frame button. After you have captured frames, you see them in an image carousel. Then, to label objects in an image, draw a box or polygon around each object.


When you have finished labeling the images, go back to the data set page and start the training process by clicking Train model.

Train a model

After you have prepared your data set as described previously, training a custom model based on your data set is almost as easy as the push of a button. There are a few more options, such as choosing a base model or customizing the number of iterations, but usually, you can sit back and watch as the chart shows you how the model is improving as the iterations run. You can wait until the training is complete or choose to stop the training as soon as the loss values stop decreasing.

After the training has stopped, IBM Visual Insights presents you with a summary. The results that are shown in the following image include the training results for an image classification model. This example shows a model that is not very accurate. The charts show overall accuracy and also break it down by category to help you see where you might need to improve your data set to refine your model.


You can now deploy the model and test further with additional images or use it in an app.

Test the model

After you have a model deployed, you can do more testing using the IBM Visual Insights UI. In the following image classification model test, you can see that you import an image file to test, and some interesting results are shown. In addition to the basic results showing the category and confidence, a heatmap overlay is used to show what part of the image influenced the results the most.


The heatmap indicator can be helpful when you need to refine your data set. In the previous example, the boat on a river was recognized as a watercraft picture, but this model also tends to infer watercraft anytime there is water in the picture.

Use your custom model in an app


The IBM Visual Insights API is invoked from your application code to integrate inferencing — using your own custom models. In addition, the API provides a programmatic interface for many of the tasks that are also available in the UI, such as data set creation, model training, and deployment. The REST API is easy to use from any programming language.

Python code sample

The following Python code snippet (taken from the Object tracking in video with OpenCV and Deep Learning code pattern) shows the one-line post call used to send a JPEG file to the API endpoint URL. The JSON results are then formatted and printed.

s = requests.Session()

def detect_objects(filename):

    with open(filename, 'rb') as f:
        # WARNING! verify=False is here to allow an untrusted cert!
        r =,
                   files={'files': (filename, f)},

    return r.status_code, json.loads(r.text)

rc, jsonresp = detect_objects('frames/frame_00100.jpg')

print("rc = %d" % rc)
print("jsonresp: %s" % jsonresp)
if 'classified' in jsonresp:
    print("Got back %d objects" % len(jsonresp['classified']))
print(json.dumps(jsonresp, indent=2))

Node.js code sample

The Locate and count items with object detection code pattern uses a Node.js server. The following code snippet shows how it pipes an image from its web UI input (this could be mobile camera input or file chooser) through a Node.js server to the API endpoint URL and sends the results back to the web UI.'/uploadpic', function(req, result) {
  if (!poweraiVisionWebApiUrl) {
    result.send({data: JSON.stringify({error: MISSING_ENV})});
  } else {
      url: poweraiVisionWebApiUrl,
      agentOptions: {
        rejectUnauthorized: false,
      }}, function(err, resp, body) {
      if (err) {
      result.send({data: body});


This article is the second part of the Getting started with IBM Visual Insights learning path. Continue with the learning path to grow your skills with hands-on tutorials and example code.

In the rest of this learning path, we’ll use IBM Visual Insights and integrate computer vision into various applications, such as:

  • Image classification integration into an iOS app
  • Object detection integration into a Node.js app
  • Object detection and tracking in videos with a Python Jupyter Notebook
  • Validation techniques for continuous model testing

These apps include example data sets, but they also provide open source code and can be tailored to work with a custom model that you train using your own data set. Feel free to use them to learn, leverage, and innovate. To continue, read the next tutorial, Build and deploy a IBM Visual Insights model and use it in an iOS app.

Note: To leverage IBM Visual Insights software capabilities, you must have the horsepower of an enterprise-grade P9 processor. An alternative cost-effective approach for academia and other non data scientists would be to deploy IBM Visual Insights on Raptor Computing System‘s (RCS) Talos II Development System.