Join the IBM Code London Meetup group for hands-on workshops on a variety of technologies.
Humans have an uncanny ability to process visual information. With one quick glance at a photo, we can understand what is in the scene and act accordingly. Giving computers this ability, however, has been challenging, but there is a wide range of potential use – from tagging your Facebook friends in a photo to recognizing cancer from a photo of a skin condition. Image recognition has been an active area of research and development for many years. Recent advances in deep learning have brought significant improvements to image recognition and classification, to the degree that many neural network models are now available and offer state-of-the-art performance. If your application requires recognizing characteristics of an image, you can leverage one of these models to train and deploy a neural network to serve your application.
To explain this process in practice, in this IBM Code developer pattern, we will use deep learning to train an image classification model. Specifically, we will use data from the art collection at The Metropolitan Museum of Art and metadata from Google BigQuery. We will use the Inception model implemented in TensorFlow, and we will run the training on a Kubernetes cluster. We save the trained model and load it later to perform inference. To use the model, we provide as input a picture of a painting, and the model will return the likely culture – Italian Florentine art, for instance. You can adapt the data by choosing some other attributes to classify the art collection, such as author, time period, etc. You can choose an entirely different source of data, or a different category for classification, along with different ways to create the labels. You can also choose other models, such as VGG, ResNet, AlexNet, MobileNet, etc.
Depending on the compute resource available, you can choose the number of images to train, the number of classes to use, and more. For the purpose of showing the full working process, we will select a small set of images and a small number of classes to allow the training to complete within a reasonable amount of time. With a large data set, the training may take days or weeks.
When you have completed this pattern, you will better understand how to:
- Collect and process data for deep learning in TensorFlow
- Configure and deploy TensorFlow to run on a Kubernetes cluster
- Train an advanced image classification neural network
- Use TensorBoard to visualize and understand the training process
- Inspect the available attributes in the Google BigQuery database for the Met art collection.
- Create the labeled data set using the attribute selected.
- Select a model for image classification from the set of available public models and deploy to IBM Cloud.
- Run the training on Kubernetes, optionally using GPU if available.
- Save the trained model and logs.
- Visualize the training with TensorBoard.
- Load the trained model in Kubernetes and run an inference on a new art drawing to see the classification.