In this code pattern, work through the process of analyzing an image data set using a pre-trained convolution network (VGG16) and extracting feature vectors for each image using a Jupyter Notebook.
Machine learning algorithms provide many useful tools that solve real-world problems. One of the domains that machine learning has had great success with is image recognition. By using computational power to identify images and compare them to other images, you can use machines to perform tasks that a few years ago could be done only by humans. Engineers and data scientists who work with image recognition can encounter a few challenges that can put limits on the work that can be done with machine learning algorithms. The biggest limitation is the time and computational power required to create the machine learning layers in a deep neural network, which are required for image recognition. Although large data sets and complex algorithms can be run on many common hardware configurations, the time required for creating neural layers, training a model, and extracting features can be prohibitively high. Using a Graphics Processing Unit (GPU) to perform many computations in parallel revolutionized the world of computer graphics, and discovering that the same GPUs also can be used to accelerate the performance of machine learning tasks has had a similar effect on the world of artificial intelligence. In this code pattern, we demonstrate a common task for content-based image retrieval (CBIR), and compare running it on the CPU versus the increased performance obtained when running it on a GPU.
In the code pattern, we guide you through the process of analyzing an image data set using a pre-trained convolutional neural network (VGG16) and extracting feature vectors for each image using a Jupyter Notebook. This is a computationally expensive process that takes 300 times longer on a CPU versus a GPU. We’ll use the GPU environment on IBM® Watson™ Studio or on your local machine to accelerate feature extraction. Post analysis, we’ll demonstrate reverse image search, one of the popular applications of image analysis. Reverse image search is a CBIR query technique that involves providing the CBIR system with a sample image that it then bases its search on. In terms of information retrieval, the sample image is what formulates a search query. Typically, reverse image search is characterized by a lack of search terms.
When you have completed this code pattern, you understand how to:
- Use GPU acceleration in Watson Studio or locally to improve performance of feature extraction.
- Download a VGG16 pre-trained model using Keras.
- Perform feature extraction. Here, we remove the last layer (the softmax classification layer) so that our output model now has only 12 layers and the last layer would be f2(Dense), a fully connected layer.
- Get feature vectors for all of the images, then scale them down using PCA.
- Use cosine distance between PCA features to compare the query image to 5 closest images and return them as thumbnails.
- The user interacts with the Jupyter Notebook to import and use Python modules.
101_Object_Categoriesfrom caltech-101 are imported for the image search.
- The Keras VGG16 model is imported, with weights pre-trained on ImageNet.
- The user can perform feature extraction using GPU for increased performance.
Get detailed instructions in the readme file. Those steps tell you how to:
- Clone the repository.
- Create a notebook in Watson Studio or locally.
- Run the notebook.