From days to minutes. This is the time difference a data scientist can expect to save using IBM PowerAI Vision.
One day, I was looking through one of our internal user groups and happened upon a group called Deep Learning. Within that group is a forum for sharing information and experiences on deep learning. I clicked on one entry entitled â€śUse cases for Caffeâ€ť and opened the thread. There was one post asking about experiences in Caffe and thoughts about other methods of network structure design other than trial and error. I suggested to the poster to try transfer learning using Inception V3. Transfer learning is well documented and there are numerous examples on how to get going. In this case, I suggested to use https://github.com/fbarilla/transfer-learning/blob/master/samples/Classifying-House-And-Pool-Images.ipynb, which was written by Franck Barillaud, IBM Senior Technical Staff Member.
After a few days, the poster came back with his experience. He was impressed how simple it was to use transfer learning. He said it was simple enough for anyone, even without any experience in python or tensorflow. However, he was only able to reach a final accuracy of 81% compared to 93% with a much simpler neural network that he had written using Caffe.
Ok, no problem, letâ€™s try something else.
I asked the poster for permission to use his dataset using PowerAI Vision, a tool included in IBM PowerAI (https://developer.ibm.com/linuxonpower/deep-learning-powerai/). PowerAI Vision is a deep learning development platform that an application developer, with limited knowledge about deep learning, can use to train and deploy deep learning models targeted at computer vision for their application needs. The results were unexpectedly amazing!
Not only did it take minutes to setup, we got a much higher accuracy of 94.5% with 0.12 loss. This is much better compared to the 93% accuracy on Caffe and 81% on Inception V3 using transfer learning.
The complete process took about 30 minutes:
- Â Â Â Dataset download and import: 15 minutes
- Â Â Â PowerAI Vision Task Definition: 5 minutes
- Â Â Â Training time: 10 minutes
Figure 1 shows some graphs of training loss and accuracy
This clearly shows the benefit of using PowerAI Vision for image classification. Model setup is a matter of point and click which is a good proof point that a user does not necessarily need deep data scientist skills and spend days writing a neural network to get decent results.
A complete description of the image recognition use case
The poster described his project as follows:
Inspect images of photoresist openings after having been exposed and developed. The central opening (the bright core) measures approximately 20 microns in diameter. The outer disk measures approximately 130 microns:
Figure 2: Photoresist image
Classify the image in 4 classes:
- No defect
- Presence of a dark spot
- Presence of a bright spot
- Presence of a scratch
Each image represents a hole in a photoresist film. The goal is to make sure that the photoresist film is clear of defects (dark spots, bright spots or scratches) in the area between the 20 microns central opening and the 130 microns peripheric disk.
Figure 3 shows examples of images showing defects:
Figure 3: Photoresist images with defects
The potential cost savings when using image recognition is huge. And it can be used in any industry that requires visual inspection. The addition of image recognition can alleviate human errors and increase quality of outcome in any industry like manufacturing, healthcare, oil and gas, financial or telecommunications. And with IBM PowerAI Vision, the requirement of hiring experienced data scientist to develop image recognition applications may be a thing of the past.
To find out more about PowerAI and this technology preview, visit IBM PowerAI Vision Technology Previews.
Special thanks to Sebastien Gilbert of IBM for his technical contributions.