Being part of the IBM Watson & Cloud Developer Advocacy team we are always challenged to push ourselves and the boundaries of what is possible. In addition, we are always looking for new ways of showing the developer community how to use some of the technologies and services IBM has to offer in collaboration with the developer and startup community. In this blog post we’ve built a cognitive demo in collaboration with Realm.io, the startup behind the open source Realm Mobile Database.

Through this example built in collaboration with the Realm team, you learn a different pattern on building a cognitive mobile application. The Scanner app allows users to take a picture of an object, and with the help of Watson Visual Recognition Service, get a pretty good idea what the object is. In addition to recognizing the image, the app also invokes face detection and text recognition, too.

This brief video demonstrates the app:

First, let’s review the architecture.

Mobile App for Watson Realm Architecture Diagram

We built the app for iOS using Swift, Apple’s open source language. We manage the data on the client using Realm, and synchronize it between the mobile app and Node.js using the recently released Realm Object Server. Then, from Node.js we invoke the Watson Visual Recognition APIs. The results are returned to the mobile app for display.

What makes using Realm interesting in this example is a new paradigm in app development. We’re using the Realm Object Server in a new programming style. We treat objects in a different manner: Objects are the new API. The app developer doesn’t need to worry about REST API calls to the back-end service. Data synchronization is handled internally, behind the scenes through the Realm APIs. All the app developer needs to worry about is the UI/UX and the data.

The demo consists of four components:

iOS Application

We built the app using Swift 3. It updates the Realm Object with image data and updates the state of object once the picture is taken or chosen from the phone’s photo library. The application is listening on the changes on the realm object property. Once the result is returned from the back-end through the Realm Mobile Platform, the app displays the results. As we can see in the sample project, the application does not need to make any REST API calls whatsoever against the back-end service and neither does it poll for the results, which is what we historically had to do.

In the iOS application, we create a Realm Object called Scan as shown below:

import RealmSwift

class Scan: Object {
    dynamic var scanId = ""
    dynamic var status = ""
    dynamic var textScanResult:String?
    dynamic var classificationResult:String?
    dynamic var faceDetectionResult:String?
    dynamic var imageData: Data?
}

This object will be automatically instantiated on the local device, and then synchronized with the Realm Object Server. The application can now listen for the changes in the properties status, textScanResult, classificationResult & faceDetectionResult and update the UI once the data is available. Listening on the changes in the object property in the iOS application be achieved by leveraging the Cocoa runtime capability (called Key-Value Observation or “KVO”) that allows observers to be registered to watch for changes in properties of objects and data structures. So, with the combination of the Realm Framework for Swift and the key-value observation feature of the Cocoa runtime, we can build a reactive iOS application with primary focus on the UI/UX and data, while leaving the rest to the Realm Framework & Realm Mobile Platform.

For step-by-step tutorial on integration Realm Framework on swift based mobile application and setting-up the connection against the Realm Object Server, take a look at the following tutorial.

Realm Mobile Platform

The Realm Mobile Platform provides realtime data sync in an event-driven fashion. Application developers are liberated: we don’t need to worry about networking code. We leverage it to sync the image the phone took with the server application, and sync back the results of the analysis from the server application. For the details on the installation and setup of the Realm Mobile Platform, take a look at the following link.

Node.js Application

The server-side app written in Node.js listens for changes on the realm object property using the Realm Listener API available with Professional Edition of Realm Mobile Platform. The server side app utilizes Realm Event Framework. This mechanism allows server application to react to changes in the Realm Objects allowing developer to create triggers and execute code based on the changes in the property of the object. In the scanner application, once the image data is available on the realm object, it is sent to the Watson Visual Recognition service for classification, face detection and text recognition. As soon as the result is available, the realm object is updated with the result. The updates are synced back automatically to the mobile application through the Realm Mobile Platform.

We used Watson SDK for Node.js and leveraged the Watson Visual Recognition service to classify the image along with face detection and text recognition.

To use the Visual Recognition API from your Node.js application, you can instantiate the service object as shown below:

var VisualRecognition = require('watson-developer-cloud/visual-recognition/v3');
var visual_recognition = new VisualRecognition({
    api_key: API_KEY,
    version_date: '2016-05-20'
});

Next, we need to initialize Realm and setup the listener and use Visual Recognition API once the image data is available.

var Realm = require('realm');
...
var NOTIFIER_PATH = ".*/scanner";
...
var change_notification_callback = function(change_event) {
    let realm = change_event.realm;
    let changes = change_event.changes.Scan;
    let scanIndexes = changes.insertions;
    ...
    visual_recognition.recognizeText(params, function(err, res) {
    ...                
    });
    ...
   visual_recognition.classify(params, function(err, res) {
    ...                
    });
   ...
   visual_recognition.detectFaces(params, function(err, res) {
    ...                
    });
}
...
//Create the admin user
var admin_user = Realm.Sync.User.adminUser(REALM_ADMIN_TOKEN);

//Callback on Realm changes
Realm.Sync.addListener(SERVER_URL, admin_user, NOTIFIER_PATH, 'change', change_notification_callback);

console.log('Listening for Realm changes across: ' + NOTIFIER_PATH);

The above code snippet is the skeleton code to show how Realm Listener and Watson API are leveraged together in server application. You will find the full sample in the Github link given earlier.

Watson Visual Recognition Service

The Watson Visual Recognition Service enables developers to build image recognition applications based on their business needs. It provides a platform to create custom classifiers to understand the content of the images. For the purposes of the demo, we will use a default classifier provided by the platform to classify objects, recognize faces, as well as scan any text in the image.
For details on creating custom classifier, you can check out this video. For this demo, you will only need to create Bluemix account and create Visual Recognition Service. Then, you will have to grab the API Key from the service and apply it to the Node.js server application.

You can find the in-depth tutorial on building and running the sample application on the Scanner tutorial.

1 comment on"Visual Recognition Mobile App with Watson, Realm, and Swift"

  1. This is a very meaningful insight. Thank you for sharing, you might also like this: https://goo.gl/XvCILH.

Join The Discussion

Your email address will not be published. Required fields are marked *