An introduction to AI in Node.js

Recent advances in artificial intelligence (AI) have transformed many services and are expected to be pervasive in computing systems in the near future. Many tasks that previously required human interaction or expertise can now be captured and automated in machine learning models or deep learning models. In this tutorial, you’ll get an overview of using AI in your Node.js applications by using TensorFlow.js.

To work through this learning path, you must obtain the deep learning model that performs the task that your application requires. Many open source pre-trained models are available for you to use. You can find several on the IBM Model Asset eXchange, or you can train your own model.

For developers, one consideration is how you deploy the deep learning models to be used by applications. There are two ways these models can be deployed:

  1. As a web service that can be accessed through an API call in the application
  2. Embedded in the application itself

Note that there are other ways to integrate AI into Node.js such as TorchJS or ONNX.js, and they each offer different levels of support. In this tutorial, we focus on TensorFlow.js because it offers greater support on both the client and server. Because we are interested in Node.js, we focus on how to embed models in Node applications.

Why AI in JavaScript?

Although data scientists tend to prefer Python for AI development, JavaScript does offer several advantages on both the client and server:

  • The large community of JavaScript developers can be effective in using AI on the large scale.
  • The smaller footprint and fast start time of Node.js can be an advantage when deployed in containers and IoT devices.
  • AI models process voice, written text, and images, and when the models are served in the cloud, the data must be sent to a remote server. Data privacy has become a significant concern recently, so being able to run the model locally on the client with JavaScript can help to alleviate this concern.
  • Running a model locally on the client can help make browser apps more interactive.

What is TensorFlow.js?

TensorFlow.js is an open source software library for JavaScript developers to create and use machine learning or deep learning models directly in the browser or a Node.js application. TensorFlow is the broader open source software that includes support for different programming languages such as Python and different platforms such as server, mobile, and IoT.

With TensorFlow.js, you can:

  • Create models easily and train them from scratch.
  • Reuse a model that has been pre-trained. For Node.js specifically, a model can be written in Python to use the distributed training capability on huge data sets. Then, the trained model can be loaded and used in a Node.js application.
  • Use the GPU for faster processing.

Prerequisites

To follow this tutorial, you need:

Estimated time

It should take you approximately 30 minutes to complete the tutorial.

Steps

In this tutorial, we will perform the following to show how you can get started with AI in Node.js:

  1. Setup a TensorFlow.js Node project
  2. Run a packaged model
  3. Run a TensorFlow.js web model

Set up a TensorFlow.js Node project

Before jumping into using TensorFlow.js, first make sure that your environment is ready for development. If you don’t already have Node.js installed, you can follow these instructions to get a general environment set up.

For a code editor, we recommend using Visual Studio Code, as mentioned in the Node.js tutorial. However, feel free to use any editor you like.

Create a project

Now, make a Node.js project to use for the rest of this tutorial.

mkdir tfjs-project
cd tfjs-project
npm init -y
npm install @tensorflow/tfjs-node

This initializes a new Node project and installs the CPU TensorFlow.js for the Node.js package. However, if you have a Linux or Windows machine with a CUDA-enabled NVIDIA GPU, then you can alternatively install the GPU version.

npm install @tensorflow/tfjs-node-gpu

Note: For the GPU version, you also need the CUDA Toolkit and cuDNN SDK installed as well. You can find the versions needed at this GitHub TensorFlow link.

In this tutorial, we assume that you have installed the CPU version @tensorflow/tfjs-node.

Run a packaged model

Before diving into tfjs and its APIs, there are some resources that provide popular pre-trained models and simplified APIs to help you get started quickly. The TensorFlow.js team at Google provides several pre-trained models in the tfjs-models repo. These pre-trained models are accessible in NPM packages with easy-to-use APIs. The IBM Center for Open-source Data & AI Technologies (CODAIT) team also provides some pre-trained models in the max-tfjs-models repo.

In this tutorial, we use the Object Detection (COCO-SSD) package @tensorflow-models/coco-ssd. You’ll use its APIs to identify multiple objects in a single image.

Create a project

  1. Install the @tensorflow-models/coco-ssd package in the tfjs-project directory you previously created.

     npm install @tensorflow-models/coco-ssd
    
  2. Program the following tasks in Node.js.

    1. Use pre-packaged APIs to load the pre-trained model COCO-SSD.
    2. Use tfjs-node APIs to decode the image into a tf.Tensor3D.
    3. Pass the image tf.Tensor3D to the loaded model for inference.
    4. Print out the predictions.
  3. Ensure that you are inside the project folder, then copy and paste the following code into a file named index.js.

     const cocoSsd = require('@tensorflow-models/coco-ssd');
     const tf = require('@tensorflow/tfjs-node');
     const fs = require('fs').promises;
    
     // Load the Coco SSD model and image.
     Promise.all([cocoSsd.load(), fs.readFile('image1.jpg')])
     .then((results) => {
       // First result is the COCO-SSD model object.
       const model = results[0];
       // Second result is image buffer.
       const imgTensor = tf.node.decodeImage(new Uint8Array(results[1]), 3);
       // Call detect() to run inference.
       return model.detect(imgTensor);
     })
     .then((predictions) => {
       console.log(JSON.stringify(predictions, null, 2));
     });
    
  4. To run the object detection model, you need an image as input. For a test image, download this image into the project folder. You can run the program by using the following command.

node .

You have successfully used an Object Detection model to recognize the objects inside a picture. Now, take a look at the code so you’ll better understand the pieces.

Explaining the code

Using the pre-packaged APIs is simple. You can look at the detailed documentation for the COCO-SSD NPM package, and you can also get the complete application here (run-prepacked-model.js). First, we use require() to get the COCO-SSD module.

const cocoSsd = require('@tensorflow-models/coco-ssd');

We also need an image decoder, which is included in the tfjs-node module. See the details in the documents.

const tf = require('@tensorflow/tfjs-node');

Finally, we need the fs module to load the image file from the file system. Because most of the COCO-SSD APIs return Promise objects instead of using callbacks, we could also use the Promise APIs in the fs module.

const fs = require('fs').promises

After loading all of the necessary modules, you can load the COCO-SSD pre-trained model and image file at the same time.

Promise.all([cocoSsd.load(), fs.readFile('image1.jpg')])

cocoSsd.load() loads the pre-trained model, and fs.readFile() loads the image. Both of them return Promise objects, and the results are returned in the then() callback function. The first object is the loaded model instance, and the second object is the image content as a Buffer.

Next, we use the image decoder APIs provided by tfjs-node to decode the raw image data into a tf.Tensor3D object. Tensors are n-dimensional arrays that act as TensorFlow’s fundamental data structure for passing around and manipulating data.

const imgTensor = tf.node.decodeImage(new Uint8Array(results[1]), 3);

tf.Tensor3D can be passed to detect() of the loadedModel for inference.

return model.detect(imgTensor);

The return object is also a Promise object, and its fulfill object is the prediction. We print out the prediction to a console.

console.log(JSON.stringify(predictions, null, 2));

Other pre-trained models

There are many pre-packaged TensorFlow.js modules, and they all provide similar APIs to load the pre-trained model and run the inference. Usually, they also provide the data pre-processing APIs to convert the raw data into the proper data format. Then, you pass the processed data to their predict functions.

Run a TensorFlow.js web model

In the previous section, you ran a TensorFlow.js model packaged as an NPM module with a simple API. The module takes care of the entire model lifecycle. It loads the model and performs the processing on the inputs and outputs. This makes it easy to use, allowing you to perform machine learning tasks with minimal knowledge of the model.

However, you might want to work with a model that has not been packaged into a module. In this case, you would need to load the model and process the data yourself. In this section, you learn how to work with a TensorFlow.js model. You load a model and pre-process the input data to the tensor format required by the model. You might also have to post-process the model output to a format that is more understandable.

The model you use in this section is the same model that is packaged in the COCO-SSD NPM module you ran in the previous section. The code presented here is a modified version of the NPM module code.

TensorFlow.js models

TensorFlow.js provides support for several model types:

  • tf.LayersModel: This is created when using the TensorFlow.js Layers API to build a model. It is also produced when converting a Keras model with the tensorflowjs_converter tool. A LayersModel can be used for training and inferencing.

  • tf.GraphModel: This is created when converting a TensorFlow SavedModel with the tensorflowjs_converter tool. A GraphModel can be used for inferencing, but not training.

  • tf.node.TFSavedModel: In Node.js, TensorFlow.js provides native support for the TensorFlow SavedModel. You can load and run a SavedModel in the Node.js environment without conversion. Currently, SavedModels in Node.js are only for inferencing and not training.

The type of model you are working with determines the API used to load and run the model.

Loading a model

Several files make up the definition of a model, whether it is a GraphModel or a LayersModel. You find the file (model.json) containing the data flow graph and shards of binary weight files. To use a model, you must call the appropriate API and provide the URL or path to the model.json file.

You can also load models from TensorFlow Hub by providing its TensorFlow Hub URL and including the option fromTFHub set to true.

For the exercise here, you can use the version hosted in TensorFlow Hub to load the COCO-SSD model. Because this is a converted TensorFlow SavedModel, you would need to load it with the GraphModel API.

Now, let’s write some code to load the model.

  1. Create a new file called run-tfjs-model.js in the tfjs-project you created earlier.
  2. Add the following code to this file.

     const tf = require('@tensorflow/tfjs-node');
    
     const modelUrl = 'https://tfhub.dev/tensorflow/tfjs-model/ssdlite_mobilenet_v2/1/default/1';
    
     let model;
    
     // load COCO-SSD graph model from TensorFlow Hub
     const loadModel = async function () {
       console.log(`loading model from ${modelUrl}`);
    
       model = await tf.loadGraphModel(modelUrl, {fromTFHub: true});
    
       return model;
     }
    
     // run
     loadModel().then(model => {
       console.log(model);
     })
    
  3. Run the app.

     node run-tfjs-model.js
    

After running the app, you can see some information about the loaded TensorFlow.js model in the console log.

Input pre-processing

To run an inference on the model, you must provide an image as input. Because the model expects a four-dimensional Tensor of pixel values of an image, the image must be processed into the appropriate shaped Tensor before it can be passed to the model.

We’ll add some code to convert an image using the tf.node.decodeImage API. Then, we’ll increase the three-dimensional Tensor to four dimensions with tf.expandDims.

  1. Add the following pre-processing code to the run-tfjs-model.js file.

     const fs = require('fs');
    
     // convert image to Tensor
     const processInput = function (imagePath) {
       console.log(`preprocessing image ${imagePath}`);
    
       const image = fs.readFileSync(imagePath);
       const buf = Buffer.from(image);
       const uint8array = new Uint8Array(buf);
    
       return tf.node.decodeImage(uint8array, 3).expandDims();
     }
    
  2. Update the run code to allow passing in the path to an image file.

     // run
     if (process.argv.length < 3) {
       console.log('please pass an image to process. ex:');
       console.log('  node run-tfjs-model.js /path/to/image.jpg');
     } else {
       // e.g., /path/to/image.jpg
       let imagePath = process.argv[2];
    
       loadModel().then(model => {
         const inputTensor = processInput(imagePath);
         inputTensor.print();
       })
     }
    
  3. Run the app.

     node run-tfjs-model.js image1.jpg
    

When you run the code, the model is loaded, the image is pre-processed, and the resulting image tensor is displayed.

Running a model

When running an inference on a model, there are many options depending on the model type. All model APIs provide a predict function (that is, tf.GraphModel.predict, tf.LayersModel.predict, and tf.node.TFSavedModel.predict). The predict function accepts the input tensors and an optional prediction configuration.

There is also an execute function, but it is supported only for the GraphModel and SavedModel (for example, tf.GraphModel.execute and tf.node.TFSavedModel.execute).

The execute function accepts the input tensors and optional output node names. The output node names allow requesting intermediate tensors.

To run the COCO-SSD model and get a prediction, pass the image tensor to tf.GraphModel.executeAsync. This performs like the execute function but in an async fashion. When the model contains control flow ops, you must use executeAsync to avoid runtime errors.

  1. Add the following pre-processing code to the run-tfjs-model.js file.

     // run prediction with the provided input Tensor
     const runModel = function (inputTensor) {
       console.log('runnning model');
    
       return model.executeAsync(inputTensor);
     }
    
  2. Update the run code to pass the image tensor to the model and get a prediction.

     // run
     if (process.argv.length < 3) {
       console.log('please pass an image to process. ex:');
       console.log('  node run-tfjs-model.js /path/to/image.jpg');
     } else {
       // e.g., /path/to/image.jpg
       let imagePath = process.argv[2];
    
       loadModel().then(model => {
         const inputTensor = processInput(imagePath);
         return runModel(inputTensor);
       }).then(prediction => {
         console.log(prediction);
       })
     }
    
  3. Run the app.

     node run-tfjs-model.js image1.jpg
    

With these changes, the image tensor is sent to the model for inferencing and the prediction is displayed.

Output pre-processing

The prediction returned by the COCO-SSD model is an array with two tensors. The first tensor contains each class’s score for each bounding box found. The second tensor contains the coordinates of each bounding box found. These tensors need some processing to be represented in a more meaningful format. The processing computes bounding boxes, scores, and labels from the prediction.

  1. Add a helper function to extract the class and score. Note the use of dataSync() for the input tensor. This is how we get the tensor values into a TypedArray that we can use with regular JavaScript. The scores constant ends up being a flattened Float32Array containing every possible class’s score for each bounding box. The goal here is to go through this array and for each bounding box, determine which class has the highest score.

     // determine the classes and max scores from the prediction
     const extractClassesAndMaxScores = function (predictionScores) {
       console.log('calculating classes & max scores');
    
       const scores = predictionScores.dataSync();
       const numBoxesFound = predictionScores.shape[1];
       const numClassesFound = predictionScores.shape[2];
    
       const maxScores = [];
       const classes = [];
    
       // for each bounding box returned
       for (let i = 0; i < numBoxesFound; i++) {
         let maxScore = -1;
         let classIndex = -1;
    
         // find the class with the highest score
         for (let j = 0; j < numClassesFound; j++) {
           if (scores[i * numClassesFound + j] > maxScore) {
             maxScore = scores[i * numClassesFound + j];
             classIndex = j;
           }
         }
    
         maxScores[i] = maxScore;
         classes[i] = classIndex;
       }
    
       return [maxScores, classes];
     }
    
  2. Add a helper function to perform the non-maximum suppression (NMS) of bounding boxes. This is a technique to ensure that a particular object is identified only once.

     const maxNumBoxes = 5;
    
     // perform non maximum suppression of bounding boxes
     const calculateNMS = function (outputBoxes, maxScores) {
       console.log('calculating box indexes');
    
       const boxes = tf.tensor2d(outputBoxes.dataSync(), [outputBoxes.shape[1], outputBoxes.shape[3]]);
       const indexTensor = tf.image.nonMaxSuppression(boxes, maxScores, maxNumBoxes, 0.5, 0.5);
    
       return indexTensor.dataSync();
     }
    
  3. Add a helper function to build the JSON object from the boxes, scores, and classes. The code makes reference to a labels.js file. This file contains a mapping of the object labels to their index values/IDs returned by the model. Get the labels.js file and add this file to your project directory.

     const labels = require('./labels.js');
    
     let height = 1;
     let width = 1;
    
     // create JSON object with bounding boxes and label
     const createJSONresponse = function (boxes, scores, indexes, classes) {
       console.log('create JSON output');
    
       const count = indexes.length;
       const objects = [];
    
       for (let i = 0; i < count; i++) {
         const bbox = [];
    
         for (let j = 0; j < 4; j++) {
           bbox[j] = boxes[indexes[i] * 4 + j];
         }
    
         const minY = bbox[0] * height;
         const minX = bbox[1] * width;
         const maxY = bbox[2] * height;
         const maxX = bbox[3] * width;
    
         objects.push({
           bbox: [minX, minY, maxX, maxY],
           label: labels[classes[indexes[i]]],
           score: scores[indexes[i]]
         });
       }
    
       return objects;
     }
    
  4. Add the following output processing code.

     // process the model output into a friendly JSON format
     const processOutput = function (prediction) {
       console.log('processOutput');
    
       const [maxScores, classes] = extractClassesAndMaxScores(prediction[0]);
       const indexes = calculateNMS(prediction[1], maxScores);
    
       return createJSONresponse(prediction[1].dataSync(), maxScores, indexes, classes);
     }
    
  5. Update the run code to process the prediction.

     // run
     if (process.argv.length < 3) {
       console.log('please pass an image to process. ex:');
       console.log('  node run-tfjs-model.js /path/to/image.jpg');
     } else {
       // e.g., /path/to/image.jpg
       let imagePath = process.argv[2];
    
       loadModel().then(model => {
         const inputTensor = processInput(imagePath);
         height = inputTensor.shape[1];
         width = inputTensor.shape[2];
         return runModel(inputTensor);
       }).then(prediction => {
         const output = processOutput(prediction);
         console.log(output);
       })
     }
    
  6. Run the app.

     node run-tfjs-model.js image1.jpg
    

Congratulations! You now have a functioning TensorFlow.js application using Node.js. The input to the app is the path of an image file. The output is a JSON object containing each object detected, its score, and its location in the image. The following code shows an example.

[
  {
    bbox: [
      35.42379140853882,
      148.18407735228539,
      223.39171171188354,
      314.645888954401
    ],
    label: 'person',
    score: 0.9464929103851318
  },
  {
    bbox: [
      337.3498320579529,
      152.96796306967735,
      205.84774017333984,
      321.53918293118477
    ],
    label: 'person',
    score: 0.9131661653518677
  },
  {
    bbox: [
      181.7181944847107,
      44.92521911859512,
      212.15811967849731,
      423.26776707172394
    ],
    label: 'person',
    score: 0.7019169926643372
  }
]

Additional enhancements (Optional)

If you want to further enhance the application, you can try to draw the bounding boxes and labels on the image. Many approaches and tools exist to aid with this task. One tool is the @codait/max-vis module. Given an image and a JSON-formatted prediction, it tries to create a new version of the input image with the prediction drawn on the image.

  1. Install the max-vis tool (for example, npm install @codait/max-vis).
  2. Add the following max-vis annotation code to the app.

     const maxvis = require('@codait/max-vis');
     const path = require('path');
    
     const annotateImage = function (prediction, imagePath) {
       console.log(`annotating prediction result(s)`);
    
       maxvis.annotate(prediction, imagePath)
         .then(annotatedImageBuffer => {
           const f = path.join(path.parse(imagePath).dir, `${path.parse(imagePath).name}-annotate.png`);
    
           fs.writeFile(f, annotatedImageBuffer, (err) => {
             if (err) {
               console.error(err);
             } else {
               console.log(`annotated image saved as ${f}\r\n`);
             }
           });
         })
     }
    
  3. Update the run code to annotate the JSON response.

     // run
     if (process.argv.length < 3) {
       console.log('please pass an image to process. ex:');
       console.log('   node run-tfjs-model.js /path/to/image.jpg');
     } else {
       // e.g., /path/to/image.jpg
       let imagePath = process.argv[2];
    
       loadModel().then(model => {
         const inputTensor = processInput(imagePath);
         height = inputTensor.shape[1];
         width = inputTensor.shape[2];
         return runModel(inputTensor);
       }).then(prediction => {
         const jsonOutput = processOutput(prediction);
         console.log(jsonOutput);
         annotateImage(jsonOutput, imagePath);
       })
     }
    

Running with this additional code results in a new annotated image. The new image is created and saved in the same directory as the source image.

annotated image

You can get the complete Node.js application (run-tfjs-model.js).

Conclusion

In this tutorial, you learned how JavaScript can be used as a tool for AI development with TensorFlow.js. You saw how you can bootstrap the AI capabilities in a Node.js app using two methods:

  1. Using a pre-packaged TensorFlow.js module with a simple API
  2. Using the TensorFlow.js API to load the model directly, and performing the pre-processing and post-processing steps needed to get the output you want.

In this example, we used an object detection model and were able to make Node apps that could identify objects and their locations in an image. In the next part of the series, we will do a deeper dive to into the concepts surrounding deep learning, and we will walk through building models from scratch. All in JavaScript.

Video