Recent advances in artificial intelligence (AI) have transformed many services and are expected to be pervasive in computing systems in the near future. Many tasks that previously required human interaction or expertise can now be captured and automated in machine learning models or deep learning models. In this tutorial, you’ll get an overview of using AI in your Node.js applications by using TensorFlow.js.
To work through this learning path, you must obtain the deep learning model that performs the task that your application requires. Many open source pre-trained models are available for you to use. You can find several on the IBM Model Asset eXchange, or you can train your own model.
For developers, one consideration is how you deploy the deep learning models to be used by applications. There are two ways these models can be deployed:
- As a web service that can be accessed through an API call in the application
- Embedded in the application itself
Note that there are other ways to integrate AI into Node.js such as TorchJS or ONNX.js, and they each offer different levels of support. In this tutorial, we focus on TensorFlow.js because it offers greater support on both the client and server. Because we are interested in Node.js, we focus on how to embed models in Node applications.
Why AI in JavaScript?
Although data scientists tend to prefer Python for AI development, JavaScript does offer several advantages on both the client and server:
- The large community of JavaScript developers can be effective in using AI on the large scale.
- The smaller footprint and fast start time of Node.js can be an advantage when deployed in containers and IoT devices.
- AI models process voice, written text, and images, and when the models are served in the cloud, the data must be sent to a remote server. Data privacy has become a significant concern recently, so being able to run the model locally on the client with JavaScript can help to alleviate this concern.
- Running a model locally on the client can help make browser apps more interactive.
What is TensorFlow.js?
TensorFlow.js is an open source software library for JavaScript developers to create and use machine learning or deep learning models directly in the browser or a Node.js application. TensorFlow is the broader open source software that includes support for different programming languages such as Python and different platforms such as server, mobile, and IoT.
With TensorFlow.js, you can:
- Create models easily and train them from scratch.
- Reuse a model that has been pre-trained. For Node.js specifically, a model can be written in Python to use the distributed training capability on huge data sets. Then, the trained model can be loaded and used in a Node.js application.
- Use the GPU for faster processing.
Prerequisites
To follow this tutorial, you need:
- Basic knowledge of Node.js
- Familiarity with AI and machine learning concepts
A workstation running an up-to-date version of Linux®, MacOS, or Windows® with:
- Node.js installed
- Visual Studio Code installed
- Python environment
- Xcode installed (MacOS only)
xcode-select --install
Estimated time
It should take you approximately 30 minutes to complete the tutorial.
Steps
In this tutorial, we will perform the following to show how you can get started with AI in Node.js:
Set up a TensorFlow.js Node project
Before jumping into using TensorFlow.js, first make sure that your environment is ready for development. If you don’t already have Node.js installed, you can follow these instructions to get a general environment set up.
For a code editor, we recommend using Visual Studio Code, as mentioned in the Node.js tutorial. However, feel free to use any editor you like.
Create a project
Now, make a Node.js project to use for the rest of this tutorial.
mkdir tfjs-project
cd tfjs-project
npm init -y
npm install @tensorflow/tfjs-node
This initializes a new Node project and installs the CPU TensorFlow.js for the Node.js package. However, if you have a Linux or Windows machine with a CUDA-enabled NVIDIA GPU, then you can alternatively install the GPU version.
npm install @tensorflow/tfjs-node-gpu
Note: For the GPU version, you also need the CUDA Toolkit and cuDNN SDK installed as well. You can find the versions needed at this GitHub TensorFlow link.
In this tutorial, we assume that you have installed the CPU version @tensorflow/tfjs-node
.
Run a packaged model
Before diving into tfjs
and its APIs, there are some resources that provide popular pre-trained models and simplified APIs to help you get started quickly. The TensorFlow.js team at Google provides several pre-trained models in the tfjs-models repo. These pre-trained models are accessible in NPM packages with easy-to-use APIs. The IBM Center for Open-source Data & AI Technologies (CODAIT) team also provides some pre-trained models in the max-tfjs-models repo.
In this tutorial, we use the Object Detection (COCO-SSD) package @tensorflow-models/coco-ssd. You’ll use its APIs to identify multiple objects in a single image.
Create a project
Install the
@tensorflow-models/coco-ssd
package in thetfjs-project
directory you previously created.npm install @tensorflow-models/coco-ssd
Program the following tasks in Node.js.
- Use pre-packaged APIs to load the pre-trained model COCO-SSD.
- Use
tfjs-node
APIs to decode the image into atf.Tensor3D
. - Pass the image tf.Tensor3D to the loaded model for inference.
- Print out the predictions.
Ensure that you are inside the project folder, then copy and paste the following code into a file named
index.js
.const cocoSsd = require('@tensorflow-models/coco-ssd'); const tf = require('@tensorflow/tfjs-node'); const fs = require('fs').promises; // Load the Coco SSD model and image. Promise.all([cocoSsd.load(), fs.readFile('image1.jpg')]) .then((results) => { // First result is the COCO-SSD model object. const model = results[0]; // Second result is image buffer. const imgTensor = tf.node.decodeImage(new Uint8Array(results[1]), 3); // Call detect() to run inference. return model.detect(imgTensor); }) .then((predictions) => { console.log(JSON.stringify(predictions, null, 2)); });
To run the object detection model, you need an image as input. For a test image, download this image into the project folder. You can run the program by using the following command.
node .
You have successfully used an Object Detection
model to recognize the objects inside a picture. Now, take a look at the code so you’ll better understand the pieces.
Explaining the code
Using the pre-packaged APIs is simple. You can look at the detailed documentation for the COCO-SSD NPM package, and you can also get the complete application here (run-prepacked-model.js
). First, we use require()
to get the COCO-SSD module.
const cocoSsd = require('@tensorflow-models/coco-ssd');
We also need an image decoder, which is included in the tfjs-node
module. See the details in the documents.
const tf = require('@tensorflow/tfjs-node');
Finally, we need the fs
module to load the image file from the file system. Because most of the COCO-SSD APIs return Promise
objects instead of using callbacks, we could also use the Promise APIs in the fs
module.
const fs = require('fs').promises
After loading all of the necessary modules, you can load the COCO-SSD pre-trained model and image file at the same time.
Promise.all([cocoSsd.load(), fs.readFile('image1.jpg')])
cocoSsd.load()
loads the pre-trained model, and fs.readFile()
loads the image. Both of them return Promise objects, and the results are returned in the then()
callback function. The first object is the loaded model instance, and the second object is the image content as a Buffer.
Next, we use the image decoder APIs provided by tfjs-node
to decode the raw image data into a tf.Tensor3D
object. Tensors are n-dimensional arrays that act as TensorFlow’s fundamental data structure for passing around and manipulating data.
const imgTensor = tf.node.decodeImage(new Uint8Array(results[1]), 3);
tf.Tensor3D
can be passed to detect()
of the loadedModel
for inference.
return model.detect(imgTensor);
The return
object is also a Promise object, and its fulfill
object is the prediction. We print out the prediction to a console.
console.log(JSON.stringify(predictions, null, 2));
Other pre-trained models
There are many pre-packaged TensorFlow.js modules, and they all provide similar APIs to load the pre-trained model and run the inference. Usually, they also provide the data pre-processing APIs to convert the raw data into the proper data format. Then, you pass the processed data to their predict functions.
Run a TensorFlow.js web model
In the previous section, you ran a TensorFlow.js model packaged as an NPM module with a simple API. The module takes care of the entire model lifecycle. It loads the model and performs the processing on the inputs and outputs. This makes it easy to use, allowing you to perform machine learning tasks with minimal knowledge of the model.
However, you might want to work with a model that has not been packaged into a module. In this case, you would need to load the model and process the data yourself. In this section, you learn how to work with a TensorFlow.js model. You load a model and pre-process the input data to the tensor format required by the model. You might also have to post-process the model output to a format that is more understandable.
The model you use in this section is the same model that is packaged in the COCO-SSD NPM module you ran in the previous section. The code presented here is a modified version of the NPM module code.
TensorFlow.js models
TensorFlow.js provides support for several model types:
tf.LayersModel
: This is created when using the TensorFlow.js Layers API to build a model. It is also produced when converting a Keras model with thetensorflowjs_converter
tool. A LayersModel can be used for training and inferencing.tf.GraphModel
: This is created when converting a TensorFlow SavedModel with thetensorflowjs_converter
tool. A GraphModel can be used for inferencing, but not training.tf.node.TFSavedModel
: In Node.js, TensorFlow.js provides native support for the TensorFlow SavedModel. You can load and run a SavedModel in the Node.js environment without conversion. Currently, SavedModels in Node.js are only for inferencing and not training.
The type of model you are working with determines the API used to load and run the model.
Loading a model
Several files make up the definition of a model, whether it is a GraphModel or a LayersModel. You find the file (model.json
) containing the data flow graph and shards of binary weight files. To use a model, you must call the appropriate API and provide the URL or path to the model.json
file.
You can also load models from TensorFlow Hub by providing its TensorFlow Hub URL and including the option fromTFHub
set to true
.
- For
tf.LayersModel
, useloadLayersModel
to load the model. - For
tf.GraphModel
, useloadGraphModel
. - For SavedModel, use
tf.node.loadSavedModel
to load the model existing in the path to the SavedModel directory.
For the exercise here, you can use the version hosted in TensorFlow Hub to load the COCO-SSD model. Because this is a converted TensorFlow SavedModel, you would need to load it with the GraphModel API.
Now, let’s write some code to load the model.
- Create a new file called
run-tfjs-model.js
in thetfjs-project
you created earlier. Add the following code to this file.
const tf = require('@tensorflow/tfjs-node'); const modelUrl = 'https://tfhub.dev/tensorflow/tfjs-model/ssdlite_mobilenet_v2/1/default/1'; let model; // load COCO-SSD graph model from TensorFlow Hub const loadModel = async function () { console.log(`loading model from ${modelUrl}`); model = await tf.loadGraphModel(modelUrl, {fromTFHub: true}); return model; } // run loadModel().then(model => { console.log(model); })
Run the app.
node run-tfjs-model.js
After running the app, you can see some information about the loaded TensorFlow.js model in the console log.
Input pre-processing
To run an inference on the model, you must provide an image as input. Because the model expects a four-dimensional Tensor of pixel values of an image, the image must be processed into the appropriate shaped Tensor before it can be passed to the model.
We’ll add some code to convert an image using the tf.node.decodeImage
API. Then, we’ll increase the three-dimensional Tensor to four dimensions with tf.expandDims
.
Add the following pre-processing code to the
run-tfjs-model.js
file.const fs = require('fs'); // convert image to Tensor const processInput = function (imagePath) { console.log(`preprocessing image ${imagePath}`); const image = fs.readFileSync(imagePath); const buf = Buffer.from(image); const uint8array = new Uint8Array(buf); return tf.node.decodeImage(uint8array, 3).expandDims(); }
Update the run code to allow passing in the path to an image file.
// run if (process.argv.length < 3) { console.log('please pass an image to process. ex:'); console.log(' node run-tfjs-model.js /path/to/image.jpg'); } else { // e.g., /path/to/image.jpg let imagePath = process.argv[2]; loadModel().then(model => { const inputTensor = processInput(imagePath); inputTensor.print(); }) }
Run the app.
node run-tfjs-model.js image1.jpg
When you run the code, the model is loaded, the image is pre-processed, and the resulting image tensor is displayed.
Running a model
When running an inference on a model, there are many options depending on the model type. All model APIs provide a predict function (that is, tf.GraphModel.predict
, tf.LayersModel.predict
, and tf.node.TFSavedModel.predict
). The predict
function accepts the input tensors and an optional prediction configuration.
There is also an execute
function, but it is supported only for the GraphModel and SavedModel (for example, tf.GraphModel.execute
and tf.node.TFSavedModel.execute
).
The execute
function accepts the input tensors and optional output node names. The output node names allow requesting intermediate tensors.
To run the COCO-SSD model and get a prediction, pass the image tensor to tf.GraphModel.executeAsync
. This performs like the execute
function but in an async fashion. When the model contains control flow ops, you must use executeAsync
to avoid runtime errors.
Add the following pre-processing code to the
run-tfjs-model.js
file.// run prediction with the provided input Tensor const runModel = function (inputTensor) { console.log('runnning model'); return model.executeAsync(inputTensor); }
Update the
run
code to pass the image tensor to the model and get a prediction.// run if (process.argv.length < 3) { console.log('please pass an image to process. ex:'); console.log(' node run-tfjs-model.js /path/to/image.jpg'); } else { // e.g., /path/to/image.jpg let imagePath = process.argv[2]; loadModel().then(model => { const inputTensor = processInput(imagePath); return runModel(inputTensor); }).then(prediction => { console.log(prediction); }) }
Run the app.
node run-tfjs-model.js image1.jpg
With these changes, the image tensor is sent to the model for inferencing and the prediction is displayed.
Output pre-processing
The prediction returned by the COCO-SSD model is an array with two tensors. The first tensor contains each class’s score for each bounding box found. The second tensor contains the coordinates of each bounding box found. These tensors need some processing to be represented in a more meaningful format. The processing computes bounding boxes, scores, and labels from the prediction.
Add a helper function to extract the class and score. Note the use of
dataSync()
for the input tensor. This is how we get the tensor values into aTypedArray
that we can use with regular JavaScript. Thescores
constant ends up being a flattenedFloat32Array
containing every possible class’s score for each bounding box. The goal here is to go through this array and for each bounding box, determine which class has the highest score.// determine the classes and max scores from the prediction const extractClassesAndMaxScores = function (predictionScores) { console.log('calculating classes & max scores'); const scores = predictionScores.dataSync(); const numBoxesFound = predictionScores.shape[1]; const numClassesFound = predictionScores.shape[2]; const maxScores = []; const classes = []; // for each bounding box returned for (let i = 0; i < numBoxesFound; i++) { let maxScore = -1; let classIndex = -1; // find the class with the highest score for (let j = 0; j < numClassesFound; j++) { if (scores[i * numClassesFound + j] > maxScore) { maxScore = scores[i * numClassesFound + j]; classIndex = j; } } maxScores[i] = maxScore; classes[i] = classIndex; } return [maxScores, classes]; }
Add a helper function to perform the non-maximum suppression (NMS) of bounding boxes. This is a technique to ensure that a particular object is identified only once.
const maxNumBoxes = 5; // perform non maximum suppression of bounding boxes const calculateNMS = function (outputBoxes, maxScores) { console.log('calculating box indexes'); const boxes = tf.tensor2d(outputBoxes.dataSync(), [outputBoxes.shape[1], outputBoxes.shape[3]]); const indexTensor = tf.image.nonMaxSuppression(boxes, maxScores, maxNumBoxes, 0.5, 0.5); return indexTensor.dataSync(); }
Add a helper function to build the JSON object from the boxes, scores, and classes. The code makes reference to a
labels.js
file. This file contains a mapping of the object labels to their index values/IDs returned by the model. Get the labels.js file and add this file to your project directory.const labels = require('./labels.js'); let height = 1; let width = 1; // create JSON object with bounding boxes and label const createJSONresponse = function (boxes, scores, indexes, classes) { console.log('create JSON output'); const count = indexes.length; const objects = []; for (let i = 0; i < count; i++) { const bbox = []; for (let j = 0; j < 4; j++) { bbox[j] = boxes[indexes[i] * 4 + j]; } const minY = bbox[0] * height; const minX = bbox[1] * width; const maxY = bbox[2] * height; const maxX = bbox[3] * width; objects.push({ bbox: [minX, minY, maxX, maxY], label: labels[classes[indexes[i]]], score: scores[indexes[i]] }); } return objects; }
Add the following output processing code.
// process the model output into a friendly JSON format const processOutput = function (prediction) { console.log('processOutput'); const [maxScores, classes] = extractClassesAndMaxScores(prediction[0]); const indexes = calculateNMS(prediction[1], maxScores); return createJSONresponse(prediction[1].dataSync(), maxScores, indexes, classes); }
Update the
run
code to process the prediction.// run if (process.argv.length < 3) { console.log('please pass an image to process. ex:'); console.log(' node run-tfjs-model.js /path/to/image.jpg'); } else { // e.g., /path/to/image.jpg let imagePath = process.argv[2]; loadModel().then(model => { const inputTensor = processInput(imagePath); height = inputTensor.shape[1]; width = inputTensor.shape[2]; return runModel(inputTensor); }).then(prediction => { const output = processOutput(prediction); console.log(output); }) }
Run the app.
node run-tfjs-model.js image1.jpg
Congratulations! You now have a functioning TensorFlow.js application using Node.js. The input to the app is the path of an image file. The output is a JSON object containing each object detected, its score, and its location in the image. The following code shows an example.
[
{
bbox: [
35.42379140853882,
148.18407735228539,
223.39171171188354,
314.645888954401
],
label: 'person',
score: 0.9464929103851318
},
{
bbox: [
337.3498320579529,
152.96796306967735,
205.84774017333984,
321.53918293118477
],
label: 'person',
score: 0.9131661653518677
},
{
bbox: [
181.7181944847107,
44.92521911859512,
212.15811967849731,
423.26776707172394
],
label: 'person',
score: 0.7019169926643372
}
]
Additional enhancements (Optional)
If you want to further enhance the application, you can try to draw the bounding boxes and labels on the image. Many approaches and tools exist to aid with this task. One tool is the @codait/max-vis
module. Given an image and a JSON-formatted prediction, it tries to create a new version of the input image with the prediction drawn on the image.
- Install the max-vis tool (for example,
npm install @codait/max-vis
). Add the following max-vis annotation code to the app.
const maxvis = require('@codait/max-vis'); const path = require('path'); const annotateImage = function (prediction, imagePath) { console.log(`annotating prediction result(s)`); maxvis.annotate(prediction, imagePath) .then(annotatedImageBuffer => { const f = path.join(path.parse(imagePath).dir, `${path.parse(imagePath).name}-annotate.png`); fs.writeFile(f, annotatedImageBuffer, (err) => { if (err) { console.error(err); } else { console.log(`annotated image saved as ${f}\r\n`); } }); }) }
Update the
run
code to annotate the JSON response.// run if (process.argv.length < 3) { console.log('please pass an image to process. ex:'); console.log(' node run-tfjs-model.js /path/to/image.jpg'); } else { // e.g., /path/to/image.jpg let imagePath = process.argv[2]; loadModel().then(model => { const inputTensor = processInput(imagePath); height = inputTensor.shape[1]; width = inputTensor.shape[2]; return runModel(inputTensor); }).then(prediction => { const jsonOutput = processOutput(prediction); console.log(jsonOutput); annotateImage(jsonOutput, imagePath); }) }
Running with this additional code results in a new annotated image. The new image is created and saved in the same directory as the source image.
You can get the complete Node.js application (run-tfjs-model.js
).
Conclusion
In this tutorial, you learned how JavaScript can be used as a tool for AI development with TensorFlow.js. You saw how you can bootstrap the AI capabilities in a Node.js app using two methods:
- Using a pre-packaged TensorFlow.js module with a simple API
- Using the TensorFlow.js API to load the model directly, and performing the pre-processing and post-processing steps needed to get the output you want.
In this example, we used an object detection model and were able to make Node apps that could identify objects and their locations in an image. In the next part of the series, we will do a deeper dive to into the concepts surrounding deep learning, and we will walk through building models from scratch. All in JavaScript.