Coding a deep learning model using TensorFlow.js

In the previous tutorial “An introduction to AI in Node.js“, we explained two basic approaches for embedding a deep learning model in your Node.js application. In this tutorial, we go a step further and show you how to build and train a simple deep learning model from scratch. Therefore, unlike the previous tutorial, you need a more in-depth understanding of how deep learning models work to get the most benefit from this tutorial.

We start with the programming concepts for deep learning and cover two different programming APIs: the high-level Layers API and the low-level Core API. You’ll code a simple model to classify clothing items, train it with a small data set, and evaluate the model’s accuracy. Then, to illustrate a common practice in deep learning, you’ll take your trained model and apply transfer learning to teach the model to classify new items. We also describe how to take a pre-trained model from other sources such as Python and convert it to a format that can be used in JavaScript.

Why code models in JavaScript

So far, we have seen that the actual deep learning model can be hidden in an npm package, loaded from a binary format, or served through a REST API. In these cases, we are simply running an inference on the model, and we don’t care how the model was implemented.

Many models have been implemented in Python because Python is a popular choice among data scientists and it has the best support in terms of functions. However, the widespread adoption of deep learning in all types of applications has attracted developers from different programming language backgrounds. Additionally, practices in implementing models have become better understood and widely available, enabling more developers to build their own model that better fits their application.

Fortunately, TensorFlow is designed to support different language bindings, in particular, Python, C, R, JavaScript, and Java™ programming. Because each language provides its own set of advantages, developers have their reasons for their choice of programming language. Therefore, it’s important to enable developers to stay with their familiar programming environments instead of requiring them to learn a new language.

Why high-level APIs

Deep learning follows the same maturity trend as in other technologies. In the early days, the implementation of the technology tends to be done with low-level constructs, but as the technology matures common patterns emerge and they are captured as high-level constructs so that the implementation can be quicker and easier. For deep learning, these patterns include the kinds of layers that are commonly used in neural networks together with their activation functions, the practical choices for optimization, and the metrics to monitor how well the model is performing. The code implementing these patterns is then packaged into high-level programming APIs so that it can be reused easily.

Coding a model in a high-level API lets you be more productive by focusing on the high-level design and avoiding the nuts and bolts of the low-level implementation. The code is much shorter and easier to read and maintain.

Prerequisites

To follow this tutorial, you need:

Estimated time

It should take you approximately 40 minutes to complete the tutorial.

Steps

  1. Programming concepts in TensorFlow.js
  2. Convert a pre-trained model to TensorFlow.js
  3. Build a deep learning model with TensorFlow.js
  4. Transfer learning with TensorFlow.js

Programming concepts in TensorFlow.js

Conceptually, a neural network consists of many layers of weights together with the computation, which are represented as nodes and edges in a graph. Programming platforms support the implementation of these graphs in different ways. In earlier approaches, you would work at the low level by explicitly allocating tensors and coding the individual computation on the tensors. As the technology evolved rapidly over the past few years, common patterns emerged and these patterns are built into higher-level programming abstractions where entire layers in neural networks are available as APIs in the platform. Models can be easily built by stacking these layers together, greatly simplifying the implementation.

In Python, Keras is the most popular API for this layer approach while the earlier TensorFlow Python library targets the low-level approach. In JavaScript, TensorFlow.js supports both programming styles with its low-level and high-level APIs. We introduce the key concepts in these two programming styles.

Complete API documentation is available on the TensorFlow.js API website.

High-level API: Layers

The Layers API imitates the Keras programming style in Python, although in JavaScript syntax. There is a close similarity between the Layers API and Keras, but they are not one-to-one identical. The main programming abstraction here consists of the model and the layers. You create a model object, which represents your deep learning model, and add any number of layers to the model to implement your model architecture.

There are two ways to construct your model:

  1. tf.sequential(): The simplest way to build the model, by arranging the layers in linear order with one layer feeding the next. The tensors between the layers are allocated automatically, so you only need to manage the input tensor that feeds the first layer.
  2. tf.model(): The layers can be arranged in an arbitrary acyclic graph. You do need to manage all of the tensors between the layers and you connect the input to each layer through the apply() method.

There are many types of layers that can be used to implement your model architecture.

  • Basic layers: Commonly used functions
  • Activation layers: Various activation functions typically placed at the end of a major layer
  • Convolution layers: Various versions of the convolution function
  • Merge layers: Common matrix operations
  • Normalization layers: Normalize the activation output to a mean value of 0 and standard deviation of 1
  • Pooling layers: Pool values by average or max
  • Recurrent layers: Various layers for recurrent networks
  • Wrapper layers: Apply some transformation on top of another layer
  • Input layers: Manage the input to the sequential layers
  • Padding layers: Pad the border of an image with some values, typically zero
  • Noise layers: Dropout function for regularization during training to avoid overfitting
  • Mask layers: Skip remaining layers on certain conditions
  • Regularization layers: To avoid overfitting

It’s also important to initialize the tensors for weights, biases, and kernels to the proper values so that the model’s performance would converge during training.

After the model architecture has been defined with the appropriate layers, you must specify three parameters that arer required for training. This is done with the compile method in tf.LayersModel:

  • Optimizer
  • Loss function
  • Metric

These parameters can be specified as a convenient string name such as “accuracy”, or an object as created with the low-level Core API (described below).

To train the Layers model, the tf.LayersModel API provides two methods:

  • fit runs the training for a fixed number of iterations.
  • fitDataset runs the training on input as provided by the Dataset object.

For more details, visit the TensorFlow.js page on layers.

Composing a sequential model

Now, you have a clear picture of the TensorFlow.js APIs. The following list is the summary of how to compose a sequential model.

  1. Define neural network layers, including:
    • Layer type
    • Number of neural nodes
    • Activation function
    • Initializer
  2. Choose a proper optimizer to train your model, like sgd or adam.
  3. Decide the loss function to minimize the distance between the model’s output and labels.
  4. Select a list of metrics that you want to monitor while training the models.
  5. Call compile() with the optimizer, loss function, and metrics.
  6. Start the model training by fitting the data set to the model.
  7. Monitor the training progress and evaluate the trained model.

When the training is complete, you can use the save API of a sequential model to store the trained model to the file system.

await model.save('file:///path/to/my-model');

Later on, you can use the tf.loadLayersModel API to load the model from the file system.

const model = await tf.loadLayersModel('file://path/to/my-model/model.json');

Low-level API: Core

At the low level, a deep learning model is a directed graph in which the nodes represent the operations and the edges represent the data flowing through the graph. Mirroring this concept, the low-level programming constructs consist of:

  1. Data objects tf.tensor are multi-dimensional arrays of various data types. To support typical usage, the API lets you create tensors of various shapes, which are filled with various data patterns.

  2. Operations include linear algebra and machine learning computation on some input tensors, producing a new tf.tensor as a result. Basic activation functions are supported, that is, Sigmoid, RELU, and leaky RELU. The API also includes some special purpose operations for audio and image processing.

More information on tensors and operations can be found on the Tensors and operations page.

For training at the low level, the Core API supports the following functions:

Training with the selected optimizer is done through the Optimizer.minimize() method.

Convert a pre-trained model to TensorFlow.js

Although there are many open source pre-trained models for TensorFlow.js, more models are trained and available in TensorFlow and Keras Python formats. For those models, conversion is necessary before they can be used for inference with TensorFlow.js.

Conversion is possible for TensorFlow SavedModel and Keras models. However, if the model includes operations that are not supported by TensorFlow.js, the conversion will fail. See the complete list of TensorFlow.js supported operations.

In this section, we show an example of converting a Keras HDF5 model to a TensorFlow.js GraphModel format. Other formats and conversion scenarios can be found at the tensorflow/tfjs GitHub repo.

Prerequisites

The tensorflowjs_converter utility can be installed using the tensorflowjs Python package. tensorflowjs requires (and installs) specific versions of TensorFlow and Keras. To ensure compatibility and avoid disrupting your existing Python environment, it’s recommended that you install tensorflowjs using Python 3.6.8 in a virtual environment or Docker.

You can skip this step if your system already meets the following requirements.

  • Python 3.6.8: Use the pyenv instructions to install or manage an additional Python runtime.
  • Virtualenv: Set up using your operating system-specific instructions or use these virtualenv instructions.

As an example, we convert a pre-trained Fashion-MNIST Keras HDF5 model to the TensorFlow.js GraphModel format.

  1. Create a folder as the workspace and two subfolders for conversion.

     mkdir tfjs_converter
     cd tfjs_converter
     mkdir kerasmodel graphmodel
    
  2. Download the fashion_mnist.h5 model to the kerasmodel folder.

     wget -P kerasmodel/ https://dax.cdn.appdomain.cloud/dax-fashion-mnist/1.0.2/pre-trained-models/fashion_mnist.h5
    

Conversion

Pick one of the following options.

Option 1: Install TensorFlowjs_converter in a Virtualenv

  1. Install an additional Python 3.6.8 runtime using pyenv.

     pyenv install 3.6.8
    
  2. Install the tensorflowjs Python package that contains the tensorflowjs_converter in a Virtualenv with the Python 3.6.8 runtime.

     virtualenv -p $(pyenv root)/versions/3.6.8/bin/python --no-site-packages venv
     source venv/bin/activate
     pip install tensorflowjs
    
  3. Convert from Keras HDF5 to GraphModel.

     tensorflowjs_converter \
     --input_format=keras \
     --output_format=tfjs_graph_model \
     ./kerasmodel/fashion_mnist.h5 \
     ./graphmodel
    

Option 2: Convert using a pre-built Docker image

Convert inside a Docker container.

```
docker run --rm -v ${PWD}:/root/model \
tedhtchang/tensorflowjs_converter \
--input_format=keras \
--output_format=tfjs_graph_model \
/root/model/kerasmodel/fashion_mnist.h5 \
/root/model/graphmodel
```

Verify conversion

Verify that the graphmodel folder contains a model.json file and sharded binaries, for example, group1-shard1of1.bin.

Finding the saved_model_tags and signature_name

Some common errors might occur during conversion. You might need to specify additional options using the --saved_model_tags and --signature_name options.

The following error only applies to TensorFlow SavedModel conversion when the tags of MetaGraphDef are not default to serve.

RuntimeError: MetaGraphDef associated with tags 'serve' could not be found in SavedModel. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: `saved_model_cli`

Use the saved_model_cli command to list the possible values for --saved_model_tags. The command is available as part of the Python TensorFlow package installation.

saved_model_cli show --dir <model folder>

Another conversion error might occur if the TFHub module or saved_model Signatures are not default or serving_default, respectively.

ValueError: Signature 'default' is missing from meta graph.

To list the possible Signatures tagged by <tag name>, run the following command.

saved_model_cli show --dir <model folder> --tag_set <tag name>

Build a deep learning model with TensorFlow.js

In this section of the tutorial, you learn how to build a deep learning machine learning model using the TensorFlow.js Layers API. We go over the following steps in the model building flow: load the data, define the model, train the model, and test the model.

Gathering, preparing, and creating a data set is beyond the scope of this tutorial. Instead, you use a ready-to-use data set from the IBM Data Asset eXchange (DAX). DAX provides a curated list of free and open data sets.

The data set you’ll work with is the Fashion-MNIST data set found in DAX. Fashion-MNIST contains pixel data and labels for over 60,000 images of 10 different clothing items. With this data, you’ll build and train a model to identify the clothing items.

Load the data

The Fashion-MNIST data set includes two CSV files (a training set and a test set). The first column of the CSV files represents the label for an item. The remaining (784) columns represent the pixel values (0-255) for the image (28×28).

While the data set contains pictures of 10 different clothing items, you’ll be working with only half the data set (five clothing items). This allows you in the next section to take the model you built here and do some transfer learning with the second half of the data set (without having to create your own new data set for transfer learning).

TensorFlow.js provides a Data API to load and parse data. CSV data can be loaded using tf.data.csv.

  1. Start a new Node.js project (for example, tfjs-tutorial) and install the tfjs-node package:

     $ mkdir tfjs-tutorial
     $ cd tfjs-tutorial
     $ npm init -y
     $ npm install @tensorflow/tfjs-node
    
  2. Download and extract the Fashion-MNIST data set. It should contain two CSV files (fashion-mnist_train.csv and fashion-mnist_test.csv).

  3. Create and open a build-model.js file in the tfjs-tutorial project using VS Code or your favorite IDE.

  4. Add the following code to the build-model.js file and update the trainDataUrl and testDataUrl to the proper path of the extracted data files. Here, we are simply initializing the environment values. Note that there are 10 labels, but we are only using five classes.

     // TensorFlow.js for Node,js
     const tf = require('@tensorflow/tfjs-node');
    
     // Fashion-MNIST training & test data
     const trainDataUrl = 'file://./fashion-mnist/fashion-mnist_train.csv';
     const testDataUrl = 'file://./fashion-mnist/fashion-mnist_test.csv';
    
     // mapping of Fashion-MNIST labels (i.e., T-shirt=0, Trouser=1, etc.)
     const labels = [
       'T-shirt/top',
       'Trouser',
       'Pullover',
       'Dress',
       'Coat',
       'Sandal',
       'Shirt',
       'Sneaker',
       'Bag',
       'Ankle boot'
     ];
    
     // Build, train a model with a subset of the data
     const numOfClasses = 5;
    
     const imageWidth = 28;
     const imageHeight = 28;
     const imageChannels = 1;
    
     const batchSize = 100;
     const epochsValue = 5;
    
  5. Add to the build-model.js file, the code to load the data set and normalize the pixel values (0-255) between 0 and 1. This is a common practice because the math functions in the library typically operate on floating point tensors. The transform function converts the label representation to one-hot vectors, which are also commonly used for categorical classification. We then select the set of images that belong to the classes to be used for this exercise, and group them into batches for training.

     // load and normalize data
     const loadData = function (dataUrl, batches=batchSize) {
       // normalize data values between 0-1
       const normalize = ({xs, ys}) => {
         return {
             xs: Object.values(xs).map(x => x / 255),
             ys: ys.label
         };
       };
    
       // transform input array (xs) to 3D tensor
       // binarize output label (ys)
       const transform = ({xs, ys}) => {
         // array of zeros
         const zeros = (new Array(numOfClasses)).fill(0);
    
         return {
             xs: tf.tensor(xs, [imageWidth, imageHeight, imageChannels]),
             ys: tf.tensor1d(zeros.map((z, i) => {
                 return i === ys ? 1 : 0;
             }))
         };
       };
    
       // load, normalize, transform, batch
       return tf.data
         .csv(dataUrl, {columnConfigs: {label: {isLabel: true}}})
         .map(normalize)
         .filter(f => f.ys < numOfClasses) // only use a subset of the data
         .map(transform)
         .batch(batchSize);
     };
    
     // run
     const run = async function () {
       const trainData = loadData(trainDataUrl);
    
       const arr = await trainData.take(1).toArray();
       arr[0].ys.print();
       arr[0].xs.print();
     };
    
     run();
    
  6. Run the app.

     $ node build-model.js
    

When the code is run, the training data is loaded, normalized, and turned into tensors. The label values (ys) and the normalized pixel values (xs) are displayed for the first set of images.

Build the model

The model architecture to build depends on your use case and the type of data you are working with. For the images we are using as input in this example, convolution (CNN) has shown to be effective in extracting useful features from the images so that the model can learn. Our model architecture simply consists of two layers of 2D convolution along with computing the max pool after each layer. These make up our four hidden layers in the model. Then, we flatten the tensors to feed into a dense layer for the final classification. You build the layers using the TensorFlow.js Layers API.

Note that the first (input) layer conv2d requires an inputShape to indicate the shape of the input that the model receives. The final (output) layer dense includes a units parameter to indicate the dimension of the output. Note also that there is no declaration for the tensors and shape between the layers because this is all done automatically by the Layers API. The conv2d layer requires a number of parameters specific to convolution, such as kernel size and shape. The activation function is built into this layer. In the final layer dense, the softmax activation function generates the classification probabilities of the image.

After we define the layers, the model must be compiled with the optimizer and loss functions to configure and prepare the model for training and evaluation. You can see how the code to implement the model is fairly compact and easy to understand by using the high-level API.

  1. Add the following code to the build-model.js file.

     // Define the model architecture
     const buildModel = function () {
       const model = tf.sequential();
    
       // add the model layers
       model.add(tf.layers.conv2d({
         inputShape: [imageWidth, imageHeight, imageChannels],
         filters: 8,
         kernelSize: 5,
         padding: 'same',
         activation: 'relu'
       }));
       model.add(tf.layers.maxPooling2d({
         poolSize: 2,
         strides: 2
       }));
       model.add(tf.layers.conv2d({
         filters: 16,
         kernelSize: 5,
         padding: 'same',
         activation: 'relu'
       }));
       model.add(tf.layers.maxPooling2d({
         poolSize: 3,
         strides: 3
       }));
       model.add(tf.layers.flatten());
       model.add(tf.layers.dense({
         units: numOfClasses,
         activation: 'softmax'
       }));
    
       // compile the model
       model.compile({
         optimizer: 'adam',
         loss: 'categoricalCrossentropy',
         metrics: ['accuracy']
       });
    
       return model;
     }
    
  2. Update the run code.

     // run
     const run = async function () {
       const trainData = loadData(trainDataUrl);
       const model = buildModel();
       model.summary();
     };
    
  3. Run the app.

     $ node build-model.js
    

This change builds a model and displays a summary of the model architecture.

Train the model

Training is required before the model can be used. To train the model, it must be fitted with the training data set.

  1. Update the build-model.js file with the training code.

     // train the model against the training data
     const trainModel = async function (model, trainingData, epochs=epochsValue) {
       const options = {
         epochs: epochs,
         verbose: 0,
         callbacks: {
           onEpochBegin: async (epoch, logs) => {
             console.log(`Epoch ${epoch + 1} of ${epochs} ...`)
           },
           onEpochEnd: async (epoch, logs) => {
             console.log(`  train-set loss: ${logs.loss.toFixed(4)}`)
             console.log(`  train-set accuracy: ${logs.acc.toFixed(4)}`)
           }
         }
       };
    
       return await model.fitDataset(trainingData, options);
     };
    
  2. Update the run code.

     // run
     const run = async function () {
       const trainData = loadData(trainDataUrl);
       const model = buildModel();
       model.summary();
       const info = await trainModel(model, trainData);
       console.log(info);
     };
    
  3. Run the app. Note that training might take several minutes.

     $ node build-model.js
    

When this code is run, the model goes through training using the training data and iterates for the number of epochs defined. Each iteration displays the loss and accuracy values. You should see the accuracy improving after each epoch. You can experiment with the model architecture by adding some more pairs of conv2d and maxPooling2d layers to see whether the accuracy improves further.

Evaluate the model

After training, the model can be evaluated with the test data set, which it has not yet seen. The test data set should be processed in the same manner as the training data set. We rerun the training and then run the test data set through the model after training has completed.

  1. Edit the build-model.js file and add the evaluation code.

     // verify the model against the test data
     const evaluateModel = async function (model, testingData) {
       const result = await model.evaluateDataset(testingData);
       const testLoss = result[0].dataSync()[0];
       const testAcc = result[1].dataSync()[0];
    
       console.log(`  test-set loss: ${testLoss.toFixed(4)}`);
       console.log(`  test-set accuracy: ${testAcc.toFixed(4)}`);
     };
    
  2. Update the run code.

     // run
     const run = async function () {
       const trainData = loadData(trainDataUrl);
       const testData = loadData(testDataUrl);
    
       const model = buildModel();
       model.summary();
    
       const info = await trainModel(model, trainData);
       console.log(info);
    
       console.log('Evaluating model...');
       await evaluateModel(model, testData);
     };
    
  3. Run the app.

     $ node build-model.js
    

This code evaluates the model using the test data set and displays the loss and accuracy values for the model. Typically, the accuracy on the test data is slightly lower than the training data. If it is significantly lower, then the model is not performing well and overfitting can be one of the causes.

Save the model

At this point, you can start using your model for predictions. To do this, you must first export/save your model so that you can later load and run it in a browser environment or a separate Node.js application. It can also be used for transfer learning. Models are saved with the tf.LayersModel.save method.

  1. Update the run code and the saveModelPath to the preferred path to where you want to save the model.

     // run
     const run = async function () {
       const trainData = loadData(trainDataUrl);
       const testData = loadData(testDataUrl);
       const saveModelPath = 'file://./fashion-mnist-tfjs';
    
       const model = buildModel();
       model.summary();
    
       const info = await trainModel(model, trainData);
       console.log(info);
    
       console.log('Evaluating model...');
       await evaluateModel(model, testData);
    
       console.log('Saving model...');
       await model.save(saveModelPath);
     };
    
  2. Run the app.

     $ node build-model.js
    

The model is saved in a fashion-mnist-tfjs folder in the current working directory. The saved content includes the topology (model.json) and weights (weights.bin) of the model.

You can find the complete Node.js application to build, train, and save a model in the /src/build-model.js file.

Note: Feel free to experiment with different model architectures and try to improve or create a better performing model. You can also increase the number of epochs during training, or try different activation functions and optimizers.

Run the model

Your saved model can now be loaded and used to make predictions on images of clothing items.

Remember, the Fashion-MNIST data set we used was derived from 28×28 grayscale images. Any image that you want to run through the model must be converted to a 28×28 grayscale image. We use the jimp library to help with the necessary image manipulation.

  1. Install the jimp library into your Node.js project.

     $ npm install --save jimp
    
  2. Create and open a test-model.js file in the project.

  3. Add the initialization code to the test-model.js file.

     // TensorFlow.js for Node,js
     const tf = require('@tensorflow/tfjs-node');
    
     // mapping of Fashion-MNIST labels
     const labels = [
       'T-shirt/top',
       'Trouser',
       'Pullover',
       'Dress',
       'Coat',
       'Sandal',
       'Shirt',
       'Sneaker',
       'Bag',
       'Ankle boot'
     ];
    
     const imageWidth = 28;
     const imageHeight = 28;
     const imageChannels = 1;
    
  4. Add the code that converts an image into the format needed to the test-model.js file.

     const Jimp = require('jimp');
    
     // Convert image to array of normalized pixel values
     const toPixelData = async function (imgPath) {
       const pixeldata = [];
       const image = await Jimp.read(imgPath);
       await image
           .resize(imageWidth, imageHeight)
           .greyscale()
           .invert()
           .scan(0, 0, imageWidth, imageHeight, (x, y, idx) => {
             let v = image.bitmap.data[idx + 0];
             pixeldata.push(v / 255);
           });
    
       return pixeldata;
     };
    
  5. Update the test-model.js file to add the code to perform a prediction.

     const runPrediction = function (model, imagepath) {
       return toPixelData(imagepath).then(pixeldata => {
         const imageTensor = tf.tensor(pixeldata, [imageWidth, imageHeight, imageChannels]);
         const inputTensor = imageTensor.expandDims();
         const prediction = model.predict(inputTensor);
         const scores = prediction.arraySync()[0];
    
         const maxScore = prediction.max().arraySync();
         const maxScoreIndex = scores.indexOf(maxScore);
    
         const labelScores = {};
    
         scores.forEach((s, i) => {
             labelScores[labels[i]] = parseFloat(s.toFixed(4));
         });
    
         return {
             prediction: `${labels[maxScoreIndex]} (${parseInt(maxScore * 100)}%)`,
             scores: labelScores
         };
       });
     };
    
  6. Add the code to run the app and update the modelUrl to the correct path for the saved model.

     // run
     const run = async function () {
       if (process.argv.length < 3) {
         console.log('please pass an image to process. ex:');
         console.log('  node test-model.js /path/to/image.jpg');
       } else {
         // e.g., /path/to/image.jpg
         const imgPath = process.argv[2];
    
         const modelUrl = 'file://./fashion-mnist-tfjs/model.json';
    
         console.log('Loading model...');
         const model = await tf.loadLayersModel(modelUrl);
         model.summary();
    
         console.log('Running prediction...');
         const prediction = await runPrediction(model, imgPath);
         console.log(prediction);
       }
     };
    
     run();
    
  7. Run the app, passing in the full path to an image.

     $ node test-model.js dress-red.jpg
    

Running this code, the image is processed and a prediction made. The prediction output is a JSON object that contains the prediction and score for each available label. For example:

{
  "prediction": "Dress (62%)",
  "scores": {
    "T-shirt/top": 0.0018,
    "Trouser": 0.0106,
    "Pullover": 0.0133,
    "Dress": 0.6229,
    "Coat": 0.3427
  }
}

You can find the complete Node.js application to load and test the model in the /src/test-model.js file.

Transfer learning with TensorFlow.js

Let’s practice a commonly used technique to leverage already trained models for our own specific use cases. Modern, state-of-the-art models typically have millions of parameters and can take inordinate amounts of time to fully train. Transfer learning shortcuts a lot of this training work by taking a model trained on one task and repurposing it for a second related task. We do this by replacing the final layers of the pre-trained model with new layers and then training them with new data. A major advantage of this technique is that much less training data is needed to train an effective model for new classes.

Remember that for this approach to be effective, model features learned from the first task should be general, that is, features should be similar between both the first and second tasks.

In the previous section, we created a model trained on five classes from the Fashion MNIST data set. Let’s try making a classifier for the other five classes using transfer learning. However, this time we use only a fraction of the training data set for each class. We should be able to train a classifier much faster.

Setup

To get started, copy the build-model.js file contents into another file where you’ll adapt it for transfer learning.

cp build-model.js transfer-learn.js

Make data loading adjustments

You can keep the loading almost the same, with only a few adjustments.

  1. Alter the filter function in the return statement of loadData. We want the other part of the data set that we didn’t train on previously, so let’s change the expression to only get data where the label is greater than or equal to the cutoff given by labels.length - numOfClasses.

     return tf.data
       .csv(dataUrl, {columnConfigs: {label: {isLabel: true}}})
       .map(normalize)
       .filter(f => f.ys >= (labels.length - numOfClasses)) // Note the change here.
       .map(transform)
       .batch(batchSize);
    
  2. Alter the transform function in loadData to accurately map the final five labels to one-hot vectors. For example, ‘Sandal’ has a label number of 5, so we subtract the number of classes (which is 5 in our case) from it to get the ‘hot’ index in the one-hot encoding array (that is, [1, 0, 0, 0, 0]).

     const transform = ({xs, ys}) => {
       const zeros = (new Array(numOfClasses)).fill(0);
    
       return {
         xs: tf.tensor(xs, [imageWidth, imageHeight, imageChannels]),
         ys: tf.tensor1d(zeros.map((z, i) => {
           // Note the change from ys to (ys - numOfClasses)
           return i === (ys - numOfClasses) ? 1 : 0;
         }))
       };
     };
    

Change the model building function

Because we are no longer building a model from scratch and are going to rely on the model built from before, let’s change the buildModel function. This time, buildModel requires an argument for the base model (that is, the pre-trained model from before). We use that as a base for the new model.

const buildModel = function (baseModel) {

  // Remove the last layer of the base model. This is the softmax
  // classification layer used for classifying the first  classes
  // of Fashion-MNIST. This leaves us with the 'Flatten' layer as the
  // new final layer.
  baseModel.layers.pop();

  // Freeze the weights in the base model layers (feature layers) so they
  // don't change when we train the new model.
  for (layer of baseModel.layers) {
    layer.trainable = false;
  }

  // Create a new sequential model starting from the layers of the
  // previous model.
  const model = tf.sequential({
    layers: baseModel.layers
  });

  // Add a new softmax dense layer. This layer will have the trainable
  // parameters for classifying our new classes.
  model.add(tf.layers.dense({
    units: numOfClasses,
    activation: 'softmax',
    name: 'topSoftmax'
  }));

  model.compile({
    optimizer: 'adam',
      loss: 'categoricalCrossentropy',
      metrics: ['accuracy']
  });

  return model;
}

Update the run code

Update the run code to load the pre-trained model from before and use it for the new model. We are also going to train on only a subset of the data set. In this case, we are only training on 10% of the available training images from the new set of classes (approximately 600 images per class compared to the original 6000).

const run = async function () {

  const trainData = loadData(trainDataUrl);
  const testData = loadData(testDataUrl);

  // Determine how many batches to take for reduced training set.
  const amount = Math.floor(3000 / batchSize);
  const trainDataSubset = trainData.take(amount);

  const baseModelUrl = 'file://./fashion-mnist-tfjs/model.json';
  const saveModelPath = 'file://./fashion-mnist-tfjs-transfer';

  const baseModel =  await tf.loadLayersModel(baseModelUrl);
  const newModel = buildModel(baseModel);
  newModel.summary();

  const info = await trainModel(newModel, trainDataSubset);
  console.log(info);

  console.log('Evaluating model...');
  await evaluateModel(newModel, testData);

  console.log('Saving model...');
  await newModel.save(saveModelPath);
}

run();

Train the model

Run the app from the command line to start training the new model with transfer learning.

$ node transfer-learn.js

Hopefully, you should see the training complete much quicker with similar test accuracy to what we got training a model from scratch using the first five classes. This is because not only are we training on less data, but only the last layer is being trained. All layers before the last are unchanged.

You can find the complete Node.js application to load the pre-trained model and perform transfer learning in the /src/transfer-learn.js file.

Test the new model

Let’s try out the new model on an image of an item representing one of the new classes.

  1. In the run code, alter the test-model.js script to use the new model file.

     const modelUrl = 'file://./fashion-mnist-tfjs-transfer/model.json';
    
  2. Comment out or remove the first five classes in the labels array so that the softmax output indices have the correct mapping. It should look something like the following code.

     const labels = [
       'Sandal',
       'Shirt',
       'Sneaker',
       'Bag',
       'Ankle boot'
     ];
    
  3. Run the app, passing in the full path to an image.

     $ node test-model.js shirt-blue.jpg
    

Note that the new model now only recognizes the five new classes that it has been trained for during the transfer learning, not the original five classes. What we have done is to use the weights that were trained on the features of the first five classes, and replace and retrain the last layer to classify the second five classes. As an experiment, you can try reconfiguring the last layer to recognize all 10 classes.

In this exercise, we divided up the small data set to illustrate the programming concept, so the accuracy might not be very high. In practice, you would want to have a sufficiently large data set, and it should include the features you might expect in the transfer learning data.

Conclusion

In this tutorial, we took a deeper dive into programming a deep learning model in JavaScript. We started with the high-level concepts and covered the available choices in programming APIs. We then went through two programming exercises that represent common practices in building and working with your model. We also described how to convert a model implemented in Python into the format for JavaScript.

Building a deep learning model does require considerable expertise in the field. Fortunately, the rapid adoption of deep learning has brought this expertise to the wider community, enabling many more developers to build their own model rather than relying on the pre-defined models. If you are interested in gaining the expertise in this field, there are many sources available for learning, including a learning path on the IBM Developer site.

In the next part of the series, we will look at how to run your JavaScript AI application, with consideration for performance and IoT devices.

Video

va barbosa
Ton Ngo
Paul Van Eck
Yi-Hong Wang
Ted Chang