Back in March we introduced a major evolution of PowerAI with version 1.6.0. It included a complete transition to conda packaging, updated versions of the most popular deep learning and machine learning frameworks, and a simplified and unified package dependency control.

Now PowerAI is evolving and expanding again. The PowerAI team is excited to be joining the Watson Machine Learning family and is proud to introduce Watson Machine Learning Community Edition 1.6.1.

So what’s new in 1.6.1 (besides the name)? There is quite a bit! Let’s walk through it!

Platform Support

GPU support and CPU support

WML CE 1.6.1 includes the latest cutting edge GPU support. The GPU support packages include NVIDIA’s CUDA 10.1 update 1, NCCL 2.4.7 and cuDNN 7.5.1. New packages include a technical preview of NVIDIA’s TensorRT ( and a package for data pipe-lining and data transformation on the GPU called DALI (0.9). There is more on TensorRT below and there will be more about DALI in a later blog post. Stay tuned!

In the past we’ve exclusively provided GPU enabled packages built against NVIDIA’s latest CUDA versions. The move to unify and simply our AI efforts under the Watson brand has also expanded our scope. This means that although there will still be a heavy focus on GPU enabled packages, WML CE 1.6.1 also now includes TensorFlow, Caffe and XGBoost packages built for CPU-only servers. These CPU packages have no dependency on CUDA and can be used on servers without GPUs.

Inferencing and prediction advancements

A few important technologies have emerged in the industry that help on the production inferencing part of the AI life cycle. As mentioned above, the TensorRT technical preview package is included. TensorRT is another excellent piece of software from NVIDIA that can optimize trained models by replacing certain compatible subgraphs with graphs optimized for the GPU. Not only is the TensorRT package included for use, but the TensorRT features in the TensorFlow 1.14 package and the PyTorch 1.1.0 package have been enabled. Want to learn more? Check back to Developerwork soon for a blog on TensorRT.

Another new addition to the WML CE package set that is aimed at production inferencing is TensorFlow Serving 1.14. TensorFlow Serving is an inferencing server used for putting trained models into production. The TensorFlow Serving project also includes TensorRT features, which have been enabled in our package. It’s now easier than ever to get a TensorRT-enabled TensorFlow Serving installation ready to use. To learn more, check back to Developerwork soon for a blog on TensorFlow Serving.

Support for 64 bit Linux

As we join the Watson Machine Learning family, WML CE has become fully multi-platform (with the exception of SnapML). We now provide a fully populated channel for 64 bit Linux architecture in addition to the linux-ppc64le packages that are already included. With the additional of packages for these 64 bit x86 based platforms, WML CE can be run just about anywhere!

The Frameworks

The included machine learning and deep learning frameworks have been updated to the latest completed upstream versions. We work in each community, helping to build and test these frameworks so we have the utmost confidence in their stability by the time they are tagged and released upstream. This continual development process is a natural way to work with these open source communities and allows us to rapidly update our channel with the best versions possible.

TensorFlow ecosystem updates

The TensorFlow package included in WML CE 1.6.1 has been updated to version 1.14. Please review the release notes for this TensorFlow version. One additional feature that is not mentioned in the release notes is the debut of native Automatic Mixed Precision (AMP) support. This feature, once enabled in a model, will attempt to automatically take advantage of lower precision hardware such as the Tensor Cores found in NVIDIA’s V100 GPUs. It’s important to note that this implementation of AMP differs from NVIDIA’s initial implementation. The main difference for users is the method for enabling AMP in TensorFlow models. To enable AMP in the official TensorFlow 1.14 version, the following lines of Python can be added to the TensorFlow model code:

config = tf.ConfigProto()

Another important note is that this is likely to be the last release in the 1.x series for TensorFlow. While TensorFlow 1.x has been an astonishingly successful project, the community is hard at work with TensorFlow 2.0. Since this will be yet another large evolution of the framework, WML CE 1.6.1 includes a beta version of TensorFlow 2.0 for early adopters and advanced porting efforts. Refer to the IBM Knowledge Center for instructions on using the tensorflow2 package: Getting started with TensorFlow.

TensorFlow Estimator (1.14), Tensorboard (1.14) and Tensorflow Probability (0.7) have been updated. TensorFlow Serving, as mentioned previously, is included at version 1.14 as well. Both TensorFlow and TensorFlow Serving are built with the TensorRT features enabled.


The WML CE team has witnessed a bit of confusion around Keras and TensorFlow. The good news is there are choices, the bad news is the support story can be a little unclear. To put it succinctly, there are two popular implementations of the Keras API. The “upstream” or reference implementation is written by KerasTeam and can support different compute framework back-ends. TensorFlow is currently the most common compute back-end. This version of Keras is installed separately from TensorFlow itself.

The TensorFlow community has also implemented the Keras API within TensorFlow. This implementation is exposed as tf.keras. The TensorFlow implementation lags the KerasTeam implementation by version, but gains optimizations and multi-GPU support via the TensorFlow DistributionStrategies API.

In order to offer choice, WML CE 1.6.1 includes a package for KerasTeam Keras for the first time. TensorFlow’s native tf.keras also remains available to use. Note that the KerasTeam Keras package is currently not compatible with the TensorFlow 2.0 beta.

Caffe ecosystem updates

The original Caffe project has not seen a version bump upstream. However, WML CE 1.6.1 has included the latest round of bug fixes to the 1.0 release of IBM optimized Caffe. As mentioned above, WML CE 1.6.1 includes a Caffe package that does not require CUDA and is targeted at systems without GPUs.

PyTorch ecosystem updates

PyTorch has been updated to version 1.1.0. You can find extensive notes for this release here on github. Beyond the many new features and bug fixes listed in the release notes, there are some additional noteworthy updates. TensorRT support has been enabled in the PyTorch package. The ONNX package has been updated to 1.5.0. The excellent CUDA-enabled linear algebra package, Magma, has been updated to version 2.5.0.

Additions include torchtext 0.3.1 and pytext 0.1.5. These package work along with PyTorch to provide functions for processing natural language and analyzing sentiment. The PyTorch APEX (0.1.1) package has also been added, which brings along Automatic Mixed Precision support for Pytorch as well.

IBM Distributed Deep Learning (DDL) and Large Model Support (LMS) updates

The DDL (1.4) and corresponding IBM Spectrum MPI (10.03) package have been updated. DDL now integrates with NCCL 2.4.7 and has seen efficiency and performance improvements for large scale installations. Bug fixes and general improvements are also included.

The TensorFlow Large Model Support (TFLMS) package has been updated to version 2.0.1 and lives exclusively in a separate package. The built-in version of TFLMS (version 1) has been deprecated and removed, so all TFLMS code must be migrated to version 2.x. The latest version of TFLMS includes updates to work with TensorFlow’s new keras-resident optimizer_v2, as well as initial support for recurrent models that contain while loops. The TFLMS 2.0.1 package is also now available on 64 bit x86 for the fist time.

PyTorch LMS is now fully supported and remains fully built-in. Use the --lms tag to enable LMS in PyTorch.

Lightning Speed Machine Learning!

SnapML improved

In 1.6.0, we introduced CPU implementations of Decision Tree and Random Forest algorithms to pai4sk. For 1.6.1, we have enhanced these algorithms with histogram features that makes them even faster than they were before. The histogram features are enabled with a new parameter in the pai4sk_DecisionTreeClassifier and pai4sk_RandomForestClassifier APIs.

The 1.6.0 release also brought in the GPU DataFrame library cuDF and the GPU accelerated machine learning library cuML, both from the RAPIDS community. In WML CE 1.6.1, those packages have been updated to 0.7.2 and 0.7.0, respectively, bringing along lots of new features and implemented algorithms. Strings are now supported in cuDF, courtesy of the newly included nvstrings library. Other new features in cuDF include multi-index support, bitwise binary ops, and cumulative operations for a series. Linear regression, ridge regression, and stochastic gradient descent are few of the new algorithms added to cuML.


If there has been a single most common request for a package, it would certainly be XGBoost. We’ve listened and included GPU and CPU versions of XGBoost 0.82 for the first time in the WML CE channel. XGBoost is a very popular scalable, portable, and distributed gradient boosting library.

Meta packages and updates

In the 1.6.0 release, a meta package mechanism was introduced that allowed for convenient installation of all of the packages in the release (the powerai metapackage), or to install a particular package from the release (the powerai-release metapackage). Let’s revisit these for the 1.6.1 release to illustrate how these package can be used to control package versions and corresponding releases of WML CE. Note that despite the product name change, the powerai and powerai-release metapackages have not changed.

To install all included packages at 1.6.0 levels, the command is:

conda install powerai=1.6.0

To install all included packages at 1.6.1 levels, the command is:

conda install powerai=1.6.1

The powerai-release metapackage can be used to install packages from particular releases. For example, installing the TensorFlow package from the 1.6.1 release, the command is:

conda install tensorflow-gpu powerai-release=1.6.1

These packages also help when upgrading between releases. The complete set can upgraded by running:

conda update --prune powerai


conda install powerai=1.6.1

If the current environment was installed using the powerai-release metapackage to specify a custom package set, that can be upgraded to the next release by upgrading the metapackage. This will pull up the corresponding packages to the latest release. Accomplish this by running:

conda update --prune powerai-release


conda install powerai-release=1.6.1

A new meta package has been added in version 1.6.1 called powerai-cpu. This metapackage works the same as the powerai meta package, but installs the available CPU-only version of the included packages. The powerai meta package continues to install GPU focused packages built against CUDA.

As you can see, the metapackages included in the WML CE channel are flexible and powerful and give you control over the installed package sets. Keep in mind that upgrading CUDA based packages to 1.6.1 will also upgrade CUDA. The 1.6.0 and 1.6.1 releases can be installed in separate conda environments, each with their own CUDAversion, if everyone is not ready to move yet.

Docker image updates

As usual for each release, the images published to Docker Hub have been updated. Items worth mentioning for the Docker image release include a few new tags categories: cpu and architecture. Available options for the cpu include all-cpu, tensorflow-cpu, caffe-cpu, and xgboost-cpu (ppc64le only), with the framework-specifc ones including only those frameworks. Available options for the `architecture` tag category include: ppc64le, x86_64, and . The tag will auto detect the correct architecture. This works as the images are now fully enabled with a multi-architecture manifest.

There is also now a separate image for TensorFlow Serving, which is new for this release.

As you can see, we’ve evolved in more than just in name. The WML CE team continues to deliver heaps of new functions and integrated capabilities. For more information, please visit our Knowledge Center, or our releases page. WML CE 1.6.1 can be installed right now from our channel!

2 comments on"Introducing WML CE 1.6.1, PowerAI evolves again!"

  1. Hi,
    I’m rizki from I would like to know is it possible to combine nvidia-tensorrt and LMS. And if so cpould you please explain the way to combine both.

    • Neural network training suffers from GPU memory limitations more frequently than inference. The reason for this is the need to keep tensors produced in the forward phase in memory until the end of phase and through parts of the backward phase. Inference models don’t run the backward phase. In my testing of TensorFlow LMS I have only seen LMS needed for inferencing at the very highest resolutions the model can achieve.

      It should be possible to train a model using LMS and then export it / import it into TensorRT for inference. TensorRT does not have LMS support added in, so the model must be able to run its inferencing path within the GPU memory bounds.

Join The Discussion

Your email address will not be published. Required fields are marked *