page-brochureware.php

Frequently asked questions

Find answers to some of the most frequently asked questions about deep learning and and AI development with the Watson Machine Learning family of products

What is the Watson Machine Learning family of products?

Watson Machine Learning Community Edition (WML CE), formerly PowerAI, is a free, enterprise-grade software distribution that combines popular open source deep learning frameworks, efficient AI development tools, and accelerated IBM® Power Systems™ servers to take your deep learning projects to the next level.

WML CE greatly eases the time, effort, and difficulty associated with getting a deep learning environment operational and performing optimally.

Watson Machine Learning Accelerator (WML Accelerator), formerly PowerAI Enterprise, includes all of the open source deep learning frameworks, libraries, and tools included in WML CE along with IBM Spectrum Conductor, and IBM Spectrum Conductor Deep Learning Impact to give data scientists everything they need to build a distributed deep learning environment in hours rather than days or weeks — and to easily manage it as the environment grows.

What is the current release and where can I get it?

WML CE (formerly PowerAI) 1.6.1 became generally available on June 14, 2019. See the Releases page for more information about WML CE 1.6.1 and where to get it.

Watson Machine Learning Accelerator 1.2.1 became generally available on June 28, 2019. There are several ways to get WML Accelerator 1.2.1:

  • Install a trial version of WML Accelerator to give it a try. If you don’t already have one, you’ll need to register for an IBMID to access the evaluation.
  • Order it from your IBM representative or authorized Business Partner.

See the WML Accelerator releases page for more information

I have access to a Power server but it’s not equipped with GPUs. Can I test drive WML CE on it?

Currenlty you cannot run WML CE without access to GPUs and the associated NVIDIA libraries. WML CE is optimized to leverage the unique capabilities of IBM Power Systems accelerated servers:

  • IBM Power System AC922 with NVIDIA Tesla V100 GPUs
  • IBM Power System S822LC with NVIDIA Tesla P100 GPUs

Note: WML CE release 1.6.1 includes support for accelerated x86 architecture servers.

Are there any other major frameworks in plan?

We are continuously evaluating additional frameworks as part of our participation in the rapidly evolving deep learning ecosystem. As part of this evaluation, it is immensely helpful to understand specific client requirements and the relevant opportunity details. Please share details of these requirements directly with the offering team (soutter@us.ibm.com)

What is the support scenario for the Watson Machine Learning product family?

For a fee, IBM offers formal support for WML CE components as long as their versions are consistent with the release configuration. If you choose to use a different version of any of the components, no formal support will be available. However, in keeping with industry norms, specific questions can be posted on the Cognitive Systems space in IBM Developer Answers: https://developer.ibm.com/answers/topics/powerai/. This forum is monitored by the IBM technical team and technical support is provided on a best effort basis.

WML Accelerator comes with IBM Level 1-3 support.

Can WML run on x86 platforms?

Yes, both WML Accelerator and WML CE can run on x86-based servers with NVIDIA V100 and P100 GPUs.

WML Accelerator V1.2 supports an x86-based cluster, which requires all servers within the cluster to be running x86. As a result, you can run WML Accelerator on either a cluster of Power servers or a cluster of x86 servers.

What POWER9 firmware level is required for WML?

Get the latest version of firmware for POWER9 from Fix Central

Is WLM CE available on a public cloud?

In partnership with Nimbix, WLM CE on IBM Cloud service provides users with access to IBM® Power Systems™ with NVIDIA® GPUs running the PowerAI software. There are three different plans to choose from:

  • Small: Provides one WML CE cloud instance with 1 GPU
  • Medium: Provides one WML CE cloud instance with 2 GPUs
  • Large: Provides one or more WML CE cloud instances with 4 GPUs each

What is Large Model Support?

IBM Caffe with Large Model Support (LMS) loads the neural model and data set in system memory and caches activity to GPU memory, allowing models and training batch size to scale significantly beyond what was previously possible.

You can enable LMS by adding -lms <size in KB> For example -lms 1000. Then, any memory chunk larger than 1000 KB will be kept in CPU memory, and fetched to GPU memory only when needed for computation. Thus, if you pass a very large value like -lms 10000000000, it will effectively disable the feature while a small value means more aggressive LMS. The value is to control the performance trade-off.

LMS uses system memory and GPU memory to support more complex and higher resolution data.

TensorFlow Large Model Support (TLMS) provides an approach to training large models, batch sizes, and data sizes that cannot fit into GPU memory. It achieves this by automatically moving tensor data between the GPU and system memory. For more information on how to enable TensorFlow Large Model Support see the README.Note that if you’re using TLMS with WML CE and need additional information, you should check the WML CE README.

PyTorch Large Model Support (LMS) is a feature provided in PowerAI PyTorch that allows the successful training of deep learning models that would otherwise exhaust GPU memory and abort with “out of memory” errors. LMS manages this over subscription of GPU memory by temporarily swapping tensors to host memory when they are not needed.

See the “Getting started with PyTorch” topic in the IBM Knowledge Center for more information.

What is Distributed Deep Learning?

Distributed Deep Learning (DDL) is a MPI-based communication library, which is specifically optimized for deep learning training. An application integrated with DDL becomes an MPI-application, which will allow the use of the ddlrun command to invoke the job in parallel across a cluster of systems. DDL understands multi-tier network environment and uses different libraries (e.g. NCCL) and algorithms to get the best performance in multi-node, multi-GPU environments.

Check out this performance proof-point that shows how DDL maximized research productivity by training on more images at the same time with TensorFlow 1.4.0 running on a cluster of IBM Power System AC922 servers with Nvidia Tesla V100 GPUs connected via NVLink 2.0: Distributed Deep Learning: IBM POWER9™ with Nvidia Tesla V100 results in 2.3X more data processed on TensorFlow versus tested x86 systems.

What is PowerAI Vision?

PowerAI Vision can help provide robust end-to-end workflow support for deep learning models related to computer vision. This enterprise-grade software provides a complete ecosystem to label raw data sets for training, creating, and deploying deep learning-based models. PowerAI Vision is designed to empower subject matter experts with no skills in deep learning technologies to train models for AI applications. It can help train highly accurate models to classify images and detect objects in images and videos.

PowerAI Vision is built on open source frameworks for modeling and managing containers to deliver a highly available framework, providing application lifecycle support, centralized management and monitoring, and support from IBM.

PowerAI Vision 1.1.4 is available now. See the PowerAI Vision page for more information.

How can I access PowerAI Vision?

IBM PowerAI Vision is licensed per Virtual Server. When you install it, a software license metric (SLM) tag file is created to track usage with the IBM License Metric Tool. See the “License Management in IBM License Metric Tool” topic in the IBM Knowledge Center for more information.

In addition you can:

How does IBM PowerAI Vision provide value?

IBM PowerAI Vision is designed to provide an end-to-end deep learning platform for subject matter experts (non-data scientists), application developers, and data scientists. It offers several features and optimizations that can help accelerate tasks related to data labeling, training, and deployment, such as:

  • User interface-driven interaction to configure and manage lifecycles of data sets and models
  • A differentiated capability where trained deep learning models automatically detect objects from videos
  • Preconfigured deep learning models specialized to classify and detect objects
  • Preconfigured hyper-parameters optimized to classify and detect objects
  • Training visualization and runtime monitoring of accuracy
  • Integrated inference service to deploy models in production
  • Scalable architecture designed to run deep learning, high-performance analytics, and other long-running services and frameworks on shared resources

Can IBM PowerAI Vision be used solely as a data labeling tool?

Yes, the labelled data can be exported and used as a training set in your ecosystem.