By Vinay Rao | Published November 16, 2017
Artificial IntelligenceData ScienceDeep LearningMachine Learning
In the context of machine learning, tensor refers to the multidimensional array used in the mathematical models that describe neural networks. In other words, a tensor is usually a higher-dimension generalization of a matrix or a vector.
Through a simple notation that uses a rank to show the number of dimensions, tensors allow the representation of complex n-dimensional vectors and hyper-shapes as n-dimensional arrays. Tensors have two properties: a datatype and a shape.
TensorFlow is an open source deep learning framework that was released in late 2015 under the Apache 2.0 license. Since then, it has become one of the most widely adopted deep learning frameworks in the world (going by the number of GitHub projects based on it.).
TensorFlow traces its origins from Google DistBelief, a proprietary production deep learning system developed by the Google Brain project. Google designed TensorFlow from the ground up for distributed processing and to run optimally on Google’s custom application-specific integrated circuit (ASIC) called the Tensor Processing Unit (TPU) in its production data centers. This design makes TensorFlow efficient for deep learning applications.
The framework can run on the CPU, GPU, or TPU on servers, desktops, and mobile devices. Developers can deploy TensorFlow on multiple operating systems and platforms either locally or in the cloud. Many developers consider TensorFlow to have better support for distributed processing and greater flexibility and performance for commercial applications than similar deep learning frameworks such as Torch and Theano, which are also capable of hardware acceleration and widely used in academia.
Deep learning neural networks typically consist of many layers. They transfer data or perform operations between layers using multidimensional arrays. A tensor flows between the layers of a neural network—thus, the name TensorFlow.
The main programming language for TensorFlow is Python. C++, the Java® language, and the Go application programming interface (API) are also available without stability promises, as are many third-party bindings for C#, Haskell, Julia, Rust, Ruby, Scala, R, and even PHP. Google recently announced a mobile-optimized TensorFlow-Lite library to run TensorFlow applications on Android.
This tutorial provides an overview of the TensorFlow system, including the framework’s benefits, supported platforms, installation considerations, and supported languages and bindings.
TensorFlow offers developers many benefits:
This section looks at the applications that TensorFlow is good at. Obviously, because Google was using its proprietary version of TensorFlow for text and voice search, language translation, and image search applications, the major strengths of TensorFlow are in classification and inference. For example, Google implemented RankBrain, the engine that ranks Google search results, in TensorFlow.
TensorFlow can be used to improve speech recognition and speech synthesis in differentiating multiple voices or filtering speech in high-ambient-noise environments, mimicking voice patterns for more natural-sounding text to speech. In addition, it handles sentence structure in different languages to produce better translations. It can also be used for image and video recognition as well as classification of objects, landmarks, people, sentiments, or activities. This has resulted in major improvements in image and video search.
Because of its flexible, extensible, and modular design, TensorFlow doesn’t limit developers to specific models or applications. Developers have used TensorFlow to implement not only machine learning and deep learning algorithms but also statistical and general computational models. For more information about applications and contributed models, see TensorFlow in Use.
Various platforms that support Python development environments can support TensorFlow. However, to access a supported GPU, TensorFlow depends on other software such as the NVIDIA CUDA toolkit and cuDNN. Prebuilt Python binaries for TensorFlow version 1.3 (current at the time of publication) are available for the operating systems listed in the following table.
Note: GPU support on Ubuntu or Windows requires CUDA Toolkit 8.0 and cuDNN 6 or later and a GPU card compatible with the toolkit version and CUDA Compute Capability 3.0 or later. GPU support on macOS beyond TensorFlow version 1.2 is no longer available.
For details, refer to Installing TensorFlow.
The official build process uses the Bazel build system to build TensorFlow from source on Ubuntu and macOS. The Windows build using Bazel for Windows or CMake for Windows is highly experimental. For more information, see Installing TensorFlow from Sources.
IBM optimized PowerAI for deep learning on the S822LC high-performance computing (HPC) system by using NVIDIA NVLink interconnects between two POWER8 processors and four NVIDIA Tesla P100 GPU cards. Developers can build TensorFlow on IBM Power Systems running OpenPOWER Linux. For more information, see Deep Learning on OpenPOWER: Building TensorFlow on OpenPOWER Linux Systems.
Many community- or vendor-supported build procedures are available, as well.
To support TensorFlow on a wider variety of processor and nonprocessor architectures, Google has introduced a new abstract interface for vendors to implement new hardware back ends for Accelerated Linear Algebra (XLA), a domain-specific compiler for linear algebra that optimizes TensorFlow computations.
Currently, because XLA is still experimental, TensorFlow is supported, tested, and built for x64 and ARM64 CPU architectures. On CPU architectures, TensorFlow accelerates linear algebra by using the vector processing extensions.
Intel CPU-centric HPC architectures such as the Intel Xeon and Xeon Phi families accelerate linear algebra by using Intel Math Kernel Library for Deep Neural Networks primitives. Intel also provides prebuilt, optimized distributions of Python with optimized linear algebra libraries.
Other vendors, such as Synopsys and CEVA, use mapping and profiler software to translate a TensorFlow graph and generate optimized code to run on their platforms. Developers need to port, profile, and tune the resulting code when using this approach.
TensorFlow supports specific NVIDIA GPUs compatible with the related version of the CUDA toolkit that meets specific performance criteria. OpenCL support is a roadmap item, although some community efforts have run TensorFlow on OpenCL 1.2-compatible GPUs such as AMD.
According to Google, TPU-based graphs perform 15 – 30 times better than on CPU or GPU and are extremely energy-efficient. Google designed TPU as an external accelerator that fits into a serial ATA hard disk slot and connects to the host by PCI Express Gen3 x16, which allows high-bandwidth throughput.
Google TPUs are matrix processors rather than vector processors and use the fact that neural networks do not need high-precision math but rather massively parallel, low-precision integer math. Not surprisingly, the matrix processor (MXU) architecture has 65,536, 8-bit integer multipliers and pushes data in waves through a systolic array architecture, much like blood through a heart.
This design is a form of complex instruction set computing (CISC) architecture that, although single-threaded, allows a single high-level instruction to trigger multiple low-level operations on the MXU, which potentially can perform 128,000 instructions per cycle without needing to access memory.
As a result, a TPU sees massive performance gains and energy efficiency compared with GPU arrays or multiple instruction, multiple data CPU HPC clusters. The TPU massively reduces training time for deep learning neural networks over other architectures by evaluating every ready-to-execute node in a TensorFlow graph in each cycle.
In general, TensorFlow runs on any platform that supports a 64-bit Python development environment. This environment is sufficient to train and test most simple examples and tutorials. However, most experts agree that for research or professional development, an HPC platform is strongly recommended.
Because deep learning is quite computationally intensive, a fast, multicore CPU with vector extensions and one or more high-end CUDA-capable GPU cards is the norm for a deep learning environment. Most experts also recommend having significant CPU and GPU RAM because memory-transfer operations are energy expensive and detrimental to performance.
There are two modes to consider in the performance of deep learning networks:
Virtual machines (VMs) for deep learning are currently best suited to CPU-centric hardware where many cores are available. Because the host operating system controls the physical GPU, GPU acceleration is complex to implement on VMs. Two main methods exist:
Running TensorFlow in a Docker container or Kubernetes cluster has many advantages. TensorFlow can distribute a graph as execution tasks to clusters of TensorFlow servers that are mapped to container clusters. The added advantage of using Docker is that TensorFlow servers can access physical GPU cores (devices) and assign them specific tasks.
Developers can also deploy TensorFlow in a Kubernetes cluster on PowerAI OpenPOWER servers by installing a community-built Docker image, as described in “TensorFlow Training with Kubernetes on OpenPower Servers using PowerAI.”
TensorFlow has several options for cloud-based installation:
Although Google implemented the TensorFlow core in C++, its main programming language is Python, and that API is the most complete, robust, and easiest to use. For more information, see the Python API documentation. The Python API also has the most extensive documentation and extensibility options as well as widespread community support.
Other than Python, TensorFlow supports APIs for the following languages without stability promises:
Google has defined a foreign function interface (FFI) to support other language bindings. This interface exposes TensorFlow C++ core functions with a C API. The FFI is new and might not be in use by existing third-party bindings.
A survey of GitHub reveals that there are community- or vendor-developed third-party TensorFlow bindings for the following languages: C#, Haskell, Julia, Node.js, PHP, R, Ruby, Rust, and Scala.
There is now a new, optimized TensorFlow-Lite Android library to run TensorFlow applications. For more information, see What’s New in Android: O Developer Preview 2 & More.
Keras layers and models are fully compatible with pure-TensorFlow tensors. As a result, Keras makes a great model definition add-on for TensorFlow. Developers can even use Keras alongside other TensorFlow libraries. For details, see Keras as a simplified interface to TensorFlow: tutorial.
TensorFlow is just one of the many open source software libraries for machine learning. But, it has become one of the most widely adopted deep learning frameworks going by the number of GitHub projects based on it. In this tutorial, you got an overview of TensorFlow, learned which platforms support it, and looked at installation considerations.
If you’re ready to see some samples using TensorFlow, take a look at the Classify art using TensorFlow code pattern, which shows how to pull data and labels from The Metropolitan Museum of Art to train an image-classification system.
February 19, 2019
Artificial IntelligenceData Science+
January 30, 2019
January 23, 2019
Back to top