by Vinay Rao | Published January 18, 2018
Artificial intelligenceData scienceDeep learningMachine learningPython
PyTorch is an open source Python package released under the modified Berkeley Software Distribution license. Facebook, the Idiap Research Institute, New York University (NYU), and NEC Labs America hold the copyrights for PyTorch. Although Python is the language of choice for data science, PyTorch is a relative newcomer to the deep learning arena.
In this article, you can find an overview of the PyTorch system and learn more about the following topics:
Neural network algorithms typically compute peaks or troughs of a loss function, with most using a gradient descent function to do so. In Torch, PyTorch’s predecessor, the Torch Autograd package, contributed by Twitter, computes the gradient functions. Torch Autograd is based on Python Autograd.
Python packages such as Autograd and Chainer both use a technique known as tape-based auto-differentiation to calculate gradients. As such, those packages heavily influenced and inspired the design of PyTorch.
PyTorch for research The primary use case for PyTorch is research. Facebook uses PyTorch for innovative research and switches to Caffe2 for production. A new format called Open Neural Network Exchange allows users to convert models between PyTorch and Caffe2 and reduces the lag time between research and production.
Tape-based auto-differentiation works just like a tape recorder in that it records operations performed, and then replays them to compute gradients—a method also known as reverse-mode auto-differentiation . PyTorch Autograd has one of the fastest implementations of this function.
Using this feature, PyTorch users can tweak their neural networks in an arbitrary manner without overhead or lag penalties. As a result, unlike in most well-known frameworks, PyTorch users can dynamically build graphs, with the framework’s speed and flexibility facilitating the research and development of new deep learning algorithms.
Some of this performance comes from the modular design used in the PyTorch core. PyTorch implements most of the tensor and neural network back ends for CPU and graphical processing unit (GPU) as separate and lean C-based modules, with integrated math acceleration libraries to boost speed.
PyTorch integrates seamlessly with Python and uses the Imperative coding style by design. Moreover, the design keeps the extensibility that made its Lua-based predecessor popular. Users can program by using C/C++ with an extension application programming interface (API) based on the C Foreign Function Interface (cFFI) for Python.
Before diving into what PyTorch can do and the benefits it brings, let’s get a bit of background.
Torch is a modular, open source library for machine learning and scientific computing. Researchers at NYU first developed Torch for academic research. The library’s use of the LuaJIT compiler improves performance, and C-based NVIDIA CUDA extensions enable Torch to take advantage of GPU acceleration.
Many developers use Torch as a GPU-capable NumPy alternative; others use it to develop deep learning algorithms. Torch has gained prominence because of its use by Facebook and Twitter. The Google DeepMind artificial intelligence (AI) project started out using Torch, and then switched to TensorFlow.
Lua is a lightweight scripting language that supports multiple programming models; it has its origins in application extensibility. Lua is compact and written in C, which makes it able to run on constrained embedded platforms. The Pontifical Catholic University of Rio de Janeiro in Brazil first introduced Lua in 1993.
LuaJIT is a Just-in-time (JIT) compiler with platform-specific optimizations to improve the performance of Lua. It also extends and enhances the C application programming interface (API) of standard Lua.
Two PyTorch variants exist. Originally, Hugh Perkins developed pytorch as a Python wrapper for the LuaJIT-based Torch framework. The PyTorch variant this article discusses, however, is a completely new development. Unlike the older variant, PyTorch no longer uses the Lua language and LuaJIT. Instead, it is a native Python package.
PyTorch redesigns and implements Torch in Python while sharing the same core C libraries for the back-end code. PyTorch developers tuned this back-end code to run Python efficiently. They also kept the GPU-based hardware acceleration as well as the extensibility features that made Lua-based Torch popular with researchers.
PyTorch has many benefits. Let’s look at some of the major items.
Most deep learning frameworks that use computational graphs generate and analyze graphs before runtime. In contrast, PyTorch builds graphs during runtime by using reverse-mode auto-differentiation. Therefore, arbitrary changes to a model do not add runtime lag or overhead to rebuild the model.
PyTorch has one of the fastest implementations of reverse-mode auto-differentiation. Apart from being easier to debug, dynamic graphs allow PyTorch to handle variable-length inputs and outputs, which is especially useful in natural language processing (NLP) for text and speech.
Rather than using a single back end, PyTorch uses separate back ends for CPU and GPU and for distinct functional features. For example, the tensor back end for CPU is TH, while the tensor back end for GPU is THC. Similarly, the neural network back ends are THNN and THCUNN for CPU and GPU, respectively. Individual back ends result in lean code tightly focused for a specific task running on a specific class of processor with high memory efficiency. Use of separate back ends makes it easier to deploy PyTorch on constrained systems, such as those used in embedded applications.
Although it is a derivative of Torch, PyTorch is a native Python package by design. It does not function as a Python language binding but rather as an integral part of Python. PyTorch builds all its functionality as Python classes. Hence, PyTorch code can seamlessly integrate with Python functions and other Python packages.
Because a direct change to the program state triggers computation, code execution isn’t deferred and produces simple code, avoiding a lot of asynchronous executions that could cloud how the code executes. When manipulating data structures, this style is intuitive and easy to debug.
Users can program using C/C++ by using an extension API based on cFFI for Python and compiled for CPU or with CUDA for GPU operation. This feature allows the extension of PyTorch for new and experimental uses and makes it attractive for research use. For example, the PyTorch audio extension allows the loading of audio files.
Torch and PyTorch share the same back-end code, and there’s often a lot of confusion between Lua-based Torch and PyTorch in the literature. As a result, it’s difficult to distinguish between the two unless you look at the timeline.
For example, the Google DeepMind AI project used Torch before switching to TensorFlow. Because the switch happened before the advent of PyTorch, one cannot consider it an example of a PyTorch application. Twitter was a Torch contributor and now uses TensorFlow and PyTorch to fine-tune its ranking algorithms on timelines.
Both Torch and PyTorch have seen heavy use at Facebook to research NLP for text and speech. Facebook has released many open source PyTorch projects, including ones related to:
The PyTorch Torchvision package gives users access to model architectures and pre-trained models of popular image classification models such as AlexNet, VGG, and ResNet.
Because of its flexible, extensible, modular design, PyTorch doesn’t limit you to specific models or applications. You can use PyTorch as a NumPy replacement or to implement machine learning and deep learning algorithms. For more information about applications and contributed models, see the PyTorch Examples page at https://github.com/pytorch/examples.
PyTorch requires a 64-bit processor, operating system, and Python distribution to work correctly. To access a supported GPU, PyTorch depends on other software such as CUDA. At the time of publication, the latest PyTorch version was 0.2.0_4. Table 1 shows the availability of prebuilt PyTorch binaries and GPU support for this version.
For more information, please see the README file at https://github.com/tylergenter/pytorch/blob/master/README_Windows.md.
PyTorch builds use CMake for build management. To build PyTorch from source and ensure a high-quality Basic Linear Algebra Subprograms (BLAS) library (the Intel® Math Kernel Library, or MKL), the package maintainers recommend using the Anaconda Python distribution. This distribution also ensures a controlled compiler version regardless of the operating system you use.
For details, refer to https://github.com/pytorch/pytorch#from-source.
PyTorch doesn’t officially support PPC64le architectures such as IBM® POWER8®, but you can use a Linux build recipe to build PyTorch from source. These builds are still experimental and don’t pass all tests, especially with CUDA enabled. In addition, the math kernel and linear algebra libraries aren’t fully optimized.
Instead of a single back end, PyTorch uses several separate back-end modules for CPU and GPU and for different tasks. This approach means that each module is lean, focused, and highly optimized for the task. The modules rely heavily on underlying math and linear algebra libraries (MKL and BLAS) for their performance.
Because Anaconda Python tunes MKL for Intel architectures, PyTorch performs best on Intel CPUs. Intel supplies the MKL-Deep Neural Network primitives for its CPU-centric high-performance computing (HPC) architectures, such as the Intel Xeon® and Xeon Phi™ families of processors.
PyTorch could also use OpenBLAS. On IBM PowerPC® machines, IBM makes available the Math Acceleration Subsystem (MASS) for Linux. You can build OpenBLAS with MASS support. IBM also recently announced the integration of Anaconda on its HPC platform based on the IBM Power Systems™ S822LC platform. Therefore, PyTorch would potentially perform well on IBM systems, as well.
PyTorch supports only NVIDIA GPU cards. On the GPU, PyTorch uses NVIDIA CUDA Deep Neural Network (CuDNN) library, a GPU-accelerated library meant for deep learning algorithms. Open Computing Language (OpenCL) support is not on the PyTorch road map, although the Lua-based Torch had limited support for the language. There are no community efforts to port to OpenCL, either.
PyTorch is still trailing behind on the CUDA development curve. PyTorch uses CUDA version 8, while many other deep learning libraries already use CUDA version 9. Earlier PyTorch releases are based on CUDA 7 and 7.5. As such, PyTorch users cannot take advantage of the latest NVIDIA graphics cards. Building from source with CUDA 9 has so far produced mixed results and inconsistent behaviors.
In general, PyTorch runs on any platform that supports a 64-bit Python development environment. That’s enough to train and test most simple examples and tutorials. However, most experts agree that for research or professional development, an HPC platform is strongly recommended.
Deep learning algorithms are compute intensive and need at least a fast, multicore CPU with vector extensions. In addition, one or more high-end CUDA-capable GPUs is the norm for a deep learning environment.
Processes in PyTorch communicate with each other by using buffers in shared memory, and so allocated memory must be adequate for this purpose. Hence, most experts also recommend having large CPU and GPU RAM because memory transfers are expensive in terms of both performance and energy usage. A larger RAM avoids these operations.
Virtual machines (VMs) for deep learning are currently best suited to CPU-centric hardware with multiple cores. Because the host operating system controls the physical GPU, GPU acceleration is challenging to implement on VMs. That said, two main methods exist for doing so:
Running PyTorch in a Docker container or Kubernetes cluster has many advantages. The PyTorch source includes a Dockerfile that in turn includes CUDA support. Docker Hub also hosts a prebuilt runtime Docker image. The main advantage of using Docker is that PyTorch models can access and run on physical GPU cores (devices). In addition, NVIDIA has a tool for Docker Engine called nvidia-docker that can run a PyTorch Docker image on the GPU.
Finally, PyTorch offers several cloud installation options:
conda install pytorch torchvision -c soumith
If you would like to find out how to install PyTorch on Power systems, refer to this how-to guide.
Get the Code »
Back to top