Deep learning frameworks – Pytorch and Caffe2 with ONNX support on IBM AIX
Learn to install Pytorch or Caffe2 on AIX with sample use cases
Artificial intelligence (AI) is changing every industry and business function resulting in increased interest in AI, its subdomains, and related fields such as machine learning and data science. IBM® AIX® has continuously adapted to the changing needs of its customers. As an effort in that direction, for experimentation purposes, we are providing the source of PyTorch and Caffe2 open source machine learning libraries for AIX.
PyTorch takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch’s existing flexible, research-focused design to provide a fast and seamless path (from research prototyping to production deployment) for a broad range of AI projects. A stable version, 1.0.1 of PyTorch with all these features has been provided for experimentation on AIX.
This tutorial discusses how to build or install PyTorch and Caffe2 on AIX 7.2 and use them for different machine learning(ML) and deep learning (DL) use cases. It also discusses a method to convert available ONNX models in the little endian (LE) format to the big endian(BE) format to run on AIX systems.
AIX Toolbox for Linux Applications contains a collection of open source and GNU software built for IBM AIX systems. The prerequisites needed can be installed using the ‘
yum‘installation tool. Any dependent Python packages can be installed using the pip command. Python3, gcc, and pip packages need to be installed before building Protobuf, ONNX, PyTorch, or Caffe2. Refer to Configuring YUM and creating local repositories on IBM AIX for more information about it.
Install PyTorch and Caffe2 with ONNX
Perform the following steps to install PyTorch or Caffe2 with ONNX:
Set the following compiler or linker environment variables to build in the 64-bit mode:
#export PATH=$PATH:/opt/freeware/bin #export CXXFLAGS="-mvsx -maix64" #export CFLAGS="-mvsx -maix64" #export CXX='g++ -L/opt/freeware/lib/pthread/ppc64 -lstdc++ -pthread' #export LDFLAGS='-latomic -lpthread -Wl,-bbigtoc' #export CC="gcc -L/opt/freeware/lib/pthread/ppc64 -pthread" #export OBJECT_MODE=64
Build the protocol buffers:
#yum install binutils libtool autoconf automake #git clone https://github.com/aixoss/protobuf.git -baix_build (This is the Protobuf version 3.6.0 that is being used with Pytorch 1.0.1 with fixes to build on AIX ) #cd protobuf; git submodule update --init –recursive #./autogen.sh #./configure #make install
#pip install numpy #git clone https://github.com/aixoss/onnx.git -baix_build (ONNX 1.3.0 with AIX changes) #cd onnx; git submodule update --init –recursive #python setup.py install
#bash #git clone https://github.com/aixoss/pytorch.git -bv1.0.1 #yum install lapack lapack-devel openblas #pip install setuptools six future PyYAML numpy protobuf #cd pytorch; source ./aix_setup.sh #USE_DISTRIBUTED=OFF python setup.py install
Sample RNN use case in Caffe2
A sample Caffe2 Recurrent Neural Network where the network is capable of learning and maintaining a memory overtime while showing gradual improvement is available at caffe2/examples/char_rnn.py in the PyTorch repository. This script not only can learn English, grammar, and spelling, it will also pick up the nuances of the structure and prose used in any text source given.
You need to enter the following commands to run the use case:
#export LIBPATH=/opt/freeware/lib64/python3.7/site-packages/torch/lib:/opt/freeware/lib/pthread/ppc64 #cd caffe2/python/examples #wget https://caffe2.ai/static/datasets/shakespeare.txt #python char_rnn.py --train_data shakespeare.txt
# python char_rnn.py --train_data shakespeare.txt WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. Input has 62 characters. Total input size: 99993 DEBUG:char_rnn:Start training DEBUG:char_rnn:Training model Characters Per Second: 2917 Iterations Per Second: 116 ---------- Iteration 500 ---------- Khest h the' the pins wind the thot no ghemgrtnt enlsswe hlear now An monbamthe, and Othy anschake dod ind hot uchene woe peerC ht unl wind I'thee wame bybun et ane toXy suINend ans thesd tho :sy unC piFe mnve al sak ufe Qhe one th moI dralmepend, py thef the wit, th wine there I mandinge XusX mwave tothent, the sth wee bor ndse; f forlHfblr wule wits heand noun bee, And hn the the to: ml: wy thy to wotd w, woTly mnoll therft wot the. FSRe goft. Oine the wine whepyonq M.bachy, ?nd male nout ma DEBUG:char_rnn:Loss since last report: 70.67837668657303 DEBUG:char_rnn:Smooth loss: 90.0106901566045 Characters Per Second: 2768 Iterations Per Second: 110
Use case for transferring a model from PyTorch to Caffe2 using ONNX
This example demonstrates how to use analytics to predict credit card default using PyTorch and Caffe2. It generates a predictive model for credit card default using PyTorch, saves the model in ONNX, and uses Caffe2 to load the saved ONNX model for online scoring.
#export LIBPATH=/opt/freeware/lib64/python3.7/site-packages/torch/lib:/opt/freeware/lib/pthread/ppc64 #git clone https://github.com/aixoss/onnx-example.git #cd onnx-example/pytorch #python Train.py------> Uses the credit card default data to generate the ONNX model in Pytorch [1, 20] loss: 0.682 [1, 40] loss: 0.575 .... [50, 120] loss: 0.079 [50, 140] loss: 0.086 Correct Prediction: 138 Total Samples 140 #cd ../caffe2 #python load_model.py ------> (Uses Caffe2 to load the onnx model for inferencing WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. Correct: 663 Total: 700
Convert a little endian ONNX model to a big endian model
ONNX is an open format to represent deep learning models, created with an intention of interoperability between different DL frameworks. Basically, a user can create or train a model in one framework and deploy it in a different framework for inferencing. Internally, ONNX models are represented in the Protobuf format. That makes it framework-independent and improves the interoperability. But some elements in the model file are saved in raw format, which break the interoperability across systems with different endianness. That is, ONNX models generated on a little endian system will not work on a big endian system. To address this issue, we have developed a tool that converts an ONNX model from little endian to big endian and big endian to little endian. This can help AIX customers (big endian) bring models trained on Linux systems (Little Endian) to AIX for inferencing. We have tested the tool with famous DL models available in ONNX model zoo.
The code for the tool is available at https://github.com/aixoss/onnx-convertor-le-be.git
The code needs to be compiled as per the Readme file and the tool can be used to convert the endianness of the model. To test some of the models from the ONNX zoo which come with sample data in the Protobuf format, the sample data also need to be endianness converted. Tool for the same is also available in the above git repository with instructions on how to use it.
git clone https://github.com/aixoss/onnx-convertor-le-be.git
To compile the tool for model endianness conversion:
/usr/bin/g++ -DONNX_NAMESPACE=onnx -DONNX_API= -maix64 -L/opt/freeware/lib/gcc/powerpc-ibm-aix184.108.40.206/8.1.0/pthread -lstdc++ -pthread -I /usr/local/include -mvsx -maix64 -O3 -DNDEBUG -fPIC -std=gnu++11 onnx.pb.cc onnx_conv.cc -I. -L. -lprotobuf -o onnx_conv
To compile the tool for sample data endianness conversion:
/usr/bin/g++ -DONNX_NAMESPACE=onnx -DONNX_API= -maix64 -L/opt/freeware/lib/gcc/powerpc-ibm-aix220.127.116.11/8.1.0/pthread -lstdc++ -pthread -I /usr/local/include -mvsx -maix64 -O3 -DNDEBUG -fPIC -std=gnu++11 onnx.pb.cc tensor_conv.cc -I. -L. -lprotobuf -o tensor_conv
To convert a LE ONNX model (available at https://github.com/onnx/models) to BE format:
./onnx_conv ./model.onnx ./model.onnx.be
To convert the sample data to the big endian format:
./tensor_conv ./test_data_set_0/input_0.pb ./test_data_set_0/input_0.pb.be ./tensor_conv ./test_data_set_0/output_0.pb ./test_data_set_0/input_0.pb.be
After converting the model and sample data, the model is ready for use on AIX. A sample inference code is given at https://github.com/onnx/models#others in the ONNX model zoo repository.
This tutorial explains how to build and install PyTorch and Caffe2 on AIX and also discusses many of the other packages (such as protobuf, ONNX, and other Python packages) that are needed by this ecosystem on AIX. The use cases provide helpful examples for using these frameworks on AIX.