Introduction

Machine learning is a branch of artificial intelligence that helps enterprises to discover hidden insights from large amounts of data and run predictions. Machine learning algorithms are written by data scientists to understand data trends and provide predictions beyond simple analysis. Python is a popular programming language that is used extensively to write machine learning algorithms due to its simplicity and applicability. Many packages are written in Python that can help data scientists to perform data analysis, data visualization, data preprocessing, feature extraction, model building, training, evaluation, and model deployment of machine learning algorithms.

This tutorial describes the installation and configuration of Python-based ecosystem of machine learning packages on IBM® AIX®. AIX users can use these packages to efficiently perform data mining, data analysis, scientific computing, data plotting, and other machine learning tasks. Some of these Python machine learning packages are NumPy, Pandas, Scikit-learn, SciPy, and Matplotlib.

Because all these packages are Python based, the latest version of Python needs to be installed on the AIX system. YUM can be used to install Python on AIX or it can be directly installed from AIX toolbox. This tutorial talks about Python3 but same should work for Python2 as well. You need to have python3-3.7.1.-1 or later version of Python from AIX toolbox to run these machine learning packages.

In this tutorial, we use a Python package management tool called pip to install these machine learning packages on AIX. These packages are compiled as part of pip installation because binary versions of these packages for AIX are not available on the Python Package Index (PyPI) repository.

Python Installation through YUM

YUM is the easiest way to install any open source RPM package because you don’t need to have any prior knowledge of packages and their dependencies. To install YUM on AIX, download the yum.sh script from AIX toolbox repository to the AIX system and run it as a root user.

As part of YUM installation, Python2 will be installed by default. After setting up and installing YUM, update all the packages to the latest level using the yum update command.

Prerequisites to install machine learning packages

The following open source packages are required to build and install the machine learning packages on AIX using pip. Install these packages from the AIX toolbox using YUM.

  • blas
  • freetype2-devel
  • gcc
  • gcc-c++
  • gcc-gfortran
  • lapack
  • libpng-devel
  • python3
  • python3-devel
  • xz
  • zeromq
  • zeromq-devel

Use YUM to install all these packages.

# yum install gcc  xz python3 gcc-c++ gcc-gfortran freetype2-devel libpng-devel zeromq zeromq-devel lapack blas python3-devel

Figure 1. Output of YUM installation of open source packages Output of YUM installation of open source packages

Once this command is successfully completed, your machine is installed with Python3 and other dependent packages that are required for successful installation of Python machine learning packages.

Miscellaneous settings

The following additional settings are required for successful installation of machine learning packages.

  • Increasing resource limits The process resource limits need to be increased for successful installation of machine learning packages like Matplotlib and NumPy. The stack and data limits need to be increased using the ulimit command. A value can be specified to increase the resource limit for stack and data. If you are not sure of the value to be specified, unlimited value can be given.

    You can use the following command to increase the resource limits to unlimited:

    # ulimit -d unlimited
    `
    
  • Specify space requirements The file system space should be at least 1 GB in /tmp and /opt to install these packages without any error.

    # chfs -a size=+1G /tmp
    # chfs -a size=+1G /opt
    
  • Set the path The binaries from the packages will be installed in the /opt/freeware/bin path. Hence add this path to the PATH environment variable.

    # export PATH=$PATH:/opt/freeware/bin
    

Installing Machine Learning packages

You can use pip to install machine learning Python packages. In Python3, pip comes inbuilt with Python. First, install NumPy using the following command:

# python3 -m pip install numpy

Figure 2. Output of pip installation of NumPy package Output of pip installation of NumPy package

After installing NumPy successfully, install the remaining packages using the following command:

# python3 -m pip install pandas scipy scikit-learn matplotlib flask

Figure 3. Output of pip installation of Pandas, SciPy, Scikit-learn, Matplotlib, and Flask packages Output of pip installation of Pandas, SciPy, Scikit-learn, Matplotlib, and Flask packages

Though we specify only limited number of packages, other machine learning packages can also be installed using the python3 -m pip install <package_name> command. These packages might take a few minutes (typically 5 to 10 minutes for a few packages) to install because of the C and C++ source code that gets compiled while the packages are being installed. You cannot run parallel builds because pip runs in a single-core mode.

Install and configure Jupyter Notebook on AIX

Jupyter Notebook is a web application that provides users to write code for data analysis, statistical analysis, data visualization and run machine learning algorithms. AIX users can use Jupyter Notebook to import existing models or write their own data analysis and machine learning models.

Perform the following steps to install and configure the Jupyter server on AIX:

  1. Run the following command to install the Jupyter package:

    # python3 -m pip install jupyter
    

    Figure 4. Output of pip installation of Jupyter package Output of pip installation of Jupyter package

  2. If you don’t already have a Jupyter folder, or if your Jupyter folder doesn’t contain a notebook configuration file, run the following command:

    # jupyter notebook --generate-config
    

    The config file is saved in for example: /.jupyter/jupyter_notebook_config.py. The file has various user-settable parameters that can be used by your Jupyter server.

    Figure 5. Generating Jupyter notebook config file Generating Jupyter notebook config file

  3. You can set a Jupyter password for secured web access. This is an optional step.

    # jupyter notebook password
    

    The password in hash format would be written to a file.

    Figure 6. Generating Jupyter notebook password Generating Jupyter notebook password

  4. Modify following parameters in the file generated in step 3 (jupyter_notebook_config.py) Open the file using any editor.

    # vi : /.jupyter/jupyter_notebook_config.py
    

    Uncomment by removing # at the beginning of each line and make the following changes:

    c.NotebookApp.ip = '0.0.0.0'
    c.NotebookApp.open_browser = False
    c.NotebookApp.port = 8888
    

    8888 is the default port. If you want to change the default port number, you need to edit this parameter in the file.

  5. Start the Jupyter server using the following command:

    # jupyter notebook
    

    This command is not recommended to run from the root login. However, you can bypass it using the –allow-root option.

    Figure 7. . Starting Jupyter server Starting Jupyter server

    For more command line options on how to start you Jupyter server, refer https://jupyter-notebook.readthedocs.io/en/latest/config.html.

  6. Access the Jupyter server

    1. Open http://<HOSTNAME>:8888 from the browser to open your notebook and enter the password set in step 3 above to log in to the notebook.

    2. Access your Jupyter notebooks through a secure mode protocol (HTTPS) by using self-signed certificate with steps mentioned in https://jupyter-notebook.readthedocs.io/en/stable/public_server.html

    3. Write your own notebook and import any existing notebooks ending with .ipnyb to write your own models.

Summary

The Python machine learning tools are very popular among data scientists. Many AIX users are interested to have these packages on AIX to write their AI applications. This tutorial provides an easy way to install all these machine learning packages with the help of AIX toolbox open source packages.