Now available! Red Hat OpenShift Container Platform for Linux on IBM Z and LinuxONE Learn more

Working with Jupyter Notebook and JupyterHub on IBM Power Systems

Introduction

As described in the Project Jupyter website, Jupyter Notebook is an open-source web application that allows you to create and share documents containing live code, equations, visualizations and explanatory text. You can use Jupyter Notebook for numerical simulation, statistical modeling, machine learning, and much more.

Jupyter Notebook has become popular among students, researchers, and data scientists to create and share documents, conduct data analysis and visualization, learn or teach a programming, and so on. It consists of code and markdown cells where users can write code, equations, or markup text. Users can also visualize data or result in the form of graphs or images within Jupyter Notebook. Though originally designed for Python, Julia, and R languages, now it supports more than 100 programming and scripting languages.

Typically, Jupyter Notebook is designed as a single user application. JupyterHub is a multi-user version of Jupyter Notebook which spawns and manages multiple instances of a single-user Jupyter Notebook server. It is useful when a group of participants wants to work on the same notebook, for example, offer hands-on demo to a group of people where each participant wants to experiment in their own work space.

In this tutorial, we explain the steps to install and configure Jupyter Notebook and JupyterHub on an IBM® Power Systems™ server. We have also shown how to add new programming languages such as C or provide bash scripting support in Jupyter Notebook or JupyterHub.

Note:

  • The installation and configuration of Jupyter Notebook and JupyterHub has been tested on an IBM Power® System AC922 server and RHEL 7.6 (ppc64le) operating system with Anaconda3-2019.07 and Python version 3.7.
  • JupyterHub requires at least Python version 3.5 (see “References” for details).

Jupyter Notebook installation

Anaconda is a open source distribution of Python/R programming language based scientific computing packages which also include Jupyter Notebook. After successful installation of Anaconda, Jupyter Notebook will be available at the Anaconda installation path. You need to perform the following steps to install the Anaconda3 package.

  1. Download Anaconda distribution – the Linux installer for IBM POWER8 and IBM POWER9.

    Figure 1. Anaconda distribution for POWER8 and POWER9 (ppc64le) systems
    Anaconda distribution

  2. Run the Anaconda installation script.

    # ./Anaconda3-2019.07-Linux-ppc64le.sh

    By default, the Anaconda package is installed in the user’s home directory. With the -p \<prefix\> option, you can specify a customized installation location. Use the -h option to see additional installation options.

  3. Add the Anaconda installation location to standard path, that is, update to the PATH environment variable

    # export PATH=/<path>/<to>/<Anaconda>/<installation>/anaconda/bin:$PATH

Alternatively, Jupyter Notebook can also be installed using the Python package installing tool, pip.

Configuring and starting Jupyter Notebook

You can configure Jupyter Notebook using the configuration file or the equivalent command line options (which can be passed while starting the Notebook server). By default, Jupyter Notebook is configured to run on a local server (localhost). You can override the default configuration to run Jupyter Notebook on a remote server.

Generate configuration file

Run the following command to generate the configuration file (jupyter_notebook_config.py).

# jupyter notebook --generate-config

The configuration file will be generated in the ~/.jupyter directory under your home directory. Modify the configuration file to set the Notebook server IP address, port, Notebook directory path, and so on. Listing 1 shows Jupyter Notebook reference configuration to run on a remote server.

Listing 1. Jupyter Notebook configuration to run on remote server
c.NotebookApp.ip = <public or private ip>
e.g. c.NotebookApp.ip = '9.3.90.141'

c.NotebookApp.notebook_dir = '/<path>/<to>/<working directory>/'

c.NotebookApp.port = <port number>
# Default notebook server will listen on 8888 port.
Start the Jupyter Notebook server

Run following command to start Jupyter Notebook.

# jupyter notebook

Figure 2 shows the Jupyter Notebook server starting log. Open the web browser and connect to the Jupyter Notebook server using the respective IP and port number as shown in Figure 3.

Figure 2. Jupyter Notebook server starting log

Jupyter Notebook

Figure 3. Connecting to the Jupyter Notebook server from the client (web browser)

Connecting to the Jupyter Notebook server from the client

JupyterHub installation

The JupyterHub is not part of the Anaconda packages for IBM Power servers (ppc64le) [3]. Perform the following steps to install JupyterHub on IBM Power server.

  1. Make sure that the following prerequisites are installed:

    • Python version 3.5 or higher
    • nodejs / npm (node package manager)
    • Anaconda3
  2. Install node.js.

    # conda install -c conda-forge nodejs

  3. Install JupyterHub.

    # python -m pip install jupyterhub
    # npm install -g configurable-http-proxy

  4. Add the JupyterHub installation location to the standard path, that is, update to the PATH environment variable.

    # export PATH=/<path>/<to>/<JupyterHub>/<installation>/bin:$PATH

Configuring and starting JupyterHub

You can configure JupyterHub using configuration file or the equivalent command line options (which can be passed while starting the JupyterHub server). By default, JupyterHub is configured to run on a local server (localhost). You can override the default configuration to run JupyterHub on a remote server.

Generate configuration file

Run the following command to create and store the JupyterHub configuration file (jupyterhub_config.py) in the current directory.

# jupyterhub --generate-config

Modify the configuration file to override the default configuration and set the JupyterHub and spawner IP address, port, password, Notebook directory path and so on. Listing 2 in Appendix A shows the JupyterHub reference configuration to run on the remote server.

Starting JupyterHub using configuration file

Run the following command to start JupyterHub using the configuration file:

# jupyterhub -f <configuration file>

Figure 4 shows the JupyterHub server starting log. Open the web browser and connect to the JupyterHub server using the respective IP and port number as shown in Figure 5.

Figure 4. JupyterHub server starting log

JupyterHub server starting log

Figure 5. Connecting to the JupyterHub server from client (web browser)

Connecting to the JupyterHub server from client

Integrating new language support in JupyterHub and Jupyter Notebook

Jupyter Notebook, by default, supports Python programming. Over a period of time, many other programming and scripting languages have been enabled to use within Jupyter Notebook. You can find the list of available or supported languages (kernels) at: https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

Add C language support (install C kernel)

Refer GitHub (jupyter-c-kernel) for details about C kernel installation. Here is quick reference to prerequisite and installation steps.

Prerequisite
  • gcc compiler
  • jupyter (application to run the Jupyter Notebook)
  • python3 interpreter
  • pip (Python package installer)
Installation steps
  1. Install the jupyter-c-kernel python package.
    pip install jupyter-c-kernel
  2. Install the kernel specification for the C language kernel.
    install_c_kernel
  3. Start the Jupyter Notebook server.
    jupyter-notebook

Add bash shell support (install bash kernel)

Refer GitHub (bash_kernel)for details about bash kernel installation. Here is a quick reference to the prerequisite and the installation steps.

As a prerequisite, make sure that IPython 3 is installed in your system.

Installation steps
  1. Install the bash_kernel Python package.
    pip install bash_kernel

  2. Run the bash_kernel library module.
    python -m bash_kernel.install

After successful installation of kernels, the respective kernels will be visible in the New drop-down list on the Jupyter Notebook (or JupyterHub) as shown in Figure 6.

Figure 6. Jupyter Notebook with bash and C programming language support

Jupyter Notebook with bash and C programming language support

Summary

The scientific and research community is adopting Jupyter Notebook and JupyterHub as collaborating tools. The Jupyter Notebook is architecture independent, easy to share, and easy to reproduce results. Users can write markup text, equations, and code in more than 100 different programming languages in Jupyter Notebook. In this tutorial, we showed how to install and configure Jupyter Notebook and JupyterHub on an IBM Power System server (ppc64le architecture). This can help developers, data scientists, and researchers to develop and run Jupyter Notebook on IBM Power servers.

Appendix

A. JupyterHub reference configuration to run on remote server

Listing 2. JupyterHub configuration to run on remote server
# Specify class for authenticating users. This should be a subclass of `jupyterhub.auth.Authenticator` class
# The JupyterHub ships with default PAM-based authenticator and DummyAuthenticator for logging in.
# Note : The DummyAuthenticator is extremely insecure because it allows any username to log in with any password unless global password is set.
c.JupyterHub.authenticator_class = 'jupyterhub.auth.DummyAuthenticator'

# set global password to all users
c.DummyAuthenticator.password = "passw0rd"  

# The ip or hostname for proxies and spawners to use for connecting to the Hub.
c.JupyterHub.hub_connect_ip = '9.3.90.141'  

# The public facing ip of the whole JupyterHub application. This is the address on which the proxy will listen. This is the only address through which JupyterHub should be accessed by users.
c.JupyterHub.ip = '9.3.90.141'

# Default JupyterHub proxy will listen on 8000 port. The public facing URL of JupyterHub application
c.JupyterHub.bind_url = 'http://:<port number>' Or
c.JupyterHub.bind_url = 'http://<IP address>:<port number>'

# The URL the single-user server should start in. {username} will be expanded to the user’s username
c.Spawner.default_url = '/tree/home/{username}'

# The IP address (or hostname) the single-user server should listen on. The JupyterHub proxy implementation should be able to send packets to this interface.
c.Spawner.ip = '9.3.90.141'

# Path to the Notebook directory for the single-user server. The user sees a file listing of this directory when the Notebook interface is started.
c.Spawner.notebook_dir = '/'

B. Opening port in firewall

Run the following commands with an appropriate port number for the JupyterHub or Notebook server:

  • $ sudo firewall-cmd --zone=public –add-port=<port number>/tcp –permanent
  • $ sudo firewall-cmd --reload

References

  1. JupyterHub documentation
    a) JupyterHub setup
    b) JupyterHub GitHub repository
  2. Jupyter Notebook programming language kernels
  3. Anaconda documentation – packages for IBM Power with Python 3.7
  4. Jupyter Notebook: An Introduction
  5. Installing and configuring Python machine learning packages on IBM AIX
  6. Working with Notebook – IBM Watson Studio
  7. JupyterHub with Kubernetes
Aditya Nitsure
Asis Patra