This blog provides steps and instructions that are required to bring up Data Science Experience Local on IBM Power Systems.
DSX (Data Science Experience) is an interactive environment, which is useful for data scientists to collaborate on machine learning projects and solve the toughest data challenges with the best tools available in DSX like Rstudio, Jupyter, and Watson Machine Learning(WML) Spark in an integrated environment. Like DSX on IBM Cloud, at IBM, Data Science Experience is enabled on IBM Power Platform.
DSX provides an environment that works with Power AI and exposes the open source software through interfaces like jupyter notebook. This gives an easier way for data scientists to work with the technologies provided by the Power AI tools. With DSX on Power, data analysis is very quick. In addition to all the capabilities that consumers can experience, other advantages of Power technology are advanced GPU support and NVLink. Data Science Experience can be installed and run on a local setup of Power systems or on IBM Private Cloud and can make use of the tools that are available in the installed DSX cluster.
Installing DSX Local on Virtual or physical machines
The installation of DSX Local is simple and straight forward. Complete DSX Local can be configured through one installer. DSX installation takes 3 to 3.5 hours. Some of the installation steps load docker images (some are of size in GB) like notebook, spark, cloudant database, and rstudio and pulls different services from docker. The installation time mainly depends on the network speed and available resources. So before installing DSX, ensure that the system requirements are met and the disk performance is good. Otherwise, even if the installation succeeds, the cluster will experience issues due to slowness of disk.
The link System requirements for Data Science describes in detail about the hardware and software requirements for three-node and nine-node configuration. The configuration details in the above link are required for production environment. For testing purpose, we can go with lower configuration, which is also described in this blog in the later sections.
The following is the configuration that was tried out for a three-node configuration.
- Number of servers used: Three virtual machines (Servers can be virtual or physical machines)
- Installed Operating System: RHEL 7.2 ppc64le
|Master1||8||24||Public-ip + private-ip configured||Two disks of size 500 and 400 GB|
|Master2||8||24||Only private-ip||Two disks of size
|Master3||8||24||Only private-ip||Two disks of size 400GB|
Note: After installing by using a private IP, it can be changed later to public for external connect.
Preconfiguration of system before installing DSX.
Create partitions and volume groups of required size
- Create and format the two disk partitions in all the three nodes with XFS.
- Create new partitions using parted if free partitions are not available.
Example: (Considering the disks are named as vda and vdb)
# parted /dev/vda --script mklabel gpt # parted /dev/vda --script mkpart primary '0%' '100%' # mkfs.xfs -f -n ftype=1 /dev/vda1 # parted /dev/vdb --script mklabel gpt # parted /dev/vdb --script mkpart primary '0%' '100%' # mkfs.xfs -f -n ftype=1 /dev/vdb1
Create two directories and mount the partitions on all three nodes
# mkdir -p /ibm # mkdir -p /data
where /ibm â€“ installer-partition, /data â€“ data partition
To ensure mount persists after reboot, add the similar entry in /etc/fstab on all three nodes
# echo "/dev/vda1 /ibm xfs defaults,noatime 1 2" >> /etc/fstab # echo "/dev/vdb1 /data xfs defaults,noatime 1 2" >> /etc/fstab
Mount the partitions on all three nodes
# mount /ibm # mount /data
Verify whether the partitions are mounted properly on all the three nodes:
# df -h /ibm Filesystem Size Used Avail Use% Mounted on /dev/vda1 400G 33M 400G 1% /ibm # df -h /data Filesystem Size Used Avail Use% Mounted on /dev/vdb1 500G 33M 500G 1% /data
Selinux must be in enforcing or permissive. To achieve this perform one of the following:
# setenfore 0
and verify #getenforce shows the desired results.
- Otherwise, modify /etc/sysconfig/selinux and make â€śSELINUX=permissiveâ€ť
This requires system reboot. Reboot and ensure that the result shows permissive.
# getenforce Permissive
Also, after reboot ensure that the partitions are mounted as described in the above section.
autologin from master-1 to master-2 and master-3
This is required since installer connects to other two nodes to copy docker images, start different services, and so on.
- Create ssh-key in master-1 using:
- Add the rsa keys ( .ssh/id_rsa.pub ) in .ssh/authorized_keys file in master-2 and master-3.
- Ensure login from master-1 to other two nodes happens without password.
Get Proxy IP
The installation requires Proxy IP (in this case an unused private ip) as HA proxy IP address. Install one more node with private-ip assigned and later shut down this proxy node.
#shutdown -h now
-> Run this in proxy node to shut down
Reuse the private-ip of this proxy node for DSX installation.
- For testing purposes, the RHEL servers can use 2 separate partitions that are not used by the operating system installation with a minimum 150 GB and 350 GB
- All the IP’s in the cluster must be in same subnet
Systems are set for the installation. Let us move to the steps related to DSX installer. This section describes installation by using the command line and configuration file. â€śwdp.confâ€ť is the configuration file that will have all the required parameters.
Sample configuration file used:
# Warning: This file generated by a script, do NOT share user=root virtual_ip_address=
node_1= node_data_1=/data node_path_1=/ibm node_2= node_data_2=/data node_path_2=/ibm node_3= node_data_3=/data node_path_3=/ibm ssh_port=22 overlay_network=<>
NOTE: The virtual_ip_address is either any unused IP address or the IP address that is obtained from the proxy node
Command line installation of DSX using wdp.conf
In master-1 node:
- Create the configuration file ( wdp.conf ) under /ibm folder.
- Download and copy the DSX installer ( DSX-Local-Build-Config.ppc64le.* ) from the appropriate location under same /ibm folder.
- Start the installation for the three-node cluster:
# cd /ibm # chmod +x DSX-Local-Build-Config.ppc64le.117 # ./DSX-Local-Build-Config.ppc64le.117 --three-nodes
While running the installer, it detects the wdp.conf from the same folder and will be prompted to use this configuration file. Press â€śyâ€ť here. Accept the terms and conditions and proceed. The installer might prompt to enter root password for all the nodes.
The installer detected a configuration file. Do you want to use the parameters in this file for your installation? [Y/N]: y Validating the information in the file...SUCCESS By typing (A), you agree to the terms and conditions: http://www14.software.ibm.com/cgi- bin/weblap/lap.pl?la_formnum=&li_formnum=L-KLSY AF9UXF&title=IBM+Data+Science+Experience+Local+Enterprise+Edition&l=en Type (R) if you do not agree to the terms and conditions Please type (A) for accept or (R) for reject: A Thank you for using IBM Data Platform on Private Cloud Installer is preparing files for the initial setup, this will take several minutes... Initial setup starts, log file will be located at /ibm/InstallPackage/tmp/wdp.2017_11_09__04_08_16.log Docker client is not found. Installer is installing and starting docker via yum Checking if the docker daemon is running Clean up the old images and containers if any exist Load the wdp docker image (1/2)
If the time across the nodes is not synchronized, the following message will be displayed. Manually sync-up time on all three nodes and press enter in such cases.
All the nodes are not synced to the same NTP server Warning: Continuing without time synchronization among the nodes will cause unexpected issues ===== NTP configuration summary ===== x.x.x.x is synced to NTP server x.x.x.x, x.x.x.x System clock is not synced x.x.x.x System clock is not synced Please configure NTP properly and press Enter to continue
When the installation completes successfully, URL for DSX Local client will be displayed.
The installation completed successfully. Please visit https://x.x.x.x/dsx-admin for DSX portal
Change the IP to public for external connect
For the cluster, since private IPs were used, change the IP to public for external connect.
# cd /wdp/k8s/dsx-local-proxy/k8s # cp nginx-service.yaml nginx-service.yaml.orig
Then, edit nginx-service.yaml and change the IP you see in the file to the public IP of the master-1. Run the following:
# kubectl delete -f nginx-service.yaml.orig --namespace=ibm-private-cloud # kubectl create -f nginx-service.yaml --namespace=ibm-private-cloud
Now it should be able to connect with the public IP of the first master node.
On ppc64le, run this command also to make it accessible.
#iptables -P FORWARD ACCEPT
Login to DSX Portal
Installation and configuration of DSX is completed. Now sign in to the URL address that is similar as shown above â€śhttps://
Trouble shooting guide for DSX install on Power
The following are some common errors during DSX install with solutions listed below each of them:
- ERROR: Disk latency test failed. By copying 512 kB, the time must be shorter than 60s, recommended to be shorter than 10s, validation result is 95s
Solution: Ensure that your servers meet the hardware and software requirements for DSX Local. Refer https://datascience.ibm.com/docs/content/local/requirements.html to understand system requirements. The above means the node is not acceptable for the install. In such cases, by-passing the latency checks and doing forceful installation will result in cluster which will have issue due to slowness of the disk.
- WARNING: Disk throughput test failed. By copying 1.1 GB, the time is recommended to be shorter than 5s, validation result is 26s
WARNING: NTP/Chronyc is not setup
WARNING: CPU cores are 4, while requirement are 8
Solution: Add suppress_warning=true in wdp.conf to skip warnings which can be really ignored.
- ERROR: Kubernetes is already installed with a different version or settings, please uninstall Kubernetes
Solution: The installation script takes care of installing kubernetes and other required packages. Please uninstall any installed version if done.
- Pre-install script timeout, trying again
Solution: After installer is extracted, scripts under â€śInstallPackageâ€ť takes care of doing pre-install check to verify if system requirements are met. Specifically â€śparse.shâ€ť verifies the connection to all nodes. Above message indicates either connection to the nodes are not proper or the nodes is not acceptable for install since it didn’t meet the install requirements. Check the log file under tmp folder to see what is happening here.
- “error: error validating \”/wdp/create_calico/calico.yaml\”: error validating data: Get
http://localhost:8080/swaggerapi/api/v1: dial tcp 127.0.0.1:8080: getsockopt: connection refused;
â€śThe connection to the server localhost:8080 was refused – did you specify the right host or port?â€ť
Solution: This comes from kubernetes because of cgroup driver difference. In docker version 1.12.6 , kubelet service has issues because of cgroup option difference.
System logs shows message as â€śkubelet cgroup driver: â€ścgroupfsâ€ť is different from docker cgroup driver: â€śsystemdâ€ť. Check the system logs/dmesg to see similar error from kubelet if any. In the later version of docker (17.03), this option is modified. The DSX installer takes care of installing proper docker and kubernetes. Please uninstall version of docker if it was done before installation and installer will pick up right version according to the distro.
- Selinux should be in permissive
Solution: The installer requires selinux to be permissive mode. This can be achieved by modifying /etc/sysconfig/selinux or running:
- NTP/Chronyc is not setup
Solution: Date/time on all the nodes should be synchronized. Manually synchronize and proceed with the
- Retry or skip any installation step
In total, there are around 63 steps and installer allows to retry after correcting the stuff in background if some step fails. This is during the installer runtime. It also allows to resume installation from a specific step by adding â€śjump_install=
â€ť in wdp.conf . This will be helpful if someone got disconnected from any step and want to continue from the same point. But make sure the environment is not altered or cleaned up.
- Uninstall DSX
From the Install folder run, /wdp/utils/uninstall.sh
The following web references has information about technologies referred in this blog DSX install:
I would like to thank Suchitra Venugopal, co-author of this blog, for the contributions. We would like to thank Kanda Zhang, Omer Kamal and Manjunath Kumatagi for their guidance and right help with the issues during DSX Installation.
We would like to thank Poornima Nayak, Pradipta Banerjee, Indrajit Poddar, GopiKrishnan Gopi for encouraging to work on this blog and providing review comments. And we would like to extend our thanks to the members who worked on building the product mainly Igor Khapov, Yulia Gaponenko, Konstantin Maximov, Ilsiyar Gaynutdinov, Ekaterina Krivtsova, Alanny Lopez, Shilpa Kaul, Champakala Shankarappa, and Anita Nayak.