Code can fight systemic racism. This Black History Month, let's rewrite the wrong. Get involved

Connect a custom machine learning model to a SingleStore database

Data scientists developing machine learning models need a reliable, scalable database to store and access their data. SingleStore DB offers this capability. In this tutorial, we show you how to:

  • Create an instance of SingleStore database on a OpenShift Cluster using a Red Hat Marketplace operator.
  • Connect a custom machine learning model to a SingleStore database hosted on Openshift cluster.

Introduction to SingleStore

SingleStore, which was previously MemSQL, is a distributed, highly-scalable SQL database that ingests data continuously to perform operational analytics for your business. SingleStore ingests millions of events per day with ACID transactions while simultaneously analyzing billions of rows of data in relational SQL, JSON, geospatial, and full-text search formats. Querying is done through standard SQL drivers and syntax, leveraging a broad ecosystem of drivers and applications.

Learning objectives

After completing this tutorial, you will learn how to:

  • Deploy a SingleStore operator on an OpenShift cluster from Red Hat Marketplace
  • Create a table within a SingleStore database
  • Configure the database as a data asset in Watson Studio.
  • Access the database using Python client
  • Create and store data in a SingleStore database using Python Client

Prerequisites

Estimated time

Completing this tutorial should take about 30 minutes.

Steps to Deploy SingleStore Operator from Red Hat Marketplace on OpenShift Cluster

  1. Configure an OpenShift cluster with Red Hat Marketplace
  2. Deploy a SingleStore operator on an OpenShift cluster
  3. Create a database instance
  4. Create a new Watson Studio instance on IBM Cloud
  5. Create a new project on IBM Cloud
  6. Create a database connection to the project
  7. Create a new Python notebook for the project
  8. Configure the notebook

1. Configure an OpenShift cluster with Red Hat Marketplace

Follow the steps in this tutorial to configure a Red Hat OpenShift cluster on Red Hat Marketplace and connect to the OpenShift cluster in your CLI: Configure a Red Hat OpenShift cluster hosted on Red Hat Marketplace

2. Deploy a SingleStore operator on an OpenShift cluster

  1. Go to the Red Hat Marketplace catalog and search for SingleStore. Select SingleStore from the results.

    alt

  2. The SingleStore product page gives you an overview, documentation, and pricing options associated with the product. Click on the Free Trial button.

    alt

  3. Next, the purchase summary will show the Subscription term and total cost is $0.00. Click Start trial.

    alt

  4. A Red Hat login is required. Click Logon with Red Hat credentials.

    alt

  5. Now, select Start trial.

    alt

    You can visit Workspace > My Software to view your list of purchased softwares.

  6. Back in the web dashboard, select the SingleStore tile and then select the Operators tab. Click on the Install Operator button.

    alt

  7. Leave the default selection for Update channel and Approval strategy. Select the cluster and namespace scope as SingleStore-project for the operator and click *Install.

    alt

  8. A message will appear at the top of your screen indicating the install process initiated in the cluster.

    alt

3. Create a database instance

To launch the OpenShift cluster console, navigate to Workspace > Clusters and click the Cluster console.

Note: In the OpenShift cluster web console, you would still see the old name of the operator as MemSQL instead of SingleStore.

alt

  1. Next, navigate to Operators > Installed Operators to confirm the installation was successful. The operator MemSQL Operator should list under the project/namespace SingleStore-dtl as shown.

    alt

  2. Under Provided APIs, click on the first Create Instance.

    alt

  3. The Create MemSQL page will display with the default YAML. Edit the storageclass_name in the YAML file, and click the Create button. If the default YAML file is not visible, you can copy and paste the following YAML file to replace the storageclass_name.

     apiVersion: labs.ai/v1alpha1
     kind: MemSQL
     metadata:
       name: SingleStore-dtl
     spec:
       size: 1
       mongodb:
         environment: prod
         storageclass_name: <existing_storageclass>
         storage_size: 20G
    

    alt

  4. SingleStore Operator pods should come up when the installation is completed.

  5. After the deployment completes, run the following command to display the two SingleStore DB service endpoints that are created during the deployment.

$ oc get svc

The output will resemble the following (actual values will vary):

NAME                     TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
svc-memsql-cluster       ClusterIP      None            <none>         3306/TCP         42h
svc-memsql-cluster-ddl   LoadBalancer   172.21.29.233   169.46.26.10   3306:32278/TCP   42h
svc-memsql-cluster-dml   LoadBalancer   172.21.6.77     169.46.26.11   3306:30922/TCP   42h

You can now access the SingleStore database.

4. Create a new Watson Studio Instance on IBM Cloud

Visit the IBM Watson Studio page to create an instance of IBM Watson Studio. Click the Create button.

Create Watson Studio instance

After the Watson Studio page loads, click on Get Started.

CLick Get Started to create an instance

5. Create a new project on IBM Cloud

  • Once you log into your IBM Watson Studio instance. Click on the (☰) menu icon in the top left corner of your screen and click Projects.

    Navigate to the project

  • When you reach the Project list, click on Create an Empty Project. You will be navigated to a new page where you can enter the desired name(or Telco_CallDrop). Once you click on Ok you will go to a new screen. Click on Create to complete your project creation.

    create a project

6. Create a database connection to the project

  1. Once your new project is created, go to the project’s landing page click on Add to Project. A menu will open; click on Connection.

    Connect to the database

  2. Select the desired database connection: Compose for MySQL Database.

    Choose Compose for MySQL database

  3. Enter the credentials of the database. Click on Test and then select Create.

    Enter your credentials

7. Create a new Python notebook for the project

In the created project page, click on the Add to Project button. Then click on Notebook.

Add a Python notebook to your project

You will be taken to a new page. Click on the From URL tab. Enter this URL: https://github.com/IBM/icp4d-telco-manage-ml-project/blob/master/notebooks/Multivariate_Time_Series_MemSQL_DB.ipynb

8. Configure the notebook

  1. Download the Telco_training_final.csv file and load it into a table named call_drop_data in SingleStore. Refer to the documentation for instructions on how to load .CSV data into a SingleStore database (using MySQL client).

  2. Select the cell which says “Insert Db2 Connection Credentials here”, click on Connections in the assets tab and select Insert to code > Insert Credentials from your SingleStore Connection variable.

    alt

    Your credentials will look something like the ones below:

    alt

  3. Run the notebook by selecting Cell > Run all.

Note: You will run cells individually by highlighting each cell. Then, click the Run button at the top of the notebook. While the cell is running, an asterisk ([*]) will show up to the left of the cell. When that cell has finished executing, a sequential number will appear (i.e. [17]).

  • After you have run the notebook you should be able to see the machine learning model output as below, which will be loaded into SingleStore Database.

    alt

References