Train a cloud-based machine learning model using on-premise data

Get the code


Many companies and individuals struggle to use their on-premises data — the kind of data that lives on a local machine, within your data center, behind your firewall — for machine learning in the cloud. It can be challenging to find a quick, easy, and secure solution for connecting resources in a protected environment to resources in the cloud.

With Watson Studio and Machine Learning, Db2, and Secure Gateway, it is possible to establish a secure, persistent connection between your on-premises data and the cloud to train machine learning models leveraging cloud computing resources like Spark, elastic environments, and GPUs.


In this guide we’ll create an on-premises Db2 database on our local computer, populate it, and connect it to Watson Studio via Secure Gateway. Next, we’ll read buildings violations data from this database and build a model to predict the likelihood that a particular building will fail an inspection based on historical data from the City of Chicago. After we build the model, we’ll deploy it as an API endpoint with Watson Machine Learning that only authorized users can access.

After completing this code pattern, you’ll learn how to:

  • Configure a secure gateway to IBM Cloud.
  • Connect Watson Studio to an on-premise Db2 database.
  • Create a machine learning model.



  1. Source data is retained in on-premise Db2 database.
  2. Data is accessible to Watson Studio via a Secure Gateway.
  3. Secure gateway is utilized to train a cloud-based machine learning model.


Get the detailed instructions in the README file. These steps will show you how to:

  1. Load sample data into an on-premise Db2 database.
  2. Create IBM Cloud service dependencies.
  3. Configure a secure gateway to IBM Cloud.
  4. Create a Watson Studio project.
  5. Create a machine learning model.