Predict heart medicine using machine learning

Get the code View the demo Watch the video


DISCLAIMER: This notebook is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review. It is not intended to serve as a medical application. There is no representation as to the accuracy of the output of this application and it is presented without warranty.

In this code pattern, we use anonymous patient data to predict the best medication to treat heart disease. This notebook introduces commands for getting data, building the model, model deployment, and scoring.


Using machine learning in an application can produce impressive results, but moving from the model training stage to a production application is a lot of work. While frameworks like Apache Spark MLlib, scikit-learn, and Xgboost can help to reduce the model building workload, IBM Watson Machine Learning is a solution that can put those models into production in minutes. By taking advantage of Watson Machine Learning web service deployment of models, you can easily start building your application with powerful REST APIs.

In this code pattern, we use the machine learning classification algorithm to solve a requirement from a fictional biomedical company that produces heart medication. The company has collected data about a set of patients, all of whom suffered from the same illness. During their course of treatment, each patient responded to one of five medications. Based on treatment records, the company would like to predict the best medication for the patient. The pattern shows the exact steps demonstrating how that data and the Spark MLlib package are used to train a model that predicts the best medication.

Next, the trained model is published to a Watson Machine Learning repository on IBM Cloud and then deployed as a web service. The new patient’s records are sent in an authenticated request to the scoring endpoint, and the model returns a drug recommendation in the response.

When you have completed the code pattern, you’ll understand how to:

  • Prepare data, create an Apache Spark machine learning pipeline, and train a model
  • Publish a sample model in the Watson Machine Learning repository
  • Deploy a model for online scoring (as a web service)
  • Score the model by using sample scoring records and the scoring endpoint


flow diagram

  1. Create a project in Watson Studio by using a Jupyter Notebook, Python 3.5, and Spark.
  2. Use Db2 Warehouse on Cloud to load and read the data.
  3. Use PySpark to create a pipeline, train a model, and store the model using the Watson Machine Learning service.


Find the detailed steps for this pattern in the readme file. The steps will show you how to:

  1. Clone the repository.
  2. Create Watson services in IBM Cloud.
  3. Save the credentials for your Watson Machine Learning service.
  4. Create the Db2 Warehouse on Cloud service and load the data.
  5. Create a notebook in IBM Watson Studio.
  6. Run the notebook in IBM Watson Studio.