You’re an incredibly smart and capable data scientist or a beginner struggling with the different hats you’re asked to wear, you’ve trained a model to classify customers, and another to make predictions on what incentives will influence a purchase. How do you put these insights into production? While most of the guides and documentation today focus on finding data, cleaning data, and training models, the dirty work of moving insights into production are left to development and infrastructure teams. IBM’s Watson Machine Learning Service lets you save, deploy, and test, right from your development environment.

Watson Machine Learning is a cloud service that allows you to build modules that best suit your data, and then deploy these models online. It also allows you to import custom models written in Spark MLLib and Scikit-Learn or it can automatically select and train a model for you. It brings machine learning to the public by allowing anyone without a development background to deploy models easily.

Learning objectives

Demonstrate how to deploy your trained machine learning model and pipeline into production on IBM’s Machine Learning as a Service offering Watson Machine Learning (WML). Once deployed online you will have an auto scaling API endpoint fronting a ML model available for public consumption.


A trained Machine Learning Model trained with one of the following supported frameworks

Estimated time

Creating the WML Service, saving your trained model, deploying your model, and testing the deployment should take 10-15 minutes.


You, rock star or aspiring rock star data scientist, have trained a new model, have evaluated its accuracy, and want to move it into production.

The following 6 steps will guide you through the process of deploying your machine learning model in production:

  1. Create Watson ML Service
  2. Create a set of credentials for using the service
  3. Download the SDK
  4. Authenticate and Save the model
  5. Deploy the model
  6. Call the model

Create Watson ML Service

To get started, create a new instance of Watson’s ML Service for hosting the model. Use Watson’s ML Service, it’s available for free on IBM’s Cloud. This free tier provides you hosting for 5 models, and 5000 predictions; not too shabby for a service that doesn’t ask for a credit card. Create or login to your IBM Cloud account.

  1. Once logged in select Catalog from the upper right menu
  2. Search for “Machine Learning” service
  3. Select the Machine Learning service that appears

IBM Cloud Catalog

Under the service details page, you are provided with an overview, some configuration options for where the service should run, and a menu of pricing plans.

  • Name your new Machine Learning Service
  • Select the Lite Plan
  • Click Create



Create a set of credentials for using the service

Your Watson Machine Learning Service has been created, and you’re taken to panel with information about the newly created service. Before moving back into the code though we need to create a set of credentials for authenticating with the service through a SDK or HTTP API calls.

  1. Select Service Credentials from the left menu
  2. Select New Credential
  3. Leave defaults and select Add
  4. View New Credentials
  5. Copy credentials to clipboard and save for later

Create Credentials

Create Credentials

Create Credentials

Create Credentials

Download the Python SDK

With the Watson Machine Learning Service created, and the credentials saved, everything else can be performed programmatically; music to my ears as a developer.

The watson-machine-learning-client library depends on one of the following libraries to also be installed: pyspark, scikit-learn, xgboost, mlpipelinepy and ibmsparkpipeline; in this example I’m using scikit-learn. Documentation for the library is available at

pip install watson-machine-learning-client scikit-learn

Authenticate and Save the model

Authenticate to the Watson Machine Learning Service with your credentials saved from above.

from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "",
"access_key": "*****",
"username": "*****",
"password": "*****",
"instance_id": "*****"
client = WatsonMachineLearningAPIClient(wml_credentials)

Save your machine learning pipeline and model in the Watson Machine Learning Service. It’s important to include a pipeline in your model, transforming data before making any predictions. With your pipeline built, your model is saved by calling the fit function on your data.

model =

Within your development environment you can make predictions by calling the predict method on your model but this doesn’t lend itself easily to a production deployment. Here’s where Watson’s Machine Learning Service simplifies deployment, just save your trained model out onto the Watson Machine Learning Service.

model_props = {"authorName":"IBM", "authorEmail":""}
model_artifact = client.repository.store_model(model, name="My Awesome Prediction Model", meta_props=model_props, training_data=train_data)

That’s it, your model has been saved and details of your model are available by calling the get_details method

print(json.dumps(client.repository.get_details(model_artifact.uid), indent=2))

Deploy the model

The model has been saved in the Watson Machine Learning Service, but before you can begin to make predictions it needs to be deployed. By deploying the saved model it’s fronted by an API scoring endpoint.

deployed_model  = client.deployments.create(model_artifact.uid, “Deployment of My Awesome Prediction Model”)

Display the API endpoint of the newly deployed model and details of the deployed models.


Call the model

With the model deployed you call the endpoint, passing your features in on the HTTP request. The following sample method demonstrates how to construct an HTTP POST request with a payload specifying the fields the model pipeline requires.

def get_prediction_ml(featureA, featureB, featureC, featureD):
scoring_url = client.deployments.get_scoring_url(deployed_model))
scoring_payload = { "fields":["FEATUREA","FEATUREB"," FEATUREC"," FEATURED],"values":[[ featureA,featureB, featureC, featureD]]}
header = {'authorization': 'Bearer ' + watson_ml_token, 'content-type': "application/json" }
scoring_response =, json=scoring_payload, headers=header)
return (json.loads(scoring_response.text).get("values")[0][18])


And that’s it, with just a few lines you can take a trained ML Model with a Pipeline, and deploy it as a Service; auto scaling for demand. So, next time you want to quickly move your insights into production, just add a few more lines of code to that notebook instead of working through capacity planning exercises, infrastructure build out, and configuring that network to front your model with an API.

While this all works from any development environment it’s simplified even further when using IBM’s Watson Studio, your cloud based IDE for Data Science. Here Watson ML is integrated with open source tools such as Juypter Notebooks, Zeppelin Notebooks, and additional enhancements around data governance, collaboration, and scale, simplifying the workflow of the data science and development teams.