Performance evaluation of machine learning models

In the machine learning world, numerous models are being created to achieve a specific task. With so many models available to use, how do you decide which model is best for your use, which model is performing better, or what are the various performance parameters for the different models? The correct question is which model is going to be the right fit for your requirements.

This code pattern gives you a way to compare Watson Cognitive services models to help you decide which model performs better for a particular set of data. It gives you a platform to configure models, provide input data, and run and prepare performance evaluation statistics such as a confusion matrix and ROC curves. The Workbench provides recommendations, ROC curves, and summary statistics for all of the configured models on single dashboard screen to let you select the best performing model.

The code pattern demonstrates a way to compare Watson cognitive service models.

High-level steps involved in comparing the models are:

  • Create models or use available models that are to be compared.
  • Develop an application that can consume the models.
  • Upload the test data with the ground truth for your supported Watson cognitive service.
  • The application compares the models that are consumed based on the real data.
  • The application gives recommendations, ROC curves, and summary statistics for all of the configured models on a single dashboard to let you select the best performing model.
  • You can use the statistics to fine-tune a model’s parameters to select the best-performing model.

This code pattern can be extended to support other types of machine learning models.