After an introduction to Apache Spark™, we propose during this new workshop to create a Machine Learning model to predict credit risk.
We will build the prediction model using the SparkML library and following the steps in :
* Loading and visualization of all data,
* Building the model with the SparkML API
Spark MLlib is Spark’s Machine Learning (ML) library.
Its purpose is to make the Machine Learning practical, scalable and easy.
This library provides tools such as :
* ML Algorithms: common learning algorithms such as classification, regression, clustering …
* Featurization: extraction, transformation, dimension reduction …
* Pipelines: tools for building, evaluating and adjusting ML pipelines
* Persistence: saving and loading algorithms, models and pipelines
* Utilities: linear algebra, statistics, data processing, etc.
This workshop will be led by Georges-Henri Moll, IBM Digital Developer Advocate – Data Scientist – Master Inventor (https://www.linkedin.com/in/georgeshenrimoll/?originalSubdomain=fr).