One of the challenges of deploying a recommendation engine at scale is that developers often need to integrate many disparate systems to fulfill the various requirements of data storage, data processing, machine learning, model serving, and search and content filtering. This typically leads to teams having to manage and maintain many systems. Furthermore, it is common to have to write custom components particularly for the model scoring piece.
To address this complexity, we’ve created a new developer pattern titled Build a recommender with Apache Spark and Elasticsearch. This shows how to leverage the capabilities of Apache Spark for processing data and training machine learning models, together with the data storage, search and custom ranking features of Elasticsearch, to create a scalable and flexible recommendation engine.
Elasticsearch is a great fit not only for content metadata but also time-series data, which is exactly what the user event data in a recommender system typically looks like. By taking advantage of the custom Elasticsearch scoring functionality, the data storage, model scoring, and search systems can be combined into one. Elasticsearch provides solid integration with Spark, which in turn offers all the features needed for large-scale data processing. The final piece of the puzzle is the Spark MLlib library of machine learning algorithms, which boasts a highly scalable implementation of one of the most popular recommendation models for collaborative filtering, out of the box.
Check out Build a recommender with Apache Spark and Elasticsearch for the building blocks to create a simpler, more integrated solution for a recommendation engine that provides the power and flexibility your business needs.