Building your own Distributed Streaming Recommendation Engine

06/11/2018 - 14:00 to 14:40
long talk (40 min)

Session abstract: 

Collaborative filtering is a well known method to implement recommendation engines. Although modern techniques, such as Alternating Least Squares (ALS), allow us to perform product rating predictions with large amounts of observations, typically ALS is implemented as a (distributed) batch algorithm where retraining must be performed with the entirety of the data. However, in scenarios where we have large amounts of data arriving as stream, batch retraining might be problematic. In this talk Rui will guide us in building a distributed streaming ALS implementation based on Stochastic Gradient Descent, where model training can be performed using the observations as they arrive. The advantages of real-time streaming collaborative filtering will be discussed as well as the scenarios where batch ALS might be preferable.