Today's industrial systems produce more and more data everyday. Companies are increasingly using Big Data technologies and data analysis approaches in order to monitor their systems.
I will illustrate this trend with a project of predictive maintenance on trains. In cities where millions of people use public transportation everyday, avoiding faults on trains during circulations is critical. The goal of the project is to predict faults on trains in advance so that can be are dispatched to the workshops accordingly, thus avoiding train delays and reducing maintenance costs.
I'll explain our approach which uses random forests and artificial neural networks with theano, what are the results and how we measure success, and finally how, from developing models on historical data with python, we progressively moved to production and transposed the models to a distributed environment using Spark.