Tensorflow is all kind of fancy, from helping startups raising their Series A in Silicon Valley to detecting if something is a cat. However, when things start to get “real” you may find yourself no longer dealing with mnist.csv, and instead needing do large scale data prep as well as training. This talk will explore how Tensorflow can be used in conjunction with Apache Spark, Flink, and BEAM to create a full machine learning pipeline including that annoying “feature engineering” and “data prep” components that we like to pretend don’t exist. We’ll also talk about how these feature prep stages need to be integrated into the serving layer.
This talk will also explore how Apache Arrow impacts cross-language development for big-data including things like deep learning. Even if you’re not trying to raise a round of funding in Silicon Valley, this talk will give you tools to do interesting machine learning problems at scale (or find more cats).