How do you measure the behavior of users firing 3,000 requests per second?
At bol.com, the biggest online retail platform in the Netherlands and Flanders, there are thousands of users visiting the site daily, and my team provides the measurement infrastructure providing near real time metrics in the interactions of these customers with the platform.
Our new measuring system, Measuring 2.0 is built using open-source big data technologies like Apache Flink, Kafka, Avro, and Parquet, supported with in-house technology.
At Berlin Buzzwords 2016, Niels Basjes introduced this project, for which we can now share technical details of running it in production. During the talk I will guide you through how we kept our Kafka cluster resilient enough to handle our peak loads, how we used Apache Flink to show trending products on our website in real-time, how we handle continuous changes in our measurement data structure using Avro, how we use Parquet to deal with the great quantities of data while also keeping them easily accessible, and share the lessons we learned along the way.