Eventually, time will kill your data pipeline

06/17/2019 - 16:30 to 17:10
Palais Atelier
long talk (40 min)

Session abstract: 

Race conditions and intermittent failures, daylight saving time, time zones, leap seconds, and overload conditions - time is a factor in many of the most annoying problems in computer systems. Data engineering is not exempt from problems caused by time, but also has a slew of unique problems. In this presentation, we will enumerate the time-related problems that we have seen cause trouble in data processing system components, including data collection, batch processing, workflow orchestration, and stream processing. We will provide examples of time-related incidents, and also tools and tricks to avoid timing issues in data processing systems.