Big Data, Fast Data, Easy Data: distributed stream processing for everyone with KSQL, the streaming SQL engine for Apache Kafka

06/12/2018 - 14:50 to 15:30
Moon Lounge
long talk (40 min)

Session abstract: 

Modern businesses have data at their core, and this data is changing continuously. Stream processing is what allows you harness this torrent of information in real-time, and thousands of companies use Apache Kafka as the streaming platform to transform and reshape their industries. However, the world of stream processing still has a very high barrier to entry. Today’s most popular stream processing technologies require the user to write code in programming languages such as Java or Scala. This hard requirement on coding skills is preventing many companies to unlock the benefits of stream processing to their full effect.

However, imagine that instead of having to write a lot of code, all you’d need to get started with stream processing is a simple SQL statement, such as: SELECT* FROM payments-kafka-stream WHERE fraudProbability > 0.8, so that you can detect anomalies and fraudulent activities in data feeds, monitor application behavior and infrastructure, conduct session-based analysis of user activities, and perform real-time ETL.

In this talk, I introduce the audience to KSQL, the open source streaming SQL engine for Apache Kafka.  KSQL provides an easy and completely interactive SQL interface for data processing on Kafka -- no need to write any programming code.  KSQL brings together the worlds of streams and databases by allowing you to work with your data in a stream and in a table format.  Built on top of Kafka's Streams API, KSQL supports many powerful operations including filtering, transformations, aggregations, joins, windowing, sessionization, and much more.  It is open source, distributed, scalable, fault-tolerant, and real-time.  You will learn how KSQL makes it easy to get started with a wide range of stream processing use cases such as those described at the beginning.  I cover how to get up and running with KSQL and explore the under-the-hood details of how it all works.