Program committee

Fran Bennett

Francine is a data scientist, and the CEO and cofounder of Mastodon C. Mastodon C are agile big data specialists, who offer the open source big data technology and data science skills which help organisations to realise the potential of their data, and who are currently building the Witan city data platform with the Greater London Authority.

Before founding Mastodon C, she spent a number of years working on big data analysis for Google, helping them to turn lots of data into even more money. She enjoys good coffee, running, sleeping as much as possible, and exploring large datasets.

William Benton

William Benton leads a team of data scientists and engineers at Red Hat where he has focused on enabling machine learning workflows and data processing pipelines in cloud-native environments. After stints in academia doing HPC and static bytecode analysis, he vowed never to touch anything looking like scientific code or the JVM again; in a pleasant professional irony, he’s liking both much better this time around. Will lives in the midwestern United States with his wife and three children and spends some of his spare time chasing light on bicycles or capturing it with cameras.

Mandy Chessell

Mandy Chessell is an IBM Distinguished Engineer and PMC leader of the ODPi Egeria ( and ODPi Data Governance ( Projects.  She is also an Apache Atlas committer.  Her focus is on supporting organizations in their transformation towards becoming data-driven.  This includes working with them to develop their strategy and architecture relating to the governance, integration and management of information.  It was through this work that the vision for open metadata and governance was born and eventually lead to the two ODPi open source projects that she leads.  More information about Mandy’s work and publications can be found on LinkedIn and her blog at

Ellen Friedman

Ellen Friedman is Principal Technologist with MapR Technologies, who provide a large scale data platform for AI & Analytics. Ellen is also a committer for Apache Drill and Apache Mahout projects.  With a PhD in Biochemistry, she has experience as a research scientist and has written about technical topics such as molecular biology, oceanography, machine learning and other big data topics. She is co-author on O'Reilly publications including "Machine Learning Logistics",  "AI & Analytics in Production” “Introduction to Apache Flink" and "Streaming Architecture.

Fabian Hüske

Fabian Hueske is a committer and PMC member of the Apache Flink project and has been contributing to Flink since its earliest days. Fabian is a cofounder of data Artisans, a Berlin-based startup devoted to fostering Flink, where he works as a software engineer and contributes to Apache Flink. He holds a PhD in computer science from TU Berlin and is currently writing a book about “Stream Processing with Apache Flink”.

Holden Karau

Holden is a transgender Canadian open source developer advocate @ Google with a focus on Apache Spark, BEAM, and related "big data" tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that's a bit more out of date. She is a committer on the Apache Spark, SystemML, and Mahout projects. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal.

Billie Rinaldi

Billie Rinaldi is a Principal Software Engineer I at Hortonworks, currently prototyping new features related to long-running services and containers in Apache Hadoop YARN. Prior to August 2012, Billie engaged in big data science and research at the National Security Agency, where she provided early leadership for Apache Accumulo. Billie is a member of the Apache Software Foundation and a committer for Apache Hadoop and a number of other Apache projects in the Hadoop ecosystem. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute.

Monica Sarbu

Monica Sarbu is currently leading the Ingest team at Elastic that is responsible for the popular open source projects Beats and Logstash, as well as turn key solutions built on top of the Beats platform. She is the founder of the Packetbeat open source project and the Beats co-creator.

The solutions she leads at Elastic, for logging, monitoring, and security, all work by collecting large amounts of data into Elasticsearch and extracting domain-specific insights from the data. When she’s not busy with the data grokking, she enjoys spending time with her daughter and travelling the world.

Georgi Knox

Georgi is a back-end Engineer who works on the Platforms team at GitHub. Originally from Sydney, Georgi now lives in Brooklyn, NY. Her current nerd crush is on distributed systems and Go. When not geeking out she likes to loose convincingly in her shuffleboard league, drink single-origin flat whites and dress up her cats in costumes.

Grant Ingersoll

Grant is the CTO and co-founder of Lucidworks, co-author of “Taming Text” from Manning Publications, co-founder of Apache Mahout and a long-standing committer on the Apache Lucene and Solr open source projects. Grant’s experience includes engineering a variety of search, question answering and natural language processing applications for a variety of domains and languages. He earned his B.S. from Amherst College in Math and Computer Science and his M.S. in Computer Science from Syracuse University. In his spare time, he cycles and rock climbs.

Sean Treadway

Sean currently heads the architectural evolution of SoundCloud backed by a diverse history of engineering and operating desktop, server and web applications. He tackles the challenges of scaling to multi-millions of users while maintaining rapid development at SoundCloud - a social sound platform for anyone to record, promote and share their sounds on all devices, including the web.

Vasia Kalavri

Vasia Kalavri is a Postdoctoral researcher at the ETH Zurich Systems group, where she is working on distributed data processing, data center performance, and graph streaming algorithms. She is a PMC member of Apache Flink and a core developer of its graph processing API, Gelly. Vasia has a PhD in Distributed Computing from KTH Stockholm and UCLouvain, Belgium, and she has previously interned at Telefonica Research and data Artisans.

Tugdual Grall

Tugdual Grall is a Technical Evangelist at MapR, an open source advocate and a passionate developer. He currently works with the European developer communities to ease MapR, Hadoop and NoSQL adoption. Before joining MapR, Tug was Technical Evangelist at MongoDB and Couchbase. Tug has also worked as CTO at eXo Plaform and JavaEE product manager, and software engineer at Oracle.
Tugdual is Co-Founder of the Nantes JUG (Java User Group) that holds since 2008 monthly meeting about Java ecosystem. Tugdual also writes a blog available at Twitter/Github : @tgrall

Gary Dusbabek

An Apache Cassandra committer and PMC member, Gary Dusbabek is a life-long programmer specializing in distributed systems. Past experience includes working with large-scale text and image indexes in the newspaper industry and building a multi-data center distributed metrics and monitoring system for a large Cloud provider. Gary is the principal architect behind the open source Blueflood metrics platform and is currently building platforms for clients at Silicon Valley Data Science.

Owen O'Malley

Owen is a software architect who has worked exclusively on Hadoop since the project's start. He was the first committer added to Hadoop and was the original chair of the Hadoop Project Management Committee. In July 2011, he helped co-found Hortonworks, which is accelerating development and adoption of Hadoop for the enterprise. Before working on Hadoop, he worked on Yahoo Search'sWebMap project, which builds a graph of the known web and applies many heuristics to the entire graph that control search. Prior to Yahoo, he wandered between testing, static analysis, distributed configuration management, and software model checking. He received his PhD in Software Engineering from University of California, Irvine.

Ema Orhian

Ema is a passionate Big Data Engineer and Consultant contributing to open source projects. She is one of the co-founders of the Big Data Research Group, a group that provides open source solutions and proof of concepts in the Big Data area. Ema is actively involved in organizing technical events, speaker at local and international conferences and meetups.

Ted Dunning

Ted has been involved with a number of startup with the latest being MapR Technologies where he is Chief Application Architect working on advanced Hadoop-related technologies. He is also a PMC member for the Apache Zookeeper and Mahout projects. Opinionated about software and data-mining and passionate about open source, he is an active participant of Hadoop and related communities and loves helping projects get going with new technologies.

Steve Loughran

Steve is a member of technical staff at Hortonworks, where he works on leading-edge developments within the Hadoop ecosystem, including service availability, cloud infrastructure integration, and emerging layers in the Hadoop stack. Previously, he worked at HP Laboratories. He is the author of Ant in Action, a member of the Apache Software Foundation, an active committer on the Hadoop core projects; a lapsed committer on Apache Ant and Axis. He lives and works in Bristol, England. For fun he falls off bicycles in the local woodland.

Leslie Hawthorn

An internationally known community manager, speaker and author, Leslie Hawthorn has spent the past decade creating, cultivating and enabling open source communities. She created the world’s first initiative to involve pre-university students in open source software development, launched Google’s #2 Developer Blog, received an O’Reilly Open Source Award in 2010 and gave a few great talks on many things open source. In August 2013, she joined Elasticsearch as Community Manager.

Erik Hatcher

Erik Hatcher co-authored “Lucene in Action” and “Java Development with Ant”. Erik is an active member of the Lucene community – a leading Lucene and Solr committer, member of the Lucene Project Management Committee, member of the Apache Software Foundation as well as a frequent invited speaker at various industry events.

Erik is a co-founder of LucidWorks, dedicated to Lucene/Solr support, services, and training.

Michael Stack

Michael is on the Apache HBase and Hadoop Project Management Committees and an Apache Software Foundation member. He works for Apple's Open Source Technologies group out of San Francisco.


Simon Willnauer

Simon is a Elasticsearch & Apache Lucene core committer and Apache Software Foundation Member. He has been involved with Lucene since 2006 and has contributed to several other open source projects within and without the Apache Software Foundation. During the last couple of years he worked on design and implementation of scalable information retrieval systems and search infrastructure. His main interests are performance optimizations and concurrency. He studied Computer Science at the University of Applied Sciene Berlin. He is a co-founder & member of technical staff at Elasticsearch and a co-founder of the BerlinBuzzwords conference.

Jan Lehnardt

Jan is an Open Source developer. He works on all parts of the web stack and tries to make things easier for everyone. He’s a core contributor to Apache CouchDB, a co-curator for JSConf EU and lives in Berlin.

Isabel Drost-Fromm

Isabel is member of the Apache Software Foundation. She is co-founder of the Berlin Buzzwords conference, the Apache Hadoop Get Together in Berlin, and was co-organiser of the first NoSQL meetup in Europe. Isabel co-founded and is active committer of Apache Mahout. She is actively engaged with communities of several big data and search related Apache projects, e.g. Apache Lucene, and Apache Hadoop. She is regular speaker at renown conferences on topics related to free software development, scalability, Apache Lucen, Apache Hadoop and Apache Mahout. Isabel would like to thank Elasticsearch GmbH for supporting the conference by donating part of my working hours to Berlin Buzzwords preparation.

Jim Webber

Dr. Jim Webber is Chief Scientist with Neo Technology the company behind the popular open sou rce graph database Neo4j, where he works on graph database server technology and writes open source software. Jim is interested in using big graphs like the Web for building distributed systems, which led him to being a co-author on the book REST in Practice, having previously written Developing Enterprise Web Services - An Architect's Guide. Jim is an active speaker, presenting regularly around the world. His blog is located at and he tweets often @jimwebber.

Jonathan Ellis

Jonathan is CTO and co-founder at DataStax as well as Project Chair of Apache Cassandra. Prior to his work on Cassandra, Jonathan built a multi-petabyte, scalable storage system based on Reed-Solomon encoding for backup provider Mozy.