Hops in the Cloud

06/18/2019 - 11:00 to 11:40
Palais Atelier
long talk (40 min)

Session abstract: 

Hops is a European open-source, next-generation distribution of Apache Hadoop that is being repurposed for the cloud. In this talk, we will walk through some of recent technical developments in Hops, including solving the small files problem by stuffing them in metadata using NVMe disks, free-text search of file system with extended metadata (this is great for automated annotation of millions of images and then finding them in milliseconds with consistent), and most interestingly data-center level HA for HopsFS with millions of filesystem operations per second on real industrial workloads. So yes, we will tell you why a POSIX-style hierarchical filesystem with indexed extensible metadata is superior to an object store. Finally, we can show you what else you can do with Hops, and how we built Hopsworks, a horizontally scalable secure platform for Data and AI, using Hops' extended metadata.