Linux containers are increasingly popular with application developers: they offer improved elasticity, fault-tolerance, and portability between different public and private clouds, along with an unbeatable development workflow. It’s hard to imagine a technology that has had more impact on application developers in the last decade than containers, with the possible exception of ubiquitous analytics. Indeed, analytics is no longer a separate workload that occasionally generates reports on things that happened yesterday; instead, it pulses beneath the rhythms of contemporary business and supports today’s most interesting and vital applications. Since applications depend on analytic capabilities, it makes good sense to deploy our data-processing frameworks alongside our applications.
In this talk, you’ll learn from our expertise deploying Apache Spark and other data-processing frameworks in Linux containers on Kubernetes. We’ll explain what containers are and why you should care about them. We'll cover the benefits of containerizing applications, architectures for analytic applications that make sense in containers, and how to handle external data sources. You’ll also get practical advice on how to ensure security and isolation, how to achieve high performance, and how to sidestep and negotiate potential challenges. Throughout the talk, we’ll refer back to concrete lessons we’ve learned about containerized analytic jobs ranging from interactive notebooks to production applications. You’ll leave inspired and enabled to deploy high-performance analytic applications without giving up the security you need or the developer-friendly workflow you want.