The majority of production Hadoop clusters today run in a secure environment. Most of them (from our experience) contain sensitive data that needs to be protected. The Hadoop ecosystem provides lots of choices but there is no single solution, piece of documentation or book to cover everything needed to secure your Hadoop environment. I’m going to talk about a comprehensive security design we created for one of our customers.
This security concept covers everything from the Operating System layer, to authentication, authorization, auditing, encryption, data masking, row & column based access, Backup & DR, acquisition processes and more.
The technologies involved are fairly standard but I’m going to talk about CDH, HDP and BigInsights/IOP including tools like SPSS, Qlik and Informatica which you'll frequently find in use at your customers or in your own organization.
In this talk I’ll teach you all the lessons that we have learned while securing Hadoop environments. It is much more than just enabling Kerberos and includes lots of organizational red tape to get approvals and information. And I hope that you can avoid a few of these pitfalls next time you have to do it after you’ve listened to this talk.
This talk is combining business and technical aspects without boring the other half so everyone is welcome and should get something out of it.