Building a data lake at a bank

06/11/2018 - 14:50 to 15:10
Moon Lounge
short talk (20 min)

Session abstract: 

Banks are data companies. All the services they provide and processes they execute eventually boil down to processing data and making decisions based on the information derived from it. Obviously, technology is what allows them to do that at scale. In contrast to other technology companies though, data at banks tends to be the Achilles heel rather than the biggest asset. To comply with regulation and compliance requirements, they are mandated to handle information properly and provide it to regulators upon request. To stay competitive, they have to innovate to find new sources of revenue or to streamline business processes. These often contradictory forces, combined with legacy IT landscapes, frequently turn out to produce chaotic environments.

At SolarisBank, we are fortunate to not have the legacy technology other non-startups struggle with. Hence we have the opportunity to build data infrastructure with a modern tech stack and according to contemporary architectural principles. Nevertheless, regulation and legislation requirements often run counter to design patterns you would normally take for granted. For instance, it might feel like common sense to you to keep large sets of raw, immutable data around indefinitely for future analytics purposes. Your Data Protection Officer though will have a very different opinion on that. This talk is about the engineering aspects of building a data lake in a highly regulated environment. Along the lines of the SolarisBank case, I will discuss the impact of regulatory requirements on system architecture and present design patterns to comply without wreaking havoc on usability.