Knowledge level: Intermediate
Special requirements: Personal laptop
The data industry is constantly growing, reaching tens of zettabytes, absorbing more and more areas. The rapidly changing data ecosystem pushes us to look for new ways of transforming our data storage into a pool of weighty data. In this workshop, you will learn that the usage of a mixture of open source technologies and proven solutions can provide a full stack service with a robust data platform that can manage information in a productive and qualitative way.
The first part of the workshop will introduce you to Datalakes and give you an accurate overview of data management systems within highly regulated industries. The following sections will cover the hot topics of a Datalake ecosystem: data ingestion, advanced analytics, data integration, real-time events, open metadata governance, data lineage, containers, and cloud.
The second part of the workshop provides hands-on introduction in designing a mock-up Datalake platform. In preparing the Datalake design, the participants will decide upon tooling and concepts that will help them build the foundation.