Our Journey to a Modern Analytics Processing Pipeline
Cross-Device Analytics (CDA) provides person-centric reporting capabilities in Adobe Analytics reports by grouping devices belonging to certain identities.
In 2016 when we started working on this solution, the technological stack was innovative, as it was comprehended entirely of orchestrated containerized workloads. The underlying infrastructure was built on top of Azure and enhanced with services such as DC/OS for service orchestration and virtual machines management, Mesos for resources allocation, Marathon for service discovery, Zookeper, HDFS, HBase, Kafka.
Today, CDA is capable of handling billions of events per day. At the same time, the technological landscape has changed tremendously since and this pushed us further into challenging the status quo and migrating towards Databricks and Delta.
Follow our journey in modernizing a large-scale data processing pipeline using Databricks technologies. Hear about the challenges and the trade-offs we have faced.