Intro to Big Data through use cases and architecture examples
Teaching several Big Data training and workshops, one of the most encountered questions is: can you give us an example of how a certain architecture was built? What made that company choose Apache Flink or Spark for stream processing, or why was HDFS not a good or enough choice for their storage architecture?
In this workshop we aim to go through some use cases and architectures for big data solutions and discuss briefly about the role certain solutions (e.g. Kafka, HDFS, Spark, Cassandra, ..) play. The workshop as well will have a hands on part: we will use Kafka as a messaging bus as well a stream processor through KSQL and then we will capture Kafka topics data using Spark and persist the data in a local repository.
Content of the workshop:
- What Big Data solutions architecture could look like
- Examples of use cases and architectures
- Overview of usual suspects in a big data architecture:
- The distributed messaging bus: Kafka
- The distributed processing: Spark
- The distributed storage: HDFS, Cassandra
- Hands on session: with Kafka, KSQL and Spark
IT architecture knowledge, a bit of SQL and Scala knowledge would be a plus – but not mandatory.