AM6: Architecting a Big Data Platform

What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop, Spark, and big data ecosystem fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.

By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including:

Acquisition: from internal and external data sources
Ingestion: offline and real-time processing
Storage
Analytics: batch and interactive
Providing data services: exposing data to applications

We’ll also give advice on:

Tool selection
The function of the major Hadoop components and other big data technologies such as Spark and Kafka
Integration with legacy systems

With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

A leading expert on big data architecture and Hadoop, Stephen O'Sullivan brings over 20 years of experience creating scalable, high-availability data and applications solutions. A veteran of WalmartLabs, Sun, and Yahoo!, Stephen leads data architecture and infrastructure.

Mark Mims has extensive experience architecting and implementing data science solutions across a variety of industries. His passion is Data Plumbing, where Data Science meets the real world of DevOps and Infrastructure Engineering.