AM6: Architecting a Big Data Platform
Share this Session:
  John Akred   John Akred
Silicon Valley Data Science
  Stephen O'Sullivan   Stephen O'Sullivan
VP, Engineering
Silicon Valley Data Science
  Mark Mims   Mark Mims
Principal Data Engineer
Silicon Valley Data Science


Monday, April 3, 2017
08:30 AM - 11:45 AM

Level:  Intermediate

What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop, Spark, and big data ecosystem fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.

By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including:

  • Acquisition: from internal and external data sources
  • Ingestion: offline and real-time processing
  • Storage
  • Analytics: batch and interactive
  • Providing data services: exposing data to applications

We’ll also give advice on:

  • Tool selection
  • The function of the major Hadoop components and other big data technologies such as Spark and Kafka
  • Integration with legacy systems

With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

A leading expert on big data architecture and Hadoop, Stephen O'Sullivan brings over 20 years of experience creating scalable, high-availability data and applications solutions. A veteran of WalmartLabs, Sun, and Yahoo!, Stephen leads data architecture and infrastructure.

Mark Mims has extensive experience architecting and implementing data science solutions across a variety of industries. His passion is Data Plumbing, where Data Science meets the real world of DevOps and Infrastructure Engineering.

Close Window