Instant and Repeatable Data Platforms
Share this Session:
  Heather Nelson   Heather Nelson
Senior Solution Architect
Silicon Valley Data Science
 
  Mark Mims   Mark Mims
Principal Data Engineer
Silicon Valley Data Science
 


 

Wednesday, April 5, 2017
11:45 AM - 12:30 PM

Level:  Intermediate


Configuring a data platform and data science environment can be a tedious, error-prone process including development, continuous integration, QA, staging and production, and often has to be configured from scratch. By combining cloud platforms such as AWS or Azure with Terraform and Ansible, we can create a repeatable data science infrastructure.

In this talk, we'll discuss our "push button" infrastructure tool and how attendees can use it in their own projects to create a cloud-agnostic environment that spins up quickly and is easy to configure as required.

We will cover:

  • Use cases, such as the ability to bring up the same cluster repeatedly, or disaster recovery
  • How to parameterize your cloud environment
  • Creating a data lab for the data scientist, with all the tools they require for their exploration
  • The development and release process, including integration testing
  • How to model costs in real-time to analyze price and desired performance


Heather Nelson is a Senior Solution Architect at Silicon Valley Data Science. A problem solver by nature, Heather is passionate about helping organizations leverage data to drive competitive advantage. She draws from a diverse background in business and technology consulting to find the best solutions for her clients’ toughest data problems.

Mark Mims has extensive experience architecting and implementing data science solutions across a variety of industries. His passion is Data Plumbing, where Data Science meets the real world of DevOps and Infrastructure Engineering.


   
Close Window