Big Data Quality
Share this Session:
  Laura Sebastian-Coleman   Laura Sebastian-Coleman
Data Quality Center of Excellence Lead


Wednesday, April 5, 2017
11:45 AM - 12:30 PM

Level:  Intermediate

Many organizations are ready to take advantage of data lakes to store more data at a fraction of the cost. Data Lakes promise organizations that they will be able to magically integrate data with a fraction of the work. Almost all promoters of data lakes have been cavalier about the need to manage the quality of data in such environments. But data quality always matters. Gartner reports that data lakes carry "substantial risks," the most important of which is the "inability to determine the quality of data." This presentation will present a case study of how Cigna, a major health care insurer, is approaching this challenge. Participants will learn:
  • Risks of ignoring data quality in data lakes
  • The role of ingestion engines in managing data quality in a data lake
  • How to assess data lake data quality
  • The benefits of managing data quality upfront

Laura Sebastian-Coleman, Data Quality Center of Excellence Lead at Cigna, has worked on data quality in large health care analytic data warehouses since 2003. Cigna is a global health service company dedicated to helping people improve their health, well-being and sense of security. Laura has implemented data quality metrics and reporting, launched and facilitated data quality working groups, and contributed to data consumer training programs. She has led efforts to establish data standards and to manage metadata for large analytic data warehouses. In 2009, she led a group of analysts at Optum in developing the original Data Quality Assessment Framework (DQAF) which is the basis for her book "Measuring Data Quality for Ongoing Improvement" (Morgan Kaufmann, 2013). She is the DAMA Publications Director and is the 2015 recipient of the IAIDQ Distinguished Member Award.

Close Window