Data Quality At Scale

This course is designed to provide learners with a comprehensive understanding of the concept of data observability and its significance in data-driven decision-making.

Chad Sanderson image

Meet your instructor Chad Sanderson

Chad writes on Data Management, Contracts, and Product monthly. He believes that applying product thinking to holistic data challenges is the only way to make trustworthy decisions at scale. He has built everything from feature stores, experimentation platforms, metrics layers, streaming platforms, analytics tools, data discovery systems, and workflow development platforms.

Finally, he has implemented open source, SaaS products (early and late stage) and has built cutting edge technology from the ground up.

Week 1

Detecting Problems in the Data Pipeline

Understanding where in the data pipeline problems occur, and what peoples’ roles are is a crucial first step to detecting and preventing data quality issues. In this session, we will:

  • Map out key stakeholders’ roles in a data pipeline
  • Discover where in the pipeline data quality issues can occur and why
  • Learn from case studies such as Convoy on how to detect data quality issues

By the end of the session, learners will be able to:

  • Map out key stakeholders’ roles in your data pipeline
  • Describe where in the pipeline data quality issues can occur and why

Week 2

Creating Requirements for your Data Quality System

Garbage in - garbage out: The quality of your data model is determined by the quality of its data inputs. In this session, we will:

  • Discuss how you can identify the type of data you need for your model to run effectively
  • Determine the requirements for a data quality system
  • Get to know specific tools & frameworks, such as medallion data infrastructure, data warehouses, data marts, data modeling and data mesh

By the end of the session, learners will be able to:

  • Identify what data you need and where in the pipeline it is processed (constraints)
  • Define the requirements for your data quality system

Week 3

Architecting a Data Observability System

With your requirements defined, we can now move onto implementing the technical solution for your data observability system. In this session, we will:

  • Contrast data observability with data contracts
  • Review the technical components of observability, including alerting systems, ownership and process, including 3 pillars: logs, metrics and traces

By the end of the session, learners will be able to:

  • Decide whether you need an observability system or data contract
  • Architect an observability system

Week 4

Developing Data Contracts

Data Contracts are API-like agreements between software engineers who produce operational data and data consumers who use them for business-critical analytics. In this session, we will:

  • Review a CDC-based implementation of entity-based data contracts, covering contract definition, schema enforcement, fulfillment and monitoring
  • Discuss the elements of an effective data contract definition as code, including using the Transactional Outbox pattern
  • Learn from guest lecturer Adrian Krueziger

By the end of the session, learners will be able to:

  • Define a data contract (as code) between data producers and data consumers

Week 5

Implementing Data Contracts

With a well-defined data contract in place, you need to enforce it in production and monitor it for any bugs slipping through. In this session, we will:

  • Learn how to enforce contracts in our CI/CD pipeline as part of the normal deployment process
  • Review an implementation for the enforcement of semantic contracts and the need for good monitoring
  • Discuss open-source solutions for data contracts such as flink, airflow, DBT, great expectations
  • Bring it all together - review how to implement an end-to-end data quality system
  • Learn from guest lecturer Daniel Dicker

By the end of the session, learners will be able to:

  • Enforce, fulfill, and monitor data contracts for a continuous data quality system

A learning approach that aligns with your company values.

Self-guided

Bite-sized daily lessons that you can easily fit into your schedule. Each day, we release new lessons no longer than 15 minutes. Our lessons are carefully curated to ensure that they're both engaging and informative, allowing you to learn something new every day, and at your own pace.

Collaborative

Collaborate with other engineers from around the world, providing you with a unique opportunity to learn from others and build your professional network.

Engaging

Our live learning sessions are designed to be interactive and engaging, giving you the opportunity to ask questions and interact with subject-matter experts.

Project-based

Learn by solving real-world problems. Our courses are designed to get rid of the fluff and provide you with the most relevant information to help you apply your learning.

Trusted by teams from global companies

Frequently asked questions

Are all sessions live?

Yes, all sessions during the cohort will be live with the instructor. However, we will record each session and make them available for everyone in the cohort.

What is the time commitment?

Our courses typically have 2-4 modules, with each module lasting for approximately 2h per week which you can block out on your calendar during your work day. You also get some take home projects that you can complete at your own pace.

Do I earn a certificate for this course?

Of course! Once you’ve completed the course modules, you will get a certificate of completion that you can showcase to the world.

What is included in a LearnCrunch membership?

With your LearnCrunch yearly membership, you get access to our live instructor-led cohorts, our catalog of self-guided courses, unlimited real-world projects to learn from and master new skills, exclusive live events for members and a vetted global community of experts and peers.

How much does a LearnCrunch membership cost?

An individual LearnCrunch membership costs $1,000 USD per year. If your company is interested in purchasing multiple seats, please contact hello@learncrunch.com.

Can I expense this course?

Yes. Most LearnCrunch members have expensed this course through their Learning & Development budget, similar to how you expense conferences. You can use this email template to request expense approval from your manager.

I have more questions. Get in touch with us!

If you have more questions, email us at hello@learncrunch.com.

Book a call with us

Victor Chima

Co-Founder at LearnCrunch

Connect with us to learn how we can help you grow your team.

Fill in your details and we’ll reach out to you within 24h.