Skip to main content
HomeCheat sheetsData Analysis

Data Quality Dimensions Cheat Sheet

In this cheat sheet, you'll learn about data quality dimensions, allowing you to ensure that your data is fit for purpose.
Mar 2023  · 3 min read

Data Quality Dimensions.png

Have this cheat sheet at your fingertips

Download PDF

What are Data Quality Dimensions?

Data Quality is a measurement of the degree to which data is fit for purpose. Good data quality generates trust in data. Data Quality Dimensions are a measurement of a specific attribute of a data's quality.

Completeness

Completeness measures the degree to which all expected records in a dataset are present. At a data element level, completeness is the degree to which all records have data populated when expected.

Group 427.png

Completeness Example

All records must have a value populated in the CustomerName field.

Group 409.png

Validity

Validity measures the degree to which the values in a data element are valid.

Group 428.png

Validity Example

  • CustomerBirthDate value must be a date in the past.
  • CustomerAccountType value must be either Loan or Deposit.
  • LatestAccountOpenDate value must be a date in the past.

Group 409 (1).png

Uniqueness

Uniqueness measures the degree to which the records in a dataset are not duplicated.

Group 2127.png

Uniqueness Example

All records must have a unique CustomerID and CustomerName.

Group 409 (2).png

Timeliness

Timeliness is the degree to which a dataset is available when expected and depends on service level agreements being set up between technical and business resources.

Group 2128.png

Timeliness Example

All records in the customer dataset must be loaded by the 9:00 am.

Group 2129.png

Consistency

Consistency is a data quality dimension that measures the degree to which data is the same across all instances of the data. Consistency can be measured by setting a threshold for how much difference there can be between two datasets.

Group 416 (1).png

Consistency Example

The count of records loaded today must be within +/- 5% of the count of records loaded yesterday.

Group 418.png

The count of records loaded today must be within +/- 5% of the count of records loaded yesterday.

Group 419.png

Accuracy

All records in the Customer Table must have accurate Customer Name, Customer Birthdate, and Customer Address fields when compared to the Tax Form.

Group 2127.jpg

Accuracy Example

All records in the Customer Table must have accurate Customer Name, Customer Birthdate, and Customer Address fields when compared to the Tax Form.

Screenshot 2023-02-17 at 11.39 1.png

Group 422.png

Topics
Related

20 Top SQL Joins Interview Questions

Prepare your SQL interview with this list of the most common SQL Joins questions
Javier Canales Luna's photo

Javier Canales Luna

15 min

Data Sets and Where to Find Them: Navigating the Landscape of Information

Are you struggling to find interesting data sets to analyze? Do you have a plan for what to do with a sample data set once you’ve found it? If you have data set questions, this tutorial is for you! We’ll go over the basics of what a data set is, where to find one, how to clean and explore it, and where to showcase your data story.
Amberle McKee's photo

Amberle McKee

11 min

You’re invited! Join us for Radar: The Analytics Edition

Join us for a full day of events sharing best practices from thought leaders in the analytics space
DataCamp Team's photo

DataCamp Team

4 min

10 Top Data Analytics Conferences for 2024

Discover the most popular analytics conferences and events scheduled for 2024.
Javier Canales Luna's photo

Javier Canales Luna

7 min

The Future of Marketing Analytics with Cory Munchbach, CEO at BlueConic

Richie and Cory cover successful marketing strategies and their use of data, how data is leveraged during different stages of the customer life cycle, the impact of privacy laws on data collection and marketing strategies, and much more.
Richie Cotton's photo

Richie Cotton

50 min

Mastering Bayesian Optimization in Data Science

Unlock the power of Bayesian Optimization for hyperparameter tuning in Machine Learning. Master theoretical foundations and practical applications with Python to enhance model accuracy.
Zoumana Keita 's photo

Zoumana Keita

11 min

See MoreSee More