Pular para o conteúdo principal
InicioFolhas de consultaData Analysis

Data Quality Dimensions Cheat Sheet

In this cheat sheet, you'll learn about data quality dimensions, allowing you to ensure that your data is fit for purpose.
14 de mar. de 2023  · 3 min leer

Data Quality Dimensions.png

Have this cheat sheet at your fingertips

Download PDF

What are Data Quality Dimensions?

Data Quality is a measurement of the degree to which data is fit for purpose. Good data quality generates trust in data. Data Quality Dimensions are a measurement of a specific attribute of a data's quality.

Completeness

Completeness measures the degree to which all expected records in a dataset are present. At a data element level, completeness is the degree to which all records have data populated when expected.

Group 427.png

Completeness Example

All records must have a value populated in the CustomerName field.

Group 409.png

Validity

Validity measures the degree to which the values in a data element are valid.

Group 428.png

Validity Example

  • CustomerBirthDate value must be a date in the past.
  • CustomerAccountType value must be either Loan or Deposit.
  • LatestAccountOpenDate value must be a date in the past.

Group 409 (1).png

Uniqueness

Uniqueness measures the degree to which the records in a dataset are not duplicated.

Group 2127.png

Uniqueness Example

All records must have a unique CustomerID and CustomerName.

Group 409 (2).png

Timeliness

Timeliness is the degree to which a dataset is available when expected and depends on service level agreements being set up between technical and business resources.

Group 2128.png

Timeliness Example

All records in the customer dataset must be loaded by the 9:00 am.

Group 2129.png

Consistency

Consistency is a data quality dimension that measures the degree to which data is the same across all instances of the data. Consistency can be measured by setting a threshold for how much difference there can be between two datasets.

Group 416 (1).png

Consistency Example

The count of records loaded today must be within +/- 5% of the count of records loaded yesterday.

Group 418.png

The count of records loaded today must be within +/- 5% of the count of records loaded yesterday.

Group 419.png

Accuracy

All records in the Customer Table must have accurate Customer Name, Customer Birthdate, and Customer Address fields when compared to the Tax Form.

Group 2127.jpg

Accuracy Example

All records in the Customer Table must have accurate Customer Name, Customer Birthdate, and Customer Address fields when compared to the Tax Form.

Screenshot 2023-02-17 at 11.39 1.png

Group 422.png

Temas
Relacionado

blog

10 Signs of Bad Data: How to Spot Poor Quality Data

Learn how to spot bad data, exploring why data quality matters, the cost of poor data, and the 10 signs of bad data.
Kurtis Pykes 's photo

Kurtis Pykes

11 min

blog

[Infographic] Data Cleaning Checklist

Data cleaning takes up 80% of the data science workflow. Use this checklist to identify and resolve any quality issues with your data
DataCamp Team's photo

DataCamp Team

5 min

cheat-sheet

Data Visualization Cheat Sheet

In this data visualization cheat sheet, you'll learn about the most common data visualizations to employ, when to use them, and their most common use-cases.
Richie Cotton's photo

Richie Cotton

5 min

cheat-sheet

Data Science Cheat Sheet for Business Leaders

This cheat sheet guides you through the basics of how data science can help your business, including building your data science team and the common steps in the data science workflow.
Joyce Chiu's photo

Joyce Chiu

6 min

cheat-sheet

Descriptive Statistics Cheat Sheet

In this descriptive statistics cheat sheet, you'll learn about the most common statistical techniques for descriptive analytics.
Richie Cotton's photo

Richie Cotton

5 min

cheat-sheet

DAX Cheat Sheet

This cheat sheet is your handy companion when working with DAX formulas and statements in Power BI.
Richie Cotton's photo

Richie Cotton

7 min

See MoreSee More