Integrity Constraints in SQL: A Guide With Examples

Integrity constraints in SQL are rules enforced on database tables to maintain data accuracy, consistency, and validity, such as ensuring unique primary keys and valid foreign key relationships.

Jun 19, 2024 · 15 min read

Imagine a hospital database where patient allergies can't be left blank or a financial system where transaction amounts must be positive numbers. In these scenarios and countless others, we rely on integrity constraints to ensure our data remains accurate, consistent, and reliable.

In SQL, integrity constraints are rules that we impose on our database tables to maintain data quality. They help prevent errors, enforce business rules, and ensure our data reflects the real-world entities and relationships it represents.

In this article, we'll delve into the essential types of integrity constraints in SQL, providing clear explanations and practical examples to illustrate their use in a PostgreSQL database. While we'll use PostgreSQL syntax, the concepts easily translate to other SQL dialects.

If you want to learn more about SQL, check out this list of SQL courses.

What Are Integrity Constraints in SQL?

Consider a scenario where we have a table that stores user information for a web application. Certain data, like a user's age, might be optional because it doesn’t hinder users from accessing our application. However, having a password for each user is essential for login purposes. To address this, we would implement an integrity constraint on the password column of our user’s table to guarantee that every entry in this table includes a password.

In essence, integrity constraints are crucial for:

Preventing missing data.
Ensuring all data adheres to the expected types and value ranges.
Maintaining proper links between data across different tables.

In this article, we'll explore the following essential integrity constraints in SQL:

PRIMARY KEY: Uniquely identifies each record in a table.
NOT NULL: Ensures a column cannot contain NULL values.
UNIQUE: Ensures that all values in a column or group of columns are unique.
DEFAULT: Provides a default value for a column when none is specified.
CHECK: Ensures that all values in a column satisfy a specific condition.
FOREIGN KEY: Establishes relationships between tables by referencing a primary key in another table.

Study Case: University Database

Let's consider a relational database for a university. This database contains three tables: students, courses, and enrollments.

Students table

The students table holds information on all university students.

student_id: The identifier of the student
first_name: The first name of the student.
last_name: The last name of the student.
email: The email address of the student.
major: The student’s major.
enrollment_year: The year in which the student enrolled.

Courses table

The courses table holds information on the courses available at the university.

course_id: The course identifier.
course_name: The name of the course.
department: The course department.

Enrollments table

The enrollments table stores information on which students are enrolled in which courses.

student_id: The identifier of the student enrolled in the course.
course_id: The identifier of the course.
year: The enrollment year.
grade: The student's grade in this course.
is_passing_grade: A boolean indicating if the grade is passing.

Throughout this article, we’ll use this example database and show several ways to enforce data integrity. We’ll use PostgreSQL syntax in our queries. However, the concepts easily translate to other flavors of SQL.

PRIMARY KEY Constraint

The university aims to identify each student uniquely. Utilizing first and last names for this purpose isn't advisable due to the possibility of name duplication among students. Likewise, relying on email addresses is not ideal since students may change their emails.

The common solution for ensuring unique identification is to assign a unique identifier to each student, which we store in the student_id column. We can apply a PRIMARY KEY constraint to the student_id column to guarantee that each student has a unique identifier.

This constraint is defined in the CREATE TABLE command, following the specification of the column data type:

CREATE TABLE students (
    student_id INT PRIMARY KEY,
    first_name TEXT,
    last_name TEXT,
    email TEXT,
    major TEXT,
    enrollment_year INT
);

The above query creates the student table with the six columns we mentioned above.

The PRIMARY KEY constraint ensures that:

Each student has a student_id.
Each student_id is unique.

PRIMARY KEY constraint on multiple columns

In certain scenarios, we need to utilize multiple columns to uniquely identify each row. Take, for instance, the enrollments table. A student can enroll in multiple courses, resulting in several rows sharing the same student_id. Similarly, a course can have multiple students enrolled, leading to numerous rows with the same course_id.

Given that no single field can uniquely identify a row, each enrollment record is determined using a combination of student_id, course_id, and year.

When multiple columns are involved, the PRIMARY KEY constraint is specified at the end of the CREATE TABLE command.

CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    year INT,
    grade INT,
    is_passing_grade BOOLEAN,
    PRIMARY KEY (student_id, course_id, year)
);

Integrity constraints after table creation

There are two ways to add integrity constraints. We just learned how to do it when the table is created.

Imagine the table already exists, but you forgot to specify the PRIMARY KEY constraint. You can also define it after the table is created using the ALTER TABLE command, like so:

ALTER TABLE enrollments
ADD CONSTRAINT enroll_pk 
PRIMARY KEY (student_id, course_id, year);

In the ALTER TABLE query, we named the constraint enroll_pk (standing for enrollment primary key). This name can be any identifier of your choosing, but it’s recommended to select a name that succinctly conveys the purpose of the constraint.

It’s best practice to name integrity constraints as it provides several benefits:

It allows for easier reference, especially when you need to modify or drop the constraint in the future.
It supports the management and organization of constraints, particularly in databases with a large number of constraints.

NOT NULL Constraint

The university wants to ensure that every student's name and email are recorded in the database. They don’t want staff to forget to input any of those fields accidentally.

To overcome this, we can use NOT NULL constraints on these three columns when creating the table:

CREATE TABLE students (
    student_id INT,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL,
    email TEXT NOT NULL,
    major TEXT,
    enrollment_year INT
);

The above query uses the NOT NULL constraints to ensure that the columns first_name, last_name, and email cannot have NULL (undefined) values.

Adding a NOT NULL constraint to an existing table can be done using the ALTER TABLE command. The syntax for adding NOT NULL constraints is as follows:

ALTER TABLE students
ALTER COLUMN first_name SET NOT NULL,
ALTER COLUMN last_name SET NOT NULL,
ALTER COLUMN email SET NOT NULL;

UNIQUE Constraint

Email addresses are inherently unique to each individual. In situations where we’re sure that a field should never have duplicate values, it’s good practice to enforce this at a database level. This helps prevent mistakes and ensure data integrity.

Adding this constraint is done in the same way as the others.

CREATE TABLE students (
    ...
    email TEXT UNIQUE,
    ...
);

However, in our case, we also want to enforce the NOT NULL constraint. We can enforce multiple constraints on a single column by space separating them:

CREATE TABLE students (
    ...
    email TEXT UNIQUE NOT NULL,
    ...
);

Note that the order doesn’t matter, we could also have used NOT NULL UNIQUE.

Adding a UNIQUE constraint to an existing table is done as follows:

ALTER TABLE students 
ADD CONSTRAINT unique_emails UNIQUE (email);

UNIQUE constraint on multiple columns

Suppose we want to ensure that each department's course name is unique in the courses table. In this case, the course_name and department columns together should be unique.

When multiple columns are involved, the constraint is added at the end of the CREATE TABLE command:

CREATE TABLE courses (
    course_id INT,
    course_name TEXT,
    department TEXT,
    UNIQUE (course_name, department)
);

Alternatively, we can add the constraint by altering an existing table. In this case, we provide a tuple with the names of the columns:

ALTER TABLE courses 
ADD CONSTRAINT unique_course_name_department 
UNIQUE (course_name, department);

NOT NULL UNIQUE vs PRIMARY KEY constraints

We’ve learned that a PRIMARY KEY constraint enforces both unicity and that there are no missing values. So you might be wondering what’s the difference between:

course_id INT PRIMARY KEY

course_id INT UNIQUE NOT NULL

The difference between NOT NULL UNIQUE and PRIMARY KEY on a table is their intended purpose and usage.

While both enforce uniqueness and non-nullability on a column(s), a table can only have one PRIMARY KEY, which is intended to uniquely identify each record in the table.

On the other hand, the combination NOT NULL UNIQUE constraint can be applied to additional columns to enforce a unique value across each row, without null values, serving to maintain data integrity for specific business rules. A table can have any number of NOT NULL UNIQUE constraints.

The existence of both allows for greater flexibility in database design, enabling multiple ways to enforce uniqueness and data integrity while distinguishing between the primary identifier of a record and other important, unique attributes within a table.

DEFAULT Constraint

Students may need some time after university registration to select their major. The university would like the value of the major column to be the string ’Undeclared’ for students who haven’t selected their major yet.

To do so, we can set a default value of this column using the DEFAULT constraint. We can alter the students table like so:

ALTER TABLE students
ALTER COLUMN major SET DEFAULT 'Undeclared';

If, instead, we want to set a DEFAULT constraint when the table is created, we can do it by declaring it after the column data type:

CREATE TABLE students (
    ...
    major TEXT DEFAULT 'Undeclared',
    ...
);

CHECK Constraint

In this particular university, grades go from 0 to 100. Without any constraint, the grade column from the enrollment table accepts any integer value. We can fix that by using a CHECK constraint to enforce the values to be between 0 and 100.

ALTER TABLE enrollments
ADD CONSTRAINT grade_range CHECK (grade BETWEEN 0 AND 100);

In general, CHECK constraints enable us to validate specific conditions we want the data to satisfy. These are important to ensure data consistency and integrity.

A CHECK constraint can involve more than one column. Let’s use it to ensure that the grade and is_passing_grade have consistent values. Say a grade is passing if its value is at least 60. Then we can ensure that is_passing_grade is TRUE if and only if the grade is at least 60. Let’s do it in the table creation to show how CHECK constraints are declared in the CREATE TABLE command:

CREATE TABLE enrollments (
    ...
    grade INT,
    is_passing_grade BOOLEAN,
    CONSTRAINT grade_check CHECK (grade BETWEEN 0 AND 100),
    CONSTRAINT is_passing_grade CHECK (
        (grade >= 60 AND is_passing_grade = TRUE) OR
        (grade < 60 AND is_passing_grade = FALSE)
    )
);

There is a problem with the above constraint. When a student enrolls in a course, they won’t have a grade yet. So we should allow NULL values on the grade. When doing so, we also need to update the passing grade constraint to be NULL when the grade is still undefined. Here’s how to update the constraint to take these into account:

CREATE TABLE enrollments (
    ...    
    grade INT NULL DEFAULT NULL,
    is_passing_grade BOOLEAN NULL DEFAULT NULL,
    CONSTRAINT grade_check CHECK (grade BETWEEN 0 AND 100),
    CONSTRAINT is_passing_grade CHECK (
        (grade IS NULL AND is_passing_grade IS NULL) OR
        (grade >= 60 AND is_passing_grade = TRUE) OR
        (grade < 60 AND is_passing_grade = FALSE)
    ),
    ...
);

Note that on both the grade and is_passing_grade columns we added NULL on the data type and as the default value. The purpose of these is simply to increase readability.

Let’s now see what conditions we can enforce with a CHECK constraint.

Range conditions

We can ensure that values in a column fall within a specific range.

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (column_name BETWEEN min_value AND max_value);

List conditions

We can validate that a column's value matches one in a list of specific values.

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (column_name IN ('Value1', 'Value2', 'Value3'));

Comparison conditions

We can compare values in a column to ensure they meet a specified condition (greater than, less than, etc.).

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (column_name > some_value);

Pattern matching conditions

We can use pattern matching (e.g., with LIKE or SIMILAR TO) to validate text data.

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (column_name LIKE 'pattern');

Logical conditions

We can allow multiple conditions using logical operators (AND, OR, NOT).

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (condition1 AND condition2 OR condition3);

Composite conditions

Applies a condition over multiple columns.

ALTER TABLE table_name
ADD CONSTRAINT constraint_name 
CHECK (column1 + column2 < some_value);

FOREIGN KEY Constraint

Foreign key constraints are used to link the columns of two tables together, ensuring the referential integrity of the data. In essence, a foreign key in one table points to a primary key in another table, indicating that the rows in these two tables are related. This ensures that you can't have a row in a table with a foreign key that doesn't correspond to any row in the linked table with the primary key.

In our example, each record in the enrollments table refers to a student and a course through the student_id and course_id columns, respectively. Without any constraint, there’s nothing to ensure that the values of these identifiers in the enrollments table match existing entries in the students and courses tables.

Here’s how we ensure this when creating the enrollments table:

CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    ...
    FOREIGN KEY (student_id) REFERENCES Students(student_id),
    FOREIGN KEY (course_id) REFERENCES Courses(course_id)
);

Note that for these to be foreign keys, they must be primary keys in the students and courses tables, respectively.

Just as with other constraints, we could also declare these after table creation using the ALTER TABLE command:

ALTER TABLE enrollments
ADD CONSTRAINT fk_student_id
FOREIGN KEY (student_id)
REFERENCES students(student_id);

ALTER TABLE enrollments
ADD CONSTRAINT fk_course_id
FOREIGN KEY (course_id)
REFERENCES courses(course_id);

Putting It All Together

Throughout this article, we showcased several integrity constraints and how to use them to improve a university database. Here’s the final version of the CREATE TABLE command, which combines everything we’ve learned.

First, we'll define the students table:

CREATE TABLE students (
    student_id INT PRIMARY KEY,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL,
    email TEXT UNIQUE NOT NULL,
    major TEXT DEFAULT 'Undeclared',
    enrollment_year INT,
    CONSTRAINT year_check CHECK (enrollment_year >= 1900),
    CHECK (major IN (
        'Undeclared',
        'Computer Science',
        'Mathematics',
        'Biology',
        'Physics',
        'Chemistry',
        'Biochemistry'
    ))
);

Next, let's define the courses table:

CREATE TABLE courses (
    course_id INT PRIMARY KEY,
    course_name TEXT NOT NULL,
    department TEXT NOT NULL,
    UNIQUE (course_name, department),
    CHECK (department IN (
        'Physics & Mathematics',
        'Sciences'
    ))
);

Finally, we'll define the enrollments table, establishing the relationships between students and courses:

CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    year INT CHECK (year >= 1900),
    grade INT NULL DEFAULT NULL,
    is_passing_grade BOOLEAN NULL DEFAULT NULL,
    CONSTRAINT grade_check CHECK (grade BETWEEN 0 AND 100),
    CONSTRAINT is_passing_grade CHECK (
        (grade IS NULL AND is_passing_grade IS NULL) OR
        (grade >= 60 AND is_passing_grade = TRUE) OR
        (grade < 60 AND is_passing_grade = FALSE)
    ),
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id),
    PRIMARY KEY (student_id, course_id, year)
);

We’ve added a few constraints in this final example. A university has a known set of departments and majors. Therefore, the set of values that the major and department columns can have is finite and known in advance. In these situations, using a CHECK constraint is recommended to ensure that the columns can only take values from that known set of values.

Conclusion

In this article, we explored the different types of integrity constraints in SQL and how to implement them using PostgreSQL. We covered primary keys, NOT NULL constraints, UNIQUE constraints, DEFAULT constraints, CHECK constraints, and FOREIGN KEY constraints, providing practical examples for each.

By understanding these concepts, we can ensure the accuracy, consistency, and reliability of our data.

If you want to learn more about organizing data efficiently, check out this course on Database Design.

Author

François Aubry

Full-stack engineer & founder at CheapGPT. Teaching has always been my passion. From my early days as a student, I eagerly sought out opportunities to tutor and assist other students. This passion led me to pursue a PhD, where I also served as a teaching assistant to support my academic endeavors. During those years, I found immense fulfillment in the traditional classroom setting, fostering connections and facilitating learning. However, with the advent of online learning platforms, I recognized the transformative potential of digital education. In fact, I was actively involved in the development of one such platform at our university. I am deeply committed to integrating traditional teaching principles with innovative digital methodologies. My passion is to create courses that are not only engaging and informative but also accessible to learners in this digital age.

Topics

SQL

Data Science

Learn SQL with these courses!

Course

Data-Driven Decision Making in SQL

4 hr

62.7K

Learn how to analyze a SQL table and report insights to management.

See Details

Start Course

Course

Intermediate SQL

4 hr

454.9K

Accompanied at every step with hands-on practice queries, this course teaches you everything you need to know to analyze data using your own SQL code today!

See Details

Start Course

Course

Joining Data in SQL

4 hr

288.8K

Level up your SQL knowledge and learn to join tables together, apply relational set theory, and work with subqueries.

See Details

Start Course

Tutorial

SQL Triggers: A Guide for Developers

Learn how to use SQL triggers to automate tasks, maintain data integrity, and enhance database performance. Try practical examples like the CREATE, ALTER, and DROP commands in MySQL and Oracle.

Oluseye Jeremiah

Tutorial

SQL Query Examples and Tutorial

If you are looking to get started with SQL, we’ve got you covered. In this SQL tutorial, we will introduce you to SQL queries - a powerful tool that enables us to work with the data stored in a database.

Sejal Jaiswal

Tutorial

SQL Commands for Data Scientists

Learn the basic and essential commands for SQL with examples and a workspace notebook to refer to.

Emiko Sano

Tutorial

Set Operators in SQL: A Comprehensive Guide

Set operations in SQL are techniques for combining or comparing the results of two or more SELECT statements.

Kurtis Pykes

Tutorial

Joins in SQL Tutorial

This tutorial will explain how to join tables together using primary and foreign keys in an SQL Server.

DataCamp Team

Tutorial

SQL OR Operator: A Guide with Examples

The SQL OR operator allows you to filter data by combining multiple conditions, returning rows where at least one condition is true. Learn how to use OR with INSERT, UPDATE, and DELETE statements to modify records in tables.

Allan Ouko

See More See More

What Are Integrity Constraints in SQL?

Study Case: University Database

Students table

Courses table

Enrollments table

PRIMARY KEY Constraint

PRIMARY KEY constraint on multiple columns

Integrity constraints after table creation

NOT NULL Constraint

UNIQUE Constraint

UNIQUE constraint on multiple columns

NOT NULL UNIQUE vs PRIMARY KEY constraints

DEFAULT Constraint

CHECK Constraint

Range conditions

List conditions

Comparison conditions

Pattern matching conditions

Logical conditions

Composite conditions

FOREIGN KEY Constraint

Putting It All Together

Conclusion

SQL Triggers: A Guide for Developers

SQL Query Examples and Tutorial

SQL Commands for Data Scientists

Set Operators in SQL: A Comprehensive Guide

Joins in SQL Tutorial

SQL OR Operator: A Guide with Examples

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Data-Driven Decision Making in SQL

Intermediate SQL

Joining Data in SQL

SQL Triggers: A Guide for Developers

SQL Query Examples and Tutorial

SQL Commands for Data Scientists

Set Operators in SQL: A Comprehensive Guide

Joins in SQL Tutorial

SQL OR Operator: A Guide with Examples

Data-Driven Decision Making in SQL