Skip to main content

UNION vs UNION ALL in SQL

Discover the output and performance differences between UNION and UNION ALL: UNION removes duplicates while UNION ALL includes them.
Jul 19, 2024  · 7 min read

In the SQL universe, the ability to combine data from multiple queries is a fundamental requirement, and SQL offers powerful functions like UNION and UNION ALL to achieve this. 

As you will see, understanding the subtle differences between UNION and UNION ALL functions in SQL is imperative for efficient data querying and management. This tutorial explains their key similarities, differences, and usage scenarios using companion datasets to help you optimize your SQL queries.

The Short Answer: SQL UNION vs. UNION ALL 

The key difference is that UNION removes duplicate records, whereas UNION ALL includes all duplicates. This distinction not only changes the number of rows in the query result, but it also impacts performance.

Let's consider two sample tables, employees_2023 and employees_2024, representing employee records for two different years.

SELECT *
FROM employees_2023

employee_id

name

department

1

Alice

HR

2

Bob

IT

3

Charlie

Finance

SELECT *
FROM employees_2024

employee_id

name

department

2

Bob

IT

3

Charlie

Finance

4

David

Marketing

Now, let’s join our two tables using the UNION function.

SELECT employee_id, name, department FROM employees_2023 
UNION 
SELECT employee_id, name, department FROM employees_2024;

employee_id

name

department

1

Alive

HR

2

Bob

IT

3

Charlie

Finance

4

David

Marketing

In this result, duplicate records are removed. Now let’s compare this with the result we obtain after using UNION ALL.

SELECT employee_id, name, department FROM employees_2023 
UNION ALL 
SELECT employee_id, name, department FROM employees_2024;

employee_id

name

department

1

Alive

HR

2

Bob

IT

3

Charlie

Finance

2

Bob

Bob

3

Charlie

Finance

4

David

Marketing

In this result, all records are included, and duplicates are not removed.

What is UNION in SQL?

The UNION function combines the results of two or more SELECT queries into a single result set, removing duplicate rows. Each SELECT statement within the UNION must have the same number of columns. Also, they have to have similar data types, and the columns must also be in the same order. To understand the function, let’s create two sample tables below.

CREATE TABLE sales_team (
    employee_id INT,
    employee_name VARCHAR(50)
);
INSERT INTO sales_team (employee_id, employee_name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
CREATE TABLE support_team ( 
employee_id INT, 
employee_name VARCHAR(50)
 );
INSERT INTO support_team (employee_id, employee_name) VALUES 
(3, 'Charlie'), 
(4, 'David'), 
(5, 'Eve');

The above code will create the two tables we’ll use in this tutorial, sales_team and support_team, respectively. You can view the tables using the code below.

SELECT * 
FROM sales_team

Records in the <code624

Records in the sales_team table. Image by Author 

SELECT * 
FROM support_team

Records in the <code624

Records in the support_team table. Image by Author 

These are small tables with three records each, and the column names are self-explanatory. Now that we have our two tables created, let’s apply the UNION function.

SELECT employee_id, employee_name
    FROM sales_team
   UNION
SELECT employee_id, employee_name
    FROM support_team;

Output from the <code624

Output from the UNION function. Image by Author 

You can see that the duplicate entry of employee_id, which takes the value of 3, is removed. 

What Are the Use Cases for UNION in SQL?

Here are some common use cases for the UNION function.

  • Combining Results From Different Tables: When you want to combine data from multiple tables and ensure no duplicate records, UNION is the go-to function.

  • Handling Different Data Sources: UNION is useful when combining tables from different data sources. 

  • Removing Duplicates Across Queries: When you want to ensure the uniqueness of the combined result set, use UNION.

What is UNION ALL in SQL?

The UNION ALL function combines the results of two or more SELECT queries, including all duplicate rows. This function is faster than UNION because it doesn’t bother removing duplicates.

SELECT employee_id, employee_name
    FROM sales_team
UNION ALL
SELECT employee_id, employee_name
    FROM support_team;

Output from the <code624

Output from the UNION ALL function. Image by Author 

You can see that, in this case, the duplicate entries are repeated and not removed.

What Are the Use Cases for UNION ALL in SQL?

Here are some common use cases for the UNION ALL function.

  • Combining Results With Duplicates: Use UNION ALL when you need to combine results from multiple queries and preserve all duplicate rows.
  • Performance Considerations: UNION ALL is more time-efficient than the UNION function, because it doesn't require the additional step of removing duplicates.
  • Aggregating Data From Different Periods: When aggregating data from different periods or sources, and you need to preserve the duplicate entries, UNION ALL is preferred.
  • Reporting and Analysis: For reporting purposes where every record, including duplicates, is necessary, UNION ALL is suitable.

UNION and UNION ALL: Similarities and Differences

Let's create a summary table of the similarities and differences. 

Feature

UNION

UNION ALL

Duplicate rows

Removed

Included

Performance

Slower

Faster

Use Case

When you need unique records

When you need all records

Result Size

Smaller

Larger

UNION and UNION ALL Syntax Differences Across SQL Dialects 

While the basic syntax remains the same across different SQL dialects, there are slight variations depending on the platform. If you look at the code then in most of the platforms, such as SQL Server, Oracle, MySQL, PostgreSQL, or BigQuery, the code will look something similar to what is shown below. 

SELECT employee_id, employee_name
    FROM sales_team
UNION ALL
SELECT employee_id, employee_name
    FROM support_team;

Basic commands in these SQL dialects remain the same, with some nuances and differences:

  1. Handling NULLs and Collations: Different SQL dialects may have different rules for handling NULL values and string collation.
  2. Performance Optimizations: The performance optimizations and execution efficiency might vary.
  3. Error Handling and Warnings: Different dialects may handle errors and warnings differently when using these functions.
  4. Platform-Specific Functions and Features: Some SQL dialects might offer additional functions or features that can be used in conjunction with UNION and UNION ALL for more complex scenarios.

The command is significantly different in the case of PySpark, which operates in a distributed environment. The code is given below, assuming df1 and df2 are the names of the two data frames consisting of the two tables we created above. :

df1.union(df2)

Final Thoughts

It is essential to understand the application of UNION as opposed to UNION ALL in managing data efficiently using SQL. Where UNION selects only distinct records, UNION ALL selects all of them,  affecting performance and result set size. Use this knowledge to choose the appropriate function for your specific requirements. 

For further learning, consider exploring the following sources:

  • Introduction to SQL: Learn how to create and query relational databases using SQL.
  • SQL Fundamentals: Gain the fundamental SQL skills you need to interact with and query your data. 
  • SQL Associate Certification: Demonstrate your SQL ability by extracting appropriate data from a database, and using it to answer common data questions.
  • Intermediate SQL: With this hands-on course, you’ll learn everything you need to know to analyze data using your own SQL code today.

Photo of Vikash Singh
Author
Vikash Singh
LinkedIn

Seasoned professional in data science, artificial intelligence, analytics, and data strategy.

Frequently Asked Questions

What is the primary difference between SQL UNION and UNION ALL?

UNION removes duplicate records, while UNION ALL includes all duplicates.

Is there a performance difference between UNION and UNION ALL?

Yes, UNION is generally slower because it removes duplicates.

Are there specific use cases where UNION ALL is preferred over UNION?

UNION ALL is preferred when you need all records, including duplicates, such as in reporting and analysis.

When should I use SQL UNION instead of UNION ALL?

Use UNION when you need a combined result set without duplicates.

How do different SQL dialects affect the use of UNION and UNION ALL?

While the basic syntax is consistent, there might be slight variations in performance, handling of NULLs, and collation across SQL dialects.

Topics

Learn SQL with DataCamp

course

Introduction to SQL

2 hr
780.6K
Learn how to create and query relational databases using SQL in just two hours.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

SELECTing Multiple Columns in SQL

Learn how to easily select multiple columns from a database table in SQL, or select all columns from a table in one simple query.
DataCamp Team's photo

DataCamp Team

4 min

tutorial

Aggregate Functions in SQL

Learn how to use aggregate functions for summarizing results and gaining useful insights about data in SQL.
Sayak Paul's photo

Sayak Paul

9 min

tutorial

Introduction to SQL Joins

In this tutorial, you'll learn about the mechanics of joins in SQL and its different types.
Sayak Paul's photo

Sayak Paul

9 min

tutorial

SQL LEFT JOIN vs. LEFT OUTER JOIN: What You Need to Know

LEFT JOIN and LEFT OUTER JOIN in SQL are identical, with no functional difference between them. Keep reading to explore practical examples and discover how to apply these joins effectively in your queries.
Vikash Singh's photo

Vikash Singh

4 min

tutorial

How to Use SQL MINUS

The SQL MINUS operator subtracts one result set from another. It is simple to use: Just place MINUS between two queries to exclude common records from the first set.
Oluseye Jeremiah's photo

Oluseye Jeremiah

10 min

tutorial

SQL: Reporting and Analysis

Master SQL for Data Reporting & daily data analysis by learning how to select, filter & sort data, customize output, & how you can report aggregated data from a database!
Hafsa Jabeen's photo

Hafsa Jabeen

37 min

See MoreSee More