course
CTE in SQL: A Complete Guide with Examples
If you've been working with SQL for a while but haven't used CTEs, you'll likely wonder how you managed without them. I use them pretty much everywhere, including in SELECT
, INSERT
, UPDATE
, and DELETE
statements.
In this article, I will go through the basics, including how to create a CTE. I will also go into more advanced things, such as how to differentiate between non-recursive and recursive CTEs, which both serve a purpose.
If you are a bit unfamiliar with SQL operations, try our very popular Introduction to SQL course to get started. The course is well-designed and comprehensive, and it will teach you everything you need to know to extract data using efficient queries.
What is a SQL CTE?
The idea of CTEs will become clear when I show examples. But for now, we can say that a CTE, or common table expression, is a temporary, named result set in SQL that allows you to simplify complex queries, making them easier to read and maintain.
CTEs are commonly used when working with multiple subqueries. You might recognize them because they are created with the distinctive WITH
keyword and, like I mentioned, they can be used in SELECT
, INSERT
, UPDATE
, and DELETE
statements.
Associate Data Engineer in SQL
How to Create a SQL CTE
When creating a CTE, we use the WITH
keyword to initiate the CTE definition. The general syntax of a CTE is as follows:
WITH cte_name (column1, column2, ...)
AS (
-- Query that defines the CTE
SELECT ...
FROM ...
WHERE ...
)
-- Main query
SELECT ...
FROM cte_name;
Where:
-
WITH
: Initiates the CTE definition, indicating that the following name represents a temporary result set. -
cte_name
: The name is assigned to the CTE to reference it in the main query. -
Optional column list (
column1
,column2
, ...): Specifies column names for the CTE’s result set. This is useful when column names need to be adjusted. -
Query that defines the CTE: The inner query that selects data and shapes the temporary result set.
-
Main query: References the CTE by its name, using it like a table.
Let us look at the following example of creating a CTE using a layered approach. Assume we have a table Employees
, and we want to create a CTE that selects employees who earn a salary above $50,000.
Step 1: Write the base query
We start by writing the basic SELECT
query:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Salary > 50000;
Step 2: Wrap the query using the WITH keyword to create a CTE
Use the WITH
keyword to give the CTE a name.
WITH HighEarningEmployees AS (
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Salary > 50000
)
Step 3: Use the CTE in the main query
Finally, refer to the CTE in a SELECT
statement by calling the CTE name defined above.
-- Define a Common Table Expression (CTE)
WITH HighEarningEmployees AS (
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Salary > 50000
)
-- Use the CTE to select high-earning employees
SELECT EmployeeID, FirstName, LastName
FROM HighEarningEmployees;
To summarize the above steps, we have used the WITH
keyword to define the CTE named HighEarningEmployees
. The inner query was used to generate the temporary dataset. The main query references the HighEarningEmployees
to display the specified columns EmployeeID
, FirstName
, and LastName
.
Why SQL CTEs are Useful
From the above example, you may wonder why we use CTEs when even simple queries yield the same results. The following are the reasons:
Simplify complex queries
CTEs break down complex SQL statements into smaller, more manageable parts, making the code easier to read, write, and maintain.
Suppose we have three tables: Orders
, Customers
, and Products
. We want to find the total revenue generated by each customer who purchased in 2024. When we write the query without using CTE, it looks cluttered and hard to read and understand.
-- Select customer names and total revenue from their orders
SELECT c.CustomerName, SUM(p.Price * o.Quantity) AS TotalRevenue
FROM Orders o
-- Join to get customer and products table
JOIN Customers c ON o.CustomerID = c.CustomerID
JOIN Products p ON o.ProductID = p.ProductID
WHERE YEAR(o.OrderDate) = 2024
GROUP BY c.CustomerName
HAVING SUM(p.Price * o.Quantity) > 1000;
By using a CTE, we can separate the logic into a more readable format:
-- Define the CTE
WITH OrderDetails AS (
SELECT o.OrderID, c.CustomerName, p.Price, o.Quantity, o.OrderDate
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID
JOIN Products p ON o.ProductID = p.ProductID
WHERE YEAR(o.OrderDate) = 2024
)
--Main query
SELECT CustomerName, SUM(Price * Quantity) AS TotalRevenue
FROM OrderDetails
GROUP BY CustomerName
HAVING SUM(Price * Quantity) > 1000;
Code reusability
CTEs help avoid duplication by allowing the same result set to be reused across different parts of a query. If multiple calculations or operations are based on the same dataset, you can define it once in a CTE and refer to it as needed.
Assume we need to calculate the average and total sales for each product category in an e-commerce database. We can use a CTE to define the calculations once and reuse them in subsequent queries.
-- Define a CTE to calculate total and average sales for each category
WITH CategorySales AS (
SELECT Category, SUM(SalesAmount) AS TotalSales, AVG(SalesAmount) AS AverageSales
FROM Products
GROUP BY Category
)
-- Select category, total sales, and average sales from the CTE
SELECT Category, TotalSales, AverageSales
FROM CategorySales
WHERE TotalSales > 5000;
Other applications
In addition to simplifying queries and code reusability, CTEs have other uses as well. I'm not able to cover every possible use of CTEs in detail. Our Data Manipulation in SQL course is a great option if you want to keep practicing. However, I'll document some of the main other reasons here:
- Query Organization and Readability: CTEs improve SQL code readability by dividing queries into logical, sequential steps. Each step in the query process can be represented by its own CTE, making the entire query easier to follow.
- Hierarchical Data Traversal: CTEs can help navigate hierarchical relationships, such as organizational structures, parent-child relationships, or any data model that involves nested levels. Recursive CTEs are useful for querying hierarchical data because they allow you to traverse levels iteratively.
- Multi-Level Aggregations: CTEs can help perform aggregations at multiple levels, such as calculating sales figures at different granularities (e.g., by month, quarter, and year). Using CTEs to separate these aggregation steps ensures that each level is calculated independently and logically.
- Combining Data from Multiple Tables: Multiple CTEs can be used to combine data from different tables, making the final combination step more structured. This approach simplifies complex joins and ensures the source data is organized logically for improved readability.
Advanced SQL CTE Techniques
CTEs support advanced SQL techniques, making them versatile and useful for different use cases. The following are some of CTEs' advanced applications.
Multiple CTEs in a single query
You can define multiple CTEs in a single query, which allows for complex transformations and calculations. This method is useful when a problem requires multiple stages of data processing, where each CTE represents a distinct stage.
Suppose we have sales data in a table called Sales
and we want to calculate the total sales for each product, identify products with above-average total sales, and rank these products based on their total sales.
WITH ProductSales AS (
-- Step 1: Calculate total sales for each product
SELECT ProductID, SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY ProductID
),
AverageSales AS (
-- Step 2: Calculate the average total sales across all products
SELECT AVG(TotalSales) AS AverageTotalSales
FROM ProductSales
),
HighSalesProducts AS (
-- Step 3: Filter products with above-average total sales
SELECT ProductID, TotalSales
FROM ProductSales
WHERE TotalSales > (SELECT AverageTotalSales FROM AverageSales)
)
-- Step 4: Rank the high-sales products
SELECT ProductID, TotalSales, RANK() OVER (ORDER BY TotalSales DESC) AS SalesRank
FROM HighSalesProducts;
In the above example;
-
The first CTE (
ProductSales
) calculates the total sales per product. -
The second CTE (
AverageSales
) computes the average total sales across all products. -
The third CTE
(HighSalesProducts
) filters for products whose total sales exceed the average. -
The final query ranks these products based on their total sales.
CTEs in UPDATE, DELETE, and MERGE Statements
When incorporated into UPDATE
, DELETE
, and MERGE
operations, CTEs can simplify data manipulation tasks, especially when dealing with complex filters or hierarchical data.
Using CTE with an UPDATE statement
Assume we have an Employees
table with an EmployeeSalary
column. We want to give a 10% raise to all employees who have worked for the company for over 5 years.
-- Define a CTE to find employees hired more than 5 years ago
WITH LongTermEmployees AS (
SELECT EmployeeID
FROM Employees
WHERE DATEDIFF(YEAR, HireDate, GETDATE()) > 5
)
-- Update salaries by 10% for long-term employees identified in the CTE
UPDATE Employees
SET EmployeeSalary = EmployeeSalary * 1.1
WHERE EmployeeID IN (SELECT EmployeeID FROM LongTermEmployees);
The CTE LongTermEmployees
identifies employees who have worked more than five years. The UPDATE
statement uses this CTE to selectively increase salaries.
Using CTE with a DELETE statement
Now assume we have a table named Products
and want to delete all products that haven’t been sold in the last 2 years. We can use a CTE to filter the products:
-- Define a CTE to identify products not sold in the last 2 years
WITH OldProducts AS (
SELECT ProductID
FROM Products
-- Use DATEADD to find products with a LastSoldDate more than 2 years ago
WHERE LastSoldDate < DATEADD(YEAR, -2, GETDATE())
)
-- Delete products identified as old from the main table
DELETE FROM Products
WHERE ProductID IN (SELECT ProductID FROM OldProducts);
The CTE OldProducts
identifies products that haven't been sold in the last two years, and then the DELETE
statement uses this CTE to remove those products.
Using CTE with a MERGE statement
The MERGE
statement in SQL allows for conditional updates, inserts, or deletions in a target table based on data in a source table. In the following example, the CTE MergedInventory
combines new and existing inventory data. The MERGE
statement then updates quantities for existing products or inserts new products based on the CTE data.
-- CTE to merge new and existing inventory data
WITH MergedInventory AS (
SELECT ni.ProductID, ni.Quantity AS NewQuantity, i.Quantity AS CurrentQuantity
FROM NewInventoryData ni
-- Use LEFT JOIN to include all new data, even if not in current inventory
LEFT JOIN Inventory i ON ni.ProductID = i.ProductID
)
-- Merge the prepared data into the Inventory table
MERGE INTO Inventory AS i
USING MergedInventory AS mi
ON i.ProductID = mi.ProductID
-- Update existing products with new quantities
WHEN MATCHED THEN
UPDATE SET i.Quantity = mi.NewQuantity
-- Insert new products if they don't exist in the inventory
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, Quantity) VALUES (mi.ProductID, mi.NewQuantity);
Recursive Common Table Expressions (CTEs)
Recursive CTEs help perform advanced and repeated operations.
Introduction to recursive CTEs
Recursive CTEs are a special type of CTE that references itself within its definition, allowing the query to perform repeated operations. This makes them ideal for working with hierarchical or tree-structured data, such as organizational charts, directory structures, or product assemblies. The recursive CTE iteratively processes data, returning results step by step until a termination condition is met.
Anchor and recursive members
A recursive CTE consists of two main parts:
- Anchor Member: The part that defines the base query that starts the recursion.
- Recursive Member: The part that references the CTE itself, allowing it to perform the "recursive" operations.
Suppose we have an Employees
table, where each row contains an EmployeeID
, EmployeeName
, and ManagerID
. If we want to find all direct and indirect reports for a specific manager, we start with the anchor member identifying the top-level manager. The anchor member starts with the employee with EmployeeID = 1
.
The recursive member finds employees whose ManagerID
matches the EmployeeID
from the previous iteration. Each iteration retrieves the next level of the hierarchy.
WITH EmployeeHierarchy AS (
-- Anchor member: select the top-level manager
SELECT EmployeeID, EmployeeName, ManagerID, 1 AS Level
FROM Employees
WHERE EmployeeID = 1 -- Starting with the top-level manager
UNION ALL
-- Recursive member: find employees who report to the current managers
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID, eh.Level + 1
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT EmployeeID, EmployeeName, Level
FROM EmployeeHierarchy;
Potential Issues or Limitations of CTEs in SQL
Understanding the features and limitations of CTEs is important for writing logical and readable queries. Let us look at some limitations and potential issues of using CTEs in different databases.
SQL Server and Azure limitations
There are some environment-specific limitations for SQL CTEs when working with SQL Server or Azure Synapse Analytics. They include the following:
-
SQL Server: The default maximum recursion level for recursive CTEs is 100, which can be modified using the
OPTION (MAXRECURSION)
hint. If this limit is exceeded without adjustment, an error occurs. CTEs cannot be nested directly within each other or defined inside another CTE. -
Azure Synapse Analytics: CTEs have limited support for certain SQL operations such as
INSERT
,UPDATE
,DELETE
, andMERGE
. Also, recursive CTEs are not supported in the Azure Synapse Analytics cloud-based environments restricting the ability to perform certain hierarchical data operations.
If you find yourself working with SQL Server, know that DataCamp has a lot of great resources to help. To start, I recommend taking DataCamp’s Introduction to SQL Server course to master the basics of SQL Server for data analysis. You can try our SQL Server Developer career track, which covers everything from transactions and error handling to time series analysis. Our Hierarchical and Recursive Queries in SQL Server course gets right to the heart of how to write advanced queries in SQL Server, including methods involving CTEs.
Other potential issues
Although CTEs are useful for simplifying complex queries, there are some common pitfalls you should be aware of. They include the following:
-
Infinite Loops in Recursive CTEs: If the termination condition for a recursive CTE is not met, it can result in an infinite loop, causing the query to run indefinitely. To avoid the recursive CTE running infinitely, use the
OPTION (MAXRECURSION N)
hint to limit the maximum number of recursive iterations, whereN
is a specified limit. -
Performance Considerations: Recursive CTEs can become resource-intensive if the recursion depth is high or large datasets are being processed. To optimize the performance, limit the data processed in each iteration and ensure appropriate filtering to avoid excessive recursion levels.
When to Use CTEs vs. Other Techniques
While CTEs are appropriate for simplifying queries involving repeated tasks, derived tables, views, and temp tables also serve similar purposes. The following table highlights the advantages and disadvantages of each method and when to use each.
Technique | Advantages | Disadvantages | Suitable Use Case |
---|---|---|---|
CTEs | Temporary scope within a single queryNo storage or maintenance requiredImproves readability by modularizing code | Limited to the query in which they are defined | Organizing complex queries, temporary transformations, and breaking down multi-step operations |
Derived Tables | Simplifies nested subqueriesNo need for permanent storage | Harder to read/maintain for complex queriesIt cannot be reused multiple times within a query | Quick, single-use transformations and aggregations within a query |
Views | Reusable across queriesCan enhance security by restricting data access | Requires maintenance and can affect multiple queriesComplex views can impact performance | Long-term reusable logic and data access control |
Conclusion
Mastering CTEs takes practice, like anything: I recommend trying out DataCamp’s Associate Data Analyst in SQL career track to become a proficient data analyst. The Reporting in SQL course will also help you become proficient in building complex reports and dashboards for effective data presentation. Finally, you should obtain the SQL Associate Certification to showcase your mastery in using SQL to solve business problems and stand out among other professionals.
Become SQL Certified
SQL CTE FAQs
What is a CTE in SQL?
A CTE (common table expression) is a temporary, named result set defined within an SQL query using the WITH
keyword, which is used to simplify complex queries by breaking them into smaller, more manageable parts.
How does a CTE differ from a view?
CTEs are temporary and exist only for the duration of a single query. Views are stored in the database and can be reused across multiple queries. CTEs don’t consume storage space, while views do.
Are CTEs faster than temporary tables?
Not necessarily. CTEs improve readability but may not always perform better than temp tables for large datasets.
Can CTEs be used in INSERT, UPDATE, or DELETE operations?
Yes, CTEs can be used in data modification statements to simplify the process, especially when filtering or joining data is involved.
What is the difference between non-recursive and recursive CTEs?
Non-recursive CTEs do not reference themselves and act similarly to a subquery or temporary table. Non-recursive CTEs simplify complex queries similar to subqueries or temporary tables. Recursive CTEs, on the other hand, reference themselves within the query definition and are used for iterative data processing, such as traversing hierarchical data structures. They are appropriate for tasks that require repeated execution, with each step building on the previous one.
Learn SQL with DataCamp
course
Intermediate SQL
course
Hierarchical and Recursive Queries in SQL Server
cheat-sheet
SQL Basics Cheat Sheet
tutorial
SQL Tutorial: How To Write Better Queries
tutorial
SQL Order of Execution: Understanding How Queries Run
Allan Ouko
5 min
tutorial
How to Use a SQL Alias to Simplify Your Queries
Allan Ouko
9 min
tutorial
SQL Query Examples and Tutorial
code-along
Getting Started in SQL
Kelsey McNeillie