Courses
In relational databases, rows are often interdependent, and answering a complex question often requires a query to look back at the table it is currently processing.
To query such tables, SQL allows you to do correlated subqueries, which define a specific relationship where the inner query depends on the values of the outer query. While a standard subquery runs once and finishes, a correlated subquery is dynamic, executing repeatedly for every single row the main query evaluates.
In this tutorial, I will explain how a correlated subquery works in SQL, its performance considerations, and when it’s the right choice compared to joins and window functions. If you are new to SQL, start with our Introduction to SQL course, or the Intermediate SQL course if you have some experience.
What Is a Correlated Subquery?
A correlated subquery is a type of subquery that depends on values from the outer query to run.
Instead of executing once and returning a fixed result, the subquery is evaluated once for every row processed by the outer query. This happens because the inner query references a column from the outer query, creating a direct link between the two queries.
In comparison, a non-correlated subquery runs independently of the outer query. It executes once, returns a result set or value, and the outer query uses that result without re-running the subquery for each row.
How a Correlated Subquery Works
A typical correlated subquery in SQL has the following workflow:

How a correlated subquery works. Image by Gemini.
- The outer query picks a row: SQL begins scanning the outer query's table and selects the first row.
- Referencing: The inner query pulls a value from that specific row, often using an alias.
- Execution: The inner query runs using that value.
- Filter/update: The result is passed back to the outer query to determine if to include the row.
- Iteration: The process repeats for the next row until the table is finished.
Correlated Subquery Example in SQL
So far, what I’ve been saying is conceptual. The best way to learn is by working through examples.
Example 1: Employees earning above department average
Suppose you have an employees table with employee salaries and department IDs. You want to find employees who earn more than the average salary in their department.
You will use the query below, where:
-
The outer query selects employees from the
employeestable. -
The subquery calculates the average salary for the same department.
-
The condition
e2.department_id = e.department_idreferences the outer query aliase.
-- Fetch employees earning more than the average salary in dept
SELECT
e.employee_id,
e.employee_name,
e.salary,
e.department_id
FROM employees e
WHERE e.salary > (
SELECT AVG(e2.salary) -- Calculate the average salary
FROM employees e2
WHERE e2.department_id = e.department_id
-- Correlation: references the outer query's department_id
);
Example 2: Using EXISTS() with correlated subquery
You can also use the EXISTS() operator with a correlated subquery to check whether related records exist in another table.
Let's say you have records in customers and orders tables. You want to list customers who have placed at least one order. You will use the query below, where:
-
The outer query scans rows in the
customerstable. -
The subquery checks if at least one order exists for that customer.
-
The condition
o.customer_id = c.customer_idlinks the subquery to the outer query.
-- Fetch customers with at least one order
SELECT
c.customer_id,
c.customer_name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
-- Correlation: references the outer query customer_id
);
In the above query, SQL checks whether a matching row exists in the orders table. If it does, the EXISTS() operator returns true, and the customer is included in the result.
Correlated vs. Non-Correlated Subqueries
As we had learned earlier, subqueries in SQL fall into either non-correlated subqueries or correlated subqueries. The key difference is whether the inner query depends on the outer query.
For a non-correlated subquery, the database executes it once, then uses the result in the outer query.
For example, the query below finds employees who earn more than the overall average salary.
-- Query employees who earn more than the overall average salary
SELECT
employee_id,
employee_name,
salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
In the above query, the subquery calculates the average salary for the entire table, and it runs a single time. The outer query then compares each employee’s salary to that one value.
Since non-correlated subqueries run once, they are usually faster when the result can be reused. They are best used for global comparisons, such as overall averages and totals.
However, correlated subqueries can be slower on large tables. They become useful when conditions must be evaluated relative to each row, such as department-level comparisons or existence checks.
I recommend taking our Introduction to SQL Server course to learn more about grouping and data aggregation, and joining tables.
Correlated Subquery vs. JOIN
Many correlated subqueries can be rewritten using JOINs. In relational databases, JOINs perform better because the database can process relationships set-by-set rather than row-by-row.
Consider the query below, using a correlated subquery. This query lists employees who are paid above the average salary within their department
-- Use subquery to fetch employees earning more than the average salary in dept
SELECT
e.employee_id,
e.employee_name,
e.salary,
e.department_id
FROM employees e
WHERE e.salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.department_id = e.department_id
);
You can rewrite the query using the JOIN clause to produce the same results.
-- Use JOIN to fetch employees earning more than the average salary in dept
SELECT
e.employee_id,
e.employee_name,
e.salary,
e.department_id
FROM employees e
JOIN (
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
-- Precompute the department average once per department
) dept_avg
ON e.department_id = dept_avg.department_id
-- Match employees with their department averages
WHERE e.salary > dept_avg.avg_salary;
-- Compare salary with the computed department average
The table below summarizes the difference between correlated subqueries and JOIN in SQL.
|
Feature |
Correlated Subquery |
JOIN |
|
Readability |
Often easier to read because the logic is expressed directly in the |
It can be slightly more complex because it may require derived tables or CTEs. |
|
Logic Expression |
Expresses conditions naturally. For example, “salary greater than department average.” |
Requires calculating aggregated values first and then joining them back to the main table. |
|
Execution Behavior |
The subquery may run once for each row in the outer query. |
Aggregated results are typically computed once and reused. |
|
Performance |
It may be slower on large datasets due to repeated execution. |
Usually more efficient for large tables. |
|
Common Use Cases |
Checking row-specific conditions, filtering with |
Reporting queries, aggregations, and performance-sensitive workloads. |
I recommend taking our Joining Data in SQL course to learn the different types of joins in SQL and how to work with different related tables in the database.
Correlated Subquery vs. Window Functions
In modern SQL, Window functions like AVG() and OVER (PARTITION BY) can compute aggregates per row in one scan.
For example, the query below returns employees whose salary is higher than the average salary of their department. Inside the subquery, it uses OVER () to turn the aggregate into a window function and PARTITION BY department_id to divide the table into groups (partitions) by department.
-- Use window function to get employees earning more than dept average salary
SELECT
employee_id,
employee_name,
salary,
department_id
FROM (
SELECT
employee_id,
employee_name,
salary,
department_id,
AVG(salary) OVER (PARTITION BY department_id) AS dept_avg_salary
-- Window function calculates department average once per partition
FROM employees
) t
WHERE salary > dept_avg_salary;
However, correlated subqueries remain useful when you want to use EXISTS() or NOT EXISTS() to test relationships between tables. You may also use correlated subqueries when working in databases or situations where window functions are unavailable.
Correlated Subquery Performance
Although correlated subqueries are powerful, they often come with some performance issues.
Correlated subqueries execute repeatedly
Since the query runs once per outer query row, it can slow queries on large tables by rescanning inner data multiple times. If your outer table has 100,000 rows, the database performs 100,000 sub-tasks.
Bottlenecks
If you don’t do proper optimization, correlated queries can lead to high CPU usage and long wait times, especially if the inner query is performing complex math or scanning large tables.
Indexing columns
Indexing the columns used in the correlation will help the database to find the related row in the subquery almost instantly, rather than scanning the entire inner table every single time.
Query Planner optimization
Modern databases often optimize correlated subqueries internally. The query planner may transform the query into a more efficient form, such as a JOIN or a cached aggregation, and reduce query time significantly.
When to Use a Correlated Subquery
You can use correlated subqueries if you want to perform the following:
-
Filter based on row-specific aggregates: Use it when you need to compare a value relative to each row, such as employees earning above their department average.
-
Checking related data with EXISTS(): You can also use correlated subqueries with
EXISTS()to test whether related rows exist. -
Expressing complex nested logic: Correlated subqueries may help make complex conditions easier to read and express compared to long
JOINchains.
However, avoid using correlated subqueries when:
-
A simple JOIN works: If you can get the same result with a
LEFT JOINorINNER JOIN, use it, as it will always be faster. -
Working with big data: If the correlated condition references large tables without indexes, the repeated evaluations can slow the query significantly.
Common Mistakes with Correlated Subqueries
The following are some of the common issues you may encounter when using correlated subqueries, and how to troubleshoot:
- Forgetting the correlation condition: A correlated subquery must reference a column from the outer query. If this condition is missing, the subquery becomes independent and may produce incorrect results.
- Misunderstanding execution order: Always remember that the outer query runs first and the inner query follows. Reversing this logic in your head can lead to incorrect results.
- Unnecessary nesting: Sometimes you may wrap a simple value in a correlated subquery when a standard subquery would suffice. If the inner query doesn't need the outer row to function, remove the correlation to save performance.
- Ignoring performance impact: Correlated subqueries may work perfectly on small datasets, but become slow as tables grow. To avoid this, always test your queries with realistic data sizes and consider indexing or rewriting the query if performance becomes an issue.
Conclusion
Learning when and how to use correlated subqueries, and when to replace them with other techniques, is an important skill for writing clear, efficient SQL queries.
As a next step, I recommend obtaining our SQL Associate Certification to demonstrate your mastery of using SQL for data analysis and stand out among other data professionals. Finally, I recommend taking our Database Design course, where you will learn to create and manage databases and select the appropriate DBMS for your needs.
FAQs
How is a correlated subquery different from a regular subquery?
A regular (non-correlated) subquery runs independently and usually executes once, while a correlated subquery depends on the outer query and may execute repeatedly for each row.
Are correlated subqueries supported in all SQL databases?
Yes. Correlated subqueries are part of the standard SQL language and are supported by most relational database systems, including PostgreSQL, MySQL, SQL Server, and Oracle.
What happens if I forget the correlation condition?
The subquery becomes non-correlated, running once against all rows and likely producing incorrect results.
Can correlated subqueries always be replaced with JOINs?
Not always, but many correlated subqueries can be rewritten using JOINs or aggregations. JOINs are often preferred for performance on larger datasets.

