Course
When writing SQL queries, we usually follow a certain order. (SQL is known to be written from the inside-out.) However, the SQL engines follow a specific order of execution when compiling queries, which is different from the typical order of writing. Understanding the SQL order of execution is important in mastering query optimization, improving query accuracy and performance, and debugging complex issues, as you will see.
As we get started, I recommend taking DataCamp’s Introduction to SQL course and SQL Fundamentals skill track to learn the basics of SQL and how to extract data using queries. The SQL Basics Cheat Sheet will be a helpful guide for common SQL functions for filtering and aggregating data.
What is SQL Order of Execution?
SQL order of execution refers to the order in which the different clauses in the query are evaluated. It's worth understanding because the execution order is usually different from how we write the SQL queries. To take the most simple example, you might think that in the case of SELECT * FROM database
, the SELECT
is evaluated first, but really the order of execution starts with it’s the FROM
clause.
Here is the SQL order of execution. In the next section, we will go through the steps in detail.
- FROM/JOIN: Specifies the tables from which to retrieve data.
- WHERE: Filters the rows that meet the condition before grouping.
- GROUP BY: Groups rows that share a property.
- HAVING: Filters groups based on conditions, applied after grouping.
- SELECT: Specifies the columns to retrieve or calculate.
- DISTINCT: Removes duplicate rows from the result set.
- ORDER BY: Sorts the result set by specified columns.
- LIMIT: Specifies the maximum number of rows to return.
- OFFSET: Specifies how many rows to skip before starting to return rows.
In this following query, I have added comments to show which are evaluated first.
-- #6+7 SELECT DISTINCT department_id
-- #1 FROM employees
-- #2 JOIN orders ON customers.customer_id = orders.customer_id
-- #3 WHERE salary > 3000
-- #4 GROUP BY department
-- #5 HAVING AVG(salary) > 5000
-- #8 ORDER BY department
-- #9 LIMIT 10 OFFSET 5
-- #10 OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
I have also created an acrostic that might help: For Work Goals and Hurdles Search DataCamp's Organized Learning Opportunities.
Stages of SQL Query Execution
Although most of the SQL queries we write begin with the SELECT
statement, the logical order of execution starts with the FROM
clause. Here, I will write queries to show the query order of execution. Although, keep in mind that these queries are incomplete and won't compile.
FROM clause
SQL queries begin the execution process with the FROM
clause. This is usually the first stage since the database identifies the data sources/table. Where multiple tables are involved, the SQL query will also evaluate the JOIN
condition to combine the tables specified as the data source.
The incomplete query below would first select data from the customers
table using the FROM
clause and the orders
table using the JOIN
clause.
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id;
WHERE clause
The WHERE
clause is executed after the FROM
and JOIN
clauses to filter the rows based on specific conditions. It is important to note that columns defined with aliases in the SELECT
clause cannot be referenced directly in the WHERE
clause since it is processed before the SELECT
.
The query below uses the WHERE
clause to filter records of employees with a bonus
greater than 5000.
FROM employees
JOIN departments ON employees.employee_id = departments.department_id
WHERE salary * 0.1 > 5000;
GROUP BY clause
After filtering the rows, SQL executes the GROUP BY
clause to group the results based on specified columns. This step is usually used with aggregate functions like COUNT()
, SUM()
, and AVG()
to perform some calculations on the specified columns.
The query below first filters out employees with a salary
greater than 3,000
, then groups them by department
and calculates the average salary for each group.
FROM employees e
JOIN departments d ON e.employee_id = d.department_id
WHERE e.salary > 3000
GROUP BY e.name, d.department_name;
HAVING clause
The HAVING
clause is similar to the WHERE
clause but is used to filter grouped data after the GROUP BY
operation. In the query below, SQL groups employees
by department
, calculate the average salary for each group, and then filters out groups where the average salary is less than or equal to 5,000
.
FROM employees
JOIN departments ON employees.employee_id = departments.department_id
WHERE salary > 3000
GROUP BY department_name
HAVING AVG(salary) > 5000;
SELECT clause
The SELECT
clause is where SQL derives the columns or expressions to be returned after executing the preceding steps. You can apply arithmetic operations, aliasing, and aggregate functions in the SELECT
clause.
The following query uses the SELECT
clause to retrieve the name
and bonus
calculated as salary * 0.1
from the employees
table.
SELECT name, salary * 0.1 AS bonus
FROM employees
JOIN orders ON customers.customer_id = orders.customer_id
WHERE salary > 3000
GROUP BY name
HAVING AVG(salary) > 5000;
DISTINCT clause
The DISTINCT
clause is evaluated after the SELECT clause in a query. The DISTINCT
clause is important in removing duplicate records from a table because it returns unique rows. The query below returns each unique department_id
, filtering out duplicates.
SELECT DISTINCT department_id
FROM employees
JOIN orders ON customers.customer_id = orders.customer_id
WHERE salary > 3000
GROUP BY department
HAVING AVG(salary) > 5000;
ORDER BY clause
The ORDER BY
clause sorts out the results set of specific columns or expressions. Unlike the WHERE
clause, the ORDER BY
clause can use column aliases defined in the SELECT
statement.
The query below sorts the bonus
column in descending order. Note that the bonus
was defined in the SELECT
statement as an alias of an expression.
SELECT DISTINCT department_id
FROM employees
JOIN orders ON customers.customer_id = orders.customer_id
WHERE salary > 3000
GROUP BY department
HAVING AVG(salary) > 5000
ORDER BY bonus DESC;
LIMIT/OFFSET clause
The LIMIT
and OFFSET
clauses are usually the last to be executed in a SQL query to restrict the number of rows to return. The LIMIT
clause specifies the maximum number of rows to return, and OFFSET
specifies how many rows to skip before starting to return rows.
The query below retrieves employees' names and salaries, sorts them by salary
in descending order, and limits the output to 10 results while skipping the first 5 rows.
SELECT DISTINCT department_id
FROM employees
JOIN orders ON customers.customer_id = orders.customer_id
WHERE salary > 3000
GROUP BY department
HAVING AVG(salary) > 5000
ORDER BY bonus DESC
LIMIT 10 OFFSET 5;
-- OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY --SQL SERVER / ORACLE
The LIMIT
and OFFSET
clauses are supported in MySQL and PostgreSQL databases. For SQL Server and Oracle, you use OFFSET
, ROWS FETCH
, and ROWS ONLY
to restrict the number of rows to return from a query.
Check out our tutorial on How to Use SQL OFFSET to learn more about data pagination and database-specific support for OFFSET
and LIMIT
clauses.
SQL Order of Execution vs. Order of Writing
SQL is a declarative language, meaning that the query execution order differs from the written order. So, instead of specifying how to perform a task, you declare what you want, and the database engine decides the best way to achieve it. This method differs from imperative programming languages like Python or Java, where you explicitly write step-by-step instructions for execution.
Understanding the SQL order of execution changes how you think about constructing queries. For example, imagine you write a query to filter rows based on an alias you created in the SELECT
clause:
SELECT price * 0.9 AS discounted_price
FROM products
WHERE discounted_price > 100;
At first glance, this looks logical, but it will throw an error. Why? Because the WHERE
clause is evaluated before the SELECT
clause in SQL's execution order. To fix it, you'd need to use a subquery or HAVING
instead:
SELECT price * 0.9 AS discounted_price
FROM products
HAVING discounted_price > 100;
To learn more about WHERE
and HAVING
specifically, read our tutorial: The Difference Between WHERE and HAVING in SQL.
Common Mistakes and Best Practices
Although the order of query execution is not affected by the order of writing, understanding the execution flow is crucial to avoid common mistakes and improve query performance. The following mistakes are directly linked to misunderstanding SQL’s execution order:
Common mistakes
The following are the common mistakes that might hinder your query performance.
-
Using Column Aliases in the WHERE Clause: Since the
WHERE
clause is executed before theSELECT
clause, trying to use an alias inWHERE
will result in an error. Understanding that SQL evaluatesWHERE
before theSELECT
clause teaches you that you need to repeat the full expression instead of relying on an alias. -
Using HAVING for Row Filtering Instead of WHERE: The
HAVING
clause is executed afterGROUP BY
and is designed for filtering aggregated data. If you're filtering non-aggregated data, it belongs in the WHERE clause. Knowing the difference in execution order between WHERE and HAVING helps you determine where each condition should be placed. -
Incorrect Use of Aggregates in SELECT Without GROUP BY: Since
GROUP BY
is executed beforeHAVING
orSELECT
, failing to group your data before applying an aggregate function will lead to incorrect results or errors. Understanding the execution order clarifies why these two clauses must go together. -
Not Using Aliases Properly in the ORDER BY Clause: Unlike the
WHERE
clause, theORDER BY
clause is evaluated afterSELECT
. This allows you to use aliases created inSELECT
for sorting, helping you avoid confusion by knowing when aliases are available for use.
Best practices
Consider the following best practices to ensure your queries execute as expected.
-
Filter Early with WHERE: Since the
WHERE
clause is executed beforeGROUP BY
andJOIN
, applying filters early reduces the number of rows processed by subsequent clauses, improving query performance. By filtering non-aggregated data as early as possible, you limit the data that needs to be grouped or joined, saving processing time. -
Pre-Aggregate Data Before Joins: Knowing that
FROM
andJOIN
are the first clauses executed, pre-aggregating data using subqueries or common table expressions (CTEs) allows you to shrink the dataset before the join process. This ensures that fewer rows are processed during the join. -
Optimize ORDER BY with Indexes: Since the
ORDER BY
clause is one of the last executed steps, ensuring that the sorted columns are indexed will speed up query performance by helping the database handle sorting operations more efficiently. -
Avoid SELECT * in Production Queries: The
SELECT
clause is executed after filtering, grouping, and aggregating, so specifying only the needed columns minimizes the amount of data retrieved, reducing unnecessary overhead.
Conclusion
Understanding the SQL order of execution is important for writing efficient, accurate, and optimized queries. We have discussed the logical order of query execution in SQL and compared it to the written order. I encourage you to practice writing different queries to better understand the logical execution order. Mastering this concept will greatly improve your ability to troubleshoot and optimize SQL queries.
If you are looking to advance your SQL skills, I recommend trying out DataCamp’s Associate Data Analyst in SQL career track to become a proficient data analyst. The Reporting in SQL course will also help you become proficient in building complex reports and dashboards for effective data presentation. Finally, you should obtain the SQL Associate Certification to showcase your mastery in using SQL to solve business problems and stand out among other professionals.
Become a Data Engineer
SQL Order of Execution FAQs
How does SQL execution differ from the written order?
The SQL execution order usually starts with the FROM
clause followed by clauses like WHERE
and GROUP BY
while the writing order starts with the SELECT
statement.
How does JOIN fit into the order of execution?
The JOIN
operations are executed as part of the FROM
clause.
Can I use column aliases in the WHERE clause?
No, column aliases are defined in the SELECT
clause, executed after the WHERE
clause.
What is the difference between WHERE and HAVING?
WHERE
filters rows before grouping, while HAVING
filters after GROUP BY
and works on aggregated data.
Does SQL execution order impact query performance?
Yes, understanding the execution order allows you to optimize queries by applying filters early and reducing unnecessary operations.