Skip to main content
HomeTutorialsSQL

QUALIFY: The SQL Filtering Statement You Never Knew You Needed

Learn about the SQL QUALIFY clause, an essential yet lesser-known filtering method in SQL. Understand its syntax, uses, and how it differs from other SQL filtering methods.
Jun 5, 2023  · 8 min read

Structured Query Language (SQL) is one of the most important programming languages in the world. This statement is particularly true for those who work with data since it’s the language of choice when communicating with databases.

As more people turn to data to drive their business, demand for SQL expertise has only increased over the years. Fortunately, the language is easy-to-learn and well established, which means it has a large community of users to support you if you get stuck.

In this article, we will cover a specific clause from the SQL language called QUALIFY. By the end of this article, you will know:

  • The basic concept and syntax for QUALIFY
  • When to use the QUALIFY clause
  • How it differs from other filtering methods.

Basic Concept and Syntax of the SQL QUALIFY Clause

QUALIFY is a clause used to filter the results of a window function. Therefore, to successfully use the QUALIFY clause, there must be at least one WINDOW function in the SELECT list or QUALIFY clause – only the rows where the boolean expression evaluates to TRUE will be returned.

Tip: Check out the SQL Window Functions Cheat Sheet.

If there was no QUALIFY clause, filtering on the results of a WINDOW function would require nesting. SQL developers typically recommend against using nested queries as much as possible because they make the code less human-readable while increasing the complexity of debugging and collaboration.

In other words, the main idea behind the QUALIFY clause was to simplify queries that require filtering on the result of window functions; this means QUALIFY is evaluated after window functions are computed.

Here’s the typical order of execution for a query with a QUALIFY statement clause:

  • FROM
  • WHERE
  • GROUP BY and Aggregation
  • HAVING
  • WINDOW
  • QUALIFY
  • DISTINCT
  • ORDER BY
  • LIMIT

The general syntax of a QUALIFY statement is as follows:

QUALIFY <predicate>

In its general form, it would look something like this:

SELECT <column_list>
  FROM <data_source>
  [GROUP BY ...]
  [HAVING ...]
  QUALIFY <predicate>
  [ ... ]

Note: <predicate> is an expression used to filter the result after aggregations and window functions are computed.

When to Use SQL QUALIFY

The QUALIFY statement clause is much like the HAVING clause in the sense that it avoids the need for a subquery to perform filtering. For example, you may use QUALIFY to filter the results of an analytic function – a function used to calculate an aggregate value based on a group of rows.

To make it clearer when exactly you would use QUALIFY, here is a short scenario:

Let’s say you’re working as a data scientist, and a stakeholder asks you for each customer's last login information. Your initial query may look something like this:

SELECT 
  user_id, 
  ip, 
  country_code,
  os,
  RANK() over (
    PARTITION BY user_id ORDER BY log_datetime DESC
  ) as previous_logins
FROM login_logs
WHERE TRUE 

This is a good start, but it does not fulfill the stakeholder's requests as it will return all of the logins and not just their most recent one. To return the last login, we need to add a filter.

Newcomers to SQL may fall into the trap of attempting to filter this using the WHERE statement clause, as follows:

SELECT 
  user_id, 
  ip, 
  country_code,
  os,
  RANK() over (
    PARTITION BY user_id ORDER BY log_datetime DESC
  ) AS previous_logins
FROM login_logs
WHERE TRUE 
  AND last_login = 1

Unfortunately, this code will not execute; the same is true if you attempt GROUP BY and HAVING.

The reason this occurs is all down to the order of operations we discussed in the previous section.

WHERE, GROUP BY, and HAVING are all evaluated before WINDOW functions; this means they can not filter WINDOW functions because they don’t know they exist – the WINDOW function has not been evaluated at the time those three clauses are run.

One way to make the filter come after the WINDOW function is to use a Common Table Expression (CTE).

Here’s how our code will look:

WITH 
logins AS (
  SELECT 
      user_id, 
      ip, 
      country_code,
      os,
      RANK() OVER (
        PARTITION BY user_id ORDER BY log_datetime DESC
      ) AS previous_logins
  FROM login_logs
  WHERE TRUE
)

SELECT user_id, ip, country_code, os 
FROM logins
WHERE previous_logins = 1

Technically speaking, this code is valid – It will execute perfectly fine and return the results you’re stakeholder is after.

However, we had to introduce two queries, and there are some extra lines of code, which can become redundant if you’re repeating this process regularly.

The best workaround for this problem is to use QUALIFY.

Here’s how the code will look:

-- Starter code from @Jiho Choi on StackOverflow 
SELECT 
  user_id, 
  ip, 
  country_code,
  os,
  RANK() over (
    PARTITION BY user_id ORDER BY log_datetime DESC
  ) as previous_logins
FROM login_logs
WHERE TRUE 
QUALIFY previous_logins = 1

This works solution works because QUALIFY clauses are evaluated after WINDOW functions in SQLs order of operations, which means they are aware of their existence so they can filter them in the same query.

You may be wondering what the benefits QUALIFY are other than writing fewer lines of code, and the answer is that there aren’t really any performance benefits.

The CTE solution we used and the QUALIFY solution both execute in similar timeframes, so we can not say there’s much of a performance enhancement from the QUALIFY solution.

The main benefit is the quality of life improvement; there’s less code, and easier to read.

Become a Data Engineer

Become a data engineer through advanced Python learning
Start Learning for Free

Comparing SQL QUALIFY With Other Filtering Methods

By now, you understand the QUALIFY statement clause is another filtering method available in SQL. You also know the best time to use the QUALIFY clause.

But what’s the difference between QUALIFY and the other filtering methods?

Let’s recap each filtering method.

The WHERE clause

WHERE is used to filter records in a table; this means it implements row operations. We use it when we want to extract the records that meet a specific condition. For example, we may want to filter a dataset by someone's age. Here’s how it will look in code:

SELECT name, gender, height, weight, age 
FROM gym_members_info 
WHERE age < 23; 

Here are the main things to note about WHERE clause.

  1. It’s used to filter records based on a specific condition.
  2. Operations are implemented on the rows.
  3. You can use WHERE clause without a GROUP BY clause
  4. WHERE can be used with SELECT, UPDATE, DELETE statements.

Let’s compare this to the HAVING clause.

The HAVING clause

HAVING is used to filter records from groups defined by a GROUP BY clause based on a specific condition. Thus, the most telling difference between WHERE and HAVING is that HAVING requires a GROUP BY clause to be present for it to be evaluated successfully.

The main benefit of this is you can apply your HAVING clause to a subset of aggregated groups, which is not possible in a WHERE block.

For example:

SELECT gender, AVG(height) 
FROM gym_members_info
GROUP BY gender 
HAVING AVG(height) > 170 

The code above will return the gender groups where the average height is greater than 170 centimeters; in other words, HAVING implements a column operation.

Another thing to be aware of is that HAVING can only be used with the SELECT statement, unlike WHERE.

The QUALIFY clause

QUALIFY is the equivalent of HAVING except it performs filters on the results of WINDOW functions, whereas HAVING performs filters on aggregate functions and GROUP BY clauses – learn more about How to use GROUP BY.

If we wanted to know information about the tallest person within each age group, we could use a WINDOW function to perform the calculation and a QUALIFY clause to filter the results so we only discover the tallest person from each age group.

Here’s how it would look in code:

SELECT name, gender, height, weight, age 
RANK() over (
    PARTITION BY age ORDER BY height DESC
  ) as ranked_ages
FROM gym_members_info
WHERE TRUE
QUALIFY ranked_ages = 1

Essentially, the QUALIFY clause specifies a conditional expression used to filter the output of an ordered analytical function that has already been computed in accordance with user-specified criteria.

Conclusion and Further Study

SQL plays a major role in the typical data science workflow; knowing how to implement useful operations in SQL, like the QUALIFY clause, is extremely helpful as it enables you to write cleaner code, which makes it more pleasant for others to collaborate with you.

If you would like to master your SQL skills, we recommend you check out the resources below:

Become a Data Engineer

Prove your skills as a job-ready data engineer.

Photo of Kurtis Pykes
Author
Kurtis Pykes
LinkedIn
Topics

Start your SQL journey today!

Course

Introduction to SQL

2 hr
664.3K
Learn how to create and query relational databases using SQL in just two hours.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Introduction to the Where Clause in SQL

In this tutorial, you will be introduced to filtering rows in SQL using the where clause.
Sayak Paul's photo

Sayak Paul

7 min

tutorial

How to Best Use the SQL LIMIT Clause

Learn the application of the SQL LIMIT clause to filter data. Master the use of the LIMIT clause in PostgreSQL and MySQL databases.
Allan Ouko's photo

Allan Ouko

8 min

tutorial

How to Use the SQL IN Operator: A Guide to Efficient Filtering Techniques

The SQL IN operator allows you to filter query results to include only rows that meet specified conditions. Explore its full potential and alternatives.
Allan Ouko's photo

Allan Ouko

8 min

tutorial

SQL: Reporting and Analysis

Master SQL for Data Reporting & daily data analysis by learning how to select, filter & sort data, customize output, & how you can report aggregated data from a database!
Hafsa Jabeen's photo

Hafsa Jabeen

37 min

tutorial

How to Use the SQL EXISTS() Operator

Learn how to use the SQL EXISTS() operator for subquery evaluation and filtering, complete with examples, best practices, and tips for optimizing your queries.
Allan Ouko's photo

Allan Ouko

10 min

tutorial

SQL Tutorial: How To Write Better Queries

Learn about anti-patterns, execution plans, time complexity, query tuning, and optimization in SQL.
Karlijn Willems's photo

Karlijn Willems

35 min

See MoreSee More