Course
The SQL IN
operator is a useful tool for filtering data based on a set of values. When included in long queries, the IN
operator helps simplify complex queries and improves readability.
If you are an aspiring data practitioner, I recommend taking DataCamp’s Introduction to SQL and SQL Fundamentals courses to learn the basics of applying SQL to filter data. For a quick refresher, check out our SQL Basics Cheat Sheet.
What is the SQL IN Operator?
The SQL IN
operator is a powerful tool for filtering data in SQL queries. It helps refine query results by returning only the rows that meet the specified conditions.
However, the IN
operator is included after the WHERE
clause to return results with multiple conditions. Therefore, the IN
operator can be used where the filtered data type has different values.
-- Select the employee ID, employee name, and department columns
SELECT employee_id, employee_name, department
FROM employees
-- Filter the results to include only employees whose department is 'Sales' and 'HR'
WHERE department IN ('Sales', 'HR');
In the example above, we select the employee_id
, employee_name
, and department
columns from the employees
table. However, we only return the rows where the employees are from Sales
and HR
.
The IN
operator can filter different SQL data types, including strings, numbers, and dates. Always ensure the reference column has the appropriate data type when filtering rows.
Syntax of the SQL IN Operator
The IN
operator has a simple syntax for filtering values. It is used in the WHERE
clause to allow filtering multiple values from a column list. The overall syntax for the IN
operator is as shown below.
-- Select the columns you want to retrieve
SELECT column_name(s)
-- Specify the table from which to retrieve the data
FROM table_name
-- Filter the results to include only those rows where the column's value matches any value in the specified list
WHERE column_name IN (value1, value2, ...);
The following is a practical example of including the IN
operator in the WHERE
clause.
-- Select the employee ID, employee name, and department columns
SELECT employee_id, employee_name, department
FROM employees
-- Filter the results to include only employees whose department is 'Sales' and 'HR'
WHERE department IN ('Sales', 'HR');
Important Points to Remember
The IN
operator is included in the WHERE
clause of the SELECT
statement. This placement allows the query to filter rows based on multiple values. When used in an SQL query, the IN
operator simplifies complex queries and makes them easy to read.
The IN
operator is standard across different databases, including SQL Server, PostgreSQL, MySQL, and Oracle databases. Writing the IN
operator in uppercase as the standard is important for improved readability.
Common Use Cases of the IN Operator
The SQL IN
operator is used for different data analysis applications. This section will explore some of the most common use cases, including filtering data, handling multiple values, and utilizing subqueries.
Filtering data
The IN
operator filters, records, and returns a set of specified values. For example, the query below will select all columns from the employees
table and return rows with employees from the Sales
, HR
, and IT
departments.
-- Select all columns from employees table
SELECT employee_id, employee_name, department
FROM employees
-- Filter to include only 'Sales', 'HR', and 'IT' departments
WHERE department IN ('Sales', 'HR', 'IT');
Handling multiple values
The IN
operator is also used to filter records with multiple values. This application is useful when filtering data using multiple OR
conditions. Therefore, the IN
operator provides a simpler way to improve query readability.
Below is how the query would look with the OR
operator instead of the IN
operator.
-- Main query to select employee_id and name from the employees table
SELECT employee_id, name
FROM employees
-- Filter employees based on multiple department names using OR conditions
WHERE department = 'Sales'
OR department = 'Marketing'
OR department = 'HR';
Here is the query using the IN
operator. Note the simplicity of the query by removing the repeated OR
operator.
-- Main query to select employee_id and name from the employees table
SELECT employee_id, name
FROM employees
-- Filter employees based on a list of department names using the IN operator
WHERE department IN ('Sales', 'Marketing', 'HR');
Subqueries
You can also use the IN
operator to filter results in a subquery. This operation allows you to filter records from another query or related data. In the below example, the subquery selects the department_id
of the Sales
department from the departments
table. Therefore, the main query will return rows containing employee_id
and name from the employees
table, filtered according to the subquery.
-- Main query to select employee_id and name from the employees table
SELECT employee_id, name
FROM employees
-- Filter employees based on department_id using a subquery
WHERE department_id IN (
-- Subquery to get department_id for the "Sales" department
SELECT department_id
FROM departments
WHERE department_name = 'Sales'
);
Limitations and Alternatives
Although the IN
operator is simple to use in SQL queries, there are some limitations in its applications, which include the following:
- Performance Issues with Large Lists: Using the
IN
operator with large lists of values could cause performance issues, slowing down query execution. - Handling NULL values: The
IN
operator does not efficiently handleNULL
values and could return unpredicted results. This issue is due to theNULL
values not being included in the value list. - Limited Flexibility: The
IN
operator is limited when handling large datasets with multiple columns. One can only filter limited columns using subqueries.
The alternatives for avoiding the limitations of the IN
operator include:
Using the EXISTS operator
The EXISTS
operator checks whether specified records are in a subquery. When used correctly, this operator is efficient and faster.
-- Select employee ID and name for employees in departments located in New York using EXISTS
SELECT employee_id, employee_name
FROM employees e
-- Check if there exists a row in the departments table
WHERE EXISTS (
-- Select 1 is used here as a placeholder; it simply checks for existence
SELECT 1
FROM departments d
-- Match the department_id in employees with department_id in departments
WHERE d.department_id = e.department_id
-- Filter for departments located in New York
AND d.location = 'New York'
);
Using JOIN
The SQL JOIN
clause is efficient when querying data with complex filtering and useful when filtering data from multiple related columns or columns. In the example below, the JOIN
clause has been used to join the employees and departments tables on department_id
to filter employee records for those who live in New York.
-- Select employee ID and name for employees in departments located in New York using JOIN
SELECT e.employee_id, e.employee_name
FROM employees e
-- Join the employees table with the departments table on department_id
JOIN departments d ON e.department_id = d.department_id
-- Filter the results to include only departments located in New York
WHERE d.location = 'New York';
Using Common Table Expressions (CTEs) or temporary tables
If you have tables with large lists, consider using CTEs to store the list of values before querying. This method improves query performance and readability.
-- Define a CTE (Common Table Expression) to store the list of department IDs located in New York
WITH DepartmentList AS (
SELECT department_id
FROM departments
WHERE location = 'New York'
)
-- Select employee ID and name for employees who belong to the departments listed in the CTE
SELECT e.employee_id, e.employee_name
FROM employees e
-- Filter the results to include only employees whose department_id is in the list of department IDs from the CTE
WHERE e.department_id IN (SELECT department_id FROM DepartmentList);
To learn other advanced techniques for filtering data, I recommend taking DataCamp’s Associate Data Analyst in SQL career track. You can also check DataCamp’s Intermediate SQL Server course, which includes a chapter on common table expressions and other advanced topics.
Conclusion
Understanding the SQL IN
operator is important for a data practitioner, as it allows you to filter and transform data during analysis. To enhance your SQL skills in the application of the IN
operator, I encourage you to practice with different datasets for analysis.
If you want to advance your SQL knowledge to an intermediate level, I recommend the Reporting in SQL course to explore other complex querying techniques. Moreover, the SQL Associate Certification will help you stand out among other data analysts by showing your practical knowledge of using SQL for advanced analysis.
Frequently Asked Questions
Can I use the IN operator to filter different data types?
The IN
operator filters different SQL data types, including strings, numbers, and dates.
Can you use the IN operator in subqueries?
The IN
operator is used in subqueries as an advanced method to filter data from the main query.
What is the alternative of the IN operator?
The EXISTS
operator, JOIN
clause, and Common Expressions Tables (CTEs) are the best alternatives for the IN
operator.
What databases are compatible with the IN operator?
The common databases support the use of the IN
operator, including SQL Server, MySQL, PostgreSQL, and Oracle databases.