Course
Like it or not, the LIKE operator is essential in SQL. It gives data scientists and data engineers the power to filter data on specific string matches. This article provides a quick tutorial on LIKE for beginners and intermediates.
The SQL LIKE Operator for Pattern Matching
Note: To easily run all the example code in this tutorial yourself, you can create a free DataLab workbook with SQL installed and databases with sample data.
Associate Data Engineer in SQL
Suppose you have an employees table and would like to find all names that start with ‘A’:
|
employees |
|||||
|
emp_no |
birth_date |
first_name |
last_name |
gender |
hire_date |
|
10001 |
1953-09-02T00:00:00.000Z |
Georgi |
Facello |
M |
1986-06-26T00:00:00.000Z |
|
10002 |
1964-06-02T00:00:00.000Z |
Bezalel |
Simmel |
F |
1985-11-21T00:00:00.000Z |
|
10003 |
1959-12-03T00:00:00.000Z |
Parto |
Bamford |
M |
1986-08-28T00:00:00.000Z |
You could spend time looking through the table manually. But why would you do that when you have the LIKE operator?
SELECT DISTINCT
first_name
FROM employees
WHERE first_name LIKE 'A%'
The magic here is in the clause `WHERE first_name LIKE ‘A%’`, which means “find all first_name starting with "A" and ending with any number of characters.” The `A%` here is known as a pattern for matching.
The `%` is not the only wildcard you can use in conjunction with the LIKE operator. You could use the underscore sign `_` , too:
`%`matches any number of characters.`_`matches any single character.
The syntax is easy to remember. It is simply:
column_name LIKE pattern
You can use LIKE to achieve a variety of pattern-matching. Here’s how:
Beginner SQL LIKE Examples
Below, we’ve outlined some practical examples of how you can use the LIKE statement and the results from our sample data set.
1. Use LIKE for exact string match
If you’d like to perform an exact string match, use LIKE without ‘%’ or ‘_’
SELECT
first_name, last_name
FROM employees
WHERE first_name LIKE 'Barry' -- the same as WHERE first_name = ‘Barry’
2. Use ‘%’ to match any number of characters
‘%’ can be used to match any (even zero) number of characters – a number, an alphabet, or a symbol.
Suppose you want to find all employees whose name starts with ‘Adam’; you can use the pattern ‘Adam%’
SELECT DISTINCT
first_name
FROM employees
WHERE first_name LIKE 'Adam%'
To find names that end with "z" try the pattern ‘%z’. You can also use multiple ‘%’ in one pattern. For example, if you want to find names that contain "z", use ‘%z%’.
3. Use ‘_’ to match one (and only one) character
Like the game Hangman, the underscore sign _ can only fit one character.
How many ways are there to spell Le_n? The pattern would match anything from "Lexn", "LeAn", "Le3n", or "Le-n".
SELECT DISTINCT
first_name
FROM employees
WHERE first_name LIKE 'Le_n'
What are the different names with only three characters? We can find out using three consecutive underscores ___ as the pattern.
SELECT DISTINCT
first_name
FROM employees
WHERE first_name LIKE '___'
4. Use both ‘%’ and ‘_’ to match any pattern
Of course, you can use both ‘%’ and ‘_’ to create interesting patterns.
SELECT DISTINCT
first_name
FROM employees
WHERE first_name LIKE '%ann_'
The pattern ‘%ann_’ matches a string that starts with any number of characters and ends with "ann" and one other character.
5. Use NOT to find strings that do not match a pattern
What if you want to find all rows that do not match a specific criterion? You can use the NOT LIKE operator. For example, to find all titles except for "Staff", we can use the syntax
`WHERE title NOT LIKE ‘Staff’`
This is exactly equivalent to the syntax
`WHERE title != ‘Staff’`
SELECT DISTINCT
title
FROM titles
WHERE title NOT LIKE 'Staff'
Of course, you can use NOT LIKE with any of the patterns we described.
SELECT DISTINCT
title
FROM titles
WHERE title NOT LIKE '%engineer'
6. Use LOWER (or UPPER) with LIKE for case-insensitive pattern matching
You can use the following syntax if you need to perform pattern matching but aren’t sure if the string is stored in lowercase, uppercase, or mixed case.
`LOWER(column_name) LIKE pattern`
The function LOWER() returns all strings in lowercase, regardless of whether they are stored as upper-, lower-, or mixed case.
When using the syntax, make sure you spell the pattern in all lowercase! Else, you might not get any matches.
You could replace LOWER with UPPER in the syntax above, too. Be sure to spell out the pattern in capital letters.
`UPPER(column_name) LIKE PATTERN`
For example, to find out if an employee’s name is "Joanne", "JoAnne", "Joanna", or "JoAnna", try either of the following:
SELECT DISTINCT
first_name
FROM employees
WHERE lower(first_name) LIKE 'joann_'
SELECT DISTINCT
first_name
FROM employees
WHERE UPPER(first_name) LIKE 'JOANN_'
7. SQL LIKE with multiple values using OR/AND
You can combine multiple conditions using the LIKE syntax, too.
For example, use the OR condition to find results that satisfy at least one out of multiple LIKE patterns.
SELECT DISTINCT
first_name
FROM employees
WHERE
first_name LIKE 'Ad_l' OR
first_name LIKE 'Ad_m'
On the other hand, to find a string that matches multiple LIKE conditions, use the AND keyword.
SELECT DISTINCT
first_name
FROM employees
WHERE
first_name LIKE '%am%' AND
first_name LIKE '%me%'
The LIKE syntax can be applied to multiple columns as long as their variable type is a variable-length character (varchar). Knowing that we can find the names of employees whose initials are "Z. Z."
SELECT DISTINCT
first_name, last_name
FROM employees
WHERE
first_name LIKE 'Z%' AND
last_name LIKE 'Z%'
8. Use LIKE in the SELECT CASE WHEN clause
Thus far, we have focused on using LIKE as a condition for selecting records in the WHERE clause. We also use LIKE in the SELECT clause. For example, can we find out how many employees named "Adam" are in our database?
SELECT
COUNT(CASE WHEN first_name LIKE 'Adam' THEN 1 END) num_employees_adam
FROM employees
On the other hand, how many employees have the initials "A.Z."?
SELECT
COUNT(CASE WHEN first_name LIKE 'A%' AND last_name LIKE 'Z%' THEN 1 END) num_employees
FROM employees
Common patterns used in SQL LIKE
Here's a summary of the patterns we discussed for quick reference:
| Pattern | Description | Example use case |
|---|---|---|
| A% | Matches strings starting with "A" | Finding names starting with "A" |
| %z% | Matches strings containing "z" | Finding names containing "z" |
| Le_n | Matches strings like "Len", "Leon", etc. | Finding names with a single character variation |
| %ann_ | Matches strings ending with "ann" and one additional character | Finding names like "Joann", "Joanna" |
Intermediate Examples of SQL LIKE
The LIKE function is largely similar across different flavors of SQL (e.g. PostgreSQL, MySQL, Redshift, etc.). However some have additional variations of the LIKE function that are worth mentioning.
1. The ILIKE operator
Available in Redshift and PostgreSQL, ILIKE is the case-insensitive version of LIKE. As such, all of the following are equivalent.
SELECT
datacamp ILIKE ‘datacamp’, -- returns TRUE
DATACAMP ILIKE ‘datacamp’, -- returns TRUE
Datacamp ILIKE ‘datacamp’, -- returns TRUE
datacamp ILIKE ‘DataCamp’, -- returns TRUE
2. Using square brackets [] and [^] as wildcard characters
Users of T-SQL or SQL Server have additional wildcard characters for more complex pattern matching.
The square bracket syntax [] matches any specific character within the range or set. For example, the following will all return TRUE.
SELECT
‘Carson’ LIKE ‘[C-K]arson’, -- returns TRUE because C is in the range C-K
‘Karson’ LIKE ‘[C-K]arson’, -- returns TRUE because K is in range
‘Larson’ LIKE ‘[CKL]arson’, -- returns TRUE because L is in the set [CKL]
‘Parson’ LIKE ‘[C-K]arson’ -- returns FALSE because P is out of range
The caret-in-square-bracket [^] wildcard matches any single character that is not within the specified range or set. Can you see why the following results are such?
SELECT
‘Carson’ LIKE ‘[^C-K]arson’ -- returns FALSE
‘Parson’ LIKE ‘[^C-K]arson’ -- returns TRUE
Here, since "C" is within the range of [C-K], the pattern ‘C’ will not match [^C-K]. Thus, "Carson" will not match [^C-K]arson, but "Parson" will.
3. The RLIKE operator
Available in MySQL, the RLIKE operator recognizes regular expressions (RegEx) in the pattern. RegEx is a powerful and versatile tool for advanced pattern matching.
It does not hurt to have a basic understanding of RegEx, especially if your SQL flavor supports RLIKE. You can learn more about RegEx with our Regular Expressions in Python course.
SELECT DISTINCT
first_name
FROM employees
WHERE first_name RLIKE 'Susann[a-e]'
4. The ‘~~’ operator
In PostgreSQL, ‘~~’ is completely synonymous with LIKE. There are also equivalents of ILIKE, NOT LIKE, and NOT ILIKE.
|
Operator |
Equivalent |
|
|
|
|
|
|
|
|
|
|
|
|
Troubleshooting: Common Errors and Mistakes with LIKE
Here are some common errors you may run into when using LIKE, and how to correct them:
- No results returned: Check for case sensitivity. In some SQL dialects,
LIKEis case-sensitive by default. UseLOWER()orILIKE(if supported) for case-insensitive matching. - Unexpected results with wildcards: Make sure
%and_are used correctly.%matches any number of characters, while_matches exactly one character. Misplacing these can lead to unexpected matches. - Performance issues: If your query is slow, look for leading
%in patterns (e.g.,%pattern), which disables index usage. Rewrite patterns to start with a specific string if possible. See the section below for more on performance. - SQL injection errors: If you’re using user input in your query, make sure it is properly parameterized to avoid syntax errors or vulnerabilities.
- Special characters in patterns: Wildcard symbols like
%and_in the search string must be escaped if intended as literal characters. UseESCAPEin your query to define an escape character.
WHERE column_name LIKE '50\% OFF' ESCAPE '\'
Performance Considerations When Using the LIKE Operator
The LIKE operator is great, but it can potentially impact query performance, especially when used on large datasets. Here are some considerations to optimize its use:
- Indexes: The
LIKEoperator performs best when the pattern starts with a constant string, such asAdam%, because the database can use indexes. However, patterns like%Adamor%Adam%require a full table scan, which can be slow for large tables. - Avoid leading wildcards: Starting a pattern with
%, such as%pattern, disables index usage, as the database has to examine every record. - Collation and case-insensitive matching: Using functions like
LOWER()orUPPER()on columns for case-insensitive searches can also prevent indexes from being used. Instead, make sure the database collation is set appropriately for case-insensitive comparisons. - Alternative approaches: For large datasets, consider using full-text search or database-specific search features, such as
GINindexes in PostgreSQL orFULLTEXTindexes in MySQL, when performing complex or frequent string matching. - Selective queries: Limit the scope of your queries using additional filters, such as date ranges or numerical columns, to reduce the data processed by the
LIKEoperator.
LIKE vs. Full-Text Search
Full-text search, available in databases like MySQL, PostgreSQL, and SQL Server, is optimized for advanced text queries, such as searching for keywords, handling linguistic variations, and ranking results by relevance. While LIKE is great for basic tasks, full-text search is better for scalable and sophisticated text search.
Use SQL LIKE Confidently
Mastering SQL functions is key to succeeding in data science, and SQL’s LIKE command is no exception. A good command of SQL will supercharge your analytics progress, so be sure to learn it well!
For more resources on SQL, be sure to check out the following:
Become a Data Engineer
FAQs
Can the LIKE operator be used with numeric data types?
No, the LIKE operator is specifically used for pattern matching with string data types such as CHAR, VARCHAR, and TEXT. For numeric data types, other comparison operators such as =, <, >, etc., are used.
How does performance of LIKE compare to other SQL operators?
The LIKE operator can be less efficient, especially with patterns starting with %, as it requires a full table scan. Indexing can't be used effectively in these cases, which might slow down queries on large datasets.
Are there any security concerns when using LIKE in SQL?
While LIKE itself is not inherently insecure, using user inputs directly in SQL queries can lead to SQL injection attacks. Always sanitize inputs and consider using parameterized queries to mitigate this risk.
How would you handle case-sensitive searches in a SQL flavor that doesn’t support ILIKE?
In SQL flavors without ILIKE, you can use LOWER(column_name) LIKE LOWER(pattern) or UPPER(column_name) LIKE UPPER(pattern) to perform case-insensitive searches.
Can LIKE be used with non-ASCII characters?
Yes, LIKE can be used with non-ASCII characters as long as the database encoding supports those characters. This includes UTF-8, which is commonly used to support a wide range of characters.
How would you modify a LIKE query to search for a literal wildcard character (e.g., % or _)?
To search for a literal % or _, you need to use an escape character. For example, in SQL Server, you might use LIKE 'A[%]%' ESCAPE '%' to search for strings containing a literal %.
Can you combine LIKE with other SQL functions to improve search capabilities?
Yes, combining LIKE with functions like CONCAT or SUBSTRING can help refine search patterns. For example, using CONCAT can dynamically construct patterns based on other column values.
How can LIKE be used in conjunction with JOIN operations?
LIKE can be applied in JOIN conditions to match patterns between columns of different tables. For instance, ON table1.col1 LIKE table2.col2 || '%' can be used to join tables where table1.col1 starts with table2.col2.
What are some alternatives to LIKE for complex pattern matching?
For more complex patterns, SQL flavors that support regular expressions, such as MySQL's RLIKE or PostgreSQL's SIMILAR TO, can be used. These provide a richer syntax for advanced pattern matching.
How does LIKE handle null values in columns?
When a column contains null values, LIKE will not match these records, as nulls are not considered equal to any string or pattern. To include nulls, use a condition like OR column IS NULL.
