The SQL LIKE Operator for Pattern Matching
Like it or not, the
LIKE operator is essential in SQL. It gives data practitioners the power to filter data on specific string matches. This article provides a quick tutorial on
LIKE for beginners and intermediates. If you learn better with hands-on practice, you can also follow along (and run the code) on this DataCamp Workspace.
Suppose you have an
employees table and would like to find all names that start with
‘A’. You could spend time looking through the table manually. But why would you do that when you have the LIKE operator?
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE 'A%'
The magic here is in the clause
`WHERE first_name LIKE ‘A%’`, which means “find all
first_name starting with A and ending with any number of characters.” The
`A%` here is known as a pattern for matching.
`%` is not the only wildcard you can use in conjunction with the
LIKE operator. You could use the underscore sign
`%`matches any number of characters.
`_`matches any single character.
The syntax is easy to remember. It is simply
column_name LIKE pattern
You can use
LIKE to achieve a variety of pattern-matching. Here’s how.
Beginner SQL LIKE Examples
Below, we’ve outlined some practical examples of how you can use the
LIKE statement and the results from our sample data set.
LIKE for Exact String Match
If you’d like to perform an exact string match, use LIKE without
SELECT first_name, last_name FROM employees WHERE first_name LIKE 'Barry' -- the same as WHERE first_name = ‘Barry’
‘%’ to match any number of characters
‘%’ can be used to match any (even zero) number of characters – a number, an alphabet, or a symbol.
Suppose you want to find all employees whose name starts with
‘Adam’; you can use the pattern
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE 'Adam%'
To find names that end with
’Z’, try the pattern
‘%z’. You can also use multiple
‘%’ in one pattern. For example, if you want to find names that contain z, use
‘_’ to match one (and only one) character
Like the game Hangman, the underscore sign
_ can only fit one character.
How many ways are there to spell
Le_n? The pattern would match anything from
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE 'Le_n'
What are the different names with only three characters? We can find out using three consecutive underscores ___ as the pattern.
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE '___'
4. Use both
‘_’ to match any pattern
Of course, you can use both
‘_’ to create interesting patterns.
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE '%ann_'
‘%ann_’ matches a string that starts with any number of characters and ends with
‘ann’ and one other character.
NOT to find strings that do not match a pattern
What if you want to find all rows that do not match a specific criterion? You can use the
NOT LIKE operator. For example, to find all titles except for Staff, we can use the syntax
`WHERE title NOT LIKE ‘Staff’`
This is exactly equivalent to the syntax
`WHERE title != ‘Staff’`
SELECT DISTINCT title FROM titles WHERE title NOT LIKE 'Staff'
Of course, you can use
`NOT LIKE` with any of the patterns we described.
SELECT DISTINCT title FROM titles WHERE title NOT LIKE '%engineer'
LIKE for case-insensitive pattern matching
If you need to perform pattern matching but aren’t sure if the string is stored in lowercase, uppercase, or mixed case, you can use the following syntax.
`LOWER(column_name) LIKE pattern`
LOWER() returns all strings in lowercase, regardless of whether they are stored as upper-, lower- or mixed case.
When using the syntax, make sure you spell the pattern in all lowercase! Else, you might not get any matches.
You could replace
UPPER in the syntax above too. Be sure to spell out the pattern in CAPITAL LETTERS.
`UPPER(column_name) LIKE PATTERN`
For example, to find out if an employee’s name is Joanne, JoAnne, Joanna, or JoAnna, try either of the following.
SELECT DISTINCT first_name FROM employees WHERE lower(first_name) LIKE 'joann_'
SELECT DISTINCT first_name FROM employees WHERE UPPER(first_name) LIKE 'JOANN_'
LIKE with Multiple Values Using
You can combine multiple conditions using the
LIKE syntax too.
For example, use the
OR condition to find results that satisfy at least one out of multiple
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE 'Ad_l' OR first_name LIKE 'Ad_m'
On the other hand, to find a string that matches multiple
LIKE conditions, use the
SELECT DISTINCT first_name FROM employees WHERE first_name LIKE '%am%' AND first_name LIKE '%me%'
LIKE syntax can be applied to multiple columns, as long as their variable type is a variable-length character (
varchar). Knowing that we can find out the names of employees whose initials are
SELECT DISTINCT first_name, last_name FROM employees WHERE first_name LIKE 'Z%' AND last_name LIKE 'Z%'
LIKE in the
SELECT CASE WHEN clause
Thus far, we have focused on using
LIKE as a condition for selecting records in the
WHERE clause. We also use
LIKE in the
SELECT clause too. For example, can we find out how many employees with the name
‘Adam’ are in our database?
SELECT COUNT(CASE WHEN first_name LIKE 'Adam' THEN 1 END) num_employees_adam FROM employees
On the other hand, how many employees have the initials
SELECT COUNT(CASE WHEN first_name LIKE 'A%' AND last_name LIKE 'Z%' THEN 1 END) num_employees FROM employees
Intermediate Examples of SQL LIKE
LIKE function is largely similar across different flavors of SQL (e.g. PostgreSQL, MySQL, Redshift, etc.). Some have additional variations of the
LIKE function that are worth mentioning.
Available in Redshift and PostgreSQL,
ILIKE is the case-insensitive version of
LIKE. As such, all of the following are equivalent.
SELECT datacamp ILIKE ‘datacamp’, -- returns TRUE DATACAMP ILIKE ‘datacamp’, -- returns TRUE Datacamp ILIKE ‘datacamp’, -- returns TRUE datacamp ILIKE ‘DataCamp’, -- returns TRUE
2. Using square brackets
[^] as wildcard characters
Users of T-SQL or SQL Server have additional wildcard characters for more complex pattern matching.
The square bracket syntax
 matches any single character specific within the range or set. For example, the following will all return TRUE.
SELECT ‘Carson’ LIKE ‘[C-K]arson’, -- returns TRUE because C is in the range C-K ‘Karson’ LIKE ‘[C-K]arson’, -- returns TRUE because K is in range ‘Larson’ LIKE ‘[CKL]arson’, -- returns TRUE because L is in the set [CKL] ‘Parson’ LIKE ‘[C-K]arson’ -- returns FALSE because P is out of range
[^] wildcard matches any single character that is not within the specified range or set. Can you see why the following results are such?
SELECT ‘Carson’ LIKE ‘[^C-K]arson’ -- returns FALSE ‘Parson’ LIKE ‘[^C-K]arson’ -- returns TRUE
C is within the range of
[C-K], the pattern
‘C’ will not match
‘Carson’ will not match
Available in MySQL, the
RLIKE operator recognizes regular expressions (RegEx) in the pattern. RegEx is a powerful and versatile tool for advanced pattern matching.
It does not hurt to have a basic understanding of RegEx, especially if your flavor of SQL supports RLIKE. You can learn more about RegEx with our Regular Expressions in Python course.
SELECT DISTINCT first_name FROM employees WHERE first_name RLIKE 'Susann[a-e]'
‘~~’ is completely synonymous with
LIKE. There are also equivalents of
NOT LIKE, and
Use SQL LIKE Confidently
Mastering SQL functions is key to succeeding in data science, and SQL’s LIKE command is no exception. Good command over SQL will supercharge your analytics progress, so be sure to learn it well!
For more resources on SQL, be sure to check out the following:
Learn more about SQL