Track
Working with relational databases often involves handling messy data. One of the most common challenges is cleaning and processing text data. Luckily, SQL provides powerful string functions that can make this process much more efficient, saving you time and effort.
In this article, we are going to explore the most important SQL string functions that can help you clean and manipulate text data with ease. Let’s get started!
What Are SQL String Functions?
SQL String Functions are built-in functions that enable us to manipulate and process text data stored in a database. Common examples of text fields include names, descriptions, and addresses.
These functions can be used to perform a wide range of tasks, including:
- Concatenate strings
- Formatting text
- Extraction of parts of a string
- Search and replace specific text within a string
Concatenating Strings in SQL
We can concatenate two or more strings using the function concat():
SELECT CONCAT('Street Roma',', ','72',', ','Padova') AS full_address
This is the output:
full_address |
-----------------------+
Street Roma, 72, Padova|
This function is widely used to combine information provided by different columns, like names and addresses.
As an alternative, you can also use the function CONCAT_WS():
SELECT CONCAT_WS(',','Street Roma','72','Padova') as full_address
From the syntax, you can notice that it’s more easy and efficient in situations where you want to combine more than two columns with the same separator.
Formatting Text
These are several SQL string functions to adjust text by changing the case, removing extra spaces and other operations. Each function is listed with its syntax and the corresponding description.
|
Function Syntax |
Description |
| LOWER(string) |
Returns the text with all the characters in lowercase |
| UPPER(string) |
Returns the text with all the characters in uppercase |
| TRIM(string, [character]) |
Removes white spaces from both left and right sides of the string by default, special character if specified |
| LTRIM(string, [character]) |
Removes white spaces from the left of the string by default, special character if specified |
| RTRIM(string, [character]) |
Removes white spaces from the right of the string by default, special character if specified |
Now, let’s show some examples to understand how to apply these functions. Imagine an instance where you have email addresses with upper characters and the state acronym in lowercase, and you want to change to correct the format of these columns.
SELECT
LOWER('ANONymous@gmail.com') AS email,
UPPER('it') AS country
This is the following output:
email |country|
-------------------+-------+
anonymous@gmail.com|IT |
In addition to changing the case, it can also be useful to get rid of white spaces on the front or at the end of strings. The primary function for this type of operation is TRIM() to remove from both sides. Other alternatives are LTRIM() and RTRIM() to delete white spaces respectively from the left and right sides:
SELECT TRIM(' Street Roma, 72 ') AS trimmed_address,
LTRIM(' Street Roma, 72 ') AS ltrimmed_address,
RTRIM(' Street Roma, 72 ') AS rtrimmed_address
The query returns the following result:
trimmed_address|ltrimmed_address|rtrimmed_address|
---------------+----------------+----------------+
Street Roma, 72|Street Roma, 72 | Street Roma, 72|
From the output, you can catch the differences between each function. Moreover, TRIM() can be applied to remove other special characters, besides white spaces. For example, we want to clean the field of the telephone by getting rid of the + sign:
SELECT TRIM('++345','+') AS telephone
The output returned is the following:
telephone|
---------+
345 |
As you can see, the telephone number is completely clean.
Extracting Text
After concatenating and modifying the format of the strings, it’s time to discover how to extract text using the special SQL function SUBSTRING(). Let’s show different examples to master this function:
SELECT
SUBSTRING('Antony',1,1) AS first_character,
SUBSTRING('Antony',1,3) AS first_3characters,
SUBSTRING('Antony',1,5) AS first_5characters
This is the output:
first_character|first_3characters|first_5characters|
---------------+-----------------+-----------------+
A |Ant |Anton |
We can also combine this function with the previous functions to capitalize the first character, converting the rest of the letters to lowercase and, then, concatenating all the letters:
SELECT
LENGTH('antony') AS lenght_name,
SUBSTRING(UPPER('antony'),1,1) AS first_character,
SUBSTRING(LOWER('antony'),2,LENGTH('antony')) AS last_characters,
CONCAT(SUBSTRING(UPPER('antony'),1,1),SUBSTRING(LOWER('antony'),
2,LENGTH('antony'))) AS name
This query returns the following result:
lenght_name|first_character|LAST_characters|name |
-----------+---------------+---------------+------+
6|A |ntony |Antony|
In addition to the functions seen until now, you can notice the LENGTH() function that helps to extract the length of the string. In this context, it’s useful to specify the index of the last character.
Search and Replace Strings in SQL
We can show the most principal functions to search and replace strings:
|
Function Syntax |
Description |
| REPLACE(string,x,y) |
Replace the x value in the string with the y value |
| CHARINDEX(x,string) |
FInd the position of the x value within the string |
A very useful function is REPLACE() to replace a substring with another substring within the string:
SELECT
'Street Roma - 32' AS address,
replace('Street Roma - 32','-',',') AS cleaned_address
This is the output:
address |cleaned_address |
----------------+----------------+
Street Roma - 32|Street Roma , 32|
In this case, we just substituted a dash with a comma within the address. We can also try to find the position of the dash within the address:
SELECT
'Street Roma - 32' AS address,
LENGTH('Street Roma - 32') AS length_address,
CHARINDEX('-','Street Roma - 32') AS location_dash
The query returns the following result:
address |length_address|location_dash|
----------------+--------------+-------------+
Street Roma - 32| 16| 13|
We can notice that the symbol - is in position 13 of the address, mostly at the end of the string.
Conclusion
Mastering these SQL string functions can make a difference in cleaning data efficiently. Of course, the article doesn’t cover all the functions, but just a short list of the most important functions. Not all SQL functions are always valid and can change depending on the type of database you are using. For example, CHARINDEX can be replaced by LOCATE() in a MySQL database.
To practice with these functions, you can check out our Reporting in SQL Course. It can help you to become an expert in cleaning data and building complex reports. We also recommend the SQL Fundamentals skill track to cover all essential aspects of SQL.


