CASE Statements in PostgreSQL
Conditional expressions are one of the most fundamental elements of any programming paradigm. Common conditional expressions include
if-else blocks and
switch cases. You can formulate conditional expressions in PostgreSQL using
WHEN-THEN case which is very similar to
if-else blocks. In this tutorial, you will learn how to do this.
Before you start writing condition queries, it is important that you set up a local PostgreSQL database. You will do this in the first section of this tutorial.
Note: To be able to follow along with this tutorial you need to know the basics of SQL and PostgreSQL. Feel free to refresh the basics with the following resources:
Setting up a PostgreSQL database locally
You can use this backup file and restore it in your PostgreSQL server. If you want to know how to do this, you can follow this article. After you successfully restore the backup, you should be able to see these tables in the database -
This tutorial, however, only uses the
countries table. Feel free explore the tables provided in the database backup. Also, you may want to form your own databases and tables. You should have a table (populated with the details about different countries around the globe) by now if you restored the database backup. Let's now explore the
countries table very briefly.
countries table very briefly
After restoring the database, you can view the table in the
pgAdmin interface which comes with the default installation of PostgreSQL. Upon running a simple
SELECT statement, you get to know about the columns, their data-types and also the rows the table is containing -
The table contains details about a total of 206 different countries from various regions of the world. How many regions exactly? The below query should give you the answer to the question.
SELECT COUNT(DISTINCT(region)) from countries;
It turns out that there are 23 different regions covered in this table such as Western Europe, Polynesia, the Middle East and so on. You can explore the table with your own quick questions, and along the way, you may find out interesting facts. Let's now see the anatomy of a basic PostgreSQL
Introduction to PostgreSQL
The following describes the general form of a PostgreSQL case with
WHEN-THEN construct -
CASE WHEN condition_1 THEN result_1 WHEN condition_2 THEN result_2 ... ELSE result_n END
Here are some critical points that you should keep in mind while constructing
CASEs in PostgreSQL:
Each condition is a boolean expression and based on its output the result is chosen. If all the expressions corresponding to
WHENare evaluated to be
False, then the result respective to the
ELSEpart is shown. In case, you don't specify the
ELSEpart; the query will return null.
A condition is generally applied on a column that is present in the table you are operating on. Conditional query constructs like the above are typically done with
SELECTstatements. Keep in mind that, the column must be present in the
SELECTstatement of your query, on which you are specifying the boolean expressions (in
CASE). But when you are using
CASEin pair with a where clause, you need not to follow this (more on this later).
The data-type of the results has to be uniform. You cannot return a
result_2and so on.
Let's dig into some examples now.
Let's start simple. You will now write a
CASE to group the countries with respect to the independence year. Independence years are recorded in the
indep_year column. Let's say you have three groups of independence years of the countries which you need to follow. You will create three groups of independence years for this -
- Before 1900 (
indep_year < 1900)
- Between 1900 and 1930 (
indep_year <= 1930)
- After 1930 (
indep_year > 1930)
The final result of the query should contain the following columns -
- name (of the country)
- continent (to which the country belongs)
- And the independence years' group that you will create. Let's name that as
Let's now write the query -
SELECT name, continent, indep_year, CASE WHEN indep_year < 1900 THEN 'before 1900' WHEN indep_year <= 1930 THEN 'between 1900 and 1930' ELSE 'after 1930' END AS indep_year_group FROM countries ORDER BY indep_year_group;
(The query is inspired from DataCamp's Joining Data in SQL course.)
Upon executing the query, you will see many countries like Palestine, Puerto Rico, etc. for which no independence year is provided in the table. Hence they got
between 1900 and 1930 group.
CASEs can have multiple conditions. There are a handful of entries in the table where the name of country and capital is the same. You can add this to your previously constructed
SELECT name, continent, indep_year, CASE WHEN (indep_year < 1900) AND (countries.name = countries.capital) THEN 'before 1900 and capital same' WHEN indep_year <= 1930 AND (countries.name = countries.capital) THEN 'between 1900 and 1930 and capital same' ELSE 'after 1930_and_no_same_capital' END AS indep_year_group FROM countries ORDER BY indep_year_group;
Another exciting thing about
CASEs is that you can pair them with aggregate functions like
SUM(). To implement a
CASE with the aggregate function
SUM(), you will need a different table where this might be a good idea. Let's create a simple table named
student_grades having the following columns and data-types -
- student_name (string)
- student_stream (string)
- student_grade (character)
The following query creates the table for you -
CREATE TABLE student_grades (student_name character varying, student_stream character varying, student_grade character);
Let's insert some records into the table now. You can do this using a few
INSERT statements. Let's say after the insertions, the table
student_grades now has the following records -
Now, you will write a
CASE to get the sum of the students that belong to certain grade groups. As you can see, according to the instances of the table, there are three distinct grades - A, B and C. The conditional query should return an output similar to the following -
The query for this -
SELECT SUM ( CASE WHEN student_grade = 'A' THEN 1 ELSE 0 END ) AS "High Scoring", SUM ( CASE WHEN student_grade = 'B' THEN 1 ELSE 0 END ) AS "Mid Scoring", SUM ( CASE WHEN student_grade = 'C' THEN 1 ELSE 0 END ) AS "Low Scoring" FROM student_grades;
This is how you can pair up
CASE with aggregate functions to come up with interesting facts about the data.
Conclusion and further reading
That is all for this tutorial. In this tutorial, you learned how to incorporate
if-then like conditioning into your SQL queries to come with interesting reportings. You worked through several examples to strengthen this skill. If you are looking for more challenging exercises, you should take the following DataCamp courses on SQL -
Feel free to let me know your questions on this tutorial via the