Materialized Views in PostgreSQL
If you work in SQL (preferably a PostgreSQL user) extensively or on an average level to query databases, then you might have written a good number of complex queries to find answers to critical questions. Also, sometimes it so happens that you need to use the results of those queries quite frequently in your works. Running a complex query multiple times can be expensive. What if there was a way to cache the results of a complex query to aid the performance? Well, materialized views are there for you.
You might be thinking there are simple views also which can store the result of a query. But a simple view differs from a materialized view from multiple aspects. Here is what the official documentation of PostgreSQL says about simple views -
The view is not physically materialized. Instead, the query is run every time the view is referenced in a query.
The statement means that simple views are not physically stored in the database. Each time a view is used in a query, the query that created the view is executed. Now, this is what makes simple views a bit inefficient when it comes to access time.
On the other hand, materialized views come with a lot of flexibility by allowing you to persist a view in the database physically. And you can operate on the materialized views just like you do in case of simple views (but with a lower access time). For the rest of this tutorial, you will be studying about materialized views in PostgreSQL. In one of his webinars, DataCamp's Chief Data Scientist David Robinson showed how he uses materialized views to effectively store the results of complex queries.
Note that in order to follow along with this tutorial, you should be familiar with the contents covered in the Joining Data in SQL course.
Initial database set up
It is better to set up a database in PostgreSQL and create a basic table in there so that you can effectively query it. This would help you in the following sections of this tutorial as well. You can restore this database backup which contains the table which you are going to use in this tutorial. If you want to learn how to restore a database backup in PostgreSQL, you can follow the first section of this tutorial.
If you were able to restore the backup, you should see a table named
international_debt within the database (you will need to create a database first, if you don't have one). Let's quickly take a look at the first few rows of the table (a simple select query will help you do this) -
The table contains information about the debt statistics of different countries across the globe for this current year in different categories (refer to the columns
indicator_code). The debt column shows the amount of debt (in USD) a particular country has in a particular category. The data belongs to the domain of economics and is often used to analyze the economic conditions of different countries. The data was collected from the World Bank .
Now that you have set up a database successfully, you are good to proceed to the next sections of this tutorial.
Creating materialized views
Let's create a materialized view of the following query -
select country_name, sum(debt) as total_debt from international_debt group by country_name order by country_name;
The query returns the country names with their total debts (in alphabetical order). The output of the query should look like so -
Now the materialized view -
CREATE MATERIALIZED VIEW country_total_debt as select country_name, sum(debt) as total_debt from international_debt group by country_name order by country_name;
You should get a success message saying -
Query returned successfully: 124 rows affected, 242 ms execution time.
Refreshing a materialized view
You can also instruct PostgreSQL not load any data into the materialized view while you are creating it by specifying the
WITH NO DATA option -
CREATE MATERIALIZED VIEW country_total_debt_2 as select country_name, sum(debt) as total_debt from international_debt group by country_name order by country_name WITH NO DATA;
You will get a message saying -
Query returned successfully with no result in 41 ms. Notice the difference between the two output messages. If you append the
WITH NO DATA option, an empty materialized view gets created. You cannot query this materialized view. However, you can populate the materialized view by executing -
REFRESH MATERIALIZED VIEW country_total_debt_2;
Querying a materialized view
By now, you should have two materialized views (country_total_debt, country_total_debt_2) created. Let's execute a simple select query using any of the two -
select * from country_total_debt_2;
The query should return something like -
The output is exactly the same as the query which we wrapped in a materialized view. You can run queries on materialized views with aggregate functions as well -
select max(total_debt) from country_total_debt_2;
You should get -
Dropping a materialized view
Materialized views can be dropped just like the way tables are dropped in PostgreSQL. You just need to specify
Materialized View after
Drop Materialized View country_total_debt_2;
That is all for this tutorial. Thank you for reading to the end. In this tutorial, you got to learn about materialized views in PostgreSQL, and how you can operate on them. Due to their effectiveness, materialized views are extensively used in database warehousing. If you are interested in learning more complex queries, you can check out the following DataCamp courses -