Functions for Manipulating Data in PostgreSQL
Here you can access the tables used in the course. To access the table, you will need to specify the dvdrentals schema in your queries (e.g., dvdrentals.film for the film table and dvdrentals.country for the country table).
Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).
Take Notes
Add notes about the concepts you've learned and SQL cells with queries you want to keep.
Add your notes here
-- Add your own queries here
SELECT *
FROM dvdrentals.film
LIMIT 10Explore Datasets
Use the different tables to explore the data and practice your skills!
- Select the
title,release_year, andratingof films in thefilmtable.- Add a
description_shortenedcolumn which contains the first 50 characters of thedescriptioncolumn, ending with "...". - Filter the
filmtable for rows where thespecial_featurescolumn contains "Commentaries".
- Add a
- Select the
customer_id,amount, andpayment_datefrom thepaymenttable.- Extract date information from the
payment_datecolumn, creating new columns for theday,month,quarter, andyearof transaction. - Use the
rentaltable to include a column containing the number of days rented (i.e., time between therental_dateand thereturn_date).
- Extract date information from the
- Update the title column so that titles with multiple words are reduced to the first word and the first letter of the second word followed by a period.
- For example:
- "BEACH HEARTBREAKERS" becomes "BEACH H."
- "BEAST HUNCHBACK" becomes "BEAST H."
- Reformat your shortened title to title case (e.g., "BEACH H." becomes "Beach H.").
- For example:
Putting it all together Many of the techniques you've learned in this course will be useful when building queries to extract data for model training. Now let's use some date/time functions to extract and manipulate some DVD rentals data from our fictional DVD rental store. In this exercise, you are going to extract a list of customers and their rental history over 90 days. You will be using the EXTRACT(), DATE_TRUNC(), and AGE() functions that you learned about during this chapter along with some general SQL skills from the prerequisites to extract a data set that could be used to determine what day of the week customers are most likely to rent a DVD and the likelihood that they will return the DVD late. Finally, use a CASE statement and DATE_TRUNC() to create a new column called past_due which will be TRUE if the rental_days is greater than the rental_duration otherwise, it will be FALSE.
SELECT c.first_name || ' ' || c.last_name AS customer_name, f.title, r.rental_date, -- Extract the day of week date part from the rental_date EXTRACT(dow FROM r.rental_date) AS dayofweek, AGE(r.return_date, r.rental_date) AS rental_days, -- Use DATE_TRUNC to get days from the AGE function CASE WHEN DATE_TRUNC('day', AGE(r.return_date, r.rental_date)) > -- Calculate number of d f.rental_duration * INTERVAL '1' day THEN TRUE ELSE FALSE END AS past_due FROM film AS f INNER JOIN inventory AS i ON f.film_id = i.film_id INNER JOIN rental AS r ON i.inventory_id = r.inventory_id INNER JOIN customer AS c ON c.customer_id = r.customer_id WHERE -- Use an INTERVAL for the upper bound of the rental_date r.rental_date BETWEEN CAST('2005-05-01' AS DATE) AND CAST('2005-05-01' AS DATE) + INTERVAL '90 day';
Extracting substrings from text data In this exercise, you are going to practice how to extract substrings from text columns. The Sakila database contains the address table which stores the street address for all the rental store locations. You need a list of all the street names where the stores are located but the address column also contains the street number. You'll use several functions that you've learned about in the video to manipulate the address column and return only the street address. Extract only the street address without the street number from the address column. Use functions to determine the starting and ending position parameters.
SELECT -- Select only the street name from the address table SUBSTRING(address FROM POSITION(' ' IN address)+1 FOR CHAR_LENGTH(address)) FROM address;
The TRIM function In this exercise, we are going to revisit and combine a couple of exercises from earlier in this chapter. If you recall, you used the LEFT() function to truncate the description column to 50 characters but saw that some words were cut off and/or had trailing whitespace. We can use trimming functions to eliminate the whitespace at the end of the string after it's been truncated. Convert the film category name to uppercase and use the CONCAT() concatenate it with the title. Truncate the description to the first 50 characters and make sure there is no leading or trailing whitespace after truncating.
SELECT CONCAT(UPPER(name), ': ', title) AS film_category, -- Truncate the description remove trailing whitespace TRIM(LEFT(description, 50)) AS film_desc FROM film AS f INNER JOIN film_category AS fc ON f.film_id = fc.film_id INNER JOIN category AS c ON fc.category_id = c.category_id;
Basic full-text search Searching text will become something you do repeatedly when building applications or exploring data sets for data science. Full-text search is helpful when performing exploratory data analysis for a natural language processing model or building a search feature into your application. In this exercise, you will practice searching a text column and match it against a string. The search will return the same result as a query that uses the LIKE operator with the % wildcard at the beginning and end of the string, but will perform much better and provide you with a foundation for more advanced full-text search queries. Let's dive in. Select the title and description columns from the film table. Perform a full-text search on the title column for the word elf.
SELECT title, description FROM film -- Convert the title to a tsvector and match it against the tsquery WHERE to_tsvector(title) @@ to_tsquery('elf');