Course Notes
Use this workspace to take notes, store sample queries, and build your own interactive cheat sheet!
You will need to connect your SQL cells to an integration to run a query.
- You can use a sample integration from the dropdown menu. This includes the Course Databases integration, which contains tables you used in our SQL courses.
- You can connect your own integration by following the instructions provided here.
Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).
Take Notes
Add notes here about the concepts you've learned and SQL cells with code you want to keep.
Add your notes here
-- A sample query for you to replace!
SELECT
*
FROM booksDate truncation Unlike date_part() or EXTRACT(), date_trunc() keeps date/time units larger than the field you specify as part of the date. So instead of just extracting one component of a timestamp, date_trunc() returns the specified unit and all larger ones as well. Recall the syntax: date_trunc('field', timestamp) Using date_trunc(), find the average number of Evanston 311 requests created per day for each month of the data. Ignore days with no requests when taking the average.
Write a subquery to count the number of requests created per day. Select the month and average count per month from the daily_count subquery.
-- Aggregate daily counts by month SELECT date_trunc('month', day) AS month, avg(count) -- Subquery to compute daily counts FROM (SELECT date_trunc('day', date_created) AS day, count(*) AS count FROM evanston311 GROUP BY day) AS daily_count GROUP BY month ORDER BY month;
Find missing dates The generate_series() function can be useful for identifying missing dates.
Recall: generate_series(from, to, interval) where from and to are dates or timestamps, and interval can be specified as a string with a number and a unit of time, such as '1 month'. Are there any days in the Evanston 311 data where no requests were created? Write a subquery using generate_series() to get all dates between the min() and max() date_created in evanston311. Write another subquery to select all values of date_created as dates from evanston311. Both subqueries should produce values of type date (look for the ::). Select dates (day) from the first subquery that are NOT IN the results of the second subquery. This gives you days that are not in date_created.
SELECT day
-- 1) Subquery to generate all dates
-- from min to max date_created
FROM (SELECT generate_series(min(date_created),
max(date_created),
'1 day')::date AS day
-- What table is date_created in?
FROM evanston311) AS all_dates
-- 4) Select dates (day from above) that are NOT IN the subquery
WHERE day NOT IN
-- 2) Subquery to select all date_created values as dates
(SELECT date_created::date
FROM evanston311);
Monthly average with missing dates Find the average number of Evanston 311 requests created per day for each month of the data. This time, do not ignore dates with no requests. Generate a series of dates from 2016-01-01 to 2018-06-30. Join the series to a subquery to count the number of requests created per day. Use date_trunc() to get months from date, which has all dates, NOT day. Use coalesce() to replace NULL count values with 0. Compute the average of this value.
WITH all_days AS (SELECT generate_series('2016-01-01', '2018-06-30', '1 day'::interval) AS date), -- Subquery to compute daily counts daily_count AS (SELECT date_trunc('day', date_created) AS day, count(*) AS count FROM evanston311 GROUP BY day) -- Aggregate daily counts by month using date_trunc SELECT date_trunc('month',date) AS month, -- Use coalesce to replace NULL count values with 0 avg(coalesce(count, 0)) AS average FROM all_days LEFT JOIN daily_count -- Joining condition ON all_days.date=daily_count.day GROUP BY month ORDER BY month;