Skip to content
Course Notes: Exploratory Data Analysis in SQL (copy)
add text here
In the SQL cell below:
- Set the source to 'Course Databases' to query databases used in SQL courses.
- Set the source to 'DataFrames and CSVs' to query this course's CSV files (if any) with SQL
- Write SQL! Note that you cannot run queries that modify the database.
Queryas
query
variable
-- To clear table if it already exists;
-- fill in name of temp table
DROP TABLE IF EXISTS temp_table;
-- Create the temporary table
CREATE TEMP TABLE temp_table AS
-- Select the two columns you need; alias as needed
SELECT sector,
percentile_disc(0.8) WITHIN GROUP (ORDER BY profits) AS pct80
-- What table are you getting the data from?
FROM fortune500
-- What do you need to group by?
GROUP BY sector;
-- See what you created: select all columns and rows
-- from the table you created
SELECT *
FROM temp_table;
DataFrameas
df
variable
-- Count rows
SELECT count(*)
FROM evanston311
WHERE description ILIKE '%trash%'
OR description ILIKE '%garbage%';
Building up the query through the steps below, find inquires that mention trash or garbage in the description without trash or garbage being in the category. What are the most frequent categories for such inquiries?
DataFrameas
df1
variable
-- Count rows with each category
SELECT category, count(*)
FROM evanston311
WHERE (description ILIKE '%trash%'
OR description ILIKE '%garbage%')
AND category NOT LIKE '%Trash%'
AND category NOT LIKE '%Garbage%'
-- What are you counting?
GROUP BY category
--- order by most frequent values
ORDER BY count(category) DESC
LIMIT 10;