Intermediate SQL
Here you can access every table used in the course. To access each table, you will need to specify the cinema
schema in your queries (e.g., cinema.reviews
for the reviews
table.
Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).
Take Notes
Add notes about the concepts you've learned and SQL cells with queries you want to keep.
Results often include duplicates, we can use distinct to select all unique values from the field Combining count and distinct is useful to count all unique values in a field
-- Add your own queries here SELECT * FROM cinema.reviews LIMIT 5
--Count, counts the number of records with a value in a field ; use alias for clarity COUNT(field name) --- counts values in field COUNT(*) --- counts records in a table
- --- represents all fields
-- Count with Distinct, Distinct removes duplicates to return only unique values SELECT COUNT (DISTINCT birthdate) AS count_distinct_birthdates FROM people;
-- SQL formatting --- formatting is not required, but makes code difficult to read ----- capitalized key words, you can indent to identify multiple -------- add semi-colon, it's not necessary but helps distinguish end of code, is best practice, some sql flavors require it so easier to have the habit and transfer code ------ https://www.sqlstyle.guide/
--- WHERE, allows you to filter for specific data, comes after FROM statement --- SELECT -> FROM -> WHERE -> LIMIT for Postgresql, swith select and from for SQL server
SELECT title FROM films WHERE release_year < 1960
--Comparison Operators -- <, >, = <=, >=, <>
--with an exception SELECT title FROM films WHERE release_year <> 1960
--Multiple Critera for Filtering -- OR, AND, BETWEEN
SELECT * FROM coats WHERE color = 'yellow' OR length = 'short';
--OR when you need to statisfy at least one condition --OR example
---VALID SELECT * FROM films WHERE release_year = 1994 OR release_year = 2000 ;
---INVALID SELECT * FROM films WHERE release_year = 1994 OR 2000 ;
--------AND ---Use AND to satisfy all critera
------VALID SELECT title FROM films WHERE release_year > 1994 AND release_year < 2000 ;
---AND, OR SELECT title FROM films WHERE (release_year = 1994 OR release_year = 1995) AND (certification = 'PG' OR certification = 'R') ;
-----BETWEEN, AND -----Provides shortand with AND ----Between is inclusive, contains beginning and end values
SELECT title FROM films WHERE release_year BETWEEN 1994 AND 2000 ;
----Can Add Multiple at Once with Between SELECT title FROM films WHERE release_year BETWEEN 1994 AND 2000 AND country='UK' ;
** -------------------------------
Filtering Text
** WHERE can also filter text Filter a pattern rather than a specific text string Can be done with LIKE, NOT LIKE, IN
LIKE - used to search for pattern in field use wildcards, "%" and "_" % will match zero, one, or many characters in a text _ will match a single character
** % Code** SELECT name FROM people WHERE name LIKE 'Ade%' ;
** _ Code** SELECT name FROM people WHERe name LIKE 'Ev_'
-- NOT LIKE Case sensitive, find records that don't match the pattern
SELECT name FROM people WHERE name NOT LIKE 'A.%'
Wildcard positions can be put anywhere
IN Allows us to specify multiple places in a where clause
SELECT title FROM films WHERE release_year IN (1920, 1930, 1940) ;
----identify NULL
IS NULL
SELECT name FROM people WHERE birthdate IS NULL ;
SELECT COUNT(*) as no_birthdates FROM people WHERE birthdate IS NULL (or IS NOT NULL)
----- NEW Functions AVG(), SUM(), MIN(), MAX(), COUNT()
-AVG: Average of numerical variable, only numerical -SUM: Result of adding the value, only numerical -MIN: Lowest Value, can be used on both numerical and non-numerical -MAX: Highest Value, can be used on both numerical and non-numerical
Some can be used with numerical and non-numerical fields
-Min and Max can give the record that is figueratively the lowest or highest (e.g., earliest date, country beginning with A)
ROUND() - round to a specified decimal ROUND(number to round, decimal places)
SELECT ROUND(AVG(budget), 2) AS avg_budget FROM films WHERE release_year >= 2010 ;
defaults to whole number with second part is left out can pass a negative number and still get a result
---Arithmatic and Aliasaing We can add subtract and multiply in SQL
When dividing we can add decimal places to our numbers if we want more precision - SELECT (4/3) or SELECT (4.0 / 3.0)
Aggregate Functions - perform operations on the fields vertically Arthimatic - performs calculations horizontally
SELECT (gross - budget) AS profit FROM films ;
You must always use an alias with arthimatic and aggregative functions If using multiple max functions you get two max functions, its important to alias
Order of Execution FROM WHERE SELECT (alias defined here) LIMIT SELECT
Sorting Results
Order By - orders in ascending order of one or more results, from A-Z smallest to biggest Comes after from Can add ASC or DESC to clarify ascending or descending order
Can help to include the variable we're sorting on for clarity Can be used for multiple fields - ORDER BY field one, field two ; - Second field is a tie breaker of sorts
SELECt title, wins FROM best_movies ORDER by wins DESC ; -- multiple have top results of 11
SELECT title, wins, imdb_score FROM best_movies ORDER BY wins DESC, imdb_score DESC ; -----breaks tie
GROUP BY is used with aggregate functions to privde summary statistics
- can do when only grouping a single field, and selecting multiple fields
SQL will make an error when selecting a field not in our group by clause
In SQL we can't filter aggregate functions with WHERE clauses - example, filtering the title with COUNT is in valid Groups have special HAVING filter
Example SELECT certification, COUNT(title) AS title_count FROM films WHERE certification IN('G','PG','PG-13') GROUP BY certification HAVING COUNT(title) > 500 ORDER BY title_count DESC LIMIT 3 ;
WHERE: filters individual records HAVING filters group recrods
Difference between WHERE and HAVING: Example "What films were released in the year 2000" - WHERE clause "In what years was the avg film duration over 2 hours" - HAVING clause
-- Add your own queries here
SELECT *
FROM cinema.reviews
LIMIT 5
--Count, counts the number of records with a value in a field ; use alias for clarity
COUNT(field name) --- counts values in field
COUNT(*) --- counts records in a table
* --- represents all fields
-- Count with Distinct, Distinct removes duplicates to return only unique values
SELECT COUNT (DISTINCT birthdate) AS count_distinct_birthdates
FROM people;
-- SQL formatting
--- formatting is not required, but makes code difficult to read
----- capitalized key words, you can indent to identify multiple
-------- add semi-colon, it's not necessary but helps distinguish end of code, is best practice, some sql flavors require it so easier to have the habit and transfer code
------ https://www.sqlstyle.guide/
--- WHERE, allows you to filter for specific data, comes after FROM statement
--- SELECT -> FROM -> WHERE -> LIMIT for Postgresql, swith select and from for SQL server
SELECT title
FROM films
WHERE release_year < 1960
--Comparison Operators
-- <, >, = <=, >=, <>
--with an exception
SELECT title
FROM films
WHERE release_year <> 1960
--Multiple Critera for Filtering
-- OR, AND, BETWEEN
SELECT *
FROM coats
WHERE color = 'yellow' OR length = 'short';
--OR when you need to statisfy at least one condition
--OR example
---VALID
SELECT *
FROM films
WHERE release_year = 1994
OR release_year = 2000 ;
---INVALID
SELECT *
FROM films
WHERE release_year = 1994 OR 2000 ;
--------AND
---Use AND to satisfy all critera
------VALID
SELECT title
FROM films
WHERE release_year > 1994
AND release_year < 2000 ;
---AND, OR
SELECT title
FROM films
WHERE (release_year = 1994 OR release_year = 1995)
AND (certification = 'PG' OR certification = 'R') ;
Explore Datasets
Use the descriptions
, films
, people
, reviews
, and roles
tables to explore the data and practice your skills!
- Which titles in the
reviews
table have an IMDB score higher than 8.5? - Select all titles from Germany released after 2010 from the
films
table. - Calculate a count of all movies by country using the
films
table.