Joining Data with SQL
Here you can access every table used in the course. To access each table, you will need to specify the world
schema in your queries (e.g., world.countries
for the countries
table, and world.languages
for the languages
table).
Take Notes
Add notes about the concepts you've learned and SQL cells with queries you want to keep.
Chapter 1
The ins and outs of INNER JOIN
- Key: field that uniquely identifies each record (ID)
- INNER JOIN looks for records in both tables which match on a given field
- Must match the keys in the query after the INNER JOIN with an ON keyword (or if column names are identical, you can use USING intead) <- For Example: INNER JOIN presidents USING (country);
- Aliasing tables helps with shortening the query
- Example: SELECT prime_ministers.country, prime_minsters.continent, prime_minister, president FROM prime_ministers INNER JOIN presidents ON prime_ministers.country = presidents.country;
Defining Relationships
- One to many relationship (author with their books)
- One to one relationship (finger with fingerprint)
- Many to many relationship (languages and countries)
Multiple Joins
- Can add another INNER JOIN under the first (will join on the table from the FROM clause unless otherwise specified)
- Can join on multiple keys (ADD on the ON clause)
Chapter 2
LEFT and RIGHT JOINs
- LEFT JOIN: Return all records in the left table, and those records in the right table that match on the joining field provided.(Keeps all values in left table, while INNER JOIN only contains only matching and removes all non-matched values) - Can also be written as LEFT OUTER JOIN
- RIGHT JOIN: Same as LEFT JOIN, but portrays all records in the right table instead of the left (Not common) - Can also be written as RIGHT OUTER JOIN
FULL JOINs
- Combines a LEFT JOIN and a RIGHT JOIN (contains ALL data whether matching or not) - Can also be written as FULL OUTER JOIN
- Syntax the same as LEFT, RIGHT, and INNER JOINS
Crossing into CROSS JOIN
- CROSS JOIN: Creates all possible combinations of two tables
- Minimal syntax = dont use ON or USING with the clause
- Example: SELECT prime_minister, president FROM prime_ministers AS p1 CROSS JOIN presidents AS p2 WHERE p1.continent IN ('Asia') AND p2.continent IN ('South America');
Self joins
- Tables joined with themselves; can be used to compare parts of the same table
- Aliasing is required; does not use SELF JOIN clause
- Will be returned as pairs
- Example: SELECT p1.country AS country1, p2.country AS country2, p1.continent FROM prime_ministers AS p1 INNER JOIN prime_ministers AS p2 ON p1.continent = p2.continent AND p1.country <> p2.country;
Chapter 3
Set theory for SQL Joins
- 3 main set operations: Union, Intersect, Except
- UNION: takes two tables as input, and returns all records for both tables (does not return identical records)
- UNION ALL: same as UNION, but DOES include duplicates
- MUST have the same # of columnds, and all the same field types in order to stack, will be the left table alias
- Example: SELECT * FROM left_table UNION SELECT * FROM right_table
At the INTERSECT
- INTERSECT: takes two tables as input, and returns only the records that exist in both tables
- Same syntax as UNION and UNION ALL
- No duplicates will be shown, unlike INNER JOIN
- All columns must be exact match in order to get results, or can be a blank table if not exact
EXCEPT
- EXCEPT: takes two tables as input, and only returns the records that exist in the left and not right table
- All columns must match in order for results to be shown
- Same syntax as UNION, UNION ALL, and INTERSECT
Chapter 4
Subquerying with semi joins and anti joins
- Semi joins: chooses records in the first table where a condition is met in the second table
- Uses subqueries
- Example: SELECT president, country, continent FROM presidents WHERE country IN (SELECT country FROM states WHERE indep_year < 1800);
- Anti joins: same as semi joins just NOT matching
- Uses subqueries -> just add NOT before the IN statement
- Example: SELECT president, country, continent FROM presidents WHERE country NOT IN (SELECT country FROM states WHERE indep_year < 1800);
Subqueries inside WHERE and SELECT
- The WHERE clause is the most common place for subqueries
- Can use a subquery next to the IN operator of a WHERE _ IN () clause -> they just need to be same data type
- Example: SELECT * FROM some_table WHERE some_field IN (SELECT some_numeric_field FROM another_table WHERE field2 = some_condition);
- Subqueries can also be found in SELECT statements
- Example SELECT DISTINCT continent, (SELECT COUNT(*) FROM monarchs WHERE states.continent = monarch.continent) AS monarch_count FROM states;
Subqueries inside FROM
- Select data from multiple tables
- Example: SELECT DISTINCT monarchs.continent, sub.most_recent FROM monarchs, (SELECT continent, MAX(indep_year) AS most_recent FROM states GROUP BY continent) AS sub WHERE monarchs.continent = sub.continent ORDER BY continent;
Difficult Example: -- Select fields from cities SELECT name, country_code, city_proper_pop, metroarea_pop, city_proper_pop / metroarea_pop * 100 AS city_perc FROM cities -- Use subquery to filter city name WHERE name IN (SELECT capital FROM countries WHERE (continent = 'Europe' OR continent LIKE '%America')) -- Add filter condition such that metroarea_pop does not have null values AND metroarea_pop IS NOT NULL -- Sort and limit the result ORDER BY city_perc DESC LIMIT 10;
-- Add your own queries here
SELECT *
FROM world.languages
LIMIT 5