Skip to content

Intro + Intermediate SQL

-- SQL order of operations --

Nb/ the order of execution for SQL code is:

  1. FROM - Which tables the info is being pulled from,can use aliases. Any joins happen around here *
  2. WHERE - Filtering the table with conditions on certain fields or text matching, conditions separated with 'and'. Note that aggregate functions and aliases can't be used here. e.g purchase_date > '2018-01-01'
  3. GROUP BY - The group by clause is often used in conjunction with aggregate functions to return an aggregation of results grouped by one or more columns.
  4. HAVING - HAVING is like WHERE with aggregate functions, (or you could use a subquery).
  5. (Window functions) - so can only use in SELECT or ORDER BY
  6. SELECT - The SELECT statement is where we define the columns and aggregate functions we want to return as columns on our table. Note that including an aggregate function requires a GROUP BY statement.
  7. ORDER BY - Once all of our data has been grouped and aggregated the ORDER BY clause will sort the resulting set of rows. Because it comes so late in the order of operations we can order by aggregates, window functions, and column aliases. Can be ASC or DESC, and can be over multiple columns.
  8. LIMIT - How many rows maximum to be returned by query

*Nb/ Joining on many rows can be slow, can be sped up by filtering with CTEs before joining

Intro to SQL

# Start coding here... 

Other Misc Queries


Spinner
DataFrameas
df
variable
SELECT * 
FROM cinema.films
WHERE language LIKE '' AND gross IS NULL
LIMIT 5;
Spinner
DataFrameas
df
variable
-- Select the names, continents, and areas of countries
-- Note that the query may take a long time to run if you remove the LIMIT statement
SELECT countries.name AS country_name,
        continents.name AS continent_name,
        area
FROM countries
  INNER JOIN regions USING(region_id)
  INNER JOIN continents USING(continent_id)
ORDER BY country_name
LIMIT 5

'Answering business questions' query (chapter 4)

Spinner
DataFrameas
df1
variable
SELECT release_year, COUNT(DISTINCT language) AS distinct_langs
FROM cinema.films
GROUP BY release_year
ORDER BY distinct_langs DESC;
Hidden output