Skip to content
Intermediate SQL
  • AI Chat
  • Code
  • Report
  • Intermediate SQL

    Here you can access every table used in the course. To access each table, you will need to specify the cinema schema in your queries (e.g., cinema.reviews for the reviews table.


    Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).

    Take Notes

    Add notes about the concepts you've learned and SQL cells with queries you want to keep.

    Results often include duplicates, we can use distinct to select all unique values from the field Combining count and distinct is useful to count all unique values in a field

    -- Add your own queries here SELECT * FROM cinema.reviews LIMIT 5

    --Count, counts the number of records with a value in a field ; use alias for clarity COUNT(field name) --- counts values in field COUNT(*) --- counts records in a table

    • --- represents all fields

    -- Count with Distinct, Distinct removes duplicates to return only unique values SELECT COUNT (DISTINCT birthdate) AS count_distinct_birthdates FROM people;

    -- SQL formatting --- formatting is not required, but makes code difficult to read ----- capitalized key words, you can indent to identify multiple -------- add semi-colon, it's not necessary but helps distinguish end of code, is best practice, some sql flavors require it so easier to have the habit and transfer code ------ https://www.sqlstyle.guide/

    --- WHERE, allows you to filter for specific data, comes after FROM statement --- SELECT -> FROM -> WHERE -> LIMIT for Postgresql, swith select and from for SQL server

    SELECT title FROM films WHERE release_year < 1960

    --Comparison Operators -- <, >, = <=, >=, <>

    --with an exception SELECT title FROM films WHERE release_year <> 1960

    --Multiple Critera for Filtering -- OR, AND, BETWEEN

    SELECT * FROM coats WHERE color = 'yellow' OR length = 'short';

    --OR when you need to statisfy at least one condition --OR example

    ---VALID SELECT * FROM films WHERE release_year = 1994 OR release_year = 2000 ;

    ---INVALID SELECT * FROM films WHERE release_year = 1994 OR 2000 ;

    --------AND ---Use AND to satisfy all critera

    ------VALID SELECT title FROM films WHERE release_year > 1994 AND release_year < 2000 ;

    ---AND, OR SELECT title FROM films WHERE (release_year = 1994 OR release_year = 1995) AND (certification = 'PG' OR certification = 'R') ;

    -----BETWEEN, AND -----Provides shortand with AND ----Between is inclusive, contains beginning and end values

    SELECT title FROM films WHERE release_year BETWEEN 1994 AND 2000 ;

    ----Can Add Multiple at Once with Between SELECT title FROM films WHERE release_year BETWEEN 1994 AND 2000 AND country='UK' ;

    ** ------------------------------- Filtering Text

    ** WHERE can also filter text Filter a pattern rather than a specific text string Can be done with LIKE, NOT LIKE, IN


    LIKE - used to search for pattern in field use wildcards, "%" and "_" % will match zero, one, or many characters in a text _ will match a single character

    ** % Code** SELECT name FROM people WHERE name LIKE 'Ade%' ;

    ** _ Code** SELECT name FROM people WHERe name LIKE 'Ev_'

    -- NOT LIKE Case sensitive, find records that don't match the pattern

    SELECT name FROM people WHERE name NOT LIKE 'A.%'


    Wildcard positions can be put anywhere

    IN Allows us to specify multiple places in a where clause

    SELECT title FROM films WHERE release_year IN (1920, 1930, 1940) ;

    ----identify NULL

    IS NULL

    SELECT name FROM people WHERE birthdate IS NULL ;

    SELECT COUNT(*) as no_birthdates FROM people WHERE birthdate IS NULL (or IS NOT NULL)

    ----- NEW Functions AVG(), SUM(), MIN(), MAX(), COUNT()

    -AVG: Average of numerical variable, only numerical -SUM: Result of adding the value, only numerical -MIN: Lowest Value, can be used on both numerical and non-numerical -MAX: Highest Value, can be used on both numerical and non-numerical

    Some can be used with numerical and non-numerical fields

    -Min and Max can give the record that is figueratively the lowest or highest (e.g., earliest date, country beginning with A)

    ROUND() - round to a specified decimal ROUND(number to round, decimal places)

    SELECT ROUND(AVG(budget), 2) AS avg_budget FROM films WHERE release_year >= 2010 ;

    defaults to whole number with second part is left out can pass a negative number and still get a result

    ---Arithmatic and Aliasaing We can add subtract and multiply in SQL

    When dividing we can add decimal places to our numbers if we want more precision - SELECT (4/3) or SELECT (4.0 / 3.0)

    Aggregate Functions - perform operations on the fields vertically Arthimatic - performs calculations horizontally

    SELECT (gross - budget) AS profit FROM films ;

    You must always use an alias with arthimatic and aggregative functions If using multiple max functions you get two max functions, its important to alias

    Order of Execution FROM WHERE SELECT (alias defined here) LIMIT SELECT


    Sorting Results

    Order By - orders in ascending order of one or more results, from A-Z smallest to biggest Comes after from Can add ASC or DESC to clarify ascending or descending order

    Can help to include the variable we're sorting on for clarity Can be used for multiple fields - ORDER BY field one, field two ; - Second field is a tie breaker of sorts

    SELECt title, wins FROM best_movies ORDER by wins DESC ; -- multiple have top results of 11

    SELECT title, wins, imdb_score FROM best_movies ORDER BY wins DESC, imdb_score DESC ; -----breaks tie

    GROUP BY is used with aggregate functions to privde summary statistics
    - can do when only grouping a single field, and selecting multiple fields SQL will make an error when selecting a field not in our group by clause

    In SQL we can't filter aggregate functions with WHERE clauses - example, filtering the title with COUNT is in valid Groups have special HAVING filter

    Example SELECT certification, COUNT(title) AS title_count FROM films WHERE certification IN('G','PG','PG-13') GROUP BY certification HAVING COUNT(title) > 500 ORDER BY title_count DESC LIMIT 3 ;

    WHERE: filters individual records HAVING filters group recrods

    Difference between WHERE and HAVING: Example "What films were released in the year 2000" - WHERE clause "In what years was the avg film duration over 2 hours" - HAVING clause

    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Add your own queries here
    SELECT *
    FROM cinema.reviews
    LIMIT 5
    
    --Count, counts the number of records with a value in a field ; use alias for clarity 
    COUNT(field name) --- counts values in field 
    COUNT(*) --- counts records in a table
    * --- represents all fields
    
    -- Count with Distinct, Distinct removes duplicates to return only unique values 
    SELECT COUNT (DISTINCT birthdate) AS count_distinct_birthdates 
    FROM people; 
    
    -- SQL formatting 
    --- formatting is not required, but makes code difficult to read 
    ----- capitalized key words, you can indent to identify multiple
    -------- add semi-colon, it's not necessary but helps distinguish end of code, is best practice, some sql flavors require it so easier to have the habit and transfer code 
    ------ https://www.sqlstyle.guide/ 
    
    --- WHERE, allows you to filter for specific data, comes after FROM statement 
    --- SELECT -> FROM -> WHERE -> LIMIT for Postgresql, swith select and from for SQL server 
    
    SELECT title
    FROM films 
    WHERE release_year < 1960 
    
    --Comparison Operators 
    -- <, >, = <=, >=, <> 
    
    --with an exception
    SELECT title
    FROM films
    WHERE release_year <> 1960
    
    --Multiple Critera for Filtering 
    -- OR, AND, BETWEEN 
    
    SELECT * 
    FROM coats 
    WHERE color = 'yellow' OR length = 'short';
    
    --OR when you need to statisfy at least one condition 
    --OR example 
    
    ---VALID 
    SELECT * 
    FROM films 
    WHERE release_year = 1994 
    	OR release_year = 2000 ;
    	
    ---INVALID 
    SELECT * 
    FROM films 
    WHERE release_year = 1994 OR 2000 ;
    
    --------AND 
    ---Use AND to satisfy all critera
    
    ------VALID 
    SELECT title
    FROM films 
    WHERE release_year > 1994 
    	AND release_year < 2000 ; 
    	
    ---AND, OR 
    SELECT title
    FROM films
    WHERE (release_year = 1994 OR release_year = 1995)
    	AND (certification = 'PG' OR certification = 'R') ; 
    
    
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Explore Datasets

    Use the descriptions, films, people, reviews, and roles tables to explore the data and practice your skills!

    • Which titles in the reviews table have an IMDB score higher than 8.5?
    • Select all titles from Germany released after 2010 from the films table.
    • Calculate a count of all movies by country using the films table.