Skip to content
Introduction to SQL
  • AI Chat
  • Code
  • Report
  • Intermediate SQL Queries

    Here you can access every table used in the course. To access each table, you will need to specify the cinema schema in your queries (e.g., cinema.reviews for the reviews table.

    Filtering

    • WHERE clause can also be used to filter numeric records, such as years or ages.
      • SELECT * FROM films WHERE budget > 10000;
      • SELECT * FROM films WHERE release_year='2016'
    • WHERE clause can also be used to filter text results, such as names or countries.
      • SELECT * FROM films WHERE language='French'
    • You can build up your WHERE queries by combining multiple conditions with the AND keyword.
      • Note that you need to specify the column name separately for every AND condition
        • SELECT title FROM films WHERE release_year > 1994 AND release_year < 2000;
        • SELECT title, release_year FROM films WHERE language='Spanish' AND release_year<'2000'
    • You can select rows that meet some but not all conditions by using the OR clause
      • Note the use of parenthesis
        • SELECT title, release_year FROM films WHERE (release_year >= 1990 AND release_year < 2000) AND (language = 'French' OR language = 'Spanish') AND (gross>2000000)
    • BETWEEN clAuse allows for the return of a range of values.
      • the range will be inclusive of the values at the beginning and end of the command
      • SELECT title, release_year FROM films WHERE release_year BETWEEN 1990 AND 2000
      • Similar to the WHERE clause, the BETWEEN clause can be used with multiple AND and OR operators, so you can build up your queries and make them even more powerful!
      • SELECT title, release_year FROM films WHERE (release_year BETWEEN 1990 AND 2000) AND (budget > 100000000) AND (language = 'Spanish' OR language='French');
    • IN operator allows you to specify multiple values in a WHERE clause, making it easier and quicker to specify multiple OR conditions!
      • SELECT name FROM kids WHERE age IN (2, 4, 6, 8, 10);
    • NULL/NOT NULL finds if there is a missing value
      • SELECT name FROM people WHERE deathdate IS NULL
    • LIKE operator can be used in a WHERE clause to search for a pattern in a column.
    • NOT LIKE operator to find records that don't match the pattern you specify
      • % wildcard will match zero, one, or many characters in text
        • SELECT name FROM companies WHERE name LIKE 'Data%'; Results can be: 'Data', 'DataC' 'DataCamp', 'DataMind'
      • _ wildcard will match a single character.
        • SELECT name FROM companies WHERE name LIKE 'DataC_mp'; Results can be:'DataCamp', 'DataComp'
      • _EX: return values with'r' as the second letter
        • SELECT name FROM people
          WHERE name LIKE '_r%'

    Aggregate Functions

    AVG function gives you the average value MAX function returns the highest value SUM function returns the results of adding up the numeric values in a field MIN function returns the lowest value

    • Aggregate functions can be combined with the WHERE clause to gain further insights from your data.

      • For example, to get the total budget of movies made in the year 2010 or later:
        • SELECT SUM(budget) FROM films WHERE release_year >= 2010;
    • In addition to using aggregate functions, you can perform basic arithmetic with symbols like +, -, *****, and /.

      • SQL assumes that if you divide an integer by an integer, you want to get an integer back. So be careful when dividing!
        • SELECT (4 / 3); result is 1
      • If you want more precision when dividing, you can add decimal places to your numbers.
        • SELECT (4.0 / 3.0) AS result; 1.33
    • AS function for aliasing

    Get the title and net profit (the amount a film grossed, minus its budget) for all films. Alias the net profit as net_profit.

    Unknown integration
    DataFrameavailable as
    movie_info
    variable
    -- Add your own queries here
    SELECT title, (gross-budget) AS net_profit
    FROM cinema.films
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden output

    Get the title and duration in hours for all films. Alias the duration in hours as duration_hours

    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT title, (duration/60.0) AS duration_hours
    FROM cinema.films
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden output

    Get the average duration in hours for all films, aliased as avg_duration_hours.

    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT AVG(duration)/60.0 AS avg_duration_hours
    FROM cinema.films
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden output

    Get the percentage of people who are no longer alive. Alias the result as percentage_dead.

    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT (COUNT(deathdate)*100.0/COUNT(*)) AS percentage_dead
    FROM cinema.people
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden output

    Get the number of decades the films table covers. Alias the result as number_of_decades. The top half of your fraction should be enclosed in parentheses.

    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT (MAX(release_year)-MIN(release_year))/10.0 AS number_of_decades
    FROM cinema.films
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden output

    Sorting and Grouping

    ORDER BY keyword is used to sort results in ascending or descending order according to the values of one or more columns.

    • By default ORDER BY will sort in ascending order. If you want to sort the results in descending order, you can use the DESC keyword.
    • ORDER BY can also be used to sort on multiple columns. It will sort by the first column specified, then sort by the next, then the next, and so on. GROUP BY allows you to group a result by one or more columns,
    • Commonly, GROUP BY is used with aggregate functions like COUNT() or MAX(). Note that GROUP BY always goes after the FROM clause!
    • ORDER BY comes after GROUP BY clause. HAVING clause allows for filtering based on aggregate functions, in the GROUP BY clause
    • SELECT release_year FROM films GROUP BY release_year HAVING COUNT(title) > 10;shows only those years in which more than 10 films were released.

    average budget and average gross earnings for films in each year after 1990, if the average budget is greater than $60 million