If the data include duplicate, we use function COUNT still include duplicates, using SELECT COUNT(DISTINCT sth) to **remove duplicates. **
OR, AND, BETWEEN are used to query multiple criteria.
OR to satisfy at least 1 condition.
WHERE + AND to satisfy ALL criterias,
WHERE + AND & WHERE + OR must be used in ()
WHERE + BETWEEN + AND for filtering values within a specified range
BETWEEN means inclusive, it contains the beginning and end values
FILTERING TEXT Filter LIKE, NOT LIKE, IN
- LIKE used to search for a pattern in a field
% will match 0, 1 or many characters in the text
For example: SELECT name FROM people WHERE name LIKE 'Ade%' the result will come out Adele, Adelia, Aden
_ match a single character
For example: SELECT name FROM people WHERE name LIKE 'Ev_'; The result come out: Eve, Eva etc
- NOT LIKE To find records that don't match the specified pattern.
WHERE....NOT LIKE....
Arithmetic
To divide to get exact result, need to add 0.
For example: 4.0/3.0 = 1.333
GROUPING DATA
To summarize data for a particular group of results.
GROUP BY is commonly used with aggregate functions to provide summary statistics.
Can combine GROUP BY with ORDER BY to group our results, make a calculation, and then order our results.
GROUP BY is always written before ORDER BY
SORTING
To put our data in a specific order.
The ORDER BY keyword is used to sort results of one or more fields. When used on its own, it is written after the FROM statement.
Add the ASC keyword to our query to clarify that we are sorting in ascending order (smallest to biggest) For ex: ORDER BY budget ASC
Can use the DESC keyword to sort the results in descending order.
Add your notes here
In SQL, we can't filter aggregate functions with WHERE clauses For ex: WHERE COUNT(title) > 10 => CANNOT
Use HAVING instead WHERE to filter based on the result of an aggregate function. For ex: HAVING COUNT(title) > 10
WHERE filters individual records while HAVING filters grouped records
-- Add your own queries here
SELECT *
FROM cinema.reviews
LIMIT 5
Explore Datasets
Use the descriptions
, films
, people
, reviews
, and roles
tables to explore the data and practice your skills!
- Which titles in the
reviews
table have an IMDB score higher than 8.5? - Select all titles from Germany released after 2010 from the
films
table. - Calculate a count of all movies by country using the
films
table.
WHERE is a filtering clause to answer relevant questions, use the "" for the string in the WHERE clause.
NULL means human error or the information is unavailble or unknown.
Using WHERE sth NULL or NOT NULL to include or exlcude missing values.
If want to understand the dataset as a whole but not individual records. One way to do this is to summarize the data using SQL's aggregate functions. An aggregate function performs a calculation on several values and returns a single value.
They are: **COUNT, AVG, MIN, MAX, SUM **
Numerical data function only: AVG(), SUM() Various data types: COUNT, MIN, MAX (A-Z or Z-A)
ROUND() to round number to specified decimal place.