Video games are big business: the global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are hugely incentivized to create the next big hit. But are games getting better, or has the golden age of video games already passed?
In this project, you'll analyze video game critic and user scores as well as sales data for the top 400 video games released since 1977. You'll search for a golden age of video games by identifying release years that users and critics liked best, and you'll explore the business side of gaming by looking at game sales data.
Your search will involve joining datasets and comparing results with set theory. You'll also filter, group, and order data. Make sure you brush up on these skills before trying this project! The database contains two tables. Each table has been limited to 400 rows for this project, but you can find the complete dataset with over 13,000 games on Kaggle.
game_sales table
game_sales table| Column | Definition | Data Type |
|---|---|---|
| name | Name of the video game | varchar |
| platform | Gaming platform | varchar |
| publisher | Game publisher | varchar |
| developer | Game developer | varchar |
| games_sold | Number of copies sold (millions) | float |
| year | Release year | int |
reviews table
reviews table| Column | Definition | Data Type |
|---|---|---|
| name | Name of the video game | varchar |
| critic_score | Critic score according to Metacritic | float |
| user_score | User score according to Metacritic | float |
users_avg_year_rating table
users_avg_year_rating table| Column | Definition | Data Type |
|---|---|---|
| year | Release year of the games reviewed | int |
| num_games | Number of games released that year | int |
| avg_user_score | Average score of all the games ratings for the year | float |
critics_avg_year_rating table
critics_avg_year_rating table| Column | Definition | Data Type |
|---|---|---|
| year | Release year of the games reviewed | int |
| num_games | Number of games released that year | int |
| avg_critic_score | Average score of all the games ratings for the year | float |
-- best_selling_games
-- Select all the columns from the table game_sales.
SELECT *
FROM public.game_sales
-- Order the results based on the games_sold column, in descending order and return the top ten results.
ORDER BY games_sold DESC
LIMIT 10;-- critics_top_ten_years
-- Select the release year, number of games released, and average critic score for each year;
SELECT cayr.year,
cayr.num_games,
-- Round the column for average critic score to two decimal places.
ROUND(cayr.avg_critic_score, 2) AS avg_critic_score
FROM public.critics_avg_year_rating AS cayr
-- Join the game_sales and reviews tables so that only games which appear on both tables are represented.
INNER JOIN public.game_sales AS gs
ON cayr.year = gs.year
INNER JOIN public.reviews as r
ON gs.name = r.name
-- Only include years where the count of games released is more than 4.
WHERE cayr.num_games > 4
-- Group the data by release year, number of games released in a year, and its respective average critic scores.
GROUP BY cayr.year, cayr.num_games, cayr.avg_critic_score
-- Order the data from highest to lowest avg_critic_score and limit the results to the top ten years.
ORDER BY ROUND(cayr.avg_critic_score, 2) DESC
LIMIT 10;-- golden_years
-- Select the release year, number of games released, average critic score, average user score and difference between the critic and the user scores.
SELECT cayr.year,
cayr.num_games,
cayr.avg_critic_score,
uayr.avg_user_score,
-- Calculate the difference by subtracting the critic score from the user score
-- Have it aliased as diff.
ROUND(cayr.avg_critic_score - uayr.avg_user_score, 2) AS diff
FROM public.critics_avg_year_rating AS cayr
-- Join the critics_avg_year_rating and users_avg_year_rating tables so that only games which appear on both tables are represented.
INNER JOIN public.users_avg_year_rating AS uayr
ON cayr.year = uayr.year
-- Filter the date, so that only years that had an average rating of higher than nine from either the critics or the users appear.
WHERE cayr.avg_critic_score > 9 OR uayr.avg_user_score > 9
-- ORDER the data in ascending order by the year column.
ORDER BY cayr.year ASC;