Skip to content

Video games are big business: the global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are hugely incentivized to create the next big hit. But are games getting better, or has the golden age of video games already passed?

In this project, you'll analyze video game critic and user scores as well as sales data for the top 400 video games released since 1977. You'll search for a golden age of video games by identifying release years that users and critics liked best, and you'll explore the business side of gaming by looking at game sales data.

Your search will involve joining datasets and comparing results with set theory. You'll also filter, group, and order data. Make sure you brush up on these skills before trying this project! The database contains two tables. Each table has been limited to 400 rows for this project, but you can find the complete dataset with over 13,000 games on Kaggle.

game_sales table

ColumnDefinitionData Type
nameName of the video gamevarchar
platformGaming platformvarchar
publisherGame publishervarchar
developerGame developervarchar
games_soldNumber of copies sold (millions)float
yearRelease yearint

reviews table

ColumnDefinitionData Type
nameName of the video gamevarchar
critic_scoreCritic score according to Metacriticfloat
user_scoreUser score according to Metacriticfloat

users_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_user_scoreAverage score of all the games ratings for the yearfloat

critics_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_critic_scoreAverage score of all the games ratings for the yearfloat
Spinner
DataFrameas
best_selling_games
variable
-- best_selling_games
SELECT name, SUM(games_sold) AS total_sales, year
FROM game_sales
GROUP BY name
ORDER BY total_sales DESC
LIMIT 10
Hidden output
Spinner
DataFrameas
critics_top_ten_years
variable
-- critics_top_ten_years
SELECT gs.name, gs.year, AVG(r.critic_score) AS avg_critic_score, COUNT(r.name) AS count_of_reviews
FROM public.game_sales gs
	JOIN public.reviews r ON r.name = gs.name
GROUP BY gs.name, gs.year
HAVING COUNT(r.name) > 4
ORDER BY avg_critic_score DESC, count_of_reviews DESC
LIMIT 10;
Spinner
DataFrameas
golden_years
variable
WITH combined AS (
    SELECT 
        c.year,
        c.num_games AS num_games_critic,
        u.num_games AS num_games_user,
        c.avg_critic_score,
        u.avg_user_score,
        ABS(c.avg_critic_score - u.avg_user_score) AS diff
    FROM critics_avg_year_rating c
    JOIN users_avg_year_rating u ON c.year = u.year
    WHERE c.avg_critic_score > 9 OR u.avg_user_score > 9
)
SELECT 
    year,
    CASE 
        WHEN num_games_critic = num_games_user THEN num_games_critic
        ELSE NULL
    END AS num_games,
    avg_critic_score,
    avg_user_score,
    diff
FROM combined
ORDER BY diff ASC;