Skip to content

Project Overview

Video games are a massive industry, with the global gaming market projected to surpass $300 billion by 2027, according to Mordor Intelligence. With so much on the line, game publishers are constantly striving to create the next big hit. But the question remains—are video games truly getting better, or have we already experienced the golden age?

In this project, I analyzed critic and user scores, as well as sales data during their release year, for the top 400 video games released from 1977 up to 2020. My goal was to identify the years that gamers and critics loved the most and delve into the business side of gaming through sales data.

Methodology

To carry out this analysis, I used SQL to join datasets, filter, group, and order data. The database contains the following tables, each limited to 400 rows for this project:

game_sales table

ColumnDefinitionData Type
nameName of the video gamevarchar
platformGaming platformvarchar
publisherGame publishervarchar
developerGame developervarchar
games_soldNumber of copies sold (millions)float
yearRelease yearint

reviews table

ColumnDefinitionData Type
nameName of the video gamevarchar
critic_scoreCritic score according to Metacriticfloat
user_scoreUser score according to Metacriticfloat

users_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_user_scoreAverage score of all the games ratings for the yearfloat

critics_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_critic_scoreAverage score of all the games ratings for the yearfloat

Top 10 Best Selling Games

Spinner
DataFrameas
best_selling_games
variable
-- best_selling_games
SELECT *
FROM game_sales
ORDER BY games_sold DESC
LIMIT 10;

According to the data, "Wii Sports for Wii" by Nintendo is the highest selling game, which sold 82.9 million copies during 2006. It has a 51.45% difference from the second game on the list, "Super Mario Bros. for NES", which sold 40.24 million copies during 1985.

Top 5 Years for Game Sales

Spinner
DataFrameas
sales_top_5_years
variable
-- sales_top_5_years
SELECT year, COUNT(name) AS num_games, SUM(games_sold) AS total_games_sold
FROM game_sales
GROUP BY year
ORDER BY total_games_sold DESC
LIMIT 5;

Looking at the table, 2010 has the highest number of total games sold, at 219.3 million, with 25 games released that year. 2006 comes in 3rd place with 216.74 million from 17 games.

The line chart below shows that sales have been on the rise since 2003, spiked in 2006, achieved the highest sales in 2010, and then started decreasing in 2011.

Spinner
DataFrameas
df
variable
SELECT year, SUM(games_sold) AS total_games_sold
FROM game_sales
GROUP BY year
ORDER BY year

2010 Games

Spinner
DataFrameas
games_2010
variable
-- games_2010
WITH total_games_sold AS (
    SELECT SUM(games_sold) AS total
    FROM game_sales
    WHERE year = 2010
)
SELECT
    r.name,
    g.platform,
    c.avg_critic_score,
    u.avg_user_score,
    c.avg_critic_score - u.avg_user_score AS diff,
    g.games_sold,
    ROUND((g.games_sold / t.total) * 100, 2) AS percentage_of_total
FROM
    reviews r
JOIN
    game_sales g ON r.name = g.name
JOIN
    critics_avg_year_rating c ON g.year = c.year
JOIN
    users_avg_year_rating u ON c.year = u.year
JOIN
    total_games_sold t ON TRUE
WHERE
    c.year = 2010
ORDER BY
    u.avg_user_score ASC;

The top game in 2010 is "Minecraft for PC" by Mojang which sold 33.15 million copies, marking it behind "Wii Sports for Wii with a 60.02%"" difference.

2006 Games

Spinner
DataFrameas
games_2006
variable
[24]
-- games_2006
WITH total_games_sold AS (
    SELECT SUM(games_sold) AS total
    FROM game_sales
    WHERE year = 2006
)
SELECT
    r.name,
    g.platform,
    c.avg_critic_score,
    u.avg_user_score,
    c.avg_critic_score - u.avg_user_score AS diff,
    g.games_sold,
    ROUND((g.games_sold / t.total::numeric) * 100, 2) AS percent_of_total
FROM
    reviews r
JOIN
    game_sales g ON r.name = g.name
JOIN
    critics_avg_year_rating c ON g.year = c.year
JOIN
    users_avg_year_rating u ON c.year = u.year
JOIN
    total_games_sold t ON TRUE
WHERE
    c.year = 2006
ORDER BY
    games_sold DESC;