Project Overview
Video games are a massive industry, with the global gaming market projected to surpass $300 billion by 2027, according to Mordor Intelligence. With so much on the line, game publishers are constantly striving to create the next big hit. But the question remains—are video games truly getting better, or have we already experienced the golden age?
In this project, I analyzed critic and user scores, as well as sales data during their release year, for the top 400 video games released from 1977 up to 2020. My goal was to identify the years that gamers and critics loved the most and delve into the business side of gaming through sales data.
Methodology
To carry out this analysis, I used SQL to join datasets, filter, group, and order data. The database contains the following tables, each limited to 400 rows for this project:
game_sales table
game_sales table| Column | Definition | Data Type |
|---|---|---|
| name | Name of the video game | varchar |
| platform | Gaming platform | varchar |
| publisher | Game publisher | varchar |
| developer | Game developer | varchar |
| games_sold | Number of copies sold (millions) | float |
| year | Release year | int |
reviews table
reviews table| Column | Definition | Data Type |
|---|---|---|
| name | Name of the video game | varchar |
| critic_score | Critic score according to Metacritic | float |
| user_score | User score according to Metacritic | float |
users_avg_year_rating table
users_avg_year_rating table| Column | Definition | Data Type |
|---|---|---|
| year | Release year of the games reviewed | int |
| num_games | Number of games released that year | int |
| avg_user_score | Average score of all the games ratings for the year | float |
critics_avg_year_rating table
critics_avg_year_rating table| Column | Definition | Data Type |
|---|---|---|
| year | Release year of the games reviewed | int |
| num_games | Number of games released that year | int |
| avg_critic_score | Average score of all the games ratings for the year | float |
Top 10 Best Selling Games
-- best_selling_games
SELECT *
FROM game_sales
ORDER BY games_sold DESC
LIMIT 10;
According to the data, "Wii Sports for Wii" by Nintendo is the highest selling game, which sold 82.9 million copies during 2006. It has a 51.45% difference from the second game on the list, "Super Mario Bros. for NES", which sold 40.24 million copies during 1985.
Top 5 Years for Game Sales
-- sales_top_5_years
SELECT year, COUNT(name) AS num_games, SUM(games_sold) AS total_games_sold
FROM game_sales
GROUP BY year
ORDER BY total_games_sold DESC
LIMIT 5;
Looking at the table, 2010 has the highest number of total games sold, at 219.3 million, with 25 games released that year. 2006 comes in 3rd place with 216.74 million from 17 games.
The line chart below shows that sales have been on the rise since 2003, spiked in 2006, achieved the highest sales in 2010, and then started decreasing in 2011.
SELECT year, SUM(games_sold) AS total_games_sold
FROM game_sales
GROUP BY year
ORDER BY year2010 Games
-- games_2010
WITH total_games_sold AS (
SELECT SUM(games_sold) AS total
FROM game_sales
WHERE year = 2010
)
SELECT
r.name,
g.platform,
c.avg_critic_score,
u.avg_user_score,
c.avg_critic_score - u.avg_user_score AS diff,
g.games_sold,
ROUND((g.games_sold / t.total) * 100, 2) AS percentage_of_total
FROM
reviews r
JOIN
game_sales g ON r.name = g.name
JOIN
critics_avg_year_rating c ON g.year = c.year
JOIN
users_avg_year_rating u ON c.year = u.year
JOIN
total_games_sold t ON TRUE
WHERE
c.year = 2010
ORDER BY
u.avg_user_score ASC;The top game in 2010 is "Minecraft for PC" by Mojang which sold 33.15 million copies, marking it behind "Wii Sports for Wii with a 60.02%"" difference.
2006 Games
-- games_2006
WITH total_games_sold AS (
SELECT SUM(games_sold) AS total
FROM game_sales
WHERE year = 2006
)
SELECT
r.name,
g.platform,
c.avg_critic_score,
u.avg_user_score,
c.avg_critic_score - u.avg_user_score AS diff,
g.games_sold,
ROUND((g.games_sold / t.total::numeric) * 100, 2) AS percent_of_total
FROM
reviews r
JOIN
game_sales g ON r.name = g.name
JOIN
critics_avg_year_rating c ON g.year = c.year
JOIN
users_avg_year_rating u ON c.year = u.year
JOIN
total_games_sold t ON TRUE
WHERE
c.year = 2006
ORDER BY
games_sold DESC;