Skip to content

Video games are big business: the global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are hugely incentivized to create the next big hit. But are games getting better, or has the golden age of video games already passed?

In this project, I analyzed video game critic and user scores as well as sales data for the top 400 video games released since 1977. You'll find the golden age of video games by identifying release years that users and critics liked best, and you'll explore the business side of gaming by looking at game sales data.

For this challange I used basic SQL like joining datasets, filtering, group, and order data.

game_sales table

ColumnDefinitionData Type
nameName of the video gamevarchar
platformGaming platformvarchar
publisherGame publishervarchar
developerGame developervarchar
games_soldNumber of copies sold (millions)float
yearRelease yearint

reviews table

ColumnDefinitionData Type
nameName of the video gamevarchar
critic_scoreCritic score according to Metacriticfloat
user_scoreUser score according to Metacriticfloat

users_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_user_scoreAverage score of all the games ratings for the yearfloat

critics_avg_year_rating table

ColumnDefinitionData Type
yearRelease year of the games reviewedint
num_gamesNumber of games released that yearint
avg_critic_scoreAverage score of all the games ratings for the yearfloat
Spinner
DataFrameas
best_selling_games
variable
-- best_selling_games
/* Find the ten best-selling games. The output should contain all the columns in the game_sales table and be sorted by the games_sold column in descending order. Save the output as best_selling_games. */

SELECT
	*
FROM
	public.game_sales
ORDER BY games_sold DESC
LIMIT 10
;
Spinner
DataFrameas
critics_top_ten_years
variable
-- critics_top_ten_years with at least four released games for that year ensuring good sample size. Challange: don't use critics table */

-- SELECT years with at least 4 games
SELECT 
    year, 
    COUNT(game_sales.name) AS num_games, 
	ROUND(AVG(reviews.critic_score),2) AS avg_critic_score
FROM game_sales
INNER JOIN reviews
ON game_sales.name = reviews.name
GROUP BY year
HAVING COUNT(*) >= 4
ORDER BY avg_critic_score DESC
LIMIT 10;
Spinner
DataFrameas
golden_years
variable
-- golden_years
/* Find the years where critics AND users broadly agreed that the games released were highly rated (i.e. score over 9) */

SELECT
	u.year,
	u.num_games,
	c.avg_critic_score,
	u.avg_user_score,
	c.avg_critic_score - u.avg_user_score AS diff
FROM users_avg_year_rating AS u 
INNER JOIN critics_avg_year_rating AS c
ON u.year = c.year
WHERE u.avg_user_score > 9 OR c.avg_critic_score > 9
ORDER BY year ASC;