Video Games Sales Data
This dataset contains records of popular video games in North America, Japan, Europe and other parts of the world. Every video game in this dataset has at least 100k global sales.
Not sure where to begin? Scroll to the bottom to find challenges!
import pandas as pd
sales = pd.read_csv("vgsales.csv", index_col=0)
print(sales.shape)
sales.head()import pandas as pd
sales = pd.read_csv("vgsales.csv", index_col=0)
# Filter the data for seventh generation consoles
seventh_gen = sales[(sales['Platform'] == 'Xbox 360') | (sales['Platform'] == 'NES') | (sales['Platform'] == 'PS3') | (sales['Platform'] == 'Wii')]
# Group by platform and calculate total global sales
total_sales = seventh_gen.groupby('Platform')['Global_Sales'].sum()
# Print the results
print(total_sales)
print(total_sales.shape)
total_sales.head
# Filter the data for the top 3 genres
top_3_genres = sales[sales['Genre'].isin(sales['Genre'].value_counts().index[:3])]
# Group by genre and calculate average sales
avg_sales = top_3_genres.groupby('Genre')[['NA_Sales', 'EU_Sales', 'Global_Sales']].mean()
# Plot the results
import matplotlib.pyplot as plt
avg_sales.plot(kind='bar')
plt.show()
total_sales.plot(kind='bar')
plt.show()Data Dictionary
| Column | Explanation | 
|---|---|
| Rank | Ranking of overall sales | 
| Name | Name of the game | 
| Platform | Platform of the games release (i.e. PC,PS4, etc.) | 
| Year | Year the game was released in | 
| Genre | Genre of the game | 
| Publisher | Publisher of the game | 
| NA_Sales | Number of sales in North America (in millions) | 
| EU_Sales | Number of sales in Europe (in millions) | 
| JP_Sales | Number of sales in Japan (in millions) | 
| Other_Sales | Number of sales in other parts of the world (in millions) | 
| Global_Sales | Number of total sales (in millions) | 
Source of dataset.
Don't know where to start?
Challenges are brief tasks designed to help you practice specific skills:
- 🗺️ Explore: Which of the three seventh generation consoles (Xbox 360, Playstation 3, and Nintendo Wii) had the highest total sales globally?
 - 📊 Visualize: Create a plot visualizing the average sales for games in the most popular three genres. Differentiate between NA, EU, and global sales.
 - 🔎 Analyze: Are some genres significantly more likely to perform better or worse in Japan than others? If so, which ones?
 
Scenarios are broader questions to help you develop an end-to-end project for your portfolio:
You are working as a data analyst for a video game retailer based in Japan. The retailer typically orders games based on sales in North America and Europe, as the games are often released later in Japan. However, they have found that North American and European sales are not always a perfect predictor of how a game will sell in Japan.
Your manager has asked you to develop a model that can predict the sales in Japan using sales in North America and Europe and other attributes such as the name of the game, the platform, the genre, and the publisher.
You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.