Skip to content

Introduction

This project aims to meticulously identify optimal game genres, categories, and tags for implementation in a forthcoming gaming endeavor. This comprehensive analysis serves as a cornerstone in determining the strategic direction of the game's development, with profound implications on its market positioning, pricing strategy, and release timeline.

The primary objective of this project is to leverage data-driven insights to inform pivotal decisions in game development. By conducting a thorough analysis, we seek to identify the most suitable genres and categories that resonate with target audiences, ensuring maximum engagement and reception upon release. Furthermore, the findings will guide pricing decisions and inform the strategic utilization of downloadable content (DLC) to enhance revenue generation and user experience.

Key Questions to Address:

  1. Genre Selection: Which genre(s) align best with the intended gameplay experience and target demographic?
  2. Category Determination: What categories should the game encompass to cater to diverse player preferences and expectations?
  3. Release Timing: When is the optimal timeframe for launching the game to capitalize on market trends and maximize visibility?
  4. Monetization Strategy: Should the game be released as a free-to-play model, and if so, how can DLCs be strategically incorporated to drive revenue without compromising user satisfaction? Alternatively, what pricing model should be adopted if the game is not offered for free, and how can DLCs be integrated post-launch to sustain player engagement?
  5. Publisher Consideration: Is it advisable to seek collaboration with a publisher prior to release to leverage their expertise and resources for enhanced market penetration and promotional efforts?
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud


data = pd.read_csv('games.csv')
df = pd.DataFrame(data)

genres_series = df['Genres'].dropna().str.split(',').explode()
categories_series = df['Categories'].dropna().str.split(',').explode()
tags_series = df['Tags'].dropna().str.split(',').explode()

popular_genres = genres_series.value_counts().head(5)
popular_categories = categories_series.value_counts().head(5)
popular_tags = tags_series.value_counts().head(5)

df['Average_Playtime'] = (df['Average playtime forever'] + df['Average playtime two weeks']) / 2
df['Game_Type'] = df['Price'].apply(lambda x: 'Paid Game' if x > 0 else 'Free Game')

df['Release date'] = pd.to_datetime(df['Release date'], errors='coerce')
df['Year'] = df['Release date'].dt.year
df['Release Month'] = df['Release date'].dt.month
monthly_average_releases = df.groupby('Release Month').apply(lambda x: x.groupby('Year').size().mean()).sort_index()
month_names = {1: 'January', 2: 'February', 3: 'March', 4: 'April', 5: 'May', 6: 'June',
                   7: 'July', 8: 'August', 9: 'September', 10: 'October', 11: 'November', 12: 'December'}
df['Owners'] = df['Estimated owners'].apply(lambda x: int(x.split('-')[1].replace(',', '')))

Guide for Analysis

  • Select the top 5 tags, genres, and categories.
  • Determine the average length of the games from the top 5 selections.
  • Determine the quarter/month the game should be released(We don't want to release game in the month with high number of game releases).
  • Determing what type, and how to generate revenue when the game releases.

1 hidden cell

Summary

The following tables provide a comprehensive overview of two primary game types: Paid and Free. These tables present a breakdown of the total number of games in each category, distinguishing between those with and without downloadable content (DLC), as well as games published independently versus those with publishers. Furthermore, they offer insights into the average peak of concurrent users and estimated ownership for each respective game type.

Hidden code df7
Hidden code df
Hidden code DLCs

Game Length

Determining the Length of the game for players to finish.

Hidden code df
Hidden code df8