Netflix! What started in 1997 as a DVD rental service has since exploded into one of the largest entertainment and media companies.
Given the large number of movies and series available on the platform, it is a perfect opportunity to flex your exploratory data analysis skills and dive into the entertainment industry.
You work for a production company that specializes in nostalgic styles. You want to do some research on movies released in the 1990's. You'll delve into Netflix data and perform exploratory data analysis to better understand this awesome movie decade!
You have been supplied with the dataset netflix_data.csv
, along with the following table detailing the column names and descriptions. Feel free to experiment further after submitting!
The data
netflix_data.csv
Column | Description |
---|---|
show_id | The ID of the show |
type | Type of show |
title | Title of the show |
director | Director of the show |
cast | Cast of the show |
country | Country of origin |
date_added | Date added to Netflix |
release_year | Year of Netflix release |
duration | Duration of the show in minutes |
description | Description of the show |
genre | Show genre |
# Importing pandas and matplotlib
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Set seaborn style for better look
sns.set(style='whitegrid')
# Read in the Netflix CSV as a DataFrame
netflix_df = pd.read_csv("netflix_data.csv")
Load the data, check basic info, and identify any null values or issues for cleaning.
Identifying the unique content types
# Find unique values in the 'type' column
unique_types = netflix_df['type'].unique()
print("\nUnique Types in the Dataset:")
print(unique_types)
# Count the occurrences of each type
type_counts = netflix_df['type'].value_counts()
print("\nCounts of Each Type:")
print(type_counts)
Number of Movies Released Each Year in the 1990s
Visualize the count of movies released each year to see if there were any trends or spikes in production.
# Start coding here! Use as many cells as you like
netflix_df
movies_90s=netflix_df[(netflix_df['release_year']>= 1990) & (netflix_df['release_year']<= 1999)]
print(movies_90s.head())
# Optional: Plot the number of movies released each year in the 1990s
movies_90s_by_year = movies_90s['release_year'].value_counts().sort_index()
# Plotting
plt.figure(figsize=(14, 8))
sns.lineplot(x=movies_90s_by_year.index, y=yearly_counts.values, marker='o')
plt.title('Yearly Movie Production Trends (1990-1999)')
plt.xlabel('Year')
plt.ylabel('Number of Movies Produced')
plt.xticks(yearly_counts.index)
plt.show()
print("\nYearly Movie Production Analysis:")
print(movies_90s_by_year)
The analysis of Netflix's movie production trends from 1990 to 1999 indicates a consistent release pattern in the early to mid-decade, averaging 15–16 films per year.This stable production rate persisted until 1997, when a marked increase to 26 releases annually occurred, sustaining through 1999. This upward trend in the late 1990s suggests a strategic response to evolving audience demand and the growing potential of digital distribution, highlighting Netflix’s shift toward scaling content offerings in anticipation of market expansion. This period underscores a pivotal moment in content strategy, laying the groundwork for the platform’s rapid growth in subsequent years.
Genre Analysis of 1990s Movies:
The analysis of movie genres throughout the 1990s reveals distinct preferences among audiences, with a predominant interest in Action, Drama, and Comedy films. Specifically, Action leads the decade with 48 releases, followed by Dramas at 44, and Comedies at 40. These three genres collectively account for the majority of productions, indicating a strategic focus on high-demand genres that likely resonated widely with audiences.
# Count the occurrences of each genre
genre_counts = movies_90s['genre'].value_counts()
print(genre_counts)
# Plot genre counts
plt.figure(figsize=(12, 8))
genre_counts.plot(kind = 'bar', color='lightgreen')
plt.title("popular genres in 1990s movies")
plt.xlabel("Genre")
plt.ylabel("Number of Movies")
plt.show()
Leading Genres: The high frequency of Action, Drama, and Comedy genres aligns with broader cinema trends, reflecting Netflix’s early emphasis on content with mass appeal.
Niche Categories: Genres such as Horror Movies, Documentaries, and Cult Movies appeared infrequently, pointing to their niche status or limited production during this era.
Diversity in Content: While limited, categories like Stand-Up and Docuseries show early signs of Netflix's interest in varied content types, setting the stage for genre diversification in later years.
Conclusion
This genre analysis offers a snapshot of 1990s viewer preferences, underscoring Netflix’s strategic selection of high-demand genres while gradually exploring diverse content types. By recognizing these trends, Netflix can continue refining its content strategy, balancing popular demand with niche interests to build an engaging, well-rounded library. This approach is pivotal for anticipating future viewer preferences and supporting long-term growth.
Top Countries for Movies
An analysis of movie production by country reveals that the United States led significantly in the 1990s with 100 films, followed by India with 34 films. Other notable contributors included the United Kingdom, Hong Kong, and France, although their production volumes were much lower by comparison.
# Exploring Country Trends
plt.figure(figsize=(12, 6))
country_counts = movies_90s['country'].value_counts()
country_counts[:10].plot(kind='bar')
plt.title('Top Countries for Movies in the 1990s')
plt.xlabel('Country')
plt.ylabel('Number of Movies')
plt.xticks(rotation=45)
plt.grid()
plt.show()
print("\nTop Countries Analysis:")
print(country_counts[:10])
Dominance of the United States: The high volume of movies produced in the U.S. underscores its dominant role in the global film industry, with an output significantly higher than other countries.
Significant Contributors: India’s notable output, followed by the United Kingdom and Hong Kong, highlights a diverse international presence in the Netflix catalog, particularly for countries with robust film industries. Emerging Markets: Countries like Mexico, Germany, and Japan contributed modestly, indicating growing but limited international representation in the 90s Netflix collection.
Conclusion
This analysis underscores the influence of American cinema on the 90s entertainment landscape, alongside a diverse range of international contributors. As Netflix continues to expand globally, this early catalog diversification demonstrates a commitment to varied cultural representation, providing insights into regional production trends and potential markets for future growth.