Skip to content

Netflix! What started in 1997 as a DVD rental service has since exploded into one of the largest entertainment and media companies.

Given the large number of movies and series available on the platform, it is a perfect opportunity to flex my exploratory data analysis skills and dive into the entertainment industry. My friend has also been brushing up on his Python skills and has taken a first crack at a CSV file containing Netflix data. He believes that the average duration of movies has been declining. Using my friends initial research, I'll dive into the Netflix data to see if I can determine whether movie lengths are actually getting shorter and explain some of the contributing factors, if any.

I have been supplied with the dataset netflix_data.csv , along with the following table detailing the column names and descriptions:

The data

netflix_data.csv

ColumnDescription
show_idThe ID of the show
typeType of show
titleTitle of the show
directorDirector of the show
castCast of the show
countryCountry of origin
date_addedDate added to Netflix
release_yearYear of Netflix release
durationDuration of the show in minutes
descriptionDescription of the show
genreShow genre
# Importing necessary libraries
import pandas as pd
import matplotlib.pyplot as plt

# Loading the dataset
netflix_df = pd.read_csv('netflix_data.csv')

# Filtering the dataset for movies only
netflix_movies = netflix_df[netflix_df['type'] == 'Movie']

# Selecting relevant columns
netflix_movies = netflix_movies[['title', 'country', 'genre', 'release_year', 'duration']]

# Filtering for short movies (duration < 60 minutes)
short_movies = netflix_movies[netflix_movies['duration'] < 60]

# Displaying the first 20 short movies
print(short_movies.head(20))

# Assigning colors based on genre
genre_colors = {
    'Children': 'blue',
    'Documentaries': 'green',
    'Stand_Up': 'orange'
}

# To create a list of colors for each movie based on its genre
colors = [genre_colors.get(row['genre'], 'gray') for index, row in netflix_movies.iterrows()]

# Plotting the data
plt.figure(figsize=(12, 8))
plt.scatter(netflix_movies['release_year'], netflix_movies['duration'], c=colors)
plt.xlabel("Release Year")
plt.ylabel("Duration (min)")
plt.title("Movie Duration by Year of Release")
plt.show()

# Are we certain that movies are getting shorter?
answer = 'no'

Netflix Movie Duration Analysis Report

Introduction

This report analyzes trends in Netflix movie durations over time, with a particular focus on identifying whether movies are getting shorter. The analysis was conducted using Python with pandas for data manipulation and matplotlib for visualization.

Methodology

  1. Data Preparation:

    • Loaded Netflix data from 'netflix_data.csv'
    • Filtered for only movie content (excluding TV shows)
    • Selected relevant columns: title, country, genre, release_year, and duration
  2. Special Focus:

    • Identified short movies (duration < 60 minutes) for closer examination
    • Created a visualization of movie durations by release year
  3. Visual Encoding:

    • Applied color coding by genre:
      • Children: Blue
      • Documentaries: Green
      • Stand-Up: Orange
      • All others: Gray

Key Findings

Short Movies Analysis

The first 20 short movies (duration < 60 minutes) include:

  • Holiday specials (e.g., "6 Go! Go! Grey Carson Christmas")
  • Documentaries (e.g., "3 Seconds Divorce")
  • Children's content (e.g., "18B Things to do Before High School")
  • Various Christmas specials and short films

Duration Trends Visualization

The scatter plot "Movie Duration by Year of Release" shows:

  • A wide distribution of movie durations across all years
  • No clear downward trend in movie durations over time
  • Short movies (under 60 minutes) appear consistently throughout the timeline
  • Most movies cluster between 60-150 minutes regardless of release year

Genre Patterns

The color-coded visualization reveals:

  • Children's content (blue) tends to have shorter durations
  • Documentaries (green) show a mix of short and medium lengths
  • Stand-Up specials (orange) typically fall in the medium duration range

Conclusion

Based on the analysis:

  • Movies are not demonstrably getting shorter over time
  • Short movies (under 60 minutes) have consistently existed alongside standard-length features
  • Genre appears to be a stronger predictor of duration than release year

The answer to "Are we certain that movies are getting shorter?" is: No

Recommendations

  1. Further analysis could examine duration trends within specific genres
  2. Investigating the relationship between country of origin and duration might reveal additional patterns
  3. A time-series analysis with moving averages could provide more nuanced insights about duration trends