Skip to content

Netflix Movie Data

This dataset contains more than 8,500 Netflix movies and TV shows, including cast members, duration, and genre. It contains titles added as recently as late September 2021.

suppressPackageStartupMessages(library(tidyverse))

read_csv('data/netflix_dataset.csv', show_col_types = FALSE)

Data Dictionary

variableclassdescription
typecharacterEither 'TV Show' or 'Movie'
titlecharacterThe title of the movie or TV show
directorcharacterThe director of the movie or TV show
castcharacterThe actors playing in the movie or TV show
countrycharacterThe country in which the movie or TV show was directed
date_addedcharacterThe date on which the movie or TV show was added to Netflix
release_yearcharacterThe year the movie or TV show was released
ratingcharacterThe kid-friendly rating the movie or TV show received
durationcharacterThe length of the movie or TV show
listed_incharacterThe genre of the movie or TV show
descriptioncharacterThe description/short summary of the movie or TV show

Source of dataset.

Don't know where to start?

Challenges are brief tasks designed to help you practice specific skills:

  • 🗺️ Explore: How much variety exists in Netflix's offering? Base this on three variables: type, country, and listed_in.
  • 📊 Visualize: Build a word cloud from the movie and TV shows descriptions. Make sure to remove stop words!
  • 🔎 Analyze: Has Netflix invested more in certain genres (see listed_in) in recent years? What about certain age groups (see ratings)?

Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

A talent agency has hired you to analyze patterns in the professional relationships of cast members and directors. The key deliverable is a network graph where each node represents a cast member or director. An edge represents a movie or TV show worked on by both nodes in this undirected graph. You can limit the actors to the first four names listed in cast. The client is interested in any insights you can derive from your Netflix network analysis, such as actor/actor and actor/director pairs that work most closely together, most popular actors and directors to work with, and graph differences over time.

You will need to prepare a report that is accessible to a broad audience. It will need to outline your motivation, analysis steps, findings, and conclusions.

esq