Premium project

Going Down to South Park: A Text Analysis

Analyze the dialog and IMDB ratings of 287 South Park episodes. Warning: contains explicit language.

Start Project
9 Tasks1,500 XP

Loved by learners at thousands of companies


Project Description

_Warning: the dataset in this project contains explicit language._ South Park is a satiric American TV show that is popular around the world. In this Project, you will combine two datasets: dialogs from the first 21 seasons (287 episodes) and IMDB ratings of these episodes. Using some text analysis principles, you will answer questions like: Are naughtier episodes more popular? Is Eric Cartman the naughtiest character in the show?

Project Tasks

  1. 1
    Import and explore data
  2. 2
    Sentiments, swear words, and stemming
  3. 3
    Summarize data by episode
  4. 4
    South Park overall sentiment
  5. 5
    South Park episode popularity
  6. 6
    Are naughty episodes more popular?
  7. 7
    Comparing profanity of two characters
  8. 8
    Is Eric Cartman the naughtiest character?
  9. 9
    Let's answer some questions
Technologies
R R
Topics
Data ManipulationData VisualizationProbability & StatisticsImporting & Cleaning Data
Patrik Drhlík Headshot

Patrik Drhlík

Freelance Data Scientist
Patrik is a freelance data scientist that helps small local companies with data-related problems. He is also pursuing his Ph.D. where he specializes in missing data. He never leaves home without his Rubik's cube and loves hitchhiking, athletics, mountains, kangaroos, and beer (he is Czech).
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA