Premium project

Who's Tweeting? Trump or Trudeau?

Build a machine learning classifier that knows whether President Trump or Prime Minister Trudeau is tweeting!

Start Project
8 Tasks1,500 XP

Loved by learners at thousands of companies


Project Description

Let's apply our natural language processing knowledge to Twitter. Tweets are notoriously difficult, as they are shorter than most texts and usually have hard-to-parse content like hashtags, mentions, links and emoji. Despite the difficulties, tweets are fun content, so in this notebook we'll take a look at classifying two prominent North American politicians. Can we determine if it is Donald Trump or Justin Trudeau based on just a tweet? Let's see!

Project Tasks

  1. 1
    Tweet classification: Trump vs. Trudeau
  2. 2
    Transforming our collected data
  3. 3
    Vectorize the tweets
  4. 4
    Training a multinomial naive Bayes model
  5. 5
    Evaluating our model using a confusion matrix
  6. 6
    Trying out another classifier: Linear SVC
  7. 7
    Introspecting our top model
  8. 8
    Bonus: can you write a Trump or Trudeau tweet?
Technologies
Python Python
Topics
Data ManipulationData VisualizationProbability & StatisticsImporting & Cleaning Data
Katharine Jarmul Headshot

Katharine Jarmul

Founder, kjamistan
Katharine Jarmul runs a data analysis company called kjamistan that specializes in helping companies analyze data and training others on data analysis best practices, particularly with Python. She has been using Python for 8 years for a variety of data work -- including telling stories at major national newspapers, building large scale aggregation software, making decisions based on customer analytics, and marketing spend and advising new ventures on the competitive landscape.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA