Interactive Course

Introduction to Text Analysis in R

Analyze text data in R using the tidy framework.

  • 4 hours
  • 15 Videos
  • 46 Exercises
  • 3,967 Participants
  • 3,850 XP

Loved by learners at thousands of top companies:

lego-grey.svg
3m-grey.svg
deloitte-grey.svg
rei-grey.svg
paypal-grey.svg
ebay-grey.svg

Course Description

From social media to product reviews, text is an increasingly important type of data across applications, including marketing analytics. In many instances, text is replacing other forms of unstructured data due to how inexpensive and current it is. However, to take advantage of everything that text has to offer, you need to know how to think about, clean, summarize, and model text. In this course, you will use the latest tidy tools to quickly and easily get started with text. You will learn how to wrangle and visualize text, perform sentiment analysis, and run and interpret topic models.

  1. 1

    Wrangling Text

    Free

    Since text is unstructured data, a certain amount of wrangling is required to get it into a form where you can analyze it. In this chapter, you will learn how to add structure to text by tokenizing, cleaning, and treating text as categorical data.

  2. Sentiment Analysis

    While word counts and visualizations suggest something about the content, we can do more. In this chapter, we move beyond word counts alone to analyze the sentiment or emotional valence of text.

  1. 1

    Wrangling Text

    Free

    Since text is unstructured data, a certain amount of wrangling is required to get it into a form where you can analyze it. In this chapter, you will learn how to add structure to text by tokenizing, cleaning, and treating text as categorical data.

  2. Visualizing Text

    While counts are nice, visualizations are better. In this chapter, you will learn how to apply what you know from ggplot2 to tidy text data.

  3. Sentiment Analysis

    While word counts and visualizations suggest something about the content, we can do more. In this chapter, we move beyond word counts alone to analyze the sentiment or emotional valence of text.

  4. Topic Modeling

    In this final chapter, we move beyond word counts to uncover the underlying topics in a collection of documents. We will use a standard topic model known as latent Dirichlet allocation.

What do other learners have to say?

Devon

“I've used other sites, but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyd's Banking Group

Louis

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

Ronbowers

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics @ USAA

Marc Dotson
Marc Dotson

Assistant Professor of Marketing, BYU Marriott School of Business

Marc's research is focused on applications of Bayesian inference in marketing, including choice modeling and text analysis. He teaches courses in survey research, conjoint analysis, marketing analytics, and statistical modeling.

See More
Icon Icon Icon professional info