Skip to main content
HomeR

Course

Categorical Data in the Tidyverse

BasicSkill Level
4.7+
154 reviews
Updated 01/2026
Get ready to categorize! In this course, you will work with non-numerical data, such as job titles or survey responses, using the Tidyverse landscape.
Start Course for Free
RData Manipulation4 hr13 videos44 Exercises3,600 XP16,455Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training 2 or more people?

Try DataCamp for Business

Course Description

As a data scientist, you will often find yourself working with non-numerical data, such as job titles, survey responses, or demographic information. R has a special way of representing them, called factors, and this course will help you master working with them using the tidyverse package forcats. We’ll also work with other tidyverse packages, including ggplot2, dplyr, stringr, and tidyr and use real world datasets, such as the fivethirtyeight flight dataset and Kaggle’s State of Data Science and ML Survey. Following this course, you’ll be able to identify and manipulate factor variables, quickly and efficiently visualize your data, and effectively communicate your results. Get ready to categorize!

Prerequisites

Reshaping Data with tidyr
1

Introduction to Factor Variables

In this chapter, you’ll learn all about factors. You’ll discover the difference between categorical and ordinal variables, how R represents them, and how to inspect them to find the number and names of the levels. Finally, you’ll find how forcats, a tidyverse package, can improve your plots by letting you quickly reorder variables by their frequency.
Start Chapter
2

Manipulating Factor Variables

3

Creating Factor Variables

Having gotten a good grasp of forcats, you’ll expand out to the rest of the tidyverse, learning and reviewing functions from dplyr, tidyr, and stringr. You’ll refine graphs with ggplot2 by changing axes to percentage scales, editing the layout of the text, and more.
Start Chapter
4

Case Study on Flight Etiquette

Categorical Data in the Tidyverse
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.7
from 154 reviews
82%
16%
3%
0%
0%
  • Ahmad kedebi
    yesterday

  • julio
    yesterday

  • Alan
    2 days ago

  • Lorien Egladil Arellano
    2 days ago

  • Jiaqi
    5 days ago

  • ‪IBRAHIM
    7 days ago

Ahmad kedebi

julio

Alan

FAQs

What is the forcats package and why does this course focus on it?

Forcats is a tidyverse package designed for working with factor variables in R. This course teaches you to use it for reordering, lumping, and recoding categorical data efficiently.

What datasets are used in this course?

You will work with the FiveThirtyEight flight dataset and Kaggle's State of Data Science and ML Survey to practice manipulating and visualizing categorical data.

Which other tidyverse packages besides forcats will I use?

You will also use ggplot2 for visualization, dplyr for data manipulation, stringr for string operations, and tidyr for reshaping data alongside forcats.

What should I already know before starting this course?

You should have completed Introduction to the Tidyverse, Data Manipulation with dplyr, and Reshaping Data with tidyr to be prepared for this course.

Will I learn to visualize categorical data effectively?

Yes. The course covers how to create clear visualizations of factor variables, including reordering them by frequency or other metrics to make plots more readable.

Join over 19 million learners and start Categorical Data in the Tidyverse today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.