Skip to main content

Course

Reshaping Data with tidyr

IntermediateSkill Level

4.8+

Updated 03/2023

Transform almost any dataset into a tidy format to make analysis easier.

Start Course for Free

RData Manipulation

4 hr

15 videos

54 Exercises

4,650 XP

24,562

Statement of Accomplishment

Loved by learners at thousands of companies

Training a Team?

Try for Business

Course Description

Data in the wild can be scary—when confronted with a complicated and messy dataset you may find yourself wondering, where do I even start? The tidyr package allows you to wrangle such beasts into nice and tidy datasets. Inaccessible values stored in column names will be put into rows, JSON files will become data frames, and missing values will never go missing again. You'll practice these techniques on a wide range of messy datasets, learning along the way how many dogs the Soviet Union sent into space and what bird is most popular in New Zealand. With the tidyr package in your tidyverse toolkit, you'll be able to transform almost any dataset in a tidy format which will pay-off during the rest of your analysis.

Prerequisites

Data Manipulation with dplyr

1

Tidy Data

You'll be introduced to the concept of tidy data which is central to this course. In the first two lessons, you'll jump straight into the action by separating messy character columns into tidy variables and observations ready for analysis. In the final lesson, you'll learn how to overwrite and remove missing values.

What is tidy data?

Tidy data structure

Multiple variables per column

Columns with multiple values

International phone numbers

Extracting observations from values

Separating into columns and rows

Missing values

And the Oscar for best director goes to ... <NA>

Imputing sales data

Nuclear bombs per continent

2

From Wide to Long and Back

This chapter is all about pivoting data from a wide to long format and back again using the pivot_longer() and pivot_wider() functions. You'll need these functions when variables are hidden in messy column names or when variables are stored in rows instead of columns. You'll learn about space dogs, nuclear bombs, and planet temperatures along the way.

From wide to long data

Nuclear bombs per country

WHO obesity per country

Bond... James Bond

Deriving variables from column headers

New-Zealand's bird of the year

Big tech stock prices

Deriving variables from complex column headers

Soviet space dogs, the dog perspective

WHO obesity vs. life expectancy

Uncounting observations

From long to wide data

Soviet space dogs, the flight perspective

Planet temperature & distance to the Sun

Transposing planet data

3

Expanding Data

Values can often be missing in your data, and sometimes entire observations are absent too. In this chapter, you'll learn how to complete your dataset with these missing observations. You'll add observations with zero values to counted data, expand time series to a full sequence of intervals, and more!

Creating unique combinations of vectors

Letters of the genetic code

When did humans replace dogs in space?

Finding missing observations

Completing data with all value combinations

Completing the Solar System

Zero Olympic medals

Creating a sequence with full_seq()

The Cold War's hottest year

Advanced completions

Olympic medals per continent

Tracking a virus outbreak

Counting office occupants

4

Rectangling Data

In the final chapter, you'll learn how to turn nested data structures such as JSON and XML files into tidy, rectangular data. This skill will enable you to process data from web APIs. You'll also learn how nested data structures can be used to write elegant modeling pipelines that produce tidy outputs.

Intro to non-rectangular data

Rectangular vs. non-rectangular files

Rectangling Star Wars movies

From nested values to observations

Unnesting wide or long

Rectangling Star Wars planets

The Solar System's biggest moons

Selecting nested variables

Hoisting Star Wars films

Hoisting movie ratings

Nesting data for modeling

Tidy model outputs with broom

Nesting tibbles

Modeling on nested data frames

Congratulations!

Reshaping Data with tidyr

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance reviewEnroll Now

Don’t just take our word for it

*4.8

from 452 reviews

85%

13%

2%

0%

0%

Sort by

Eric

6 days ago

Viplav

2 weeks ago

Marcin

2 weeks ago

Ednei Luís

2 weeks ago

Great course

Maria

3 weeks ago

Ntobeko

3 weeks ago

Up until now, I've been using very basic data wrangling, rarely pivoting (never [un]nested). Working through this course has shown me all the new kinds of questions I can ask of my data.

Eric

Marcin

"Great course"

Ednei Luís

FAQs

Is this course suitable for beginners?

Yes, this course is suitable for beginners. It starts with an introduction to the concept of tidy data that is fundamental to the course, which then progresses to the tidyr package functions, before introducing more sophisticated concepts such as pivoting and rectangling data.

Will I receive a certificate at the end of the course?

Yes, you will receive a certificate of completion after successfully completing all the chapters of the course.

What jobs would benefit from this course?

This course would be especially beneficial to anyone in the data science or analytics field, as the skills learned in this course are invaluable in reshaping and wrangling data for any kind of analysis or graphing. Additionally, this course would be beneficial to developers working with web APIs, as the rectangling data module teaches how to transform nested data structures into tidy, rectangular data.

Join over 19 million learners and start Reshaping Data with tidyr today!

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.