premium course

Joining Data in R with dplyr

Start Course For Free Play Trailer
  • 20 Videos
  • 84 Exercises
  • 4 hours 
  • 4,359 Participants
  • 6550 XP

Instructor(s):

Garrett Grolemund
Garrett Grolemund

Garrett is the author of Hands-On Programming with R and R for Data Science from O'Reilly Media. He is a Data Scientist at RStudio and holds a Ph.D. in Statistics, but specializes in teaching. He's taught people how to use R at over 50 government agencies, small businesses, and multi-billion dollar global companies; and he's designed RStudio's training materials for R, Shiny, dplyr and more and is a frequent contributor to the RStudio blog. He wrote the popular lubridate package for R.

Collaborator(s):

Nick Carchedi Nick Carchedi

Tom Jeon Tom Jeon

Course Description

This course builds on what you learned in Data Manipulation in R with dplyr by showing you how to combine data sets with dplyr's two table verbs. In the real world, data comes split across many data sets, but dplyr's core functions are designed to work with single tables of data. In this course, you'll learn the best ways to combine data sets into single tables. You'll learn how to augment columns from one data set with columns from another with mutating joins, how to filter one data set against another with filtering joins, and how to sift through data sets with set operations. Along the way, you'll discover the best practices for building data sets and troubleshooting joins with dplyr. Afterwards, you’ll be well on your way to data manipulation mastery!

Assembling data 

This chapter will show you how to build datasets from basic elements: vectors, lists, and individual datasets that do not require a join. dplyr contains a set of functions for assembling data that work more intuitively than base R's functions. The chapter will also look at when dplyr does and does not use data type coercion.