Joining Data in R with dplyr

This course will show you how to combine data sets with dplyr's two table verbs.
Start Course for Free
4 Hours20 Videos84 Exercises31,649 Learners
6550 XP

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

This course builds on what you learned in <em>Data Manipulation in R with dplyr</em> by showing you how to combine data sets with dplyr's two table verbs. In the real world, data comes split across many data sets, but dplyr's core functions are designed to work with single tables of data. In this course, you'll learn the best ways to combine data sets into single tables. You'll learn how to augment columns from one data set with columns from another with mutating joins, how to filter one data set against another with filtering joins, and how to sift through data sets with set operations. Along the way, you'll discover the best practices for building data sets and troubleshooting joins with dplyr. Afterwards, you’ll be well on your way to data manipulation mastery!

  1. 1

    Mutating joins

    Mutating joins add new variables to one dataset from another dataset, matching observations across rows in the process. This chapter will explain the various ways you can join datasets together and what happens when you do.
    Play Chapter Now
  2. 2

    Filtering joins and set operations

    Filtering joins and set operations combine information from datasets without adding new variables. Filtering joins filter the observations of one dataset based on whether or not they occur in a second dataset. Set operations use combinations of observations from both datasets to create a new dataset.
    Play Chapter Now
  3. 3

    Assembling data

    This chapter will show you how to build datasets from basic elements: vectors, lists, and individual datasets that do not require a join. dplyr contains a set of functions for assembling data that work more intuitively than base R's functions. The chapter will also look at when dplyr does and does not use data type coercion.
    Play Chapter Now
  4. 4

    Advanced joining

    Now that you have the basics, let's dive deep into the mechanics of joins. This chapter will show you how to spot common join problems, how to join based on multiple or mismatched keys, how to join multiple tables, and how to recreate dplyr's joins with SQL and base R.
    Play Chapter Now
  5. 5

    Case study

    You know the ins and outs of two-table verbs with dplyr, but your knowledge is untried! Let's cement what you've learned with a real world application.
    Play Chapter Now
AerosmithThe EaglesElvis PresleyHank WilliamsJimi HendrixJulie AndrewsMichael JacksonFrank Sinatra and Bing CrosbyMusicalsThe Dark Side of the Moon (Pink Floyd)Top selling albums in the USThe Complete Studio RecordingsThe Song Remains the SameThe Definitive CollectionLahman NamesLive! BootlegSupergroups
Nick CarchediTom Jeon
Team RStudio Headshot

Team RStudio

See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA