Joining Data with data.table in R

This course will show you how to combine and merge datasets with data.table.

Start Course for Free
4 Hours13 Videos47 Exercises10,503 Learners
3950 XP

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies


Course Description

In the real world, data sets typically come split across many tables while most data analysis functions in R are designed to work with single tables of data. In this course, you'll learn how to effectively combine data sets into single tables using data.table. You'll learn how to add columns from one table to another table, how to filter a table based on observations in another table, and how to identify records across multiple tables matching complex criteria. Along the way, you'll learn how to troubleshoot failed join operations and best practices for working with complex data sets. After completing this course you'll be well on your way to be a data.table master!

  1. 1

    Joining Multiple data.tables

    Free

    This chapter will show you how to perform simple joins that will enable you to combine information spread across multiple tables.

    Play Chapter Now
    Welcome to the course
    50 xp
    Exploring data.tables
    100 xp
    Identifying join keys
    50 xp
    Multiple data.tables, multiple keys
    50 xp
    The merge function
    50 xp
    Inner join
    100 xp
    Full join
    100 xp
    Left and right joins
    50 xp
    Left join
    100 xp
    Right join
    100 xp
    Mastering simple joins
    100 xp
  2. 3

    Diagnosing and Fixing Common Join Problems

    This chapter will discuss common problems and errors encountered when performing data.table joins and show you how to troubleshoot and avoid them.

    Play Chapter Now

In the following tracks

Data Analyst Data Manipulation

Collaborators

Sumedh PanchadharRichie CottonEunkyung Park
Scott Ritchie Headshot

Scott Ritchie

Postdoctoral Researcher in Systems Genomics

Scott Ritchie is a Post-doctoral Researcher in the field of systems genomics. He applies and develops tools to analyse genetic and molecular data in population studies of common diseases. He is a daily user of R and the data.table package. He has contributed to development of the data.table package and to course material used by Software Carpentry. He holds an MSc in Bioinformatics and a PhD in systems biology.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA