Loved by learners at thousands of companies
In the real world, data sets typically come split across many tables while most data analysis functions in R are designed to work with single tables of data. In this course, you'll learn how to effectively combine data sets into single tables using data.table. You'll learn how to add columns from one table to another table, how to filter a table based on observations in another table, and how to identify records across multiple tables matching complex criteria. Along the way, you'll learn how to troubleshoot failed join operations and best practices for working with complex data sets. After completing this course you'll be well on your way to be a data.table master!
Joining Multiple data.tablesFree
This chapter will show you how to perform simple joins that will enable you to combine information spread across multiple tables.
Joins Using data.table Syntax
In this chapter you will perform joins using the data.table syntax, set and view data.table keys, and perform anti-joins.Joins using data.table syntax50 xpRight join with the data.table syntax100 xpInner join with the data.table syntax100 xpAnti-joins100 xpSetting and viewing data.table keys50 xpSetting keys100 xpGetting keys100 xpIncorporating joins into your data.table workflow50 xpExploring the Australian population100 xpFinding multiple matches100 xpExploring world life expectancy100 xp
Diagnosing and Fixing Common Join Problems
This chapter will discuss common problems and errors encountered when performing data.table joins and show you how to troubleshoot and avoid them.
Concatenating and Reshaping data.tables
In the last chapter of this course you'll learn how to concatenate observations from multiple tables together, how to identify observations present in one table but not another, and how to reshape tables between long and wide formats.Concatenating two or more data.tables50 xpConcatenating data.table variables100 xpConcatenating a list of data.tables100 xpSet operations50 xpIdentifying observations shared by multiple tables100 xpRemoving duplicates while combining tables100 xpIdentifying observations unique to a table100 xpMelting data.tables50 xpMelting a wide table100 xpMore melts100 xpCasting data.tables50 xpCasting a long table100 xpCasting multiple columns100 xpSplitting by multiple groups100 xp
PrerequisitesData Manipulation with data.table in R
Postdoctoral Researcher in Systems Genomics
Scott Ritchie is a Post-doctoral Researcher in the field of systems genomics. He applies and develops tools to analyse genetic and molecular data in population studies of common diseases. He is a daily user of R and the data.table package. He has contributed to development of the data.table package and to course material used by Software Carpentry. He holds an MSc in Bioinformatics and a PhD in systems biology.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA