Interactive Course

Data Manipulation in R with data.table

Master core concepts about data manipulation such as filtering, selecting and calculating groupwise statistics using data.table.

  • 4 hours
  • 15 Videos
  • 59 Exercises
  • 3,183 Participants
  • 5,050 XP

Loved by learners at thousands of top companies:

ikea-grey.svg
deloitte-grey.svg
t-mobile-grey.svg
whole-foods-grey.svg
uber-grey.svg
roche-grey.svg

Course Description

The data.table package provides a high-performance version of base R's data.frame with syntax and feature enhancements for ease of use, convenience and programming speed. This course shows you how to create, subset, and manipulate data.tables. You'll also learn about the database-inspired features of data.tables, including built-in groupwise operations. The course concludes with fast methods of importing and exporting tabular text data such as CSV files. Upon completion of the course, you will be able to use data.table in R for a more efficient manipulation and analysis process. Throughout the course you'll explore the San Francisco Bay Area bike share trip dataset from 2014.

  1. 1

    Introduction to data.table

    Free

    This chapter introduces data.tables as a drop-in replacement for data.frames and shows how to use data.table's i argument to filter rows.

  2. Groupwise Operations

    This chapter introduces data.table's by argument that lets you perform computations by groups. By the end of this chapter, you will master the concise DT[i, j, by] syntax of data.table.

  3. Importing and Exporting Data

    Not only does the data.table package help you perform incredibly fast computations, it can also help you read and write data to disk with amazing speeds. This chapter focuses on data.table's fread() and fwrite() functions which let you import and export flat files quickly and easily!

  1. 1

    Introduction to data.table

    Free

    This chapter introduces data.tables as a drop-in replacement for data.frames and shows how to use data.table's i argument to filter rows.

  2. Selecting and Computing on Columns

    Just as the i argument lets you filter rows, the j argument of data.table lets you select columns and also perform computations. The syntax is far more convenient and flexible when compared to data.frames.

  3. Groupwise Operations

    This chapter introduces data.table's by argument that lets you perform computations by groups. By the end of this chapter, you will master the concise DT[i, j, by] syntax of data.table.

  4. Reference Semantics

    You will learn about a unique feature of data.table in this chapter: modifying existing data.tables in place. Modifying data.tables in place makes your operations incredibly fast and is easy to learn.

  5. Importing and Exporting Data

    Not only does the data.table package help you perform incredibly fast computations, it can also help you read and write data to disk with amazing speeds. This chapter focuses on data.table's fread() and fwrite() functions which let you import and export flat files quickly and easily!

What do other learners have to say?

Devon

“I've used other sites, but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyd's Banking Group

Louis

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

Ronbowers

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics @ USAA

Matt Dowle
Matt Dowle

Author of data.table

Matt Dowle is the main author of the data.table package. Matt has worked for some of the world’s largest financial organizations and has been programming in R for over a decade.

See More
Arun Srinivasan
Arun Srinivasan

R's data.table co-developer

Arun Srinivasan is originally from Tamilnadu, India. He holds a Bachelors degree in Electronics engineering and a Masters degree in Bioinformatics. He started using R in 2010 and has contributed to R's data.table package since late 2013. He currently lives in London, where he works as a developer and analyst in Finance. He has a passion for developing tools and algorithms facilitating analyses on large data.

See More
Collaborators
  • Richie Cotton

    Richie Cotton

  • Benjamin  Feder

    Benjamin Feder

  • Eunkyung Park

    Eunkyung Park

  • Sumedh Panchadhar

    Sumedh Panchadhar

Icon Icon Icon professional info