Garrett is the author of Hands-On Programming with R and R for Data Science from O'Reilly Media. He is a Data Scientist at RStudio and holds a Ph.D. in Statistics, but specializes in teaching. He's taught people how to use R at over 50 government agencies, small businesses, and multi-billion dollar global companies; and he's designed RStudio's training materials for R, Shiny, dplyr and more and is a frequent contributor to the RStudio blog. He wrote the popular lubridate package for R.
In this interactive tutorial, you will learn how to perform sophisticated dplyr techniques to carry out your data manipulation with R. First you will master the five verbs of R data manipulation with dplyr: select, mutate, filter, arrange and summarise. Next, you will learn how you can chain your dplyr operations using the pipe operator of the magrittr package. In the final section, the focus is on practicing how to subset your data using the group_by function, and how you can access data stored outside of R in a database. All said and done, you will be familiar with data manipulation tools and techniques that will allow you to efficiently manipulate data.
Introduction to the dplyr package and the tbl class. Learn the philosophy that guides dplyr, discover some useful applications of the dplyr package, and meet the data structures that dplyr uses behind the scenes.
Get familiar with dplyr's manipulation verbs. Meet the five verbs and then practice using the mutate and select verbs.
Learn how to search through the observations in your data set (and extract useful observations) with the filter function. Rearrange the observations in your data set with the arrange verb.
Master the data manipulation verb summarise, and practice combining the five verbs to solve advanced data manipulation tasks. Learn to chain the operators together with the piping operator.
Complete your mastery of data manipulation with group-wise operations and databases. Learn to use group_by to group your data into subsets of observations, and use dplyr to access data stored outside of R in a database.