Official Blog
r programming

Case Study: DataCamp, dplyr, and Blended Learning

How to use an online interactive coding environment for R in the class? A case study using the DataCamp platform.

Editorial Note: This is a guest blog post by Professor Matthew J. Salganik (Princeton University) in which he describes his experiences using the DataCamp interactive learning platform for blended learning. The article was first published on Wheels on the Bus. Want to use DataCamp in your class as well? Contact us via [email protected].

DataCamp, dplyr, and blended learning

As I’ve written about in previous posts (here, here, and here), this semester I taught a course called Advanced Data Analysis for the Social Science, which is the second course in our department’s required sequence for Ph.D. students. I’ve taught this course in the past, and in teaching the course this time, I tried to modernize it both in content and in form. Therefore, I partnered with DataCamp to make their dplyr course, taught by Garrett Grolemund, available to my students. This combination of face-to-face teaching and online content is called blended learning, and it’s something that I’d like to explore more in future classes. For a first attempt, I think it worked pretty well, and the people at DataCamp were very helpful. Here’s more about what happened.

DataCamp is an online learning platform focused on data science. They offer courses on a variety of topics, but most are focused on R and variety of R packages. I was happy to see that they offer a course on dplyr, a wonderful data manipulation package by Hadley Wickham and colleagues. dplyr is very well designed, but it takes some getting used to because it works differently (some might say better) than the way things work in base R. So, like a traditional class, we had a face-to-face lab session on dplyr and a homework assignment that required students to use it (here’s our class syllabus). But, I knew that the students would need more practice if they were going to becoming truly fluent in dplyr. So, in addition to our traditional class activities, I offered the students the chance to take Garrett Grolemund’s dplyr course. The course consists of 5 chapters, each with videos and instantly graded exercises. I had taken Garrett’s class myself, and I found the exercises to be really, really helpful for practicing dplyr’s style of thinking.

How did my students respond? About half the students started Garrett’s course, and, of the students that started, a bit less than half pretty much completed the whole thing.


Is that a success or failure? I’d say a success because Garrett’s course was not required and it was quite long (about 4 hours). In other words, by offering this enrichment about a quarter of the class ended up spending more time learning. I’ve redacted names from the plot, but the students that were most engaged with Garrett’s course were an interesting mix; it was not just the strongest students or the struggling students. If this kind of enrichment were offered more regularly, I wonder which students would be the most likely to take it up.

Finally, I’d like to thank Martijn Theuwissen from DataCamp for helping to make this experiment in blended learning possible. I’d definitely like to try this again next time I teach a course on data analysis.