Machine Learning in the Tidyverse

Leverage the tools in the tidyverse to generate, explore and evaluate machine learning models.

Start Course for Free
5 Hours15 Videos52 Exercises10,366 Learners
4300 XP

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies


Course Description

This course will teach you to leverage the tools in the "tidyverse" to generate, explore, and evaluate machine learning models. Using a combination of tidyr and purrr packages, you will build a foundation for how to work with complex model objects in a "tidy" way. You will also learn how to leverage the broom package to explore your resulting models. You will then be introduced to the tools in the test-train-validate workflow, which will empower you evaluate the performance of both classification and regression models as well as provide the necessary information to optimize model performance via hyperparameter tuning.

  1. 1

    Foundations of "tidy" Machine learning

    Free

    This chapter will introduce you to the backbone of machine learning in the tidyverse, the List Column Workflow (LCW). The LCW will empower you to work with many models in one dataframe.
    This chapter will also introduce you to the fundamentals of the broom package for exploring your models.

    Play Chapter Now
    Foundations of "tidy" machine learning
    50 xp
    Nesting your data
    100 xp
    Unnesting your data
    100 xp
    Explore a nested cell
    100 xp
    The map family of functions
    50 xp
    Mapping your data
    100 xp
    Expecting mapped output
    100 xp
    Mapping many models
    100 xp
    Tidy your models with broom
    50 xp
    The three ways to tidy your model
    50 xp
    Extracting model statistics tidily
    100 xp
    Augmenting your data
    100 xp

In the following tracks

Intermediate Tidyverse ToolboxMachine Learning ScientistSupervised Machine Learning

Collaborators

Sumedh PanchadharChester IsmayEunkyung Park
Dmitriy Gorenshteyn Headshot

Dmitriy Gorenshteyn

Lead Data Scientist at Memorial Sloan Kettering Cancer Center

Dmitriy is a Lead Data Scientist in the Strategy & Innovation department at Memorial Sloan Kettering Cancer Center. At MSK he develops predictive models for programs aimed at improving patient care. Prior to this role, Dmitriy completed his Doctorate in Quantitative & Computational Biology at Princeton University. With a passion for teaching and for R, he regularly holds cross-departmental R training sessions within MSK. His core teaching philosophy is centered on building intuition and understanding for the methods and tools available.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA