Loved by learners at thousands of companies
This course will teach you to leverage the tools in the "tidyverse" to generate, explore, and evaluate machine learning models. Using a combination of tidyr and purrr packages, you will build a foundation for how to work with complex model objects in a "tidy" way. You will also learn how to leverage the broom package to explore your resulting models. You will then be introduced to the tools in the test-train-validate workflow, which will empower you evaluate the performance of both classification and regression models as well as provide the necessary information to optimize model performance via hyperparameter tuning.
Foundations of "tidy" Machine learningFree
This chapter will introduce you to the backbone of machine learning in the tidyverse, the List Column Workflow (LCW). The LCW will empower you to work with many models in one dataframe.
This chapter will also introduce you to the fundamentals of the broom package for exploring your models.Foundations of "tidy" machine learning50 xpNesting your data100 xpUnnesting your data100 xpExplore a nested cell100 xpThe map family of functions50 xpMapping your data100 xpExpecting mapped output100 xpMapping many models100 xpTidy your models with broom50 xpThe three ways to tidy your model50 xpExtracting model statistics tidily100 xpAugmenting your data100 xp
Multiple Models with broom
This chapter leverages the List Column Workflow to build and explore the attributes of 77 models. You will use the tools from the broom package to gain a multidimensional understanding of all of these models.Exploring coefficients across models50 xpTidy up the coefficients of your models100 xpWhat can we learn about these 77 countries?50 xpEvaluating the fit of many models50 xpGlance at the fit of your models100 xpBest and worst fitting models100 xpVisually inspect the fit of many models50 xpAugment the fitted values of each model100 xpExplore your best and worst fitting models100 xpImprove the fit of your models50 xpBuild better models100 xpPredicting the future50 xp
Build, Tune & Evaluate Regression Models
In this chapter you will learn how to use the List Column Workflow to build, tune and evaluate regression models. You will have the chance to work with two types of models: linear models and random forest models.Training, test and validation splits50 xpThe test-train split100 xpCross-validation data frames100 xpMeasuring cross-validation performance50 xpBuild cross-validated models100 xpPreparing for evaluation100 xpEvaluate model performance100 xpBuilding and tuning a random forest model50 xpBuild a random forest model100 xpEvaluate a random forest model100 xpFine tune your model100 xpThe best performing parameter100 xpMeasuring the test performance50 xpBuild & evaluate the best model100 xp
Build, Tune & Evaluate Classification Models
In this chapter you will shift gears to build, tune and evaluate classification models.Logistic regression models50 xpPrepare train-test-validate parts100 xpBuild cross-validated models100 xpEvaluating classification models50 xpPredictions of a single model100 xpPerformance of a single model100 xpPrepare for cross-validated performance100 xpCalculate cross-validated performance100 xpRandom forest for classification50 xpTune random forest models100 xpRandom forest performance100 xpBuild final classification model100 xpMeasure final model performance100 xpWrap-up50 xp
In the following tracksIntermediate Tidyverse ToolboxMachine Learning ScientistSupervised Machine Learning
PrerequisitesModeling with Data in the Tidyverse
Lead Data Scientist at Memorial Sloan Kettering Cancer Center
Dmitriy is a Principal Data Scientist at Interos Inc. Previously, he worked in the Strategy & Innovation department at Memorial Sloan Kettering Cancer Center where he developed predictive models for programs aimed at improving patient care. Dmitriy completed his Doctorate in Quantitative & Computational Biology at Princeton University. His core teaching philosophy is centered on building intuition and understanding for the methods and tools available.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA