Loved by learners at thousands of companies
Tidymodels is a powerful suite of R packages designed to streamline machine learning workflows. Learn to split datasets for cross-validation, preprocess data with tidymodels' recipe package, and fine-tune machine learning algorithms. You'll learn key concepts such as defining model objects and creating modeling workflows. Then, you'll apply your skills to predict home prices and classify employees by their risk of leaving a company.
Machine Learning with tidymodelsFree
In this chapter, you’ll explore the rich ecosystem of R packages that power tidymodels and learn how they can streamline your machine learning workflows. You’ll then put your tidymodels skills to the test by predicting house sale prices in Seattle, Washington.The tidymodels ecosystem50 xpTidymodels packages100 xpCreating training and test datasets100 xpDistribution of outcome variable values100 xpLinear regression with tidymodels50 xpFitting a linear regression model100 xpExploring estimated model parameters50 xpPredicting home selling prices100 xpEvaluating model performance50 xpModel performance metrics100 xpR squared plot100 xpComplete model fitting process with last_fit()100 xp
Learn how to predict categorical outcomes by training classification models. Using the skills you’ve gained so far, you’ll predict the likelihood of customers canceling their service with a telecommunications company.Classification models50 xpData resampling100 xpFitting a logistic regression model100 xpCombining test dataset results100 xpAssessing model fit50 xpCalculating metrics from the confusion matrix50 xpEvaluating performance with yardstick100 xpCreating custom metric sets100 xpVisualizing model performance50 xpPlotting the confusion matrix100 xpROC curves and area under the ROC curve100 xpAutomating the modeling workflow50 xpStreamlining the modeling process100 xpCollecting predictions and creating custom metrics100 xpComplete modeling workflow100 xp
Find out how to bake feature engineering pipelines with the recipes package. You’ll prepare numeric and categorical data to help machine learning algorithms optimize your predictions.Feature engineering50 xpExploring recipe objects50 xpCreating recipe objects100 xpTraining a recipe object100 xpNumeric predictors50 xpDiscovering correlated predictors100 xpRemoving correlated predictors with recipes100 xpMultiple feature engineering steps100 xpNominal predictors50 xpApplying step_dummy() to predictors100 xpOrdering of step_*() functions100 xpComplete feature engineering pipeline100 xpComplete modeling workflow50 xpFeature engineering process100 xpModel training and prediction100 xpModel performance metrics100 xp
Workflows and Hyperparameter Tuning
Now it’s time to streamline the modeling process using workflows and fine-tune models with cross-validation and hyperparameter tuning. You’ll learn how to tune a decision tree classification model to predict whether a bank's customers are likely to default on their loan.Machine learning workflows50 xpExploring the loans dataset100 xpSpecifying a model and recipe100 xpCreating workflows100 xpEstimating performance with cross validation50 xpMeasuring performance with cross validation100 xpCross validation with logistic regression100 xpComparing model performance profiles100 xpHyperparameter tuning50 xpSetting model hyperparameters100 xpRandom grid search100 xpExploring tuning results100 xpSelecting the best model50 xpFinalizing a workflow100 xpTraining a finalized workflow100 xpCongratulations!50 xp
In the following tracksMachine Learning FundamentalsMachine Learning ScientistSupervised Machine Learning
PrerequisitesModeling with Data in the Tidyverse
David is a data scientist in the Washington D.C. area where he helps organizations leverage data science and machine learning to solve complex business problems and build data products. He is also an adjunct professor of Business Analytics in the Graduate School of Business at George Mason University where he teaches courses focused on applied statistics, data analysis, machine learning, and database design.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA