Home RMachine Learning with caret in R

Machine Learning with caret in R

Name: Machine Learning with caret in R
Rating: 4.625 (16 reviews)

4.6+

16 reviews

Advanced

This course teaches the big ideas in machine learning like how to build and evaluate predictive models.

Start Course for Free

4 Hours24 Videos88 Exercises

56,727 LearnersStatement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies

Course Description

Machine learning is the study and application of algorithms that learn from and make predictions on data. From search results to self-driving cars, it has manifested itself in all areas of our lives and is one of the most exciting and fast growing fields of research in the world of data science. This course teaches the big ideas in machine learning: how to build and evaluate predictive models, how to tune them for optimal performance, how to preprocess data for better results, and much more. The popular caret R package, which provides a consistent interface to all of R's most powerful machine learning facilities, is used throughout the course.

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Fundamentals in R

Go To Track

Machine Learning Scientist with R

Go To Track

1
Regression Models: Fitting and Evaluating Their Performance
Free
In the first chapter of this course, you'll fit regression models with train() and evaluate their out-of-sample performance using cross-validation and root-mean-square error (RMSE).
Play Chapter Now
Welcome to the course
50 xp
In-sample RMSE for linear regression
50 xp
In-sample RMSE for linear regression on diamonds
100 xp
Out-of-sample error measures
50 xp
Out-of-sample RMSE for linear regression
50 xp
Randomly order the data frame
100 xp
Try an 80/20 split
100 xp
Predict on test set
100 xp
Calculate test set RMSE by hand
100 xp
Comparing out-of-sample RMSE to in-sample RMSE
50 xp
Cross-validation
50 xp
Advantage of cross-validation
50 xp
10-fold cross-validation
100 xp
5-fold cross-validation
100 xp
5 x 5-fold cross-validation
100 xp
Making predictions on new data
100 xp
2
Classification Models: Fitting and Evaluating Their Performance
In this chapter, you'll fit classification models with train() and evaluate their out-of-sample performance using cross-validation and area under the curve (AUC).
Play Chapter Now
Logistic regression on sonar
50 xp
Why a train/test split?
50 xp
Try a 60/40 split
100 xp
Fit a logistic regression model
100 xp
Confusion matrix
50 xp
Confusion matrix takeaways
50 xp
Calculate a confusion matrix
100 xp
Calculating accuracy
50 xp
Calculating true positive rate
50 xp
Calculating true negative rate
50 xp
Class probabilities and predictions
50 xp
Probabilities and classes
50 xp
Try another threshold
100 xp
From probabilites to confusion matrix
100 xp
Introducing the ROC curve
50 xp
What's the value of a ROC curve?
50 xp
Plot an ROC curve
100 xp
Area under the curve (AUC)
50 xp
Model, ROC, and AUC
50 xp
Customizing trainControl
100 xp
Using custom trainControl
100 xp
3
Tuning Model Parameters to Improve Performance
In this chapter, you will use the train() function to tweak model parameters through cross-validation and grid search.
Play Chapter Now
Random forests and wine
50 xp
Random forests vs. linear models
50 xp
Fit a random forest
100 xp
Explore a wider model space
50 xp
Advantage of a longer tune length
50 xp
Try a longer tune length
100 xp
Custom tuning grids
50 xp
Advantages of a custom tuning grid
50 xp
Fit a random forest with custom tuning
100 xp
Introducing glmnet
50 xp
Advantage of glmnet
50 xp
Make a custom trainControl
100 xp
Fit glmnet with custom trainControl
100 xp
glmnet with custom tuning grid
50 xp
Why a custom tuning grid?
50 xp
glmnet with custom trainControl and tuning
100 xp
Interpreting glmnet plots
50 xp
4
Preprocessing Data
In this chapter, you will practice using train() to preprocess data before fitting models, improving your ability to making accurate predictions.
Play Chapter Now
Median imputation
50 xp
Median imputation vs. omitting rows
50 xp
Apply median imputation
100 xp
KNN imputation
50 xp
Comparing KNN imputation to median imputation
50 xp
Use KNN imputation
100 xp
Compare KNN and median imputation
50 xp
Multiple preprocessing methods
50 xp
Order of operations
50 xp
Combining preprocessing methods
100 xp
Handling low-information predictors
50 xp
Why remove near zero variance predictors?
50 xp
Remove near zero variance predictors
100 xp
preProcess() and nearZeroVar()
50 xp
Fit model on reduced blood-brain data
100 xp
Principle components analysis (PCA)
50 xp
Using PCA as an alternative to nearZeroVar()
100 xp
5
Selecting Models: A Case Study in Churn Prediction
In the final chapter of this course, you'll learn how to use resamples() to compare multiple models and select (or ensemble) the best one(s).
Play Chapter Now
Reusing a trainControl
50 xp
Why reuse a trainControl?
50 xp
Make custom train/test indices
100 xp
Reintroducing glmnet
50 xp
glmnet as a baseline model
50 xp
Fit the baseline model
100 xp
Reintroducing random forest
50 xp
Random forest drawback
50 xp
Random forest with custom trainControl
100 xp
Comparing models
50 xp
Matching train/test indices
50 xp
Create a resamples object
100 xp
More on resamples
50 xp
Create a box-and-whisker plot
100 xp
Create a scatterplot
100 xp
Ensembling models
100 xp
Summary
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Fundamentals in R

Go To Track

Machine Learning Scientist with R

Go To Track

Datasets

Diamonds Sonar Wine Overfit data Breast Cancer Blood-brain Churn

Collaborators

Nick Carchedi

Tom Jeon

Prerequisites

Introduction to Regression in R

Zachary Deane-Mayer

VP, Data Science at DataRobot

Max Kuhn

Software Engineer at RStudio and creator of caret

Dr. Max Kuhn is a Software Engineer at RStudio. He is the author or maintainer of several R packages for predictive modeling including caret, AppliedPredictiveModeling, Cubist, C50 and SparseLDA. He routinely teaches classes in predictive modeling at Predictive Analytics World and UseR! and his publications include work on neuroscience biomarkers, drug discovery, molecular diagnostics and response surface methodology.

Don’t just take our word for it

*4.6

from 16 reviews

75%

19%

Sort by

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

Chin-Huai S.

9 months

I think the course is very concise and informative.

Dimitris L.

10 months

great course, excellent instructor

Danny W.

12 months

Caret is an amazing package, all there is to say about it

Pedro D.

about 1 year

great course

Andrew H.

about 1 year

Very informative and certainly I will be using caret in the future.

"I think the course is very concise and informative."

Chin-Huai S.

"great course, excellent instructor"

Dimitris L.

"Caret is an amazing package, all there is to say about it"

Danny W.

Join over 13 million learners and start Machine Learning with caret in R today!

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Course Description

.css-1goj2uy{margin-right:8px;}Group.css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;}.css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192D;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;}Training 2 or more people?

In the following Tracks

Machine Learning Fundamentals in R

Machine Learning Scientist with R

Regression Models: Fitting and Evaluating Their Performance

Classification Models: Fitting and Evaluating Their Performance

Tuning Model Parameters to Improve Performance

Preprocessing Data

Selecting Models: A Case Study in Churn Prediction

GroupTraining 2 or more people?

In the following Tracks

Machine Learning Fundamentals in R

Machine Learning Scientist with R

Don’t just take our word for it

Join over .css-ou6dz6{color:#03ef62;}13 million learners and start Machine Learning with caret in R today!

Create Your Free Account

Training 2 or more people?

Training 2 or more people?

Join over 13 million learners and start Machine Learning with caret in R today!