Перейти к основному содержимому

Course

Machine Learning with Tree-Based Models in R

БазовыйУровень мастерства

Обновлено 08.2023

Learn how to use tree-based models and ensembles to make classification and regression predictions with tidymodels.

Начать Курс Бесплатно

RMachine Learning4 ч16 videos58 Exercises4,850 XP10,526Свидетельство о достижениях

Создайте бесплатный аккаунт

или

Продолжая, вы принимаете наши Условия использования, нашу Политику конфиденциальности и подтверждаете, что ваши данные хранятся в США.

Пользуется популярностью среди обучающихся в тысячах компаний.

Обучение двух или более человек?

Попробуйте DataCamp for Business

Описание курса

Tree-based machine learning models can reveal complex non-linear relationships in data and often dominate machine learning competitions. In this course, you'll use the tidymodels package to explore and build different tree-based models—from simple decision trees to complex random forests. You’ll also learn to use boosted trees, a powerful machine learning technique that uses ensemble learning to build high-performing predictive models. Along the way, you'll work with health and credit risk data to predict the incidence of diabetes and customer churn.

Предварительные требования

Modeling with tidymodels in R

1

Classification Trees

Ready to build a real machine learning pipeline? Complete step-by-step exercises to learn how to create decision trees, split your data, and predict which patients are most likely to suffer from diabetes. Last but not least, you’ll build performance measures to assess your models and judge your predictions.

Welcome to the course!

Why tree-based methods?

Specify that tree

Train that model

How to grow your tree

Train/test split

Avoiding class imbalances

From zero to hero

Predict and evaluate

Make predictions

Crack the matrix

Are you predicting correctly?

Начало Главы

2

Regression Trees and Cross-Validation

Ready for some candy? Use a chocolate rating dataset to build regression trees and assess their performance using suitable error measures. You’ll overcome statistical insecurities of single train/test splits by applying sweet techniques like cross-validation and then dive even deeper by mastering the bias-variance tradeoff.

Continuous outcomes

Train a regression tree

Predict new values

Inspect model output

Performance metrics for regression trees

In-sample performance

Out-of-sample performance

Bigger mistakes, bigger penalty

Cross-validation

Create the folds

Fit the folds

Evaluate the folds

Bias-variance tradeoff

Call things by their names

Adjust model complexity

In-sample and out-of-sample performance

Начало Главы

3

Hyperparameters and Ensemble Models

Time to get serious with tuning your hyperparameters and interpreting receiver operating characteristic (ROC) curves. In this chapter, you’ll leverage the wisdom of the crowd with ensemble models like bagging or random forests and build ensembles that forecast which credit card customers are most likely to churn.

Tuning hyperparameters

Generate a tuning grid

Tune along the grid

Pick the winner

More model measures

Calculate specificity

Draw the ROC curve

Area under the ROC curve

Bagged trees

Create bagged trees

In-sample ROC and AUC

Check for overfitting

Random forest

Bagged trees vs. random forest

Variable importance

Начало Главы

4

Boosted Trees

Ready for the high society of tree-based models? Apply gradient boosting to create powerful ensembles that perform better than anything that you have seen or built. Learn about their fine-tuning and how to compare different models to pick a winner for production.

Introduction to boosting

Bagging vs. boosting

Specify a boosted ensemble

Gradient boosting

Train a boosted ensemble

Evaluate the ensemble

Compare to a single classifier

Optimize the boosted ensemble

Tuning preparation

The actual tuning

Finalize the model

Model comparison

Compare AUC

Plot ROC curves

Начало Главы

Machine Learning with Tree-Based Models in R

Курс
завершен

Получите свидетельство о достижениях

Добавьте эти данные в свой профиль LinkedIn, резюме или CV.
Поделитесь этим в социальных сетях и в своем отчете об оценке эффективности работы.Запишитесь Прямо Сейчас

Присоединяйтесь 19 миллионов учащихся и начните Machine Learning with Tree-Based Models in R сегодня!

Создайте бесплатный аккаунт

или

Продолжая, вы принимаете наши Условия использования, нашу Политику конфиденциальности и подтверждаете, что ваши данные хранятся в США.

Развивайте свои навыки работы с данными с помощью DataCamp для мобильных устройств.

Успевайте в обучении на ходу с помощью наших мобильных курсов и ежедневных 5-минутных заданий по программированию.