Home PythonMachine Learning with Tree-Based Models in Python

Machine Learning with Tree-Based Models in Python

Name: Machine Learning with Tree-Based Models in Python
Rating: 4.5238094 (42 reviews)

4.5+

Intermediate

In this course, you'll learn how to use tree-based models and ensembles for regression and classification using scikit-learn.

Start Course for Free

5 Hours15 Videos57 Exercises

86,390 LearnersStatement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies

Course Description

Decision trees are supervised learning models used for problems involving classification and regression. Tree models present a high flexibility that comes at a price: on one hand, trees are able to capture complex non-linear relationships; on the other hand, they are prone to memorizing the noise present in a dataset. By aggregating the predictions of trees that are trained differently, ensemble methods take advantage of the flexibility of trees while reducing their tendency to memorize noise. Ensemble methods are used across a variety of fields and have a proven track record of winning many machine learning competitions. In this course, you'll learn how to use Python to train decision trees and tree-based models with the user-friendly scikit-learn machine learning library. You'll understand the advantages and shortcomings of trees and demonstrate how ensembling can alleviate these shortcomings, all while practicing on real-world datasets. Finally, you'll also understand how to tune the most influential hyperparameters in order to get the most out of your models.

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

1
Classification and Regression Trees
Free
Classification and Regression Trees (CART) are a set of supervised learning models used for problems involving classification and regression. In this chapter, you'll be introduced to the CART algorithm.
Play Chapter Now
Decision tree for classification
50 xp
Train your first classification tree
100 xp
Evaluate the classification tree
100 xp
Logistic regression vs classification tree
100 xp
Classification tree Learning
50 xp
Growing a classification tree
50 xp
Using entropy as a criterion
100 xp
Entropy vs Gini index
100 xp
Decision tree for regression
50 xp
Train your first regression tree
100 xp
Evaluate the regression tree
100 xp
Linear regression vs regression tree
100 xp
2
The Bias-Variance Tradeoff
The bias-variance tradeoff is one of the fundamental concepts in supervised machine learning. In this chapter, you'll understand how to diagnose the problems of overfitting and underfitting. You'll also be introduced to the concept of ensembling where the predictions of several models are aggregated to produce predictions that are more robust.
Play Chapter Now
Generalization Error
50 xp
Complexity, bias and variance
50 xp
Overfitting and underfitting
50 xp
Diagnose bias and variance problems
50 xp
Instantiate the model
100 xp
Evaluate the 10-fold CV error
100 xp
Evaluate the training error
100 xp
High bias or high variance?
50 xp
Ensemble Learning
50 xp
Define the ensemble
100 xp
Evaluate individual classifiers
100 xp
Better performance with a Voting Classifier
100 xp
3
Bagging and Random Forests
Bagging is an ensemble method involving training the same algorithm many times using different subsets sampled from the training data. In this chapter, you'll understand how bagging can be used to create a tree ensemble. You'll also learn how the random forests algorithm can lead to further ensemble diversity through randomization at the level of each split in the trees forming the ensemble.
Play Chapter Now
Bagging
50 xp
Define the bagging classifier
100 xp
Evaluate Bagging performance
100 xp
Out of Bag Evaluation
50 xp
Prepare the ground
100 xp
OOB Score vs Test Set Score
100 xp
Random Forests (RF)
50 xp
Train an RF regressor
100 xp
Evaluate the RF regressor
100 xp
Visualizing features importances
100 xp
4
Boosting
Boosting refers to an ensemble method in which several models are trained sequentially with each model learning from the errors of its predecessors. In this chapter, you'll be introduced to the two boosting methods of AdaBoost and Gradient Boosting.
Play Chapter Now
Adaboost
50 xp
Define the AdaBoost classifier
100 xp
Train the AdaBoost classifier
100 xp
Evaluate the AdaBoost classifier
100 xp
Gradient Boosting (GB)
50 xp
Define the GB regressor
100 xp
Train the GB regressor
100 xp
Evaluate the GB regressor
100 xp
Stochastic Gradient Boosting (SGB)
50 xp
Regression with SGB
100 xp
Train the SGB regressor
100 xp
Evaluate the SGB regressor
100 xp
5
Model Tuning
The hyperparameters of a machine learning model are parameters that are not learned from data. They should be set prior to fitting the model to the training set. In this chapter, you'll learn how to tune the hyperparameters of a tree-based model using grid search cross validation.
Play Chapter Now
Tuning a CART's Hyperparameters
50 xp
Tree hyperparameters
50 xp
Set the tree's hyperparameter grid
100 xp
Search for the optimal tree
100 xp
Evaluate the optimal tree
100 xp
Tuning a RF's Hyperparameters
50 xp
Random forests hyperparameters
50 xp
Set the hyperparameter grid of RF
100 xp
Search for the optimal forest
100 xp
Evaluate the optimal forest
100 xp
Congratulations!
50 xp

In the following tracks

Associate Data Scientist in Python Machine Learning Scientist with Python Supervised Machine Learning in Python

Datasets

Auto-mpg Bike Sharing Demand Wisconsin Breast Cancer Indian Liver Patient

Collaborators

Kara Woo

Eunkyung Park

Sumedh Panchadhar

Prerequisites

Supervised Learning with scikit-learn

Elie Kawerk

Senior Data Scientist

Elie is a data scientist with a background in computational quantum physics. His experience encompasses several industries including brick and mortar retail, e-commerce, entertainment, and quick-commerce. He uses a variety of tools and techniques such as machine learning, experimentation, and causal inference to drive business value. His work on a Word2vec-based recommender system has been featured in Amazon Web Service's blog. As a meetup organizer, Elie is passionate about teaching data science and mentoring new-entrants to the field. Elie holds a Phd in physics from Sorbonne University.

Don’t just take our word for it

*4.5

from 42 reviews

71%

12%

14%

Sort by

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

rabbit 1.

8 months

Great experience with practically applying concepts learned

Vengadesh W.

10 months

Very good

Joern B.

11 months

This course is well organized and keeps the classification as well the regression in the focus. If you download the files you are able to get the same results as doing it online. Never had this before!!! Thanks to Elie Kawerk who set this course up. This course is cristal clear and transparent.

Jonathan W.

12 months

Great course. I especially like that the instructor used pictures to explain the structure/flow of different kinds of models. It made things a lot clearer. One thing is that I think it would be better if the instructor talks more about in which situations we should use which kinds of models, as such a variety of models in the course may seem confusing without specifying in which scenario should we use them.

Kyaw A.

12 months

Really, really great course.

"Great experience with practically applying concepts learned"

rabbit 1.

"Very good"

Vengadesh W.

"This course is well organized and keeps the classification as well the regression in the focus. If you download the files you are able to get the same results as doing it online. Never had this before!!! Thanks to Elie Kawerk who set this course up. This course is cristal clear and transparent."

Joern B.

FAQs

Join over 13 million learners and start Machine Learning with Tree-Based Models in Python today!

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Machine Learning with Tree-Based Models in Python

Create Your Free Account

Loved by learners at thousands of companies

Course Description

Training 2 or more people?

Classification and Regression Trees

The Bias-Variance Tradeoff

Bagging and Random Forests

Boosting

Model Tuning

Don’t just take our word for it

FAQs

Is this course suitable for beginners?

Will I receive a certificate at the end of the course?

Does the course cover the concept of the bias-variance trade-off?

Does the course provide an understanding of how to tune hyperparameters?

Who will benefit from this course?

Join over 13 million learners and start Machine Learning with Tree-Based Models in Python today!

Create Your Free Account

Course Description

.css-1goj2uy{margin-right:8px;}Group.css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;}.css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192D;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;}Training 2 or more people?

Classification and Regression Trees

The Bias-Variance Tradeoff

Bagging and Random Forests

Boosting

Model Tuning

Don’t just take our word for it

FAQs

Does the course cover the concept of the bias-variance trade-off?

Does the course provide an understanding of how to tune hyperparameters?

Who will benefit from this course?

Join over .css-ou6dz6{color:#03ef62;}13 million learners and start Machine Learning with Tree-Based Models in Python today!

Create Your Free Account

Training 2 or more people?

Join over 13 million learners and start Machine Learning with Tree-Based Models in Python today!