Ga naar hoofdinhoud

Cursus

Machine Learning met boomgebaseerde modellen in Python

GemiddeldVaardigheidsniveau

Bijgewerkt 12-2025

In deze cursus leer je hoe je boomgebaseerde modellen en ensembles gebruikt voor regressie en classificatie met scikit-learn.

Start Cursus Kosteloos

PythonMachine Learning5 u15 videos57 Opdrachten4,650 XP110K+Prestatieverklaring

Maak je gratis account aan

of

Door verder te gaan accepteer je onze Gebruiksvoorwaarden, ons Privacybeleid en dat je gegevens worden opgeslagen in de VS.

Geliefd bij leerlingen van duizenden bedrijven

Wil je 2 of meer mensen trainen?

Probeer DataCamp for Business

Cursusbeschrijving

Decision trees zijn supervised learning-modellen die worden gebruikt voor classificatie- en regressieproblemen. Boombasede modellen bieden veel flexibiliteit, maar dat heeft een keerzijde: aan de ene kant kunnen bomen complexe niet-lineaire relaties vastleggen; aan de andere kant zijn ze gevoelig voor het onthouden van ruis in een gegevensset. Door de voorspellingen te combineren van bomen die op verschillende manieren zijn getraind, benutten ensemblemethoden de flexibiliteit van bomen terwijl ze de neiging om ruis te onthouden verkleinen. Ensemblemethoden worden in veel vakgebieden gebruikt en hebben een bewezen staat van dienst in het winnen van veel Machine Learning-wedstrijden. In deze cursus leer je hoe je Python gebruikt om decision trees en boomgebaseerde modellen te trainen met de gebruiksvriendelijke scikit-learn Machine Learning-bibliotheek. Je leert de sterke punten en beperkingen van bomen kennen en ziet hoe ensembling deze beperkingen kan verminderen, terwijl je oefent met realistische gegevenssets. Tot slot leer je ook hoe je de meest invloedrijke hyperparameters afstemt om het maximale uit je modellen te halen.

Vereisten

Supervised Learning with scikit-learn

1

Classification and Regression Trees

Classification and Regression Trees (CART) are a set of supervised learning models used for problems involving classification and regression. In this chapter, you'll be introduced to the CART algorithm.

Decision tree for classification

Train your first classification tree

Evaluate the classification tree

Logistic regression vs classification tree

Classification tree Learning

Growing a classification tree

Using entropy as a criterion

Entropy vs Gini index

Decision tree for regression

Train your first regression tree

Evaluate the regression tree

Linear regression vs regression tree

Hoofdstuk Beginnen

2

The Bias-Variance Tradeoff

The bias-variance tradeoff is one of the fundamental concepts in supervised machine learning. In this chapter, you'll understand how to diagnose the problems of overfitting and underfitting. You'll also be introduced to the concept of ensembling where the predictions of several models are aggregated to produce predictions that are more robust.

Generalization Error

Complexity, bias and variance

Overfitting and underfitting

Diagnose bias and variance problems

Instantiate the model

Evaluate the 10-fold CV error

Evaluate the training error

High bias or high variance?

Ensemble Learning

Define the ensemble

Evaluate individual classifiers

Better performance with a Voting Classifier

Hoofdstuk Beginnen

3

Bagging and Random Forests

Bagging is an ensemble method involving training the same algorithm many times using different subsets sampled from the training data. In this chapter, you'll understand how bagging can be used to create a tree ensemble. You'll also learn how the random forests algorithm can lead to further ensemble diversity through randomization at the level of each split in the trees forming the ensemble.

Define the bagging classifier

Evaluate Bagging performance

Out of Bag Evaluation

Prepare the ground

OOB Score vs Test Set Score

Random Forests (RF)

Train an RF regressor

Evaluate the RF regressor

Visualizing features importances

Hoofdstuk Beginnen

4

Boosting

Boosting refers to an ensemble method in which several models are trained sequentially with each model learning from the errors of its predecessors. In this chapter, you'll be introduced to the two boosting methods of AdaBoost and Gradient Boosting.

Define the AdaBoost classifier

Train the AdaBoost classifier

Evaluate the AdaBoost classifier

Gradient Boosting (GB)

Define the GB regressor

Train the GB regressor

Evaluate the GB regressor

Stochastic Gradient Boosting (SGB)

Regression with SGB

Train the SGB regressor

Evaluate the SGB regressor

Hoofdstuk Beginnen

5

Model Tuning

The hyperparameters of a machine learning model are parameters that are not learned from data. They should be set prior to fitting the model to the training set. In this chapter, you'll learn how to tune the hyperparameters of a tree-based model using grid search cross validation.

Tuning a CART's Hyperparameters

Tree hyperparameters

Set the tree's hyperparameter grid

Search for the optimal tree

Evaluate the optimal tree

Tuning a RF's Hyperparameters

Random forests hyperparameters

Set the hyperparameter grid of RF

Search for the optimal forest

Evaluate the optimal forest

Congratulations!

Hoofdstuk Beginnen

Machine Learning met boomgebaseerde modellen in Python

Cursus
voltooid

Verdien een prestatieverklaring

Voeg deze referentie toe aan je LinkedIn-profiel, cv of curriculum vitae
Deel het op sociale media en in je functioneringsgesprekSchrijf Je Nu in

Sluit je aan bij meer dan 19 miljoen leerlingen en start vandaag nog met Machine Learning met boomgebaseerde modellen in Python!

Maak je gratis account aan

of

Door verder te gaan accepteer je onze Gebruiksvoorwaarden, ons Privacybeleid en dat je gegevens worden opgeslagen in de VS.

Ontwikkel je datavaardigheden met DataCamp voor Mobiel

Maak vooruitgang onderweg met onze mobiele cursussen en dagelijkse 5-minuten programmeeruitdagingen.