본문으로 바로가기

강의

Python으로 배우는 트리 기반 Machine Learning

중급기술 수준

업데이트됨 2025. 12.

이 강좌에서는 scikit-learn을 사용하여 회귀 및 분류를 위한 트리 기반 모델과 앙상블을 활용하는 방법을 배웁니다.

무료로 강의 시작

PythonMachine Learning

5시간

15 동영상

57 연습 문제

4,650 XP

110K+

성취 증명서

수천 개 기업의 학습자들이 사랑하는

팀을 교육하시나요?

비즈니스용으로 체험해 보세요

강의 설명

의사결정나무는 분류와 회귀 문제에 사용하는 지도 학습 모델입니다. 트리 모델은 매우 유연하지만 그만큼 대가도 따릅니다. 한편으로는 복잡한 비선형 관계를 잘 포착하지만, 다른 한편으로는 데이터셋에 존재하는 잡음을 쉽게 외워버릴 수 있습니다. 서로 다르게 학습된 나무들의 예측을 집계하는 앙상블 방법은 트리의 유연성은 살리면서 잡음에 과적합되는 경향을 줄여 줍니다. 앙상블 방법은 다양한 분야에서 활용되며, 여러 Machine Learning 대회에서 성과로 입증되었습니다. 이 과정에서는 사용자 친화적인 scikit-learn Machine Learning 라이브러리를 사용해 Python으로 의사결정나무와 트리 기반 모델을 학습하는 법을 배웁니다. 트리의 장단점을 이해하고, 앙상블이 이러한 한계를 어떻게 완화하는지 실제 데이터셋으로 실습하면서 익히게 됩니다. 마지막으로, 모델 성능을 극대화하기 위해 가장 영향력 있는 하이퍼파라미터를 튜닝하는 방법도 배우게 됩니다.

선수 조건

Supervised Learning with scikit-learn

1

Classification and Regression Trees

Classification and Regression Trees (CART) are a set of supervised learning models used for problems involving classification and regression. In this chapter, you'll be introduced to the CART algorithm.

Decision tree for classification

Train your first classification tree

Evaluate the classification tree

Logistic regression vs classification tree

Classification tree Learning

Growing a classification tree

Using entropy as a criterion

Entropy vs Gini index

Decision tree for regression

Train your first regression tree

Evaluate the regression tree

Linear regression vs regression tree

2

The Bias-Variance Tradeoff

The bias-variance tradeoff is one of the fundamental concepts in supervised machine learning. In this chapter, you'll understand how to diagnose the problems of overfitting and underfitting. You'll also be introduced to the concept of ensembling where the predictions of several models are aggregated to produce predictions that are more robust.

Generalization Error

Complexity, bias and variance

Overfitting and underfitting

Diagnose bias and variance problems

Instantiate the model

Evaluate the 10-fold CV error

Evaluate the training error

High bias or high variance?

Ensemble Learning

Define the ensemble

Evaluate individual classifiers

Better performance with a Voting Classifier

3

Bagging and Random Forests

Bagging is an ensemble method involving training the same algorithm many times using different subsets sampled from the training data. In this chapter, you'll understand how bagging can be used to create a tree ensemble. You'll also learn how the random forests algorithm can lead to further ensemble diversity through randomization at the level of each split in the trees forming the ensemble.

Define the bagging classifier

Evaluate Bagging performance

Out of Bag Evaluation

Prepare the ground

OOB Score vs Test Set Score

Random Forests (RF)

Train an RF regressor

Evaluate the RF regressor

Visualizing features importances

4

Boosting

Boosting refers to an ensemble method in which several models are trained sequentially with each model learning from the errors of its predecessors. In this chapter, you'll be introduced to the two boosting methods of AdaBoost and Gradient Boosting.

Define the AdaBoost classifier

Train the AdaBoost classifier

Evaluate the AdaBoost classifier

Gradient Boosting (GB)

Define the GB regressor

Train the GB regressor

Evaluate the GB regressor

Stochastic Gradient Boosting (SGB)

Regression with SGB

Train the SGB regressor

Evaluate the SGB regressor

5

Model Tuning

The hyperparameters of a machine learning model are parameters that are not learned from data. They should be set prior to fitting the model to the training set. In this chapter, you'll learn how to tune the hyperparameters of a tree-based model using grid search cross validation.

Tuning a CART's Hyperparameters

Tree hyperparameters

Set the tree's hyperparameter grid

Search for the optimal tree

Evaluate the optimal tree

Tuning a RF's Hyperparameters

Random forests hyperparameters

Set the hyperparameter grid of RF

Search for the optimal forest

Evaluate the optimal forest

Congratulations!

Python으로 배우는 트리 기반 Machine Learning

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 인증서를 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

19백만 명 이상의 학습자와 함께 Python으로 배우는 트리 기반 Machine Learning을(를) 시작하세요!

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.