본문으로 바로가기

강의

Python으로 Kaggle 대회 공략하기

고급기술 수준

업데이트됨 2026. 5.

Kaggle 대회를 공략하고 우승으로 이끄는 전략을 학습하세요.

무료로 강의 시작

PythonMachine Learning

4시간

16 동영상

52 연습 문제

4,200 XP

21,642

성취 증명서

수천 개 기업의 학습자들이 사랑하는

팀을 교육하시나요?

비즈니스용으로 체험해 보세요

강의 설명

Kaggle은 가장 유명한 Data Science 대회 플랫폼입니다. 이런 대회에 참여하면 실제 데이터셋을 다루고, 다양한 Machine Learning 문제를 탐색하며, 다른 참가자들과 경쟁하고, 무엇보다 값진 실전 경험을 쌓을 수 있어요. 이 강의에서는 어떤 Data Science 대회든 접근하고 구조화하는 방법을 배웁니다. 올바른 로컬 검증 방식을 선택하고 과적합을 피하는 법을 익히게 돼요. 또한 고급 특성 공학과 모델 앙상블 기법도 마스터합니다. 이 모든 기술은 Kaggle 대회 데이터셋으로 직접 실습해 볼 거예요.

선수 조건

Extreme Gradient Boosting with XGBoost

1

Kaggle competitions process

In this first chapter, you will get exposure to the Kaggle competition process. You will train a model and prepare a csv file ready for submission. You will learn the difference between Public and Private test splits, and how to prevent overfitting.

Competitions overview

Explore train data

Explore test data

Prepare your first submission

Determine a problem type

Train a simple model

Prepare a submission

Public vs Private leaderboard

What model is overfitting?

Train XGBoost models

Explore overfitting XGBoost

2

Dive into the Competition

Now that you know the basics of Kaggle competitions, you will learn how to study the specific problem at hand. You will practice EDA and get to establish correct local validation strategies. You will also learn about data leakage.

Understand the problem

Understand the problem type

Define a competition metric

Initial EDA

EDA statistics

EDA plots I

EDA plots II

Local validation

K-fold cross-validation

Stratified K-fold

Validation usage

Time K-fold

Overall validation score

3

Feature Engineering

You will now get exposure to different types of features. You will modify existing features and create new ones. Also, you will treat the missing data accordingly.

Feature engineering

Arithmetical features

Date features

Categorical features

Label encoding

One-Hot encoding

Target encoding

Mean target encoding

K-fold cross-validation

Beyond binary classification

Missing data

Find missing data

Impute missing data

4

Modeling

Time to bring everything together and build some models! In this last chapter, you will build a base model before tuning some hyperparameters and improving your results with ensembles. You will then get some final tips and tricks to help you compete more efficiently.

Baseline model

Replicate validation score

Baseline based on the date

Baseline based on the gradient boosting

Hyperparameter tuning

Grid search

2D grid search

Model ensembling

Model blending

Model stacking I

Model stacking II

Testing Kaggle forum ideas

Select final submissions

Final thoughts

Python으로 Kaggle 대회 공략하기

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 인증서를 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

19백만 명 이상의 학습자와 함께 Python으로 Kaggle 대회 공략하기을(를) 시작하세요!

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.