Skip to main content

Extreme Gradient Boosting with XGBoost

Learn the fundamentals of gradient boosting and build state-of-the-art machine learning models using XGBoost to solve classification and regression problems.

Start Course for Free
4 Hours16 Videos49 Exercises35,461 Learners
3750 XP

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

Do you know the basics of supervised learning and want to use state-of-the-art models on real-world datasets? Gradient boosting is currently one of the most popular techniques for efficient modeling of tabular datasets of all sizes. XGboost is a very fast, scalable implementation of gradient boosting, with models using XGBoost regularly winning online data science competitions and being used at scale across different industries. In this course, you'll learn how to use this powerful library alongside pandas and scikit-learn to build and tune supervised learning models. You'll work with real-world datasets to solve classification and regression problems.

  1. 1

    Classification with XGBoost


    This chapter will introduce you to the fundamental idea behind XGBoost—boosted learners. Once you understand how XGBoost works, you'll apply it to solve a common classification problem found in industry: predicting whether a customer will stop being a customer at some point in the future.

    Play Chapter Now
    Welcome to the course!
    50 xp
    Which of these is a classification problem?
    50 xp
    Which of these is a binary classification problem?
    50 xp
    Introducing XGBoost
    50 xp
    XGBoost: Fit/Predict
    100 xp
    What is a decision tree?
    50 xp
    Decision trees
    100 xp
    What is Boosting?
    50 xp
    Measuring accuracy
    100 xp
    Measuring AUC
    100 xp
    When should I use XGBoost?
    50 xp
    Using XGBoost
    50 xp
  2. 2

    Regression with XGBoost

    After a brief review of supervised regression, you'll apply XGBoost to the regression task of predicting house prices in Ames, Iowa. You'll learn about the two kinds of base learners that XGboost can use as its weak learners, and review how to evaluate the quality of your regression models.

    Play Chapter Now

In the following tracks

Machine Learning Scientist


Hugo Bowne-AndersonYashas Roy
Sergey Fogelson Headshot

Sergey Fogelson

VP of Analytics and Measurement Sciences, Viacom

Sergey loves applying his quantitative skills to large-scale data intensive problems, mentoring junior colleagues, and is an avid learner who is always trying to refine his programming chops and to apply state of the art analytical and statistical methods to tackling hard data problems. He began his career as an academic at Dartmouth College in Hanover, New Hampshire, where he researched the neural bases of visual category learning and obtained his Ph.D. in Cognitive Neuroscience. After leaving academia, Sergey got into the rapidly growing startup scene in the NYC metro area, where he has worked as a data scientist in digital advertising, cybersecurity, finance, and media. He is heavily involved in the NYC-area teaching community and has taught courses at various bootcamps, and has been a volunteer teacher in computer science through TEALSK12. When Sergey is not working or teaching, he is probably hiking. (He thru-hiked the Appalachian trail before graduate school).
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA