Skip to main content

Machine Learning with scikit-learn

Grow your machine learning skills with scikit-learn in Python. Use real-world datasets in this interactive course and learn how to make powerful predictions!

Start Course for Free
4 Hours15 Videos49 Exercises4050 XP

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

Grow your machine learning skills with scikit-learn and discover how to use this popular Python library to train models using labeled data. In this course, you'll learn how to make powerful predictions, such as whether a customer is will churn from your business, whether an individual has diabetes, and even how to tell classify the genre of a song. Using real-world datasets, you'll find out how to build predictive models, tune their parameters, and determine how well they will perform with unseen data.

  1. 1



    In this chapter, you'll be introduced to classification problems and learn how to solve them using supervised learning techniques. You'll learn how to split data into training and test sets, fit a model, make predictions, and evaluate accuracy. You’ll discover the relationship between model complexity and performance, applying what you learn to a churn dataset, where you will classify the churn status of a telecom company's customers.

    Play Chapter Now
    Machine learning with scikit-learn
    50 xp
    Binary classification
    50 xp
    The supervised learning workflow
    100 xp
    The classification challenge
    50 xp
    k-Nearest Neighbors: Fit
    100 xp
    k-Nearest Neighbors: Predict
    100 xp
    Measuring model performance
    50 xp
    Train/test split + computing accuracy
    100 xp
    Overfitting and underfitting
    100 xp
    Visualizing model complexity
    100 xp
  2. 2


    In this chapter, you will be introduced to regression, and build models to predict sales values using a dataset on advertising expenditure. You will learn about the mechanics of linear regression and common performance metrics such as R-squared and root mean squared error. You will perform k-fold cross-validation, and apply regularization to regression models to reduce the risk of overfitting.

    Play Chapter Now
  3. 3

    Fine-tuning your model

    Having trained models, now you will learn how to evaluate them. In this chapter, you will be introduced to several metrics along with a visualization technique for analyzing classification model performance using scikit-learn. You will also learn how to optimize classification and regression models through the use of hyperparameter tuning.

    Play Chapter Now


Advertising and SalesDiabetesTelecom ChurnMusic


james-datacampJames Chapmanamy-4121b590-cc52-442a-9779-03eb58089e08Amy Petersonizzyweber-9bc35945-95bd-423b-833e-40780c76586fIzzy Weber
George Boorman Headshot

George Boorman

Core Curriculum Manager, DataCamp

George is a Core Curriculum Manager at DataCamp. He has experience in project management across public health, applied research, and not-for-profit sectors. George is passionate about health technologies and all things data science.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA