Skip to main content
Emilien van Hoecke avatar

Emilien van Hoecke has completed

Machine Learning with scikit-learn

Start course For Free
4 hr
4,200 XP
Statement of Accomplishment Badge

Loved by learners at thousands of companies


Course Description

Machine learning is the field that teaches machines and computers to learn from existing data to make predictions on new data: Will a tumor be benign or malignant? Which of your customers will take their business elsewhere? Is a particular email spam? In this course, you'll learn how to use Python to perform supervised learning, an essential component of machine learning. You'll learn how to build predictive models, tune their parameters, and determine how well they will perform with unseen data—all while using real world datasets. You'll be using scikit-learn, one of the most popular and user-friendly machine learning libraries for Python.
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.
DataCamp for BusinessFor a bespoke solution book a demo.
  1. 1

    Classification

    Free

    In this chapter, you will be introduced to classification problems and learn how to solve them using supervised learning techniques. And you’ll apply what you learn to a political dataset, where you classify the party affiliation of United States congressmen based on their voting records.

    Play Chapter Now
    Supervised learning
    50 xp
    Which of these is a classification problem?
    50 xp
    Exploratory data analysis
    50 xp
    Numerical EDA
    50 xp
    Visual EDA
    50 xp
    The classification challenge
    50 xp
    k-Nearest Neighbors: Fit
    100 xp
    k-Nearest Neighbors: Predict
    100 xp
    Measuring model performance
    50 xp
    The digits recognition dataset
    100 xp
    Train/Test Split + Fit/Predict/Accuracy
    100 xp
    Overfitting and underfitting
    100 xp
  2. 2

    Regression

    In the previous chapter, you used image and political datasets to predict binary and multiclass outcomes. But what if your problem requires a continuous outcome? Regression is best suited to solving such problems. You will learn about fundamental concepts in regression and apply them to predict the life expectancy in a given country using Gapminder data.

    Play Chapter Now
  3. 3

    Fine-tuning your model

    Having trained your model, your next task is to evaluate its performance. In this chapter, you will learn about some of the other metrics available in scikit-learn that will allow you to assess your model's performance in a more nuanced manner. Next, learn to optimize your classification and regression models using hyperparameter tuning.

    Play Chapter Now
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

datasets

Automobile miles per gallonBoston housingDiabetesGapminderUS Congressional Voting Records (1984)White wine qualityRed wine quality

collaborators

Collaborator's avatar
Yashas Roy
Hugo Bowne-Anderson HeadshotHugo Bowne-Anderson

Data Scientist

Hugo is a data scientist, educator, writer and podcaster formerly at DataCamp. His main interests are promoting data & AI literacy, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. If you want to know what he likes to talk about, definitely check out DataFramed, the DataCamp podcast, which he hosted and produced.
See More

Join over 18 million learners and start Machine Learning with scikit-learn today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.