Skip to main content
HomePython

Free Course

Preprocessing for Machine Learning in Python

IntermediateSkill Level
4.7+
407 reviews
Updated 12/2025
Learn how to clean and prepare your data for machine learning!
Start Free Course

Included for Free

PythonMachine Learning
4 hr
20 videos
62 Exercises
4,700 XP
66,197
Statement of Accomplishment

Create Your Free Account

Continue with GoogleShow more options

or


By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training a Team?

Try for Business

Course Description

This course covers the basics of how and when to perform data preprocessing. This essential step in any machine learning project is when you get your data ready for modeling. Between importing and cleaning your data and fitting your machine learning model is when preprocessing comes into play. You'll learn how to standardize your data so that it's in the right form for your model, create new features to best leverage the information in your dataset, and select the best features to improve your model fit. Finally, you'll have some practice preprocessing by getting a dataset on UFO sightings ready for modeling.

Prerequisites

Cleaning Data in PythonSupervised Learning with scikit-learn
1

Introduction to Data Preprocessing

In this chapter you'll learn exactly what it means to preprocess data. You'll take the first steps in any preprocessing journey, including exploring data types and dealing with missing data.
Start Chapter
2

Standardizing Data

This chapter is all about standardizing data. Often a model will make some assumptions about the distribution or scale of your features. Standardization is a way to make your data fit these assumptions and improve the algorithm's performance.
Start Chapter
4

Selecting Features for Modeling

Preprocessing for Machine Learning in Python
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.7
from 407 reviews
80%
20%
0%
0%
0%
  • Illya
    3 minutes ago

  • Roberto
    5 days ago

  • Jesimiel
    6 days ago

  • Anil
    6 days ago

  • Alexander
    last week

  • Davyd
    last week

Illya

Roberto

Jesimiel

FAQs

Is this course suitable for beginners in machine learning?

No. This is an advanced course with many prerequisites including pandas, scikit-learn, and statistics. You should have prior supervised learning experience.

What preprocessing techniques does this course cover?

You will learn data standardization, feature creation, feature selection, and how to handle missing data to prepare datasets for machine learning models.

What is the UFO sightings dataset used for?

The UFO sightings dataset is used in the final chapter as a hands-on exercise where you apply all the preprocessing techniques learned throughout the course.

How many chapters and exercises does this course have?

The course has 5 chapters and 70 exercises. Most learners complete it in about 3 hours.

Why is preprocessing important for machine learning?

Preprocessing ensures your data is in the right form for your model. Poorly prepared data can lead to inaccurate predictions regardless of which algorithm you choose.

Join over 19 million learners and start Preprocessing for Machine Learning in Python today!

Create Your Free Account

Continue with GoogleShow more options

or


By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.