Skip to main content
HomePython

Course

Feature Engineering for Machine Learning in Python

IntermediateSkill Level
4.8+
933 reviews
Updated 02/2023
Create new features to improve the performance of your Machine Learning models.
Start Course for Free
PythonMachine Learning4 hr16 videos53 Exercises4,350 XP38,322Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training 2 or more people?

Try DataCamp for Business

Course Description

Every day you read about the amazing breakthroughs in how the newest applications of machine learning are changing the world. Often this reporting glosses over the fact that a huge amount of data munging and feature engineering must be done before any of these fancy models can be used. In this course, you will learn how to do just that. You will work with Stack Overflow Developers survey, and historic US presidential inauguration addresses, to understand how best to preprocess and engineer features from categorical, continuous, and unstructured data. This course will give you hands-on experience on how to prepare any data for your own machine learning models.

Prerequisites

Supervised Learning with scikit-learn
1

Creating Features

In this chapter, you will explore what feature engineering is and how to get started with applying it to real-world data. You will load, explore and visualize a survey response dataset, and in doing so you will learn about its underlying data types and why they have an influence on how you should engineer your features. Using the pandas package you will create new features from both categorical and continuous columns.
Start Chapter
2

Dealing with Messy Data

3

Conforming to Statistical Assumptions

4

Dealing with Text Data

Finally, in this chapter, you will work with unstructured text data, understanding ways in which you can engineer columnar features out of a text corpus. You will compare how different approaches may impact how much context is being extracted from a text, and how to balance the need for context, without too many features being created.
Start Chapter
Feature Engineering for Machine Learning in Python
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.8
from 933 reviews
85%
14%
2%
0%
0%
  • Abualgasim
    8 hours ago

  • Chalita
    13 hours ago

  • Zaina
    5 days ago

  • Santiago
    6 days ago

  • Eman
    6 days ago

  • Alexis
    6 days ago

Abualgasim

Chalita

Santiago

FAQs

What types of features will I learn to engineer in this course?

You will create features from categorical columns, continuous variables, and unstructured text data, covering the full spectrum of feature types found in real-world machine learning projects.

What datasets are used for hands-on practice?

You will work with the Stack Overflow Developer Survey for structured feature engineering and historic US presidential inauguration addresses for text-based feature creation.

How does this course handle missing data?

Chapter 2 teaches you to locate missing values and explore multiple imputation and removal approaches, along with string manipulation techniques for cleaning messy columns.

Does the course cover statistical assumptions for features?

Yes. Chapter 3 focuses on analyzing data distributions, dealing with skewed data, and handling outliers that could negatively impact your machine learning models.

What text feature engineering techniques are included?

You will learn multiple approaches for extracting columnar features from text corpora, comparing how each method balances context richness against the number of features generated.

Join over 19 million learners and start Feature Engineering for Machine Learning in Python today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.