Home PythonDimensionality Reduction in Python

Dimensionality Reduction in Python

Name: Dimensionality Reduction in Python
Rating: 4.4545455 (11 reviews)

4.4+

11 reviews

Intermediate

Understand the concept of reducing dimensionality in your data, and master the techniques to do so in Python.

Start Course for Free

4 Hours16 Videos58 Exercises

27,928 LearnersStatement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies

Course Description

High-dimensional datasets can be overwhelming and leave you not knowing where to start. Typically, you’d visually explore a new dataset first, but when you have too many dimensions the classical approaches will seem insufficient. Fortunately, there are visualization techniques designed specifically for high dimensional data and you’ll be introduced to these in this course. After exploring the data, you’ll often find that many features hold little information because they don’t show any variance or because they are duplicates of other features. You’ll learn how to detect these features and drop them from the dataset so that you can focus on the informative ones. In a next step, you might want to build a model on these features, and it may turn out that some don’t have any effect on the thing you’re trying to predict. You’ll learn how to detect and drop these irrelevant features too, in order to reduce dimensionality and thus complexity. Finally, you’ll learn how feature extraction techniques can reduce dimensionality for you through the calculation of uncorrelated principal components.

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist with Python

Go To Track

1
Exploring High Dimensional Data
Free
You'll be introduced to the concept of dimensionality reduction and will learn when an why this is important. You'll learn the difference between feature selection and feature extraction and will apply both techniques for data exploration. The chapter ends with a lesson on t-SNE, a powerful feature extraction technique that will allow you to visualize a high-dimensional dataset.
Play Chapter Now
Introduction
50 xp
Finding the number of dimensions in a dataset
50 xp
Removing features without variance
100 xp
Feature selection vs. feature extraction
50 xp
Visually detecting redundant features
100 xp
Advantage of feature selection
50 xp
t-SNE visualization of high-dimensional data
50 xp
t-SNE intuition
50 xp
Fitting t-SNE to the ANSUR data
100 xp
t-SNE visualisation of dimensionality
100 xp
2
Feature Selection I - Selecting for Feature Information
In this first out of two chapters on feature selection, you'll learn about the curse of dimensionality and how dimensionality reduction can help you overcome it. You'll be introduced to a number of techniques to detect and remove features that bring little added value to the dataset. Either because they have little variance, too many missing values, or because they are strongly correlated to other features.
Play Chapter Now
The curse of dimensionality
50 xp
Train - test split
100 xp
Fitting and testing the model
100 xp
Accuracy after dimensionality reduction
100 xp
Features with missing values or little variance
50 xp
Finding a good variance threshold
100 xp
Features with low variance
100 xp
Removing features with many missing values
100 xp
Pairwise correlation
50 xp
Correlation intuition
50 xp
Inspecting the correlation matrix
50 xp
Visualizing the correlation matrix
100 xp
Removing highly correlated features
50 xp
Filtering out highly correlated features
100 xp
Nuclear energy and pool drownings
100 xp
3
Feature Selection II - Selecting for Model Accuracy
In this second chapter on feature selection, you'll learn how to let models help you find the most important features in a dataset for predicting a particular target feature. In the final lesson of this chapter, you'll combine the advice of multiple, different, models to decide on which features are worth keeping.
Play Chapter Now
Selecting features for model performance
50 xp
Building a diabetes classifier
100 xp
Manual Recursive Feature Elimination
100 xp
Automatic Recursive Feature Elimination
100 xp
Tree-based feature selection
50 xp
Building a random forest model
100 xp
Random forest for feature selection
100 xp
Recursive Feature Elimination with random forests
100 xp
Regularized linear regression
50 xp
Creating a LASSO regressor
100 xp
Lasso model results
100 xp
Adjusting the regularization strength
100 xp
Combining feature selectors
50 xp
Creating a LassoCV regressor
100 xp
Ensemble models for extra votes
100 xp
Combining 3 feature selectors
100 xp
4
Feature Extraction
This chapter is a deep-dive on the most frequently used dimensionality reduction algorithm, Principal Component Analysis (PCA). You'll build intuition on how and why this algorithm is so powerful and will apply it both for data exploration and data pre-processing in a modeling pipeline. You'll end with a cool image compression use case.
Play Chapter Now
Feature extraction
50 xp
Manual feature extraction I
100 xp
Manual feature extraction II
100 xp
Principal component intuition
50 xp
Principal component analysis
50 xp
Calculating Principal Components
100 xp
PCA on a larger dataset
100 xp
PCA explained variance
100 xp
PCA applications
50 xp
Understanding the components
100 xp
PCA for feature exploration
100 xp
PCA in a model pipeline
100 xp
Principal Component selection
50 xp
Selecting the proportion of variance to keep
100 xp
Choosing the number of components
100 xp
PCA for image compression
100 xp
Congratulations!
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist with Python

Go To Track

Datasets

ANSUR Female ANSUR Male Diabetes Grocery store sales Boston Public Schools Pokemon

Collaborators

Hadrien Lacroix

Hillary Green-Lerman

Chester Ismay

Prerequisites

Supervised Learning with scikit-learn

Jeroen Boeye

Machine Learning Engineer @ Faktion

Jeroen is a machine learning engineer working at Faktion, an AI company from Belgium. He uses both R and Python for his analyses and has a PhD background in computational biology. His experience mostly lies in working with structured data, produced by sensors or digital processes.

Don’t just take our word for it

*4.4

from 11 reviews

73%

27%

Sort by

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

HARPREET S.

9 months

concepts delivered

Ankush B.

about 1 year

Topics are very well explained in the course.

Swee M.

over 1 year

Great

Nur K.

over 1 year

He is an absolutely amazing teacher. This course not only helped me learn more about dimensionality reduction and feature selection, but it also gave me the motivation to continue my research. Thank you!

Hsing-chuan H.

over 1 year

very practical content for immediate application for my analysis

"concepts delivered"

HARPREET S.

"Topics are very well explained in the course."

Ankush B.

"Great"

Swee M.

FAQs

Join over 13 million learners and start Dimensionality Reduction in Python today!

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Dimensionality Reduction in Python

Create Your Free Account

Loved by learners at thousands of companies

Course Description

Training 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Exploring High Dimensional Data

Feature Selection I - Selecting for Feature Information

Feature Selection II - Selecting for Model Accuracy

Feature Extraction

Training 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Don’t just take our word for it

FAQs

Is this course suitable for beginners?

Will I receive a certificate at the end of the course?

What jobs would benefit from this course?

What skills will I gain after completing this course?

What techniques will be covered in the course?

Does this course cover specific visualization techniques?

Join over 13 million learners and start Dimensionality Reduction in Python today!

Create Your Free Account

Course Description

.css-1goj2uy{margin-right:8px;}Group.css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;}.css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192D;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;}Training 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Exploring High Dimensional Data

Feature Selection I - Selecting for Feature Information

Feature Selection II - Selecting for Model Accuracy

Feature Extraction

GroupTraining 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Don’t just take our word for it

FAQs

What jobs would benefit from this course?

What skills will I gain after completing this course?

What techniques will be covered in the course?

Does this course cover specific visualization techniques?

Join over .css-ou6dz6{color:#03ef62;}13 million learners and start Dimensionality Reduction in Python today!

Create Your Free Account

Training 2 or more people?

Training 2 or more people?

Join over 13 million learners and start Dimensionality Reduction in Python today!