Saltar al contenido principal

Inicio RDimensionality Reduction in R

Dimensionality Reduction in R

Learn dimensionality reduction techniques in R and master feature selection and extraction for your own data and models.

Comience El Curso Gratis

4 Horas16 Videos56 Ejercicios

Crea Tu Cuenta Gratuita

Google LinkedIn Facebook

o

Al continuar, acepta nuestros Términos de uso, nuestra Política de privacidad y que sus datos se almacenan en los EE. UU.

¿Entrenar a 2 o más personas?Pruebe DataCamp para empresas

Preferido por estudiantes en miles de empresas

Descripción del curso

Do you ever work with datasets with an overwhelming number of features? Do you need all those features? Which ones are the most important? In this course, you will learn dimensionality reduction techniques that will help you simplify your data and the models that you build with your data while maintaining the information in the original data and good predictive performance.

Why learn dimensionality reduction?

We live in the information age—an era of information overload. The art of extracting essential information from data is a marketable skill. Models train faster on reduced data. In production, smaller models mean faster response time. Perhaps most important, smaller data and models are often easier to understand. Dimensionality reduction is your Occam’s razor in data science.

What will you learn in this course?

The difference between feature selection and feature extraction! Using R, you will learn how to identify and remove features with low or redundant information, keeping the features with the most information. That’s feature selection. You will also learn how to extract combinations of features as condensed components that contain maximal information. That’s feature extraction!

But most importantly, using R’s new tidymodel package, you will use real-world data to build models with fewer features without sacrificing significant performance.

Empresas

¿Entrenar a 2 o más personas?

Obtenga acceso de su equipo a la biblioteca completa de DataCamp, con informes centralizados, tareas, proyectos y más

En las siguientes pistas

Científico de Machine Learning con R

1
Foundations of Dimensionality Reduction
Gratuito
Prepare to simplify large data sets! You will learn about information, how to assess feature importance, and practice identifying low-information features. By the end of the chapter, you will understand the difference between feature selection and feature extraction—the two approaches to dimensionality reduction.
Reproducir Capítulo Ahora
Introduction to dimensionality reduction
50 xp
Dimensionality and feature information
100 xp
Mutual information features
100 xp
Information and feature importance
50 xp
Calculating root entropy
100 xp
Calculating child entropies
100 xp
Calculating information gain of color
100 xp
The Importance of Dimensionality Reduction in Data and Model Building
50 xp
Calculate possible combinations
100 xp
Curse of dimensionality, overfitting, and bias
100 xp
2
Feature Selection for Feature Importance
Learn how to identify information-rich and information-poor features missing value ratios, variance, and correlation. Then you'll discover how to build tidymodel recipes to select features using these information indicators.
Reproducir Capítulo Ahora
Feature selection vs. feature extraction
50 xp
Create a zero-variance filter
100 xp
Create a missing values filter
100 xp
Feature selection with the combined filter
100 xp
Selecting based on missing values
50 xp
Create a missing value ratio filter
100 xp
Apply a missing value ratio filter
100 xp
Create a missing values recipe
100 xp
Selecting based on variance
50 xp
Create a low-variance filter
100 xp
Create a low-variance recipe
100 xp
Selecting based on correlation with other features
50 xp
Identify highly correlated features
100 xp
Select correlated feature to remove
50 xp
Create a high-correlation recipe
100 xp
3
Feature Selection for Model Performance
Chapter three introduces the difference between unsupervised and supervised feature selection approaches. You'll review how to use tidymodels workflows to build models. Then, you'll perform supervised feature selection using lasso regression and random forest models.
Reproducir Capítulo Ahora
Supervised feature selection
50 xp
Supervised vs. unsupervised feature selection
100 xp
Decision tree feature selection type
50 xp
Model Building and Evaluation with tidymodels
50 xp
Split out the train and test sets
100 xp
Create a recipe-model workflow
100 xp
Fit, explore, and evaluate the model
100 xp
Lasso Regression
50 xp
Scale the data for lasso regression
100 xp
Explore lasso regression penalty values
100 xp
Tune the penalty hyperparameter
100 xp
Fit the best model
100 xp
Random forest models
50 xp
Create full random forest model
100 xp
Reduce data using feature importances
100 xp
Create reduced random forest
100 xp
4
Feature Extraction and Model Performance
In this final chapter, you'll gain a strong intuition of feature extraction by understanding how principal components extract and combine the most important information from different features. Then learn about and apply three types of feature extraction — principal component analysis (PCA), t-SNE, and UMAP. Discover how you can use these feature extraction methods as a preprocessing step in the tidymodels model-building process.
Reproducir Capítulo Ahora
Foundations of feature extraction - principal components
50 xp
Understanding principal components
100 xp
Naming principal components
50 xp
Principal Component Analysis (PCA)
50 xp
PCA: variance explained
50 xp
Mapping features to principal components
100 xp
PCA in tidymodels
100 xp
t-Distributed Stochastic Neighborhood Embedding (t-SNE)
50 xp
Separating house prices with PCA
100 xp
Separating house prices with t-SNE
100 xp
Uniform Manifold Approximation and Projection (UMAP)
50 xp
Separating house prices with UMAP
100 xp
UMAP reduction in a decision tree model
100 xp
Evaluate the UMAP decision tree model
100 xp
Wrap up
50 xp

Empresas

¿Entrenar a 2 o más personas?

Obtenga acceso de su equipo a la biblioteca completa de DataCamp, con informes centralizados, tareas, proyectos y más

En las siguientes pistas

Científico de Machine Learning con R

Colaboradores

George Boorman

Jasmin Ludolf

Izzy Weber

Requisitos Previos

Modeling with tidymodels in R

Owner, Pickard Predictives, LLC

¿Qué tienen que decir otros alumnos?

Únete a 13 millones de estudiantes y empeza Dimensionality Reduction in R hoy!

Crea Tu Cuenta Gratuita

Google LinkedIn Facebook

o

Al continuar, acepta nuestros Términos de uso, nuestra Política de privacidad y que sus datos se almacenan en los EE. UU.