Skip to main content

Sara Bengoechea Rodríguez has completed

Advanced Dimensionality Reduction in R

Start course For Free
4 hr
4,300 XP
Statement of Accomplishment Badge

Loved by learners at thousands of companies


Course Description

Dimensionality reduction techniques are based on unsupervised machine learning algorithms and their application offers several advantages. In this course you will learn how to apply dimensionality reduction techniques to exploit these advantages, using interesting datasets like the MNIST database of handwritten digits, the fashion version of MNIST released by Zalando, and a credit card fraud detection dataset. Firstly, you will have a look at t-SNE, an algorithm that performs non-linear dimensionality reduction. Then, you will also explore some useful characteristics of dimensionality reduction to apply in predictive models. Finally, you will see the application of GLRM to compress big data (with numerical and categorical values) and impute missing values. Are you ready to start compressing high dimensional data?
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.
DataCamp for BusinessFor a bespoke solution book a demo.
  1. 1

    Introduction to Advanced Dimensionality Reduction

    Free

    Are you ready to become a master of dimensionality reduction? In this chapter, you'll start by understanding how to represent handwritten digits using the MNIST dataset. You will learn what a distance metric is and which ones are the most common, along with the problems that arise with the curse of dimensionality. Finally, you will compare the application of PCA and t-SNE .

    Play Chapter Now
    Exploring the MNIST dataset
    50 xp
    Exploring MNIST dataset
    100 xp
    Digits features
    100 xp
    Distance metrics
    50 xp
    Euclidean distance
    100 xp
    Minkowski distance
    100 xp
    KL divergence
    100 xp
    PCA and t-SNE
    50 xp
    Generating PCA from MNIST sample
    100 xp
    t-SNE output from MNIST sample
    100 xp
  2. 2

    Introduction to t-SNE

    Now, you will learn how to apply the t-Distributed Stochastic Neighbour Embedding (t-SNE) algorithm. After finishing this chapter, you will understand the different hyperparameters that have an impact on your results and how to optimize them. Finally, you will do something really cool: compute centroids prototypes of each digit to classify other digits.

    Play Chapter Now
  3. 3

    Using t-SNE with Predictive Models

    In this chapter, you'll apply t-SNE to train predictive models faster. This is one of the many advantages of dimensionality reduction. You will learn how to train a random forest with the original features and with the embedded features and compare them. You will also apply t-SNE to understand the patterns learned by a neural network. And all of this using a real credit card fraud dataset!

    Play Chapter Now
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

datasets

MNIST sampleCredit card fraudFashion MNIST sample

collaborators

Collaborator's avatar
Chester Ismay
Collaborator's avatar
Sara Billen
Federico Castanedo HeadshotFederico Castanedo

Data Scientist at DataRobot

See More

Join over 17 million learners and start Advanced Dimensionality Reduction in R today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.