Skip to main content

Unsupervised Learning in R

This course provides an intro to clustering and dimensionality reduction in R from a machine learning perspective.

Start Course for Free
4 Hours16 Videos49 Exercises40,636 Learners3600 XPData Engineer TrackData Scientist TrackMachine Learning Fundamentals TrackMachine Learning Scientist TrackSQL Fundamentals TrackUnsupervised Machine Learning Track

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

Many times in machine learning, the goal is to find patterns in data without trying to make predictions. This is called unsupervised learning. One common use case of unsupervised learning is grouping consumers based on demographics and purchasing history to deploy targeted marketing campaigns. Another example is wanting to describe the unmeasured factors that most influence crime differences between cities. This course provides a basic introduction to clustering and dimensionality reduction in R from a machine learning perspective, so that you can get from data to insights as quickly as possible.

  1. 1

    Unsupervised learning in R


    The k-means algorithm is one common approach to clustering. Learn how the algorithm works under the hood, implement k-means clustering in R, visualize and interpret the results, and select the number of clusters when it's not known ahead of time. By the end of the chapter, you'll have applied k-means clustering to a fun "real-world" dataset!

    Play Chapter Now
    Welcome to the course!
    50 xp
    Identify clustering problems
    50 xp
    Introduction to k-means clustering
    50 xp
    k-means clustering
    100 xp
    Results of kmeans()
    100 xp
    Visualizing and interpreting results of kmeans()
    100 xp
    How k-means works and practical matters
    50 xp
    Handling random algorithms
    100 xp
    Selecting number of clusters
    100 xp
    Introduction to the Pokemon data
    50 xp
    Practical matters: working with real data
    100 xp
    Review of k-means clustering
    50 xp
  2. 3

    Dimensionality reduction with PCA

    Principal component analysis, or PCA, is a common approach to dimensionality reduction. Learn exactly what PCA does, visualize the results of PCA with biplots and scree plots, and deal with practical issues such as centering and scaling the data before performing PCA.

    Play Chapter Now
  3. 4

    Putting it all together with a case study

    The goal of this chapter is to guide you through a complete analysis using the unsupervised learning techniques covered in the first three chapters. You'll extend what you've learned by combining PCA as a preprocessing step to clustering using data that consist of measurements of cell nuclei of human breast masses.

    Play Chapter Now

In the following tracks

Data EngineerData ScientistMachine Learning FundamentalsMachine Learning ScientistSQL FundamentalsUnsupervised Machine Learning


n10iNick CarcheditommyjeeTom Jeon


Introduction to R
Hank Roark Headshot

Hank Roark

Senior Data Scientist, Boeing

Hank is a Senior Data Scientist at Boeing and a long time user of the R language. Prior to his current role, he led the Customer Data Science team at, a leading provider of machine learning and predictive analytics services.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA