Skip to main content
HomeRCluster Analysis in R

Cluster Analysis in R

4.8+
12 reviews
Intermediate

Develop a strong intuition for how hierarchical and k-means clustering work and learn how to apply them to extract insights from your data.

Start Course for Free
4 Hours16 Videos52 Exercises
40,331 LearnersTrophyStatement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
GroupTraining 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies


Course Description

Learn How to Perform Cluster Analysis

Cluster analysis is a powerful toolkit in the data science workbench. It is used to find groups of observations (clusters) that share similar characteristics. These similarities can inform all kinds of business decisions; for example, in marketing, it is used to identify distinct groups of customers for which advertisements can be tailored.

Explore Hierarchical and K-Means Clustering Techniques

In this course, you will learn about two commonly used clustering methods - hierarchical clustering and k-means clustering. You won't just learn how to use these methods, you'll build a strong intuition for how they work and how to interpret their results. You'll develop this intuition by exploring three different datasets: soccer player positions, wholesale customer spending data, and longitudinal occupational wage data.

Hone Your Skills with a Hands-On Case Study

You’ll finish the course by applying your new skills to a case study based around average salaries and how they have changed over time. This will combine hierarchical clustering techniques such as occupation trees, preparing for exploration, and plotting occupational clusters, with k-means techniques including elbow analysis and average silhouette widths.

DataCamp courses are comprised of a mixture of videos, articles, and practice exercises so that you have the chance to test and cement your new-found skills so that you feel confident applying them outside a course setting.
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more
Try DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Machine Learning Scientist with R

Go To Track
  1. 1

    Calculating Distance Between Observations

    Free

    Cluster analysis seeks to find groups of observations that are similar to one another, but the identified groups are different from each other. This similarity/difference is captured by the metric called distance. In this chapter, you will learn how to calculate the distance between observations for both continuous and categorical features. You will also develop an intuition for how the scales of your features can affect distance.

    Play Chapter Now
    What is cluster analysis?
    50 xp
    When to cluster?
    50 xp
    Distance between two observations
    50 xp
    Calculate & plot the distance between two players
    100 xp
    Using the dist() function
    100 xp
    Who are the closest players?
    50 xp
    The importance of scale
    50 xp
    Effects of scale
    100 xp
    When to scale data?
    50 xp
    Measuring distance for categorical data
    50 xp
    Calculating distance between categorical variables
    100 xp
    The closest observation to a pair
    50 xp
  2. 2

    Hierarchical Clustering

    This chapter will help you answer the last question from chapter 1—how do you find groups of similar observations (clusters) in your data using the distances that you have calculated? You will learn about the fundamental principles of hierarchical clustering - the linkage criteria and the dendrogram plot - and how both are used to build clusters. You will also explore data from a wholesale distributor in order to perform market segmentation of clients using their spending habits.

    Play Chapter Now
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist with R

Go To Track

Datasets

Soccer player positionsOccupational Employment Statistics (OES)Wholesale customer spending

Collaborators

Collaborator's avatar
Yashas Roy
Collaborator's avatar
Richie Cotton

Prerequisites

Intermediate R
Dmitriy Gorenshteyn HeadshotDmitriy Gorenshteyn

Lead Data Scientist at Memorial Sloan Kettering Cancer Center

See More

Don’t just take our word for it

*4.8
from 12 reviews
83%
17%
0%
0%
0%
Sort by
  • Richard L.
    12 months

    Practical and applicable instructions on various clustering methods with the right mix of code and real life examples.

  • Edwin A.
    about 1 year

    I recommend this course for those who want to learn about cluster analysis using R.

  • Aleksandar V.
    over 1 year

    I enjoyed this very smooth course, as well as Machine Learning with Tidyverse. It is very intuitive and well explained. I wish Dima made more courses

  • Dimitris L.
    over 1 year

    excellent instructor, thoughtful notes

  • Nicolas F.
    over 1 year

    From this course I feel like I'm actually able to engage in cluster analyses just from taking this course! Thank you so much! Could you please create a cluster analysis cheat sheet?

"Practical and applicable instructions on various clustering methods with the right mix of code and real life examples."

Richard L.

"I recommend this course for those who want to learn about cluster analysis using R."

Edwin A.

"I enjoyed this very smooth course, as well as Machine Learning with Tidyverse. It is very intuitive and well explained. I wish Dima made more courses"

Aleksandar V.

FAQs

Join over 13 million learners and start Cluster Analysis in R today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.