Cluster Analysis in R
Develop a strong intuition for how hierarchical and k-means clustering work and learn how to apply them to extract insights from your data.
Commencer Le Cours Gratuitement4 heures16 vidéos52 exercices41 501 apprenantsDéclaration de réalisation
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.Formation de 2 personnes ou plus ?
Essayer DataCamp for BusinessApprécié par les apprenants de milliers d'entreprises
Description du cours
Learn How to Perform Cluster Analysis
Cluster analysis is a powerful toolkit in the data science workbench. It is used to find groups of observations (clusters) that share similar characteristics. These similarities can inform all kinds of business decisions; for example, in marketing, it is used to identify distinct groups of customers for which advertisements can be tailored.Explore Hierarchical and K-Means Clustering Techniques
In this course, you will learn about two commonly used clustering methods - hierarchical clustering and k-means clustering. You won't just learn how to use these methods, you'll build a strong intuition for how they work and how to interpret their results. You'll develop this intuition by exploring three different datasets: soccer player positions, wholesale customer spending data, and longitudinal occupational wage data.Hone Your Skills with a Hands-On Case Study
You’ll finish the course by applying your new skills to a case study based around average salaries and how they have changed over time. This will combine hierarchical clustering techniques such as occupation trees, preparing for exploration, and plotting occupational clusters, with k-means techniques including elbow analysis and average silhouette widths.DataCamp courses are comprised of a mixture of videos, articles, and practice exercises so that you have the chance to test and cement your new-found skills so that you feel confident applying them outside a course setting.
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Scientifique en apprentissage automatique en R
Aller à la piste- 1
Calculating Distance Between Observations
GratuitCluster analysis seeks to find groups of observations that are similar to one another, but the identified groups are different from each other. This similarity/difference is captured by the metric called distance. In this chapter, you will learn how to calculate the distance between observations for both continuous and categorical features. You will also develop an intuition for how the scales of your features can affect distance.
What is cluster analysis?50 xpWhen to cluster?50 xpDistance between two observations50 xpCalculate & plot the distance between two players100 xpUsing the dist() function100 xpWho are the closest players?50 xpThe importance of scale50 xpEffects of scale100 xpWhen to scale data?50 xpMeasuring distance for categorical data50 xpCalculating distance between categorical variables100 xpThe closest observation to a pair50 xp - 2
Hierarchical Clustering
This chapter will help you answer the last question from chapter 1—how do you find groups of similar observations (clusters) in your data using the distances that you have calculated? You will learn about the fundamental principles of hierarchical clustering - the linkage criteria and the dendrogram plot - and how both are used to build clusters. You will also explore data from a wholesale distributor in order to perform market segmentation of clients using their spending habits.
Comparing more than two observations50 xpCalculating linkage100 xpRevisited: The closest observation to a pair50 xpCapturing K clusters50 xpAssign cluster membership100 xpExploring the clusters100 xpValidating the clusters50 xpVisualizing the dendrogram50 xpComparing average, single & complete linkage100 xpHeight of the tree50 xpCutting the tree50 xpClusters based on height100 xpExploring the branches cut from the tree100 xpWhat do we know about our clusters?50 xpMaking sense of the clusters50 xpSegment wholesale customers100 xpExplore wholesale customer clusters100 xpInterpreting the wholesale customer clusters50 xp - 3
K-means Clustering
In this chapter, you will build an understanding of the principles behind the k-means algorithm, learn how to select the right k when it isn't previously known, and revisit the wholesale data from a different perspective.
Introduction to K-means50 xpK-means on a soccer field100 xpK-means on a soccer field (part 2)100 xpEvaluating different values of K by eye50 xpMany K's many models100 xpElbow (Scree) plot100 xpInterpreting the elbow plot50 xpSilhouette analysis: observation level performance50 xpSilhouette analysis100 xpMaking sense of the K-means clusters50 xpRevisiting wholesale data: "Best" k100 xpRevisiting wholesale data: Exploration100 xp - 4
Case Study: National Occupational Mean Wage
In this chapter, you will apply the skills you have learned to explore how the average salary amongst professions have changed over time.
Occupational wage data50 xpInitial exploration of the data50 xpHierarchical clustering: Occupation trees100 xpHierarchical clustering: Preparing for exploration100 xpHierarchical clustering: Plotting occupational clusters100 xpReviewing the HC results50 xpK-means: Elbow analysis100 xpK-means: Average Silhouette Widths100 xpThe "best" number of clusters50 xpReview K-means results50 xp
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Scientifique en apprentissage automatique en R
Aller à la pisteensembles de données
Soccer player positionsOccupational Employment Statistics (OES)Wholesale customer spendingcollaborateurs
prérequis
Intermediate RDmitriy Gorenshteyn
Voir PlusLead Data Scientist at Memorial Sloan Kettering Cancer Center
Qu’est-ce que les autres apprenants ont à dire ?
Inscrivez-vous 15 millions d’apprenants et commencer Cluster Analysis in R Aujourd’hui!
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.