Hoppa till huvudinnehållet

Kurs

Unsupervised Learning in Python

MedelnivåKunskapsnivå

Uppdaterad 2025-12

Learn how to cluster, transform, visualize, and extract insights from unlabeled datasets using scikit-learn and scipy.

Starta kursen gratis

PythonMachine Learning

4 tim

13 videor

52 Övningar

4,150 XP

170K+

Intyg om genomförande

Omtyckt av lärande på tusentals företag

Utbildar du ett team?

Prova för företag

Kursbeskrivning

Say you have a collection of customers with a variety of characteristics such as age, location, and financial history, and you wish to discover patterns and sort them into clusters. Or perhaps you have a set of texts, such as Wikipedia pages, and you wish to segment them into categories based on their content. This is the world of unsupervised learning, called as such because you are not guiding, or supervising, the pattern discovery by some prediction task, but instead uncovering hidden structure from unlabeled data. Unsupervised learning encompasses a variety of techniques in machine learning, from clustering to dimension reduction to matrix factorization. In this course, you'll learn the fundamentals of unsupervised learning and implement the essential algorithms using scikit-learn and SciPy. You will learn how to cluster, transform, visualize, and extract insights from unlabeled datasets, and end the course by building a recommender system to recommend popular musical artists.The videos contain live transcripts you can reveal by clicking "Show transcript" at the bottom left of the videos. The course glossary can be found on the right in the resources section.To obtain CPE credits you need to complete the course and reach a score of 70% on the qualified assessment. You can navigate to the assessment by clicking on the CPE credits callout on the right.

Förkunskapskrav

Supervised Learning with scikit-learn

1

Clustering for Dataset Exploration

Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.

Unsupervised Learning

How many clusters?

Clustering 2D points

Inspect your clustering

Evaluating a clustering

How many clusters of grain?

Evaluating the grain clustering

Transforming features for better clusterings

Scaling fish data for clustering

Clustering the fish data

Clustering stocks using KMeans

Which stocks move together?

2

Visualization with Hierarchical Clustering and t-SNE

In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.

Visualizing hierarchies

How many merges?

Hierarchical clustering of the grain data

Hierarchies of stocks

Cluster labels in hierarchical clustering

Which clusters are closest?

Different linkage, different hierarchical clustering!

Intermediate clusterings

Extracting the cluster labels

t-SNE for 2-dimensional maps

t-SNE visualization of grain dataset

A t-SNE map of the stock market

3

Decorrelating Your Data and Dimension Reduction

Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!

Visualizing the PCA transformation

Correlated data in nature

Decorrelating the grain measurements with PCA

Principal components

Intrinsic dimension

The first principal component

Variance of the PCA features

Intrinsic dimension of the fish data

Dimension reduction with PCA

Dimension reduction of the fish measurements

A tf-idf word-frequency array

Clustering Wikipedia part I

Clustering Wikipedia part II

4

Discovering Interpretable Features

In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!

Non-negative matrix factorization (NMF)

Non-negative data

NMF applied to Wikipedia articles

NMF features of the Wikipedia articles

NMF reconstructs samples

NMF learns interpretable parts

NMF learns topics of documents

Explore the LED digits dataset

NMF learns the parts of images

PCA doesn't learn parts

Building recommender systems using NMF

Which articles are similar to 'Cristiano Ronaldo'?

Recommend musical artists part I

Recommend musical artists part II

Final thoughts

Unsupervised Learning in Python

Kurs
slutförd

Tjäna ett prestationsbevis

Lägg till det här beviset i din LinkedIn-profil, ditt CV eller din meritförteckning
Dela det i sociala medier och i din medarbetarutvärderingRegistrera dig nu

Gå med 19 miljoner lärande och börja Unsupervised Learning in Python idag!

Utveckla dina datakunskaper med DataCamp för mobilen

Gör framsteg när du är på språng med våra mobila kurser och dagliga 5-minuters kodningsutmaningar.