Skip to main content
HomePython

Course

Unsupervised Learning in Python

IntermediateSkill Level
4.8+
1,032 reviews
Updated 12/2025
Learn how to cluster, transform, visualize, and extract insights from unlabeled datasets using scikit-learn and scipy.
Start Course for Free
PythonMachine Learning
4 hr
13 videos
52 Exercises
4,150 XP
170K+
Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training a Team?

Try for Business

Course Description

Say you have a collection of customers with a variety of characteristics such as age, location, and financial history, and you wish to discover patterns and sort them into clusters. Or perhaps you have a set of texts, such as Wikipedia pages, and you wish to segment them into categories based on their content. This is the world of unsupervised learning, called as such because you are not guiding, or supervising, the pattern discovery by some prediction task, but instead uncovering hidden structure from unlabeled data. Unsupervised learning encompasses a variety of techniques in machine learning, from clustering to dimension reduction to matrix factorization. In this course, you'll learn the fundamentals of unsupervised learning and implement the essential algorithms using scikit-learn and SciPy. You will learn how to cluster, transform, visualize, and extract insights from unlabeled datasets, and end the course by building a recommender system to recommend popular musical artists.The videos contain live transcripts you can reveal by clicking "Show transcript" at the bottom left of the videos. The course glossary can be found on the right in the resources section.To obtain CPE credits you need to complete the course and reach a score of 70% on the qualified assessment. You can navigate to the assessment by clicking on the CPE credits callout on the right.

What you'll learn

  • Assess intrinsic dimensionality by interpreting PCA explained-variance ratios and selecting optimal n_components for compression
  • Distinguish between k-means, agglomerative hierarchical clustering, and t-SNE based on their algorithms, input requirements, and visualization outputs
  • Evaluate cluster quality using inertia plots, dendrogram linkage distances, and cross-tabulations against known categories
  • Identify appropriate preprocessing, clustering, and dimension-reduction tools in scikit-learn for specific unsupervised learning tasks
  • Recognize significant latent features produced by NMF and apply cosine similarity to recommend documents or images with related topics or patterns

Feels like what you want to learn?

Start Course for Free

Prerequisites

Supervised Learning with scikit-learn
1

Clustering for Dataset Exploration

Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.
Start Chapter
2

Visualization with Hierarchical Clustering and t-SNE

In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.
Start Chapter
3

Decorrelating Your Data and Dimension Reduction

Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!
Start Chapter
4

Discovering Interpretable Features

In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!
Start Chapter
Unsupervised Learning in Python
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.8
from 1,032 reviews
85%
14%
1%
0%
0%
  • Guilherme
    2 hours ago

  • ROBERTO
    10 hours ago

    esta genial

  • Mohmed
    12 hours ago

  • Solange
    19 hours ago

  • Jakub
    yesterday

  • Vojtěch
    yesterday

Guilherme

"esta genial"

ROBERTO

Mohmed

FAQs

Is this course suitable for beginners?

You should be comfortable with basic and intermediate Python before starting. No prior knowledge of machine learning or unsupervised learning is required.

What unsupervised learning techniques does this course cover?

The course covers k-means clustering, hierarchical clustering, principal component analysis, and non-negative matrix factorization, all implemented using scikit-learn.

What is the difference between clustering and dimensionality reduction, and does the course explain both?

Yes. Clustering groups similar items together, while dimensionality reduction re-expresses data along its most important axes to reduce size and noise. The course covers both and explains when each approach is useful.

Which Python library is used throughout the course?

All clustering and dimensionality reduction algorithms are implemented using scikit-learn, the standard Python library for machine learning.

Join over 19 million learners and start Unsupervised Learning in Python today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.