Pushp Vashisht has completed

Unsupervised Learning in Python

4 hr

4,150 XP

Loved by learners at thousands of companies

Course Description

Say you have a collection of customers with a variety of characteristics such as age, location, and financial history, and you wish to discover patterns and sort them into clusters. Or perhaps you have a set of texts, such as Wikipedia pages, and you wish to segment them into categories based on their content. This is the world of unsupervised learning, called as such because you are not guiding, or supervising, the pattern discovery by some prediction task, but instead uncovering hidden structure from unlabeled data. Unsupervised learning encompasses a variety of techniques in machine learning, from clustering to dimension reduction to matrix factorization. In this course, you'll learn the fundamentals of unsupervised learning and implement the essential algorithms using scikit-learn and SciPy. You will learn how to cluster, transform, visualize, and extract insights from unlabeled datasets, and end the course by building a recommender system to recommend popular musical artists.The videos contain live transcripts you can reveal by clicking "Show transcript" at the bottom left of the videos. The course glossary can be found on the right in the resources section.To obtain CPE credits you need to complete the course and reach a score of 70% on the qualified assessment. You can navigate to the assessment by clicking on the CPE credits callout on the right.

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

1
Clustering for Dataset Exploration
Free
Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.
Play Chapter Now
Unsupervised Learning
50 xp
How many clusters?
50 xp
Clustering 2D points
100 xp
Inspect your clustering
100 xp
Evaluating a clustering
50 xp
How many clusters of grain?
100 xp
Evaluating the grain clustering
100 xp
Transforming features for better clusterings
50 xp
Scaling fish data for clustering
100 xp
Clustering the fish data
100 xp
Clustering stocks using KMeans
100 xp
Which stocks move together?
100 xp
2
Visualization with Hierarchical Clustering and t-SNE
In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.
Play Chapter Now
Visualizing hierarchies
50 xp
How many merges?
50 xp
Hierarchical clustering of the grain data
100 xp
Hierarchies of stocks
100 xp
Cluster labels in hierarchical clustering
50 xp
Which clusters are closest?
50 xp
Different linkage, different hierarchical clustering!
100 xp
Intermediate clusterings
50 xp
Extracting the cluster labels
100 xp
t-SNE for 2-dimensional maps
50 xp
t-SNE visualization of grain dataset
100 xp
A t-SNE map of the stock market
100 xp
3
Decorrelating Your Data and Dimension Reduction
Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!
Play Chapter Now
Visualizing the PCA transformation
50 xp
Correlated data in nature
100 xp
Decorrelating the grain measurements with PCA
100 xp
Principal components
50 xp
Intrinsic dimension
50 xp
The first principal component
100 xp
Variance of the PCA features
100 xp
Intrinsic dimension of the fish data
50 xp
Dimension reduction with PCA
50 xp
Dimension reduction of the fish measurements
100 xp
A tf-idf word-frequency array
100 xp
Clustering Wikipedia part I
100 xp
Clustering Wikipedia part II
100 xp
4
Discovering Interpretable Features
In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!
Play Chapter Now
Non-negative matrix factorization (NMF)
50 xp
Non-negative data
50 xp
NMF applied to Wikipedia articles
100 xp
NMF features of the Wikipedia articles
100 xp
NMF reconstructs samples
50 xp
NMF learns interpretable parts
50 xp
NMF learns topics of documents
100 xp
Explore the LED digits dataset
100 xp
NMF learns the parts of images
100 xp
PCA doesn't learn parts
100 xp
Building recommender systems using NMF
50 xp
Which articles are similar to 'Cristiano Ronaldo'?
100 xp
Recommend musical artists part I
100 xp
Recommend musical artists part II
100 xp
Final thoughts
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

In other tracks

Machine Learning Scientist

datasets

Company stock price movements Eurovision 2016 Fish measurements Grains LCD digits Musical artists Wikipedia articles Wine Course Glossary

collaborators

Yashas Roy

Hugo Bowne-Anderson

prerequisites

Supervised Learning with scikit-learn

Benjamin Wilson

Director of Research at lateral.io

Ben is a machine learning specialist and the director of research at lateral.io. He is passionate about learning and has worked as a data scientist in real-time bidding, e-commerce, and recommendation. Ben holds a PhD in mathematics and a degree in computer science.

Join over 18 million learners and start Unsupervised Learning in Python today!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Unsupervised Learning in Python

Loved by learners at thousands of companies

Course Description

.css-10r9e5n{-webkit-margin-end:8px;margin-inline-end:8px;}.css-1309hh9{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;-webkit-margin-end:8px;margin-inline-end:8px;}Training 2 or more people?

Clustering for Dataset Exploration

Visualization with Hierarchical Clustering and t-SNE

Decorrelating Your Data and Dimension Reduction

Discovering Interpretable Features

Training 2 or more people?

Join over .css-ou6dz6{color:#03ef62;}18 million learners and start Unsupervised Learning in Python today!

Create Your Free Account

Training 2 or more people?

Join over 18 million learners and start Unsupervised Learning in Python today!