Skip to main content

New Course: Unsupervised Learning in Python

We are launching our new course Unsupervised Learning in Python by Benjamin Wilson!
Feb 2017  · 3 min read

Course description

Say you have a collection of customers with a variety of characteristics such as age, location, and financial history, and you wish to discover patterns and sort them into clusters. Or perhaps you have a set of texts, such as wikipedia pages, and you wish to segment them into categories based on their content. This is the world of unsupervised learning, called as such because you are not guiding, or supervising, the pattern discovery by some prediction task, but instead uncovering hidden structure from unlabeled data. Unsupervised learning encompasses a variety of techniques in machine learning, from clustering to dimension reduction to matrix factorization.

In this course, you'll learn the fundamentals of unsupervised learning and implement the essential algorithms using scikit-learn and scipy. You will learn how to cluster, transform, visualize, and extract insights from unlabeled datasets, and end the course by building a recommender system to recommend popular musical artists.

Start for free

Unsupervised Learning in Python features interactive exercises that combine high-quality video, in-browser coding, and gamification for an engaging learning experience that will make you a master at Data Science with Python!

What you'll learn

Chapter one: Clustering for dataset exploration

In the first chapter, you'll learn how to discover the underlying groups (or "clusters") in a dataset. By the end of the chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements. Start first chapter for free here.

Chapter two: Visualization with hierarchical clustering and t-SNE

In the second you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE.

Chapter three: Decorrelating your data and dimension reduction

The third chapter is about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization.

Chapter four: Discovering interpretable features

Finally, you'll end the course by learning about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns.

About Ben

Ben is a machine learning specialist and the director of research at He is passionate about learning and has worked as a data scientist in real-time bidding, e-commerce, and recommendation. Ben holds a PhD in mathematics and a degree in computer science.

Start course for free
← Back to Blogs