# Cluster Analysis in Python

4+
11 reviews
Beginner

In this course, you will be introduced to unsupervised learning through techniques such as hierarchical and k-means clustering using the SciPy library.

4 Hours14 Videos46 Exercises
47,853 Learners

or

## Course Description

You have probably come across Google News, which automatically groups similar news articles under a topic. Have you ever wondered what process runs in the background to arrive at these groups? In this course, you will be introduced to unsupervised learning through clustering using the SciPy library in Python. This course covers pre-processing of data and application of hierarchical and k-means clustering. Through the course, you will explore player statistics from a popular football video game, FIFA 18. After completing the course, you will be able to quickly apply various clustering algorithms on data, visualize the clusters formed and analyze results.
1. 1

### Introduction to Clustering

Free

Before you are ready to classify news articles, you need to be introduced to the basics of clustering. This chapter familiarizes you with a class of machine learning algorithms called unsupervised learning and then introduces you to clustering, one of the popular unsupervised learning algorithms. You will know about two popular clustering techniques - hierarchical clustering and k-means clustering. The chapter concludes with basic pre-processing steps before you start clustering data.

Play Chapter Now
Unsupervised learning: basics
50 xp
Unsupervised learning in real world
50 xp
Pokémon sightings
100 xp
Basics of cluster analysis
50 xp
Pokémon sightings: hierarchical clustering
100 xp
Pokémon sightings: k-means clustering
100 xp
Data preparation for cluster analysis
50 xp
Normalize basic list data
100 xp
Visualize normalized data
100 xp
Normalization of small numbers
100 xp
FIFA 18: Normalize data
100 xp
2. 2

### Hierarchical Clustering

This chapter focuses on a popular clustering algorithm - hierarchical clustering - and its implementation in SciPy. In addition to the procedure to perform hierarchical clustering, it attempts to help you answer an important question - how many clusters are present in your data? The chapter concludes with a discussion on the limitations of hierarchical clustering and discusses considerations while using hierarchical clustering.

3. 3

### K-Means Clustering

This chapter introduces a different clustering algorithm - k-means clustering - and its implementation in SciPy. K-means clustering overcomes the biggest drawback of hierarchical clustering that was discussed in the last chapter. As dendrograms are specific to hierarchical clustering, this chapter discusses one method to find the number of clusters before running k-means clustering. The chapter concludes with a discussion on the limitations of k-means clustering and discusses considerations while using this algorithm.

4. 4

### Clustering in Real World

Now that you are familiar with two of the most popular clustering techniques, this chapter helps you apply this knowledge to real-world problems. The chapter first discusses the process of finding dominant colors in an image, before moving on to the problem discussed in the introduction - clustering of news articles. The chapter concludes with a discussion on clustering with multiple variables, which makes it difficult to visualize all the data.

In the following tracks

Machine Learning Scientist with Python

Collaborators

Prerequisites

Intermediate Python
Shaumik Daityari

Shaumik is a business analyst at American Express by day, and a comic book enthusiast by night (or maybe, he's Batman?) He has masters degrees from IIT Roorkee and IIM Lucknow, but apparently, none were as fun as coding in Python all day. Shaumik has been writing tutorials and creating screencasts for over five years. When not working, he's busy automating daily tasks through Python scripts!
See More

## Don’t just take our word for it

*4
from 11 reviews
36%
27%
36%
0%
0%
Sort by
• Charles L.

Covers many useful operations required in actual CA, which i believe will allow me to perform actual CA on my own.

• Sofia K.
2 months

Excellent

• David T.
3 months

Great. Concise and precise

• Tyler J.
7 months

Good content

• Pierre-Etienne T.
7 months

This course is a good beginning about clustering analysis. Unfortunatelly, I wanted to learn advanced clustering algorithms like DBscan a spectrum analysis.

"Covers many useful operations required in actual CA, which i believe will allow me to perform actual CA on my own."

Charles L.

"Excellent"

Sofia K.

"Great. Concise and precise"

David T.