Skip to main content
HomeBlogMachine Learning

Introduction to Unsupervised Learning

Learn about unsupervised learning, its types—clustering, association rule mining, and dimensionality reduction—and how it differs from supervised learning.
Mar 2023  · 9 min read

Unsupervised learning is a machine learning problem type in which training data consists of a set of input vectors but no corresponding target values. The idea behind this type of learning is to group information based on similarities, patterns, and differences. 

Unlike in supervised learning problems, unsupervised learning algorithms do not require input-to-output mappings to learn a mapping function—this is what is meant when we say, “no teacher is provided to the learning algorithm.” Consequently, an unsupervised learning algorithm cannot perform classification or regression.  

The role of an unsupervised learning algorithm is to discover the underlying structure of an unlabeled dataset by itself. 

Supervised vs Unsupervised Learning 

In the table below, we’ve compared some of the key differences between unsupervised and supervised learning: 


Supervised Learning

Unsupervised learning


To approximate a function that maps inputs to outputs based out example input-output pairs. 

To build a concise representation of the data and generate imaginative content from it. 


Highly accurate and reliable.

Less accurate and reliable. 


Simpler method.

Computationally complex.


Number of classes is known.

Number of classes is unknown.


A desired output value (also called the supervisory signal).

No corresponding output values.

Types of Unsupervised Learning

In the introduction, we mentioned that unsupervised learning is a method we use to group data when no labels are present. Since no labels are present, unsupervised learning methods are typically applied to build a concise representation of the data so we can derive imaginative content from it. 

For example, if we were releasing a new product, we can use unsupervised learning methods to identify who the target market for the new product will be: this is because there is no historical information about who the target customer is and their demographics. 

But unsupervised learning can be broken down into three main tasks: 

  • Clustering
  • Association rules
  • Dimensionality reduction.  

Let’s delve deeper into each one:


From a theoretical standpoint, instances within the same group tend to have similar properties. You can observe this phenomenon in the periodic table. Members of the same group, separated by eighteen columns, have the same number of electrons in the outermost shells of their atoms and form bonds of the same type. 

This is the idea that’s at play in clustering algorithms; Clustering methods involve grouping untagged data based on their similarities and differences. When two instances appear in different groups, we can infer they have dissimilar properties. 

Clustering is a popular type of unsupervised learning approach. You can even break it down further into different types of clustering; for example: 

  • Exlcusive clustering: Data is grouped such that a single data point exclusively belongs to one cluster. 
  • Overlapping clustering: A soft cluster in which a single data point may belong to multiple clusters with varying degrees of membership. 
  • Hierarchical clustering: A type of clustering in which groups are created such that similar instances are within the same group and different objects are in other groups. 
  • Probalistic clustering: Clusters are created using probability distribution.

Association Rule Mining

This type of unsupervised machine learning takes a rule-based approach to discovering interesting relationships between features in a given dataset. It works by using a measure of interest to identify strong rules found within a dataset. 

We typically see association rule mining used for market basket analysis: this is a data mining technique retailers use to gain a better understanding of customer purchasing patterns based on the relationships between various products. 

The most widely used algorithm for association rule learning is the Apriori algorithm. However, other algorithms are used for this type of unsupervised learning, such as the Eclat and FP-growth algorithms. 

Dimensionality Reduction

Popular algorithms used for dimensionality reduction include principal component analysis (PCA) and Singular Value Decomposition (SVD). These algorithms seek to transform data from high-dimensional spaces to low-dimensional spaces without compromising meaningful properties in the original data. These techniques are typically deployed during exploratory data analysis (EDA) or data processing to prepare the data for modeling.

It’s helpful to reduce the dimensionality of a dataset during EDA to help visualize data: this is because visualizing data in more than three dimensions is difficult. From a data processing perspective, reducing the dimensionality of the data simplifies the modeling problem.

When more input features are being fed into the model, the model must learn a more complex approximation function. This phenomenon can be summed up by a saying called the “curse of dimensionality.” 

Unsupervised Learning Applications

Most executives would have no problem identifying use cases for supervised machine learning tasks; the same cannot be said for unsupervised learning. 

One reason this may be is down to the simple nature of risk. Unsupervised learning introduces much more risk than unsupervised learning since there’s no clear way to measure results against ground truth in an offline manner, and it may be too risky to conduct an online evaluation. 

Nonetheless, there are several valuable unsupervised learning use cases at the enterprise level. Beyond using unsupervised techniques to explore data, some common use cases in the real-world include: 

  • Natural language processing (NLP). Google News is known to leverage unsupervised learning to categorize articles based on the same story from various news outlets. For instance, the results of the football transfer window can all be categorized under football.
  • Image and video analysis. Visual Perception tasks such as object recognition leverage unsupervised learning.
  • Anomaly detection. Unsupervised learning is used to identify data points, events, and/or observations that deviate from a dataset's normal behavior.
  • Customer segmentation. Interesting buyer persona profiles can be created using unsupervised learning. This helps businesses to understand their customers' common traits and purchasing habits, thus, enabling them to align their products more accordingly.
  • Recommendation Engines. Past purchase behavior coupled with unsupervised learning can be used to help businesses discover data trends that they could use to develop effective cross-selling strategies.

Unsupervised Learning Example in Python

Principal component analysis (PCA) is the process of computing the principal components then using them to perform a change of basis on the data. In other words, PCA is an unsupervised learning dimensionality reduction technique. 

It’s useful to reduce the dimensionality of a dataset for two main reasons: 

  1. When there are too many dimensions in a dataset to visualize 
  2. To identify the most predictive n dimensions for feature selection when building a predictive model. 

In this section, we will implement the PCA algorithm in Python on the Iris dataset and then visualize it using matplotlib. Check out this DataCamp Workspace to follow along with the code used in this tutorial. 

Let’s start by importing the necessary libraries and the data.

from sklearn.datasets import load_iris # Dataset
from sklearn.decomposition import PCA # Algorithm
import matplotlib.pyplot as plt # Visualization

# Load the data 
iris_data = load_iris(as_frame=True)

# Preview

sepal length (cm)

sepal width (cm)

petal length (cm)

petal width (cm)


























The iris dataset has four features. Attempting to visualize data in four dimensions or more is impossible because we have no clue of how things in such a high dimension would look like. The next best thing we could do is to depict it in three dimensions, which is not impossible but still challenging. 

For example:

Credit: Rishikesh Kumar Rishi
plt.rcParams["figure.figsize"] = [7.00, 3.50]
plt.rcParams["figure.autolayout"] = True

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
sepal_length =["sepal length (cm)"]
sepal_width =["sepal width (cm)"]
petal_length =["petal length (cm)"]
petal_width =["petal width (cm)"]

ax.scatter(sepal_length, sepal_width, petal_length, c=petal_width)


It’s quite difficult to get insights from this visualization because all of the inststances are jumbled together since we only have access to one viewpoint when we visualize data in three dimensions in this scenario.  

With PCA, we can reduce the dimensions of the data down to two, which would then make it easier to visualize our data and tell apart the classes. 

Note: Learn how to implement PCA in R in “Principal Component Analysis in R Tutorial.”

# Instantiate PCA with 2 components
pca = PCA(n_components=2)

# Train the model
iris_data_reduced = pca.fit_transform(

# Plot data


In the code above, we transform the iris dataset features, only keeping two components, and then plot the reduced data in a two-dimensional plane.  

Now, it’s much easier for us to gather information about the data and how the classes are separated. We can use this insight to decide on the next steps to take if we were to fit a machine learning model onto our data. 

Final thoughts

Unsupervised learning refers to a class of problems in machine learning where a model is used to characterize or extract relationships in data. 

In contrast to supervised learning, unsupervised learning algorithms discover the underlying structure of a dataset using only input features. This means unsupervised learning models do not require a teacher to correct them, unlike in supervised learning. 

In this article, you learned the three main types of unsupervised learning, which are association rule mining, clustering, and dimensionality reduction. You also learned several applications of unsupervised learning, and how to do dimensionality reduction using the PCA algorithm in Python. 

Why not check out these resources to continue your education: 


What is unsupervised learning in machine learning?

Unsupervised learning is a type of machine learning where a model is used to discover the underlying structure of a dataset using only input features, without the need for a teacher to correct the model.

What are the main tasks of unsupervised learning?

The main tasks of unsupervised learning are clustering, association rules, and dimensionality reduction.

What is clustering in unsupervised learning?

Clustering is a type of unsupervised learning where untagged data is grouped based on their similarities and differences. This helps to identify groups with similar properties.

What is association rule mining in unsupervised learning?

Association rule mining is a type of unsupervised learning that uses a rule-based approach to discover interesting relationships between features in a dataset. This is commonly used for market basket analysis.

What is dimensionality reduction in unsupervised learning?

Dimensionality reduction is a technique used in unsupervised learning to transform data from high-dimensional spaces to low-dimensional spaces without compromising meaningful properties in the original data. This helps to simplify the modeling problem.

What are some applications of unsupervised learning?

Some applications of unsupervised learning include natural language processing, image and video analysis, anomaly detection, customer segmentation, and recommendation engines.

How can PCA be used for dimensionality reduction in unsupervised learning?

PCA is a popular algorithm used for dimensionality reduction in unsupervised learning. It seeks to transform data from high-dimensional spaces to low-dimensional spaces without compromising meaningful properties in the original data. This can help to simplify the modeling problem and make it easier to visualize data.


What is Named Entity Recognition (NER)? Methods, Use Cases, and Challenges

Explore the intricacies of Named Entity Recognition (NER), a key component in Natural Language Processing (NLP). Learn about its methods, applications, and challenges, and discover how it's revolutionizing data analysis, customer support, and more.
Abid Ali Awan's photo

Abid Ali Awan

9 min

The Curse of Dimensionality in Machine Learning: Challenges, Impacts, and Solutions

Explore The Curse of Dimensionality in data analysis and machine learning, including its challenges, effects on algorithms, and techniques like PCA, LDA, and t-SNE to combat it.
Abid Ali Awan's photo

Abid Ali Awan

7 min

Machine Learning Engineer Salaries in 2023

Find out how much machine learning engineers make around the world at different career stages. Learn how you can become a top-earning machine learning engineer today.
Natassha Selvaraj's photo

Natassha Selvaraj

16 min

What is Continuous Learning? Revolutionizing Machine Learning & Adaptability

A primer on continuous learning: an evolution of traditional machine learning that incorporates new data without periodic retraining.

Yolanda Ferreiro

7 min

What is Natural Language Processing (NLP)? A Comprehensive Guide for Beginners

Explore the transformative world of Natural Language Processing (NLP) with DataCamp’s comprehensive guide for beginners. Dive into the core components, techniques, applications, and challenges of NLP.
Matt Crabtree's photo

Matt Crabtree

11 min

What is Topic Modeling? An Introduction With Examples

Unlock insights from unstructured data with topic modeling. Explore core concepts, techniques like LSA & LDA, practical examples, and more.
Kurtis Pykes 's photo

Kurtis Pykes

13 min

See MoreSee More