Skip to main content

Course

Anomaly Detection in Python

IntermediateSkill Level

4.8+

Updated 11/2025

Detect anomalies in your data analysis and expand your Python statistical toolkit in this four-hour course.

Start Course for Free

PythonProbability & Statistics

4 hr

16 videos

59 Exercises

4,950 XP

7,196

Statement of Accomplishment

Loved by learners at thousands of companies

Training a Team?

Try for Business

Course Description

Spot Anomalies in Your Data Analysis

Extreme values or anomalies are present in almost any dataset, and it is critical to detect and deal with them before continuing statistical exploration. When left untouched, anomalies can easily disrupt your analyses and skew the performance of machine learning models.

Learn to Use Estimators Like Isolation Forest and Local Outlier Factor

In this course, you'll leverage Python to implement a variety of anomaly detection methods. You'll spot extreme values visually and use tested statistical techniques like Median Absolute Deviation for univariate datasets. For multivariate data, you'll learn to use estimators such as Isolation Forest, k-Nearest-Neighbors, and Local Outlier Factor. You'll also learn how to ensemble multiple outlier classifiers into a low-risk final estimator. You'll walk away with an essential data science tool in your belt: anomaly detection with Python.

Expand Your Python Statistical Toolkit

Better anomaly detection means better understanding of your data, and particularly, better root cause analysis and communication around system behavior. Adding this skill to your existing Python repertoire will help you with data cleaning, fraud detection, and identifying system disturbances.

Prerequisites

Supervised Learning with scikit-learn

1

Detecting Univariate Outliers

This chapter covers techniques to detect outliers in 1-dimensional data using histograms, scatterplots, box plots, z-scores, and modified z-scores.

What are anomalies and outliers?

Print a 5-number summary

Histograms for outlier detection

Scatterplots for outlier detection

Box plots and IQR

Boxplots for outlier detection

Calculating outlier limits with IQR

Using outlier limits for filtering

Using z-scores for Anomaly Detection

Finding outliers with z-scores

Using modified z-scores with PyOD

2

Isolation Forests with PyOD

In this chapter, you’ll learn the ins and outs of how the Isolation Forest algorithm works. Explore how Isolation Trees are built, the essential parameters of PyOD's IForest and how to tune them, and how to interpret the output of IForest using outlier probability scores.

Getting started with Isolation Forests

The difference between univariate and multivariate anomalies

Detecting outliers with IForest

Overview of Isolation Forest hyperparameters

Most important IForest parameters

Choosing contamination

Choosing n_estimators

Checking the theory

Hyperparameter tuning of Isolation Forest

Tuning contamination

Tuning multiple hyperparameters

Interpreting the output of IForest

Alternative way of classifying with IForest

Using outlier probabilities

3

Distance and Density-based Algorithms

After a tree-based outlier classifier, you will explore a class of distance and density-based detectors. KNN and Local Outlier Factor classifiers have been proven highly effective in this area, and you will learn how to use them.

KNN for outlier detection

KNN for the first time

KNN with outlier probabilities

Outlier-robust feature scaling

Finding the euclidean distance manually

Finding the euclidean distance with SciPy

Practicing standardization

Testing QuantileTransformer

Hyperparameters of KNN

Differentiating distance metrics

Calculating manhattan distance manually

Tuning n_neighbors

Tuning the aggregation method

Local Outlier Factor

LOF for the first time

LOF with outlier probabilities

4

Time Series Anomaly Detection and Outlier Ensembles

In this chapter, you’ll learn how to perform anomaly detection on time series datasets and make your predictions more stable and trustworthy using outlier ensembles.

Introduction to time series

Working with DateTime columns

Creating a DateTimeIndex

MAD on time series

Isolation Forest on time series

Time Series Decomposition for Outlier Detection

Practicing decomposition

Fitting on residuals

Outlier classifier ensembles

Scaling parts of a dataset

Manual outlier ensembles - creating the arrays

Storing outlier probabilities

Aggregating and thresholding the probabilities

How to deal with identified outliers

Classifying the reasons for outlier presence

When to drop outliers

Non-aggressive methods of dealing with outliers

Congratulations!

Anomaly Detection in Python

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance reviewEnroll Now

Don’t just take our word for it

*4.8

from 175 reviews

87%

12%

1%

0%

0%

Sort by

Chun Yu

2 days ago

Lorenzo

7 days ago

good

Angelica

2 weeks ago

Emre

2 weeks ago

OZAN DERVİŞ

2 weeks ago

Buse Nur

2 weeks ago

Chun Yu

"good"

Lorenzo

Angelica

FAQs

What anomaly detection methods does this course teach?

You will learn z-scores, modified z-scores, Isolation Forest with PyOD, Local Outlier Factor, and how to combine multiple outlier classifiers for a reliable final estimate.

Is this course focused on univariate or multivariate anomaly detection?

Both. It starts with univariate outlier detection using visual and statistical methods, then progresses to multivariate techniques like Isolation Forest and Local Outlier Factor.

What Python libraries are used for anomaly detection?

You will use PyOD for Isolation Forest and other outlier detection algorithms, alongside pandas, scikit-learn, and standard visualization tools for analysis and plotting.

What practical applications can I pursue after this course?

You can apply these techniques to data cleaning, fraud detection, network intrusion detection, manufacturing quality control, and identifying unusual system behavior.

Do I need machine learning experience before enrolling?

Yes. You should have completed supervised learning with scikit-learn and introductory statistics in Python, along with solid pandas and intermediate Python skills.

Join over 19 million learners and start Anomaly Detection in Python today!

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.