Probability Mass Function: A Guide to Discrete Distributions

Learn how the probability mass function defines discrete probability distributions. Explore its properties, examples, and differences from probability density functions.

Jun 20, 2025 · 8 min read

In probability and statistics, the likelihood of an outcome for discrete random variables is quantified by the probability mass function (PMF), whereas for continuous variables, we use the probability density function (PDF).

Whether modeling coin flips, equipment failures, or user clicks on a webpage, the PMF helps us quantify the likelihood of different outcomes.

Understanding the Probability Mass Function

The probability mass function formula

Mathematically, the probability mass function of a discrete random variable X is defined as:

Where:

X is a discrete random variable,
x is a value that X can take,
p(x) is the probability that X equals x.

Two critical conditions must be met for a valid PMF:

1. Non-negativity

2. Total probability must equal one

These conditions ensure that all assigned probabilities make logical and mathematical sense. The PMF provides a complete description of the distribution of X, making it possible to compute expected values, variances, and other statistical measures.

Definition and key properties

The PMF has several defining properties that distinguish it:

Discreteness: The PMF only applies to discrete random variables, those that can take on countable values (like 0, 1, 2, ...).
Normalization: The sum of the PMF over all possible values of X must be exactly 1.
Individual probability assignment: Each possible value x is assigned a specific probability p(x), which captures how likely that particular outcome is.

For example, the PMF for rolling a fair six-sided die is:

Here, each value from 1 to 6 has an equal chance of appearing, and the sum of all probabilities is 1.

Probability Mass Function vs. Probability Density Function

Let’s understand in detail the differences between PMFs and PDFs and how they model the discrete and continuous distributions, respectively:

Characteristic	PMF	PDF
Variable Type	Discrete	Continuous
Value Output	Probability of exact outcome:	Probability density:
Summation Rule

PMF example

Let’s take a classic PMF example: tossing a fair coin (Bernoulli trial):

Let’s perform a Bernoulli trial in Python to understand this better.

import matplotlib.pyplot as plt
from scipy.stats import bernoulli
# Parameters
p = 0.5  # Probability of Heads
# PMF for a fair coin
x = [0, 1]  # 0 = Tails, 1 = Heads
pmf_values = bernoulli.pmf(x, p)

# Plotting
plt.bar(x, pmf_values, tick_label=['Tails (0)', 'Heads (1)'])
plt.title('PMF of a Fair Coin Toss (Bernoulli Trial)')
plt.ylabel('Probability')
plt.xlabel('Outcome')
plt.grid(axis='y', linestyle='--')
plt.show()

PDF example

Here’s a PDF example: measuring a person’s height (continuous variable).

The PDF might show that the probability density is highest around 170 cm, but the probability of any exact height is technically 0.

Again, let’s understand this with a code example.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Parameters
mu = 170   # mean height
sigma = 10  # standard deviation

# Continuous range of heights
x = np.linspace(130, 210, 500)

pdf_values = norm.pdf(x, mu, sigma)

# Plotting
plt.plot(x, pdf_values, label='PDF')
plt.title('PDF of Human Height (Normal Distribution)')
plt.xlabel('Height (cm)')
plt.ylabel('Probability Density')
plt.grid(True)
plt.axvline(mu, color='red', linestyle='--', label='Mean (170 cm)')
plt.legend()
plt.show()

While PMFs assign actual probabilities to specific outcomes, PDFs describe the relative likelihood of outcomes within an interval.

Common Discrete Distributions Using PMF

Common examples of discrete probability distributions are the Bernoulli, binomial, geometric, and Poisson distributions. Let's take a look at each.

PMF of the Bernoulli distribution

The Bernoulli distribution models a single experiment with only two possible outcomes: success (1) and failure (0).

Where p is the probability of success.

Here is an example: Tossing a fair coin where success = heads:

PMF of the binomial distribution

The Binomial distribution generalizes the Bernoulli to multiple trials:

Where:

n is the number of independent trials,
p is the probability of success in each trial,
x is the number of successes.

One example would be counting defective items in a batch of products.

PMF of the Geometric distribution

The Geometric distribution is used to model the number of failures before the first success in repeated independent Bernoulli trials:

One common example of this would be the number of coin flips until the first head appears.

PMF of the Poisson distribution

The Poisson distribution models the number of events in a fixed interval of time or space:

Where λ is the expected number of events per interval.

An example of this is the number of customer arrivals at a service center per hour.

Visualizing Probability Mass Functions

PMFs are visualized using bar charts, where:

The x-axis represents the possible values of the random variable.
The y-axis represents the probability assigned by the PMF.

Here is a table showing an example: rolling a fair die

X	1	2	3	4	5	6
P(X)	1/6	1/6	1/6	1/6	1/6	1/6

This uniform PMF would appear as a flat bar chart, where all outcomes are equally likely.

Such visualizations make it easier to:

Interpret distributions intuitively,
Compare relative probabilities,
Spot skewed or symmetric distributions.

Applications of Probability Mass Functions

PMFs are used across various domains:

Statistics

For calculating expected values:

and variances:

Machine learning

In classification algorithms (e.g., Naive Bayes), PMFs describe the likelihoods of categorical features. Probabilistic models like Hidden Markov Models also depend on PMFs for state transitions and emissions.

Reliability engineering

PMFs help determine the probability of a system or component failing at a specific time step.

Economics and finance

Modeling market scenarios or investment outcomes where the number of outcomes is finite and distinct.

Bayesian inference

PMFs serve as prior distributions in discrete Bayesian models. When new evidence is observed, the prior PMF is updated to a posterior PMF, making Bayesian updating tractable for discrete problems.

If you want to learn about how conditional probabilities work, check out our Conditional Probability: A Close Look tutorial.

Monte Carlo Simulations

PMFs are also essential in Monte Carlo simulations, especially for systems with discrete and probabilistic outcomes. These simulations use random sampling from a PMF to estimate a model's behavior or performance over many iterations.

Conclusion

The probability mass function provides the mathematical structure for modeling discrete random variables, unlike PDFs, which deal with continuous variables. They enable powerful modeling across domains, from basic coin tosses to complex machine learning systems.

Key takeaways:

PMFs are non-negative and sum to 1 across all possible values.
They define the shape and characteristics of discrete probability distributions.
PMFs are essential in various applications, from classical statistics to modern AI and finance.
In Bayesian inference and Monte Carlo methods, PMFs are central to updating beliefs and simulating uncertain systems.

For those diving deeper into probability theory, exploring cumulative distribution functions (CDFs) and expected value calculations will further enrich your understanding of how uncertainty can be quantified and leveraged in decision-making. Explore our courses:

Author

Vidhi Chugh

What is a probability mass function?

How does a PMF differ from a PDF?

Can a PMF be negative?

What are common distributions that use PMFs?

How is a PMF used in real-world applications?

Topics

Data Science

Data Analysis

Learn with DataCamp

Course

Mixture Models in R

4 hr

5.1K

Learn mixture models: a convenient and formal statistical framework for probabilistic clustering and classification.

See Details

Start Course

Course

Generalized Linear Models in R

4 hr

20.7K

The Generalized Linear Model course expands your regression toolbox to include logistic and Poisson regression.

See Details

Start Course

Course

Introduction to Regression with statsmodels in Python

4 hr

53.3K

Predict housing prices and ad click-through rate by implementing, analyzing, and interpreting regression analysis with statsmodels in Python.

See Details

Start Course

Tutorial

Poisson Distribution: A Comprehensive Guide

The Poisson distribution models the probability of a certain number of events occurring within a fixed interval. See how it's applied in real-world scenarios like queueing theory and traffic modeling.

Vinod Chugani

Tutorial

Binomial Distribution: A Complete Guide with Examples

Learn how the binomial distribution models multiple binary outcomes and is used in fields like finance, healthcare, and machine learning.

Vinod Chugani

Tutorial

Probability Distributions in Python Tutorial

In this tutorial, you'll learn about and how to code in Python the probability distributions commonly referenced in machine learning literature.

DataCamp Team

Tutorial

Understanding the Exponential Distribution: A Comprehensive Guide

Discover the fundamentals of the exponential distribution and its applications in real-world scenarios. Learn how to calculate probabilities and understand its significance in various fields. Explore practical examples and visualizations.

Vinod Chugani

Tutorial

Gaussian Distribution: A Comprehensive Guide

Uncover the significance of the Gaussian distribution, its relationship to the central limit theorem, and its real-world applications in machine learning and hypothesis testing.

Vinod Chugani

Tutorial

Bernoulli Distribution: A Complete Guide with Examples

Discover how the Bernoulli distribution captures binary outcomes and is applied in everything from coin flips to customer predictions.

Vinod Chugani

See More See More

Understanding the Probability Mass Function

The probability mass function formula

1. Non-negativity

2. Total probability must equal one

Definition and key properties

Probability Mass Function vs. Probability Density Function

PMF example

PDF example

Common Discrete Distributions Using PMF

PMF of the Bernoulli distribution

PMF of the binomial distribution

PMF of the Geometric distribution

PMF of the Poisson distribution

Visualizing Probability Mass Functions

Applications of Probability Mass Functions

Statistics

Machine learning

Reliability engineering

Economics and finance

Bayesian inference

Monte Carlo Simulations

Conclusion

PMF FAQs

Can a PMF be negative?

What are common distributions that use PMFs?

How is a PMF used in real-world applications?

Poisson Distribution: A Comprehensive Guide

Binomial Distribution: A Complete Guide with Examples

Probability Distributions in Python Tutorial

Understanding the Exponential Distribution: A Comprehensive Guide

Gaussian Distribution: A Comprehensive Guide

Bernoulli Distribution: A Complete Guide with Examples

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Mixture Models in R

Generalized Linear Models in R

Introduction to Regression with statsmodels in Python

Poisson Distribution: A Comprehensive Guide

Binomial Distribution: A Complete Guide with Examples

Probability Distributions in Python Tutorial

Understanding the Exponential Distribution: A Comprehensive Guide

Gaussian Distribution: A Comprehensive Guide

Bernoulli Distribution: A Complete Guide with Examples

Mixture Models in R