Skip to main content
HomeAbout RLearn R

Markov Chain Analysis in R

In this tutorial, you'll learn what Markov chain is and use it to analyze sales velocity data in R.
Aug 2018  · 8 min read

What is a Markov Chain?

A Markov Chain is a mathematical system that experiences transitions from one state to another according to a given set of probabilistic rules. Markov chains are stochastic processes, but they differ in that they must lack any "memory". That is, the probability of the next state of the system is only dependent on the present state of the system and not on any prior states. This is called the Markov property (seen below):

Markov property

In order to have a functional Markov chain model, it is essential to define a transition matrix Pt. A transition matrix contains the information about the probability of transitioning between the different states in the system. For a transition matrix to be valid, each row must be a probability vector, and the sum of all its terms must be 1.

Transition Matrices have the property that the product of subsequent matrices can describe the transition probabilities along a time interval. Therefore, one can model how probable is to be at a certain state after k steps by calculating the following:


This tutorial will also cover absorbing Markov chains. These chains occur when there is at least one state that, once reached, the probability of staying on it is 1 (you cannot leave it).

What are Absorbing Markov Chains?

An absorbing Markov chain is a Markov chain in which it is impossible to leave some states once entered. However, this is only one of the prerequisites for a Markov chain to be an absorbing Markov chain. In order for it to be an absorbing Markov chain, all other transient states must be able to reach the absorbing state with a probability of 1.

Absorbing Markov chains have specific unique properties that differentiate them from the normal time-homogeneous Markov chains. One of these properties is the way in which the transition matrix can be written. With a chain with t transient states and r absorbing states, the transition matrix P can be written in canonical form as follows:

transition matrix P

Where Q is a t x t matrix, R is a t x r matrix, 0 is a r x t zero matrix, and Ir is a r x r identity matrix. In particular, the decomposition of the transition matrix into the fundamental matrix allows for certain calculations such as the expected number of steps until absorption from each state. The fundamental matrix N is calculated as follows:

fundamental matrix N

Where It is a t x t identity matrix

The expected number of steps is based on the concept of linearity of expectation, and it is calculated as follows:

expected number of steps

Where 1 is a column vector of the same length as the number of transient states with all entries being 1

Furthermore, we can calculate the probability of being absorbed by a specific absorbing state when starting from any given transient state. This probability is calculated as follows:


Analysis of Sales Velocity

Markov chains are widely used in many fields such as finance, game theory, and genetics. However, the basis of this tutorial is how to use them to model the length of a company's sales process since this could be a Markov process. This was in fact validated by testing if sequences are detailing the steps that a deal went through before successfully closing complied with the Markov property.

This analysis carried the assumption that the probabilities of a given deal moving forward in our sales process was constant from month to month for a given industry in order to use time-homogenous Markov chains. That is a Markov chain in which the transition probabilities between states stayed constant as time went on (the number of steps k increased).

The probabilities that were calculated were as follows:

  • The probability of a deal moving from our sales representative stages to our account execute stages vs. staying on them for a given month.
  • The probability of a deal moving from our account executive stages to a successfully closed deal vs. staying in them for a given month.
  • The probability of staying in a successfully closed deal, which was 1. Therefore, making closed an absorbing state.

This analysis was conducted using the R programming language. R has a handy package called a Markov Chain that can handle a vast array of Markov chain types.

To begin with, the first thing we did was to check if our sales sequences followed the Markov property. To that end, the Markov Chain package carries a handy function called verifyMarkovProperty() that tests if a given sequence of events follows the Markov property by performing Chi-square tests on a series of contingency tables derived from the sequence of events. Large p-values indicate that the null hypothesis of a sequence following the Markov property should not be rejected. An example sales sequence showing where a deal was each month from the first meeting to when it was closed is provided below:

# SDR Funnel is our sales representative stages, AE Funnel is our account executive stages, and CW is a successfully closed deal
seq <- c('SDR Funnel','SDR Funnel','AE Funnel','AE Funnel','AE Funnel','AE Funnel','AE Funnel','AE Funnel','CW')
## Testing markovianity property on given data sequence
## Chi - square statistic is: 0.5733333  degrees of freedom are: 27  and corresponding p-value is: 1

Since the p-value shown is above 0.05, we do not reject the null hypothesis that the sequence follows the Markov property.

Once that was verified, we plotted the Markov chain structure along with the transition probabilities that were derived from our data. This code instantiates a Markov chain object by defining the transition matrix as well as the names of the states. It also displays the Markov chain and the transition probabilities.

transElec <- GatherTransMat('Manufacturing','Between 100M and 500M', 'Between 500 and 1k')

markov2 <- new('markovchain',
              transitionMatrix = transElec, # These are the transition probabilities of a random industry
              states = c('SDR','AE','CW'))

layout <- matrix(c(0,0,0,1,1,0), ncol = 2, byrow = TRUE)
plot(markov2, node.size = 10, layout = layout)
##           [,1]      [,2]      [,3]
## [1,] 0.5573215 0.4426785 0.0000000
## [2,] 0.0000000 0.8678118 0.1321882
## [3,] 0.0000000 0.0000000 1.0000000
markov chain

Since we have an absorbing Markov chain, we calculate the expected time until absorption. The first entry of the vector will output the expected number of steps until closing from the SDR funnel while the second entry will output the expected number if we started from the AE funnel.

# Extract Q from the transition matrix
Q <- transElec[1:2,1:2]
# Generate It
It <- diag(2)
#Calculate fundamental matrix
N <- solve(It-Q)
#Generate column vector of 1s
one <- t(t(c(1,1)))

# Calculate the expected steps by matrix multiplication
expected <- N%*%one
##          [,1]
## [1,] 9.823945
## [2,] 7.564969

However, we can also visualize how the probabilities change as the number of steps increases to contrast the expected number of steps. Thus, we also checked the probabilities of a deal being in any of the three stages for 24 steps.


initState <- c(1,0,0) # The initial state will be SDR Funnel (mimicking setting an initial discovery appointment)

# Initiate probability vectors
SDRProb <- c()
AEProb <- c()
CW <- c()

# Calculate probabilities for 24 steps.
for(k in 1:24){
  nsteps <- initState*markov2^k
  SDRProb[k] <- nsteps[1,1]
  AEProb[k] <- nsteps[1,2]
  CW[k] <- nsteps[1,3]

# Make dataframes and merge them
SDRProb <-
SDRProb$Group <- 'SDR'
SDRProb$Iter <- 1:24
names(SDRProb)[1] <- 'Value'

AEProb <-
AEProb$Group <- 'AE'
AEProb$Iter <- 1:24
names(AEProb)[1] <- 'Value'

CW <-
CW$Group <- 'CW'
CW$Iter <- 1:24
names(CW)[1] <- 'Value'

steps <- rbind(SDRProb,AEProb,CW)

# Plot the probabilities using ggplot
ggplot(steps, aes(x = Iter, y = Value, col = Group))+
  geom_line() +
  xlab('Chain Step') +
  ylab('Probability') +
  ggtitle('24 Step Chain Probability Prediction')+
  theme(plot.title = element_text(hjust = 0.5))

If we combine both results, it seems apparent that the CW state becomes the most likely state after 6-9 steps. Given that the transition porbabilities are monthly transition probabilities, you can argue that the typical sales velocity from initial appointment set to successfully closing a deal for that industry can be between 6 and 9 months. That's quite a lengthy sales process.


The goal of this analysis was to show how can the basic principles of Markov chains and absorbing Markov chains could be used to answer a question relevant to business. In this case, the Markov chain results were quite accurate despite the time-homogeneous assumptions since further empirical analyses revealed that the average sales velocity for industry used in this analysis was 208 days, which is almost 7 months. Hopefully, this example will serve for you to further explore Markov chains on your own and apply them to your business questions.

If you are interested in learning more about R, take DataCamp's Intermediate R course.

Check out our Markov Chains in Python: Beginner Tutorial.

R Courses

Introduction to R

BeginnerSkill Level
4 hr
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See DetailsRight Arrow
Start Course
See MoreRight Arrow

Free Access Week | Aug 28 – Sept 3

Access DataCamp's entire platform for free, including all 440+ courses, for an entire week. No catch, no credit card required—just unlimited learning for anyone with internet access.
Will Rix's photo

Will Rix

5 min

How to Choose The Right Data Science Bootcamp in 2023 (With Examples)

Learn everything about data science bootcamps, including a list of top programs to kickstart your career.
Abid Ali Awan's photo

Abid Ali Awan

10 min

DataCamp Portfolio Challenge: Win $500 Publishing Your Best Work

Win up to $500 by building a free data portfolio with DataCamp Portfolio.
DataCamp Team's photo

DataCamp Team

5 min

A Data Scientist’s Guide to Signal Processing

Uncover actionable insights hidden in complex signal data by filtering noise, choosing appropriate visualizations, finding patterns in the time- and frequency-domain, and more using signal processing.
Amberle McKee's photo

Amberle McKee

25 min

Chroma DB Tutorial: A Step-By-Step Guide

With Chroma DB, you can easily manage text documents, convert text to embeddings, and do similarity searches.
Abid Ali Awan's photo

Abid Ali Awan

10 min

Introduction to Non-Linear Model and Insights Using R

Uncover the intricacies of non-linear models in comparison to linear models. Learn about their applications, limitations, and how to fit them using real-world data sets.

Somil Asthana

17 min

See MoreSee More