Skip to main content

Multivariate Probability Distributions in R

Learn to analyze, plot, and model multivariate data.

Start Course for Free
4 Hours15 Videos51 Exercises6,681 Learners4000 XP

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

When working with data that contains many variables, we are often interested in studying the relationship between these variables using multivariate statistics. In this course, you'll learn ways to analyze these datasets. You will also learn about common multivariate probability distributions, including the multivariate normal, the multivariate-t, and some multivariate skew distributions. You will then be introduced to techniques for representing high dimensional data in fewer dimensions, including principal component analysis (PCA) and multidimensional scaling (MDS).

  1. 1

    Reading and plotting multivariate data


    In this introduction to multivariate data, you will learn how to read and summarize it. You will learn how to summarize multivariate data using descriptive statistics, such as the mean vector, variance-covariance, and correlation matrices. You'll then explore plotting techniques to provide insights into multivariate data.

    Play Chapter Now
    Reading multivariate data
    50 xp
    Reading multivariate data using read.table
    100 xp
    Specifying datatypes for columns
    100 xp
    Mean vector and variance-covariance matrix
    50 xp
    Calculating the mean vector
    100 xp
    Calculating the variance-covariance matrix
    100 xp
    Calculating the correlation matrix
    100 xp
    Plotting multivariate data
    50 xp
    Pairs plot using base graphics and lattice
    100 xp
    Plotting multivariate data using ggplot
    100 xp
    3D plotting techniques
    100 xp
  2. 2

    Multivariate Normal Distribution

    This chapter will introduce you to the most important and widely used multivariate probability distribution, the multivariate normal. You will learn how to generate random samples from a multivariate normal distribution and how to calculate and plot the densities and probabilities under this distribution. You will also learn how to test if a dataset follows multivariate normality.

    Play Chapter Now
  3. 3

    Other Multivariate Distributions

    This chapter introduces a host of probability distributions to model non-normal data. In particular, you will be introduced to multivariate t-distributions, which can model heavier tails and are a generalization of the univariate Student's t-distribution. You will be introduced to various skew distributions, which are specifically designed to model data that are right or left skewed.

    Play Chapter Now




nicksolomonNick SolomonchesterChester Ismayamy-4121b590-cc52-442a-9779-03eb58089e08Amy Peterson
Surajit Ray Headshot

Surajit Ray

Senior Lecturer in Statistics, University of Glasgow

Surajit is a Professor of Statistics at the University of Glasgow's School of Mathematics & Statistics. His research interests are in the area of model selection, the theory and geometry of mixture models, and functional data analysis. He is especially interested in challenges presented by "large magnitude", both in the dimension of data vectors and in the number of vectors. He is the author of the R-package Modalclust. He is also a founder board member and instructor for the Online MSc in Data Analytics at the University of Glasgow.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA