Skip to main content
HomeR

Course

Handling Missing Data with Imputations in R

AdvancedSkill Level
4.7+
85 reviews
Updated 10/2022
Diagnose, visualize and treat missing data with a range of imputation techniques with tips to improve your results.
Start Course for Free
RData Manipulation4 hr13 videos49 Exercises4,200 XP6,074Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training 2 or more people?

Try DataCamp for Business

Course Description

Missing data is everywhere. The process of filling in missing values is known as imputation, and knowing how to correctly fill in missing data is an essential skill if you want to produce accurate predictions and distinguish yourself from the crowd. In this course, you’ll learn how to use visualizations and statistical tests to recognize missing data patterns and how to impute data using a collection of statistical and machine learning models. You’ll also gain decision-making skills, helping you decide which imputation method fits best in a particular situation. Finally, you’ll learn to incorporate uncertainty from imputation into your inference and predictions, making them more robust and reliable.

Prerequisites

Intermediate Regression in RDealing With Missing Data in R
1

The Problem of Missing Data

In this chapter, you’ll find out why missing data can be a risk when analyzing a dataset. You’ll be introduced to the three missing data mechanisms and learn how to recognize them using statistical tests and visualization tools.
Start Chapter
2

Donor-Based Imputation

Get to know the taxonomy of imputation methods and learn three donor-based techniques: mean, hot-deck, and k-Nearest-Neighbors imputation. You’ll look under the hood to see how these methods work, before learning how to apply them to a real-world tropical weather dataset. Along the way, you’ll also learn useful tricks that you can use to make them work even better for your problems.
Start Chapter
3

Model-Based Imputation

It’s time to learn how to use statistical and machine learning models, such as linear regression, logistic regression, and random forests, to impute missing data. In this chapter, you’ll look into how the models make their predictions and use this knowledge to draw the imputed values from conditional distributions. This is important as it ensures your imputations are more varied and plausible, making them more similar to the true data.
Start Chapter
4

Uncertainty from Imputation

Imputed values are not set in stone. They are just estimates and estimates come with some uncertainty. In this final chapter, you’ll discover how bootstrapping and chained equation using the mice package can be used to incorporate imputation uncertainty into your models and analyses to make them more reliable and robust.
Start Chapter
Handling Missing Data with Imputations in R
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.7
from 85 reviews
78%
19%
4%
0%
0%
  • ZOE SAMANTHA
    3 days ago

  • Fong
    last month

  • Juding
    2 months ago

  • Yanjie
    2 months ago

  • Shiyu
    2 months ago

  • Tung
    2 months ago

    .

ZOE SAMANTHA

Juding

FAQs

What imputation methods are taught in this course?

You will learn mean imputation, hot-deck imputation, k-nearest neighbors, linear regression, logistic regression, random forests, and multiple imputation using the mice package.

How does this course differ from Dealing With Missing Data in R?

Dealing With Missing Data in R is a prerequisite that covers recognition and basic handling. This advanced course focuses specifically on imputation techniques and incorporating imputation uncertainty.

What are the three missing data mechanisms and will I learn to identify them?

They are missing completely at random, missing at random, and missing not at random. Chapter 1 teaches you to recognize them using statistical tests and visualizations.

What real-world dataset is used for practice?

You will apply imputation techniques to a tropical weather dataset, practicing donor-based and model-based methods on real missing data patterns.

Does the course cover uncertainty in imputed values?

Yes. The final chapter teaches bootstrapping and chained equations with the mice package to incorporate imputation uncertainty into your analyses and make results more robust.

Join over 19 million learners and start Handling Missing Data with Imputations in R today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.