Handling Missing Data with Imputations in R

Diagnose, visualize and treat missing data with a range of imputation techniques with tips to improve your results.
Start Course for Free
Clock4 HoursPlay13 VideosCode49 Exercises
Database4200 XP

Create Your Free Account

Google LinkedInFacebook
or
By continuing you accept the Terms of Use and Privacy Policy. You also accept that you are aware that your data will be stored outside of the EU and that you are above the age of 16.

Loved by learners at thousands of companies


Course Description

Missing data is everywhere. The process of filling in missing values is known as imputation, and knowing how to correctly fill in missing data is an essential skill if you want to produce accurate predictions and distinguish yourself from the crowd. In this course, you’ll learn how to use visualizations and statistical tests to recognize missing data patterns and how to impute data using a collection of statistical and machine learning models. You’ll also gain decision-making skills, helping you decide which imputation method fits best in a particular situation. Finally, you’ll learn to incorporate uncertainty from imputation into your inference and predictions, making them more robust and reliable.

  1. 1

    The problem of missing data

    Free
    In this chapter, you’ll find out why missing data can be a risk when analyzing a dataset. You’ll be introduced to the three missing data mechanisms and learn how to recognize them using statistical tests and visualization tools.
    Play Chapter Now
  2. 2

    Donor-based imputation

    Get to know the taxonomy of imputation methods and learn three donor-based techniques: mean, hot-deck, and k-Nearest-Neighbors imputation. You’ll look under the hood to see how these methods work, before learning how to apply them to a real-world tropical weather dataset. Along the way, you’ll also learn useful tricks that you can use to make them work even better for your problems.
    Play Chapter Now
  3. 3

    Model-based imputation

    It’s time to learn how to use statistical and machine learning models, such as linear regression, logistic regression, and random forests, to impute missing data. In this chapter, you’ll look into how the models make their predictions and use this knowledge to draw the imputed values from conditional distributions. This is important as it ensures your imputations are more varied and plausible, making them more similar to the true data.
    Play Chapter Now
  4. 4

    Uncertainty from imputation

    Imputed values are not set in stone. They are just estimates and estimates come with some uncertainty. In this final chapter, you’ll discover how bootstrapping and chained equation using the mice package can be used to incorporate imputation uncertainty into your models and analyses to make them more reliable and robust.
    Play Chapter Now
Datasets
Biopics datasetTropical Atmosphere Ocean dataset
Collaborators
Adel NehmeAmy Peterson
Michał Oleszak Headshot

Michał Oleszak

Machine Learning Engineer
Michał is a Machine Learning Engineer with a background in statistics and econometrics, holding degrees from Erasmus University Rotterdam, The Netherlands and Warsaw School of Economics, Poland. He is the author of the pmpp R package for forecasting with panel data. Having worked at a data science consultancy, he has gained experience in squeezing value from messy and incomplete data. He's currently shaping the future at an AI startup. Visit his homepage to find out more.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA

Join over 6 million learners and start Handling Missing Data with Imputations in R today!

Create Your Free Account

Google LinkedInFacebook
or
By continuing you accept the Terms of Use and Privacy Policy. You also accept that you are aware that your data will be stored outside of the EU and that you are above the age of 16.