Interactive Course

Modeling with Data in the Tidyverse

Explore Linear Regression in a tidy framework.

• 4 hours
• 17 Videos
• 49 Exercises
• 4,162 Participants
• 3,900 XP

Course Description

In this course, you will learn to model with data. Models attempt to capture the relationship between an outcome variable of interest and a series of explanatory/predictor variables. Such models can be used for both explanatory purposes, e.g. "Does knowing professors' ages help explain their teaching evaluation scores?", and predictive purposes, e.g., "How well can we predict a house's price based on its size and condition?" You will leverage your tidyverse skills to construct and interpret such models. This course centers around the use of linear regression, one of the most commonly-used and easy to understand approaches to modeling. Such modeling and thinking is used in a wide variety of fields, including statistics, causal inference, machine learning, and artificial intelligence.

1. 1

Introduction to Modeling

Free

This chapter will introduce you to some background theory and terminology for modeling, in particular, the general modeling framework, the difference between modeling for explanation and modeling for prediction, and the modeling problem. Furthermore, you'll start performing your first exploratory data analysis, a crucial first step before any formal modeling.

2. Modeling with Multiple Regression

In the previous chapter, you learned about basic regression using either a single numerical or a categorical predictor. But why limit ourselves to using only one variable to inform your explanations/predictions? You will now extend basic regression to multiple regression, which allows for incorporation of more than one explanatory or one predictor variable in your models. You'll be modeling house prices using a dataset of houses in the Seattle, WA metropolitan area.

3. Modeling with Basic Regression

Equipped with your understanding of the general modeling framework, in this chapter, we'll cover basic linear regression where you'll keep things simple and model the outcome variable y as a function of a single explanatory/ predictor variable x. We'll use both numerical and categorical x variables. The outcome variable of interest in this chapter will be teaching evaluation scores of instructors at the University of Texas, Austin.

4. Model Assessment and Selection

In the previous chapters, you fit various models to explain or predict an outcome variable of interest. However, how do we know which models to choose? Model assessment measures allow you to assess how well an explanatory model "fits" a set of data or how accurate a predictive model is. Based on these measures, you'll learn about criteria for determining which models are "best".

1. 1

Introduction to Modeling

Free

This chapter will introduce you to some background theory and terminology for modeling, in particular, the general modeling framework, the difference between modeling for explanation and modeling for prediction, and the modeling problem. Furthermore, you'll start performing your first exploratory data analysis, a crucial first step before any formal modeling.

2. Modeling with Basic Regression

Equipped with your understanding of the general modeling framework, in this chapter, we'll cover basic linear regression where you'll keep things simple and model the outcome variable y as a function of a single explanatory/ predictor variable x. We'll use both numerical and categorical x variables. The outcome variable of interest in this chapter will be teaching evaluation scores of instructors at the University of Texas, Austin.

3. Modeling with Multiple Regression

In the previous chapter, you learned about basic regression using either a single numerical or a categorical predictor. But why limit ourselves to using only one variable to inform your explanations/predictions? You will now extend basic regression to multiple regression, which allows for incorporation of more than one explanatory or one predictor variable in your models. You'll be modeling house prices using a dataset of houses in the Seattle, WA metropolitan area.

4. Model Assessment and Selection

In the previous chapters, you fit various models to explain or predict an outcome variable of interest. However, how do we know which models to choose? Model assessment measures allow you to assess how well an explanatory model "fits" a set of data or how accurate a predictive model is. Based on these measures, you'll learn about criteria for determining which models are "best".

What do other learners have to say?

“I've used other sites, but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyd's Banking Group

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics @ USAA

Albert Y. Kim

Assistant Professor of Statistical & Data Sciences at Smith College

Albert completed his Ph.D. in Statistics at the University of Washington. He is a co-author of the fivethirtyeight R package as well as ModernDive, an online textbook for introductory statistics and data science using R. Previously he worked in the Search Ads Metrics Team at Google Inc. as well as at Reed, Middlebury, and Amherst Colleges. Find out more about his work on Twitter @rudeboybert and on his webpage.

See More
Collaborators
• Chester Ismay

• Sumedh Panchadhar

• Benjamin Feder