# Data Analysis and Statistical Inference

#### Open Course Description

This interactive DataCamp course complements the Coursera course *Data Analysis and Statistical Inference* by Mine Çetinkaya-Rundel. For every lesson given at Coursera, you can follow interactive exercises in the comfort of your browser to master the different topics.

#### Lab 0: Introduction to R

In this first lab, you'll learn the basics of how to analyze data with R. You are suggested to take this introductory lab if you are not yet familiar with this powerful open-source language.

#### Lab 1: Introduction to data

Some define Statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information - the data. In this lab, we will gain insight into public health by generating simple graphical and numerical summaries of a data set collected by the Centers for Disease Control and Prevention (CDC). As this is a large data set, along the way we'll also learn the indispensable skills of data processing and subsetting.

#### Lab 2: Probability

In this lab, we will investigate the phenomenon of hot hands in basketball, or specifically, whether Kobe Bryant has hot hands. We will make use of simulations in our investigation.

#### Lab 3A: Foundations for inference: Sampling distributions

In this two part lab we will investigate sampling distributions and the Central Limit Theorem as well as confidence intervals. We will use housing data from Ames, Iowa (a small town in the US) in our exploration.

#### Lab 3B: Foundations for inference: Confidence intervals

In this two part lab we will investigate sampling distributions and the Central Limit Theorem as well as confidence intervals. We will use housing data from Ames, Iowa (a small town in the US) in our exploration.

#### Lab 4: Inference for numerical data

In this two part lab we will work on inference for numerical data. We will use a dataset on births from North Carolina as well as data from the General Social Survey.

#### Lab 5: Inference for categorical data

In this lab we will work on inference for categorical data using data from a world-wide survey on religiosity and atheism.

#### Lab 6: Introduction to linear regression

The movie Moneyball focuses on the "quest for the secret of success in baseball". It follows a low-budget team, the Oakland Athletics, who believed that underused statistics, such as a player's ability to get on base, better predict the ability to score runs than typical statistics like home runs, RBIs (runs batted in), and batting average. In this lab we'll be looking at data from all 30 Major League Baseball teams and examining the linear relationship between runs scored in a season and a number of other player statistics. Our aim will be to summarize these relationships both graphically and numerically in order to find which variable, if any, helps us best predict a team's runs scored in a season.

#### Lab 7: Multiple linear regression

Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics, such as the physical appearance of the instructor. The article titled, "Beauty in the classroom: instructors' pulchritude and putative pedagogical productivity" (Hamermesh and Parker, 2005) found that instructors who are viewed to be better looking receive higher instructional ratings. In this lab we will analyze the data from this study in order to learn what goes into a positive professor evaluation.