# Inference for Categorical Data in R

In this course you'll learn how to leverage statistical techniques for working with categorical data.

Start Course for Free4 Hours14 Videos53 Exercises6,263 Learners

4000 XP## Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).## Loved by learners at thousands of companies

## Course Description

Categorical data is all around us. It's in the latest opinion polling numbers, in the data that lead to new breakthroughs in genomics, and in the troves of data that internet companies collect to sell products to you. In this course you'll learn techniques for parsing the signal from the noise; tools for identifying when structure in this data represents interesting phenomena and when it is just random noise.

- 1
### Inference for a single parameter

**Free**In this chapter you will learn how to perform statistical inference on a single parameter that describes categorical data. This includes both resampling based methods and approximation based methods for a single proportion.

The General Social Survey50 xpExploring consci100 xpGenerating via bootstrap100 xpConstructing a CI100 xpWhy more bootstraps?50 xpInterpreting a Confidence Interval50 xpCIs and confidence level50 xpSE with less data100 xpSE with different p100 xpThe approximation shortcut50 xpCI via approximation100 xpMethods compared50 xp - 2
### Proportions: testing and power

This chapter dives deeper into performing hypothesis tests and creating confidence intervals for a single parameter. Then, you'll learn how to perform inference on a difference between two proportions. Finally, this chapter wraps up with an exploration of what happens when you know the null hypothesis is true.

Hypothesis test for a proportion50 xpLife after death100 xpGenerating from H0100 xpTesting a claim100 xpMaking a decision50 xpIntervals for differences50 xpDeath penalty and sex100 xpHypothesis test on the difference in proportions100 xpInterpreting the test50 xpHypothesis tests and confidence intervals100 xpStatistical errors50 xpWhen the null is true100 xpWhen the null is true: decision100 xp - 3
### Comparing many parameters: independence

This part of the course will teach you how to use both resampling methods and classical methods to test for the indepence of two categorical variables. This chapter covers how to perform a Chi-squared test.

Contingency tables50 xpPolitics and Space100 xpUnderstanding contingency tables50 xpFrom tidy to table to tidy100 xpChi-squared test statistic50 xpA single permuted Chi-sq100 xpBuilding two null distributions100 xpIs the data consistent with the model?50 xpAlternate method: the chi-squared distribution50 xpChecking conditions50 xpThe geography of happiness100 xpA p-value two ways100 xpIntervals for the chi-squared distribution50 xp - 4
### Comparing many parameters: goodness of fit

The course wraps up with two case studies using election data. Here, you'll learn how to use a Chi-squared test to check goodness-of-fit. You'll study election results from Iran and Iowa and test if Benford's law applies to these datasets.

Case study: election fraud50 xpGetting to know the Iran data50 xpWho won?100 xpBreaking it down by province100 xpExtracting the first digit I100 xpGoodness of fit50 xpGoodness of fit test100 xpA p-value, two ways100 xpIs this evidence of fraud?50 xpAnd now to US50 xpGetting to know the Iowa data50 xpExtracting the first digit II100 xpTesting Iowa100 xpFraud in Iowa?50 xpElection fraud in Iran and Iowa: debrief50 xp

Prerequisites

Foundations of Inference#### Andrew Bray

Assistant Professor of Statistics at Reed College

Andrew Bray is an assistant professor of statistics at Reed College. His interests are in computing, differential privacy, environmental statistics, and statistics education. He is a co-author of the infer package for tidy statistical inference.

## What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph

Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden

Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers

Decision Science Analytics, USAA

## Join over 9 million learners and start Inference for Categorical Data in R today!

### Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).