Ben Bolstad has completed

Statistical Thinking in Python (Part 2)

4 hr

5,350 XP

Loved by learners at thousands of companies

Course Description

After completing Statistical Thinking in Python (Part 1), you have the probabilistic mindset and foundational hacker stats skills to dive into data sets and extract useful information from them. In this course, you will do just that, expanding and honing your hacker stats toolbox to perform the two key tasks in statistical inference, parameter estimation and hypothesis testing. You will work with real data sets as you learn, culminating with analysis of measurements of the beaks of the Darwin's famous finches. You will emerge from this course with new knowledge and lots of practice under your belt, ready to attack your own inference problems out in the world.

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

1
Parameter estimation by optimization
Free
When doing statistical inference, we speak the language of probability. A probability distribution that describes your data has parameters. So, a major goal of statistical inference is to estimate the values of these parameters, which allows us to concisely and unambiguously describe our data and draw conclusions from it. In this chapter, you will learn how to find the optimal parameters, those that best describe your data.
Play Chapter Now
Optimal parameters
50 xp
How often do we get no-hitters?
100 xp
Do the data follow our story?
100 xp
How is this parameter optimal?
100 xp
Linear regression by least squares
50 xp
EDA of literacy/fertility data
100 xp
Linear regression
100 xp
How is it optimal?
100 xp
The importance of EDA: Anscombe's quartet
50 xp
The importance of EDA
50 xp
Linear regression on appropriate Anscombe data
100 xp
Linear regression on all Anscombe data
100 xp
2
Bootstrap confidence intervals
To "pull yourself up by your bootstraps" is a classic idiom meaning that you achieve a difficult task by yourself with no help at all. In statistical inference, you want to know what would happen if you could repeat your data acquisition an infinite number of times. This task is impossible, but can we use only the data we actually have to get close to the same result as an infinitude of experiments? The answer is yes! The technique to do it is aptly called bootstrapping. This chapter will introduce you to this extraordinarily powerful tool.
Play Chapter Now
Generating bootstrap replicates
50 xp
Getting the terminology down
50 xp
Bootstrapping by hand
50 xp
Visualizing bootstrap samples
100 xp
Bootstrap confidence intervals
50 xp
Generating many bootstrap replicates
100 xp
Bootstrap replicates of the mean and the SEM
100 xp
Confidence intervals of rainfall data
50 xp
Bootstrap replicates of other statistics
100 xp
Confidence interval on the rate of no-hitters
100 xp
Pairs bootstrap
50 xp
A function to do pairs bootstrap
100 xp
Pairs bootstrap of literacy/fertility data
100 xp
Plotting bootstrap regressions
100 xp
3
Introduction to hypothesis testing
You now know how to define and estimate parameters given a model. But the question remains: how reasonable is it to observe your data if a model is true? This question is addressed by hypothesis tests. They are the icing on the inference cake. After completing this chapter, you will be able to carefully construct and test hypotheses using hacker statistics.
Play Chapter Now
Formulating and simulating a hypothesis
50 xp
Generating a permutation sample
100 xp
Visualizing permutation sampling
100 xp
Test statistics and p-values
50 xp
Test statistics
50 xp
What is a p-value?
50 xp
Generating permutation replicates
100 xp
Look before you leap: EDA before hypothesis testing
100 xp
Permutation test on frog data
100 xp
Bootstrap hypothesis tests
50 xp
A one-sample bootstrap hypothesis test
100 xp
A two-sample bootstrap hypothesis test for difference of means
100 xp
4
Hypothesis test examples
As you saw from the last chapter, hypothesis testing can be a bit tricky. You need to define the null hypothesis, figure out how to simulate it, and define clearly what it means to be "more extreme" in order to compute the p-value. Like any skill, practice makes perfect, and this chapter gives you some good practice with hypothesis tests.
Play Chapter Now
A/B testing
50 xp
The vote for the Civil Rights Act in 1964
100 xp
What is equivalent?
50 xp
A time-on-website analog
100 xp
What should you have done first?
50 xp
Test of correlation
50 xp
Simulating a null hypothesis concerning correlation
50 xp
Hypothesis test on Pearson correlation
100 xp
Do neonicotinoid insecticides have unintended consequences?
100 xp
Bootstrap hypothesis test on bee sperm counts
100 xp
5
Putting it all together: a case study
Every year for the past 40-plus years, Peter and Rosemary Grant have gone to the Galápagos island of Daphne Major and collected data on Darwin's finches. Using your skills in statistical inference, you will spend this chapter with their data, and witness first hand, through data, evolution in action. It's an exhilarating way to end the course!
Play Chapter Now
Finch beaks and the need for statistics
50 xp
EDA of beak depths of Darwin's finches
100 xp
ECDFs of beak depths
100 xp
Parameter estimates of beak depths
100 xp
Hypothesis test: Are beaks deeper in 2012?
100 xp
Variation in beak shapes
50 xp
EDA of beak length and depth
100 xp
Linear regressions
100 xp
Displaying the linear regression results
100 xp
Beak length to depth ratio
100 xp
How different is the ratio?
50 xp
Calculation of heritability
50 xp
EDA of heritability
100 xp
Correlation of offspring and parental data
100 xp
Pearson correlation of offspring and parental data
100 xp
Measuring heritability
100 xp
Is beak depth heritable at all in G. scandens?
100 xp
Final thoughts
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

datasets

Anscombe data Bee sperm counts Female literacy and fertility Finch beaks (1975)Finch beaks (2012)Fortis beak depth heredity Frog tongue data Major League Baseball no-hitters Scandens beak depth heredity Sheffield Weather Station

collaborators

Yashas Roy

Hugo Bowne-Anderson

prerequisites

Statistical Thinking in Python (Part 1)

Justin Bois

Lecturer at the California Institute of Technology

Justin Bois is a Teaching Professor in the Division of Biology and Biological Engineering at the California Institute of Technology. He teaches nine different classes there, nearly all of which heavily feature Python. He is dedicated to empowering students in the biological sciences with quantitative tools, particularly data analysis skills. Beyond biologists, he is thrilled to develop courses for DataCamp, whose students are an excited bunch of burgeoning data scientists!

Join over 18 million learners and start Statistical Thinking in Python (Part 2) today!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Statistical Thinking in Python (Part 2)

Loved by learners at thousands of companies

Course Description

.css-10r9e5n{-webkit-margin-end:8px;margin-inline-end:8px;}.css-1309hh9{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;-webkit-margin-end:8px;margin-inline-end:8px;}Training 2 or more people?

Parameter estimation by optimization

Bootstrap confidence intervals

Introduction to hypothesis testing

Hypothesis test examples

Putting it all together: a case study

Training 2 or more people?

Join over .css-ou6dz6{color:#03ef62;}18 million learners and start Statistical Thinking in Python (Part 2) today!

Create Your Free Account

Training 2 or more people?

Join over 18 million learners and start Statistical Thinking in Python (Part 2) today!