Loved by learners at thousands of companies
In this course, you'll learn how to use statistical techniques to make inferences and estimations using numerical data. This course uses two approaches to these common tasks. The first makes use of bootstrapping and permutation to create resample based tests and confidence intervals. The second uses theoretical results and the t-distribution to achieve the same result. You'll learn how (and when) to perform a t-test, create a confidence interval, and do an ANOVA!
Bootstrapping for estimating a parameterFree
In this chapter you'll use bootstrapping techniques to estimate a single parameter from a numerical distribution.Welcome to the course!50 xpGenerate bootstrap distribution for median100 xpReview percentile and standard error methods50 xpCalculate bootstrap interval using both methods100 xpWhich method more appropriate: percentile or SE?50 xpDoctor visits during pregnancy50 xpAverage number of doctor's visits100 xpSD of number of doctor's visits100 xpRe-centering a bootstrap distribution50 xpTest for median price of 1 BR apartments in Manhattan100 xpConclude the hypothesis test on median50 xpTest for average weight of babies100 xp
Introducing the t-distribution
In this chapter you'll use Central Limit Theorem based techniques to estimate a single parameter from a numerical distribution. You will do this using the t-distribution.t-distribution50 xpWhen to t?50 xpProbabilities under the t-distribution100 xpCutoffs under the t-distribution100 xpEstimating a mean with a t-interval50 xpAverage commute time of Americans100 xpAverage number of hours worked100 xpt-interval for paired data50 xpt-interval at various levels100 xpUnderstanding confidence intervals50 xpTesting a mean with a t-test50 xpEstimate the median difference in textbook prices100 xpTest for a difference in median test scores100 xpInterpret the p-value50 xp
Inference for difference in two parameters
In this chapter you'll extend what you have learned so far to use both simulation and CLT based techniques for inference on the difference between two parameters from two independent numerical distributions.Hypothesis testing for comparing two means50 xpEvaluating the effectiveness of stem cell treatment100 xpEvaluating the effectiveness of stem cell treatment (cont.)100 xpConclusion of the hypothesis test50 xpEvaluating the relationship between smoking during pregnancy and birth weight100 xpBootstrap CI for difference in two means50 xpQuantifying the relationship between smoking during pregnancy and birth weight100 xpMedian lengths of pregnancies for smoking and non-smoking mothers100 xpComparing means with a t-test50 xpHourly pay vs. citizenship status100 xpEstimating the difference of two means using a t-interval100 xp
Comparing many means
In this chapter you will use ANOVA (analysis of variance) to test for a difference in means across many groups.Vocabulary score vary between vs. (self identified) social class50 xpEDA for vocabulary score vs. social class100 xpComparing many means, visually50 xpANOVA50 xpANOVA for vocabulary score vs. (self identified) social class100 xpConditions for ANOVA50 xpChecking the normality condition50 xpChecking the constant variance condition100 xpPost-hoc testing50 xpCalculate alpha*50 xpCompare pairwise means100 xpCongratulations!50 xp
In the following tracksStatistical Inference
DatasetsChp1-vid1-boot-dist-noaxes-paranthesesChp1-vid1-bootsamp-bootpop.001Chp1-vid1-manhattan-rentsChp1-vid2-boot-dist-withaxesChp1-vid2-perc-method.001Chp1-vid2-perc-method.002Chp1-vid3-boot-test.001Chp3-vid3-hrly-rate-citizen-smallerChp3-vid3-hrly-rate-citizenChp4-vid1-class-barChp4-vid1-wodrsum-histGss moredaysGSS dataManhattan rent dataRunners.001Tdistcomparetonormaldist
PrerequisitesFoundations of Inference
Associate Professor at Duke University & Data Scientist and Professional Educator at RStudio
Mine is the Director of Undergraduate Studies and an Associate Professor of the Practice in the Department of Statistical Science at Duke University as well as a Professional Educator at RStudio. Her work focuses on innovation in statistics pedagogy, with an emphasis on computation, reproducible research, open-source education, and student-centered learning. She is the author of three open-source introductory statistics textbooks as part of the OpenIntro project and teaches the popular Statistics with R MOOC on Coursera.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA