If you have experience with SAS and want to learn R, this is the course for you. R is FREE (cost) and OPEN (license) and is one of the fastest growing software languages for statistics and data science. This course is a gentle introduction to the R language with every chapter providing a detailed mapping of R functions to SAS procedures highlighting similarities and differences. You will orient yourself in the R environment and discover how to wrangle, visualize, and model data plus customize your output for final presentation. Throughout the course, you will follow a consistent workflow of data quality checking and cleaning, exploring relationships, modeling, and presenting results. You will leave this course with coded examples that provide a template to use immediately with a dataset of your own.
This first chapter will get you oriented into the R programming environment. You'll learn how to get help, load a dataset, and increase functionality by adding packages. You'll begin working with the abalone dataset (through the dplyr package workflow) to get descriptive statistics and create helpful visualizations (using the ggplot2 package).
Now that you are oriented in the R environment, this chapter will advance your understanding of R's versatility working with data objects. You'll learn how to create and modify variables in the abalone data set. Using your ggplot2 visualization skills, you will discover the data errors in the abalone data and then create a final cleaned data set ready for analysis and modeling.
Once your data set has been cleaned, the next step is exploration. In chapter 3 you will learn how to compute descriptive statistics, explore associations (e.g., correlations) among the variables, and perform bi-variate statistical tests (e.g., t-tests and chi-square tests). You will also create graphical visualizations which illustrate the bi-variate associations and group comparison tests.
In this final chapter, you will learn how to work with one of the most versatile data object types in R called a list. These skills will enable you to save and manipulate your output from descriptive statistics, associations, and group comparison computations. You will also learn how to perform ANOVA (analysis of variance) and linear regression in R. All your skills are put to use in the final exercises to create the best models for predicting abalone ages from their sex, size, and weight measurements.
Research Professor, Senior Biostatistician
“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”
Devon Edwards Joseph
Lloyds Banking Group
“DataCamp is the top resource I recommend for learning data science.”
Harvard Business School
“DataCamp is by far my favorite website to learn from.”
Decision Science Analytics, USAA