Bridget Balkaran has completed
Supervised Learning in R: Case Studies
Start course For Free4 hr
4,400 XP

Loved by learners at thousands of companies
Course Description
Predictive modeling, or supervised machine learning, is a powerful tool for using data to make predictions about the world around us. Once you understand the basic ideas of supervised machine learning, the next step is to practice your skills so you know how to apply these techniques wisely and appropriately. In this course, you will work through four case studies using data from the real world; you will gain experience in exploratory data analysis, preparing data so it is ready for predictive modeling, training supervised machine learning models, and evaluating those models.
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.- 1
Not mtcars AGAIN
FreeIn this first case study, you will predict fuel efficiency from a US Department of Energy data set for real cars of today.
Making predictions using machine learning50 xpChoosing an appropriate model50 xpVisualizing the fuel efficiency distribution100 xpBuilding a simple linear model100 xpGetting started with caret50 xpTraining and testing data100 xpTraining models with caret100 xpEvaluating your models100 xpUsing the testing data100 xpLet's sample our data50 xpBootstrap resampling100 xpPlotting modeling results100 xp - 2
Stack Overflow Developer Survey
Stack Overflow is the world's largest online community for developers, and you have probably used it to find an answer to a programming question. The second chapter uses data from the annual Stack Overflow Developer Survey to practice predictive modeling and find which developers are more likely to work remotely.
Essential copying and pasting from Stack Overflow50 xpChoosing an appropriate model50 xpExploring the Stack Overflow survey100 xpStart with a simple model100 xpDealing with imbalanced data50 xpTraining and testing data100 xpUpsampling100 xpUnderstanding upsampling50 xpUpsampling in your workflow50 xpPredicting remote status50 xpTraining models100 xpConfusion matrix100 xpClassification model metrics100 xp - 3
Get out the vote
In the third case study, you will use data on attitudes and beliefs in the United States to predict voter turnout. You will apply your skills in dealing with imbalanced data and explore more resampling options.
Predicting voter turnout from survey data50 xpChoosing an appropriate model50 xpExploring the VOTER data100 xpVisualization for exploratory data analysis100 xpImbalanced data50 xpFit a simple model100 xpVOTE 201650 xpTraining and testing data100 xpUpsampling for imbalanced data100 xpCross-validation50 xpUnderstanding cross-validation50 xpTraining models with cross-validation100 xpComparing model performance50 xpConfusion matrix for your training data100 xpConfusion matrix for your testing data100 xpWhich model is best?50 xp - 4
But what do the nuns think?
The last case study uses an extensive survey of Catholic nuns fielded in 1967 to once more put your practical machine learning skills to use. You will predict the age of these religious women from their responses about their beliefs and attitudes.
Surveying Catholic sisters in 196750 xpChoosing an appropriate model50 xpVisualizing the age distribution100 xpTidying the survey data100 xpExploratory data analysis with tidy data50 xpVisualizing agreement with age100 xpBuilding a simple linear model100 xpTraining, validation, and testing data100 xpUsing your validation set50 xpPredicting age with supervised machine learning50 xpTraining models100 xpMaking predictions100 xpChoosing between models100 xpEstimating uncertainty for new data100 xpWrapping up50 xp
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.datasets
Fuel efficiency of real cars (2018)Annual Stack Overflow Developer Survey (2017)Survey responses of voters (2016 US Presidential elections)Survey responses of Catholic sisters (1967)collaborators


DataCamp Content Creator
See MoreCourse Instructor
DataCamp offers interactive R, Python, Spreadsheets, SQL and shell courses. All on topics in data science, statistics, and machine learning. Learn from a team of expert teachers in the comfort of your browser with video lessons and fun coding challenges and projects.
Join over 17 million learners and start Supervised Learning in R: Case Studies today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.