Supervised Learning in R: Classification
In this course you will learn the basics of machine learning for classification.
Start Course for Free4 hours14 videos55 exercises90,341 learnersStatement of Accomplishment
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?
Try DataCamp For BusinessLoved by learners at thousands of companies
Course Description
This beginner-level introduction to machine learning covers four of the most common classification algorithms. You will come away with a basic understanding of how each algorithm approaches a learning task, as well as learn the R functions needed to apply these tools to your own work.
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and moreIn the following Tracks
Machine Learning Fundamentals in R
Go To TrackMachine Learning Scientist in R
Go To Track- 1
k-Nearest Neighbors (kNN)
FreeAs the kNN algorithm literally "learns by example" it is a case in point for starting to understand supervised machine learning. This chapter will introduce classification while working through the application of kNN to self-driving vehicle road sign recognition.
Classification with Nearest Neighbors50 xpRecognizing a road sign with kNN100 xpThinking like kNN50 xpExploring the traffic sign dataset100 xpClassifying a collection of road signs100 xpWhat about the 'k' in kNN?50 xpUnderstanding the impact of 'k'50 xpTesting other 'k' values100 xpSeeing how the neighbors voted100 xpData preparation for kNN50 xpWhy normalize data?50 xp - 2
Naive Bayes
Naive Bayes uses principles from the field of statistics to make predictions. This chapter will introduce the basics of Bayesian methods while exploring how to apply these techniques to iPhone-like destination suggestions.
Understanding Bayesian methods50 xpComputing probabilities100 xpUnderstanding dependent events50 xpA simple Naive Bayes location model100 xpExamining "raw" probabilities100 xpUnderstanding independence50 xpUnderstanding NB's "naivety"50 xpWho are you calling naive?50 xpA more sophisticated location model100 xpPreparing for unforeseen circumstances100 xpUnderstanding the Laplace correction50 xpApplying Naive Bayes to other problems50 xpHandling numeric predictors50 xp - 3
Logistic Regression
Logistic regression involves fitting a curve to numeric data to make predictions about binary events. Arguably one of the most widely used machine learning methods, this chapter will provide an overview of the technique while illustrating how to apply it to fundraising data.
Making binary predictions with regression50 xpBuilding simple logistic regression models100 xpMaking a binary prediction100 xpThe limitations of accuracy50 xpModel performance tradeoffs50 xpCalculating ROC Curves and AUC100 xpComparing ROC curves50 xpDummy variables, missing data, and interactions50 xpCoding categorical features100 xpHandling missing data100 xpUnderstanding missing value indicators50 xpBuilding a more sophisticated model100 xpAutomatic feature selection50 xpThe dangers of stepwise regression50 xpBuilding a stepwise regression model100 xp - 4
Classification Trees
Classification trees use flowchart-like structures to make decisions. Because humans can readily understand these tree structures, classification trees are useful when transparency is needed, such as in loan approval. We'll use the Lending Club dataset to simulate this scenario.
Making decisions with trees50 xpBuilding a simple decision tree100 xpVisualizing classification trees100 xpUnderstanding the tree's decisions50 xpGrowing larger classification trees50 xpWhy do some branches split?50 xpCreating random test datasets100 xpBuilding and evaluating a larger tree100 xpConducting a fair performance evaluation50 xpTending to classification trees50 xpPreventing overgrown trees100 xpCreating a nicely pruned tree100 xpWhy do trees benefit from pruning?50 xpSeeing the forest from the trees50 xpUnderstanding random forests50 xpBuilding a random forest model100 xp
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and moreIn the following Tracks
Machine Learning Fundamentals in R
Go To TrackMachine Learning Scientist in R
Go To Trackcollaborators
prerequisites
Intermediate RBrett Lantz
See MoreSenior Data Scientist at Sony PlayStation
Brett Lantz currently works as a data scientist at Sony PlayStation, is the author of Machine Learning with R, and teaches machine learning at the Global School in Empirical Research Methods summer program. After training as a sociologist, Brett has applied his endless thirst for data to projects that involve understanding and predicting human behavior in fields including epidemiology, higher education fundraising, and most recently, the video gaming industry.
Join over 14 million learners and start Supervised Learning in R: Classification today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.