R Tutorial by DrivenData & DataCamp: Data Mining the Water Table
New Free Course: DrivenData Water Pumps Challenge
Interested in starting to put your data science skills to work in order to solve some of the world's biggest social challenges? DrivenData provides that opportunity by hosting challenges where data scientists compete to come up with the best statistical model for difficult predictive problems that make a difference. In this practice challenge tutorial, you will be predicting which water pumps throughout Tanzania are functional, which need some repairs, and which do not work at all based on a number of variables. A smart understanding of which water points will fail can improve maintenance operations and ensure that clean, potable water is available to communities across Tanzania.
Explore and Predict
In this tutorial, you will be introduced to the DrivenData Water Pumps challenge. The data sets in the challenge are fairly large and feature-heavy. The tutorial will show you how to:
Load the challenge data set using R.
Explore and examine the features of your data.
Create visualizations to better understand the feature set.
Feature engineer new variables with predictive power.
Quickly apply a Random Forest to test variable importance
Create your own course
Would you like to create your own course? With DataCamp Teach, you can easily create and host your own interactive tutorial for free. Use the same system DataCamp course creators use to develop their courses, and share your Python knowledge with the rest of the world.