Skip to main content

R Tutorial by DrivenData & DataCamp: Data Mining the Water Table

Weston Stearns,
May 24, 2016 min read
DrivenData Water Pumps Challenge interactive tutorial on DataCamp

New Free Course: DrivenData Water Pumps Challenge

Interested in starting to put your data science skills to work in order to solve some of the world's biggest social challenges? DrivenData provides that opportunity by hosting challenges where data scientists compete to come up with the best statistical model for difficult predictive problems that make a difference. In this practice challenge tutorial, you will be predicting which water pumps throughout Tanzania are functional, which need some repairs, and which do not work at all based on a number of variables. A smart understanding of which water points will fail can improve maintenance operations and ensure that clean, potable water is available to communities across Tanzania. 

DrivenData Exercise Example

Explore and Predict

In this tutorial, you will be introduced to the DrivenData Water Pumps challenge. The data sets in the challenge are fairly large and feature-heavy. The tutorial will show you how to:

  • Load the challenge data set using R.

  • Explore and examine the features of your data.

  • Create visualizations to better understand the feature set.

  • Feature engineer new variables with predictive power.

  • Quickly apply a Random Forest to test variable importance

So don't wait and get started and use your data science skills for good! Want to see other topics covered as well? Just let us know on Twitter.

Create your own course

Would you like to create your own course? With DataCamp Teach, you can easily create and host your own interactive tutorial for free. Use the same system DataCamp course creators use to develop their courses, and share your Python knowledge with the rest of the world.