Start Learning for Free
Join over 1,000,000 other Data Science learners and start one of our interactive tutorials today!
12 Useful Data Science Walkthroughs
July 24th, 2017 inSo you’ve developed some base skills in programming, data visualization, data manipulation etc... And are looking for ways to apply those skills and build a data science portfolio?
We’re here to help.
Practicing your skills with concrete examples will boost your data science confidence and will help you identify and solve problems in the real world. For this reason, we’ve made a collection of highquality walkthroughs ranging from Text Mining, ML, Deep Learning, Finance and more.
Check it out and let us know your favorite!
Text Mining in R

In this 3part tutorial, you will learn how to scrape H1B visa data with R. DataCamp instructor Ted Kwartler walks you through how to parse and store the JSON data, perform Exploratory Data Analysis, adding visuals, and finally create a map of the data thanks to a geocoding API. This walkthrough is valuable as it shows all the steps a data scientist would take to answer a question: Can Data Help Your H1B Visa Application?

Characterizing Twitter followers with tidytext  Explore tidytext in this walkthrough by analyzing your Twitter followers’ descriptions to learn more about them.
Data Mining (Python)
Introduction to Market Basket Analysis in Python  learn how to use market basket analysis to find common patterns of items in large datasets. This walkthrough showcases this technique on a large online retail data set to try to find interesting purchase combinations.
Machine Learning
Machine Learning (ML) is increasingly becoming essential in a data scientist’s toolbox for both R and Python. Advances in ML are a big reason why data science has become such an indemand skill. These 3 walkthroughs below show you how to use scikitlearn (Python) and Caret (R) along with a series of Machine Learning techniques.
ScikitLearn (Python)

Python Machine Learning: ScikitLearn Tutorial  This introductory post covers the basics of scikitlearn using digits data. The techniques covered here are Principal Component Analysis (PCA), Support Vector Machines (SVM), and KMeans algorithms.

ScikitLearn Tutorial: Baseball Analytics  This 2part walkthrough uses baseball datasets to determine Major League Baseball (MLB) Teams wins per season based on team statistics, and which player will be voted into the Hall of Fame based on career statistics and awards. The techniques covered here are Linear Regression, KMeans, Logistic Regression, and Random Forest.
Caret (R)

Machine Learning in R For Beginners  This includes a walkthrough on multiclass classification with the wellknown knearest neighbor algorithm with the help of the caret library. This short introduction to ML in R is a must for R learners and the data used here is the famous iris dataset.
Building a Classifier

What I learned From Implementing A Classifier From Scratch  This is a great walkthrough to understand what is under the hood of Machine Learning. Without using a preexisting library, build a classifier from scratch to better understand its inner workings.
Forecasting (Python)

Forecasting Website Traffic Using Facebook’s Prophet Library  Facebook opensourced an R and Python library called prophet to automate the forecasting process. This walkthrough introduces this library and uses it to predict traffic volume for a website.
Deep Learning
Even more so than Machine Learning, Deep Learning gets all the attention in the data science world. Companies are investing in infrastructure and talent to take advantage of this new field. To become an elite data scientist, Deep Learning is a must.
Keras (R + Python)

Keras Tutorial: Deep Learning in Python  Build a MultiLayer Perceptron (MLP) for classification and regression tasks using a wine data set.

keras: Deep Learning in R  The Keras package was recently launched in R, be an early adopter! Here you will build a MLP for multiclass classification again using the iris dataset.
TensorFlow

TensorFlow Tutorial For Beginners (Python)  Work on Belgian traffic signs data with Google’s very own TensorFlow, one of the more promising deep learning libraries.
Finance (Python)
Python For Finance: Algorithmic Trading  Perform financial analysis, develop a trading strategy, and backtest it using Quantopian in this popular walkthrough.
For more data science content, create a free DataCamp account to receive a newsletter every Tuesday with the best data science news and projects!
Comments
No comments yet. Be the first to respond!