Project Description

When beginning a career in data science, one often wonders what programming tools and languages one should learn first and what is being used in the industry. By exploring the 2017 Kaggle Data Science Survey results, you can learn about the tools used by 10,000+ people in the professional data science community.

To explore this project, students should be familiar with dataframes, dplyr, tidyr, and ggplot2. This project uses a subset of the 2017 Kaggle Machine Learning and Data Science Survey dataset.

Before starting this project you should be comfortable manipulating data frames and have some experience working with the tidyverse packages dplyr, tidyr, and ggplot2. We recommend that you have completed at least one of the following courses:

This project uses a subset of the 2017 Kaggle Machine Learning and Data Science Survey dataset. If you want to know more about what tools and techniques Kaggle participants use do check out the full report of the Kaggle 2017 survey results.

Project Tasks

  • 1 Welcome to the world of data science
  • 2 Using multiple tools
  • 3 Counting users of each tool
  • 4 Plotting the most popular tools
  • 5 The R vs Python debate
  • 6 Plotting R vs Python users
  • 7 Language recommendations
  • 8 The most recommended language by the language used
  • 9 The moral of the story
Amber Thomas
Amber Thomas

Journalist-Engineer at The Pudding

Amber Thomas is a journalist-engineer at The Pudding, an online collection of data-driven, visual essays. Before joining The Pudding, she was a marine biologist, collecting data on all things beneath the waves. Follow her on Twitter ( @ProQuesAsker) or on her personal website.

See More
TECHNOLOGY
  • R
  • TOPICS
    info Artboard a