Skip to main content

[Infographic] Data Science Learning Checklist

Use this handy checklist to guide your data science learning journey.
Jan 2023  · 4 min read

A career in data science is highly sought-after and lucrative. It encompasses a range of tasks such as studying and organizing data, applying machine learning techniques, and being aware of business objectives. To excel in this field, you should have a combination of abilities, like scrutinizing data, grasping business concepts, communication proficiencies, and more. To aid in your progress, use this list as a reference point in your learning journey.

Data Cleaning Checklist@1x.png

To download this infographic, press on the image above

Exploratory Data Analysis

Descriptive Statistics

  • Calculate metrics on measures of location like mean and median, measures of variation like range and standard deviation, and other characteristics of features
  • Calculate metrics like correlation to understand the relationships between feature

Learn on DataCamp

Apply Your Skills

Data Visualization

  • Create plots like bar plots, histograms and box plots to visualize single features.
  • Create plots like scatter plots, line plots and heat maps to visualize relationships between features.

Learn on DataCamp

Apply Your Skills

 

Data Management

Importing & Reading Data

  • Import data from common file formats like CSV and spreadsheets.
  • Import data by querying SQL databases.
  • Import data via web APIs.

Learn on DataCamp

Apply Your Skills

Data Wrangling

  • Perform common data manipulations such as sorting, subsetting, adding new features, and aggregating.
  • Join two datasets together via inner, left and other joins.
  • Pivot a rectangular dataset to convert rows to columns or columns to rows.

Learn on DataCamp

Apply Your Skills

Data Cleaning

  • Identify and fix issues with data constraints such as wrong data types, numbers out of range, or duplicate values.
  • Identify and fix issues with text and categorical data, such as invalid categories or incorrect formatting.
  • Identify and fix issues with data uniformity, such as incorrect units, incorrect date formats, and inconsistency between features.
  • Identify and fix issues with missing data values.

Learn on DataCamp

Apply Your Skills

 

Business Acumen

Business Goals

  • Make recommendations for analytic approaches based on business goals
  • Judge performance of analytic results against KPIs or other relevant business criteria

Learn on DataCamp

Apply Your Skills

Organizational Knowledge

  • Understand the impact of data science projects on your business.
  • Understand which teams or employees need to be involved in a data project, and in what capacity.

Learn on DataCamp

Apply Your Skills

Programming for Data Science

Computational Thinking

  • Use common programming constructs like flow control and iteration.
  • Understand functions and functional programming to write repeatable code for analysis.

Learn on DataCamp

Apply Your Skills

Production Coding

  • Make use of version control like git for managing code
  • Use error handling, assertions, and unit tests to ensure code quality
  • Write documentation to make your code understandable by others
  • Develop packages to make your code reusable

Learn on DataCamp

Apply Your Skills

Model Development

Model Design

  • Choose an appropriate model type (regression, classification, clustering, etc.) based on your dataset and the analysis goals

Learn on DataCamp

Apply Your Skills

Feature Engineering

  • Extract problem-relevant information from existing features, like getting the day of week from a datetime variable, or getting an "is working age" indicator from a date of birth.
  • Combine multiple features into new features, for example summing regional sales into total sales, or calculating profit as revenue minus costs.
  • Use external datasets to define new features, for example using a geographic API to get the city from a longitude and latitude, or using a computer vision API to determine if an image contains people.
  • Use imputation to estimate missing values.

Learn on DataCamp

Apply Your Skills

Model Fitting

  • Can generate training and testing splits from a dataset, including using cross-validation.
  • Uses hyperparameter tuning to optimize model performance.

Learn on DataCamp

Apply Your Skills

Model Validation

  • Can evaluate supervised learning model performance using metrics like accuracy, precision and recall.
  • Can evaluate unsupervised learning model performance using metrics like homogeneity, completeness, and silhouette coefficient.

Learn on DataCamp

Apply Your Skills

Statistical Experimentation

Sampling Methods

  • Understand statistical distributions like the normal, uniform and Poisson distributions
  • Choose appropriate sampling methods to answer your questions while avoiding bias.

Learn on DataCamp

Apply Your Skills

Hypothesis Testing

  • Understand null and alternative hypotheses
  • Know when and how to use hypothesis tests like the t-test, Chi-squared test, and Mann-Whitney U test
  • Interpret test statistics and p-values

Learn on DataCamp

Apply Your Skills

Data Communication

Data Storytelling

  • Create a narrative that describes your motivation, methods, results, and conclusions
  • Ensure your narrative is consistent with the findings of the data
  • Edit your stories to remove extraneous details

Learn on DataCamp

Apply Your Skills

Understand your Audience

  • Understand your audience's prior knowledge and interests
  • Tailor your message to resonate with the audience, even if they are non-technical

Learn on DataCamp

Apply Your Skills

Related
Data Science Concept Vector Image

How to Become a Data Scientist in 8 Steps

Find out everything you need to know about becoming a data scientist, and find out whether it’s the right career for you!
Jose Jorge Rodriguez Salgado's photo

Jose Jorge Rodriguez Salgado

12 min

DC Data in Soccer Infographic.png

How Data Science is Changing Soccer

With the Fifa 2022 World Cup upon us, learn about the most widely used data science use-cases in soccer.
Richie Cotton's photo

Richie Cotton

Top 2022 Resources to Sharpen your Data Skills

Get access to our top-performing resources from 2022, including webinars, blog posts, white papers, cheat sheets, tutorials, and articles, all designed to help you sharpen your data skills and scale your organization's data culture. Start learning and growing your data expertise today!
Adel Nehme's photo

Adel Nehme

14 min

21 Top Data Scientist Interview Questions

Explore the top data science interview questions with answers for final-year students and professionals looking for jobs.
Abid Ali Awan's photo

Abid Ali Awan

21 min

DataFramed 120 .png

Data Trends & Predictions for 2023

DataCamp Co-founders, Jonathan Cornelissen, and Martijn Theuwissen break down the top data trends they are seeing in the data space today, as well as their predictions for the future of the data industry.

Richie Cotton's photo

Richie Cotton

39 min

_Quote.png

How Organizations Can Bridge the Data Literacy Gap

Dr Selena Fisk joins the show to chat about the perception people have that "I'm not a numbers person" and how data literacy initiatives can move past that. How can leaders help their people bridge the data literacy gap and, in turn, create a data culture?

Adel Nehme's photo

Adel Nehme

42 min

See MoreSee More