#### Introduction to Data Engineering

Learn about the world of data engineering with an overview of all its relevant topics and tools!

385 results

Learn about the world of data engineering with an overview of all its relevant topics and tools!

4 hours
Data Engineering
Vincent Vankrunkelsven
Course

Master the basics of data analysis in Python. Expand your skillset by learning scientific computing with numpy.

4 hours
Programming
Hugo Bowne-Anderson
Course

Master the basics of data analysis by manipulating common data structures such as vectors, matrices, and data frames.

4 hours
Programming
Jonathan Cornelissen
Course

Master the basics of querying tables in relational databases such as MySQL, Oracle, SQL Server, and PostgreSQL.

4 hours
Programming
Nick Carchedi
Course

Level up your data science skills by creating visualizations using Matplotlib and manipulating DataFrames with pandas.

4 hours
Programming
Filip Schouwenaars
Course

Continue your journey to becoming an R ninja by learning about conditional statements, loops, and vector functions.

6 hours
Programming
Filip Schouwenaars
Course

Learn the art of writing your own functions in Python, as well as key concepts like scoping and error handling.

3 hours
Programming
Hugo Bowne-Anderson
Course

Get started on the path to exploring and visualizing your own data with the tidyverse, a powerful and popular collection of data science tools within R.

4 hours
Programming
David Robinson
Course

Join two or three tables together into one, combine tables using set theory, and work with subqueries in PostgreSQL.

5 hours
Data Manipulation
Chester Ismay
Course

Learn how to use the industry-standard pandas library to import, build, and manipulate DataFrames.

4 hours
Data Manipulation
Team Anaconda
Course

Continue to build your modern Data Science skills by learning about iterators and list comprehensions.

4 hours
Programming
Hugo Bowne-Anderson
Course

Learn to import data into Python from various sources, such as Excel, SQL, SAS and right from the web.

3 hours
Importing & Cleaning Data
Hugo Bowne-Anderson
Course

Learn how to build and tune predictive models and evaluate how well they'll perform on unseen data.

4 hours
Machine Learning
Hugo Bowne-Anderson
Course

Learn to produce meaningful and beautiful data visualizations with ggplot2 by understanding the grammar of graphics.

5 hours
Data Visualization
Rick Scavetta
Course

In this course, you will learn to read CSV, XLS, and text files in R using tools like readxl and data.table.

3 hours
Importing & Cleaning Data
Filip Schouwenaars
Course

This course will equip you with all the skills you need to clean your data in Python.

4 hours
Importing & Cleaning Data
Daniel Chen
Course

Dive into data science using Python and learn how to effectively analyze and visualize your data.

4 hours
Programming
Hillary Green-Lerman
Course

Learn how to analyze data with spreadsheets using functions such as SUM(), AVERAGE(), and VLOOKUP().

3 hours
Programming
Vincent Vankrunkelsven
Course

Learn to transform and manipulate your data using dplyr.

4 hours
Data Manipulation
Chris Cardillo
Course

Learn the fundamentals of neural networks and how to build deep learning models using Keras 2.0.

4 hours
Machine Learning
Dan Becker
Course

Learn the language of data, study types, sampling strategies, and experimental design.

4 hours
Probability & Statistics
Mine Cetinkaya-Rundel
Course

Learn the basics of spreadsheets by working with rows, columns, addresses, and ranges.

2 hours
Programming
Vincent Vankrunkelsven
Course

Improve your Python data importing skills and learn to work with web and API data.

2 hours
Importing & Cleaning Data
Hugo Bowne-Anderson
Course

Build the foundation you need to think statistically and to speak the language of your data.

3 hours
Probability & Statistics
Justin Bois
Course

Learn about data science and how can you use it to strengthen your organization.

4 hours
Management
Michael Chow
Course

Learn to explore your data so you can properly clean and prepare it for analysis.

4 hours
Importing & Cleaning Data
Nick Carchedi
Course

You will learn how to tidy, rearrange, and restructure your data using versatile pandas DataFrames.

4 hours
Data Manipulation
Team Anaconda
Course

Learn complex data visualization techniques using Matplotlib and seaborn.

4 hours
Data Visualization
Team Anaconda
Course

Learn how to create one of the most efficient ways of storing data - relational databases!

4 hours
Programming
Timo Grossenbacher
Course

Master the complex SQL queries necessary to answer a wide variety of data science questions and prepare robust data sets for analysis in PostgreSQL.

4 hours
Data Manipulation
Mona Khalil
Course

The Unix command line helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds.

4 hours
Programming
Greg Wilson
Course

Explore the world of Pivot Tables within Google Sheets, and learn how to quickly organize thousands of data points with just a few clicks of the mouse.

4 hours
Data Manipulation
Frank Sumanski
Course

Take your R skills up a notch by learning to write efficient, reusable functions.

4 hours
Programming
Richie Cotton
Course

Learn to implement distributed data management and machine learning in Spark using the PySpark package.

4 hours
Other
Lore Dirick
Course

This course is an introduction to version control with Git for data scientists.

4 hours
Programming
Greg Wilson
Course

Learn the fundamentals of how to build conversational bots using rule-based systems as well as machine learning.

4 hours
Machine Learning
Alan Nichol
Course

This course is all about the act of combining, or merging, DataFrames, an essential part your Data Scientist's toolbox.

4 hours
Data Manipulation
Team Anaconda
Course

Learn how to describe relationships between two numerical quantities and characterize these relationships graphically.

4 hours
Probability & Statistics
Ben Baumer
Course

Learn the fundamentals of object-oriented programming: classes, objects, methods, inheritance, polymorphism, and others!

4 hours
Programming
Vicki Boykis
Course

This course will equip you with the skills to analyze, visualize, and make sense of networks using the NetworkX library.

4 hours
Probability & Statistics
Eric Ma
Course

Learn how to use graphical and numerical techniques to begin uncovering the structure of your data.

4 hours
Probability & Statistics
Andrew Bray
Course

Learn fundamental natural language processing techniques using Python and how to apply them to extract insights from real-world text data.

4 hours
Machine Learning
Katharine Jarmul
Course

Parse data in any format. Whether it's flat files, statistical software, databases, or data right from the web.

3 hours
Importing & Cleaning Data
Filip Schouwenaars
Course

Explore the Stanford Open Policing Project dataset and analyze the impact of gender on police behavior using pandas.

4 hours
Data Manipulation
Kevin Markham
Course

Learn how to cluster, transform, visualize, and extract insights from unlabeled datasets using scikit-learn and scipy.

4 hours
Machine Learning
Benjamin Wilson
Course

Learn to perform the two key tasks in statistical inference: parameter estimation and hypothesis testing.

4 hours
Probability & Statistics
Justin Bois
Course

Learn to train and assess models performing common machine learning tasks such as classification and clustering.

6 hours
Machine Learning
Gilles Inghelbrecht
Course

Learn essential data structures such as lists and data frames and apply that knowledge directly to financial examples.

4 hours
Applied Finance
Lore Dirick
Course

In this course, you'll learn how to use tree-based models and ensembles for regression and classification using scikit-learn.

5 hours
Machine Learning
Elie Kawerk
Course

Learn how to create versatile and interactive data visualizations using Bokeh.

4 hours
Data Visualization
Team Anaconda
Course

Analyze the gender distribution of children's book writers and use sound to match names to gender.

45 minutes
Case Studies
Tufool Alnuaimi
Project

You will explore the market capitalization of Bitcoin and other cryptocurrencies.

45 minutes
Data Manipulation, Data Visualization...
Juan González-Vallinas
Project

Wrangle and visualize musical data to find common chords and compare the styles of different artists.

45 minutes
Case Studies
Kris Shaffer
Project

Analyze the network of characters in Game of Thrones and how it changes over the course of the books.

45 minutes
Case Studies
Mridul Seth
Project

Apply your importing and data cleaning skills to real-world soccer data.

45 minutes
Data Manipulation, Importing & Cleaning Data...
Erin LaBrecque
Project

Write SQL queries to answer interesting questions about international debt using data from The World Bank.

45 minutes
Data Manipulation, Importing & Cleaning Data
Sayak Paul
Project

Explore Disney movie data, then build a linear regression model to predict box office success.

45 minutes
Data Manipulation, Data Visualization...
Sirinda Palahan
Project

Discover the top tools Kaggle participants use for data science and machine learning.

45 minutes
Data Manipulation, Data Visualization...
Amber Thomas
Project

Discover how the US bond yields behave using descriptive statistics and advanced modeling.

45 minutes
Data Visualization, Applied Finance
József Soltész
Project

Import, clean, and analyze seven years worth of training data tracked on the Runkeeper app.

45 minutes
Data Manipulation, Data Visualization...
Andrii Pavlenko
Project

Use tree-based machine learning methods to identify the characteristics of legendary Pokémon.

45 minutes
Data Manipulation, Data Visualization...
Joshua Feldman
Project

Use logistic regression to determine which treatment procedure is more effective for kidney stone removal.

45 minutes
Data Visualization, Probability & Statistics...
Amy Yang
Project

Process ingredient lists for cosmetics on Sephora then visualize similarity using t-SNE and Bokeh.

45 minutes
Data Manipulation, Data Visualization...
Jiwon Jeong
Project

Load, clean, and explore Super Bowl data in the age of soaring ad costs and flashy halftime shows.

45 minutes
Data Manipulation, Data Visualization...
Erin LaBrecque
Project

Load, clean, and explore Super Bowl data in the age of soaring ad costs and flashy halftime shows.

45 minutes
Data Manipulation, Data Visualization...
David Venturi
Project

Check what passwords fail to conform to the National Institute of Standards and Technology password guidelines.

45 minutes
Case Studies
Rasmus Bååth
Project

Analyze health survey data to determine how BMI is associated with physical activity and smoking.

45 minutes
Data Manipulation, Probability & Statistics...
Jessica Minnier
Project

Apply hierarchical and mixed-effect models to analyze Maryland crime rates.

45 minutes
Data Manipulation, Data Visualization...
Richard Erickson
Project

Use your logistic regression skills to protect people from becoming zombies!

45 minutes
Data Manipulation, Data Visualization...
Jenine Harris
Project

Predict the impact of climate change on bird distributions using spatial data and machine learning.

45 minutes
Data Manipulation, Data Visualization...
Laurens Geffert
Project

Use pandas to calculate and compare profitability and risk of different investments using the Sharpe Ratio.

45 minutes
Applied Finance, Case Studies
Stefan Jansen
Project

Use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity.

45 minutes
Data Manipulation, Data Visualization...
Anubhav Singh
Project

Build a binary classifier to predict if a blood donor is likely to donate again.

45 minutes
Data Manipulation, Machine Learning...
Dimitri Denisjonok
Project

Use cluster analysis to glean insights into cryptocurrency gambling behavior.

45 minutes
Data Manipulation, Data Visualization...
Eric Hare
Project

Apply unsupervised learning techniques to help plan an education program in Argentina.

45 minutes
Data Manipulation, Data Visualization...
Rafael La Buonora
Project

Use R to make art and create imaginary flowers inspired by nature.

45 minutes
Data Visualization, Case Studies
Antonio Sánchez Chinchón
Project

Load, clean, and visualize scraped Google Play Store data to understand the Android app market.

45 minutes
Data Manipulation, Data Visualization...
Lavanya Gupta
Project

Use data science to catch criminals, plus find new ways to volunteer personal time for social good.

45 minutes
Data Manipulation, Data Visualization...
William Connell
Project

Scrape news headlines for FB and TSLA then apply sentiment analysis to generate investment insight.

45 minutes
Data Manipulation, Data Visualization...
Juan González-Vallinas
Project

Build a book recommendation system using NLP and the text of books like "On the Origin of Species."

45 minutes
Data Manipulation, Data Visualization...
Philippe Julien
Project

Explore the salary potential of college majors with a k-means cluster analysis.

45 minutes
Data Manipulation, Data Visualization...
Jaclyn Burge
Project

If you've never done a DataCamp project, this is the place to start!

45 minutes
Data Manipulation, Data Visualization...
David Venturi
Project

Analyze admissions data from UC Berkeley and find out if the university was biased against women.

45 minutes
Data Manipulation, Data Visualization...
Joshua Feldman
Project

Analyze the dialog and IMDB ratings of 287 South Park episodes. Warning: contains explicit language.

45 minutes
Data Manipulation, Data Visualization...
Patrik Drhlík
Project

Build a machine learning model to predict if a credit card application will get approved.

45 minutes
Data Manipulation, Machine Learning...
Sayak Paul
Project

Build a deep learning model that can automatically detect honey bees and bumble bees in images.

45 minutes
Data Manipulation, Data Visualization...
Emily Miller
Project

Experiment with clustering algorithms to help doctors inform treatment for heart disease patients.

45 minutes
Data Manipulation, Data Visualization...
Megan Robertson
Project

Plot Google Trends data to find the most famous Kardashian/Jenner sister. Is it Kim? Kendall? Kylie?

45 minutes
Data Manipulation, Data Visualization...
David Venturi
Project

Write functions to forecast time series of food prices in Rwanda.

45 minutes
Data Manipulation, Data Visualization...
Richie Cotton
Project

Apply text mining to Donald Trump's tweets to confirm if he writes the (angrier) Android half.

45 minutes
Data Manipulation, Data Visualization...
David Robinson
Project

Build a convolutional neural network to classify images of letters from American Sign Language.

45 minutes
Data Manipulation, Data Visualization...
Alexis Cook
Project

Play bank data scientist and use regression discontinuity to see which debts are worth collecting.

45 minutes
Data Manipulation, Data Visualization...
Howard Friedman
Project

Use regression trees and random forests to find places where New York taxi drivers earn the most.

45 minutes
Data Visualization, Machine Learning...
Robert Grant
Project

Reanalyse the data behind one of the most important discoveries of modern medicine: handwashing.

60 minutes
Data Manipulation, Data Visualization...
Rasmus Bååth
Project

Apply your skills from "Working with Dates and Times in R" to breathalyzer data from Ames, Iowa.

45 minutes
Data Manipulation, Data Visualization...
Samantha Tyner
Project

Use pandas and Bayesian statistics to see if left-handed people actually die earlier than righties.

45 minutes
Data Manipulation, Data Visualization...
Madeleine Bonsma-Fisher
Project

Create and explore interactive maps using Leaflet to determine where to open the next Chipotle.

45 minutes
Data Manipulation, Data Visualization...
Rich Majerus
Project

Flex your pandas muscles on breath alcohol test data from Ames, Iowa, USA.

45 minutes
Data Manipulation, Data Visualization...
Samantha Tyner
Project

Build a machine learning classifier that knows whether President Trump or Prime Minister Trudeau is tweeting!

60 minutes
Data Manipulation, Data Visualization...
Katharine Jarmul
Project

How can we find a good strategy for reducing traffic-related deaths?

60 minutes
Data Manipulation, Data Visualization...
Joel Östblom
Project