Data Engineering Tool Kits

  • Learn at your own pace
  • Get hands-on experience
  • Practice and apply your skills
responsive media image

Data Engineering Toolkit

Introduction to Python

BeginnerSkill Level
4 hr
5.9M learners
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.

Introduction to Functions in Python

BeginnerSkill Level
3 hr
429.8K learners
Learn the art of writing your own functions in Python, as well as key concepts like scoping and error handling.

Python Toolbox

BeginnerSkill Level
4 hr
284.8K learners
Continue to build your modern Data Science skills by learning about iterators and list comprehensions.

Introduction to SQL

BeginnerSkill Level
2 hr
919.2K learners
Learn how to create and query relational databases using SQL in just two hours.

Introduction to Data Engineering

BeginnerSkill Level
4 hr
114.9K learners
Learn about the world of data engineering in this short course, covering tools and topics like ETL and cloud computing.

Intermediate SQL

BeginnerSkill Level
4 hr
287.8K learners
Accompanied at every step with hands-on practice queries, this course teaches you everything you need to know to analyze data using your own SQL code today!

Joining Data in SQL

BeginnerSkill Level
4 hr
183.2K learners
Level up your SQL knowledge and learn to join tables together, apply relational set theory, and work with subqueries.

Database Design

BeginnerSkill Level
4 hr
80.1K learners
Learn to design databases in SQL to process, store, and organize data in a more efficient way.

NoSQL Concepts

BeginnerSkill Level
2 hr
13.9K learners
In this conceptual course (no coding required), you will learn about the four major NoSQL databases and popular engines.

Introduction to Shell

BeginnerSkill Level
4 hr
131.9K learners
The Unix command line helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds.

Introduction to PySpark

BeginnerSkill Level
4 hr
147.6K learners
Learn to implement distributed data management and machine learning in Spark using the PySpark package.

Data Management in SQL (PostgreSQL)

0.23 hr
This assessment might cover the following topics: Assess data quality and perform validation tasks using SQL; Perform standard cleaning tasks to prepare data for analysis using SQL; Perform standard data joining tasks using SQL; Perform standard data extraction and aggregation tasks using SQL.

Importing and Cleaning with Python

0.19 hr
This assessment might cover the following topics: Performing common tidying operations such as moving between wide and long format data; Working with a range of file types to import into a suitable format; Working with multiple data sets to combine and create a relevant single data structure; Performing cleaning and manipulation operations on strings, dates and times; Accessing data through APIs and scraping techniques; Identifying and resolving common data issues such as incorrect categories and missing data issues.

Intermediate Python

BeginnerSkill Level
4 hr
1.2M learners
Level up your data science skills by creating visualizations using Matplotlib and manipulating DataFrames with pandas.

Writing Efficient Python Code

BeginnerSkill Level
4 hr
125.4K learners
Learn to write efficient code that executes quickly and allocates resources skillfully to avoid unnecessary overhead.

Data Manipulation with pandas

BeginnerSkill Level
4 hr
418.6K learners
Learn how to import and clean data, calculate statistics, and create visualizations with pandas.

Reshaping Data with pandas

BeginnerSkill Level
4 hr
17.9K learners
Reshape DataFrames from a wide to long format, stack and unstack rows and columns, and wrangle multi-index DataFrames.

Creating PostgreSQL Databases

BeginnerSkill Level
4 hr
14.5K learners
Learn how to create a PostgreSQL database and explore the structure, data types, and how to normalize databases.

Introduction to Data Quality

BeginnerSkill Level
2 hr
10K learners
Explore the basics of data quality management. Learn the key concepts, dimensions, and techniques for monitoring and improving data quality.

undefined

Data Processing in Shell

BeginnerSkill Level
4 hr
19.5K learners
Learn powerful command-line skills to download, process, and transform data, including machine learning pipeline.

Understanding Cloud Computing

BeginnerSkill Level
2 hr
110.2K learners
A non-coding introduction to cloud computing, covering key concepts, terminology, and tools.

undefined

Streaming Concepts

BeginnerSkill Level
2 hr
3.8K learners
Learn about the difference between batching and streaming, scaling streaming systems, and real-world applications.

undefined

DevOps Concepts

BeginnerSkill Level
4 hr
6.3K learners
In this Introduction to DevOps, you’ll master the DevOps basics and learn the key concepts, tools, and techniques to improve productivity.

Introduction to Git

BeginnerSkill Level
4 hr
44.2K learners
Familiarize yourself with Git for version control. Explore how to track, compare, modify, and revert files, as well as collaborate with colleagues using Git.

Introduction to Docker

BeginnerSkill Level
4 hr
21.9K learners
Gain an introduction to Docker and discover its importance in the data professional’s toolkit. Learn about Docker containers, images, and more.

MLOps Concepts

BeginnerSkill Level
2 hr
20.7K learners
Discover how MLOps can take machine learning models from local notebooks to functioning models in production that generate real business value.

MLOps for Business

BeginnerSkill Level
3 hr
2.7K learners
Learn about MLOps, including the tools and practices needed for automating and scaling machine learning applications.

Machine Learning with PySpark

BeginnerSkill Level
4 hr
23.9K learners
Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression, linear regression, ensembles, and pipelines.

Introduction to Scala

BeginnerSkill Level
3 hr
24.9K learners
Begin your journey with Scala, a popular language for scalable applications and data engineering infrastructure.
Our full library contains 50+ learning tracks, 400+ interactive courses, 100+ training projects, and other material.

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials and Google Spreadsheets to Python, R, and SQL.

Join other IoA learners

Visit My IoA—your gateway to online services for Members and Students—to unlock your DataCamp access

Start Learning

Hands-on learning experience

Grow your data skills with short video tutorials and hands-on coding exercises.

Choose your own learning path

Choose from over 350 courses or enroll in IoA's custom learning paths for Business People, Practitioners, Thought Leaders, and C-Suite, to overcome your biggest business and technology challenges.

Flexible online training for every role

Grow your skills with data-oriented assessments, courses, projects, and practice exercises in Python, R, SQL, Excel, Python, Tableau, Oracle, Power BI, data engineering, and more.

Ready to learn?

Visit My IoA to unlock your DataCamp access

Start Learning