Professional Data Engineer in Python

更新时间 2026年3月

Dive deep into advanced skills and state-of-the-art tools revolutionizing data engineering roles today with our Professional Data Engineer track.

创建您的免费帐户

或

继续操作即表示您接受我们的《使用条款》和《隐私政策》，并同意您的数据存储在美国。

学习路径描述

Professional Data Engineer in Python

Take your skills to the next level with our Professional Data Engineer track. This advanced track is designed to build on the Associate Data Engineer in SQL and Data Engineer in Python tracks. It equips you with the cutting-edge knowledge and tools demanded by modern data engineering roles. Throughout this journey, you'll master modern data architectures, enhance your Python skills with a deep dive into object-oriented programming, explore NoSQL databases, and harness the power of dbt for seamless data transformation. Unlock the secrets of DevOps with essential practices, advanced testing techniques, and tools like Docker to streamline your development and deployment processes. Immerse yourself in big data technologies with PySpark and achieve mastery in data processing and automation using shell scripting. Engage in hands-on projects and tackle real-world datasets to apply your knowledge, debug complex workflows, and optimize data processes. By completing this track, you'll not only gain the advanced skills needed to conquer complex data engineering challenges but also the confidence to apply them in the dynamic world of data engineering.

先决条件

Data Engineer

Course
1
Understanding Modern Data Architecture
Discover modern data architecture's key components, from ingestion and serving to governance and orchestration.
Course
2
Introduction to Shell
The Unix command line helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds.
Course
3
Containerization and Virtualization Concepts
Learn the essentials of VMs, containers, Docker, and Kubernetes. Understand the differences to get started!
Course
4
Introduction to dbt
This course introduces dbt for data modeling, transformations, testing, and building documentation.
Course
5
Introduction to Object-Oriented Programming in Python
Discover the fundamental concepts of object-oriented programming (OOP), building custom classes and objects!
Course
6
Introduction to NoSQL
Conquer NoSQL and supercharge data workflows. Learn Snowflake to work with big data, Postgres JSON for handling document data, and Redis for key-value data.
Course
7
DevOps Concepts
In this Introduction to DevOps, you’ll master the DevOps basics and learn the key concepts, tools, and techniques to improve productivity.
Course
8
Introduction to Testing in Python
Master Python testing: Learn methods, create checks, and ensure error-free code with pytest and unittest.
Project
额外
Debugging Code
Sharpen your debugging skills to enhance sales data accuracy.
Course
10
Introduction to Docker
Gain an introduction to Docker and discover its importance in the data professional’s toolkit. Learn about Docker containers, images, and more.
Course
11
Introduction to PySpark
Master PySpark to handle big data with ease—learn to process, query, and optimize massive datasets for powerful analytics!
Chapter
额外
Introduction to Big Data analysis with Spark
This chapter introduces the exciting world of Big Data, as well as the various concepts and different frameworks for processing Big Data. You will understand why Apache Spark is considered the best framework for BigData.
Chapter
额外
Programming in PySpark RDD’s
The main abstraction Spark provides is a resilient distributed dataset (RDD), which is the fundamental and backbone data type of this engine. This chapter introduces RDDs and shows how RDDs can be created and executed using RDD Transformations and Actions.
Chapter
额外
PySpark SQL & DataFrames
In this chapter, you'll learn about Spark SQL which is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. This chapter shows how Spark SQL allows you to use DataFrames in Python.
Project
额外
Cleaning an Orders Dataset with PySpark
Step into a data engineer's shoes and master data cleaning with PySpark on an e-commerce orders dataset!
Chapter
额外
Downloading Data on the Command Line
In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.
Chapter
额外
Data Pipeline on the Command Line
In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.
Course
18
Streaming Concepts
Learn about the difference between batching and streaming, scaling streaming systems, and real-world applications.
Course
19
Introduction to Apache Kafka
Master Apache Kafka! From core concepts to advanced architecture, learn to create, manage, and troubleshoot Kafka for real-world data streaming challenges!
Course
20
Introduction to Kubernetes
In this course, you will learn the fundamentals of Kubernetes and deploy and orchestrate containers using Manifests and kubectl instructions.
Resource
额外
Impactful Data Engineering—with Datadog's Wouter de Bie
Understand how data engineering can impact your business.

Professional Data Engineer in Python

13 课程

学习路径完成

获得成就证明

将此证书添加到你的 LinkedIn 档案、简历或履历中
在社交媒体和绩效评估中分享立即注册

加入超过19百万学习者，今天就开始Professional Data Engineer in Python！

创建您的免费帐户

或

继续操作即表示您接受我们的《使用条款》和《隐私政策》，并同意您的数据存储在美国。

Professional Data Engineer in Python

创建您的免费帐户

培训2人或更多？

学习路径描述

Professional Data Engineer in Python

先决条件

Understanding Modern Data Architecture

Introduction to Shell

Containerization and Virtualization Concepts

Introduction to dbt

Introduction to Object-Oriented Programming in Python

Introduction to NoSQL

DevOps Concepts

Introduction to Testing in Python

Debugging Code

Introduction to Docker

Introduction to PySpark

Introduction to Big Data analysis with Spark

Programming in PySpark RDD’s

PySpark SQL & DataFrames

Cleaning an Orders Dataset with PySpark

Downloading Data on the Command Line

Data Pipeline on the Command Line

Streaming Concepts

Introduction to Apache Kafka

Introduction to Kubernetes

Impactful Data Engineering—with Datadog's Wouter de Bie

获得成就证明

加入超过19百万学习者，今天就开始Professional Data Engineer in Python！

创建您的免费帐户

通过 DataCamp for Mobile 提升您的数据技能

学习路径描述

Professional Data Engineer in Python

获得成就证明

加入超过.css-nklxlk{color:var(--wf-brand--main, #03EF62);}19百万学习者，今天就开始Professional Data Engineer in Python！

创建您的免费帐户

通过 DataCamp for Mobile 提升您的数据技能

加入超过19百万学习者，今天就开始Professional Data Engineer in Python！