跳至内容
首页Databricks

课程

Data Transformation with Spark SQL in Databricks

中级技能水平
更新时间 2026年5月
Build end-to-end data pipelines - from cleaning and aggregation to streaming and orchestration.
免费开始课程
DatabricksData Engineering
3小时
7 视频
25 练习
1,750 经验值
成就证明

创建你的免费账户

继续使用 Google显示更多选项


继续操作即表示您接受我们的《使用条款》和《隐私政策》,并同意您的数据存储在美国。

深受数千家公司学习者的喜爱

Group

在培训团队?

企业版试用

课程描述

Ready to handle real-world data at scale? This course teaches you to transform large datasets using Spark SQL and PySpark in Databricks. Learn to shape and clean data, run aggregations with optimized joins, and apply window functions for advanced analytics. You'll also set up file-based streaming with fault-tolerant checkpoints and persist results as Delta tables. By the end, you'll be orchestrating multi-step production pipelines with Databricks Workflows and Lakeflow Declarative Pipelines.

先决条件

Introduction to Databricks SQLIntroduction to PySpark
1

Loading and Shaping Data

In this chapter, you'll learn how to work with Databricks notebooks, load CSV data into Spark DataFrames, and shape data using PySpark and SQL.
开始章节
2

Data Cleaning and Optimization

Learn how to define explicit schemas, build a data cleaning pipeline, and optimize query performance with broadcast joins.
开始章节
Data Transformation with Spark SQL in Databricks
课程完成

获得成就证明

将此证书添加到你的 LinkedIn 档案、简历或履历中
在社交媒体和绩效评估中分享
立即注册

加入超过19百万学习者,今天就开始Data Transformation with Spark SQL in Databricks!

创建你的免费账户

继续使用 Google显示更多选项


继续操作即表示您接受我们的《使用条款》和《隐私政策》,并同意您的数据存储在美国。

通过 DataCamp for Mobile 提升您的数据技能

随时随地通过我们的移动课程和每日 5 分钟编程挑战提升技能。