跳至内容
首页Spark

课程

Feature Engineering with PySpark

高级技能水平
更新时间 2026年1月
Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.
免费开始课程
SparkData Manipulation
4小时
16 视频
60 道练习
5,000 XP
17,763
成就证明

创建您的免费帐户

继续使用 Google显示更多选项


继续操作即表示您接受我们的《使用条款》和《隐私政策》,并同意您的数据存储在美国。

深受数千家公司学习者的喜爱

Group

需要团队培训?

企业版试用

课程描述

The real world is messy and your job is to make sense of it. Toy datasets like MTCars and Iris are the result of careful curation and cleaning, even so the data needs to be transformed for it to be useful for powerful machine learning algorithms to extract meaning, forecast, classify or cluster. This course will cover the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering. With size of datasets now becoming ever larger, let's use PySpark to cut this Big Data problem down to size!

先决条件

Supervised Learning with scikit-learnIntroduction to PySpark
1

Exploratory Data Analysis

Get to know a bit about your problem before you dive in! Then learn how to statistically and visually inspect your dataset!
开始章节
2

Wrangling with Spark Functions

Real data is rarely clean and ready for analysis. In this chapter learn to remove unneeded information, handle missing values and add additional data to your analysis.
开始章节
Feature Engineering with PySpark
课程完成

获得成就证明

将此证书添加到您的 LinkedIn 档案、简历或履历中
在社交媒体和绩效评估中分享
立即注册

加入超过19百万学习者,今天就开始Feature Engineering with PySpark!

创建您的免费帐户

继续使用 Google显示更多选项


继续操作即表示您接受我们的《使用条款》和《隐私政策》,并同意您的数据存储在美国。

通过 DataCamp for Mobile 提升您的数据技能

随时随地通过我们的移动课程和每日 5 分钟编程挑战提升技能。