Перейти к основному содержимому
Главная

Spark courses

With Spark, data is read into memory, operations are performed, and the results are written back, resulting in faster execution. Learn core principles and common packages on DataCamp.

Создать бесплатный аккаунт

Продолжить через GoogleПоказать больше вариантов

или


Продолжая, вы принимаете наши Условия использования, нашу Политику конфиденциальности и соглашаетесь с тем, что ваши данные хранятся в США.
Group

Обучаете 2 или более человек?

Попробовать DataCamp for Business

Recommended for Spark beginners

Build your Spark skills with interactive courses curated by real-world experts

Курс

Foundations of PySpark

Средний уровеньУровень навыков
4.7+
601 отзыв
4 ч
Learn to implement distributed data management and machine learning in Spark using the PySpark package.

Трек

Большие данные с PySpark

3.6+
6 отзывов
25 ч
Освойте обработку больших данных и эффективное их использование с Apache Spark с помощью API PySpark.

Не знаете, с чего начать?

Пройти оценку

Просматривайте курсы и треки Spark

Курс

Introduction to PySpark

Средний уровеньУровень навыков
4.7+
2 536 отзывов
4 ч
Master PySpark to handle big data with ease—learn to process, query, and optimize massive datasets for powerful analytics!

Курс

Big Data Fundamentals with PySpark

Продвинутый уровеньУровень навыков
4.7+
215 отзывов
4 ч
Learn the fundamentals of working with big data with PySpark.

Курс

Cleaning Data with PySpark

Продвинутый уровеньУровень навыков
4.7+
466 отзывов
4 ч
Learn how to clean data with Apache Spark in Python.

Курс

Machine Learning with PySpark

Продвинутый уровеньУровень навыков
4.8+
689 отзывов
4 ч
Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression, linear regression, ensembles, and pipelines.

Курс

Introduction to Spark SQL in Python

Продвинутый уровеньУровень навыков
4.7+
142 отзыва
4 ч
Learn how to manipulate data and create machine learning feature sets in Spark using SQL in Python.

Курс

Foundations of PySpark

Средний уровеньУровень навыков
4.7+
601 отзыв
4 ч
Learn to implement distributed data management and machine learning in Spark using the PySpark package.

Курс

Feature Engineering with PySpark

Продвинутый уровеньУровень навыков
4.8+
286 отзывов
4 ч
Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.

Курс

Introduction to Spark with sparklyr in R

Средний уровеньУровень навыков
4.7+
81 отзыв
4 ч
Learn how to run big data analysis using Spark and the sparklyr package in R, and explore Spark MLIb in just 4 hours.

Курс

Building Recommendation Engines with PySpark

Продвинутый уровеньУровень навыков
4.8+
232 отзыва
4 ч
Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.

Связанные ресурсы по теме Spark

блог

The Top 20 Spark Interview Questions

Essential Spark interview questions with example answers for job-seekers, data professionals, and hiring managers.
Tim Lu's photo

Tim Lu

блог

Flink vs. Spark: A Comprehensive Comparison

Comparing Flink vs. Spark, two open-source frameworks at the forefront of batch and stream processing.
Maria Eugenia Inzaugarat's photo

Maria Eugenia Inzaugarat

8 мин

Учебное руководство

PySpark Tutorial: Getting Started with PySpark

A hands-on PySpark tutorial: install PySpark, explore data with DataFrames, and build a K-Means clustering model for customer segmentation.
Natassha Selvaraj's photo

Natassha Selvaraj

15 мин


Ready to apply your skills?

Projects allow you to apply your knowledge to a wide range of datasets to solve real-world problems in your browser

Frequently asked questions

Which Spark course is the best for absolute beginners?

For new learners, DataCamp has three introductory Spark courses across the most popular programming languages:

Introduction to PySpark 

Introduction to Spark with sparklyr in R 

Introduction to Spark SQL in Python Course

Do I need any prior experience to take a Spark course?

You’ll need to have completed an introduction course to the programming language you’re using Spark on. 

All of which you can find here:

Introduction to Python

Introduction to R

Introduction to SQL

Beyond that, anyone can get started with Spark through simple, interactive exercises on DataCamp.

What is PySpark used for?

If you're already familiar with Python and libraries such as Pandas, then PySpark is a good language to learn to create more scalable analyses and pipelines.

Apache Spark is basically a computational engine that works with huge sets of data by processing them in parallel and batch systems. 

Spark is written in Scala, and PySpark was released to support the collaboration of Spark and Python.

How can Spark help my career?

You’ll gain the ability to analyze data and train machine learning models on large-scale datasets—a valuable skill for becoming a data scientist. 

Having the expertise to work with big data frameworks like Apache Spark will set you apart.

What is Apache Spark?

Apache Spark is an open-source, distributed processing system used for big data workloads. 

It utilizes in-memory caching, and optimized query execution for fast analytic queries against data of any size. 

It provides development APIs in Java, Scala, Python, and R, and supports code reuse across multiple workloads—batch processing, interactive queries, real-time analytics, machine learning, and graph processing.

Другие технологии и темы

технологии

Развивайте свои навыки работы с данными с помощью DataCamp для мобильных устройств.

Успевайте в обучении на ходу с помощью наших мобильных курсов и ежедневных 5-минутных заданий по программированию.