Introduction to PySpark
Learn to implement distributed data management and machine learning in Spark using the PySpark package.
Follow short videos led by expert instructors and then practice what you’ve learned with interactive exercises in your browser.
Learn to implement distributed data management and machine learning in Spark using the PySpark package.
Learn the fundamentals of working with big data with PySpark.
Learn how to clean data with Apache Spark in Python.
Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression, linear regression, ensembles, and pipelines.
Learn how to manipulate data and create machine learning feature sets in Spark using SQL in Python.
Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.
Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.
Learn how to run big data analysis using Spark and the sparklyr package in R, and explore Spark MLIb in just 4 hours.