Advance your data skills by mastering Apache Spark. Using the Spark Python API, PySpark, you will leverage parallel computation with large datasets, and get ready for high-performance machine learning. From cleaning data to creating features and implementing machine learning models, you'll execute end-to-end workflows with Spark. The track ends with building a recommendation engine using the popular MovieLens dataset and the Million Songs dataset.
Discover your data science skill level in 10 minutes with DataCamp Signal™.
Join 4,900,000 Data Science Enthusiasts today!Create Free Account Now Get Full Access