PySpark Cheat Sheet: Spark in Python
This PySpark cheat sheet with code samples covers the basics like initializing Spark in Python, loading data, sorting, and repartitioning.
Jul 2021 · 6 min read
RelatedSee MoreSee More
10 Essential Python Skills All Data Scientists Should Master
All data scientists need expertise in Python, but which skills are the most important for them to master? Find out the ten most vital Python skills in the latest rundown.
Thaylise Nakamoto
9 min
Unlocking the Power of Data Science in the Cloud
Cloud analytics leaders from Exasol cover the motivation for moving analytics to the cloud, economic triggers for migration, success stories, the importance of flexibility and open-mindedness and much more.
Richie Cotton
41 min
Geocoding for Data Scientists: An Introduction With Examples
In this tutorial, you will learn three different ways to convert an address into latitude and longitude using Geopy.
Eugenia Anello
9 min
A Complete Guide to Socket Programming in Python
Learn the fundamentals of socket programming in Python
Serhii Orlivskyi
41 min
A Beginner's Guide to BigQuery
Learn what BigQuery is, how it works, its differences from traditional data warehouses, and how to use the BigQuery console to query public datasets provided by Google.
Eduardo Oliveira
9 min
Textacy: An Introduction to Text Data Cleaning and Normalization in Python
Discover how Textacy, a Python library, simplifies text data preprocessing for machine learning. Learn about its unique features like character normalization and data masking, and see how it compares to other libraries like NLTK and spaCy.
Mustafa El-Dalil
5 min