Skip to content
Course Notes: Introduction to PySpark
  • AI Chat
  • Code
  • Report
  • Course Notes

    Use this workspace to take notes, store code snippets, and build your own interactive cheatsheet!

    Note that the data from the course is not yet added to this workspace. You will need to navigate to the course overview page, download any data you wish to use, and add it to the file browser.

    # Import any packages you want to use here
    

    Take Notes

    Add notes here about the concepts you've learned and code cells with code you want to keep.

    USE SQL with Spark

    query = "FROM flights SELECT * LIMIT 10"
    
    # Get the first 10 rows of flights
    flights10 = spark.sql(query)
    
    # Show the results
    flights10.show()

    Read a csv with Spark

    Run cancelled
    # Don't change this file path
    file_path = "/usr/local/share/datasets/airports.csv"
    
    # Read in the airports data
    airports = spark.read.csv(file_path, header=True)
    
    # Show the data
    airports.show()