Skip to content
Joining Data with pandas
  • AI Chat
  • Code
  • Report
  • Spinner

    Joining Data with pandas

    Run the hidden code cell below to import a few of the datasets used in this course.

    Note: There are a large number of datasets in the datasets/ folder. Many of these are Pickle files, which you can read using pd.read_pickle(path_to_file). An example is included in the cell below.

    # Import pandas
    import pandas as pd
    
    # Import some of the course datasets 
    actors_movies = pd.read_csv("datasets/actors_movies.csv")
    business_owners = pd.read_pickle("datasets/business_owners.p")
    casts = pd.read_pickle("datasets/casts.p")
    
    # Preview one of the DataFrames
    casts

    Take Notes

    Add notes here about the concepts you've learned and code cells with code you want to keep.

    Inner join returns values if and only if left == right. In the above if both tables have same names then merge method adjusts the names by giving them suffixes ex address_x is of ward table and address_y is of census table. To control the names of suffixes use below argument in merge method: Relationships:- Merging multiple tables:-

    Here the 5 is matched with 3 as 3< 5 i.e nearest to 5, 7 is mathced with 10 as 7 is nearest to 10 and 1 is matched with 1 as 1 is equal to 1. So <= is used for matching here. in above example notice the nearest date-time in both tables. Nearest date-time are taken from ibm table.

    :- For merge_asof() 3rd time

    # Add your code snippets here