Skip to content

Joining Data with Pandas

Ch1 Data Merging Basics

By inner join, merge the all two df. but shape of df is intersection of both.

--What column to merge on?

.merge(dataframe, on='overlapped_column')  # on part is colnmn which both dataframe have

--Your first inner join

.merge(dataframe, on='same_column', suffixes=('_df','_df1')) # suffixes part is ckind of indicator for overlapped column
.value_counts() count repeated 

--Inner joins and number of rows returned

--One-to-many classification

"""
one to one : match one row to one row
one to many: match one row to  more than one row
"""

--One-to-many merge

--Group the results by title then count the number of accounts--

counted_df = licenses_owners.groupby("title").agg({'account':'count'})

The dict {'account':'count'} provided in your code snippet therefore applies the count function to the column account on the grouped dataframe (grouped by title). It therefore counts the occurrences of each title.

--Three table merge