Skip to content
Course Notes: Joining Data with pandas
Joining Data with Pandas
Ch1 Data Merging Basics
By inner join, merge the all two df. but shape of df is intersection of both.
--What column to merge on?
.merge(dataframe, on='overlapped_column') # on part is colnmn which both dataframe have
--Your first inner join
.merge(dataframe, on='same_column', suffixes=('_df','_df1')) # suffixes part is ckind of indicator for overlapped column
.value_counts() count repeated
--Inner joins and number of rows returned
--One-to-many classification
"""
one to one : match one row to one row
one to many: match one row to more than one row
"""
--One-to-many merge
--Group the results by title then count the number of accounts--
counted_df = licenses_owners.groupby("title").agg({'account':'count'})
The dict {'account':'count'} provided in your code snippet therefore applies the count function to the column account on the grouped dataframe (grouped by title). It therefore counts the occurrences of each title.
--Three table merge