Loved by learners at thousands of companies
Being able to combine and work with multiple datasets is an essential skill for any aspiring Data Scientist. pandas is a crucial cornerstone of the Python data science ecosystem, with Stack Overflow recording 5 million views for pandas questions. Learn to handle multiple DataFrames by combining, organizing, joining, and reshaping them using pandas. You'll work with datasets from the World Bank and the City Of Chicago. You will finish the course with a solid skillset for data-joining in pandas.
Data Merging BasicsFree
Learn how you can merge disparate data using inner joins. By combining information from multiple sources you’ll uncover compelling insights that may have previously been hidden. You’ll also learn how the relationship between those sources, such as one-to-one or one-to-many, can affect your result.Inner join50 xpWhat column to merge on?50 xpYour first inner join100 xpInner joins and number of rows returned100 xpOne-to-many relationships50 xpOne-to-many classification100 xpOne-to-many merge100 xpMerging multiple DataFrames50 xpTotal riders in a month100 xpThree table merge100 xpOne-to-many merge with multiple tables100 xp
Merging Tables With Different Join Types
Take your knowledge of joins to the next level. In this chapter, you’ll work with TMDb movie data as you learn about left, right, and outer joins. You’ll also discover how to merge a table to itself and merge on a DataFrame index.Left join50 xpCounting missing rows with left join100 xpEnriching a dataset100 xpHow many rows with a left join?50 xpOther joins50 xpRight join to find unique movies100 xpPopular genres with right join100 xpUsing outer join to select actors100 xpMerging a table to itself50 xpSelf join100 xpHow does pandas handle self joins?50 xpMerging on indexes50 xpIndex merge for movie ratings100 xpDo sequels earn more?100 xp
Advanced Merging and Concatenating
In this chapter, you’ll leverage powerful filtering techniques, including semi-joins and anti-joins. You’ll also learn how to glue DataFrames by vertically combining and using the pandas.concat function to create new datasets. Finally, because data is rarely clean, you’ll also learn how to validate your newly combined data structures.Filtering joins50 xpSteps of a semi join100 xpPerforming an anti join100 xpPerforming a semi join100 xpConcatenate DataFrames together vertically50 xpConcatenation basics100 xpConcatenating with keys100 xpUsing the append method100 xpVerifying integrity50 xpValidating a merge50 xpConcatenate and merge to find common songs100 xp
Merging Ordered and Time-Series Data
In this final chapter, you’ll step up a gear and learn to apply pandas' specialized methods for merging time-series and ordered data together with real-world financial and economic data from the city of Chicago. You’ll also learn how to query resulting tables using a SQL-style format, and unpivot data using the melt method.Using merge_ordered()50 xpCorrelation between GDP and S&P500100 xpPhillips curve using merge_ordered()100 xpmerge_ordered() caution, multiple columns100 xpUsing merge_asof()50 xpUsing merge_asof() to study stocks100 xpUsing merge_asof() to create dataset100 xpmerge_asof() and merge_ordered() differences100 xpSelecting data with .query()50 xpExplore financials with .query()50 xpSubsetting rows with .query()100 xpReshaping data with .melt()50 xpSelect the right .melt() arguments50 xpUsing .melt() to reshape government data100 xpUsing .melt() for stocks vs bond performance100 xpCourse wrap-up50 xp
DatasetsChicago WardsChicago Business LicensesChicago CensusChicago Demographics by Zip CodeChicago Business OwnersChicago Land UseChicago Taxi VehiclesChicago Taxi OwnersCTA RidershipCTA CalendarCTA StationsMoviesMovie ActorsMovie RatingsMovie CastsMovie CrewsMovie GenresMovie SequelsMovie Financial DataMovie Tag LinesS&P 500World Bank GDPWorld Bank Population
PrerequisitesData Manipulation with pandas
Manager, Supply Chain Analytics @ Ingredion Incorporated
Manager of Supply Chain Analytics, with over 7 years of experience analyzing data to find insight for business related questions. I am responsible Supply Chain related Analytics for the NA business for $5.8 billion ingredient solutions provider to the food, beverage, brewing and pharmaceutical sectors. I graduated from DePaul University with distinction and received a MS in Predictive Analytics. I am passionate about Data Science / Machine Learning and I continue to work on my craft by learning new concepts through online classes.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA