Loved by learners at thousands of companies
Building good models only succeeds if you have a decent base table to start with. In this course you will learn how to construct a good base table, create variables and prepare your data for modeling. We finish with advanced topics on the matter.
Crucial base table conceptsFree
In this chapter you will learn how to construct the foundations of your base table, namely the population and the target.The basetable timeline50 xpTimeline violations50 xpAvailable data100 xpTimeline violation100 xpThe population50 xpSelect the relevant population50 xpA timeline compliant population100 xpRemoving duplicate objects100 xpThe target50 xpCalculate an event target100 xpCalculate an aggregated target100 xp
You will learn how to add variables to the base table that you can use to predict the target.Adding fixed variables50 xpSelecting the right value50 xpAdding age100 xpAdding the donor segment100 xpAdding living place100 xpAdding aggregated variables50 xpSelecting the appropriate date50 xpMaximum value last year100 xpRecency of donations100 xpAdding evolutions50 xpRatio of last month's and last year's average100 xpAbsolute difference between two years100 xpUsing evolution variables50 xpPerformance of evolution variables100 xpMeaning of evolution100 xp
Once you derived variables from the raw data, it is time to clean the data and prepare it for modeling. In this Chapter we discuss the steps that need to be taken to make your data modeling-ready.Creating dummies50 xpCreating a dummy from a two-category variable100 xpCreating dummies from a many-categories variable100 xpMissing values50 xpHow to replace missing values50 xpCreating a missing value dummy100 xpReplace missing values with the median value100 xpReplace missing values with a fixed value100 xpHandling outliers50 xpInfluence of outliers on predictive models50 xpHandle outliers with winsorization100 xpHandle outliers with standard deviation100 xpTransformations50 xpInteractions50 xpSquare root transformation100 xpAdding interactions to the basetable100 xp
Advanced base table concepts
In some cases, the target or variables change heavily with the seasons. You will learn how you can deal with seasonality by adding different snapshots to the base table.Seasonality50 xpSeasonality or not50 xpDetecting seasonality100 xpThe effect of seasonality100 xpUsing multiple snapshots50 xpTarget values50 xpCalculating snapshot targets100 xpCalculating aggregated variables100 xpStacking basetables100 xpThe timegap50 xpEvents during the timegap50 xpCalculating aggregated variables with timegap100 xpAdding age with timegap100 xpCongratulations50 xp
DatasetsDonor IDsBasetable with countries and ageBasetable used in Ex 2.13Living place of donorsDonations
PrerequisitesIntroduction to Predictive Analytics in Python
Data Scientist at Python Predictions
Nele is a senior data scientist at Python Predictions, after joining in 2014. She holds a master’s degree in mathematical computer science and a PhD in computer science, both from Ghent University. At Python Predictions, she developed several predictive models and recommendation systems in the fields of banking, retail and utilities. Nele has a keen interest in big data technologies and business applications
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA