Bank Marketing
This dataset consists of direct marketing campaigns by a Portuguese banking institution using phone calls. The campaigns aimed to sell subscriptions to a bank term deposit (see variable y
).
Not sure where to begin? Scroll to the bottom to find challenges!
suppressPackageStartupMessages(library(tidyverse))
read_delim('data/bank-marketing.csv', delim=";", show_col_types = FALSE)
Data Dictionary
Column | Variable | Class |
---|---|---|
age | age of customer | |
job | type of job | categorical: "admin.","blue-collar","entrepreneur","housemaid","management","retired","self-employed","services","student","technician","unemployed","unknown" |
marital | marital status | categorical: "divorced","married","single","unknown"; note: "divorced" means divorced or widowed |
education | highest degree of customer | categorical: "basic.4y","basic.6y","basic.9y","high.school","illiterate","professional.course","university.degree","unknown" |
default | has credit in default? | categorical: "no","yes","unknown" |
housing | has housing loan? | categorical: "no","yes","unknown" |
loan | has personal loan? | categorical: "no","yes","unknown" |
contact | contact communication type | categorical: "cellular","telephone" |
month | last contact month of year | categorical: "jan", "feb", "mar", ..., "nov", "dec" |
day_of_week | last contact day of the week | categorical: "mon","tue","wed","thu","fri" |
campaign | number of contacts performed during this campaign and for this client | numeric, includes last contact |
pdays | number of days that passed by after the client was last contacted from a previous campaign | numeric; 999 means client was not previously contacted |
previous | number of contacts performed before this campaign and for this client | numeric |
poutcome | outcome of the previous marketing campaign | categorical: "failure","nonexistent","success" |
emp.var.rate | employment variation rate - quarterly indicator | numeric |
cons.price.idx | consumer price index - monthly indicator | numeric |
cons.conf.idx | consumer confidence index - monthly indicator | numeric |
euribor3m | euribor 3 month rate - daily indicator | numeric |
nr.employed | number of employees - quarterly indicator | numeric |
y | has the client subscribed a term deposit? | binary: "yes","no" |
Source of dataset.
Citations:
- S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
- S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. In P. Novais et al. (Eds.), Proceedings of the European Simulation and Modelling Conference - ESM'2011, pp. 117-121, Guimaraes, Portugal, October, 2011. EUROSIS.
Don't know where to start?
Challenges are brief tasks designed to help you practice specific skills:
- πΊοΈ Explore: What are the jobs of the people most likely to subscribe to a term deposit?
- π Visualize: Create a plot to visualize the number of people subscribing to a term deposit by
month
. - π Analyze: What impact does the number of contacts performed during the last campaign have on the likelihood that a customer subscribes to a term deposit?
Scenarios are broader questions to help you develop an end-to-end project for your portfolio:
You work for a financial services firm. The past few campaigns have not gone as well as the firm would have hoped, and they are looking for ways to optimize their marketing efforts.
They have supplied you with data from a previous campaign and some additional metrics such as the consumer price index and consumer confidence index. They want to know whether you can predict the likelihood of subscribing to a term deposit. The manager would also like to know what factors are most likely to increase a customer's probability of subscribing.
You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.