Skip to content

Credit Card Recommendation

Credit Card Recommendation in Banking

“Golden Horizon Bank” is a private bank which provides various products to its customers, such as savings accounts, home loans, car loans, credit cards and so on. Currently, the new manager of the bank identified that the income from credit cards is quite low compared to the other services. He made the decision to take some actions to increase credit card earnings. The first step he decided is to recommend credit cards to the bank's customers rather than to new customers, because they already have trust in the company. As the manager knows the capability and potential of data in banking, he instructed the data science team lead to build a model that predicts which customers are more likely to buy credit cards and the team lead appointed you to this task. Try to build a model which helps the bank to get maximum out of the efforts in selling more credit cards. Good luck!

Data Set Information:

Total Entries: 245,725

Number of Columns: 11

Column Types:

4 columns of type int64 (Age, Vintage, Avg_Account_Balance, Need_Credit_Card)
7 object type columns (User_ID, Gender, Area_Code, Profession, Channel_Code, Has_Credit, Is_Active)

Important Columns:

User_ID: Customer ID

Gender: Gender.

Age: Age

Area_Code: Area code

Profession: Profession

Channel_Code: Channel code used

Vintage: Duration of the relationship with the bank

Has_Credit: Whether or not you have a credit card

Avg_Account_Balance: Average account balance

Is_Active: Is the account active or not?

Need_Credit_Card: Whether you need a credit card or not

Basic Information & Summary

import pandas as pd
import numpy as np
df = pd.read_csv("credit_card_recommendation.csv")
df.info()#Info on DataFrame
df.head(5)

Data Set Statistics & Missing Data

def check_data(df, head=5):
    print("######## SHAPE ########")  # (rows,columns)
    print(df.shape)
    print("######## TYPES ########")  # Data types
    print(df.dtypes)
    print("######## HEAD ########")  # First 5 lines
    print(df.head(head))
    print("######## TAIL ########")  # Last 5 lines
    print(df.tail(head))
    print("######## NaN ########")  # Check for missing values
    print(df.isnull().sum())
    print("######## DESCRIBE ########")  # Summary statistics
    print(df.describe())
    print("######## INDEX ########")  # Describe index
    print(df.index)
    print("######## COLUMNS ########")  # Describe DataFrame columns
    print(df.columns)
    print("######## COUNT ########")  # Number of non-NA values
    print(df.count())
check_data(df)
  • Age average 43.86, minimum 23, maximum 85.
  • Vintage average 46.96, minimum 7, maximum 135.
  • Average Account Balance (Avg_Account_Balance) average 1,128,403.10, minimum 20,790, maximum 10,352,009.
  • Need for Credit Card (Need_Credit_Card) received 23.72 per cent positive responses.
  • There are 29,325 missing data in the Has_Credit column.
# Fill missing data with average value
df['Has_Credit'].fillna(df['Has_Credit'].mode()[0], inplace=True)

# Check the data set again
check_data(df)

Visualling

Various visualizations are made on the data set. For example, a line chart showing average account balance by bio, an Extreme bar chart showing credit card hosting and average account balance by industry, bank relationship duration, and a scatter plot showing average account balance and credit card usage.
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud

# Line Plot
plt.figure(figsize=(12, 6))
sns.lineplot(x='Age', y='Avg_Account_Balance', data=df, ci=None)
plt.title('Average Account Balance by Age')
plt.show()

# Stacked Bar Plot
plt.figure(figsize=(12, 6))
sns.barplot(x='Profession', y='Avg_Account_Balance', hue='Has_Credit', data=df)
plt.title('Credit Card Ownership and Average Account Balance Across Occupations')
plt.show()

# Scatter Plot
plt.figure(figsize=(12, 6))
sns.scatterplot(x='Vintage', y='Avg_Account_Balance', hue='Need_Credit_Card', data=df)
plt.title('Average Account Balance and Credit Card Need by Bank Relationship Duration')
plt.show()

# Connected Scatter Plot
plt.figure(figsize=(12, 6))
sns.lineplot(x='Age', y='Vintage', data=df, sort=False)
plt.title('Bank Relationship Duration by Age')
plt.show()

# Bubble Graph
plt.figure(figsize=(12, 6))
sns.scatterplot(x='Age', y='Vintage', size='Avg_Account_Balance', hue='Need_Credit_Card', data=df)
plt.title('Credit Card Need and Average Account Balance by Age and Bank Relationship Duration')
plt.show()

# Word Cloud
wordcloud = WordCloud(width=800, height=400, random_state=42, background_color='white').generate(' '.join(df['Profession']))
plt.figure(figsize=(12, 6))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Occupations Word Cloud')
plt.show()

Modelling