TELECOM CUSTOMER CHURN PREDICTION 📈
Content
- Introduction (Invalid URL)
- What is Customer Churn? (Invalid URL)
- How can customer churn be reducded? (Invalid URL)
- Objectives (Invalid URL)
- Loading libraries and data (Invalid URL)
- Undertanding the data (Invalid URL)
- Visualize missing values (Invalid URL)
- Data Manipulation (Invalid URL)
- Data Visualization (Invalid URL)
- Data Preprocessing (Invalid URL)
- Standardizing numeric attributes (Invalid URL)
- Machine Learning Model Evaluations and Predictions (Invalid URL)
- KNN (Invalid URL)
- SVC (Invalid URL)
- Random Forest (Invalid URL)
- Logistic Regression (Invalid URL)
- Decision Tree Classifier (Invalid URL)
- AdaBoost Classifier (Invalid URL)
- Gradient Boosting Classifier (Invalid URL)
- Voting Classifier (Invalid URL)
(Invalid URL)
1. Introduction
(Invalid URL)
(Invalid URL)
What is Customer Churn?
Customer churn is defined as when customers or subscribers discontinue doing business with a firm or service.
Customers in the telecom industry can choose from a variety of service providers and actively switch from one to the next. The telecommunications business has an annual churn rate of 15-25 percent in this highly competitive market.
Individualized customer retention is tough because most firms have a large number of customers and can't afford to devote much time to each of them. The costs would be too great, outweighing the additional revenue. However, if a corporation could forecast which customers are likely to leave ahead of time, it could focus customer retention efforts only on these "high risk" clients. The ultimate goal is to expand its coverage area and retrieve more customers loyalty. The core to succeed in this market lies in the customer itself.
Customer churn is a critical metric because it is much less expensive to retain existing customers than it is to acquire new customers.
(Invalid URL) (Invalid URL)
To reduce customer churn, telecom companies need to predict which customers are at high risk of churn.
To detect early signs of potential churn, one must first develop a holistic view of the customers and their interactions across numerous channels, including store/branch visits, product purchase histories, customer service calls, Web-based transactions, and social media interactions, to mention a few.
As a result, by addressing churn, these businesses may not only preserve their market position, but also grow and thrive. More customers they have in their network, the lower the cost of initiation and the larger the profit. As a result, the company's key focus for success is reducing client attrition and implementing effective retention strategy. (Invalid URL)
(Invalid URL)
Objectives
I will explore the data and try to answer some questions like:
- What's the % of Churn Customers and customers that keep in with the active services?
- Is there any patterns in Churn Customers based on the gender?
- Is there any patterns/preference in Churn Customers based on the type of service provided?
- What's the most profitable service types?
- Which features and services are most profitable?
- Many more questions that will arise during the analysis (Invalid URL)
(Invalid URL)
2. Loading libraries and data
(Invalid URL)
import pandas as pd
import numpy as np
import missingno as msno
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from xgboost import XGBClassifier
from catboost import CatBoostClassifier
from sklearn import metrics
from sklearn.metrics import roc_curve
from sklearn.metrics import recall_score, confusion_matrix, precision_score, f1_score, accuracy_score, classification_report#loading data
df = pd.read_csv('Telecom_Customer_Data.csv')(Invalid URL)
3. Undertanding the data
(Invalid URL)
Each row represents a customer, each column contains customer’s attributes described on the column Metadata.
df.head()