Telecom Churn Analysis with Random Forest

Telecom Customer Churn - Introduction

This dataset comes from an Iranian telecom company, with each row representing a customer over a year period. Along with a churn label, there is information on the customers' activity, such as call failures and subscription length.

Not sure where to begin? Scroll to the bottom to find challenges!

import pandas as pd

pd.read_csv("data/customer_churn.csv")

Data Dictionary

Column	Explanation
Call Failure	number of call failures
Complains	binary (0: No complaint, 1: complaint)
Subscription Length	total months of subscription
Charge Amount	ordinal attribute (0: lowest amount, 9: highest amount)
Seconds of Use	total seconds of calls
Frequency of use	total number of calls
Frequency of SMS	total number of text messages
Distinct Called Numbers	total number of distinct phone calls
Age Group	ordinal attribute (1: younger age, 5: older age)
Tariff Plan	binary (1: Pay as you go, 2: contractual)
Status	binary (1: active, 2: non-active)
Age	age of customer
Customer Value	the calculated value of customer
Churn	class label (1: churn, 0: non-churn)

Source of dataset and source of dataset description.

Citation: Jafari-Marandi, R., Denton, J., Idris, A., Smith, B. K., & Keramati, A. (2020). Optimum Profit-Driven Churn Decision Making: Innovative Artificial Neural Networks in Telecom Industry. Neural Computing and Applications.

Object and Motivation

Im going to propose a model that predicts Customer Churn Better than a Baseline Classifier
Im going to undeline the most important causes related to customer Churn
In chapter 3 I will explore the data, showing appropriated graphs and making observations about the findings
In chapter 4 I will design the model and show its performance and the most important variables in churn customers
Finally in chapter 5 i will make conclusions and recomendations

Exploratory Data Analysis

## Basic Imports

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

Datafrane definition

df=pd.read_csv("data/customer_churn.csv")

df.shape

Statistical description

df.describe()

Null Cells

‌
‌
‌