Skip to content
Telecom Customer Churn
  • AI Chat
  • Code
  • Report
  • Telecom Customer Churn

    This dataset comes from an Iranian telecom company, with each row representing a customer over a year period. Along with a churn label, there is information on the customers' activity, such as call failures and subscription length.

    Not sure where to begin? Scroll to the bottom to find challenges!

    import pandas as pd
    
    pd.read_csv("data/customer_churn.csv")

    Data Dictionary

    ColumnExplanation
    Call Failurenumber of call failures
    Complaintsbinary (0: No complaint, 1: complaint)
    Subscription Lengthtotal months of subscription
    Charge Amountordinal attribute (0: lowest amount, 9: highest amount)
    Seconds of Usetotal seconds of calls
    Frequency of usetotal number of calls
    Frequency of SMStotal number of text messages
    Distinct Called Numberstotal number of distinct phone calls
    Age Groupordinal attribute (1: younger age, 5: older age)
    Tariff Planbinary (1: Pay as you go, 2: contractual)
    Statusbinary (1: active, 2: non-active)
    Ageage of customer
    Customer Valuethe calculated value of customer
    Churnclass label (1: churn, 0: non-churn)

    Source of dataset and source of dataset description.

    Citation: Jafari-Marandi, R., Denton, J., Idris, A., Smith, B. K., & Keramati, A. (2020). Optimum Profit-Driven Churn Decision Making: Innovative Artificial Neural Networks in Telecom Industry. Neural Computing and Applications.

    Don't know where to start?

    Challenges are brief tasks designed to help you practice specific skills:

    • πŸ—ΊοΈ Explore: Which age groups send more SMS messages than make phone calls?
    • πŸ“Š Visualize: Create a plot visualizing the number of distinct phone calls by age group. Within the chart, differentiate between short, medium, and long calls (by the number of seconds).
    • πŸ”Ž Analyze: Are there significant differences between the length of phone calls between different tariff plans?

    Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

    You have just been hired by a telecom company. A competitor has recently entered the market and is offering an attractive plan to new customers. The telecom company is worried that this competitor may start attracting its customers.

    You have access to a dataset of the company's customers, including whether customers churned. The telecom company wants to know whether you can use this data to predict whether a customer will churn. They also want to know what factors increase the probability that a customer churns.

    You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.

    To assist the client we would need to fully understand the company's current perfomance and by extension its impact in the market. We would like to propose to the client the exact amount they can offer up as a discount without drastically affecting profit. But adequate to prevent customer churn.

    Data Understanding

    df = pd.read_csv("data/customer_churn.csv")
    df.head()
    # Summary of columns and rows
    df.info()

    We observe the data has 13 columns all numeric with zero null values. We therefore willnot need to carry out much data cleaning. although some columns contain cartegorical data and we'll need to consider that during modelling.

    How many clients have active Subscriptions

    we will use group by in order to see the average nuber of months subscriptions are held.

    # Active subscription
    df.groupby(['Subscription Length', 'Tariff Plan']).sum().reset_index()
    # Futher analysis on Subscription
    # how many clients with over 30 months who left. 
    df[(df['Subscription Length'] > 30) & (df['Churn'] == 1)].count()
    

    We observe out out the total nuber of 3150, only 386 clients left. This is a good indicator since we see not a large number of long term clients left the organization. we can try and observe the inverse to see those who had recently joined the subscription service and left.

    df[(df['Subscription Length'] < 30) & (df['Churn'] == 1)].count()
    β€Œ
    β€Œ
    β€Œ