Skip to content
Superior customer segmentation to the status quo
  • AI Chat
  • Code
  • Report
  • Can you find a better way to segment your customers?

    📖 Background

    You work for a medical device manufacturer in Switzerland. Your company manufactures orthopedic devices and sells them worldwide. The company sells directly to individual doctors who use them on rehabilitation and physical therapy patients.

    Historically, the sales and customer support departments have grouped doctors by geography. However, the region is not a good predictor of the number of purchases a doctor will make or their support needs.

    Your team wants to use a data-centric approach to segmenting doctors to improve marketing, customer service, and product planning.

    Hidden code

    Q1.1 How many doctors are there in each region?

    df = doctors.groupby('Region')['DoctorID'].count().reset_index().sort_values(by="DoctorID", ascending = False)
    df.columns = ['Region', 'Count']
    df.head(10)
    plt.figure(figsize = (16, 6))
    sns.barplot(data = df, x = "Region", y = "Count")
    plt.xticks(rotation = 90)
    plt.tight_layout()
    df = doctors.groupby(['Region','Category'])['DoctorID'].count().reset_index()
    df.columns = ['Region', 'Category', 'Count']
    df = pd.pivot(data = df, index='Region', columns='Category', values='Count').sort_values(by = "Specialist", ascending = False).fillna(0)
    df = df.reset_index()
    df = df.set_index("Region")
    df['Total'] = df['General Practitioner'] + df['Specialist']
    df = df.sort_values(by = "Total", ascending = False)
    df.head()
    df = doctors.groupby(['Region','Category'])['DoctorID'].count().reset_index()
    df.columns = ['Region', 'Category', 'Count']
    df = df.sort_values(by = "Count", ascending = False)
    plt.figure(figsize = (16, 6))
    sns.barplot(data = df, x = "Region", y = "Count", hue = "Category")
    plt.xticks(rotation = 90)
    plt.tight_layout()

    Q1.2 What is the average number of purchases per region?

    keep_cols = ['Purchases','Region']
    df = doctors[keep_cols]
    df = df.groupby('Region').agg({'Purchases':['mean','count','sum']}).reset_index()
    df.columns = list(map("_".join, df.columns))
    df.columns = ['Region', 'Average', 'Count', 'Sum']
    df.sort_values(by = "Average", ascending = False).head(10)
    plt.figure(figsize = (16, 6))
    df = df.sort_values(by = "Average", ascending = False)
    sns.barplot(data = df, x = 'Region', y = 'Average')
    plt.xticks(rotation = 90)
    plt.tight_layout()
    plt.figure(figsize = (16, 6))
    sns.scatterplot(data = df, x = 'Count', y = 'Average')
    plt.tight_layout()
    threshold = 5
    df2 = df[df['Count'] > threshold].sort_values(by = "Average", ascending = False)
    df2.head(10)
    plt.figure(figsize = (16, 6))
    sns.scatterplot(data = df2, x = 'Count', y = 'Average')
    plt.tight_layout()

    Q2 Can you find a relationship between Purchases and Complaints?