Skip to content

A SaaS company seeks to uncover what drives its clients to renew subscriptions. They’ve collected data on client details, subscription records, and economic indicators and would like to connect them to better understand its clients’ behavior.

They’ve tasked you with analyzing these datasets to identify the key factors influencing clients’ decisions to renew their subscriptions.

Your analysis will provide them with insights into which customers are renewing their products and the reasons behind their renewals. The company can leverage these insights to make informed decisions to increase renewal rates and improve customer loyalty, helping them stay competitive and ensure long-term growth.

The Data

The company have provided you with three datasets for your analysis. A summary of each data is provided below.

client_details.csv

ColumnDescription
client_idUnique identifier for each client.
company_sizeSize of the company (Small, Medium, Large).
industryIndustry to which the client belongs (Fintech, Gaming, Crypto, AI, E-commerce).
locationLocation of the client (New York, New Jersey, Pennsylvania, Massachusetts, Connecticut).

subscription_records.csv

ColumnDescription
client_idUnique identifier for each client.
subscription_typeType of subscription (Yearly, Monthly).
start_dateStart date of the subscription - YYYY-MM-DD.
end_dateEnd date of the subscription - YYYY-MM-DD.
renewedIndicates whether the subscription was renewed (True, False).

economic_indicators.csv

ColumnDescription
start_dateStart date of the economic indicator (Quarterly) - YYYY-MM-DD.
end_dateEnd date of the economic indicator (Quarterly) - YYYY-MM-DD.
inflation_rateInflation rate in the period.
gdp_growth_rateGross Domestic Product (GDP) growth rate in the period.
# Re-run this cell
# Import required libraries
import pandas as pd

# Import data
client_details = pd.read_csv('data/client_details.csv')
subscription_records = pd.read_csv('data/subscription_records.csv', parse_dates = ['start_date','end_date'])
economic_indicators = pd.read_csv('data/economic_indicators.csv', parse_dates = ['start_date','end_date'])
# Start coding here, good luck!
client_details.head()
Hidden output
subscription_records.head()
subscription_records.info()
Hidden output
economic_indicators.head()
economic_indicators.info()
Hidden output
# How many total Fintech and Crypto clients does the company have? Store as an integer variable called total_fintech_crypto_clients.
client_details["industry"].unique()
fintech_crypto_clients = client_details[client_details["industry"].isin(["Fintech", "Crypto"])]
total_fintech_crypto_clients = fintech_crypto_clients["industry"].value_counts()
total_fintech_crypto_clients = fintech_crypto_clients["industry"].count()
total_fintech_crypto_clients
# Which industry has the highest renewal rate? Store as a string variable called top_industry.
client_sub = client_details.merge(subscription_records, on="client_id")
client_sub.head()
Hidden output
# Filter out the renewed subscription
renewed_subs = client_sub[client_sub["renewed"] == True]

# count renewal rate for each industry
count_renewal_subs = renewed_subs["industry"].value_counts().reset_index()
count_renewal_subs.columns = ["industry", "count"]

# Find the industry with the highest renewal count
top_industry = count_renewal_subs.loc[count_renewal_subs["count"].idxmax(), "industry"]
top_industry
subscription_records.head()
import pandas as pd

# Ensure datetime format
subscription_records["end_date"] = pd.to_datetime(subscription_records["end_date"])
economic_indicators["start_date"] = pd.to_datetime(economic_indicators["start_date"])
economic_indicators["end_date"] = pd.to_datetime(economic_indicators["end_date"])

# Sort for merge_asof
subscription_records = subscription_records.sort_values("end_date")
economic_indicators = economic_indicators.sort_values("start_date")

# Merge: match subscription end_date to the latest economic start_date
merged_sub_econ = pd.merge_asof(
    subscription_records,
    economic_indicators,
    left_on="end_date",
    right_on="start_date",
    suffixes = ('_sub', '_econ'),
    direction='backward'
)

print(merged_sub_econ.head())

# Filter renewed subscriptions
renewed_subs = merged_sub_econ[merged_sub_econ["renewed"] == True]

# Calculate average inflation rate
average_inflation_for_renewals = round(renewed_subs["inflation_rate"].mean(), 2)

print("Average Inflation Rate for Renewals:", average_inflation_for_renewals)