Skip to content
impact_redesign_landing_page_on_conversion_rate
Which version of the website should you use?
๐ Background
You work for an early-stage startup in Germany. Your team has been working on a redesign of the landing page. The team believes a new design will increase the number of people who click through and join your site.
They have been testing the changes for a few weeks and now they want to measure the impact of the change and need you to determine if the increase can be due to random chance or if it is statistically significant.
๐พ The data
The team assembled the following file:
Redesign test data
- "treatment" - "yes" if the user saw the new version of the landing page, no otherwise.
- "new_images" - "yes" if the page used a new set of images, no otherwise.
- "converted" - 1 if the user joined the site, 0 otherwise.
The control group is those users with "no" in both columns: the old version with the old set of images.
# imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import ttest_ind# read data
df = pd.read_csv('./data/redesign.csv')
df.head()๐ช Challenge
Complete the following tasks:
- Analyze the conversion rates for each of the four groups: the new/old design of the landing page and the new/old pictures.
- Can the increases observed be explained by randomness? (Hint: Think A/B test)
- Which version of the website should they use?
# count rows
df.shape# are there missing values?
df.isnull().values.any()# Question 1 - Look into data & prepare groups
# 1. control group - conversion rate (treatment "no", new_images "no")
print('1. CONTROL GROUP:')
print('---------------------------------------------------')
## 1.1 control group count
count_control = sum((df['treatment'] == 'no') & (df['new_images'] == 'no'))
print('Count of individuals: ', count_control)
## 1.2 control group - converted count
control_group_converted = sum((df['treatment'] == 'no') & (df['new_images'] == 'no') & (df['converted'] == 1))
print('Count of converted individuals: ', control_group_converted)
## 1.3 control group - not converted count
control_group_nconverted = sum((df['treatment'] == 'no') & (df['new_images'] == 'no') & (df['converted'] == 0))
print('Count of NOT converted individuals: ', control_group_nconverted)
print('---------------------------------------------------')
print()
# 2. only new images group - conversion rate (treatment "no", new_images "yes")
print('2. ONLY NEW IMAGES GROUP:')
print('---------------------------------------------------')
## 2.1 only new images group count
count_new_images = sum((df['treatment'] == 'no') & (df['new_images'] =='yes'))
print('Count of individuals: ', count_new_images)
## 2.2 only new images group - converted count
new_images_group_converted = sum((df['treatment'] == 'no') & (df['new_images'] =='yes') & (df['converted'] == 1))
print('Count of converted individuals: ', new_images_group_converted)
## 2.3 only new images group - not converted count
new_images_group_nconverted = sum((df['treatment'] == 'no') & (df['new_images'] =='yes') & (df['converted'] == 0))
print('Count of NOT converted individuals: ', new_images_group_nconverted)
print('---------------------------------------------------')
print()
# 3. new landing page group - conversion rate (treatment "yes", new_images "no")
print('3. ONLY NEW LANDING PAGE GROUP:')
print('---------------------------------------------------')
## 3.1 only new landing page group count
count_new_landing = sum((df['treatment'] == 'yes') & (df['new_images'] == 'no'))
print('Count of individuals: ', count_new_landing)
## 3.2 only new landing page group - converted count
new_landing_group_converted = sum((df['treatment'] == 'yes') & (df['new_images'] =='no') & (df['converted'] == 1))
print('Count of converted individuals: ', new_landing_group_converted)
## 3.3 only new landing page group - not converted count
new_landing_group_nconverted = sum((df['treatment'] == 'yes') & (df['new_images'] =='no') & (df['converted'] == 0))
print('Count of NOT converted individuals: ', new_landing_group_nconverted)
print('---------------------------------------------------')
print()
# 4. complete redesign group - conversion rate (treatment "yes", new_images "yes")
print('4. COMPLETE REDESIGN GROUP:')
print('---------------------------------------------------')
## 4.1 complete redesign group count
count_complete_redesign = sum((df['treatment'] == 'yes') & (df['new_images'] == 'yes'))
print('Count of individuals: ', count_complete_redesign)
## 4.2 complete redesign group - converted count
complete_redesign_group_converted = sum((df['treatment'] == 'yes') & (df['new_images'] =='yes') & (df['converted'] == 1))
print('Count of converted individuals: ', complete_redesign_group_converted)
## 4.3 complete redesign group - not converted count
complete_redesign_group_nconverted = sum((df['treatment'] == 'yes') & (df['new_images'] =='yes') & (df['converted'] == 0))
print('Count of NOT converted individuals: ', complete_redesign_group_nconverted)
print('---------------------------------------------------')# Question 1 - Allocation Plot
fig, ax = plt.subplots()
W = ['Control Group', 'New Images Group', 'New Landing Page Group', 'Complete Redesign Group']
X = [count_control, count_new_images, count_new_landing, count_complete_redesign]
Y = [control_group_nconverted, new_images_group_nconverted, new_landing_group_nconverted, complete_redesign_group_nconverted]
Z = [control_group_converted, new_images_group_converted, new_landing_group_converted, complete_redesign_group_converted]
df_grouped = pd.DataFrame(np.c_[X,Y,Z], index=W)
df_grouped.plot(kind="bar", ax=ax)
ax.legend(["Group Count", "Not Converted", "Converted"], loc=(1.05, 0.5));
plt.show()# Question 1 - Calculate Conversion Rates
# 1. control group
conversion_rate_control = control_group_converted/count_control
print('1. Control Group CR: ', round(conversion_rate_control*100, 2), "%")
# 2. only new images
conversion_rate_new_images = new_images_group_converted/count_new_images
print('2. New Images Group CR: ', round(conversion_rate_new_images*100, 2), "%")
# 3. only new landing page
conversion_rate_new_landing = new_landing_group_converted/count_new_landing
print('3. New Landing Page Group CR: ', round(conversion_rate_new_landing*100, 2), "%")
# 4. complete redesign
conversion_rate_complete_redesign = complete_redesign_group_converted/count_complete_redesign
print('4. Complete Redesign Group CR: ', round(conversion_rate_complete_redesign*100, 2), "%")# Question 2 - Lift
def lift(a,b):
lift = (b-a)/a
return str(round(lift, 4)*100) + '%'
print('New Images: ', lift(conversion_rate_control, conversion_rate_new_images))
print('New Landing Page: ', lift(conversion_rate_control, conversion_rate_new_landing))
print('Complete Redesign: ', lift(conversion_rate_control, conversion_rate_complete_redesign))# Question 2 - Statistical Significance
t_new_img = ttest_ind(df[(df['treatment'] == 'no') & (df['new_images'] =='yes')]['converted'],
df[(df['treatment'] == 'no') & (df['new_images'] =='no')]['converted'])
print('Only New Images: ', t_new_img)
t_new_landing = ttest_ind(df[(df['treatment'] == 'yes') & (df['new_images'] =='no')]['converted'],
df[(df['treatment'] == 'no') & (df['new_images'] =='no')]['converted'])
print('Only New Landing Page: ', t_new_landing)
t_complete_redesign = ttest_ind(df[(df['treatment'] == 'yes') & (df['new_images'] =='yes')]['converted'],
df[(df['treatment'] == 'no') & (df['new_images'] =='no')]['converted'])
print('Complete Redesign: ', t_complete_redesign)Question 3 - Findings
- 4 Groups in Data (A,B,"C","D" Testing)
- a) Control Group
- b) Only New Images Introduced Group
- c) Only Redesign Landing Page Group
- d) Complete Redesign Group (New Images + Redesign Landing Page)
- All Groups have same amount of users allocated
- no missing data
- no difference in allocation --> no randomness to allocation
- Conversion rates of users differ between groups (see above)
- Conversion rate for new redesign landing page (group c) the highest in comparison to control group (control: โ10.71 % converted, new_landing: 12 % converted)
- lift effect: โ12.08%
- Difference is only significant for group c)
- p-value less than 0.05 --> for c) โ 0.004
The Company should indeed use the version with ONLY the redesigned landing page