Skip to content
New Workbook
Sign up
Beyond Saint Petersburg beer profile
0
library(tidyverse)
library(skimr)
library(cluster)
data <- readr::read_csv('./data/russian_alcohol_consumption.csv')
#skim(data)

1. Executive summary

Our latest promotion in Saint petersburg yields good outcomes. Running the same type of promotion in region having the similar selling history and characteristics could be of great benefit for the company.

Based on the success on previous selling promotion results in Saint petersburg, this application intends to investigate possible regions where we could run additional promotion and expect similar outcomes or more.

for (i in c(3:7)){
  data0 <-  data[,i]
  median0 <- unlist(map(data0,median,na.rm = TRUE))
  data0[is.na(data0)] <- median0
  data[,i] <- data0
}
#mean(is.na(data))
data <- as_tibble(data)

2. Saint peterburg profile

st_pet <- data %>% 
  filter(str_detect(region,"[P|p]etersburg")) %>% 
  select(-c(year,region)) 

cat("Average sale in litres per capita by year in Saint petersburg ")
round(colMeans(st_pet),2)

barplot(colMeans(st_pet), main = "Average sale in litres per capita by year in Saint petersburg ")
On average, in Saint petersburg , beer has the highest mean value of alcoholic drinks sale in litres per capita : `r round(mean(st_pet$beer),2)`, followed by vodka `r round(mean(st_pet$vodka),2)` and wine `r round(mean(st_pet$wine),2)`.

3. General profile

barplot(colMeans(data[,3:7]),
        main = "Overall average sale in litres per capita by year ")

The overall mean profile yields the same pattern, but is there any association between beer and vodka ?

data %>% 
  ggplot(aes(vodka,beer )) +
  geom_point() +
  geom_smooth(method = "lm",se = F) + 
  #scale_y_log10() +
  labs(title = "scatter plot of beer vs vodka across all region")

There is very weak linear association between beer and wine, with a correlation coefficient r = r round(cor(data$beer,data$vodka),2). Without individual product price to weight product effect on the company income, we consider beer as the most important or leading product to guide our intuition in investigations. Which region might be good candidates where to run promotion ?

data %>% 
  #filter(!str_detect(region,"^[S|s]aint" )) %>% 
  group_by(region) %>% 
  summarize(wine = median(wine) , 
            beer = median(beer) ,
            champagne = median(champagne) ,
            vodka = median(vodka) ,
            brandy = median(brandy) 
            ) -> data_sum
#dim(data_sum)

4. Intuitive region selection by beer profile

Roughly speaking and considering only beer profile, the following 11 regions might have a similar beer profile.

data_sum %>% 
  arrange(desc(beer)) %>% 
  distinct(region) %>% 
  head(11) 

Selecting with descending vodka yields another list of regions .