Skip to content
0
  
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, 
                      comment = NA, prompt = FALSE, tidy = FALSE, 
                      fig.width = 12, fig.height = 8, fig_caption = TRUE)

Where to focus a marketing campaign?

๐Ÿ“– Background

You are a data analyst at a crowdfunding site. For the next quarter, your company will be running a marketing campaign. The marketing manager wants to target those segments that have donated the most in the past year. She turned to you to help her with her upcoming meeting with the CEO.

๐Ÿ’พ The data

You have access to the following information:

Historic crowdfunding donations

First, we need to upload the packages that will be used to write the report.


install.packages(c("questionr", "vcd", "forcats"))

library(tidyverse)
library(questionr)
library(ggplot2)
library(forcats)
library(vcd)

Then, the read_csv function is used to load the data frame, which consists of the following variables:

  • category - "Sports", "Fashion", "Technology", etc.
  • device - the type of device used.
  • gender - gender of the user.
  • age range - one of five age brackets.
  • amount - how much the user donated in Euros.

df <- readr::read_csv('./data/crowdfunding.csv')
head(df)

๐Ÿ’ช Challenge

Create a single visualization that the marketing manager can use to explore the data. Include:

  1. What are the top three categories in terms of total donations?
  2. What device type has historically provided the most contributions?
  3. What age bracket should the campaign target?

First, we look at the structure of the data frame. As can be seen, there are variables that have to be recoded as factor.


glimpse(df)

These variables are transformed into factor using mutate_if function combined with as.factor.


df <- df %>% mutate_if(is.character, as.factor)
glimpse(df)

After this conversion, the relative frequencies of the factors are looked at to see if there are any minor categories that can be regrouped. In this case all levels of all variables seem to be sufficiently represented.


df %>% select_if(is.factor) %>% lapply(freq)

Then we study the amount donated by users in euros according to the different categorical variables. The variables device and category are used as a starting point. On one hand, it can be deduced that iOS devices have by far the highest amount of money donated compared to Android. On the other hand, between categories the amount donated seems to be even, with a slight dominance of the Games category.


df %>% group_by(category, device) %>% 
  summarize(n = n(), amount = sum(amount)) %>%
  mutate(prop = n/sum(n)) %>% 
  arrange(desc(amount))
โ€Œ
โ€Œ
โ€Œ