Skip to content
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(class.output = "code-background")
.code-background {
background-color: lightgreen;
border: 3px solid brown;
font-weight: bold;
}
install.packages("FactoMineR")
install.packages("factoextra")
library(tidyverse)
library(car)
library(scales)
library(psych)
library(FactoMineR)
library(factoextra)
theme_set(theme_bw())
theme_update(plot.title = element_text(hjust = 0.5, size = 20),
plot.subtitle = element_text(hjust = 0.5, size = 15),
axis.text = element_text(size = 18),
axis.title = element_text(size = 18),
legend.position = "bottom")
df <- readr::read_csv('./data/employee_churn_data.csv')
head(df)
1. Which department has the highest employee turnover? Which one has the lowest?
df %>%
mutate(left = ifelse(left == "yes", 1, 0)) %>%
group_by(department) %>%
summarize(turnover_rate = mean(left)) %>%
arrange(desc(turnover_rate)) %>%
ggplot(aes(fct_reorder(department, turnover_rate, .desc = T), turnover_rate)) +
geom_segment(aes(col = department, x = fct_reorder(department, turnover_rate, .desc = T), xend = fct_reorder(department, turnover_rate, .desc = T), y = 0, yend = turnover_rate), show.legend = F) +
geom_point(show.legend = F, pch = 21, size = 10 ,aes(fill = department)) +
geom_text(col = "black", aes(x = fct_reorder(department, turnover_rate, .desc = T), y = turnover_rate, label = 100 * round(turnover_rate, 3))) +
scale_color_brewer(palette = "Set3") +
scale_fill_brewer(palette = "Set3") +
scale_y_continuous(labels = label_percent()) +
labs(x = "Department", y = "Turnover",
title = "Turnover percentage per deparment",
subtitle = "n = 9540") +
coord_flip() +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5))
-
The IT department has the highest turnover percentage (30.9 %), followed by logistics (30.8 %), retail (30.6 %) and marketing (30.3 %)
-
Support (28.8 %), Engineering (28.8 %), Operations (28.6 %), Sales (28.5 %) and Administration (28.1 %) are in the middle field
-
The finance department has the lowest turnover percentage by quite the margin (26.9 %)
-
Looks like money - even if one is just working with it - does indeed buy happiness
2. Predictor variables for employee turnover
- We can use an ordination method like the PCA to answer this question
- PCA is made for numeric variables though, and we have a lot of categorical variables here
- But if we dummify these categorical variables (but don't scale and center them), the PCA should work just fine
df_trans <- df %>%
select(review, projects, tenure, satisfaction, avg_hrs_month) %>%
map_df(~ .x - mean(.x)) %>% #centering on the numeric variables
map_df(~ .x / sd(.x)) %>% # scaling of the numeric variables
cbind(
df %>%
select(department, promoted, salary, bonus, left) %>%
map(~ psych::dummy.code(.x)) %>%
as.data.frame())
head(df_trans)
df_pca <- PCA(df_trans, graph = F, scale.unit = F,)
# scaling and centering on the continuous variables was already done earlier
plot.PCA(df_pca, choix = "var", alpha.var = "contrib") +
theme_bw()
- Arrows (variables) that point in the same direction are positively correlated with each other
- Arrows (variable) that point in opposite directions are negatively correlated with each other
- Arrows (variables) that are at a 90° angle are not correlated
3. Recommendations to reduce employee turnover
- We can infer:
- Being dissatisfied with the job increases the chances of quitting (not very surprising)
- Having high review results increases the chances of quitting
- The chances of an employee quitting seem not to be correlated with (but played a role in constructing the PCA planes:
- The average working hours per month
- The person's tenure in that company
- The other variables don't seem to be having much effect on the likelihood of quitting
- Recommendations:
- It looks like as if people that get high scores on their reviews are more likely to quit, because:
- A) They now know their worth
- B) Its probably easier to get a new job with a strong review value
- Therefore, we would recommend to reward people who get good reviews with some sort of compensation (e.g a raise)
- Efforts to improve job satisfaction (New chairs, tables, maybe a ping-pong room, a PS5 in the break room, a movie room,...), prioritizing the departments with the highest turnover rates (IT, logistics, retail, marketing)
- Or even stock shares in the company like the cool kids in Silicon Valley do
- It looks like as if people that get high scores on their reviews are more likely to quit, because: