Skip to content

Case Study Analytics Employee Attrition Performance on SQL

The purpose of this case study on SQL is to conduct a quick review of this year’s employees, focusing on several key points.

Key points to check

  1. Employee Engagement and Satisfaction
  • Attrition: Identify employees at risk of leaving.
  • Work Life Balance: Determine if work-life balance issues correlate with overtime or job dissatisfaction.
  1. Employee Performance and Compensation
  • Percent Salary Hike: Review salary increases in relation to performance ratings.
  1. Workforce Distribution and Demographics
  • Gender: Review gender distribution across departments and roles to ensure diversity.
  1. Productivity and Logistics
  • Standard Hours: Ensure consistent work hours across roles and departments.
  1. Operational Insights
  • Training Times Last Year: Review the adequacy of training investments and correlate with performance.
Hidden code Table
Spinner
DataFrameas
Point_1
variable
--Employee Engagement and Satisfaction
--Attrition

--First let´s check the average satisfaction level of employees who have already left the company.
WITH Satisfaction AS (
SELECT EmployeeNumber, EnvironmentSatisfaction, JobSatisfaction, RelationshipSatisfaction, JobInvolvement,
(EnvironmentSatisfaction + JobSatisfaction + RelationshipSatisfaction + WorkLifeBalance + JobInvolvement) / 5 AS AVG_Satisfaction
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'Yes')

SELECT 
ROUND(AVG(AVG_Satisfaction),2) AS AVG_Satisfaction
FROM Satisfaction
Spinner
DataFrameas
Point_1
variable
--The AVG_ Satisfaction of the employees that already left the company is 2.54, I would recommend first focus on employees with less ot equal level of AVG_ satisfaction.

SELECT EmployeeNumber, EnvironmentSatisfaction, JobSatisfaction, RelationshipSatisfaction, JobInvolvement,
(EnvironmentSatisfaction + JobSatisfaction + RelationshipSatisfaction + WorkLifeBalance + JobInvolvement) / 5 AS AVG_Satisfaction
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
AND (EnvironmentSatisfaction + JobSatisfaction + RelationshipSatisfaction + WorkLifeBalance + JobInvolvement) / 5 <= 2.54
ORDER BY AVG_Satisfaction ASC
-- List of employees with AVG less or equal to 2.54
Spinner
DataFrameas
Point_1
variable
--Employee Engagement and Satisfaction
--Work Life Balance

--Determine if work-life balance issues correlate with overtime.
SELECT
OverTime,
ROUND((COUNT(*) * 100) / (SELECT COUNT(*) FROM 'employees.hr_employee_attrition.csv' where attrition = 'No' AND WorkLifeBalance <= 3),2) AS '% Overtime'
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
AND WorkLifeBalance <= 3 
GROUP BY OverTime
--76.06% of employees with a Work-Life Balance score less than or equal to 3 do not overwork, while 23.94% do overwork. Based on these results, overworking may not be the main reason why employees lack a Work-Life Balance, but it can be a secundary reason.
Spinner
DataFrameas
Point_1
variable
--Determine if work-life balance issues correlate with job Satisfaction
SELECT
JobSatisfaction,
ROUND((COUNT(*) * 100) / (SELECT COUNT(*) FROM 'employees.hr_employee_attrition.csv' where attrition = 'No' AND WorkLifeBalance <= 3),2) AS WorkLifeBalance
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
AND WorkLifeBalance <= 3
GROUP BY JobSatisfaction
ORDER BY JobSatisfaction ASC

--Using the last table, we calculated the percentage of employees with a Work-Life Balance of 3 or less, categorized by their Job Satisfaction levels.The results are as follows: 18.07% for a Job Satisfaction of 1, 19.24% for a Job Satisfaction of 2, 29.45% for a Job Satisfaction of 3, and 33.24% for a Job Satisfaction of 4.

Spinner
DataFrameas
Point_1
variable
--Sum the porcentage with Job Satisfaction 3 or lower
SELECT
(18.07 + 19.24 + 29.45) AS Low_Level_Job_Satis

--Based on the results, 66.76% of employees with a Work-Life Balance of 3 or less are dissatisfied with their current Job Satisfaction. This could be one of the main reasons for their lower level of Work-Life Balance.
Spinner
DataFrameas
Point_2
variable
--Employee Performance and Compensation
--Percent Salary Hike

--Salary increases in relation to performance ratings
SELECT
PercentSalaryHike, 
AVG(PerformanceRating) AVG_Performance_Rating
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
GROUP BY PercentSalaryHike
ORDER BY PercentSalaryHike DESC;

--Employees who receive a salary hike between 20% and 25% have an average performance rating of 4, while employees with a hike of 11% to 19% have an average performance rating of 3. Based on this information, we can infer that the bigger the salary hike, the better the performance of the employee.
Spinner
DataFrameas
Point_3
variable
--Workforce Distribution and Demographics
--Gender: gender distribution across departments and roles

--Count of gender
SELECT
Gender,
COUNT(*) AS Count_Gender_Distribution,
ROUND((COUNT(*) * 100) / (SELECT COUNT(*) FROM 'employees.hr_employee_attrition.csv' where attrition = 'No'),2) AS '%_Gender_Distribution'
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
GROUP BY Gender;
--The percentage of men in the company exceeds that of women by almost 20%.



Spinner
DataFrameas
Point_3
variable
--Count of gender across departments
SELECT
Gender,
Department,
COUNT(*) AS Count_Gender_Distribution,
ROUND((COUNT(*) * 100) / (SELECT COUNT(*) FROM 'employees.hr_employee_attrition.csv' where attrition = 'No'),2) AS '%_Gender_Distribution'
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
GROUP BY Gender, Department
ORDER BY Department, '%_Gender_Distribution';
--The percentage of men in each department exceeds that of women.
Spinner
DataFrameas
Point_3
variable
--Count of gender across Job levels
SELECT
Gender,
JobLevel,
COUNT(*) AS Count_Gender_Distribution,
ROUND((COUNT(*) * 100) / (SELECT COUNT(*) FROM 'employees.hr_employee_attrition.csv' where attrition = 'No'),2) AS '%_Gender_Distribution'
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
GROUP BY Gender, JobLevel
ORDER BY JobLevel;
--The percentage of men in each job level exceeds that of women, with Level 5 showing the greatest difference by more than 20%.
Spinner
DataFrameas
Point_4
variable
--Productivity and Logistics
--Standard Hours across employees
SELECT 
DISTINCT StandardHours
FROM 'employees.hr_employee_attrition.csv'; 

--All the employees works the same 80 hours at month, I´m missing how many extra hours the employees works, so I can´t calculate it.
Spinner
DataFrameas
Point_5
variable
--Operational Insights
--Training Times Last Year
--adequacy of training investments and correlate with performance
SELECT
TrainingTimesLastYear,
ROUND(AVG(PerformanceRating),2)
FROM 'employees.hr_employee_attrition.csv'
WHERE Attrition = 'No'
GROUP BY TrainingTimesLastYear
ORDER BY TrainingTimesLastYear DESC;
--We would expect an increase in performance depending on the number of training sessions the employee receives. However, after checking the results in the last table, all the values hover around 3,regardless of the number of training sessions. This could mean that the training is not significantly affecting employee performance, as the company expected