Skip to content
Workforce Analytics Initiative
Workforce Analytics Initiative
1. Project Overview
This initiative explores workforce-related datasets from a simulated Canadian university. By examining data on human resources, employee performance, and absenteeism, the objective is to uncover insights related to workforce productivity, staffing trends, compensation structures, and retention patterns.
2. Key Business Questions
- Which departments exhibit the highest rates of absenteeism?
 - How does employee performance correlate with tenure, age, and departmental affiliation?
 - Is there a measurable relationship between high performance and higher compensation?
 
3. Set-up Environment
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime4. Data Loading
import pandas as pd
employees = pd.read_csv("employees.csv", parse_dates=["hire_date"])
departments = pd.read_csv("departments.csv")
absences = pd.read_csv("absences.csv")
performance_reviews = pd.read_csv("performance_reviews.csv", parse_dates=["review_date"])5. Initial Exploration
We’ll explore the top rows of each dataset to understand their structure and contents.
# employee
employees.head()# department
departments.head()
# asbences
absences.head()# performance_reviews
performance_reviews.head()6. Data Cleaning
We will check each dataset for null values, duplicates, and confirm that columns are using appropriate data types.
Employees
Check column types