Skip to content
0

Salaries and Workforce Distribution Analysis

Comparison of workforce distribution and salaries of Data Scientists and Data Engineers

Executive Summary

The dataset has a range of 4 years’ worth of HR records (57,194 rows). The average salary of a Data Scientist in this dataset is 170k a year, while Data Engineers earn an average of 150k. This suggests that on average, Data Scientists earn more than Data Engineers.

The dataset shows that there are 11,125 employees fully remote in the US. These employees account for 5.14% of the total dataset.

Recommendations

It is recommended to increase the number of remote roles when making offers to new talent, particularly if they are Data Engineers. While the 20k avg salary difference could be a factor, having the option to work from home could make the offer more attractive. Further analysis is required for additional insights.


2 hidden cells
import pandas as pd
salaries_df = pd.read_csv('salaries.csv')
salaries_df.head() #Overview of dataset
Hidden output
years_covered = salaries_df['work_year'].max() - salaries_df['work_year'].min() # Calculating the range of years
total_rows = len(salaries_df) # Number of records

print(f"Total Number of Records in Dataset: {total_rows}")
print(f"Range in Years: {years_covered} years")
# Calculating avarage salaries

avg_salaries = salaries_df.groupby('job_title')['salary'].mean() 

ds_avg_salary = avg_salaries['Data Scientist'].round(2)
de_avg_salary = avg_salaries['Data Engineer'].round(2)

print(f'Avarage Data Scientist Salary: ${ds_avg_salary}')
print(f'Avarage Data Engineer Salary: ${de_avg_salary}')
# Counting FT USA employees that are fully remote

us_remote_ft_emp = salaries_df[
    (salaries_df['employment_type'] == 'FT') &
    (salaries_df['remote_ratio'] == 100) &
    (salaries_df['employee_residence'] == 'US')
]

print(f'Total of full-time remote US employees: {len(us_remote_ft_emp)}')

office_emp_ratio = (total_rows / len(us_remote_ft_emp))
print(f'Ratio: {round(office_emp_ratio, 2)}% of the total workforce.')