📖 Background
In this project, I will take a data-driven approach to analyze how various factors—such as job role, experience level, remote work, and company size—impact salaries in the global tech industry. By leveraging salary data from thousands of employees worldwide, I will identify key trends and provide actionable insights that help companies attract and retain top talent.
At the core of my analysis, I will explore and visualize salary patterns to highlight industry benchmarks and competitive compensation strategies. These insights will empower companies to make informed hiring and compensation decisions in an increasingly competitive market.
📖 Executive Summary
Based on the analysis of salary trends in relation to job titles, remote work, and company sizes, we recommend the following key takeaways:
-
Top Paying Job Titles: Companies looking to attract top talent should offer competitive salaries for high-paying roles such as specialized tech and senior leadership positions.
-
Remote Work and Salaries: Employees working fully on-site (0%) tend to earn higher salaries than hybrid (50%) or No remote (100%) workers. This suggests that on-site work still remains an attractive option for talent retention instead of 100% remote work offered.
-
Company Size and Salaries: Medium-sized firms (M) offer the highest salaries, while smaller companies (S) tend to pay less. Large companies (L) fall in between.
📖 Recommendation:
- Tech firms should leverage competitive on-site work salaries to attract top-tier professionals.
- Companies must benchmark salaries against industry trends to remain competitive.
- Smaller companies should focus on non-monetary benefits (e.g., equity, flexible work) to retain talent.
💾 The data
The data comes from a survey hosted by an HR consultancy, available in 'salaries.csv'.
Each row represents a single employee's salary record for a given year:
work_year- The year the salary was paid.experience_level- Employee experience level:EN: Entry-level / JuniorMI: Mid-level / IntermediateSE: Senior / ExpertEX: Executive / Director
employment_type- Employment type:PT: Part-timeFT: Full-timeCT: ContractFL: Freelance
job_title- The job title during the year.salary- Gross salary paid (in local currency).salary_currency- Salary currency (ISO 4217 code).salary_in_usd- Salary converted to USD using average yearly FX rate.employee_residence- Employee's primary country of residence (ISO 3166 code).remote_ratio- Percentage of remote work:0: No remote work (<20%)50: Hybrid (50%)100: Fully remote (>80%)
company_location- Employer's main office location (ISO 3166 code).company_size- Company size:S: Small (<50 employees)M: Medium (50–250 employees)L: Large (>250 employees)
Analysing and processing data to fecth meaningful and desired insights
#importing the required libraries and dataset
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
salaries_df = pd.read_csv('salaries.csv')
salaries_dfQ. Create a bar chart displaying the top 5 job titles with the highest average salary (in USD).
# Group by job title and compute the average salary
top_jobs = salaries_df.groupby("job_title")["salary_in_usd"].mean().nlargest(5)
# Plot bar chart
plt.figure(figsize=(10, 6))
sns.barplot(x=top_jobs.values, y=top_jobs.index, palette="Blues_r")
plt.title("Top 5 Job Titles with Highest Average Salary (USD)")
plt.xlabel("Average Salary (USD)")
plt.ylabel("Job Title")
plt.show()
Q. Compare the average salaries for employees working remotely 100%, 50%, and 0%. What patterns or trends do you observe?
# Group by remote ratio and compute the average salary
remote_salaries = salaries_df.groupby("remote_ratio")["salary_in_usd"].mean()
# Plot bar chart
plt.figure(figsize=(4, 6))
sns.barplot(x=remote_salaries.index, y=remote_salaries.values, palette="Greens_r")
plt.title("Average Salary by Remote Work Ratio")
plt.xlabel("Remote Work Ratio (%)")
plt.ylabel("Average Salary (USD)")
plt.show()
Q. Visualise the salary distribution (in USD) across company sizes (S, M, L). Which company size offers the highest average salary?
# Group by company size and compute the average salary
company_size_salaries = salaries_df.groupby("company_size")["salary_in_usd"].mean().sort_values()
# Violin plot for salary distribution across company sizes
plt.figure(figsize=(8, 6))
sns.violinplot(x=salaries_df["company_size"], y=salaries_df["salary_in_usd"], palette="Set3")
plt.title("Salary Distribution by Company Size")
plt.xlabel("Company Size")
plt.ylabel("Salary (USD)")
plt.show()