Skip to content
0

📖 Background

You work for an international HR consultancy helping companies attract and retain top talent in the competitive tech industry. As part of your services, you provide clients with insights into industry salary trends to ensure they remain competitive in hiring and compensation practices.

Your team wants to use a data-driven approach to analyse how various factors—such as job role, experience level, remote work, and company size—impact salaries globally. By understanding these trends, you can advise clients on offering competitive packages to attract the best talent.

In this competition, you’ll explore and visualise salary data from thousands of employees worldwide. f you're tackling the advanced level, you'll go a step further—building predictive models to uncover key salary drivers and providing insights on how to enhance future data collection.

💾 The data

The data comes from a survey hosted by an HR consultancy, available in 'salaries.csv'.

Each row represents a single employee's salary record for a given year:
  • work_year - The year the salary was paid.
  • experience_level - Employee experience level:
    • EN: Entry-level / Junior
    • MI: Mid-level / Intermediate
    • SE: Senior / Expert
    • EX: Executive / Director
  • employment_type - Employment type:
    • PT: Part-time
    • FT: Full-time
    • CT: Contract
    • FL: Freelance
  • job_title - The job title during the year.
  • salary - Gross salary paid (in local currency).
  • salary_currency - Salary currency (ISO 4217 code).
  • salary_in_usd - Salary converted to USD using average yearly FX rate.
  • employee_residence - Employee's primary country of residence (ISO 3166 code).
  • remote_ratio - Percentage of remote work:
    • 0: No remote work (<20%)
    • 50: Hybrid (50%)
    • 100: Fully remote (>80%)
  • company_location - Employer's main office location (ISO 3166 code).
  • company_size - Company size:
    • S: Small (<50 employees)
    • M: Medium (50–250 employees)
    • L: Large (>250 employees)
salaries_df <- read.csv('salaries.csv')
head(salaries_df)

💪 Competition challenge

In this first level, you’ll explore and summarise the dataset to understand its structure and key statistics. If you want to push yourself further, check out level two! Create a report that answers the following:

  • How many records are in the dataset, and what is the range of years covered?
  • What is the average salary (in USD) for Data Scientists and Data Engineers? Which role earns more on average?
  • How many full-time employees based in the US work 100% remotely?

🧑‍⚖️ Judging criteria

This is a community-based competition. Once the competition concludes, you'll have the opportunity to view and vote for the best submissions of others as the voting begins. The top 5 most upvoted entries will win. The winners will receive DataCamp merchandise.

✅ Checklist before publishing into the competition

  • Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
  • Remove redundant cells like the judging criteria, so the workbook is focused on your story.
  • Make sure the workbook reads well and explains how you found your insights.
  • Try to include an executive summary of your recommendations at the beginning.
  • Check that all the cells run without error

⌛️ Time is ticking. Good luck!

# loading the necessary package
library(dplyr)
# loading the data
salaries_df <- read.csv('salaries.csv')
# 1. How many records are in the dataset, and what is the range of years covered?
num_records <- nrow(salaries_df)
year_range <- range(salaries_df$work_year, na.rm = TRUE)

cat("1. Total records:", num_records, "\n")
cat("   Range of years:", year_range[1], "to", year_range[2], "\n\n")

# 2. Average salary (in USD) for Data Scientists and Data Engineers
avg_salaries <- salaries_df %>%
  filter(job_title %in% c("Data Scientist", "Data Engineer")) %>%
  group_by(job_title) %>%
  summarise(avg_salary_usd = mean(salary_in_usd, na.rm = TRUE)) %>%
  arrange(desc(avg_salary_usd))

cat("2. Average salaries (USD):\n")
print(avg_salaries)

higher_paid_role <- avg_salaries$job_title[1]
cat("\n   ➤", higher_paid_role, "earns more on average.\n\n")

# 3. Number of full-time US employees working 100% remotely
us_fulltime_remote <- salaries_df %>%
  filter(employment_type == "FT",
         employee_residence == "US",
         remote_ratio == 100) %>%
  nrow()

cat("3. Full-time US-based employees working 100% remotely:", us_fulltime_remote, "\n")



Remote Pay Realities: A Global Dive into Tech Salaries (2020–2024)

Executive Summary In an era where remote work and tech talent are reshaping the global job landscape, understanding salary dynamics has become critical for both companies and professionals. This report analyzes salary trends from over 57,000 tech professionals across the globe, focusing on key roles such as Data Scientists and Data Engineers.

Our analysis highlights which roles command higher pay, how remote work influences compensation, and the prevalence of full-time remote work in the US. These insights equip HR professionals and companies with data-backed recommendations to remain competitive in attracting and retaining top talent.

1. Dataset Overview Total Records: 57,194

Year Range Covered: 2020 to 2024

This dataset captures global salary patterns across five years, offering a comprehensive view of tech compensation trends.

2. Average Salaries by Role (in USD) Job Title Average Salary (USD) Data Scientist 149,315

Insight: Data Scientists earn more on average than Data Engineers, with a difference of over $10,000. This suggests a higher market valuation of analytical and machine learning skills in recent years.

3. Remote Work in the US Number of Full-time, Fully Remote US Employees: 11,125

Insight: A significant portion of US-based tech talent now works 100% remotely. This trend has major implications for hiring strategies, compensation adjustments by location, and global talent mobility.

Recommendations for Employers Offer competitive salaries to attract Data Scientists, whose roles are commanding premium compensation.

Embrace remote-first hiring policies to tap into a broader talent pool, especially as remote work becomes standard in the US.

Monitor and adapt to remote compensation trends, ensuring pay equity across locations and roles.