Shadows of Salaries: Unveiling Tech Pay Trends
An HR Consultancy Report for Level 1 Competition
February 22, 2025
In the neon-lit corridors of the tech world, salaries pulse like code through a mainframe. As an HR consultancy, we’re diving into a shadowy dataset to decode what drives compensation—job roles, remote work, and more. This report explores thousands of employee records with a dark, futuristic lens, answering three key questions to guide our clients in attracting top talent. Buckle up for a visual journey through the data abyss—charts that glow, insights that strike, and a story that sticks. Let’s illuminate the shadows.
# Import libraries and set dark theme for Plotly
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
# Set Plotly to dark mode globally
pio.templates.default = 'plotly_dark'
dark_template = pio.templates['plotly_dark']
dark_template.layout.paper_bgcolor = '#1a1a1a'
dark_template.layout.plot_bgcolor = '#1a1a1a'
dark_template.layout.font.color = 'white'
dark_template.layout.title.font.size = 20
dark_template.layout.xaxis.gridcolor = '#333333'
dark_template.layout.yaxis.gridcolor = '#333333'
# Load the dataset
df = pd.read_csv('salaries.csv')
print("Data loaded. The shadows are ready to speak.")1. The Pulse of the Data: How Many Records, What Years?
Our first step is to map the terrain—how vast is this dataset, and across what years does it stretch? Imagine a digital heartbeat, each record a pulse in time. The lollipop chart below lights up the count per year in neon blue, revealing the dataset’s scope and rhythm. Watch the glow—does it surge or fade over time?
# Count records per year
year_counts = df['work_year'].value_counts().sort_index()
# Plotly bar chart
fig = px.bar(x=year_counts.index, y=year_counts.values, 
             title='Records Over Time',
             labels={'x': 'Year', 'y': 'Number of Records'},
             text=[f"{v:,}" for v in year_counts.values],
             color_discrete_sequence=['#00b7eb'])
fig.update_traces(textposition='auto')
fig.update_layout(
    plot_bgcolor='#1a1a1a', paper_bgcolor='#1a1a1a',
    font_color='white', title_font_size=20, title_font_color='white',
    xaxis={'gridcolor': '#333333'}, yaxis={'gridcolor': '#333333'}
)
fig.show()
# Calculate stats
num_records = len(df)
min_year = df['work_year'].min()
max_year = df['work_year'].max()
print(f"Total records: {num_records:,}")
print(f"Range of years: {min_year} to {max_year}")2. Salary Showdown: Data Scientists vs. Data Engineers
Now, we jack into the pay matrix—Data Scientists versus Data Engineers, a clash of neon titans. The violin plot below hums with purple and green, revealing salary distributions beyond mere averages. Hover to explore the spread, and spot the cyan mean lines cutting through. Who rules this digital paygrid?
# Filter for roles
roles = ['Data Scientist', 'Data Engineer']
filtered_df = df[df['job_title'].isin(roles)]
# Plotly violin plot
fig = go.Figure()
for role, color in zip(roles, ['#9b59b6', '#2ecc71']):
    fig.add_trace(go.Violin(x=filtered_df['job_title'][filtered_df['job_title'] == role],
                           y=filtered_df['salary_in_usd'][filtered_df['job_title'] == role],
                           name=role, fillcolor=color, line_color=color, opacity=0.8))
# Add mean lines
means = filtered_df.groupby('job_title')['salary_in_usd'].mean()
for i, (role, mean) in enumerate(means.items()):
    fig.add_shape(type='line', x0=i-0.4, x1=i+0.4, y0=mean, y1=mean,
                  line=dict(color='cyan', width=2, dash='dash'))
    fig.add_annotation(x=i, y=mean + 5000, text=f"${mean:,.0f}", showarrow=False,
                       font=dict(color='cyan', size=12))
fig.update_layout(
    title='Salary Duel: Data Scientists vs Data Engineers',
    xaxis_title='Job Title', yaxis_title='Salary (USD)',
    plot_bgcolor='#1a1a1a', paper_bgcolor='#1a1a1a',
    font_color='white', title_font_size=20, title_font_color='white',
    xaxis={'gridcolor': '#333333'}, yaxis={'gridcolor': '#333333'}
)
fig.update_traces(box_visible=True, meanline_visible=True)
fig.show()
# Stats
ds_avg = means['Data Scientist']
de_avg = means['Data Engineer']
higher_earner = "Data Scientists" if ds_avg > de_avg else "Data Engineers"
print(f"Data Scientists avg: ${ds_avg:,.2f}")
print(f"Data Engineers avg: ${de_avg:,.2f}")
print(f"{higher_earner} earn more.")3. The Remote Frontier: US Full-Time Talent
Last, we scan the ether—how many US full-time employees operate fully remote? The donut chart below flares with neon red, yellow, and blue, its core blazing with the 100% remote count. Click and hover to feel the shift—how deep does the remote signal run?
# Filter for US full-time
us_ft = df[(df['employment_type'] == 'FT') & (df['employee_residence'] == 'US')]
remote_counts = us_ft['remote_ratio'].value_counts().sort_index()
# Plotly sunburst chart
sunburst_data = pd.DataFrame({
    'Remote Ratio': ['0% (On-site)', '50% (Hybrid)', '100% (Remote)'],
    'Count': remote_counts.values,
    'Parent': ['US Full-Time'] * 3
})
fig = px.sunburst(sunburst_data, path=['Parent', 'Remote Ratio'], values='Count',
                  title='Remote Work: US Full-Time Employees',
                  color='Remote Ratio', color_discrete_sequence=['#e74c3c', '#f1c40f', '#3498db'])
fig.update_layout(
    plot_bgcolor='#1a1a1a', paper_bgcolor='#1a1a1a',
    font_color='white', title_font_size=20, title_font_color='white'
)
fig.show()
# Stats
num_us_fulltime_remote = remote_counts[100]
print(f"Full-time US employees working 100% remotely: {num_us_fulltime_remote:,}")Shadows of Salaries: Final Report
1. The Pulse of the Data
The neon bar chart pulses with 57,194 records from 2020 to 2024, a rising signal in the dark.
2. Salary Showdown
The violin grid hums with Data Scientists at $159,397.07 and Data Engineers at $149,315.00. Data Scientists surge ahead, their neon cutting the shadows.
3. The Remote Frontier
The sunburst ignites with 11,125 US full-time employees at 100% remote, a blue flare in the tech constellation.
Closing Signal
From pulsing records to salary duels and remote flares, this data hacks the tech pay code. Our clients can wield these insights to snag top talent. Upvote if this dark-mode dive lit your screen!