Skip to content
0

How can the company improve collaboration?

๐Ÿ“– Background

You work in the analytics department of a multinational company, and the head of HR wants your help mapping out the company's employee network using message data.

They plan to use the network map to understand interdepartmental dynamics better and explore how the company shares information. The ultimate goal of this project is to think of ways to improve collaboration throughout the company.

๐Ÿ’พ The data

The company has six months of information on inter-employee communication. For privacy reasons, only sender, receiver, and message length information are available (source).

Messages has information on the sender, receiver, and time.
  • "sender" - represents the employee id of the employee sending the message.
  • "receiver" - represents the employee id of the employee receiving the message.
  • "timestamp" - the date of the message.
  • "message_length" - the length in words of the message.
Employees has information on each employee;
  • "id" - represents the employee id of the employee.
  • "department" - is the department within the company.
  • "location" - is the country where the employee lives.
  • "age" - is the age of the employee.

Acknowledgments: Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.

Executive summary.

1. Which departments are the most/least active?
Sales department leads in terms of the number of messages sent and received. Marketing is the least active department.

2. Which employee has the most connections?
An employee with ID 598 leads in terms of the number of contacts, he (she) has 81 of them for the presented period.

3. Identify the most influential departments and employees.
Influential departments are Sales, Admin and Operations.
Top 14 the most influential employees: 605, 128, 509, 144, 389, 317, 598, 586, 483, 725, 337, 422, 260, 734.

4. Using the network analysis, in which departments would you recommend the HR team focus to boost collaboration?
Pay attention to the difference between received and sent messages in the following departments: Engineering, IT, Marketing.

[1]
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

messages = pd.read_csv('data/messages.csv', parse_dates= ['timestamp'])
messages
[2]
employees = pd.read_csv('data/employees.csv')
employees

First of all, let's see what the data is.

[3]
df = pd.merge(messages, employees, left_on = 'sender', right_on = 'id', how = 'left', suffixes=(None, '_sender'))
df = pd.merge(df, employees, left_on = 'receiver', right_on = 'id', how = 'left', suffixes=(None, '_receiver'))
df = df.rename(columns={"department": "department_sender", "location": "location_sender", "age":"age_sender"})
del df["id"]
del df["id_receiver"]
df

Let's convert the date value.

df['timestamp'] = df['timestamp'].astype(str)
df['month'] = pd.DatetimeIndex(df['timestamp']).month
df['day'] = pd.DatetimeIndex(df['timestamp']).day
df['hour'] = pd.DatetimeIndex(df['timestamp']).hour
df['date'] = pd.DatetimeIndex(df['timestamp']).date
df['date']= pd.to_datetime(df['date'])
df['dayofweek'] = pd.DatetimeIndex(df['timestamp']).dayofweek
df['day_of_week'] = df['date'].dt.day_name()
df.info()

As we can see, the final table doesn't contain empty values, we can move on.

Descriptive statistics help us look at the data, summarize the characteristics of a data set and assess the likelihood of outliers and other anomalies.

[6]
df.describe()

Message count distribution analysis can be visualized by the following line graph:

df1 = df.groupby('date')['sender'].count().reset_index()
df1 = df1.rename(columns={'sender':'message_send_cnt'})

plt.rcParams['figure.figsize']=(12,4)
plt.plot(df1['date'], df1['message_send_cnt'], color='lightskyblue')

plt.title('Message count distribution', fontsize=14)
plt.xlabel('Date')
plt.ylabel('Message cnt')
plt.legend()

plt.show()

The chart from above plummets towards the end of July 2021. The decrease in the number of messages may be due to the transition to another messenger.

โ€Œ
โ€Œ
โ€Œ