How can the company improve collaboration?
๐ Background
You work in the analytics department of a multinational company, and the head of HR wants your help mapping out the company's employee network using message data.
They plan to use the network map to understand interdepartmental dynamics better and explore how the company shares information. The ultimate goal of this project is to think of ways to improve collaboration throughout the company.
๐พ The data
The company has six months of information on inter-employee communication. For privacy reasons, only sender, receiver, and message length information are available (source).
Messages has information on the sender, receiver, and time.
- "sender" - represents the employee id of the employee sending the message.
- "receiver" - represents the employee id of the employee receiving the message.
- "timestamp" - the date of the message.
- "message_length" - the length in words of the message.
Employees has information on each employee;
- "id" - represents the employee id of the employee.
- "department" - is the department within the company.
- "location" - is the country where the employee lives.
- "age" - is the age of the employee.
Acknowledgments: Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.
Executive Summary
Understanding how much employees from different departments connect or interact can help understand the level of collaboration between these departments and ultimately improve inter-departmental dynamics and collaboration. It would also help improve communication or information movement or sharing amongst the departments.
It was found in the analysis that:
- The most active department is the Sales department.
- The least active department is the Engineering department.
- The employee with the most connections has the employee_id 598
- The two most influential departments are the Engineering and Admin departments.
- The marketing department does not even collaborate with itself and the IT department.
- The Engineering department does not collaborate with the Marketing department.
- The three most influential employees have the employee ids 194, 32 and 249 My recommendations would be for the HR team to:
- develop strategies that would ensure more team work and collaboration betweeen Marketing, Engineering, and IT departments.
- set up tasks that would make the Engineering department work more with other departments even though they are the most influential.
- organise inter-departmental hangouts to boost the relationships between different departments and also increase inter-departmental connections.
- possibly, organise rotation of employees through every department for some period of time to help them appreciate other departments more and improve collaboration.
- encourage special collaboration between the Marketing and Engineering departments.markdown
import pandas as pd
messages = pd.read_csv('data/messages.csv', parse_dates= ['timestamp'])
messagesemployees = pd.read_csv('data/employees.csv')
employeesReport
The messages and employees dataframe were merged together in order to get a single master dataframe that has the employee information of both the sender and receiver. For each message, the sender's department, age and location is present likewise for the receiver.
master1. Which departments are the most/least active?
# group the master dataframe by the sender departments and summarize by the sum of message length
sender_group = master[['sender_dept','message_length']].groupby('sender_dept').sum().reset_index().sort_values('message_length',ascending=False)
sender_groupReport
The master dataframe was grouped based on the sender departments and summarized using the sum of the message length in words. The resulting table shows all departments and the total amount of words they sent. This includes messages sent to employees withing the same department.
The Sales department is the most active department while the Engineering department is the least active department, in terms of messages sent.
Report
The master dataframe was grouped based on the receiver departments and summarized using the sum of the message lenght in words. The resulting table shows all departments and the total amount of words they received. This includes messages received from employees within the same department.
The Sales department is the most active department while the Engineering department is the least active department, in terms of messages received.
1 hidden cell
# create visualization to show the most/least active departments
plt.style.use('default')
fig, ax = plt.subplots(figsize=[5,3])
ax = sns.barplot(data=sender_group,y='sender_dept',x='message_length',color='#B22222')
ax.set_xlabel('')
ax.set_ylabel('')
ax.set_xticks([0,25000,50000,75000])
ax.xaxis.tick_top()
ax.tick_params(left=False,top=False)
ax.tick_params(axis='x',colors='grey')
for loc in ['left','right','top','bottom']:
ax.spines[loc].set_visible(False)
for bar, alpha in zip(ax.containers[0],np.linspace(1,0.5,6)):
bar.set_alpha(alpha)
ax.text(-18500,-2.6,'The Total Message Length Is 170K+ Words',size=12,weight='bold');
ax.text(-18500,-2.0,'Message length in words for all departments',size=12);
y = 0
for x,text in zip(sender_group.message_length,['75k+','48k+','41k+','2k+','980','848']):
ax.text(x+100,y,text,color='grey');
y += 1.03โ
โ