Back to Templates

Student Happiness Survey Data

Explore the student happiness survey dataset and publish your findings.

# Load packages
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
color = sns.color_palette()

Load your data

# Upload data as a .csv file 
df = pd.read_csv('survey.csv', index_col = 'response_id')
df.head()
CareerCitizenshipNationalityYear since MatriculationYear of StudyPrimary ProgrammeGenderDepartmentHousing TypeQ1-How many events have you Volunteered in ?Q2-How many events have you Participated in ?Q3-How many activities are you Interested in ?Q4-How many activities are you Passionate about ?Q5-What are your levels of stress ?Q6-How Satisfied You are with your Student Life ?Q7-How much effort do you make to interact with others ?Q8-About How events are you aware about ?Q9-What is an ideal student life ?
response_id
1UGRDForeignerIndonesia22Bachelor of ScienceFSchool of ScienceResidences0131122.02.0NaN
2UGRDCountry CitizenSingapore11Bachelor of EngineeringFSchool of EngineeringOut of Campus0123122.03.0Friends+CCas+good result
3UGRDForeignerMalaysia22Bachelor of ScienceMSchool of ScienceHalls3115222.02.0just want everything to go smooth. serious
4UGRDForeignerMalaysia22Bachelor of EngineeringMSchool of EngineeringHalls3433711.01.0NaN
5UGRDForeignerViet Nam33Bachelor of EngineeringFSchool of EngineeringOut of Campus4345422.02.0a mixture of both academic and non-academic

Understand your variables

# Rename your column names to be more succinct 
to_rename = [column for column in df.columns[9:]]
acronyms = ['Volunteer', 'Participate', 'Interest', 'Passion', 'Stress', 'Satisfaction', 'Interaction', 'Events', 'Ideal'] 
mapping = {key: value for key, value in zip(to_rename,acronyms)}
df = df.rename(columns = mapping)
df.columns
Index(['Career', 'Citizenship', 'Nationality', 'Year since Matriculation',
  'Year of Study', 'Primary Programme', 'Gender', 'Department',
  'Housing Type', 'Volunteer', 'Participate', 'Interest', 'Passion',
  'Stress', 'Satisfaction', 'Interaction', 'Events', 'Ideal'],
dtype='object')
# Understand your variables
variables = pd.DataFrame(columns=['Variable','Number of unique values','Values'])

for i, var in enumerate(df.columns):
    variables.loc[i] = [var, df[var].nunique(), df[var].unique().tolist()]
    
variables
VariableNumber of unique valuesValues
0Career3[UGRD, GRAD, NGRD]
1Citizenship3[Foreigner, Country Citizen, Permanent Resident]
2Nationality31[Indonesia, Singapore, Malaysia, Viet Nam, Hon...
3Year since Matriculation6[2, 1, 3, 4, 5, 6]
4Year of Study5[2, 1, 3, 4, 5]
5Primary Programme68[Bachelor of Science, Bachelor of Engineering,...
6Gender2[F, M]
7Department21[School of Science, School of Engineering, Sch...
8Housing Type4[Residences, Out of Campus, Halls, Residential...
9Volunteer12[0, 3, 4, 2, 1, 5, 10, 6, 8, 7, 9, 11]
10Participate6[1, 4, 3, 2, 0, 5]
11Interest8[3, 2, 1, 4, 5, 6, 7, 8]
12Passion12[1, 3, 5, 2, 4, 7, 6, 8, 10, 9, 0, 11]
13Stress10[1, 2, 7, 4, 3, 5, 6, 8, 9, 0]
14Satisfaction4[2, 1, 3, 0]
15Interaction4[2.0, 1.0, 3.0, 0.0, nan]
16Events4[2.0, 3.0, 1.0, 4.0, nan]
17Ideal2245[nan, Friends+CCas+good result, just want ever...

Identify what variables are worth analyzing further

Create a heatmap to identify correlations

# Generate correlation matrix
corr = df.corr(method='pearson')

# Generate a mask for the upper triangle
mask = np.triu(np.ones_like(corr, dtype=bool))

# Set up the matplotlib figure
fig, ax = plt.subplots(figsize=(11, 9))                    # Set figure size

# Generate a custom diverging colormap
cmap = sns.diverging_palette(230, 20, as_cmap=True)

# Draw the heatmap with the mask 
sns.heatmap(corr, 
            mask = mask, 
            cmap = cmap, 
            vmax = 1,                                      # Set scale min value
            vmin = -1,                                     # Set scale min value
            center = 0,                                    # Set scale min value
            square = True,                                 # Ensure perfect squares
            linewidths = 1.5,                              # Set linewidth between squares
            cbar_kws = {"shrink": .8},                     # Set size of color bar
            annot = True                                   # Include values within squares
           );

plt.xticks(rotation=90)                                    # Rotate x labels
plt.title('Diagonal Correlation Plot', size=20, y=1.05);   # Set plot title and position

Answer interesting questions:

Now you get to explore this exciting dataset! Can't think of where to start? No worries we've got you covered. Try your hand at these questions:

  • Are international students happier than domestic students?
  • How does the amount of events students attend, influence their stress levels?
  • Does the type of housing influence stress levels, passion and happiness?
# Start coding
Python

Student Happiness Survey Data

Explore the student happiness survey dataset and publish your findings.

Use Template