Does attending college abroad have an impact on your mental health? After surveying its students in 2018, a Japanese international university released a research the following year that received approval from multiple regulatory and ethical bodies.
According to the study, depression is predicted by social connection (belonging to a social group) and acculturative stress (stress related to assimilating into a new culture), and overseas students are more likely than the general population to experience mental health issues.
By exploring the student data using PostgreSQL, the objective was to understand if there is a link between social connection and acculturative stress on depression levels, and if the duration of their stay at the University plays a role in their overall declining mental health.
# Let's first load and inspect the uploaded CSV file to understand its structure and contents.
import pandas as pd
# Load the CSV file
file_path = 'students.csv'
students = pd.read_csv(file_path)
# Display the first few rows to understand the structure
students.head()Below are some descriptions of columns within the students dataset that may provide useful context for this investigation.
| Field Name | Description |
|---|---|
inter_dom | Types of students (international or domestic) |
japanese_cate | Japanese language proficiency |
english_cate | English language proficiency |
academic | Current academic level (undergraduate or graduate) |
age | Current age of student |
stay | Current length of stay in years |
todep | Total score of depression (PHQ-9 test) |
tosc | Total score of social connectedness (SCS test) |
toas | Total score of acculturative stress (ASISS test) |
- Exploring and understanding the data:
To begin, I conducted a count of total number of records, and grouped the records to differentiate between international and domestic students alike and analyse their respective quantities.
-- Count the number of records in the dataset
SELECT
COUNT(*) AS total_records
FROM
students;-- How many international and domestic students are there?
SELECT
inter_dom,
COUNT(*) AS count_inter_dom
FROM
students
WHERE
inter_dom IS NOT NULL
GROUP BY
inter_dom
ORDER BY
count_inter_dom DESC;- Now we want to understand and query the summary statistics of diagnostic scores for all students
Here we will be retreiving summary statistics for each diagnostic test (PHQ-9, SCS & ASISS) utilising aggregate functions. These functions will help to understand the different tests and what the scores mean.
-- Summary statistics of diagnostic scores for international students
SELECT
MIN(todep) min_phq,
MAX(todep) AS max_phq,
ROUND(AVG(todep),2) AS avg_phq,
MIN(tosc) min_scs,
MAX(tosc) AS max_scs,
ROUND(AVG(tosc),2) AS avg_scs,
MIN(toas) min_as,
MAX(toas) AS max_as,
ROUND(AVG(toas),2) AS avg_as
FROM students
WHERE inter_dom = 'Inter';- Filter the data for context on international students
Here I wanted to understand more about the students to better analyse the data and provide relevant recommendations, taking into account demographics such as age, nationality, gender & education.
-- Understand the distribution of students across age and region
SELECT
age,
region AS nationality,
COUNT(*) AS student_count
FROM
students
WHERE
inter_dom = 'Inter'
GROUP BY
age, region
ORDER BY
student_count DESC;-- To calculate the proportion of each gender among international students with average diagnostics
SELECT
gender,
COUNT(*) AS gender_count,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM students WHERE inter_dom = 'Inter'), 2) AS gender_percentage,
ROUND(AVG(todep),2) AS average_dep,
ROUND(AVG(tosc),2) AS average_scs,
ROUND(AVG(toas),2) AS average_as
FROM
students
WHERE
inter_dom = 'Inter'
GROUP BY
gender
ORDER BY
gender_percentage DESC;-- To analyse living situations with average diagnostics
SELECT
alone_bi AS lives_alone,
COUNT(*) AS student_count,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM students WHERE inter_dom = 'Inter'), 2) AS percentage,
ROUND(AVG(todep),2) AS average_dep,
ROUND(AVG(tosc),2) AS average_scs,
ROUND(AVG(toas),2) AS average_as
FROM
students
WHERE
inter_dom = 'Inter'
GROUP BY
alone_bi;This information provides important context for further analysis to understand what influences depression levels and helps to prevent bias in our analysis. It will also assist in informing us when deciding on actionable strategies to combat mental health in international students.
Key Insights:
- The majority of students in the dataset are aged 18-21, and are likely navigating adult challenges alongside cultural adjustment.
- Most students come from SEA (South East Asia), providing context when designing mental health support programs.
- Nearly 65% of students in the dataset are Female, highlighting the need to consider gender-specific mental health concerns.
- 27% of students live alone, a factor strongly associated with social isolation and higher depression levels.