Analyzing Students Mental Health, Using PostgreSQL
| Field Name | Description | 
|---|---|
| inter_dom | Types of students (international or domestic) | 
| japanese_cate | Japanese language proficiency | 
| english_cate | English language proficiency | 
| academic | Current academic level (undergraduate or graduate) | 
| age | Current age of student | 
| stay | Current length of stay in years | 
| todep | Total score of depression (PHQ-9 test) | 
| tosc | Total score of social connectedness (SCS test) | 
| toas | Total score of acculturative stress (ASISS test) | 
SELECT *
FROM students;Analysing Lenght of Stay Impacts on Mental Health Diagnostic:
Objective: To explore and analyze the students data to see how the length of stay (stay) impacts the average mental health diagnostic scores of the international students present in the study.
- Do international students have higher/lower depression scores than domestic students?
- How does the academic level influence it? Which group (grad/under) is more prone to depression?
- What is the correlation between the total score of depression (todep) and length of stay for international students?
Are international students with a longer stay more prone to depression?
- I selected the stay column and calculated the average scores for todep (PHQ-9 test), tosc (SCS test), and toas (ASISS test), which represent the study’s mental health diagnostic measures. I rounded these values to two decimal places. To make the analysis easier to understand, I mapped the original data columns to my results as follows:
Total score of depression (todep) → average_phq Total score of social connectedness (tosc) → average_scs Total score of acculturative stress (toas) → average_as
- 
Since my focus is on international students, I counted the number of international students (count_int) for each length of stay. 
- 
Finally, I sorted the results by length of stay in descending order. 
SELECT stay,
       COUNT(inter_dom) AS count_int, 
       ROUND(AVG(todep), 2) AS average_phq, 
       ROUND(AVG(tosc), 2) AS average_scs, 
       ROUND(AVG(toas), 2) AS average_as
FROM students
WHERE inter_dom = 'Inter'
GROUP BY stay
ORDER BY 1 DESC;Compare Depression Scores by Academic Level
The previous result made me want to understand more of the data to determine if the academic level could be an influential factor
SELECT stay, 
       academic, 
       COUNT(*) AS count_students, 
       ROUND(AVG(todep), 2) AS average_phq, 
       ROUND(AVG(tosc), 2) AS average_scs, 
       ROUND(AVG(toas), 2) AS average_as
FROM students
WHERE tosc IS NOT NULL 
AND toas IS NOT NULL 
AND todep IS NOT NULL
GROUP BY academic, stay
ORDER BY stay, count_students DESC;We can see that the majority of students are undergraduates
Compare Average Depression Scores Between Groups
The following query calculates the average depression, social connectedness, and acculturative stress for each group.It gives a high-level comparison between the two populations (international vs domestic students)
SELECT inter_dom, 
       COUNT(*) AS count_students, 
       ROUND(AVG(todep), 2) AS average_phq, 
       ROUND(AVG(tosc), 2) AS average_scs, 
       ROUND(AVG(toas), 2) AS average_as
FROM students
WHERE todep IS NOT NULL 
AND tosc IS NOT NULL 
AND toas IS NOT NULL
GROUP BY inter_dom;Compare Depression Scores by Stay Duration for Both Groups
- Do international students' depression scores increase with longer stays?
SELECT stay, 
       inter_dom, 
       COUNT(*) AS count_students, 
       ROUND(AVG(todep), 2) AS average_phq, 
       ROUND(AVG(tosc), 2) AS average_scs, 
       ROUND(AVG(toas), 2) AS average_as
FROM students
WHERE todep IS NOT NULL 
AND tosc IS NOT NULL 
AND toas IS NOT NULL
GROUP BY stay, inter_dom
ORDER BY stay DESC, inter_dom;What is the correlation between the the total score of depression (todep) and lenght of stay?
I used the PostgreSQL CORR () function to calculate how stay relates to depression and grouped the results by academic level.