Skip to content

Does going to university in a different country affect your mental health? A Japanese international university surveyed its students in 2018 and published a study the following year that was approved by several ethical and regulatory boards.

The study found that international students have a higher risk of mental health difficulties than the general population, and that social connectedness (belonging to a social group) and acculturative stress (stress associated with joining a new culture) are predictive of depression.

Explore the students data using PostgreSQL to find out if you would come to a similar conclusion for international students and see if the length of stay is a contributing factor.

Here is a data description of the columns you may find helpful.

Field NameDescription
inter_domTypes of students (international or domestic)
japanese_cateJapanese language proficiency
english_cateEnglish language proficiency
academicCurrent academic level (undergraduate or graduate)
ageCurrent age of student
stayCurrent length of stay in years
todepTotal score of depression (PHQ-9 test)
toscTotal score of social connectedness (SCS test)
toasTotal score of acculturative stress (ASISS test)
Spinner
DataFrameas
students
variable
-- Run this code to save the CSV file as students
SELECT * 
FROM 'students.csv';

The dataset returns 286 rows of data and 50 columns. However, the most important columns out of these 50 which will be using in this project are todep, tosc, and toas.

The ‘todep’ field represents the total score of depression, the ‘tosc’ field represents the total score of social connectedness (SCS test) and The ‘toas’ field represents the total score of acculturative stress (ASISS test).

Spinner
DataFrameas
df0
variable
-- Number of records in the dataset
SELECT COUNT(*) AS total_records
FROM students;

The query returns the dataset consisted of 286 records or data for 286 students.

Spinner
DataFrameas
df1
variable
-- Number of International and Domestic students
SELECT inter_dom, COUNT(inter_dom)	AS total_students
FROM students
GROUP BY 1;

Out of 286 students, 201 are international students whereas 67 are domestic students.

Spinner
DataFrameas
df2
variable
-- Find the students whose inter_dom is neither Dom nor Inter.
SELECT *
FROM students
WHERE inter_dom NOT LIKE 'D%' 
	AND inter_dom NOT LIKE 'I%'
	OR inter_dom IS NULL;
Spinner
DataFrameas
df3
variable
-- Number of international students from different regions.
SELECT region, COUNT(inter_dom) AS total_students
FROM students
WHERE inter_dom = 'Inter'
GROUP BY 1;
Spinner
DataFrameas
df4
variable
-- Find the summary statistics for each diagnostic test using aggregate functions for all students.

SELECT 
	MIN(todep) AS min_phq,
	MAX(todep) AS max_phq,
	ROUND(AVG(todep),2) AS avg_phq,
	MIN(tosc) AS min_scs,
	MAX(tosc) AS max_scs,
	ROUND(AVG(tosc),2) AS avg_scs,
	MIN(toas) AS min_as,
	MAX(toas) AS max_as,
	ROUND(AVG(toas),2) AS avg_as
FROM students;

Using aggregate functions such as MIN(), MAX(), and AVG() to perform the summary statistics comprising of all students.

Spinner
DataFrameas
df5
variable
-- Find the summary statistics for each diagnostic test using aggregate functions for Inter and Dom students only.

SELECT 
	inter_dom,
	MIN(todep) AS min_phq,
	MAX(todep) AS max_phq,
	ROUND(AVG(todep),2) AS avg_phq,
	MIN(tosc) AS min_scs,
	MAX(tosc) AS max_scs,
	ROUND(AVG(tosc),2) AS avg_scs,
	MIN(toas) AS min_as,
	MAX(toas) AS max_as,
	ROUND(AVG(toas),2) AS avg_as
FROM students
WHERE inter_dom IN ('Inter', 'Dom')
GROUP BY 1;

Again, using aggreagate functions to perform statistics, for international students though.