Skip to content

Does going to university in a different country affect your mental health? A Japanese international university surveyed its students in 2018 and published a study the following year that was approved by several ethical and regulatory boards.

The study found that international students have a higher risk of mental health difficulties than the general population, and that social connectedness (belonging to a social group) and acculturative stress (stress associated with joining a new culture) are predictive of depression.

Explore the students data using PostgreSQL to find out if you would come to a similar conclusion for international students and see if the length of stay is a contributing factor.

Here is a data description of the columns you may find helpful.

Field NameDescription
inter_domTypes of students (international or domestic)
japanese_cateJapanese language proficiency
english_cateEnglish language proficiency
academicCurrent academic level (undergraduate or graduate)
ageCurrent age of student
stayCurrent length of stay in years
todepTotal score of depression (PHQ-9 test)
toscTotal score of social connectedness (SCS test)
toasTotal score of acculturative stress (ASISS test)
Spinner
DataFrameas
df
variable
-- Run this code to save the CSV file as students
SELECT * 
FROM 'students.csv';

Below, I'm counting all of the records in the students data

Spinner
DataFrameas
df
variable
SELECT
	COUNT (*) AS Total_Records
FROM students

Below, I'm counting all records per student type

Spinner
DataFrameas
df
variable
SELECT inter_dom,
	COUNT (*) AS Record_Count
FROM students
GROUP BY inter_dom;

Below, I'm Filtering the data to see how it differs between the student types. In this case we'll replace with values Inter,Dom and using the conditon IS NULL to see the data with inknown status

Spinner
DataFrameas
df2
variable
SELECT *
FROM students
--WHERE inter_dom = 'Inter';
WHERE inter_dom = 'Dom';
--WHERE inter_dom IS NULL;

Below, I'm Finding the summary statistics of the diagnostic tests for all students using aggregate functions, rounding the test scores to two decimal places and using aliases.

In this example we're using the diagnostic tests for:

  • todep Total score of depression (PHQ-9 test)
  • tosc Total score of social connectedness (SCS test)
  • toas Total score of acculturative stress (ASISS test)
Spinner
DataFrameas
df
variable
SELECT
    MIN(todep) AS min_phq,
    MAX(todep) AS max_phq,
    ROUND(AVG(todep), 2) AS avg_phq,
	MIN(tosc) AS min_scs,
    MAX(tosc) AS max_scs,
    ROUND(AVG(tosc), 2) AS avg_scs,
	MIN(toas) AS min_as,
    MAX(toas) AS max_as,
    ROUND(AVG(toas), 2) AS avg_as,
FROM students;

Below, I'm summarizing the data for international students only.

Spinner
DataFrameas
df
variable
SELECT
    inter_dom,
    MIN(todep) AS min_phq,
    MAX(todep) AS max_phq,
    ROUND(AVG(todep), 2) AS avg_phq,
    MIN(tosc) AS min_scs,
    MAX(tosc) AS max_scs,
    ROUND(AVG(tosc), 2) AS avg_scs,
    MIN(toas) AS min_as,
    MAX(toas) AS max_as,
    ROUND(AVG(toas), 2) AS avg_as
FROM students
GROUP BY inter_dom;

I'm Finding the average scores by length of stay for international students, and view them in descending order

stay - Current length of stay in years