Skip to content

Does going to university in a different country affect your mental health? A Japanese international university surveyed its students in 2018 and published a study the following year that was approved by several ethical and regulatory boards.

The study found that international students have a higher risk of mental health difficulties than the general population, and that social connectedness (belonging to a social group) and acculturative stress (stress associated with joining a new culture) are predictive of depression.

Explore the students data using PostgreSQL to find out if you would come to a similar conclusion for international students and see if the length of stay is a contributing factor.

Here is a data description of the columns you may find helpful.

Field NameDescription
inter_domTypes of students (international or domestic)
japanese_cateJapanese language proficiency
english_cateEnglish language proficiency
academicCurrent academic level (undergraduate or graduate)
ageCurrent age of student
stayCurrent length of stay in years
todepTotal score of depression (PHQ-9 test)
toscTotal score of social connectedness (SCS test)
toasTotal score of acculturative stress (ASISS test)

Displaying all available columns and the five rows

Number of Columns: 49

Which are:

inter_dom, region, gender, academic, age, age_cate, stay, stay_cate, japanese, japanese_cate, english, english_cate, intimate, religion, suicide, dep, deptype, todep, depsev, tosc, apd, ahome, aph, afear, acs, aguilt, amiscell, toas, partner, friends, parents, relative, profess, phone, doctor, reli, alone, others, internet, partner_bi, friends_bi, parents_bi, relative_bi, professional_bi, phone_bi, doctor_bi, religion_bi, alone_bi, others_bi, internet_bi

SELECT * FROM students LIMIT 5;

Number of students Surveyed: 286

SELECT COUNT(*) FROM students;

A.1.Explaining what we know with the Columns

  • Without further documentation we can only speculate the data or spin a meaning from the heuristic statistics.

A.1.1 Column: inter_dom

Theory of Column: This data pertains to Domestic (Dom) and International (Inter) students. The column includes three categories: Dom, Inter, and NULL. It can be inferred that there are 201 International students, 67 Domestic students, and 18 NULL or empty values.

indexinter_dom
0Inter
1Dom
2NULL
SELECT DISTINCT(inter_dom) FROM students GROUP BY inter_dom ORDER BY inter_dom DESC;
indexinter_domcount
0Inter201
1Dom67
2NULL18
SELECT DISTINCT inter_dom, COUNT(inter_dom) AS count FROM students GROUP BY inter_dom ORDER BY inter_dom DESC;

A.1.2. Column: region

Theory of Column: This data relates to students from South East Asia (SEA), South Asia (SA), Other Locations (Other), and East Asia (EA). The column contains five categories: SEA, SA, Other, EA, and NULL. It can be inferred that there are 122 students from South East Asia, 18 from South Asia, 11 from Other Locations, 69 from Japan, 48 from East Asia, and 18 NULL or empty values.

indexregion
0SEA
1SA
2Others
3EA
4NULL
SELECT DISTINCT(region) FROM students GROUP BY region ORDER BY region DESC;
indexregioncount
0SEA122
1SA18
2Others11
3JAP69
4EA48
5NULL18
SELECT DISTINCT(region), COUNT(region) FROM students GROUP BY region ORDER BY region DESC;

A.1.3. Column: gender

Theory of Column: This data pertains to Male and Female students. It can be inferred that there are 170 Female students, 98 Male students, and 18 NULL or empty values.

indexgender
0Male
1Female
2NULL
SELECT DISTINCT(gender) FROM students GROUP BY gender ORDER BY gender DESC;
indexgendercount
0Male98
1Female170
2NULL18
SELECT DISTINCT(gender), COUNT(gender) FROM students GROUP BY gender ORDER BY gender DESC;

A.1.4. Column: academic

Theory of Column: This data pertains to Undergraduate (Under) and Graduate (Grad) students. It can be inferred that there are 201 Undergraduate, 67 Graduate students, and 18 NULL or empty values.

indexacademic
0Under
1Grad
2NULL
SELECT DISTINCT(academic) FROM students GROUP BY academic ORDER BY academic DESC;
indexacademiccount
0Under201
1Grad67
2NULL18
SELECT DISTINCT(academic), COUNT(academic) FROM students GROUP BY academic ORDER BY academic DESC;

B. Instructions:

  • Explore and analyze the students data to see how the length of stay (stay) impacts the average mental health diagnostic scores of the international students present in the study.
    • Return a table with nine rows and five columns.
    • The five columns should be aliased as: stay, count_int, average_phq, average_scs, and average_as, in that order.
    • The average columns should contain the average of the todep (PHQ-9 test), tosc (SCS test), and toas (ASISS test) columns for each length of stay, rounded to two decimal places.
    • The count_int column should be the number of international students for each length of stay.
    • Sort the results by the length of stay in descending order.
  • Note: Creating new cells in the workbook will rename the DataFrame. Make sure that your final solution uses the name df.

C. Working through the columns, field by field.

  • Which will be in this order (stay, count_int, average_phq, average_scs, and average_as).

C.1. Column: stay

Spinner
DataFrameas
Columns_stay_diagnostic
variable
-- Start coding here...
SELECT 
stay
FROM students
WHERE stay IS NOT NULL
GROUP BY stay
HAVING stay >= 1
ORDER BY stay DESC
LIMIT 9;

C.2. Column: count_int

  • The count_int column should be the number of international students for each length of stay.
Spinner
DataFrameas
Columns_count_int_diagnostic
variable
-- Start coding here...
SELECT 
stay,
COUNT(inter_dom) AS count_int
FROM students
WHERE stay IS NOT NULL
AND inter_dom = 'Inter'
GROUP BY stay
ORDER BY stay DESC
LIMIT 9;