Skip to content
Project: Analyzing Students' Mental Health in SQL
  • AI Chat
  • Code
  • Report
  • Does going to university in a different country affect your mental health? A Japanese international university surveyed its students in 2018 and published a study the following year that was approved by several ethical and regulatory boards.

    The study found that international students have a higher risk of mental health difficulties than the general population, and that social connectedness (belonging to a social group) and acculturative stress (stress associated with joining a new culture) are predictive of depression.

    Explore the students data using PostgreSQL to find out if you would come to a similar conclusion for international students and see if the length of stay is a contributing factor.

    Here is a data description of the columns you may find helpful.

    Field NameDescription
    inter_domTypes of students (international or domestic)
    japanese_cateJapanese language proficiency
    english_cateEnglish language proficiency
    academicCurrent academic level (undergraduate or graduate)
    ageCurrent age of student
    stayCurrent length of stay in years
    todepTotal score of depression (PHQ-9 test)
    toscTotal score of social connectedness (SCS test)
    toasTotal score of acculturative stress (ASISS test)
    Hidden code students
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    See if length of stay impacts the average diagnostic scores rounded to two decimal places for international students, and order the results by descending order of the length of stay.
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Check avg stay
    
    SELECT stay, ROUND(AVG(todep), 2) AS average_phq, ROUND(AVG(tosc), 2) AS average_scs, ROUND(AVG(toas), 2) AS average_as
    FROM students
    WHERE inter_dom = 'Inter'
    GROUP BY stay
    ORDER BY stay DESC;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Counting all of the records in the data

    So we have total of 268 Students

    • 67 Dom students
    • 201 International Students
    Unknown integration
    DataFrameavailable as
    df1
    variable
    -- Checking the dataset
    SELECT * 
    FROM students;
    
    -- Dom students count.
    SELECT COUNT(inter_dom) AS Dom_Students
    FROM students
    WHERE inter_dom = 'Dom';
    
    
    -- International students count.
    SELECT COUNT(inter_dom) AS International_Students
    FROM students
    WHERE inter_dom = 'Inter';
    
    
    --Checking if there's a null values
    SELECT COUNT (inter_dom) AS Null_Values
    FROM students
    WHERE inter_dom IS NULL;
    
    
    -- All Students Count
    SELECT COUNT  ( inter_dom)
    FROM students;
    
    
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Filter the data to see how it differs between the student types.

    After checking the dataset, I've found the below:

    1. First we have the redion column, which contains values about the region of each students.
    2. We have the gender column, which contains Male and Female values.
    3. Another importnat column is age and stay_cate
    4. I think the japanese_cate is also an importnant column to measure the mental health based on the communication of the students, If they can communicate with Japanese well, does that Impact thet mental health or not?
    5. Also the english_cate as the English language is the most used langauge around the world.
    6. The last important column is the religion column, I think if the student is religious he wouldn't attempt suicide, but I find it very interesting to check this theory after checking the dataset.
    Unknown integration
    DataFrameavailable as
    df2
    variable
    -- start by checking the dataset again.
    SELECT * 
    FROM students
    WHERE religion = 'Yes';
    
    
    -- Checking the avg dep score
    SELECT round(avg(todep), 2) AVG_Dep_score
    FROM students;
    
    -- Checking the average depression score if the student is religious 
    SELECT round(avg(todep), 2) AS AVG_Dep_Score
    FROM students
    WHERE religion = 'Yes'; 
    
    -- Checking the average depression score if the student not religious 
    SELECT round(avg(todep), 2) AS AVG_Dep_Score
    FROM students
    WHERE religion = 'No'; 
    
    
    -- Checking if the region impact the dep score.
    SELECT DISTINCT region 
    FROM students;
    
    
    -- AVG dep for EA region is 8.2
    SELECT round(avg(todep), 2)
    FROM students
    WHERE region = 'EA';
    
    -- AVG dep for SEA region is 8.2
    SELECT round(avg(todep), 2)
    FROM students
    WHERE region = 'SEA';
    
    -- AVG dep for SA region is 7
    SELECT round(avg(todep), 2)
    FROM students
    WHERE region = 'SA';
    
    -- AVG dep for JAP region is 8.6
    SELECT round(avg(todep), 2)
    FROM students
    WHERE region = 'JAP';
    
    -- AVG dep for Others region is 7
    SELECT round(avg(todep), 2)
    FROM students
    WHERE region = 'Others';
    
    
    -- AVG dep based on gender -Male: 8 -Female: 8.4
    SELECT round(avg(todep), 2)
    FROM students
    WHERE gender = 'Female';
    
    -- Checking Japanese cat
    SELECT AVG(japanese)
    FROM students;
    
    -- AVG japansee cat is 3
    SELECt ROUND(AVG(todep), 2)
    FROM students
    WHERE japanese > 3;
    
    
    -- AVG english cat is 3.6
    SELECt ROUND(AVG(todep), 2)
    FROM students
    WHERE japanese > 3.6;
    
    
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Findings

    1. If the student is religios he score an avg of 7.3 total dep, and if he is not religios he scores 8.6 there's a differrence but not huge.
    2. The most depressed students are from Japan.
    3. The most deprressed gender is female with an average of 8.4
    4. Even if the student is above avg on Japansese language, they still score avg of 8.2 dep avg.
    5. The same for English language with avg dep score of 8.1
    Find the summary statistics of the diagnostic tests for all students using aggregate functions, rounding the test scores to two decimal places, remembering to use aliases.

    checking for all students:-

    1. Avg dep score: 8.2
    2. Avg Social connectedness: 37.5
    3. Avg acculturative stress: 72.3

    checking for International students:-

    1. Avg dep score: 8.04
    2. Avg Social connectedness: 37.4
    3. Avg acculturative stress: 75.6
    Unknown integration
    DataFrameavailable as
    df3
    variable
    -- First for all students
    
    SELECT * FROM students;
    
    SELECT ROUND(AVG(todep), 2) AS avg_dep_score, ROUND(AVG(tosc), 2) AS avg_social_connectedness, ROUND(AVG(toas), 2) AS avg_acculturative_stress
    FROM students
    WHERE inter_dom = 'Inter';
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.