In today's fast-paced and competitive educational environment, understanding the factors that influence student success is more important than ever. Just like the transport system in a bustling city like London must adapt to serve its residents, schools and educators must adapt to meet the needs of students. In this project, we will take a deep dive into a dataset containing rich details about various aspects of student life, such as hours studied, sleep patterns, attendance, and more, to uncover what truly impacts exam performance.
The dataset we'll be working with includes a wide range of factors influencing student performance. By analyzing this data, we'll be able to identify key drivers of success and provide insights that could help students, teachers, and policymakers make informed decisions. The table we'll use for this project is called student_performance and includes the following data:
| Column | Definition | Data type |
|---|---|---|
attendance | Percentage of classes attended | float |
extracurricular_activities | Participation in extracurricular activities | varchar (Yes, No) |
sleep_hours | Average number of hours of sleep per night | float |
tutoring_sessions | Number of tutoring sessions attended per month | integer |
teacher_quality | Quality of the teachers | varchar (Low, Medium, High) |
exam_score | Final exam score | float |
-- Overview of the data
SELECT *
FROM student_performance
LIMIT 30;-- This query calculates the average exam score of students who studied more than 10 hours and participated in extracurricular activities.
SELECT hours_studied, AVG(exam_score) AS avg_exam_score
FROM (SELECT *
FROM student_performance
WHERE hours_studied > 10
AND extracurricular_activities = 'Yes') AS a
GROUP BY hours_studied
ORDER BY hours_studied DESC;Exam Score in relation to the hours studied
-- avg_exam_score_by_hours_studied_range
SELECT CASE
-- Categorizes hours studied into specific ranges: 1-5, 6-10, 11-15, and 16+ hours.
WHEN hours_studied BETWEEN 1 AND 5 THEN '1-5 hours'
WHEN hours_studied BETWEEN 6 AND 10 THEN '6-10 hours'
WHEN hours_studied BETWEEN 11 AND 15 THEN '11-15 hours'
ELSE '16+ hours'
END AS hours_studied_range,
AVG(exam_score) AS avg_exam_score
FROM student_performance
GROUP BY hours_studied_range
ORDER BY avg_exam_score DESC;Exam Score in relation to the hours studied
-- student_exam_ranking
SELECT attendance,
hours_studied,
sleep_hours,
tutoring_sessions,
-- Assigns a rank to each student based on their exam score with highest score ranked first
DENSE_RANK() OVER(ORDER BY exam_score DESC) AS exam_rank
FROM student_performance
ORDER BY exam_rank
LIMIT 30; WITH encoded_student_performance AS (
SELECT
hours_studied,
attendance,
sleep_hours,
tutoring_sessions,
exam_score,
CASE
WHEN extracurricular_activities = 'Yes' THEN 1
ELSE 0
END AS extracurricular_activities
FROM student_performance
)
-- Correlation of hours_studied with other variables
SELECT
'hours_studied' AS measure,
CAST(corr(hours_studied, hours_studied) AS numeric(10,2)) AS hours_studied,
CAST(corr(hours_studied, attendance) AS numeric(10,2)) AS attendance,
CAST(corr(hours_studied, sleep_hours) AS numeric(10,2)) AS sleep_hours,
CAST(corr(hours_studied, tutoring_sessions) AS numeric(10,2)) AS tutoring_sessions,
CAST(corr(hours_studied, exam_score) AS numeric(10,2)) AS exam_score,
CAST(corr(hours_studied, extracurricular_activities) AS numeric(10,2)) AS extracurricular_activities
FROM encoded_student_performance
UNION ALL
-- Correlation of attendance with other variables
SELECT
'attendance' AS measure,
CAST(corr(attendance, hours_studied) AS numeric(10,2)),
CAST(corr(attendance, attendance) AS numeric(10,2)),
CAST(corr(attendance, sleep_hours) AS numeric(10,2)),
CAST(corr(attendance, tutoring_sessions) AS numeric(10,2)),
CAST(corr(attendance, exam_score) AS numeric(10,2)),
CAST(corr(attendance, extracurricular_activities) AS numeric(10,2))
FROM encoded_student_performance
UNION ALL
-- Correlation of sleep_hours with other variables
SELECT
'sleep_hours' AS measure,
CAST(corr(sleep_hours, hours_studied) AS numeric(10,2)),
CAST(corr(sleep_hours, attendance) AS numeric(10,2)),
CAST(corr(sleep_hours, sleep_hours) AS numeric(10,2)),
CAST(corr(sleep_hours, tutoring_sessions) AS numeric(10,2)),
CAST(corr(sleep_hours, exam_score) AS numeric(10,2)),
CAST(corr(sleep_hours, extracurricular_activities) AS numeric(10,2))
FROM encoded_student_performance
UNION ALL
-- Correlation of tutoring_sessions with other variables
SELECT
'tutoring_sessions' AS measure,
CAST(corr(tutoring_sessions, hours_studied) AS numeric(10,2)),
CAST(corr(tutoring_sessions, attendance) AS numeric(10,2)),
CAST(corr(tutoring_sessions, sleep_hours) AS numeric(10,2)),
CAST(corr(tutoring_sessions, tutoring_sessions) AS numeric(10,2)),
CAST(corr(tutoring_sessions, exam_score) AS numeric(10,2)),
CAST(corr(tutoring_sessions, extracurricular_activities) AS numeric(10,2))
FROM encoded_student_performance
UNION ALL
-- Correlation of exam_score with other variables
SELECT
'exam_score' AS measure,
CAST(corr(exam_score, hours_studied) AS numeric(10,2)),
CAST(corr(exam_score, attendance) AS numeric(10,2)),
CAST(corr(exam_score, sleep_hours) AS numeric(10,2)),
CAST(corr(exam_score, tutoring_sessions) AS numeric(10,2)),
CAST(corr(exam_score, exam_score) AS numeric(10,2)),
CAST(corr(exam_score, extracurricular_activities) AS numeric(10,2))
FROM encoded_student_performance
UNION ALL
-- Correlation of extracurricular_activities with other variables
SELECT
'extracurricular_activities' AS measure,
CAST(corr(extracurricular_activities, hours_studied) AS numeric(10,2)),
CAST(corr(extracurricular_activities, attendance) AS numeric(10,2)),
CAST(corr(extracurricular_activities, sleep_hours) AS numeric(10,2)),
CAST(corr(extracurricular_activities, tutoring_sessions) AS numeric(10,2)),
CAST(corr(extracurricular_activities, exam_score) AS numeric(10,2)),
CAST(corr(extracurricular_activities, extracurricular_activities) AS numeric(10,2))
FROM encoded_student_performance;
Overall Insights
-
Study Hours and Attendance: Both study hours and attendance have noticeable positive impacts on exam scores and are important contributors to academic performance.
-
Tutoring Sessions: Tutoring sessions have a positive but smaller impact on exam scores.
-
Extracurricular Activities: Participation in extracurricular activities has a minimal impact on exam scores, suggesting that while they may provide other benefits, they are not significantly affecting exam performance.
-
Sleep Hours: Sleep hours have a negligible impact on exam scores, indicating that other factors play a more crucial role in academic performance.
These insights can help students, educators, and policymakers focus on the most impactful factors to improve academic performance.