In today's fast-paced and competitive educational environment, understanding the factors that influence student success is more important than ever. Just like the transport system in a bustling city like London must adapt to serve its residents, schools and educators must adapt to meet the needs of students. In this project, we will take a deep dive into a dataset containing rich details about various aspects of student life, such as hours studied, sleep patterns, attendance, and more, to uncover what truly impacts exam performance.
The dataset we'll be working with includes a wide range of factors influencing student performance. By analyzing this data, we'll be able to identify key drivers of success and provide insights that could help students, teachers, and policymakers make informed decisions. The table we'll use for this project is called student_performance and includes the following data:
| Column | Definition | Data type |
|---|---|---|
attendance | Percentage of classes attended | float |
extracurricular_activities | Participation in extracurricular activities | varchar (Yes, No) |
sleep_hours | Average number of hours of sleep per night | float |
tutoring_sessions | Number of tutoring sessions attended per month | integer |
teacher_quality | Quality of the teachers | varchar (Low, Medium, High) |
exam_score | Final exam score | float |
You will execute SQL queries to answer three questions, as listed in the instructions.
-- Do more study hours and extracurricular activities lead to better scores? Analyze how studying more than 10 hours per week, while also participating in extracurricular activities, impacts exam performance.
-- WHERE extracurricular = 'Yes' AND hours studied > 10
-- FILTER hours_studied, avg_exam_score (AVG)
-- GROUP BY hours_studied
-- ORDER BY hours_studied DESC
SELECT hours_studied, AVG(exam_score) AS avg_exam_score
FROM student_performance
WHERE extracurricular_activities = 'Yes'
AND hours_studied > 10
GROUP BY hours_studied
ORDER BY hours_studied DESC;
-- Looks like there is a positive correlation between hours studied and average exam score. So this would suggest that extracurricular activities and hours studied do improve average exam score when more than 10 hours studied.-- Is there a sweet spot for study hours? Explore how different ranges of study hours impact exam performance by calculating the average exam score for each study range.
-- CASE WHEN for hours studied per week. END AS hours_studied_range
-- FILTER hours_studied_range, avg_exam_score
-- GROUP BY hours_studied_range
-- ORDER BY avg_exam_score DESC
SELECT CASE WHEN hours_studied BETWEEN 1 AND 5 THEN '1-5 hours'
WHEN hours_studied BETWEEN 6 AND 10 THEN '6-10 hours'
WHEN hours_studied BETWEEN 11 AND 15 THEN '11-15 hours'
ELSE '16+ hours' END AS hours_studied_range,
AVG(exam_score) AS avg_exam_score
FROM student_performance
GROUP BY hours_studied_range
ORDER BY avg_exam_score DESC;
-- Judging by the results, there's not much difference between the study ranges. However it still follows the trend that the more you study, the greater the exam score.-- A teacher wants to show their students their relative rank in the class, without revealing their exam scores to each other.
-- Window function to assign ranks based on exam score
-- Can share same rank if same score and no skipped ranks (DENSE_RANK)
-- FILTER attendance, hours_studied, sleep_hours, tutoring_sessions & exam_rank
-- ORDER BY exam_rank ASC
-- LIMIT 30
SELECT attendance, hours_studied, sleep_hours, tutoring_sessions,
DENSE_RANK() OVER (ORDER BY exam_score DESC) AS exam_rank
FROM student_performance
ORDER BY exam_rank ASC
LIMIT 30;
-- No ranks were skipped and in the top 30 students. T14 was the lowest rank.