Skip to content
Exam Scores Descriptive Analysis
ℹ️ Exam Scores Descriptive Analysis
💾 The data
The file has the following fields (source):
- "gender" - male / female
- "race/ethnicity" - one of 5 combinations of race/ethnicity
- "parent_education_level" - highest education level of either parent
- "lunch" - whether the student receives free/reduced or standard lunch
- "test_prep_course" - whether the student took the test preparation course
- "math" - exam score in math
- "reading" - exam score in reading
- "writing" - exam score in writing
Contents
- Import necessary libraries and load the dataset
- What are the average reading scores for students with/without the test preparation course?
- What are the average scores for the different parental education levels?
- Create plots to visualize findings for questions 1 and 2.
- Look at the effects within subgroups. Compare the average scores for students with/without the test reparation course for different parental education levels (e.g., faceted plots).
- The principal wants to know if kids who perform well on one subject also score well on the others. Look at the correlations between scores.
- Summary.
1. Import necessary libraries and load the dataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import warnings
warnings.simplefilter("ignore")df = pd.read_csv('data/exams.csv')
df.head()2. [Ques 1] What are the average reading scores for students with/without the test preparation course?
df.groupby('test_prep_course')[['reading']].mean()3. [Ques 2] What are the average scores for the different parental education levels?
df.groupby('parent_education_level')[['math','reading','writing']].mean()4. Create plots to visualize findings for questions 1 and 2
sns.catplot(x='test_prep_course', y='reading', data=df, kind='bar').set(title='average reading scores for students with/without the test preparation course')sns.catplot(x='parent_education_level', y='math', data=df, kind='bar').set(title='average scores in math')
plt.xticks(rotation=90)
sns.catplot(x='parent_education_level', y='reading', data=df, kind='bar').set(title='average scores in reading')
plt.xticks(rotation=90)
sns.catplot(x='parent_education_level', y='writing', data=df, kind='bar').set(title='average scores in writing')
plt.xticks(rotation=90)
5. Look at the effects within subgroups. Compare the average scores for students with/without the test preparation course for different parental education levels