Photo by Jannis Lucas on Unsplash.
Every year, American high school students take SATs, which are standardized tests intended to measure literacy, numeracy, and writing skills. There are three sections - reading, math, and writing, each with a maximum score of 800 points. These tests are extremely important for students and colleges, as they play a pivotal role in the admissions process.
Analyzing the performance of schools is important for a variety of stakeholders, including policy and education professionals, researchers, government, and even parents considering which school their children should attend.
You have been provided with a dataset called schools.csv, which is previewed below.
You have been tasked with answering three key questions about New York City (NYC) public school SAT performance.
# Re-run this cell
import pandas as pd
# Read in the data
schools = pd.read_csv("schools.csv")
# Preview the data
schools.head()
# Start coding here...
# Add as many cells as you like...high_math = schools[schools["average_math"] >= 640]
high_mathbest_math = high_math.sort_values("average_math", ascending = False).head(10)
best_math_schools = best_math[["school_name","average_math"]]
best_math_schoolsimport matplotlib.pyplot as plt
best_math_schools.plot(x="school_name", y="average_math", kind="bar", legend=False)
plt.title('Top 10 Schools by Average Math Score')
plt.grid(True)
plt.show()schools.head()schools["total_SAT"] = schools["average_math"] + schools["average_reading"] + schools["average_writing"]
schools.head()top_10 = schools.sort_values("total_SAT", ascending = False).head(10)
top_10_schools = top_10[["school_name","total_SAT"]]
top_10_schoolstop_10_schools.plot(x="school_name", y="total_SAT", kind="bar", legend=False)
plt.title('Top 10 Schools by SAT Score')
plt.grid(True)
plt.show()schools.head()school_groupby =schools.grouped = schools.groupby("borough")["total_SAT"].agg([
("num_schools", "count"),
("average_SAT", "mean"),
("std_SAT", "std")
])
school_groupbyschool_groupby["average_SAT"] = school_groupby["average_SAT"].round(2)
school_groupby["std_SAT"] = school_groupby["std_SAT"].round(2)
school_groupbylargest_std_dev = school_groupby[school_groupby["std_SAT"] == school_groupby["std_SAT"].max()]
largest_std_dev = largest_std_dev.reset_index()
largest_std_dev