Skip to content
1 hidden cell
GOOD NOTES Introduction to Data Visualization with Seaborn
Introduction to Data Visualization with Seaborn
Run the hidden code cell below to import the data used in this course.
1 hidden cell
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets hereExplore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- From
country_data, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region. - Use
mpgto create a line plot withmodel_yearon the x-axis andweighton the y-axis. Create differentiating lines for each country of origin (origin). - Create a box plot from
student_datato explore the relationship between the number of failures (failures) and the average final grade (G3). - Create a bar plot from
surveyto compare howLonelinessdiffers across values forInternet usage. Format it to have two subplots for gender. - Make sure to add titles and labels to your plots and adjust their format for readability!
# To switch the horizontal axes change the y= to x=
sns.catplot(y="Internet usage", data=survey_data,
kind="count")
# Show plot
plt.show()# Separate into column subplots based on age category
sns.catplot(y="Internet usage", data=survey_data,
kind="count", col="Age Category")
# Show plot
plt.show()# Create a bar plot of interest in math, separated by gender
sns.catplot(data=survey_data, x="Gender", y="Interested in Math", kind="bar")
# Show plot
plt.show()# Create bar plot of average final grade in each study category
sns.catplot(data=student_data, x="study_time", y="G3", kind="bar")
# Show plot
plt.show()# List of categories from lowest to highest
category_order = ["<2 hours",
"2 to 5 hours",
"5 to 10 hours",
">10 hours"]
# Rearrange the categories
sns.catplot(x="study_time", y="G3",
data=student_data, order=category_order,
kind="bar")
# Show plot
plt.show()# List of categories from lowest to highest
category_order = ["<2 hours",
"2 to 5 hours",
"5 to 10 hours",
">10 hours"]
# Turn off the confidence intervals
sns.catplot(x="study_time", y="G3",
data=student_data,
kind="bar",
order=category_order, ci=None)
# Show plot
plt.show()# Specify the category ordering
study_time_order = ["<2 hours", "2 to 5 hours",
"5 to 10 hours", ">10 hours"]
# Create a box plot and set the order of the categories
sns.catplot(data=student_data, x="study_time", y="G3", kind="box", order=study_time_order)
# Show plot
plt.show()# Create a box plot with subgroups and omit the outliers
sns.catplot(data=student_data, x="internet", y="G3", kind="box", hue="location", sym="")
# Show plot
plt.show()# Extend the whiskers to the 5th and 95th percentile
sns.catplot(x="romantic", y="G3",
data=student_data,
kind="box",
whis=[5, 95])
# Show plot
plt.show()