Skip to content

Introduction to Data Visualization with Seaborn

Run the hidden code cell below to import the data used in this course.


1 hidden cell

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

# Add your code snippets here

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

  • From country_data, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region.
  • Use mpg to create a line plot with model_year on the x-axis and weight on the y-axis. Create differentiating lines for each country of origin (origin).
  • Create a box plot from student_data to explore the relationship between the number of failures (failures) and the average final grade (G3).
  • Create a bar plot from survey to compare how Loneliness differs across values for Internet usage. Format it to have two subplots for gender.
  • Make sure to add titles and labels to your plots and adjust their format for readability!
# To switch the horizontal axes change the y= to x=
sns.catplot(y="Internet usage", data=survey_data,
            kind="count")

# Show plot
plt.show()
# Separate into column subplots based on age category
sns.catplot(y="Internet usage", data=survey_data,
            kind="count", col="Age Category")

# Show plot
plt.show()
# Create a bar plot of interest in math, separated by gender
sns.catplot(data=survey_data, x="Gender", y="Interested in Math", kind="bar")


# Show plot
plt.show()
# Create bar plot of average final grade in each study category
sns.catplot(data=student_data, x="study_time", y="G3", kind="bar")



# Show plot
plt.show()
# List of categories from lowest to highest
category_order = ["<2 hours", 
                  "2 to 5 hours", 
                  "5 to 10 hours", 
                  ">10 hours"]

# Rearrange the categories
sns.catplot(x="study_time", y="G3",
            data=student_data, order=category_order,
            kind="bar")

# Show plot
plt.show()
# List of categories from lowest to highest
category_order = ["<2 hours", 
                  "2 to 5 hours", 
                  "5 to 10 hours", 
                  ">10 hours"]

# Turn off the confidence intervals
sns.catplot(x="study_time", y="G3",
            data=student_data,
            kind="bar",
            order=category_order, ci=None)

# Show plot
plt.show()
# Specify the category ordering
study_time_order = ["<2 hours", "2 to 5 hours", 
                    "5 to 10 hours", ">10 hours"]

# Create a box plot and set the order of the categories
sns.catplot(data=student_data, x="study_time", y="G3", kind="box", order=study_time_order)




# Show plot
plt.show()
# Create a box plot with subgroups and omit the outliers
sns.catplot(data=student_data, x="internet", y="G3", kind="box", hue="location", sym="")

# Show plot
plt.show()
# Extend the whiskers to the 5th and 95th percentile
sns.catplot(x="romantic", y="G3",
            data=student_data,
            kind="box",
            whis=[5, 95])

# Show plot
plt.show()