Intermediate Data Visualization with Seaborn

Run the hidden code cell below to import the data used in this course.

1 hidden cell

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

# Add your code snippets here

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

Use lmplot() to look at the relationship between temp and total_rentals from bike_share. Plot two regression lines for working and non-working days (workingday).
Create a heat map from daily_show to see how the types of guests (Group) have changed yearly.
Explore the variables from insurance and their relationship by creating pairwise plots and experimenting with different variables and types of plots. Additionally, you can use color to segment visually for region.
Make sure to add titles and labels to your plots and adjust their format for readability!

Rug plot and kde shading Now that you understand some function arguments for displot(), we can continue further refining the output. This process of creating a visualization and updating it in an incremental fashion is a useful and common approach to look at data from multiple perspectives.

Seaborn excels at making this process simple.

Instructions 100 XP Create a displot of the Award_Amount column in the df. Configure it to show a shaded kde plot (using the kind and fill parameters). Add a rug plot above the x axis (using the rug parameter). Display the plot.

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('datasets/schoolimprovement2010grants.csv')

# Create a displot of the Award Amount
sns.displot(df['Award_Amount'],
             kind='kde',
             rug=True,
             fill=True)

# Plot the results
plt.show()

Create a regression plot For this set of exercises, we will be looking at FiveThirtyEight's data on which US State has the worst drivers. The data set includes summary level information about fatal accidents as well as insurance premiums for each state as of 2010.

In this exercise, we will look at the difference between the regression plotting functions.

Instructions 1/2 50 XP 1 2 The data is available in the dataframe called df. Create a regression plot using regplot() with "insurance_losses" on the x axis and "premiums" on the y axis.

df = pd.read_csv('datasets/insurance_premiums.csv')

# Create a regression plot of premiums vs. insurance_losses
sns.regplot(x='insurance_losses', y='premiums', data=df)



# Display the plot
plt.show()

# Create an lmplot of premiums vs. insurance_losses
sns.lmplot(x='insurance_losses', y='premiums', data=df)



# Display the second plot
plt.show()

# Create a regression plot using hue
sns.lmplot(x='insurance_losses', y='premiums', data=df,
           hue="Region")

# Show the results
plt.show()

# Create a regression plot with multiple rows
sns.lmplot(data=df,
           x="insurance_losses",
           y="premiums",
           row="Region")

# Show the plot
plt.show()

Create and display a palette with 10 colors using the husl system.

sns.palplot(sns.color_palette('husl', 10))
plt.show()

‌
‌
‌

Intermediate Data Visualization with Seaborn

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Intermediate Data Visualization with Seaborn

Take Notes

Explore Datasets

Intermediate Data Visualization with Seaborn