Skip to content
Intermediate Data Visualization with Seaborn

Intermediate Data Visualization with Seaborn

# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Importing the course datasets
bike_share = pd.read_csv('datasets/bike_share.csv')
college_data = pd.read_csv('datasets/college_datav3.csv')
daily_show = pd.read_csv('datasets/daily_show_guests_cleaned.csv')
insurance = pd.read_csv('datasets/insurance_premiums.csv')
grants = pd.read_csv('datasets/schoolimprovement2010grants.csv', index_col=0)
sns.lmplot(x='temp',y='total_rentals',data=bike_share,hue='workingday')
plt.title('Relationship between temp and total_rentals on working day and non-working day (dataset = bike_share)')

plt.show()
plt.clf()

Heat map from daily_show to see how types of guests have changed yearly

daily_show.head()

# Create the crosstab DataFrame
pd_crosstab = pd.crosstab(daily_show["Group"], daily_show["YEAR"])

# Plot a heatmap of the table with no color bar and using the BuGn palette
sns.set_style('whitegrid')
sns.heatmap(pd_crosstab, cmap = 'BuGn')

# Rotate tick marks for visibility
plt.yticks(rotation=0)
plt.xticks(rotation=90)
plt.title('Heatmap : How type of guests (Group) have changed yearly')

#Show the plot
plt.show()
plt.clf()
g = sns.pairplot(data=insurance,
        x_vars=["fatal_collisions_speeding", "fatal_collisions_alc",'fatal_collisions','fatal_collisions_not_distracted','fatal_collisions_no_hist'],
        y_vars=['premiums', 'insurance_losses'],
        kind='scatter',
        hue='Region',
        palette='husl')

plt.subplots_adjust(top=0.9)
plt.suptitle('Relationship between fatal collisions and premiums / insurance losses in various regions (pairwise plot)', fontsize = 16)
plt.show()
plt.clf()

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

  • Use lmplot() to look at the relationship between temp and total_rentals from bike_share. Plot two regression lines for working and non-working days (workingday).
  • Create a heat map from daily_show to see how the types of guests (Group) have changed yearly.
  • Explore the variables from insurance and their relationship by creating pairwise plots and experimenting with different variables and types of plots. Additionally, you can use color to segment visually for region.
  • Make sure to add titles and labels to your plots and adjust their format for readability!