Skip to content
Introduction to Data Visualization with Seaborn
Introduction to Data Visualization with Seaborn
Run the hidden code cell below to import the data used in this course.
# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Importing the course datasets
country_data = pd.read_csv('datasets/countries-of-the-world.csv', decimal=",")
mpg = pd.read_csv('datasets/mpg.csv')
student_data = pd.read_csv('datasets/student-alcohol-consumption.csv', index_col=0)
survey = pd.read_csv('datasets/young-people-survey-responses.csv', index_col=0)Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets hereExplore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- From
country_data, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region. - Use
mpgto create a line plot withmodel_yearon the x-axis andweighton the y-axis. Create differentiating lines for each country of origin (origin). - Create a box plot from
student_datato explore the relationship between the number of failures (failures) and the average final grade (G3). - Create a bar plot from
surveyto compare howLonelinessdiffers across values forInternet usage. Format it to have two subplots for gender. - Make sure to add titles and labels to your plots and adjust their format for readability!
country_data.head()GETTING STARTED
# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Extract columns from country_data into lists
gdp = country_data['GDP ($ per capita)'].tolist()
phones = country_data['Phones (per 1000)'].tolist()
# Create scatter plot with GDP on the x-axis and number of phones on the y-axis
sns.scatterplot(x=gdp, y=phones)
# Show plot
plt.show()# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Create list of region
region = country_data['Region'].tolist()
# Create count plot with region on the y-axis
sns.countplot(y=region)
# Show plot
plt.show()MAKING COUNTPLOT
# Import Matplotlib, pandas, and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
# Create a DataFrame from csv file
df = pd.read_csv('csv_filepath')
# Create a count plot with "Spiders" on the x-axis
sns.countplot(x="Spiders", data=df)
# Display the plot
plt.show()HUE and ADDING 3RD VARIABLE
# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Change the legend order in the scatter plot
sns.scatterplot(x="absences", y="G3",
data=student_data,
hue="location", hue_order=["Rural", "Urban"])
# Show plot
plt.show()
# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Create a dictionary mapping subgroup values to colors
palette_colors = {"Rural": "green", "Urban": "blue"}
# Create a count plot of school with location subgroups
sns.countplot(x="school", data=student_data, hue="location", palette=palette_colors)
# Display plot
plt.show()RELATIONAL PLOTS - Takes the place of scatterplot in SNS due to the ability to call "scatter" or "line" when needed