Introduction to Data Visualization with Seaborn

Run the hidden code cell below to import the data used in this course.

# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Importing the course datasets
country_data = pd.read_csv('datasets/countries-of-the-world.csv', decimal=",")
mpg = pd.read_csv('datasets/mpg.csv')
student_data = pd.read_csv('datasets/student-alcohol-consumption.csv', index_col=0)
survey = pd.read_csv('datasets/young-people-survey-responses.csv', index_col=0)

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

# Add your code snippets here

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

From country_data, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region.
Use mpg to create a line plot with model_year on the x-axis and weight on the y-axis. Create differentiating lines for each country of origin (origin).
Create a box plot from student_data to explore the relationship between the number of failures (failures) and the average final grade (G3).
Create a bar plot from survey to compare how Loneliness differs across values for Internet usage. Format it to have two subplots for gender.
Make sure to add titles and labels to your plots and adjust their format for readability!

country_data.head()

GETTING STARTED

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# Extract columns from country_data into lists
gdp = country_data['GDP ($ per capita)'].tolist()
phones = country_data['Phones (per 1000)'].tolist()

# Create scatter plot with GDP on the x-axis and number of phones on the y-axis
sns.scatterplot(x=gdp, y=phones)

# Show plot
plt.show()

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# Create list of region
region = country_data['Region'].tolist()

# Create count plot with region on the y-axis
sns.countplot(y=region)

# Show plot
plt.show()

MAKING COUNTPLOT

# Import Matplotlib, pandas, and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd


# Create a DataFrame from csv file
df = pd.read_csv('csv_filepath')

# Create a count plot with "Spiders" on the x-axis
sns.countplot(x="Spiders", data=df)

# Display the plot
plt.show()

HUE and ADDING 3RD VARIABLE

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# Change the legend order in the scatter plot
sns.scatterplot(x="absences", y="G3", 
                data=student_data, 
                hue="location", hue_order=["Rural", "Urban"])

# Show plot
plt.show()


# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# Create a dictionary mapping subgroup values to colors
palette_colors = {"Rural": "green", "Urban": "blue"}

# Create a count plot of school with location subgroups
sns.countplot(x="school", data=student_data, hue="location", palette=palette_colors)



# Display plot
plt.show()

RELATIONAL PLOTS - Takes the place of scatterplot in SNS due to the ability to call "scatter" or "line" when needed

‌
‌
‌

Introduction to Data Visualization with Seaborn

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Introduction to Data Visualization with Seaborn

Take Notes

Explore Datasets

Introduction to Data Visualization with Seaborn