Skip to content
Introduction to Data Visualization with Matplotlib
Introduction to Data Visualization with Matplotlib
Run the hidden code cell below to import the data used in this course.
# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Importing the course datasets
climate_change = pd.read_csv('datasets/climate_change.csv', parse_dates=["date"], index_col="date")
medals = pd.read_csv('datasets/medals_by_country_2016.csv', index_col=0)
summer_2016 = pd.read_csv('datasets/summer2016.csv')
austin_weather = pd.read_csv("datasets/austin_weather.csv", index_col="DATE")
weather = pd.read_csv("datasets/seattle_weather.csv", index_col="DATE")
# Some pre-processing on the weather datasets, including adding a month column
seattle_weather = weather[weather["STATION"] == "USW00094290"]
month = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
seattle_weather["MONTH"] = month
austin_weather["MONTH"] = monthTake Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets hereExplore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Using
austin_weatherandseattle_weather, create a Figure with an array of two Axes objects that share a y-axis range (MONTHSin this case). Plot Seattle's and Austin'sMLY-TAVG-NORMAL(for average temperature) in the top Axes and plot theirMLY-PRCP-NORMAL(for average precipitation) in the bottom axes. The cities should have different colors and the line style should be different between precipitation and temperature. Make sure to label your viz! - Using
climate_change, create a twin Axes object with the shared x-axis as time. There should be two lines of different colors not sharing a y-axis:co2andrelative_temp. Only include dates from the 2000s and annotate the first date at whichco2exceeded 400. - Create a scatter plot from
medalscomparing the number of Gold medals vs the number of Silver medals with each point labeled with the country name. - Explore if the distribution of
Agevaries in different sports by creating histograms fromsummer_2016. - Try out the different Matplotlib styles available and save your visualizations as a PNG file.
import pandas as pd
import matplotlib.pyplot as pltfig, ax = plt.subplots()
ax.plot(climate_change.index, climate_change['co2'], color = 'blue')
ax.set_xlabel('Time')
ax.set_ylabel('CO2 (ppm)', color = 'blue')
ax.tick_params('y', colors = 'blue')
ax2 = ax.twinx()
ax2.plot(climate_change.index, climate_change['relative_temp'], color = 'red')
ax2.set_ylabel('Relative temperature (Celsius)', color = 'red')
ax2.tick_params('y', colors = 'red')
plt.show()def plot_timeseries(axes, x, y, color, xlabel, ylabel):
axes.plot(x, y, color=color)
axes.set_xlabel(xlabel)
axes.set_ylabel(ylabel, color=color)
axes.tick_params('y', colors = color)fig, ax = plt.subplots()
plot_timeseries(ax, climate_change.index, climate_change['co2'], 'blue', 'Time', 'CO2 (ppm)')
ax2 = ax.twinx()
plot_timeseries(ax2, climate_change.index, climate_change['relative_temp'], 'red', 'Time', 'Relative temperature (Celsius)')
ax2.annotate('>1 degree', xy=(pd.Timestamp('2015-10-06'), 1),
xytext=(pd.Timestamp('2008-10-06'), -0.2),
arrowprops={'arrowstyle':'->', 'color':'gray'})
plt.show()