Skip to content

Statistical Thinking in Python (Part 1)

👋 Welcome to your workspace! Here, you can write and run Python code and add text in Markdown. Below, we've imported the datasets from the course Statistical Thinking in Python (Part 1) as DataFrames as well as the packages used in the course. This is your sandbox environment: analyze the course datasets further, take notes, or experiment with code!

# Importing course packages; you can add more too!
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Importing course datasets as DataFrames
belmont = pd.read_csv('datasets/belmont.csv')
michelson = pd.read_csv('datasets/michelson_speed_of_light.csv', index_col=0)
all_states = pd.read_csv('datasets/2008_all_states.csv')
swing_states = pd.read_csv('datasets/2008_swing_states.csv')

belmont.head() # Display the first five rows
# Begin writing your own code here!

Don't know where to start?

Try completing these tasks:

  • Pick five states from all_states and create a bee swarm plot of the votes in those states. Each point should represent the share of the vote Obama got in a single country (dem_share), segmented by state.
  • Compute the variance and standard variance of dem_share for each state in swing_states.
  • Take 10,000 samples out of the normal distribution with the mean and standard deviation of the Belmont winners' times and plot it. This will require you to clean the Time column to be numeric.
  • Check the normality of the Michelson measurements, specifically velocity of light in air, by plotting the theoretical and empirical CDFs on the same plot.