Skip to content

Statistical Thinking in Python (Part 2)

👋 Welcome to your workspace! Here, you can write and run Python code and add text in Markdown. Below, we've imported the datasets from the course Statistical Thinking in Python (Part 2) as DataFrames as well as the packages used in the course. This is your sandbox environment: analyze the course datasets further, take notes, or experiment with code!

# Importing course packages; you can add more too!
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Importing course datasets as DataFrames
anscombe = pd.read_csv('datasets/anscombe.csv', header=[0,1])
bees = pd.read_csv('datasets/bee_sperm.csv', comment='#')
literacy_fertility = pd.read_csv('datasets/female_literacy_fertility.csv')
beaks_1975 = pd.read_csv('datasets/finch_beaks_1975.csv')
beaks_2012 = pd.read_csv('datasets/finch_beaks_2012.csv')
frogs = pd.read_csv('datasets/frog_tongue.csv', comment='#')
mlb = pd.read_csv('datasets/mlb_nohitters.csv')
weather = pd.read_csv('datasets/sheffield_weather_station.csv', comment='#', delimiter='\s+', na_values='---')

bees.head() # Display the first five rows
# Begin writing your own code here!

Don't know where to start?

Try completing these tasks:

  • Show that the four sets of Anscombe data have the same slope and intercept. Plot the four sets separately and their best fit lines.
  • Investigate whether beak length and beak depth are significantly different across years (beaks_1975 vs beaks_2012) or species (see species column).
  • Pick two continents from literacy_fertility and test the null hypothesis that the country-level female literacy rate is identically distributed between the two continents you've picked.