Skip to content
Hypothesis Testing in Python
Run the hidden code cell below to import the data used in this course.
# Import pandas
import pandas as pd
# Import the course datasets
republican_votes = pd.read_feather('datasets/repub_votes_potus_08_12.feather')
democrat_votes = pd.read_feather('datasets/dem_votes_potus_12_16.feather')
shipments = pd.read_feather('datasets/late_shipments.feather')
stackoverflow = pd.read_feather("datasets/stack_overflow.feather")
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
Calculating the sample mean
In pandas, a value's proportion in a categorical DataFrame column can be quickly calculated using the syntax:
prop = (df['col'] == val).mean()
Calculating a z-score
o valor-p é o menor nível de significância com que se rejeitaria a hipótese nula.
P-VALUE
In order to determine whether to choose the null hypothesis or the alternative hypothesis, you need to calculate a p-value from the z-score.
# Calculate the z-score of late_prop_samp
z_score = (late_prop_samp - late_prop_hyp)/std_error
# Calculate the p-value
p_value = 1 - norm.cdf(z_score, loc=0, scale=1)
# Print the p-value
print(p_value)