course notes

# Start coding here...

pd.merge_asof

Perform a merge by key distance.

This is similar to a left-join except that we match on nearest key rather than equal keys. Both DataFrames must be sorted by the key.

Basic statistics

Measures the average distance from eatch point to the mean algo:

for every point take distance to mean
square distance
sum
devide by number of points - 1 numpy: np.var(data, ddof=1) ddof= 0 only for full population stats

Square root of variance numpy: np.std(data, ddof=1)

algo:

std vs mad : std penalizes longer distances more then shorter distances (due to square) vs mad penalizes equally

Splits up data in some number of equal parts numpy: np.quantile(data, quantiles)

Distance between .25 quantile and .75 quantile = height of boxplot scipy:

from scipy.stats import iqr
iqr(data)

defined as: data < Q1 - 1.5 * IQR && data > Q3 + 1.5 * IQR