Probability and Quantile Functions
The most important R functions for probability distributions are the p<⋯> and q<⋯> functions. The p<⋯> functions compute values of cumulative density functions F(x), while the q<⋯> functions compute the inverse F-1(x).
For example pnorm computes probabilities for a normal distribution X
pnorm(q) = P(X≤ q) = F(q)
But qnorm does the reverse (input is a probability)
qnorm(p) = F-1(p)
I remember this by recalling that "probability" starts with "p". And "q" looks like a backwards "p", so it is the reverse.
Remember:
P(X≤q
) = p
corresponds to pnorm(q
)=p
and qnorm(p
)=q
This workspace defines new functions similar to those in R, which also draw a graph indicating the source of the values -- area (p) and cutoff (q)
- plot_pnorm(...) corresponding to pnorm(...)
- plot_qnorm(...) corresponding to qnorm(...)
# plot_pnorm(...) and plot_qnorm(...) definitions
######################################################################
#' Helper function plotting normal distribution with filled in area
#' marks cutoff (q) as well as total area (p)
#'
#' @ lim c(start,end) x-values where filled area begins and ends
#' @ mean mean of normal distribution
#' @ sd sd of normal distribution
#' @ width # of sd to plot on either side of mean
#'
norm_area <- function(lim, mean=0, sd=1, width=3) {
plot.start <- mean - width*sd # starting point for plot
plot.end <- mean + width*sd # ending point for plot
################################
# plot the normal distribution
curve( # create a new plot with a curve
dnorm(x, mean, sd), # normal distribution function
xlim=c(plot.start,plot.end), # start and end for graph
main='',xlab='', ylab='', # add title and labels later
lwd=2
)
#################################
# fill in polygon representing the area between limits
if (is.na(lim[1])) { lim[1] <- plot.start }
if (is.na(lim[2])) { lim[2] <- plot.end }
fill.x <- c( # x-coordinates of area to fill
lim[1], # start on axis at beginning of area
seq(lim[1], lim[2], .1), # x-coordinates to plot along curve
lim[2], lim[2] # final x coordinate for top & bottom
)
fill.y <- dnorm(fill.x,mean,sd) # top of fill area is along normal pdf
fill.y[c(1,length(fill.y))] <- c(0,0) # ends on x-axis to mark bottom of fill area
polygon(fill.x,fill.y, col='skyblue')
##################################
# mark the edges of filled area on graph
for (n in 1:2) { if (lim[n] != c(plot.start,plot.end)[n]) {
# draw a vertical line marking the limits (wide, red line)
abline(v=lim[n], lwd=3, col='red')
# mark the cutoff value under the x-axis (margin text)
mtext(signif(lim[n],3), # three significant digits of cutoff
line=2, side=1, at=lim[n], # a bit below, on bottom, at cutoff value
cex=1.5, col='red') # in big, red font
}}
###################################
# add the value of area to graph
if (lim[2] == plot.end) {
area.value <- 1
if (lim[1] > mean)
{ area.pos <- c('x'=lim[1]-1.2*sd,
'y'=dnorm(mean,mean,sd)/4) }
else
{ area.pos <- c('x'=mean,
'y'=dnorm(mean,mean,sd)/4) }
} else {
area.value <- pnorm(lim[2],mean,sd)
if (lim[1] == plot.start) { i <- length(fill.x)*2/3 }
else { i <- length(fill.x)/2 }
area.pos <- c('x'=fill.x[i], 'y'=fill.y[i]/2)
}
if (lim[1] != plot.start) {
area.value <- area.value - pnorm(lim[1],mean,sd)
}
# mark the area value on the graph
text(area.pos['x'],area.pos['y'],
signif(area.value,3), # print the area (3 significant digits)
cex=2, col='blue' # big blue font
)
}
######################################################################
#' Convert pnorm into a function that makes a graph marking the normal
#' distribution as well as area and cutoff
#'
#' @ cutoff, @ mu, @ sigma are parameters for pnorm(...)
#'
plot_pnorm <- function(cutoff, mean=0, sd=1, width=3) {
norm_area(c(NA,cutoff),mean,sd,width)
title(paste('pnorm(',cutoff,', mean=',mean,', sd=',sd,')',
' is ', signif( pnorm(cutoff,mean,sd), 3 ),
sep='')
)
}
#######################################################################
#' Convert qnorm into a function that makes a graph marking the normal
#' distribution as well as area and cutoff
#'
#' @ quantile, @ mu, @ sigma are parameters for qnorm(...)
#'
plot_qnorm <- function(quantile, mean=0, sd=1, width=3) {
norm_area(c(NA,qnorm(quantile,mean,sd)),mean,sd,width)
title(paste('qnorm(',quantile,', mean=',mean,', sd=',sd,')',
' is ', signif( qnorm(quantile,mean,sd), 3 ),
sep='')
)
}
plot_pnorm(3, 2, 5)
plot_qnorm(.4, 2, 5)
norm_area(lim=c(-1.5,2.3))
The area above would be computed by
pnorm(2.3) - pnorm(-1.5)
Z-values (Rightward Probabilities)
When computing confidence intervals and doing hypothesis testing, we'll often be interested in
zα, which is the quantile for rightwards probability.
P(Z≥zα) = α ⇔ P(Z≤zα) = 1-α
So zα = qnorm(1-α) = -qnorm(α)
1 hidden cell
par(mfrow=c(2,2))
plot_x_alpha(.1/2, width=4)
plot_x_alpha(.05/2,width=4)
plot_x_alpha(.01, width=4)
plot_x_alpha(.01/2,width=4)