Skip to content

Probability and Quantile Functions

The most important R functions for probability distributions are the p<⋯> and q<⋯> functions. The p<⋯> functions compute values of cumulative density functions F(x), while the q<⋯> functions compute the inverse F-1(x).

For example pnorm computes probabilities for a normal distribution X

pnorm(q) = P(X≤ q) = F(q)

But qnorm does the reverse (input is a probability)

qnorm(p) = F-1(p)

I remember this by recalling that "probability" starts with "p". And "q" looks like a backwards "p", so it is the reverse.

Remember:

P(X≤q) = p   corresponds to   pnorm(q)=p   and   qnorm(p)=q

This workspace defines new functions similar to those in R, which also draw a graph indicating the source of the values -- area (p) and cutoff (q)

  • plot_pnorm(...) corresponding to pnorm(...)
  • plot_qnorm(...) corresponding to qnorm(...)
# plot_pnorm(...) and plot_qnorm(...) definitions

######################################################################
#' Helper function plotting normal distribution with filled in area 
#' marks cutoff (q) as well as total area (p)
#'
#' @ lim    c(start,end) x-values where filled area begins and ends
#' @ mean   mean of normal distribution
#' @ sd     sd of normal distribution
#' @ width  # of sd to plot on either side of mean
#'
norm_area <- function(lim, mean=0, sd=1, width=3) {
	plot.start <- mean - width*sd      # starting point for plot
	plot.end   <- mean + width*sd      # ending   point for plot
	
	################################
	# plot the normal distribution
	curve(                            # create a new plot with a curve
	  	 dnorm(x, mean, sd),          # normal distribution function 
	  	 xlim=c(plot.start,plot.end), # start and end for graph
	 	 main='',xlab='', ylab='',    # add title and labels later
		 lwd=2
		 )

	#################################
	# fill in polygon representing the area between limits
	if (is.na(lim[1])) { lim[1] <- plot.start }
	if (is.na(lim[2])) { lim[2] <- plot.end   }
	
	fill.x <- c(                     # x-coordinates of area to fill
				lim[1],                   # start on axis at beginning of area
			    seq(lim[1], lim[2], .1),  # x-coordinates to plot along curve
		    	lim[2], lim[2]            # final x coordinate for top & bottom
		   		)
	fill.y <- dnorm(fill.x,mean,sd)       # top of fill area is along normal pdf
	fill.y[c(1,length(fill.y))] <- c(0,0) # ends on x-axis to mark bottom of fill area
	
	polygon(fill.x,fill.y, col='skyblue')

	##################################
	# mark the edges of filled area on graph
	for (n in 1:2) { if (lim[n] != c(plot.start,plot.end)[n]) {
		# draw a vertical line marking the limits (wide, red line)
		abline(v=lim[n], lwd=3, col='red')
	
		# mark the cutoff value under the x-axis (margin text)
		mtext(signif(lim[n],3),          # three significant digits of cutoff 
			  line=2, side=1, at=lim[n], # a bit below, on bottom, at cutoff value 
			  cex=1.5, col='red')        # in big, red font
	}}
	
	###################################
	# add the value of area to graph
	if (lim[2] == plot.end) {  
		area.value <- 1 
		if (lim[1] > mean) 
		  { area.pos <- c('x'=lim[1]-1.2*sd,
						  'y'=dnorm(mean,mean,sd)/4) }
		else               
		  { area.pos <- c('x'=mean,       
						  'y'=dnorm(mean,mean,sd)/4) }
	} else {                 
		area.value <- pnorm(lim[2],mean,sd)
		
		if (lim[1] == plot.start) { i <- length(fill.x)*2/3 }
		else                      { i <- length(fill.x)/2 }
		
		area.pos   <- c('x'=fill.x[i], 'y'=fill.y[i]/2)
	}
	
	if (lim[1] != plot.start) {
		area.value <- area.value - pnorm(lim[1],mean,sd) 
	}
	
	# mark the area value on the graph
	text(area.pos['x'],area.pos['y'],
		 signif(area.value,3),  # print the area (3 significant digits)
		 cex=2, col='blue'      # big blue font
		)
}

######################################################################
#' Convert pnorm into a function that makes a graph marking the normal
#' distribution as well as area and cutoff
#'
#' @ cutoff, @ mu, @ sigma  are parameters for pnorm(...)
#'
plot_pnorm <- function(cutoff, mean=0, sd=1, width=3) {

	norm_area(c(NA,cutoff),mean,sd,width)

	title(paste('pnorm(',cutoff,', mean=',mean,', sd=',sd,')',
				'  is  ', signif( pnorm(cutoff,mean,sd), 3 ),
				sep='')
		 )
}

#######################################################################
#' Convert qnorm into a function that makes a graph marking the normal
#' distribution as well as area and cutoff
#'
#' @ quantile, @ mu, @ sigma  are parameters for qnorm(...)
#'
plot_qnorm <- function(quantile, mean=0, sd=1, width=3) {

	norm_area(c(NA,qnorm(quantile,mean,sd)),mean,sd,width)

	title(paste('qnorm(',quantile,', mean=',mean,', sd=',sd,')',
				'  is  ', signif( qnorm(quantile,mean,sd), 3 ),
				sep='')
		 )
}
plot_pnorm(3, 2, 5)
plot_qnorm(.4, 2, 5)
norm_area(lim=c(-1.5,2.3))

The area above would be computed by

pnorm(2.3) - pnorm(-1.5)

Z-values (Rightward Probabilities)

When computing confidence intervals and doing hypothesis testing, we'll often be interested in zα, which is the quantile for rightwards probability.
  P(Z≥zα) = α   ⇔   P(Z≤zα) = 1-α
So zα = qnorm(1-α) = -qnorm(α)


1 hidden cell
par(mfrow=c(2,2))
plot_x_alpha(.1/2, width=4)
plot_x_alpha(.05/2,width=4)
plot_x_alpha(.01,  width=4)
plot_x_alpha(.01/2,width=4)