Confidence Intervals
Evaluating a point estimator for a parameter
Example:
mu<-2; sigma<-5; n<-100
sample_means <- rowMeans(matrix(rnorm(100*n,mu,sigma), # generate 100 sample means
nrow=100))
plot(density(sample_means), # plot density estimate
main='Density estimate of sample means', # title
xlab='sample mean',ylab='probability density' ) # axis labels
points(sample_means,rep(0,100)) # mark sample means on x-axis
abline(v=mu, lwd=2, col='blue') # expected value of estimator
abline(v=mu+2*sigma/sqrt(n), lwd=2, col='blue', lty=2) # two standard errors up
abline(v=mu-2*sigma/sqrt(n), lwd=2, col='blue', lty=2) # two standard errors down
If the point estimator is unbiased, then the expected value
point estimate for margin of error estimate for confidence interval for
Note that
More generally we can hope to replace the point estimate
Confidence Intervals for Mean of Normal (Using )
Suppose we are using
(Or the distribution isn't
•
•
Applying this to error estimates for
•
• where
The
The
The mean
Example:
# Some values of z_alpha
cat('z.₀₅ = ' ,signif( qnorm(1-.05), 3),'\n')
cat('z.₀₂₅ = ' ,signif( qnorm(1-.025),3),'\n')
cat('z.₀₁ = ' ,signif( qnorm(1-.01), 3),'\n')
cat('z.₀₀₅ = ' ,signif( qnorm(1-.005),3),'\n')
Note that, in this case, the interval could also be described as
where
Where
Example:
The code below computes
For the
### CI.z(..) and plot_CI(..) functions
################################ z interval ############
#' Make z confidence intervals for mean of normal rv
#'
#' @ mu mean of X
#' @ sigma variance of X
#' @ n sample size for variance
#' @ alpha desired confidence level
#' @ num_estimates number of confidence intervals to construct
#' @ data boolean: return data or return bootstrap confidence
#'
#' returns either data.frame: $x_bar, $min, $max, $in_CI
#' or number: bootstrap confidence level
#'
CI.z <- function(mu=0, sigma=1, n=100, alpha=.05, num_estimates=10000, data=FALSE) {
sample_means <- rowMeans(matrix( # generate num_estimates sample means
rnorm(n*num_estimates, mu, sigma),
nrow=num_estimates
))
CI <- data.frame(
min = qnorm( alpha/2, sample_means, sigma/sqrt(n)),
max = qnorm(1-alpha/2, sample_means, sigma/sqrt(n))
)
# T/F vector telling if CI contains actual mean
in_CI <- ((mu > CI$min) & (mu < CI$max))
if (data==TRUE) { return(cbind(x_bar=sample_means, CI, in_CI)) }
return (sum(in_CI)/num_estimates) # return bootstrap confidence
}
################################
################################ plot_CI ####################
#' plots confidence intervals spread vertically across graph
#'
#' @ CI data.frame $p_hat, $min, $max, $in_CI
#' @ y.max confidence intervals plotted at heights from 0 to y.max
#'
plot_CI <- function(CI, y.max) {
n <- nrow(CI)
color <- rep('red', n) # bad CI will be red and thick
width <- rep(2, n)
color[CI$in_CI] <- 'black' # good CI will be black and thin
width[CI$in_CI] <- 1
height <- (1:n) * y.max / n # spread CI vertically across plot
for (i in 1:n) {
lines(c(CI$min[i], CI$max[i]),
c(height[i], height[i]),
col=color[i], lwd=width[i])
points(CI$x_bar[i], height[i],
col=color[i], pch=16)
}
}
Bootstrap Confidence of Interval
In the graphs above, it looks like approximately
Confidence Intervals for Mean of (Using )
If we don't know the variance ahead of time, then we could try replacing the variance
•
Example (directly substituting sample variance for variance):
The code below copies the setup from the previous example, but now each sample mean's confidence interval uses the corresponding sample variance
Though this code is attempting to compute
1 hidden cell
Bootstrap Confidence
Looking at the graphs above, it seems that less than
As you can see, once