2k Factorial Experiments ("Screening Experiments")
Because this is only an initial test for a large number factors, we will consider only
Model Change
Since we have only 2 levels for each factor, we will modify our point of view. We will think of the two levels as a "control" (or "low") level and a "treatment" (or "high") level. Previously we defined treatment effects as deviations away from the global mean; but now we will define them as changes away from the "control" value. Numerically this means that our effects are twice as big as they were before (since the mean lies half-way between the "control" and "treatment" values).
We will use the notation "
In the notation from before, our observations are written
are the replicates of the control. is the third replicate observation where only treatment is applied (and other treatments are at their "control" level). is the second replicate observation where only treatments and are applied.
In our notation from before, averages would be written with subscripts and superscripts indicating the factor and level ("control"="-" or "treatment"="+") to fix during the averaging. For example
averages over the observations where the treatment was applied averages over the observations where the treatment was at "control" level. averages over observations where was applied but was not.
In the work below we will use sums rather than averages. We will use this same notation, without the "bar".
Yates Notation.
One of the big obstacles dealing with a large number of levels is the logisitical bookkeeping required to organize and perform tests for interactions on the resulting large dataset. A beautiful organizational method was developed by F.Yates in the 1930's to help with this. Yates' notation was designed to help with computations by hand... so it isn't quite as important now that analysis is done by computer, but it is still standard for discussing and organizing data in
Suppose that factors are indicated each by some (upper-case) letter:
For example, "
There's a fairly standard picture of the "geometric" view for 3 factors that you can quickly find by a google search:
In general, we are looking at vertices of a
In analysis, we'll reuse this notation... we will use the product of lower letters notation to represent the sum of all replicates of a certain type. So
We also use a product of upper letters notation to represent main and interaction effects themselves (
Contrasts
We want to look at sum of squares and mean squares. But it is helpful to introduce new notation for a "half-way" step first. The main benefit of this is to get simpler formulas for Sum of Square interaction effects (which were previously somewhat tedious).
Main Contrast
The contrast for a main effect is the total difference in value due to applying a treatment.
This can be expressed nicely in terms of the lower-case notation from earlier. For example- For one factor
:
- For two factors
and :
- For three factors
, , and :
Two Way Interaction Contrast
The contrast for a two-way interaction effect is the difference of differences.
It is not hard to show that . Using Yates notation this looks like the following.- For two factors
and : - For three factors
, , and :
Three Way Interaction (and Higher) Contrast
Contrasts for three-way and higher interactions are defined similarly. The three-way interaction contrast is the triple difference.
Note that the sign of each term is the product of the signs in the exponent .
For example
- For three factors
, , and :
- For four factors
, , , and :
Converting Contrast to Effect
The main and interaction effects can be computed from contrasts (since these are the average value of the corresponding differences):
- etc
Converting Contrast to Sum of Square
Looking back at our earlier work, we can connect contrasts to the sum of square values (treatment and interaction effects) from before.
- etc..
(Recall that total number of observations is
This is very nice, because high order interaction sum of squares was complicated to compute in our previous formulation.
Note: Each of these has only
The Design Matrix
The benefit of using contrasts as an intermediate step is that there is a simple matrix multiplication rule which can be used to quickly compute all of the contrasts for your dataset! Especially for the interaction effects, this is a great time-saver.
Each contrast is a sum over all of the combinations of levels of factors. For a main contrast, like
For example
- For
, observations , , , are all - For
, observations , , , are all
For two-way interaction contrasts the coefficient of a term is the product of the coefficients for
- For
, observations , , , are all - For
, observations , , , are all
Three-way and above also have coefficients given by products of the main contrast coefficients.
If we organize our work carefully then the coefficients make a pretty table.
Levels | |||
---|---|---|---|
The
Note: in each column above, half of the numbers are
Note: each of the columns above are orthogonal -- the dot product of any column with another is
Once we have these coefficients, we can multiply them to get coefficients for the interaction effects.
Levels | ||||
---|---|---|---|---|
The comments above about sum of column values and orthogonality of columns extend to the entire table of products. Although we haven't expressed it this way, from a linear algebra point of view, converting to contrasts is equivalent to making a change of basis.
If we wanted to, I'm pretty sure we could express this entire chapter in terms of discrete Fourier transforms. In fact, if we had a massive amount of data, then we could probably speed up the computation of all contrasts by using something like a fast Fourier transform. ... though you'd need a really ridiculous amount of data or factors in order for anyone to care about this, given current computer speeds.
## 2^3 Factorial design matrix
F <- sapply(0:7, function(x) as.numeric(intToBits(x)))
## contrast coefficient table is binary expansion
A <- F[1,]*2 - 1
B <- F[2,]*2 - 1
C <- F[3,]*2 - 1
## coefficients for interactions are products
AB <- A * B
AC <- A * C
BC <- B * C
ABC <- A * B * C
## full contrast coefficient table
data.frame(A,B,C,AB,AC,BC,ABC,
row.names=c("1","a","b","ab","c","ac","bc","abc"))
## Rows are:
# cba
#------
# 000 -> (1)
# 001 -> a
# 010 -> b
# 011 -> ab
# 100 -> c
# 101 -> ac
# 110 -> bc
# 111 -> abc
## We can also construct this table using arrays
fA <- array(c(-1,1), dim=rep(2,3))
fB <- aperm(array(c(-1,1), dim=rep(2,3)), c(3,1,2))
fC <- aperm(array(c(-1,1), dim=rep(2,3)), c(2,3,1))
## full contrast coefficient table
data.frame(A = as.vector(fA),
B = as.vector(fB),
C = as.vector(fC),
AB = as.vector(fA*fB),
AC = as.vector(fA*fC),
BC = as.vector(fB*fC),
ABC= as.vector(fA*fB*fC),
row.names=c("1","a","b","ab","c","ac","bc","abc"))
Once we have vectors of coefficients, we can use dot products with our data in order to make contrasts which can then be converted to mean square values and used for
Example:
## Generate some data... let's do something with 4 factors
control <- 5 # control value
tA <- 2/2 # treatment A effect
tB <- 2/2 # treatment B effect
tC <- 0 # treatment C effect (nonexistent)
tD <- 2/2 # treatment D effect
tAB <- 2/2 # AB interaction
tAD <- 0 # no AD interaction
tBD <- 2/2 # BD interaction
tABD<- 0 # no ABD interaction
# maybe all of the other interactions are 0...
n <- 50 # lets use 50 replicates
## Combine control and effects to get expected values
Y <- array(control, rep(2,4)) + # 4 dimensions, each with 2 values
array(c(-tA,tA), rep(2,4)) + # A
aperm(array(c(-tB,tB), rep(2,4)), c(4,1,2,3)) + # B
aperm(array(c(-tC,tC), rep(2,4)), c(3,4,1,2)) + # C
aperm(array(c(-tD,tD), rep(2,4)), c(2,3,4,1)) + # D
array(c(tAB,-tAB,-tAB,tAB), rep(2,4)) + # AB
aperm(array(c(tAD,-tAD,-tAD,tAD), rep(2,4)), c(1,3,4,2)) + # AD
aperm(array(c(tBD,-tBD,-tBD,tBD), rep(2,4)), c(3,1,4,2)) # BD
## Now add 10 replicates and error
Y <- as.vector(rep(Y, n)) + rnorm(n*16, 0, 3)
## Make factor vectors
A <- array(c(-1,1), dim=rep(2,4))
B <- aperm(array(c(-1,1), dim=rep(2,4)), c(4,1,2,3))
C <- aperm(array(c(-1,1), dim=rep(2,4)), c(3,4,1,2))
D <- aperm(array(c(-1,1), dim=rep(2,4)), c(2,3,4,1))
# include replicates
A <- as.vector(rep(A, n))
B <- as.vector(rep(B, n))
C <- as.vector(rep(C, n))
D <- as.vector(rep(D, n))
data <- data.frame(Y,A,B,C,D)
data
## Now we can compute contrasts!
contrasts <-
data.frame(A=sum(Y*A), B=sum(Y*B), C=sum(Y*C), D=sum(Y*D),
AB=sum(Y*A*B), AC=sum(Y*A*C), AD=sum(Y*A*D),
BC=sum(Y*B*C), BD=sum(Y*B*D), CD=sum(Y*C*D) )
## Ugh.. this is a lot of contrasts, let's not include any three-way or higher ones...
cat('contrasts:')
contrasts
## We can also compute effects and mean squares!
cat('computed effects:')
contrasts / (n*8)
cat('mean squares:')
contrasts^2 / (n*16)
Tests for Significance
Once we've computed effects and mean squares there are two different (equivalent) tests we could use to check for significance.
We can do the usual ANOVA
F-test for significance
For the
The sum of squares table looks like
Source | df | SS | MS | F | p-val |
---|---|---|---|---|---|
1-pf(F,1,2k(n-1)) | |||||
1-pf(F,1,2k(n-1)) | |||||
1-pf(F,1,2k(n-1)) | |||||
Error |
t-test for significance
Since our factor values are numerical
Where
Instead of doing an
(Recall that
coeffs <- matrix(c(A,B,C,D,A*B,A*C,A*D,B*C,B*D,C*D,A*B*C,A*B*D,A*C*D,B*C*D,A*B*C*D),
ncol=16-1)
contrasts <- apply(t(coeffs) %*% Y, 1, sum)
SS <- contrasts^2 / (n*16)
SSE <- (n*16-1)*sd(Y)^2 - sum(SS)
MSE <- SSE / (16*(n-1))
aov <- data.frame(
row.names=c('A','B','C','D','AB','AC','AD','BC','BD','CD','ABC','ABD','ACD','BCD','ABCD','Error'),
df =c(1 , 1, 1, 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , (n-1)*(2^4)),
'Sum Sq' =c(SS, SSE),
'Mean Sq'=c(SS, MSE),
'F value'=c(SS/MSE,''),
'p-value'=c(1-pf(SS/MSE,1,16*(n-1)),'')
)
aov
... of course, we could also do this with a single aov(..) function in R.
summary( aov( Y ~ A + B + C + D + A*B + A*C + A*D + B*C + B*D + C*D + A*B*C + A*B*D + A*C*D + B*C*D + A*B*C*D,
data=data) )
This is equivalent to a regression
summary( lm( Y ~ A + B + C + D + A*B + A*C + A*D + B*C + B*D + C*D + A*B*C + A*B*D + A*C*D + B*C*D + A*B*C*D,
data=data) )