Tests Supplementing ANOVA
Prerequisites
OneFactor
ANOVA, MultiFactor
ANOVA, Pairwise Comparisons
Among Means, Specific Comparisons
Among Means
Learning Objectives
 Compute Tukey HSD test
 Describe an interaction in words
 Describe why one might want to compute simple effect tests following
a significant interaction
The null hypothesis tested
in a onefactor ANOVA is that all the population means are equal.
Stated more formally,
H0: μ1 = μ2 =
... μk
where H0 is the
null hypothesis and k is the number of conditions. When the null
hypothesis is rejected, then all that can be said is that at least
one population mean is different from at least one other population
mean. The methods described in the sections on All
Pairwise Comparisons and on Specific
Comparisons for doing more specific tests apply here. Keep
on mind that these tests are valid whether or not they are preceded
by an ANOVA.
Main Effects
As shown below, significant main effects in multifactor
designs can be followed up in the same way as significant effects
in a oneway designs. Table 1shows the data from an imaginary
experiment with three levels of Factor A and two levels of Factor B.
Table 2 shows the ANOVA Summary Table for these
data. The significant main effect of A
indicates that, in the population, at least one of the marginal means for A is
different from at least one of the others.
The Tukey HSD can be used to test all
pairwise comparisons among means in a onefactor ANOVA as well
as comparisons among marginal means in a multifactor ANOVA. The
formula for the equalsamplesize case is shown below.
where Mi and Mj are
marginal means, MSE is the mean square error from the ANOVA,
and n is the number of scores each mean is based upon. For this
example, MSE = 1.847 and n= 8 because there are eight scores at
each level of A. The probability value can be computed using the Studentized
Range Calculator. The degrees of freedom is equal to the degrees
of freedom error. For this example, df = 18. The results of the
Tukey HSD test are shown in Table 3. The mean for A1 is
significantly lower than the mean for A2 and
the mean for A3. The means for A2 and
A3 are
not significantly different.
Specific comparisons among means are also carried
out much the same way as shown in the relevant
section on testing means. The formula for L is
where ci is the coefficient
for the ith marginal
mean and Mi is the ith marginal
mean. For example, to compare A1 with the average of A2 and A3,
the coefficients would be 1, 0.5, 0.5. Therefore,
L = (1)(5.125) + (0.5)(7.375) + (0.5)(7.875)
= 2.5
To compute t, use:
= 4.25.
where MSE is the mean square error from the ANOVA
and n is the number of scores each marginal mean is based on (eight
in this example). The degrees of freedom is the degrees of freedom
error from the ANOVA and is equal to 18. Using the Online
Calculator, we find that the twotailed probability value is
0.0005. Therefore, the difference between A1 and the average
of A2 and A3 is significant.
Online
Calculator: t distribution
Important issues concerning multiple comparisons
and orthogonal comparisons are discussed in the Specific
Comparisons section in the Testing
Means chapter.
Interactions
The presence of a significant interaction makes
the interpretation of the results more complicated. Since an
interaction means that the simple effects are different, the main effect as the mean of
the simple effects does not tell the whole story. This section
discusses how to describe interactions, proper and improper uses
of simple effects tests, and how to test components of interactions.
Describing Interactions
A crucial step first step in understanding
a significant interaction is constructing aninteraction plot. Figure 1 shows an interaction
plot from data presented in the section on Multifactor
ANOVA.
The second step is to describe the interaction
in a clear and understandable way. This is often done by describing
how by describing how the simple effects differed. Since this should
be done using as little jargon as possible, the word "simple
effect" need not appear in the description. An example is as
follows:
The effect of Outcome differed depending on the subject's
self esteem. The difference between the attributions to self
following success and attributions to self following failure
was larger for highselfesteem subjects (mean difference =
2.50) than for lowselfesteem subjects (mean difference =
2.33).
No further analyses are helpful in understanding
the interaction since the interaction means only that the simple
effects differ. The interaction's significance indicates that
the simple effects differ from each other, but provides no information
about whether they differ from zero.
Simple Effect Tests
It is not necessary to know
whether the simple effects differ from zero in order to understand
an interaction because the question of whether simple effects
differ from zero has nothing to do with interaction except
that if they are both zero there is no interaction. It is
not uncommon to see research articles in which the authors report
that they analyzed simple effects in order to explain the interaction.
However, this is not a correct since an interaction does
not depend on the analysis of the simple effects.
However, there is a reason to test simple effects
following a significant interaction. Since an interaction indicates
that simple effects differ, it means that the main effects are
not general. In the madeup example, the main effect of Outcome
is not very informative, and the effect of outcome
should be considered separately for high and lowselfesteem
subjects.
As will be seen, the simple effects of Outcome
are significant and in opposite directions: Success significantly
increases attribution to self for highselfesteem subjects and
significantly lowers attribution to self for lowselfesteem
subjects. This is a very easy result to interpret.
What
would the interpretation have been if neither simple effect
had been significant? On the surface, this seems impossible:
How can the simple effects both be zero if they differ from each
other significantly as tested by the interaction? The answer
is that a nonsignificant simple effect does not mean that the
simple effect is zero: the null hypothesis should not be accepted
just because it is not rejected
(See section on Interpreting
NonSignificant Results)
If neither simple effect is
significant, the conclusion should be that the simple effects
differ, and that at least one of them is not zero. However,
no conclusion should be drawn about which simple effect(s) is/are
not zero.
Another error that can be made by mistakenly accepting
the null hypothesis is to conclude that two simple effects are
different because one is significant and the other is not. Consider
the results of an imaginary experiment in which the researcher
hypothesized that addicted people would show a larger increase
in brain activity following some treatment than would nonaddicted
people. In other words, the researcher hypothesized that addiction
status and treatment would interact. The results
shown in Figure 2 are very much in line with the hypothesis.
However, the test of the interaction resulted in a probability
value of 0.08, a value not quite low enough to be significant
at the conventional 0.05 level. The proper conclusion is that
the experiment supports the researcher's hypothesis, but not
strongly enough to allow a firm conclusion.
Unfortunately, the researcher was not satisfied
with such a weak conclusion and went on to test the simple effects.
It turned out that the effect of Treatment was significant for
the Addicted group (p = 0.02) but not significant for the NonAddicted
group (p = 0.09). The researcher then went on to conclude that
since there is an effect of Treatment for the Addicted group
but not for the NonAddicted group, the hypothesis of a greater
effect for the former than for the latter group is demonstrated.
This is faulty logic, however, since it is based on accepting
the null hypothesis that the simple effect of Treatment is zero
for the NonAddicted group just because it is not significant.
Components of Interaction (optional)
Figure 3 shows the results of an imaginary
diet on weight loss. A control group and two diets were used
for both overweight teens and overweight adults.
The difference between Diet A and the Control
diet was essentially the same for teens
and adults whereas the difference between Diet B and Diet A was
much larger for the Teens than it was for the Adults. Over one
portion of the graph the lines are parallel whereas over another
portion they are not. It is possible to test these portions or
components of interactions using the method of specific comparisons
discussed previously. The test of the difference between Teens
and Adults on the difference between Diets A and B could be tested
with the coefficients shown in Table 4. Naturally, the same
consideration regarding multiple comparisons and orthogonal comparisons apply
to comparisons involving components of interaction that apply
to other comparisons among means.
