Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics

One-Way Analysis of Variance (ANOVA)

Researchers use one-way analysis of variance in controlled experiments to test for significant differences among group means. This lesson explains when, why, and how to use one-way analysis of variance. The discussion covers fixed-effects models and random-effects models.

Note: One-way analysis of variance is also known as simple analysis of variance or as single-factor analysis of variance.

When to Use One-Way ANOVA

You should only use one-way analysis of variance when you have the right data from the right experimental design.

Experimental Design

One-way analysis of variance should only be used with one type of experimental design - a completely randomized design with one factor (also known as a single-factor, independent groups design). This design is distinguished by the following attributes:

  • The design has one, and only one, factor (i.e., one independent variable) with two or more levels.
  • Treatment groups are defined by a unique combination of non-overlapping factor levels.
  • The design has k treatment groups, where k is greater than one.
  • Experimental units are randomly selected from a known population.
  • Each experimental unit is randomly assigned to one, and only one, treatment group.
  • Each experimental unit provides one dependent variable score.

Data Requirements

One-way analysis of variance requires that the dependent variable be measured on an interval scale or a ratio scale. In addition, you need to know three things about the experimental design:

  • k = Number of treatment groups
  • nj = Number of subjects assigned to Group j (i.e., number of subjects that receive treatment j)
  • Xi,j = The dependent variable score for the ith subject in Group j

For example, the table below shows the critical information that a researcher would need to conduct a one-way analysis of variance, given a typical single-factor, independent groups design:

Group 1 Group 2 Group 3
X1,1 X1,2 X1,3
X2,1 X2,2 X2,3
X3,1 X3,3
X4,3

The design has three treatment groups (k =3). Nine subjects have been randomly assigned to the groups: three subjects to Group 1 (n1 = 3), two subjects to Group 2 (n2 = 2), and four subjects to Group 3 (n3 = 4). The dependent variable score is X1,1 for the first subject in Group 1; X1,2 for the first subject in Group 2; X1,3 for the first subject in Group 3; X2,1 for the second subject in Group 1; and so on.

Assumptions of ANOVA

One-way analysis of variance makes three assumptions about dependent variable scores:

  • Independence. The dependent variable score for each experimental unit is independent of the score for any other unit.
  • Normality. In the population, dependent variable scores are normally distributed within treatment groups.
  • Equality of variance. In the population, the variance of dependent variable scores in each treatment group is equal. (Equality of variance is also known as homogeneity of variance or homoscedasticity.)

The assumption of independence is the most important assumption. When that assumption is violated, the resulting statistical tests can be misleading. This assumption is tenable when (a) experimental units are randomly sampled from the population and (b) sampled units are randomly assigned to treatments.

With respect to the other two assumptions, analysis of variance is more forgiving. Violations of normality are less problematic when the sample size is large. And violations of the equal variance assumption are less problematic when the sample size within groups is equal.

Before conducting an analysis of variance, it is best practice to check for violations of normality and homogeneity assumptions. For further information, see:

Why to Use One-Way ANOVA

Researchers use one-way analysis of variance to assess the effect of one independent variable on one dependent variable. The analysis answers two research questions:

  • Is the mean score in any treatment group significantly different from the mean score in another treatment group?
  • What is the magnitude of the effect of the independent variable on the dependent variable?

Notice that analysis of variance tells us whether treatment groups differ significantly, but it doesn't tell us how the groups differ. Understanding how the groups differ requires additional analysis.

How to Use One-Way ANOVA

To implement one-way analysis of variance with a single-factor, independent groups design, a researcher takes the following steps:

  • Specify a mathematical model to describe the causal factors that affect the dependent variable.
  • Write statistical hypotheses to be tested by experimental data.
  • Specify a significance level for a hypothesis test.
  • Compute the grand mean and the mean scores for each group.
  • Compute sums of squares for each effect in the model.
  • Find the degrees of freedom associated with each effect in the model.
  • Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
  • Find the expected value of the mean squares for each effect in the model.
  • Compute a test statistic, based on observed mean squares and their expected values.
  • Find the P value for the test statistic.
  • Accept or reject the null hypothesis, based on the P value and the significance level.
  • Assess the magnitude of the effect of the independent variable, based on sums of squares.

Whew! Altogether, the steps to implement one-way analysis of variance may look challenging, but each step is simple and logical. That makes the whole process easy to implement, if you just focus on one step at a time. So let's go over each step, one-by-one.

Mathematical Model

For every experimental design, there is a mathematical model that accounts for all of the independent and extraneous variables that affect the dependent variable.

Fixed Effects

For example, here is the fixed-effects mathematical model for a completely randomized design:

X i j = μ + β j + ε i ( j )

where X i j is the dependent variable score for subject i in treatment group j, μ is the population mean, β j is the treatment effect in group j; and ε i ( j ) is the effect of all other extraneous variables on subject i in treatment j.

For this model, it is assumed that ε i ( j ) is normally and independently distributed with a mean of zero and a variance of σε2. The mean ( μ ) is constant.

Note: The parentheses in ε i ( j ) indicate that subjects are nested under treatment groups. When a subject is assigned to only one treatment group, we say that the subject is nested under a treatment.

Random Effects

The random-effects mathematical model for a completely randomized design is similar to the fixed-effects mathematical model. It can also be expressed as:

X i j = μ + β j + ε i ( j )

Like the fixed-effects mathematical model, the random-effects model also assumes that (1) ε i ( j ) is normally and independently distributed with a mean of zero and a variance of σε2 and (2) the mean ( μ ) is constant.

Here's the difference between the two mathematical models. With a fixed-effects model, the experimenter includes all treatment levels of interest in the experiment. With a random-effects model, the experimenter includes a random sample of treatment levels in the experiment. Therefore, in the random-effects mathematical model, the treatment effect ( β j ) is a random variable with a mean of zero and a variance of σ2β.

Statistical Hypotheses

For fixed-effects models, it is common practice to write statistical hypotheses in terms of the treatment effect β j; for random-effects models, in terms of the treatment variance σ2β .

  • Null hypothesis: The null hypothesis states that the independent variable has no effect on the dependent variable in any treatment group. Thus,

    H0: β j = 0 for all j (fixed-effects)

    H0: σ2β = 0 for all j (random-effects)

  • Alternative hypothesis: The alternative hypothesis states that the independent variable has an effect on the dependent variable in at least one treatment group. Thus,

    H1: β j ≠ 0 for some j (fixed-effects)

    H0: σ2β ≠ 0 for all j (random-effects)

If the null hypothesis is true, the mean score in each treatment group should equal the population mean. Thus, if the null hypothesis is true, sample means in the k treatment groups should be roughly equal. If the null hypothesis is false, at least one pair of sample means should be unequal.

Significance Level

The significance level (also known as alpha or α) is the probability of rejecting the null hypothesis when it is actually true. The significance level for an experiment is specified by the experimenter, before data collection begins. Experimenters often choose significance levels of 0.05 or 0.01.

A significance level of 0.05 means that there is a 5% chance of rejecting the null hypothesis when it is true. A significance level of 0.01 means that there is a 1% chance of rejecting the null hypothesis when it is true. The lower the significance level, the more persuasive the evidence needs to be before an experimenter can reject the null hypothesis.

Mean Scores

Analysis of variance begins by computing a grand mean and group means:

  • Grand mean. The grand mean (X) is the mean of all observations, computed as follows:
    n =
    kΣj=1
    n j
    X = ( 1 / n )
    kΣj=1
    n jΣi=1
    ( X i j )
  • Group means. The mean of group j ( X j ) is the mean of all observations in group j, computed as follows:
X j = ( 1 / n j )
n jΣi=1
( X i j )

In the equations above, n is the total sample size across all groups; and n j is the sample size in Group j .

Sums of Squares

A sum of squares is the sum of squared deviations from a mean score. One-way analysis of variance makes use of three sums of squares:

  • Between-groups sum of squares. The between-groups sum of squares (SSB) measures variation of group means around the grand mean. It can be computed from the following formula:
    SSB =
    kΣj=1
    n jΣi=1
    X  j - X )2  = 
    kΣj=1
    nj ( X  j - X )2
  • Within-groups sum of squares. The within-groups sum of squares (SSW) measures variation of all scores around their respective group means. It can be computed from the following formula:
    SSW =
    kΣj=1
    n jΣi=1
    ( X i j - X j )2
  • Total sum of squares. The total sum of squares (SST) measures variation of all scores around the grand mean. It can be computed from the following formula:
    SST =
    kΣj=1
    n jΣi=1
    ( X i j - X )2

It turns out that the total sum of squares is equal to the between-groups sum of squares plus the within-groups sum of squares, as shown below:

SST = SSB + SSW

As you'll see later on, this relationship will allow us to assess the magnitude of the effect of the independent variable on the dependent variable.

Degrees of Freedom

The term degrees of freedom (df) refers to the number of independent sample points used to compute a statistic minus the number of parameters estimated from the sample points.

To illustrate what is going on, let's find the degrees of freedom associated with the various sum of squares computations:

  • Between-groups degrees of freedom. The between-groups sum of squares formula appears below:
    SSB = 
    kΣj=1
    nj ( X  j - X )2

    Here, the formula uses k independent sample points, the sample means X  j . And it uses one parameter estimate, the grand mean X, which was estimated from the sample points. So, the between-groups sum of squares has k - 1 degrees of freedom.

  • Within-groups degrees of freedom. The within-groups sum of squares formula appears below:
    SSW =
    kΣj=1
    n jΣi=1
    ( X i j - X j )2

    Here, the formula uses n independent sample points, the individual subject scores X i j . And it uses k parameter estimates, the group means X j , which were estimated from the sample points. So, the between-groups sum of squares has n - k degrees of freedom (where n is total sample size across all groups).

  • Total degrees of freedom. The total sum of squares formula appears below:
    SST =
    kΣj=1
    n jΣi=1
    ( X i j - X )2

    Here, the formula uses n independent sample points, the individual subject scores X i j . And it uses one parameter estimate, the grand mean X, which was estimated from the sample points. So, the total sum of squares has n - 1 degrees of freedom (where n is total sample size across all groups).

The degrees of freedom for each sum of squares are summarized in the table below:

Sum of squares Degrees of freedom
Between-groups k - 1
Within-groups n - k
Total n - 1

Notice that there is an additive relationship between the various sums of squares. The degrees of freedom for total sum of squares (dfTOT) is equal to the degrees of freedom for between-groups sum of squares (dfBG) plus the degrees of freedom for within-groups sum of squares (dfWG). That is,

dfTOT = dfBG + dfWG

Mean Squares

A mean square is an estimate of population variance. It is computed by dividing a sum of squares (SS) by its corresponding degrees of freedom (df), as shown below:

MS = SS / df

To conduct a one-way analysis of variance, we are interested in two mean squares:

  • Within-groups mean square. The within-groups mean square ( MSWG ) refers to variation due to differences among experimental units within the same group. It can be computed as follows:

    MSWG = SSW / dfWG

  • Between groups mean square. The between-groups mean square ( MSBG ) refers to variation due to differences among experimental units within the same group plus variation due to treatment effects. It can be computed as follows:

    MSBG = SSB / dfBG

Expected Value

The expected value of a mean square is the average value of the mean square over a large number of experiments.

Statisticians have derived formulas for the expected value of the within-groups mean square ( MSWG ) and for the expected value of the between-groups mean square ( MSBG ). For one-way analysis of variance, the expected value formulas are:

Fixed- and Random-Effects:

E( MSWG ) = σε2

Fixed-Effects:

kΣj=1
 β j2
E( MSBG ) = σε2 +
( k - 1 )

Random-Effects:

E( MSBG ) = σε2 + nσβ2

In the equations above, E( MSWG ) is the expected value of the within-groups mean square; E( MSBG ) is the expected value of the between-groups mean square; n is total sample size; k is the number of treatment groups; β j is the treatment effect in Group j; σε2 is the variance attributable to everything except the treatment effect (i.e., all the extraneous variables); and σβ2 is the variance due to random selection of treatment levels.

Notice that MSBG should equal MSWG when the variation due to treatment effects ( β j for fixed effects and σβ2 for random effects) is zero (i.e., when the independent variable does not affect the dependent variable). And MSBG should be bigger than the MSWG when the variation due to treatment effects is not zero (i.e., when the independent variable does affect the dependent variable)

Conclusion: By examining the relative size of the mean squares, we can make a judgment about whether an independent variable affects a dependent variable.

Test Statistic

Suppose we use the mean squares to define a test statistic F as follows:

F(v1, v2) = MSBG / MSWG

where MSBG is the between-groups mean square, MSWG is the within-groups mean square, v1 is the degrees of freedom for MSBG, and v2 is the degrees of freedom for MSWG.

Defined in this way, the F ratio measures the size of MSBG relative to MSWG. The F ratio is a convenient measure that we can use to test the null hypothesis. Here's how:

  • When the F ratio is close to one, MSBG is approximately equal to MSWG. This indicates that the independent variable did not affect the dependent variable, so we cannot reject the null hypothesis.
  • When the F ratio is significantly greater than one, MSBG is bigger than MSWG. This indicates that the independent variable did affect the dependent variable, so we must reject the null hypothesis.

What does it mean for the F ratio to be significantly greater than one? To answer that question, we need to talk about the P-value.

Note: With a completely randomized design, the test statistic F is computed in the same way for fixed-effects and for random-effects. With more complex designs (i.e., designs with more than one factor), test statistics may be computed differently for fixed-effects models than for random-effects models.

P-Value

In an experiment, a P-value is the probability of obtaining a result more extreme than the observed experimental outcome, assuming the null hypothesis is true.

With analysis of variance, the F ratio is the observed experimental outcome that we are interested in. So, the P-value would be the probability that an F statistic would be more extreme (i.e., bigger) than the actual F ratio computed from experimental data.

How does an experimenter attach a probability to an observed F ratio? Luckily, the F ratio is a random variable that has an F distribution. Therefore, we can use an F table or an online calculator to find the probability that an F statistic will be bigger than the actual F ratio observed in the experiment.

F Distribution Calculator

To find the P-value associated with an observed F ratio, use Stat Trek's free F distribution calculator. You can access the calculator by clicking a link in the table of contents (at the top of this web page in the left column). find the calculator in the Appendix section of the table of contents, which can be accessed by tapping the "Analysis of Variance: Table of Contents" button at the top of the page. Or you can click tap the button below.

F Distribution Calculator

For an example that shows how to find the P-value for an F ratio, see Problem 2 at the bottom of this page.

Hypothesis Test

Recall that the experimenter specified a significance level early on - before the first data point was collected. Once you know the significance level and the P-value, the hypothesis test is routine. Here's the decision rule for accepting or rejecting the null hypothesis:

  • If the P-value is bigger than the significance level, accept the null hypothesis.
  • If the P-value is equal to or smaller than the significance level, reject the null hypothesis.

A "big" P-value indicates that (1) none of the k treatment means ( X j ) were significantly different, so (2) the independent variable did not have a statistically significant effect on the dependent variable.

A "small" P-value indicates that (1) at least one treatment mean differed significantly from another treatment mean, so (2) the independent variable had a statistically significant effect on the dependent variable.

Magnitude of Effect

The hypothesis test tells us whether the independent variable in our experiment has a statistically significant effect on the dependent variable, but it does not address the magnitude (strength) of the effect. Here's the issue:

  • When the sample size is large, you may find that even small differences in treatment means are statistically significant.
  • When the sample size is small, you may find that even big differences in treatment means are not statistically significant.

With this in mind, it is customary to supplement analysis of variance with an appropriate measure of effect size. Eta squared (η2) is one such measure. Eta squared is the proportion of variance in the dependent variable that is explained by a treatment effect. The eta squared formula for one-way analysis of variance is:

η2 = SSB / SST

where SSB is the between-groups sum of squares and SST is the total sum of squares.

ANOVA Summary Table

It is traditional to summarize ANOVA results in an analysis of variance table. Here, filled with hypothetical data, is an analysis of variance table for a one-way analysis of variance.

Analysis of Variance Table

Source SS df MS F P
BG 230 k - 1 = 10 23 2.3 0.09
WG 220 N - k = 22 10
Total 450 N - 1 = 32

This is an ANOVA table for a single-factor, independent groups design. The experiment used 11 treatment groups, so k equals 11. And three subjects were assigned to each treatment group, so N equals 33. The table shows critical outputs for between-group (BG) treatment effects and within-group (WG) treatment effects.

Many of the table entries are derived from the sum of squares (SS) and degrees of freedom (df), based on the following formulas:

SSTOTAL = SSBG + SSWG = 230 + 220 = 450

MSBG = SSBG / dfBG = 230/10 = 23

MSWG = MSWG / dfWG = 220/22 = 10

F(v1, v2) = MSBG / MSWG = 23/10 = 2.3

where MSbg is the between-groups mean square, MSwg is the within-groups mean square, v1 and dfBG are the degrees of freedom for MSBG, v2 and dfWG are the degrees of freedom for MSWG, and the F ratio is F(v1, v2).

An ANOVA table provides all the information an experimenter needs to (1) test hypotheses and (2) assess the magnitude of treatment effects.

Hypothesis Tests

The P-value (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the F ratio shown in the table, assuming the null hypothesis is true. When the P-value is bigger than the significance level, we accept the null hypothesis; when it is smaller, we reject it.

Suppose the significance level for this experiment was 0.05. Based on the table entries, can we reject the null hypothesis? From the ANOVA table, we see that the P-value is 0.09. Since P-value is bigger than the significance level (0.05), we cannot reject the null hypothesis.

Magnitude of Effects

Since the P-value in the ANOVA table was bigger than the significance level, the treatment effect in this experiment was not statistically significant. Does that mean the treatment effect was small? Not necessarily.

To assess the strength of the treatment effect, an experimenter might compute eta squared (η2). The computation is easy, using sums of squares entries from the ANOVA table, as shown below:

η2 = SSB / SST = 230 / 450 = 0.51

where SSB is the between-groups sum of squares and SST is the total sum of squares.

For this experiment, eta squared is 0.51. This means that 51% of the variance in the dependent variable can be explained by the effect of the independent variable.

Even though the treatment effect was not statistically significant, it was not unimportant; since the independent variable accounted for more than half the variance in the dependent variable. The moral here is that a hypothesis test by itself may not tell the whole story. It also pays to look at the magnitude of an effect.

Advantages and Disadvantages

One-way analysis of variance with a single-factor, independent groups design has advantages and disadvantages. Advantages include the following:

  • The design layout is simple - one factor with k factor levels.
  • Data analysis is easier with this design than with other designs.
  • Computational procedures are identical for fixed-effects and random-effects models.
  • The design does not require equal sample sizes for treatment groups.
  • The design requires subjects to participate in only one treatment group.

Disadvantages include the following:

  • The design does not permit repeated measures.
  • The design can test the effect of only one independent variable.

Test Your Understanding

Problem 1

In analysis of variance, what is a mean square?

(A) The average deviation from the mean.
(B) A measure of standard deviation.
(C) A measure of variance.
(D) A measure of skewness.
(E) A vicious geometric shape.

Solution

The correct answer is (C). Mean squares are estimates of variance within groups or across groups. Mean squares are used to calculate F ratios, such as the following:

F = MSbg / MSwg

where MSbg is the between-group mean square and MSwg is the within-group mean square.


Problem 2

In the ANOVA table shown below, the P-value is missing. What is the correct entry for the P-value?

Source SS df MS F P-value
BG 300 5 60 3 ???
WG 600 30 20
Total 900 35

Hint: Stat Trek's F Distribution Calculator may be helpful.

(A) 0.01
(B) 0.03
(C) 0.11
(D) 0.89
(E) 0.97

Solution

The correct answer is (B).

A P-value is the probability of obtaining a result more extreme (bigger) than the observed F ratio, assuming the null hypothesis is true. From the ANOVA table, we know the following:

  • The observed value of the F ratio is 3.
  • The degrees of freedom (v1) for the between-groups mean square is 5.
  • The degrees of freedom (v2) for the within-groups mean square is 30.

Therefore, the P-value we are looking for is the probability that an F with 5 and 30 degrees of freedom is greater than 3. We want to know:

P [ F(5, 30) > 3 ]

Now, we are ready to use the F Distribution Calculator. We enter the degrees of freedom (v1 = 5) for the between-groups mean square, the degrees of freedom (v2 = 30) for the within-groups mean square, and the F ratio (3) into the calculator; and hit the Calculate button.

F-Distribution calculator shows cumulative probability equals 0.97.

The calculator reports that the probability that F is greater (more extreme) than 3 equals about 0.026. Hence, the correct P-value is 0.026.