Section 17.1: The Analysis of Variance (A.N.O.V.A.)

A Quick Example

More Explanation and Examples



Lab COVER SHEET
(click to download)



Definition

One-way analysis of variance is a hypothesis-testing technique that is used to compare the means of three or more populations. Analysis of variance is usually abbreviated ANOVA.


The hypothesis-testing procedure using ANOVA involves the same five steps that were used in earlier chapters. To begin a one-way analysis of variance test, you should first state the null and alternate hypotheses. For a one-way ANOVA test, the null and alternate hypotheses will be:
\[ \eqalign{ H_0: &\quad \mu_1=\mu_2=\mu_3=\dots=\mu_k \quad \text{ (all population means are equal)} \cr H_1: &\quad \text{ at least one mean is different from the others} } \]

When you reject the null hypothesis in a one-way ANOVA test, you can conclude that at least one of the means is different from the others. Without performing more statistical tests, however, you cannot determine which of the means is different.

Before performing a one-way ANOVA test, you must check that these three conditions are satisfied:

Assumptions for the ANOVA Test for Comparing Three or More Means

  1. The populations from which the samples were obtained must be normally or approximately normally distributed.
  2. The samples must be independent of one another.
  3. The variances of the populations must be equal.
Even though you are comparing three or more means in this hypothesis test, variances are used in the test instead of means. The ANOVA test is applied by calculating two estimates of the population variance, $\sigma^2$:

  1. the variance between samples, denoted $MS_B$, and
  2. the variance within samples, denoted $MS_W$.

The variance between samples, $MS_B$, gives an estimate of $\sigma^2$ based on the variation among the means of samples taken from different populations. The variance within samples, $MS_W$, gives an estimate of $\sigma^2$ that is calculated from using all of the data from different samples.

The test statistic for a one-way ANOVA test is the ratio of the two estimates for population variance: \[ \text{Test Statistic }=\frac{\text{Variance between samples}}{\text{Variance within samples}} \] If the assumptions given above are met, then the sampling distribution for the test is approximated by the $F$-distribution. The graph of an $F$-distribution is skewed right like the graph given below.
The test statistic is a value of $F$ given by the formula \[ F= \frac{MS_B}{MS_W} \] The values of $F$ that make up the sampling distribution lie along the horizontal axis in the above graph and will range between 0 and $\infty$. The relative frequency with which these values occur is represented by the height of the sampling distribution curve.

If there is little or no difference between the means, then $MS_B$ will be approximately equal to $MS_W$ and the test statistic will be approximately 1. Values of $F$ close to 1 suggest that you should fail to reject the null hypothesis. However, if one of the means differs significantly from the others, then $MS_B$ will be greater then $MS_W$ and the test statistic will be greater than 1. Values of $F$ significantly greater than one suggest that you would reject the null hypothesis.

The one-way ANOVA test is always a right-tailed test. The p-value for the hypothesis test is the probability that $F$ is greater than or equal to the value of $F$ obtained from the samples. This is equal to the area under the sampling distribution, right of a vertical line fixed at the value of $F$ you obtain from your samples. As usual, if the p-value is less than $\alpha$, then reject $H_0$.

Hypothesis Test Conclusion This means that...
Reject $H_0$ The sample evidence suggest that at least one mean is different from the the others. If $H_0$ were true, the sample data would be very surprising.
Fail to Reject $H_0$ There is not convincing sample evidence to conclude that there is a difference in the means. If $H_0$ were true, the sample data would not be considered surprising.


Example

From time to time, unknown to its employees, the research department at Main Street Bank observes various employees for their work productivity. Recently this department wanted to check whether the four tellers at a branch of this bank serve, on average, the same number of customers per hour. The research manager observed each of the four tellers for a certain number of hours. The following table gives the number of customers served by the four tellers during each of the observed hours.


Teller 1 Teller 2 Teller 3 Teller 4
19 14 11 24
21 16 14 19
26 14 21 21
24 13 13 26
18 17 16 20
13 18
At the 5% significance level, test the null hypothesis that the mean number of customers served per hour by each of these four tellers is the same. Assume that all the assumptions required to apply the one-way ANOVA procedure hold true.

Questions/Instructions

  • What statements represent the null and alternative hypotheses?
  • What value of $\alpha$ should you use for the hypothesis test?
  • What is the formula for the test statistic?
  • What is the value of the test statistic? Round to the hundredths.
  • What p-value is obtained from performing one-way ANOVA test? Round to the ten-thousandths.
  • Sketch a graph of the sampling distribution and the p-value.
  • Did you reject $H_0$ or fail to reject $H_0$?
  • Interpret the decision.


Answers  

  • What statements represent the null and alternative hypotheses?

    \[ \eqalign{ H_0: &\quad \mu_1=\mu_2=\mu_3=\mu_4 \quad \text{ (all population means are equal)} \cr H_1: &\quad \text{ at least one mean is different from the others} } \]

  • What value of $\alpha$ should you use for the hypothesis test?

    \[ \alpha=0.05 \]

  • What is the formula for the test statistic?

    \[ F= \frac{MS_B}{MS_W} \]

  • What is the value of the test statistic? Round to the hundredths.

    One-Way Analysis of Variance (ANOVA)

    1. Enter the data into L1, L2, L3, etc.
    2. Press STAT and move the cursor to TESTS.
    3. Arrow Up/Down until the cursor highlights ANOVA. Press ENTER.
    4. Type each list followed by a comma. End with ) and press ENTER.


    INPUT

    INPUT

    INPUT

    OUTPUT

    The test statistic is $F=9.69$.


  • What p-value is obtained from performing one-way ANOVA test? Round to the ten-thousandths.

    The  $p-value = P(F\geq9.69) \approx 4.97 \times 10^{-4}$ or $0.0005$.


  • Sketch a graph of the sampling distribution and the p-value.



  • Did you reject $H_0$ or fail to reject $H_0$?

    Since the pvalue is less than $\alpha$, we reject $H_0$.



  • Interpret the decision in the context of the data.

    The mean number of customers served per hour of the four tellers is not the same. The sample evidence suggest that at least one mean is different from the the others.

The Design of an Experiment


The way that a sample is selected is called the sampling plan or experimental design. The sampling plan determines how much information is gathered in the sample. Some research involves an observational study, in which the researcher does not actually produce the data but only observes the characteristics of data that already exist. Most sample surveys, in which information is gathered with a questionnaire, fall into this category. The researcher forms a plan for collecting the data—called the sampling plan—and then uses the appropriate statistical procedures to draw conclusions about the population or populations from which the sample comes.

Other research involves experimentation. The researcher may deliberately impose one or more experimental conditions on the experimental units in order to determine their effect on the response. Here are some new terms we will use to discuss the design of a statistical experiment.

Definitions

An experimental unit is the object on which a measurement (or measurements) is taken.

A factor is an independent variable whose values are controlled and varied by the experimenter.

A level is the intensity setting of a factor

A treatment is a specific combination of factor levels

The response is the variable being measured by the experimenter

Example

A group of golfers who all have the same handicap (they all typically score the same) is randomly divided into an experimental and control group. The control group is asked to golf 18 holes and record their scores after having eaten a full breakfast. The experimental group is asked to golf 18 holes at the same time as the control group, and record their scores, without having eaten any breakfast. What are the factors, levels and treatments in this experiment?

Solution

The experimental units are the people on which the response (golf score) is measured. The factor of interest could be described as a 'meal' and has two levels: 'breakfast' and 'no breakfast.' Since this is the only factor controlled by the experimenter, the two levels — 'breakfast' and 'no breakfast' —also represent the treatments of interest in the experiment.

Example

Suppose the experimenter in the previous example began by randomly selecting 30 golfers aged 20 to 29 and 30 golfers aged 30 to 39 for the experiment. These two groups were then randomly divided into 15 each for the experimental and control groups. What are the factors, levels and treatments?

Solution

Now there are two factors of interest to the experimenter, and each factor has two levels:
  • 'age' at two levels: ages 20 to 29 and ages 30 to 39
  • 'meal' at two levels: breakfast and no breakfast

In this more complex experiment, there are four treatments, one for each specific combination of factor levels: golfers ages 20 to 29 with no breakfast, golfers ages 20 to 29 with breakfast, golfers ages 30 to 39 with no breakfast and golfers ages 30 to 39 with breakfast.


In this section, we will concentrate on an experiment that involves one factor set at $k$ levels, and we will use a technique called the analysis of variance to judge the effects of that factor on the experimental response.

What is an Analysis of Variance?

The responses that are generated in an experimental situation always exhibit a certain amount of variability. In an analysis of variance, you divide the total variation in the response measurements into portions that may be attributed to various factors of interest to the experimenter. If the experiment has been properly designed, these portions can then be used to answer questions about the effects of the various factors on the response of interest.

You can better understand the logic underlying an analysis of variance by looking at a simple experiment. Consider two sets of samples randomly selected from populations 1 (white ovals) and 2 (black triangles), each with the same number of pairs of means, $\bar{x_1}$ and $\bar{x_2}$. The two sets are shown in the figure below.

Is it easier to detect the difference in the two means when you look at set A or set B? You will probably agree that set A shows the difference much more clearly. In set A, the variability of the measurements within the respective groups (black triangles and white ovals) is much smaller than the variability between the two groups. In set B, there is more variability within the groups (black triangles and white ovals) causing the two groups to 'mix' together and making it more difficult to see the identical difference between the means.

Placeholder image

The comparison you have just done intuitively is formalized by the analysis of variance. Moreover, the analysis of variance can be used not only to compare two means but also to make comparisons of more than two population means and to determine the effects of various factors in more complex experimental designs. The analysis of variance relies on statistics with sampling distributions that are modeled by the F distribution of Secion 10.3 in your textbook.

The Assumptions for an Analysis of Variance

The assumptions upon which the test and estimation procedures for an analysis of variance are based are similar to those required for the Student's t and F statistics from chapters 7, 8 and 10. Regardless of the experimental design used to generate the data, you can assume that the observations within each treatment group are normally distributed with a common variance, $\sigma^2$. The analysis of variance procedures are fairly robust when the sample sizes are equal and when the data are fairly bell shaped. Violating the assumptions of a common variance is more serious, especially when the sample sizes are not nearly equal.

Definitions

Assumptions for Analysis of Variance Test and Estimation Procedures
  • The observations within each population are normally distributed with a common variance, $\sigma^2$
  • Assumptions regarding the sampling procedure are specified for each experimental design.
This section describes the analysis of variance for one simple experimental design. This design is based on independent random sampling from several populations and is an extension of the unpaired $t$ test from chapter 8 (section 2).


The Completely Randomized Design: A One-Way Classification

One of the simplest experimental designs is the completely randomized design, in which random samples are selected independently from $k$ populations. This design involves only one factor, the population from which the measurements comes — hence the designation as a one -way classification. There are $k$ different levels corresponding to $k$ populations, which are also treatments for this one-way classification. Are the $k$ population means all the same, or is at least one mean different from the others? Why do you need a new procedure, the analysis of variance, to compare the population means when you already have the Student's t test available? In comparing $k=3$ means, you could test each of three pairs of hypotheses: $$ H_0: \mu_1=\mu_2 \qquad H_0: \mu_1=\mu_3 \qquad H_0: \mu_2=\mu_3 \qquad $$ to find out where the differences lie. However, you must remember that each test you perform is subject to the possibility of error. To compare $k=4$ means, you would need six tests, and you would need ten tests to compare $k=5$ means. The more tests you perform on a set of measurements, the more likely it is that at least one of your conclusions will be incorrect. The analysis of variance procedure provides one overall test to judge the equality of $k$ population means. Once you have determined whether there is actually a difference in the means, you can use another procedure to find out where the differences lie.

How can you select these $k$ random samples? Sometimes the populations actually exist in fact, and you can use a computerized random number generator or a random number table to randomly select the samples. For example, in a study to compare the average sizes of health insurance claims in four different states, you could use a computer database provided by the health insurance companies to select random samples from the four states.

Example

A researcher is interested in the effects of five types of insecticide for use in controlling the boll weevil in cotton fields. Explain how to implement a completely randomized design to investigate the effects of the five insecticides on crop yield.

Solution

The only way to generate the equivalent of five random samples from the hypothetical populations corresponding to the five insecticides is to use a method called randomized assignment. A fixed number of cotton plants are chosen for treatment, and each is assigned a random number. Suppose that each sample is to have an equal number of measurements. Using a randomization device, you can assign the first $n$ plants chosen to receive insecticide 1, the second $n$ plants to receive insecticide 2, and so on, until all five treatments have been assigned.

Whether by random selection or random assignment, both of these examples result in completely randomized design, or one-way classification, for which the analysis of variance is used.


The Analysis of Variance for a Completely Randomized Design



Suppose you want to compare $k$ population means, $\mu_1, \mu_2, ..., \mu_k,$ based on independent random samples of size $n_1, n_2, ..., n_k$ from normal populations with a common variance, $\sigma^2$. That is, each of the normal populations has the same shape but their locations migh be different, as shown in the figure below.

Placeholder image

Partitioning the Total Variation in an Experiment

Let $x$ be the variable representing the list of all measurements taken from all $k$ samples. Also let $\bar{\bar{x}}$ represent the mean of all measurements from all $k$ samples. That is, $\bar{\bar{x}} = \dfrac{\sum x }{ n}.$ The analysis of variance procedure begins by examining the total variation in the experiment, which is measured by a quantity called the total sum of squares: $$\text{Total SS} = \sum(x-\bar{\bar{x}})^2 = \sum x^2 - \frac{(\sum x)^2}{n}$$ $\sum(x-\bar{\bar{x}})^2$ is the familiar numerator in the formula for the sample variance for the entire set of $n$ measurements ($n=n_1+n_2+\dots+n_k$). The second part of the calculational formula is sometimes called the correction for the mean (abbrievated as CM in the formulas below). If we let $G$ represent the grand total of all $n$ observations, then $$CM = \frac{(\sum x)^2}{n} = \frac{G^2}{n}$$ This Total SS is partitioned (split) into two components. The first component, called the sum of squares for treatments ($SS_B$), measures the variation between the $k$ sample means $$SS_B = \sum n_i(\bar{x_i}-\bar{\bar{x}})^2 = \sum\frac{T_i^2}{n_i}-CM $$ where $T_i$ is the total of the observations for treatment $i$, and $\bar{x}_i$ is the mean of sample $i$. The second component of the Total SS, called the sum of squares for error ($SS_W$), is used to measure the pooled variation within the $k$ samples: $$ SS_W = (n_1-1)s_1^2 + (n_2-1)s_2^2 + \dots +(n_k-1)s_k^2 = \sum (n_i-1)s_i^2 $$ This formula is a direct extenstion of the numerator in the formula for the pooled estimate of $\sigma^2$ from chapter 8, section 2. We can show algebraically that, in the analysis of variance, $$\text{Total SS} = SS_B+SS_W $$ Therefore, you need to calculate only two of the three sums of squares—Total SS, $SS_B$ and $SS_W$—and the third can be found by subtraction.

Each of the sources of variation, when divided by its appropriate degrees of freedom, provides an estimate of variation in the experiment. Since Total SS involves n squared observations, its degrees of freedom are $df=(n-1)$. Similarly, the sum of squares for treatments involves $k$ squared observations, and so it's degrees of freedom are $df=(k-1)$. Finally, the sum of squares for error, a direct extension of the pooled estimate in Chapter 8 has $$df= (n_1-1)+(n_2-1)+\dots+(n_k-1)=n-k$$ Notice that the degrees of freedom for treatments and error are additive—that is $$df(total)= df(treatments)+df(error)$$ These two sources of variation and their respective degrees of freedom are combined to form the mean squares as $\text{MS}=\frac{\text{SS}}{df}$. The total variation in the experiment (of the list of all n measurements in the experiment) $$ Total \ Sample \ Variation= \frac{\sum(x-\bar{\bar{x}})^2}{n-1} = \frac{SS_B+SS_W}{(k-1)+(n-k)} $$ is then displayed (in parts) in an analysis of variance (or ANOVA) table.



ANOVA Table for $k$ Independent Random Samples: Completely Randomized Design

$$ \begin{array}{c|c|c|c|c} \color{blue}{Variation \ Source} & \color{blue}{ df } & \color{blue}{ Sum \ of \ Squares (SS) } & \color{blue}{Mean \ of \ Squares (MS)} &\color{blue}{F} \\\hline Treatments & k-1 & SS_B& MS_B& MS_B/MS_W\\ Error & n-k & SS_W & MS_W&\\ \hline Total & n-1 & Total \ SS & & \end{array} $$

Where $$\begin{align*} Total \ SS & = \sum x^2 -CM \\ &\\ & = (\text{sum of squares of all x values}) - CM \end{align*}$$ with $$ \begin{array}{cccc} CM = \dfrac{(\sum x)^2}{n} = \dfrac{G^2}{n} &&&\\ & & &\\ SS_B = \sum\dfrac{T_i^2}{n_i}-CM & & & MS_B=\dfrac{SS_B}{k-1} \\ & & &\\ SS_W = Total \ SS - SS_B &&& MS_W=\dfrac{SS_W}{n-k} \\ \end{array} $$

and $$\begin{align*} G & = \text{Grand total of all $n$ observations}\\ T_i &= \text{Total of all observations in sample $i$}\\ n_i &= \text{Number of observations in sample $i$}\\ n &= n_1+n_2+\dots+n_k\\ \end{align*}$$

Example

In an experiment to determine the effect of nutrition on the scores of golfers, a group of 15 golfers who all scored 72 on average were randomly assigned to each of three meal plans: no breakfast, light breakfast and full breakfast. Their scores were recorded during a morning round of golf and are shown in the table below. Construct the analysis of variance table for this experiment. Note: in golf the lowest score is the best score.

$$ \begin{array}{ccc} No \ Breakfast & Light \ Breakfast & Full \ Breakfast\\\hline 75 & 75 & 72 \\ 76 & 73 & 68 \\ 71 & 76 & 69 \\ 74 & 72 & 72 \\ 77 & 74 & 71 \\ \hline T_1=373& T_2=370 & T_3=352 \\ \end{array} $$

Solution

There are $k=3$ treatments with each sample size equaling 5 (that is, $n_1=n_2=n_3=5$), the total number of measurements is $n=n_1+n_2+n_3=15$ and the sum total of all the measurements is $\sum x = T_1+T_2+T_3=1095$. Then $$CM = \frac{1095^2}{15}=79,935$$

\begin{align*} Total \ SS & = (75^2+76^2+71^2+\dots75^2)-CM\\ & =80,031-79,935\\ & =96\\ \end{align*} with $(n-1)=15-1=14$ degrees of freedom. Next,

\begin{align*} SS_B & = \sum\frac{T_i}{n_i}-CM\\ & = \biggr(\frac{373^2}{5}+\frac{370^2}{5}+\frac{352^2}{5} \biggr)-79,935\\ & =51.6\\ \end{align*} with $(k-1)=3-1=2$ degrees of freedom. Using subtraction, \begin{align*} SS_W & = Total \ SS - SS_B\\ & = 96-51.6\\ &=44.4 \end{align*} with $(n-k)=15-3=12$ degrees of freedom. Then, \begin{align*} MS_B & =\frac{SS_B}{k-1}\\ & =\frac{51.6}{2}\\ & = 25.8\\ \end{align*} and \begin{align*} MS_W & =\frac{SS_W}{n-k}\\ & =\frac{44.4}{12}\\ & = 3.7\\ \end{align*} Finally, $F=\frac{MS_B}{MS_W}=\frac{25.8}{3.7}=6.729729729...$

The ANOVA Table

\begin{array}{c|c|c|c|c} \color{blue}{Variation \ Source} & \color{blue}{ df } & \color{blue}{ Sum \ of \ Squares (SS) } & \color{blue}{Mean \ of \ Squares (MS)} &\color{blue}{F} \\\hline Treatments & k-1=2 & SS_B=51.6& MS_B=25.8& MS_B/MS_W=6.973\\ Error & n-k=12 & SS_W=44.4 & MS_W=3.7&\\ \hline Total & n-1 & Total \ SS =96 & & \end{array}

Table Check

As a check, we can verify that $s_x$ (the sample standard deviation from all 15 measurements in the experiment) is equal to the same number we get from taking the square root of $\frac{SS_B+SS_W}{(k-1)+(n-k)}$. That is, we need to verify that the two formulas $$ s_x = \sqrt{\frac{SS_B+SS_W}{(k-1)+(n-k)} }$$ and $$ s_x = \sqrt{\frac{\sum(x-\bar{\bar{x}})^2}{n-1}} $$ give the same number. If they don't equal, then we made a mistake somewhere and need to check our math and find our mistake. Another way to do the problem is to employ technology, such as a graphing calculator or statistical software. Notice that \begin{align*} s_x & = \sqrt{\frac{SS_B+SS_W}{(k-1)+(n-k)} }\\ & = \sqrt{\frac{51.6+44.4}{2+12} }\\ & = \sqrt{\frac{96}{14}}\\ & = 2.618614683\\ \end{align*} We can get the calculator to evaluate the formula $$ s_x = \sqrt{\frac{\sum(x-\bar{\bar{x}})^2}{n-1}} $$ for us. All we need to do is enter all 15 measurements from the experiment into our calculator's list one (L1), then run the $1-var-stats$ algorithm on L1. Below is the calculator display we should get.

How to get the Table with the Calculator

  1. Enter the data into L1, L2, L3, etc.
  2. Press STAT and move the cursor to TESTS.
  3. Arrow Up/Down until the cursor highlights ANOVA. Press ENTER.
  4. Type each list followed by a comma. End with ) and press ENTER.


INPUT

INPUT

INPUT

OUTPUT

OUTPUT