Sunteți pe pagina 1din 4

LCGC Europe Online Supplement statistics and data analysis 9

Analysis
of Variance
Shaun Burke, RHM Technology Ltd, High Wycombe, Buckinghamshire, UK.

Statistical methods can be powerful tools for unlocking the information


contained in analytical data. This second part in our statistics refresher series
looks at one of the most frequently used of these tools: Analysis of Variance
(ANOVA). In the previous paper we examined the initial steps in describing
the structure of the data and explained a number of alternative significance
tests (1). In particular, we showed that t-tests can be used to compare the
results from two analytical methods or chemical processes. In this article,
we will expand on the theme of significance testing by showing how ANOVA
can be used to compare the results from more than two sets of data at
the same time, and how it is particularly useful in analysing data from
designed experiments.

With the advent of built-in spreadsheet central tenet of ANOVA is that the total SS in the form of the data contained in Figure 1,
functions and affordable dedicated an experiment can be divided into the which shows the results from 12 different
statistical software packages, Analysis of components caused by random error, given analysts analysing the same material. Using
Variance (ANOVA) has become relatively by the within-group (or sample) SS, and the these data and a spreadsheet, the results
simple to carry out. This article will components resulting from differences obtained from carrying out one-way
therefore concentrate on how to select the between means. It is these latter components ANOVA are reported in Example 1. In this
correct variant of the ANOVA method, the that are used to test for statistical example, the ANOVA shows there are
advantages of ANOVA, how to interpret significance using a simple F-test (1). significant differences between analysts
the results and how to avoid some of the (Fvalue > Fcrit at the 95% confidence level).
pitfalls. For those wanting more detailed Why not use multiple t-tests This result is obvious from a plot of the
theory than is given in the following instead of ANOVA? data (Figure 1) but in many situations a
section, several texts are available (25). Why should we use ANOVA in preference visual inspection of a plot will not give such
to carrying out a series of t-tests? I think a clear-cut result. Notice that the output
A bit of ANOVA theory this is best explained by using an example; also includes a p-value (see Interpretation
Whenever we make repeated suppose we want to compare the results of the result(s) section, which follows).
measurements there is always some from 12 analysts taking part in a training
variation. Sometimes this variation (known exercise. If we were to use t-tests, we Note: ANOVA cannot tell us which
as within-group variation) makes it difficult would need to calculate 66 t-values. Not individual mean or means are different
for analysts to see if there have been only is this a lot of work but the chance of from the consensus value and in what
significant changes between different groups reaching a wrong conclusion increases. The direction they deviate. The most effective
of replicates. For example, in Figure 1 correct way to analyse this sort of data is way to show this is to plot the data (Figure
(which shows the results from four replicate to use one-way ANOVA. 1) or alternatively, but less effectively, carry
analyses by 12 analysts), we can see that out a multiple comparison test such as
the total variation is a combination of the One-way ANOVA Scheffe's test (2). It is also important to
spread of results within groups and the One-way ANOVA will answer the question: make sure the right questions are being
spread between the mean values (between- Is there a significant difference between asked and that the right data are being
group variation). The statistic that measures the mean values (or levels), given that the captured. In Example 1, it is possible that
the within and between-group variations in means are calculated from a number of the time difference between the analysts
ANOVA is called the sum of squares and replicate observations? Significant refers carrying out the determinations is the
often appears in the output tables to the observed spread of means that reason for the difference in the mean
abbreviated as SS. It can be shown that the would not normally arise from the chance values. This example shows how good
different sums of squares calculated in variation within groups. We have already experimental design procedures could have
ANOVA are equivalent to variances (1). The seen an example of this type of problem in prevented ambiguity in the conclusions.
10 statistics and data analysis LCGC Europe Online Supplement

Example 1 An example of one-way ANOVA carried out by Excel Two-way ANOVA


In a typical experiment things can be more
A_1 A_2 A_3 A_4 A_5 A_6 complex than described previously. For
Replicate 1 34.1 35.84 36.67 40.54 41.19 41.22 example, in Example 2 the aim is to find
Replicate 2 34.1 36.58 37.33 40.67 40.29 39.61
Replicate 3 34.69 31.3 36.96 40.81 40.99 37.89
out if time and/or temperature have any
Replicate 4 34.6 34.19 36.83 40.78 40.4 36.67 effect on protein yield when analysing
samples of tinned ham. When analysing
A_7 A_8 A_9 A_10 A_11 A_12
Replicate 1 40.71 39.2 42.5 39.75 36.04 44.36
data from this type of experiment we use
Replicate 2 40.91 39.3 42.3 39.69 37.03 45.73 two-way ANOVA. Two-way ANOVA can
Replicate 3 40.8 39.3 42.5 39.23 36.85 45.25 test the significance of each of two
Replicate 4 38.42 39.3 42.5 39.73 36.24 45.34
experimental variables (factors or
Anova: Single Factor treatments) with respect to the response,
Source of Variation SS df MS F P-value F crit such as an instrument's output. When
Between Groups 438.7988 11 39.8908 40.31545 6.6E-17 2.066606
Within Groups 35.6208 36 0.989467
replicate measurements are made we can
also examine whether or not there are
(Note: the data table has been split into two sections (A_1 to A_6, A_7 to A_12) for display purposes. The ANOVA is
significant interactions between variables.
carried out on a single table.) An interaction is said to be present when
the response being measured changes
SS = sum of squares, df = degrees of freedom, MS = mean square (SS/df).
more than can be explained from the
The P-value is < 0.05 (Fvalue is > Fcrit - 95% confidence level for 11 and 36 degrees of freedom)
therefore it can be concluded that there is a significant difference between the analysts' results.
change in level of an individual factor. This
is illustrated in Figure 2 for a process with
two factors (Y and Z) when both factors
are studied at two levels (low and high). In
Figure 2(b), the changes in response
Example 2 Two-way ANOVA
caused by Y depend on Z, and vice versa.
The analysis of tinned ham was carried out at three temperatures (415, 435 and 460
C) and three times (30, 60 and 90 minutes). Three analyses, determining protein In two-way ANOVA we ask the
yield were made at each temperature and time. The measurements are summarized following questions:
in the diagram below and the results of the two-way ANOVA are given in the table. Is there a significant interaction between
the two factors (variables)?
Temp (C) Does a change in any of the factors
415 435 460 affect the measured result?
Time (min) It is important to check the answers in the
right order: Figure 3 illustrates the
decision process. In the case of Example
27

27.1
27.2

27

27.1
27.2

27

27.1
27.2
26.9

27.3

26.9

27.3

26.9

27.3

2 the questions are:


30
Is there an interaction between
temperature and time which affects the
27

27.1
27.2

27

27.1
27.2

27

27.1
27.2
26.9

27.3

26.9

27.3

26.9

27.3

protein yield?
60
Does time and/or temperature affect the
protein yield?
27

27.1
27.2

27

27.1
27.2

27

27.1
27.2
26.9

27.3

26.9

27.3

26.9

27.3

Using the built-in functions of a


90 spreadsheet (in this case Excels data
Time (min)/Temp (C)
analysis tools two-factor analysis with
415 435 460
30 27.13 27.2 27.03 replication) we see that there is a
30 27.2 26.97 27.1 significant interaction between time and
30 27.13 27.13 27.13
temperature and a significant effect of
60 27.29 27.07 27.1
60 27.13 27.1 27.07 temperature alone (both p-value < 0.05
60 27.23 27.03 27.03 and F > Fcrit). Following the process
90 27.03 27.2 27.03 outlined in Figure 3, we consider the
90 27.13 27.23 27.07
90 27.07 27.27 26.9 interaction question first by comparing the
mean squares (MS) for the within-group
Anova: Two-factor with replication
variation with the interaction MS. This is
Source of Variation SS df MS F P-value F crit
Sample (=Time) 0.000867 2 0.000433 0.100429 0.904952 3.554561 reported in the results table of Example 2.
Columns (=Temperature) 0.049689 2 0.024844 5.75794 0.011667 3.554561
Interaction 0.087644 4 0.021911 5.078112 0.006437 2.927749
F = 0.021911/0.004315 = 5.078
Within 0.077667 18 0.004315

Total 0.215867 26 If the interaction is significant (F > Fcrit),


as in this case, then the individual factors
Note: in the above example, the spreadsheet (Excel) labels Source of Variation as Sample, Columns, Interaction and Within. (time and temperature) should each be
compared with the MS for the interaction
Sample = Time, Columns = Temperature, Interaction is the interaction between temperature and time, and Within is a
(not the within-group MS) thus:
measure of the within-group variation. (Note: Source of variation Columns = Temperature and Sample = Time).

Ftemp = 0.024844/0.021911 = 1.134


LCGC Europe Online Supplement statistics and data analysis 11

Ftime = 0.000433/0.021911 = 0.020 Interpretation of the result(s)


Fcrit = 6.944, for 2 and 4 degrees of freedom (at the 95% confidence level) To reiterate the interpretation of ANOVA
results, a calculated F-value that is greater
In other words, there is no significant difference between the interaction of time and than Fcrit for a stated level of confidence
temperature with respect to either of the individual factors, and, therefore, the interaction (typically 95%) means that the difference
of temperature with time is worth further investigation. If one or both of the individual being tested is statistically significant at
factors were significant compared with the interaction, then the individual factor or factors that level. As an alternative to using the F-
would dominate and for all practical purposes any interaction could be ignored. values the p-value can be used to indicate
If the interaction term is not significant then it can be considered to be another small the degree of confidence we have that
error term and can thus be pooled with the within-group (error) sums of squares term. It is there is a significant difference between
the pooled value (SS2pooled) that is then used as the denominator in the F-test to means (i.e., (1-p) * 100 is the percentage
determine if the individual factors affect the measured results significantly. To combine the confidence). Normally a p-value of 0.05
sums of squares the following formula is used: is considered to denote a significant
difference.
ss inter  ss within Note: Extrapolation of ANOVA results is
ss2pooled 
dof inter  dof within not advisable, so in Example 2 for instance,
it is impossible to say if a time of 15 or 120
where dofinter and dofwithin are the degrees of freedom for the interaction term and minutes would lead to a measurable effect
error term, and SSinter and SSwithin are the sums of squares for the interaction term and on protein yield. It is, therefore, always
error term, respectively. more economic in the long run to design
the experiment in advance, in order to
(dofpooled  dofinter  dofwithin) cover the likely ranges of the parameter(s)
of interest.
Selecting the ANOVA method
One-way ANOVA should be used when there is only one factor being considered and Avoiding some of
replicate data from changing the level of that factor are available. Two-way ANOVA (with the pitfalls using ANOVA
or without replication) is used when there are two factors being considered. If no replicate In ANOVA it is assumed that the data for
data are collected then the interactions between the two factors cannot be calculated. each variable are normally distributed.
Higher level ANOVAs are also available for looking at more than two factors. Usually in ANOVA we dont have a large
amount of data so it is difficult to prove
Advantages of ANOVA any departure from normality. It has been
Compared with using multiple t-tests, one-way and two-way ANOVA require fewer shown, however, that even quite large
measurements to discover significant effects (i.e., the tests are said to have more power). deviations do not affect the decisions
This is one reason why ANOVA is used frequently when analysing data from statistically made on the basis of the F-test.
designed experiments. A more important assumption about
Other ANOVA and multivariate ANOVA (MANOVA) methods exist for more complex ANOVA is that the variance (spread)
experimental situations but a description of these is beyond the scope of this introductory between groups is homogeneous
article. More details can be found in reference 6. (homoscedastic). If this is not the case (this
often happens in chemistry, see Figure 1)
then the F-test can suggest a statistically

48

46

44
Analyte concentration (ppm)

42

40 total
standard
deviation
38

36

34

32 Mean

30
A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12
Analyst ID

figure 1 Plot comparing the results from 12 analysts.


12 statistics and data analysis LCGC Europe Online Supplement

significant difference when none is a number of tests for heteroscedasity (i.e., problem in the data structure by
present. The best way to avoid this pitfall Bartlett's test (5) and Levene's test (2)). It transforming it, such as by taking logs (7).
is, as ever, to plot the data. There also exist may be possible to overcome this type of If the variability within a group is
correlated with its mean value then
ANOVA may not be appropriate and/or it
may indicate the presence of outliers in the
ZHigh data (Figure 4). Cochran's test (5) can be
ZHigh used to test for variance outliers.

ZLow
Conclusions
ANOVA is a powerful tool for
Response

Response
ZLow
determining if there is a statistically
significant difference between two or
more sets of data.
One-way ANOVA should be used
when we are comparing several sets
of observations.
YLow YHigh YLow YHigh Two-way ANOVA is the method
used when there are two separate
(a) Y and Z are independent (b) Y and Z are interacting factors that may be influencing a result.
Except for the smallest of data sets
ANOVA is best carried out using a
figure 2 Interactive factors. spreadsheet or statistical software
package.
You should always plot your data to
make sure the assumptions ANOVA is
Yes Compare interaction mean
based on are not violated.
Compare within-group mean Significant
Start squares with interaction mean difference? squares with individual factor
squares (F > F crit) mean squares Acknowledgements
The preparation of this paper was
No supported under a contract with the UK
Pool the within-group and
Department of Trade and Industry as part
interaction sums of squares of the National Measurement System Valid
Analytical Measurement Programme (VAM)
(8).
Compare pooled mean
squares with individual factor References
mean squares (1) S. Burke, Scientific Data Management 1(1),
3238, September 1997.
(2) G.A. Millikem and D.E. Johnson, Analysis of
Messy Data, Volume 1: Designed Experiments,
Van Nostrand Reinhold Company, New York,
figure 3 Comparing mean squares in two-way ANOVA with replication.
USA (1984).
(3) J.C. Miller and J.N. Miller, Statistics for
Analytical Chemistry, Ellis Horwood PTR
Prentice Hall, London, UK (ISBN 0 13 0309907).
(4) C. Chatfield, Statistics for Technology,
Chapman & Hall, London, UK (ISBN 0412
25340 2).
(5) T.J. Farrant, Practical Statistics for the Analytical
Unreliable high mean (may contain outliers)
Scientist, A Bench Guide, Royal Society of
Chemistry, London, UK (ISBN 0 85404 442 6)
(1997).
(6) K.V. Mardia, J.T. Kent and J.M. Bibby,
Multivariate Analysis, Academic Press Inc. (ISBN
Significantly different means by ANOVA 0 12 471252 5) (1979).
Variance

(7) ISO 4259: 1992. Petroleum Products -


Determination and Application of Precision
Data in Relation to Methods of Test. Annex E,
International Organisation for Standardisation,
Geneva, Switzerland (1992).
(8) M. Sargent, VAM Bulletin, Issue 13, 45,
Laboratory of the Government Chemist
(Autumn 1995).

Shaun Burke currently works in the Food


Technology Department of RHM Technology
Ltd, High Wycombe, Buckinghamshire, UK.
Mean value
However, these articles were produced while
he was working at LGC, Teddington,
figure 4 A plot of variance versus the mean. Middlesex, UK (http://www.lgc.co.uk).