0 Voturi pozitive0 Voturi negative

1 (de) vizualizări4 paginiMetodos para analizar varianzas

Apr 05, 2017

© © All Rights Reserved

PDF, TXT sau citiți online pe Scribd

Metodos para analizar varianzas

© All Rights Reserved

1 (de) vizualizări

Metodos para analizar varianzas

© All Rights Reserved

- tmp8BDD.tmp
- SPSS in Simple Steps
- O Introducere in Statistica
- UT Dallas Syllabus for hcs6313.501 05s taught by Herve Abdi (herve)
- mgt 540 final project-12
- Stocking Densities and Feeding Strategies in Shrimp
- An Ova
- An Ova
- Correcting Students’ Chemical Misconceptions based on Two Conceptual change strategies and their effect on their achievement.
- Optimization of Hydrolysis Degradation of Neurotoxic Pesticide Methylparathion Using a Response Surface Methodology (RSM)
- TSCI110528024R
- The T-Test
- Experimental Investigation and Statistical Analysis of Creep Properties of a Hybridized Epoxy Alumina Calcium Silicate Nanocomposite Material Operating at Elevated Temperatures
- List of Tables
- Bouteloua Curtipendula Canada Schellenbreg
- CRD
- 52 Structural Condition Models for Sewer Pipeline
- Next Five Hypothesis Interpretation
- ‘Physical Education Makes You Fit and Healthy’.
- Ni Hms 571014

Sunteți pe pagina 1din 4

Analysis

of Variance

Shaun Burke, RHM Technology Ltd, High Wycombe, Buckinghamshire, UK.

contained in analytical data. This second part in our statistics refresher series

looks at one of the most frequently used of these tools: Analysis of Variance

(ANOVA). In the previous paper we examined the initial steps in describing

the structure of the data and explained a number of alternative significance

tests (1). In particular, we showed that t-tests can be used to compare the

results from two analytical methods or chemical processes. In this article,

we will expand on the theme of significance testing by showing how ANOVA

can be used to compare the results from more than two sets of data at

the same time, and how it is particularly useful in analysing data from

designed experiments.

With the advent of built-in spreadsheet central tenet of ANOVA is that the total SS in the form of the data contained in Figure 1,

functions and affordable dedicated an experiment can be divided into the which shows the results from 12 different

statistical software packages, Analysis of components caused by random error, given analysts analysing the same material. Using

Variance (ANOVA) has become relatively by the within-group (or sample) SS, and the these data and a spreadsheet, the results

simple to carry out. This article will components resulting from differences obtained from carrying out one-way

therefore concentrate on how to select the between means. It is these latter components ANOVA are reported in Example 1. In this

correct variant of the ANOVA method, the that are used to test for statistical example, the ANOVA shows there are

advantages of ANOVA, how to interpret significance using a simple F-test (1). significant differences between analysts

the results and how to avoid some of the (Fvalue > Fcrit at the 95% confidence level).

pitfalls. For those wanting more detailed Why not use multiple t-tests This result is obvious from a plot of the

theory than is given in the following instead of ANOVA? data (Figure 1) but in many situations a

section, several texts are available (25). Why should we use ANOVA in preference visual inspection of a plot will not give such

to carrying out a series of t-tests? I think a clear-cut result. Notice that the output

A bit of ANOVA theory this is best explained by using an example; also includes a p-value (see Interpretation

Whenever we make repeated suppose we want to compare the results of the result(s) section, which follows).

measurements there is always some from 12 analysts taking part in a training

variation. Sometimes this variation (known exercise. If we were to use t-tests, we Note: ANOVA cannot tell us which

as within-group variation) makes it difficult would need to calculate 66 t-values. Not individual mean or means are different

for analysts to see if there have been only is this a lot of work but the chance of from the consensus value and in what

significant changes between different groups reaching a wrong conclusion increases. The direction they deviate. The most effective

of replicates. For example, in Figure 1 correct way to analyse this sort of data is way to show this is to plot the data (Figure

(which shows the results from four replicate to use one-way ANOVA. 1) or alternatively, but less effectively, carry

analyses by 12 analysts), we can see that out a multiple comparison test such as

the total variation is a combination of the One-way ANOVA Scheffe's test (2). It is also important to

spread of results within groups and the One-way ANOVA will answer the question: make sure the right questions are being

spread between the mean values (between- Is there a significant difference between asked and that the right data are being

group variation). The statistic that measures the mean values (or levels), given that the captured. In Example 1, it is possible that

the within and between-group variations in means are calculated from a number of the time difference between the analysts

ANOVA is called the sum of squares and replicate observations? Significant refers carrying out the determinations is the

often appears in the output tables to the observed spread of means that reason for the difference in the mean

abbreviated as SS. It can be shown that the would not normally arise from the chance values. This example shows how good

different sums of squares calculated in variation within groups. We have already experimental design procedures could have

ANOVA are equivalent to variances (1). The seen an example of this type of problem in prevented ambiguity in the conclusions.

10 statistics and data analysis LCGC Europe Online Supplement

In a typical experiment things can be more

A_1 A_2 A_3 A_4 A_5 A_6 complex than described previously. For

Replicate 1 34.1 35.84 36.67 40.54 41.19 41.22 example, in Example 2 the aim is to find

Replicate 2 34.1 36.58 37.33 40.67 40.29 39.61

Replicate 3 34.69 31.3 36.96 40.81 40.99 37.89

out if time and/or temperature have any

Replicate 4 34.6 34.19 36.83 40.78 40.4 36.67 effect on protein yield when analysing

samples of tinned ham. When analysing

A_7 A_8 A_9 A_10 A_11 A_12

Replicate 1 40.71 39.2 42.5 39.75 36.04 44.36

data from this type of experiment we use

Replicate 2 40.91 39.3 42.3 39.69 37.03 45.73 two-way ANOVA. Two-way ANOVA can

Replicate 3 40.8 39.3 42.5 39.23 36.85 45.25 test the significance of each of two

Replicate 4 38.42 39.3 42.5 39.73 36.24 45.34

experimental variables (factors or

Anova: Single Factor treatments) with respect to the response,

Source of Variation SS df MS F P-value F crit such as an instrument's output. When

Between Groups 438.7988 11 39.8908 40.31545 6.6E-17 2.066606

Within Groups 35.6208 36 0.989467

replicate measurements are made we can

also examine whether or not there are

(Note: the data table has been split into two sections (A_1 to A_6, A_7 to A_12) for display purposes. The ANOVA is

significant interactions between variables.

carried out on a single table.) An interaction is said to be present when

the response being measured changes

SS = sum of squares, df = degrees of freedom, MS = mean square (SS/df).

more than can be explained from the

The P-value is < 0.05 (Fvalue is > Fcrit - 95% confidence level for 11 and 36 degrees of freedom)

therefore it can be concluded that there is a significant difference between the analysts' results.

change in level of an individual factor. This

is illustrated in Figure 2 for a process with

two factors (Y and Z) when both factors

are studied at two levels (low and high). In

Figure 2(b), the changes in response

Example 2 Two-way ANOVA

caused by Y depend on Z, and vice versa.

The analysis of tinned ham was carried out at three temperatures (415, 435 and 460

C) and three times (30, 60 and 90 minutes). Three analyses, determining protein In two-way ANOVA we ask the

yield were made at each temperature and time. The measurements are summarized following questions:

in the diagram below and the results of the two-way ANOVA are given in the table. Is there a significant interaction between

the two factors (variables)?

Temp (C) Does a change in any of the factors

415 435 460 affect the measured result?

Time (min) It is important to check the answers in the

right order: Figure 3 illustrates the

decision process. In the case of Example

27

27.1

27.2

27

27.1

27.2

27

27.1

27.2

26.9

27.3

26.9

27.3

26.9

27.3

30

Is there an interaction between

temperature and time which affects the

27

27.1

27.2

27

27.1

27.2

27

27.1

27.2

26.9

27.3

26.9

27.3

26.9

27.3

protein yield?

60

Does time and/or temperature affect the

protein yield?

27

27.1

27.2

27

27.1

27.2

27

27.1

27.2

26.9

27.3

26.9

27.3

26.9

27.3

90 spreadsheet (in this case Excels data

Time (min)/Temp (C)

analysis tools two-factor analysis with

415 435 460

30 27.13 27.2 27.03 replication) we see that there is a

30 27.2 26.97 27.1 significant interaction between time and

30 27.13 27.13 27.13

temperature and a significant effect of

60 27.29 27.07 27.1

60 27.13 27.1 27.07 temperature alone (both p-value < 0.05

60 27.23 27.03 27.03 and F > Fcrit). Following the process

90 27.03 27.2 27.03 outlined in Figure 3, we consider the

90 27.13 27.23 27.07

90 27.07 27.27 26.9 interaction question first by comparing the

mean squares (MS) for the within-group

Anova: Two-factor with replication

variation with the interaction MS. This is

Source of Variation SS df MS F P-value F crit

Sample (=Time) 0.000867 2 0.000433 0.100429 0.904952 3.554561 reported in the results table of Example 2.

Columns (=Temperature) 0.049689 2 0.024844 5.75794 0.011667 3.554561

Interaction 0.087644 4 0.021911 5.078112 0.006437 2.927749

F = 0.021911/0.004315 = 5.078

Within 0.077667 18 0.004315

as in this case, then the individual factors

Note: in the above example, the spreadsheet (Excel) labels Source of Variation as Sample, Columns, Interaction and Within. (time and temperature) should each be

compared with the MS for the interaction

Sample = Time, Columns = Temperature, Interaction is the interaction between temperature and time, and Within is a

(not the within-group MS) thus:

measure of the within-group variation. (Note: Source of variation Columns = Temperature and Sample = Time).

LCGC Europe Online Supplement statistics and data analysis 11

Fcrit = 6.944, for 2 and 4 degrees of freedom (at the 95% confidence level) To reiterate the interpretation of ANOVA

results, a calculated F-value that is greater

In other words, there is no significant difference between the interaction of time and than Fcrit for a stated level of confidence

temperature with respect to either of the individual factors, and, therefore, the interaction (typically 95%) means that the difference

of temperature with time is worth further investigation. If one or both of the individual being tested is statistically significant at

factors were significant compared with the interaction, then the individual factor or factors that level. As an alternative to using the F-

would dominate and for all practical purposes any interaction could be ignored. values the p-value can be used to indicate

If the interaction term is not significant then it can be considered to be another small the degree of confidence we have that

error term and can thus be pooled with the within-group (error) sums of squares term. It is there is a significant difference between

the pooled value (SS2pooled) that is then used as the denominator in the F-test to means (i.e., (1-p) * 100 is the percentage

determine if the individual factors affect the measured results significantly. To combine the confidence). Normally a p-value of 0.05

sums of squares the following formula is used: is considered to denote a significant

difference.

ss inter ss within Note: Extrapolation of ANOVA results is

ss2pooled

dof inter dof within not advisable, so in Example 2 for instance,

it is impossible to say if a time of 15 or 120

where dofinter and dofwithin are the degrees of freedom for the interaction term and minutes would lead to a measurable effect

error term, and SSinter and SSwithin are the sums of squares for the interaction term and on protein yield. It is, therefore, always

error term, respectively. more economic in the long run to design

the experiment in advance, in order to

(dofpooled dofinter dofwithin) cover the likely ranges of the parameter(s)

of interest.

Selecting the ANOVA method

One-way ANOVA should be used when there is only one factor being considered and Avoiding some of

replicate data from changing the level of that factor are available. Two-way ANOVA (with the pitfalls using ANOVA

or without replication) is used when there are two factors being considered. If no replicate In ANOVA it is assumed that the data for

data are collected then the interactions between the two factors cannot be calculated. each variable are normally distributed.

Higher level ANOVAs are also available for looking at more than two factors. Usually in ANOVA we dont have a large

amount of data so it is difficult to prove

Advantages of ANOVA any departure from normality. It has been

Compared with using multiple t-tests, one-way and two-way ANOVA require fewer shown, however, that even quite large

measurements to discover significant effects (i.e., the tests are said to have more power). deviations do not affect the decisions

This is one reason why ANOVA is used frequently when analysing data from statistically made on the basis of the F-test.

designed experiments. A more important assumption about

Other ANOVA and multivariate ANOVA (MANOVA) methods exist for more complex ANOVA is that the variance (spread)

experimental situations but a description of these is beyond the scope of this introductory between groups is homogeneous

article. More details can be found in reference 6. (homoscedastic). If this is not the case (this

often happens in chemistry, see Figure 1)

then the F-test can suggest a statistically

48

46

44

Analyte concentration (ppm)

42

40 total

standard

deviation

38

36

34

32 Mean

30

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12

Analyst ID

12 statistics and data analysis LCGC Europe Online Supplement

significant difference when none is a number of tests for heteroscedasity (i.e., problem in the data structure by

present. The best way to avoid this pitfall Bartlett's test (5) and Levene's test (2)). It transforming it, such as by taking logs (7).

is, as ever, to plot the data. There also exist may be possible to overcome this type of If the variability within a group is

correlated with its mean value then

ANOVA may not be appropriate and/or it

may indicate the presence of outliers in the

ZHigh data (Figure 4). Cochran's test (5) can be

ZHigh used to test for variance outliers.

ZLow

Conclusions

ANOVA is a powerful tool for

Response

Response

ZLow

determining if there is a statistically

significant difference between two or

more sets of data.

One-way ANOVA should be used

when we are comparing several sets

of observations.

YLow YHigh YLow YHigh Two-way ANOVA is the method

used when there are two separate

(a) Y and Z are independent (b) Y and Z are interacting factors that may be influencing a result.

Except for the smallest of data sets

ANOVA is best carried out using a

figure 2 Interactive factors. spreadsheet or statistical software

package.

You should always plot your data to

make sure the assumptions ANOVA is

Yes Compare interaction mean

based on are not violated.

Compare within-group mean Significant

Start squares with interaction mean difference? squares with individual factor

squares (F > F crit) mean squares Acknowledgements

The preparation of this paper was

No supported under a contract with the UK

Pool the within-group and

Department of Trade and Industry as part

interaction sums of squares of the National Measurement System Valid

Analytical Measurement Programme (VAM)

(8).

Compare pooled mean

squares with individual factor References

mean squares (1) S. Burke, Scientific Data Management 1(1),

3238, September 1997.

(2) G.A. Millikem and D.E. Johnson, Analysis of

Messy Data, Volume 1: Designed Experiments,

Van Nostrand Reinhold Company, New York,

figure 3 Comparing mean squares in two-way ANOVA with replication.

USA (1984).

(3) J.C. Miller and J.N. Miller, Statistics for

Analytical Chemistry, Ellis Horwood PTR

Prentice Hall, London, UK (ISBN 0 13 0309907).

(4) C. Chatfield, Statistics for Technology,

Chapman & Hall, London, UK (ISBN 0412

25340 2).

(5) T.J. Farrant, Practical Statistics for the Analytical

Unreliable high mean (may contain outliers)

Scientist, A Bench Guide, Royal Society of

Chemistry, London, UK (ISBN 0 85404 442 6)

(1997).

(6) K.V. Mardia, J.T. Kent and J.M. Bibby,

Multivariate Analysis, Academic Press Inc. (ISBN

Significantly different means by ANOVA 0 12 471252 5) (1979).

Variance

Determination and Application of Precision

Data in Relation to Methods of Test. Annex E,

International Organisation for Standardisation,

Geneva, Switzerland (1992).

(8) M. Sargent, VAM Bulletin, Issue 13, 45,

Laboratory of the Government Chemist

(Autumn 1995).

Technology Department of RHM Technology

Ltd, High Wycombe, Buckinghamshire, UK.

Mean value

However, these articles were produced while

he was working at LGC, Teddington,

figure 4 A plot of variance versus the mean. Middlesex, UK (http://www.lgc.co.uk).

- tmp8BDD.tmpÎncărcat deFrontiers
- SPSS in Simple StepsÎncărcat deDreamtech Press
- O Introducere in StatisticaÎncărcat deGabrielaM.
- UT Dallas Syllabus for hcs6313.501 05s taught by Herve Abdi (herve)Încărcat deUT Dallas Provost's Technology Group
- mgt 540 final project-12Încărcat deapi-302410823
- Stocking Densities and Feeding Strategies in ShrimpÎncărcat deDiego Molinari
- An OvaÎncărcat demeropeus10
- An OvaÎncărcat deAkash Balgobin
- Correcting Students’ Chemical Misconceptions based on Two Conceptual change strategies and their effect on their achievement.Încărcat deIOSRjournal
- Optimization of Hydrolysis Degradation of Neurotoxic Pesticide Methylparathion Using a Response Surface Methodology (RSM)Încărcat deIOSRjournal
- TSCI110528024RÎncărcat deRisto Filkoski
- The T-TestÎncărcat dekakkras
- Experimental Investigation and Statistical Analysis of Creep Properties of a Hybridized Epoxy Alumina Calcium Silicate Nanocomposite Material Operating at Elevated TemperaturesÎncărcat deIJSTR Research Publication
- List of TablesÎncărcat deEd Casas
- Bouteloua Curtipendula Canada SchellenbregÎncărcat deSocorro Arianna BA
- CRDÎncărcat deBizura Saruma
- 52 Structural Condition Models for Sewer PipelineÎncărcat deNatanael Malau
- Next Five Hypothesis InterpretationÎncărcat deAkshay Agarwal
- ‘Physical Education Makes You Fit and Healthy’.Încărcat deMohd Rizal
- Ni Hms 571014Încărcat deFabián Len Kagamine Matsumoto
- Bowerman Experimental Chptr 1Încărcat deCharleneKronstedt
- AgainstAllOdds_StudentGuide_Unit31Încărcat deHacı Osman
- Impact of Management CommitmentÎncărcat deMohd Yazid Mohamad Yunus
- The Reality of Application of Total Quality Management at Irbid National University from the Perspective of AcademiciansÎncărcat deIJMER
- 10 Analysis of Variance ANOVAÎncărcat deJustinMalin
- Long Term Effectiveness of Anti-stripping AgentsÎncărcat dekonstantanol
- stats analysis- bcom 214Încărcat deapi-433573291
- مهمÎncărcat deEsra'a Alhaj
- Data Save Rini_uas No 4 (Anova)Încărcat deTeddy
- 1-s2.0-0307904X93901262-mainÎncărcat deWamilus Sadri Ciago

- stats_Excel_2013_xlstat_card.pdfÎncărcat deEliana Lopez
- The Influence of Family Backgrounds toward Student’s Saving Behavior: A Survey of College Students in JabodetabekÎncărcat deIJSRP ORG
- 38156014-m-com-ProjectÎncărcat devaishnavforever
- Quality Assurance in HaematologyÎncărcat deDennis Valdez
- Data AnalysisÎncărcat deVikas Singh
- Using Surveys to Value Public GoodsÎncărcat deMary Therese Gabrielle Estioko
- 8Încărcat deArsdy Novalentio
- chap 4Încărcat deabdullah
- MEALÎncărcat dekhan7ven
- OB Literature ReviewÎncărcat dezouku
- One Way ANOVA in 4 PagesÎncărcat debsetiawany
- Research DesignÎncărcat dePankti Shah
- AIPMTÎncărcat deSuriya Elango
- Formative Exam in Research_2018-2019.docxÎncărcat dePurity Villamor Mata
- Criterion-referenced vs. Norm-referenced AssessmentÎncărcat deJohn Benedict Vocales
- multcompÎncărcat deLee Mun Seng
- Hw6 SolutionÎncărcat deMua Lanh
- chapter 3Încărcat deAjmal Khan
- How to Get My Polling Station Details Online?Încărcat de123arica
- Unit 6 Research Project in Health and Social CareÎncărcat deAlley Moor
- Eco No MetricsÎncărcat deSirElwood
- UT Dallas Syllabus for stat6338.501.11s taught by Michael Baron (mbaron)Încărcat deUT Dallas Provost's Technology Group
- How to Report an F-StatisticÎncărcat deAri Clecius
- Conduct Assessment Template 3 SampleÎncărcat desheger
- Homework a 32Încărcat desubash1111@gmail.com
- ResearchMethodsStudyGuideforAPPsychology.docÎncărcat dedan
- CHAPTER 3.docxÎncărcat degweneth irish
- ANoVA PPTÎncărcat deTeja Prakash chowdary
- Worksheet November 21 SolutionsÎncărcat deSainath Nutalapati
- Statistical Process Control & Process Control ToolsÎncărcat deRob Willestone

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.