Documente Academic
Documente Profesional
Documente Cultură
10:AnalysisofVariance
Data!Data!Data!Icantmakebrickswithoutclay!SherlockHolmes
Wehavealreadybeenintroducedtotheconceptofcomparingthemeansoftwopopulations
when the data gathered represent independent random samples from normal populations.
Whentheresponsewasquantitative,welearnedabouttwomethods,anunpooledmethodand
apooledmethod.
We turn to discuss a method that allows us to compare the means of two or more normal
populationsbasedonindependentrandomsampleswhenthepopulationvariancesareassumed
tobeequal.ThismethodiscalledANALYSISOFVARIANCE(abbreviatedANOVA)andisan
extensionofthetwoindependentsamplesPOOLEDttest.
Let'sstepbackforamomenttoourtwoindependentsamplesttest.Thepurposeofthistestwas
todecidewhetherornottwopopulationmeanswereequal:
H0:1 =
2 (or1 - 2)
Thetestwasbasedonatstatisticthathadn1+n22degreesoffreedom.
OnewayANOVAisbasicallyanextensionofourtwoindependentsamplesttesttohandling
morethan2populations.OnewayANOVAisatechniquefortestingwhetherornotthemeans
ofseveralpopulationsareequal.
Picture:
Theassumptionsareanextensionofthoseforthetwoindependentsamplesttesttokgroups.
Eachsampleisa...randomsample
Thekrandomsamplesare...independent
Foreachofpopulationthemodelfortheresponseis...anormaldistribution
Thekpopulationvariancesare....equal
159
TheANOVAHypotheses:
H0:_1 = 2 = = k
__versusHa:___ atleastonei isdifferent
Noticethisalternativedoesnotrequireallthepopulationmeansbedifferentfromeachother.
OnepossibleHapicture
Question:Whatcallatechniquefortestingtheequalityofthemeans
analysisofVARIANCE?
Answer: Wearegoingtocomparetwoestimatorsofthecommonpopulationvariance, 2 .
MSGroups(MeanSquarebetweentheGroups):
Agood/unbiasedestimatorof2ifthenullhypothesisH0istrue,
elsetendstobetoobig.
MSE(MeanSquareWithinorduetoError):
Agood/unbiasedestimatorof2
ThesetwoestimatesareusedtoformtheFstatistic:
MS Groups
.
MSE
IfthisFratioistooBIGwewouldrejectthenullhypothesis.
160
TheLogicbehindtheANOVAFTest
Lookattheplotsbelow.ForeachScenario,wehaveplotteddataobtainedbytakingindependent
randomsamplesofsize10fromthreepopulations.
ForScenariosAandB,thethreepopulationseachhadanormaldistributionandthepopulation
meanswere60,65,and70,respectively.Sothepopulationmeansareindeednotallequal.
In Scenario A, the population standard deviations were all equal to 1.5. In Scenario B, the
population standard deviations were all equal to 3. So in each case the assumption that the
populationshaveequalstandarddeviationsismet.
ScenarioA
Samplesfrom3populations
whosemeansaredifferent.
Variabilitywithineach
populationissmall.
Differencebetweensample
meansmorereadilyseen.
F statisticsomewhatbig.
ScenarioB
Samplesfrom3populations
whosemeansaredifferent.
Variabilitywithineach
populationislarger.
Differencebetweensample
meansnotreadilyseen.
Fstatisticsmaller.
Allimages
Whichoftheabovetwoscenariosdoyouthinkwouldprovidemoreevidencethatatleastone
ofthepopulationmeansisdifferentfromtheothers?ScenarioAorScenarioB?
161
Belowisafinalsetofplotsforthreeindependentrandomsamplesofsize10eachtakenfroma
populationwithanormalmodelwithapopulationmeanof65andpopulationstandarddeviation
of1.5.SoinScenarioC,thepopulationmeansareindeedallequalthatis,thenullhypothesis
testedinonewayANOVAistrue.Noticethat,althoughthepopulationmeanswereallequal,
thereisstillsomeofvariationbetweenthesamplemeans.
Also in Scenario C there is still some natural variation within the samples, making the slight
variationbetweenthesamplemeanshardlynoticeable.ThedatainScenarioCdonotprovide
evidencethatthepopulationmeansaredifferent.
ScenarioC
Samplesfrom3populations
whosemeansareallequal.
Stillsomevariabilitywithin
eachpopulation.
Verylittledifferences
betweenthesamplemeans.
F statisticverysmall.
TheFstatisticwillbesensitivetodifferencesbetweenthesamplemeans.Thelargerthevariation
betweenthesamplemeans,thelargerthevalueoftheFstatisticandlargervaluesoftheF
statistic provide more support for rejecting the null hypothesis. The variation between the
samplemeanswasgreatestforScenariosAandBcomparedtoScenarioC.Thenaturalvariation
withinthesampleswasgreatestforScenarioBcomparedtoScenariosAandC.TheFstatisticis
theratioofthesetwomeasuresofvariation:
So which scenario would you expect to result in the largest value of the Fstatistic? Provided
belowarethevaluesoftheFstatisticforthetestofequalityofthepopulationmeans.
Scenario
A(Haistrue)
B(Haistrue)
C(H0istrue)
ValueofFStatistic
F=80.4
F=16.4
F=0.17
pvalue
0.0000
0.01
0.84
NotethevalueoftheFstatisticissmallestandthepvaluethelargestwhenthenullhypothesis
istrue(ScenarioC).ForScenariosAandB,thepopulationmeansaredifferent,butthesmaller
populationstandarddeviationinScenarioAaccentuatesthedifferencesbyproducingalargerF
ratio and an extremely small pvalue. A larger Fstatistic value (and thus smaller pvalue)
correspondstomoreevidencethatthepopulationmeansarenotallequal.
162
ComputingtheFTestStatistic
WewillseehowtogetMSGroupsandMSEandperformtheFtest.Thesetwomeansquareswill
beasumofsquares(SS)dividedbyacorrespondingdegreesoffreedom(DF).
The data can be generically represented below, where X ij j observation from the i th
population.Howeverwereallydonthavetoworrytoomuchaboutthesesubscripts,aswewill
gothroughthestepsusingwords!
th
DatafromPopulation1
X 11
X 12
DatafromPopulation2
X 21
X 22
X 1n1
X 2n2
DatafromPopulationk
X k1
X k2
X kn k
ThedetailsleadingtotheFstatisticarepresentedinsixsteps,endingwithanANOVAtable.
Step1:Calculatethemeanandvarianceforeachsample: xi , si
Step2:Calculatetheoverallsamplemean
(usingall N n1 n 2 ... n k observations): x
Step3:Calculatethesumofsquaresbetweengroups:
SS Groups groups ni xi x
2
Step4:Calculatethesumofsquareswithingroups(duetoerror):
Step5:OPTIONAL:Calculatethetotalsumofsquares:
Step6:FillintheANOVAtable:
Source
Groups
DF
k1
SumofSquares
SSGroups
MeanSquare
Error(Within)
Nk
SSE
Total
N1
SSTotal
163
IfH0istrue,thentheFstatistic, F
MS Groups
MSE
,hasanF(k1,Nk)distribution.Belowarea
fewpicturesofsomeFdistributions.
From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012.
Used with permission.
Stat250FormulaCardSummaryofANOVA
TryIt!Comparing3Drugs
Wewishtocomparethreedrugsfortreatingsomedisease.Aquantitativeresponse(timeto
cureindays)ismeasuredsuchthatasmallervalueindicatesamorefavorableresponse.
independentrandomsamplesseemsok
10.9
8.5
7.8
5.2
N=19k=3
n1=5,n2=7,n3=7
Note:themediansseemtodifferwithDrug3givingthelowerresponsesoverall.
Thisiswhatwearetestingabout.TellsuswehavesomeevidenceagainstH0.
164
RecalltheassumptionsforperforminganFtest.Thinkabouthowyouwouldcheckthem.
Eachsampleisa...randomsample
Thekrandomsamplesare...independent
Foreachofpopulationthemodelfortheresponseis...anormaldistribution
Thekpopulationvariancesare....equal.
Statethehypothesestobetested:incontextandclearlyaboutpopulationmeans
Note:Wewoulduseacomputerorcalculatortoworkatleastthe
basicsummariesinsteps1and2,andlikelytocreatetheentire
ANOVAtableforus.Letsbesureweunderstandwherethevalues
arecomingfromandhowtointerpretthefinalresults.
Step1:Calculatethemeanandvarianceforeachsample:
s 2 2.61
Note:samplevariancesaresimilar
2
x3 6.80(descriptivelybest)
s 3 2.56
x1
7. 3 8 . 2 8. 5 5 . 2
5(8.22) 7(9.30) 7(6.80)
8.095 OR x
8.095
19
19
Step3:Calculatethesumofsquaresbetweengroups:
x
SS Groups groups ni xi x 2
=5(8.228.095)2+7(9.308.095)2+7(6.808.095)2
=21.98
Step4:Calculatethesumofsquareswithingroups(duetoerror):
=(51)(2.74)+(71)(2.61)+(71)(2.56)
=41.98
Step5:OPTIONAL:Calculatethetotalsumofsquares:NoThankYou!
165
Step6:FillintheANOVAtable:
Source
Groups
DF
MeanSquare
31=2
SumofSquares
21.98
21.98/2=10.99
10.99/2.62
=4.2
Error(Within)
193=16
41.98
41.98/16=2.62
Total
191=18
63.96
Bothestimatethecommonpopulvariance
butMSEisanunbiasedestimator.
HerearetheresultsfromR:
OneoftheassumptionsinANOVAisthatthepopulationstandarddeviationsareallequal.
Usingthedata,giveanestimateofthiscommonpopulationstandarddeviation.
Givetheobservedteststatisticvalue.
F=4.2
Whatisthedistributionoftheteststatisticifthethreedrugsareequallyeffectiveintermsof
themeanresponse?
AnFdistributionwith2and16df
Whatisthecorrespondingpvalueforassessingifthethreedrugsareequallyeffectiveinterms
ofthemeanresponse?
Thepvalueis0.034
Atthe5%level,whatisyourconclusion?
WerejectH0andconcludethatthethreedrugsdonotappeartobeequallyeffective
intermsofthemeanresponse.
166
WeRejectedH0inANOVA:Whatisnext?MultipleComparisons
Thetermmultiplecomparisonsisusedwhentwoormorecomparisonsaremadetoexaminethe
specific pattern of differences among means. The most commonly analyzed set of multiple
comparisonsisthesetofallpairwisecomparisonsamongpopulationmeans.Inourprevious
Drugexample,thepossiblepairwisecomparisonsare:Drug1withDrug2,Drug1withDrug3,
andDrug2withDrug3.Tocomparethepairofmeanswecould
Computeaconfidenceintervalforthedifferencebetweenthetwopopulationmeans
andseeif0fallsintheintervalornot.
Performatestofhypothesestoassessifthetwopopulationmeansdiffersignificantly.
WhenmanystatisticaltestsaredonethereisanincreasedriskofmakingatleastonetypeIerror
(erroneouslyrejectinganullhypothesis).Consequently,severalprocedureshavebeendeveloped
to control the overall family type I error rate or the overall family confidence level when
inferencesforaset(family)ofmultiplecomparisonsaredone.
Tukeysprocedureisonesuchprocedureforthefamilyofpairwisecomparisons.Ifthefamily
errorrateisnotaconcern,Fishersprocedureisused.
TryIt!Comparing3Drugs
Inthecomparisonofthethreedrugs,werejectedthenullhypothesisatthe5%significancelevel.
We follow with a multiple comparison procedure to determine which group means are
significantlydifferentfromeachother.
R gives familywise confidence interval comparisons using Tukey's method and a family
confidencelevelof95%.
95% family-wise confidence level
Linear Hypotheses:
Estimate
II - I == 0
1.0800
III - I == 0 -1.4200
III - II == 0 -2.5000
I
"ab"
II
"b"
lwr
upr
-1.3670 3.5270
-3.8670 1.0270
-4.7338 -0.2662
III
"a"
a.Usetheaboveoutputtoreportaboutthethreepairwisecomparisons:
DoestheconfidenceintervalforcomparingDrugIandIIcontain0?____Yes____
DoestheconfidenceintervalforcomparingDrugIandIIIcontain0?____Yes____
DoestheconfidenceintervalforcomparingDrugIIandIIIcontain0?____No____
b. Stateyourconclusionsregardingthedifferencesbetweenthemeanresponseforthethree
druggroupsbasedontheTukeyfamilywisecomparisonmethod.
Wecanconcludethatthepopulationmeanresponsesdifferfor
subjectstakingDrug2andDrug3
butdonotdifferforsubjectstakingDrugs1and3norDrugs1and2.
167
IndividualConfidenceIntervalsforthePopulationMeans
Sometimesitishelpfultoexamineaconfidenceintervalforthemeanforeachpopulation.Since
in ANOVA we assume the population standard deviations are all equal, the estimate of that
common population standard deviation s p
confidence intervals. The degrees of freedom used to find the t* multiplier will be those
associatedwiththeestimatedstandarddeviation,namelyNk.Theformulafortheindividual
confidenceintervalsisprovidedbelow.
TryIt!Comparing3Drugs
Wewerecomparingk=3groupsbasedonatotalofN=19observations.Thepooledstandard
deviation for the comparison of the three drugs data set is sp = 1.62. The sample means and
samplesizeswere:
Drug1:
Drug2:
Drug3:
Samplemean=8.22
Samplemean=9.30
Samplemean=6.80
Samplesize=5
Samplesize=7
Samplesize=7
Thedegreesoffreedomforthet*multiplierisNk=___193=16___.
Fromthetableoft*multipliers(TableA.2)withconfidencelevel=0.95
andtheabovedegreesoffreedomwehavet*=_______2.12______
Drug3wasdescriptivelythebest.Computea95%confidenceintervalforthepopulationmean
timetocureforallsubjectstakingDrug3.
6.801.30(5.50,8.10)
168
TryIt!MemoryExperiment
Inamemoryexperiment,threegroupsofsubjectsweregivenalistofwordstotrytoremember.
Thelengthofthelistforthefirstgroupwas10words(shortlist),whereasforthesecondgroup
itwas20words(mediumlist)andforthethirdgroup40words(longlist).Thepercentageof
wordsrecalledforeachsubjectwasrecorded.Thesamplemeanpercentageofwordsrecalled
was 68.3% for the short list, 48% for the medium list, and39.2%for the long list. A oneway
ANOVA was used to assess whether the length of the list had a significant effect on the
percentageofwordsrecalled.
df
SS
MeanSquare
Sig.
ListLength
2668.8
1334.4
15.77
.0003
Residuals
14
1184.1
84.6
Total
16
3852.9
a. SomevaluesintheANOVAtablearemissing.Completetheabovetable.
b. StatethenullandalternativehypothesesthattheaboveFstatisticistesting.
c. Supposethenecessaryassumptionshold.Usinga5%significancelevel,doesitappearthat
theaveragepercentageofwordsrecalledisthesameforthethreedifferentlengthsoflists?
Explain.
No,sincethepvalueisonly0.0003(muchlessthan0.05)wewouldrejectH0.
d. FamilywiseconfidenceintervalcomparisonswereperformedusingTukeysmethod.
medium
"a"
Estimate
-20.33
-29.17
-8.83
lwr
-39.61
-47.54
-28.11
upr
-1.06
-10.79
10.44
long
"a"
Usetheresultsandcirclethepairsthataresignificantlydifferentata5%level.
shortversusmedium
shortversuslong
mediumversuslong
e. Givea99%confidenceintervalforthepopulationmeanpercentageofwordsrecalledforthe
longlistgroup.Recallthatthesamplemeanbasedonthe6subjectsinthelonglistgroup
was39.2percent.
39.2+/(2.98)(9.198/sqrt(6))39.2+/11.19(28.01,50.39)
169
Whatifsomeconditionsdonothold?
Youprobablywontbesurprisedtolearnthatthenecessaryconditionsforusingananalysisof
varianceFtestdontholdforalldatasets.Therearemethodsthatcanbeusedwhenoneorboth
of the assumptions about equal population standard deviations and normal distributions are
violated.
Whentheobserveddataareskewed,orwhenextremeoutliersarepresent,itusuallyisbetter
toanalyzethemedianratherthanthemean.OnetestforcomparingmediansistheKruskal
WallisTest.Itisbasedonacomparisonoftherelativerankings(sizes)ofthedataintheobserved
samples,andforthisreasoniscalledaranktest.Thetermnonparametrictestalsoisusedto
describethistestbecausetherearenoassumptionsmadeaboutaspecificdistributionforthe
populationofmeasurements.Anothernonparametrictestusedtocomparepopulationmedians
isMoodsMedianTest.
TwoWayANOVA
SofarwehavefocusedontheonewayANOVAprocedure.The"oneway"referredtohaving
onlyoneexplanatoryvariable(orfactor)andonequantitativeresponsevariable.
Twoway ANOVA examines the effect of two explanatory variables (or factors) on the mean
response.Theresearcherisinterestedintheindividualeffectofeachexplanatoryvariableonthe
meanresponseandalsointhecombinedeffectofthetwoexplanatoryvariablesonthemean
response.Theindividualeffectofeachfactorontheresponseiscalledamaineffect.Ifoneof
thefactorsdoesnothaveaneffectontheresponse,wesaythereisnomaineffectduetothat
factor.
Besidesassessingthemaineffectsofeachfactorontheresponse,aninterestingfeatureintwo
wayanalysesisthepossibilityofinteractionbetweenthetwofactors.Wesaythereisinteraction
betweentwofactorsiftheeffectofonefactoronthemeanresponsedependsonthespecific
leveloftheotherfactor.Theinterpretationofthefactormaineffectscanbemoredifficultwhen
interactionispresent.
AdditionalNotes
Aplacetojotdownquestionsyoumayhaveandaskduringofficehours,takeafewextranotes,write
outanextraproblemorsummarycompletedinlecture,createyourownsummaryabouttheseconcepts.
170