Sunteți pe pagina 1din 16
2 of of ce = Analysis of Variance . : * tn the previous chapter we covered techniques for determining whether @ Ss diferenee eis between the means of to independent populations. Tt isnot carer oncnar toencountersitatonein which we wish o test foradifference Ly ‘among several independent means rather than between only two. The extension 4 Of the two-sample t-test to three or more samples is known as the analysis of ME 12.1. One-Way Analysis of Variance 12.1.1 TheProblem When discussing the paired t-test, we examined data froma study that inves- tigates the effect of carbon monoxide exposure on patients with coronary artery disease, The subjects involved in the study were recruited from three different ‘medical centers—the Johns Hopkins University School of Medicine, the Rancho Los Amigos Medical Center, and the St. Louis University School of Medicine. Before combining the subjects into one large group to conduct the analysis, we can first examine some baseline characteristics to ensure that the patients from the various centers are in fact comparable. ‘One characteristic that we might wish to consider is pulmonary function before the start of the study; if the patients from one medical center begin with measures of forced expiratory volume in 1 second that are much larger—or much smaller—than those from the other centers, then the results of the analysis may be affected. Therefore, given that the populations of patients in the three centers have mean baseline FEV; measurements 13, 42, and 1s respectively, we would like to test the null hypothesis that the population means are identical. This may be expressed as 257 256 Chapter12 Analysis of Variance Ho: se. = p= us. Thealternative hypothesisis thatat least one of the population means differs from the others In general, we are interested in comparing the means of k different popula- tions. Suppose that the k populations are independent and normally distributed. ‘We begin by drawing a random sample of size m from the normal population with ‘mean jz and standard deviation 0;. The mean of this sample is denoted by % and its standard deviation by s1. Similarly, we select a random sample of size 11 from the normal population with mean 2 and standard deviation oz, and so on for the remaining populations. The numbers of observations in each sample need not be the same. Group | Group? | = | Grouph Population Mean’ ms a | a Standarddeviation | oy o * Sample Mean a a ee Standard devition 8 2 a Sample size m m m Forthe study investigating the effects of carbon monoxide exposure on indi- Vidualswith coronary artery disease, the FEV; distributions of patientsassociated with each ofthe three medical centers comprise distinct populations. From the population of FEV; measurements for the patients at Johns Hopkins University, Tweselecta sample of size ny ~ 2. From the population at Rancho Los Amigos we draw a sample ofsize my ~ 16, and from the one at St. Louis University we select a sample of size ny = 23. The data, along with their sample means and standard deviations, are provided in Table 121 [1 Presented with hese data, wemightattempt tocomparethethree population smeans by evaluating all possible pairs of sample means using the two-sample f- test. Fora totalof three groups, the numberof tests required s (2) = 3. We would compare group to group?, group I togroup3,and group? togroup3. Weassume that the variances ofthe underlying populations are all equal or of = oy The pooled estimate of the common variance, which we denote 2, combines information from all three samples; in particular, Ds} + 3 Dsf + mm Ds? This quantity is simply an extension of s2, the pooled estimate of the variance used in the two-sample fest ula. and om the tbe di. he y xt rd b i 121 One-Way Analysis of Variance 259 Table 12.1 Forced expiratory volume in 1 second for patients with coronary artery disease sampled at three different medical centers Johns Hopkins | RanchoLos Amigos | St.Louis | 32 a7 eo es |e a Ff |. 2 te | ae 301 a7 | 247 iw 33 2 iio ee ba ge | ae us be is ue in ie dat ih 38 i an io Be 2 2 a | le a as 33 2B wo |e 22 | ie 2 is es | is 2s | Gs is | 33 sh 38 2.88 ers 4) =263ters As titers sy=oaseliers | sp=0523Iters | sy = 0498 iter Performing all possible pairs of tests is not a problem if the number of populations is relatively small. In the instance where k = 3, there are only three Such tests. If k = 10, however, the process becomes much more complicated. In this case, we would have to perform (9) = 45 different pairwise tests More important, another problem that arises when all possible two-sample tests are conducted is that this procedure is likely to lead to an incorrect con- clusion, Suppose thatthe three population means are infact equal and that we ‘conduct al tree pairwise tests. Assume that the tests are independent and set the significance level a 0.05 for each one. By the multiplicative rule, the proba- bility of failing t reject a null hypothesis of no difference inal three instances would be a= 0.05" (0.95) 0.857, (fal to reject in all three tests)