Sunteți pe pagina 1din 85
| THE ANALYSIS OF CATEGORICAL DATA NO nan! GRIPFIN’S STATISTL MONOGRAPHS & CO. susie pau esanstexe ate m5 mn PUBLISHERS’ NOTE ‘As its tile implies, the Seis in which this volume appears has two Durposts. One is fo encourage the pubestion of monographs on ‘vanced or specaised pis in, or elated to, the theory and fppeations of probability and statics uch works may sometimes {more suited fo te prsea arn of publication berate the epic ily and states, which re fometines based on unpublished lectures, “The Seis was edited fiom is inception in 1987 by Profesor ‘Maurice G: Kendall under whose editorship he fst 21 volomes in the Series appeared. He was Succeeded as editor in 195 by Proessor Alan Suse “The publsers wil be interested in approaches fom any authors who have work of portance ita fo the Serie CHARLES ORIFFIN.& CO. LTD. L ‘ORIPEINYS STATISTICAL MONOGRAPHS AND COURSES ‘he ana of ai tine series A come mate oa ‘Tefen of stitial esol Tse te of tte seine ‘stun ‘nae ott may vars ea bean Mathenatal mtd the thery of seung a. ec ‘cours the gency of dnc hove ‘enon wane ant eben en Sassoon Geom rey Seana. an te so ‘ebm oe meas ‘ot Stata amet of he i hacer neon bigot re em. msoxeane Application of charter factions, each oho ta ‘net oor rons eas nates equate on ditto frtions ican Se fi ha pray hoy mao tah of rnc: ie nase Pere ‘The fwar ypthere» goer hers oarsmen Econometrics ent reo een) Socially dipenent quran halry test “fr econo rence Paren anconfertns a fe pees ame ‘Theratiemas of experinenta den: alt ‘ek den al ar gure : Cuma tt: tered proctoe 5 Stata mnt ad et xprinetlepenon.oFtsea0 5 Slalntea flernce eto leno enon". gor Fale fart dea ton | inca ese ces with epost ‘The generation of rom verses “rrr newman and oom Fanti of rene drone veo ‘fiat couse shee, rena NOL hele remain th pel eo ‘ter Noy sw. past No. 33: Represonetinaion fom prompd rvtion —Y.unan No: Stcarepla process an th apc No.5, formerly Chaacteite Paton by B. Lace, it sow published tndependeny ofthe Serie ora sof ther statis ap mathemati books se bck cove. THE ANALYSIS OF CATEGORICAL DATA R. L. PLACKETT, ma, se. Professor of Stats Unive a Newest upon Type PELGL Monograph No.35 Series Er ALAN STUART ‘CHARLES GHPFHN & COMPANY UnAITED ‘2 DRURY LANE, LONDON WCan sex opt @ RL PLACKET, 198 Alsi wv Nop i ook aye ocd or {i inaing oop ark sa ses Eipgesgetion ten ihn sion ss Fou pubiied 1974 $8.7 2 W.C ities 1g London N12 OL ‘sina et ny Mei Se Mtr Lad contants Preface Aeknoeigemens 1 ‘THE POISSON DistrAMUTION LI Tatodveton 12. Propertiss of Poison variates 13. Inforencer about SINGLE CLASSIFICATIONS 24 Repone 22 Homogenaty 23 Lagessmple txts 24 Cycles 25 Goodnes of "TWO.WAY CLASSIFICATIONS. BAL Contingency tables 32 Vaviation of respons 2x 2 TABLES A Hypoqeomotieaebation 42 Extended hypegromett di 43 Sample cosprodet ratio 44° atimiton of 45 45 Xs TABLES Si Distiution theory 52 Bstintion of 2 53 Mypothees aot X 54 Patton of disque 55. Symmetie table 516 Incomplete tables age wt wi SSSE2RR SASSLEE BRESS azneue wee | | ' t 6 MODELS AND METHODS 62 63 4 ‘Staite mode Logit regeson near model opines dle 7 THREEWAY CLASSIEICATIONS a 2 2 14 13 16 “Thvcedimesional contingency tables Amotton of thee sepontes Ditton theory [Experiments fvoving factors Estimation and test procedres, [Nomi examples i. 8 waTeHING a1 22 83 Dasges and hypotheses Matched te iicency of matched pis for binary date 9 MULTIVARIATE DATA, 9.1 Muludlimensoul contingency abies 9.2, Panis of analysis 913 Multrate blary data 914 Allocation rte 9.5. Numail example Brerces References Autor Index Subject Index Experiment Indes 110 un 13 us 40 1st 1s 159, PREFACE Cater data ase whenever counts ate mae insted of mesiemea ‘They foun an inprtant pont of statin inforation, especially fa meine an the social scenes. Tel analy bas bees develope ove ong pei, beggin i leary marked by Katt Pesan’ fs pope | 1900, Many significant contibutons were made by RA, Fisher, Notwit standing thes ery avant, the methodology for cunts hs been gen Js attention and tas grown more slowly than the methodslgy for mene ments However, progress las continued steady, and the subject now ba dete structure which an for the bs of ture developmen, An taser naive phase, preoccupied et withthe invention of coffins te ttt wit the formoltion of hypotheses, as bee repled by 2 pote phase in which modal ar exploed. Present conditions therfore jetiy ‘mother adion tothe modest number of books which ae solely concened ‘ath the analy of exept daa ‘The objects of this book ae to desrbe the principal concept, ex how they ae combined to fre sates) ethos, stat how the methods af apple in patie, inate the mir branches of the wb, ‘nd sues intresting questions which remain to be studied, Rexders shold fave thorough grasp of th bse prncpe of east Inference The book intended for students aking vanced cour in slatitie,eahert Ht Insions of higher education, and research workers in methodology ot eis of appleaton. ‘Sultes i concersed with the collection, analy, and interpretation of sia, Consequently the materials preseted inthe erm ofa bance beleen tho problems posed by practical examples and the hnurtel struct i hich they are eae, Lengthy setions of mathemati argument re voided by ging poof in outline or omg them altogether, A esd ‘who wishes to have father detail wil usally Hid that vefrenessuient for fis purpose te piced at suitable pointe though the text and exerci, ‘Shak sigoponts can be fgnored by thore who ae spy intrested in the ‘sie echalgue. The main ext dels with road lines of development, and Isextended by te exercises, where many sisi reste ae presen ' problems, Much these trestint i accorded to the aplietions, which Text les interpretation, and the exer ACKNOWLEDGEMENTS ‘am patel fo Rank Anscombe, Vie Bstnt, Dennis Hrs, Davd Newel tnd An Stuart fr thelr comments and coatsbution; a Shela Boyd for {he conttlon of comps programs mh ceubton, and dawiag the lags; to Joyce ge or preparing the typescript at delag with cone- ‘spondence; sd to James R. Giffin fr help advice. 1 acknowledge the seneroity of authors wo hae alowel me to quote dats fom thelr Books ‘rps, and whose names ae gen he txt. also wi to thank: Mr 418. Greenwood; the Tastes of Bometrit, the Brome Sock, the Royal Sttstal Society, the Masachusstts Meal oct, bests Methven Lid, and Stanfd University Pre the eitor of taal of Apple Boog, es Medi! Sonal, Chemie Engineering Pogrest, Japanese Joural of Geology and Geography, Jour of Horteulural Since, ad Operational ‘Resch Quarter the pubsher of Anas of Haman Genter, Joie (of Experimental Eauntion, aud owen of Hyiee; athe er and Ps of Bris Jour of Preventive and Soll Meine Tam solely export for any defect that may he found inthe Book would be gla to have notfestion of thee oecurene, wi 4 THE POISSON pIsTRIBUTION 1A tatrodetion [Ars 0 has a Psion dstibtion with paameter# when (P=) = whem G = 0,1,25...50>O) We ay tha Vis 9 (The distbuion ose cent plse fa te alysis of categorical ata. Tee ae serene, Follows Hi, 1960. (The dstibuion often roids stsatory desertion of experiment 0;0>0,8> 0), “Then K bas the negate bina datsbtion inn = (2) aro (1 weno (5) Shi Gh) According on hypothe of wldnt poten, ndutilworkers ean ‘be died into groups sch tat the punber a acento Alston tht rope, an he Poison preter has anne St Seton oe th gen, Tae 13 ake om remwood a Yle 1920), and pes» tequeney dtubuten a he munber of ents to 434 macht te month oper wh he expel een, covesjoning to «Pason dition sd a negate bionil on, “he pant of the negte Bm dstton were etal by, aig te sbiened mean an ct to the ewe! ae a inpcton of he ale sget hat the weave Co distin is the cere fequnces beter than the Poison dtitation HUB STADLE 1.3__Accfons to 414 machi in tee months No esdont ot 2 3 4 s+ Tot Frequeney ovewed 2966S Bypected (Posen) 25612230 Sta Bopected (neuine 299 HSL ‘bison Suppose that YY, whic ace Metcaly di 8). A genrtizd Poison vant Zs dfiaed by Bm WAN bh Yye ere ie 4). The probity pneaing Funetion of Z bs BEA) = exp (al — cample 14 bs a Poivon dsbuton wih parameter X Then Z hat ¢ Ney type A distabulon. Bxonple 13 Y bas ogni wis dstlbtion defined by £6) = lou(l —aailog(1—a) O, ence, for ie o a POI) By ‘obtained by refering 1 ks dtbtion conditional on +My =n ‘supplemented by randomization a order to atin a specie significance Jere (Lakann, 1959, p, 14017), However, rdomiation fe sldom If raed in pace aay fr (vo ean Fist, we ae nersted mie ‘in the level of spine than wheter or not wpa vale is exceeded ad secondly, inferences shook pot depen on @ dice which teas ro ‘elation tothe problem stiled. The condition! ditbution of MI bi ‘om with nde and probably flu, 1). When = the gee sy i nd the variate (24, aa! converges 1. (0, Hat => have different dstibutons or ‘Bumple 2.6 Table 21, the Frequency of seule mphai leukaemia 25 forthe Mt sc montis and 261 foe te second se onthe. Ase that the frequencies ae observations on Pt) and # (i) respectively ad ‘omar the hypothesis ny =p. The observed wae of QM, na 0-71, which gies no evidnse fa the testion 9 py Wien r> 2,90 tof homogeneity canbe wifermly most poweal inst a gener aornatve, The choie of test extson wl wal ie between the inex of dipeson ad the ikethoodtto ait, Moet of ha follows i cowcened withthe iat extsan, not Boose i Is prefered but because Hs Been ven mow attention. Bes tes of homoge ne discussed by Good (1967, The Poison dex of aiperson X7 i defined by 2? = ani, 23° ie spec eae of Katt Pesos tise 2 (Obsewed ~ bepecte)xpected for testing the eement of bere and expos frequencies, On ths ‘ul was proposed ws a teat of Yi by Per, Thorton and Mackenzie (41922), The pred of testing with ls now 3 the pert ot arance tes, beau Nf ~ 1) thera the ample wane tothe 0 sample mean, Ths 2A ~ 1) approximately | for «sample from Pobson dibutin bu wil tend to exceed FM, Myers By hae di ferent Poison distbutons or te sie mixed Polson daiation "These Inti eas hae bean fonmalzed hy Potthot? ad Whiting (1966), ‘ho prove the flowing elt, Among all untied test fly aia yy the diperson ts is most powerful for o sir luo to tre, espe Ave of the vale of Selby (1955) shows that Xe a tsi setae wid ested maximum ino esination procedures, We shall prove a {he following seton that X? can be deed Ioan the le-sample theory ‘oF texting spl hypotess against compost altenatine, Wen yi tue, fet be the someon value of jy vss Ap Since ‘s suit frp, the ditrbuton of 4° conditional on = I indepen: ent of an depends oly 09 ar. Th detibutionconeres to 92 61°» (Cram, 1945, sect 30.1). The adequacy ofthe seympote di tnbotion when is nite as been examen deal Cover fis te ‘moments of 1° colton on A =m, which ae evaluated by Haldane (1937, Hoet (1943), and Tso and Cais (1956), ing varlous methods, For example, BOM =m) = 1) and 1 compared with BOB) = 0-1) and verde) = 26-0, “Te rapity with which the thin and fourth canons ted to those of ‘da, depends rani on , and partly om 7 Por sal acs of mas the exact dsiotion can be enumerated. Tabs ae pve by Reo end (Cakravarti (1956, and Chakrvarl and Rao (1959) Hower, the ite a hich de numberof partitions of » iene with eventual sets the {tsk of enumeration beyond the power of any compas. Recourse mst thn be modo to Moate Carlo expres, The ress obtained by Cache (4936), Lancaster and Brown (1965), Sake (1966), and Good, Gover and Mitchel (1970) show that the apreemest boyeen te contol bution! (f X? and a2. canbe ssactory down 10 gules values of “Tae observed valu of ¥? computed feo the Fx var PL m) = 26-1014, re ajn—n, fom the defnition of 1, depending espectinly on whether or not & Akektop eaeulstor i wed, ‘Bxample 2.7 Foe the dats in Table 2.1, Pe m= $06, and En = 22236 ‘he obi ye of X76 213. Sine Xu. es = 19°, tow 6 lgalesn ‘avalon i he numberof ese fom month To month Bumple 28° Forte dala Tale 1, r= 850, Wen ri le, the slgieance af Xie etl by reetng the deviation 2x7 = Graph oa table of K(,1). Te observed value of X? is 589, which gs vation of 1-217 and imps thal Pion dit baton fis the dts stati. We now dicus the power ofthe denon tet First suppor tat X7 is we to tet that the istbuton of (4) condtonl on N= fs 8 ‘symmetric ulonia, guint the alternative ypu spied by Coa er +4, fted ad sch that Be, = 0, We where cise, ™ mar Since mm 1 as n> =, we fa that X? converge. noncentral 2, wil non centrality parameter ymin = ref “This argument is akon fom Cochin (1952) and farther dla are given by Patna (1949). An examination ofthe adoqucy of he reall when 6 Mt tas been made by numeri companions ofthe etset power ofthe Aipersion test with he nominal power obtaned from noncental 32 Exanples gen by Bennett (1959), Haynamn and Laon (1965), and Slakler (1968) sogest thatthe exact power tends to be les tan the nominal power ‘in small samples. This explained east n part by the fac that the excl testi necessary coasradve inthe sn that the actual gence level {es than the nominal vale, Second, suppom that Xe wed tote fr homogsnsty when the lective hypothesis tat, My... have the sume non-Posson ltsbtion ws mea p, vases Fond, and emalats (x) which re ‘Ou for j > 2. Darwin (1957) proves that Tye comers In dstelbation to xy at >, The spoil cam of th ell for Neyinan ype A ssibiion was st hea by Bates ‘We conclude th seton by describing the lot of gaa bated on he likelihood ratio, whi sdtized by (Mexia of Hklood wader Haan of ood under H- m= my = mdm ond + On mi dtbatlon 0 4 2 ‘he numerator a¢e“/T At andthe denominator is Ne "YN. Consequently the rato 1). In prt, we wean equiva test ilerion which consis of mins wie the logarithm ofthe Hild ato, dnd Is dnoted by Y* Hence ¥? = DE Alo Onin. According tothe groerltheory of lkehood-ao tes (Kendal and Stat, ‘vol: 2, the 26), 7 conserges in dtbuton fo 192 when ste, The, ‘symptotieditibtion om i als the same 2 for Tet, Xan 2 ‘uo siympotially equtalea, a esl which en be verified by expanding ‘i loe(iA0 in power of (,~ A). Good, Gover und Michell (1970) show tat the dstibuion of? by no means a well sproxiated by 32-0 that of X7 when m boll Sy = Tha, an 55 © ERGY No single one ofthese esta uaformly Set (Ress, 1971). However, Sa Ins the practi advantage tha the cleltion of 8 not rqule, ‘We now sow how thee ideas are relevant to testing the hypothesis of| Imogeneity, ond that then Became the Poison Index of diapers, Define > loge and 8 = lil) (= 1.2, ‘The Ms equslent (0 the snp hypateas M28 0 foralla “The dstibution of {4} gen by 1 xp fog oy~sadth = exp (eK +2) + a8 +B nt ‘Thus 2 and (4) ae mini sulicen For 8a inary in {0 and en eluate B by dec he nition on N=, ae 1g = My for ala Mya) = fot exp Cm DQIVCL EEE"! Hy! “Ths is anther version ofthe lino dtibetion abaine in seston a. We proceed fo evaluate thes y od. ton of (a) Ine 5, The elements of wy ae typed @ op 11206 = Ny Sil, type element of ay is EG? tog 1)2,30,) ~ ean “he expression YER does not inl Mad # nore set sin cine fs wy helene arian tbe tie mare wih ment coe A) Tels ‘Sot eee ore rs het 163625 = Jo ‘consequence ofthe deory of peered ners (ape and Rayer, 171) that he Wray ‘This sult ean be proved by the method outlined in xtc 7, pase 116 ‘We theefore hae the alternate expreson Ss =D hich vate alee rswences (M). Fora symmetie multiomaldstaton, Ju = alla et unt a wi ca The cnton 68 = ‘is satstied by Mon oi Men Sy = B= mI ‘his completes the proof that $5 i the Paton index of diparsion. The mount of deat hte may appear out of proportion Yo the esas deed, 4 fac, mach ofthe egunent eed pani etion 53 24 Optes Suppose that we havea ste of frequencies MN... M,which rte {0 svete equal intra of tine, eg, hour ln day, ar month In yen. “The context dhe imps w parts fox of depart frm homogeosty, ‘ch a repr ovellation or an elated peak This information canbe wed to constvet tet of fy which hare greater power than the nord ets gana unrest aerate. Ror example, the das in Tble 21 ‘neta summer peak, and the question aes how to incopotate this tlleratve in the procedure for tetng homogeneity. ‘A mode proposed by Fx, Hodge and Leann (1959) bs ahah bp, sand consequently ‘whore 1 = 2am Gur sn I~ Dw, By ~2 sin nrc (24— Ia ‘The tet stati ie 1 = VC Msn (Q1— Daf? [24 cos (2I~ aH?) ee W denotes). When the model ld, and pyr = fra fT overs in dstiution to nonceatel x with noncetrally parameter 2ur% 6°) sa?G/). Other hamone tems could be nbd appropriate, Suppose that = 22, and write ‘Te eitcon Te = WF ope 1A wa snggesed by Dail and Newel (195), and they derived am approx ‘main for the peeentage points. When thi test ape co the dat a “Table 21 the observed tale of 7, does aot site vesch thet pereent sigitcane level. The power of his test unknown, "When ri tly, the problem becomes tht of texting that ast of pots onthe pesimeer of acces unerly distbuted ‘Many cateria hae ben propose The ess obtained ca be approached though Aj (1968, who dacues the eleron corresponding to Ts. Hie wok ness tht 7 i opti gaint aernatves wee ay = € fore ‘ces ves of fan y= mfr the senainder (E 2). To vations ‘etweon the problems for fe aed lata vals of 7 sgt there, ce = 0 Su Fortes deta ate pen by Steen (196) and Matta (1972, “Te mains ine series analy wh ce Seto cul ss be opi to a yc of equ 2S Goodnes of ft Consider a popuition esi ator eaters and ty be the prob: sity For eatepory / Y= 0, t,.».r 1). Asample of sie fs taken and the flequeney of units placed in eatogory 6 #. We asune that the selection ‘of wntsconeaponds to's souence of mutually independent events, Under those conto, he int ditibtion of P,P». Hy mllaonal ‘with index f aad probable: 4, uy, ear texpetvly, Seppe that these probsblies we speed Tetons of an unknown parameter 8, For ‘exatpe, the diatbuion ofthe number of boy a thet 4 children of| fails of sue 4 or mae can be dae (om the lowing wo asimptions Fst, the probability f » boy i oral biths;and secondly, sczesive ‘bt foun a sxence of matully independent events. The probability hat 1 Fay contains J boys i then 100) (eam ort G2 0,t..00 ‘We equ o test wheter the agement of the observed ‘he sata model atisacory Thi she problem of godess of “Theft sage ofthe analyse consists athe estimation of, Gen served fequeneies ff, the lkelteod fuscion 1 fis shor) = AM HOON. fon = coms. + or (0). We say tht the scond term onthe ht eth ert the log keto IC the ptt which depen on 8. Dante hy @ the maxi eliiond| ‘stimator of 0. Then 0 sts the equation EF loea, (00) =. “The maxinunikethood etinator ofa) 8) for inthe second tage ofthe aay tet he hypothe pit alternative hypothesis K, whee nd Kate defied follows the hypoth he eens Fa Fyn. Fry be 4 mom it on with index an pobabicsg),a@)e<- (0) reopen (ors wie od Motels he Feces ee eee soa sition wi ade J but wont ay weston othe pres Abit. Astor, theese two tet ler in commen we, Te Tat Pease tats 27 = BU mF) Aleoatvely, we can uss minus twice the logit ofthe Uetod lo, ven by Y= 22 45h (nay) {a the fit asf», both X° and? converge in dition 10 32-» hen Hs re. A tet of His theretore made by eefeting the obsed vai of etter X or Y* to table of xy and wecling Hi the tae Signin ag. When 46) 3 fnction of & uknown pacameters (> 1), the forgoing ‘ocedue is fllwed with appropriate changes, the ele of which I it, tho tet staat are now asymptotiely 2-4; when Hi te A rigorous roof of this re forthe tse 1 vn by Cram (1946, sec. 30.) "is methods at extended by Misa (1958) to sow het X" fe abmptotialy nonce x fr gen of aemae hypothe whh comers sample 2.11 Let ffi. fu be the observed fequenls of O, 1,4 ‘boys tespesivey in te fs chin of faaies with 4 or ore chien, wnt rape 103, {ops = cons. 1op 02 f+ og (1-9) 24-4 whence “Thus the maximo etintor of 8 6 = 2ymyan “This the aio ofthe total mer of boys to the Cota numberof een, or the data a Table 22, we ad 6 = osisass, x7 © 0m, Y* = 097 “The test cre ate dnbuted asymptoally a x the statist model Is couect. We conclude that thre good agreement betwen te obered frequeney dbtibtion and» binomial dsbution wih Inéex 4 an prob shilty 0515854, Beample 212 Let fo fifa» be the observed frequencies of 0 12, respectively in sample of soe / fom (The maxim kl Iod fstinate of the sap ea m= Ea. For th daa fn Table Lo, oy = 062. The expected frequonces ae small for 4,5, and 6+ wists, ey could invallate they? aproximaloas 1X7 snd 7. We threo easly the numberof ws no categories, rae 1,1, 2 and 34, This alter the max ikelhood estimate of, wih rine eis the squation 256-4 5,1) ~ ssn, whee =enn “The solution i 06073, and the comesponding expected frequencies ae piven below, with the ales of and ¥, Naber oF wits o 7 7 Freqency obeaned | 298 19083 Frequency expected | 29963 181985526 1341 Reference 1 table of xf inet that agreements good. When the eslepovies of» population ae determined by the fequencles ‘a speci event, sty me hee examples, we sy tha the statis 3? and Y7 sp baie on the frequent of the frequents a ich 2s, ‘ve muy be able Lo davis let moe powerful aut some aleeativer ner thet sai Is function ofthe etal vaste insted ofthe Trogencis with which the diferent values our, That the Poison index 18 of dizeion acy mor povertol psnt nti oni eesti than ae sss bed one events wih th Oe? ose Thee Sind tosh Sn rari eatin. The dts sen by Pte and Whig ‘ill (1966s) and Wisniewski (1968, 1972). ” * 3 TWOMAY CLASSIFICATIONS SA Contingeney tables Popaltlne whieh gv eo to tho sume csifction can themselves be lated by qualiative or qvantaive difference, Por example, comer fan exponent desged to soy the robones of a partial make of cup. ‘The experiment cons In droping cups om «spied ght and obser fog the depo of fretre. A clifton ofthe posible factuesdefaes, the response. Conazable populations ae those covespening to diferent snakes of eu a the same pie diferent quai of ep manactred by {he sme fem, and diferent heights fr cups ofthe sane sake. They exe iy vespectvely the the methadsofexpesingdifeences which ae ested in setion 2.1, A clsiiaion of poputons define fcr, say 5. By anulgy with experiments in agriculture, the th population i 2 ‘laid group i no a he fh Keel of ‘Sppote that chsiation wih extpories cap be made in populations ‘Aan select at tandom From the J population Is poe a eatery F ih probably py for == 12,0. 7 and) = 1,2, 0.oy The eesponge at {ho jh toe of $s vate ® dened by oe ny GU Dace PLDs vend Wo requ to fad what difeeaces there ae, any, between the dtbaton ‘OF Rat diferent levels of J, A sample i taken fom exch population and the units are esi, Denote by ny the fequene forthe Mh category in the fh 3m, The Feguencs ean bo presented aa rectangular aay ‘thr rows ands columns, in which my cccupes the cel formed bythe Mth ‘ow and ts clann. ‘Thay know ba contagncy table of exer {PX ser convenience, the abe also sides the tas of exch row and of, known as the mara! ft, cogether wih the ftal ofall suet, Example 3.1 The data in Table 3.1 ave quoted fom tans ea (1969). ‘Tay tle to an exprinent designed totes tho poss earlnogenl effect ofa fang, Avades. Male mice were fel with Avadx a S60 pm for 4S weak. Another group of sale ms, known a the conte Was onder standrd conditions The eequeney of pulmonary up ws observed, Hee the response i etemined by the presence or Shsonor of tumour, andthe popuatens consist of Weated me ard contol. Ey ‘TABLE 3.1 Numbers of mle bearing tamours a treated and contol groupe Condition ofmke [Tested Contol | Tord “Tumovte 4 5 9 No tumoue 2 wm | 6 Toul 6 x | 9s ‘Thos, both ietor and response redefined by quale diferencs. The roportion of Conours higher for the Weaed group and we need to es ‘he pifieanc ofthis observation, ‘Bxample 12 ‘The dita in Table 3.2 ae quoted frots Hewett and Packet (4980). They ate conened with the tone to the beetle Dolon atancum of Ses formed by the sete benzene hexahlride. THe Populations ate defined by the depot of 0-1 wie BH measured in, ‘TABLE3.2 Toxkity 10 Mion cancun of fits formed by (O17 viv enzenehexncloride Pedod ot epost of -BHIE Gal 10 on) rom suivat (@9t) | 1208 1449 162118413 204422296 At mest 6 is m6 mM [wr Bawen7ma9] 547 ga | 90 Moths | 30 2170) | at Toul © 0 050 ‘| 08 ‘ng)10 ct. Response Is else accor to the period of suv follow ing the application ofthe inset, The eatgoie ae: death within si ay, death between seven ad nine days, and sia forme than nine ‘ys. Ths both fictor and respons ae defined by quate diforenes. We wish to deteomine the reationship between se options in the varus ‘aepries and the dove of BHC ‘Suppve ext that the unl of population canbe vied ito categories by etch of two maths of eset, which conta raps eatepres ‘esgectvel. A unt selsted at random fom the population Is plan wth ‘rosin eatery # ofthe Mat eleiton an eatery ff the second. Define variates Rand by the probability dstbation RSI = Oy CAN PIA Nerd + | ! | | a ‘Thus the double clastiestin define vate teponse (R 5). We ow concerned with the problem of desciblag the tiation of ‘A sompl stake fom the population and the wie ae clad with respec to each response, Denote by nth Beane Tor the ith category ‘ofthe Hat clasifenon aod the th category ofthe second, The Gequencs fu) sprint presned as 9 contingecy table of onder 2, ‘Ssample 22 ‘The dita in Table 3.3 ae quoted by Amite (1958) from olmes and Wiis (1984) wih further Stal, Thy show the relaonship TARLE3 _Retuionsip between nat carr rate for Steprococes pyogenes and size of tonal among 1398 cite aged 15 yeas Pron, but 4 lage toasts | Tou not enlarged a Cares 0 » ou | nm Noteatier o7 so 269 | i6 Ta S16 3a Comieente Ose 0049200819 between nul eae ate for Syptacccu pyogenes and le of ons mong 1398 ehidren aged 0-18 yen, Cdn ae eased quaiately ‘nets and nowcariers, whereas there i natal nding forte ie ‘of fon, The cries rate increases wth tonal sie and gg pater Form for the joint ditlbuton of responses ‘Bxaple 24 ‘Th daa in Tabb 3.4 se quoted by Yates (1948). They wee binned in the oure ofa pl enguiy Int the conditions in which TABLE 34 Clarseation of 1019 ciiren according to conditions unter whic homework was cried cut and the fear ting ofthe quay of ‘hat homework. (Each sales ade, A betng the Nigh aig) omework eosin Teachers Wi “oul sstng A 8 ¢ pe 4 Maa 9) ao a BL 6 ass | ae c Mees |e Tout 308 9s 90] 019 | \ | n ‘ehoolhiren do thee homework. Chien at csied acon to the ‘ondtons under which homewevk was cried on, and according tothe Acar rating ofthe quay of tat homework Although the eect a the Somewonk conditions on the quality of the preparation apes to he smal (ho question ates whether the sgh end atered us sty sgsifence, ‘Bxample 25 The detain Teblo 35 ae quoted by Stu (1952). They se based on eateecords ofthe eyeesting of women employees tn Royal TABLE 33 _Unided distance von of 171 women ged 30-39 Grae of Grae oft ee vos fot ove | ight Second Third Lowest | 7M Wigiet [1530-266 eS as | oT Second mm sn ane | ase ‘Tied 73a mas | as Lowest 36 2m an | a9 Toul 107m 2807 swat | ‘Ordnance fatois dung 1943-6, Ther ate two responses, defined by vision inthe eg and et eyes The eatyorie ae tho sme for each e- ‘sone, and ordered, A hypothesis of possible interest i that the mains ‘ofthe able ate homogeneous, in the sens tat each espns has the sme ‘obabiltydstalbuton, Bample 26 The dat in Table 26 ate quoted by Wale (1 5). ings ‘ns ofthe ght hand ate lsd by the numbers of whos and sal TTABLE3.6 Ringerprns ofthe rhs hand els by the mune of whore ad sgt bre ‘Sia Whore oa ‘Tous o 1 2 3 4s o [me aoa as [aa 1 | 10 153 a6 mo “7 2 |i 92 sss 22 3 fs 7 170 4 |i a6 130 s_| 0 30 Tout [593453392306 ann 4s_| 000, loops ach Rngeapat seer who, ch, compote, or one of for ‘ares of loop. Consequeaty the ttl of whet nal oop french Tanda most 5, andthe tbe is ang, The question aes of com ‘sntting an apieplte mode forthe voveipondng robles, ‘Although the concepts of tesgonse ad fst ave gia pea and uo wiely wed, they ae often dfclt to ditingih in pact." ae few exparinents whete tbe sample sen ae previously determined. Thos the ‘sl total In Table 3.1 although tested st Hxed hve evey appearance of bong fortitows. The desptions of experinentlprocedre pen Ia tapers manly concerned with staal methodology may pert various allocations ofthe labels “tesponse” ot "lactr”, A posible tlution ofthe iit to Meniy factors and responses with emer and effet epee {ively ba an inspection of sts of tn suggest hat sich Hentfeaton ean sk te ambiguous, We tall conta forthe preset to syppote thal reponse fn Fates canbe cll seperated, benuse this serves to ntodce sad esr other leant concpts.In's sense, the dno is important “The base tsa! mel i one wher the Grequences are sled ndepem ‘ent Poison dsiibutions. Under this model, eotns on the fequencles ‘a equivalent to the definition of factor, We al find that the anys of at sae independent of such consraits Conzequentiy, fale to sings Flly between response and factors serves ly A modi the [wesetation of the vel 82 Variation of rexpome ‘A resgonse Is observed at cach love of factor J, where is defined bya elation with» eateries, and by a at of + populations. The Aaibtin of a the th eel of is expres by PREF = py 1,2, where HPD 59 Putra tect py = 1 forall We asume that py > 0 for If ad . Define = Nona) a0 Yas = 10 a Paltarsd ete © 1,25... 7= Land b 18> 1. Consider the hypo ese WP = P= Py PETA ‘This specie that the dstbuton of Rl he same teach eel of. Then 1 equiaent to the hypotse hw 0 forall «and 1% ‘The rel is pond by showing that euch hypothe imple the ot 4 tse, then is fy, Convey, spose that fy ols and con P= (i. Shoe Poke = PaPre (0 alle and b, we deduce that P es an 1. Thus vectors w adv exs seh that P= wv? weve, 17P = 17, where 1 Hen vector of unit, whee = 1/1 an B= uf" This Shows tha J ods, and empltes The prot tha Hy nd Hy ar eguialent When Hy hall, we can wile. Fala my ‘Te hypothe © fora the ena tha Pa = HP fora ‘ample 3.7 When r= += 2 he probabilis (py) ste sul thal Pu tpn =1 and patpn = 1 Westy that puppy the od ati beats of the alternative ox: presion {olmak Pad ‘his quantity I also known asthe rela ik a erm moe aceuaely snplied (opr. haces ust one parietr I the clas My) which we denote by A This 2 toa (raf —Pud}—tor Pall ~ Pa hich shows tat pu — pr and ate native, zero and poste ope. Consequently, lypothests about the sg of Pu — Pia te equa to Inypotieses aboot the sg oF A. More gene, nfernci about ye Pa se covericnll replaced by inerencer bout}. This oresponds to ‘measuring probable ona lol sae, where lg (pp) is known as ‘he foi ofthe probability. ‘sample is taken fom each population, and my the roweney observed In the ah eatepory fot ty = By my. P= aps Bin aad m= 2ymy he 3 ants smaton othe pnt vale of nd 3 andy es ied. Lt beth vie srerpontng omy. Te vies ‘andy dete by alo wih dy repestvy.Sce he sya ot My fed at ny fr he otsy te kei Tons in be 3.7 2s ‘TABLE 3.7 Contieney table for one response and one factor espons opilatons a categories | 12 | Tt 1M Ma me [Me 2 | meme Me | Man ro | Ma Bae | Me Toul [ese im | ® We ase tat nts then indepndenty gd at andom, whence two consogens alin (0) The etry = (iy My Ae hat 2 muon Aston whiner ny and proba (hy, gene (©) Such wet mately dependent for =e Baton ot} estore ca by 10) = my for 08) = hy nef hr ly i edet oer the psble alues ofan. This incuton ‘an be died as fellows om rot independent vate 4 ach tht yas Pokemon with peer ye Denote By yy. ‘Went clasneaton with expres made ns populations, he conto {= na) at poced Th tation of (8) condos on (ena) ants fs muta Independon lion Sst, te ho ‘which hs lndex ny and pute og). where y= pyr In ems ofthe psanetrs (0) and (hy) the dation of (Ny) siren by y= my for ad 9 = exp Ca tooe* aa asd Hy (ty ping. ‘The equience of this expresion tothe one len previ svt by cing tht the ows Of P,P Pe. Py gfe. For example, the ‘oeficent of lg ys he exponent tenn Pa 4 The da ‘The statics ‘re prima intrested in {hg} and theefore consider the dstribution of {Wa} ondional on (yy My). The fat that the cok totale ae Pied rovies the further condion {My = my) Thu the dtsbution of (a 26 Pye nae for ala) = Bay pM y for aj and, where the summation Is akon overall (whi sty both the condor, Hence the condiiona dstbuton sequel fr Ply Malo = Moo for alla and 6) ~ Leap Es may Hy 20 Bas maha yn) We denote the expesion om the ight By May Mn May Ral Thi so ‘he disubutlon ot {¥%)eontona on {Mg = haf and (Noy = nh whe the (Mp) ate mutually independent Posson waits with paneer (jy) sespectly, and Das = Fon radtatsh ‘Beample 18 When == 2, the onions om (py that hy = Mem, May teas nd ing = my Ce Hay. ence the posible ves of my sy vem Se, where ¥ = max 0, + no — 1) and w = min ‘The contol dsibation of My Is Mainnorard) = tarvoatan [| S, oxrenastvma) ‘Ths aetbtion wos fet deed by Fie (1935), ‘When holds, the distribution of Is the same for etch evel of, a ‘y=, for alan. The disibution of (Mg the convolution of §| ‘nomial tutions wih ines, us.» ty And these prob abies py} Hence (Me) bas x mltnomia dition with dex nan ‘babies (p Inferenes abou the pcb are made slong the et Alacused in Chapter 2. “The mala pat of the analy therfore cascered wih inferences aboot aa} fom the dstution with pabsbiliies Ming ey, us We tc {ort ope the following two chapter. 33 Asocition between respons ‘A popliton i chs by each of two methods, whch contin 7 and | catego espetivly. The corespondig bitte response is (RS), whee PRG S=) = my = 2D ona Bum =) 2 ‘Wo assume as before that py > 0 for ll an. Thi aunpton is nt Fulled for Bxampe 3.6. The leant ewe reduced in scton 5.6, wow. ‘Conl the hypothesis My: * PyPey forall and, wae 2p = py 0d poy 2 py. Ths specs thatthe csponses ate Independent, that po 15" ) the ne forall asl wi the responses interchanged. When the respon ate odependent there san asoeaion between them, and the qustion ties of How the ssoclation canbe measured, Condes #22 table In which R = 1 and '5=1 ac the eategonies of ply Sleret fr the two nay respons ‘Then these categorie ate poiely asad if ether of the flowing ‘uilent conaltions hol Orn > Pore and (= 11S = 1)> (R= 115 = 2) ‘When the inoue sre revered he cepts epee Example 39° Table 23, shoe, shows the lationship between nal citer ‘ae for Siapocoecus pyogenes abd sz a tna. By combining the cae tories of eng Const, we ablin the 2X 2 table gen below. and $= 1 ase ‘Tout [Gres Won Enlarged 3 829 Not enlarged 9 a7 ‘The proportion with enlarged Case ameng caer I ister thas among rome. Tet appese to be» potiv action between the eneges of caer sad ele ton Many coetcens of asselaton for 2X 2 bls have been propased, a they ate reviewed by Kendall and Stuart (tk 2, ch, 23). Soch cuties luke the vale zero when the response ate independent and ace pose or negative depending onthe type of association They se purely deseigtve ‘nd have no ral Interpretation, Considertlons af symmetry (Simpson, 19ST; wads, 1963) indice tat th mestue of aeoatlon for w 22 table should e some funtion ot = maPelraPa ‘which is known a the crossproduer ati of the tale A convenent function ts X= lor Papal: ‘Thos R and Soe independent when 2 = 0, whe th two type of seo stlon correspond respectively to > O and’) <0. ‘We now tin 10732 tbls, The aban of miei i aly coe ‘eka fd Goodman and Kral (1984, 1959, 1963, 1972) olntroduce ‘oufclents fr rs tables wih pobablste Inerpsttions, They hav bo ‘ev the cts appropriate for vate sampling procedres. On the her Ind the symmetyarment for 2X2 bles en be exon tables (Alem, 1970), and shows tht anon owt to by some function ofthe (— 1) ~ 1) ere produet ation Yan = Petra adn = 1 2pe00y PH; B12, Equivalently, the messue of asacation is fnstlo ofthe parantes Des = lo an uuthr reson forthe importance of (} wil appeae the following ‘Analy ofthe soon between an i basi on transform: stion ofthe pchabiites [p,) to now parameters, which const of (ha) a = Lorre = WoeCPalPed end $= toe Ha "Te hypothesis of independence Hi viele to Te lb. “The prot follows the same ines 8 kth 3.2 1 is rg then 0 I Hy, Conners, Hh lps that P = w. The conn that "PL 1 ensures that P= (PE, wh eH. ‘Suppose that 21 tue Th the hypothe Hgts = 0 for ad Inpis that the column of P ae Lente When 1 and both hol, the Iiypothests for Tite sures that each clement of Pie. The same conhsonfillows fm 1M when 2 208 Hy both hold. “Tete sa sila rocedue of considering «sequence of hypathess in {he anys of vance ofa two-way layout, The comparison sgt tal ‘he patameters (2), (4) and Re] eam be dseed athe min efecto A, the man effect of, andthe fstardr or rovers intrton © foratte » ‘RX sexptvly, In such tems, Hh the hypaths of zero fast avder lnercton, ‘sample of tas taken fom the popeltion, ad ny the iequncy ‘bse in the th category of the fist ssision andthe th eaegory ‘ofthe second The resin contingency table taker the form shown fe “Table 2.8, where gun Is the vara conespandng tomy. We same TABLE 38 Continoy abe fortwo response Guepes of | Gages of econdesoae | ya Aatrepone | or 2 2 7 Ma Ma me | Me 2 Ny Mn Me | hn , Na Ma Be | te Teal fy Ne Me | uns ae tke indepen and at random, which ese the vtster {Wave 2 multinomial dstibation with iadex and prbabiies(,) "The stbtion of (i he HANiy= ny foe aa f) = 0 fy We can driv the sume res from rs sual independent vats} sich that, has a Poison dation with mean py Denote yh Up When »popiation is clasied by each of two mths anda soaps of size taken, the condition Zy My = imposed. The dsibution of (My) conditional on 24 My = is lino with Index and peobabities Wu) wher py wal In terms ofthe new parameters, the disteibtion of (Ny) gen by PONy = ny foe als anf) = exp +B, m+ Tmo} Ban aod Thus (Mos (Ma a (Nu) a6 eininalslient fr (0) (and (a We ace pinoy intrested n {Xe} and avefore de the dibution ot {Ws} conditional on (Nye ya) ad Woy = ns). By the seme aipanenis avin ston 3.2, the condom dition eae Pa = HalMea™ tap and May =Hoy Foland B) = Mama an When bli, the esponses Rand Sate dependent, in which ese {Me} and fg} have independent multinomial dsatons with ide 20 4nd probable (} and fp) rexpctvly. We have dseibed in Chaper 2 how to make afresces fom multinomial distbations. Thus the sal part ofthe anya is gan eonzeaod with ifrenees about (a) fom the “stbation with proabiies ina oy A) We otra tot ope In the Cllowing two chapters. 34 Contingency tables with ven magas Suppose th the dltbtion ofa bart reaponse (RS) defied by PR SRD = By W120 oat FE NB adhe In he previous setion we inaduced a tasfontion fom the probailie Ay) t0 parameters, {8h sand (ha). The eeerse taaforaton i siren by OEP = kB Pa sod Togas = HOE Athy (ORL D. ‘Thseequallon constitute esr model forthe losis of the prob abilities {p). known a ogdne mode. The equals en be solved “unlguely for the new parameter, and so the move i deseribed 25 surat ‘hypotiess, as dstne fom a tgtoogy, is expresedoaly when same of the parameters have specified values, in wbich ese th model i unstated ‘Thee ian analogy withthe anal of variase, where main effec and lntotactens ae defined in tes ofthe expected ales a the observations, it the consequence that the expected values ean be expesed a nea omminaton of ato effects An alert version of the Log eat ‘mode Is sgesed by the standard analy of valance model fo cas lassie fetorial dss, and was introduced by Dich (1963) Accorng {o the seondvesion, we wate log py = we +848, wha Bay Bh = Boy = 2 by ‘Tho two model diferent, bu ae relied When r= $= 2, in whieh cae By, = Ay We shall contaue to as the iat veson, which was ro Prsed In general tems by Mantel (1986) “The probabiites {py} ate uni determined by the combination of| rameters (0), (fl a8 (ho) We proceed to show tat the [py 6 to uniquely deteoined by a combeatlon whieh cons of the parame {4a} andthe marginal robaiitis (mj) and (jh Ths reslt cons the basi postions of (yp) tn the messorement of asocation ad wil ao be found important in the atymptotic Bahvios ofthe dtbatin hts Mo, 140, lotta = bbe Dede a Consider fat the cae r= # = 2 Sine {pp} al Pa) ae given, the pobabiie te expresed in tenne of Pu by Pa = Papi Pu PoP. amd pas = Pra—Pe—Part Hence py subject to the conditions sna (0,256 Pan~ 1) Pay sla (Pr, Bs) "The croteprodut stl ye lo gten, whence Pulou~Pe~Pa + DKPe~Pudlea—Pu) = ¥ ‘Ths Py sls the gudatc equation PLB —1)~ Pik ~ Nowe + Pa) + IP Pemes = 0. “Te ethan side spo atthe Yower Mat of py and native a the ‘upper limit. We concie that only one woo of the guste uations ‘in te penile range of py. Therefore the prcbalties fn) ae vnlgely elesmined by the combination of Poy Psd 3 “The generation ofthis el to 3 table of ode 7X bad on the following theorem, which Is da to Sinkbow (1967). Let (and a) be ‘pecfed marginal probailites. Thea, earesponding oath poling mate of onder % 4 thee ia unique mats COD wih ow aime (pa) a olunn sms (7) her Cand D are respectvly rr ad #% Sagan Iltices with poate diagonal nde themes unig upto ¢ salar ‘multiple, Furtiemote,CQD ean be foul by s convergent erative proces whic cont ofateratlysallg the rows and clin of Q to hate vow sums {and clunn sums ip tespecivly. A poof ofthe theorem is mite ‘Suppose that we are gen the marginal probable {py} and (pa) of « {abo of order Xs, tote withthe parameter (y} Thu the prob abies (yp) sty the conta ZyPy = Ps Bly ~ Boy and 108CPaPalPasPr) = Der ‘We prove a follows that theres # unig sation, Denote te croproduet ratio exp an by at, Let Q be any posite matric which has oder X ¢ and crossprovet aos (Yul, We may corset Q by aigningablany ose valves Ios nt row and column, an then elearning floments fom 43 = Verdetrltre ‘The coke da = di9 tea = ¥ isusuly the mest convenient. Ac {o the theovtn stated soe, thee Ia unique matex CQD withthe speted ‘magia ola, Moreoret, te craspredet rate of COD ae ven by

S-ar putea să vă placă și