6408 Eng

Catalogue no.
12-001-XIE
Survey Methodology
June 2002
How to obtain more information

Specific inquiries about this product and related statistics or services should be directed to: Business Survey Methods Division, Statistics Canada, Ottawa, Ontario, K1A 0T6 (telephone: 1-800-263-1136). For information on the wide range of data available from Statistics Canada, you can contact us by calling one of our toll-free numbers. You can also contact us by e-mail or by visiting our website. National inquiries line National telecommunications device for the hearing impaired Depository Services Program inquiries Fax line for Depository Services Program E-mail inquiries Website 1-800-263-1136 1-800-363-7629 1-800-700-1033 1-800-889-9734 infostats@statcan.ca www.statcan.ca
Information to access the product

This product, catalogue no. 12-001-XIE, is available for free. To obtain a single issue, visit our website at www.statcan.ca and select Publications.
Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner and in the official language of their choice. To this end, the Agency has developed standards of service that its employees observe in serving its clients. To obtain a copy of these service standards, please contact Statistics Canada toll free at 1 800 263-1136. The service standards are also published on www.statcan.ca under About us > Providing services to Canadians.
Statistics Canada
Business Survey Methods Division
Survey Methodology
June 2002
Published by authority of the Minister responsible for Statistics Canada Minister of Industry, 2006 All rights reserved. The content of this electronic publication may be reproduced, in whole or in part, and by any means, without further permission from Statistics Canada, subject to the following conditions: that it be done solely for the purposes of private study, research, criticism, review or newspaper summary, and/or for non-commercial purposes; and that Statistics Canada be fully acknowledged as follows: Source (or Adapted from, if appropriate): Statistics Canada, year of publication, name of product, catalogue number, volume and issue numbers, reference period and page(s). Otherwise, no part of this publication may be reproduced, stored in a retrieval system or transmitted in any form, by any meanselectronic, mechanical or photocopyor for any purposes without prior written permission of Licensing Services, Client Services Division, Statistics Canada, Ottawa, Ontario, Canada K1A 0T6.
October 2006 Catalogue no. 12-001-XIE ISSN 1492-0921 Frequency: semi-annual Ottawa Cette publication est disponible en franais sur demande (no 12-001-XIF au catalogue).
Note of appreciation Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued cooperation and goodwill.
SurveyMethodology,June2002 Vol.28,No.1,pp.523 StatisticsCanada,CatalogueNo.12001XIE
RegressionEstimationforSurveySamples
WayneA.Fuller1 Abstract
Regressionandregressionrelatedprocedureshavebecomecommoninsurveyestimation.Wereviewthebasicpropertiesof regressionestimators,discussimplementationofregressionestimation,andinvestigatevarianceestimationforregression estimators.Theroleofmodelsinconstructingregressionestimatorsandtheuseofregressioninnonresponseadjustmentare explored. Key Words: AuxiliaryinformationCalibrationLeastsquares DesignconsistencyLinearprediction.
1. Introduction
Design and estimation in survey sampling involve the use of information about the study population to construct efficient procedures. While design and estimation are intimatelyrelated,withestimatorsdependingonthedesign, thetwotopicsareoftentreatedsomewhatseparatelyinthe survey sampling literature. We follow tradition first studying estimation treating the design as given. The estimation task is to combine the available information aboutthepopulation,withthesampledatatoproducegood representationsofcharacteristicsofinterest. Regressionestimationisoneoftheimportantprocedures thatusepopulationinformationorinformationfromalarger sample, to construct estimators with good efficiency. The information, sometimes called auxiliary information, may havebeenusedinthedesignormaynothavebeenavailable atthedesignstage.Insurveysofthehumanpopulation,the information often comes from official sources such as the national census. Similar sources may provide information forothertypesofsurveys.Forexample,inasurveyofland use the total surface area, the area owned by the national government,andtheareainpermanentwaterbodiesmaybe availablefromnationaldataarchives. Threedistinctsituationscanbeidentifiedwithrespectto the nature of the auxiliary information that is available. In thefirst,thevaluesoftheauxiliaryvectorxareknownfor each element in the population at the time of sample selection. Inthis casetheauxiliary variablecanbe usedin designingthesampleselectionprocedure. In the second situation all values of the vector x are known, but a particular value cannot be associated with a particularelementuntilthesampleisobserved.Inthiscase, the auxiliary information cannot be used in design, but a wide range of estimation options are available once the observations are available. For example, the population censusmaygivetheagesexdistributionofthepopulation, but a list of individuals and their characteristics is not
available to non governmental institutions selecting samples. In the third situation, only the population mean of xis known, or known for a large sample. In this case, the auxiliary information cannot be used in design and the estimation options are limited. For example the U.S. DepartmentofAgriculturemightreleaseanestimateofthe totalnumber ofanimals ofa particulartypeon farms ona particulardate.Ourdiscussionconcentratesonthissituation. Twoestimationsituationscanalsobeidentified.Inone, asinglevariableandaparameter,oraverysmallnumberof parameters,isunderconsideration.Theanalystiswillingto invest a great deal of effort in the analysis, has a well formulatedpopulationmodel,andispreparedtosupportthe estimation procedure on the basis of the reasonableness of the model. In the second situation, a large number of analyses of a large number of variables is anticipated. No single model is judged adequate for all variables. The prototypical example of the second situation is the case in which a data set is prepared by the survey sampler to be analyzed by others. Because the person preparing the data set does not have knowledge of the analysis variables, emphasis is placed on the use of estimators that can be defendedwithminimalrecoursetomodels. Regression estimators fall in the class of linear esti mators. Linear estimators have a particular advantage in survey sampling because once the weights are calculated they are appropriate for any analysis variable. Several propertiesofestimatorswillbeexaminedinourdiscussion. Givenamodel,weaccepttheclassicalgoalofminimizing the mean square error in a class of estimators. That class maybetheclassoflinearestimatorsthatareunbiasedunder themodel,buttheclassmaybefurtherrestricted. Estimators that are scale and location invariant can be used in general settings. Mickey (1959) suggested that the term regression estimator be restricted to linear estimators that are location and scale invariant. While we may not adherestrictlyto this definition, wesupport the distinction
1. WayneFuller,EmeritusDistinguishedProfessor,IowaStateUniversity,221SnedecorHall,Ames,IA500111210,U.S.A.
6 betweenestimatorsthatarelocationandscaleinvariantand those that are not. We consider location invariance to be importantforsamplingdesignswheretheunitofinterestfor analysisisalsothesamplingunit.Forclusterandtwostage designs in which weights are constructed for primary samplingunittotals,locationinvarianceislessimportant. Models play an important role in the construction of regression estimators. It is desirable that the estimators retaingoodpropertiesifthemodelspecificationisnotexact. Therefore properties conditional on the realized finite population, as well as properties under the model, are important. Linearestimatorsthatreproducetheknownmeansofthe auxiliaryvariablesaresaidtobecalibrated.Thisisadesir able property in that, for example, the marginals of tables withanauxiliaryvariableasananalysisvariableagreewith known totals. If the auxiliary variable is of no analytic interest,thencalibrationislessimportant.
Fuller:RegressionEstimationforSurveySamples
Agreatdealofresearchwasconductedinthe1970sand 1980sonthe general nature of theregression estimatorin survey samples and on the degree to which the model prediction approach can be reconciled with the design perspective. Fuller (1973, 1975) gave the large sample properties of a vector of regression coefficients computed from a survey sample. Isaki (1970) studied regression estimators and the results were published in expanded versions in Isaki and Fuller (1982) and Fuller and Isaki (1981).Itwasshown thataregressionestimatorconstructed underamodelisdesignconsistentforthepopulationmean ifthemodelcontainscertainvariables.Cassel,Srndaland Wretman (1976) considered both model and design principlesinestimatorconstructionandsuggestedtheterm generalized regression estimator for design consistent estimatorsofthetotaloftheform
, Ty , GREG = Ty , HT + (Tx , N -Tx, HT) where Ty,HT and T ,HT are the HorvitzThompson x estimatorsofthetotalsofyandx,respectively, T ,N isthe x knowpopulationtotalof xand isanestimatedregression coefficient. Srndal(1980),Wright(1983),andSrndaland Wright (1984) discussed classes of regression estimators. The text by Srndal, Swensson and Wretman (1992) contains an extensive discussion of regression estimation andMukhopadhyay(1993)isareview. Itwasthe1970sbeforetheuseofregressionforgeneral purpose,multiplecharacteristic,surveysappearedanditwas the1990sbeforetheuseofregressionweightingcouldbe called widespread. An early use ofregression weights was atDoaneAgriculturalServicesInc.,nowDoaneMarketing Research.During19711972areadershipstudyoffarmers was conducted under the direction of Mr. John Wilkin in which 6,920 farmers responded. Weights for the respondents were constructed using regression procedures, wherethecontrolscamefromtheU.S.AgriculturalCensus and from Department of Agriculture sources. Doane provided financial support to Iowa State University to develop a regression weight generation program. To guaranteepositiveweightsintheDoanestudy,observations with small weights were grouped and assigned a common weight.Grouping continued untilthe common weight was positive.Latercomputerprogramsusedmodificationsofthe Huang and Fuller (1978) procedure to guarantee positive weights. Doane has used regression weights for their syndicatedmarketresearchstudiessince1972. RegressionestimationwasfirstusedatStatisticsCanada in 1988 for the Canadian Labour Force Survey. In 1992 regression estimation was used by the 1991 Canadian Census of Population to ensure that the weighted sum of variables collected via the long form (a one in five systematic sample ofall households in Canada) was equal toknownhouseholdandpopulationtotalsascollectedinthe 1991Census.SeeBankier,RathwellandMajkowski(1992) and Bankier, Houle and Luc (1997). The regression estimator is also the key component of the Generalized
2. Background
Theearliestreferencestotheuseofregressioninsurvey sampling include Jessen (1942) and Cochran (1942). Regression in similar contexts would certainly have been used earlier and Cochran (1977, page 189) mentions a regression on leaf area by Watson (1937). It is interesting that Jessens use of regression was essentially composite estimationwhereregressionwasusedtoimproveestimates fortwotimepointsgivensamplesateachpointwithsome commonelementsinthetwosamples.Cochran(1942)gave the basic theory for regression in survey sampling relying heavily on linear model theory.He showed that the linear model did not need to hold in order for the regression estimatortoperformwell.Hederivedanexpressionforthe 1 2 O(n )biasandan O(n )approximationforthevariance.He also showed that for the model with regression passing throughtheoriginanderrorvariancesproportionaltox,the ratioestimatoristhegeneralizedleastsquaresestimator. Regressionestimationattractedtheoreticalinterestinthe 1950s,oftenintheformofstudiesofthebias.SeeMickey (1959). Brewer (1963) is an early reference that considers linear estimation using a superpopulation model to determine an optimal procedure. He was concerned with finding the optimal design for the ratio estimator and discussed the possible conflict between an optimal design underthemodelandadesignthatislessmodeldependent. SeealsoBrewer(1979).Royall(1970)arguedfortheuseof models,thattheconditionalpropertiesthatareimportantare thoseconditionalontheauxiliaryinformationinthesample, and that the design should be chosen to optimize those properties. Royall and his coworkers, e.g., Royall and Cumberland (1981), studied the conditional properties of regressionestimators,conditionalontherealizedsampleof auxiliaryvariables.
StatisticsCanada,CatalogueNo.12001XIE
SurveyMethodology,June2002
7 wheretheweights, w , minimizetheLagrangean i
k 2 wi + j wi xij - j i A i A j =1
Estimation System (GES) developed at Statistics Canada andusedinnumerousbusinessandsocialsurveyssinceits releasein1992.The methodologyis describedinEstevao, Hidiroglou and Srndal (1995). See also Hidiroglou, Srndal and Binder (1995). Regression estimation is now used to construct composite estimators for the Canadian LabourForceSurvey.SeeSingh,KennedyandWu(2001), Gambino, Kennedy and Singh (2001) and Fuller and Rao (2001). Bethlehem and Keller (1987) report on the use of regressionestimationattheNetherlandsCentralBureau of Statistics (now Statistics Netherlands) in a program called LIN WEIGHT. Nieuwenbrock, Renssen and Hofman (2000) describe the software package Bascula, that has replaced LIN WEIGHT. Deville, Srndal and Sautory (1993)describeacomputer programCALMAR developed at Institut National de la Statistique et des Etudes Economiques (I.N.S.E.E.) that computes weights of the regression type with options for different objective functions. A program developed at Statistics Sweden and called CLAN97 is documented in AndersonandNordberg (1998). Folsom and Singh (2000) discuss a procedure developedattheResearchTriangleInstitute.
andthe j areLagrangemultipliers.Thevariance of is 2 2 V { } = V wi ei = wi e i A i A
becausetheweightsarefunctionsofthe xi andnotof y . i Thecovariancematrixof is

V{ }= xi xi V b xi xi i iA iA i A (3.3) = V ci iA i i where b = xi e and c i = (X X)-1xi e . Because e is i i independentof xj forall i and j, i e V b = V{ }= x x s 2 b i i i iA i A i A
-1 -1
andweobtainthefamiliarexpression,
3. The Classical Linear Model

The classical linear model is the foundation for survey regression estimation, but the survey situation requires certain adaptations. To introduce regression estimation for survey samples, we review the classical linear model. Assume yi = x i+ e , i = 1 2 ...,n , , , i
e ~ NI( ,2), 0 e i
V { }= xi xi s 2. e i A Theusualunbiasedestimatorofthecovariancematrixof 2 isobtainedbyreplacing se withtheunbiasedestimatorof s2 obtained as the mean square of the residuals, e ei = y - x An estimator of the covariance matrix that i i . estimates V{ i A b i} directlyis ~ V { }= xi xi b iA ci, = ci
i A -1
-1
(3.1)
where ei is independent of the k dimensional row vectorsxj foralliandj,and isthe unknown parameter columnvector.Wewillalsousematrixrepresentationsfor thesamplequantities.Thus,forasampleof n elements,
x X = ( ,x2,...,xn) and y = (y1, y ,...,y ). 1 2 n
bi bi xi xi iA i A
-1
(3.4)
i where b = xi e and c i = (X X)-1xi e . Inthesameway i i

2 q Vb{ a }= w i e2 a i i A
(3.5)
Givenasampleofsizenandtreatingthe x asfixed,the i best(minimummeansquarederror)estimatorof is

= xi xi i A
-1
xi yi
i A
) = (X X -1X y (3.2) ,
where A isthesetofindexesofthesampleelementsandwe assume,aswewillthroughout,thatthematrixtobeinverted isnonsingular.If the e arenot normally distributed, is i the estimator with smallest variance in the class of linear unbiasedestimators. Theestimator ofalinearcombination ofthecoefficients,say = k = 1 jj, canbewrittenas j
= w i y i
i A
is a linear combination of the elements of (3.4) and is a . consistent estimator of V{qa } The estimator (3.4) is a } when the covariance matrix consistent estimator of V{ ofthe e isadiagonalmatrixwithboundedelements.Thus i itisamorerobustestimator.However,the estimator(3.4)is biaseddownwardbecausethevarianceof e isusuallyless i than the variance of e. Two methods are available for i reducingthebias.Thefirstistomakeadegreesoffreedom ~ adjustmentbymultiplying V {} by (n -k)-1n, wherek b is the dimension of xi. An alternative adjustment is to replace e with i
~ = ( - y )- 0.5e , ei 1 ii i
8
) where yii is the ith diagonal element of X (X X -1X. See Horn, Horn and Duncan (1975), Royall and Cumberland(1978)andCookandWeisberg(1982,section 2.2). If we observe the value xi for an element, but do not observe y , thenthebestpredictorof y forthatelementis i i yi =x Likewise,ifweknowthesumof x forasetof i . i xs,thenthebestpredictorforthesumofthe y isthesum i . of xi Thus,givenasetofNelementsthatsatisfymodel (3.1),asetofobservations ( y, x ) onasubsetdenotedby i i A, and the known values of xi for the remaining N -n elements, , Y - n,reg = yi = xi N
i A i A
] FN =[(y N, x N ), (y N, x2N), ..., (yNN, xNN ) 1 1 2

be the set of vectors for the Nth finite population. The subscriptNonthevectorswilloftenbeomitted.Thefinite populationmean is
z N = ( y N , x N ) = N -1 ( yi , x ). i
i= 1 N
(4.1)
where A isthesetofelementsforwhich yisnotobserved, isthebestpredictorofthesumoftheunobserved ys.See Goldberger(1962),Brewer(1963),Royall(1970),Harville (1976) andGraybill(1976,section12.2). Hence T = y +Y (3.6)
y , reg
i
iA
N - n reg ,
isthebestpredictorforthetotalof N observations. Ifthefirstelementinthe x vectorisalwaysone,wecan partition the x vector as x i =( , x,i) and write the 1 1 regressionestimatorofthemeanas = x = y + ( x -x ) , (3.7) y = N -1 T
reg y , reg N n 1, N 1, n 1
We denote the set of indices appearing in the sample selectedfromthe Nth finitepopulationby A . N Whenthe finite populationisasample fromaninfinite superpopulation, the probability properties of a sample are determinedbythepropertiesofthesuperpopulationandthe properties of the probability mechanism used to select the sample.Onecanconsiderthe unconditional properties,the propertiesconditionalontheparticularfinitepopulation,or the properties conditional on some part of the realized sample. Propertiesconditionalonthefinitepopulationdepend primarily on the survey design and are often called design properties. Thus an estimator q is said to be designconsistentforthefinitepopulation parameter qN if,forall e> 0 , lim prob{|q - q N | > e |F } = 0 , N
N, n
where of(3.2)ispartitionedas (0 , ) and ( y , x ) is 1 n n the vector of simple sample means. We call x the N regressionestimatorofthemean. Giventhemodel(3.1),theexpectedvalueofthemeanof yfor the finite population ofN elements generated by the model is x and x is an unbiased estimator of the N N finite population mean. This, we believe, is the point at whichregression estimation for thefinite population mean undermorecomplexdesignsbegins.
where the notation means that we condition on the realizedfinitepopulation F and,hence,theprobability N iswithrespecttothedesign. Assume the finite population is generated as independentselectionsfromasuperpopulationforwhich E z zi} is positive definite, where z i =(y , x ). We { i i i define a superpopulation vector of least squares regressioncoefficientsby
] { = [E x xi} E xi yi}. { i
- 1
(4.2)
4. Design Based Estimation

The development of this section treats the finite population as a sample realization from an infinite population. The use of such models has a long history in survey sampling. Some references through 1970 are Cochran(1939,1942,1946),DemingandStephan(1941), Madow and Madow (1944), Yates (1949), Godambe (1955), Hjek (1959), Rao, Hartley, and Cochran (1962), Konijn(1962),Brewer(1963),GodambeandJoshi(1965), Hanurav (1966), Ericson (1969), Isaki (1970), and Royall (1970). To discuss the large sample properties of regression estimatorsweconsidersequences offinite populationsand associated probability samples. The set of indices of the elements in the Nth finite population is UN ={, ..., N} 1 , where N =1, 2 .... Associatedwiththe ith elementofthe , Nth population is a row vector of characteristics z iN =(y , x ) Let iN iN .
Given a sample ofn observations on zi we define the n(k+ 1 matrix Z =( , X of observations, where the ) y ) ith rowof Z is (y , xi). Ifweassumethemodel i y =X+ u , (4.3) E u, u = ( , { u} 0 ), thegeneralizedleastsquaresestimatorof is
=(X -1X -1X -1 y ) .
(4.4)
The model (4.3) serves as motivation for estimators of theform(4.4)butweshallconsiderestimatorswhere is a general symmetric positive definite weight matrix, notnecessarilythecovariancematrixoftheerrors. We give the large sample properties of the vector of estimated regression coefficients (4.4) following Fuller (1975).SeealsoHidiroglou(1974),ScottandWu(1981), andRobinsonandSrndal(1983). Assume the superpopulation has eighth moments and that the sample design is such that the error in the HorvitzThompson estimator of the meanis Op( -1 /2), n wheretheHorvitzThompsonestimatorofthemeanis
z HT = (y HT , xHT)= N -1
p-1 zi i
iA
(4.5)
and p isthe selection probabilityforelementi. Thenthe i errorinthevector ofregressioncoefficientsis

-N |F = Q-1 b + Op( -1), n N xxN HT
where isoftheform(4.4)withageneral matrix.The estimator can be written as wy where the vector of , weightscanbeconstructedbyminimizingtheLagrangean w w + (w X -x N )
(4.6)
where
-1 N =QxxN QxyN,
and isthevectorofLagrangemultipliers. Ifthereisacolumnvectors suchthat

X =D-1 J p
(4.7) (4.8)
(4.14)
( xxN, QxyN)=E{( xx, Qxy)|F }, Q Q N ( xx, Qxy)= n-1(X -1 X X -1 y , Q , ) bHT = N-1 p -1 b , i i

i A
forall possiblesamples,where Dp = diag( 1, p 2, ..., p n ) p andJisann dimensionalcolumnvectorofones,thenthe regressionestimator x of(4.13)with definedin(4.4) N is a design consistent estimator of yN . It follows from (4.11)that
L [ x V {} x ] -1/ 2( x N - y N ) N (0, 1) . (4.15) N N
(4.9)
b = n-1 Np i z i e , e = y - x N , and zi iscolumniof i i i i i X 1 . By (4.9) the error in the estimator of N is approximatelytheerrorinaHorvitzThompsonestimatorof themean.Inresult(4.6),the N isdefinedasafunctionof the expected values of the sample quantities ( xx, Qxy). Q Thus N isnotnecessarilytheordinaryleastsquaresfinite populationregressioncoefficient.Thevector bi of(4.9)is the generalization ofthe vector bi of(3.3).Ifthelimiting distributionoftheproperlystandardizedHorvitzThompson estimatorisnormal,andifthereisadesignconsistentesti mator of the variance of the HorvitzThompson estimator, thenitispossibletoconstructtestsandconfidenceintervals forthecoefficients. Assumethedesignissuchthat
1/ L Vz-z 2( zHT - z N ) | FN N (0, I , )
The requirement of (4.14) that D-1 J be in the p columnspaceof Xiscrucialfordesignconsistency.Simple waystosatisfythisrequirementaretoletonecolumnofX bethecolumnofonesandtouseamultipleof Dp as or , toletonecolumnof Xbetheelements p-1 andset =I , i or to let one column of X be the elements p and set i 2 =D. If X iscomposedofthesinglecolumnvectorwith elements p and if =D2, then the estimator (4.13) i reduces to the HorvitzThompson estimator of (4.5) for fixed size designs. If X =J and =D, the estimator (4.13)reducestotheratio estimator,
- y p = p i1 i A
-1 pi1 yi, i A
(4.16)
(4.10)
as N, n, where V z is the covariance matrix of z z HT -z . If V z is O( -1 ) and the estimator V z is n N z z consistentfor V z, then z

L [V { }]-1/ 2( - N ) | FN N (0, I , )
whichislocationandscaleinvariant. To see the nature of the estimator when (4.14) is satisfied, let, with no loss of generality, X =(x0, X ), 1 where x0 =D-1 J and x i =(x ,i, x ,i). Then p 0 1
yreg = x0, N x0,1p yp + ( x1, N - x0, N x0,1 x p ) , p 1, 1
(4.11)
(4.17)
where
V { } = Q -1 Vb b Q -1, = V { c }, xx xx HT
where (4.12)
1 = [( X1 - x 0 x1 ) -1 ( X1 - x 0 x1)]-1 ( X1 - x x1) -1y , 0
mx1 = x0,1 x p , and ( yp , x ) is defined in (4.16). The p 1, p ratios, such as x0,1 yp , can also be written as ratios of p HorvitzThompsonestimators.If J isinthecolumnspaceof X, estimator (4.17) is location invariant. If =D, then x0,1 x0,N =1, and p
Vb =V {b } is the estimated design variance of b HT HT calculated with b = n -1 N pi z ei ei = yi -x and i i i , V { c } istheestimateddesignvarianceof cHT calculated HT i xx with c =Q -1 bi . Thelimitingpropertiesholdforstratified samples and for stratified two stage samples under mild restrictionsonthesequenceofpopulations. By analogy to (3.7), a regression estimator of the finite population mean is obtained by evaluating the estimated regressionfunctionatthepopulationmeanof x toobtain yreg = x N ,
yreg = x = yp + ( x1, N -x1, p ) , N 1
(4.18)
(4.13)
where
10
1 = (x1, i - x 1, ) pi-1(x1, i - x ) 1, i A -1 x1, i - x ) pi ( yi - yp ) . ( 1,
iA -1
From(4.17),wecanwrite
yreg = x0, N x0,1 yp - x p 1 N - ( y - x1, N N ) p 1, N 1
(4.19)
+ O p(n 1) ,
Also, when =D, the N of (4.7) is the population regressioncoefficient

(4.20) N xi xi xi y i . iU iU Becausetheregressionestimatorofthemeanisalinear combination of regression coefficients, it is a regression coefficient for a linear combination of the original x - variables. To see this, let x i =(x , i, x ,i)= ( , x,i), and 1 1 0 1 define a new vector with one in the first position and a secondvectorwithpopulationmeanequaltozeroobtained bysubtractingthe original population mean x fromthe 1,N original x,i vector. Let q i = (1, x1, i -x N ) be the 1 1, transformedvector.Thenthetransformedregressionmodel is yi =qi + e , (4.21) i
-1
= ep +O p( n -1) ,
where ei = y - xi Hence,thevarianceoftheregression . i estimatorcanbeestimatedwith

-1 - e V{ p}= V p i1 p i1 e , i i A i A
(4.26)
wherethefinitepopulationcoefficientvectoris
N = ( y N , 1,N ) = q q i i i U
-1
i U
qi yi.
(4.22)
The expression for the regression estimator of the mean becomes 0 yreg = qN = g , (4.23)
where is obtained from (4.4) with qi replacing xi. Because the estimator is a linear estimator of the form wy wecanwrite , yreg = wi yi =
i A
p-1 gi i
i A
yi ,
(4.24)
where wi =p i1 gi. Furthermore, the estimated variance from(4.12)is
. where ei = y - xi Because(4.25)isaseasytocompute i as (4.26), and is applicable when x1, p -x is not 1,N Op( -1 /2), theestimator(4.25)isrecommended. n The variance of the regression estimator can also be computedusingthejackknifeorotherreplication methods, and the use of replication methods is becoming more common. See Frankel (1971), Kish and Frankel (1974), Woodruff and Causey (1976), Royall and Cumberland (1978), and Duchesne (2000). Yung and Rao (1996) showed that (4.25) is identical to a jackknife linearization estimatorforstratifiedmultistagedesigns. The approach to regression estimation associated with (4.18) and (4.19) falls completely within a design formu lation. Nomodelsofthepopulation,beyondtheexistenceof moments,areused,throughonemightarguethatonewould onlyconsiderregressionwhenonefeelsthereissomelinear correlationbetween x,i and y. 1 i Theestimator(4.19)isaverynaturalestimatorbecause the estimated regression coefficient is a design consistent estimator of the population regression coefficient. It is mildly annoying that (4.18) does not always yield the smallest large sample design variance for the estimated 1 mean.Treating of(4.18)asafixedvector,thevaluethat minimizesthevarianceofthelinearcombinationof means is 1, dopt = V { x1, p | FN } C{ x1,p , yp | F }. (4.27) N
-1
V { yreg } = V { g 0} = V pi-1 ( g i ei) , i A
(4.25)
whereitisunderstoodthattheestimateddesignvarianceof (4.25) is computed for the variable g i ei , ei = yi - x i , isdefinedin(4.4)Thevarianceestimator(4.25)isa and direct generalization of expression (3.5). By transforming the variables so that the population mean of the auxiliary vector is zero, the first element of the regression vector is theregressionestimatorofthemeanandthefirstelementof (4.12) is an estimator of the variance of the regression estimator that contains a component due to estimating . This was pointed out in Hidiroglou, Fuller, and Hickman (1978). Also, see Srndal (1982). Srndal, Swensson and Wretman(1989)suggestedthe g factorterminologyforthe calculation of the estimated variance of a regression estimatedtotal.
See Cochran (1977, page 201), Fuller and Isaki (1981), Montanari(1987,1999)andRao(1994).Ifthereisadesign consistentestimatorofthevarianceof x p , thenthe ,d 1, 1 thatminimizestheestimatedvariance
V { yp - x p d}, 1, 1,
(4.28)
1 denoted by ,dopt, is a consistent estimator of ,dopt. It 1 followsthattheestimator yd , reg = yp + (x1, N -x p ) dopt 1, 1,
(4.29)
has the minimum limit variance for design consistent estimatorsoftheform yp + ( x1, N -x p ) d. Also 1, 1,
L [ V {ep }] -1/ 2 ( yd , reg - y N ) N (0, 1) ,
(4.30)
11
yreg = yp + ( x1, N -x p ) 1, 1
where V {ep} is the estimator of (4.26) constructed with ei = yi - yp - ( x1, i -x p ) dopt. 1, 1, In a large sample sense, (4.29) answers the question of how to construct a regression estimator with optimum designproperties.Inpracticeanumberofquestionsremain. The estimator is obtained under the assumption of a large sampleandavectorxoffixeddimension.Inpracticethere may be a number of potential auxiliary variables and if a largenumberareincludedintheregression,termsexcluded in the largesampleapproximationbecomeimportant.This isparticularlytrueforclustersampleswherethenumberof primarysamplingunitsinthesampleissmall.Insuchcases, thenumberofdegreesoffreedomin V {x p} issmalland 1, the inverse can be unstable. These issues are discussed furtherinsection9. The estimator ,dopt of (4.29) is linear in y for most 1 designs. SeeRao(1994).Forexample,forastratifieddesign withsimplerandomsamplingwithinstrata,
C {x p , y } 1, p
H n h
with the smallest estimated design variance. If the true slopes in the strata are the same and if the selection probabilities are proportional to the square roots of the withinstratum variances, then the use of =D2 gives a smaller small sample MSE than the use of -1 = 2 diag{ t} becausethesumof whi s2 issmaller.Fullerand K h Isaki (1981) noted that the designoptimum estimator is often well approximated by the estimator constructed with 2 =D. Wehaveintroducedregressionestimationforthemean, butitisoftenthetotalsthatareestimatedandtotalsthatare used as controls. Consider the regression estimator of the totalof y definedby
Ty , reg = Ty , p + (Tx , N -T,p ) y x, x
(4.33)
where T N is the known total ofxand (Ty , p , T,p) is a x, x vectorofdesignconsistentestimatorsof ( y,N, T, N ). By T x analogyto(4.28),theestimatoroftheoptimum is y x =[ V {Tx , p }] -1 C{T, p , Ty,p }, x
K h (x1, hj - x h ) ( yhj - yh) , 1,

h =1 j=1
(4.31)
(4.34)
where
K h =W2 ( h 1 = N
-2
fh)( h - 1 n )
-1
n-1 h
p -2( h 1
fh)( h - 1 -1n , n ) h
N-1 Nh = W , Nh is the size of stratum h h, fh =p h = Nh1 n , and n isthesamplesizeinstratum h h h. It follows that the weights associated with estimator (4.29)are whi = N-1p -1 + ( x1, N - x p ) h 1,
n h H K x1, tj - x1, t ) (x1, tj - x t) ( t 1, t =1 j=1 K h (x1, hi - x h). 1, -1
where V {T,p} isadesignconsistentestimatorofthevari x ance of T,p and C (Txp , T ,p ) is a design consistent x , y estimatorofthecovarianceof T,p and Ty,p . x The estimator of the total is N y for simple random reg sampling,butthe exact equivalence may not holdin more complicated samples, because in such situations the estimated mean may be a ratio estimator.However, if the regression estimator of the two totals is constructed using (4.34),theratioofthetwoestimatedtotalshaslargesample variance equal to that of the regression estimator of the mean.Toseethiswritetheerrorintheregressionestimated totalsof y and u as T -T = T - T
y , reg y, N y, p y ,N
(4.32) and
+ (Tx, N - T, p ) y x, N +O p( Nn -1) x Tu , reg - Tu , N = Tu , p - T ,N u + (Tx , N - T, p ) x,N + O p( Nn -1) ,(4.35) x u where we are assuming Ty , p - Ty , N , y x -y x ,N and the corresponding quantities for u, to be Op(Nn-1 /2) and Op( -1 /2), respectively. Then the error in Tu-1 Ty, reg n , reg is Tu-1 Ty , reg - Tu-1N Ty , N = Tu-1 [(Ty , p - T ,N ) , reg , ,N y - RN (Tu , p - T ,N) u + (Tx , N - T, p ) ( yx , N - R z x ,N ) ] x N + O p( Nn -1) ,
See also Srndal (1996). The weights of (4.32) can be constructed by minimizing hi A w2 Kh1 subject to the hi constraints
, whi =N-1 Nh, h= 1 2, ..., H,
i A h
and
whi x1, hi
hi A
= x N, 1,
where A isthesetofsampleelementsinstratum h. h The estimator of (4.19) with =D is a function of HorvitzThompson estimatorsof population moments. The estimator(4.17)with -1 = diag{ t}, thediagonalmatrix K with Kt on the diagonal for elements in stratum t, and dummyvariablesforstratumeffects,givestheestimatorof themeanintheclass
(4.36)
12 where R N =T-1 Ty, N . If we construct the regression u N , estimatorfor RN startingwith R =Tu-, 1 Ty,p , wehave p
Rreg = R + (Tx , N - T,p ) R x, x
where G AA = S eeAA S-1 , x N - n = ( N - n ) -1( N x N -n x ) , eeAA n , SeeAA = E{e A eA}

eeAA ) -1 = (X S-1 X -1X S eeAA y, eA =( n+1,e + 2,..., e ), JN- n is an N -n dimensional e n N columnvectorofones, x isthesimplesamplemean,and n A isthesetofelementsinUthatarenotinA.SeeRoyall (1976). Underthemodel, q - y N = C ( - ) + N -1 J -n (GAA e A -eA) N xA
(4.37)
where
R x =[V {Tx , p }] C{T,p , R} x
-1
and
C{Tx , p , R} = C{T, p , Tu-, 1 (Ty , p -RN Tu,p )}. x N
Itfollowsthatthelargesampledesignoptimumcoefficient for the ratio is Tu,1 ( y x, N -R u x, N ) and the ratio of N N designoptimum regression estimators is the large sample designoptimumregressionestimatoroftheratio.
and
V {q - y N | X A} = C V{}C xA xA + N -2 J -n (S eeA A - G AA S ) JN - n,(5.4) N eeAA
5. Modelsand Regression Estimation

In this section we assume that the analyst postulates a detailed superpopulation model. Assume also that the sample is an unequal probability sample or (and) the specified errorcovariancestructureis nota multiple ofthe identity matrix.Then, only inspecial cases willthe design optimal estimator of (4.29) agree with the best estimator constructed under the model, conditioning on the sample x-values. To investigate this possible conflict, write the modelforthepopulationinmatrixnotationas
y U = X + e U U e ~ ( ,SeeUU), 0 U
where
C xA = N -1[( N - n) x N -n - J -n GAA XA] . N
(5.1)
Design consistency of estimator (5.3) andthe situations in which the model estimator reduces to the Horvitz Thompson estimator have been considered by, among others,Isaki(1970),Royall(1970,1976), ScottandSmith (1974),Cassel, Srndal,andWretman(1976,1979,1983), Zyskind (1976), Tallis (1978), Isaki and Fuller (1982), Wright (1983), Pfefferman (1984), Tam (1986), Brewer, Hanif and Tam (1988), Montanari (1999), and Gerow and McCulloch(2000). The estimator (5.3) reduces to x if there is an h N suchthat
XA h= SeeAAJn + S eeAA JN-n,
where y =(y , y ,..., yN ), e =( 1 ,e ,..., e ) and e 2 U 1 2 U N 2 X =( 1 ,x ,...,xN). Itisassumed that SeeUU is known x U or known up to a multiple. The model for a sample ofn observationis y A = XA + eA, eA ~ ( ,SeeAA), 0 where yA =(y , y ,...,y ), e A =( 1 ,e ,..., e ), XA = e 2 1 2 n n ( ,x2,...,xn), and we index the sample elements by 1, x 1 2, ..., n forconvenience.WehaveusedthesubscriptUto , identifypopulationquantities,andthesubscript A toidentify samplequantities,butwewilloftenomitthesubscriptAto simplify the notation. For example, we may sometimes write the n n covariance matrix as See. The unknown finitepopulationmeanis y N = x + e . N N
(5.5)
forallsampleswith positiveprobability.Ifthereisalso g suchthat

XA = SeeAAD-1 Jn p
(5.6)
forallsampleswithpositiveprobability,then q of(5.3)is designconsistent,where Dp wasdefinedfor(4.14).Given a k suchthat XA k =SeeAA( -1Jn - Jn)S eeAA JN-n, Dp
(5.7)
then q of(5.3)isexpressibleas
q = yp + ( x N -x ) p
(5.8)
(5.2) and if the design is such that yp is design consistent for y N, q of(5.8)isdesignconsistentfor y . N Wecallaregression model of the form(5.1) forwhich (5.5)and(5.6),or(5.7),holdsafullmodel.If(5.6)or(5.7) does not hold, we call the model a reduced model or a restrictedmodel.Wecannotexpecttheconditionsforafull model to hold for every analysis variable in a general purpose survey because See will be different for different
Undermodel(5.1),thebestlinear,conditionallyunbiased predictorof q N = yN, conditionalon X is

yi + ( N - n x - n ) N , = N -1 iA q ) + J N - n GAA( y A - XA
(5.3)
13
E{ yreg - yN } = E{E [ yreg - y | H ]} N = E {(0, qreg - q N ) y h = 0, &
ys.Therefore,givenareducedmodel,onemightsearchfor a good model estimator in the class of design consistent estimators. To construct a design consistent estimator of the form x whenmodel(5.1)isareducedmodel,wecanadda N vector satisfying (5.7) to the X matrix to create a full model.Therearetwopossiblesituationsassociatedwiththis approach.Inthefirst,thepopulationmean(ortotal)ofthe added variable is known. With known mean, one can constructtheusualregressionestimatorand theusualdesign varianceestimationformulasareappropriate. To describean estimation procedure forthesituationin which the population mean of the added variable is not known, let q=( 1 , q , ..., q ) denote the added vector, q 2 n where q is the vector on the right side of the equality in (5.7).Let H =( , q , whereXisthe matrix ofauxiliary X ) variables with known population mean vector, x . We N writethefullmodelforthesampleas
y = Zyh + e ,
(5.14)
where y isdefinedin(5.12)andtheapproximationisdue reg to the approximate design expectation of the regression estimator q . reg The estimator (5.13) is a linear estimator, where the vectorofweights, w,minimizestheLagrangean w See w + [w H -( x , qreg)] . (5.15) N Theestimatorislocationinvariantifthecolumnofonesis inthecolumnspaceof X. Because the variable q is the variable whose omission fromthefullmodelcanproduceabias,itseemsprudentto testthecoefficient ofqbefore usingthereduced modelto constructanestimatorforthemeanofy.Thiscanbedone usingamodelestimatorofthevariance, V { | H} = ( H S-1 H) 1
yh ee
(5.9)
where e~(0 See). Thebestlinearconditionally unbiased , estimatorof y h is yh =( See H)-1H S -1 y H -1 (5.10) ee . If the coefficient for q in (5.9) is not zero, it is not possible to construct a conditionally unbiased estimator of hN yh because the q component of hN is unknown. N However,because y h isunbiasedfor y h , itispossible toconstructaconditionallyunbiasedestimatorofanylinear functionof yh. Thus,itisnaturaltoreplacetheunknown q with the best available estimator of q , and a N N reasonablechoiceistheregressionestimator,
qreg = qp + ( x N -x ) x, p q
(5.11)
where q x =( X S ee1 X) -1 X See1 q. Then the estimator (5.3)becomes
or using the design estimator of variance of (4.12). See Du MouchelandDuncan(1983)andFuller(1984). A working specification for See may be particularly appropriatefortwostagesamples,seeRoyall(1976,1986) andMontanari(1987).Areasonablemodelisthatinwhich there is common correlation among items in the same primarysamplingunitandzerocorrelationbetweenunitsin different primary sampling units. Because the associated See is block diagonal of a particular form, it is relatively easy to invert and hence the estimator based on such a working is relatively easy to construct. The regression estimatorusinga withanonzerocorrelationforunitsin the same primary sampling unit is a combination of the estimator based on primary sampling unit totals and that basedonelements.SeeFullerandBattese(1973).Thus,the use of such a can avoid variance problems associated withtheuseofprimarysamplingunittotals.
q = yp + [( x N , qreg) -( x , qp )]y h p
(5.12)
The estimator (5.12) can be expressed in the familiar regressionestimatorform,

yreg = yp + ( x N -x ) y x. p
6. Maximum Likelihood and Raking Ratio

The theoretical foundation for the regression estimators discussedinsection3andsection4ismaximumlikelihood estimationforthelinearmodelwithnormalerrors.Wenow consider the likelihood for multinomial variables. Given a simple randomsample from a multinomial defined by the entriesinatwo waytable, thelogarithm ofthelikelihood, exceptforaconstant,is
r c
(5.13)
That is, the regression estimator of the finite population meanof ybasedonthefullmodel,butwiththemeanof q i unknownandestimatedwiththeregressionestimator,isthe regressionestimatorwith yx estimatedbythegeneralized least squares regression of y on x using the covariance matrix See. SeePark(2002).Theestimatorisconditionally modelunbiasedunderthereducedmodelcontainingonlyx ifthereducedmodelistrue.Ifthepopulationcoefficientfor q is not zero, the reduced model is not true. Then the i estimatorisconditionallymodelbiased,buttheestimatoris unbiasedforthefinitepopulationmeanunderthefullmodel andanunbiaseddesign,because
aijlogpij ,
i=1 j= 1
(6.1)
where a is the estimated fraction in cell ij, p is the ij ij populationfractionincell ij, r isthenumberofrows,and c isthe numberofcolumns.If(6.1)is maximizedsubjectto the restriction pij =1 , one obtains the maximum
14
likelihood estimators pij =aij . If the marginal row fractions p, N andthemarginalcolumnfractions pj, N are i known, it is natural to maximize the likelihood subject to theseconstraintsbyusingtheLagrangean
r c
aij log pij +

i=1 j=1
c l i p - p, N ij i i=1 j=1 r+ c r + l j p - p j,N , (6.2) ij i=1 j= r+1
Deville, Srndal and Sautory (1993) investigated four estimatorsintheclass.Althoughweightsconstructedusing different functions could differ considerably, the authors concluded that estimates were quite similar, a result consistent with the theory. Singh and Mohl (1996) and Thberge (1999, 2000) discuss estimators with the calibrationproperty.
where li , i= 1 2 ..., r, are for the row restrictions and , , lj , j= 1 2 ..., c areforthecolumnrestrictions.Thereis , , , noexplicitexpressionforthesolutionto(6.2)andtheremay be no solution if there are too many empty cells. A procedure that produces estimates close to the maximum likelihood solution is that called raking ratio or iterative proportional fitting. The procedure iterates, first making ratioadjustmentsfortherowrestrictions,thenmakingratio adjustmentsforthecolumnrestrictions,thenmakingaratio adjustments for the row restrictions, etc. The method is generally creditedtoDemingandStephan(1940).See,for example,Bishop,FienbergandHolland(1975,Chapter3). Deville and Srndal (1992) considered a class of objective functions of the form iA G (w , a i), where i G( w a isameasureofdistancebetweenaninitialweight , ) ai and a final weight w . The objective function is i minimizedsubjecttotheconstraints
7. Population of Auxiliary Vectors Known at Estimation Step

If the xvector is known for all of the population elements,thenumberofpossibleregressiontypeestimators isgreatlyexpanded.Most proceduresinvolvethefittingof an approximating function for the relationship between y and the auxiliary variables. The most used procedure is to assignthepopulationelementstocategoriesonthebasisof theauxiliarydataandtousethesecategoriesaspoststrata. Thisprocedureisequivalenttoapproximatingtheexpected value of y given x by a step function. The estimator is formallyequivalenttotheregressionestimator(4.19)where the x vector is a vector of indicator variables for post stratummembership. The application of the procedure often requires the development of criteria to use in forming the post strata. Typically the post strata are formed so that each post stratum contains a minimum number of sample elements andsothattheweightsforanypoststratumarenotoverly large.Estimationwithpoststrataandtheformationofpost strata have been studied by Fuller (1966), Holt and Smith (1979), Tremblay (1986) Kalton and Maligalig (1991), Little(1993),EltingeandYansaneh(1997),and Lazzeroni and Little (1998), among others. Holt and Smith (1979) argued for the use of a conditional variance estimator for poststratification. Given the population of x vectors, one can use the sampletoestimateafunctionalrelationshipbetween y and x andthenpredicttheunobservedy.Iftheprocedureistobe design consistent, then a condition similar to (4.14) must hold.Onewaytoensuredesignconsistencyistorequirethe fittedmodeltosatisfy - (7.1) pi1 [yi - f (xi,)] = 0,
i A
wi xi = xN.
iA
(6.3)
Deville and Srndal (1992) used the term calibrated to describe weights satisfying (6.3). If the initial weight is ai = ( p -1)-1p i1 andifoneisthefirstelementof x, the j i solutiontotheminimizationproblemisapproximatedbya regressionestimatorofthemeanoftheform , y = y + ( x -x ) (6.4)
reg p N p
where
-1 i = x j ii x i i A
-1 xi jii1 yi , i A
and j isthesecondderivativeof G( w a withrespectto , ) ii w evaluated at (w, a = ( i , a i). Using this approach, ) a Deville and Srndal (1992) showed that the maximum likelihood and raking ratio estimators have the same limitingdistributionastheregressionestimator(4.18)with = Dp . To obtain the raking ratio weights they used the objectivefunction
[wi loga-1 w + a i - wi], i i

i A
(6.5)
andtoobtainthemaximumlikelihoodweightstheyusedthe objectivefunction
[wi -a i - a i loga i1 wi]. i A
) where f (xi, is the model estimated value for the ith observation. FirthandBennett(1998)pointedoutthatsomenonlinear models satisfy (7.1). If the initial model does not satisfy (7.1),anestimatedintercepttermcanbeaddedtocreatean expandedfullmodel, ~ f F ( i = f ( i x ) x ) - + p i1 iA
-1
(6.6)
i A
p i1 [yi - f ( i . x )]
15 for i =1 2 ..., N. Thus, if the reciprocal of the response , , probabilityisalinearfunction of the control variables,the regressionestimatorisaconsistentestimatorofthemeanof y. One way in which (8.4) can be satisfied is for the ele ments of xi tobe dummy variablesthat definesubgroups and for the response probabilities to be constant in each subgroup. If (8.4) holds and if the probability of responding is independent from unit to unit, then the estimated variance basedon(4.12)isanappropriateestimatorforthevariance of the regression estimator of the mean. It is particularly important that a variance estimator of the form (4.12) or (4.25),andnotoftheform(4.26)beused,because x N -x p is, in general, not Op( -1/2) in the presence of n nonresponse. Singh and Folsom (2000) make a similar argument for the variance estimator (4.25) when using regressiontoadjustforcoverageerror. Often a preliminary adjustment to the selection probabilities is madefor nonresponse and this is followed byregressionestimation. Themostfrequentlyusedresponse adjustment is to form adjustment cells (post strata) and to ratioadjusttheweightsofrespondentsinthecellsothatthe sumoftheweightsisequaltothe estimated(orknown)total forthecell.See,forexample,LittleandRubin(1987,page 250). Procedures using an estimated response probability function are discussed by Cassel, Srndal and Wretman (1983), Rosenbaum and Rubin (1983), Folsom and Witt (1994). Fuller and An (1998), and Folsom and Singh (2000). Brick, Waksberg and Keeter (1996) use an estimatedcontactprobabilitytoadjustforframecoverage. To consider procedures based on estimated response probabilities, assume that the inverse of the response probabilityforindividual i isgivenby
- pi 1 = g( i0), z
This is a direct extension of the ideas of difference estimation to the nonlinear case. See Isaki (1970), Cassel, SrndalandWretman(1976)andWright(1983).Aclosely relatedapproachwassuggestedbyWuandSitter(2001)in ) whichthefittedfunction f (x, isusedastheauxiliary i variableinalinearregressionestimator. A number of local procedures, other than step functions, can be used to approximate the functional relationship between x and y. Spline functions and polynomials are linear models that fall within the class of section4.Estimatorsthatusesomekindoflocalsmoothing to estimate population quantities have been considered for finite populationsfroma model viewpointbyKuo(1988), Dorfman (1993), Dorfman and Hall (1993), Chambers (1996),andChambers,DorfmanandWehrly(1993).Breidt andOpsomer(2000)showedthatestimatorsbasedonlocal polynomial regression are design consistent. Firth and Bennett(1998)alsoconsideredlocalfitmodels.
8. Regression Estimation and Nonresponse

Regressionestimationisfrequentlyapartof procedures usedtoadjustdataforunitnonresponse.Regressioncanbe justifiedonthebasisofamodelsuchas(3.1)oronthebasis thatregressioncanadjustforunequalresponseprobabilities. See Cassel, Srndal and Wretman (1979, 1983), Little (1982, 1986), Bethlehem (1988), Kott (1994), Fuller, LoughinandBaker(1994)andFullerandAn(1998). Consideranestimatorofthepopulationregressionvector of the form (4.4) with =Dp constructed with the ~ respondingunits.Denotetheestimatorby andlet p be i theconditionalprobabilityofobservingunitigiventhatthe unit is selected for the sample. Then under regularity ~ conditions,theestimator isaconsistentestimatorof
i i N = x p x i i U
-1
(8.5)
xi pi yi.
i U
(8.1)
Thepopulationmeanof y canbeexpressedas
y N = x N +a N N
(8.2)
where a = y - x i gN and a is the population mean of i i N % the a . The regression estimator yreg = x will be i N consistentfor yN iftheprobabilitylimitof a iszero.The N probabilitylimitof a willbezeroifthesequenceoffinite N populations is a sequence of random samples from an infinitepopulationinwhich
yi =x + e , i i
where zi isavector of variables thatcanbe observed for bothrespondentsand nonrespondents, 0 isthetruevalue of , and g( i iscontinuousin withcontinuousfirst z ) andsecondderivativesinanopensetcontaining 0 forall zi. The vector (y , x , zi) is observed, and we assume i i that p isboundedbelowbyapositivenumber. i Let di betheindicatorvariablewith di = 1 ifaresponse isobtainedand di = 0 ifaresponseisnotobtained.Using the vector ( i, z i), the parameter 0 of the response d probability function is estimated. Assume that -0 = Op( -1 /2), where istheestimatorof Let N denote n . thefinitepopulationregressionvectorfortheregressionof y on x.Let
% = x xi pi-1 pi-1 di x yi pi-1 pi-1di , (8.6) i i iA i A where p are the selection probabilities and pi 1 = i ). Underconditionsofthetypeusedinsection4, g( i z % - N = M -1 di pi-1 pi-1 x ai [1 + p g1,i ( - 0) ] xx i i
iA -1
(8.3)
and the e the sample are independent of xi with i E{e |x }=0 . i i Alternatively,asufficientconditionfor a tobezerois N theexistenceofacolumnvector x suchthat
xi x = p-1 i
(8.4)
+ O p ( n -1) ,
16 where g ,i istherowvectoroffirstderivativesof g(zi ) 1 0 evaluated at = and Mxx = iA xi xi p i1 p i 1 d i . If g ,i is uncorrelatedwith a , thentheterminvolving g ,i a 1 i 1 i is Op(n-1 ) and the variance estimator constructed as if g( 0) is known is appropriate. The conditions are z satisfied if zi is a subvector of x and zi defines i imputationcells(adjustmentcells)withequalresponserates withinacell.
9. Practical Considerations
If the regression weights are to be used in a general purpose survey, no individual weight used in estimating a totalshouldbelessthanone.Also,itseemsreasonable,on robustnessgrounds,toavoidverylargeweights.Wediscuss some procedures that have been developed to accomplish theseobjectives. Anumberofalgorithmsproducepositiveweightswitha highprobability.Rakingratioproceduresproducespositive weights for most data configurations. Deville, Srndal and Sautory (1993) discuss the extension of raking ratio to general x variables andextensionstoincludeboundsonthe weights. Till (1998) suggested the use of approximate conditionalprobabilities,conditionalon x , tocomputean p estimator. His approximation can be extended to produce regression weights that are positive with high probability. Let x(i ) beanestimatorobtainedbydeletingelementi,or p primary sampling unit i, and modifying the remaining weightsso that x(i ) is unbiased, or consistent to thesame p order as x , for the population mean of all elements p excluding i.Theestimator x(i ) canbetheestimatorusedto p constructjackknifedeviates. Let Sx x beanestimatorofthe covariance matrix of x and let Sx x ( i) be anestimator p oftheconditionalcovariance matrixof x(i ) conditionalon p i A Then, in large samples x and x(i ) are . p p approximatelynormallydistributedandanestimatorofthe probabilitythatiisinthesamplegiventheestimatedmean x , is p
pi |A = P{ i A | FN, x } p = pi | S x x |1/ 2 | Sx x (i ) |-1/ 2 exp {0.5 (G x x -Gx x (i))} (9.1)
regressionestimator.Becausetheestimatorisnotcalibrated, wesuggestacalibratedversion obtainedbycomputingthe i regression estimator with p|A as initial weights. The difference between (9.2) and the regression estimator i constructed with initial weights p|A is Op(n-1 ). Hence, there is a good chance that the regression weights so constructed willbe positive. The varianceestimator Sx x ( i) is relatively simple to compute for stratified samples but mayrequireconsiderablecomputationforothercases.Thus onemaychoosetoapproximate Sxx(i). Given that the regression weights are being constructed by minimizing an objective function, one can add restrictionstotheproblemtoplaceboundsontheweights. Huang and Fuller (1978) gave an interative procedure equivalent to constructing a F matrix at each step that reduces the weight on observations whose current weight deviatesfromtheaveragebyalargeabsoluteamount. To discuss additional procedures associated with quadratic objective functions, assume we have a working covariancematrix,denotedby Fee, forthemodel(5.1)that istobeusedtoconstructaregressionestimator.Let a be thecolumnvectorofinitialweightsandassume Fee a isin thecolumnspaceofX.Thentheweightsthatminimizethe conditional model variance are the weights that minimize w F ee w or,equivalently,thatminimize ( - Fee ( - w ) w ) (9.3) subjecttotheconstraint
w X =x . N
(9.4)
Givenanobjectivefunction,wecanaddrestrictionsonthe w suchas i L w L , i A, (9.5) 1 i 2 where L and L are nonnegative constants. Minimizing 1 2 (9.3),subjecttotheconstraints(9.4)and(9.5)isaquadratic programming problem. The use of quadratic programming was suggested by Husain (1969) and was used by Isaki, TsayandFuller(2000). Ifalargenumberofcontrolvariablesareused,itmaynot be possible to construct weights satisfying the calibration constraints and also falling within reasonable bounds. The practitioner is faced with making compromises. The most common practice isto drop variablesfrom the model. See Bankier, Rathwell and Maijkowski (1992) and Silva and Skinner (1997). To discuss an alternative procedure, considerthesituationinwhich some of theconstraintsare required but others can be relaxed. Let the matrix of observations on the auxiliary variables be partitioned as ( 0, X2), where X0 isthesetofvariablesforwhichexact X constraints are required and X2 is the set for which the constraintscanberelaxed.Assume Fee a isinthecolumn spaceof X0. Thenageneralizationof(9.3)and(9.4)isthe function ( w- ) F ( w - ) ee + ( w X 2 - x N ) (w X 2 -x N ) (9.6) 2, 2,
where
G x x = ( x p - x ) -1 ( x p -x ), N xx N
( ( ( ( G x x ( i ) = ( xpi ) - x i ) ) -1x ( i ) ( x pi ) -x i)), N x N
( and x Ni) = ( N - 1) -1( N x N -x ). For simple random i sampling,Till(1998)showedthattheestimator
y pp = N -1 pi-|1 yi, A
i A
(9.2)
where p A is the conditional probability calculated under i| the normality assumptions, is approximately equal to the
17 Then the diagonal that minimizes the approximate variancehaselements

2 y ii = ( ii bbii)-1b i , m V
andtheconstraint
w X0 - x0,N =0 ,
(9.7)
where Fee and arepositivedefinitesymmetricmatrices and x N =( x0, N , x N ). Thewthatminimizes(9.6)subject 2, to(9.7) minimizesthe meansquared error ofthe unbiased linearpredictorof x underthemixedmodel N
y = X0 b + X2 b 2 + e, 0
(9.12)
where 2 ~( , e~( ,Fee), the random vector 2 is 0 ), 0 independent ofe, and 0 is a fixed vector.See Lazzeroni and Little (1998) for the use of random models for post stratification. Thevector w thatminimizes(9.6)subjecttorestriction (9.7)is
1 ee w = + ( x N -x p ) H -yx X F-1, x
% where m is the ith element of the diagonal matrix X ii 2 % F-1 X and Vbbii isthevarianceof b inthetransformed ee 2 i scale. To implement the procedure one must estimate the population parameters or choose realistic values for a general purpose If one postulates a superpopulation . 2 randommodelfor thenthe b of(9.12)isreplacedwith , i 2 E { i }, wheretheexpectationisthemodelexpectation. b
10. Comments
Regressionestimationisaflexibleandpowerfultoolfor the incorporation of auxiliary information into the esti mation process.Closelyrelated procedures,suchasraking ratio, have large sample properties equivalent to those of regressionestimators.Thelinearityofsuchestimatorsisof paramountimportancebecauseitpermits theconstructionof a general purpose data set that provides very good esti matorsforawiderangeofparameters. Given a concentrated interest in a single y variable, efficiencygainsmaybepossiblebypostulatingaparticular set of auxiliary variables and a particular error covariance matrix. Because of the simple nature of the design consistency requirement, it is easy to test such models for designconsistency.
(9.8)
where
X0 F-1 X0 ee Hxyx = X F -1 X 2 ee 0 ee X0 F -1 X2 +
-1
ee X2 F -1 X2
(9.9)
Theestimatorcanbewritten
, yr reg = w y = yp + ( x N -x ) p
(9.10)
-1 where =H-1 y X Fee y See Henderson (1963), . xy Robinson(1991),andRao(2002,Chapter6). Husain (1969) considered (9.6) for a simple random sample from a normal distribution with X 0 =J, Fee = I , and -1 = g-1 Sx , 22, where Sx, 22 is the estimated covariance matrix of x p, and g is a constant to be 2, determined. For this case, Husain showed that the optimal g is 1 g opt = [k 2( - R2)]- ( - k2 - 2 R2, 1 n )
Acknowledgements
This research was partially supported by Cooperative Agreement 433AEU080064 between Iowa State University,theU.S.NationalAgriculturalStatisticsService andtheU.S.BureauoftheCensus.Iamdeeplyindebtedto Mingue Park for assistance in literature review, for commentsonandrepairoftheoreticalresults,and foruseof materialfromhisthesis.IthankMichaelHidiroglou,J.N.K. Rao, Harold Mantel, and Jean Opsomer for useful commentsondraftsofthemanuscript.IthanktheAssociate Editor for numerous comments that improved the presentation.
(9.11)
where k2 is the dimension of x2 and R2 is the squared multiple correlation coefficient. Bardsley and Chambers (1984)consideredthefunction(9.6),thedivisionof x into i twocomponents,andstudiedthebehavior oftheestimator from a model perspective. The procedure associated with (9.5), (9.6) and (9.7) was used by Isaki, Tsay and Fuller (2000). In that application, the vector x 0,N contained marginal totals of a multiway table and x 2,N contained totals for interior cells. Rao and Singh (1997) studied a closely related estimator in which tolerances are given for the difference between the final estimates for elements of x andthecorrespondingelementsof x N. 2,N 2, Park (2002) extended Husains optimality results to a moregeneral The x2 vectorcanbetransformedsothat . V {x p} forthetransformedvectorisadiagonalmatrixand 2, % % % so that X F-1 X2 is a diagonal matrix, where X2 is the 2 ee partof X2 thatisorthogonalto X0 inthemetric Fee. That is, % X 2 = X 2 - X 0 ( X Fee1 X0 ) -1 X F-1 X . 0 0 ee 2
Appendix
Thisappendixcontainstheoremssupportingthelimiting propertiesoftheregressionestimatorsdiscussedinsection4. Theorem A.1. Let {UN,F , A , n :N =k + 3 k + 4 ...} , , N N N beasequenceoffinitepopulationsandsamples,where F N is a sample from an infinite population with eighth moments, A isthesample of size nN selectedfrom the N Nth population.Let bedefinedby(4.4)ofthetext,and let
18
Q zz = n-1 Z F-1 Z ,
Proof. Theerrorin is - N = ( X F -1 X ) -1 [ X F -1 y - X F -1 XN ] =Q -1 ( n -1 X F-1e . )

xx
where F isapositivedefinitesymmetric n n matrixthat maybeafunctionofXbutnotofy,Zisdefinedfollowing (4.2), and we omit the subscript N on sample quantities. Assume Qzz ispositivedefinitewithprobabilityone.If F israndom,assumetherowsof F-1Z haveboundedfourth moments.Assumetheselectionprobabilitiessatisfy
0< K < Nn p i < K2, 1
- 1
Now isageneralizedleastsquaresestimator. Therefore e F -1 X = (y - X) F-1 X =0 and Q xyN -N QxxN = QexN = 0. Byassumption(A.1) Q = n -1 X F-1 e=O p ( n -1/ 2). ex
where p aretheselectionprobabilities. Assumethesample i designissuchthatforany z withboundedfourthmoments

[( zHT - z N ), (Q zz - Q zzN )] | F =O p( n -1/ 2), N
(A.1)
Thus
- N = Q-1 n -1 ei + O p ( n 1) xxN i iA = Q -1 N -1 xxN
where
z HT = ( yHT , xHT) = N -1 pi-1 zi ,
i A
(A.2)
pi bi + O p (n -1).
iA
Q zzN =E{Q zz | FN }, zN is the finite population mean of z, QzzN is a positive definite matrix for the Nth population,andthelimitof QzzN ispositivedefinite. Then - N | FN = Q 1 b +O p(n -1) , xxN HT
The bi haveboundedfourthmomentsbytheassumptions. Thus,byassumption(A.5)

1/ L V 2 ( - N ) N (0, I ), bb
(A.3)
=
where
V =Q-1 V b Q-1 bb xxN b xxN
where N = n -1 Np i z e , i i
Q -1 xxN
Q xyN , b HT = N
-1
S i A pi-1 b i ,b i
and Vb b =V {bHT}. Now

QyyN QyxN , QzzN = Q xyN QxxN ) n-1 XF-1 e= n-1 X F -1 e+ n-1 X F -1 X( N -
(A.4)
-1
=:N-1 p i1 b + N-1 p i1 h , i i i A i A
e = y - x i N, and z iscolumn i of X F . i i i Assumethedesignissuchthat

1/ L Vz-z 2 { zHT - z N | FN} N (0, I , )
where
- p i b hi = n 1N i i x
(A.5)
as n for any z with finite fourth moments, where N V z is the covariance matrix of z HT -z . Assume that z N V z is O( -1 ) and that the design admits an estimator n z V z suchthat z
n (Vz z - Vz z ) | F =o p(1) N
. and b = N - Foranyfixed by(A.6),theestimated , variance of N-1 iAp i1( + hi) is consistent for the b i variance of the estimator of the mean of b +h By . assumption,theelementsof xi havefourthmoments.For i a fixed thevarianceof hHT is O( -1 ). For = b , n -1 V {h } =o ( n ) ,
HT p
(A.6)
and
V {b } = V {b } + o p ( n -1) HT HT
forany z withboundedfourthmoments.Then
L [V {}] -1/ 2[ - ] | FN N (0, I , )
(A.7)
where -1 V {} =Q-1 Vb b Qxx, xx (A.8)
because db =Op( -1 /2). Result(A.7)thenfollowsfromthe n asymptoticnormalityof -N . Theorem A.2. Let y = (y1, y , ..., y ) and 2 n X= ( 1, x , ..., x ). Let F bea nonsingularsymmetric x 1 n n n matrix and let FN be a nonsingular symmetric N N matrix. Let
yp , xp , n -1 ( X F -1 X) and n -1 XF-1 y
Vb b =V {b } is the estimated design variance of bHT HT . calculatedwith b = n -1N pi z ei and ei = yi - x i i i

bedesignconsistentestimatorsforfinitepopulationcharac teristics yN , x N , QxxN and QxyN, respectively,where
19 (A.9)
where V {ep} is the estimator of (A.14) constructed with ei = yi - yp - ( x1, i -x p ) dopt. 1, 1,
N [ xxN, QxyN ]= [N-1XN F-1XN,N-1XN F -1yN ] Q . N
Let N =Q-1 QxyN. Let there be a sequence of column xxN vectors { N} suchthat
XN = FD 1 Jn p
Proof. Theestimator
1, dopt =[ V {x1, p }]-1 C {x p , yp } 1,
(A.10)
forallpossiblesamples,where Dp = diag(p1, p 2, ..., p n ) and Jn isann dimensionalcolumnvectorofones.Then, theregression estimator x with N

=( X F -1 X) -1 X F-1 y ,
(A.11)
isadesignconsistentestimatorof yN. Proof.If isdefinedby(A.11),thenbythepropertiesof generalizedleastsquaresestimators,

( y - X ) F-1 X =0 .
minimizes the estimated variance of (A.15), and, by assumption(A.14),the estimated varianceisconsistent for 1 the true variance. Hence, ,dopt is design consistent for ,dopt and ,dopt minimizes V { yp - x p Therefore, }. 1 1 1, noestimatoroftheform(4.29)hasalimitdistributionwith smallervariance. Now
yd , reg - y N = yp - y - ( x1, N - x p ) dopt N 1, 1, = ep +o p( n -1/ 2),
If(A.10)holds,then
( y - X ) D -1 J = p
pi1 iA
) ( yp - x p = 0.
Itfollowsthat y isdesignconsistentbecause reg

0 = p lim {( y - x n ) | F } p p N
N
where ei = yi - y N - ( x1, i -x N) dopt . Therefore the 1, 1, varianceofthelimitingdistributionof n1/ 2( yd , reg - y N) is the variance of n1/ 2(ep -eN ). By assumption (A.14), the estimator V { z is a consistent variance estimator of p } V { z foranyfixed Because 1, dopt - dopt =o p(1) , . p } 1, the estimated variance based on e converges to the i estimatedvariancebasedon e and(A.16)holds. i
= p lim {( y - x N ) | F } p p N
N
References
Anderson,C.,andNordberg,L.(1998).AusersguidetoCLAN97. StatisticsSweden,Orebro,Sweden. Bankier, M.D., Rathwell, S. and Majkowski, M. (1992). Two step generalizedleastsquaresestimationinthe1991CanadianCensus. WorkingPaperMethodologyBranch,CensusOperationsSection, SocialSurveyMethodsDivision.StatisticsCanada,Ottawa. Bankier, M.D., Houle, A.M. and Luc, M. (1997). Calibration estimation in the 1991 and 1996 Canadian census. Statistics Canada(draft),8pages. Bardsley, P., and Chambers, R.L. (1984). Multipurpose estimation fromunbalancedsamples. AppliedStatistics,33,290299. Bethlehem, J.G. (1988). Reduction of nonresponse bias through regressionestimation. JournalofOfficialStatistics,4,251260. Bethlehem,J.G., andKeller,W.J. (1987). Linearweightingofsample survey data. JournalofOfficialStatistics,3,141153. Bishop,Y.M.M.,Fienberg,S.E.andHolland,P.W.(1975).Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge,MA. Breidt,F.J.,andOpsomer,J.D.(2000).Localpolynomialregression estimators in survey sampling. The Annals of Statistics, 28, 10261053. Brewer, K.R.W. (1963). Ratio estimation and finite populations: Some results deducible from the assumption of an underlying stochasticprocess. AustralianJournalofStatistics,5, 93105.
= p lim {( y N - x N ) | F } . N N
N
TheoremA.3.Letasequence of populationsandsamples be as defined in Theorem A.1. Let zi be a vector of the form z i =(y , 1 x,i) and let z1,i =(y , x,i). Assume , 1 i i 1 z p isadesignconsistentestimatorofthepopulationmean 1, z withnonsingularcovariancematrix 1,N
V { z1,p | F } =O (n -1) N
(A.12)
and
L n1/ 2( z1, p - z1,N ) | FN N (0 z z ) , (A.13) ,
where Szz is thelimit of nV {z1,p | F }. Assumethereis N anestimator ofthe variance of z p, denotedby V {z p}, 1, 1, suchthat
p lim n1+ d ( V { z1, p } - V { z1,p | FN})=0 (A.14)
N
forsome d> 0. Let 1,dopt bethevectorthatminimizes V { y - x } (A.15)

p 1, p 1,d
and let 1,dopt be the vector that minimizes V { yp - x p d}. Let y , reg be defined by (4.29). Then 1, 1, d y , reg hastheminimumlimitvariancefordesignconsistent d estimatorsoftheform yp + ( x1, N -x p ) d. Also 1, 1,
L [ V { ep }]-1/ 2( yd , reg - yN ) N (0, 1) ,
(A.16)
20
Brewer, K.R.W. (1979). A class of robust sampling designs for largescale surveys. Journal of the American Statistical Association,74,911915. Brewer,K.R.W.,Hanif,M.andTam,S.M.(1988).Hownearly can modelbased prediction and designbased estimation be reconciled?Journal of the American Statistical Association, 83, 128132. Brick, J.M., Waksberg, J. and Keeter, S. (1996). Using data on interruptionsintelephoneserviceascoverageadjustments. Survey Methodology,22,185197. Cassel,C.M.,Srndal,C.E.andWretman,J.H.(1976).Someresults on generalized difference estimation and generalized regression estimationforfinitepopulations. Biometrika,63,615620. Cassel, C.M., Srndal, C.E.and Wretman, J.H.(1979). Prediction theoryforfinitepopulationswhenmodelbasedanddesignbased principles are combined. Scandinavian Journal of Statistics, 6, 97106. Cassel,C.M.,Srndal,C.E. andWretman,J.H. (1983).Someusesof statisticalmodelsinconnectionwiththenonresponseproblem.In IncompleteDatainSampleSurveys,(Eds.W.G.Madow,I.Olkin, andD.Rubin).NewYork:AcademicPress,3,143160. Chambers, R.L. (1996). Robust caseweighting for multipurpose establishmentsurveys. JournalofOfficialStatistics,12,332. Chambers, R.L., Dorfman, A.H. and Wehrly, T.E. (1993). Bias robust estimation in finite populations using nonparametric calibration. Journal of the American Statistical Association, 88, 268277. Cochran, W.G. (1939). The use of the analysis of variance in enumeration by sampling. Journal of the American Statistical Association,34,492510. Cochran,W.G.(1942).Samplingtheorywhenthesamplingunitsare ofunequalsizes.JournaloftheAmericanStatisticalAssociation, 37,199212. Cochran,W.G.(1946).Relativeaccuracyofsystematicandstratified random samples for a certain class of populations. Annals of MathematicalStatistics,17,164177. Cochran, W.G. (1977). Sampling Techniques, 3 ed., New York: JohnWiley&Sons,Inc. Cook, R.D., and Weisberg, S. (1982). Residuals and Influence in Regression.NewYork:ChapmanandHall. Deming, W.E., and Stephan, F.F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginaltotalsareknown.AnnalsofMathematicalStatistics,11, 427444. Deming, W.E., and Stephan, F.F. (1941). On the interpretation of censuses as samples. Journal of the American Statistical Association,36,4549. Deville,J., andSrndal,C.E. (1992).Calibrationestimatorsinsurvey sampling. Journal of the American Statistical Association, 87, 376382.
rd
Deville,J.,Srndal,C.E. andSautory,O. (1993).Generalizedraking proceduresinsurveysampling. JournaloftheAmericanStatistical Association,88,10131020. Dorfman, A.H. (1993). A comparison of designbased and modelbased estimators of the finite population distribution function. AustralianJournalofStatistics,35,2941. Dorfman, A.H., and Hall, P. (1993). Estimators of the finite population distribution function using nonparametric regression. AnnalsofStatistics,21,14521475. Duchesne,P.(2000).Anoteonjackknifevarianceestimationforthe general regression estimator. Journal of Official Statistics, 16, 133138. DuMouchel,W.H.,andDuncan,G.J.(1983).Usingsurveyweights inmultipleregressionanalysisofstratifiedsamples. Journalofthe AmericanStatisticalAssociation,78,535543. Eltinge,J.L., andYansaneh,I.S. (1997).Diagnosticsforformationof nonresponse adjustment cells, with an application to income nonresponse in the U. S. Consumer Expenditure Survey.Survey Methodology,23,3340. Ericson,W.A.(1969).SubjectiveBayesianmodelsinsamplingfinite populations.JournaloftheRoyalStatisticalSociety,SeriesB,31, 195224. Estevao, V., Hidiroglou, M.A. and Srndal, C.E. (1995). Methodologicalprinciples fora generalizedestimationsystemat StatisticsCanada. JournalofOfficialStatistics,11,181204. Firth, D., and Bennett, K.E. (1998). Robust models in probability sampling. Journal of the Royal Statistical Society, Series B,60, 321. Folsom, R.E., and Witt, M.B. (1994). Testing a new attrition nonresponse adjustment method for SIPP. Proceedings of the Section on Survey Research Methods, American Statistical Association,428433. Folsom, R.E.,and Singh, A.C.(2000). The generalized exponential model for a unified approach to sampling weight calibration for outlier weight treatment, nonresponse adjustment and poststratification. ProceedingsoftheSectiononSurveyResearch Methods, AmericanStatisticalAssociation,598603. Frankel,M.R.(1971). Inference fromsurveysamples: Anempirical investigation. Institute for Social Research, University of Michigan,AnnArbor. Fuller,W.A. (1966).Estimationemployingpoststrata. Journalofthe AmericaStatisticalAssociation,61,11721183. Fuller,W.A.(1973).Regressionforsamplesurveys.Paperpresented at meeting of International Statistical Institute. August, 1973, Vienna,Austria. Fuller,W.A. (1975).Regressionanalysisfor samplesurvey. Sankhy, SeriesC, 37,117132. Fuller,W.A.(1984).Leastsquaresandrelatedanalysesforcomplex surveydesigns. SurveyMethodology,10,97118. Fuller, W.A., and An, A.B. (1998). Regression adjustments for nonresponse. Journal of the Indian Society of Agricultural Statistics,51,331342.
Fuller, W.A., and Battese, G.E. (1973). Transformations for estimationoflinearmodelswithnestederrorstructure.Journalof theAmericanStatisticalAssociation,68,626632. Fuller, W.A., and Isaki, C.T. (1981). Survey design under superpopulation models. In Current Topics in Survey Sampling, (Eds., D. Krewski, J.N.K. Rao and R. Platek), New York: AcademicPress,199226. Fuller, W.A., Loughin, M.M. and Baker, H.D. (1994). Regression weighting for the 198788 National Food Consumption Survey, SurveyMethodology,207585. Fuller, W.A., and Rao, J.N.K. (2001). A regression composite estimator with application to the Canadian labour force survey. SurveyMethodology,27,4552. Gambino, J., Kennedy, B. and Singh, M.P. (2001). Regression composite estimation for the Canadian labour force survey: Evaluationandimplementation. SurveyMethodology,27,6574. Gerow, K., and Mcculloch, C.E. (2000). Simultaneously model unbiased,designunbiasedestimation. Biometrics 56,873878. Godambe, V.P. (1955). A unified theory of sampling from finite populations.JournaloftheRoyalStatisticalSociety,SeriesB,17, 269278. Godambe, V.P., and Joshi, V.M. (1965). Admissibility and Bayes estimation in sampling finite populations, 1. Annals of MathematicalStatistics,36,17071722. Goldberger, A.S. (1962). Best linear unbiased prediction in the generalized linear regression model. Journal of the American StatisticalAssociation,57,369375. Graybill, F.A.(1976). Theory and application of the linear model. Wadsworth,Belmont,CA. Hjek,J. (1959).Optimumstrategyandotherproblemsinprobability sampling. CasopisProPestovaniMatematiky,84,387423. Hanurav, T.V. (1966). Some aspects of unified sampling theory. Sankhy, SeriesA, 28,175204. Henderson, C.R. (1963). Selection index and expected genetic advance. In Statistical Genetics and Plant Breeding, 141163. National Academy Sciences, National Research Council Publication982,Washington,DC. Harville, D.A. (1976). Extension of the GaussMarkov Theroem to include estimation of random effects. Annals of Statistics, 4, 384395. Hidiroglou, M.A. (1974). Estimation of regression parameters for finite populations. Ph.D. thesis, Iowa State University, Ames, Iowa. Hidiroglou, M.A., Fuller, W.A. and Hickman, R.D. (1978). Super Carp,(sixthedition,1980)SurveySection,StatisticalLaboratory, IowaStateUniversity,Ames,Iowa. Hidiroglou,M.A.,Srndal,C.E. andBinder,D.A. (1995).Weighting and estimation in business surveys. Business Survey Methods, (Eds. Cox, Binder, Chinnappa, Colledge and Kott) New York: JohnWiley&Sons,Inc.,477502.
21
Holt,D.,andSmith,T.M.F.(1979).PostStratification. Journalof theRoyalStatisticalSociety,Serie.A,142,3346. Horn, S.D., Horn, R.A. and Duncan, D.B. (1975). Estimating heteroscedastic variances in linear models. Journal of the AmericanStatisticalAssociation,70,380385. Huang, E.T., and Fuller, W.A. (1978). Nonnegative regression estimation for sample survey data. Proceedings of the social statisticssection, AmericanStatisticalAssociation,300305. Husain,M.(1969).Constructionofregressionweightsforestimation in sample surveys. Unpublished M.S. thesis, Iowa State University,Ames,Iowa. Isaki, C.T. (1970). Survey designs utilizing prior information. UnpublishedPh.D.thesis.IowaStateUniversity. Isaki, C.T., and Fuller, W.A. (1982). Survey design under the regression superpopulation model. Journal of the American StatisticalAssociation,77,8996. Isaki,C.T.,Tsay,J.H.andFuller,W.A.(2000).Estimationofcensus adjustmentfactors. SurveyMethodology,26,3142. Jessen, R.J. (1942). Statistical investigation of a sample survey for obtaining farm facts. Iowa Agriculture Experiment Station ResearchBulletin, 304. Kalton,G., andMaligalig,D.S. (1991).AComparisonofMethodsof WeightingAdjustmentforNonresponse.Proceedingsofthe1991 Annual Research Conference, U.S. Bureau of the Census, 409428. Kish,L.,andFrankel,M.R.(1974).Inferencefromcomplexsamples (withdiscussion).JournaloftheRoyalStatisticalSociety,Series B,36,137. Konijn,H.S. (1962).Regressionanalysisforsamplesurveys. Journal oftheAmericanStatisticalAssociation,57,590606. Kott,P.S. (1994).Anoteonhandlingnonresponseinsamplesurveys. JournaloftheAmericanStatisticalAssociation,89,693696. Kuo, L. (1988). Classical and prediction approaches to estimating distributionfunctionfromsurveydata.ProceedingsoftheSection on Survey Research Methods, American Statistical Association, 280285. Lazzeroni,L.C., andLittle,R.J.A. (1998).Randomeffectsmodelsfor smoothing poststratification weights. Journal of Official Statistics,14,6178. Little, R.J.A. (1982). Models for nonresponse in sample surveys. JournaloftheAmericanStatisticalAssociation,77,237250. Little, R.J.A.(1986). Survey nonresponse adjustments for estimates ofmeans. InternationalStatisticalReview,54,139157. Little, R.J.A., and Rubin, D.B. (1987). Statistical Analysis with MissingData.NewYork:JohnWiley&Sons,Inc. Little, R.J.A. (1993). Poststratification: A modelers perspective. JournaloftheAmericanStatisticalAssociation,88,10011012. Madow,W.G., andMadow,L.H. (1944).Onthetheoryofsystematic sampling. AnnalsofMathematicalStatistics,15,124.
22
Mickey, M.R. (1959). Some finite population unbiased ratio and regression estimators. Journal of the American Statistical Association,54,594612. Montanari, G.E. (1987). Postsampling efficient QR prediction in largesample surveys. International Statistical Review, 55, 191202. Montanari, G.E. (1999). A study on the conditional properties of finitepopulationmeanestimators. Metron,57,2135. Mukhopadhyay, P. (1993). Estimation of a finite population total underregressionmodels:Areview. Sankhy, 55,141155. Nieuwenbroek, N., Renssen, R.and Hofman, L.(2000).Towards a generalized weighting system. In Proceedings of the Second International Conference on Establishment Surveys, American StatisticalAssociation,Alexandria,Virginia. Park, M. (2002). Regression estimation of the mean in Survey Sampling.UnpublishedPh.D.dissertation,IowaStateUniversity, Ames,Iowa. Pfeffermann,D.(1984).Noteonlargesamplepropertiesofbalanced samples. Journal of the Royal Statistical Society, Series B, 46, 3841. Rao,J.N.K.(1994).Estimatingtotalsanddistributionfunctionsusing auxiliary information at the estimation stage.Journal of Official Statistics,10,153165. Rao,J.N.K. (2002). SmallAreaEstimationTheoryandMethods,New York:JohnWiley&Sons,Inc. Rao, J.N.K, Hartley, H.O.and Cochran, W.G.(1962). On a simple procedure of unequal probability sampling without replacement. JournaloftheRoyalStatisticalSociety,SeriesB,24,482491. Rao,J.N.K.,andSingh, A.C.(1997). Aridgeshrinkage method for range restricted weight calibration in survey sampling. Proceedingsofthesectiononsurveyresearchmethods,American StatisticalAssociation,5764. Robinson,G.K. (1991).TheBLUPisagoodthing:Theestimationof randomeffects. StatisticalScience,6,1532. Robinson,P.M., andSrndal,C.E. (1983).Asymptoticpropertiesof the generalized regression estimator in probability sampling. Sankhy,SeriesB,45,240248. Rosenbaum, P.R., and Rubin, D.B. (1983). The central role of the propensity score in observational studies for casual effects. Biometrika,70,4155. Royall, R.M. (1970). On finite population sampling theory under certainlinearregressionmodels. Biometrika,57,377387. Royall,R.M.(1976).Thelinearleastsquarespredictionapproachto twostage sampling. Journal of the American Association, 71, 657664. Royall, R.M. (1986). The prediction approach to robust variance estimationintwostageclustersampling. JournaloftheAmerican Statistical Association,81,119123. Royall,R.M.,andCumberland,W.G.(1978).Varianceestimationin finite population sampling. Journal of the American Statistical Association,73,351358.
Royall, R.M.,and Cumberland, W.G.(1981). The finite population linear regression estimator and estimators of its variance, an empirical study.Journal of the American Statistical Association, 76,924930. Srndal,C.E.(1980).On p - weightingversusbestlinearunbiased weightinginprobabilitysampling. Biometrika,67,639650. Srndal,C.E.(1982).Implicationsofsurveydesignforgeneralized regression estimation of linear functions. Journal of Statistical PlanningandInference,7,155170. Srndal, C.E. (1996). Efficient estimators with simple variance in unequal probability sampling.Journal of the American Statistics Association,91,12891300. Srndal,C.E.,Swenson,B. andWretman,J.H. (1989).Theweighted residual technique for estimating the variance of the general regressionestimatorofthefinitepopulationtotal.Biometrika,76, 527537. Srndal, C.E., Swenson, B. and Wretman, J.H. (1992). Model AssistedSurveySampling.NewYork:SpringerVerlag. Srndal,C.E.,andWright,R.(1984).Cosmeticformofestimators in surveysampling.Scandinavian Journal of Statistics,11, 146 156. Scott, A.,andSmith,T.M.F.(1974).Linearsuperpopulationmodels insurveyandsampling. Sankhy,C, 36,143146. Scott,A., andWu,C.F. (1981).Ontheasymptoticdistributionofratio and regression estimators. Journal of the American Statistical Association,76,98102. Silva, P.L.D.N., and Skinner, C.J. (1997). Variable selection for regression estimation in finite populations. Survey Methodology, 23,2332. Singh, A.C., and Folsom, R.E. (2000). Bias corrected estimating functions approach for variance estimation adjusted for poststratification.ProceedingsoftheSectiononSurveyResearch Methods, AmericanStatisticalAssociation, 610615. Singh, A.C.,Kennedy,B.andWu,S.(2001).Regressioncomposite estimationfortheCanadianLabourForceSurveywitharotating design. SurveyMethodology,27,3344. Singh, A.C., and Mohl, C.A. (1996). Understanding calibration estimatorsinsurveysampling. SurveyMethodology,22,107115. Tallis, G.M.(1978). Note on robust estimation infinite populations. Sankhy, C,40,136138. Tam,S.M. (1986).Characterizationofbestmodelbasedpredictorsin surveysampling. Biometrika,73,232235. Thberge, A.(1999). Extensions of calibration estimators in survey sampling. Journal of the American Statistical Association, 94, 635644. Thberge, A. (2000). Calibration and restricted weights. Survey Methodology,26,99107. Till, Y. (1998). Estimation in surveys using conditional inclusion probabilities: Simple random sampling. International Statistical Review,66,303322. Tremblay, V. (1986). Practical Criteria for Definition of Weighting Classes. SurveyMethodology,12,8597.
Watson, D.J. (1937). The estimation of leaf area in field crops. JournalofAgriculturalScience,27,474483. Woodruff,R.S., andCausey,B.D. (1976).Computerizedmethodfor approximatingthevarianceofacomplicatedestimate.Journalof theAmericanStatisticalAssociation,71,315321. Wright, R.L. (1983). Finite population sampling with multivariate auxiliary information. Journal of the American Statistical Association,78,879884. Wu, C., and Sitter, R.R. (2001). A modelcalibration approach to usingcompleteauxiliaryinformationfromsurveydata. Journalof theAmericanStatisticalAssociation,96,185193.
23
Yates, F. (1949). Sampling Methods for Census and Surveys. London:Griffin. Yung, W.,and Rao, J.N.K.(1996). Jackknife linearization variance estimators under stratified multistage sampling. Survey Methodology,22,2331. Zyskind, G. (1976). On canonical forms, nonnegative covariance matrices and best and simple least squares linear estimators in linearmodels. AnnalsofMathematicalStatistics,38,10921109.

6408 Eng

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

6408 Eng

Încărcat de

Drepturi de autor:

Formate disponibile

Catalogue no.

How to obtain more information

Information to access the product

Standards of service to the public

SurveyMethodology,June2002 Vol.28,No.1,pp.523 StatisticsCanada,CatalogueNo.12001XIE

becausetheweightsarefunctionsofthe xi andnotof y . i Thecovariancematrixof is

3. The Classical Linear Model

i where b = xi e and c i = (X X)-1xi e . Inthesameway i i

Givenasampleofsizenandtreatingthe x asfixed,the i best(minimummeansquarederror)estimatorof is

] FN =[(y N, x N ), (y N, x2N), ..., (yNN, xNN ) 1 1 2

4. Design Based Estimation

and p isthe selection probabilityforelementi. Thenthe i errorinthevector ofregressioncoefficientsis

and isthevectorofLagrangemultipliers. Ifthereisacolumnvectors suchthat

( xxN, QxyN)=E{( xx, Qxy)|F }, Q Q N ( xx, Qxy)= n-1(X -1 X X -1 y , Q , ) bHT = N-1 p -1 b , i i

as N, n, where V z is the covariance matrix of z z HT -z . If V z is O( -1 ) and the estimator V z is n N z z consistentfor V z, then z

yreg = x = yp + ( x1, N -x1, p ) , N 1

Also, when =D, the N of (4.7) is the population regressioncoefficient

where ei = y - xi Hence,thevarianceoftheregression . i estimatorcanbeestimatedwith

where wi =p i1 gi. Furthermore, the estimated variance from(4.12)is

V { yreg } = V { g 0} = V pi-1 ( g i ei) , i A

1 denoted by ,dopt, is a consistent estimator of ,dopt. It 1 followsthattheestimator yd , reg = yp + (x1, N -x p ) dopt 1, 1,

K h (x1, hj - x h ) ( yhj - yh) , 1,

where G AA = S eeAA S-1 , x N - n = ( N - n ) -1( N x N -n x ) , eeAA n , SeeAA = E{e A eA}

5. Modelsand Regression Estimation

forallsampleswith positiveprobability.Ifthereisalso g suchthat

Undermodel(5.1),thebestlinear,conditionallyunbiased predictorof q N = yN, conditionalon X is

where q x =( X S ee1 X) -1 X See1 q. Then the estimator (5.3)becomes

The estimator (5.12) can be expressed in the familiar regressionestimatorform,

6. Maximum Likelihood and Raking Ratio

aij log pij +

c l i p - p, N ij i i=1 j=1 r+ c r + l j p - p j,N , (6.2) ij i=1 j= r+1

7. Population of Auxiliary Vectors Known at Estimation Step

[wi loga-1 w + a i - wi], i i

8. Regression Estimation and Nonresponse

( and x Ni) = ( N - 1) -1( N x N -x ). For simple random i sampling,Till(1998)showedthattheestimator

17 Then the diagonal that minimizes the approximate variancehaselements

Proof. Theerrorin is - N = ( X F -1 X ) -1 [ X F -1 y - X F -1 XN ] =Q -1 ( n -1 X F-1e . )

where p aretheselectionprobabilities. Assumethesample i designissuchthatforany z withboundedfourthmoments

The bi haveboundedfourthmomentsbytheassumptions. Thus,byassumption(A.5)

and Vb b =V {bHT}. Now

e = y - x i N, and z iscolumn i of X F . i i i Assumethedesignissuchthat

where -1 V {} =Q-1 Vb b Qxx, xx (A.8)

Vb b =V {b } is the estimated design variance of bHT HT . calculatedwith b = n -1N pi z ei and ei = yi - x i i i

bedesignconsistentestimatorsforfinitepopulationcharac teristics yN , x N , QxxN and QxyN, respectively,where

N [ xxN, QxyN ]= [N-1XN F-1XN,N-1XN F -1yN ] Q . N

forallpossiblesamples,where Dp = diag(p1, p 2, ..., p n ) and Jn isann dimensionalcolumnvectorofones.Then, theregression estimator x with N

isadesignconsistentestimatorof yN. Proof.If isdefinedby(A.11),thenbythepropertiesof generalizedleastsquaresestimators,

Itfollowsthat y isdesignconsistentbecause reg

forsome d> 0. Let 1,dopt bethevectorthatminimizes V { y - x } (A.15)

S-ar putea să vă placă și