EstevaoSarndal - 2009 - New Face On Two Phase Samplng With Calibration Estimators

Component of Statistics Canada
Catalogue no. 12-001-X Business Survey Methods Division
Article
A new face on two-phase
sampling with calibration
estimators
by Victor M. Estevao and Carl-Erik Srndal
June 2009
Survey Methodology, June 2009 3
Vol. 35, No. 1, pp. 3-14
Statistics Canada, Catalogue No. 12-001-X
A new face on two-phase sampling with calibration estimators

Victor M. Estevao and Carl-Erik Srndal 1
Abstract
This paper provides a framework for estimation by calibration in two-phase sampling designs. This work grew out of the
continuing development of generalized estimation software at Statistics Canada. An important objective in this development
is to provide a wide range of options for effective use of auxiliary information in different sampling designs. This objective
is reflected in the general methodology for two-phase designs presented in this paper.
We consider the traditional two-phase sampling design. A phase-one sample is drawn from the finite population and then a
phase-two sample is drawn as a sub-sample of the first. The study variable, whose unknown population total is to be
estimated, is observed only for the units in the phase-two sample. Arbitrary sampling designs are allowed in each phase of
sampling. Different types of auxiliary information are identified for the computation of the calibration weights at each
phase. The auxiliary variables and the study variables can be continuous or categorical.
The paper contributes to four important areas in the general context of calibration for two-phase designs:
(1) Three broad types of auxiliary information for two-phase designs are identified and used in the estimation. The
information is incorporated into the weights in two steps: a phase-one calibration and a phase-two calibration. We
discuss the composition of the appropriate auxiliary vectors for each step, and use a linearization method to arrive at the
residuals that determine the asymptotic variance of the calibration estimator.
(2) We examine the effect of alternative choices of starting weights for the calibration. The two natural choices for the
starting weights generally produce slightly different estimators. However, under certain conditions, these two estimators
have the same asymptotic variance.
(3) We re-examine variance estimation for the two-phase calibration estimator. A new procedure is proposed that can
improve significantly on the usual technique of conditioning on the phase-one sample. A simulation in section 10 serves
to validate the advantage of this new method.
(4) We compare the calibration approach with the traditional model-assisted regression technique which uses a linear
regression fit at two levels. We show that the model-assisted estimator has properties similar to a two-phase calibration
estimator.
Key Words: Auxiliary information; Two-phase regression estimator; Starting weights; Separate residual variance
estimator; Combined residual variance estimator.
1. Introduction levels: At the population level, the total U x1k is known,

where x1k is a vector known for every k s1 ; therefore, it
The term double sampling refers to sampling designs is also known for every k s. At the level of the first
whose common feature is a selection of two probability sample, the vector value x 2k is observed for every k s1 ,
samples, denoted s1 and s, both of them subsets of the and is thereby known for every k s; the total U x 2 k is
finite population of interest, given by U = {1, ..., k, ..., unknown but can be estimated without design bias at the s1 -
N }. The sample s1 is realized and observed prior to s. A level. Two arguments are found in the literature for
typical study variable is denoted by y ; its value yk is incorporating these two types of auxiliary information in
obtained only for the units k s. The objective is to estimating Y = U yk : the regression fit argument and the
estimate the population y - total Y = U yk (if A is a set calibration argument. Under certain conditions they can lead
of units, A U , then we write A as a short form for to identical estimators, but this is not so in general.
k A when there is no ambiguity). The regression fit argument prevails in Srndal and
Hidiroglou (2001) discusses two types of double Swensson (1987), Srndal, Swensson and Wretman (1992),
sampling, nested and non-nested. This paper focuses on the Sitter (1997), Hidiroglou and Srndal (1998), Axelson
nested type, usually referred to as two-phase sampling: The (1998) and Hidiroglou, Rao and Haziza (2006). The
phase-two sample s is a sub-sample from the phase-one calibration approach in Deville and Srndal (1992) was
sample s1 drawn from U , so s s1 U . applied to two-phase sampling by Dupont (1995). She
Estimation for two-phase sampling has been examined in compares the resulting calibration estimators with those
several earlier papers in a context where two kinds of obtained from the regression approach. For the same
auxiliary information are recognized and addressed by their auxiliary information, the two approaches may not give
1. Victor M. Estevao, Business Survey Methods Division, Statistics Canada, Ottawa, Ontario, Canada, K1A OT6. E-mail: victor.estevao@statcan.gc.ca;
Carl-Erik Srndal, professor. E-mail: carl.sarndal@rogers.com.
4 Estevao and Srndal: A new face on two-phase sampling with calibration estimators
identical estimators, although in practice the difference is The double-expansion estimator s ak yk is unbiased for
likely to be of little consequence. Resampling for two-phase Y = U yk . We can produce more efficient estimators by
variance estimation is considered in Kott and Stukel (1997). taking into account the available auxiliary information.
Estevao and Srndal (2002) focus on the calibration Three types or sets of auxiliary variables (called x-variables)
argument and distinguish ten different ways to use all or part can be distinguished for two-phase sampling designs. These
of the information available at the two levels. The present are denoted by X , X and X . Their information
paper also focuses on the calibration approach. It extends characteristics are specified in the following table.
earlier work by recognizing three (rather than two) types of
auxiliary information, each having different characteristics. Table 1.1
In the regression approach, it is natural to fit two linear Sets of auxiliary variables for calibration in two-phase sampling
least squares regressions. One set of regression-predicted Set of Auxiliary Unit variable Unit variable
y - values are produced for k s1 using both x1k and x 2k auxiliary variable values for values for
variables total over U k s1 k s
as predictors; another set is produced for k s1 using only X known known known
the vector x1k as predictor. Both sets of predicted y - X known unknown known
values, as well as the known total U x1k , are used to build X unknown known known
the regression-type estimator of Y , in the manner described
in section 9. Each set may contain any number of x-variables. The
The calibration approach is motivated by two factors: To three sets are mutually exclusive. The properties in the last
create a set of weights that are consistent with known or three columns apply to every x-variable in the correspon-
estimated totals for the auxiliary variables and to reduce the ding set. All x-variables used for calibration belong to one of
variance of the estimates made for the study variable(s). We these three sets.
want the weights wk in Y2 P = s wk yk to achieve
consistency with the total U x1k known at the level of the
population and/or with an (approximately) unbiased 2. Phase-one calibration
estimate, made at the level of the phase-one sample, of the
unknown U x2 k . Since y is observed only at the ultimate For the phase-one calibration, we use a vector x1k of
level (the phase-two sample), consistency at higher levels auxiliary variables selected from the set X . While it is
on important auxiliary variables will often significantly natural to let x1k consist of all the variables in X , the
reduce the variance of Y2 P = s wk yk . We can distinguish general presentation here allows us to define x1k to include
two steps in the process leading to the weights wk , a some or even none of the variables in X . The phase-one
phase-one calibration and a phase-two calibration. calibration weights w1k are derived by modifying the
The two-phase sampling design is as follows: From the phase-one starting weights a1k subject to the calibration
finite population of units U = {1, 2, ... k, ... N } we constraint s1 w1k x1k = U x1k . In our formulation, the
select a phase-one sample s1. The known positive inclusion calibration weights are given for k s1 as
probability of unit k is 1k = Pr ( k s1 ), and the
phase-one design weight is a1k = 1 / 1k . Certain variables { 1 )
w1k = a1k 1 + ( X1 X (s a1
z x
1k 1k 1k )
1
z1k } (2.1)
may be observed for the units k s1. Then, conditionally
where X1 = U x1k , X 1 = s a1k x1k and z1k is an
on s1, we select a phase-two sample s from s1. The known 1
instrumental vector of the same dimension as x1k . It

and positive conditional inclusion probability of k is
replaces x1k /12k in the form of the model-assisted estimator
2 k = Pr (k s | s1 ) for k s1, and the conditional
described by Srndal, Swensson, Wretman (1992), and
phase-two design weight is a2 k = 1 / 2 k . (to keep the
permits a more general specification of the calibration
notation simple, we use 2k and a2 k rather than the more
weights. The use of an instrumental vector is discussed in
suggestive 2 k | s1 and a2 k | s1 ; it should be kept in mind that
Estevao and Srndal (2000) and Deville (2002). Here and in
both 2k and a2 k are conditional on the phase-one sample
the following, we always assume the invertibility of
s1 ). The combined or double-expansion design weight is
matrices such as the one over s1 in (2.1) and those (over s
ak = a1 k a2 k for k s. The analysis of the estimators in
and U ) appearing later.
this article is design based. The term (approximately)
unbiased means (approximately) design unbiased. We
assume mild conditions on the population and the two 3. Phase-two calibration
sampling designs, permitting us to discard lower order terms
in the analysis of our estimators when the expected sizes of We use a vector x k of auxiliary variables to produce a
the phase-one and phase-two samples are sufficiently large. set of phase-two (or final) calibration weights wk . They are

used to calculate Y2 P = s wk yk as our estimator of Given the starting weights ak, we determine final weights
Y = U yk . The vector x k = (xk ( t ) , xk ( w) , xk ( a ) ) has wk subject to the calibration equation s wk x k = X. These
three components, as described below. No auxiliary variable final weights are given for k s by
can appear in more than one of the three vector components.
These three components have different roles in the setup of {
wk = ak 1 + ( X X
) ( s a
k z k xk )
1
zk } (3.2)
the phase-two calibration equation s wk x k = X and in the
where X = s ak x k is an unbiased or approximately
determination of the phase-two calibration weights.
The variables in the vector x k (t ) are selected from unbiased estimator of X, depending on the composition of
among those in the set X X. This means that the total x k . The instrumental variable z k has the same dimension
U xk (t ) is known and can be included in X. Variables in as x k . The vectors z1k and z k are assumed to be fixed
x1k are allowed to reoccur in x k (t ) , and this is usually functions of x1k and x k . How to choose z1k and z k is a
preferable in order to reduce the variance of the estimator. topic we leave for others to address.
We can specify x k ( t ) = x1k , but our framework permits
x k (t ) to include variables from X . This allows us to use 4. Comparison of two options for
variables with known population totals in situations where the starting weights
the variables are too expensive to collect for a large
phase-one sample s1 but are observable for the smaller The objective in this section is to analyze how the final
phase-two sample s. These variables are excluded from the weights wk in Y2 P = s wk yk depend on the specification
phase-one calibration because they are unavailable for of the starting weights ak in (3.2). We consider two distinct
k s1. cases based on whether or not the auxiliary variables x k are
The variables in x k ( w) and x k ( a ) are selected from used for the phase-two calibration. When we carry out the
among those in the set X X X provided they are phase-two calibration, the two different choices for starting
not already included in x k (t ). The variables in x k ( w) are weights generally lead to different estimators. We show that
those for which we want to satisfy the phase-two calibration these estimators are asymptotically equivalent under certain
equation s wk x k ( w) = s1 w1k xk ( w) , where the right-hand conditions, commonly found in practice. When we have no
side is approximately unbiased for U xk ( w). The variables phase-two calibration, the two choices for starting weights
in x k ( a ) are those for which we want to satisfy the lead to two other estimators that are usually less efficient
phase-two calibration equation s wk x k ( a ) = s1 a1k x k ( a ). than those obtained by performing the phase-two
Here, the right-hand side is unbiased for U x k ( a ). The calibration.
inclusion of both x k ( w) and x k ( a ) in the definition of x k
allows us to calibrate on one or both of these vectors and 4.1 Estimators with phase-two calibration (xk )
provides a general framework for producing different As noted previously, there are two alternatives for the
estimators from the phase-two calibration. starting weights ak in (3.2): (1) ak = ak = a1k a2 k , and (2)
The phase-two calibration equation is s wk x k = X, ak = w1k a2 k , where w1k is the phase-one calibration
where X is the stacked auxiliary vector weight given by (2.1). We now provide a detailed analysis
of the form of the estimator under these two choices. In this
U k (t )
x subsection, we look at the more interesting case where we
perform the phase-two calibration ( x k ). In the next
X = s w1k xk ( w) . (3.1) subsection, we consider what happens when we do not carry
1
out the phase-two calibration ( x k = ).
a1k x k ( a ) Our procedure is as follows. First, we derive the
s1
linearized (asymptotic) form of Y2 P based on the general
A specific variable can only occur once in x k . starting weights ak. Then we substitute the two choices for
Otherwise, the calibration equation may be inconsistent and ak in this expression. We determine Y2 P based on the
admit no solution. starting weights ak = ak = a1k a2 k . We denote this
The starting weights for the phase-two calibration are estimator by Y2 P a and derive its linearized form, Y2 P a lin.
denoted by ak for k s. There is more than one Similarly, we obtain Y2 P based on the starting weights
reasonable choice for the ak. We consider two alternatives, ak = w1k a2 k . We refer to this estimator as Y2 P w and derive
both of which seem natural: (1) ak = ak = a1k a2 k , and (2) its linearized form, Y2 P w lin. These two forms are slightly
ak = w1k a2 k , where w1k is the phase-one calibration different but we prove in Result 4.2 that Y2 P a lin = Y2 P w lin
weight given by (2.1). under certain conditions.

We start by inserting the weights wk into Y2 P = The following result establishes the relationship between
s wk yk and writing the estimator as the estimators obtained for the two choices of starting
weights.
Y2 P = U xk (t ) B( y; x)(t ) + s w1k xk ( w) B( y; x)( w)
1
Result 4.1: The linearized forms of Y2 P a and Y2 P w are
related by the equation Y2 P w lin = Y2 P a lin +
+ s a1k xk ( a ) B( y; x)( a )
1
+ s ak e( y; x) k 1 )( B ( y ; x ) B ( x; x ) B( y ; x ) ).
( X1 X 1 1
) (B * Proof
+ (X X ( y; x ) B( y; x ) ) (4.1)
We consider expression (4.3) under the two possible
where B *( y ; x ) = ( s ak z k xk ) 1 s ak z k yk , B ( y ; x ) = choices for ak. First, with ak = ak = a1k a2 k we obtain
(U z k xk ) 1U z k yk and B ( y ; x) = (B( y ; x)(t) , B( y ; x)(w) , B( y ; x)(a) ) Y2 P a given by
is the partitioning corresponding to x k = ( xk (t ) , xk ( w) ,
Y2 P a = U (xk ( t ) B ( y ; x )(t ) + x1k B ( xB( w ) ; x 1 ) )
xk ( a ) ). Our subscript notation of the form ( v1 ; v 2 )
identifies the variables in the regression. The term v 2 refers + s a1k (xk (a ) B( y; x)( a )
1
+ e( xB( w ) ; x 1 ) k )
to the independent variables and v1 identifies the dependent
variable or variables. For simplicity, the instrumental + s ak e( y; x) k
vectors z1k and z k are not included in the notation.
) (B
+ ( X1 X
The term e( y ; x ) k = yk xk B ( y ; x ) is defined for k U . 1 ( xB ( w ) ; x1 ) B ( xB ( w ) ; x1 ) )
Note that although e( y ; x ) k looks like a regression residual, it )(B

+ (X X ( y ; x ) B ( y ; x ) ). (4.4)
does not arise as the result of fitting a proper regression
model. We then develop the term s1 w1k xk ( w) B ( y ; x )( w) in The term B ( y ;x ) = ( s ak z k xk )1 s ak z k yk converges
(4.1) by inserting expression (2.1) for w1k and making use in probability to (and is approximately unbiased for)
of the phase-one calibration equation s1 w1k x1k = U x1k . B ( y ;x ) = ( U z k xk ) 1 U z k yk . The first term is constant
We obtain and does not contribute to the variance of Y2 P a. The next
s w1k xk ( w) B( y; x)( w) =
1
two terms are random quantities, defined as sums over s1
and s respectively. The last two terms are products of
differences with zero or almost zero expectation. As for the
X1 B ( xB( w ) ; x1 ) + s a1k e( xB( w ) ; x1 ) k 1 ) (B ( xB ; x ) B ( xB ; x ) ), both differ-
1 product ( X1 X ( w) 1 ( w) 1
ences are functions of the phase-one sample s1. We know
1 ) (B ( xB ; x ) B ( xB ; x ) ) (4.2)
+ ( X1 X ( w) 1 ( w) 1 that X1 is unbiased for X1 and B ( xB( w ) ; x1 ) is approximately
unbiased for B ( xB( w ) ; x1 ). Under fairly general conditions,
where B ( xB( w ) ; x1 ) = ( s1 a1k z1k x1k ) 1 s1 a1k z1k xk ( w) B ( y ; x )( w)
N 1 (X1 X 1 ) (B ( xB ; x ) B ( xB ; x ) ) = OP (n11 ), where
converges in probability to (and is approximately unbiased (w) 1 ( w) 1
n1 is the expected size of s1, assumed sufficiently large. By
for) B ( xB( w ) ; x1 ) = (U z1k x1k ) 1 U z1k xk ( w) B( y ; x )( w) , and
e( xB( w ) ; x1 ) k = xk ( w) B( y ; x )( w) x1k B ( xB( w ) ; x1 ) is defined for a similar reasoning, N 1 ( X X )(B ( y ; x ) B( y ; x ) ) =
1
k U. OP (n ), where n is the expected size of s, also assumed
We can interpret e( xB( w ) ; x1 ) k as a residual arising from a sufficiently large. Consequently, we can drop the last two
population fit based on a generalized regression of terms of (4.4), because they are of lower order than the
xk ( w) B ( y ; x )( w) as the dependent variable and x1k as the preceding terms: N 1 s1 a1k (xk ( a ) B ( y ; x) ( a) + e( xB( w ) ; x1 ) k ) is
predictor vector. Replacing expression (4.2) into expression OP (n11 2 ) and N 1 s ak e( y ;x )k is OP (n 1 2 ). The first
(4.1) for Y2 P leads to three terms define the linearized form of Y2 P a ,
Y2 P a lin = ( xk (t ) B ( y ; x )(t ) + x1k B ( xB ; x ) )
Y2 P = U (xk (t ) B( y ; x )(t ) + x1k B( xB( w ) ; x1 ) ) U ( w) 1
+ s a1k (xk ( a ) B( y; x )(a ) + e( xB( w ) ; x1 ) k ) + s a1k (xk ( a ) B( y; x)( a )

1
+ e( xB( w ) ; x1 ) k )
1
+ s ak e( y; x) k
+ s ak e( y; x )k . (4.5)
Now let us consider expression (4.3) under the second
) (B
+ ( X1 X1 ( xB ( w ) ; x1 ) B ( xB ( w ) ; x1 ) ) choice, ak = w1k a2 k . This leads to Y2 P w given by
)(B * B
+ (X X (4.3)
( y; x) ( y ; x ) ).

Y2 P w = U (xk (t ) B( y; x)(t ) + x1k B ( xB( w ) ; x1 ) ) Y2 P w lin = U (xk (t ) B ( y; x )(t ) + x1k B ( xB( w ) ; x1 ) )
+ s a1k ( xk ( a ) B ( y ; x )( a ) + e( xB( w ) ; x1 ) k ) + s a1k (xk ( a ) B( y ; x )( a ) + e( xB( w ) ; x1 ) k )

1
1
+ s ak e( y ; x )k + s ak e( y; x )k
( s a ) s a z
1
1 )
+ ( X1 X z x e )( B
1
1k 1k 1k k 1k ( y ; x ) k + ( X1 X1 ( y ; x1 ) B ( x; x1 ) B ( y ; x ) ). (4.8)
)(B w
+ (X X ( y; x) B( y; x) ) (4.6) Comparing (4.5) with (4.8), we see that Y2 P w lin =

Y2 P a lin + ( X1 X 1 )( B ( y ; x ) B ( x; x ) B ( y ; x ) ) as stated in
where B (wy ; x ) = ( s w1k a2 k z k xk )1 s w1k a2 k z k yk
1 1
and the result. This completes the proof of result 4.1.

X = s w1k a2 k x k . The first three terms of Y2 P w are the Result 4.1 shows that in general, the linearized forms of
same as those found in expression (4.4) for Y2 P a. The Y2 P w and Y2 P a are not the same. However, they are the
fourth and fifth terms differ from their counterparts in (4.4). same under certain conditions. Let us consider the case of
Although B (wy ; x ) and X are functions of the phase-one nested calibration (not to be confused with nested
calibration weights w1k , we do not need to replace them in sampling), meaning that x k includes x1k . Then x k is of the
B (wy ; x ) and X in the fifth term; this would simply split the form x k = (x1k , x+ k ) where the vector x + k is composed
lower order term ( X X )(B (wy ; x ) B ( y ; x ) ) into other of the remaining variables. We now state and prove the
lower order terms. Therefore, we can drop the fifth term of following result.
(4.6) when the sample sizes are sufficiently large. The Result 4.2: If x k = (x1k , x+ k ) and z k = ( z1k , z + k ) then
fourth term can be written as follows. Y2 P w lin = Y2 P a lin and Y2 P a and Y2 P w are
( s a ) s a z
1 asymptotically equivalent.
)
( X1 X z x e
1 1
1k 1k 1k k 1k ( y ; x ) k
Proof
1 )( B ( y ; x ) B ( x; x ) B ( y ; x ) )
= ( X1 X The proof follows from result 4.1 by showing
1 1
B ( y ; x1 ) B( x; x1 ) B( y ; x ) = 0 under the specified conditions.

1 )( B ( y ; x ) s B ( y ; x ) )
+ ( X1 X We have
1 1 1
( U z1 x1 ) ( U z1 h )
1
1 )( B ( x; x ) B ( x; x ) )B ( y ; x ) B ( y ;x1 ) B ( x;x1 ) B ( y ;x ) =
( X1 X 1 1
k k k k
where hk = yk xk (U z k xk ) 1 (U z k yk ). Since
( s a )
1
1 )
+ ( X1 X z x
1k 1k 1k
U z1k hk = 0 and we assume z k = (z1k , z + k ), it follows
U z1k hk = 0 and B ( y ; x1 ) B ( x; x1 ) B ( y ; x ) = 0. Therefore
1
from result 4.1, Y2 P w lin = Y2 P a lin. Since their linear

(s a z e
k 1k ( y ; x ) k s a1k z1k e( y; x )k ) . (4.7)
1 forms are the same, Y2 P a and Y2 P w are asymptotically
equivalent estimators.
The quantities in this expression are defined as follows:
Interestingly, Result 4.2 only requires that we include
B ( x; x1 ) = ( s1 a1k z1k x1k )1 s1 a1k z1k xk and B ( y ; x1 ) s1 =
x1k somewhere within x k . Obviously, it makes sense to
( s1 a1k z1k x1k )1 s1 a1k z1k yk . The statistic B ( y ; x1 ) s1 can not
include x1k within the component x k (t ) of x k because the
be computed from the phase-one sample because the values
x1 -totals are known. However, we obtain the same
yk are only known for k s. It is implicitly defined for
asymptotic result as long as all variables in x1k are included
the purpose of determining the linearized form. We can somewhere in x k = (xk ( t ) , xk ( w) , xk ( a ) ). In practice, we
define such a construct in the same manner as B ( xB( w ) ; x1 ) is a often find x k (t ) = x1k with z1k = x1k and z k = x k =
function of the unknown quantity B ( y ; x )( w) . Now (x1k , x+ k ) where x + k is the vector for the remaining
B ( y ; x1 ) s1 is approximately unbiased for its corresponding variables x k ( w) and x k ( a ). This satisfies the requirements
population quantity B ( y ; x1 ) = ( U z1k x1k )1 U z1k yk . for the asymptotic equivalence of Y2 P a and Y2 P w.
Similarly, B ( x; x1 ) is approximately unbiased for B ( x; x1 ) = To study the properties of Y2 P a and Y2 P w we work with
(U z1k x1k ) 1 U z1k xk . As before, we can argue that the their linearized forms given respectively by (4.5) and (4.8).
last three terms of (4.7) are of lower order than the first term With appropriate definitions for the residuals e0 k , e1k and
( X1 X 1 )( B ( y ; x ) B ( x; x ) B ( y ; x ) ), which provides the e2 k , we can represent Y2 P a lin and Y2 P w lin as the sum of
1 1
linear approximation. The substitution of this term into (4.6) three terms: a constant term U e0 k , a phase-one expansion
leads to the linearized form of Y2 P w, term s1 a1k e1k , and a double-expansion term s ak e2 k ,

Y2 P lin = U e0k + s a1k e1k + s ak e2k. (4.9) calibration. The linearized form of the two-phase estimator
with wk = w1k a2 k is obtained by writing it as follows.
1
This makes (4.9) a suitable starting point for studying the

bias and the asymptotic variance of the two estimators Y2 P a Y2 P = X1 B ( y ; x1 ) X
1B ( y ; x ) +
1
s ak yk
and Y2 P w.
) (B
+ ( X1 X
For the linearized form Y2 P a lin given by (4.5), the three 1 ( y ; x1 ) s1 B ( y ; x1 ) )
residual quantities are defined as follows for k U :
(s a )
1
e0 k = xk ( t ) B ( y ; x )( t ) + x1k B ( xB( w ) ; x1 ) )
+ ( X1 X z x
1 1
1k 1k 1k
e1k = xk ( a ) B ( y ; x )( a ) + xk ( w) B ( y ; x )( w) (s a z k 1k yk s a1k z1k yk ) .

1
(4.12)
x1k B ( xB( w ) ; x1 ) The terms B ( y ; x1 ) s1 and B ( y ; x1 ) were defined in the

previous section. When the samples are sufficiently large,
e2 k = yk xk (t ) B ( y ; x )( t ) xk ( w) B ( y ; x )( w) )( s a z x ) 1 ( s a z y s a z y ) and
( X1 X1 1 1k 1k 1k k 1k k 1 1k 1k k
( X1 X 1 ) (B ( y ; x ) s B ( y ; x ) ) are of lower order and
1 1 1
xk ( a ) B ( y ; x )( a ) . (4.10) can be ignored. This leads to the linearized form of this

estimator.
Note that e2k is simply e( y ; x ) k . Similarly, for Y2 P w lin
given by (4.8), the residuals have the following definitions Y2 P lin = X1 B ( y ; x1 ) X
1B ( y ; x ) +
1 s ak yk . (4.13)
for k U :
We can also write this linearized form as a sum (4.9) of
e0 k = xk ( t ) B ( y ; x )( t ) three residual terms, with the residuals e0 k , e1k and e2k
having following definitions for k U.
+ x1k ( B ( xB( w ) ; x1 ) + B ( y ; x1 ) B ( x; x1 ) B ( y ; x ) )
e0 k = x1k B ( y ; x1 )
e1k = xk ( a ) B ( y ; x )( a ) + xk ( w) B ( y ; x )( w)
e1k = x1k B ( y ; x1 )
x1k ( B ( xB( w ) ; x1 ) + B ( y ; x1 ) B ( x; x1 ) B ( y ; x ) )
e2 k = yk . (4.14)
e2 k = yk xk (t ) B ( y ; x )( t ) These residuals show a resemblance to those given by

(4.10) if we set x k = and remove B ( y ; x ). Note how
xk ( w) B ( y ; x )( w) xk ( a ) B ( y ; x )( a ) . (4.11) B ( y ; x1 ) has the same role as B ( xB( w ) ; x1 ) in (4.10). As before,
e0k + e1k + e2 k = yk for every k, and hence U (e0 k +
Note that in both cases, e0 k + e1k + e2 k = yk for every e1k + e2 k ) = U yk = Y.
k, and hence U (e0 k + e1k + e2 k ) = U yk = Y. This The double-expansion estimator is a special case of this
additivity allows us to prove in section 5 that Y2 P a and estimator when we also have x1k = . This means that
Y2 P w are approximately unbiased. To save space, we B ( y ; x1 ) is not defined. The corresponding definitions for
concentrate on the properties of Y2 P a in the remaining e0 k , e1k and e2k are simply e0 k = 0, e1k = 0 and e2k =
sections. However, the analysis is similar for Y2 P w and the yk for k U.
method for variance estimation proposed in section 7 can In the following sections, we examine the bias and
also be used for this estimator. variance of the two-phase calibration estimator Y2 P a and
we propose a new method for estimation of variance. We
4.2 Estimators without the phase-two calibration can derive corresponding results when there is no phase-two
(xk = )
calibration because the residuals for these two groups of
If there is no phase-two calibration ( x k = ), then estimators have similar properties and linearized form. The
wk = ak. Accordingly, the final weights are either wk = only difference occurs in the estimation of variance. We use
ak = a1k a2 k or wk = w1k a2 k . The first alternative gives the same variance estimator (as described in section 7) but
the double-expansion estimator s ak yk . The second the residuals are estimated by using e1k = x1k B ( y ; x1 ) s
produces a different estimator that is usually more efficient. where B ( y ; x1 ) s = ( s ak z1k x1k ) 1 s ak z1k yk , and
However, both of these are generally inefficient compared e2 k = yk .
to the estimators obtained by carrying out the phase-two

5. Bias and variance of the two-phase where

calibration estimator Y2 P a
( s a z k xk )
1
The two-phase calibration estimator Y2 P a = s wk yk is

B ( y ; x ) = k s ak z k yk
approximately unbiased for Y = U yk . To show this, we
= (B ( y ; x )(t ) , B ( y ; x )( w) , B ( y ; x )( a ) )
derive the expectation of the linearized form given by (4.9)
via the usual method of conditioning on the phase-one and (6.2)
sample s1. We have E ( s ak e2 k ) = Es1 Es | s1 ( s ak e2 k ) =
Es ( s a1k e2 k ) = U e2 k , Es1 ( s1 a1k e1k ) = U e1k , and U e0 k B ( xB =
1 1 ( w) ; x1 )
is a constant term, so
( s a ) s a
1
E (Y2 P a lin ) = U (e0k + e1k + e2 k ) = U yk = Y.
1
1k z1k x1k
1
1k z1k xk ( w) B ( y ; x )( w).
This shows that Y2 P a lin is unbiased for Y. By (4.4), The term B ( xB ; x ) in the definition of e1k is the esti-
Y2 P a = Y2 P a lin + R, so the bias of Y2 P a equals the
mate of B ( xB( w ) ; x1 ) = (U z1k x1k ) 1 U z1k xk ( w) B ( y ; x )( w) in
( w) 1
expectation of R, which is the sum of the two lower order (4.10). Two replacements are required in B ( xB( w ) ; x1 ) to arrive
terms ( X1 X 1 ) (B ( xB ; x ) B ( xB ; x ) ) and ( X X
) at B ( xB ; x ): First, sums over U are replaced by ap-
( w) 1 ( w) 1

(B ( y ; x ) B ( y ; x ) ). As pointed out in section 4, each of these
(w) 1
propriately weighted sums over s1, giving B ( xB( w ) ; x1 ) =
terms has expectation close to zero. It follows that Y2 P a is ( s1 a1k z1k x1k ) 1 s1 a1k z1k xk ( w) B ( y ; x )( w). In this expres-
approximately unbiased for Y. sion, B ( y ; x )( w) is still unknown, so we replace it by its esti-
The variance of Y2 P a = s wk yk is closely approxi- mate B ( y ; x )( w) to arrive at B ( xB ; x ).
mated by the variance of the linearized form Y2 P a lin given
( w) 1
A key point to note is that estimates e1k can be obtained
by (4.9) with residuals defined by (4.10). Its first term, for k s1, because x k ( a ) , x k ( w) and x1k are all known for
U e0 k , is constant and does not contribute to the variance. k s1, but estimates e2 k can only be made for k s,
Therefore, because yk is available only for k s. The fact that the
estimates e1k are available for k s1 rather than k s
V (Y2 P a lin ) = V ( s1
a1k e1k + s ak e2k ) . (5.1)
allows us to construct (in section 7) a more efficient
We use (5.1) as the starting point for deriving a variance estimator of V (Y2 P a lin ) than the traditional approach to
estimator for Y2 P a lin . Two different approaches can be used variance estimation (in section 8) where all estimated
and it is of interest to compare them. The one in section 7 is residuals are calculated only for k s.
new and more interesting because it produces a more The design weights a1k = 1/ 1k , a2 k = 1/ 2 k and
efficient variance estimator than the one in section 8, ak = a1 k a2 k were defined in section 1. In the following
derived by the traditional technique of conditioning on the sections, we also need the quantities given below, defined as
phase-one sample s1. The residuals e1k and e2k given by functions of the second-order inclusion probabilities
(4.10) play an important role in both derivations. 1k = Pr ( k & s1 ) and 2 k = Pr (k & s | s1 ):
a1k = 1/ 1k , a2 k = 1/ 2 k , ak = a1 k a2 k
6. Preliminaries for variance estimation
D1k = a1 k a1 a1 k , D2 k = a2 k a2 a2 k ,
Our objective is to estimate the variance V (Y2 P a lin )
given by (5.1). This is done in sections 7 and 8 by two Dk = ak a ak .
different arguments. The residuals e1k and e2 k are defined
for all k U but they can not be computed. They must be Here, 2 k and a2 k are conditional on the sample s1.
replaced by estimates e1k and e2 k . These estimates, formed All first-order and second-order inclusion probabilities are
in the image of (4.10) are assumed positive. Using this notation and the above results,
e1k = xk ( a ) B ( y ; x )( a ) + xk ( w) B ( y ; x )( w) we now develop two different variance estimators in the
next two sections.
x1k B ( xB for k s1
( w) ; x1 )
e2 k = yk xk B ( y ; x ) 7. The separate residual variance estimator
= yk xk (t ) B ( y ; x )( t ) xk ( w) B ( y ; x )( w) The variance of Y2 P a lin is given by (5.1), where e1k and

e2k are defined by (4.10). It can be expanded as
xk ( a ) B ( y ; x )( a ) for k s (6.1)

V (Y2 P a lin ) = V (s a 1
1k )
e1k + V ( s ak e2 k ) common with our approach, but there are also considerable
differences.
+ 2 Cov ( s a 1
1k e1k , s ak e2 k . ) (7.1)
8. The combined residual variance estimator
If we knew the residuals e1k and e2 k , unbiased estimates
for these three components would be given respectively by We arrived at (7.3) by recognizing that the estimates e1k
are obtainable for k s1. The traditional approach,
k s s D1k e1k e1,
1 1 reviewed in this section, is to derive a variance estimator by
conditioning on the phase-one sample s1. This produces a
k s s Dk e2k e2, variance estimator where all required residuals are defined
for k s. Later, we compare it with the more efficient
2 k s s D1k a2 e1k e2. (7.2) (7.3). From (5.1), we condition on the phase-one sample s1
1
to obtain
The proof of unbiasedness is similar for all three
components. For example, for the second one, we have V (Y2 P a lin ) = Vs1 Es | s1 (s a 1
e + s ak e2 k
1k 1k )
Es 1 Es | s 1 ( k s s Dk e2 k e2 )
+ Es1Vs | s1 (s a 1
e
1k 1k + s ak e2 k )
= Es1 ( s ( Dk /a2 k ) e2 k e2 )
( s a )
k s1
1
= Vs1 e + s a1k e2 k
1k 1k
1 1
= k U U ( Dk /ak ) e2k e2
+ Es1Vs | s1 ( s ak e2 k )
k U U (ak a /ak ) e2k e2 ( )

2
= e
U 2k
= Vs1 ( s a 1
e
1k 12 k ) + Es Vs s ( s a e )
1 | 1 k 2k (8.1)
= E ( s ak e2 k ) E ( s ak e2 k )
2 2
where e12 k = e1k + e2 k is called the combined residual.
From (4.10), we obtain the following.
= V ( s ak e2 k ) .
e12 k = yk xk ( t ) B ( y ; x )(t ) x1k B ( xB( w ) ; x1 )
We now replace the unknown residuals in (7.2) by the
respective estimates given by (6.1); that is, e1k by e1k for e2 k = yk xk ( t ) B ( y ; x )(t ) xk ( w) B ( y ;x )( w)
k s1 and e2k by e2 k for k s. Then, the resulting
three components are added to arrive at the separate xk ( a ) B ( y ;x )( a ) . (8.2)
residual variance estimator
It is straightforward to define estimators of the two
Vsr (Y2 P a lin ) = k s s D1k e1k e1
1 1
components Vs1 ( s1 a1k e12 k ) and Es1 Vs | s1 ( s ak e2 k ).
Each of these has the form of a double sum over s because
e12k and e2k contain yk which is only available for
+ k s s Dk e2k e2 k s. The first component uses e12 k = e1k + e2 k =
yk xk ( t ) B ( y ; x )(t ) x1k B ( xB ; x ) for k s. We then
+ 2 k s s D1k a2 e1k e2. (7.3) (w) 1
1 have k s s D1k a2 k e12 k e12 as an estimator of
The term separate residual and the corresponding Vs1 ( s1 a1k e12 k ).
subscript sr reflect the fact that (7.3) keeps the residuals For the second component, we use the residual estimates
separate, where e1k is defined over the larger sample s1 and e2 k = yk xk B ( y ; x ) given by (6.1) for k s, and
e2 k over the smaller sample s. The fact that residuals obtain k s s D2 k a1k a1 e2 k e2 as an estimator of
computed for the larger sample s1 can be advantageous for Es1 Vs | s1 ( s ak e2 k ). Summing the two estimated terms we
variance estimation was recognized by Axelson (1998). have the following variance estimator, where the subscript
However, his derivation differs from our calibration cr indicates combined residual,
approach based on x1k and x k . The technique for variance Vcr (Y2 P a lin ) = D1k a2 k e12 k e12
ks s
estimation of the two-phase regression estimator in
Hidiroglou, Rao and Haziza (2006) has certain traits in + k s s D2 k a1k a1 e2 k e2. (8.3)

Let us review how (7.3) and (8.3) differ. The separate estimator with the combined residual variance estimator
residual variance estimator (7.3) starts with the expansion (8.3), also developed by the conditioning argument. The two
V (Y2 P a lin ) = V ( s1 a1k e1k ) + V ( s ak e2 k ) + 2Cov( s1 a1k e1k , variance estimators do not agree exactly, because the point
s ak e2 k ). We estimate these three components separately estimators are slightly different, but they are numerically
as functions of the residuals e1k and e2 k . The resulting close, as shown in this section.
variance expression has three terms: a double sum over s1 Let x1k be a vector of auxiliary variables with known
in terms of e1k and e1 , a double sum over s in terms of population totals, and let x k = (x1k , x2 k ), where both x1k
e2k and e2 , and a cross-sum over s1 and s in terms of and x 2k are known vector values for k s1. The total
e1k s1 and e2 s. Finally, we arrive at (7.3) by U x1k is assumed known whereas the total U x 2 k is
estimating e1k by e1k for k s1 and e2k by e2 k for unknown. The predicted values produced for k s1 by the
k s. two regressions fitted at the top level and bottom level
The combined residual variance estimator (8.3) arises are given respectively by
from the traditional conditioning on the phase-one sample
s1 as V (Y2 P a lin ) = Vs1 Es | s1 (Y2 P a lin ) + Es1Vs | s1 (Y2 P a lin ). y1k = x1k B 1s
This leads us to combine e1k and e2k as e12 k = e1k + e2 k with (9.1)
in the first term. The second term, Es1Vs | s1 (Y2 P a lin ), is a
(s a x ) (s a x )
1
B 1s = k 1k 1k x 12k k 1k yk 12k
function of e2 k . Since e12k and e2k can only be estimated
over s, the resulting variance estimator becomes a sum of and
two terms, each of them expressed as a double sum over s.
y k = xk B s
The separate residual estimator (7.3) is more efficient
than the combined residual alternative (8.3), because it is with (9.2)
( s a x x ) s a x
based on residuals e1k obtained for the typically larger 1
B s = k k k k2 k k yk / 2k .
sample s1. The advantage of (7.3) over (8.3) is illustrated
by the simulation in section 10. The approach behind the The resulting two-phase regression estimator Yreg of
separate residual variance estimator (7.3) can be extended to Y = U yk is
three-phase sampling and other complex designs. In those
extensions of the technique, we proceed in a similar manner, Yreg = ( U x1k ) B 1s + s a1k ( y k
1
y1k )
starting by a derivation of the linearized form through an
expansion of the variance components and the determina- + s ak ( yk y k ). (9.3)
tion of the appropriate residuals.
Can Yreg be interpreted as a calibration estimator? To
answer this question, let us determine the implicit weights in
9. A comparison with the two-phase (9.3). We can write Yreg = s wk yk , with weights wk
regression estimator identified by substituting (9.1) and (9.2) into (9.3) and
simplifying. We find wk = ak g k = a1k a2 k g k , where the
Srndal, Swensson and Wretman (1992) developed a
calibration factor g k is given for k s by
two-phase regression estimator for Y = U yk , based on
an earlier paper by Srndal and Swensson (1989). It is
useful to see how this estimator, denoted here by Yreg ,
gk = 1 + ( x s a x )
U 1k 1
1k 1k
( s a x x ) x
1
compares with the calibration estimator Y2 P considered in 2 2
k 1k 1k 1k 1k 1k
the preceding sections of this paper. When based on the
same auxiliary information, the two estimators are close + ( s a x s a x )
1k k k k
1
but not identical. This is because the estimator Y2 P is
derived by calibration in each of the two phases, whereas ( s a x x ) x .
k k k
2 1
k k
2
k (9.4)
the two-phase regression estimator Yreg is derived by
model-assisted reasoning. The weights wk are not explicitly stated in Srndal,
We now describe the two-phase regression estimator of Swensson and Wretman (1992). In what sense, if any, can
Srndal, Swensson and Wretman (1992). Their derivation wk be considered a calibration weight? To examine this, we
involves the fit of two linear regression models with the use first replace yk in (9.3) with x1k . Using (9.1) and (9.2) with
of the available auxiliary data; one at the top level and the yk = x1k gives U x1k as the right-hand side of (9.3).
other at the bottom level. These authors develop a Thus, the weights wk = ak g k satisfy s wk x1k =
corresponding estimator of variance, via the traditional U x1k . Next we replace yk in (9.3) with x2 k , again using
conditioning argument. We compare their variance (9.1) and (9.2) to obtain

s a1k x2 k
1
+ ( U x s a x )
1k 1
1k 1k e1k = x2 k B ( y ; x )(2) x1k B ( xB
(2) ; x1 )
for k s1
( s a x x ) e2 k = yk xk B ( y ; x )
1
2
k 1k 1k 1k
= yk x1k B ( y ; x )(1) x2 k B ( y ; x )(2) for k s (9.8)

( s a x x ) .
k 1k 2k
2
1k (9.5)
Although (9.5) is an approximately unbiased estimate of where B ( y ; x ) = (B ( y ; x )(1) , B ( y ; x )(2) ) corresponds to the
the unknown x 2k -total U x2 k , it does not have the usual partitioning of x k = (x1k , x2 k ) and from (6.2)
form of the right-hand side of a phase-two calibration
( s a z x ) ( s a z y )
1
B ( y ; x ) =
equation, such as s1 a1k x2 k or s1 w1k x2 k . However, it is k k k k k k
close. If we replace the two sums over s with appropriately

(s a )
1
weighted sums over s1, then (9.5) becomes s1 w1k x2 k B ( xB = x x 12k
1k 1k 1k
( 2 ) ; x1 ) 1
where w1k is given by (2.1) with z1k = x1k / 12k . Thus, the
implicit weights wk in Yreg calibrate exactly on the known
population x1k -total, and they come close to calibrating on (s a1
x x2 k B ( y ;x )(2) 12k .
1k 1k ) (9.9)
the estimated x 2k -total s1 w1k x2 k . This suggests that Yreg The residuals e2 k in (9.8) are the same as eks in (9.7).
should have properties similar to an estimator Y2 P obtained But how do the residuals e12 k = e1k + e2 k , obtained by
by defining x k in Y2 P as x k = (x1k , x2 k ) with x k (t ) = adding in (9.8), relate to their counterparts e1ks in (9.7)? To
x1k , x k ( w) = x 2 k and x k ( a ) = . In addition, the form of find this link, we first show that B 1s = ( s ak x1k x1k / 12k ) 1
the model-assisted estimator implies z1k = x1k / 12k and
s ak x1k yk / 1k can be written as
2
z k = x k / k2. Since x k includes x1k it is reasonable to
define z k = xk / k2 as z k = ( x1k / 12k , x2 k / 22 k ). These B 1s = B ( y ; x )(1)
specifications meet the requirements for asymptotic
(s a x ) ( s a x )
1
equivalence of Y2 P a and Y2 P w so we do not need to worry + k 1k 1k x / 12k k 1k x2 k B ( y ; x )(2) / 12k . (9.10)
about the choice of starting weights in Y2 P. We can simply
To see this, we start with B ( y ; x ) , which by definition
work with Y2 P a as the estimator comparable to Yreg. Now,
satisfies s ak z k yk = ( s ak z k xk ) B ( y ; x ) . This equality
let us look at variance estimation for Yreg and the estimator
can also be written as s ak z k yk = s ak z k ( x1k B ( y ; x )(1) +
Y2 P a under these specifications.
x2 k B ( y ; x )(2) ). Since z k = (x1k /12k , x2 k /22 k ), the component
The variance estimator of Srndal, Swensson and
of this equation corresponding to x1k is s ak x1k yk /12k =
Wretman (1992) contains calibration factors denoted g ks
s ak x1k x1k B ( y ; x )(1) /12k + s ak x1k x2 k B ( y ; x )(2) /12k . Premulti-
and g1ks1 . They are not to be confused with g k given by
plying both sides by ( s ak x1k x1k / 12k ) 1, we obtain (9.10).
(9.4). If we disregard g ks and g1ks1 , both of which are near
Then, starting with (9.8) and using the definition of
one and of limited numerical impact, their variance
B (xB ; x ) given by (9.9), we have
estimator is (2) 1
e12 k = e1k + e2 k
V (Yreg ) = k s s D1k a2 k e1k s e1 s = yk x1k B ( y ; x )(1) x1k B ( xB
( 2) ; x1 )
+ k s s D2 k a1k a1 ek s e s (9.6) = yk x1k B ( y ; x )(1) + { ( s a 1

x x 12k
1k 1k 1k )
1
where, for k s, ( s a1
1k 1kx x2 k B ( y ; x )(2) 12k )}.
e1k s = yk x1k B 1s and ek s = yk xk B s. (9.7) In the expression within curly brackets, let us replace the
two a1k -weighted sums over s1 with the corresponding
Both components of (9.6) are double sums over s, ak -weighted sums over s; the result is equal to B 1s as
reflecting the fact that both e1k s and ek s can only be given by (9.10). This means e12 k yk x1k B 1s = e1k s.
obtained for k s. Formula (9.6) looks similar to formula In summary, e12 k e1k s for k s and e2 k = ek s for
(8.3) for the combined residual estimator but how different k s. Hence, the variance estimator (9.6) for the
are the residuals in the two formulas? Let us look at the two-phase regression estimator Yreg should be numerically
residuals for the comparable point estimator. As noted close to the combined residual variance estimator (8.3) for
above, this estimator Y2 P has xk = (x1k , x2 k ) with x k (t ) = the calibration estimator Y2 P defined in this section. We
x1k , x k ( w) = x 2 k , x k ( a ) = , z1k = x1k / 12k and z k = present empirical support for this through the simulation in
x k / k2 = (x1k / 12k , x2 k / 22 k ). Under these specifications, next section.
the residuals e1k and e2 k in (6.1) are given by

10. Simulation Vsr (Y2 P a lin ) and Vcr (Y2 P a lin ). However, we can not
compare Vcr (Y2 P a lin ) and V (Yreg ) unless we define an
In this section we present a small simulation to validate estimator Y2 P a comparable to Yreg , and to achieve this we
the claim that the separate residual variance estimator need x k ( a ) = , as noted in section 9.
Vsr (Y2 P a lin ) given by (7.3) can be considerably more We drew repeated sample pairs ( s1, s ), where s1 is an
efficient than the combined residual variance estimator SRS of n1 units from U , and s is an SRS of n units from
Vcr (Y2 P a lin ) given by (8.3), and that the behaviour of the s1. Here SRS stands for simple random sampling without
latter is very similar to that of the two-phase regression replacement. We worked with different size combinations
estimator V (Yreg ) given by (9.6). We created a population (n1, n): (4000, 3000), (4000, 2000), (4000, 1000), (3000,
of N = 5,000 units in two steps as follows: First, the 2000), (3000, 1000) and (2000, 1000). If n = n1, two-
values (u1k , u2 k ) for k = 1, 2, ..., 5,000 were generated phase sampling is equivalent to one-phase sampling, and
by 5,000 realizations of the independent random variables Vsr (Y2 P a lin ) and Vcr (Y2 P a lin ) are identical.
u1k ~ 2 Gamma(4) and u2 k ~ 3Gamma(6), where the For each combination (n1, n), we realized 100,000
Gamma(a) distribution has density f ( x) = [ (a ) ] 1 sample pairs ( s1, s ). Based on the data for each of these
x a 1 e x for x > 0. Secondly, the values of the variable of outcomes, we computed the separate residual variance
interest were created as yk = 10 + u1k + 3 u2 k + k , estimator Vsr (Y2 P a lin ), the combined residual variance
k = 1, 2, ... 5,000, with k ~ 5 Normal(0), where estimator Vcr (Y2 P a lin ) and the variance estimator V (Yreg ).
Normal(0) is the standard Normal distribution with mean 0 For this purpose, we used the respective expressions that
and variance 1. The target of estimation in the experiment is follow from (7.3), (8.3) and (9.6) when SRS is specified at
the population y -total Y = U yk = 358, 205. For the phase- each phase. To save space, these expressions are not shown
one calibration, we used the auxiliary vector x1k = (1, u1k ) here. We obtained 100,000 realized values for each of the
and z1k = x1k . That is, the weights w1k for k s1 were three variance estimators. Figure 10.1 shows the distribu-
determined by calibration to the known total ( N, U u1k ) = tions of the 100,000 V -values for n1 = 4,000 and
(5, 000, 39, 611.8). For the phase-two calibration we used n = 2,000.
x k = (xk (t ) , xk ( w) , xk ( a ) ) with x k (t ) = (1, u1k ), x k ( w) = u2 k , The figure shows strikingly different distributions for
x k ( a ) = and z k = xk . These specifications satisfy the Vsr (Y2 P a lin ) and Vcr (Y2 P a lin ) . The distribution of the
conditions for asymptotic equivalence between Y2 P a and separate residual estimator Vsr (Y2 P a lin ) is much more
Y2 P w. Therefore, for this simulation, we can work with concentrated. Thus Vsr (Y2 P a lin ) is more efficient than
Y2 P a and its linearized form Y2 P a lin. Vcr (Y2 P a lin ) and on average, it produces considerably
For each phase-one sample s1, the final weights wk for shorter confidence intervals. We also note that the
the estimator Y2 P a = s wk yk were determined by distribution of V (Yreg ) is very similar to that of
calibrating to the known totals given by the vector Vcr (Y2 P a lin ). This supports our analysis in section 9. Similar
( N, U u1k , s1 w1k u2 k ) = (5,000, 39,611.8, s1 w1k u2 k ). It results were obtained for the other sample sizes in the
is important to note that it was not necessary to have simulation.
x k ( a ) = in order to run a simulation to compare
11,000
10,000
9,000
F 8,000
r
e 7,000
q 6,000
u
5,000
e
n 4,000
c
3,000
y
2,000
1,000
0
750,000 775,000 800,000 825,000 750,000 775,000 800,000 825,000 750,000 775,000 800,000 825,000
Vsr (Y2 P a lin ) Vcr (Y2 P a lin ) V (Yreg )
Figure 10.1 Distribution of 100,000 realized values for Vsr (Y2 P a lin ), Vcr (Y2 P a lin ) and V (Yreg )

To obtain a measure of the efficiency of the three 11. Discussion

variance estimators, we computed the simulation variance
of the 100,000 V -values. These simulation variances are In a design-based perspective on estimation for two-
shown in Table 10.1, Table 10.2 and Table 10.3. The phase sampling designs, one can follow a regression
numbers are dramatically lower for Vsr (Y2 P a lin ) than for the estimation approach or a calibration estimation approach.
other two. Table 10.4 shows the relative advantage of We concentrate on the calibration approach to create
Vsr (Y2 P a lin ) over Vcr (Y2 P a lin ). For this population, the approximately design-unbiased estimators. The extent of the
simulation variance of Vsr (Y2 P a lin ) is less than half the information available for the calibration holds the key to the
simulation variance of Vcr (Y2 P a lin ). efficiency of the estimates. We recognize in this paper that
there are three different types of auxiliary variables
associated with two-phase designs. They have different
Table 10.1 information characteristics. From these we define four
Simulation variance for the separate residual variance different auxiliary vectors; one for the phase-one calibration
estimator Vsr (Y2 P a lin ) and the other three for the phase-two calibration. The
n calibration approach is suitable for analyzing the resulting
n1 3,000 2,000 1,000 estimators in a systematic manner. As the paper shows, this
4,000 64.82 95.91 484.92 approach also leads to a more efficient variance estimator
3,000 1,179.62 1,806.79 than the traditional method for variance estimation in
2,000 13,995.94
two-phase designs.
Note: Actual values are the displayed values times 106.
References
Table 10.2
Simulation variance for the combined residual variance Axelson, M. (1998). Variance estimation for the generalised
estimator Vcr (Y2 P a lin ) regression estimator under two-phase sampling - a modified
approach. Proceedings of the Section on Survey Research
n Methods, American Statistical Association, 85-89.
n1 3,000 2,000 1,000
Deville, J.-C., and Srndal, C.-E. (1992). Calibration estimators in
4,000 153.22 364.08 1,290.41 survey sampling. Journal of the Ame,rican Statistical Association,
3,000 2,449.05 6,855.69 87, 376-382.
2,000 33,220.88
Deville, J.-C. (2002). La correction de la nonrponse par calage
Note: Actual values are the displayed values times 106. gnralis. Actes des Journes de Mthodologie, I.N.S.E.E., Paris.
Dupont, F. (1995). Alternative adjustments when there are several
levels of auxiliary information. Survey Methodology, 21, 125-136.
Table 10.3 Estevao, V.M., and Srndal, C.-E. (2002). The ten cases of auxiliary
Simulation variance for the variance estimator V (Yreg ) information for calibration in two phase sampling. Journal of
n Official Statistics, 18, 233-255.
n1 3,000 2,000 1,000 Hidiroglou, M.A. (2001). Double sampling. Survey Methodology, 27,
4,000 153.25 364.14 1,289.79 143-154.
3,000 2,449.36 6,854.52 Hidiroglou, M.A., and Srndal, C.-E. (1998). Use of auxiliary
2,000 33,210.31 information for two-phase sampling. Survey Methodology, 24,
Note: Actual values are the displayed values times 106. 11-20.
Hidiroglou, M.A., Rao, J.N.K. and Haziza, D. (2006). Variance
estimation in two phase sampling. (Accepted paper to appear in)
Australian and New Zealand Journal of Statistics.
Table 10.4
Ratio of entries in Table 10.1 to corresponding entries in Kott, P.S., and Stukel, D.M. (1997). Can the jackknife be used with a
Table 10.2 two-phase sample? Survey Methodology, 23, 81-90.
n Srndal, C.-E., and Swensson, B. (1987). A general view of
n1 3,000 2,000 1,000 estimation for two phases of selection with applications to
two-phase sampling and nonresponse. International Statistical
4,000 0.42 0.26 0.38 Review, 55, 279-294.
3,000 0.48 0.26
2,000 0.42 Srndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted
Survey Sampling. New York: Springer-Verlag.
Sitter, R.R. (1997). Variance estimation for the regression estimator in
two-phase sampling. Journal of the American Statistical
Association, 92, 780-787.

EstevaoSarndal - 2009 - New Face On Two Phase Samplng With Calibration Estimators

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

EstevaoSarndal - 2009 - New Face On Two Phase Samplng With Calibration Estimators

Încărcat de

Drepturi de autor:

Formate disponibile

Component of Statistics Canada

Catalogue no. 12-001-X Business Survey Methods Division

by Victor M. Estevao and Carl-Erik Srndal

A new face on two-phase sampling with calibration estimators

1. Introduction levels: At the population level, the total U x1k is known,

instrumental vector of the same dimension as x1k . It

Statistics Canada, Catalogue No. 12-001-X

Statistics Canada, Catalogue No. 12-001-X

Note that although e( y ; x ) k looks like a regression residual, it )(B

+ s a1k (xk ( a ) B( y; x )(a ) + e( xB( w ) ; x1 ) k ) + s a1k (xk ( a ) B( y; x)( a )

Statistics Canada, Catalogue No. 12-001-X

Y2 P w = U (xk (t ) B( y; x)(t ) + x1k B ( xB( w ) ; x1 ) ) Y2 P w lin = U (xk (t ) B ( y; x )(t ) + x1k B ( xB( w ) ; x1 ) )

+ s a1k ( xk ( a ) B ( y ; x )( a ) + e( xB( w ) ; x1 ) k ) + s a1k (xk ( a ) B( y ; x )( a ) + e( xB( w ) ; x1 ) k )

B ( y ; x1 ) B( x; x1 ) B( y ; x ) = 0 under the specified conditions.

from result 4.1, Y2 P w lin = Y2 P a lin. Since their linear

Statistics Canada, Catalogue No. 12-001-X

This makes (4.9) a suitable starting point for studying the

e1k = xk ( a ) B ( y ; x )( a ) + xk ( w) B ( y ; x )( w) (s a z k 1k yk s a1k z1k yk ) .

x1k B ( xB( w ) ; x1 ) The terms B ( y ; x1 ) s1 and B ( y ; x1 ) were defined in the

xk ( a ) B ( y ; x )( a ) . (4.10) can be ignored. This leads to the linearized form of this

e2 k = yk xk (t ) B ( y ; x )( t ) These residuals show a resemblance to those given by

Statistics Canada, Catalogue No. 12-001-X

5. Bias and variance of the two-phase where

The two-phase calibration estimator Y2 P a = s wk yk is

e2 k = yk xk B ( y ; x ) 7. The separate residual variance estimator

= yk xk (t ) B ( y ; x )( t ) xk ( w) B ( y ; x )( w) The variance of Y2 P a lin is given by (5.1), where e1k and

Statistics Canada, Catalogue No. 12-001-X

k U U (ak a /ak ) e2k e2 ( )

Statistics Canada, Catalogue No. 12-001-X

Statistics Canada, Catalogue No. 12-001-X

= yk x1k B ( y ; x )(1) x2 k B ( y ; x )(2) for k s (9.8)

close. If we replace the two sums over s with appropriately

+ k s s D2 k a1k a1 ek s e s (9.6) = yk x1k B ( y ; x )(1) + { ( s a 1

Statistics Canada, Catalogue No. 12-001-X

Vsr (Y2 P a lin ) Vcr (Y2 P a lin ) V (Yreg )

Statistics Canada, Catalogue No. 12-001-X

To obtain a measure of the efficiency of the three 11. Discussion

Statistics Canada, Catalogue No. 12-001-X

S-ar putea să vă placă și