Sunteți pe pagina 1din 5

Ancillary Statistics

January 25, 2016 Debdeep Pati

1 Ancillary statistics

Suppose X P , .

Definition 1. A statistics is ancillary if its distribution does not depend on . More


precisely, a statistic S(X) is ancillary for it its distribution is the same for all .
That is, P (S(X) A) is constant for for any set A.

Example: X = (X1 , . . . , Xn ) iid N(, 2 ). Let


n
2 1 X
S = (Xi X)2 .
n1
i=1

We know
(n 1)S 2 2 2 2 2
n1 S
2 n 1 n1

so that the distribution of S 2 depends upon 2 but not on . Thus S 2 is ancillary for

1 = {(, 2 ) : 2 = 02 },

but is not ancillary for

2 = {(, 2 ) : 2 > 0}.

Let (x) be a fixed density.

1. Location Family (LF) of densities: f (x | ) = (x ), < < .

2. Scale Family (SF) of densities: f (x | ) = 1 ( x ), > 0.

3. Location-Scale Family (LSF) of densities: f (x | , ) = 1 ( x


), ( > 0, < <
).

If X f ( | ) and Z (), then

d d
1. (LF) X = Z + (X = Z)

1
d d
2. (SF) X = Z (X/ = Z)
d d
3. (LSF) X = Z + ((X )/ = Z)

If X = (X1 , . . . , Xn ) is iid f ( | ) and Z = (Z1 , . . . , Zn ) iid (), then


d d
1. (LF) X = Z + 1 (X 1 = Z )

d d
2. (SF) X = Z (X / = Z )

d d
3. (LSF) X = Z + 1 ((X 1)/ = Z )

1. Examples of Location families:

(a) Unif(, + 1) distributions ( = R) with pdf f (x | ) = I( x + 1)


(b) Cauchy location family with pdf
1
f (x | ) = .
{1 + (x )2 }

(c) N(, 02 ) distributions with R unknown, 02 known.

2. Examples of Scale families:

(a) Unif(0, ) distributions ( > 0 unknown) with pdf f (x | ) = 1 I(0 x )


(b) Cauchy scale family with pdf
1
f (x | ) = .
{1 + (x/)2 }

(c) N(0, 2 ) distributions with 2 > 0 unknown.


(d) Exp() distributions ( > 0 unknown) with pdf f (x | ) = 1 ex/ I(x 0).

3. Examples of Location-Scale families:

(a) Unif(, ), < < < (all uniform distributions)


(b) N(, 2 ), R, 2 > 0 (all normal distributions).

2
1.1 Facts

1. If X = (X1 , X2 , . . . , Xn ) is iid from a LF and S(x) is a location invariant function,



(S(x + c1) = S(x) for all x Rn and c R), then S(X ) is ancillary.

2. If X = (X1 , X2 , . . . , Xn ) is iid from a SF and S(x) is a scale invariant function,

(S(cx) = S(x) for all x Rn and c > 0), then S(X ) is ancillary.

3. If X = (X1 , X2 , . . . , Xn ) is iid from a LSF and S(x) is a location-scale invariant

function, (S(ax + b1) = S(x) for all x Rn and a > 0, b R), then S(X ) is ancillary.

Proof. Let X = (X1 , X2 , . . . , Xn ) be iid f ( | ) and Z = (Z1 , . . . , Zn ) be iid ( | ).


d
1. Since X = Z + 1, we have

P (S(X ) A) = P (S(Z + 1) A)

= P (S(Z ) A)

which does not involve by the location invariance of S.
d
2. Since X = Z , we have

P (S(X ) A) = P (S(Z A)

= P (S(Z ) A)

which does not involve by the scale invariance of S.
d
3. Since X = Z + 1, we have

P (S(X ) A) = P (S(Z + 1) A)

= P (S(Z ) A)

which does not involve , by the location-scale invariance of S.

3
1.2 Location Invariant Statistics
1
X)2 is location invariant:
P
1. S(X) = n1 i=1 (Xi
n n
1 X 1 X
S(X + c) = (Xi + c X + c)2 = (Xi + c X c)2 = S(X).
n1 n1
i=1 i=1
Pn
Here X + c = (1/n) i=1 (Xi + c) = X + c.
P
2. S(X) = i=1 |Xi median(X)| is location invariant:
n
X n
X
S(X + c) = |Xi + c median(X + c)| = |Xi + c median(X) c| = S(X).
i=1 i=1

3. S(X) = max Xi min Xi = X(n) X(1) is location invariant:

S(X + c) = max(Xi + c) min(Xi + c) = max(Xi ) + c min(Xi ) c = X(n) X(1) .

4. The vector S(X) = (X2 X1 , X3 X2 , . . . , Xn X1 ) is location invariant by a similar


argument.

1.3 Scale Invariant Statistics


x0
1. t =
S/ n
is scale invariant as:

cx
t(cx) = = t(x)
cs/ n
since the cs cancel. Here we have used
n n
1X 1X
cx = cxi = c xi = cx,
n n
i=1 i=1
v v
u n u n
u 1 X u 1 X
S(cx) = t 2
(cxi cx) = c t (xi x)2 = cS(x).
n1 n1
i=1 i=1

X
2. S(X) = X(n) is scale invariant:

cX
S(cX) = = S(X)
cX(n)
for all c > 0.
Note: S(cX) 6= S(X) for c 0.

4
3.
 
X X X
S(X) = P 1 , P 2 ,..., P n
Xi Xi Xi
is scale invariant.

1.4 Scale Invariant Statistics

1. Sample skewness is proportional to


(Xi X)3
P
S(X) = P .
[ (Xi X)2 ]3/2
2. Sample kurtosis is proportional to
(Xi X)4
P
S(X) = P .
[ (Xi X)2 ]2
They are location-scale invariant.

Proof. It suffices to show:


(a) S(aX) = S(X) for a > 0, and
(b) S(X + b) = S(X) for all b.
Part (b) follows from
(Xi + b) (X + b) = Xi X
Part (a) follows from
X X
(cxi cx)m = cm (xi x)m

3. The standardized residuals


xi x
z = (z1 , z2 , . . . , zn ), zi =
S
are location-scale invariant.

General comment: An ancillary statistic by itself can tell us nothing about , but when
combined with other statistics, it may give information about .
Example: X = (X1 , X2 , . . . , Xn ) iid Unif(, + 1). We know (X(1) , X(n) ) is MSS. Any
1-1 function of a MSS is also MSS. Therefore (X(1) , X(n) X(1) ) is MSS. We cannot drop
X(n) X(1) without losing information about . But X(n) X(1) is ancillary ! It is ancillary
because Unif(, + 1) is a location family, and X(n) X(1) is a location invariant statistic.

S-ar putea să vă placă și