Minimal Sufficient Statistics

MINIMAL SUFFICIENT STATISTICS
If a statistic is sufficient, then so is an augmented statistic S ' ( S ,T ) . Since the goal is

to summarize information concisely, we desire to work with minimal sufficient
statistics.
Def: A statistic S= T ( X ) is minimal sufficient if, for any other sufficient statistic,
T ' ( X ) , T ( X ) is a function of T ' ( X ) .
n
n
n
Example: Suppose T ( X ) = X i , T ' ( X ) = X i , X i2

i =1
i =1
i =1
Clearly, T ( X ) = g1 (T ' ( X ) ) , T ' ( X ) g 2 (T ( X ) )
We desire to use minimal sufficient statistics whenever possible as the greatest reduction
of the data.
n
Example: Assume X i , i = 1,.....n are iid Bernoulli. Let S = T ( X ) = X i . Now

i =1
suppose that V = g ( S ) . By definition, V is a summary of S only if for two different

values of S, S = r1 ,S = r2 , g ( r1 ) = g ( r2 ) = v . Lets check it.
P (S = T ( x) V = v) =
P (S = s V = v)
P (V = v )
n s
ns
s (1 )

=
n r1
n r2
n r1
n r2
r (1 ) + r (1 )
1
2
But this is still a function of , so V is NOT sufficient. S is sufficient. S is also minimal.
Can we formalize this? Sometimes factorization of the likelihood function,

L ( x ) = f ( x ) , gives us the minimal sufficient statistic directly. In other cases, we
can exploit the relationship between ratios of likelihood functions and minimal
sufficiency.
Theorem: Let f ( x ) be the pdf (pmf) of a sample X. Suppose that a function

T ( X ) exists such that for every two sample points (i.e. samples of observations) x, y the
ratio
f ( x)
f ( y)
is a constant as a function of iff T ( X ) = T (Y ) . Then T ( X ) is a minimal
sufficient statistic.
Proof: See class handout.
Consider a statistic as dividing the sample space into classes called equivalence classes.
Each class contains all observations X with the same value of S. If S is minimal
sufficient, then so is any 1 1 function of S (unique inverse). So minimal sufficiency is
somehow related to the set of equivalence classes but not to the particular labeling of
the equivalence classes.
Consider the partition of the sample space:

f ( z )
= h ( z, x )
D ( x ) = z;
f ( x )
That is, where the ratios of likelihood functions are proportional.

If z D ( x1 ) and z D ( x2 ) , then D ( x1 ) = D ( x2 ) .
This gives insight into the requirements for minimal sufficiency.
By the Factorization Theorem:

f ( x ) = g (T ( x ) ) h ( x ) = g ( x ) h ( x )
Now suppose that for some other set of data z,

f ( z ) = g ( T ( z ) ) h ( z ) (same family of pdfs)
If T ( x ) = T ( z ) , then
g (T ( x ) ) = g (T ( z ) ) (since the function g is the same)
f (x ) = f (z )
f (x )
h ( x)
h (z)
= m ( x,z)
f (z )
But this implies that x and z are in the same equivalence class. Therefore, the partition
defined by the ratio of the likelihoods includes that based on the statistic T, so this
partition is minimal sufficient. (This result is essentially Theorem 6.2.13, which is
proved differently).
Example: Suppose that X i are iid Poisson ( ) . We know that S1 = T ( X ) = X i and

i =1
S2 = T ' ( X ) = X (1) < ..... < X ( n ) are both sufficient for .
Consider S1 .
f (x )
f ( y )
exp ( n ) xi xi!
exp ( n )
xi y j
xi! y j!
yj
y j!
= h ( x , y ) i, j = 1,...n
So, this is a minimal sufficient statistic.

Now consider S2 .
f (x )
f ( y )
x( i )
x( i )!
y( j )
y( j )!
exp ( n )
exp ( n )
S1 is a function of S2 , so S1 is a coarser statistic. Therefore, S2 cannot be minimum
sufficient.
Example: Assume that you are performing life testing. Suppose that of n components, r
die after y1 ,.....yr time periods and that n-r are still alive after y1 ',.....ym . Assuming that
the lives of Y1 ,.....Yn are ~iid f ( yi ) = exp ( yi ) , then the joint pdf is,
f ( y ) = exp ( y j ) exp y'k

r
j =1
k =1
(The second factor is the probability of times to death exceeding y'k , k = 1,...m )
Then
f ( y ) = r exp y* ,
where y* =
j =1
k =1
y j + y'k .
What is a sufficient statistic S?
S = R,Y (note that R is the random variable of which r is a particular

observation).
For this problem, you have a sufficient statistic that is of dimension 2, while the
dimension of the parameter vector is 1. Is this a minimal sufficient statistic? Yes, check
the ratio.
Can we say anything special if we are dealing with the exponential family of
k
distributions: f ( x ) = h ( x ) c ( ) exp wi ( ) ti ( x )
i =1
Recall that
ti ( x j ) is sufficient.
Now look at the ratio of joint pdfs! Thus, if x j ~iid
j =1
and a member of the exponential family, these statistics are also minimal sufficient.
Example: Suppose that X ~ N , 2 and that neither is known.

f ( x)
n
n
1 n
n
= exp 2 xi2 yi2 2 xi yi

f ( y)
i =1
i =1
i =1
2 i =1
n
n
n
n
Now if xi = yi , xi2 = yi2 , the ratio is independent of , . So,

i =1
i =1
i =1
i =1
n
n
x
,
xi2 is minimal sufficient for , .
i =1 i =1
Note Example 6.2.14 in C&B. The densities are expressed in terms of x, s 2 . Not
surprisingly, the ratio of the likelihood functions is still independent of the parameters, so
these are minimal sufficient statistics as well. Neither sufficient nor minimal sufficient
statistics is necessarily unique for a family.

Example: Assume X i ~ Gamma ( , ) .
f ( x . ) =
x 1 exp ( x )
( )
1
= exp
+ ln + ( 1) ln ( x ) ln ( ( ) )
Assume that is known:
w`1 ( ) =
, t ( x ) = x, c ( ) = ln
Suppose that were unknown
ln ( ( ) ) , h ( x ) = ( 1) ln x
Consider a generalization of the exponential family to = ( i ,i = 1,...q ) . The joint pdf is
{
{
f ( x j ) = exp am ( ) b jm ( x j ) + c j ( ) + d j ( x j )
k
m =1
f ( x ) = exp am Tm ( x ) + c' ( ) + d ' ( x )

k
m =1
Tm ( x ) = b jm ( x j )
n
j =1
For the general exponential family with multiple unknown parameters, the dimension of
the parameter vector q and the index m of the sum over k terms are not necessarily the
same. (Recall the life testing example.)
If
k < q , some nonlinear relationship exists between the parameters

k = q , standard case
k > q , not common, but can happen
Example: Assume that =
, fixed. X ~ N ( , 2 2 )
( x )2
1
f ( x) =
exp
2 2 2
2
x2
x
1
1
= exp 2 2 + 2 2 ln 2 2 2
2
2
2
n
n
Here, k=2, q=1 , where a minimum sufficient statistic is X i , X i2

i =1
i =1
See also Example 6.2.15, a uniform distribution.

Minimal Sufficient Statistics

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Minimal Sufficient Statistics

Încărcat de

Drepturi de autor:

Formate disponibile

MINIMAL SUFFICIENT STATISTICS

If a statistic is sufficient, then so is an augmented statistic S ' ( S ,T ) . Since the goal is

Example: Suppose T ( X ) = X i , T ' ( X ) = X i , X i2

Clearly, T ( X ) = g1 (T ' ( X ) ) , T ' ( X ) g 2 (T ( X ) )

Example: Assume X i , i = 1,.....n are iid Bernoulli. Let S = T ( X ) = X i . Now

suppose that V = g ( S ) . By definition, V is a summary of S only if for two different

Can we formalize this? Sometimes factorization of the likelihood function,

Theorem: Let f ( x ) be the pdf (pmf) of a sample X. Suppose that a function

is a constant as a function of iff T ( X ) = T (Y ) . Then T ( X ) is a minimal

Proof: See class handout.

Consider the partition of the sample space:

That is, where the ratios of likelihood functions are proportional.

This gives insight into the requirements for minimal sufficiency.

By the Factorization Theorem:

Now suppose that for some other set of data z,

Example: Suppose that X i are iid Poisson ( ) . We know that S1 = T ( X ) = X i and

S2 = T ' ( X ) = X (1) < ..... < X ( n ) are both sufficient for .

So, this is a minimal sufficient statistic.

S1 is a function of S2 , so S1 is a coarser statistic. Therefore, S2 cannot be minimum

f ( y ) = exp ( y j ) exp y'k

What is a sufficient statistic S?

S = R,Y (note that R is the random variable of which r is a particular

Now look at the ratio of joint pdfs! Thus, if x j ~iid

Example: Suppose that X ~ N , 2 and that neither is known.

= exp 2 xi2 yi2 2 xi yi

Now if xi = yi , xi2 = yi2 , the ratio is independent of , . So,

statistics is necessarily unique for a family.

Assume that is known:

Suppose that were unknown

Consider a generalization of the exponential family to = ( i ,i = 1,...q ) . The joint pdf is

f ( x ) = exp am Tm ( x ) + c' ( ) + d ' ( x )

k < q , some nonlinear relationship exists between the parameters

Example: Assume that =

Here, k=2, q=1 , where a minimum sufficient statistic is X i , X i2

See also Example 6.2.15, a uniform distribution.

S-ar putea să vă placă și