22ANOVAmixed PDF

THE ANOVA APPROACH TO THE ANALYSIS OF
LINEAR MIXED EFFECTS MODELS

We begin with a relatively simple special case. Suppose
yijk = + i + uij + eijk , (i = 1, . . . , t; j = 1, . . . , n; k = 1, . . . , m)
= (, 1 , . . . , t )0 , u = (u11 , u12 , . . . , utn )0 , e = (e111 , e112 , . . . , etnm )0 ,
IRt+1 , an unknown parameter vector,

2

u
0
u I 0
N
,
, where
e
0
0 e2 I
u2 , e2 IR+ are unknown variance components.
c
Copyright 2012
(Iowa State University)
Statistics 511
1 / 47
This is the standard model for a CRD with t treatments, n

experimental units per treatment, and m observations per
experimental unit.
We can write the model as y = X + Zu + e, where
X = [1tnm1 , Itt 1nm1 ]
c
Copyright 2012
and
Z = [Itntn 1m1 ].
Statistics 511
2 / 47
The ANOVA Table
Source
treatments
exp.units(treatments)
obs.units(exp.units, treatments)
c.total
c
Copyright 2012
DF
t1
t(n 1)
tn(m 1)
tnm 1
Statistics 511
3 / 47
The ANOVA Table
Source
trt
xu(trt)
ou(xu, trt)
DF
t1
t(n 1)
tn(m 1)
c.total
tnm 1
c
Copyright 2012
Sum of Squares
Pt Pn Pm
2
k=1 (yi y )
Pti=1 Pj=1
n Pm
2
k=1 (yij yi )
Pi=1
Pj=1
t
n Pm
2
j=1
k=1 (yijk yij )
Pi=1
P
P
t
n
m
2
i=1
j=1
k=1 (yijk y )
Statistics 511
4 / 47
Source
trt
xu(trt)
ou(xu, trt)
DF
t1
tn t
tnm tn
c.total
tnm 1
c
Copyright 2012
Sum of Squares
Mean Square
P
nmP ti=1P
(yi y )2
m ti=1 nj=1 (yij yi )2
SS/DF
Pt Pn Pm
2
(y
y
)
ij
k=1 ijk
Pi=1
Pj=1
t
n Pm
2
(y
y
)
i=1
j=1
k=1 ijk
Statistics 511
5 / 47
Expected Mean Squares
E(MStrt ) =
nm
t1
Pt
nm
t1
Pt
+ i + ui + ei u e )2
nm
t1
Pt
+ ui u + ei e )2
nm
t1
Pt
nm Pt
t1 [ i=1 (i
i=1 E(yi..
i=1 E(
i=1 E(i
i=1 [(i
y )2
)2 + E(ui u )2 + E(ei e )2 ]
P
)2 + E{ ti=1 (ui u )2 }
P
+E{ ti=1 (ei e )2 }]
c
Copyright 2012
Statistics 511
6 / 47
To simplify this expression further, note that

u2
i.i.d.
u1 , . . . , ut N 0,
.
n
Thus,
E
( t
X
)
(ui u )
= (t 1)
i=1
u2
.
n
Similarly,
e1 , . . . , et

e2
.
N 0,
nm
i.i.d.
Thus,
E
( t
X
)
(ei e )2
i=1
= (t 1)
e2
.
mn
It follows that
E(MStrt ) =
t
nm X
(i .)2 + mu2 + e2 .
t1
i=1
c
Copyright 2012
Statistics 511
7 / 47
Similar calculations allow us to add an Expected Mean Squares (EMS)

column to our ANOVA table.
Source
trt
xu(trt)
ou(xu, trt)
c
Copyright 2012
EMS
e2 + mu2 +
e2 + mu2
e2
nm
t1
Pt
i=1 (i
.)2
Statistics 511
8 / 47
The entire table could also be derived using matrices

X1 = 1, X2 = It t 1nm 1 , X3 = Itn tn 1m 1 .
Source
trt
xu(trt)
ou(xu, trt)
c.total
Sum of Squares
y0 (P2 P1 )y
y0 (P3 P2 )y
y0 (I P3 )y
y0 (I P1 )y
c
Copyright 2012
DF
MS
rank(X2 ) rank(X1 )
rank(X3 ) rank(X2 ) SS/DF
tnm rank(X3 )
tnm 1
Statistics 511
9 / 47
Expected Mean Squares (EMS) could be computed using

E(y0 Ay) = tr(A) + E(y)0 AE(y),
where
= Var(y) = ZGZ0 + R = u2 Itn tn 110 m m + e2 Itnm tnm
and
E(y) =
c
Copyright 2012
+ 1
+ 2
.
.
.
+ t
1nm 1 .
Statistics 511
10 / 47
Furthermore, it can be shown that

y0 (P2 P1 )y
2t1
e2 + mu2
t
X
nm
(i .)2
e2 + mu2
!
,
i=1
y0 (P3 P2 )y
2tnt ,
e2 + mu2
y0 (I P3 )y
2tnmtn ,
e2
and that these three 2 random variables are independent.
c
Copyright 2012
Statistics 511
11 / 47
It follows that
MStrt
Ft1, tnt
F1 =
MSxu(trt)
and
MSxu(trt)
F2 =
MSou(xu,trt)
t
X
nm
(i .)2
e2 + mu2
e2 + mu2
e2
i=1

Ftnt, tnmtn .
Thus, we can use F1 to test H0 : 1 = ... = t and F2 to test H0 : u2 = 0.
c
Copyright 2012
Statistics 511
12 / 47
Estimating u2
Note that

E
MSxu(trt) MSou(xu,trt)
m
Thus,

=
(e2 + mu2 ) e2
= u2 .
m
m
is an unbiased estimator of u2 .
c
Copyright 2012
Statistics 511
13 / 47
Although
m
is an unbiased estimator of u2 , this estimator can take negative
values.
This is undesirable because u2 , the variance of the u random
effects, cannot be negative.
c
Copyright 2012
Statistics 511
14 / 47
As we have seen previously,

Var(y) = u2 Itn tn 110 m m + e2 Itnm tnm .
It turns out that
= (X0 1 X) X0 1 y = (X0 X) X0 y = .
Thus, the GLS estimator of any estimable C is equal to the OLS

estimator in this special case.
c
Copyright 2012
Statistics 511
15 / 47
An Analysis Based on the Average for Each

Experimental Unit
Recall that our model is

yijk = + i + uij + eijk , (i = 1, ..., t; j = 1, ..., n; k = 1, ..., m)
The average of observations for experimental unit ij is
yij = + i + uij + eij
c
Copyright 2012
Statistics 511
16 / 47
If we define
ij = uij + eij i, j
and
2 = u2 +
e2
,
m
we have
yij = + i + ij ,
where the ij terms are iid N(0, 2 ). Thus, averaging the same number
(m) of multiple observations per experimental unit results in a normal
theory Gauss-Markov linear model for the averages

yij : i = 1, ..., t; j = 1, ..., n .
c
Copyright 2012
Statistics 511
17 / 47
Inferences about estimable functions of obtained by analyzing

these averages are identical to the results obtained using the
ANOVA approach as long as the number of multiple observations
per experimental unit is the same for all experimental units.
When using the averages as data, our estimate of 2 is an
2
estimate of u2 + me .
We cant separately estimate u2 and e2 , but this doesnt matter if
our focus is on inference for estimable functions of .
c
Copyright 2012
Statistics 511
18 / 47
Because
E(y) =
+ 1
+ 2
.
.
.
+ t
1nm 1 ,
the only estimable quantities are linear combinations of the treatment

means + 1 , + 2 , ..., + t , whose Best Linear Unbiased
Estimators are y1 , y2 , . . . , yt , respectively.
c
Copyright 2012
Statistics 511
19 / 47
Thus, any estimable C can always be written as
+ 1
+ 2
A
for some matrix A.
..
.
+ t
It follows that the BLUE of C can be written as
y1
y
2
A . .
..
yt
c
Copyright 2012
Statistics 511
20 / 47
Now note that

Var(yi ..) = Var( + i + ui . + ei ..)
= Var(ui . + ei ..)
= Var(ui .) + Var(ei ..)
u2
2
=
+ e
n
nm

1
e2
2
=
u +
n
m
2
=
.
n
c
Copyright 2012
Statistics 511
21 / 47
Thus
Var
y1..
y2..
.
.
.
yt..
2
=
Itt
which implies that the variance of the BLUE of C is

y1
.

2

2 0
Var A . = A
Itt A0 =
AA .
n
n
.
yt
c
Copyright 2012
Statistics 511
22 / 47
Thus, we dont need separate estimates of u2 and e2 to carry out

inference for estimable C.
We do need to estimate 2 = u2 +
e2
m.
This can equivalently be estimated by

MSxu(trt)
m
or by the MSE in an analysis of the experimental unit means

yij : i = 1, ..., t; j = 1, ..., n. .
c
Copyright 2012
Statistics 511
23 / 47
For example, suppose we want to estimate 1 2 . The BLUE is

y1 y2 whose variance is
Var(y1 y2 ) = Var(y1 ) + Var(y2 )
2

2
u
e2
= 2
=2
+
n
n
mn
2 2
=
( + mu2 )
mn e
2
E(MSxu(trt) )
=
mn
Thus,
c 1 y2 ) =
Var(y
c
Copyright 2012
2MSxu(trt)
.
mn
Statistics 511
24 / 47
A 100(1 )% confidence intreval for 1 2 is

r
2MSxu(trt)
.
y1 y2 tt(n1),1/2
mn
A test of H0 : 1 = 2 can be based on
y y2
1 2
t = q1
tt(n1) q
2MSxu(trt)
mn
c
Copyright 2012
2(e2 +mu2 )
mn
Statistics 511
25 / 47
What if the number of observations per experimental unit is not

the same for all experimental units?
Let us look at two miniature examples to understand how this type
of unbalancedness affects estimation and inference.
c
Copyright 2012
Statistics 511
26 / 47
First Example
y111
y121
y=
y211 ,
y212
c
Copyright 2012
1
1
X=
0
0
0
0
,
1
1
1
0
Z=
0
0
0
1
0
0
0
0
1
1
Statistics 511
27 / 47
X1 = 1,
X2 = X,
X3 = Z
MStrt = y0 (P2 P1 )y = 2(y1 y )2 + 2(y2 y )2 = (y1 y2 )2

1
MSxu(trt) = y0 (P3 P2 )y = (y111 y1 )2 + (y121 y1 )2 = (y111 y121 )2
2
1
MSou(xu,trt) = y0 (I P3 )y = (y211 y2 )2 + (y212 y2 )2 = (y211 y212 )2
2
c
Copyright 2012
Statistics 511
28 / 47
E(MStrt ) = E(y1 y2 )2
= E(1 2 + u1 u21 + e1 e2 )2
= (1 2 )2 + Var(u1 ) + Var(u21 ) + Var(e1 ) + Var(e2 )
= (1 2 )2 +
u2
2
+ u2 +
e2
2
e2
2
= (1 2 )2 + 1.5u2 + e2
c
Copyright 2012
Statistics 511
29 / 47
E(MSxu(trt) )
1
2 E(y111
1
2 E(u11
1
2
2 (2u
y121 )2
u12 + e111 e121 )2
+ 2e2 )
= u2 + e2
E(MSou(xu,trt) ) =
1
2 E(y211
y212 )2
1
2 E(e211
e212 )2
= e2
c
Copyright 2012
Statistics 511
30 / 47
SOURCE
trt
xu(trt)
ou(xu, trt)

F=
MStrt
1.5u2 + e2
c
Copyright 2012
EMS
(1 2 )2 + 1.5u2 + e2
u2 + e2
e2

MSxu(trt)
(1 2 )2
/
F
1,1
u2 + e2
1.5u2 + e2
Statistics 511
31 / 47
The test statistic that we used to test

H0 : 1 = = t
in the balanced case is not F distributed in this unbalanced case.

(1 2 )2
MStrt
1.5u2 + e2
F1,1
MSxu(trt)
u2 + e2
1.5u2 + e2
c
Copyright 2012
Statistics 511
32 / 47
A Statistic with an Approximate F Distribution

Wed like our denominator to be an unbiased estimator of
1.5u2 + e2 in this case.
Consider 1.5MSxu(trt) 0.5MSou(xu,trt)
The expectation is
1.5(u2 + e2 ) 0.5e2 = 1.5u2 + e2 .
The ratio
MStrt
1.5MSxu(trt) 0.5MSou(xu,trt)
can be used as an approximate F statistic with 1 numerator DF

and a denominator DF obtained using the Cochran-Satterthwaite
method.
c
Copyright 2012
Statistics 511
33 / 47
The Cochran-Satterthwaite method will be explained in the next

set of notes.
We should not expect this approximate F-test to be reliable in this
case because of our pitifully small dataset.
c
Copyright 2012
Statistics 511
34 / 47
Best Linear Unbiased Estimates in this First Example
What do the BLUEs of the treatment means look like in this case?
Recall
1 0
1 0 0

1 0
1
, Z = 0 1 0 .
=
, X=
0 1
0 0 1
2
0 1
0 0 1
c
Copyright 2012
Statistics 511
35 / 47
= Var(y) = ZGZ0 + R = u2 ZZ0 + e2 I
1 0 0 0
1 0 0 0
0 1 0 0
2 0 1 0 0
= u2
0 0 1 1 + e 0 0 1 0
0 0 1 1
0 0 0 1
2
2
u 0 0 0
e 0 0 0
0 u2 0 0 0 e2 0 0

=
0 0 u2 u2 + 0 0 e2 0
0 0 u2 u2
0 0 0 e2
2
u + e2
0
0
0
0
u2 + e2
0
0
=
2
2
2
0
0
u + e
u
2
2
2
0
0
u
u + e
c
Copyright 2012
Statistics 511
36 / 47
It follows that
= (X0 1 X) X0 1 y

1 1

0 0
y1
2
2
=
y=
y2
0 0 21 12
Fortunately, this is a linear estimator that does not depend on unknown
variance components.
c
Copyright 2012
Statistics 511
37 / 47
Second Example
y111
y112
y=
y121 ,
y211
c
Copyright 2012
1
1
X=
1
0
0
0
,
0
1
1
1
Z=
0
0
0
0
1
0
0
0
0
1
Statistics 511
38 / 47
In this case, it can be shown that

0 1
0 1
= (X X) X y
"
=
"
=
e2 +u2
3e2 +4u2
e2 +u2
3e2 +4u2
e2 +2u2
3e2 +4u2
2e2 +2u2
3e2 +4u2
c
Copyright 2012
y11
+
y211
e2 +2u2
3e2 +4u2
0
1
y121
y111
y112
y121
y211
#
.
Statistics 511
39 / 47
It is straightforward to show that the weights on y11 and y121 are

1
Var(y11 )
1
Var(y11 )
1
Var(y121 )
and
1
Var(y121 )
1
Var(y11 )
1
Var(y121 )
, respectively.
This is a special case of a more general phenomenon: the BLUE is a

weighted average of independent linear unbiased estimators with
weights for the linear unbiased estimators proportional to the inverse
variances of the linear unbiased estimators.
c
Copyright 2012
Statistics 511
40 / 47
Of course, in this case and in many others,

" 2 2
#
2e +2u
e2 +2u2
= 3e2 +4u2 y11 + 3e2 +4u2 y121
y211
is not an estimator because it is a function of unknown parameters.
b as our estimator (i.e., we replace 2 and 2 by
Thus, we use
e
u
estimates in the expression above).
c
Copyright 2012
Statistics 511
41 / 47
b is an approximation to the BLUE.
b is not even a linear estimator in this case.
Its exact distribution is unknown.

When sample sizes are large, it is reasonable to assume that the
b is approximately the same as the distribution of
distribution of
c
Copyright 2012
Statistics 511
42 / 47
) = Var[(X0 1 X)1 X0 1 y]
Var(
= (X0 1 X)1 X0 1 Var(y)[(X0 1 X)1 X0 1 ]0

= (X0 1 X)1 X0 1 1 X(X0 1 X)1
= (X0 1 X)1 X0 1 X(X0 1 X)1
= (X0 1 X)1
1
b ) = Var[(X0
b
Var(
c
Copyright 2012
b 1 y] =???? (X0
b 1 X)1
X)1 X0
Statistics 511
43 / 47
Summary of Main Points
Many of the concepts we have seen by examining special cases

hold in greater generality.
For many of the linear mixed models commonly used in practice,
balanced data are nice because...
c
Copyright 2012
Statistics 511
44 / 47
It is relatively easy to determine degrees of freedom, sums of

squares, and expected mean squares in an ANOVA table.
Ratios of appropriate mean squares can be used to obtain exact

F-tests.
b = C.
(OLS = GLS).
For estimable C, C
= constant E(MS), exact inferences about c0

When Var(c0 )
can be obtained by constructing t-tests or confidence intervals
based on
c0
c0
t= p
tDF(MS) .
constant (MS)
Simple analysis based on experimental unit averages gives the
same results as those obtained by linear mixed model analysis of
the full data set.
c
Copyright 2012
Statistics 511
45 / 47
When data are unbalanced, the analysis of linear mixed may be

considerably more complicated.
1
Approximate F-tests can be obtained by forming linear

combinations of Mean Squares to obtain denominators for test
statistics.
b may be a nonlinear estimator of C whose
The estimator C
exact distribution is unknown.

Approximate inference for C is often obtained by using the
b , with unknowns in that distribution replaced by
distribution of C
estimates.
c
Copyright 2012
Statistics 511
46 / 47
Whether data are balanced or unbalanced, unbiased estimators of

variance components can be obtained using linear combinations of
mean squares from the ANOVA table.
c
Copyright 2012
Statistics 511
47 / 47

22ANOVAmixed PDF

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

22ANOVAmixed PDF

Încărcat de

Drepturi de autor:

Formate disponibile

THE ANOVA APPROACH TO THE ANALYSIS OF

LINEAR MIXED EFFECTS MODELS

This is the standard model for a CRD with t treatments, n

The ANOVA Table

The ANOVA Table

Expected Mean Squares

To simplify this expression further, note that

Similar calculations allow us to add an Expected Mean Squares (EMS)

The entire table could also be derived using matrices

Expected Mean Squares (EMS) could be computed using

Furthermore, it can be shown that

Thus, we can use F1 to test H0 : 1 = ... = t and F2 to test H0 : u2 = 0.

As we have seen previously,

Thus, the GLS estimator of any estimable C is equal to the OLS

An Analysis Based on the Average for Each

Recall that our model is

Inferences about estimable functions of obtained by analyzing

the only estimable quantities are linear combinations of the treatment

Thus, any estimable C can always be written as

Now note that

which implies that the variance of the BLUE of C is

Thus, we dont need separate estimates of u2 and e2 to carry out

This can equivalently be estimated by

For example, suppose we want to estimate 1 2 . The BLUE is

A 100(1 )% confidence intreval for 1 2 is

What if the number of observations per experimental unit is not

MStrt = y0 (P2 P1 )y = 2(y1 y )2 + 2(y2 y )2 = (y1 y2 )2

u12 + e111 e121 )2

The test statistic that we used to test

A Statistic with an Approximate F Distribution

can be used as an approximate F statistic with 1 numerator DF

The Cochran-Satterthwaite method will be explained in the next

Best Linear Unbiased Estimates in this First Example

= Var(y) = ZGZ0 + R = u2 ZZ0 + e2 I

In this case, it can be shown that

It is straightforward to show that the weights on y11 and y121 are

This is a special case of a more general phenomenon: the BLUE is a

Of course, in this case and in many others,

estimates in the expression above).

b is an approximation to the BLUE.

b is not even a linear estimator in this case.

Its exact distribution is unknown.

= (X0 1 X)1 X0 1 Var(y)[(X0 1 X)1 X0 1 ]0

Summary of Main Points

Many of the concepts we have seen by examining special cases

It is relatively easy to determine degrees of freedom, sums of

Ratios of appropriate mean squares can be used to obtain exact

= constant E(MS), exact inferences about c0

When data are unbalanced, the analysis of linear mixed may be

Approximate F-tests can be obtained by forming linear

exact distribution is unknown.

Whether data are balanced or unbalanced, unbiased estimators of

S-ar putea să vă placă și