Documente Academic
Documente Profesional
Documente Cultură
48
Abstract
This paper discusses about estimation of
multivariate linear mixed model or multivariate component of
variance model with equal number of replications. We focus on
two estimation methods, namely Maximum Likelihood
Estimation (MLE) and Restricted Maximum Likelihood
Estimation (REMLE) methods. The results show that the
parameter estimation of fixed effects yields unbiased estimators,
whereas the estimation for random effects or variance
components yields biased estimators. Moreover, assume that
both likelihood and ln-likelihood functions hold some of
regularity conditions, it can be proved that estimators as a
solutions set of the likelihood equations satisfy strong consistency
for large sample size, asymptotic normal and efficiency.
Index Term-- Linear Mixed Model, Multivariate Linear
Model, Maximum Likelihood, Asymptotic Normal and
Efficiency, Consistency.
I. INTRODUCTION
Linear mixed models or variance components models have
been effectively and extensively used by statisticians for
analyzing data when the response is univariate. Reference
[12] discussed the latent variable model for mixed ordinal or
discrete and continuous outcomes that was applied to birth
defects data. Reference [16] showed that maximum likelihood
estimation of variance components from twin data can be
parameterized in the framework of linear mixed models.
Specialized variance component estimation software that can
handle pedigree data and user-defined covariance structures
can be used to analyze multivariate data for simple and
complex models with a large number of random effects.
Reference [2] showed that Linear Mixed Models (LMM)
could handle data where the observations were not
independent or could be used for modeling data with
correlated errors. There are some technical terms for predictor
variables in linear mixed models, those are (i) random effects,
i.e. the set of values of a categorical predictor variable that the
values are selected not completely but as random sample of all
possibility values (for example, the variable product has
values representing only 5 of a possible 42 brands), (ii)
hierarchical effects, i.e. predictor variables are measured at
more than one level, and (iii) fixed effects, i.e. predictor
variables which all possible category values (levels) are
measured.
Otherwise, in this paragraph there are many papers
discussed about linear model for multivariate cases. Reference
[5] applied multivariate linear mixed model to Scholastic
Aptitude Test and proposed Restricted Maximum Likelihood
(REML) to estimate the parameters. Reference [6] used
multivariate linear mixed model or multivariate variance
components model with equal replication to predict the sum of
the regression mean and the random effects of models. This
(1)
random
IJENS
y ~ N ( X , ) .
or
(2)
exp y X 1 y X / 2
T
(3)
49
X T 1 X ,
2
E
q
= 0 ; 1 q n(n+1)/2,
2
E
q q '
(1/2)tr 1 1
q
q '
1 q, q n(n+1)/2.
Base on (5) and a result of Theorem 1 in [15], the ML Iterative
Algorithm can be used to compute the MLE of the unknown
parameters in model (3). By using iteration process, the
estimator of variance components are found as elements of
) such that estimator of covariance matrix can be
vech (
written as,
(8)
(4)
y T P
q
1
P y tr (
q
(5)
) ; q 1, 2, ..., n( n 1) / 2 , (6)
q
f R ( y 1 ) 1 / (2 ) n p 1 | A T A |
exp (1/2) y 1T ( A T A) 1 y 1 .
(9)
where P 1 1 X (X T 1 X) 1 X T 1 .
Equation (5) has a closed-form. Whereas (6) does not
have a closed-form, therefore [4], [17] proposed three
algorithms that could be used to calculate the parameter
estimation q, i.e. EM (expectation-maximization), N-R
(Newton-Raphson), and Fisher Scoring algorithm. Due to the
MLE is consistent and asymptotically normal with covariance
matrix asymptotic equal to inverse of Fisher Information
matrix, so we need to obtain Fisher Information matrix
components. In both univariate and multivariate linear mixed
models, the regression coefficients and covariance matrix
components will be placed it in to a vector , that is
(T , T ) T . Thus, the Fisher Information matrix will have
expressions as follows
2
E
Var
T
(7)
R () (1 / 2)[ln( A T A ) y1T (A T A) 1 y1 ] .
(10)
(1 / 2){y T P
Py tr(P
)}
q
q
q
(11)
2 R
E
T
(12)
IJENS
(1 / 2)tr P P
q q '
1 q, q n(n 1) / 2 .
(13)
(14)
where
yi =
Yi1
Y
i2 ,
Yis
B =
01
11
p1
(15)
02
12
p2
1
i1
x
i
1
xi = , and ei = i 2 .
is
x ip
0,
i i'
vec ( Y T ) ( X I s )vec( B T ) .
(18)
(19)
(16)
T )) ( X T X) 1
.
Var(vec(B
0s
1s
,
ps
50
(21)
IJENS
(22)
Let a matrix of variance V (Z I s )(Z I s ) T (I n ) with size
nsns, then variance components of model (21) are elements
of = vech(V) = [1 ,, ns ( ns 1) / 2 ]T . The likelihood function of
51
3 L( y, )
H( y ) hold for all y, and
i i ' i ''
g ( y )dy , h( y )dy ,
E H(y ) for N ( 0 ) .
(R3). For each ,
0 E T ( y, )
,,
= gradient, and
k
1
where
matrix.
V 1 vec( Y T ) ( X I s )vec( B T ) .
(24)
and by using a nonlinear optimization, with inequality
constraints imposed on so that positive definiteness
requirements on the and matrices are satisfied. There is
no closed-form solution for the , so the estimate of is
obtained by the Fisher scoring algoritm or Newton-Raphson
algoritm. After an iterative computational process we have
V
V
, i.e.
( Z I )
( Z I ) T (I
) .
V
s
s
n
(25)
1
This is written m wp
; m . An equivalent condition
for convergence wp1 is
lim P( l , l m) 1, 0 [14].
(27)
Proof:
The L( y, ) is a likelihood function. Thus, it can be
considered as a joint probability density function, so
L( y, )dy 1 .
L( y, )dy 0 .
( y, )
( y , )
,
i i ' i ''
i i '
2
, k .
(R2).
( y, )
; i.e.,
2 ( y, )
( y, )
I( ) Var
E
E[ T ( y, )].
( y, )
,
i
(26)
Lemma:
Let L( y, ) and ( y, ) are likelihood function and lnlikelihood function respectively. Assume that the regularity
conditions (R1) and (R2) hold. Fisher Information is the
= norm of a
(28)
L( y , )
L( y, )dy 0
L( y , )
or
(ln L( y, ))
L( y, )dy 0 .
(29)
For each 0 , there are exist functions g(y), h(y), and H(y) (poss
Writing (29) as an expectation, we have established
IJENS
(30)
( y , )
is
2 (ln L( y , ))
L( y , )
T
(ln L( y , ))
L( y , )}dy 0.
m 0 ) T (11T )]
{(1 / m ) T ( y, 0 ) (1 /( 2m ))[(
m 0 ) (1 / m )( y, 0 )
H( y, *m )} m (
(34)
a m (1 / m ) ( y i , 0 ) then
i 1
E 0 [a m ] (1 / m ) E 0 [( y i , 0 )] 0 ,
i 1
Var[a m ] (1 / m ) 2 mI( 0 ) I( 0 ) ,
2 (ln L( y, ))
L( y, )dy
T
m)
( y,
0 , so Taylor series (33) can be written as
Therefore, putting
This is equivalent to
where 0 *m m , H( y, *m ) (T ) ( y, *m ) .
But
52
B m (1 / m) T ( y i , 0 ) then E0 [B m ] I( 0 ) , and
i 1
(ln L( y , ))
L ( y , ) d y 0.
0
T
(1 / 2) E * [H( y, *m )].
m 0 ) T (11T )]C m } m (
m 0 ) am
{B m [(
(35)
0,
2 ( y, )
( y, )
I( ) Var
E
1
B m wp
I( 0 )
(36)
C m (1 / 2) E * [H( y, )] .
E[ ( y, )] [3], [14].
*
m
wp 1
Theorem:
Let n vectors observations y 1 , y 2 ,, y n be iid with
distribution F , for . Assume regularity conditions (R1),
(R2), and (R3) hold on the family F. Then, by m converges
with probability 1 to , the likelihood equations admit a
sequence of solutions m satisfying
a). strong consistency:
1
m wp
(31)
; m ;
b). asymptotic normality and efficiency
d
m
(32)
AN , (1 / m)I 1 ( ) ,
AN (,)
2 ( y , )
( y, )
E
Var
0 or
T
where I( ) E[ ( y, )] ,
asymptotic normal distribution.
i 1
E * [C m ] (1 / 2m)mE * [H( y, *m )]
C m (1 / 2m) H( y i , *m ) then
is a multivariate
H( y, *m )
is bounded in
i 1
i 1
(1 / m)H( y, *m ) (1 / m) H( y i , *m ) (1 / m) H( y i , 0 )
(37)
By condition (R2) E H(y ) and by applying the Law of
Large Numbers (LLN), it can be shown that
m
1
(1 / m) H( y i , 0 ) wp
E 0 [H( y, 0 )] .
i 1
M1 and M2 so that,
Proof:
Expanding the function
( y, )
into a Taylor series of
m 0 c0 ] 1 / 2
m M 1 P[
m M 2 P[ (1 / m ) H( y i , 0 ) E 0 [H( y, 0 )]
i 1
(39)
1] 1 / 2.
m ) ( y, 0 )
( y,
m 0 )T
{ T ( y, 0 )(1 / 2)[(
m 0 )
(11T )]H( y, *m )}(
(38)
(33)
IJENS
hence,
; n and
1
m
1
Eq. (35) can be concluded that ( m 0 ) wp
0 for all
m } satisfying strong
0 . The sequence of solutions {
1
consistency m wp
; m is proven.
Furthermore, by using the Central Limit Theorem (CLT),
we have
a m ~ AN (0, I( 0 ))
(40)
By (40) : a m
AN (0, I( 0 ))
By (36) :
p
Bm
I( 0 )
[2]
[3]
[4]
[5]
[6]
[7]
d
m
n1
n ( ps s ns ( ns 1) / 2 ) ] . By using
the above Theorem, n will be convergence with probability
1 to or strong consistency n ; n and also
efficiency and asymptotic normality distribution, i.e.
d
d
n n
AN 0, I 1 ( ) or n
AN , n 1I 1 () .
V. CONCLUSION
This paper has discussed about estimation of multivariate
linear mixed model or multivariate component of variance
model with equal number of replications. The results show
that the parameter estimation of fixed effects yields unbiased
estimators, whereas the estimation for random effects or
variance components yields biased estimators. Moreover,
assume that both likelihood and ln-likelihood functions hold
some of regularity conditions, it can be proved that estimators
as a solutions set of the likelihood equations satisfy strong
consistency for large sample size, asymptotic normal and
efficiency.
Based on the discussion at the previous section, it can be
drawn some theoretical conclusions as follows:
1). The estimator of parameters by MLE method in the
multivariate linear mixed model Y = XB + ZD + E, with
the variance V ( Z I s )( Z I s )T (I n ) are
d
n
AN , n 1I 1 ( ) .
REFERENCES
[1]
d
m 0 )
m (
AN (0, I 1 ( 0 )).
53
[8]
[9]
[10]
T ) [( X I ) T V
1 ( X I )]1
vec( B
s
s
T 1
( X I ) V vec( Y T ),
s
( Z I )
( Z I ) T (I
) ,
V
s
s
n
and
T )) [( X I ) T V
1 ( X I )]1 .
Var (vec ( B
s
s
IJENS