Documente Academic
Documente Profesional
Documente Cultură
_
X
11
X
1n
.
.
.
.
.
.
X
m1
X
mn
_
_
then
E(X) =
_
_
E(X
11
) E(X
1n
)
.
.
.
.
.
.
E(X
m1
) E(X
mn
)
_
_
Properties:
E(X
) = E(X)
) = AE(X)B
E(X
1
+X
2
) = E(X
1
) +E(X
2
)
If X
1
and X
2
are independent,
E(X
1
X
2
) = E(X
1
) E(X
2
)
If X
1
and X
2
are independent,
E(X
1
X
2
) = E(X
1
) E(X
2
)
The variance-covariance matrix (or covariance
matrix) of an m1 random vector x is the mm
matrix V(x) (or Var(x) or Cov(x)) dened by
V(x) = E
_
(x E(x))(x E(x))
_
when the expectations all exist.
Note: If
x =
_
_
X
1
.
.
.
X
m
_
_
then V(x) =
_
_
V(X
1
) Cov(X
1
, X
2
) Cov(X
1
, X
m
)
Cov(X
2
, X
1
) V(X
2
)
.
.
.
.
.
.
.
.
.
Cov(X
m
, X
1
) V(X
m
)
_
_
and, in particular, V(x) is symmetric.
If the elements of x are independent, V(x) is
diagonal.
If the elements of x are independent and
identically distributed with variance
2
,
V(x) =
2
I
m
.
Using properties of the expected value,
V(x) = E
_
(x E(x))(x E(x))
_
= E(xx
E(x)x
xE(x)
+E(x) E(x)
)
= E(xx
) E(x) E(x
) E(x) E(x)
+E(x) E(x)
= E(xx
) E(x) E(x)
and thus
E(xx
Some properties:
If a is a constant, V(ax) = a
2
V(x).
If A is a constant matrix and b a constant
vector,
V(Ax +b) = AV(x)A
V(x)a = V(a
x) 0
More generally, the covariance between the
m1 random vector x
1
and the n 1 random
vector x
2
is dened to be the mn matrix
Cov(x
1
, x
2
) = E
_
(x
1
E(x
1
))(x
2
E(x
2
))
_
when the expectations all exist.
Note: If
x
1
=
_
_
X
11
.
.
.
X
1m
_
_ and x
2
=
_
_
X
21
.
.
.
X
2n
_
_
then
Cov(x
1
, x
2
) =
_
_
Cov(X
11
, X
21
) Cov(X
11
, X
2n
)
.
.
.
.
.
.
Cov(X
1m
, X
21
) Cov(X
1m
, X
2n
)
_
_
An identity (proven similarly to the one for
variance):
Cov(x
1
, x
2
) = E(x
1
x
2
) E(x
1
) E(x
2
)
In particular, if x
1
and x
2
are independent then
Cov(x
1
, x
2
) = E(x
1
) E(x
2
) E(x
1
) E(x
2
)
= 0
This also yields
E(x
1
x
2
) = Cov(x
1
, x
2
) +E(x
1
) E(x
2
)
Cov(x, x) = V(x)
If a and b are constants,
Cov(ax
1
, bx
2
) = ab Cov(x
1
, x
2
)
If A and B are constant matrices and c and d
are constant vectors,
Cov(Ax
1
+c, Bx
2
+d) = ACov(x
1
, x
2
)B
Cov(x
1
+x
2
, x
3
) = Cov(x
1
, x
3
) +Cov(x
2
, x
3
)
and
Cov(x
1
, x
2
+x
3
) = Cov(x
1
, x
2
) +Cov(x
1
, x
3
)
If x
1
and x
2
are two m1 random vectors whose
variances exist, then
V(x
1
+x
2
)
= Cov(x
1
+x
2
, x
1
+x
2
)
= Cov(x
1
, x
1
) +Cov(x
1
, x
2
)
+Cov(x
2
, x
1
) +Cov(x
2
, x
2
)
= V(x
1
) +V(x
2
) +Cov(x
1
, x
2
) +Cov(x
2
, x
1
)
In particular, if x
1
and x
2
are independent,
V(x
1
+x
2
) = V(x
1
) +V(x
2
)
Suppose x
=
_
X
1
X
m
_
and let
D = diag(V(X
1
), . . . , V(X
m
))
Then the correlation matrix of x is
Corr(x) = D
1/2
V(x)D
1/2
assuming that all of the variances in D exist and
are nonzero.
That is, Corr(x) =
_
_
1 Corr(X
1
, X
2
) Corr(X
1
, X
m
)
Corr(X
2
, X
1
) 1
.
.
.
.
.
.
.
.
.
Corr(X
m
, X
1
) 1
_
_
Note that Corr(x) is symmetric and nonnegative
denite. It is positive denite if V(x) is positive
denite.
Suppose
x
1
=
_
_
X
11
.
.
.
X
1m
_
_ x
2
=
_
_
X
21
.
.
.
X
2n
_
_
and let
D
1
= diag(V(X
11
), . . . , V(X
1m
))
D
2
= diag(V(X
21
), . . . , V(X
2n
))
Then we may also dene
Corr(x
1
, x
2
) = D
1/2
1
Cov(x
1
, x
2
)D
1/2
2
=
_
_
Corr(X
11
, X
21
) Corr(X
11
, X
2n
)
.
.
.
.
.
.
Corr(X
1m
, X
21
) Corr(X
1m
, X
2n
)
_
_
If
z =
_
x
1
x
2
_
then E(z) =
_
E(x
1
)
E(x
2
)
_
and
V(z) =
_
V(x
1
) Cov(x
1
, x
2
)
Cov(x
2
, x
1
) V(x
2
)
_
The variance-covariance structure of an mn
random matrix
X =
_
_
X
11
X
1n
.
.
.
.
.
.
X
m1
X
mn
_
_
can be characterized by vectoring:
V
_
vec(X)
_
Notice that, if A and B are constant matrices,
the variance-covariance structure of AXB
is
characterized by
V
_
vec(AXB
)
_
= V
_
(AB) vec(X)
_
=(AB) V
_
vec(X)
_
(AB)
t
_
assuming that the expectation exists for all m1
vectors t in a neighborhood of 0. Note that
M
x
(t) is a scalar-valued function.
Important Fact: If two random vectors have the
same moment generating function (in a
neighborhood of 0) then they have the same
distribution.
Some properties:
M
x
(0) = E(x)
2
M
x
(0) = E(xx
If c is a constant scalar,
M
cx
(t) = M
x
(ct)
If A is a constant matrix and b a constant
vector,
M
Ax+b
(t) = e
b
t
M
x
(A
t)
If x
1
and x
2
are independent m1 random
vectors, then
M
x
1
+x
2
(t) = M
x
1
(t)M
x
2
(t)
Suppose x
1
is an m1 random vector and x
2
is an n 1 random vector, and let
x =
_
x
1
x
2
_
Then x
1
and x
2
are independent if and only if
M
x
(t) = M
x
1
(t
1
)M
x
2
(t
2
)
where
t =
_
t
1
t
2
_
Conditional Expectation
For random matrices X
1
and X
2
, the
conditional expectation E(X
1
|X
2
= A) of X
1
given X
2
= A is the expectation of X
1
dened
using the conditional distribution of its elements
given X
2
= A (where A is a constant matrix).
Similarly, the conditional expectation
E(X
1
|X
2
) of X
1
given X
2
is the expectation of
X
1
dened using the conditional distribution of
its elements given X
2
.
The double expectation formula applies:
E(E(X
1
|X
2
)) = E(X
1
)
If x
1
is a random vector, the conditional
variance-covariance matrix V(x
1
|X
2
= A) or
V(x
1
|X
2
) is dened by substituting the
appropriate conditional expectations into the
denition of the variance-covariance matrix.
The conditional variance formula applies:
V(x
1
) = E(V(x
1
|X
2
)) +V(E(x
1
|X
2
))
Also, if x
1
and x
2
are random vectors, the
conditional covariance Cov(x
1
, x
2
|X
3
= A) or
Cov(x
1
, x
2
|X
3
) can be dened by substituting
the appropriate conditional expectations into the
denition of the covariance.
There is a conditional covariance formula:
Cov(x
1
, x
2
) = E(Cov(x
1
, x
2
|X
3
)) +
Cov(E(x
1
|X
3
), E(x
2
|X
3
))
Samples of Random Vectors
A collection of p 1 random vectors x
1
, . . . , x
n
is
a sample (of size n) if the vectors are
independent and identically distributed.
A sample is often represented as the rows of an
n p full data matrix
X =
_
_
x
1
.
.
.
x
n
_
_
The rows correspond to the observations and
the columns correspond to the variables.
The p 1 sample mean vector is
x =
1
n
n
i=1
x
i
=
1
n
X
1
n
where 1
n
is the n 1 vector of 1s.
Note that element j of x is the (univariate)
sample mean of variable j: If x
i
=
_
X
i1
X
ip
_
then
x
=
_
X
1
X
p
_
The p p sample variance-covariance matrix is
S =
1
n 1
n
i=1
(x
i
x)(x
i
x)
i=1
(X
ij
X
j
)
2
and the element in row j and column k is the
sample covariance of variables j and k
s
jk
=
1
n 1
n
i=1
(X
ij
X
j
)(X
ik
X
k
)
Note that S is symmetric and nonnegative
denite.
An alternative version is normalized by n instead
of n 1:
=
1
n
n
i=1
(x
i
x)(x
i
x)
=
n 1
n
S
This is also symmetric and nonnegative denite.
Now,
n
i=1
(x
i
x)(x
i
x)
=
n
i=1
(x
i
x
i
x
i
x
xx
i
+ x x
)
=
n
i=1
x
i
x
_
n
i=1
x
i
_
x
x
n
i=1
x
i
+n x x
=
n
i=1
x
i
x
i
n x x
n x x
+n x x
=
n
i=1
x
i
x
i
n x x
=X
Xn
1
n
X
1
n
_
1
n
X
1
n
_
=X
XX
_
1
n
1
n
1
n
_
X
=X
_
I
n
1
n
1
n
1
n
_
X
and thus
S =
1
n 1
X
_
I
n
1
n
1
n
1
n
_
X
and
=
1
n
X
_
I
n
1
n
1
n
1
n
_
X
Note that the centering matrix
I
n
1
n
1
n
1
n
is the projection onto the subspace of n 1
vectors that are orthogonal to 1
n
.
If the diagonal elements s
11
, s
22
, . . . , s
pp
of S are
nonzero and
D = diag(s
11
, s
22
, . . . , s
pp
)
then the sample correlation matrix is
R = D
1/2
S D
1/2
Note that
R =
_
_
1 r
12
r
1p
r
21
1
.
.
.
.
.
.
.
.
.
r
p1
1
_
_
where
r
jk
=
s
jk
s
jj
s
kk
is the sample correlation of variables j and k.
Note that R is a symmetric nonnegative denite
matrix. It is positive denite if S is positive
denite.
The sample correlation matrix could alternatively
have been dened using
. This would give
exactly the same matrix.
Properties of Summary Statistics
Suppose x
1
, . . . , x
n
is a sample of p 1 random
vectors whose common distribution has mean
vector and variance-covariance matrix .
Then
E( x) = E
_
1
n
n
i=1
x
i
_
=
1
n
n
i=1
E(x
i
) =
1
n
n
i=1
=
and, using independence,
V( x) = V
_
1
n
n
i=1
x
i
_
=
1
n
2
n
i=1
V(x
i
) =
1
n
2
n
i=1
=
1
n
i=1
x
i
x
i
n x x
_
Thus, E(S)
=
1
n 1
E
_
n
i=1
x
i
x
i
n x x
_
=
1
n 1
_
n
i=1
E(x
i
x
i
) nE( x x
)
_
=
1
n 1
_
n
i=1
(+
) n
_
V( x) +E( x) E( x
)
_
_
=
1
n 1
_
n+n
n
1
n
n
_
=
1
n 1
((n 1))
=
and therefore S is an unbiased estimator of .
On the other hand,
=
n 1
n
S
is a biased estimator of , and its bias is
E(
) =
n 1
n
E(S) =
_
1
1
n
_
=
1
n