Sunteți pe pagina 1din 84

Expectation

for multivariate distributions

Definition
Let X1, X2, , Xn denote n jointly distributed
random variable with joint density function
f(x1, x2, , xn )
then
E g X 1 ,K , X n

K g x ,K , x f x ,K , x dx ,K , dx
1

Example
Let X, Y, Z denote 3 jointly distributed random
variable with joint density function then

f x, y , z

12
7

x 2 yz

Determine E[XYZ].

0 x 1, 0 y 1, 0 z 1
otherwise

Solution:

1 1 1

12 2
E XYZ xyz
x yz dxdydz
7
0 0 0
1 1 1

12
x 3 yz xy 2 z 2 dxdydz
7 000

12 x
x 2 2

yz
y z
7 00 4
2

1 1

3 y
y 2

z 2 z
70 2
3
1

3 z
2z


7 4
9
2

y 1

x 1

1 1

3
dydz yz 2 y 2 z 2 dydz
700
x 0

3 1
2 2
dz z z dz
7 0 2
3
y 0

3 1 2
17
3 17


7 4 9
84
7 36

Some Rules for Expectation

1.

E Xi

K x f x ,K , x dx K dx
i

x f x dx
i i

Thus you can calculate E[Xi] either from the joint distribution of
X1, , Xn or the marginal distribution of Xi.

K x f x ,K , x dx ,K , dx

Proof:

x K f x ,K , x dx K dx
i

x f x dx
i i

i 1

dxi 1 K dxn dxi

E a1 X 1 L an X n a1 E X 1 L an E X n

2.

The Linearity property


Proof:

K a x K

an xn f x1 ,K , xn dx1 K dxn

1 1

a1 K

x f x ,K , x dx K dx
1

an K

x f x ,K , x dx K dx
n

3. (The Multiplicative property) Suppose X1, , Xq


are independent of Xq+1, , Xk then

E g X 1 ,K , X q h X q 1 ,K , X k

h X q 1 ,K , X k
E g X 1 ,K , X q E

In the simple case when k = 2

E XY E X E Y
if X and Y are independent

Proof:

E g X 1 ,K , X q h X q 1 ,K , X k

K g x ,K , x h x

,K , xk f x1 ,K , xk dx1 K dxn

K g x ,K , x h x

,K , xk f1 x1 ,K , xq

K h x

q 1

q 1

q 1

f 2 xq 1 ,K , xk dx1 K dxq dxq 1 K dxk

,K , xk f 2 xq 1 ,K , xk

K g x ,K , x
1

f1 x1 ,K , xq dx1 K dxq dxq 1 K dxk

h X q 1 ,K , X k
E g X 1 ,K , X q E

E g X 1 ,K , X q

K h x

q 1

,K , xk f 2 xq 1 ,K , xk dxq 1 K dxk

h X q 1 ,K , X k
E g X 1 ,K , X q E

Some Rules for Variance


2

Var X E X X E X 2 X2

1.

Var X Y Var X Var Y 2Cov X , Y


where

Cov X , Y =E X X Y Y

Proof
2

Var X Y E X Y X Y

where X Y E X Y X Y

Thus

Var X Y E X Y X Y

2
2

E X X 2 X X Y Y Y Y

Var X 2Cov X , Y Var Y

Note: If X and Y are independent, then

Cov X , Y =E X X Y Y
=E X X E Y Y
= E X X E Y Y 0
and

Var X Y Var X Var Y

Definition: For any two random variables X and Y


then define the correlation coefficient XY to be:

xy =

Cov X , Y

Var X Var Y

Cov X , Y

XY

Thus Cov X , Y = XY X Y
and

Var X Y X2 Y2 2 XY X Y

2
X

2
Y

if X and Y are
independent

Properties of the correlation coefficient XY

xy =

Cov X , Y

Var X Var Y

Cov X , Y

XY

If X and Y are independent than XY 0.


Reason : Cov X , Y 0
The converse is not necessarily true.
i.e. XY = 0 does not imply that X and Y are independent.

More properties of the correlation coefficient XY

1 XY 1
and XY 1 if there exists a and b such that
P Y bX a 1
whereXY = +1 if b > 0 and XY = -1 if b< 0
Proof: Let

U X X and V Y Y .
Let

g b E V bU 0

Consider choosing b to minimize

for all b.

Consider choosing b to minimize


2

g b E V bU

E V 2 2bVU b 2U 2

U
2
E V 2 2bE VU b 2 E
g b 2 E VU 2bE U 2 0
or

b bmin

E VU

E U

Since g(b) 0, then g(bmin) 0

Hence g(bmin) 0
2

g bmin E V 2 2bmin E VU bmin


E

E
VU

E V 2
E
VU

E U 2

2
E VU

2
E V
0
2
E U

Hence

E VU

E U E
2

2
U
2
E VU

U
2
E

or
Note

E X

Y Y
2
2

E X X E Y Y

2
XY
1

2
U
2
g bmin E V 2 2bmin E VU bmin
E

If and only if

E V bminU 0

2
XY 1

This will be true if P V bminU 0 1

P Y Y bmin X X 0 1
P Y bmin X a 1 where a Y bmin X
i.e.

Summary

1 XY 1
and XY 1 if there exists a and b such that
P Y bX a 1
where

and

b bmin

E X X Y X

E X X

Cov X , Y XY X Y
Y

=
= XY
2
Var X
X
X
Y
a Y bmin X Y XY
X
X

2.

Var aX bY a 2 Var X b 2 Var Y 2abCov X , Y

Proof
2

Var aX bY E aX bY aX bY

with aX bY E aX bY a X bY

Thus

Var aX bY E aX bY a X bY

2
2
2
2

E a X X 2ab X X Y Y b Y Y

a Var X 2abCov X , Y b Var Y


2

3.

Var a1 X 1 K an X n
a Var X 1 K a Var X n
2
1

2
n

2a1a2 Cov X 1 , X 2 K 2a1an Cov X 1 , X n

2a2 a3Cov X 2 , X 3 K 2a2 an Cov X 2 , X n


2an 1an Cov X n 1 , X n

ai2 Var X i 2 ai a j Cov X i , X j


i 1
n

i j

ai2 Var X i if X 1 ,K , X n are mutually independent


i 1

Some Applications
(Rules of Expectation & Variance)
Let X1, , Xn be n mutually independent random
variables each having mean and standard deviation
(variance 2).
1 n
1
1
Let
X X i X1 K X n
n i 1
n
n
a1 X 1 K an X n
Then

1
1
X E X E X 1 K E X n
n
n
1
1
K
n
n

Also

1
Var X n
n

1
Var X Var X 1 K
n
2
X

or X
n

1
2 K
n
2 2
n 2
n
n

1
2
n

Thus X and X
n

Hence the distribution of X is centered at and


becomes more and more compact about as n
increases

Tchebychevs Inequality

Tchebychevs Inequality
Let X denote a random variable with
mean =E(X) and
variance Var(X) = E[(X )2] = 2
then

1
P X k 1 2
k
1
P k X k 1 2
k

Note:
2
Is called
thestandard
deviation
of
X,

Var X E X

Proof:

x f x dx

Var ( X )
2

x f x dx
2

x f x dx
x f x dx
k
k

x f x dx k x f x dx
2

k f x dx
2

k f x dx
2

k
2

f x dx

kf x dx

k P X k P X k
2

k P X k
2

Thus k P X k
1
or P X k 2
k
1
and P X k 1 2
k
2

Tchebychevs inequality is very conservative


1
P X k P k X k 1 2
k
k =1
1
P X P X 1 2 0
1

k = 2
P X 2 P 2 X 2 1

k = 3

1 3

2
2
4

1 8
P X 3 P 3 X 3 1 2
3
9

The Law of Large Numbers

The Law of Large Numbers


Let X1, , Xn be n mutually independent random
variables each having mean
1 n
Let X X i
n i 1

Then for any > 0 (no matter how small)


P X P X 1 as n

Proof
We will use Tchebychevs inequality which states for
any random variable X.
1
P X k X X X k X 1 2
k

Now X and X
n
P X
1
P k X X k X 1 2
k

n
where k X k
or k

Thus
1

P X
1 2 1 2 1
k
n
as n

Thus
P X 1 as n

A Special case
Let X1, , Xn be n mutually independent random
variables each having Bernoulli distribution with
parameter p
if repetition is S (prob p)
1
Xi
0 if repetition is F (prob q 1 p)

E Xi p

X1 K X n
X
p proportion of successes
n

Thus the Law of Large Numbers states


P p p p 1 as n

Thus the Law of Large Numbers states that


p proportion of successes

converges to the probability of success p as n


Some people misinterpret this to mean that if the
proportion of successes is currently lower that p then
the proportion of successes in the future will have to be
larger than p to counter this and ensure that the Law of
Large numbers holds true.
Of course if in the infinite future the proportion of
successes is p than this is enough to ensure that the
Law of Large numbers holds true.

Some more applications


Rules of expectation and Rules of
Variance

The mean and variance


of a Binomial Random variable
We have already computed this by other methods:
1. Using the probability function p(x).
2. Using the moment generating function mX(t).
Suppose that we have observed n independent
repetitions of a Bernoulli trial
Let X1, , Xn be n mutually independent random
variables each having Bernoulli distribution with
parameter pand defined by
1 if repetition i is S (prob p)
Xi
0 if repetition i is F (prob q)

E X i 1p 0 q p

Var X i 1 p p 0 p q pq
2

Now X = X1 + + Xn has a Binomial distribution with


parameters n and p
X is the total number of successes in the n repetitions.
X E X 1 K E X n p K p np
X2 var X 1 K var X n pq K pq npq

The mean and variance


of a Hypergeometric distribution
The hypergeometric distribution arises when we sample
with replacement n objects from a population of N = a +
b objects. The population is divided into to groups
(group A and group B). Group A contains a objects
while group B contains b objects
Let X denote the number of objects in the sample of n
that come from group A. The probability function of X
b
is:
a

x
n

p x
a b

Let X1, , Xn be n random variables defined by


1 if i th object selected comes from group A
Xi
th
0
if
i
object selected comes from group B

Then

X X1 L X n

a
b
P X i 1
and P X i 0
ab
ab

Proof
P X i 1

a b 1 n 1
a b

Pn

a b 1 !

a b 1 n 1 ! a
a b
a b !
a b n !

Therefore
a
E X i 1P X i 1 0 P X i 0
ab
a
2
2
2
E X i 1 P X i 1 0 P X i 0
ab

and

a
2

var X i E X i E X i
ab
2

a
a
b
a

1-

ab
ab ab
a b

Thus
E X E X1 K X n
n

E Xi
i 1

b
n
ab

Also
Var X Var X 1 K X n
n

Var X i 2 Cov X i , X j
i 1

and

a
b
var X i

ab ab

We need to also calculate Cov X i , X j


Note: Cov U , V E U U V V
E UV U V V U U V
E UV U V V U U V
E UV U V E UV E U E V

Thus
and

X
j
Cov X i , X j E X i X j E X i E
a
E Xi
ab

X i X j 1 0 P X i X j
E X i X j 1 P
X i 1, X j
P X i X j 1 P

Note:

P X i 1, X j 1
a a 1

0
1

a a 1 a b 2 Pn 2

a b 2 !

a b

Pn 2

a a 1
a b 2 n 2 !
a b !
a b a b 1
a b n !

Thus
and

E X i X j

a a 1

a b a b 1

X
j
Cov X i , X j E X i X j E X i E
a a 1

a b a b 1 a b

a
a 1
a

b
a

1
a

a 1 a b a a b 1

a b 1 a b

ab

a b 1 a b

Thus
Var X Var X 1 K X n
n

Var X i 2 Cov X i , X j
i 1

with
and

i j

a
b
ab
var X i

a b a b a b 2
Cov X i , X j

ab

a b 1 a b

Var X Var X i 2 Cov X i , X j


i 1

i j

ab

a b

n n 1
2

ab

a b 1 a b

Thus

Var X Var X i 2 Cov X i , X j


i 1

i j

ab

a b

ab

a b

n n 1

ab

a b 1 a b

n 1

a b 1

np A pB 1 f
a
b
n 1
n 1
where p A
, pB
and f

ab
ab
a b 1 N 1

Thus if X has a hypergeometric distribution with


parameters a, b and n then
a
E X n
np A
ab
Var X np A pB 1 f

a
b
n 1
n 1
where p A
, pB
and f

ab
ab
a b 1 N 1

The mean and variance


of a Negative Binomial distribution
The Negative Binomial distribution arises when we
repeat a Bernoulli trial until k successes (S) occur. Then
X = the trial on which the kth success occurred.
The probability function of X is:
x 1
k xk
p x
p
x k , k 1, k 2,...
q
k 1
Let X1= the number of trial on which the 1st success
occurred.
and Xi = the number of trials after the (i -1)st success on
which the ith success occurred (i 2)

Then X = X1 + + Xk
and X1, , Xk are mutually independent
Xi each have a geometric distribution with parameter p.
1
q
thus E X i
and Var X i 2
p
p
k

k
hence E X E X i
p
i 1
k

kq
and Var X Var X i 2
p
i 1

Thus if X has a negative binomial distribution with


parameters k and p then
k
E X
p
kq
Var X 2
p

Multivariate Moments
Non-central and Central

Definition
Let X1 and X2 be a jointly distirbuted random variables
(discrete or continuous), then for any pair of positive
integers (k1, k2) the joint moment of (X1, X2) of order
(k1, k2) is defined to be:

k1k2

k1
k2

E X 1 X 2
k1 k2
x
1 x2 p x1 , x2

x1

if X 1 , X 2 are discrete

x2

x1k1 x2k2 f x1 , x2 dx1dx2

if X 1 , X 2 are continuous

Definition
Let X1 and X2 be a jointly distirbuted random variables
(discrete or continuous), then for any pair of positive
integers (k1, k2) the joint central moment of (X1, X2) of
order (k1, k2) is defined to be:

k1
k2

E X 1 1 X 2 2

0
k1 , k2

x1 1 x2 2

k1

x1

x2

x x
k1

k2

k2

p x1 , x2

f x1 , x2 dx1dx2

where 1 = E [X1] and 2 = E [X2]

if X 1 , X 2 are discrete
if X 1 , X 2 are continuous

Note
E X 1 1 X 2 2 Cov X 1 , X 2
0
1,1

= the covariance of X1 and X2.


Definition: For any two random variables X and Y
then define the correlation coefficient XY to be:

xy =

Cov X , Y

Var X Var Y

Cov X , Y

XY

Properties of the correlation coefficient XY


xy =

Cov X , Y

Var X Var Y

Cov X , Y

XY

If X and Y are independent than XY 0.


Reason : Cov X , Y 0
The converse is not necessarily true.
i.e. XY = 0 does not imply that X and Y are independent.

More properties of the correlation coefficient

1 XY 1
and XY 1 if there exists a and b such that
P Y bX a 1
whereXY = +1 if b > 0 and XY = -1 if b< 0

Some Rules for Expectation

1.

E Xi

K x f x ,K , x dx K dx
i

x f x dx
i i

Thus you can calculate E[Xi] either from the joint distribution of
X1, , Xn or the marginal distribution of Xi.

2.

E a1 X 1 L an X n a1 E X 1 L an E X n
The Linearity property

3. (The Multiplicative property) Suppose X1, , Xq


are independent of Xq+1, , Xk then

E g X 1 ,K , X q h X q 1 ,K , X k

h X q 1 ,K , X k
E g X 1 ,K , X q E

In the simple case when k = 2

E XY E X E Y
if X and Y are independent

Some Rules for Variance


2

Var X E X X E X 2 X2

Var X Y Var X Var Y 2Cov X , Y

1.

where

Cov X , Y =E X X Y Y

Note: If X and Y are independent, then

Cov X , Y =E X X Y Y
=E X X E Y Y
= E X X E Y Y 0
and

Var X Y Var X Var Y

Definition: For any two random variables X and Y


then define the correlation coefficient XY to be:

xy =

Cov X , Y

Var X Var Y

Cov X , Y

XY

Thus Cov X , Y = XY X Y
and

Var X Y X2 Y2 2 XY X Y

2
X

2
Y

if X and Y are
independent

2.

Var aX bY a 2 Var X b 2 Var Y 2abCov X , Y

Proof
2

Var aX bY E aX bY aX bY

with aX bY E aX bY a X bY

Thus

Var aX bY E aX bY a X bY

2
2
2
2

E a X X 2ab X X Y Y b Y Y

a Var X 2abCov X , Y b Var Y


2

3.

Var a1 X 1 K an X n
a Var X 1 K a Var X n
2
1

2
n

2a1a2 Cov X 1 , X 2 K 2a1an Cov X 1 , X n

2a2 a3Cov X 2 , X 3 K 2a2 an Cov X 2 , X n


2an 1an Cov X n 1 , X n

ai2 Var X i 2 ai a j Cov X i , X j


i 1
n

i j

ai2 Var X i if X 1 ,K , X n are mutually independent


i 1

Distribution functions,
Moments,
Moment generating functions
in the Multivariate case

The distribution function F(x)


This is defined for any random variable, X.
F(x) = P[X x]
Properties
1.
2.

3.

F(-) = 0 and F() = 1.


F(x) is non-decreasing
(i. e. if x1 < x2 then F(x1) F(x2) )
F(b) F(a) = P[a < X b].

4. Discrete Random Variables


F x P X x p u
u x

p x F x F x jump in F x at x.
1.2

F(x)

1
0.8
0.6
0.4

p(x)

0.2
0
-1

F(x) is a non-decreasing step function with


F 0 and F 1

5.

Continuous Random Variables Variables


F x P X x

f x F x .

f u duF(x)

f(x) slope

0
-1

F(x) is a non-decreasing continuous function with


F 0 and F 1

To find the probability density function, f(x), one first


finds F(x) then
f x F x .

The joint distribution function F(x1, x2, , xk)


is defined for k random variables, X1, X2, , Xk.
F(x1, x2, , xk) = P[ X1 x1, X2 x2 , , Xk xk ]
for k = 2

x2
(x1, x2)

x1
F(x1, x2) = P[ X1 x1, X2 x2]

Properties
1.

F(x1 , -) = F(- , x2) = F(- , -) = 0

2.

F(x1 , ) = P[ X1 x1, X2 ] = P[ X1 x1] = F1 (x1)


= the marginal cumulative distribution
function of X1
F(, x2) = P[ X1 , X2 x2] = P[ X2 x2] = F2 (x2)
= the marginal cumulative distribution
function of X2
F(, ) = P[ X1 , X2 ] = 1

3.

F(x1, x2 ) is non-decreasing in both the x1


direction and the x2 direction.
i.e. if a1 < b1 if a2 < b2 then
i.

F(a1, x2) F(b1 , x2)

ii.

F(x1, a2) F(x1 , b2)

iii.

F( a1, a2) F(b1 , b2)


x2

(a1, b2)

(b1, b2)
x1

(a1, a2)

(b1, a2)

4.

P[a < X1 b, c < X2 d] =


F(b,d) F(a,d) F(b,c) + F(a,c).
x2
(b, d)
(a, d)
x1
(a, c)

(b, c)

4. Discrete Random Variables


F x1 , x2 P X 1 x1 , X 2 x2
x2

p u ,u

u2 x2 u1 x1

(x1, x2)
x1

F(x1, x2) is a step surface

p x1 , x2 jump in F x1 , x2 at x1 , x2 .

5. Continuous Random Variables


F x1 , x2 P X 1 x1 , X 2 x2
x2

x1 x1

f u , u du du
1

(x1, x2)
x1

F(x1, x2) is a surface

f x1 , x2

2 F x1 , x2
x1x2

2 F x1 , x2
x2 x12

Multivariate Moments
Non-central and Central

Definition
Let X1 and X2 be a jointly distirbuted random variables
(discrete or continuous), then for any pair of positive
integers (k1, k2) the joint moment of (X1, X2) of order
(k1, k2) is defined to be:

k1k2

k1
k2

E X 1 X 2
k1 k2
x
1 x2 p x1 , x2

x1

if X 1 , X 2 are discrete

x2

x1k1 x2k2 f x1 , x2 dx1dx2

if X 1 , X 2 are continuous

Definition
Let X1 and X2 be a jointly distirbuted random variables
(discrete or continuous), then for any pair of positive
integers (k1, k2) the joint central moment of (X1, X2) of
order (k1, k2) is defined to be:

k1
k2

E X 1 1 X 2 2

0
k1 , k2

x1 1 x2 2

k1

x1

x2

x x
k1

k2

k2

p x1 , x2

f x1 , x2 dx1dx2

where 1 = E [X1] and 2 = E [X2]

if X 1 , X 2 are discrete
if X 1 , X 2 are continuous

Note
E X 1 1 X 2 2 Cov X 1 , X 2
0
1,1

= the covariance of X1 and X2.

Multivariate Moment Generating


functions

Recall
The moment generating function
tx
e
p x

mX t E e

tX

if X is discrete

etx f x dx if X is continuous

Definition
Let X1, X2, Xk be a jointly distributed random
variables (discrete or continuous), then the joint
moment generating function is defined to be:
mX1 ,K , X k t1 , K , tk E et1 X1 K tk X k

t1 x1 K tk xk
e
p x1 , K , xk

if X 1 , K , X k are discrete

t1 x1 K tk xk
e
f x1 , K , xk dx1 K dxk

if X 1 ,K , X k are continuous

x1

xk

Definition
Let X1, X2, Xk be a jointly distributed random
variables (discrete or continuous), then the joint
moment generating function is defined to be:
mX1 ,K , X k t1 , K , tk E et1 X1 K tk X k

t1 x1 K tk xk
e
p x1 , K , xk

if X 1 , K , X k are discrete

t1 x1 K tk xk
e
f x1 , K , xk dx1 K dxk

if X 1 ,K , X k are continuous

x1

xk

Note : mX1 ,K , X k 0, K , 0 1

mX1 ,K , X k 0, K , t , K 0 mX i t
i

Power Series expansion the joint moment


generating function (k = 2)
tX sY
tX sY

mX ,Y t , s E e E
e
e

tX
sY

1 tX
L 1 sY
L

2!
2!

u u u
using e 1 u L
2! 3! 4!
u

t2 2
s2 2
t k sm k m
E 1 Xt Ys X XYts Y L
X Y K
2!
2!
k !m !

2
2
k m
t
s
t s
1 t 1,0 s 0,1 2,0 1,1ts 2,0 L
k ,m K
2!
2!
k !m!
2,0 2
0,2 2
k ,m k m
1 1,0t 0,1s
t 1,1ts
s L
t s K
2!
2!
k !m !

S-ar putea să vă placă și