Documente Academic
Documente Profesional
Documente Cultură
A random variable (r.v.) is a function that assigns one and only one numerical
value to each simple event in an experiment.
We will denote r.vs by capital letters: X, Y or Z and their values by small letters:
x, y or z respectively.
There are two types of r.vs: discrete and continuous. Random variables that can
take a countable number of values are called discrete. Random variables that take
values from an interval of real numbers are called continuous.
(
1 if the National League won in year t,
Xt =
−1 if the Americal League won in year t.
33
34 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
Definition 3.2. Let X denote a r.v. and x its particular value from the whole range
of all values of X, say RX . The probability of the event (X ≤ x) expressed as a
function of x:
FX (x) = PX (X ≤ x) (3.1)
is called the cumulative distribution function (c.d.f.) of the r.v. X.
• limx→∞ FX (x) = 1
• limx→−∞ FX (x) = 0
Note that X
FX (xi ) = PX (X ≤ xi ) = pX (x) (3.2)
x≤xi
Ex. 3.4 The Uniform Distribution The p.m.f. of a r.v. X uniformly distributed
on {1, 2, . . . , n} is
1
P (X = k) = , k = 1, 2, . . . , n,
n
where n is a positive integer.
Tabular form:
k 0 1 2 3 4 5 6 7 8
P (X = k) 0.0168 0.0896 0.2090 0.2787 0.2322 0.1239 0.0413 0.0079 0.0007
P (X ≤ k) 0.0168 0.1064 0.3154 0.5941 0.8263 0.9502 0.9915 0.9993 1
36 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
0.27869184 1.0
0.8
0.20918272
0.6
0.13967360
0.4
0.07016448
0.2
0.00065536 0.0
0 2 4 6 8 0 1 2 3 4 5 6 7 8
x x
Figure 3.1: Graphical representation of the mass function and the cumulative dis-
tribution function for X ∼ Bin(8, 0.4)
• Bernoulli(p)
• Geometric(p)
• Hypogeomeric(n, M, N )
d
fX (x) = FX (x) (3.6)
dx
Hence Z x
FX (x) = fX (t)dt (3.7)
−∞
F(x)
1.0
0.20
0.8
0.15
0.6
Density
cdf
0.10
0.4
0.05
0.2
a b c 0.0
x
V1
Figure 3.2: Distribution Function and Cumulative Distribution Function for
N(4.5, 2)
1 , if a ≤ x ≤ b,
fX (x) = b − a
0, otherwise.
38 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
cdf
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 2 4 6 8 10 x 0 2 4 6 8 10 x
Figure 3.3: Distribution density function and cumulative distribution function for
Exp(1)
Note that all the distributions depend on some parameters, like p, λ, µ, σ or other.
These values are usually unknown, and so their estimation is one of the important
problems in statistical analyses.
3.1.3 Expectation
The expectation of a function g of a r.v. X is defined by
Z ∞
g(x)f (x)dx, for a continuous r.v.,
−∞
E(g(X)) = X ∞ (3.11)
g(xj )p(xj ) for a discrete r.v.,
j=0
Example 3.10
Let X be a r.v. such that
f (x) = 2
0 otherwise.
• Expectation
π
1 π
Z
EX = x sin xdx = .
2 0 2
• Variance
var(X) = E X 2 − (E X)2
1 π 2 π 2
Z
= x sin xdx −
2 0 2
2
π
= .
4
Example 3.11 The Normal distribution, X ∼ N(µ, σ 2 )
We will show that the parameter µ is the expectation of X and the parameter σ 2
is the variance of X.
Hence E X = µ.
for any real constants a, b and c and for any functions g and h of r.vs X and Y
whose expectations exist.
3.2. TWO DIMENSIONAL RANDOM VARIABLES 41
Definition 3.4. The cumulative distribution function of a two-dimensional r.v.
X = (X1 , X2 ) is
FX (x1 , x2 ) = P (X1 ≤ x1 , X2 ≤ x2 ) (3.14)
where p(x1i , x2j ) denotes the joint probability mass function and
p(x1i , x2j ) = P (X1 = x1i , X2 = x2j ).
As in the univariate case, the joint p.m.f. satisfies the following conditions.
1. p(x1i , x2j ) ≥ 0 , for all i, j
P P
2. all j all i p(x1i , x2j ) = 1
Now, with each of these 36 elements associate values of two variables, X1 and
X2 , such that
Then the bivariate r.v. X = (X1 , X2 ) has the following joint probability mas
function.
x1
2 3 4 5 6 7 8 9 10 11 12
1 1 1 1 1 1
0 36 36 36 36 36 36
1 1 1 1 1
1 18 18 18 18 18
1 1 1 1
2 18 18 18 18
1 1 1
x2 3 18 18 18
1 1
4 18 18
1
5 18
Marginal p.m.fs
Example 3.12 cont. The marginal distributions of the variables X1 and X2 are
following.
x1 2 3 4 5 6 7 8 9 10 11 12
1 1 1 1 5 1 5 1 1 1 1
P (X1 = x1 ) 36 18 12 9 36 6 36 9 12 18 36
x2 0 1 2 3 4 5
1 5 2 1 1 1
P (X2 = x2 ) 6 18 9 6 9 18
3.2. TWO DIMENSIONAL RANDOM VARIABLES 43
1 1 245
E[g(X)] = 2 × 0 × + ...+ 7 × 5 × = .
36 18 18
∂ 2 F (x1 , x2 )
= f (x1 , x2 ). (3.17)
∂x1 ∂x2
Also
Z d Z b
P (a ≤ X1 ≤ b, c ≤ X2 ≤ d) = f (x1 , x2 )dx1 dx2 . (3.18)
c a
44 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
X_2
R(X1,X2)
0 1 X_1
The marginal p.d.fs of X1 and X2 are defined similarly as in the discrete case,
here using integrals.
Z ∞
fX1 (x1 ) = f (x1 , x2 )dx2 , for − ∞ < x1 < ∞, (3.19)
−∞
Z ∞
fX2 (x2 ) = f (x1 , x2 )dx1 , for − ∞ < x2 < ∞. (3.20)
−∞
Example 3.13
Calculate P [X ⊆ A], where A = {(x1 , x2 ) : x1 + x2 ≥ 1} and the joint p.d.f. of
X = (X1 , X2 ) is defined by
(
6x1 x22 for 0 < x1 < 1, 0 < x2 < 1,
fX (x1 , x2 ) = (3.21)
0 otherwise.
The probability is a double integral of the p.d.f. over the region A. The region is
however limited by the domain in which the p.d.f. is positive, see Figure 3.4.
3.2. TWO DIMENSIONAL RANDOM VARIABLES 45
We can write
A = {(x1 , x2 ) : x1 + x2 ≥ 1, 0 < x1 < 1, 0 < x2 < 1}
= {(x1 , x2 ) : x1 ≥ 1 − x2 , 0 < x1 < 1, 0 < x2 < 1}
= {(x1 , x2 ) : 1 − x2 < x1 < 1, 0 < x2 < 1}.
Hence, the probability is
Z Z Z 1Z 1
P (X ⊆ A) = f (x1 , x2 )dx1 dx2 = 6x1 x22 dx1 dx2 = 0.9
A 0 1−x2
Also, using formulae ((3.19)) and ((3.20)) we can calculate marginal p.d.fs.
Z 1
fX1 (x1 ) = 6x1 x22 dx2 = 2x1 x32 |10 = 2x1 ,
0
Z 1
fX2 (x2 ) = 6x1 x22 dx1 = 3x21 x22 |10 = 3x22 .
0
These functions allow us to calculate probabilities involving only one variable.
For example
Z 1
1 1 2 3
P < X1 < = 2x1 dx1 = .
4 2 1
4
16
Similarly to the case of a univariate r.v. the following linear property for the
expectation holds.
E[ag(X) + bh(X) + c] = a E[g(X)] + b E[h(X)] + c, (3.22)
where a, b and c are constants and g and h are some functions of the bivariate r.v.
X = (X1 , X2 ).
It is easy to check that these functions are indeed p.d.fs. For example, for (3.23)
we have
P
X X pX (x1 , x2 ) RX2 pX (x1 , x2 ) pX (x1 )
pX2 (x2 |x1 ) = = = 1 = 1.
R R
pX1 (x1 ) pX1 (x1 ) pX1 (x1 )
X2 X2
x2 0 1 2 3 4 5
1 1
pX2 (x2 |5) 0 2
0 2
0 0
Analogously to the conditional distribution for discrete r.vs, we define the condi-
tional distribution for continuous r.vs.
Definition 3.6. Let X = (X1 , X2 ) denote a continuous bivariate r.v. with joint
p.d.f. fX (x1 , x2 ) and marginal p.d.fs fX1 (x1 ) and fX2 (x2 ). For any x1 such that
fX1 (x1 ) > 0, the conditional p.d.f. of X2 given that X1 = x1 is the function of x2
defined by
fX (x1 , x2 )
fX2 (x2 |x1 ) = . (3.25)
fX1 (x1 )
Analogously, we define the conditional p.d.f. of X1 given X2 = x2
fX (x1 , x2 )
fX1 (x1 |x2 ) = . (3.26)
fX2 (x2 )
Here too, it is easy to verify that these functions are p.d.fs. For example, for the
function (3.25) we have
fX (x1 , x2 )
Z Z
fX2 (x2 |x1 )dx2 = dx2
RX2 RX2 fX1 (x1 )
R
f (x , x2 )dx2
RX2 X 1
=
fX1 (x1 )
fX (x1 )
= 1 = 1.
fX1 (x1 )
3.2. TWO DIMENSIONAL RANDOM VARIABLES 47
Definition 3.7. Let X = (X1 , X2 ) denote a continuous bivariate r.v. with joint
p.d.f. fX (x1 , x2 ) and marginal p.d.fs fX1 (x1 ) and fX2 (x2 ). Then X1 and X2 are
called independent random variables if, for every x1 ∈ RX1 and x2 ∈ RX2
48 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
In fact, two r.vs are independent if and only if there exist functions g(x1 ) and
h(x2 ) such that for every x1 ∈ RX1 and x2 ∈ RX2 ,
that is, the events {X1 ∈ A} and {X2 ∈ B} are independent events.
2 2
Also, we assume that σX 1
and σX 2
are finite positive values.
A simplified notation µ1 , µ2 , σ12 , σ22 will be used when it is clear which r.vs the
notation refers to.
cov(X1 , X2 )
ρ(X1 ,X2 ) = corr(X1 , X2 ) = . (3.30)
σX1 σX2
Properties of covariance
cov(X1 , X2 ) = 0.
Properties of correlation
• −1 ≤ ρ(X1 ,X2 ) ≤ 1,
P (X2 = aX1 + b) = 1.
where x = (x1 , x2 )T .
The determinant of V is
σ12
ρσ1 σ2
det V = det = (1 − ρ2 )σ12 σ22 .
ρσ1 σ2 σ22
3.3. MULTIVARIATE NORMAL DISTRIBUTION 51
Remark 3.1. Note that when ρ = 0 the joint p.d.f. (3.32) simplifies to
1 (x1 − µ1 )2 (x2 − µ2 )2
1
fX (x) = exp − + ,
2πσ1 σ2 2 σ12 σ22
which can be written as a product of the marginal distributions of X1 and X2 .
Hence, if X = (X1 , X2 )T has a bivariate normal distribution and ρ = 0 then the
variables X1 and X2 are independent.
Y = a + BX
VY = BV B T .
Two important properties of multivariate normal random vectors are given in the
following proposition.
Proposition 3.1. Suppose that the normal r.v. is partitioned into two subvectors
of dimensions n1 and n2 (1)
X
X= ,
X (2)
54 CHAPTER 3. RANDOM VARIABLES AND THEIR DISTRIBUTIONS
and, correspondingly, the mean vector and the variance-covariance matrix are
partitioned as (1)
µ V11 V12
µ= V = ,
µ(2) V21 V22
where
µ(i) = E(X (i) ) and Vij = cov(X (i) , X (j) ).
Then
1. X (1) and X (2) are independent iff the (n1 × n2 ) dimensional covariance
matrix V12 is a zero matrix.
Definition 3.10. Xt is a Gaussian time series if all its joint distributions are
multivariate normal, i.e., if for any collection of integers i1 , . . . , in , the random
vector (Xi1 , . . . , Xin )T has a multivariate normal distribution.
For proofs of the results given in this chapter see Castella and Berger (1990) and
Brockwell and Davis (2002) or other textbooks on probability and statistics.