Random Variables: A Concise Guide to Probability Distributions

Random Variables
Consider a random experiment described by a

triple (S, E, P ), where S is a sample space, E
is an event space, and P is a probability mea-
sure. The triple (S, E, P ) is often reffered to as
probability space. Let R be the set of all real
numbers. A random variable is a real-valued
function
X:S→R
such that for any x ∈ R
{s : X(s) ≤ x} ∈ E. (1)
In what follows, instead of {s : X(s) ≤ x} we

will write X ≤ x. Condition (1) ensures that
the probability P (X ≤ x) is defined.
The term “random” refers to the fact that the

value of X is not known before the experiment
is performed. The experiment results in some
outcome s and the corresponding value (real-
ization) X(s).
1
Example Consider the experiment of tossing
two fair coins. Then,
S = {HH, HT, T H, T T }
and
HH 0.25
0.5
H
0.5 0.5 HT 0.25
∅
0.5 0.5 T H 0.25
T
0.5
TT 0.25
Let X(HH) = 2, X(HT ) = 1, X(T H) = 1,

X(T T ) = 0. In other words, the random vari-
able X is the number of heads appearing in
this experiment. Then, for example,
P (X ≤ 1.3) = P ({T T, HT, T H})
= P (T T ) + P (HT ) + P (T H) = 0.75
= 0.25 + 0.25 + 0.25 = 0.75

2
Example Suppose that our experiment consists
of seeing how long a battery can operate be-
fore wearing down. Then, we can consider
S = {s : s ≥ 0}
If the random variable X is the lifetime of this
battery, then
X(s) = s for all s ∈ S

Another example is the random variable
{
1, if s ≥ 3
Y (s) = .
0, otherwise
The function
F (x) = P (X ≤ x), −∞ < x < ∞

is called the (cumulative) distribution function
of X (cdf of X).
3
In words, F (x) is the probability that the ran-
dom variable X takes on a value that is less
than or equal to x.
Let a and b be real numbers and a < b. Since

{X ≤ a} ∩ {a < X ≤ b} = ∅
and
{X ≤ a} ∪ {a < X ≤ b} = {X ≤ b},
we have
F (b) = P (X ≤ b) = P (X ≤ a) + P (a < X ≤ b)
= F (a) + P (a < X ≤ b).

Hence,
P (a < X ≤ b) = F (b) − F (a)
• If F (a) = F (b), then the probability of X
taking a value in the interval (a, b] is 0.
• If F (x) is continuous at b, then
P (X = b) = 0.
4
Joint Distribution Function
Let X and Y be random variables on the same

sample space S. We also assume the same
probability measure. Then, the event
{s : X(s) ≤ x and Y (s) ≤ y}
consists of all sample points s ∈ S such that
X(s) ≤ x and Y (s) ≤ y. Instead of
P ({s : X(s) ≤ x and Y (s) ≤ y})
we will write P (X ≤ x, Y ≤ y).
The joint cumulative probability distribution

function of X and Y is
F (x, y) = P (X ≤ x, Y ≤ y),
where ∞ < x < ∞ and −∞ < y < ∞. Knowl-
edge of F (x, y) gives the “marginal” distribu-
tion functions of X and Y :
FX (x) = P (X ≤ x) = P (X ≤ x, Y < ∞) = F (x, ∞)
and
FY (y) = P (Y ≤ y) = P (X < ∞, Y ≤ y) = F (∞, y)
5
The random variables X and Y are indepen-
dent if, for all x and y, the events
{s : X(s) ≤ x} and {s : Y (s) ≤ y}

are independent. In other words, the random
variables X and Y are independent if, for all x
and y,
P (X ≤ x, Y ≤ y) = P (X ≤ x)P (Y ≤ y).
Equivalently, in terms of the joint distribution
function F of X and Y , the random variables
X and Y are independent if, for all x and y,
F (x, y) = FX (x)FY (y).
6
Discrete Random Variables
A random variable that takes on a finite or

countably infinite number of values is called
discrete. Let X be a discrete random variable
with range {x1, x2, ...} (finite or countably infi-
nite). The function
p(xi) = P (X = xi)
is called the probability mass function. Then,
∑
p(xi) = 1
i
and the cumulative distribution function
∑
F (x) = p(xi).
i: xi ≤x
∑
If the infinite series i |xi|p(xi) converges, i.e.
∑
i xi p(xi) is absolutely convergent, then the
expected value of X is defined as
∑
E[X] = xip(xi).
i
The symbol µ is often used to denote E[X].
7
Consider discrete random variables X and
Y = u(X),
where u : R → R is some function, such that
E[Y ] exists. Then, for each yj in the range of
Y , there exists xi in the range of X such that
yj = u(xi). We have
∑
E[Y ] = yj P (Y = yj )
j
∑
= yj P (∪{i: u(xi)=yj }{s : X(s) = xi})
j
∑ ∑
= yj P ({s : X(s) = xi})
j {i: u(xi )=yj }
∑ ∑ ∑
= yj p(xi) = u(xi)p(xi)
j {i: u(xi)=yj } i
Hence,
∑
E[u(X)] = u(xi)p(xi).
i
8
Let X be a random variable with expected
value µ. The variance of X is defined as
V ar[X] = E[(X − µ)2].

So, for any discrete random variable X
∑
V ar[X] = (xi − µ)2p(xi)
i
∑( )
= 2 2
xi − 2µxi + µ p(xi)
i
∑ ∑ ∑
= x2
i p(xi ) − 2µ xip(xi) + µ2 p(xi)
i i i
E[X 2] − 2µ2 + µ2 = E[X 2] − µ2.

Hence, we can compute V ar[X], using the for-
mula
V ar[X] = E[X 2] − (E[X])2.
9
z-Transforms
Many random variables assume only integral
values 0, 1, 2, 3, ... . Let X be such a variable.
In order to simplify the notation we replace p(i)
by pi. Consider the power series
∞
∑
g(z) = pi z i
i=0
which is called the probability generating func-
tion. The series g(z) converges for |z| ≤ 1. In
particular, g(1) = 1, and since
∞
∑
d
g(z) = ipiz i−1,
dz i=1
we have ∞ ∞
∑ ∑
E[X] = ipi = ipi = g ′(1).
i=0 i=1
Moreover, since
∞
∑
d2 i−2 ,
g(z) = i(i − 1)p i z
dz 2 i=2
we have
∞
∑ ∞
∑
g ′′(1) = i(i − 1)pi = i(i − 1)pi
i=2 i=0
10
∞
∑ ∞
∑
= i2pi − ipi = E[X 2] − g ′(1).
i=0 i=0
Hence,
( )2
′′ ′ ′
V ar[X] = g (1) + g (1) − g (1) .
Another important use of probability generat-

ing functions is the analysis of problems con-
cerning sums of independent random variables.
Let X1 and X2 be independent nonnegative in-
teger valued random variables, and let
Y = X1 + X2.
The P (Y = k) is given by the convolution
k
∑
P (Y = k) = P (X1 = j)P (X2 = k − j),
j=0
and therefore, the corresponding probability gen-
erating function is
 
∞
∑ k
∑
g(z) =  P (X1 = j)P (X2 = k − j) z k
k=0 j=0
11
Consider
∞
∑
g1(z) = P (X1 = j)z j
j=0
and ∞
∑
g2(z) = P (X2 = j)z j .
j=0
Then
 
∞
∑ k
∑
g1(z)g2(z) =  P (X1 = j)P (X2 = k − j) z k
k=0 j=0
∞
∑
= P (Y = k)z k = g(z)
k=0
Hence, for Y = X1 + X2,
g(z) = g1(z)g2(z).
Example. A Bernoulli random variable has two
possible values: 1 with probability p and 0 with
probability q = 1 − p. Then
E[X] = 0 · q + 1 · p = p
12
and
V ar[X] = E[X 2] − (E[X])2
= 12p + 02q − p2 = p − p2 = pq.

Alternatively,
g(z) = q + pz.
Therefore,
E[X] = g ′(1) = p
and
( )2
′′ ′ ′
V ar[X] = g (1) + g (1) − g (1)
= 0 + p − p2 = pq.
13
Binomial Distribution
Some experiments can be viewed as a sequence
of identical and independent trials, each re-
sulting in one of two outcomes called “suc-
cess” and “failure”. Such trials are often called
Bernoulli trials. Let p be the probability of suc-
cess in any single Bernoulli trial and Y be the
number of successes observed in n such trials.
Then
Y = X1 + X2 + ... + Xn,
where all X1, X2, ..., Xn are mutually inde-
pendent and identically distributed Bernoulli
random variables. Therefore Y has generat-
ing function
n
∑
g(z) = P (Y = k)z k = (q + pz)n
k=0
n ( ) n ( )
∑ n ∑ n
= k
(pz) q n−k = pk q n−k z k ,
k k
k=0 k=0
where ( )
n n!
= .
k k!(n − k)!
14
Hence, for k = 0, 1, ..., n,
( )
n k n−k
P (Y = k) = p q .
k
We will say that Y has a binomial distribution
(is a binomial random variable). We have
d d
g(z) = (q + pz)n = np(q + pz)n−1
dz dz
= np(1 − p + pz)n−1
Hence
E[Y ] = g ′(1) = np.
Similarly
g ′′(z) = n(n − 1)p2(1 − p + pz)n−2

and therefore
V ar[Y ] = g ′′(1) + g ′(1) − (g ′(1))2

= n(n − 1)p2 + np − (np)2
= np − np2 = np(1 − p) = npq
15
Alternatively
n n ( )
∑ ∑ n i
E(Y ) = i · P (X = i) = i· p (1 − p)n−i
i
i=0 i=0
n
∑ n!
= i· pi(1 − p)n−i
i=1 (n − i)!i!
n
∑ n!
= pi(1 − p)n−i
i=1 (n − i)!(i − 1)!
n
∑ (n − 1)!
= np pi−1(1 − p)n−i
i=1 (n − i)!(i − 1)!
n−1
∑ (n − 1)!
= np pj (1 − p)n−1−j
j=0 (n − 1 − j)!(j)!
n−1 ( )
∑ n−1
= np pj (1 − p)n−1−j
j
j=0
= np[p + (1 − p)]n−1 = np.
16
Poisson Distribution
Let X be a discrete random variable that can
take on the values 0, 1, 2, ... . Then X is
a Poisson random variable (X is Poisson dis-
tributed or X has a Poisson distribution) if
λk e−λ
P (X = k) = ,
k!
where P (X = k) is a probability that X = k
and λ is a positive constant. The probability
generating function is given by
∞ ∞
∑ λk e−λ k −λ
∑ (λz)k
g(z) = z =e
k=0
k! k=0
k!
= e−λeλz = eλ(z−1).
Hence,
g ′(z) = λeλ(z−1) and g ′′(z) = λ2eλ(z−1),
and therefore,
E[X] = λ and V ar[X] = λ.
17
The idea of z-transform is applicable not only
to random variables with positive integer val-
ues but to any sequences of real numbers. Thus,
let a0, a1, a2, ... be a sequence of real num-
bers. If
∞
∑
g(z) = a0 + a1z + a2z 2 + a3z 3 + ... = ai z i
i=0
converges in some interval |z| < b, then g(z) is
called the generating function of the sequence
a0, a1, a2, ... .
Example Consider the sequence where aj = 1

for all j ≥ 0. Then, the generating function is
a sum of the following geometric series:
g(z) = 1 + z + z 2 + ...
This series converges for |z| < 1 and
1
g(z) = .
1−z
18
Example Consider a sequence where
{
0 if i is either 0 or 1
ai =
1 otherwise
Then, the generating function
∞
∑ ∞
∑
g(z) = 0 + 0z + 1z 2 + ... = zi = z2 zi
i=2 i=0
This series converges for |z| < 1 and
z2
g(z) = .
1−z
Let X be a random variable with nonnegative

integer values. For all nonnegative integer i,
let
pi = P (X = i) and qi = P (X > i).
Then,
∞
∑
qi = pi+1 + pi+2 + ... = pj
j=i+1
19
∑∞ i and q(z) =
Theorem 1 If g(z) = p i z
∑∞ i=0
i=0 qi z , then for |z| < 1
i
1 − g(z)
q(z) =
1−z
Proof For all i ≥ 1,

qi−1 = pi + pi+1 + pi+2 + ...
qi = pi+1 + pi+2 + pi+3 + ...

and therefore,
pi = qi−1 − qi.
This implies
∞
∑ ∞
∑ ∞
∑
piz i = qi−1z i − qi z i
i=1 i=1 i=1
∞
∑ ∞
∑ ∞
∑ ∞
∑
=z qi−1z i−1 − qi z i = z qi z i − qi z i
i=1 i=1 i=0 i=1
Taking into account that
∞
∑
p0 = 1 − pi = 1 − q0,
i=1
20
we have
∞
∑ ∞
∑ ∞
∑
g(z) = p0 + piz i = 1 − q0 + z qi z i − qi z i
i=1 i=0 i=1
∞
∑ ∞
∑
=1+z qi z i − qiz i = 1 + zq(z) − q(z).
i=0 i=0
Hence,
g(z) = 1 + zq(z) − q(z) (2)
and consequently
1 − g(z)
q(z) =
1−z
Theorem 2 The expectation E(X) and vari-

ance V ar[X] satisfy the relations
E[X] = q(1) and V ar[X] = 2q ′(1)+q(1)−q 2(1).
Proof By differentiation of (2), we obtain
g ′(z) = q(z) + zq ′(z) − q ′(z)

and
21
g ′′(z) = 2q ′(z) + zq ′′(z) − q ′′(z).
Hence, g ′(1) = q(1) and g ′′(1) = 2q ′(1). Then,
E[X] = g ′(1) = q(1)

and
( )2
′′ ′ ′
V ar[X] = g (1) + g (1) − g (1)
= 2q ′(1) + q(1) − q 2(1)
22

Random Variables: A Concise Guide to Probability Distributions

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Random Variables: A Concise Guide to Probability Distributions

Încărcat de

Drepturi de autor:

Formate disponibile

Random Variables

Consider a random experiment described by a

In what follows, instead of {s : X(s) ≤ x} we

The term “random” refers to the fact that the

Let X(HH) = 2, X(HT ) = 1, X(T H) = 1,

= 0.25 + 0.25 + 0.25 = 0.75

X(s) = s for all s ∈ S

F (x) = P (X ≤ x), −∞ < x < ∞

Let a and b be real numbers and a < b. Since

= F (a) + P (a < X ≤ b).

Let X and Y be random variables on the same

The joint cumulative probability distribution

{s : X(s) ≤ x} and {s : Y (s) ≤ y}

F (x, y) = FX (x)FY (y).

A random variable that takes on a finite or

V ar[X] = E[(X − µ)2].

E[X 2] − 2µ2 + µ2 = E[X 2] − µ2.

V ar[X] = E[X 2] − (E[X])2.

Another important use of probability generat-

= 12p + 02q − p2 = p − p2 = pq.

g ′′(z) = n(n − 1)p2(1 − p + pz)n−2

V ar[Y ] = g ′′(1) + g ′(1) − (g ′(1))2

= np[p + (1 − p)]n−1 = np.

E[X] = λ and V ar[X] = λ.

Example Consider the sequence where aj = 1

Let X be a random variable with nonnegative

Proof For all i ≥ 1,

qi = pi+1 + pi+2 + pi+3 + ...

Theorem 2 The expectation E(X) and vari-

E[X] = q(1) and V ar[X] = 2q ′(1)+q(1)−q 2(1).

Proof By differentiation of (2), we obtain

g ′(z) = q(z) + zq ′(z) − q ′(z)

E[X] = g ′(1) = q(1)

= 2q ′(1) + q(1) − q 2(1)

S-ar putea să vă placă și