Documente Academic
Documente Profesional
Documente Cultură
Venkata Rao
2 CHAPTER 2
U
2.1 Introduction
At the start of Sec. 1.1.2, we had indicated that one of the possible ways
of classifying the signals is: deterministic or random. By random we mean
unpredictable; that is, in the case of a random signal, we cannot with certainty
predict its future value, even if the entire past history of the signal is known. If the
signal is of the deterministic type, no such uncertainty exists.
then (we are assuming them to be constants) we know the value of x ( t ) for all t .
time).
Even this value may not remain constant and could vary with time. Then,
observing the output of such a source over a long period of time would not be of
much use in predicting the future values. We say that the source output varies in
a random manner.
2.1
(radio) signal is from a highly stable source, the voltage at the terminals of a
receiving antenna varies in an unpredictable fashion. This is because the
conditions of propagation of the radio waves are not under our control.
2.2
2.2.1. Terminology
Def. 2.1: Outcome
The end result of an experiment. For example, if the experiment consists
of throwing a die, the outcome would be anyone of the six faces, F1 ,........, F6
Def. 2.2: Random experiment
An experiment whose outcomes are not known in advance. (e.g. tossing a
coin, throwing a die, measuring the noise voltage at the terminals of a resistor
etc.)
Def. 2.3: Random event
A random event is an outcome or set of outcomes of a random experiment
that share a common attribute. For example, considering the experiment of
throwing a die, an event could be the 'face F1 ' or 'even indexed faces'
A1 , A2 ⋅ ⋅ ⋅ ⋅
2.3
This concept can be generalized to the union of more than two events.
Def. 2.7: Intersection of events
The intersection of two events, A and B , is the set of all outcomes which
belong to A as well as B . The intersection of A and B is denoted by ( A ∩ B )
2.4
⎛n ⎞
P ( A ) = lim ⎜ A ⎟ (2.1)
n → ∞ ⎝ n ⎠
⎛ nA ⎞
⎜ n ⎟ represents the fraction of occurrence of A in n trials.
⎝ ⎠
⎛n ⎞
For small values of n , it is likely that ⎜ A ⎟ will fluctuate quite badly. But
⎝ n ⎠
⎛n ⎞
as n becomes larger and larger, we expect, ⎜ A ⎟ to tend to a definite limiting
⎝ n ⎠
value. For example, let the experiment be that of tossing a coin and A the event
⎛n ⎞
'outcome of a toss is Head'. If n is the order of 100, ⎜ A ⎟ may not deviate from
⎝ n ⎠
2.5
1
by more than, say ten percent and as n becomes larger and larger, we
2
⎛n ⎞ 1
expect ⎜ A ⎟ to converge to .
⎝ n ⎠ 2
Def. 2.12: The classical definition:
The relative frequency definition given above has empirical flavor. In the
classical approach, the probability of the event A is found without
experimentation. This is done by counting the total number N of the possible
outcomes of the experiment. If N A of those outcomes are favorable to the
occurrence of the event A , then
NA
P ( A) = (2.2)
N
where it is assumed that all outcomes are equally likely!
P2) P ( S ) = 1 (2.3b)
(Note that in Eq. 2.3(c), the symbol + is used to mean two different things;
namely, to denote the union of A and B and to denote the addition of two real
numbers). Using Eq. 2.3, it is possible for us to derive some additional
relationships:
i) If A B ≠ φ , then P ( A + B ) = P ( A ) + P ( B ) − P ( A B ) (2.4)
2.6
b) A1 + A2 + ...... + An = S. (2.5b)
iii) P ( A ) = 1 − P ( A ) (2.7)
that A has occurred. In a real world random experiment, it is quite likely that the
occurrence of the event B is very much influenced by the occurrence of the
event A . To give a simple example, let a bowl contain 3 resistors and 1
capacitor. The occurrence of the event 'the capacitor on the second draw' is very
much dependent on what has been drawn at the first instant. Such dependencies
between the events is brought out using the notion of conditional probability .
2.7
we have
⎛ nAB ⎞
⎜ ⎟ P ( AB )
= lim ⎜ n ⎟ = , P ( A) ≠ 0 (2.8a)
n → ∞ ⎜ nA ⎟ P ( A)
⎜ n ⎟
⎝ ⎠
or P ( A B ) = P (B | A) P ( A)
Similarly
P ( A) P (B | A)
P ( A | B) = , P (B ) ≠ 0 (2.10b)
P (B )
relation can be derived for the joint probability of a joint event involving the
intersection of three or more events. For example P ( A BC ) can be written as
2.8
P ( A BC ) = P ( A B ) P (C | AB )
= P ( A ) P ( B | A ) P (C | AB ) (2.11)
Exercise 2.1
Let A1 , A2 ,......, An be n mutually exclusive and exhaustive events
( ) ( )
P B | Aj P Aj
(
P Aj | B =) n
(2.12)
∑ P (B | Aj ) P ( Aj )
i = 1
Either Eq. 2.13 or 2.14 can be used to define the statistical independence
of two events. Note that if A and B are independent, then
P ( A B ) = P ( A ) P ( B ) , whereas if they are disjoint, then P ( A B ) = 0 . The notion
if and only if (iff) the probability of every intersection of k or fewer events equal
the product of the probabilities of its constituents. Thus three events A , B , C are
independent when
2.9
P ( A B ) = P ( A) P (B )
P ( AC ) = P ( A ) P (C )
P ( BC ) = P ( B ) P (C )
and P ( A BC ) = P ( A ) P ( B ) P (C )
We shall illustrate some of the concepts introduced above with the help of
two examples.
Example 2.1
Priya (P1) and Prasanna (P2), after seeing each other for some time (and
after a few tiffs) decide to get married, much against the wishes of the parents on
both the sides. They agree to meet at the office of registrar of marriages at 11:30
a.m. on the ensuing Friday (looks like they are not aware of Rahu Kalam or they
don’t care about it).
However, both are somewhat lacking in punctuality and their arrival times
are equally likely to be anywhere in the interval 11 to 12 hrs on that day. Also
arrival of one person is independent of the other. Unfortunately, both are also
very short tempered and will wait only 10 min. before leaving in a huff never to
meet again.
2.10
2.11
Shaded area 11
c) Probability of marriage = =
Total area 36
Example 2.2:
Let two honest coins, marked 1 and 2, be tossed together. The four
possible outcomes are T1 T2 , T1 H2 , H1 T2 , H1 H2 . ( T1 indicates toss of coin 1
resulting in tails; similarly T2 etc.) We shall treat that all these outcomes are
equally likely; that is the probability of occurrence of any of these four outcomes
1
is . (Treating each of these outcomes as an event, we find that these events
4
are mutually exclusive and exhaustive). Let the event A be 'not H1 H2 ' and B be
P ( AB )
We know that P ( B | A ) = .
P ( A)
A B is the event 'not H1 H2 ' and 'match'; i.e., it represents the outcome T1 T2 .
1
Hence P ( A B ) = . The event A comprises of the outcomes ‘ T1 T2 , T1 H2 and
4
H1 T2 ’; therefore,
3
P ( A) =
4
1
1
P (B | A) = 4 =
3 3
4
1
Intuitively, the result P ( B | A ) = is satisfying because, given 'not H1 H2 ’ the
3
toss would have resulted in anyone of the three other outcomes which can be
2.12
1
treated to be equally likely, namely . This implies that the outcome T1 T2 given
3
1
'not H1 H2 ', has a probability of .
3
1 1
As P ( B ) = and P ( B | A ) = , A and B are dependent events.
2 3
2.13
then the events on the sample space will become transformed into appropriate
segments of the real line. Then we can enquire into the probabilities such as
or
P ⎡⎣{s : X ( s ) = c}⎤⎦
These and other probabilities can be arrived at, provided we know the
Distribution Function of X, denoted by FX ( ) which is given by
That is, FX ( x ) is the probability of the event, comprising all those sample points
which are transformed by X into real numbers less than or equal to x . (Note
that, for convenience, we use x as the argument of FX ( ) . But, it could be any
other symbol and we may use FX ( α ) , FX ( a1 ) etc.) Evidently, FX ( ) is a function
whose domain is the real line and whose range is the interval [0, 1] ).
1
each with sample point representing an event with the probabilities P ( s1 ) = ,
4
1 1 1
P ( s2 ) = , P ( s3 ) = and P ( s4 ) = . If X ( si ) = i − 1.5, i = 1, 2, 3, 4 , then
8 8 2
the distribution function FX ( x ) , will be as shown in Fig. 2.3.
2.14
i) FX ( x ) ≥ 0, − ∞ < x < ∞
ii) FX ( − ∞ ) = 0
iii) FX ( ∞ ) = 1
v) If a > b , then FX ( a ) ≥ FX ( b )
The first three properties follow from the fact that FX ( ) represents the
{s : X ( s ) ≤ b} ∪ {s : b < X ( s ) ≤ a} = {s : X ( s ) ≤ a}
Referring to the Fig. 2.3, note that FX ( x ) = 0 for x < − 0.5 whereas
1 1
FX ( − 0.5 ) = . In other words, there is a discontinuity of at the point
4 4
x = − 0.5 . In general, there is a discontinuity in FX of magnitude Pa at a point
x = a , if and only if
2.15
be an event in the sample space with some assigned probability. (The term
“random variable” is somewhat misleading because an RV is a well defined
function from S into the real line.) However, every transformation from S into
the real line need not be a random variable. For example, let S consist of six
sample points, s1 to s6 . The only events that have been identified on the sample
2 1 1
are P ( A ) = , P ( B ) = and P (C ) = . We see that the probabilities for the
6 2 6
various unions and intersections of A , B and C can be obtained.
space.
1
TP PT Let x = a . Consider, with ∆ > 0 ,
{ }
We intuitively feel that as ∆ → 0 , the limit of the set s : a < X ( s ) ≤ a + ∆ is the null set and
F
X
( a+ ) − FX ( a ) = 0 , where a
+
= lim
∆ → 0
(a + ∆ )
That is, F
X
( x ) is continuous to the right.
2.16
Exercise 2.2
Let S be a sample space with six sample points, s1 to s6 . The events
identified on S are the same as above, namely, A = {s1 , s2 } ,
1 1 1
B = {s3 , s4 , s5 } and C = {s6 } with P ( A ) = , P ( B ) = and P (C ) = .
3 2 6
Let Y ( ) be the transformation,
⎧1, i = 1, 2
⎪
Y ( si ) = ⎨2 , i = 3, 4, 5
⎪3 , i = 6
⎩
Show that Y ( ) is a random variable by finding FY ( y ) . Sketch FY ( y ) .
2.17
continuous to the right.) The second case is taken care of by introducing the
impulse in the probability domain. That is, if there is a discontinuity in FX at
example, for the CDF shown in Fig. 2.3, the PDF will be,
1 ⎛ 1⎞ 1 ⎛ 1⎞ 1 ⎛ 3⎞ 1 ⎛ 5⎞
fX ( x ) = δ⎜ x + ⎟ + δ⎜ x − ⎟ + δ⎜ x − ⎟ + δ⎜ x − ⎟ (2.18)
4 ⎝ 2⎠ 8 ⎝ 2⎠ 8 ⎝ 2⎠ 2 ⎝ 2⎠
1 1
In Eq. 2.18, f X ( x ) has an impulse of weight at x = as
8 2
⎡ 1⎤ 1
P ⎢ X = ⎥ = . This impulse function cannot be taken as the limiting case of
⎣ 2⎦ 8
1 ⎛x⎞
an even function (such as ga ⎜ ⎟ ) because,
ε ⎝ε⎠
1 1
2 2
1 ⎛ 1⎞ 1
lim ∫ f X ( x ) dx = lim ∫ δ ⎜ x − ⎟ dx ≠
ε → 0
1 −ε
ε → 0
1 −ε
8 ⎝ 2⎠ 16
2 2
2.18
1
2
1
However, lim ∫ f X ( x ) dx = . This ensures,
ε → 0 8
1 −ε
2
⎧2 1 1
⎪⎪ 8 , − 2 ≤ x < 2
FX ( x ) = ⎨
⎪3 , 1 ≤ x < 3
⎪⎩ 8 2 2
Such an impulse is referred to as the left-sided delta function.
As FX is non-decreasing and FX ( ∞ ) = 1, we have
i) fX ( x ) ≥ 0 (2.19a)
∞
ii) ∫ f ( x ) dx
−∞
X = 1 (2.19b)
1
TP PT As the domain of the random variable X ( ) is known, it is convenient to denote the variable
simply by X .
2.19
That is, FX ,Y ( x , y ) is the probability associated with the set of all those
sample points such that under X , their transformed values will be less than or
equal to x and at the same time, under Y , the transformed values will be less
than or equal to y . In other words, FX ,Y ( x1 , y1 ) is the probability associated with
the set of all sample points whose transformation does not fall outside the
shaded region in the two dimensional (Euclidean) space shown in Fig. 2.5.
Looking at the sample space S, let A be the set of all those sample
points s ∈ S such that X ( s ) ≤ x1 . Similarly, if B is comprised of all those
sample points s ∈ S such that Y ( s ) ≤ y1 ; then F ( x1 , y1 ) is the probability
associated with the event AB .
ii) FX ,Y ( − ∞ , y ) = FX ,Y ( x , − ∞ ) = 0
iii) FX ,Y ( ∞ , ∞ ) = 1
iv) FX ,Y ( ∞ , y ) = FY ( y )
v) FX ,Y ( x , ∞ ) = FX ( x )
2.20
FX ,Y ( x2 , y 2 ) ≥ FX ,Y ( x2 , y1 ) ≥ FX ,Y ( x1 , y1 )
The notion of joint CDF and joint PDF can be extended to the case of k random
variables, where k ≥ 3 .
FX ,Y ( x1 , ∞ ) = FX ( x1 ) .
x1 ∞
That is, FX ( x1 ) = ∫ ∫ f ( α , β) d β d α
X ,Y
− ∞− ∞
d ⎧⎪ 1 ⎡ ⎤ ⎫⎪
x ∞
f X ( x1 ) = ⎨ ∫ ⎢ ∫ f X ,Y ( α , β ) d β⎥ d α ⎬ (2.22)
d x1 ⎩⎪− ∞ ⎣⎢ − ∞ ⎦⎥ ⎭⎪
Eq. 2.22 involves the derivative of an integral. Hence,
d FX ( x ) ∞
f X ( x1 ) = = ∫ f ( x , β ) dβ
X ,Y 1
dx x = x1 −∞
∞
or fX ( x ) = ∫ f ( x , y ) dy
X ,Y (2.23a)
−∞
∞
Similarly, fY ( y ) = ∫ f ( x , y ) dx
X ,Y (2.23b)
−∞
(In the study of several random variables, the statistics of any individual variable
is called the marginal. Hence it is common to refer FX ( x ) as the marginal
2.21
f X ,Y ( x , y1 )
f X |Y ( x | y1 ) = (2.24)
fY ( y1 )
f X ,Y ( x , y )
and fY | X ( y | x ) = (2.25b)
fX ( x )
or f X ,Y ( x , y ) = f X |Y ( x | y ) fY ( y ) (2.25c)
= fY | X ( y | x ) f X ( x ) (2.25d)
2.22
or f X |Y ( x | y ) = f X ( x ) (2.30a)
Similarly, fY | X ( y | x ) = fY ( y ) (2.30b)
Eq. 2.30(a) and 2.30(b) are alternate expressions for the statistical independence
between X and Y . We shall now give a few examples to illustrate the concepts
discussed in sec. 2.3.
Example 2.3
A random variable X has
⎧ 0 , x < 0
⎪
FX ( x ) = ⎨ K x 2 , 0 ≤ x ≤ 10
⎪100 K , x > 10
⎩
i) Find the constant K
ii) Evaluate P ( X ≤ 5 ) and P ( 5 < X ≤ 7 )
2.23
iii) What is f X ( x ) ?
1
i) FX ( ∞ ) = 100 K = 1 ⇒ K = .
100
⎛ 1 ⎞
ii) P ( x ≤ 5 ) = FX ( 5 ) = ⎜ ⎟ × 25 = 0.25
⎝ 100 ⎠
P ( 5 < X ≤ 7 ) = FX ( 7 ) − FX ( 5 ) = 0.24
⎧0 , x < 0
d FX ( x ) ⎪
fX ( x ) = = ⎨0.02 x , 0 ≤ x ≤ 10
dx ⎪0
⎩ , x > 10
fX ( x ) = a e
−bx
, − ∞ < x < ∞ where a and b are positive constants.
∫ dx = 2 ∫ a e − b x dx = 1 .
−bx
represent a legitimate PDF, we require ae
−∞ 0
∞
1
That is, ∫ a e − b x dx = ; hence b = 2 a .
0
2
2.24
∞
= 1 − ∫ fX ( x ) d x
2
∞ −2
1 − bx
Therefore, FX ( x ) = 1 − e , x > 0
2
We can now write the complete expression for the CDF as
⎧ 1 bx
⎪⎪ e , x < 0
FX ( x ) = ⎨
2
⎪1 − 1 − bx
e , x ≥ 0
⎪⎩ 2
iii) FX ( 2 ) − FX (1) =
1 −b
2
(
e − e− 2 b )
Example 2.5
⎧1
⎪ , 0 ≤ x ≤ y, 0 ≤ y ≤ 2
Let f X ,Y ( x , y ) = ⎨ 2
⎪⎩0 , otherwise
f X ,Y ( x , y )
(a) fY | X ( y | x ) =
fX ( x )
2.25
1
⎧2 , 1.5 ≤ y ≤ 2
(ii) fY | X ( y | 1.5 ) = 2 = ⎨
1 ⎩0 , otherwise
4
b) the dependence between the random variables X and Y is evident from
the statement of the problem because given a value of X = x1 , Y should
be greater than or equal to x1 for the joint PDF to be non zero. Also we see
then fY | X ( y | x ) = fY ( y ) .
2.26
Exercise 2.3
For the two random variables X and Y , the following density functions
have been given. (Fig. 2.6)
b) Show that
⎧ y
⎪⎪100 , 0 ≤ y ≤ 10
fY ( y ) = ⎨
⎪ 1 − y , 10 < y ≤ 20
⎪⎩ 5 100
2.27
Y ( s1 ) = g ( X ( s1 ) )
Our interest is to obtain fY ( y ) . This can be obtained with the help of the following
Proof: Consider the transformation shown in Fig. 2.7. We see that the equation
y 1 = g ( x ) has three roots namely, x1 , x2 and x3 .
2.28
need to find the set of values x such that y1 < g ( x ) ≤ y1 + d y and the
probability that X is in this set. As we see from the figure, this set consists of the
following intervals:
x1 < x ≤ x1 + d x1 , x2 + d x2 < x ≤ x2 , x3 < x ≤ x3 + d x3
dy
P [ x2 + d x2 < x ≤ x2 ] = f X ( x2 ) d x2 , d x2 =
g ' ( x2 )
dy
P [ x3 < X ≤ x3 + d x 3 ] = f X ( x 3 ) d x3 , d x 3 =
g ' ( x3 )
2.29
We conclude that
f X ( x1 ) f X ( x2 ) f X ( x3 )
fY ( y ) d y = dy + dy + dy (2.32)
g ' ( x1 ) g ' ( x2 ) g ' ( x3 )
FX ( x1 ) − FX ( x0 ) .
Example 2.6
Y = g ( X ) = X + a , where a is a constant. Let us find fY ( y ) .
fY ( y ) = f X ( y − a )
⎧⎪1 − x , x ≤ 1
Let f X ( x ) = ⎨
⎪⎩ 0 , elsewhere
and a = − 1
Then, fY ( y ) = 1 − y + 1
as ( − 2, 0 ) .
2.30
⎧⎪1 − y + 1 , − 2 ≤ y ≤ 0
Hence fY ( y ) = ⎨
⎪⎩ 0 , elsewhere
Example 2.7
Let Y = b X , where b is a constant. Let us find fY ( y ) .
1
Solving for X , we have X = Y . Again for a given y , there is a unique
b
1 ⎛y⎞
x . As g ' ( x ) = b , we have fY ( y ) = fX ⎜ ⎟ .
b ⎝b⎠
⎧ x
⎪1 − , 0 ≤ x ≤ 2
Let f X ( x ) = ⎨ 2
⎪⎩ 0 , otherwise
and b = − 2 , then
2.31
⎧1 ⎡ y⎤
⎪
fY ( y ) = ⎨ 2 ⎢1 + 4 ⎥ , − 4 ≤ y ≤ 0
⎣ ⎦
⎪
⎩ 0 , otherwise
Exercise 2.4
Let Y = a X + b , where a and b are constants. Show that
1 ⎛y − b⎞
fY ( y ) = fX ⎜ ⎟.
a ⎝ a ⎠
b = 1.
Example 2.8
2.32
y
Hence fY ( y ) = 0 for y < 0 . If y ≥ 0 , then it has two solutions, x1 = and
a
y
x2 = − , and Eq. 2.31 yields
a
⎧ 1 ⎡ ⎛ y⎞ ⎛ y ⎞⎤
⎪ ⎢f X ⎜⎜ ⎟⎟ + f X ⎜⎜ − ⎟⎥ , y ≥ 0
⎪ a ⎟⎠ ⎦⎥
fY ( y ) = ⎨ 2a y ⎣⎢ ⎝
a⎠ ⎝
⎪ a
⎪⎩ 0 , otherwise
1 ⎛ x2 ⎞
Let a = 1 , and f X ( x ) = exp ⎜ − , −∞ < x < ∞
2π ⎜ 2 ⎟⎟
⎝ ⎠
1 ⎡ ⎛ y⎞ ⎛ y ⎞⎤
Then fY ( y ) = ⎢ exp ⎜ − 2 ⎟ + exp ⎜ − 2 ⎟ ⎥
2 y 2π ⎣ ⎝ ⎠ ⎝ ⎠⎦
⎧ 1 ⎛ y⎞
⎪ exp ⎜ − ⎟ , y ≥ 0
= ⎨ 2π y ⎝ 2⎠
⎪
⎩ 0 , otherwise
Example 2.9
Consider the half wave rectifier transformation given by
⎧0 , X ≤ 0
Y =⎨
⎩X , X > 0
a) Let us find the general expression for fY ( y )
⎧1 1 3
⎪ , − < x <
b) Let f X ( x ) = ⎨ 2 2 2
⎪⎩ 0 , otherwise
2.33
⎪⎧f ( y ) w ( y ) + FX ( 0 ) δ ( y )
fY ( y ) = ⎨ X
⎪⎩0 , otherwise
⎧1 , y ≥ 0
where w ( y ) = ⎨
⎩0 , otherwise
⎧1 1 3
⎪ , − <x<
b) Specifically, let f X ( x ) = ⎨ 2 2 2
⎪⎩0 , elsewhere
⎧1
⎪4 δ(y ) , y = 0
⎪
⎪1 3
Then, fY ( y ) = ⎨ , 0 < y ≤
⎪2 2
⎪0 , otherwise
⎪
⎩
f X ( x ) and fY ( y ) are sketched in Fig. 2.10.
2.34
Example 2.10
⎧− 1, X < 0
Let Y = ⎨
⎩+ 1, X ≥ 0
a) Let us find the general expression for fY ( y ) .
a) In this case, Y assumes only two values, namely ±1. Hence the PDF of Y
has only two impulses. Let us write fY ( y ) as
fY ( y ) = P1 δ ( y − 1) + P−1 δ ( y + 1) where
P− 1 = P [ X < 0 ] and P1 [ X ≥ 0 ]
3 1
b) Taking f X ( x ) of example 2.9, we have P1 = and P− 1 = . Fig. 2.11
4 4
has the sketches f X ( x ) and fY ( y ) .
Note that this transformation has converted a continuous random variable X into
a discrete random variable Y .
2.35
Exercise 2.5
Let a random variable X with the PDF shown in Fig. 2.12(a) be the
input to a device with the input-output characteristic shown in Fig. 2.12(b).
Compute and sketch fY ( y ) .
Fig. 2.12: (a) Input PDF for the transformation of exercise 2.5
(b) Input-output transformation
Exercise 2.6
The random variable X of exercise 2.5 is applied as input to the
X − Y transformation shown in Fig. 2.13. Compute and sketch fY ( y ) .
2.36
Example 2.11
Let Y = X 2 .
6
1
a) If f X ( x ) =
6
∑ δ ( x − i ) , find fY ( y ) .
i = 1
3
1
b) If f X ( x ) =
6
∑ δ ( x − i ) , find fY ( y ) .
i = −2
1
a) If X takes the values (1, 2, ......., 6 ) with probability of , then Y takes the
6
∑ δ(x − i ) .
6
1 1
values 12 , 22 , ......., 62 with probability . That is, fY ( y ) = 2
6 6 i = 1
1
b) If, however, X takes the values − 2, − 1, 0, 1, 2, 3 with probability , then
6
1 1 1 1
Y takes the values 0, 1, 4, 9 with probabilities , , , respectively.
6 3 3 6
That is,
1 1
fY ( y ) = ⎡⎣δ ( y ) + δ ( y − 9 ) ⎤⎦ + ⎡⎣δ ( y − 1) + δ ( y − 4 ) ⎤⎦
6 3
2.37
⎛ z, w ⎞
J⎜ ⎟ where
⎝ x, y ⎠
∂z ∂z
⎛ z, w ⎞ ∂x ∂y
J⎜ ⎟ =
⎝ x, y ⎠ ∂w ∂w
∂x ∂y
∂ z ∂w ∂ z ∂w
= −
∂x ∂y ∂y ∂x
That is, the Jacobian is the determinant of the appropriate partial derivatives. We
shall now state the theorem which relates fZ ,W ( z , w ) and f X ,Y ( x , y ) .
h ( x , y ) = w1 , (2.33b)
g ( x , y ) = z1 ,
h ( x , y ) = w1 ,
f X ,Y ( x1 , y1 )
fZ ,W ( z , w ) = (2.34)
⎛ z, w ⎞
J⎜ ⎟
⎝ x1 , y1 ⎠
Proof of this theorem is given in appendix A2.1. For a more general version of
this theorem, refer [1].
2.38
Example 2.12
X and Y are two independent RVs with the PDFs ,
1
fX ( x ) = , x ≤ 1
2
1
fY ( y ) = , y ≤ 1
2
If Z = X + Y and W = X − Y , let us find (a) fZ ,W ( z , w ) and (b) fZ ( z ) .
1
a) From the given transformations, we obtain x = (z + w ) and
2
1
y = ( z − w ) . We see that the mapping is one-to-one. Fig. 2.14(a)
2
depicts the (product) space A on which f X ,Y ( x , y ) is non-zero.
2.39
1 ⎧1
⎪ , z, w ∈ B
Hence fZ ,W (z, w ) = 4 = ⎨8
2 ⎪⎩0 , otherwise
∞
b) fZ ( z ) = ∫ f (z, w ) d w
Z ,W
−∞
From Fig. 2.14(b), we can see that, for a given z ( z ≥ 0 ), w can take
values only in the − z to z . Hence
+ z
1 1
fZ ( z ) = ∫ 8 dw = z, 0 ≤ z ≤ 2
−z
4
2.40
1
Hence fZ ( z ) = z , z ≤ 2
4
Example 2.13
⎛Y ⎞
Φ = arc tan ⎜ ⎟ where we assume R ≥ 0 and − π < Φ < π . It is given that
⎝X⎠
1 ⎡ ⎛ x 2 + y 2 ⎞⎤
f X ,Y ( x , y ) = exp ⎢ − ⎜ ⎟⎥ , − ∞ < x, y < ∞ .
2π ⎣ ⎝ 2 ⎠⎦
Let us find fR , Φ ( r , ϕ ) .
∂r ∂r
cos ϕ sin ϕ
∂x ∂y 1
J = = − sin ϕ cos ϕ =
∂ϕ ∂ϕ r
r r
∂x ∂y
⎧ r ⎛ r2 ⎞
⎪ exp ⎜ − ⎟ , 0 ≤ r ≤ ∞ , − π < ϕ < π
Hence, fR , Φ ( r , ϕ ) = ⎨ 2 π ⎝ 2⎠
⎪
⎩0 , otherwise
independent variables.
obtained.
2.41
W = Y . Then, X = Z − W , and Y = W .
As J = 1 ,
fZ ,W ( z , w ) = f X ,Y ( z − w , w )
∞
and fZ ( z ) = ∫ f (z − w , w ) d w
X ,Y (2.35)
−∞
Example 2.14
Let X and Y be two independent random variables, with
⎧1
⎪ , −1 ≤ x ≤ 1
fX ( x ) = ⎨ 2
⎪⎩0 , otherwise
⎧1
⎪ , −2 ≤ y ≤ 1
fY ( y ) = ⎨ 3
⎪⎩0 , otherwise
If Z = X + Y , let us find P [ Z ≤ − 2] .
2.42
1
P [ Z ≤ − 2] is the shaded area = .
12
Example 2.15
X
Let Z = ; let us find an expression for fZ ( z ) .
Y
⎧1 + x y
⎪ , x ≤ 1, y ≤ 1
Let f X ,Y ( x , y ) = ⎨ 4
⎪⎩0 , elsewhere
∞
1 + zw 2
Then fZ ( z ) = ∫ w dw
−∞
4
2.43
appropriate ranges.
i) Let z < 1; then
1 + zw 2 1 + zw 2
1 1
fZ ( z ) = ∫ w dw = 2 ∫ w dw
−1
4 0
4
1⎛ z⎞
= ⎜ 1+ ⎟
4⎝ 2⎠
ii) For z > 1 , we have
1
z
1 + zw 2 1⎛ 1 1 ⎞
fZ ( z ) = 2 ∫0 4 w dw = 4 ⎜⎝ z2 + 2 z3 ⎟⎠
iii) For z < − 1, we have
− 1
z
1 + zw 2 1⎛ 1 1 ⎞
fZ ( z ) = 2 ∫ w dw = ⎜ 2 + ⎟
0
4 4 ⎝z 2 z3 ⎠
2.44
⎧1 ⎛ z⎞
⎪4 ⎜1 + 2 ⎟ , z ≤ 1
⎪ ⎝ ⎠
Hence fZ ( z ) = ⎨
⎪1 ⎛ 1 1 ⎞
⎪⎩ 4 ⎜ 2 + ⎟, z > 1
⎝z 2 z3 ⎠
Exercise 2.7
Let Z = X Y and W = Y .
∞
1 ⎛z ⎞
a) Show that fZ ( z ) = ∫ f X ,Y ⎜ , w ⎟ d w
−∞
w ⎝w ⎠
⎧⎪
2
−y
and fY ( y ) = ⎨ y e 2
, y ≥ 0
⎪⎩0 , otherwise
1 2
Show that fZ ( z ) =
−z
e 2
, − ∞ < z < ∞.
2π
Example 2.16
The input to a noisy channel is a binary random variable with
1
P [ X = 0] = P [ X = 1] = . The output of the channel is given by Z = X + Y
2
2.45
y2
1 −
where Y is the channel noise with fY ( y ) = e 2
, − ∞ < y < ∞ . Find
2π
fZ ( z ) .
Let us first compute the distribution function of Z from which the density
function can be derived.
P ( Z ≤ z ) = P [ Z ≤ z | X = 0] P [ X = 0 ] + P [ Z ≤ z | X = 1] P [ X = 1]
As Z = X + Y , we have
P [ Z ≤ z | X = 0 ] = FY ( z )
Similarly P [ Z ≤ z | X = 1] = P ⎡⎣Y ≤ ( z − 1) ⎤⎦ = FY ( z − 1)
1 1 d
Hence FZ ( z ) = FY ( z ) + FY ( z − 1) . As fZ ( z ) = FZ ( z ) , we have
2 2 dz
⎡ ⎛ z2 ⎞ ⎛ z − 1 2 ⎞⎤
fZ ( z ) =
1⎢ 1
exp ⎜ − ⎟+
1
exp ⎜ −
( ) ⎟⎥
2 ⎢ 2π ⎜ 2 ⎟ 2π ⎜ 2 ⎟⎥
⎣ ⎝ ⎠ ⎝ ⎠⎦
Example 2.17
Let Z = X + Y . Obtain fZ ( z ) , given f X ,Y ( x , y ) .
2.46
P [Z ≤ z ] = P [ X + Y ≤ z ]
= P [Y ≤ z − x ]
This probability is the probability of ( X , Y ) lying in the shaded area shown in Fig.
2.17.
That is,
∞ ⎡z − x ⎤
FZ ( z ) = ∫− ∞ ⎢⎢ −∫∞ fX ,Y ( x , y ) d y ⎥⎥
d x
⎣ ⎦
∂ ⎡ ⎡z − x ⎤⎤
∞
fZ ( z ) = ⎢ ∫ d x ⎢ ∫ f X ,Y ( x , y ) d y ⎥ ⎥
∂ z ⎢⎣ − ∞ ⎣⎢ − ∞ ⎦⎥ ⎥⎦
∞ ⎡ ∂ z− x
⎤
= ∫− ∞ d x ⎢⎢ ∂ z ∫ f X ,Y ( x , y ) d y ⎥
⎣ −∞ ⎥⎦
∞
= ∫ f (x, z − x) d x
−∞
X ,Y (2.37a)
2.47
∞
fZ ( z ) = ∫ f (z − y , y ) d y
X ,Y (2.37b)
−∞
Remarks: The terminology expected value or expectation has its origin in games
of chance. This can be illustrated as follows: Three small similar discs, numbered
1,2 and 2 respectively are placed in bowl and are mixed. A player is to be
2.48
blindfolded and is to draw a disc from the bowl. If he draws the disc numbered 1,
he will receive nine dollars; if he draws either disc numbered 2, he will receive 3
dollars. It seems reasonable to assume that the player has a '1/3 claim' on the 9
dollars and '2/3 claim' on three dollars. His total claim is 9(1/3) + 3(2/3), or five
dollars. If we take X to be (discrete) random variable with the PDF
1 2
fX ( x ) = δ ( x − 1) + δ ( x − 2 ) and g ( X ) = 15 − 6 X , then
3 3
∞
E ⎣⎡g ( X ) ⎦⎤ = ∫ (15 − 6 x ) f ( x ) d x
X = 5
−∞
The most widely used moments are the first moment ( n = 1, which results in the
mean value of Eq. 2.38) and the second moment ( n = 2 , resulting in the mean
square value of X ).
∞
E ⎡⎣ X 2 ⎤⎦ = X 2 = ∫ x 2 fX ( x ) d x (2.41)
−∞
∞
E ⎡( X − m X ) ⎤ = ∫ (x − m ) fX ( x ) d x
n n
(2.42)
⎣ ⎦ X
−∞
g ( X , Y ) . Then,
2.49
∞ ∞
E ⎡⎣g ( X , Y ) ⎤⎦ = ∫ ∫ g (x, y ) f (x, y ) d x d y
X ,Y (2.43)
− ∞− ∞
∞ ∞ ∞ ∞
= ∫ ∫ ( α x ) fX ,Y ( x , y ) d x dy + ∫ ∫ (β y ) f ( x , y ) d x dy
X ,Y
− ∞− ∞ − ∞− ∞
Integrating out the variable y in the first term and the variable x in the second
term, we have
∞ ∞
E [Z ] = ∫ α x f (x) d x + ∫ β y f (y ) d y
X Y
−∞ −∞
= α X + βY
2.5.1 Variance
Coming back to the central moments, we have the first central moment
being always zero because,
∞
E ⎡⎣( X − m X ) ⎤⎦ = ∫ (x − m ) f (x) d x
X X
−∞
= mX − mX = 0
E ⎡( X − m X ) ⎤ = E ⎡⎣ X 2 − 2 m X X + m X2 ⎤⎦
2
⎣ ⎦
From the linearity property of expectation,
E ⎡⎣ X 2 − 2 m X X + m X2 ⎤⎦ = E ⎡⎣ X 2 ⎤⎦ − 2 m X E [ X ] + m X2
= E ⎡⎣ X 2 ⎤⎦ − 2 m X2 + m X2
2
= X 2 − m X2 = X 2 − ⎡⎣ X ⎤⎦
2.50
The second central moment of a random variable is called the variance and its
(positive) square root is called the standard deviation. The symbol σ 2 is
generally used to denote the variance. (If necessary, we use a subscript on σ 2 )
E ⎡⎣g ( X ) ⎤⎦
P ⎡⎣g ( X ) ≥ c ⎤⎦ ≤ (2.44)
c
If x ∈ A , then g ( x ) ≥ c , hence
E ⎡⎣g ( X ) ⎤⎦ ≥ c ∫ f X ( x ) d x
A
But ∫ f X ( x ) d x = P [ x ∈ A] = P ⎡⎣g ( X ) ≥ c ⎤⎦
A
2.51
To see how innocuous (or weak, perhaps) the inequality 2.44 is, let g ( X )
Then Eq. 2.44 states that the probability of choosing a person over 16 m tall is at
1
most ! (In a population of 1 billion, at most 100 million would be as tall as a
10
full grown Palmyra tree!)
⎣ ⎦ k
In other words,
1
P ⎡⎣ X − m X ≥ k σ X ⎤⎦ ≤ 2
k
which is the desired result. Naturally, we would take the positive number k to be
greater than one to have a meaningful result. Chebyshev inequality can be
interpreted as: the probability of observing any RV outside ± k standard
1
deviations off its mean value is no larger than . With k = 2 for example, the
k2
probability of X − m X ≥ 2 σ X does not exceed 1/4 or 25%. By the same token,
we expect X to occur within the range ( m X ± 2 σ X ) for more than 75% of the
observations. That is, smaller the standard deviation, smaller is the width of the
interval around m X , where the required probability is concentrated. Chebyshev
2.52
Note that it is not necessary that variance exists for every PDF. For example, if
α
f X ( x ) = 2 π 2 , − ∞ < x < ∞ and α > 0 , then X = 0 but X 2 is not finite.
α +x
(This is called Cauchy’s PDF)
2.5.2 Covariance
An important joint expectation is the quantity called covariance which is
obtained by letting g ( X , Y ) = ( X − m X ) (Y − mY ) in Eq. 2.43. We use the
E [ X Y ] − mX mY
ρX Y = (2.47)
σ X σY
The correlation coefficient is a measure of dependency between the variables.
Suppose X and Y are independent. Then,
∞ ∞
E[XY] = ∫ ∫ x y f X ,Y ( x , y ) d x d y
− ∞− ∞
∞ ∞
= ∫ ∫ x y f X ( x ) fY ( y ) d x d y
− ∞− ∞
∞ ∞
= ∫ x fX ( x ) d x ∫ y f (y ) d y
Y = m X mY
−∞ −∞
2.53
many trials of the experiments and with the outcome X = x1 , the sum of the
sum
numbers x1 ( y − mY ) would be very small and the quantity, ,
number of trials
tends to zero as the number of trials keep increasing.
On the other hand, let X and Y be dependent. Suppose for example, the
outcome y is conditioned on the outcome x in such a manner that there is a
is, for X and Y be independent, we have ρ X Y = 0 and for the totally dependent
2.54
⎧1 α α
⎪ , − < x <
fX ( x ) = ⎨ α 2 2,
⎪⎩0 , otherwise
and Y = X 2 . Then,
α
∞ 2
1 1
XY = X3 = ∫ x3 d x = ∫ x3 d x = 0
−∞
α α −α
2
E [Y ] = k1 m1 + k 2 m2
E ⎡⎣Y 2 ⎤⎦ = E ⎡⎣ k1 X12 + k 2 X 22 + 2 k1 k 2 X1 X 2 ⎤⎦
Then σ2Z = σW2 = σ12 + σ22 . That is, the sum as well as the difference random
2.55
n
Y = ∑k
i = 1
i X i , then
n
σY2 = ∑k
i = 1
i
2
σ2i + 2 ∑∑ k i k j ρi j σi σ j
i j
i < j
(2.49)
where the meaning of various symbols on the RHS of Eq. 2.49 is quite obvious.
We shall now give a few examples based on the theory covered so far in
this section.
Example 2.18
Let a random variable X have the CDF
⎧0 , x < 0
⎪x
⎪ , 0 ≤ x ≤ 2
⎪8
FX ( x ) = ⎨ 2
⎪x , 2 ≤ x ≤ 4
⎪16
⎪
⎩1 , 4 ≤ x
Therefore,
2 4
1 1 2
E[X] = ∫0 8 x d x + ∫2 8 x d x
2.56
1 7 31
= + =
4 3 12
Example 2.19
Let Y = cos π X , where
⎧ 1 1
⎪1, − < x <
fX ( x ) = ⎨ 2 2
⎪⎩0, otherwise
1
2
1
E ⎡⎣Y ⎤⎦ =
2
∫ cos2 ( π x ) d x = = 0.5
− 1
2
2
1 4
Hence σY2 = − 2 = 0.96
2 π
Example 2.20
Let X and Y have the joint PDF
⎧ x + y , 0 < x < 1, 0 < y < 1
f X ,Y ( x , y ) = ⎨
⎩0 , elsewhere
2.57
1 1
17
= ∫ ∫ x y ( x + y ) dx dy =
2
0 0
72
show that
7 11 48
E [ X ] = E [Y ] = , σ2X = σY2 = and E [ X Y ] =
12 144 144
1
Hence ρ X Y = −
11
Similarly, we can define the conditional variance etc. We shall illustrate the
calculation of conditional mean with the help of an example.
Example 2.21
Let the joint PDF of the random variables X and Y be
2.58
⎧1
⎪ , 0 < x < 1, 0 < y < x
f X ,Y ( x , y ) ≡ ⎨ x .
⎪⎩0 , outside
Let us compute E [ X | Y ] .
f X ,Y ( x , y )
To find E [ X | Y ] , we require the conditional PDF, f X |Y ( x | y ) =
fY ( y )
1
1
fY ( y ) = ∫x dx = − ln y , 0 < y < 1
y
1
1
f X |Y ( x | y ) = x = − , y < x < 1
− ln y x ln y
1
⎡ 1 ⎤
Hence E ( X | Y ) = ∫ x ⎢⎣ − x ln y ⎥⎦ d x
y
y −1
=
ln y
Note that E [ X | Y = y ] is a function of y .
2.59
probability of occurrence of A on each trial and the trials are independent. Let X
denote random variable, ‘number of occurrences of A in n trials'. X can be
equal to 0, 1, 2, ........., n . If we can compute P [ X = k ], k = 0, 1, ........., n , then
we can write f X ( x ) .
sample points such as ABAAB , A A A B B etc. will map into real number 3 as
shown in the Fig. 2.18. (Each sample point is actually an element of the five
dimensional Cartesian product space).
= p (1 − p ) p 2 (1 − p ) = p 3 (1 − p )
2
⎛5⎞
There are ⎜ ⎟ = 10 sample points for which X ( s ) = 3 .
⎝3⎠
In other words, for n = 5 , k = 3 ,
⎛ 5⎞
P [ X = 3] = ⎜ ⎟ p3 (1 − p )
2
⎝ 3⎠
Generalizing this to arbitrary n and k , we have the binomial density, given by
2.60
n
fX ( x ) = ∑ P δ(x − i )
i = 0
i (2.52)
⎛n⎞
where Pi = ⎜ ⎟ p i (1 − p )
n−i
⎝i ⎠
As can be seen, f X ( x ) ≥ 0 and
∞ n n
⎛n⎞ i
fX ( x ) d x = ∑ ∑ ⎜i ⎟ p (1 − p )
n −i
∫
−∞ i = 0
Pi =
i = 0⎝ ⎠
= ⎡⎣(1 − p ) + p ⎤⎦
n
= 1
the formulae for the mean and the variance of a binomial PDF are simple, the
algebra to derive them is laborious).
We write X is b ( n , p ) to indicate X has a binomial PDF with parameters
Example 2.22
A digital communication system transmits binary digits over a noisy
channel in blocks of 16 digits. Assume that the probability of a binary digit being
in error is 0.01 and that errors in various digit positions within a block are
statistically independent.
i) Find the expected number of errors per block
ii) Find the probability that the number of errors per block is greater than or
equal to 3.
Let X be the random variable representing the number of errors per block.
Then X is b (16, 0.01) .
i) E [ X ] = n p = 16 × 0.01 = 0.16;
2.61
ii) P ( X ≥ 3 ) = 1 − P [ X ≤ 2]
2
⎛ ⎞ 16
∑ ⎜ i ⎟ ( 0.1) (1 − p )
i 16 − i
= 1−
i =0 ⎝ ⎠
= 0.002
Exercise 2.8
tight.
ii) Poisson:
A random variable X which takes on only integer values is Poisson
distributed, if
∞
λm e− λ
fX ( x ) = ∑
m = 0
δ ( x − m)
m!
(2.53)
Since,
∞
λm
e = ∑
λ
, we have
m = 0 m!
( )
d eλ ∞
m λm − 1 1 ∞
λm
dλ
= eλ = ∑
m = 0 m!
=
λ
∑
m = 1
m
m!
∞
m λm e− λ
E[X] = ∑ = λ eλ e − λ = λ
m = 1 m!
Differentiating the series again, we obtain,
2.62
E ⎡⎣ X 2 ⎤⎦ = λ 2 + λ . Hence σ2X = λ .
⎧ 1
⎪ , a ≤ x ≤ b
fX ( x ) = ⎨ b − a (2.54)
⎪0
⎩ , elsewhere
(b − a)
2
a+b
E[X] = and σ2X =
2 12
Note that the variance of the uniform PDF depends only on the width of the
interval ( b − a ) . Therefore, whether X is uniform in ( − 1, 1) or ( 2, 4 ) , it has the
1
same variance, namely .
3
ii) Rayleigh:
An RV X is said to be Rayleigh distributed if,
2.63
⎧x ⎛ x2 ⎞
⎪ exp ⎜− ⎟, x ≥ 0
fX ( x ) = ⎨ b ⎝ 2 b⎠ (2.55)
⎪
⎩0 , elsewhere
2.64
Exercise 2.9
∞
a) Let f X ( x ) be as given in Eq. 2.55. Show that ∫ f X ( x ) d x = 1. Hint: Make
0
dz
the change of variable x 2 = z . Then, x d x = .
2
πb
b) Show that if X is Rayleigh distributed, then E [ X ] = and
2
E ⎡⎣ X 2 ⎤⎦ = 2 b
iii) Gaussian
By far the most widely used PDF, in the context of communication theory
is the Gaussian (also called normal) density, specified by
1 ⎡ ( x − m X )2 ⎤
fX ( x ) = exp ⎢− ⎥, −∞ < x < ∞ (2.56)
2 π σX ⎢⎣ 2 σ2X ⎥
⎦
where m X is the mean value and σ2X the variance. That is, the Gaussian PDF is
completely specified by the two parameters, m X and σ2X . We use the symbol
As can be seen from the Fig. 2.21, The Gaussian PDF is symmetrical with
respect to m X .
1
TP PT In this notation, N ( 0, 1) denotes the Gaussian PDF with zero mean and unit variance. Note that
⎛ X − mX ⎞
( )
if X is N m X , σ X , then Y =
2
⎜ σ ⎟ is N ( 0, 1) .
⎝ X ⎠
2.65
mX
Hence FX ( m X ) = ∫ f (x) d x
X = 0.5
−∞
Consider P [ X ≥ a ] . We have,
∞
1 ⎡ ( x − m X )2 ⎤
P [ X ≥ a] = ∫ exp ⎢− ⎥dx
2 π σX ⎢⎣ 2 σ2X ⎥
a ⎦
This integral cannot be evaluated in closed form. By making a change of variable
⎛ x − mX ⎞
z = ⎜ ⎟ , we have
⎝ σX ⎠
∞ z2
1 −
P [ X ≥ a] = ∫ e 2
dz
a − mX 2π
σX
⎛ a − mX ⎞
= Q⎜ ⎟
⎝ σX ⎠
∞
1 ⎛ x2 ⎞
where Q ( y ) = ∫ exp ⎜ − ⎟dx (2.57)
y 2π ⎝ 2 ⎠
2.66
Example 2.23
A random variable Y is said to have a log-normal PDF if X = ln Y has a
⎧ 1 ⎡ ( ln y − α )2 ⎤
⎪⎪ exp ⎢− ⎥, y ≥ 0
fY ( y ) = ⎨ 2 π y β ⎢⎣ 2 β2 ⎥⎦
⎪
⎪⎩ 0 , otherwise
2.67
dx 1 1
= → J =
dy y y
Also as y → 0 , x → − ∞ and as y → ∞ , x → ∞
1 ⎡ ( x − α )2 ⎤
Hence f X ( x ) = exp ⎢− ⎥, −∞ < x < ∞
2π β ⎢⎣ 2 β2 ⎥
⎦
Note that X is N ( α , β2 )
⎡ − ( x − α) ⎤
2
∞
1 x ⎢ ⎥dx
Y = E ⎡⎣e X ⎤⎦ = ∫
2 β2
b) e e
2π β − ∞ ⎢ ⎥
⎣ ⎦
⎡∞ ⎡ x − ( α + β2 ) ⎤ ⎤
2
β2 − ⎣ ⎦
α+ ⎢ 1 ⎥
⎢∫
2 β2
= e 2
e d x⎥
⎢− ∞ 2 π β ⎥
⎣ ⎦
As the bracketed quantity being the integral of a Gaussian PDF
between the limits ( − ∞ , ∞ ) is 1, we have
β2
α +
Y = e 2
c) P [Y ≤ m ] = P [ X ≤ ln m ]
That is, ln m = α or m = e α
1 ⎧ 1 ⎡ ( x − m X )2 ( y − mY )2 ( x − mX ) ( y − mY ) ⎤⎫
fX , Y ( x , y ) = exp ⎨− ⎢ + −2ρ ⎥⎬ (2.58)
σX σY σ X σY
2 2
k1 ⎩ k2 ⎣ ⎦⎭
where,
k1 = 2 π σ X σY 1 − ρ2
2.68
k 2 = 2 (1 − ρ2 )
P2) f X Y ( x , y ) = f X ( x ) fY ( y ) iff ρ = 0
That is, if the Gaussian variables are uncorrelated, then they are
independent. That is not true, in general, with respect to non-Gaussian
variables (we have already seen an example of this in Sec. 2.5.2).
P3) If Z = α X + βY where α and β are constants and X and Y are jointly
computing mZ and σ2Z with the help of the formulae given in section 2.5.
Figure 2.22 gives the plot of a bivariate Gaussian PDF for the case of
ρ = 0 and σ X = σY .
1
TP PT Note that the converse is not necessarily true. Let fX and fY be obtained from fX , Y and let fX
and fY be Gaussian. This does not imply fX , Y is jointly Gaussian, unless X and Y are
independent. We can construct examples of a joint PDF fX , Y , which is not Gaussian but results in
fX and fY that are Gaussian.
2.69
the striker missing! For ρ ≠ 0 , we have two cases (i) ρ , positive and (ii)
ρ , negative. If ρ > 0 , imagine the bell being compressed along the
X = − Y axis so that it elongates along the X = Y axis. Similarly for
ρ < 0.
Example 2.24
Let X and Y be jointly Gaussian with X = − Y = 1 , σ2X = σY2 = 1 and
1
ρX Y = − . Let us find the probability of (X,Y) lying in the shaded region D
2
shown in Fig. 2.23.
2.70
Let A be the shaded region shown in Fig. 2.24(a) and B be the shaded
region in Fig. 2.24(b).
Fig. 2.24: (a) Region A and (b) Region B used to obtain region D
2.71
⎡ X ⎤ ⎡ X ⎤
P ⎢Y + ≥ 1⎥ − P ⎢Y + ≥ 2⎥
⎣ 2 ⎦ ⎣ 2 ⎦
X
Let Z = Y +
2
Then Z is Gaussian with the parameters,
1 1
Z =Y + X = −
2 2
1 2 1
σ2Z = σ X + σY2 + 2 . ρ X Y
4 2
1 1 1 3
= + 1− 2. . =
4 2 2 4
1
Z+
⎛ 1 3⎞ 2 is N ( 0, 1)
That is, Z is N ⎜ − , ⎟ . Then W =
⎝ 2 4⎠ 3
4
P [ Z ≥ 1] = P ⎡⎣W ≥ 3 ⎤⎦
⎡ 5 ⎤
P [ Z ≥ 2] = P ⎢W ≥ ⎥
⎣ 3⎦
2.72
Exercise 2.10
X and Y are independent, identically distributed (iid) random variables,
each being N ( 0, 1) . Find the probability of X , Y lying in the region A shown in
Fig. 2.25.
2.73
Exercise 2.11
Two random variables X and Y are obtained by means of the
transformation given below.
1
X = ( − 2 loge U1 ) 2 cos ( 2 π U2 ) (2.59a)
1
Y = ( − 2 loge U1 ) 2 sin ( 2 π U2 ) (2.59b)
range 0 < u1 , u2 < 1 . Show that X and Y are independent and each is
N ( 0, 1) .
Y = Y1 sin Θ , where Θ = 2 π U2 .
Note: The transformation given by Eq. 2.59 is called the Box-Muller
transformation and can be used to generate two Gaussian random number
sequences from two independent uniformly distributed (in the range 0 to 1)
sequences.
2.74
Appendix A2.1
Proof of Eq. 2.34
The proof of Eq. 2.34 depends on establishing a relationship between the
differential area d z d w in the z − w plane and the differential area d x d y in
the x − y plane. We know that
fZ ,W ( z , w ) d z d w = P [ z < Z ≤ z + d z , w < W ≤ w + d w ]
to A' , B to B' etc.) We shall now find the relation between the differential area of
the rectangle and the differential area of the parallelogram.
2.75
and P4 .
⎛ ∂ g− 1 ∂ h− 1 ⎞
P2 = ⎜ x + dz, y + d z⎟
⎜ ∂z ∂z ⎟
⎝ ⎠
⎛ ∂x ∂y ⎞
= ⎜x + dz, y + d z⎟
⎝ ∂z ∂z ⎠
⎛ ∂x ∂y ⎞
P3 = ⎜ x + dw , y + dw⎟
⎝ ∂w ∂w ⎠
V1 = ( P2 − P1 ) and V2 = ( P3 − P1 ) .
That is,
∂x ∂y
V1 = dz i + dz j
∂z ∂z
∂x ∂y
V2 = dw i + dw j
∂w ∂w
where i and j are the unit vectors in the appropriate directions. Then, the area
A of the parallelogram is,
A = V1 × V2
2.76
∂x ∂y ∂y ∂x
V1 × V2 = − d z dw
∂ z ∂w ∂ z ∂w
⎛ x, y ⎞
A = J⎜ ⎟ d z dw
⎝ z, w ⎠
That is,
⎛ x, y ⎞
fZ,W ( z , w ) = f X ,Y ( x , y ) J ⎜ ⎟
⎝ z, w ⎠
fX , Y ( x , y )
= .
⎛ z, w ⎞
J⎜ ⎟
⎝ x, y ⎠
2.77
Appendix A2.2
Q( ) Function Table
∞
1 2
Q (α) =
− x
∫
α 2π
e 2
dx
that Q ( 0 ) = 0.5 .
y Q (y )
y Q (y ) y Q (y ) y Q (y )
10− 3 3.10
10− 3
0.05 0.4801 1.05 0.1469 2.10 0.0179 3.28
2
0.10 0.4602 1.10 0.1357 2.20 0.0139 10− 4 3.70
0.15 0.4405 1.15 0.1251 2.30 0.0107 10− 4
3.90
0.20 0.4207 1.20 0.1151 2.40 0.0082 2
10− 5 4.27
0.25 0.4013 1.25 0.0156 2.50 0.0062
10− 6 4.78
0.30 0.3821 1.30 0.0968 2.60 0.0047
0.35 0.3632 1.35 0.0885 2.70 0.0035
0.40 0.3446 1.40 0.0808 2.80 0.0026
0.45 0.3264 1.45 0.0735 2.90 0.0019
0.50 0.3085 1.50 0.0668 3.00 0.0013
0.55 0.2912 1.55 0.0606 3.10 0.0010
0.60 0.2743 1.60 0.0548 3.20 0.00069
0.65 0.2578 1.65 0.0495 3.30 0.00048
0.70 0.2420 1.70 0.0446 3.40 0.00034
0.75 0.2266 1.75 0.0401 3.50 0.00023
0.80 0.2119 1.80 0.0359 3.60 0.00016
0.85 0.1977 1.85 0.0322 3.70 0.00010
0.90 0.1841 1.90 0.0287 3.80 0.00007
0.95 0.1711 1.95 0.0256 3.90 0.00005
1.00 0.1587 2.00 0.0228 4.00 0.00003
2.78
Note that some authors use erfc ( ) , the complementary error function which is
given by
∞
2
erfc ( α ) = 1 − erf ( α ) = ∫e
− β2
dβ
π α
α
2
and the error function, erf ( α ) = ∫e
− β2
dβ
π 0
1 ⎛ α ⎞
Hence Q ( α ) = erfc ⎜ ⎟.
2 ⎝ 2⎠
2.79
Appendix A2.3
(
Proof that N m X , σ2X is a valid PDF )
We will show that f X ( x ) as given by Eq. 2.56, is a valid PDF by
∞
establishing ∫ f (x) d x
−∞
X = 1 . (Note that f X ( x ) ≥ 0 for − ∞ < x < ∞ ).
∞ 2 ∞ 2
−v −y
Let I = ∫e
−∞
2
dv = ∫e
−∞
2
dy.
⎡ ∞ − v2 ⎤ ⎡ ∞ − y2 ⎤
Then, I 2
= ⎢∫e 2
dv⎥ ⎢ ∫ e 2 d y⎥
⎣⎢ − ∞ ⎥⎦ ⎣⎢ − ∞ ⎥⎦
∞ ∞ v2 + y2
−
= ∫ ∫e
− ∞− ∞
2
dv d y
⎛y⎞
Let v = r cos θ and y = r sin θ . Then, r = v 2 + y 2 and θ = tan−1 ⎜ ⎟ ,
⎝v ⎠
and d x d y = r d r d θ . (Cartesian to Polar coordinate transformation).
2π ∞ r2
−
= ∫ ∫e r dr dθ
2 2
I
0 0
= 2 π or I = 2π
∞ v2
1 −
That is,
2π −∞
∫e 2
dv = 1 (A2.3.1)
x − mX
Let v = (A2.3.2)
σx
dx
Then, d v = (A2.3.3)
σx
Using Eq. A2.3.2 and Eq. A2.3.3 in Eq. A2.3.1, we have the required result.
2.80
References
1) Papoulis, A., ‘Probability, Random Variables and Stochastic Processes’,
McGraw Hill (3rd edition), 1991.
P P
2.81