Sunteți pe pagina 1din 160

Chapter 5: Probability densities

Š Continuous Random Variables


Š The Normal Distribution
Š The Normal Approximation to Binomial Distribution
Š The Uniform Distribution
Š The Log-normal Distribution
Š The Gamma Distribution
Š The Beta Distribution
Š The Weibull Distribution
Š Transformations of Variables
Š Joint Distribution – Discrete and Continuous
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 1
Continuous Random Variables

Random variables are real valued functions defined over


the sample space of an experiment. A random variable
whose values are not countable is called a continuous
random variable. A continuous random variable can
assume values on a continuous scale, i.e. over an
interval or union of intervals.
Examples: The height of a person, the price of a house,
the time taken to complete an examination, the weight
of a baby, etc. .

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 2
Continuous Random Variables (cont’d)

P(a ≤ X ≤ b) =?

represent the probability associated with the


points of the sample space for which the
value of random variables falls on the
interval from a to b.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 3
Continuous Random Variables (cont’d)

Suppose we are interested in the probability that a given


random variable will take on a value on the interval from a
to b where a and b are constants with a ≤ b. First, we divide
the interval from a to b into n equal subintervals of width ∆x
containing respectively the points x1, x2, … , xn.
Suppose that the probability that the random variable will take
on a value in subinterval containing xi is given by f(xi).∆x.
Then the probability that the random variable will take on a
value in the interval from a to b is given by
n
P (a ≤ X ≤ b) = ∑ f ( xi ). ∆x.
i =1

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 4
Continuous Random Variables (cont’d)

If f is an integrable function defined for all values of the


random variable, the probability that the value of the random
variables falls between a and b is defined by letting ∆x → 0 as
n b
P (a ≤ X ≤ b) = lim ∑ f ( xi )∆x = ∫ f ( x)dx
∆x →0 i =1 a

Note: The value of f(x) does not give the probability that the
corresponding random variable takes on the values x; in the
continuous case, probabilities are given by integrals not by the
values f(x).
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 5
Continuous Random Variables (cont’d)

f(x)
P (a ≤ X ≤ b) = area under f ( x) from a to b
b
= ∫ f ( x)dx
a

P(a ≤ X ≤ b)

x
a b

Figure: Probability as area under f


Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 6
Continuous Random Variables (cont’d)

The function f is called probability density function or simply


probability density.
Characteristics of the probability density function f :
1. f ( x) ≥ 0 for all x.

2. ∫ f ( x)dx = 1.
−∞
F(x) represents the probability that a random variable with
probability density f(x) takes on a value less than or equal to
x and the corresponding function F is called the cumulative
distribution function or simply distribution function of the
random variable X.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 7
Continuous Random Variables (cont’d)

Thus, for any value x,


F (x) = P(X ≤ x)
is the area under the probability density function over the
interval -∞ to x. Mathematically,
x
F ( x) = ∫ f (t )dt
−∞

The probability that the random variable will take on a value


on the interval from a to b is given by
P(a ≤ X ≤ b) = F (b) - F (a)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 8
Continuous Random Variables (cont’d)

According to the fundamental theorem of integral calculus it


follows that
dF ( x)
= f ( x)
dx
wherever this derivative exists.
F is non-decreasing function, F(-∞) = 0 and F(∞) = 1.
kth moment about the origin

µ k′ = ∫ x k ⋅ f ( x)dx
−∞
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 9
Continuous Random Variables (cont’d)
Mean of a probability density:


µ = ∫ x f ( x)dx
−∞

kth moment about the mean:



µ k = ∫ ( x − µ ) k ⋅ f ( x)dx
−∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 10
Continuous Random Variables (cont’d)

Variance of a probability density



σ = ∫ (x − µ )
2 2
f ( x ) dx
−∞

= ∫x f ( x ) dx − µ
2 2

−∞

= µ2 − µ
′ 2

σ is referred to as the standard deviation


Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 11
Continuous Random Variables (cont’d)

Example 1: If the probability density of a random variable is given by


⎧x for 0 < x < 1

f ( x ) = ⎨ 2 − x for 1 ≤ x < 2
⎪0 elsewhere

find the probabilities that a random variable having this probability
density will take on a value
(a) between 0.2 and 0.8;
(b) between 0.6 and 1.2.
Solution:
0.8 0.8 2 0.8
x
(a ) P(0.2 ≤ X ≤ 0.8) = ∫ f ( x)dx = ∫ xdx = = 0.30.
0.2 0.2
2 0.2
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 12
Continuous Random Variables (cont’d)

1 .2 1 .0 1 .2
(b ) P ( 0 .6 ≤ X ≤ 1 .2 ) = ∫ f ( x ) dx = ∫ xdx + ∫ ( 2 − x ) dx
0 .6 0 .6 1 .0
1 .2
2 1 .0
x ⎛ x ⎞
2
= + ⎜⎜ 2 x − ⎟⎟ = 0 .50
2 0 .6 ⎝ 2 ⎠ 1 .0

Example 2: With reference to the preceding example, find the corresponding


distribution function and use it to determine the probabilities that a random
variable having this distribution function will take on a value
(a) greater than 1.8;
(b) between 0.4 and 1.6.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 13
Continuous Random Variables (cont’d)
x
Solution: F ( x) = ∫ f (t )dt
−∞

If x ≤ 0, F ( x) = 0.
x
x2
If 0 < x < 1, F ( x) = ∫ tdt =
0
2
1 x
x2
If 1 ≤ x < 2, F ( x) = ∫ tdt + ∫ (2 − t )dt = 2 x − − 1
0 1
2
1 2
If x ≥ 2, F ( x) = ∫ tdt + ∫ (2 − t )dt = 1.
0 1
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 14
Continuous Random Variables (cont’d)

⎧0 for x ≤ 0
⎪ 2
⎪x for 0 < x < 1
⎪2
F ( x) = ⎨ 2
⎪2 x − x
−1 for 1 ≤ x < 2
⎪ 2

⎩1 for x ≥ 2
( a ) P ( X > 1 .8) = 1 − F (1 .8) = .02
( b ) P ( 0 .4 ≤ X ≤ 1 .6 ) = F (1 .6 ) − F ( 0 .4 ) = 0 .84 .

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 15
Continuous Random Variables (cont’d)

Example 3: Find µ and σ2 for the probability density of


previous example.
Solution: ∞ 1 2
µ = ∫ x f ( x)dx = ∫ x 2 dx + ∫ x(2 − x)dx = 1
−∞ 0 1
∞ 1 2
7
µ 2′ = ∫ x f ( x)dx = ∫ x dx + ∫ x (2 − x)dx =
2 3 2

−∞ 0 1
6
7 1
σ = µ 2′ − µ = − 1 = .
2 2

6 6
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 16
Find the mean of

1 1
f ( x) = −∞ < x < ∞
π 1 + x2

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 17
Find the median and mode of

f ( x) = λ e − λ x x>0

log2
median=
λ

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 18
Density Curves

Š Density curves describe the overall shape of


a distribution.
Š Ideal patterns that are accurate enough for
practical purposes.
Š Faster to draw and easier to use.
Š Areas or proportions under the curve
represent counts or percents of observations.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 19
Center of a Density Curve

Š The mode of a distribution is the point where the


curve is highest.
Š The median is the point where half of the area
under the curve lies on the left and the other half on
the right. Equal Areas Point
Š Quartiles can be found by dividing the area under
the curve into four equal parts
„ 1/4 of the area is to the left of the 1st quartile
„ 3/4 of the area is to the left of the 3rd quartile
Š The mean is the balance point.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 20
5.2 The Normal Distribution

Š A Gaussian function (named after Carl Friedrich


Gauss) is a function of the form:
− ( x −b ) 2 / c 2
f ( x) = a e
for some real constants a > 0, b, and c.
The Fourier transform of a Gaussian function is not only
another Gaussian function but a scalar multiple of the
function whose Fourier transform was taken.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 21
The Normal Distribution (cont’d)

Š Gaussian functions are among those functions that are


elementary but lack of elementary antiderivatives.
Š But their improper integrals over the whole real line
can be evaluated exactly.


− x2
e dx = π
−∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 22
The Normal Distribution (cont’d)
Š The antiderivative of the Gaussian function is the error
function.
Š Gaussian functions are used as pre-smoothing kernels in
image processing.
Š A Gaussian function is the wave function of the ground state
of the quantum harmonic oscillator.
Š Gaussian functions are also associated with the vacuum state
in quantum field theory.
Š Gaussian beams are used in optical and microwave systems.
Š Gaussian orbitals are used in computational chemistry.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 23
The Normal Distribution (cont’d)

Š In probability and statistics, Gaussian functions appear as


the density function of the normal distribution, which is a
limiting probability distribution of complicated sums,
according to the central limit theorem.
Š The Central Limit Theorem states that if the sum of the
variables has a finite variance, then it will be approximately
normally distributed.
Š Since many real processes yield distributions with finite
variance, this explains the omnipresence of the normal
distribution.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 24
Normal Distributions

Š Symmetric
Š Single-peaked
Š Bell-shaped
Š Tails fall off quickly
Š The mean, median, and mode are the same (Unimodal).
Š The points where there is a change in curvature is one
standard deviation on either side of the mean.
Š The mean and standard deviation completely specify the
curve.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 25
The Normal Distribution (cont’d)

Definition: A continuous random variable X has a normal


distribution and it is referred to as a normal random
variable if its probability density is given by
1 − ( x − µ ) 2 / 2σ 2
f ( x; µ , σ ) =
2
e for − ∞ < x < ∞
2π σ
where -∞ < µ < ∞ and σ > 0.

The parameters of normal distribution µ and σ are indeed its


mean and its standard deviation.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 26
The Empirical Rule

Š 68% of the observations fall within one


standard deviation of the mean.
Š 95% of the observations fall within two
standard deviation of the mean.
Š 99.7% of the observations fall within
three standard deviation of the mean.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 27
The Normal Distribution (cont’d)

The Empirical Rule


Or
The 68-95-99.7 Rule

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 28
The Normal Distribution (cont’d)

Figure: Normal probability density function for selected


values of the parameter µ and σ2
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 29
The Normal Distribution (cont’d)

mean = ∫ µ σ 2
xf ( x; , )dx
−∞

1
= ∫x e − ( x − µ ) 2 / 2σ 2
dx
−∞ 2π σ

1 x
∫σ e
− ( x − µ ) 2 / 2σ 2
= dx
2π −∞

1 ⎛ x − µ µ ⎞ −( x − µ ) 2 / 2σ 2
=
2π ∫−∞⎜⎝ σ + σ ⎟⎠e dx

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 30
The Normal Distribution (cont’d)
∞ ∞
1 (x − µ ) 1 µ −( x − µ )
∫ ∫−∞σ e
2 2 2
− ( x − µ ) / 2σ / 2σ 2
Mean = e dx + dx
2π −∞
σ 2π
x−µ
(put = u in first integral)
σ
∞ ∞
1
∫ ∫
2
= σ ue −u / 2
du +µ f ( x;µ , σ 2
)dx.
2π −∞ −∞

Since the value of first integral is zero as integrand is odd


function and the value of integral in second term is equal
to 1, we have
Mean = µ
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 31
The Normal Distribution (cont’d)

Variance = ∫ ( x − µ )2 f ( x; µ ,σ 2 )dx
−∞

1
= ∫ (x − µ) 2
e −( x −µ ) 2 / 2σ 2
dx
−∞ 2π σ
x−µ
(put = u)
σ
σ 2 ∞ 2 −u / 2 ∞
2σ 2 2 −u / 2
∫ ∫
2 2
= ue du = ue du
2π −∞ 2π 0
(since integrand is even function)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 32
The Normal Distribution (cont’d)

2σ 2
∫ u ⋅ ue
−u 2 / 2
Variance = du
2π 0

⎛ ∞

⎜⎡ ⎤ ( )
2σ 2 2
−u / 2


⎜ u.∫ ue du ∫ ∫
2
−u / 2
= − 1. ue du du ⎟
⎜⎢ ⎥
⎝⎣ ⎦0 0
2π ⎟

⎛ ∞

⎜⎡ −u / 2 ⎤
2 ∞ ∞
2σ 2 ⎟ 2σ 2
⎜ −ue ∫ ∫
2
−u / 2 −u 2 / 2
= + e du ⎟ = e du
2π ⎜⎢⎣ ⎥
⎦ 0 ⎟ 2π 0
⎝ 0 ⎠
Since ∫ ue −u 2 / 2
du = −e −u 2 / 2
which we can easily get by putting
u2
= v in this integral.
2
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 33
The Normal Distribution (cont’d)
We know that
∞ ∞
1
∫ f ( x; µ , σ ∫
− ( x − µ ) 2 / 2σ 2
2
)dx = 1 ⇔ e dx = 1
-∞ -∞ 2π σ
Put µ = 0 and σ = 1, we get
∞ ∞
1 −x / 2 1
∫ ∫e
2
− x2 / 2
e dx = 1 ⇔ 2 dx = 1.
-∞ 2π 2π 0

Hence,
Variance = σ2

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 34
Moment Generating function for Normal
Distribution

1 − ( x − µ ) 2 / 2σ 2
f ( x; µ , σ ) =
2
e for − ∞ < x < ∞
2π σ


1
m.g. f = E ( e tX
)= ∫ e tx
e − ( x − µ )2 / 2σ 2
dx
−∞ 2πσ

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 35

1
∫e
tx − ( x − µ )2 / 2σ 2
= e dx
2πσ −∞

tµ ∞
e

tx − t µ − ( x − µ )2 / 2σ 2
= e e e dx
2πσ −∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 36
tµ ∞
e
∫e
tx −t µ − ( x − µ )2 / 2σ 2
= e dx
2πσ −∞

tµ ∞
e

t ( x − µ ) − ( x − µ )2 / 2σ 2
= e e dx
2πσ −∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 37
tµ ∞
e
∫e
t ( x − µ ) − ( x − µ )2 / 2σ 2
= dx
2πσ −∞

tµ ∞ 2σ 2t ( x − µ ) − ( x − µ )2
e
=
2πσ
∫e
−∞
2σ 2
dx

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 38
tµ ∞ ( x − µ )2 − 2σ 2t ( x − µ )
e −
=
2πσ

−∞
e 2σ 2
dx

tµ ∞ ( x − µ )2 − 2σ 2t ( x − µ ) +σ 4t 2 −σ 4t 2
e −
=
2πσ

−∞
e 2σ 2
dx

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 39
tµ ∞ ( x − µ −σ 2t )2 −σ 4t 2
e −
=
2πσ

−∞
e 2σ 2
dx

tµ ∞ ( x − µ −σ 2t )2 σ 4t 2
e −
=
2πσ

−∞
e 2σ 2
e 2σ 2
dx

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 40
tµ ∞ ( x − µ −σ 2t )2 σ 2t 2
e −
=
2πσ
∫e
−∞
2σ 2
e 2
dx

σ 2t 2
tµ + ∞ ( x − µ −σ 2t )2
e 2 −
=
2πσ ∫e
−∞
2σ 2
dx
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 41
σ 2t 2 ∞ ( x − µ −σ 2t )2
tµ + 1 −
=e 2

2πσ ∫
−∞
e 2σ 2
dx

1 22
tµ + σ t
=e 2

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 42
1 22
tµ + σ t
M g (t ) = e 2

1 22
log ( M g (t ) ) = t µ + σ t
2

dM g (t ) ⎡⎛ µ t + 1 σ 2t 2 ⎞ ⎤
dt
= ⎢⎜ e 2
⎟ µ +σ t
2
( ) ⎥
t =0 ⎢⎣⎝ ⎠ ⎥⎦ t =0

E( X ) = µ
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 43
The Normal Distribution (cont’d)
The normal distribution with µ = 0 and σ = 1 is called standard
normal distribution.
Distribution function for standard normal distribution
z
1
∫e
−t 2 / 2
F ( z) = dt = P( Z ≤ z )
2π −∞

S. N. Table at the end of book gives the values of F(z) for positive
or negative values of z = 0.00, 0.01, 0.02, ,3.49 and of z = 3.50,
z = 4.00 and z = 5.00.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 44
The Normal Distribution (cont’d)

F(z) F(b) - F(a)

0 z a b

Figure: The standard normal probabilities Figure: The standard normal probability
F(z) = P(Z ≤ z) F(b) - F(a) = P(a ≤ Z ≤ b)

F(- z) = 1 - F(z)
Proof: The standard normal density function is given by
1 −z 2 / 2
f (z) = e which is a even function, i.e. f(-z) = f(z)

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 45
The Normal Distribution (cont’d)
Hence,
−z z ∞
F (− z ) = ∫ f (t ) dt = − ∫ f ( − s ) ds = ∫ f ( − s ) ds
−∞ ∞ z
∞ z
= ∫ f ( s ) ds = 1 − ∫ f ( s ) ds = 1 − F ( z )
z −∞

Example 4: Find the probabilities that a random variable having


the standard normal distribution will take on a value
(a) between –1.25 and 0.37;
(b) greater than 1.26;
(c) greater than –1.37.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 46
The Normal Distribution (cont’d)

(a) P(-1.25 < Z < 0.37) = F(0.37) – F(-1.25) = 0.6443 – 0.1056


= 0.5387

(b) P(Z > 1.26) = 1 - F(1.26) = 1 – 0.8962 = 0.1038

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 47
The Normal Distribution (cont’d)

(c) P(Z > -1.37) = 1 – F(-1.37) = 1 – 0.0853 = 0.9147

1
-1.37 -1.37

or

P(Z > -1.37) = P(Z < 1.37) = F(1.37) = 0.9147

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 48
The Normal Distribution (cont’d)

Š There are also problems in which we are given probabilities


relating to standard normal distributions and asked to find
the corresponding values of z.
Let zα be such that the probability is α that it will be exceeded
by a random variable having standard normal distribution.
That is, α = P(Z > zα )

α = P(Z > zα ) = 1 – F(zα) α

⇔ F(zα) = 1 - α zα
Figure: The zα notation for a standard
normal distribution
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 49
The Normal Distribution (cont’d)

Example 5: Find (a) z0.01; (b) z0.05.


Solution:
(a) Since F(z0.01) = 1 – 0.01 = 0.99, we look for the entry in
Table 3 which is closest to 0.99 and get 0.9901
corresponding to z = 2.33. Thus z0.01 = 2.33.
(b) Since F(z0.05 ) = 1 – 0.05 = 0.95, we look for the entry in
Table 3 which is closest to 0.95 and get 0.9495 and 0.9505
corresponding to z = 1.64 and z = 1.65. Thus by
interpolation, we take z0.05 = 1.645.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 50
The Normal Distribution (cont’d)
If a random variable X has a normal distribution with the
mean µ and the standard deviation σ, then
X −µ
Z=
σ
is a random variable which has the standard normal distribution.
In this case, Z is called standardized random variable.
The probability that the random variable X will take on a
value less than or equal to a, is given by
⎛ X −µ a−µ ⎞ ⎛ a−µ ⎞ ⎛a−µ⎞
P ( X ≤ a ) = P⎜ ≤ ⎟ = P⎜ Z ≤ ⎟ = F⎜ ⎟
⎝ σ σ ⎠ ⎝ σ ⎠ ⎝ σ ⎠
which we can get from Table 3.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 51
The Normal Distribution (cont’d)

Š The probability that a random variable having the normal


distribution with the mean µ and the standard deviation σ,
will take on a value between a and b is given by
⎛a−µ X −µ b−µ ⎞
P (a < X ≤ b) = P⎜ < ≤ ⎟
⎝ σ σ σ ⎠
⎛a−µ b−µ ⎞
= P⎜ <Z≤ ⎟
⎝ σ σ ⎠
⎛b−µ ⎞ ⎛a−µ⎞
P (a < X ≤ b) = F ⎜ ⎟ − F⎜ ⎟
⎝ σ ⎠ ⎝ σ ⎠
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 52
The Normal Distribution (cont’d)

Example 6: If a random variable has the normal distribution


with µ = 16.2 and σ2 = 1.5625, find the probabilities that it
will take on a value
(a) greater than 16.8;
(b) less than 14.9;
(c) between 13.6 and 18.8.
Solution: σ = 1.25
⎛ 16 .8 − 16 .2 ⎞
( a ) P ( X > 16 .8 ) = 1 − P ( X ≤ 16 .8) = 1 − F ⎜ ⎟
⎝ 1 .25 ⎠
= 1 − F ( 0 .48 ) = 1 − 0 .6844 = 0 .3156 .
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 53
The Normal Distribution (cont’d)

⎛ 14.9 − 16.2 ⎞
(b) P ( X < 14.9) = F ⎜ ⎟ = F (−1.04) = 0.1492.
⎝ 1.25 ⎠

⎛ 18.8 − 16.2 ⎞ ⎛ 13.6 − 16.2 ⎞


(c) P (13.6 < X < 18.8) = F ⎜ ⎟ − F⎜ ⎟
⎝ 1.25 ⎠ ⎝ 1.25 ⎠
= F (2.08) − F (−2.08)
= 2 F (2.08) − 1 (using F(- z) = 1 - F(z))
= 2(0.9812) − 1
= 0.9624.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 54
The Normal Distribution (cont’d)
Although the normal distribution applies to continuous
random variable, it is often used to approximate distributions
of discrete random variables.
For that we must use the continuity correction according to
which each integer k be represented by the interval from k –
½ to k + ½.
For instance, 3 is represented by the interval from 2.5 to
3.5, “at least 7” is represented by the interval from 6.5 to
∞ and “at most 5” is represented by the interval from -∞
to 5.5. Similarly “less than 5” is represented by the
interval from -∞ to 4.5 and “greater than 7” by the
interval from 7.5 to ∞.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 55
The Normal Distribution (cont’d)
Example 7: A continuity correction to improve the normal
approximation to a count variable
In a certain city, the number of power outages per month is a
random variable having a distribution with µ = 11.6 and
σ = 3.3. If this distribution can be approximated closely with
a normal distribution, what is the probability that there will
be at least 8 outages in any one month?
Solution: The number of outages is a discrete random variable,
and if we want to approximate its distribution with a normal
distribution, we must spread its values over a continuous
scale, i.e. we make the continuity correction according to
which “at least 8” is represented by the interval to the right
of 7.5. Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 56
The Normal Distribution (cont’d)
Thus the desired probability is approximated by
P(at least 8 outages) = P( X ≥ 7.5) = 1 − P( X < 7.5)
⎛ 7.5 − 11.6 ⎞
= 1− F⎜ ⎟ = 1 − F (−1.24)
⎝ 3.3 ⎠
= F (1.24) = 0.8925.

Figure: Diagram for example dealing


with power outages 0.8925

Number of
outages
7.5 11.6
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 57
5.3 The normal Approximation
to the Binomial Distribution
Š The normal distribution can be used to approximate the
binomial distribution when n is large and p the probability of
a success, is close to 0.50 and hence not small enough to use
the Poisson approximation.
Normal approximation to binomial distribution
Theorem 5.1 If X is a random variable having the binomial
distribution with the parameters n and p, and if
X − np
Z=
np (1 − p )
then the limiting form of the distribution function of this
standardized random variable as n → ∞ is given by
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 58
The normal Approximation to the
Binomial Distribution (cont’d)
z 1 −t 2 / 2
F ( z) = ∫ e dt − ∞ < z < ∞.
−∞

Although X takes on only the values 0, 1, 2, … , n, in the limit
as n → ∞ the distribution of the corresponding standardized
random variable is continuous and the corresponding probability
density is the standard normal density.
A good rule of thumb for the normal approximation
Use the normal approximation to the binomial distribution only
when np and n(1 - p) are both greater than 15.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 59
The normal Approximation to the
Binomial Distribution (cont’d)
Example 8: If a random variable has the binomial distribution
with n = 30 and p = 0.60, use the normal approximation to
determine the probabilities that it will take on
(a) a value less than 12;
(b) the value 14;
(c) a value greater than 16.
Solution: µ = np = 18; σ2 = np(1 - p) = 7.2; σ = 2.6833
(a ) P (X < 12) = P ( X < 11.5) (using continuity correction)
⎛ 11.5 − 18 ⎞
= F⎜ ⎟ = F (−2.42) = 0.0078.
⎝ 2.6833 ⎠
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 60
The normal Approximation to the
Binomial Distribution (cont’d)
(b) P( X = 14) = P(13.5 < X < 14.5) (using continuity correction)
⎛ 14.5 − 18 ⎞ ⎛ 13.5 − 18 ⎞
= F⎜ ⎟ − F⎜ ⎟
⎝ 2.6833 ⎠ ⎝ 2.6833 ⎠
= F (−1.3044) − F (−1.677)
= 0.0961− 0.0468 = 0.0493.
(c ) P ( X > 16 ) = P ( X > 16 .5) ( using continuity correction )
⎛ 16 .5 − 18 ⎞
= 1 − P ( X ≤ 16 .5) = 1 − F ⎜ ⎟
⎝ 2.6833 ⎠
= 1 − F ( −0.559 ) = F ( 0.559 ) = 0.7120 .
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 61
5.5 The Uniform Distribution

ƒ Uniform or rectangular distribution introduced in


1937 by J. V. Uspensky and defined as “ A
stochastic variable is said to have uniform
distribution of probability if probabilities attached
to two equal intervals are equal."
ƒ The uniform distribution is not commonly found in
nature, it is particularly useful for sampling from
arbitrary distributions.
ƒ Uniform distributions are used by computers for
random number generation within a given range.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 62
5.5 The Uniform Distribution
The uniform distribution, with parameters α and β, has
probability density function
⎧ 1
⎪ for α < x < β
f ( x) = ⎨ β − α
⎪0 elsewhere

f(x)

1
β −α
Figure: Graph of uniform
probability density
x
0 α β
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 63
The Uniform Distribution (cont’d)

Note: All values of x from α to β are equally likely in the sense


that the probability that x lies in an interval of width ∆x
entirely contained in the interval from α to β is equal to
∆x/(β - α), regardless of the exact location of the interval.
Mean of uniform distribution
α+β
µ=
2
Proof:
2 β
β 1 1 x α+β
µ=∫ x⋅ dx = =
α β −α β −α 2 α
2
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 64
The Uniform Distribution (cont’d)

Variance of uniform distribution 1


σ = (β − α )2
2

12
Proof:
3 β
β 1 1 x β 2 + αβ + α 2
µ 2′ = ∫ x 2
dx = =
α β −α β −α 3 α
3
Hence
β 2 + αβ + α 2 (α + β ) 2 ( β − α ) 2
σ 2 = µ 2′ − µ 2 = − = .
3 4 12
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 65
The Uniform Distribution (cont’d)
Distribution function for uniform density function
⎧0 for x ≤ α

⎪ x −α
F ( x) = ⎨ for α < x < β
⎪ β −α
⎪⎩1 for x ≥ β
Example: In certain experiments, the error made in determining the
solubility of a substance is a random variable having the uniform
density with α = - 0.025 and β = 0.025. What are the probabilities
that such an error will be
(a) between 0.010 and 0.015;
(b) between –0.012 and 0.012?
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 66
The Uniform Distribution (cont’d)

Solution: β - α = 0.05, hence the density function is given by


⎧ 1
⎪ for − 0.025 < x < 0.025
f ( x) = ⎨ 0.05
⎪⎩0 elsewhere
0.015
1
(a ) P(0.010 < error < 0.015) = ∫ dx = 0.1.
0.010
0.05
0.012
1
(b) P (−0.012 < error < 0.012) = ∫ dx = 0.48.
− 0.012
0.05
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 67
The Uniform Distribution (cont’d)

Example: From experience Mr. Harris has found that a


low bid on a construction job can be regarded as a
random variable having the uniform density
⎧ 3 2C
⎪ , for < x < 2C.
f ( x) = ⎨ 4C 3
⎪⎩0, else.
where C is his own estimate of the cost of the job. What
percentage should Mr. Harris add to his cost estimate
when submitting bids to maximize his expected profit ?
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 68
The Uniform Distribution (cont’d)

Hint: X= R.V that assumes the value of low bid on the


construction job.
Given that C is the cost of job by Mr. Harris.
Suppose Mr.Harris has added p% to his cost estimate
i.e. Mr. Harris bids for C+Cp.
If X < C+Cp, Mr. Harris will not get the job and
hence, there is no profit.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 69
The Uniform Distribution (cont’d)

If X ≥ C+Cp, Mr. Harris should get the job and


hence, his net profit = (C+Cp) – C = Cp.
So, the expected profit of Mr. Harris ,
E(p) = 0 . P (X < C+ Cp) + Cp . P (X ≥ C+Cp).
2C
= Cp ∫ = − 2
3 / 4C ds 3C ( p p ) / 4.
(1+ p ) C

Homework: Maximize E(p) to get the value of p (=0.5).

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 70
5.6 The Log-Normal Distribution

Š In probability and statistics, the log-normal


distribution is the probability distribution of any
random variable whose logarithm is normally
distributed (the base of the logarithmic function is
immaterial in that loga X normally distributed if and
only if logb X is normally distributed).
Š If X is a random variable with a normal distribution,
then exp(X) has a log-normal distribution.
Š "Log-normal" is also written "log normal",
"lognormal" or "logistic normal".
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 71
5.6 The Log-Normal Distribution

Š A variable might be modeled as log-normal if it can be


thought of as the multiplicative product of many small
independent factors.
Š A typical example is the long-term return rate on a stock
investment: it can be considered as the product of the daily
return rates.
Š Examples of variates which have approximately log normal
distributions include the size of silver particles in a
photographic emulsion, the survival time of bacteria in
germicides, the weight and blood pressure of humans, etc.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 72
5.6 The Log-Normal Distribution

The probability density of the log-normal distribution is


given by
⎧ 1 −1 −(ln x −α ) 2 / 2 β 2
⎪ x e for x > 0, β > 0
f ( x) = ⎨ 2π β
⎪0 elsewhere

where, ln x is the natural logarithm of x.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 73
The Log-Normal Distribution (cont’d)

f(x)

Figure: Graph of log-normal


probability density
x
0 2 4 6 8 10 12 14

¾The log-normal distribution is positively skewed or has a


long right-hand tail.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 74
The Log-Normal Distribution (cont’d)

The probability that a random variable having the log-normal


distribution will take on a value between a and b (0 < a < b)
is given by
b 1
P ( a < X < b) = ∫ −1 − (ln x −α ) 2 / 2 β 2
x e dx
a
2π β
(putting y = ln x)
ln b 1
=∫ e − ( y −α ) 2 / 2 β 2
dy
ln a
2π β
The integrand is normal density function with µ = α
and σ = β , hence we have
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 75
The Log-Normal Distribution (cont’d)

⎛ ln b − α ⎞ ⎛ ln a − α ⎞
P(a < X < b) = F ⎜⎜ ⎟⎟ − F ⎜⎜ ⎟⎟
⎝ β ⎠ ⎝ β ⎠
where F is the distribution function of the standard normal
distribution.
Mean of log-normal distribution 2
α +β / 2
µ =e
Variance of log-normal distribution

2α + β 2 β2
σ =e
2
(e − 1)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 76
The Log-Normal Distribution (cont’d)

Proof for mean


1 ∞

−1 − (ln x −α ) 2 / 2 β 2
µ= x⋅x e dx
2π β 0
(putting y = ln x)
1 ∞

y − ( y −α ) 2 / 2 β 2
= e e dy
2π β −∞

1 ∞

−{ y 2 +α 2 − 2 (α + β 2 ) y } / 2 β 2
= e dy
2π β −∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 77
The Log-Normal Distribution (cont’d)

1 ∞ ( )
− ⎧⎨ y − (α + β 2 ) − β 4 − 2αβ 2 ⎫⎬ / 2 β 2
2

µ=
2π β ∫ −∞
e ⎩ ⎭
dy

α +β 2 / 2 ∞ 1 − ( y − (α + β 2 ) ) / 2 β 2
2

=e ∫ −∞
2π β
e dy

Since the integrand is normal density function with µ = α + β2


and σ = β, so the value of the integral is 1. Hence we have
α +β 2 / 2
µ =e .

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 78
The Log-Normal Distribution (cont’d)

Proof for variance


1 ∞

−1 − (ln x −α ) 2 / 2 β 2
µ 2′ = x ⋅x e
2
dx
2π β 0
1 ∞

2 y − ( y −α ) 2 / 2 β 2
= e e dy
2π β −∞
1 ∞

−{ y 2 +α 2 − 2 (α + 2 β 2 ) y } / 2 β 2
= e dy
2π β −∞

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 79
The Log-Normal Distribution (cont’d)

1 ∞ ( )
− ⎧⎨ y − (α + 2 β 2 ) − 4 β 4 − 4αβ 2 ⎫⎬ / 2 β 2
2

µ 2′ =
2π β ∫ −∞
e ⎩ ⎭
dy

∞ 1 − ( y − (α + 2 β 2 ) ) / 2 β 2
2


2 (α + β 2 ) 2 (α + β 2 )
=e e dy = e
−∞
2π β
Since the integrand is normal density function with µ = α + 2β2
and σ = β, so the value of the integral is 1. Hence we have
2 (α + β 2 ) 2α + β 2 2α + β 2 β2
σ = µ 2′ − µ = e
2 2
−e =e (e − 1).

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 80
The Log-Normal Distribution (cont’d)

Example: The current gain of certain transistors is measured in


units which make it equal to the logarithm of Io/Ii, the ratio of
output to the input current. If this logarithm is normally
distributed with µ = 2 and σ2 = 0.01, find the probability that
Io/Ii will take on a value between 7.0 and 7.5.
Solution: Since α = 2 and β2 = 0.01 so β = 0.1, we have
P(7 < I o /I i < 7.5) = P( ln 7 < ln( I 0 / I i ) ) <ln 7.5)
⎛ ln 7.5 − 2 ⎞ ⎛ ln 7.0 − 2 ⎞
= F⎜ ⎟ − F⎜ ⎟
⎝ 0.1 ⎠ ⎝ 0.1 ⎠
= F (0.149) − F (−0.54)
= 0.5592 − 0.2946 = 0.2646.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 81
5.7 The Gamma Distribution

The probability density of the gamma distribution is given by


⎧ 1 α −1 − x / β
⎪ α x e for x > 0, α > 0, β > 0
f ( x) = ⎨ β Γ(α )
⎪0 elsewhere

where Γ(α) is a value of the gamma function, defined by

Γ(α ) = ∫ xα −1e − x dx
0
The parameter α is known as shape parameter and the parameter
β is known as scaling parameter.
The above improper integral exists (converges) whenever α > 0.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 82
5.7 The Gamma Distribution

Š Γ(α) = (α -1)Γ(α-1) for∞ any α >1.


Proof: Γ(α ) = − e − x xα −1 |0∞ − ∫ [−e − x (α − 1) xα −2
0

= 0 + (α − 1) ∫ e − x xα − 2 = (α − 1) Γ(α − 2)
Š Γ(α) = (α -1)! when0α is a positive integer.
Proof: Γ(α) = (α -1)Γ(α-1)= (α -1) (α -2) Γ(α-2)= ….
= (α -1) (α -2) ….. Γ(1) = (α -1)!
Œ Homework: Γ(1)=1 and Γ(½) = π
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 83
The Gamma Distribution (cont’d)

f(x)

α = 1, β = 1

α = 1, β = 2
α = 2, β = 3

0 1 2 3 4 5 6 7

Figure: Graph of some gamma probability density functions

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 84
The Gamma Distribution (cont’d)

Mean of gamma distribution: µ = αβ


Proof:
1 ∞

α −1 − x / β
µ= α x. x e dx
β Γ(α ) 0 (put y = x/β)
β ∞ α −y β Γ(α + 1)
= ∫
Γ(α ) 0
y e dy =
Γ(α )
Use the identity Γ(α + 1) = α Γ(α), we get
µ = αβ
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 85
The Gamma Distribution (cont’d)

Variance of gamma distribution:


σ 2 = αβ 2
Proof:
1 ∞

2 α −1 − x / β
µ 2′ = α x .x e dx
β Γ(α ) 0


β 2
β 2
Γ(α + 2)

α +1 − y
= y e dy =
Γ(α ) 0 Γ(α )
= αβ 2 (α + 1)
Hence σ 2 = µ 2′ − µ 2 = αβ 2 (α + 1) − α 2 β 2 = αβ 2 .
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 86
The Gamma Distribution (cont’d)

Exponential Distribution: The density function of exponential


distribution is given by
⎧ 1 −x / β
⎪ e for x > 0, β > 0
f ( x) = ⎨ β
⎪0 elsewhere

which is the special case of gamma distribution where α = 1.
Mean and variance of the exponential distribution are given by

µ = β and σ = β 2 2

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 87
The Gamma Distribution (cont’d)

Application of Exponential Distribution


In Poisson processes (Ex: arrival times of telephone calls or
bus arrival times at bust stop), the waiting time between
successive arrivals has exponential distribution. If in a
Poisson process mean arrival rate (average number of
arrivals per unit time) is α, the time until the first arrival, or
the waiting time between successive arrivals, has an
exponential distribution with β = 1 .
α
λ xe−λ
Proof: P ( x arrivals in the time interval T ) = f ( x; λ ) =
x!
where λ = αT.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 88
The Gamma Distribution (cont’d)

P( waiting time between successive arrivals be at least t )


= P(no arrivals during a time interval of length t )
= f (0; λ ) where λ = α t
(αt ) 0 e −α t
= = e −α t
0!
P( waiting time between successive arrivals < t )
= 1 − e −α t = F (t )

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 89
The Gamma Distribution (cont’d)
So if waiting time between successive arrivals be random variable
with the distribution function
F (t ) = 1 − e −αt
the probability density of the waiting time between successive
arrivals given by d −αt
F (t ) = α e = f (t )
dt 1
which is an exponential distribution with β = .
α
Note: If X represents the waiting time to the first arrival, then
P( X > t ) = 1 − P ( X ≤ t ) = 1 − F (t ) = e −αt .
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 90
The Gamma Distribution (cont’d)

Example: Given that the switchboard of consultant’s office


receives on the average 0.6 calls per minute, find the
probabilities that the time between successive calls arriving
at the switchboard of the consulting firm will be
(a) less than 1/2 minute;
(b) more than 3 minute.
Solution: α = 0.6, the waiting time t between successive calls
arriving at the switchboard, has an exponential distribution
with β = 1/0.6, hence density function is given by

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 91
The Gamma Distribution (cont’d)

⎧0.6e −0.6t for t > 0


f (t ) = ⎨
⎩0 elsewhere
1/ 2
1 − 0 .6 t 1 / 2
(a ) P (t < ) = ∫ 0.6e dt = − e
− 0 .6 t
= 1 − e − 0 .3 .
2 0
0


− 0.6 t ∞
(b) P(t > 3) = ∫ 0.6e − 0.6 t
dt = − e = e −1.8 .
3
3

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 92
The Gamma Distribution (cont’d)

Memoryless Property of Exponential Distribution:


Let X be a r.v. that has exponential distribution. Let s,
t ≥0. Then,
P{ X > t + s} e − (t + s ) / β
P{ X > t + s | X > s} = = − s / β = e −t / β = P{ X > t}
P{ X > s} e
since the event{ X > t + s } ⊂ { X > t}. If X represents lifetime of
an equipment, then the above equation states that if the
equipment has been working for time s, then the probability
that that it will survive an additional time t depends only on t
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 93
The Gamma Distribution (cont’d)

(not on s) and is identical to the probability of


survival for time t of a new piece of equipment.
In that sense, the equipment does not remember
that it has been in use for time s.
NOTES: (1) The memoryless property simplifies
many calculations and is mainly the reason for wide
applicability of the exponential model.
(2) Under this model, an item that has not been failed
so far is as good as new.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 94
The Gamma Distribution (cont’d)

Example: Suppose the life length of a machine


has an exponential distribution with β=10
years. A 7 years used machine is bought by
someone. What is the probability that it will
not fail in the next 5 years ?
Solution: Because of memoryless property, it
is irrelevant how many years the machine has
been in service prior to its purchase.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 95
The Gamma Distribution (cont’d)

Let X be a r.v. that represents the length of the


life time of the machine. So, the density
function is ⎧0.1e −0.1t for t > 0
f (t ) = ⎨
⎩0 elsewhere

Here, s=7 is its actual life duration to the


present time instant. Then,
P{ X > s + 5 | X > s} = P{ X > 5} = e −0.1( 5) = 0.368.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 96
The Beta Distribution

Beta distribution: The probability density function of the beta


distribution is given by
⎧ Γ(α + β ) α −1 β −1
⎪ x (1 − x ) for 0 < x < 1, α > 0, β > 0
f ( x) = ⎨ Γ(α ) ⋅ Γ( β )
⎪0 elsewhere

Mean and Variance of beta distribution

α αβ
µ= and σ = 2

α+β (α + β ) 2 (α + β + 1)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 97
The Beta Distribution (cont’d)
Beta Function:
1
B(m, n) = ∫ x m −1 (1 − x) n −1 dx m > 0, n > 0
0

Relation between beta and gamma function


Γ ( m)Γ ( n )
B(m, n) =
Γ ( m + n)
B(m,n) = B(n,m)

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 98
The Beta Distribution (cont’d)

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 99
The Beta Distribution (cont’d)

Š The Beta distribution can be used to model events which are


constrained to take place within an interval defined by a
minimum and maximum value. For this reason, it is used
extensively in PERT (Program Evaluation and Review
Technique) , CPM (Critical Path Method) and other project
management / control systems to describe the time to
completion of a task.
Š Beta distributions are used extensively in Bayesian statistics
(Bayesian inference uses aspects of the scientific method,
which involves collecting evidence that is meant to be
consistent or inconsistent with a given hypothesis).

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 100
The Beta Distribution (cont’d)
Proof for mean
Γ(α + β ) α −1
1
µ =∫x x (1 − x) β −1dx
0
Γ(α ) ⋅ Γ( β )
Γ(α + β )
1


α +1−1 β −1
= x (1 − x ) dx
Γ(α ) ⋅ Γ( β ) 0
Γ(α + β )
= B (α + 1, β )
Γ(α ) ⋅ Γ( β )
Γ(α + β ) Γ(α + 1) ⋅ Γ( β ) α
= =
Γ(α ) ⋅ Γ( β ) Γ(α + β + 1) α +β
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 101
The Beta Distribution (cont’d)
Proof for variance
Γ(α + β ) α −1
1
µ 2′ = ∫ x2
x (1 − x) β −1dx
0
Γ(α ) ⋅ Γ( β )
Γ(α + β )
1


α + 2 −1 β −1
= x (1 − x ) dx
Γ(α ) ⋅ Γ( β ) 0
Γ(α + β )
= B (α + 2, β )
Γ(α ) ⋅ Γ( β )
Γ(α + β ) Γ(α + 2) ⋅ Γ( β ) α (α + 1)
= =
Γ(α ) ⋅ Γ( β ) Γ(α + β + 2) (α + β )(α + β + 1)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 102
The Beta Distribution (cont’d)
Hence
σ 2 = µ 2′ − µ 2
α (α + 1) α2
= −
(α + β )(α + β + 1) (α + β ) 2
αβ
=
(α + β ) 2 (α + β + 1)
Notes.(1) For α = 1 and β = 1, we obtain as a special case of the
uniform distribution defined on the interval from 0 to 1.
(2) If α = β, then the density function is symmetric about
1/2 (red & purple plots).
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 103
The Beta Distribution (cont’d)

Example: If the annual proportion of erroneous income tax returns


filed with the IRS can be looked upon as a random variable
having a beta distribution with α = 2 and β = 9, what is the
probability that in any given year there will be fewer than 10%
erroneous returns?
Solution: The density function is given by
⎧ Γ(11)
⎪ x (1 − x ) 8
for 0 < x < 1
f ( x) = ⎨ Γ(2)Γ(9)
⎪0 elsewhere

⎧90 x(1 − x)8 for 0 < x < 1
=⎨
⎩0 elsewhere
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 104
The Beta Distribution (cont’d)
0.1 0.1
P( x < 0.1) = 90 ∫ x(1 − x) dx = 90 ∫ [1 − (1 − x)](1 − x)8 dx
8

0 0

⎡0.1 0.1

= 90⎢ ∫ (1 − x) dx − ∫ (1 − x) dx⎥
8 9

⎣0 0 ⎦
⎡ (1 − x)9 0.1
(1 − x) 10 0.1 ⎤
= 90⎢ + ⎥
⎢⎣ − 9 0
10 0 ⎥⎦
= 0.2639.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 105
The Weibull Distribution

The probability density function of Weibull distribution is


given by
⎧⎪α β x e
β
β α
− 1 − x
x > 0, α > 0, β > 0
f ( x) = ⎨
⎪⎩ 0 elsewhere
a aβ
β −1 − α x β
P ( X ≤ a ) = ∫ αβ x e dx = ∫ α e −α y
dy (put y = x β
)
0 0
β
−α y a
αe −α a β
= = 1− e
−α 0
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 106
The Weibull Distribution
(cont’d)
When β = 1, then the Weibull distribution reduces to the
exponential distribution.

Mean and Variance of Weibull distribution


⎛ 1⎞
µ =α −1/ β
Γ ⎜1 + ⎟ ,
⎝ β⎠
⎡ ⎛ 2⎞ ⎧ 1 ⎫ ⎤
2

σ 2 = α −2 / β ⎢Γ ⎜1 + ⎟ − ⎨Γ(1 + ) ⎬ ⎥
⎢⎣ ⎝ β ⎠ ⎩ β ⎭ ⎥⎦
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 107
The Weibull Distribution
(cont’d)
Proof for mean:
∞ ∞ 1/ β
⎛u⎞
µ = ∫ x.α β x β −1 −α x β
e ∫
dx = ⎜ ⎟ e − u du (put u = α x β )
0 0 ⎝
α⎠
∞ 1
1+ −1 1
∫u
−1 / β −1 / β
=α β
e du = α
−u
Γ (1 + )
0
β
Proof for variance:
∞ ∞ 2/ β
⎛u⎞
µ 2′ = ∫ x .α β x ∫
β
β −1 −αx
2
e dx = ⎜ ⎟ e −u du (put u = αx β )
0 0⎝
α⎠

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 108
The Weibull Distribution
(cont’d)
∞ 2
1+ −1 2
∫u
−2 / β −2 / β
µ 2′ = α β
e du = α
−u
Γ(1 + )
0
β
Hence
σ 2 = µ 2′ − µ 2
2
−2 / β 2 −2 / β ⎛ 1 ⎞
=α Γ (1 + ) −α ⎜⎜ Γ (1 + ) ⎟⎟
β ⎝ β ⎠
⎡ 2 ⎧ 1 ⎫
2

= α − 2 / β ⎢Γ (1 + ) − ⎨Γ (1 + ) ⎬ ⎥
⎢⎣ β ⎩ β ⎭ ⎥⎦
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 109
The Weibull Distribution (cont’d)

Š The Weibull distribution is used to represent manufacturing


and delivery times in industrial engineering problems, while
it is very important in weather forecasting.
Š The Weibull distribution is used for fading channel
modeling in wireless communications (Fading refers to the
time variation of the received signal power caused by
changes in the transmission medium or path.)
Š The Weibull distribution is also commonly used to describe
wind speed distributions.
Š It is also a very popular statistical model in reliability
engineering and failure analysis.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 110
In the equation x + 2 x − Q = 0, Q is a random variable,
2

uniformly distributed over the interval (0, 2). Find the


distribution of the larger root.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 111
x2 + 2x − Q = 0
roots are :
−2 ± 4 + 4Q
x= x = −1 ± 1 + Q
2
Let Y denote the larger root.
Y = g ( Q ) = −1 + 1 + Q

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 112
Now,
Q is a random variable, uniformly distributed
over the interval (0, 2).

Density function for Q is given by:

⎧1
⎪ 0<q<2
f (q) = ⎨ 2
⎪⎩ 0 elsewhere

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 113
Since
y = 1+ q −1
0<q<2⇒
1 < 1+ q < 3
1 < 1+ q < 3
0 < 1+ q −1 < 3 −1
∴0 < y < 3 −1

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 114
Hence , density function for y is given by

dq
f ( y) = f ( g ( y) )
−1

dy

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 115
Since
dy 1
y = 1+ q −1 =
dq 2 1 + q
dq
Hence or , = 2 1+ q
dy
f ( y ) = f (q).2 1 + q
1
or , f ( y ) = .2 1 + q ⎡⎣ but 1 + q = y + 1⎤⎦
2
∴ f ( y) = 1 + y
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 116
Density function for Y is given by:

⎪⎧ y + 1 0 < y < 3 − 1
f ( y) = ⎨
⎪⎩0 otherwise

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 117
5.10 Joint Distributions—Discrete
and Continuous
Š In many statistical investigations, one is frequently
interested in studying the relationship between two
or more r.v.'s, such as the relationship between
annual income and yearly savings per family or the
relationship between occupation and hypertension.
Š In this chapter, we consider n-dimensional vector
valued r.v.'s, however, we start with the case of n =
2. We will study them simultaneously in order to
determine not only their individual behavior but
also the degree of relationship between them.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 118
5.10 Joint Distributions—Discrete
and Continuous (cont’d)
Discrete Variables
¾ For two discrete random variables X1 and X2, the probability
that X1 will take the value x1 and X2 will take the value x2 is
written as P(X1 = x1, X2 = x2).
¾ Consequently, P(X1 = x1, X2 = x2) is the probability of the
intersection of the events X1 = x1 and X2 = x2.
¾ If X1 and X2 are discrete random variables, the function
given by f (x1, x2) = P(X1 = x1, X2 = x2) for each pair of
values (x1, x2) within the range of X1 and X2 is called the
joint probability distribution of X1 and X2.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 119
Joint Distributions—Discrete and
Continuous (cont’d)
¾ The distribution of probability is specified by listing the
probabilities associated with all possible pairs of values x1
and x2, either by formula or in a table.
¾ A function of two variables can serve as the joint
probability distribution of a pair of discrete random
variables X1 and X2 if and only if its values, f (x1, x2),
satisfy the conditions
1. f (x1, x2) ≥ 0 for each pair of values (x1, x2) within its
domain;
2. ∑∑ f (x1, x2 ) = 1, where the double summation extends
x1 x2
over all possible pairs (x1, x2) within its domain.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 120
Joint Distributions—Discrete and
Continuous (cont’d)
Joint distribution function: If X1 and X2 are discrete
random variables, the function given by
F ( x1 , x 2 ) = P ( X 1 ≤ x 1 , X 2 ≤ x 2 ) = ∑∑
s ≤ x1 t ≤ x 2
f (s, t )

for -∞ < x1 < ∞, -∞ < x2 < ∞ where f (s, t) is the value


of the joint probability distribution of X1 and X2 at (s, t),
is called the joint distribution function, or the joint
cumulative distribution of X1 and X2.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 121
Joint Distributions—Discrete and
Continuous (cont’d)
Marginal distribution: If X1 and X2 are discrete random
variables and f (x1, x2) is the value of their joint probability
distribution at (x1, x2), the function given by

P ( X 1 = x1 ) = f1 ( x1 ) = ∑ f ( x1 , x2 )
x2
for each x1 within the range of X1 is called the marginal
distribution of X1. Correspondingly, the function given by
P( X 2 = x2 ) = f 2 ( x2 ) = ∑ f ( x1 , x2 )
x1
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 122
Joint Distributions—Discrete and
Continuous (cont’d)
for each x2 within the range of X2 is called the marginal
distribution of X2.
Conditional probability distribution: Consistent with the definition of
conditional probability of events when A is the event X1 = x1 and B is the
event X2 = x2 , the conditional probability distribution of X1 = x1
given X2 = x2 is defined as
f ( x1 , x2 )
f1 ( x1 | x2 ) = for all x1 provided f 2 ( x2 ) ≠ 0
f 2 ( x2 )
Similarly, the conditional probability distribution of X2 given X1 = x1
is defined as f (x , x )
f 2 ( x2 | x1 ) = 1 2
for all x2 provided f1 ( x1 ) ≠ 0
f1 ( x1 )
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 123
Joint Distributions—Discrete and
Continuous (cont’d)
If f1(x1|x2) = f1(x1) for all x1 and x2, so the conditional probability
distribution of X1 free of x2 or equivalently, if
f (x1, x2) = f1(x1) f2(x2) for all x1, x2
the two random variables are independent.
Example: Two scanners are needed for an experiment. Of the five, two
have electronic defects, another one has a defect in memory, and two
are in good working order. Two units are selected at random.
(a) Find the joint probability distribution of X1 = the number with
electronic defects and X2 = the number with a defect in memory.
(b) Find the probability of 0 and 1 total defects among the two selected.
(c) Find the marginal probability distribution of X1.
(d) Find the conditional probability distribution of X1 given X2 = 0.
Solution: (a) The joint probability distribution of X1 and X2 is given by
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 124
Joint Distributions—Discrete and
Continuous (cont’d)
⎛ 2 ⎞⎛ 1 ⎞⎛ 2 ⎞
⎜⎜ ⎟⎟⎜⎜ ⎟⎟⎜⎜ ⎟⎟
x x 2 − x1 − x2 ⎠
f ( x1 , x2 ) = ⎝ 1 ⎠⎝ 2 ⎠⎝ where x1 = 0 , 1, 2 and x 2 = 0 , 1
⎛5⎞
⎜⎜ ⎟⎟ 0 ≤ x1 + x 2 ≤ 2
⎝ 2⎠
The joint probability distribution f(x1,x2) of X1 and X2 can be summarized in
the following table:
f(x1,x2) X2
0 1
0 0.1 0.2
X1 1 0.4 0.2
2 0.1 0.0

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 125
Joint Distributions—Discrete and
Continuous (cont’d)
(b) Let A be the event that X1 + X2 equal to 0 or 1
P(A) = f (0 , 0) + f (0 , 1) + f (1, 0) = 0.1 + 0.2 + 0.4 = 0 .7
(c) The marginal probability distribution of X1 is given by
f1 ( x1 ) = ∑ f ( x1 , x 2 ) = f ( x1 ,0 ) + f ( x1 ,1)
x2

f1 ( 0 ) = f ( 0,0 ) + f ( 0,1) = 0 .1 + 0 .2 = 0 .3
f1 (1) = f (1,0 ) + f (1,1) = 0 .4 + 0 .2 = 0 .6
f1 ( 2 ) = f ( 2,0 ) + f ( 2,1) = 0 .1 + 0 .0 = 0 .1

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 126
Joint Distributions—Discrete and
Continuous (cont’d)
f(x1,x2) X2 f1(x1)
0 1
0 0.1 0.2 0.3
X1 1 0.4 0.2 0.6
2 0.1 0.0 0.1
f2(x2) 0.6 0.4 1

(d) The conditional probability distribution of X1 given X2 = 0


f ( x1 ,0)
f1 ( x1 | 0) = for x1 = 0,1, 2
f 2 ( 0)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 127
Joint Distributions—Discrete and
Continuous (cont’d)
f 2 ( 0 ) = ∑ f ( x1 ,0 ) = f ( 0,0 ) + f (1,0 ) + f ( 2,0 )
x1

= 0 .1 + 0 .4 + 0 .1 = 0 .6
f (0,0) 0.1 1
f1 (0 | 0) = = =
f 2 (0) 0.6 6
f (1,0) 0.4 4
f1 (1 | 0) = = =
f 2 (0) 0.6 6
f (2,0) 0.1 1
f1 (2 | 0) = = =
f 2 (0) 0.6 6
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 128
Joint Distributions—Discrete and
Continuous (cont’d)
All the definitions concerned with two random
variables X1 and X2 can be generalized for k random
variables X1, X2, …, Xk.
Let x1 be a possible value for the first random variable
X1, x2 be a possible value for the second random
variable X2, and so on with xk be a possible value
for the k-th random variable Xk. The values of the
joint probability distribution of k discrete random
variables X1, X2, …, Xk are given by
f(x1, x2, …, xk) = P(X1 = x1 , X2 = x2, …, Xk = xk)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 129
Joint Distributions—Discrete and
Continuous (cont’d)

Continuous Variables
There are many situation in which we describe an
outcome by giving the value of several continuous
variables. For instance, we may measure the weight
and the hardness of a rock, the volume, pressure
and temperature of a gas, or the thickness, color,
compressive strength and potassium content of a
piece of glass.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 130
Joint Distributions—Discrete and
Continuous (cont’d)
Example 1: If two random variables have the joint density
x2
⎧ x1 x2 for 0 < x1 < 1, 0 < x2 < 2
f ( x1 , x2 ) = ⎨
⎩0 elsewhere 2

1
find the probabilities that
(a) both random variables will take on values 0 1 x1
less than 1;
(b) the sum of the values taken on by the two random
variables will be less than 1.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 131
Joint Distributions—Discrete and
Continuous (cont’d)
Solution:
x2
( a ) P ( X 1 < 1, X 2 < 1)
1 1 2

= ∫ ∫ f (x , x
− ∞− ∞
1 2 ) dx 1 dx 2 1

1 1 x1
0

= ∫ ∫ x1 x2 dx1dx 2
1

0 0

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 132
1 2 1 1
x 1
= ∫ x2 1
dx2 = ∫ x2 dx2
0
2 0 20
2 1
1 x 1
= ⋅ 2
= .
2 2 0
4

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 133
Joint Distributions—Discrete and
Continuous (cont’d)
x2
(b) P( X1 + X 2 < 1)
1 1−x1 2

=∫ ∫ f (x , x )dx dx
1 2 2 1
1
−∞ −∞
1 1−x1
= ∫ ∫ x1x2dx2dx1
x1
0 1
x1 + x2 = 1
0 0

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 134
1 2 1− x1 1
x2 1
= ∫ x1 dx1 = ∫ x1 (1 − x1 ) 2 dx1
0
2 0
20
1
1
= ∫ ( x1 − 2 x1 + x1 )dx1
2 3

20
1
1 ⎡ x1 2 x x1 ⎤
2 3 4
1
= ⎢ − + ⎥ = .
1
2⎣ 2 3 4 ⎦ 0 24
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 135
Problem

With reference to the previous example find


the probabilities that the sum of the values
taken on by the two random variables will be
greater than 1.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 136
Joint Distributions—Discrete and
Continuous (cont’d)
Example 2: With the reference to the example 1, find
the marginal densities of the two random variables.
Solution: The marginal density of X1 is given by
∞ 2 2 2
x2
f1(x1) = ∫ f (x1, x2 )dx2 = ∫ x1x2dx2 = x1 = 2x1, 0 < x1 < 1.
−∞ 0
2 0
The marginal density of X2 is given by
∞ 1 2 1
x x2
f 2 ( x2 ) = ∫ f ( x1, x2 )dx1 = ∫ x1x2dx1 = x2 1
= , 0 < x2 < 2.
−∞ 0
2 0
2
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 137
Joint Distributions—Discrete and
Continuous (cont’d)

Conditional probability density


If f(x1, x2) is the joint density of the continuous random
variables X1 and X2 and f2(x2) is the marginal density of
X2, the conditional probability density of X1 given
X2 = x2 is given by
f ( x1 , x2 )
f1 ( x1 | x2 ) = provided f 2 ( x2 ) ≠ 0 for − ∞ < x1 < ∞.
f 2 ( x2 )
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 138
Joint Distributions—Discrete and
Continuous (cont’d)
Correspondingly, if f1(x1) is the marginal density of X1,
the conditional probability density of X2 given X1 = x1
is given by
f ( x1 , x2 )
f 2 ( x2 | x1 ) = provided f1 ( x1 ) ≠ 0
f1 ( x1 )
for - ∞ < x2 < ∞.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 139
Joint Distributions—Discrete and
Continuous (cont’d)
Example 17: If two random variables have the joint
density
⎧6
⎪ ( x + y 2
) for 0 < x < 1, 0 < y < 1
f ( x, y ) = ⎨ 5
⎪⎩0 elsewhere

find
(a) an expression for f1(x | y) for 0 < y < 1;
(b) an expression for f1(x | 0.5)
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 140
Joint Distributions—Discrete and
Continuous (cont’d)
f ( x, y)
(a) f1 ( x | y) = provided f2 ( y) ≠ 0
f2 ( y)
But for 0 < y < 1
∞ 1
6 6⎛ 2 1⎞
f 2 ( y ) = ∫ f ( x, y )dx = ∫ ( x + y )dx = ⎜ y + ⎟
2

−∞
50 5⎝ 2⎠
and elsewhere 0.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 141
Joint Distributions—Discrete and
Continuous (cont’d)
Therefore for 0 < y < 1
⎧ x + y2
⎪ 2 0 < x <1
f1 ( x | y ) = ⎨ y + 1 / 2
⎪0 elsewhere

⎧ x + 0.25
⎪ 0 < x <1
(b) f1 ( x | 0.5) = ⎨ 0.75
⎪⎩0 elsewhere
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 142
Joint Distributions—Discrete and
Continuous (cont’d)
⎧ 4x +1
⎪ 0 < x <1
f1 ( x | 0.5) = ⎨ 3
⎪⎩0 elsewhere
(c) Mean of the conditional density of X when Y = 0.5
1 1
x(4 x + 1)
= ∫ x ⋅ f1 ( x | 0.5)dx = ∫ dx
0 0
3
1
⎛ 4x 3
x ⎞ 11 2
= ⎜⎜ + ⎟⎟ = .
⎝ 9 6 ⎠ 0 18
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 143
Joint Distributions—Discrete and
Continuous (cont’d)
Properties of Expectation
Consider a function g(X) of a single continuous random variables X.
For instance, if X is an oven temperature in degree centigrade, then
g(X) = (9/5)X + 32 is the same temperature in degree Fahrenheit.
If X has probability density function f, then the mean or expectation of g(X)
is given by

E[ g ( X )] = ∫ g ( x) f ( x)dx
−∞
In the discrete case, where X has probability distribution f,
E[ g ( X )] = ∑ g ( xi ) f ( xi )
xi
where xi is a possible value for X.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 144
Joint Distributions—Discrete and
Continuous (cont’d)
If X has mean µ = E(X), then take g(x) = (x - µ)2, we have
E[g(x) ] = E[(x - µ)2] = σ2 (variance of X).
For any random variable Y, let E(Y) denote its expectation which is also
its mean µY. Its variance is Var(Y) which is also written as σ Y2 .
When g(x) = ax + b, for given constants a and b, the random variable g(X)
has expectation
∞ ∞ ∞
E (aX + b) = ∫ (ax + b) f ( x)dx = a ∫ x f ( x)dx + b ∫ f ( x)dx = aE ( X ) + b
−∞ −∞ −∞

= aµ X + b
and variance ∞ ∞
Var (aX + b) = ∫ (ax + b − aµ X − b) 2 f ( x)dx = a 2 ∫ ( x − µ X ) 2 f ( x)dx
−∞ −∞

= a 2Var ( X )
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 145
Joint Distributions—Discrete and
Continuous (cont’d)
For given constant a and b
E ( aX + b ) = aE ( X ) + b and Var ( aX + b ) = a 2Var ( X )
Given any collection of k random variables, the function Y = g(X1, X2, …, Xk)
is also a random variable.
For example Y = X1 – X2 is a random variable when g(x1, x2) = x1 – x2 and
Y = 2X1 +3X2 is a random variable when g(x1, x2) = 2x1 +3x2 .
The random variable g(X1, X2, …, Xk) has expected value, or mean, given by
∞ ∞ ∞

∫ ∫ ...... ∫ g ( x , x ,...., x ) f ( x , x ,...., x )dx dx ....dx


− ∞− ∞ −∞
1 2 k 1 2 k 1 2 k

or in discrete case,
∑∑ .....∑ g ( x , x ,..., x ) f ( x , x ,..., x )
x1 x2 xk
1 2 k 1 2 k

where the summation is over all k-tuples (x1, x2, …, xk) of possible values.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 146
Joint Distributions—Discrete and
Continuous (cont’d)
Let X1 and X2 be two random variables and µ1 and µ2 are their mean
respectively. Take g(x1, x2) = (x1 – µ1)(x2 - µ2), we see that the product
(x1 – µ1)(x2 - µ2) will be positive if both values x1 and x2 are above their
respective means or both are below their respective means. Otherwise it
will be negative.
The expected value E[(X1 – µ1)(X2 - µ2)] will tend to be positive when large
X1 and X2 tend to occur together and small X1 and X2 tend to occur
together with high probability.
This measure E[(X1 – µ1)(X2 - µ2)] of joint variation is called the population
covariance of X1 and X2.
When X1 and X2 are independent, their covariance E[(X1 – µ1)(X2 - µ2)] = 0.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 147
Joint Distributions—Discrete and
Continuous (cont’d)
Proof: If X1 and X2 are independent so f(x1, x2) = f1(x1) f2( x2),
∞ ∞
E[( X 1 − µ1 )( X 2 − µ 2 )] = ∫ ∫ (x
− ∞− ∞
1 − µ1 )( x2 − µ 2 ) f ( x1 , x2 )d x1dx2

∞ ∞
= ∫ ( x1 − µ1 ) f1 ( x1 )dx1 ⋅ ∫ ( x2 − µ 2 ) f 2 ( x2 )dx2 = 0
−∞ −∞
The expectation of a linear combination of two independent random variables
Y = a1 X1 + a2 X2 is
∞ ∞
µY = E (Y ) = E (a1 X 1 + a2 X 2 ) = ∫ ∫ (a x
− ∞− ∞
1 1 + a2 x2 ) f ( x1 , x2 )dx1dx2

∞ ∞
= ∫ ∫ (a x
− ∞− ∞
1 1 + a2 x2 ) f1 ( x1 ) f 2 ( x2 )dx1dx2 (since X1 and X2 are independent)

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 148
Joint Distributions—Discrete and
Continuous (cont’d)
∞ ∞ ∞ ∞
µY = a1 ∫ x1 f1 ( x1 )dx1 ∫ f 2 ( x2 )dx2 + ∫ f ( x )dx ∫ x
1 1 1 f ( x2 )dx2
2 2
−∞ −∞ −∞ −∞

= a1 E ( X 1 ) + a2 E ( X 2 ) = a1µ1 + a2 µ 2
Note: This result hold even if the two random variables are not independent.
Var (Y ) = E (Y − µY ) 2 = E[(a1 X 1 + a2 X 2 − a1µ1 − a2 µ 2 ) 2 ]
= E[(a1 ( X 1 − µ1 ) + a2 ( X 2 − µ 2 )) 2 ]
= E[a12 ( X 1 − µ1 ) 2 + a22 ( X 2 − µ 2 ) 2 + 2a1a2 ( X 1 − µ1 )( X 2 − µ 2 )]
= a12 E[( X 1 − µ1 ) 2 ] + a22 E[( X 2 − µ 2 ) 2 ] + 2a1a2 E[( X 1 − µ1 )( X 2 − µ 2 )]
= a12Var ( X 1 ) + a22Var ( X 2 )
since the third term is zero because X1 and X2 are independent .
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 149
Joint Distributions—Discrete and
Continuous (cont’d)
Š These properties hold for any number of variables whether they are
continuous or discrete.
The mean and variance of linear combinations
Let Xi have mean µi and variance σi2 for i = 1, 2, …., k. The linear
combination Y = a1 X1 + a2 X2 + ….. + ak Xk has
E(a1 X1 + a2 X2 + ….. + ak Xk) = a1 E(X1) + a2 E( X2) + ….. + ak E( Xk)
or k
µY = ∑ ai µ i
i =1
When the random variables are independent
Var(a1 X1 + a2 X2 + ….. + ak Xk) = a12 Var(X1) + a22 Var( X2) + ….. + ak2 Var( Xk)
or k
σ Y2 = ∑ ai2σ i2
i =1
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 150
Joint Distributions—Discrete and
Continuous (cont’d)
Example: If X1 has mean 4 and variance 9 while X2 has mean –2 and
variance 5, and the two are independent, find
(a) E(2X1 + X2 – 5)
(b) Var(2X1 + X2 – 5)
Solution:
(a) E(2X1 + X2 – 5) = E(2X1 + X2) – 5 = 2E(X1) +E(X2) – 5
= 2(4) + ( - 2) – 5 = 1.
(b) Var (2X1 + X2 – 5) = Var(2X1 + X2) = 22 Var(X1) + Var(X2)
= 22(9) + 5 = 41.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 151
5.13 Simulation
Simulation is a technique of manipulating a model of a system through a
process of imitation.
To simulate the observation of continuous random variables we usually
start with uniform random numbers and relate these to the distribution
function of interest.
Let X is a continuous random variable with cumulative distribution function
F(x), then U = F(X) is uniformly distributed on [0, 1]. So to find a
random observation x of X, we select u an n-digit uniform random
number and solve equation
u = F(x) for x as x = F -1(u).
Further, to generate a random sample of size r from X, we take a sequence
of r independent n-digit uniform random numbers say u1, u2, …., ur, and
then generate x1, x2, …., xr where
xi = F -1(ui); i = 1, 2, …..,r.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 152
Simulation (cont’d)

Uniform random numbers: Ideally specking a uniform random number u is a


random observation from the uniform distribution on [0,1]. This can be done as
under:
Let u = .d1d2…….
where the digits d1, d2, …… are independent and each di is chosen giving equal
chance to the 10 digits 0, 1, 2, …, 9. We call u a uniform random number.
Since u has infinite digits the generation of u is impracticable. Hence we take its
discretised approximation, the n- digit uniform random number
u = .d1d2…….dn
where d1, d2, …., dn are independently chosen such that each digit di is chosen at
random from the digits 0, 1, 2, …, 9 or equivalently u is chosen at random giving
equal chance to the 10n numbers .00…..0 to .99…..9.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 153
Simulation (cont’d)

Simulation of an observation from exponential distribution


Let X be a random variable with exponential distribution i.e.
⎧ 1 −x / β
⎪ e for x > 0, β > 0
f (x) = ⎨ β
⎪0 elsewhere.

Its cumulative distribution function is given by


⎧1 − e − x / β , x > 0
F ( x) = ⎨
⎩0, x≤0
Hence select an uniform random number u and solve equation
u = 1 − e− x / β
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 154
Simulation (cont’d)
We get x = − β ln(1 − u ).
Example: let β = 2 and u = .788429 an 6-digit uniform random number, then
x = −2 ln(1 − u )
= −2 ln(1 − 0.788429)
= 3.10639
Note: If U is uniformly distributed on [0, 1], then 1 – U is also uniformly
distributed on [0,1]. Hence 1- u can be replaced by u.
1
∴ x = − β ln u = β ln
u

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 155
Simulation (cont’d)
Further to generate a random sample of size n from X, we take n independent
n- digit random numbers u1, u2, …., un, and then generate x1, x2, …., xn as
⎛1⎞
xi = β ln⎜⎜ ⎟⎟, i = 1, 2, ......., n.
⎝ ui ⎠
{x1, x2, …., xn} is then the required random sample from X, whose distribution
is exponential with parameter β.

Simulation of an observation from normal distribution


To simulate values from the normal distribution with a specified µ and σ2
consider the relation
x−µ
z= ⇒ x = µ +σ z
σ
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 156
Simulation (cont’d)
so a value of x can be calculated from the value of a standard normal variable
z. The value of z can be obtained from the value for a uniform variable u by
numerically solving equation
u = F(z).
Another approach called the Box-Muller method is almost universally
preferred. In this we start with a pair of independent uniform variables
(u1, u2) and produce two standard normal variables
z1 = − 2 ln(u 2 ) cos(2π u1 )
z2 = − 2 ln(u 2 ) sin(2π u1 )

where the angle is expressed in radians. Then


x1 = µ + σ z1 and x2 = µ + σ z 2
are treated as two independent observations of normal random variables.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 157
Simulation (cont’d)
Example: Suppose the numbers it takes a person to learn how to operate a certain
machine is a random variable having normal distribution with µ = 5.8 and σ =
1.2. Suppose it takes two persons to operate the machine. Simulate the time it
takes four pairs of persons to learn how to operate the machine. That is for each
pair, calculate the maximum of the two learning time.
Solution: We use Box-Muller method i.e. consider 4 pairs of (u1, u2) and and
calculate two standard normal variables (z1, z2) for each pair of (u1, u2) from the
relations
z1 = − 2 ln(u2 ) cos(2π u1 )
z2 = − 2 ln(u2 ) sin(2π u1)

Then, we calculate (x1, x2) from


x1 = µ + σ z1 and x2 = µ + σ z 2
and see which one is larger.
Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 158
Simulation (cont’d)

So, we have
u1 u2 z1 z2 x1 x2

0.885090 0.907445 0.302728 -0.074016 6.16327 5.71118

0.604684 0.656732 -0.357316 -0.541115 5.37122 5.15066

0.337516 0.592905 -0.484167 0.536939 5.21900 6.44433

0.389570 0.366152 -0.888335 0.464292 4.73400 6.35715

The simulated times it takes four pairs to learn how to operate the machine are
6.16327, 5.37122, 6.44433 and 6.35715.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 159
Simulation (cont’d)
Simulation of an observation from uniform distribution on [a , b]
The density function for uniform distribution is given by
⎧ 1
⎪ for a < x < b
f ( x) = ⎨b − a
⎪⎩0 elsewhere
The cumulative distribution function is
⎧0 for x ≤ a
⎪x −a

F ( x) = ⎨ for a < x < b
⎪b − a
⎪⎩1 for x ≥ b
Solving for u = F(x) we get x = a + (b – a) u.

Department of Mathematics,
4-Apr-08 BITS Pilani, Goa Campus 160

S-ar putea să vă placă și