Sunteți pe pagina 1din 32

Discrete Random Variable

MATH142-3
Engineering Data Analysis
Course Outcome
Compute the probability distribution of a random variable for
both discrete and continuous data.
Learning Objectives
• At the end of the lesson, the• Understand the assumptions for
students are expected to some common discrete
• Determine probabilities from the probability distributions;
probability mass functions and• Select an appropriate discrete
the reverse; probability distribution to
• Determine probabilities from calculate probabilities in specific
cumulative distribution applications; and
functions and cumulative• Calculate probabilities,
distribution functions from determine means and variances
probability mass functions, and for some common discrete
the reverse; probability distributions.
• Calculate the means and
variances for discrete random
variables;
Random Variables
A function that assigns a real number to Discrete
each outcome in the sample space of a With a finite (or countably infinite)
random experiment range
•number of scratches on a surface,
Notation proportion of defective parts among
Uppercase letter such as X 1000 tested, number of transmitted
bits in error

Example: The measured value is denoted Continuous


by a lowercase letter such as x = 70
milliamperes With an interval (either finite or
infinite) of real numbers for its range
For multiple values, subscripts are used
(x1, x2, x3, …) •Electrical current, length, pressure,
temperature, time, voltage, weight
Random Variables
2-182/58 Decide whether a discrete or• 2-183/58 Decide whether a discrete or
continuous random variable is the best model continuous random variable is the best model
for each of the following variables: for each of the following variables:
(a) The time until a projectile returns to earth. • (b) The weight of an injection-molded plastic
(b) The number of times a transistor in a part.
computer memory changes state in one• (c) The number of molecules in a sample of
operation. gas.
(c) The volume of gasoline that is lost to• (d) The concentration of output from a
evaporation during the filling of a gas tank. reactor.
(d) The outside diameter of a machined shaft. • (e) The current in an electronic circuit.
(e) The number of cracks exceeding, 1.2 cm in
16 km of an interstate highway.
Discrete Random Variable
PROBABILITY MASS FUNCTION CUMULATIVE DISTRIBUTION FUNCTION

For a discrete random variable X with The cumulative distribution function of


possible values x1, x2, …, xn, a probability a discrete random variable X, denoted
mass function is a function such that as F(x), is
𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = σ𝑥𝑖≤𝑥 𝑓 𝑥𝑖 .
(1) f(xi) ≥ 0
(2) σ𝑛𝑖=1 𝑓 𝑥𝑖 = 1 For discrete random variable X, 𝐹 𝑥
(3) f(xi) = P(X = xi) satisfies the following properties.
(3-1) (1) 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = σ𝑥𝑖≤𝑥 𝑓 𝑥𝑖
(2) 0 ≤ F(x) ≤ 1
(3) If x ≤ y, then F(x) ≤ F(y)
(3-2)
Mathematical Expectation of Random Variable
EXPECTED VALUE OF A FUNCTION OF A
DISCRETE RANDOM VARIABLE VARIANCE OF A RANDOM VARIABLE

If X is a discrete random variable with The variance of X, denoted as σ2 or V(X), is


probability mass function f(x), 𝜎 2 = 𝑉 𝑋 = 𝐸 𝑋 − 𝜇 2 = σ 𝑥 ሺ𝑥 −

𝐸ℎ 𝑋 = σ𝑥 ℎ 𝑥 𝑓 𝑥 .
(3-4)

MEAN OF A RANDOM VARIABLE


The mean or expected value of the discrete
random variable X, denoted as μ or E(X), is
𝜇 = 𝐸 𝑋 = σ𝑥 𝑥𝑓 𝑥 .
(3-3)
Discrete Random Variable
3-16/69 The sample space of a random 3-38/73 Determine the cumulative
experiment is {a, b, c, d, e, f}, and each outcome distribution function of the random variable
is equally likely. A random variable is defined as in Exercise 3-16.
follows: Determine the mean and variance of the
random variable in Exercise 3-16
outcome a b c d e f
x 0 0 1.5 1.5 2 3

Determine the probability mass function of X.


Use the probability mass function to determine
the following probabilities:
(a) P(X = 1.5) (b) P(0.5 < X < 2.7)
(c) P(X > 3) (d) P(0 ≤ X < 2)
(e) P(X = 0 or X = 2)
Guided Learning Activity
3-17/70 Verify that the following functions are 8 1 𝑥
3-18/69 f(x) = , x = 1, 2, 3
probability mass functions, and determine the 7 2
requested probabilities. Obtain the following probabilities
x ─2 ─1 0 1 2 (a) P(X ≤ 1) (b) P(X > 1)
f(x) 0.2 0.4 0.1 0.2 0.1 (c) P(2 < X < 6) (d) P(X ≤ 1 or X > 1)
3-40/73 Determine the cumulative
(a) P(X ≤ 2) (b) P(X > −2) distribution function for the random variable
in Exercise 3-18; also determine the
(c) P(−1 ≤ X ≤ 1) (d) P(X ≤ −1 or X = 2) following probabilities:
3-39/73 Determine the cumulative distribution (a) P(X < 2) (b) P(X ≤ 3)
function for random variable and also
determine the following probabilities: (c) P(X > 2) (d) P(1 < X ≤ 2)
(a) P(X ≤ 1.25) (b) P(X ≤ 2.2) 3-50/76 Determine the mean and variance
of the random variable in Exercise 3-18.
(c) P(−1.1 < X ≤ 1) (d) P(X > 0)
3-59/76 Determine the mean and variance
Guided Learning Activity
2𝑥 + 1
3-19/70 f(x) = , x = 0, 1, 2, 3, 4 3-62/77 Determine the mean and
25 variance of the random variable in
(a) P(X = 4) (b) P(X ≤ 1) Exercise 3-20.
(c) P(2 ≤ X < 4) (d) P(X > −10) 3 1 𝑥
𝑓ሺ𝑥) = , x = 0, 1, 2, …
4 4

3-61/77 Determine the mean and 4.17/118 Let X be a random variable


variance of the random variable in with the following probability
Exercise 3-19. distribution:
2𝑥 + 1
f(x) = , x = 0, 1, 2, 3, 4
25 x −3 6 9
3 1 𝑥 f(x) 1/6 1/2 1/3
3-20/70 f(x) = ,
x = 0, 1, 2, …
4 4
(a) P(X = 2) (b) P(X ≤ 2) Find μg(X), where g(X) = (2X + 1)2.
(c) P(X > 2) (d) P(X ≥ 1)
Guided Learning Activity
3-25/69 In a semiconductor manufacturing 3-50/73 Errors in an experimental
process, three wafers from a lot are tested. transmission channel are found when the
Each wafer is classified as pass or fail. Assume transmission is checked by a certifier that
that the probability that a wafer passes the test detects missing pulses. The number of errors
is 0.8 and that wafers are independent. found in an eight-bit byte is a random
Determine the probability mass function of the variable with the following distribution:
number of wafers from a lot that pass the test.
0, 𝑥 < 1
3-27/70 A disk drive manufacturer sells storage 0.7, 1 ≤ 𝑥 < 4
devices with capacities of one terabyte, 500 𝐹 𝑥 = .
0.9, 4 ≤ 𝑥 < 7
gigabytes, and 100 gigabytes with probabilities 1, 7 ≤ 𝑥
0.5, 0.3, and 0.2, respectively. The revenues
associated with the sales in that year are Determine each of the following
estimated to be $50 million, $25 million, and probabilities:
$10 million, respectively. Let X denote the (a) P(X ≤ 4) (b) P(X > 7)
revenue of storage devices during that year.
Determine the probability mass function of X. (c) P(X ≤ 5) (d) P(X > 4)
(e) P(X ≤ 2)
Guided Learning Activity
3-11/76 Messages The number of e-mail
messages received per hour has the
following distribution:
4.18/118 Find the expected value of
x 10 11 12 13 14 15 the random variable g(X) = X2, where X
has the probability distribution of
f(x) 0.08 0.15 0.30 0.20 0.20 0.07 Exercise 4.2.
• Where x is the number messages.
3 1 𝑥 3 3−𝑥
Determine the mean and standard 𝑓 𝑥 = , 𝑥 = 0,1,2,3.
deviation of the number of messages sent 𝑥 4 4
per hour.

3-59/76 Determine the mean and


variance of the random variable in
Exercise 3-17.
x ─2 ─1 0 1 2
f(x) 0.2 0.4 0.1 0.2 0.1
Discrete Uniform Distribution
A random variable X has a discrete uniform 3-57/76 If the range of X is the set {0, 1, 2,
distribution if each of the n values in its range, 3, 4} and P(X = x) = 0.2, determine the mean
say, x1, x2, …, xn, has equal probability. Then, and variance of the random variable.
1
𝑓 𝑥𝑖 = .
𝑛
(3-5)

Mean and Variance


Suppose X is a discrete uniform random 3-77/79 Let the random variable X have a
variable on the consecutive integers a, a + 1, a discrete uniform distribution on the integers
+ 2, …, b, for a ≤ b. Then mean of X is 1 ≤ x ≤ 8. Determine the mean and variance
𝑏+𝑎 of X.
𝜇 = 𝐸 𝑋 =
2
The variance of X is
𝑏−𝑎+1 2 −1
𝜎2 =
12
(3-6)
Discrete Uniform Distribution
• 3-81/79 Assume that the wavelengths of photosynthetically
active radiations (PAR) are uniformly distributed at integer
nanometers in the red spectrum from 657 to 700 nm.
(a)What is the mean and variance of the wavelength distribution
for this radiation?
• (b) If the wavelengths are uniformly distributed at integer
nanometers from 57 to 100 nanometers, how does the mean
and variance of the wavelength distribution compare to the
previous part? Explain.
Binomial Distribution
• The terms success and failure BINOMIAL DISTRIBUTION
are just labels. We can just as A random experiment consists
well use A and B or 0 or 1. of n Bernoulli trials such that
Unfortunately, the usual labels
can sometimes be misleading. In (1) The trials are independent.
experiment 2, because X counts (2) Each trial results in only two
defective parts, the production possible outcomes, labeled as
of a defective part is called a “success” and “failure”.
success. (3) The probability of a success
• Bernoulli trial has only two in each trial, denoted as p,
possible outcomes used so remains constant.
frequently as a building block of
a random experiment.
Binomial Distribution
• Consider the following random experiments
and random variables: 5. A multiple-choice test contains 10
1. Flip a coin 10 times. Let X = number of questions, each with four choices, and you
heads obtained. guess at each question. Let X = the number
2. A worn machine tool produces 1% defective of questions answered correctly.
parts. Let X = number of defective parts in 6. In the next 20 births at a hospital, let X =
the next 25 parts produced. the number of female births.
3. Each sample of air has a 10% chance of 7. Of all patients suffering a particular illness,
containing a particular rare molecule. Let X 35% experience improvement from a
= the number of air samples that contain particular medication. In the next 100
the rare molecule in the next 19 samples patients administered the medication, let X
analyzed. = the number of patients who experience
4. Of all bits transmitted through a digital improvement.
transmission channel, 10% are received in
error. Let X = the number of bits in error in
the next five bits transmitted.
Binomial Distribution
The random variable X that
equals the number of trials that
result in a success has a Mean and Variance
binomial random variable withIf X is a binomial random
parameters 0 < p < 1 and n = 1,variable with parameters p and
2, ... . The probability massn,
function of X is 𝜇 = 𝐸ሺ𝑋) = 𝑛𝑝
𝑛 𝑥 𝑛−𝑥 and
𝑓 𝑥 = 𝑥 𝑝 1−𝑝 ,
𝜎2 = 𝑉ሺ𝑋) = 𝑛𝑝ሺ1 − 𝑝)
𝑥 = 0, 1, … , 𝑛. (3-8)
(3-7)
Binomial Distribution
3-94/85 The random variable X has a binomial distribution 3-100/85 An electronic product contains 40 integrated
with n = 20 and p = 0.5. Determine the following circuits. The probability that any integrated circuit is
probabilities. defective is 0.01, and the integrated circuits are
independent. The product operates only if there are
(a) P(X = 5) (b) P(X ≤ 2) no defective circuits. What is the probability that the
(c) P(X ≥ 9) (d) P(3 ≤ X < 5) product operates?

3-101/85 The phone lines to an airline reservation


3-97/85 Sketch the probability mass function of a binomial system are occupied 40% of the time. Assume that the
distribution with n = 10 and p = 0.01 and comment on the events that the lines are occupied on successive calls
shape of the distribution. are independent. Assume that 10 calls are placed to
the airline.
(a) What value of X is most likely?
(a) What is the probability that for exactly three calls
(b) What value of X is least likely? the lines are occupied?
(b) What is the probability that for at least one call the
3-99/85 Determine the cumulative distribution function of lines are not occupied?
a binomial random variable with n = 3 and p = 1/4. (c) What is the expected number of calls in which the
lines are all occupied?
Hypergeometric Distribution

The probability distribution of the MEAN AND VARIANCE


hypergeometric random variable X, the The mean and variance of the
number of successes in a random sample hypergeometric distribution ℎ 𝑥; 𝑁, 𝑛, 𝑘
of size n selected from N items of which k are
are labeled success and N-k labeled 𝑛𝑘
failure, is 2 = 𝑁−𝑛 𝑛𝑘 1 − 𝑘 .
𝑘 𝑁−𝑘
𝜇 = and 𝜎
𝑁 𝑁−1 𝑁 𝑁
𝑥 𝑛−𝑥
ℎ 𝑥; 𝑁, 𝑛, 𝑘 = 𝑁 ,
𝑛 Example: 5.29: A homeowner plants 6
where x,n,N, and k are all positive. bulbs selected at random from a box
containing 5 tulip bulbs and 4 daffodil
bulbs. What is the probability that he
planted 2 daffodil bulbs and 4 tulip bulbs?
Guided Learning Activity
5.30 To avoid detection at customs, a traveler 5.34 What is the probability that a waitress will
places 6 narcotic tablets in a bottle containing 9 refuse to serve alcoholic beverages to only 2
vitamin tablets that are similar in appearance. If minors if she randomly checks the IDs of 5
the customs official selects 3 of the tablets at among 9 students, 4 of whom are minors?
random for analysis, what is the probability that 5.35 A company is interested in evaluating its
the traveler will be arrested for illegal possession current inspection procedure for shipments of
of narcotics? 50 identical items. The procedure is to take a
5.31 A random committee of size 3 is selected sample of 5 and pass the shipment if no more
from 4 doctors and 2 nurses. Write a formula for than 2 are found to be defective. What
the probability distribution of the random proportion of shipments with 20% defectives
variable X representing the number of doctors on will be accepted?
the committee. Find P(2 ≤ X ≤ 3). 5.36 A manufacturing company uses an
5.33 If 7 cards are dealt from an ordinary deck of acceptance scheme on items from a production
52 playing cards, what is the probability that line before they are shipped. The plan is a two-
(a) exactly 2 of them will be face cards? stage one. Boxes of 25 items are readied for
shipment, and a sample of 3 items is tested for
(b) at least 1 of them will be a queen? defectives. If any defectives are found, the
entire box is sent back for 100% screening. If no
defectives are found, the box is shipped.
Negative Binomial Distribution
If repeated independent trials can result in a In an NBA (National Basketball Association)
success with probability 𝑝 and a failure with a championship series, the team that wins
probability 𝑞 = 1 − 𝑝, then the probability four games out of seven is the winner.
distribution of the random variable 𝑋, the Suppose that teams A and B face each
number of the trial on which the kth success other in the championship games and that
occurs, is team A has probability 0.55 of winning a
𝑥 − 1 𝑘 𝑥−𝑘 game over team B.
𝑏 ∗ 𝑥; 𝑘, 𝑝 = 𝑝 𝑞 ,
𝑘−1 (a) What is the probability that team A will
𝑥 = 𝑘, 𝑘 + 1, 𝑘 + 2 … . win the series in 6 games?
(b) What is the probability that team A will
win the series?
• 5.49 The probability that a person living in a
certain city owns a dog is estimated to be 0.3. (c) If teams A and B were facing each other
Find the probability that the tenth person in a regional playoff series, which is decided
randomly interviewed in that city is the fifth by winning three out of five games, what is
one to own a dog. the probability that team A would win the
series?
Geometric Distribution
If repeated independent trials can
result in a success with probability 5.50 Find the probability that a
𝑝 and a failure with a probability 𝑞 = person flipping a coin gets
1 − 𝑝, then the probability distribution
of the random variable 𝑋, the number (a) the third head on the seventh
of the trial on which the first success flip;
occurs, is (b) the first head on the fourth flip.
𝑔 𝑥; 𝑝 = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2,3 …
5.51 Three people toss a fair coin
MEAN AND VARIANCE and the odd one pays for coffee. If
The mean and variance of a random the coins all turn up the same, they
variable following the geometric are tossed again. Find the
distribution are probability that fewer than 4 tosses
1 1−𝑝
are needed.
𝜇 = and 𝜎 = 22
𝑝 𝑝
Poisson Distribution
A widely-used distribution emerges as the We expect E(X) = λT from the definition
number of trials in a binomial experiment of λ. The probability distribution of X is
increases to infinity while the mean of the determined as follows. Partition the
distribution remains constant. Consider length of wire into n subintervals of
the following example. small length Δt = T/n (say, one
micrometer each). If the subintervals
are chosen small enough, the
probability that more than one flaw
3-30/98 Wire Flaws Flaws occur at occurs in a subinterval is negligible.
random along the length of a thin copper Furthermore, we can interpret the
wire. Let X denote the random variable assumption that flaw occurs at random
that counts the number of flaws in a to imply that every subinterval has the
length of T millimeters of wire and same probability of containing a flaw,
suppose that the average number of flaws say p. Also, the occurrence of a flaw in
a subinterval is assumed to be
per millimeter is λ. independent of flaws in other
subintervals.
• Example 3-30 can be generalized to• In general, consider subintervals of
include a broad array of random small length Δt and assume as Δt
experiments. The interval that was tends to zero,
partitioned was a length of wire.• 1. The probability that more than
However, the same reasoning can be one event in a subinterval tends to
applied to an interval of time, an zero.
area, or a volume. For example,
counts of (1) particles of• 2. The probability of one event in a
contamination in semiconductor subinterval tends to λΔt.
manufacturing, (2) flaws in rolls of• 3. The event in each subinterval is
textiles, (3) calls to a telephone independent of other subintervals.
exchange, (4) power outages, and (5)
atomic particles emitted from a• A random experiment with these
specimen have been successfully properties is called Poisson process.
modeled by the probability mass
function in the following definition.
Poisson Distribution
λ𝑇 −𝑥
Then we can model the distribution of X 1−
𝑛
→1
as approximately a binomial random
variable. Each subinterval generates an λ𝑇 𝑛
event (flaw) or not. Therefore, 1− → 𝑒 −λ𝑇
𝑛
E(X) = λT = np
and one can solve for p to obtain Therefore,
𝑒 −λ𝑇 λ𝑇 𝑥
p = λT/n lim 𝑃 𝑋 = 𝑥 = ,𝑥 = 0, 1, 2, …
𝑛→∞ 𝑥!
From the approximate binomial
distribution Because the number of subintervals
𝑛 𝑥
𝑃 𝑋=𝑥 ≈ 𝑝 1 − 𝑝 𝑛−𝑥 tends to infinity, the range X (the
𝑥 number of flaws) can equal any
With small enough subintervals, n is large nonnegative integer.
and p is small. Basic properties of limits
can be used to show that as n increases
𝑛 λ𝑇 𝑥 λ𝑇 𝑥

𝑥 𝑛 𝑥!
The sum of the probabilities is 1 because and the summation on the right-hand
𝑒 −λ𝑇λ𝑇 𝑥 λ𝑇 𝑥 side of the previous equation is
σ∞
𝑥=0 = 𝑒 −λ𝑇 ∞
σ𝑥=0 recognized to be Taylor’s expansion of
𝑥! 𝑥!
ex evaluated at λT. Therefore, the
summation equals eλT and the right-
and the summation on the right-hand hand side equals = 1.
side of the previous equation is
recognized to be Taylor’s expansion of ex
evaluated at λT. Therefore, the Use consistent units for λ and T.
summation equals eλT and the right-hand For example, if λ = 2.3 flaws per
side equals = 1. millimeter, then T should be expressed
The sum of the probabilities is 1 because in millimeters. If λ = 7.1 square
𝑒 −λ𝑇λ𝑇 𝑥 λ𝑇 𝑥 centimeters, then an area of 4.5 square
σ∞
𝑥=0 = 𝑒 −λ𝑇 ∞
σ𝑥=0 inches should be expressed as T =
𝑥! 𝑥! 4.5(2.542) = 29.03 square centimeters.
The random variable X that equals the 3-169/103 The number of
number of events in a Poisson process content changes to a Web site
is a Poisson random variable with follows a Poisson distribution
parameter 0 < λ, and
with a mean of 0.25 per day.
𝑒 −λ𝑇 λ𝑇 𝑥
(a) What is the probability of two
𝑓 𝑥 =
𝑥!
𝑥 = 0, 1, 2, … or more changes in a day?
(b) What is the probability of no
Mean and Variance content changes in five days?
If X is a Poisson random variable over an (c) What is the probability of two
interval of length T with parameter λ, or fewer changes in five days?
then
μ = E(X) = λT and σ2 = V(X) = λT
(3-16)
Guided Learning Activity
3-132/102 The number of telephone calls
that arrive at a phone exchange is often 3-165/102 When a computer disk
modeled as Poisson random variable. manufacturer tests a disk, it writes to the
Assume that on the average there are 10 disk and then tests it using a certifier.
calls per hour. The certifier counts the number of
(a) What is the probability that there are missing pulses or errors. The number of
exactly 5 calls in one hour? errors on a test area on a disk has a
(b) What is the probability that there are 3 Poisson distribution with λ = 0.2.
or fewer calls in one hour? (a) What is the expected number of
(c) What is the probability that there are errors per test area?
exactly 15 calls in two hours? (b) What percentage of test areas have
(d) What is the probability that there are two or fewer errors?
exactly 5 calls in 30 minutes?
Summary
A random variable is a function that 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = σ𝑥𝑖≤𝑥 𝑓 𝑥𝑖
assigns a real number to each outcome in
the sample space of a random
experiment. The mean refers to the expected
A continuous random variable has an value of a random variable. For
interval (either finite or infinite) of real discrete case, 𝜇 = 𝐸 𝑋 = σ𝑥 𝑥𝑓 𝑥 .
numbers for its range. The variance is a measure of
A discrete random variable has a finite (or variability defined as the expected
countably infinite) range. value of the square of the random
variable around its mean. For discrete
Probability mass function provides case,
probabilities for the values in the range of 𝜎2 = 𝑉 𝑋 = 𝐸 𝑋 − 𝜇 2
a discrete random variable.
The cumulative distribution function of a = σ𝑥 𝑥 − 𝜇 2 𝑓 𝑥
discrete random variable X, denoted as = σ𝑥 𝑥 2 𝑓 𝑥 − 𝜇2 .
F(x), is
Summary
Discrete uniform random variable is a discrete Negative Binomial
random variable with a finite range and Hypergeometric Distribution
constant probability mass function
Discrete uniform distribution Geometric Distribution
f(xi) = 1/n A Poisson process is a random experiment
with events that occur in an interval and
μ = E(X) = (b + a)/2 satisfy the following assumptions. The
2 𝑏−𝑎+1 2 −1 interval can be partitioned into subintervals
𝜎 = such that the probability of more than one
12
event in a subinterval is zero, the probability
Bernoulli trials are sequences of independent of an event in a subinterval is proportional to
trials with only two outcomes, generally called the length of the subinterval, and the event
“success” and “failure”, in which the probability in each subinterval is independent of other
of success remains constant. subintervals.
Binomial distribution Poisson distribution
𝑛 𝑥
𝑓 𝑥 = 𝑥 𝑝 1 − 𝑝 𝑛−𝑥 , 𝑥 = 0, 1, … , 𝑛 𝑒 −λ𝑇 λ𝑇 𝑥
𝑓 𝑥 = 𝑥 = 0, 1, 2, …
μ = E(X) = np 𝑥!
μ = E(X) = σ2 = V(X) = λT
σ2 = V(X) = np(1 − p)
References
• Montgomery and Runger. Applied Statistics and Probability for
Engineers, 6th Ed. © 2014
• Walpole, et al. Probability and Statistics for Engineers and
Scientists 9th Ed. © 2012, 2007, 2002

S-ar putea să vă placă și