Documente Academic
Documente Profesional
Documente Cultură
INFERENTIAL STATISTICS
Session-2
BASIC PROBABILITY
Basic Probability Concepts
• Probability – the chance that an uncertain
event will occur (always between 0 and 1)
• Impossible Event – an event that has no
chance of occurring (probability = 0)
• Certain Event – an event that is sure to occur
(probability = 1)
Chap 4-3
Assessing Probability
There are three approaches to assessing the
probability of an uncertain event:
1. A priori -- based on prior knowledge of the process
X number of ways the event can occur
probability of occurrence
Assuming
T total number of elementary outcomes
all
outcomes 2. Empirical probability
are equally
likely number of ways the event can occur
probability of occurrence
total number of elementary outcomes
3. Subjective probability
based on a combination of an individual’s past experience,
personal opinion, and analysis of a particular situation
Chap 4-4
Example of a priori probability
X 31 days in January 31
T 365 days in 2013 365
Chap 4-5
Example of empirical probability
Find the probability of selecting a male taking
statistics from the population described in the following
table:
Chap 4-6
Subjective probability
• Subjective probability may differ from person to person
• A media development team assigns a 60% probability of success to its new ad
campaign.
• The chief media officer of the company is less optimistic and assigns a 40% of
success to the same campaign
• The assignment of a subjective probability is based on a person’s experiences,
opinions, and analysis of a particular situation
• Subjective probability is useful in situations when an empirical or a priori
probability cannot be computed
Chap 4-7
Events
Each possible outcome of a variable is an event.
• Simple event
• An event described by a single characteristic
• e.g., A day in January from all days in 2013
• Joint event
• An event described by two or more characteristics
• e.g. A day in January that is also a Wednesday from all days in 2013
• Complement of an event A (denoted A’)
• All events that are not part of event A
• e.g., All days from 2013 that are not in January
Chap 4-8
Sample Space
The Sample Space is the collection of all
possible events
e.g. All 6 faces of a die:
Chap 4-9
Organizing & Visualizing Events
• Contingency Tables -- For All Days in 2013
Jan. Not Jan. Total
Wed. 5 47 52
Not Wed. 27 286 313
• Decision Trees 5
Total
Number
Sample Of
Space 27 Sample
All Days Space
In 2013 Outcomes
47
Chap 4-10
286
Definition: Simple Probability
• Simple Probability refers to the probability of a
simple event.
• ex. P(Jan.)
• ex. P(Wed.)
P(Jan.) = 32 / 365
Chap 4-11
Definition: Joint Probability
• Joint Probability refers to the probability of an
occurrence of two or more events (joint event).
• ex. P(Jan. and Wed.)
• ex. P(Not Jan. and Not Wed.)
Chap 4-13
Computing Joint and
Marginal Probabilities
Wed. 5 47 52
Not Wed. 27 286 313
Chap 4-15
Marginal Probability Example
P(Wed.)
4 48 52
P(Jan. and Wed.) P(Not Jan. and Wed.)
365 365 365
Wed. 4 48 52
Not Wed. 27 286 313
Chap 4-16
Marginal & Joint Probabilities In A
Contingency Table
Event
Event B1 B2 Total
A1 P(A1 and B1) P(A1 and B2) P(A1)
A2 P(A2 and B1) P(A2 and B2) P(A2)
Chap 4-17
Probability Summary So Far
• Probability is the numerical measure of
the likelihood that an event will occur 1 Certain
Chap 4-20
Computing Conditional Probabilities
• A conditional probability is the probability of one
event, given that another event has occurred:
P(A and B) The conditional
P(A | B) probability of A given
P(B) that B has occurred
Chap 4-22
Conditional Probability Example
(continued)
Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a GPS and
20% of the cars have both.
GPS No GPS Total
AC 0.2 0.5 0.7
No AC 0.2 0.1 0.3
Total 0.4 0.6 1.0
P(A | B) P(A)
• Events A and B are independent when the probability of
one event is not affected by the fact that the other event
has occurred
Chap 4-27
Multiplication Rules
Chap 4-28
Discrete Probability Distributions
The probability distribution of a discrete
random variable
• The Binomial distribution
• The Poisson distribution
• The Hypergeometric distribution
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc. Chap 5-30
Definitions
• Discrete variables produce outcomes that come
from a counting process (e.g. number of classes you
are taking).
Types Of
Variables
Chap 5-34
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Example of a Discrete Random Variable
Probability Distribution
T H 1 2/4 = 0.50
2 1/4 = 0.25
H T
Probability
0.50
0.25
H H
0 1 2 X Chap 5-35
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Discrete Variables
Expected Value (Measuring Center)
• Expected Value (or mean) of a discrete
variable (Weighted Average)
N
E(X) X i P( X X i ) X P(X=Xi)
i 1
• Example: Toss 2 coins, 0 0.25
1 0.50
X = # of heads,
2 0.25
compute expected value of X:
E(X) = ((0)(0.25) + (1)(0.50) + (2)(0.25))
= 1.0
N
σ σ2 i
[X
i 1
E(X)] 2
P(X Xi )
where:
E(X) = Expected value of the discrete random variable X
Xi = the ith outcome of X
Statistics for Managers Using Microsoft P(X=Xi) = Probability of the ith occurrence of X
Excel® 7e Copyright ©2014 Pearson Chap 5-37
Education, Inc.
Discrete Random Variables
Measuring Dispersion
(continued)
σ [X i
E(X)] P(X i ) 2
Binomial Normal
Poisson Uniform
Hypergeometric Exponential
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-39
Education, Inc.
Binomial Probability Distribution
A fixed number of observations, n
e.g., 15 tosses of a coin; ten light bulbs taken from a warehouse
Each observation is categorized as to whether or not the
“event of interest” occurred
e.g., head or tail in each toss of a coin; defective or not defective
light bulb
Since these two categories are mutually exclusive and
collectively exhaustive
When the probability of the event of interest is represented as π,
then the probability of the event of interest not occurring is 1 - π
Constant probability for the event of interest occurring
(π) for each observation
Probability of getting a tail is the same each time we toss the
coin
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-40
Education, Inc.
Binomial Probability Distribution
(continued)
n!
P(X 1 | 5,0.1) x (1 ) n x
x! (n x)!
5!
(0.1)1 (1 0.1) 51
1!(5 1)!
(5)(0.1)(0 .9) 4
0.32805
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-44
Education, Inc.
The Binomial Distribution
Example
Suppose the probability of purchasing a defective
computer is 0.02. What is the probability of
purchasing 2 defective computers in a group of 10?
x = 2, n = 10, and π = 0.02
n!
P(X 2 | 10, 0.02) x (1 ) n x
x! (n x)!
10!
(.02) 2 (1 .02)10 2
2!(10 2)!
(45)(.0004 )(.8508)
.01531
Examples:
n = 10, π = 0.35, x = 3: P(X = 3|10, 0.35) = 0.2522
n = 10, π = 0.75, x = 8: P(X = 8|10, 0.75) = 0.0004
Chap 5-46
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Binomial Distribution
Characteristics
• Mean
μ E(X) n
Variance and Standard Deviation
2
σ nπ (1 - π )
σ nπ (1 - π )
Where n = sample size
π = probability of the event of interest for any trial
(1 – π) = probability of no event of interest for any trial
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-47
Education, Inc.
The Binomial Distribution
Characteristics
Examples
P(X=x|5, 0.1)
μ nπ (5)(.1) 0.5 .6
.4
σ nπ (1 - π ) (5)(.1)(1 .1) .2
0
0.6708 0 1 2 3 4 5 x
P(X=x|5, 0.5)
μ nπ (5)(.5) 2.5 .6
.4
σ nπ (1 - π ) (5)(.5)(1 .5) .2
0
1.118 0 1 2 3 4 5 x
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-48
Education, Inc.
The Poisson Distribution
Definitions
• You use the Poisson distribution when you are
interested in the number of times an event occurs in
a given area of opportunity.
• An area of opportunity is a continuous unit or
interval of time, volume, or such area in which more
than one occurrence of an event can occur.
• The number of scratches in a car’s paint
• The number of mosquito bites on a person
• The number of computer crashes in a day
e x
P( X x | )
X!
where:
x = number of events in an area of opportunity
= expected number of events
e = base of the natural logarithm system (2.71828...)
• Mean
μλ
Variance and Standard Deviation
σ λ
2
σ λ
where = expected number of events
e λ λ X e 0.50 (0.50) 2
P(X 2 | 0.50) 0.0758
X! 2!
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-53
Education, Inc.
Graph of Poisson Probabilities
Graphically:
= 0.50
=
X 0.50
0 0.6065
1 0.3033
2 0.0758
3 0.0126
4 0.0016
5 0.0002
6 0.0000
P(X = 2 | =0.50) = 0.0758
7 0.0000
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 5-54
Education, Inc.
Poisson Distribution Shape
• The shape of the Poisson Distribution depends
on the parameter :
= 0.50 = 3.00
Changing σ increases
or decreases the
σ spread.
μ X
1 (1/2)Z 2
f(Z) e
2π
1
Z
0
Values above the mean have positive Z-values,
values below the mean have negative Z-values
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 6-64
Education, Inc.
Example
X μ $200 $100
Z 2.0
σ $50
• This says that X = $200 is two standard deviations
(2 increments of $50 units) above the mean of
$100.
a b X
Chap 6-67
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below
f(X) P( X μ) 0.5
P(μ X ) 0.5
0.5 0.5
μ X
0.9772
Example:
P(Z < 2.00) = 0.9772
0 2.00 Z
X
18.0
Chap 6-71
18.6
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Finding Normal Probabilities
(continued)
• Let X represent the time it takes, in seconds to download an image file from
the internet.
• Suppose X is normal with a mean of 18.0 seconds and a standard deviation of
5.0 seconds. Find P(X < 18.6)
X μ 18.6 18.0
Z 0.12
σ 5.0
μ = 18 μ=0
σ=5 σ=1
18 18.6 X 0 0.12 Z
Chap 6-73
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Finding Normal
Upper Tail Probabilities
• Suppose X is normal with mean 18.0 and
standard deviation 5.0.
• Now Find P(X > 18.6)
X
18.0
Statistics for Managers Using Microsoft 18.6
Excel® 7e Copyright ©2014 Pearson Chap 6-74
Education, Inc.
Finding Normal
Upper Tail Probabilities (continued)
• Now Find P(X > 18.6)…
P(X > 18.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522
0.5478
1.000 1.0 - 0.5478
= 0.4522
Z Z
0 0
Statistics for Managers Using Microsoft 0.12 0.12
Excel® 7e Copyright ©2014 Pearson Chap 6-75
Education, Inc.
Finding a Normal Probability
Between Two Values
Calculate Z-values:
X μ 18 18
Z 0
σ 5
18 18.6 X
X μ 18.6 18 0 0.12 Z
Z 0.12
σ 5 P(18 < X < 18.6)
X
18.0
17.4
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 6-78
Education, Inc.
Probabilities in the Lower Tail (continued)
X
μ-1σ μ μ+1σ
Statistics for Managers Using Microsoft
68.26%
Excel® 7e Copyright ©2014 Pearson Chap 6-80
Education, Inc.
The Empirical Rule
(continued)
2σ 2σ 3σ 3σ
μ x μ x
95.44% 99.73%
X μ Zσ
Example:
• Let X represent the time it takes (in seconds) to download an
image file from the internet.
• Suppose X is normal with mean 18.0 and standard deviation
5.0
• Find X such that 20% of download times are less than X.
0.2000
? 18.0 X
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson ? 0 Chap 6-83 Z
Education, Inc.
Find the Z value for
20% in the Lower Tail
1. Find the Z value for the known probability
Standardized Normal Probability • 20% area in the lower tail
Table (Portion) is consistent with a Z
value of -0.84
Z … .03 .04 .05
X μ Zσ
18.0 (0.84)5.0
13.8
• Examples:
• Time between trucks arriving at an unloading dock
• Time between transactions at an ATM Machine
• Time between phone calls to the main operator
λX
P(arrival time X) 1 e
where e = mathematical constant approximated by 2.71828
λ = the population mean number of arrivals per unit
X = any value of the continuous variable where 0 < X <
Statistics for Managers Using Microsoft
Excel® 7e Copyright ©2014 Pearson Chap 6-87
Education, Inc.
Exponential Distribution
Example
Example: Customers arrive at the service counter at
the rate of 15 per hour. What is the probability that the
arrival time between consecutive customers is less
than three minutes?
The mean number of arrivals per hour is 15, so λ = 15
Three minutes is 0.05 hours
P(arrival time < .05) = 1 – e-λX = 1 – e-(15)(0.05) = 0.5276
So there is a 52.76% chance that the arrival time
between successive customers is less than three
minutes
Chap 7-89
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Developing a
Sampling Distribution
• Random variable, X,
is age of individuals
• Values of X: 18, 20,
22, 24 (years)
Chap 7-90 Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014
Pearson Education, Inc.
Developing a
Sampling Distribution
(continued)
μ
X i P(x)
N .3
18 20 22 24
21 .2
4 .1
(X μ) 2 0
18 20 22 24 x
σ i
2.236 A B C D
N
Uniform Distribution
Chap 7-91 Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014
Pearson Education, Inc.
Developing a
Sampling Distribution
(continued)
Now consider all possible samples of size n=2
16 Sample
1st 2nd Observation
Obs Means
18 20 22 24
18 18,18 18,20 18,22 18,24 1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24
Chap 7-92
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Developing a
Sampling Distribution
(continued)
Sampling Distribution of All Sample Means
18 19 19 24
μX 21
16
(18 - 21) 2 (19 - 21) 2 (24 - 21) 2
σX 1.58
16
.2 .2
.1 .1
0 X 0
18 19 20 21 22 23 24
_
18 20 22 24 X
A B C D
Chap 7-95
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Sample Mean Sampling Distribution:
Standard Error of the Mean
• Different samples of the same size from the same
population will yield different sample means
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)
σ
σX
n
Chap 7-97
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Z-value for Sampling Distribution
of the Mean
• Z-value for the sampling distribution of : X
( X μX ) ( X μ)
Z
σX σ
n
Chap 7-98
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Sample Mean Sampling Distribution:
If the Population is not Normal
σ
μ x μ and σx
n
Chap 7-99
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Central Limit Theorem
the sampling
As the n↑ distribution of
sample the sample
size gets mean becomes
large almost normal
enough… regardless of
shape of
population
Chap 7-100
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
x
Sample Mean Sampling Distribution:
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
μx μ
μ x
Sampling Distribution
Variation
σ
σx
(becomes normal as n increases)
Larger
n Smaller
sample size
sample
size
Chap 7-101 μx x
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
How Large is Large Enough?
• For most distributions, n > 30 will give a sampling
distribution that is nearly normal
• For fairly symmetric distributions, n > 15
• For normal population distributions, the sampling
distribution of the mean is always normally
distributed
Chap 7-102 Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014
Pearson Education, Inc.
Example
• Suppose a population has mean μ = 8 and standard
deviation σ = 3. Suppose a random sample of size n
= 36 is selected.
Chap 7-103 Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014
Pearson Education, Inc.
Example
(continued)
Solution:
• Even if the population is not normally distributed,
the central limit theorem can be used (n > 30)
• … so the sampling distribution of x is approximately
normal
• … with mean μx = 8
• …and standard deviation σ x σ 3 0.5
n 36
Chap 7-104
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Example
(continued)
Solution (continued):
7.8 - 8 X -μ 8.2 - 8
P(7.8 X 8.2) P
3 σ 3
36 n 36
P(-0.4 Z 0.4) 0.6554 - 0.3446 0.3108
• Approximated by a
normal distribution if: Sampling Distribution
P( ps)
.3
•
nπ 5 .2
.1
and 0
0 .2 .4 .6 8 1 p
n(1 π ) 5
where
π(1 π )
μp π and σp
n
(where π = population proportion)
Chap 7-107
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Z-Value for Proportions
Standardize p to a Z value with the formula:
p p
Z
σp (1 )
n
Chap 7-108 Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014
Pearson Education, Inc.
Example
Chap 7-109
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Example
(continued)
Chap 7-110
Statistics for Managers Using Microsoft Excel® 7e Copyright ©2014 Pearson Education, Inc.
Example
(continued)
0.4251
Standardize