Documente Academic
Documente Profesional
Documente Cultură
Normal Distribution
The most important type of random variable is
the normal or Gaussian random variable that has
a normal distribution. In fact, the binomial
distribution can be approximated to the normal
distribution.
Note that normal or Gaussian random variable is
continuous.
2
1 x
f ( x | , )
e
2
where
x , 0,
VARYING 2, FIXED
Empirical Rule
For a distribution that is symmetrical and bell shaped
(in particular, for a normal distribution):
Approximately 68% of the data fall in the interval
,
Approximately 95% of the data fall in the interval
2 , 2
3 , 3
8
Empirical Rule
In
fact from this empirical rule, one can easily conclude that the
probabilities of the events:
i.
ii.
iii.
are 0.68, 0.95, and 0.997 respectively.
That is,
P x 1 0.68
P x 2 0.95
P x 3 0.997
10
Control Charts
A control chart is used to examine data over a
period of equally spaced time intervals.
For a given random variable X, the control chart
is a plot of the observed values of X = x in time
sequence order.
11
12
13
Out-of-Control Signal-I:
One point beyond the three standard deviation level
either above or below the center line (mean, ).
Out-of-Control Signal-II:
Out-of-Control Signal-III:
15
20
10
19
16
13
10
30
Trial
P x 3 1 0 .997
0.003
Empirically
16
17
20
10
19
16
13
10
0
1
30
Trial
18
19
20
10
Trial
19
16
13
10
30
20
1 0.95
x 2 0 .025
More accurately 2
Above the mean
Empirically
P
two standard deviations above the mean
22
Chebyshevs Theorem*
For any set of data (population or sample) with
sample size greater than 1, regardless of the
distribution of the data set, the proportion of the
data that must be within k standard deviations on
either side of the mean is given by,
23
24
1 x
2
z score
f (t | , )dt
The
26
Remarks
Note
27
E ( y ) E ( x ) E ( x) (1) E ( )
E ( y) 0
V ( y ) V ( x ) V ( x) (1) 2 V ( )
V ( y) 2 0 2
E ( z ) ? and V ( z ) ?
Hence, let z
1
x
E(x )
E ( z) E
1
0 0 E( z) 0
1
x
V ( z) V
2 V (x )
1
V ( z) 2 2 1
29
30
32
Convention? Argumentative.
Some instructors and books states that:
The area to the left of a z-value smaller than
3.49 is 0.000
It is better to state 0.0002 from table
The area to the left of a z-value greater than 3.49
is 1.000
It is better to state 0.9998 from table
Always avoid absolute statements.
34
35
Example 3
a.
b.
P( z 1.78) 0.9625
c.
P ( z 3.09) 0.001
d.
Example 4
P (2.18 z 1.34) 0.9099 0.0146 0.8953
P( z 1.34) P ( z 2.18)
P ( z 1.34) 0.9099
0.04
1.3 0.9099
P ( z 2.18) 1 P ( z 2.18)
1 0.9854 0.0146
0.08
2.1 0.9854
37
Example 5
Given that the mean is 25 and the standard deviation is 5,
what is the probability that the observed data point is at most
28.15.
x 28.15 25
P ( x 28.15) P
P( z 0.63) 0.7357
38
Example 6
Given 4, 2;
3 4 x 6 4
P (3 x 6) P
2
2
P (0.50 z 1.00) 0.8413 0.3085 0.5328
P ( z 1.00) 0.8413
By symmetry
1.0 0.8413
P ( z 0.50) 1 P ( z 0.50)
1 0.6915 0.3085
0.00
0.00
0.5 0.6915
39
Example 7
b.
40
Example 7 (continue)
c.
41
Sometimes
we may be required to find the z
value or raw score, x that corresponds to a given
area under the normal curve.
To do this, we look up the area associated with
the given problem and find the corresponding z
value.
Next, the raw score, x can be computed as
follows:
42
43
44
Example 10
1. Find the z value such that 90% of the area
under the standard normal curve lies between
z and z.
2. Find the z value such that 3% of the area under
the standard normal curve lies to the right of z.
3. If a random variable X is normally distributed
with mean 50 and standard deviation 10, find k
so that the P (X k) = 0.99
45
Sampling Distribution
A sampling distribution is a
probability distribution of a
sample statistic based on all
possible simple random samples
of the same size from the same
population.
46
Number Sold
Sales Representative
Number Sold
Zina
54
Jan
48
Woon
50
Molly
50
Ernie
52
Rachel
52
49
Proof
n
x
i 1
Prove : x E ( x ) , x2 V ( x )
, and therefore
n
the standard error for the sampling distribution is
SE x
n
51
Detailed Proof
n
1
i 1
x E(x) E
E xi
n
n i 1
1
1
E x1 x 2 x n E ( x1 ) E ( x 2 ) E ( x n )
n
n
1 n
1 n
1
E ( xi )
n i 1
n i 1
n n times
1
(n )
n
52
n
1
i 1
V (x) V
2 V xi
n
n i 1
1
1
V ( x1 ) V ( x 2 ) V ( xn )
V
x
1
2
n
2
2
n
n
1 n
1 n 2
1 2
2
2
V ( xi ) 2 2
2
n i 1
n i 1
n
n times
2
1
2
(
n
)
2
n
n
2
x
53
2
2
n
n
54
Example 12
Assume that the weight of marbles are normally
distributed with mean 172 grams and standard
deviation 29 grams.
a. If 4 marbles are selected, find the probability that its
mean weight is less than 167 grams.
b. If 25 marbles are selected, find the probability that
they have a mean weight more than 167 grams.
c. If 100 marbles are selected, find the probability that
they have a mean weight between 167 grams and
180 grams.
55
Continuity Correction
57
58
x
x np
z
np (1 p )
59
x 0.5 x 0.5 np
z
np (1 p )
60
Example 13
The Denver Post stated that 80% of all new products introduced in
grocery stores fail (and are taken off the market) within 2 years.
Using normal approximation for this binomial distribution and
correction for continuity, if a grocery store chain introduces 75 new
products,
a. Verify that the assumption for normal approximation to the
binomial is satisfied.
b. What is the probability that within two years, 54 or more will fail?
c. What is the probability that within two years, fewer than 62 will
fail?
d. What is the probability that within two years, more than 49 will
fail?
e. What is the probability that within two years, 58 or fewer fail?
61
Example 13 (solution)
a.
b.
3.464
3.464
P( z 1.73) 0.9582
With correction for continuity
60 and 3.464
x 0.5 60 54 0.5 60
P( x 54 0.5) P
3.464
3.464
P ( z 1.88) 0.9699
62
Example 13 (solution)
c.
3.464
3.464
P( z 0.29) 0.6141
With correction for continuity
60 and 3.464
x 0.5 60 61 0.5 60
P( x 61 0.5) P
3.464
3.464
P( z 0.43) 0.6664
63
Example 13 (solution)
d. for Continuity
64
Example 13 (solution)
e. for Continuity
65
QQ PLOT
66
z
z x
Let m and b
z mx b
Hence, the data is normal if the scatter plot of the data and
the corresponding z-score (by matching percentiles) is a
line.
67
Approximately Normal
Not Normal
68
69
70
PERCENTAGE PLOT
71
Normality
Central Limit Theorem
Distribution
Normal Approximation
Normal Probability
Distribution
Sampling Distribution
Standard Score
PP plot
QQ plot
Standardized plot
Percentage plot
Assignment Problems
Section 6.1:# 6.1
Section 6.2:# 6.6
Section 6.3:# 6.15, 6.17, 6.19, 6.27, 6.29
Section 6.5:# 6.31, 6.33, 6.41
Section 6.5:# 6.49, 6.54, 6.60
Section 6.6:# 6.65, 6.68, 6.70
73
Section 6.1
# 6.1 Determine if the following are continuous or
discrete random variables:
a. Number of characters in a document.
b. The amount of time it takes to make dinner.
c. The height of a palm
tree.
Section6.4
Section 6.2
# 6.6 Illustrate the following curves indicating the
points of inflection .
a. X~N()
b. X~N()
c. X~N()
Section 6.3
# 6.15 Determine the probability that the standard
normal random variable Z will assume a single value
between -1.42 and 0.75.
# 6.17 The random variable X is normally
distributed with mean and . Find the following
probabilities:
Section6.4
Section 6.6
# 6.65 Let the random variable X be binomially
distributed with and . Evaluate the following
probabilities:
c.
Using the normal approximation with
correction for continuity.
# 6.70 A fair die is rolled 200 times. Using normal
approximation to the binomial, what is the
probability that an ace (one) will appear between 34
and 36 times?