Sunteți pe pagina 1din 17

The Normal Distribution

 Using Statistics
 Properties of the Normal Distribution
 The Standard Normal Distribution
The Normal Distribution  The Transformation of Normal Random Variables
 The Inverse Transformation
 The Normal Approximation of Binomial Distributions

LEARNING OBJECTIVES Introduction


As n increases, the binomial distribution approaches a ...
After studying this chapter, you should be able to:
n=6 n = 10 n = 14
 Identify when a random variable will be normally
Binomial Distribution: n=6, p=.5 Binomial Distribution: n=10, p=.5 Binomial Distribution: n=14, p=.5
distributed
0.3 0.3 0.3
 Use the properties of normal distributions

 Explain
p a thee significance
s g ca ce ofo thee standard
s a da d normal
o a distribution
d s bu o 0.2 0.2 0.2
P(x)

P(x)

P(x)
P

P
P

 Compute probabilities using normal distribution tables


0.1 0.1 0.1

 Transform a normal distribution into a standard normal


0.0 0.0 0.0
0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

distribution x x x

 Convert a binomial distribution into an approximate normal


distribution Normal Probability Density Function: Normal Distribution:  = 0, = 1
0.4

 2
x
  0.3
 
 
 

e 2 2
f(x)

f ( x)  for    x  
0.2
1 0.1

2 2 0.0

where e 2 . 7182818 ... and   3 . 14159265 ...


 -5 0 5
x

1
The Normal Probability Distribution Properties of the Normal Distribution

• The normal is a family of


Bell-shaped and symmetric distributions.
The normal probability density function: Because the distribution is symmetric, one-half (.50 or 50%) lies on
Normal Distribution:  = 0, = 1 either side of the mean.
 2
0.4
Each is characterized by a different pair of mean, , and variance,
x
 

. That is: [X~N()].


  0.3
 
 

f ( x)  1 e 2  2 for  x Each is asymptotic to the horizontal axis.


f(x)

0.2

2  2 0.1

The area under any normal probability density function within k



where e 2 .7182818 ... and   3.14159265 ...
0.0
-5 0 5 of  is the same for any normal distribution, regardless of the mean
x
and variance.

Properties of the Normal Distribution Properties of the Normal Distribution


(continued) (continued)

• If several independent random variables are normally distributed • If X1, X2, …, Xn are independent normal random variable, then
then their sum will also be normally distributed. their sum S will also be normally distributed with
• The mean of the sum will be the sum of all the individual means. • E(S) = E(X1) + E(X2) + … + E(Xn)
• The variance of the sum will be the sum of all the individual • V(S) = V(X1) + V(X2) + … + V(Xn)
variances ((by
y virtue of the independence).
p )
• Note: It is
i the variances
i that can be added above and not the
standard deviations.

2
4-9 4-10

Properties of the Normal Distribution Properties of the Normal Distribution


– Example (continued)

Example 4.1: Let X1, X2, and X3 be independent random variables that are • If X1, X2, …, Xn are independent normal random variable, then the
normally distributed with means and variances as shown. random variable Q defined as Q = a1X1 + a2X2 + … + anXn + b will
also be normally distributed with
• E(Q) = a1E(X1) + a2E(X2) + … + anE(Xn) + b
Mean Variance • (Q) = a12 V(X
V(Q) ( 1) + a22 V(X
( 2) + … + an2 V(X
( n)
X1 10 1
• Note: It is the variances that can be added above and not the
standard deviations.
X2 20 2
X3 30 3

Let S = X1 + X2 + X3. Then E(S) = 10 + 20 + 30 = 60 and


V(S) = 1 + 2 + 3 = 6. The standard deviation of S is 6
= 2.45.

Properties of the Normal Distribution


Normal Probability Distributions
– Example
All of these are normal probability density functions, though each has a different mean and variance.
Example 4.3: Let X1 , X2 , X3 and X4 be independent random variables that are normally Normal Distribution:  =40, =1 Normal Distribution:  =30, =5 Normal Distribution:  =50, =3
distributed with means and variances as shown. Find the mean and variance of Q = 0.4 0.2 0.2

X1 - 2X2 + 3X2 - 4X4 + 5 0.3


f(w)

f(x)

f(y)

0.2 0.1 0.1

Mean Variance 0.1

X1 12 4 0.0 0.0 0.0


35 40 45 0 10 20 30 40 50 60 35 45 50 55 65
w x y
X2 -5 2
X3 8 5 W~N(40,1) X~N(30,25) Y~N(50,9)
X4 10 1 Normal Distribution:  =0, =1
0.4 Consider:
0.3
The probability in each
P(39  W  41) case is an area under a
f(z)

0.2
E(Q) = 12 – 2(-5) + 3(8) – 4(10) + 5 = 11
0.1 P(25  X  35) normal probability density
V(Q) = 4 + (-2)2(2) + 32(5) + (-4)2(1) = 73 0.0 P(47  Y  53) function.
P(-1  Z  1)
-5 0 5
z

SD(Q) = 73  8.544 Z~N(0,1)

3
Finding Probabilities of the Standard
The Standard Normal Distribution
Normal Distribution: P(0 §Z § 1.56)
Standard Normal Probabilities
The standard normal random variable, Z, is the normal random Standard Normal Distribution z
0.0
.00
0.0000
.01
0.0040
.02
0.0080
.03
0.0120
.04
0.0160
.05
0.0199
.06
0.0239
.07
0.0279
.08
0.0319
.09
0.0359

variable with mean  = 0 and standard deviation  = 1: Z~N(0,12). 0.4


0.1
0.2
0.3
0.0398
0.0793
0.1179
0.0438
0.0832
0.1217
0.0478
0.0871
0.1255
0.0517
0.0910
0.1293
0.0557
0.0948
0.1331
0.0596
0.0987
0.1368
0.0636
0.1026
0.1406
0.0675
0.1064
0.1443
0.0714
0.1103
0.1480
0.0753
0.1141
0.1517
0.3 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

f(z)
Standard Normal Distribution 0.2 0.7
0.8
0.2580
0.2881
0.2611
0.2910
0.2642
0.2939
0.2673
0.2967
0.2704
0.2995
0.2734
0.3023
0.2764
0.3051
0.2794
0.3078
0.2823
0.3106
0.2852
0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
0 .4 1 56
1.56 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

{
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
0 .3 Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
=1
f(z)

1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
{

0 .2
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

Look in row labeled 1.5 2.1


2.2
0.4821
0.4861
0.4826
0.4864
0.4830
0.4868
0.4834
0.4871
0.4838
0.4875
0.4842
0.4878
0.4846
0.4881
0.4850
0.4884
0.4854
0.4887
0.4857
0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
0 .1 and column labeled .06 to 2.4
2.5
0.4918
0.4938
0.4920
0.4940
0.4922
0.4941
0.4925
0.4943
0.4927
0.4945
0.4929
0.4946
0.4931
0.4948
0.4932
0.4949
0.4934
0.4951
0.4936
0.4952

find P(0  z  1.56) = 2.6


2.7
0.4953
0.4965
0.4955
0.4966
0.4956
0.4967
0.4957
0.4968
0.4959
0.4969
0.4960
0.4970
0.4961
0.4971
0.4962
0.4972
0.4963
0.4973
0.4964
0.4974
0 .0 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981

-5 -4 -3 -2 -1 0 1 2 3 4 5
0.4406 2.9
3.0
0.4981
0.4987
0.4982
0.4987
0.4982
0.4987
0.4983
0.4988
0.4984
0.4988
0.4984
0.4989
0.4985
0.4989
0.4985
0.4989
0.4986
0.4990
0.4986
0.4990

=0
Z

Finding Probabilities of the Standard Finding Probabilities of the Standard


Normal Distribution: P(Z < -2.47) Normal Distribution: P(1 § Z § 2)
To find P(Z<-2.47): z ...
.
.06
.
.07
.
.08
.
To find P(1  Z  2): z
.
.00
.
...

. .
Find table area for 2.47 . . . . 1. Find table area for 2.00 .
0.9
.
0.3159 ...
P(0 < Z < 2.47) = .4932 . . . .
F(2) = P(Z  2.00) = .5 + .4772 =.9772
1.0 0.3413 ...
1.1 0.3643 ...
2.3 ... 0.4909 0.4911 0.4913 . .
P(Z < -2.47) = .5 - P(0 < Z < 2.47) 2.4 ... 0.4931 0.4932 0.4934 . .
2. Find table area for 1.00 . .
= .5 - .4932 = 0.0068 2.5 ... 0.4948 0.4949 0.4951 1.9 0.4713 ...
. F(1) = P(Z  1.00) = .5 + .3413 = .8413 2.0
2.1
0.4772
0.4821
...
...
.
3. P(1  Z  2.00) = P(Z  2.00) - P(Z  1.00)
. .
. .
. . .

= .9772 - .8413 = 0.1359


Standard Normal Distribution
Area to the left of -2.47
0.4 Standard Normal Distribution
P(Z < -2.47) = .5 - 0.4932
0.4
= 0.0068 0.3 Table area for 2.47 Area between 1 and 2
P(0 < Z < 2.47) = 0.4932 0.3
P(1  Z  2) = .9772 - .8413 = 0.1359
f(z)

0.2
f(z)

0.2

0.1
0.1

0.0 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z

4
Finding Values of the Standard Normal
99% Interval around the Mean
Random Variable: P(0 § Z § z) = 0.40
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
To find z such that 0.0
0.1
0.0000
0.0398
0.0040
0.0438
0.0080
0.0478
0.0120
0.0517
0.0160
0.0557
0.0199
0.0596
0.0239
0.0636
0.0279
0.0675
0.0319
0.0714
0.0359
0.0753
To have .99 in the center of the distribution, there
z .04 .05 .06 .07 .08 .09
0.2
0.3
0.0793
0.1179
0.0832
0.1217
0.0871
0.1255
0.0910
0.1293
0.0948
0.1331
0.0987
0.1368
0.1026
0.1406
0.1064
0.1443
0.1103
0.1480
0.1141
0.1517
should be (1/2)(1-.99) = (1/2)(.01) = .005 in each . . . . . . .
. . . . . . .
P(0  Z  z) = .40: 0.4
0.5
0.1554
0.1915
0.1591
0.1950
0.1628
0.1985
0.1664
0.2019
0.1700
0.2054
0.1736
0.2088
0.1772
0.2123
0.1808
0.2157
0.1844
0.2190
0.1879
0.2224
tail of the distribution, and (1/2)(.99) = .495 in .
2.4 ...
.
0.4927
.
0.4929
.
0.4931
.
0.4932
.
0.4934
.
0.4936
0.6
0.7
0.2257
0.2580
0.2291
0.2611
0.2324
0.2642
0.2357
0.2673
0.2389
0.2704
0.2422
0.2734
0.2454
0.2764
0.2486
0.2794
0.2517
0.2823
0.2549
0.2852
each half of the .99 interval. That is: 2.5 ... 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 ... 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 . . . . . . .
1. Find a probability as close as 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
P(0  Z  z.005) = .495
. . . . . . .
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 . . . . . . .
possible to .40 in the table of 1.1
1.2
0.3643
0.3849
0.3665
0.3869
0.3686
0.3888
0.3708
0.3907
0.3729
0.3925
0.3749
0.3944
0.3770
0.3962
0.3790
0.3980
0.3810
0.3997
0.3830
0.4015
13
1.3 0 4032
0.4032 0 4049
0.4049 0 4066
0.4066 0 4082
0.4082 0 4099
0.4099 0 4115
0.4115 0 4131
0.4131 0 4147
0.4147 0 4162
0.4162 0 4177
0.4177
standard
t d d normall probabilities.
b biliti . . . . . . . . . . .
. . . . . . . . . . . Look to the table of standard normal probabilities Total area in center = .99
. . . . . . . . . . . Area in center left = .495
to find that:
2. Then determine the value of z Standard Normal Distribution 0.4

from the corresponding row


0.4
 z.005   0.3
Area in center right = .495

and column. Area to the left of 0 = .50 Area = .40 (.3997) z.005  
P(z  0) = .50

f(z)
0.3 0.2

P(0  Z  1.28)  .40 P(-.2575 Z  ) = .99 Area in right tail = .005
f(z)

0.2 0.1
Area in left tail = .005

Also, since P(Z  0) = .50


0.1 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0 Z
-z.005 z.005
-5 -4 -3 -2 -1 0 1 2 3 4 5
P(Z  1.28)  .90 Z Z = 1.28 -2.575 2.575

The Transformation of Normal


Example: Using the Normal Transformation
Random Variables
The area within k of the mean is the same for all normal random variables. So an area
under any normal distribution is equivalent to an area under the standard normal. In this X~N(160,302)
example: P(40  X  P(-1  Z     since and 
P (100  X  180 )
The transformation of X to Z:  100   X   180   
X   x
Z 
Normal Distribution:  =50, =10
 P   

x 0.07     
0.06

Transformation
 100  160 180  160 
0.05
f(x)

0.04

 P 
(1) Subtraction: (X - x)
 Z 
0.03


0.02 =10


{

Standard Normal Distribution 0.01

0.4 0.00
0 10 20 30 40 50 60 70 80 90 100
30 30
 P 2  Z  .6666 )
X
0.3
f(z)

0.2

(2) Division by x)


 0 . 4772  0 . 2475  0 . 7247
{

0.1 1.0 The inverse transformation of Z to X:


0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 X  x  Z x
Z

5
The Transformation of Normal
Using the Normal Transformation
Random Variables

The transformation of X to Z: The inverse transformation of Z to X:


Example X  x
Z  X    Z
x x x
X~N(127,222)
P ( X  150)

 P
 X    150   

   

 P Z 
150  127 The transformation of X to Z, where a and b are numbers::

   a  
22
P ( X  a )  P Z  
 P Z  1.045
   
 0.5  0.3520  0.8520  b  
P ( X  b )  P Z  
  
a b  
P (a  X  b )  P Z 
   

Normal Probabilities (Empirical Rule) The Inverse Transformation


The area within k of the mean is the same for all normal random variables. To find a
• The probability that a normal random S tan d a rd N o rm a l D is trib u tio n
probability associated with any interval of values for any normal random variable, all that
variable will be within 1 standard 0 .4
is needed is to express the interval in terms of numbers of standard deviations from the
deviation from its mean (on either 0 .3 mean. That is the purpose of the standard normal transformation. If X~N(50,102),
side) is 0.6826, or approximately 0.68.  x   70     70  50 
P( X  70)  P    P Z    P( Z  2)
     10 
f(z)

0 .2

• The probability that a normal random 0 .1


1 That is
is, P(X >70) can be found easily because 70 is 2 standard deviations above the mean
variable will be within 2 standard of X: 70 =  + 2. P(X > 70) is equivalent to P(Z > 2), an area under the standard normal
deviations from its mean is 0.9544, or 0 .0
-5 -4 -3 -2 -1 0 1 2 3 4 5 distribution.
approximately 0.95. Z

Example 4-12 X~N(124,122) Normal Distribution:  = 124,  = 12


• The probability that a normal random P(X > x) = 0.10 and P(Z > 1.28) 0.10 0.04
x =  + z = 124 + (1.28)(12) = 139.36
variable will be within 3 standard 0.03

deviation from its mean is 0.9974. z


. .
.07
.
.08
.
.09
.
f(x)

. . . . . 0.02
. . . . .
1.1 ... 0.3790 0.3810 0.3830 0.01
1.2 ... 0.3980 0.3997 0.4015 0.01
1.3 ... 0.4147 0.4162 0.4177
. . . . .
. . . . . 0.00
. . . . . 80 130 139.36 180
X

6
4-26

Finding Values of a Normal Random


The Inverse Transformation (Continued)
Variable, Given a Probability
Example X~N(2450,4002)
Example X~N(5.7,0.52) P(a<X<b)=0.95 and P(-1.96<Z<1.96)0.95 1. Draw pictures of Normal Distribution:  = 2450,  = 400
P(X > x)=0.01 and P(Z > 2.33) 0.01 x =   z = 2450 ± (1.96)(400) = 2450
x =  + z = 5.7 + (2.33)(0.5) = 6.865 the normal 0.0012
.
±784=(1666,3234) 0.0010
.
P(1666 < X < 3234) = 0.95 distribution in 0.0008
.
question and of the

f(x)
z .02 .03 .04 0.0006
.
. . . . .
. . . . .
z
. .
.05
.
.06
.
.07
.
standard normal 0.0004
.
. . . . .
. . . . . 0.0002
.
2.2
23
2.3
...
...
0.4868
0 4898
0.4898
0.4871
0 4901
0.4901
0.4875
0 4904
0.4904
. . . . . distribution.
1.8 ... 0.4678 0.4686 0.4693 0.0000
2.4 ... 0.4922 0.4925 0.4927 1000 2000 3000 4000
1.9 ... 0.4744 0.4750 0.4756
. . . . .
. . . . .
2.0 ... 0.4798 0.4803 0.4808 X
. . . . .
. . . . .
. . . . .
S tand ard Norm al D istrib utio n
Normal Distribution:  = 5.7  = 0.5 Normal Distribution:  = 2450  = 400
0.8
0.4
0.0015
Area = 0.49
0.7
0.3
0.6 .4750 .4750
0.5 0.0010

f(z)
0.2
f(x)

f(x)

0.4
0.3 X.01 = +z = 5.7 + (2.33)(0.5) = 6.865
0.0005 0.1
0.2 .0250 .0250
0.1 Area = 0.01
0.0
0.0 0.0000 -5 -4 -3 -2 -1 0 1 2 3 4 5
3.2 4.2 5.2 6.2 7.2 8.2 1000 2000 3000 4000 Z
X X
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
z Z.01 = 2.33 -1.96 Z 1.96

4-27

Finding Values of a Normal Random Finding Values of a Normal Random


Variable, Given a Probability Variable, Given a Probability
Normal Distribution:  = 2450,  = 400
Normal Distribution:  = 2450,  = 400 1. Draw pictures of 3. From the table
0.0012
.
0.0012
. the normal 0.0010
. .4750 .4750 of the standard
1. Draw pictures of
0.0010
. .4750 .4750 distribution in 0.0008
.
normal
0.0008
.
f(x)

the normal question and of the 0.0006


.
distribution,
f(x)

0.0006
. 0.0004
.
distribution in 0.0004
. standard normal 0.0002
. find the z value
.9500
question
ti and d off th
the 0.0002
0 .0002 .9500
9500 distribution. 0.0000 or values.
l
0.0000 1000 2000 3000 4000
standard normal 1000 2000 3000 4000 X

distribution.
X 2. Shade the area S tand ard Norm al D istrib utio n
S tand ard Norm al D istrib utio n corresponding 0.4
0.4 to the desired .4750 .4750
.4750
2. Shade the area 0.3
.4750
probability.
0.3
f(z)

corresponding to 0.2
f(z)

0.2
the desired z
. .
.05
.
.06
.
.07
. 0.1
0.1 . . . . . .9500
probability. .9500 . . . . .
0.0
1.8 ... 0.4678 0.4686 0.4693
0.0 1.9 ... 0.4744 0.4750 0.4756 -5 -4 -3 -2 -1 0 1 2 3 4 5
-5 -4 -3 -2 -1 0 1 2 3 4 5 2.0 ... 0.4798 0.4803 0.4808 Z
Z . . . . .
. . . . .
-1.96 1.96

7
Finding Values of a Normal Random Finding Values of a Normal Random
Variable, Given a Probability Variable, Given a Probability
Normal Distribution:  = 2450,  = 400
1. Draw pictures of
0.0012
. 3. From the table The normal distribution with  = 3.5 and  = 1.323 is a close
the normal .4750 .4750
distribution in
0.0010
.

0.0008
.
of the standard approximation to the binomial with n = 7 and p = 0.50.
normal
f(x)

question and of the 0.0006


.

0.0004
. distribution, Normal Distribution:  = 3.5,  = 1.323
standard normal P(x<4.5) = 0.7749 Binomial Distribution: n = 7, p = 0.50
0.0002
. .9500 find the z value
distribution. 0.0000
0.3 0.3

1000 2000 3000 4000 or values


values. P( x 4)
4) = 00.7734
7734
X
0.2 0.2
2. Shade the area 4. Use the

P(x)
f(x)
S tand ard Norm al D istrib utio n
corresponding 0.4
transformation 0.1 0.1

to the desired .4750 .4750 from z to x to get


0.3
probability. value(s) of the 0.0
0 5 10
0.0
0 1 2 3 4 5 6 7
original random
f(z)

0.2 X X

z .05 .06 .07


0.1
variable.
. . . . .
. . . . . .9500 MTB > cdf 4.5; MTB > cdf 4;
. . . . .
0.0 SUBC> normal 3.5 1.323. SUBC> binomial 7,.5.
x =   z = 2450 ± (1.96)(400)
1.8 ... 0.4678 0.4686 0.4693 Cumulative Distribution Function
1.9 ... 0.4744 0.4750 0.4756 -5 -4 -3 -2 -1 0 1 2 3 4 5 Cumulative Distribution Function
2.0 ... 0.4798 0.4803 0.4808 Z = 2450 ±784=(1666,3234)
. . . . . Normal with mean = 3.50000 and standard deviation = 1.32300 Binomial with n = 7 and p = 0.500000
. . . . .
-1.96 1.96
x P( X <= x) x P( X <= x)
4.5000 0.7751 4.00 0.7734

The Normal Approximation of Binomial Approximating a Binomial Probability


Distribution Using the Normal Distribution

The normal distribution with  = 5.5 and  = 1.6583 is a closer  a  np b  np 


P ( a  X  b)  P Z 
approximation to the binomial with n = 11 and p = 0.50.  np(1  p) np(1  p) 
P(x < 4.5) = 0.2732
Normal Distribution:  = 5.5,  = 1.6583
P(x  4) = 0.2744
Binomial Distribution: n = 11, p = 0.50 for n large (n  50) and p not too close to 0 or 1.00
0.3
0.2 or:
0.2
 a  0.5  np b  0.5  np 
P (a  X  b)  P Z
P(x)


f(x)

0.1

0.1
 np(1  p) np(1  p) 
0.0
0.0

for n moderately large (20  n < 50).


0 1 2 3 4 5 6 7 8 9 10 11
0 5 10
X
X

NOTE: If p is either small (close to 0) or large (close to 1), use the


Poisson approximation.

8
Confidence Interval or Interval
Confidence interval Using Statistics
Estimate
• Consider the following statements:
A confidence interval or interval estimate is a range or interval of
x = 550 numbers believed to include an unknown population parameter.
• A single-valued estimate that conveys little information Associated with the interval is a measure of the confidence we have
about the actual value of the population mean. that the interval does indeed contain the parameter of interest.
We are 99% confident that  is in the interval [449,551]
• An interval estimate which locates the population mean • A confidence interval or interval estimate has two components:
within a narrow interval, with a high level of confidence. A range or interval of values
We are 90% confident that  is in the interval [400,700] An associated level of confidence
• An interval estimate which locates the population mean
within a broader interval, with a lower level of confidence.

Confidence Interval for  Confidence Interval for  when  is Known


When  Is Known (Continued)
Beforesampling,thereis a 0.95probability thatthe interval
 If the population distribution is normal, the sampling distribution of the mean is
normal. 
• If the sample is sufficiently large, regardless of the shape of the population
  1.96
distribution, the sampling distribution is normal (Central Limit Theorem).
n
In either case: will includethe samplemean (and 5% that it willnot).

    Conversely, after sampling,approximately 95% of such intervals


P   196  x    196
Standard Normal Distribution: 95% Interval
. .   0.95
 n
n 
0.4

0.3 x  1.96
n
f(z)

or 0.2

0.1 will includethe populationmean (and 5% of them will not).


  
0.0

P x  196    x  196   0.95
-4 -3 -2 -1 0 1 2 3 4
. .

z
 n n
That is, x  1.96 is a 95%confidenceintervalfor  .
n

9
A 95% Interval around the Population
The 95% Confidence Interval for 
Mean
Sampling Distribution of the Mean
0.4 Approximately 95% of sample means A 95% confidence interval for  when  is known and sampling is
0.3
95%
can be expected to fall within the done from a normal population, or a large sample is used, is:
interval    1.96  ,   1.96  . 
x  1.96
f(x)

0.2
 n n 
0.1
2.5% 2.5%
n
Conversely, about 22.5%
Conversely 5% can be 
 The quantity 1.9 6 is often called the margin of error or the
0.0

  196
.

n
   1.96

n
x
expected to be above   1.96 n and n
2.5% can be expected to be below sampling error.
x 
  1.96 . For example, if: n = 25 A 95% confidence interval:
n

x
2.5% fall below
 = 20 20
the interval x
x  1.96  122  1.96
x So 5% can be expected to fall outside x = 122 n 25
 122  (1.96)( 4 )
x
x 2.5% fall above the interval    1.96  ,   1.96  .
x
the interval  n n  122  7 .84
x
x
 114 .16,129.84 
95% fall within
the interval

Critical Values of z and Levels of


A (1- )100% Confidence Interval for 
Confidence
We define z as the z value that cuts off a right-tail area of  under the standard
2
(1   )
 z
Stand ard N orm al Distrib ution
normal curve. (1-) is called the confidence coefficient.  is called the error
2
0.4
2 2 (1   )
probability, and (1-)100% is called the confidence level.
0.3

Stand ard Norm al Distrib ution  


0.99 0.005 2.576
P z  z  
f(z)

02
0.2
f

0.4   0.98 0.010 2.326 


(1   ) 2

 
P z  z   
0.1
2 2
0.3
 2
 0.95 0.025 1.960 0.0

 
f(z)

-5 -4 -3 -2 -1 0 1 2 3 4 5
P  z z z   (1  )
  
0.2
 z z
0.1    2 2
 0.90 0.050 1.645 2
Z
2

2 2
0.0 (1- )100% Confidence Interval: 0.80 0.100 1.282
-5 -4 -3 -2 -1 0 1 2 3 4 5 
 z Z z x  z
2 2
2 n

10
The Level of Confidence and the The Sample Size and the Width of the
Width of the Confidence Interval Confidence Interval
When sampling from the same population, using a fixed sample size, the When sampling from the same population, using a fixed confidence
higher the confidence level, the wider the confidence interval. level, the larger the sample size, n, the narrower the confidence
St an d ar d N o r m al Di s tri b uti o n St an d ar d N or m al Di s tri b uti o n
interval.
0.4 0.4
S am p ling D is trib utio n o f th e M e an S a m p lin g D is trib utio n o f th e M e an

0.3 0.3 0 .4 0 .9
0 .8
f(z)

f(z)

0 .3 0 .7
0.2 0.2
0 .6

0 .5

f(x)
f(x)
0 .2
0.1 0.1 0 .4

0 .3
0 .1
0.0 0.0 0 .2

-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5 0 .1

Z Z 0 .0 0 .0

80% Confidence Interval: 95% Confidence Interval: x x

 
x  1.28 x  1.96 95% Confidence Interval: n = 20 95% Confidence Interval: n = 40
n n

Confidence Interval or Interval Estimate for 


Example When  Is Unknown - The t Distribution

• Shrimmy,is planning to invest heavily in black tiger breed. As part of the If the population standard deviation, , is not known, replace
decision, the company wants to estimate the average amount of black tiger with the sample standard deviation, s. If the population is
shrimp a family of four would need per month. A random sample of n = 100
families is obtained, and in this sample the average amount of shrimp in pound normal, the resulting statistic: t  X  
s
per month is 6.5 and the population standard deviation is known to be 3.2.
Construct a 95% confidence interval for the average amount of shrimp n
consumed d by
b th
the entire
ti population
l ti off families
f ili off 4.4 has a t distribution with (n - 1) degrees of freedom.
• The t is a family of bell-shaped and symmetric Standard normal
distributions, one for each number of degree of
freedom. t, df = 20
• The expected value of t is 0.
t, df = 10
• For df > 2, the variance of t is df/(df-2). This is
greater than 1, but approaches 1 as the number
of degrees of freedom increases. The t is flatter
and has fatter tails than does the standard
normal. 
• The t distribution approaches a standard normal 

as the number of degrees of freedom increases

11
Confidence Intervals for  when  is
The t Distribution
Unknown- The t Distribution
df t0.100 t0.050 t0.025 t0.010 t0.005
--- ----- ----- ------ ------ ------
1 3.078 6.314 12.706 31.821 63.657 t D is trib utio n: d f = 1 0
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841 0 .4
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707 0 .3

A (1-)100% confidence interval for  when  is not known


7 1.415 1.895 2.365 2.998 3.499 Area = 0.10 Area = 0.10
8 1.397 1.860 2.306 2.896 3.355

}
f(t)
0 .2
9 1.383 1.833 2.262 2.821 3.250
(assuming a normally distributed population) is given by: 10
11
1.372
1.363
1.812
1.796
2.228
2.201
2.764
2.718
3.169
3.106
0 .1
12 1.356 1.782 2.179 2.681 3.055

s 13 1.350 1.771 2.160 2.650 3.012

x t
14 1.345 1.761 2.145 2.624 2.977 0 .0
15 1.341 1.753 2.131 2.602 2.947 -1.372 1.372
0
 16 1.337 1.746 2.120 2.583 2.921 -2.228 2.228

}
17 1.333 1.740 2.110 2.567 2.898 t
2
18 1.330 1.734 2.101 2.552 2.878
Area = 0.025 Area = 0.025
where t is the value of the t distribution with n-1 degrees of
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845


21 1.323 1.721 2.080 2.518 2.831
2 22 1.321 1.717 2.074 2.508 2.819

freedom that cuts off a tail area of 2 to its right. 23


24
1.319
1.318
1.714
1.711
2.069
2.064
2.500
2.492
2.807
2.797
Whenever  is not known (and the population is
25
26
1.316
1.315
1.708
1.706
2.060
2.056
2.485
2.479
2.787
2.779
assumed normal), the correct distribution to use is
27
28
1.314
1.313
1.703
1.701
2.052
2.048
2.473
2.467
2.771
2.763
the t distribution with n-1 degrees of freedom.
29
30
1.311
1.310
1.699
1.697
2.045
2.042
2.462
2.457
2.756
2.750
Note, however, that for large degrees of freedom,
40
60
1.303
1.296
1.684
1.671
2.021
2.000
2.423
2.390
2.704
2.660
the t distribution is approximated well by the Z
120

1.289
1.282
1.658
1.645
1.980
1.960
2.358
2.326
2.617
2.576
distribution.

Large Sample Confidence Intervals for


Example
the Population Mean
A blood analyst wants to estimate the average AFP index of the Vietnamese
people. A random blood sample of size 15 yields an average of x  10.37ng / ml
Whenever  is not known (and the population is
df t0.100 t0.050 t0.025 t0.010 t0.005
and a standard deviation of s = 3.5 ng/ml. Assuming a normal population of ---
1
-----
3.078
-----
6.314
------
12.706
------
31.821
------
63.657
the AFP values, give a 95% confidence interval for the average AFP value . . . . . . assumed normal), the correct distribution to use is
. . . . . .

of the Vietnamese ppopulation?


p ((AFP=alpha-fetoprotein)
p p ) . . . . . . the t distribution with n-1 degrees of freedom.
120 1.289 1.658 1.980 2.358 2.617
 1.282 1.645 1.960 2.326 2.576 N t however,
Note, h that
th t for
f large
l degrees
d off freedom,
f d
the t distribution is approximated well by the Z
df
---
t0.100
-----
t0.050
-----
t0.025
------
t0.010
------
t0.005
------ The critical value of t for df = (n -1) = (15 -1)
1 3.078 6.314 12.706 31.821 63.657 distribution.
. . . . . . =14 and a right-tail area of 0.025 is:
t 0. 025  2.145
. . . . . .
. . . . . .
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977 The corresponding confidence interval or
15 1.341 1.753 2.131 2.602 2.947
s
.
.
.
.
.
.
.
.
.
.
.
. interval estimate is: x  t 0.025
. . . . . . n
35
.
 10.37  2.145
15
 10.37  1.94
 8.43,12.31

12
Large Sample Confidence Intervals for
the Population Mean

A large - sample (1 -  )100% confidence interval for :


s
x  z
2 n

Example An environmental scientist wants to estimate the average amount of NOx in a given region. A random sample
of 100 data points gives x-bar = 357.60 ppm and s = 140.00 ppm. Give a 95% confidence interval for , the average
amount of NOx in any sample taken.

s 140.00
x  z 0 . 025  357.60  1.96  357.60  27.44   330.16,385.04 
n 100

Exercise 1 Exercise 2

13
Large-Sample Confidence Intervals Large-Sample Confidence Intervals
for the Population Proportion, p for the Population Proportion, p

The estimator of the population proportion, p , is the sample proportion, p . If the


sample size is large, p has an approximately normal distribution, with E( p ) = p and A large - sample (1- )100% confidence interval for the population proportion, p :
pq
V( p ) = , where q = (1 - p). When the population proportion is unknown, use the pˆ qˆ
pˆ  z
n  /2 n
estimated value, p , to estimate the standard deviation of p .
where the sample proportion, p̂, is equal to the number of successes in the sample, x,
divided by the number of trials (the sample size), n, and q̂ = 1- p̂.
For estimating p , a sample is considered large enough when both n  p an n  q are greater
than 5.

Example Exercise 3

A marketing research firm wants to estimate the share that foreign companies
have in the American market for certain products. A random sample of 100
consumers is obtained, and it is found that 34 people in the sample are users
of foreign-made products; the rest are users of domestic products. Give a
95% confidence interval for the share of foreign products in this market.


pq ( 0.34 )( 0.66)
p  z  0.34  1.96
2
n 100
 0.34  (1.96)( 0.04737 )
 0.34  0.0928
  0.2472 ,0.4328

Thus, the firm may be 95% confident that foreign manufacturers control
anywhere from 24.72% to 43.28% of the market.

14
Confidence Intervals for the Population Variance:
The Chi-Square (2) Distribution The Chi-Square (2) Distribution

• The sample variance, s2, is an unbiased estimator of the population  The chi-square random variable cannot be C hi-S q uare D is trib utio n: d f=1 0 , df =3 0 , d f =5 0

variance, 2. negative, so it is bound by zero on the left. 0 .1 0


0 .0 9 df = 10
• Confidence intervals for the population variance are based on the chi-  The chi-square distribution is skewed to the right. 0 .0 8
0 .0 7
square (2) distribution.  The chi-square distribution approaches a normal 0 .0 6

f(  )
The chi-square distribution is the probability distribution of the sum of as the degrees of freedom increase. df = 30

2
0 .0 5
0 .0 4

several independent,
independent squared standard normal random variables.
variables 0 .0 3
0 .0 2
df = 50

The mean of the chi-square distribution is equal to the degrees of 0 .0 1


0 .0 0
freedom parameter, (E[2] = df). The variance of a chi-square is equal 0 50 100

2
to twice the number of degrees of freedom, (V[2] = 2df).
In sam pling from a norm al population, the random variable:

( n  1) s 2
2 
 2

has a chi - square distribution w ith (n - 1) degrees of freedom .

Confidence Interval for the Population


Example
Variance
A (1-)100% confidence interval for the population variance * (where the In an automated process, a machine fills cans of coffee. If the average amount
population is assumed normal) is: filled is different from what it should be, the machine may be adjusted to
 2
correct the mean. If the variance of the filling process is too high, however, the
 ( n  1) s , ( n  1) s 
2
machine is out of control and needs to be repaired. Therefore, from time to
  2
  
2 time regular checks of the variance of the filling process are made. This is done
 2
1
2  by randomly sampling filled cans, measuring their amounts, and computing the
sample variance. A random sample of 30 cans gives an estimate s2 = 18,540.
where   is the value of the chi-square distribution with n - 1 degrees of freedom
2
Give a 95% confidence interval for the population variance, 2.

to its right and   is the value of the distribution that
2 2
that cuts off an area
1
2 2 
cuts off an area of to its left (equivalently, an area of 1  to its right).  2
 ( n  12 ) s , ( n 21) s    ( 30  1)18540 , ( 30  1)18540  11765,33604
2
2 2
      457
. 16.0 
 2
1
2 
* Note: Because the chi-square distribution is skewed, the confidence interval for the
population variance is not symmetric

15
Example (continued) Sample-Size Determination

Area in Right Tail Before determining the necessary sample size, three questions must
df .995 .990 .975 .950 .900 .100 .050 .025 .010 .005 be answered:
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . • How close do you want your sample estimate to be to the unknown parameter? (What is the
28 12.46 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34
desired bound, B?)
30 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67 • What do you want the desired confidence level (1-) to be so that the distance between your
estimate
i andd the
h parameter is
i less
l than
h or equall to B?
Chi-Square Distribution: df = 29 • What is your estimate of the variance (or standard deviation) of the population in question?
0.06

0.05
0.95
0.04
f( )
2

0.03


0.02
0.025

For example: A (1-  ) Confidence Interval for : x  z 


0.01 0.025
0.00
n

}
0 10 20 30 40 50 60 70
2
2
 20.975  16.05  20.025  45.72
Bound, B

Exercise 4 Sample Size and Standard Error

The sample size determines the bound of a statistic, since the standard
error of a statistic shrinks as the sample size increases:

Sample size = 2n
Standard error
of statistic

Sample size = n
Standard error
of statistic

16
Minimum Sample Size: Mean and
Example
Proportion
Minimum required sample size in estimating the population A microbiologist wants to conduct an experiment to estimate the average amount
mean, : of micro-organisms in the water of a popular river. He plans to determine the
z2 2 average amount of micro organism to within 120 µg/ml, with 95% confidence.
n 2 2 From past record, an estimate of the population standard deviation is
B s = 400 µg/ml. What is the minimum required sample size?
Bound of estimate:

B = z
2 n z 
2 2

n 2
2
B
Minimum required sample size in estimating the population
proportion, p 2
(1.96 ) ( 400 ) 2

z2 pq  2
120
n 2 2
B  42 .684  43

17

S-ar putea să vă placă și