2 Interval Estimation

Interval Estimation and
Sample Size Decision

• Point estimation
• Interval estimation for
 Population Mean
 Population Proportion
 Population Variance
• Sample size decision in estimating
 Population Mean
 Population Proportion
 Population Variance
QAM – II by Gaurav Garg (IIM Lucknow)

Statistical Estimation
• We take data from a sample and say something about the
population from which the sample was drawn
• Sample statistic is used to estimate unknown parameter.
• There are two types of estimation:
• Point Estimation:
 Calculation of a single value of a sample statistic
• Interval Estimation
 Calculation of an interval using a sample statistic
 This interval is calculated at a desired level of confidence
• Eg. 95% confidence, 99% confidence, can not be 100%
 Sample to sample variation (standard error) is also taken
into consideration.
Confidence Interval Estimates
• Let θ be the unknown parameter.
• Suppose T is the point estimate of θ and E(T) = θ.
• Fix the confidence level at (1-  )x100 %.
•  is the probability of “error”.
• (1- ) is called confidence coefficient.
• Thus, for 95% confidence level,  = 0.05.
• Confidence interval estimate of θ is [T-h, T+h]
• It means that P(T-h ≤ θ ≤ T+h) = 1- 
• Where, h = critical value x standard error

• Formula for confidence interval is [T-h, T+h]
• T = Unbiased (Point) Estimate of the unknown
parameter
• h = critical value x standard error of the estimate
• Critical Value is obtained using confidence coefficient
(1-  ) (will be discussed later)
• Lower Confidence Limit = T-h
• Upper Confidence Limit = T+h
Point Estimate
Lower Confidence Limit Upper Confidence Limit
Width of
confidence interval
• Using Central Limit Theorem, for large sample
T 
Z ~ N (0,1)
SE (T )
• Where T is the unbiased point estimate of θ
• SE(T) is the standard error of T.
• Confidence coefficient is fixed as (1-  ).
• Critical value is given by z/2 as below
• P(-z/2 < Z < z/2) = (1-  ), where Z~N(0,1).
N(0,1)

T 
• Z ~ N (0,1)
SE (T )
• For Z~N(0,1) P  z / 2  Z  z / 2   1  
 T  
• This implies P   z / 2   z / 2   1  
 SE (T ) 
• or P T  z / 2  SE (T )    T  z / 2  SE (T )  1  
• Thus (1-  )x100 % Confidence interval estimate of θ is
• [T - z/2 x SE(T), T + z/2 x SE(T)]

Confidence Interval for Population Mean μ
(σ Known)
• When
 Population standard deviation σ is known
 Population is normally distributed
 If population is not normal, sample size is large
• (1-  )x100 % Confidence interval estimate of μ
is given by
   
 x  z / 2  , x  z / 2  
 n n
• where P(-z/2 < Z < z/2) = (1-  ), Z~N(0,1).
Commonly used confidence levels and corresponding
critical values (N(0,1) Distribution)
N(0,1)
α
 .025 1    0.95 α
 .025
2 2
-z/2 = - 1.96 0 z/2 = 1.96

Confidence
Confidence Level Coefficient α Critical Value
80% 0.8 0.2 1.28
90% 0.9 0.1 1.645
95% 0.95 0.05 1.96
98% 0.98 0.02 2.33
99% 0.99 0.01 2.58
99.80% 0.998 0.002 3.08
99.90% 0.999 0.001 3.27
Distribution of the Sample Mean N  ,  n 
/2 1  /2
μx  μ
Value of Sample Mean x (1-) x100%
for different samples of intervals will
contain μ.
Confidence Intervals (for different samples)

 σ σ 
 x  z α/ 2 , x  z α/ 2 
 n n
• Example:
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• We know from past testing that the population standard
deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. σ
x  z ( 0.025)
n
 2.20  1.96 (0.35/ 11)
 2.20  0.2068
(1.9932 , 2.4068)
Confidence Interval for Population Mean μ
(σ Unknown)
• Use unbiased estimate of σ, given by
1 n
s1   i
n  1 i 1
( x  x ) 2
• Case 1: n is small
 Value of s1 varies sample to sample
 This increases extra variability
 Normal distribution can not be used
 We use t distribution with (n -1) d.f.
• Case 2: n is large
 When n is large, t distribution approaches normal distribution
 We use N(0,1) distribution

Case 1: σ is unknown and n is small
• Assumption: Population has normal distribution
• (1-  )x100 % Confidence interval estimate of μ is given
by
 s1 s1 
 x  t / 2  , x  t / 2  
 n n
• Where t/2 is given such that
• P(-t/2 < T < t/2) = (1-  ), for T ~ t(n-1).

Some Critical Values of t(n-1) distribution for
given α and d.f. (n-1)
α t(n-1) α
2
1 2
0
-t/2 t/2
d.f. Critical Value Critical Value
(n-1) at α = 0.05 at α = 0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
• Consider the same example
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• Population standard deviation is not known.
• Sample standard deviation (s1) is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. If we are given s2, we
s1 can use following
x  t ( 0 .025 )
n formula
 2 .20  2.22814  ( 0 .35 / 11 ) n 2
 2 .20  0 .2351
s 
2
s
n 1
1
( 1.9649 , 2 .4351 )
Case 2: σ is unknown and n is large
• Population may or may not have normal distribution
• (1-  )x100 % Confidence interval estimate of is μ given

by
 s1 s1 
 x  z / 2  , x  z / 2  
 n n
• Where z/2 is given such that
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1-  ).

Confidence Interval Estimate of μ
σ known σ Unknown
n small n large n small n large

Normal Any Normal Any
Distribution Distribution Distribution Distribution
     s1 s1 
 x  z / 2  , x  z / 2    x  z / 2  , x  z / 2  
 n n  n n
 s s 
 x  t / 2  1 , x  t / 2  1 
 n n
Confidence Intervals for Population Proportion π
Case 1:
• Small Sample: out of scope
Case 2:
• Large Sample
p 
• We know that Z  ~ N (0,1) for large n
 (1   ) n
• For Z~N(0,1), we have
P( z / 2  Z  z / 2 )  1  
 p  
or 
P  z / 2   z / 2   1  
  (1   ) n 
 
or 
P p  z / 2   (1   ) n    p  z / 2   (1   ) n  1   
• Thus (1-  )x100 % CI estimate of π is given by
p  z
 /2   (1   ) n , p  z / 2   (1   ) n 
• This expression itself contains π. Which is
unknown
• So, this CI estimate becomes meaningless.
• We use the unbiased estimate of π
• Then, (1-  )x100 % CI estimate of π is given by
p  z /2  pq n , p  z / 2  pq n 
• Where q=1-p.
• Required Assumption: Large Sample only.
• Example:
• A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
• Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
• Ans
p  z( 0.025 ) p( 1  p)/n
 25 / 100  1.96 0.25( 0.75 )/ 100
 0.25  1.96 (.0433 )
 ( 0.1651 , 0.3349 )
Confidence Interval for Population Variance  2
• Variance is an inverse measure of the group’s
homogeneity.
• Variance is an important indicator of total quality in
standardized products and services.
• Managers improve processes by reducing variance.
• Variance is a measure of financial risk.
• Variance of rates of return help managers assess
financial and capital investment alternatives.
• Variability is a reality in global markets.
• Productivity, wages, and costs of living vary between
regions and nations.

Case 1:
• Small Sample
• Parent Population is Normal
• Let us take a sample x1 , x2 ,..., xn from N(μ,σ).
 xi  x 
n 2
• Then,   
2
 ~  (2n1)
i 1   
n
1
• We know that 1
s 2
 
n  1 i 1
( x i  x ) 2
(n  1) s12
• So,  2  ~  (2n1)
 2

• Then, (1-  )x100 % CI estimate of  2 is given by
n  1s 2
n  1s 2
  
1 2 1
 
2 2
 /2 1 / 2
 
 n  1s12 n  1s12 
• Or  , 
 
2 2
 
  /2 1 / 2 
• Here,   and   are critical values obtained

2 2
/2 1 /2
using Chi Square distribution with (n-1) d.f.

df = 7
α = 0.10
α/2 = 0.05
1- α =0.90 α /2 = 0.05
2.167 14.067

• Example:
• The cholesterol concentration in the yolks of a
sample of 18 randomly selected eggs laid by
genetically engineered chickens were found to
have a mean value of 9.38 mg/g of yolk and a
standard deviation of 1.62 mg/g.
• Use this information to construct a confidence
interval estimate of the true variance of the
cholesterol concentration in these egg yolks.

Case 2:
• Large Sample
• Parent Population may or may not be Normal
• We know that E ( s1 )   2 2
• Also, S.E.( s12 )   2 2 (n  1) (Proof is out of scope)
s12   2
• So, ~ N (0,1) for large samples.
 2
2 (n  1)
• Using this, (1-  )x100 % CI estimate of  2 is given by
•  s12 s12 
 , 
1 z 2 ( n  1) 1  z 2 ( n  1) 
  / 2  / 2 
• Example:
• A technologist is developing a new method for processing
a food material.
• For best quality, it is important to control moisture content
in the final product.
• So, as one part of determining the practicality of the new
method, the technologist must estimate the variability of
water content in the resulting product.
• He collects 50 specimens of product from the new
process, and determines the percent water in each.
• These 50 specimens give a sample mean water content of
43.24% and a sample standard deviation of 7.93%.
• Compute a 95% confidence interval estimate of the true
variance of the percentage water for this new process.

(when Estimating μ)
• We have seen (for sufficiently large n) that
x
Z
x ~ N ( , n) or
 n
~ N (0,1)
• Error of Estimation e  x  
• Fix the confidence level at (1-  )x100 %
• Obtain critical value is z/2 using N(0,1) such that
• Then, we have
  z / 2 
2
e
z / 2  or n 
 n  e 

• Thus the sample size for estimating population mean μ
is
  z / 2 
2
n 
 e 
• Critical value z/2 can be taken from the table.

• Estimation Error (e) should be fixed by the researcher in
advance.
• Clearly, e ≠ 0
• Population standard deviation σ can be estimated from
some other small sample or pilot survey as
• Range/6 or by sample standard deviation
• Example:
• In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
• What should be the sample size needed to estimate the
population mean within ± 5 with 90% confidence level?
• Ans.
276  6
Estimate of population standard deviation ˆ   45
6
Estimation Error e  5
For 90% confidence level, critical value z ( 0.05)  1.645
 ˆ z 0.05 
2
 45  1.645 
2
So, n       219.19  219

 e   5 

(when Estimating 𝛑)
• Similarly, the sample size for estimating population
proportion 𝛑 is given by  (1   ) ( z / 2 ) 2
n
e2
• For fixed confidence coefficient (1-  ), critical value z/2 can
be taken from the normal table.
• Estimation Error (e = |p – 𝛑|) should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population proportion P can be estimated from some other
small sample or pilot survey.
• If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.

• Example:
• How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
• (Assume a pilot sample yields p = 0.12)
•Ans.
Estimate of population proportion p  0.12
Estimation Error e  3 / 100  0.03
For 95% confidence level, critical value z ( 0.025)  1.96
pq( z 0.025 ) 2 0.12  0.88  1.96  1.96
So, n    450.75  451
e 2
0.03  0.03

(when Estimating  2)
• We know, for large samples, s12   2
~ N (0,1)
 2
2 (n  1)
• Similarly, the sample size for estimating population variance  2 is
given by 2 4 z2 / 2
n  1
e2
• For fixed confidence coefficient (1-  ), critical value z/2 can be
taken from the normal table.
• Estimation Error e  s12   2 should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population variance  2 can be estimated from some other small
sample or pilot survey.
• If no information is available, it can be decided by the researcher
using past experience or can be taken as the square of Range/6.
Estimating Total
• In auditing, one is more interested to get the estimate of
population total amount.
• The point estimate of it can be given by Nx
• The CI estimate at (1-  )x100 % confidence level is given by
 s1   s1 
 N x  N t / 2    N x  N z / 2  
 n  n
(small sample size, normal distributi on) (large sample size)
• fpc should be used when n / N >0.05

 s1 N  n   s N n 
 N x  N t / 2    N x  N z / 2  1 
 N  1   N  1 
 n   n
(small sample size, normal distributi on) (large sample size)

Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
• A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
• Find the 95% confidence interval estimate of the total
balance.
• Ans: N  1000, n  80, x  87.6, s1  22.3
s1 N n
Nx  N z 0 .025
n N 1
22.3 1000  80
 ( 1000 )( 87.6 )  ( 1000 )( 1.96 )
80 1000  1
 87 ,600  4,762.48
 (82837.52, 92362.48)
Estimating Total Difference
• An auditor may wish to estimate the magnitude of
errors
• An error is the difference of the values reached
during audit and the original values recorded.
• A sample of size n items is collected.
• Let Di denote the error in the ith item (i=1,2,…,n).
 Di = 0, if the auditor finds that the original value is correct
 Di > 0, if the audited value is larger than the original value
 Di < 0, if the audited value is smaller than the original value

• Define: D  1 n D and 1 n
 i n i 1
sD  
n  1 i 1
( Di  D ) 2
• Point Estimate of Total Difference is N  D

• CI estimate of Total Difference
 sD   sD 
 N D  N t / 2    N D  N z / 2  
 n  n
(for small samples, normal distributi on) (for large samples)
• fpc should be used when n / N >0.05

 s N n   s N n
 N D  N t / 2  D   N D  N z / 2  D 
 n N  1   N  1 
  n
(for small samples, normal distributi on) (for large samples)

• Example:
• Econe Dresses has 1200 inventory items.
• In the past 15% items were incorrectly priced.
• A sample of 120 items was selected.
• Historical cost of each item was compared with
the audited value.
• 15 items differ in their historical costs and
audited values.
• These values are as follows:

Historical Audited D
i
Cost Value
261 240 21 n  120, N  1200
87 105 -18
201 276 -75
D  0.95833
121 110 11 s D  25.24482
315 298 17
n/N = 120/1200 = 0.1 > 0.05,
411 356 55
249 211 38 So we use fpc
216 305 -89 95% CI is
21 210 -189  s N n
 N D  Nz( 0.025) D 
140 152 -12  N  1 
 n
129 112 17  [1200  (0.95833)
340 216 124
25.24482 1200  120
341 402 -61 1200  1.96  ]
120 1200  1
135 97 38
228 220 8
Small sample
σ is

SUMMARY (INTERVAL ESTIMATES)
(Normal Distribution)
know x  z / 2 
Large sample n
n
Population (Any Distribution)
s1
Mean (μ) σ is Small sample x  t / 2 
not (Normal Distribution) n
s1
know Large sample x  z / 2 
n (Any Distribution) n
Small sample OUT OF SCOPE
Population
Large sample p  z / 2  pq n
Proportion (𝛑) (Any Distribution)
Small sample n  1s12 n  1s12
,
 
2 2
Population (Normal Distribution)
 /2 1 / 2
Variance (σ2) Large sample s12

(Any Distribution) 1  z / 2 2 (n  1)
SUMMARY (SAMPLE SIZE DECISION)
For estimating
  z / 2 
2
Population Mean Large sample n 

(Any Distribution)  e 
(μ)
For estimating Large sample
 (1   ) ( z / 2 ) 2
Population (Any Distribution) n
e2
Proportion (𝛑)
For estimating
Large sample 2 4 z2 / 2
Population n  1
(Any Distribution) e2
Variance (σ2)

2 Interval Estimation

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

2 Interval Estimation

Încărcat de

Drepturi de autor:

Formate disponibile

Interval Estimation and

Sample Size Decision

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• Thus (1-  )x100 % Confidence interval estimate of θ is

• [T - z/2 x SE(T), T + z/2 x SE(T)]

QAM – II by Gaurav Garg (IIM Lucknow)

-z/2 = - 1.96 0 z/2 = 1.96

Confidence Intervals (for different samples)

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• (1-  )x100 % Confidence interval estimate of is μ given

QAM – II by Gaurav Garg (IIM Lucknow)

n small n large n small n large

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• Here,   and   are critical values obtained

using Chi Square distribution with (n-1) d.f.

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• Also, S.E.( s12 )   2 2 (n  1) (Proof is out of scope)

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• Critical value z/2 can be taken from the table.

So, n       219.19  219

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• fpc should be used when n / N >0.05

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

• Point Estimate of Total Difference is N  D

• fpc should be used when n / N >0.05

QAM – II by Gaurav Garg (IIM Lucknow)

QAM – II by Gaurav Garg (IIM Lucknow)

Variance (σ2) Large sample s12

Population Mean Large sample n 

QAM – II by Gaurav Garg (IIM Lucknow)

S-ar putea să vă placă și