Sunteți pe pagina 1din 40

Chapter 1.

Statistical inference in one population

Contents
I Statistical inference
I Point estimators
I The estimation of the population mean and variance
I Estimating the population mean using confidence intervals
I Confidence intervals for the mean of a normal population with known
variance
I Confidence intervals for the mean in large samples
I Confidence intervals for the population proportion
I Confidence intervals for the mean of a normal population with
unknown variance
I Estimating the population variance using confidence intervals
I Confidence intervals for the variance of a normal population
Chapter 1. Statistical inference in one population

Learning goals
At the end of this chapter you should know how to:
I Estimate the unknown population parameters from the sample data
I Construct confidence intervals for the unknown population
parameters from the sample data:
I In the case of a normal distribution: confidence intervals for the
population mean and variance
I In large samples: confidence intervals for the population mean and
proportion
I Interpret the confidence interval
I Understand the impact of the sample size, confidence level, etc on
the length of the confidence interval
I Calculate a sample size needed to control a given interval width
Chapter 1. Statistical inference in one population

References
I Newbold, P. Statistics for Business and Economics
I Chapters 7 and 8 (8.1-8.6)
I Ross, S. Introduction to Statistics
I Chapter 8
Statistical inference: key words (i)

I Population: the complete set of numerical information on a


particular quantity in which an investigator is interested.
I We identify the concept of the population with that of the random
variable X .
I The law or the distribution of the population is the distribution of X ,
FX .
I Sample: an observed subset (say, of size n) of the population values.

I Represented by a collection of n random variables X1 , X2 , . . . , Xn ,


typically iid (independent identically distributed) .
I Parameter: a constant characterizing X or FX .
Statistical inference: key words (ii)

I Statistical inference: the process of drawing conclusions about a


population on the basis of measurements or observations made on a
sample of individuals from the population.
I Statistic: a random variable obtained as a function of a random
sample, X1 , X2 , . . . , Xn
I Estimator of a parameter: a random variable obtained as a function,
say T , of a random sample, X1 , X2 , . . . , Xn , used to estimate the
unknown population parameter.
I Estimate: a specific realization of that random variable, i.e., T
evaluated at the observed sample, x1 , x2 , . . . , xn , that provides an
approximation to that unknown parameter.
Statistical inference: example

We want to know We have n copies We have n


X = E[X ] of X observed values of
X1 , X2 , . . . , Xn

X 1 , X2 , . . . , X n F x1 , x2 , . . . , xn
X F ) Sample ) Observed sample

+ + +
Estimator of X (r. v.) Estimate of X (number)
X = E[X ]
Expected value of X
( X
Sample mean
( x
Sample mean
Point estimators: introduction

I A point estimator of a population parameter is a function, call it T ,


of the sample information X n = (X1 , . . . , Xn ) that yields a single
number.
I Examples of population parameters, estimators and estimates:
Population Estimator: Estimate:
parameter T (X n ) notation notation
X1 +...+Xn
Pop. mean X sample mean n X = X x
Pop. prop. pX sample prop. pX px
P 2
2 i Xin(X )2
Pop. var. X sample var. nP X2 x2
2 n(X )2
2 i Xi 2
Pop. var. X sample quasi. var. n 1 = n
n 1 X sX2 sx2
... ... ... ...
In general, X ... X x
Point estimators: properties (i)

What are desirable characteristics of the estimators?


I Unbiasdness. This means that the bias of the estimator is zero.
Whats bias? Bias equals the expected value of the estimator minus
the target parameter

Bias[X ] = E[X ] X
Population Estimator Minimum Variance
parameter T (X n ) Bias Unbiased? Unbiased Estimator?
Pop. mean X X E[X ] X = 0 Yes Yes, if X normal
Pop. prop. pX pX E[pX ] pX = 0 Yes Yes
Pop. var. X2 X2 E[ X2 ] 2
X 6= 0 No No
Pop. var. X2 sX2 E[sX2 ] 2
X = 0 Yes Yes, if X normal
In general, X X E[X ] X Often Rarely
Point estimators: properties (ii)

I Efficiency. Measured by the estimators variance. Estimators with


smaller variance are more efficient.
I Relative efficiency of two unbiased estimators X ,1 and X ,2 of a
parameter X is

Var[X ,1 ]
Relative efficiency(X ,1 , X ,2 ) =
Var[X ,2 ]

Note:
I sometimes the inverse is used as a definition
I in any case, an estimator with smaller variance is more efficient
Point estimators: properties (iii)

I A more general criterion to select estimators (among unbiased and


biased ones) is the mean squared error defined as

MSE[X ] = E[(X X )2 ] = Var[X ] + (Bias[X ])2

Note:
I the mean squared error of an unbiased estimator equals its variance
I an estimator with smaller MSE is better
I the minimum variance unbiased estimator has the smallest
variance/MSE among all estimators
I How do we come up with the definition of the estimator T ?
I In some situations, there exists an optimal estimator called minimum
variance unbiased estimator.
I If thats not the case, there are various alternative methods that
yield reasonable estimators, for example:
I Maximum likelihood estimation
I Method of moments
Point estimation: example

Example: 7.1 (Newbold) Price-earnings ratios for a random sample of


ten stocks traded on the NY Stock Exchange on a particular day were

10, 16, 5, 10, 12, 8, 4, 6, 5, 4

Use an unbiased estimation procedure to find point estimates of the


following population parameters: mean, variance, proportion of values
exceeding 8.5.

80
x = =8
10
782 10(8)2
sx2 = = 15.78
10 1
1+1+0+1+1+0+0+0+0+0
px =
10
= 0.4
Point estimation: example
2
Example: Let X = n(n+1) (X1 + 2X2 + . . . + nXn ) be an estimator of the
population mean based on a SRS X n . Compare this estimator with the sample
mean, X .
2
We know that X is an unbiased estimator of X , whose variance is nX .
X is also unbiased: And its variance/MSE is:

" # " #
2 2
E[X ] = E (X1 + 2X2 + . . . + nXn ) V[X ] = V (X1 + 2X2 + . . . + nXn )
n(n + 1) n(n + 1)
!2
2 2
= (E[X1 ] + 2E[X2 ] + . . . + nE[Xn ]) 2 2
=indep. (V[X1 ] + 2 V[X2 ] + . . . + n V[Xn ])
n(n + 1) n(n + 1)
2 n(n+1)(2n+1)/6
=id (X + 2X + . . . + nX )
n(n + 1) z }| {
4 2 2 2 2
=id X (1 + 2 + ... + n )
n(n+1)/2 n2 (n + 1)2
2X z }| {
= (1 + 2 + . . . + n) = X 2(2n + 1) 2
n(n + 1) = X
3n(n + 1)
) Bias[X ] = 0
2 2(2n + 1) 2
MSE [X ] = V[X ] + 0 = X
3n(n + 1)
2
X /n 3(n + 1)
Relative efficiency(X , X ) = 2(2n+1) 2
=
2(2n + 1)
3n(n+1) X

Its easy to see that for n 2, this ratio is smaller than 1 so X is a more
efficient estimator for X .
From point estimation to confidence interval estimation

I So far, we have consider the point estimation of an unknown


population parameter which, assuming we had a SRS sample of n
observations from X , would produce an educated guess about that
unknown parameter
I Point estimates however, do not take into account the variability of
the estimation procedure due to, among other factors:
I sample size - surely, larger samples should provide more accurate
information about the population parameter
I variability in the population - samples from populations with smaller
variance should give more accurate estimates
I whether other population parameters are known
I etc
These drawbacks can be overcome by considering confidence interval
estimation, that is, a method that gives a range of values (an interval) in
which the parameter is likely to fall.
Confidence interval estimator and confidence interval
Let X n = (X1 , X2 , . . . , Xn ) be a SRS from a population X with a cdf FX
that depends on an unknown parameter .
I A confidence interval estimator for at a confidence level
(1 ) = 100(1 )% is an interval (T1 (X n ), T2 (X n )) that satisfies

P ( 2 (T1 (X n ), T2 (X n )) = 1

I Interpretation: we have a probability of (1 ) that the unknown


population parameter will be in (T1 (X n ), T2 (X n )).
I A confidence interval for at a confidence level 1 is the
observed value of the confidence interval estimator,

(T1 (x n ), T2 (x n ))

I Interpretation: we can be (1 ) confident that the unknown


population parameter will be in (T1 (x n ), T2 (x n )).
Typical levels of confidence

0.01 0.05 0.10


100(1 )% 99% 95% 90%
Finding confidence interval estimators: procedure

1. Find a quantity involving the unknown parameter and the sample


X n , C (X n , ), whose distribution is known and does not depend on
the parameter - a so-called pivotal quantity or a pivot for
2. Use the upper 1 /2 and /2 quantiles of that distribution and the
definition of the confidence interval estimator to set up the equation

double inequality
z }| {
P(1 /2 quantile<C (X n , )</2 quantile) = 1

3. To find the end points T1 (X n ) and T2 (X n ) of the confidence


interval estimator, solve the double inequality for the parameter
4. A 100(1 )% confidence interval for is (T1 (x n ), T2 (x n ))
Confidence interval for the population mean, normal
population with known variance

1. Let X n be a SRS of size n from X . Under the assumptions:


I 2
X follows a normal distribution with parameters X and X
I 2
X is known (rather unrealistic)

2. The pivotal quantity for X is

X
pX N(0, 1)
X / n
p
I Note: the standard deviation of X , n, (or any other stats) is
X/
called the standard error
Confidence interval for the population mean, normal
population with known variance

3. Hence, if z1 /2 and z/2 are the


(1 /2) and (/2) upper
quantiles of the N(0, 1), we have
P(z1 /2 < Z < z/2 ) = 1

1
2 2
Standard normal density

Recall: If Z N(0, 1) then z12 = z2 z2


E[Z ] = 0, V[Z ] = 1
Z
z/2 z }| {
z }| { X X
4. Therefore P(z1 /2 < p < z/2 ) = 1
X/ n
Confidence interval for the population mean, normal
population with known variance
5. Solve the double inequality for X :
X X
z/2 < p
n
< z/2
X/
X X
z/2 p < X X < z/2 p
n n
X X
z/2 p X < X < X + z/2 p
n n
X X
z/2 p + X > X > X z/2 p
n n
to obtain the confidence interval estimator
T1 (X n ) T2 (X n )
z }| { z }| {
X X
(X z/2 p , X + z/2 p )
n n

6. The confidence interval is:



X X X
CI1 (X ) = x z/2 p , x + z/2 p = x z/2 p
n n n
Example: finding a confidence interval for X
Example: 8.2 (Newbold) A process produces bags of refined sugar. The
weights of the contents of these bags are normally distributed with standard
deviation 1.2 ounces. The contents of a random sample of twenty-five bags had
mean weight 19.8 ounces. Find a 95% confidence interval for the true mean
weight for all bags of sugar produced by the process.

Population: Objective: CI0.95 (X ) = x z/2 pXn
X = weight of a sugar bag (in oz)

'
X N(X , X2 = 1.22 ) X = 1.2
n = 25 x = 19.8
SRS: n = 25
1 = 0.95 ) /2 = 0.025
Sample: x = 19.8 z/2 = z0.025 = 1.96

1.2
CI0.95 (X ) = 19.8 1.96 p
25
= (19.8 0.47)
Area= = (19.33, 20.27)
0.025
Interpretation: We can be 95%
confident that X is in

(19.33, 20.27)
z0.025 = 1.96
Frequency interpretation of the CI, conf. level eect
In this simulated example, 150 samples of the same size n = 50 were generated
2
from X N(X = 5, X = 12 ) and 150 CI1 (X ) were constructed with
= 0.1 and = 0.01.
X in approximately 150(0.9) = 135 ints. X in approximately 150(0.99) = 148.5 ints.
(but not in 150(0.1) = 15) (but not in 150(0.01) = 1.5)
(1 ) = 0.9, n = 50 (1 ) = 0.99, n = 50
150

150
| | | |
| |
| | | | | |
| | | |
|| | || |
| | | |
| | | |
| | | | | |
| | | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | | |
| | | | | |
| | | |
| || | ||
| || | ||
| | | |
| | | |
100

100
| | | | | |
| || | ||
|| ||
| |
| || | ||
| | | |
| | | |
| | | |
| | | |
Index

Index
| | | |
| | | |
| |
| | | | | |
| | | |
| | | |
| | | |
| || | ||
|| ||
| | | | | |
| | | |
| |
| | | | | |
| | | | | |
50

50
| |
|| | || |
| | | |
| | | |
| |
|| | || |
| |
| || | ||
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
|| ||
| | | |
| | | |
| | | |
| |
| | | | | |
| | | |
| | | |
| | | |
0

0
6.0 5.5 5.0 4.5 4.0 6.0 5.5 5.0 4.5 4.0

Confidence interval Confidence interval


z/2 X
z/2 X
z/2 X
The width of the interval, w = x + p
n
x p
n
=2 p
n
,
increases with the increasing confidence level (keeping everything else the
same). Why?
Frequency interpretation of the CI, sample size eect
Here we collect 150 samples of size n = 50 and another 150 of size n = 200
2
from X N(X = 5, X = 12 ) .
X in approximately 150(0.9) = 135 ints. X in approximately 150(0.9) = 135 ints.
(but not in 150(0.1) = 15) (but not in 150(0.1) = 15)
(1 ) = 0.9, n = 50 (1 ) = 0.9, n = 200
150

150
| | || |
|
| | | |
| |
| | | |
|| | || |
|
| |
| | | | |
| | | || |
| | | | | |
| | |
| | || |
| | ||
| | | |
| | ||
| | | || |
| | | | |
| | |
| |
| || | | |
| || | | |
| | ||
| | | | |
100

100
| | | | ||
| || ||
|| |
| || |
| || |
| | | |
| | | ||
| | | |
| | | |
Index

Index
| | | |
| | |
| | ||
| | | |
| | ||
| | || |
| | || |
| || ||
|| |||
| | | ||
| | | |
| |
| | | | |
|
| | | ||
|
50

50
| |
|| | | ||
| | | |
| | | | |
|
|| | | | |
| ||
| || | |
| | | | |
| |
| | | | |
| | | |
| | |
| | |||
| | |
| | || |
|| |
| | | ||
| | | |
| | || |
|
| | | |
||
| | | ||
| |
| | | | |
0

0
6.0 5.5 5.0 4.5 4.0 6.0 5.5 5.0 4.5 4.0

Confidence interval Confidence interval

The width of the interval decreases with the increasing sample size (keeping
everything else the same). Why?

Question: What is the eect of on the width?


Example: estimating the sample size
Example: 8.14 (Newbold) The lengths of metal rods produced by an industrial
process are normally distributed with standard deviation 1.8mm. Suppose that
a production manager requires a 99% confidence interval extending no further
than 0.5mm on each side of the sample mean. How large a sample is needed to
achieve such an interval?
Population: Objective: n such that width 1
X = length of a metal rod (in mm)

'
X N(X , X2 = 1.82 ) z/2 X
2 p 1
n
p
SRS: n =? 2z/2 X n
85.93 = (2(2.575)(1.8))2 n
width
z }| {
z/2 X To satisfy the managers
CI0.99 (X ): 2 p 2(0.5) = 1 requirement, a sample of at least
n
86 observations is needed.

Area=
0.005

z0.005 = 2.575
Confidence interval for the population mean in large
samples

1. Let X n be a SRS of size n from X . Under the assumptions:


I 2
X follows a nonnormal distribution with parameters X and X
I the sample size n is large (n 30)
2. The pivotal quantity for X based on the Central Limit Theorem is

X X
p approx. N(0, 1)
X / n
Confidence interval for the population mean in large
samples

3. Hence, if z1 /2 and z/2 are the


(1 /2) and (/2) upper
quantiles of the N(0, 1), we have

P(z1 /2 < Z < z/2 ) = 1


1
2 2

Standard normal density
z12 = z2 z2

Z
z/2 z }| {
z }| { X X
4. Therefore P(z1 /2 < p < z/2 ) = 1
X / n
Confidence interval for the population mean in large
samples

5. Solve the double inequality for X :

X X
z/2 < p < z/2
X / n

to obtain the confidence interval estimator


T1 (X n ) T2 (X n )
z }| { z }| {
X X
(X z/2 p , X + z/2 p )
n n

6. The confidence interval is:


x x
CI1 (X ) = (x z/2 p , x + z/2 p )
n n
Confidence interval for the population proportion in large
samples
Application of CIs for the population mean in large samples
Let X n , n 30 be a SRS from p a Bernoulli distr. with parameter pX
(X = E[X ] = pX and X = pX (1 pX )). The sample proportion pX
is a special case of the sample mean of zero-one observations, pX = X .

Thus, from the CLT This result remains true if we

pX pX use an estimate for the population


p approx. N(0, 1)
pX (1 pX )/n standard deviation
| {z }
p pX pX
X/ n p p approx. N(0, 1)
pX (1 pX )/ n
| {z }
p
X / n

Thus, in large samples, the confidence interval for pX is:


r r !
px (1 px ) px (1 px )
CI1 (pX ) = px z/2 , px + z/2
n n
Example: finding a confidence interval for pX
Example: 8.6 (Newbold) A random sample of 344 industrial buyers were asked:
What is your firms policy for purchasing personnel to follow on accepting
gifts from vendors?. For 83 of these buyers, the policy of the firm was for the
buyer to make his/her own decision. Find a 90% confidence interval for the
population proportion of all buyers who are allowed to make their own decisions.
r !
Population: Objective: CI0.9 (pX ) = px z/2
px (1 px )
n
X = 1 if a buyer makes their own
decision and 0 otherwise px = 0.241 n = 344

'
X Bernoulli(pX ) 1 = 0.9 ) /2 = 0.05
z/2 = z0.05 = 1.645
0 s 1
SRS: n = 344 large CI0.9 (pX ) = @0.241 1.645
0.241(1 0.241)
A
344

= (0.241 0.038)
83
Sample: px = 344
= 0.241 = (0.203, 0.279)

Interpretation: We can be 90%


confident that the proportion of
Area= buyers who make their own decision,
0.05 pX , falls in (0.203, 0.279)

z0.05 = 1.645
Confidence interval for the population mean, normal
population with unknown variance

1. Let X n be a SRS of size n from X . Under the assumptions:


I 2
X follows a normal distribution with parameters X and X
I 2
X is unknown (quite realistic)

2. The pivotal quantity for X is

X X
p tn 1
sX / n
Confidence interval for the population mean, normal
population with unknown variance

3. Hence, if tn 1;1 /2 and tn 1;/2 are the


(1 /2) and (/2) upper quantiles of
the t distribution with n 1 degrees of
freedom (df), we have

tn 1
z}|{ 1
P(tn 1;1 /2 < T < tn 1;/2 ) =1 2 2


t (Student) density tn1 ; 12 = tn1 ; 2 tn1 ; 2

n
Recall: if T tn , E[T ] = 0, V[T ] = n 2

T tn 1
tn z }| {
1;/2
z }| { X X
4. Therefore P(tn 1;1 /2 < p < tn 1;/2 ) =1
sX / n
Confidence interval for the population mean, normal
population with known variance

5. Solve the double inequality for X :


X pX
tn 1;/2 < sX / n
< tn 1;/2

to obtain the confidence interval estimator


T1 (X n ) T2 (X n )
z }| { z }| {
sX sX
(X tn 1;/2 p , X + tn 1;/2 p )
n n

6. The confidence interval is:


sx sx
CI1 (X ) = (x tn 1;/2 p , x + tn 1;/2 p )
n n
Example: finding a confidence interval for X
Example: 8.4 (Newbold) A random sample of six cars from a particular model
year had the following fuel consumption figures, in mpg: 18.6, 18.4, 19.2, 20.8,
19.4, 20.5. Find a 90% confidence interval for the population mean fuel
consumption, assuming that the population distribution is normal.

Population: Objective: CI0.9 (X ) = x tn 1;/2 psxn
X = mpg of a car from the model
p
year X N(X , X2 ) X2 unknown

'
sx = 0.96 = 0.98
n=6 x = 19.48
SRS: n = 6 small 1 = 0.9 ) /2 = 0.05
116.9 tn 1;/2 = t5;0.05 = 2.015
Sample: x = 6
= 19.4833
0.98
CI0.9 (X ) = 19.48 2.105 p
2282.41 6(19.4833)2 6
sx2 = = 0.96
6 1 = (19.48 0.81)
= (18.67, 20.29)
Area= Interpretation: We can be 90%
0.05
confident that the population mean

fuel consumption for these cars, X ,
t5 ; 0.05 = 2.015 is between 18.67 and 20.29
Example: finding a confidence interval for X
Example: 8.4 (cont.) in Excel: Go to menu: Data, submenu: Data
Analysis, choose function: Descriptive Statistics.
Column A (data), in yellow (sample mean, half-width tn 1;/2 psxn , lower
end-point (cell D3-D16), upper end-point (cell D3+D16)).
2
t and distributions
I Recall that T tn if T = p Z2 , where Z N(0, 1) and 2
n follows a
n /n
chi-square distribution with df = n, independent of Z .
I On the other hand, 2n is the distribution of the sum of n independent
squared N(0, 1) random variables.
I Note that the rescaled sample quasi variance follows a chi-square
distribution with n 1 degrees of freedom
Pn n 2
(n 1)sX2 X )2 X
i=1 (Xi Xi X
2
= 2
= 2n 1
X X X
i=1

Why n 1 and not n?

If we knew X , the number of Since we have to estimate X with


degrees of freedom would be n, X , the df are n 1, because we only
because we would have n iid random have n 1 iid random variables Xi X X
variables Xi XX (once you know n 1 of them, you
can figure out the remaining one)

We say that one degree of freedom is used up to estimate X


2
t and distributions

2
t and N(0, 1) densities densities
0.4

0.15
df=20
0.3

N(0,1) df=15
df=10 df=10

0.10
df=5 df=5
0.2

df=3
0.05
0.1

0.00
0.0

4 2 0 2 4 0 10 20 30 40
Confidence interval for the population variance, normal
population

1. Let X n be a SRS of size n from X . Under the assumptions:


I 2
X follows a normal distribution with parameter X
2
2. The pivotal quantity for X is

(n 1)sX2 2
2 n 1
X
Confidence interval for the population variance, normal
population

3. Hence, if 2n 1;1 /2 and 2n 1;/2 are


the (1 /2) and (/2) upper
quantiles of the chi-square distribution
with n 1 degrees of freedom, we have

P( 2
n 1;1 /2 < 2
n 1 < 2
n 1;/2 ) =1
1
2 2

Chi-square density
2n1 ; 1 2n1 ;
2 2
2 2
Recall: E[ n] = n, V[ n] = 2n
2
n 1
z }| {
2 (n 1)sX2 2
4. Therefore P( n 1;1 /2 < 2
< n 1;/2 ) =1
X
Confidence interval for the population variance, normal
population
2
5. Solve the double inequality for X:

2 (n 1)sX2 2
n 1;1 /2 < 2 < n 1;/2
X

1 2 1
2 > (n 1)sX2
X
> 2
n 1;1 /2 n 1;/2
(n 1)sX2 2 (n 1)sX2
2 > X > 2
n 1;1 /2 n 1;/2

to obtain the confidence interval estimator


!
(n 1)sX2 (n 1)sX2
2 , 2
n 1;/2 n 1;1 /2

6. The confidence interval is:


!
2 (n 1)sx2 (n 1)sx2
CI1 ( X) = 2 , 2
n 1;/2 n 1;1 /2
2
Example: finding a confidence interval for X and X
Example: 8.8 (Newbold) A random sample of fifteen pills for headache relief
showed a quasi standard deviation of 0.8% in the concentration of the active
ingredient. Find a 90% confidence interval for the population variance for these
pills. How would you obtain a CI for the population standard deviation?
0 1
2 1)sx2
Population: 2 ) = @ (n 1)sx ,
Objective: CI0.9 ( X
(n
A
2 2
n 1;/2 n 1;1 /2
X = concentration of an active
ingredient in a pill (in %)
sx2 = 0.82 = 0.64

'
X N(X , X2 ) n = 15
1 = 0.9 ) /2 = 0.05
2 2
SRS: n = 15 n 1;1 /2 = 14;0.95 = 6.57
2 2
n 1;/2 = 14;0.05 = 23.68
Sample: sx = 0.8
2 14(0.64) 14(0.64)
CI0.9 ( X) = ,
23.68 6.57
Area= Area=
= (0.378, 1.364) )
p p
0.05 0.05 CI0.9 ( X) = ( 0.378, 1.364)
= (0.61, 1.17)
214 ; 0.95 214 ; 0.05 p
=6.57 =23.68
To obtain CI ( X ) we apply to the
end-points of CI ( X2 )
Confidence intervals formulae

Summary for one population


I Let X n be a simple random sample from a population X with mean X
and variance X2
Parameter Assumptions Pivotal quantity (1 ) Conf. Interval


X X
Normal data p N(0, 1) X 2 x z/2 pX , x + z/2 pX
Known variance X/ n n n

X X
Mean Nonnormal data p approx. N(0, 1) X 2 x z/2 px , x + z/2 px
Large sample X / n n n
r #
Bernoulli data pX pX px (1 px )
q approx. N(0, 1) pX 2 px z/2
Large sample pX (1 pX )/n n

Normal data X X sx sx
p tn 1 X 2 x tn 1,/2 pn , x + tn 1,/2 pn
Unknown variance sX / n
0 1
(n 2
1)sX (n 1)sx2 (n 1)sx2
2 2 @ A
Variance Normal data 2 n 1 X 2 2 , 2
X n 1;/2 n 1;1 /2
0v v 1
2 u u
(n 1)sX (n 1)sx2 (n 1)sx2
2 @u u A
Standard dev. Normal data 2 n 1 X 2 t 2 ,t 2
X n 1;/2 n 1;1 /2
Confidence intervals for the population mean:
when to use what?

X distribution with mean X and standard deviation X

.X normal
&X normal

. known
& unknown
.
n small
&
n large

#
z-based (exact)
#
t-based (exact)
#
Methods beyond
#
z-based
Est II (approx. CLT)

S-ar putea să vă placă și