Sunteți pe pagina 1din 63

Business Statistics

Estimating Population Values


Business Statistics
Topic Index
Probability Distribution
Sampling Distribution
Point & Interval Estimates
3
Types of Estimates
Point Estimate
A single number used to estimate an unknown population
parameter
Department Head: Our current data indicate that this course will
have 350 students in the fall
Characteristics
Either right or wrong
No estimate of reliability
Interval Estimate
A range of values used to estimate a population parameter
Department Head: I estimate that the true enrollment in this course
in this fall will be between 330 & 380 and that it is very likely that the
exact enrollment will fall within this interval
Characteristics
Better idea of reliability of estimate
Decision making is facilitated; e.g. on the basis of estimate cancel one
of the sections & offer an elective instead
4
Point and Interval Estimates
A point estimate is a single number,
a confidence interval provides additional
information about variability
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Width of
confidence interval
5
Point Estimates
Estimate Population
Parameters
with Sample
Statistics
Mean
Proportion
Variance
Difference

p
2
o
1 2

X
S
P
2
S
1 2
X X
6
Confidence Intervals
How much uncertainty is associated with a
point estimate of a population parameter?

An interval estimate provides more
information about a population characteristic
than does a point estimate

Such interval estimates are called confidence
intervals
7
Confidence Interval Estimate
An interval gives a range of values:
Takes into consideration variation in
sample statistics from sample to sample
Based on observation from 1 sample
Gives information about closeness to
unknown population parameters
Stated in terms of level of confidence
Never 100% sure
8
Estimation Process
(mean, , is
unknown)
Population
Random Sample
Mean
x = 50
Sample
I am 95%
confident that
is between
40 & 60.
9
General Formula
The general formula for all
confidence intervals is:
Point Estimate (Critical Value)(Standard Error)
10
Confidence Level
Confidence Level
Confidence in which the
interval will contain the
unknown population parameter
A percentage (less than 100%)

11
Confidence Level, (1-o)
Suppose confidence level = 95%
Also written (1 - o) = .95
A relative frequency interpretation:
In the long run, 95% of all the confidence
intervals that can be constructed will
contain the unknown true parameter
A specific interval either will contain or
will not contain the true parameter
No probability involved in a specific interval
12
Confidence Intervals
Population
Mean
Unknown
Confidence
Intervals
Population
Proportion
Known
13
Confidence Interval for ( Known)
Assumptions
Population standard deviation is known
Population is normally distributed
If population is not normal, use large sample
Confidence interval estimate for
n

z x
/2

14
Finding the Critical Value
Consider a 95% confidence interval:
z
.025
= -1.96 z
.025
= 1.96
.95 1 = o
.025
2

= .025
2

=
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
z units:
x units:
Point Estimate
0
1.96 z
/2
=
15
Common Levels of Confidence
Commonly used confidence levels
are 90%, 95%, and 99%
Confidence
Level
Confidence
Coefficient,

z value,
1.28
1.645
1.96
2.33
2.57
3.08
3.27
.80
.90
.95
.98
.99
.998
.999
80%
90%
95%
98%
99%
99.8%
99.9%
o 1
/2
z
o
16

x
=
Interval and Level of Confidence
Confidence Intervals
Intervals
extend from


to


100(1-o)%
of intervals
constructed
contain ;
100o% do not.
Sampling Distribution of the Mean
n

z x
/2 o
+
n

z x
/2 o

x
x
1
x
2
/2 o /2 o
o 1
17
Margin of Error
Margin of Error (e): the amount added and
subtracted to the point estimate to form the
confidence interval
n

z x
/2 o

z e
/2 o
=
Example: Margin of error for estimating , known:
18
Factors Affecting Margin of Error
Data variation, : e as
Sample size, n : e as n
Level of confidence, 1 - o : e if 1 - o
n

z e
/2 o
=
Intervals Extend from

X - Zo to X + Z o
x x
19
Case Study 4.A
A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is .35 ohms.

Determine a 95% confidence interval for the
true mean resistance of the population.
20
2.4068 ..... .......... 1.9932
.2068 2.20
) 11 (.35/ 1.96 2.20
n

z x
/2
=
=

o
Solution Case Study 4.A
A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is .35 ohms.
Solution:
21
Interpretation
We are 98% confident that the true mean
resistance is between 1.9932 and 2.4068 ohms
Although the true mean may or may not be in
this interval, 98% of intervals formed in this
manner will contain the true mean

An incorrect interpretation is that there is 98% probability that this
interval contains the true population mean.
(This interval either does or does not contain the true mean, there is
no probability for a single interval)
22
Case Study 4.B
Police Chief Ackert has recently instituted a crackdown on drug dealers
in her city. Since the crackdown began, 750 of the 12,368 drug
dealers in the city have been caught. The mean dollar value of drugs
found on these 750 dealers is $250,000. The SD of the dollar value
of drugs for these 750 dealers is $41,000. Construct a 90%
confidence interval for the mean dollar value of drugs possessed by
the citys drug dealers.
23
Determining Sample Size
The required sample size can be found to
reach a desired margin of error (e) and level
of confidence (1 - o)

Required sample size, known:
2
/2
2
/2
e
z
e
z
n
|
.
|

\
|
= =
o o
2
2
24
Required Sample Size Example
If o = 45, what sample size is needed to be
90% confident of being correct within 5?
(Always round up)
219.19
5
1.645(45)
e
z
n
2 2
/2
=
|
.
|

\
|
=
|
.
|

\
|
=
o
So the required sample size is n = 220
25
If is unknown
If unknown, can be estimated when
using the required sample size formula
Use a value for that is expected to be
at least as large as the true
Select a pilot sample and estimate with
the sample standard deviation, s
26
Confidence Intervals
Population
Mean
Unknown
Confidence
Intervals
Population
Proportion
Known
27
Confidence Intervals for the
Population Proportion, p
An interval estimate for the
population proportion ( p ) can be
calculated by adding an allowance
for uncertainty to the sample
proportion ( p )

28
Confidence Intervals for the
Population Proportion, p
Recall that the distribution of the sample
proportion is approximately normal if the
sample size is large, with standard deviation



We will estimate this with sample data:
n
) p (1 p
s
p

=
n
p) p(1

p

=
29
Confidence interval endpoints
Upper and lower confidence limits for the
population proportion are calculated with the
formula




where
z is the standard normal value for the level of confidence desired
p is the sample proportion
n is the sample size
n
) p ( p
z p
/2

o
1
30
Example
A random sample of 100 people shows
that 25 are left-handed. Form a 95%
confidence interval for the true proportion
of left-handers.
1.
2.
3.
.0433 .25(.75)/n )/n p (1 p S
.25 25/100 p
p
= = =
= =
0.3349 . . . . . 0.1651
(.0433) 1.96 .25
31
Interpretation
We are 95% confident that the true
percentage of left-handers in the population
is between
16.51% and 33.49%.

Although this range may or may not contain
the true proportion, 95% of intervals formed
from samples of size 100 in this manner will
contain the true proportion.
32
Changing the sample size
Increases in the sample size reduce
the width of the confidence interval.

Example:
If the sample size in the above example is
doubled to 200, and if 50 are left-handed in the
sample, then the interval is still centered at .25,
but the width shrinks to
.19 .31
33
Finding the Required Sample Size
for proportion problems
n
) p ( p
z e
/2

=
o
1
Solve for n:
Define the
margin of error:
2
/2
e
) p ( p z
n

=
o
1
2
p can be estimated with a pilot sample, if
necessary (or conservatively use p = .50)
34
What sample size...?
How large a sample would be necessary
to estimate the true proportion defective
in a large population within 3%, with
95% confidence?
(Assume a pilot sample yields p = .12)
35
What sample size...?
Solution:
For 95% confidence, use Z = 1.96
E = .03
p = .12, so use this to estimate p
So use n = 451
450.74
(.03)
.12) (.12)(1 (1.96)
e
) p ( p z
n
2
2
/2
=

=
o
2
2
1
36
Determining Sample
Size for Mean
What sample size is needed to be 90% confident
of being correct within 5? A pilot study
suggested that the standard deviation is 45.
Round Up
( )
2 2
2 2
2 2
1.645 45
219.2 220
Error 5
Z
n
o
= = = ~
37
Confidence Intervals
Population
Mean
Unknown
Confidence
Intervals
Population
Proportion
Known
38
If the population standard deviation is
unknown, we can substitute the sample
standard deviation, s
This introduces extra uncertainty, since s
is variable from sample to sample
So we use the t distribution instead of the
normal distribution
Confidence Interval for ( Unknown)
39
Determining Sample Size (Cost)
Too Big:
Requires
too many
resources
Too small:
Wont do
the job
40
Assumptions
Population standard deviation is unknown
Population is normally distributed
If population is not normal, use large sample
Use Students t Distribution
Confidence Interval Estimate
Confidence Interval for ( Unknown)
n
s
t x
/2 o

41
Approximation for Large Samples
Since t approaches z as the sample size
increases, an approximation is
sometimes used when n > 30:
n
s
t x
/2 o

n
s
z x
/2 o

Technically
correct
Approximation
for large n
42
Students t Distribution
Z
t
0
t (df = 5)
t (df = 13)
Bell-Shaped
Symmetric
Fatter
Tails
Standard
Normal
43
Degrees of Freedom (df )
Number of observations that are free to
vary after sample mean has been
calculated
Example:
Mean of 3 numbers is 2
degrees of freedom
= n -1
= 3 -1
= 2
1
2
3
1 (or any number)
2 (or any number)
3 (cannot vary)
X
X
X
=
=
=
44
Students t Table
Upper Tail Area
df .25 .10
.05
1 1.000 3.078 6.314
2
0.817 1.886 2.920
3 0.765 1.638 2.353
t
0
2.920
t Values
Let: n = 3
df = n - 1 = 2
o = .10
o/2 =.05
o / 2 = .05
45
t distribution values
With comparison to the z value
Confidence t t t z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.28
.90 1.812 1.725 1.697 1.64
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.57
Note: t z as n increases
46
Example
/ 2, 1 / 2, 1
8 8
50 2.0639 50 2.0639
25 25
46.69 53.30
n n
S S
X t X t
n n
o o


s s +
s s +
s s
A random sample of 25 has 50 and 8.
Set up a 95% confidence interval estimate for
n X S

= = =
47
Confidence Interval
Estimate for Proportion
Assumptions
Two categorical outcomes
Population follows binomial distribution
Normal approximation can be used if
and
Confidence interval estimate

5 np >
( )
1 5 n p >
( ) ( )
/ 2 / 2
1 1
S S S S
S S
p p p p
p Z p p Z
n n
o o

s s +
48
Example
A random sample of 400 voters showed 32
preferred candidate A. Set up a 95%
confidence interval estimate for p.
( ) ( )
( ) ( )
/ /
1 1
.08 1 .08 .08 1 .08
.08 1.96 .08 1.96
400 400
.053 .107
s s s s
s s
p p p p
p Z p p Z
n n
p
p
o o 2 2

s s +

s s +
s s
49
Determining Sample Size for
Proportion
Out of a population of 1,000, we randomly
selected 100, of which 30 were defective.
What sample size is needed to be within
5% with 90% confidence?
Round Up
( ) ( )( )
2 2
2 2
1 1.645 0.3 0.7
Error 0.05
227.3 228
Z p p
n

= =
= ~
50
Confidence Interval for Population
Total Amount
Point estimate

Confidence interval estimate


NX
( )
( )
( )
/ 2, 1
1
n
N n
S
NX N t
N
n
o

51
Confidence Interval for Population
Total: Example
An auditor is faced with a
population of 1000 vouchers
and wants to estimate the
total value of the population.
A sample of 50 vouchers is
selected with average
voucher amount of
$1076.39, standard deviation
of $273.62. Set up the 95%
confidence interval estimate
of the total amount for the
population of vouchers.
52
Example Solution
( )
( )
( )
( )( ) ( )( )
/ 2, 1
1000 50 $1076.39 $273.62
1
273.62 1000 50
1000 1076.39 1000 2.0096
1000 1
100
1, 076, 390 75,830.85
n
N n X S
N n
S
NX N t
N
n
o
= = = =

=
The 95% confidence interval for the population total
amount of the vouchers is between 1,000,559.15, and
1,152,220.85
53
Confidence Interval for Total
Difference in the Population
Point estimate
Where is the sample average
difference
Confidence interval estimate


Where
ND
1
n
i
i
D
D
n
=
=

( )
( )
( )
/ 2, 1
1
D
n
N n
S
ND N t
N
n
o

( )
2
1
1
n
i
i
D
D D
S
n
=

=

54
Estimation for Finite Population
Samples are selected without replacement
Confidence interval for the mean ( unknown)



Confidence interval for proportion


o
( )
( )
/ 2, 1
1
n
N n
S
X t
N
n
o

( ) ( )
( )
/ 2
1
1
S S
S
p p N n
p Z
n N
o

55
Sample Size Determination
for Finite Population
Samples are selected without replacement
When estimating the mean



When estimating the proportion


2 2
/ 2
0
2
Z
n
e
o
o
=
( )
2
/ 2
0
2
1 Z p p
n
e
o

=
56
Case Study 4.1 LR 357 7.17
An apartment manager wants to inform potential renters about how
much electricity they can expect to use during August. She
randomly selects 61 residents and discovers their average
electricity usage in August to be 894 kwh. She believes the
variance in usage to be 131 kwh
2
.

a. Establish an interval estimate for the average August electricity
usage so that the apartment manager can be 68.3% certain that
the true population mean lies within this interval
b. Repeat part (a) with 99.7% certainty
c. If the price per kwh is $0.12, within what interval can the
apartment manager be 68.3% certain that the average August
cost for electricity will lie?
57
Case Study 4.2 LR 356 SC 7.3
For a population with a known variance of 185, a sample of 64
individuals leads to 217 as an estimate of the mean

a. Find the standard error of the mean
b. Establish an interval estimate that should include the population
mean 68.3% of the time.
58
Case Study 4.3 LR 365 7.27
The manager of Cardinal Electrics light bulb division must estimate the
average number of hours that a light bulb made by each light bulb
machine will last. A sample of 40 light bulbs was selected from
machine A and the average burning time was 1,416 hours. The
standard deviation of burning time is known to be 30 hours.

a. Compute the standard error of the mean
b. Construct a 90% confidence interval for the true population mean.
59
Case Study 4.4 LR 365 SC 7.7
In an automotive safety test conducted by the North Carolina Highway
Safety Research Center, the average tire pressure in a sample of
62 tires was found to be 24 pounds per square inch, and the
standard deviation was 2.1 pounds per square inch.

a. What is the estimated population standard deviation for this
population? (There are about a million cars registered in North
Carolina.)
b. Calculate the estimated standard error of the mean.
c. Construct a 95% confidence interval for the population mean.
60
Case Study 4.5 LR 369 SC 7.8
When a sample of 70 retail executives was surveyed regarding the
poor November performance of the retail industry, 66% believed
that decreased sales were due to unseasonably warm
temperatures, resulting in consumers delayed purchase of cold-
weather items.

a. Estimate the standard error of the proportion of retail executives
who blame warm weather for low sales.
b. Find the upper & lower confidence limits for this proportion, given
a 95% confidence level.
61
Case Study 4.6 LR 370 7.41
For a year and a half, sales have been falling consistently in all 1,500
franchises of a fast-food chain. A consulting firm has determined
that 31 percent of a sample of 95 indicate clear signs of
mismanagement. Construct a 98% confidence interval for this
proportion.
62
Case Study 4.7 LR 378 7.49
Twelve bank teller were randomly sampled and it was determined they
made an average of 3.6 errors per day with a sample standard
deviation of .42 error. Construct a 90% confidence interval for the
population mean of errors per day. What assumption is implied
about the number of errors bank tellers make?
63
Case Study 4.8 LR 383 7.58
A local store that specializes in candles and clocks is interested in
obtaining an interval estimate for the mean number of customers
that enter the store daily. The owners are reasonably sure that the
actual standard deviation of the daily number of customers is 15
customers. Help the store out of the fix by determining the sample
size it should use in order to develop a 96% confidence interval
for the true mean that will have a width of only 8 customers.

S-ar putea să vă placă și