Sunteți pe pagina 1din 45

Chapter 15

Sampling
Distribution Models

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Objectives
54.
55.

56.

State and apply the conditions and uses of the Central


Limit Theorem.
Determine the mean and standard deviation (standard
error) for a sampling distribution of proportions or
means.
Apply the sampling distribution of a proportion or a
mean to application problems.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

15.1

Sampling
Distribution of a
Proportion

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Sample Proportions and Sampling


Distributions
The Harris poll found that of 889 U.S. adults, 40% said
they believe in ghosts. CBS News found that of 808
U.S. adults, 48% said they believe in ghosts.
Why are these two sample proportions different?
What is the true population proportion (of ALL U.S.
adults)?
Well denote the population proportion p, and the sample
proportion p^
Consider all possible samples of size 808 if we made a
histogram of the number of samples having a given p^
what might that look like?

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 4

The Central Limit Theorem for Sample


Proportions
Rather than showing real repeated samples, imagine
what would happen if we were to actually draw many
samples and look at their proportions.
The histogram wed get if we could see all the
proportions from all possible samples is called the
sampling distribution of the proportions.
What would the histogram of all the sample proportions
look like?

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 5

Sampling About Evolution


According to a Gallup poll, 43% believe in
evolution. Assume this is true of all Americans.
If many surveys were done of 1007 Americans, we
could calculate the sample proportion for each.

The histogram shows the


distribution of a simulation
of 2000 sample proportions.

The distribution of all possible


sample proportions from samples with the same
sample size is called the sampling distribution.
Copyright 2014, 2012, 2009 Pearson Education, Inc.

Sampling Distributions
Sampling Distribution for Proportions
Symmetric
Unimodal
Centered at p
The sampling distribution follows the Normal model:

N p,

pq
n

What does the sampling distribution tell us?


The sampling distribution allows us to make
statements about where we think the corresponding
population parameter is and how precise these
statements are likely to be.
Copyright 2014, 2012, 2009 Pearson Education, Inc.

Another way of saying this


Sample statistics are random variables themselves
Sample proportion (for categorical data)
Sample mean (for quantitative data)
They have a probability distribution, mean, standard
deviation, etc.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Mean and Standard Deviation


Sampling Distribution for Proportions

Mean = p

npq
pq
( p ) =
=
n
n

pq
p,

Copyright 2014, 2012, 2009 Pearson Education, Inc.

The Normal Model for Evolution


Population: p = 0.43, n = 1007. Sampling Distribution:
Mean = 0.43

Standard deviation = ( p ) =

0.43 0.57
1007

Copyright 2014, 2012, 2009 Pearson Education, Inc.

0.0156

10

Assumptions and Conditions


Most models are useful only when specific assumptions
are true.
There are two assumptions in the case of the model for
the distribution of sample proportions:
1. The Independence Assumption: The sampled
values must be independent of each other.
2. The Sample Size Assumption: The sample size, n,
must be large enough.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 11

11

Assumptions and Conditions (cont.)


Assumptions are hardoften impossibleto check.
Thats why we assume them.
Still, we need to check whether the assumptions are
reasonable by checking conditions that provide
information about the assumptions.
The corresponding conditions to check before using the
Normal to model the distribution of sample proportions
are the Randomization Condition,10% Condition and
the Success/Failure Condition.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 12

12

Assumptions and Conditions (cont.)


1. Randomization Condition: The sample should be a
simple random sample of the population.
2. 10% Condition: If sampling has not been made with
replacement, then the sample size, n, must be no
larger than 10% of the population.
3. Success/Failure Condition: The sample size has to
be big enough so that both np and nq are at least
10.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 13

13

The Central Limit Theorem for Sample


Proportions (cont)
Because we have a Normal model, for example, we
know that 95% of Normally distributed values fall
within two standard deviations of the mean. So we
should not be surprised if 95% of various polls gave
results that were near the mean but varied above and
below that by no more than two standard deviations.
This is what we mean by sampling error. Its not really
an error at all, but just variability youd expect to see
from one sample to another.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 14

14

Solving Sampling Distribution Problems


(Proportions)
o

First identify what sampling distribution is involved.


o Hint: you must know the underlying population p and there
must be a sample proportion involved.

o The sampling distribution is given by N p, pq

Check the conditions, to be sure the sampling distribution


applies.

Draw a picture of the Sampling Distribution (Normal curve)

Find where p^ falls on this distribution and use NormalCdf to


solve for the probability of seeing p^ or something more
Copyright 2014, 2012, 2009 Pearson Education, Inc.
extreme (shade from
p^ to the nearest tail)

15

Practice
12) Public Health statistics indicate that 26.4% of American adults
smoke cigarettes. Describe the sampling distribution model for
the proportion of smokers among a randomly selected group of
50 adults. What are your assumptions and conditions?
15) Based on past experience, a bank believes that 7% of the
people who receive loans will not make payments on time. The
bank has recently approved 200 loans.
What are the mean and standard deviation of the proportion of
clients in this group who may not make timely payments?
What assumptions underlie your model? Are the conditions met?
What is the probability that over 10% of these clients will not
make timely payments?
Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 16

16

Practice
16) Assume that 30% of students at a university wear
contact lenses.
We randomly pick 100 students. Let p^ represent the
proportion of students who wear contact lenses.
Whats the appropriate model for the distribution of p^?
Specify the name of the distribution, the mean, and
the standard deviation.
Be sure the verify that the conditions are met.
Whats the approximate probability that more than one
third of this sample wear contacts?

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 17

17

Enough Lefty Seats?


13% of all people are left handed.
A 200-seat auditorium has 15 lefty seats.
What is the probability that there will not be enough
lefty seats for a class of 90 students?
Think
> 0.167
Plan: p^=15/90 0.167, Want P p
Model:
Independence Assumption: With respect to
lefties, the students are independent.
10% Condition: This is out of all people.
Success/Failure Condition: 15 10, 75 10
Copyright 2014, 2012, 2009 Pearson Education, Inc.

18

Enough Lefty Seats?


Think
Model: p = 0.13, n=90

SD p =

0.13 0.87
90

0.035

The model is: N(0.13, 0.035)


Show
Plot
0.167 0.13
Mechanics: z =
1.06
0.035

P ( p > 0.167) = P ( z > 1.06)


0.1446

Or normalcdf(0.167, 1E99, 0.13, 0.035)


Copyright 2014, 2012, 2009 Pearson Education, Inc.

19

Enough Lefty Seats?


Tell
Conclusion: There is about a 14.5% chance that
there will not be enough seats for the left handed
students in the class.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

20

15.3

The Sampling
Distribution of Other
Statistics

Copyright 2014, 2012, 2009 Pearson Education, Inc.

21

The Sampling Distribution for Others

There is a sampling distribution for any statistic, but


the Normal model may not fit.
Below are histograms showing results of simulations
of sampling distributions.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

22

The Sampling Distribution For Others

The medians seem to be approximately Normal.

The variances seem somewhat skewed right.

The minimums are all over the place.

In this course, we will focus on the proportions and


the means.
Copyright 2014, 2012, 2009 Pearson Education, Inc.

23

Sampling Distribution of the Means

Imagine we roll a number of dice and


take the average of the rolls over
and over again.

For 1 die, the distribution is Uniform.

For 3 dice, the sampling distribution


for the means is closer to Normal.

For 20 dice, the sampling distribution


for the means is very close to
normal. The standard deviation is
much smaller.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

24

15.4

The Central Limit


Theorem: The
Fundamental
Theorem of
Statistics
Copyright 2014, 2012, 2009 Pearson Education, Inc.

25

The Central Limit Theorem


The Central Limit Theorem
The sampling distribution of any mean becomes
nearly Normal as the sample size grows.
Requirements
Independent
Randomly collected sample
The sampling distribution of the means is close to Normal
if either:
Large sample size
Population close to Normal
Copyright 2014, 2012, 2009 Pearson Education, Inc.

26

Video on the Central Limit Theorem


http://www.nytimes.com/video/science/100000002452709
/bunnies-dragons-and-the-normal-world.html?playlistI
d=100000002438160

Copyright 2014, 2012, 2009 Pearson Education, Inc.

27

How Normal?

Copyright 2014, 2012, 2009 Pearson Education, Inc.

28

Population Distribution and Sampling


Distribution of the Means
Population Distribution

Normal

Sampling Distribution for


the Means
Normal (any sample size)

Uniform

Normal (large sample size)

Bimodal

Normal (larger sample size)

Skewed

Normal (larger sample size)

Copyright 2014, 2012, 2009 Pearson Education, Inc.

29

Standard Deviation of the Means

Which would be more unusual: a student who is


69 tall in the class or a class that has mean height
of 69?

The sample means have a smaller standard


deviation than the individuals.

The standard deviation of the sample means goes


down by the square root of the sample size:

SD y =
n
Copyright 2014, 2012, 2009 Pearson Education, Inc.

30

The Sampling Distribution Model for a


Mean
When a random sample is drawn from a population with
mean and standard deviation , the sampling
distribution has:
Mean:

Standard Deviation:
n

For large sample size, the distribution is


approximately normal regardless of the population
the random sample comes from.

The larger the sample size, the closer to Normal.


Copyright 2014, 2012, 2009 Pearson Education, Inc.

31

Solving Sampling Distribution Problems


(Means)

Copyright 2014, 2012, 2009 Pearson Education, Inc.

32

Caution!
Pay attention to how the sampling distribution of means
differs depending on the size of the sample.
Be careful to distinguish between the underlying
distribution of the population (which may or may not be
normal) and the sampling distribution of means (which
depends on sample size n).

Copyright 2014, 2012, 2009 Pearson Education, Inc.

33

38) Statistics indicate that Ithaca, NY gets an average


rainfall of 35.4 of rain each year, with a standard
deviation of 4.2. Assume that a Normal model applies

During what percentage of years does Ithaca get


more than 40 of rain?

Less than how much rain falls in the driest 20% of


all years?
A Cornell student is in Ithaca for 4 years. Let y(bar)
represent the mean amount of rain for those 4 years.
Describe the sampling distribution model of this
sample mean y(bar).
Whats the probability that those 4 years average
less than 30 of rain?
Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 34

34

Too Heavy for the Elevator?


Mean weight of US men is 190 lb, the
standard deviation is 59 lb. An elevator has a weight limit
of 10 persons or 2500 lb. Find the probability that 10 men
in the elevator will overload the weight limit.
Think
Plan: 10 over 2500 lb same as their mean over 250.

Model:
Independence Assumption: Not random, but
probably independent.
Sample

Size Condition: Weight approx. Normal.


Copyright 2014, 2012, 2009 Pearson Education, Inc.

37

Too Heavy for the Elevator


Think
Model: = 190, = 59
By the CLT, the sampling distribution of y is
approximately Normal:

59
( y ) = 190, SD( y ) =
=
18.66
n
10
Show
Plot:

Copyright 2014, 2012, 2009 Pearson Education, Inc.

38

Too Heavy for the Elevator?

Mechanics:
y 250 190
z=
=
3.21
SD( y )
18.66

P ( y > 250) P ( z > 3.21) 0.0007

Tell
Conclusion: There is only a 0.0007 chance that the
10 men will exceed the elevators weight limit.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

39

43) The College Board reported the score distribution


shown in the table for all students who took the
2006 AP Statistics Exam:
Find the mean and standard deviation of the scores
If we select a random sample of 40 AP students
would we expect their scores to follow a Normal
Model?
Consider the mean scores of random samples of 40
AP stats students. Describe the sampling model for
these means
An AP stats teacher had 63 students preparing to take
the AP exam. He considers his students to be
typical of all the national students. Whats the
probability that his students will achieve an average
score of at least 3?

Copyright 2014, 2012, 2009 Pearson Education, Inc.

Score Percent of
Students
5

12.6

22.2

25.3

18.3

21.6

Slide 1- 40

40

48) The weight of potato chips in a bag is stated to be 10


ounces. The amount that the machine puts in these
bags is believed to have a normal model with mean
10.2 oz and standard deviation of 0.12 oz.
What fraction of all bags are underweight?
Some of the chips are sold in bargain packs of 3
bags. What is the probability that none of the 3 is
underweight?
Whats the probability that the mean weight of the 3
bags is below 10 oz.
Whats the probability that the mean weight of a 24-bag
case is below 10 oz?
Copyright 2014, 2012, 2009 Pearson Education, Inc.

Slide 1- 41

41

15.5

Sampling
Distributions: A
Summary

Copyright 2014, 2012, 2009 Pearson Education, Inc.

42

Sample Size and Standard Deviation

SD( y ) =
n

SD( p ) =

pq
n

Larger sample size Smaller standard deviation

Multiply n by 4 Divide the standard deviation by 2.

Need a sample size of 100 to reduce the


standard deviation by a factor of 10.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

43

Billion Dollar Misunderstanding


Bill and Melinda Gates Foundation found that the 12% of
the top 50 performing schools were from the smallest 3%.
They funded a transformation to small schools.

Small schools have a smaller n, thus a higher y


standard deviation.

Likely to see both higher and lower means.

18% of the bottom 50 were also from the smallest 3%.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

44

Distribution of the Sample


vs. the Sampling Distribution
Dont confuse the distribution of the sample and the
sampling distribution.
If the populations distribution is not Normal, then the
samples distribution will not be normal even if the
sample size is very large.

For large sample sizes, the sampling distribution,


which is the distribution of all possible sample means
from samples of that size, will be approximately
Normal.
Copyright 2014, 2012, 2009 Pearson Education, Inc.

45

Two Truths About Sampling Distributions

Sampling distributions arise because samples vary.


Each random sample will contain different cases
and, so, a different value of the statistic.

Although we can always simulate a sampling


distribution, the Central Limit Theorem saves us the
trouble for proportions and means. This is especially
important when we do not know the populations
distribution.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

46

What Can Go Wrong?


Dont confuse the sampling distribution with the
distribution of the sample.
A histogram of the data shows the samples
distribution. The sampling distribution is more
theoretical.
Beware of observations that are not independent.
The CLT fails for dependent samples. A good
survey design can ensure independence.
Watch out for small samples from skewed or bimodal
populations.
The CLT requires large samples or a Normal
population or both.

Copyright 2014, 2012, 2009 Pearson Education, Inc.

47

S-ar putea să vă placă și