Sunteți pe pagina 1din 47

8- 1

Sampling
Methods
&
Central Limit Theorem

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 2

When you have completed this chapter, you will be able to:
1. Explain under what conditions sampling is the
proper way to learn something about a population.
2. Describe methods for selecting a sample.
3. Define and construct a sampling distribution
of the sample mean.
4. Explain the central limit theorem.
5. Use the central limit theorem to find probabilities of
selecting possible sample means from
a specified population.

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 3

We use sample information


to make decisions or inferences
about the population.

Two KEY steps:

1. Choice of a proper method for selecting sample data


&
2. Proper analysis of the sample data (more later)

KEY 1.
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 4

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 5

KEY 1.
If the proper
method for selecting
the sample is
NOT MADE the SAMPLE
will not be truly
representative of the
TOTAL Population!

and wrong conclusions can be drawn!


Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 6

Why Sample the Population?


Because
of the physical impossibility of checking
all items in the population, and,
also, it would be too time-consuming

$ the studying of all the items in a population


would NOT be cost effective

the sample results are usually adequate


the destructive nature of certain tests
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 7

Techniques
with Replacement without Replacement
Each data unit in the Each data unit in the
population is allowed to population is allowed to
appear in the sample appear in the sample
more than once no more than once

Probability Sampling Non-Probability Sampling

Each data unit in the Does not involve


population random selection;
has a known likelihood inclusion of an item is
of being based on convenience
included in the sample
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 8

Methods
Simple Random ...each item(person) in the population
has an equal chance of being included
Systematic Random items(people) of the population
are arranged in some order.
A random starting point is selected, and
then every kth member of the population
is selected for the sample
Stratified Random a population is
first divided into subgroups, called strata,
and a sample is selected from each strata
Cluster a population is
first divided into primary units, and
samples are selected from each unit
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 9

Terminology
Sampling error is the difference between
a sample statistic
and its
corresponding population
parameter
Sampling distribution is a probability distribution
of the sample mean consisting of
all possible sample means
of a given sample size
selected from a population
Example
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 10

The law firm of Hoya and Associates has five partners.


At their weekly partners meeting each reported the
number of hours they billed their clients last week:
Partner Hours
Example
Dunn 22
Hardy 26
Kiers 30
Malinowski 26
Tillman 22
If two partners are selected randomly
how many different samples are possible?
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 11

If two partners are selected randomly


how many different samples are possible?

Partner Hours Objects


Dunn 22
Hardy 26
Kiers 30 5 taken 2 at a time
Malinowski 26 for a Total of 10 Samples!
Tillman 22
Using 5C2
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 12

If two partners are selected randomly


how many different samples are possible?

Partner Hours Objects


Dunn 22 C
5 2=
Hardy 26
Kiers 30 5 5!
=
Malinowski 26 2! (5 2!)
Tillman 22
= 10 Samples
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 13

Partners Samples of 2 Mean


1&2 (22+26)/2 = 24
1&3 (22+30)/2 = 26
1&4 (22+26)/2 = 24
1&5 (22+22)/2 = 22
2&3 (26+30)/2 = 28
2&4 (26+26)/2 = 26
2&5 (26+22)/2 = 24
3&4 (30+26)/2 = 28
3&5 (30+22)/2 = 26
4&5 (26+22)/2 = 24
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 14

Example continued
Mean
24 Organize the sample means
26 into a Sampling Distribution
24
Sample Frequency Relative frequency
22 Mean Probability
28
26 22 1 1/10
24 24 4 4/10
28
26 3 3/10
26
24 28 2 2/10
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 15

Example continued

Compute the mean of the sample means.


Compare it with the population mean
Sample Mean Frequency

m = 22(1)+ 24(4)+ 26(3) + 28(2)


X
22 1
24 4 10
26 3 = 25.2
28 2
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 16

Example continued

Note The population mean is also the same as


the sample means25.2 hours!

Partner Hours
Dunn 22
m = 22 + 26 + 30 + 26 + 22
Hardy 26
5
Kiers 30
Malinowski 26 = 25.2
Tillman 22
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 17

Central Limit Theorem


The sampling distribution of the means
of all possible samples of size n
generated from the population
will be approximately normally distributed!
Sampling Distributions:
Mean (x )
Variance 2 /n
Standard Deviation
/ n
X
(standard error of the mean)
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 18

Point Estimates
A point estimate is one value ( a single point)
that is used to estimate a population parameter

sample mean
sample standard deviation
sample variance
sample proportion
More
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 19

Point Estimates
Population follows Population does NOT follow
the normal distribution the normal distribution
The sampling distribution If the sample is of at least 30
of the sample means also follows observations, the sample WILL
the normal distribution follow the normal distribution
Probability of a sample mean Probability of a sample mean
falling within a particular region, falling within a particular region,
use:
Z= X - m use:
Z= X - m
n s n

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 20

Central Limit Theorem


Chart 8 6 Results for Several Populations

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 21

Generating
5
Random Numbers
in Excel
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 22

Using

See

Click on DATA
ANALYSIS

See
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 23

Using
See

Highlight RANDOM NUMBER GENERATION


Click OK See
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 24

Using
See
1
20

0 100 INPUT NEEDS

$A:$A

See
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 25

Using

See If you want whole numbers, use the


FUNCTION WIZARD (fx)
to ROUND to the nearest integer.

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 26

Using

Highlight
Math & Trig
See
Scroll Down
find Round

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 27

Using

Highlight, and
click OK

See
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 28

Using
INPUT REQUIRED VALUES

A1
0

Click on OK

See
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 29

Using

See
66 CLICK on B1 and
DRAG to Fill COLUMN B

See

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 30

Selecting
Simple Random
Sample
in Excel
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 31

Using
Input Data
Select in Column A

See Scroll toSampling


Click OK
Select

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 32

Using

INPUT REQUIRED VALUES

$A:$A Click on
OK

10
$B:$B

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 33

Using

Since this is random number


generation, you will get different
numbers each time you do
this

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 34
Using the Sampling Distribution
of the Sample Mean
Data
Suppose it takes an A consumer watchdog
average of 330 minutes agency selects a random
for taxpayers to sample of 40 taxpayers
prepare, copy, and and finds the standard
mail an income tax deviation of the time
return form. needed is 80 minutes

What is the standard error of the mean?

Formula /n = 80 / 40 = 12.6
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 35
Using the Sampling Distribution
of the Sample Mean
Data
Suppose it takes an average of 330 minutes for
taxpayers to prepare, copy, and mail an income tax
return form. A consumer watchdog agency selects a
random sample of 40 taxpayers and finds the
standard deviation of the time needed is 80 minutes.

What is the likelihood the sample mean


is greater than 320 minutes?

nswer
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 36
Using the Sampling Distribution
of the Sample Mean
Data* average of 330 minutes *random sample of 40
* standard deviation is 80 minutes
What is the likelihood the sample mean
is greater than 320 minutes?

X -m
1 Formula z=
s n
320 - 330 a1
= = 0.79
80 40
320 330
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 37
Using the Sampling Distribution
of the Sample Mean
Data* average of 330 minutes *random sample of 40
* standard deviation is 80 minutes
What is the likelihood the sample mean
is greater than 320 minutes?

2 Look up 0.79
in Table

Required Area = a1
a1 =0.2852
0.2852 + .5 = 0.7852
320 330
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 38
Sampling Distribution of
Proportion

The normal distribution


(a continuous distribution)
yields a good approximation of
the binomial distribution
(a discrete distribution)
for large values of n.
Use when np and n(1- p ) are both greater than 5!

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


and Variance
Mean m
2 8- 39

of a
Binomial Probability Distribution

Formula m = np
Formula = np (1- p)
2

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 40
Sampling Distribution of
Proportion
A multinational company claims that 55% of its
employees are bilingual. To verify this claim, a
statistician selected a sample of 60 employees of the
company using simple random sampling and
found 48% to be bilingual.
Based on this information,
what can we say about the companys claim?

np = 60(.55) The sample size is big


= 33 enough to use the normal
n(1- p ) = 60(.45) approximation with a mean of
.55 and a standard deviation
= 27 of (.55)(.45)/60 = 0.064
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 41
Sampling Distribution of
Proportion
continued
X -m
Formula z=
s
1 Z = (0.48 -0.55) / 0.064 a1
Z = -1.09
.48 .55
2 Look up 1.09 in Table
a1 =0.3621
Required Area
= .5 0.3621 = 0.1379
or 14%
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 42
Sampling Distribution of
Proportion
continued
X -m
Formula z= Conclusion
s
1 Z = (0.48 -0.55) / 0.064 There is
Z = -1.09 approximately
a 14% chance
2 Look up 1.09 in Table
that the
a =0.3621 companys claim
Required Area is true, based on
= .5 0.3621 = 0.1379 this sample.
or 14%
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 43
Sampling Distribution of
Mean

Suppose the mean selling price of a


litre of gasoline in Canada is $.659.
Further, assume the distribution is positively
skewed, with a standard deviation of $0.08.
What is the probability of selecting a
sample of 35 gasoline stations and
finding the sample mean within $.03 of
the population mean?
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 44
Sampling Distribution of
Mean
Data mean selling price is $.659 SD of $0.08
Sample of 35 gasoline stations
Probability of sample mean within $.03?
Find the z-scores for
.659 +/- .03 i.e. 0.629 and .689

- m $. 629 - $. 659
z1 = X = = -2. 22
s n $ 0 . 08 35
$. 689 - $. 659
z2 = X - m = = 2.22
s n $ 0 . 08 35
.629 .689
Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8- 45
Sampling Distribution of
Mean
Data mean selling price is $.659 SD of $0.08
Sample of 35 gasoline stations
Probability of sample mean within $.03?

Find areas from table

We would expect about


z 1 = -2.22 a1 = .4868 97%
z 2 = 2.22 a2 = .4868 of the sample means to
Required A = .9736 be within $0.03 of the
population mean.

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 46

Test your learning

www.mcgrawhill.ca/college/lind

Online Learning Centre


for quizzes
extra content
data sets
searchable glossary
access to Statistics Canadas E-Stat data
and much more!

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


8- 47

This completes Chapter 8

Copyright 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

S-ar putea să vă placă și