Sunteți pe pagina 1din 49

Statistics for

Business and Economics


6th Edition

Chapter 20
Sampling:
Additional Topics in Sampling
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 20-1

Chapter Goals
After completing this chapter, you should be
able to:

Explain the basic steps of a sampling study

Describe sampling and nonsampling errors

Explain simple random sampling and stratified sampling

Analyze results from simple random or stratified samples

Determine sample size when estimating population


mean, population total, or population proportion

Describe other sampling methods

Cluster Sampling, Two-Phase Sampling, Nonprobability Samples

Steps of a Sampling Study


Step 6: Conclusions?
Step 5: Inferences From
Step 4: Obtaining Information?
Step 3: Sample Selection?
Step 2: Relevant Population?
Step 1: Information Required?

Sampling and
Nonsampling Errors

A sample statistic is an estimate of an unknown


population parameter
Sample evidence from a population is variable

Sample-to-sample variation is expected

Sampling error results from the fact that we only


see a subset of the population when a sample
is selected
Statistical statements can be made about
sampling error

It can be measured and interpreted using confidence


intervals, probabilities, etc.

Sampling and
Nonsampling Errors
(continued)

Nonsampling error results from sources not


related to the sampling procedure used
Examples:

The population actually sampled is not the relevant


one
Survey subjects may give inaccurate or dishonest
answers
Nonresponse to survey questions

Types of Samples

Probability Sample

Items in the sample are chosen on the


basis of known probabilities

Nonprobability Sample

Items included are chosen without


regard to their probability of occurrence

Types of Samples
(continued)

Samples

Probability Samples

Simple
Random

Stratified

Systematic

Cluster

Non-Probability
Samples

Judgement

Convenience

Quota

Simple Random Samples

Suppose that a sample of n objects is to be selected


from a population of N objects

A simple random sample procedure is one in which


every possible sample of n objects is equally likely to be
chosen

Only sampling without replacement is considered here

Random samples can be obtained from table of random


numbers or computer random number generators

Systematic Sampling

Decide on sample size: n

Divide frame of N individuals into groups of j


individuals: j=N/n

Randomly select one individual from the 1st


group

Select every jth individual thereafter


N = 64
n=8
j=8

First Group

Finite Population
Correction Factor

Suppose sampling is without replacement and


the sample size is large relative to the
population size
Assume the population size is large enough to
apply the central limit theorem
Apply the finite population correction factor
when estimating the population variance
finite population correction factor

Nn
N

Estimating the Population Mean

Let a simple random sample of size n be


taken from a population of N members with
mean

The sample mean is an unbiased estimator of


the population mean

The point estimate is:

1 n
x xi
n i1

Estimating the Population Mean


(continued)

An unbiased estimation procedure for the variance


of the sample mean yields the point estimate
2
s
2x N n
n
N

Provided the sample size is large, 100(1 - )%


confidence intervals for the population mean are
given by

x z /2 x x z /2 x

Estimating the Population Total

Consider a simple random sample of size


n from a population of size N

The quantity to be estimated is the


population total N

An unbiased estimation procedure for the


population total N yields the point
estimate NX

Estimating the Population Total

An unbiased estimator of the variance of the


population total is
2
s
N2 2x N(N n)
n

Provided the sample size is large, a 100(1 - )%


confidence interval for the population total is

Nx z /2N x N Nx z /2N x

Confidence Interval for


Population Total: Example
A firm has a population of 1000 accounts and
wishes to estimate the total population value
A sample of 80 accounts is selected with
average balance of $87.6 and standard
deviation of $22.3
Find the 95% confidence interval estimate of
the total balance

Example Solution
N 1000, n 80,

x 87.6,

s 22.3

2
2
s
(22.3)
N N(N n)
(1000)(920) 5718835
n
80
2

2
x

N x 5718835 2391.41

Nx z /2N x (1000)(87. 6) (1.96)(2391.41)


82912.84 N 92287.16
The 95% confidence interval for the population total
balance is $82,912.52 to $92,287.16

Estimating the
Population Proportion

Let the true population proportion be P

Let p be the sample proportion from n


observations from a simple random sample

The sample proportion, p , is an unbiased


estimator of the population proportion, P

Estimating the
Population Proportion
(continued)

An unbiased estimator for the variance of the


population proportion is

(1 p ) (N n)
p

n 1
N
2
p

Provided the sample size is large, a 100(1 - )%


confidence interval for the population proportion is

p z /2 p P p z /2 p

Stratified Sampling
Overview of stratified sampling:

Divide population into two or more subgroups (called


strata) according to some common characteristic

A simple random sample is selected from each subgroup

Samples from subgroups are combined into one

Population
Divided
into 4
strata

Sample

Stratified Random Sampling

Suppose that a population of N individuals can be


subdivided into K mutually exclusive and collectively
exhaustive groups, or strata
Stratified random sampling is the selection of
independent simple random samples from each
stratum of the population.
Let the K strata in the population contain N1, N2,. . .,
NK members, so that N1 + N2 + . . . + NK = N
Let the numbers in the samples be n1, n2, . . ., nK.
Then the total number of sample members is
n 1 + n2 + . . . + n K = n

Estimation of the Population Mean,


Stratified Random Sample

Let random samples of nj individuals be taken from


strata containing Nj individuals (j = 1, 2, . . ., K)
Let
K
K
Nj N and n j n
j1

j1

Denote the sample means and variances in the strata


by Xj and sj2 and the overall population mean by

An unbiased estimator of the overall population mean


is:
K

1
x st N j x j
N j1

Estimation of the Population Mean,


Stratified Random Sample
(continued)

An unbiased estimator for the variance of the overall population


mean is

2xst

1
2
N

where

2x j

s2j
nj

2 2
N
j x j

j 1

(N j n j )
Nj

Provided the sample size is large, a 100(1 - )% confidence


interval for the population mean for stratified random samples is

x st z /2 x st x st z /2 x st

Estimation of the Population Total,


Stratified Random Sample

Suppose that random samples of nj individuals from


strata containing Nj individuals (j = 1, 2, . . ., K) are
selected and that the quantity to be estimated is the
population total, N

An unbiased estimation procedure for the population


total N yields the point estimate
K

Nx st N j x j
j1

Estimation of the Population Total,


Stratified Random Sample
(continued)

An unbiased estimation procedure for the variance of


the estimator of the population total yields the point
estimate
K

N2 2xst N2j 2xst


j 1

Provided the sample size is large, 100(1 - )%


confidence intervals for the population total for
stratified random samples are obtained from

Nx st z /2N st N Nx st z /2N st

Estimation of the Population


Proportion, Stratified Random Sample

Suppose that random samples of nj individuals from


strata containing Nj individuals (j = 1, 2, . . ., K) are
obtained
Let Pj be the population proportion, and p j the
sample proportion, in the jth stratum
If P is the overall population proportion, an unbiased
estimation procedure for P yields
K
1
p st N jp j
N j1

Estimation of the Population


Proportion, Stratified Random Sample
(continued)

An unbiased estimation procedure for the


variance of the estimator of the overall population
proportion is

p2 st
where

1
2
N

2 2
N
j p j
j 1

p j (1 p j ) (N j n j )

nj 1
Nj
2
p j

is the estimate of the variance of the sample proportion in


the jth stratum

Estimation of the Population


Proportion, Stratified Random Sample
(continued)

Provided the sample size is large, 100(1 - )%


confidence intervals for the population proportion for
stratified random samples are obtained from

p st z /2 p st P p st z /2 p st

Proportional Allocation:
Sample Size

One way to allocate sampling effort is to make the


proportion of sample members in any stratum the same
as the proportion of population members in the stratum

If so, for the jth stratum,

nj
n

Nj
N

The sample size for the jth stratum using proportional


allocation is

nj

Nj
N

Optimal Allocation
To estimate an overall population mean or total and if the
population variances in the individual strata are
denoted j2 , the most precise estimators are obtained
with optimal allocation

The sample size for the jth stratum using optimal


allocation is

nj

N j j

N
i1

Optimal Allocation
(continued)

To estimate the overall population proportion, estimators


with the smallest possible variance are obtained by
optimal allocation

The sample size for the jth stratum for population


proportion using optimal allocation is

nj

N j Pj (1 Pj )
K

N
i1

Pi (1 Pi )

Determining Sample Size

The sample size is directly related to the size


of the variance of the population estimator

If the researcher sets the allowable size of


the variance in advance, the necessary
sample size can be determined

Sample Size, Mean,


Simple Random Sampling

Consider estimating the mean of a population of N


members, which has variance 2
2
If the desired variance, x of the sample mean is
specified, the required sample size to estimate the
population mean through simple random sampling is

N 2
n
(N 1) 2x 2

Sample Size, Mean,


Simple Random Sampling
(continued)

Often it is more convenient to specify directly the


desired width of the confidence interval for the
population mean rather than 2x

Thus the researcher specifies the desired margin of error for


the mean

Calculations are simple since, for example, a 95%


confidence interval for the population mean will
extend an approximate amount 1.96 x on each side
of the sample mean, X

Required Sample Size Example


2000 items are in a population. If = 45,
what sample size is needed to estimate the
mean within 5 with 95% confidence?
N = 2000, 1.96 x = 5 x = 2.551
N 2
(2000)(45)2
n

269.39
2
2
2
2
(N 1) x
(1999)(2.551) (45)

So the required sample size is n = 270


(Always round up)

Sample Size, Proportion,


Simple Random Sampling
(continued)

Consider estimating the proportion P of individuals


in a population of size N who possess a certain
attribute
2
If the desired variance, p , of the sample proportion
is specified, the required sample size to estimate the
population proportion through simple random
sampling is

NP(1 P)
n
(N 1) p2 P(1 P)

Sample Size, Proportion,


Simple Random Sampling
(continued)

The largest possible value for this expression occurs


when the value of P is 0.25

nmax

0.25N

(N 1) p2 0.25

A 95% confidence interval for the population proportion


will extend an approximate amount 1.96 p on each
side of the sample proportion

Required Sample Size Example


How large a sample would be necessary
to estimate the true proportion of voters
who will vote for proposition A, within 3%,
with 95% confidence, from a population of
3400 voters?

Required Sample Size Example


(continued)

Solution:
N = 34000
For 95% confidence, use z = 1.96
1.96 p s = .03 p s = .015306
nmax

0.25N
(0.25)(34000)

1035.47
2
2
(N 1) p 0.25 (33999)(.0153) 025
So use n = 1036

Sample Size, Mean,


Stratified Sampling

Suppose that a population of N members is subdivided


in K strata containing N1, N2, . . .,NK members

Let j2 denote the population variance in the jth stratum

An estimate of the overall population mean is desired

If the desired variance, x st , of the sample estimator is


specified, the required total sample size, n, can be
found

Sample Size, Mean,


Stratified Sampling
(continued)

For proportional allocation:


K

2
N

j j
j1

n
N

2
x st

1 K
N j 2j
N j1

For optimal allocation:

N
N

2
x st

N
j1

2
j

1 K
N j 2j
N j1

Cluster Sampling

Population is divided into several clusters,


each representative of the population

A simple random sample of clusters is selected

Generally, all items in the selected clusters are examined

An alternative is to chose items from selected clusters using


another probability sampling technique

Population
divided into
16 clusters.

Randomly selected
clusters for sample

Estimators for Cluster Sampling

A population is subdivided into M clusters and a simple


random sample of m of these clusters is selected and
information is obtained from every member of the
sampled clusters

Let n1, n2, . . ., nm denote the numbers of members in


the m sampled clusters

Denote the means of these clusters by x1, x 2 , , x m


Denote the proportions of cluster members possessing
an attribute of interest by P1, P2, . . . , Pm

Estimators for Cluster Sampling


(continued)

The objective is to estimate the overall population mean


and proportion P

Unbiased estimation procedures give


Mean

Proportion
m

xc

n x
i1
m

n
i 1

p c

n p
i1
m

i i

n
i1

Estimators for Cluster Sampling


(continued)

Estimates of the variance of these estimators, following from


unbiased estimation procedures, are

Mean

Proportion

Mm
p2 c
Mm n 2

2
n
i (xi x c )

Mm
2xc
Mm n 2

i1

m 1

2
2

n
(P

p
)
i i c

i1

m 1

Where n

n
i 1

is the average number of individuals in the sampled clusters

Estimators for Cluster Sampling


(continued)

Provided the sample size is large, 100(1 - )%


confidence intervals using cluster sampling are
for the population mean

x c z /2 x c x c z /2 x c

for the population proportion

p c z /2 p c P p c z /2 p c

Two-Phase Sampling

Sometimes sampling is done in two steps


An initial pilot sample can be done
Disadvantage:

takes more time

Advantages:

Can adjust survey questions if problems are noted


Additional questions may be identified
Initial estimates of response rate or population
parameters can be obtained

Non-Probability Samples
Samples

Probability Samples

Simple
Random

Stratified

Systematic

Cluster

Non-Probability
Samples

Judgement

Convenience

Quota

Non-Probability Samples
(continued)

It may be simpler or less costly to use a nonprobability based sampling method

Judgement sample
Quota sample
Convience sample

These methods may still produce good


estimates of population parameters
But

Are more subject to bias


No valid way to determine reliability

Chapter Summary

Reviewed basic steps in a sampling study

Defined sampling and nonsampling errors

Examined probability sampling methods

Simple Random Sampling, Systematic Sampling, Stratified


Random Sampling, Cluster Sampling

Identified Estimators for the population mean, population


total, and population proportion for different types of
samples

Determined the required sample size for specified


confidence interval width

Examined nonprobabilistic sampling methods