Sunteți pe pagina 1din 36

Chapter 7

Sampling
distributions

Digital Image PowerPoint to accompany:


Cover illustration: © Raw Pixel/Shutterstock.com
Sampling Distributions

• A sampling distribution is a distribution of all of the possible values of


a statistic for a given size sample selected from a population.
Sampling
Distributions

Sampling Sampling
Distribution of Distribution of
the Mean the Proportion

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


3 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Developing a Sampling Distribution
Assume there is a population:

Population size N=4

Random variable, X, is age of individuals.

Values of X: 18, 20, 22, 24 (years).


Developing a Sampling Distribution

Summary Measures for the Population Distribution

P(x)
μ
X i
.3
N
18  20  22  24 .2
  21
4 .1
0
18 20 22 24 x
σ
 (X i  μ)2
 2.236
N Uniform Distribution
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
5 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Developing a Sampling Distribution

1st 2nd Observation


Obs 18 20 22 24
Now consider all possible 18 18,18 18,20 18,22 18,24
samples of size n=2 20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24

16 possible samples
(sampling with
replacement)

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


6 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Developing a Sampling Distribution

Sampling Distribution of All Sample Means

16 Sample Means Sample Means


Distribution

st nd _
1 2 Observation
P(X)
Obs 18 20 22 24 (normal distribution)
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 18 19 20 21 22 23 24 X
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
7 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
So what do we do next?
1st as a reminder what’s the formula for the E(X)?
E(X) =σ𝑖 Xi *P(Xi)

Let’s work out the mean of the sample means i.e. E(𝑋ത ) also called µ𝑋ത

Take each value of 𝑋ത and multiply it by the associated probability of it occurring

So E (𝑋ത ) = σ𝑖 Xഥi ∗ 𝑃( Xഥi)


=18*(1/16) + 19*(2/16) +20*(3/16) +21*(4/16) +22*(3/16) +23*(2/16)+24*(1/16)
=21
Notice anything?
The mean of the sample means is = to the population mean!
We thus say our estimator is unbiased.
Roughly speaking its right on average.
Lets finish off by working out the variance of the sample mean.
Recall the formula for Var (X): N
σ 2   [X i  E(X)] 2 P(X i )
i 1
ഥ?
So what about the Var(X)
σ2𝑥ҧ = σ𝑖( ഥ
Xi -µ𝑥ҧ )2 *P( ഥ
Xi )
=(18-21)2 *(1/16) + etc. =2.5
So σ𝑥ҧ = sqrt(2.5)=1.58.
Turns out the formula for the standard deviation of the sample mean (sometimes
called the standard error): σ𝑥ҧ = 𝜎/ 𝑛
It’s a measure of the variability in the mean from sample to sample
*Note that the standard error of the mean decreases as the sample size increases.*
Sample size should be at least 30. small sample size: <30, large sample size: >30
If the Population is Normal
If a population is normal with mean μ and standard deviation σ, the
sampling distribution of 𝑋ത is also normally distributed with:
𝜇𝑥ҧ =𝜇 and 𝜎𝑥ҧ = 𝜎Τ 𝑛
If the Population is NOT Normal (left skewed/right skewed)
The Central Limit Theorem states that regardless of the shape of the
population distribution, as long as the sample size is large enough
(generally n ≥ 30) the sampling distribution of 𝑋ത will be
approximately normally distributed with once again:
𝜇𝑥ҧ =𝜇 and 𝜎𝑥ҧ = 𝜎Τ 𝑛
Two Sampling Distribution Properties

As n increases,
decreases
σx
Larger
sample size

Smaller
sample size

μ x
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
12 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
The Central Limit Theorem

the sampling
As the sample distribution
size n gets large becomes
enough… almost
normal,
regardless of
shape of
population.

13
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
x
Recall
If X ~ N(µ, σ2)
Then (X- µ)/ σ2 ~ N(0,1) i.e. standard normal. i.e. subtract the mean
and divide by the square root of the variance
So we now know that if X ~ N(µ, σ2) then 𝑋ത ~ N( µ , σ2 / n)

So convert to standard normal: Z= ( 𝑋ത - µ) / σ2 / n ~ N(0,1)

Equivalently Z= ( 𝑋ത - µ) / (σ/ 𝑛)
Sampling Distribution Example
The number of a real estate agent’s clients follows a
non-normal population distribution with μ = 8
and σ = 3.
What is the probability that the sample mean is
between 7.8 and 8.2 if a random sample
of size n = 36 is selected?

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


15 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
So lets summarise
If X ~ ? (µ, σ2)

What is the distribution of 𝑋
?
If the central limit theorem holds, i.e. if sample size n >=30: Then:
𝑋ത ~ N( µ , σ2 / n)
In our specific example:
X ~ ? (8, 9) i.e. µ=8 & σ2 =9 and n=36 >=30
Therefore:
𝑋ത ~ N( 8 , 9 / 36)
𝑋ത ~ N( 8 , 9 / 36)
So what is the P(7.8 <𝑋ത <8.2) ?
Convert to standard normal : subtract the mean and divide by sqrt of
variance

If 𝑋ത =7.8 then Z= (7.8- 8)/ 9/36 = -0.4


Similarly if 𝑋ത =8.2 then Z=0.4
So work out P(-0.4 <𝑍<0.4)
Sampling Distribution of the Proportion

Cannot use this for


categorical data Sampling
Distributions In % or decimal

Sampling Sampling
In numbers
Distribution of Distribution of use this for
the Mean the Proportion categorical data

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


19 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Sampling Distribution of the Proportion (used for categorical data)

𝝅 is the proportion of items in the population with a characteristic of


interest.
p is the sample proportion and provides an estimate of 𝝅
Use P to estimate 𝝅

One example might be the proportion of a population that smoke


X
p
n

number of items in the sample having the characteristic of interest



sample size

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


20 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
p ~ N(𝜋 , 𝜋(1- 𝜋)/n)
Convert to standard normal (Z):

(p - 𝜋)/sqrt(𝜋(1- 𝜋)/n) ~ N(0,1)

sqrt(𝜋(1- 𝜋)/n is the standard error of proportion


Proportion example:

If the true proportion of voters who support Proposition A


is π = 0.4, what is the probability that a sample of size
200 yields a sample proportion between 0.40 and 0.45?

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


21 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Sampling Distribution of the Proportion Example

 (1   ) 0.4(1  0.4)
Find σ p : σp    0.03464
n 200
Convert to standard normal (Z):

 0.40  0.40 0.45  0.40 


P(0.40  p  0.45)  P Z 
 0.03464 0.03464 
 P(0  Z  1.44)

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


22 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Sampling Distribution of the Proportion Example
If π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?

Use standard normal table: P(0 ≤ Z ≤ 1.44) = 0.4251

Standardised
Sampling Distribution
Normal Distribution

Standardise
0.40 0.45 p 0 1.44 Z
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
23 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Reason for Taking a Sample

• Less time-consuming than a census.


• Less costly to administer than a census.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


24 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Types of Samples Used

• Non-probability sample:
Items included are chosen without regard to their probability of occurrence.

• Probability sample:
Items in the sample are chosen on the basis of known probabilities.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


25 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Types of Samples Used

Types of
Samples

Non-Probability Samples Probability Samples

Simple Stratified
Judgement Chunk Random

Quota Convenience Systematic Cluster

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


26 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Probability Samples

Items in the sample are chosen based on known probabilities.

Probability Samples

Simple
Systematic Stratified Cluster
Random
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
27 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Simple Random Sampling

• Every individual or item from the frame (N) has an equal chance of being selected
(1/N).

• Selection may be with replacement or without replacement.

• Samples can be obtained from a table of random numbers or computer random


number generators.

• Simple to use but may not be a good representation of the population’s


underlying characteristics.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


28 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Systematic Sampling

• Divide frame of N individuals into n groups of k individuals: k = N/n.

• Randomly select one individual from the 1st group.

• Select every kth individual thereafter.

• Like simple random sampling, simple to use but may not be a good
representation of the population’s underlying characteristics.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


29 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Systematic Sampling Example
Divide our frame of 64 into eight groups with eight people in
each group.
Randomly select one individual from the 1st group,
e.g. the third person, and then select every 8th person
after that.

N = 64
n=8
k=8

First Group
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
30 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Stratified Sampling

• Divide population into two or more subgroups (called strata) according to some common characteristic.

• A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes –
called proportionate stratified sampling.

• Samples from subgroups are combined into one.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


31 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Stratified Sampling

More efficient than simple random sampling or systematic sampling because of


assured representation of items across entire population.
Homogeneity of items within each stratum provides greater precision in the
estimates of underlying population parameters.

Population
divided
into 4 strata
Sample

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


32 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Cluster Samples

• Population is divided into several ‘clusters’, each representative of the population e.g. postcode areas,
electorates etc.

• A simple random sample of clusters is selected:


• All items in the selected clusters can be used, or items can be chosen from a cluster using another
probability sampling technique.

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


33 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Cluster Sampling
More cost effective than random sampling, especially if population is
geographically widespread.
Often requires a larger sample size compared to simple random sampling or
stratified sampling for same level of precision.

Population divided Randomly selected


into clusters clusters for sample

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


34 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Evaluating Survey Worthiness

• What is the purpose of the survey?

• Is the survey based on a probability or non-probability sample?

• Survey errors:
• Coverage error – appropriate or adequate frame?
• Non-response error – results in non-response bias.
• Measurement error – ambiguous wording, halo effect or respondent error.
• Sampling error – always exists and is the difference between sample statistic and population parameter.
(minimize this)

Copyright © 2013 Pearson Australia (a division of Pearson Australia Group


35 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e
Types of Errors

Excluded from
Coverage error frame

Follow up on non-
Non-response error responses

Random differences
Sampling error from sample to sample

Measurement error Bad or leading


question
Copyright © 2013 Pearson Australia (a division of Pearson Australia Group
36 Pty Ltd) – 9781442549272/Berenson/Business Statistics /2e

S-ar putea să vă placă și