Sunteți pe pagina 1din 33


Selecting individual observations to most
efficiently yield knowledge without bias
v If all members of a population were identical, the
population is considered to be h   .

v That is, the characteristics of any one individual

in the population would be the same as the
characteristics of any other individual (little or no
variation among individuals).






v Ähen individual members of a population are different from
each other, the population is considered to be
h   (having significant variation among
v ÿow does this change an alien¶s abduction scheme to find
out more about humans?
v In order to describe a heterogeneous population,
observations of multiple individuals are needed to account
for all possible characteristics that may exist.
Ä  Ä 





v If a sample of a population is to provide useful
information about that population, then the sample
must contain essentially the same variation as the

v ›h   h  

M The greater the chance is that a sample may not
adequately describe a populationM we could be wrong in
the inferences we make about the population.

v V
M The larger the sample needs to be to adequately describe
the populationM we need more observations to be able to
make accurate inferences.

v Sampling is the process of selecting observations (a

sample) to provide an adequate description and robust
inferences of the population
M The sample is  
  of the population.

v There are 2 types of sampling:

M Non-Probability sampling (Thurday¶s lecture)
M Probability sampling
v A sample must be representative of the population
with respect to the variables of interest.
v A sample will be representative of the population from
which it is selected if each member of the population
has an equal chance (probability) of being selected.
v Probability samples are more accurate than non-
probability samples
M They remove conscious and unconscious sampling bias.
v Probability samples allow us to estimate the accuracy
of the sample.
v Probability samples permit the estimation of population
v a
    a case or a single unit that is selected from a population
and measured in some way²the basis of analysis (e.g., an person, thing,
specific time, etc.).

v [  the theoretical aggregation of all possible elements²unspecified

to time and space (e.g., University of Idaho).

v | 
  the theoretical aggregation of =  elements as defined
for a given survey defined by time and space (e.g., UI students and staff in

v a
  the aggregation of the population from
which the sample is actually drawn (e.g., UI students and faculty in 2008-09
academic year).

v a
 a specific list that closely approximates all elements in the
population²from this the researcher selects units to create the study
sample (Vandal database of UI students and faculty in 2008-09).

v a
  a set of cases that is drawn from a larger pool and used to make
generalizations about the population




v a  =  =

M ÿow much sampling error can be tolerated²levels of precision
M Size of the population²sample size matters with small populations
M Variation within the population with respect to the characteristic of
interest²what you are investigating
M Smallest subgroup within the sample for which estimates are needed
M Sample needs to be big enough to properly estimate the smallest


v |

 any characteristic of a  that is trueM known on
the basis of a census (e.g., % of males or females; % of college
students in a population).
v ` 
 any characteristic of a  that is estimatedM estimated
on the basis of samples (e.g., % of males or females; % of college
students in a sample). Samples have:

v a
  `  an estimate of precision; estimates how close
sample estimates are to a true population value for a characteristic.
M Occurs as a result of selecting a sample rather than surveying an entire population

v a

 `  (SE) a measure of sampling error.
v SE is an inverse function of sample size.
M As sample size , SE decreases²the sample is more precise.
M So, we want to use the smallest SE we canM greatest precision!
M Ähen in doubt²increase sample size.
v SE will be highest for a population that has a 50:50 distribution on some
characteristic of interest, while it is non-existent with a distribution of 100:0.

s = standard error
n = sample size .9 * .1
p = % having a particular q*p S= = ..03 or 3%
characteristic (or 1-q) S= 100
q = % no having a particular n
characteristic (or 1-p)
.5 *.5 = .05 or 5%

v Selection process with no pattern; unpredictable
v Each element has an equal probability of being selected for a study
v Reduces the likelihood of researcher bias
v Researcher can calculate the probability of certain outcomes
v Variety of types of probability samples²þ 


v Äh 
v Samples that are assigned in a random fashion are most likely to be
truly representative of the population under consideration.

v Can calculate the deviation between sample results and a

population parameter due to random processes.
v ›h basic sampling method which most others are based on.

v h 
M A sample size µn¶ is drawn from a population µN¶ in such a way that every possible
element in the population has the same chance of being selected.
M Take a number of samples to create a 

v Typically conducted ³without replacement´

v Ä  =
þ =

M Random numbers table, drawing out of a hat, random timer, etc.

v Not usually the most efficient, but can be most accurate!

M Time & money can become an issue
M Ähat if you only have enough time and money to conduct one sample?
v h 
M Starting from a random point on a sampling frame, every nth element in the frame
is selected at equal intervals º=     

v M tells the researcher how to select elements from

the frame (1 in µk¶ elements is selected).
M Depends on sample size needed

v `
M You have a sampling frame (list) of 10,000 people and you need a sample of
1000 for your study«Ä = =       

M Every 10th person listed (1 in 10 persons)

v Empirically provides identical results to SRS, but is more efficient.

v Caution: Need to keep in mind the nature of your frame for SS to
work²beware of periodicity.





M ›  





v h 
M Divide the population by certain characteristics into homogeneous
subgroups () (e.g., UI PhD students, Masters Students,
Bachelors students).
M Elements þ h  each strata are homogeneous, but are
M A simple random or a systematic sample is taken from each strata
relative to the proportion of that stratum to each of the others.

M Ähen a stratum of interest is a small percentage of a population
and random processes could miss the stratum by chance.
M Ähen enough is known about the population that it can be easily
broken into subgroups or strata.
| |[›m 



a››! a›› 

"&##$a"' &% "&##$a"' &%

| |[›m 


")## a››
a"' &%
a"& #%



v Some populations are spread out (over a state or

v Elements occur in clumps (towns, districts)²Primary

sampling units (PSU).

v Elements are hard to reach and identify.

v Trade accuracy for efficiency.

v You cannot assume that any one clump is better or

worse than another clump.
| |[›m 

[ |
| |[›m 


| |[›m 

  |   |  

|m ,a |m[m›a


v [ þh
M Researchers lack a good sampling frame for a dispersed
M The cost to reach an element to sample is very high.

v Each cluster is as varied heterogeneous internally and

homogeneous to all the other clusters.

v Usually less expensive than SRS but not as accurate

M Each stage in cluster sampling introduces sampling
error²the more stages there are, the more error there
tends to be.

v Can combine SRS, SS, stratification and cluster

` 2

M a

 weekday-weekend; gender; type of
travel; season; size of operation; etc.
M Ähat are some others?

M 2  counties; entry points (put-in and take-

outs); time of day, city blocks, road or trail
M Ähat are some others?