Sunteți pe pagina 1din 27

Ch.

7: Sampling and Inferential Statistics

Presented by
Hussein Walid Hussein Alkhawaja

Supervised by
Vahid Nimehchisalem, PhD
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)

Sample:
A subset of the population
Why sampling?

Get information about large populations


 Less costs
 Less field time
 More accuracy i.e. Can Do A Better Job of
Data Collection
 When it’s impossible to study the whole
population
Target Population:
The population to be studied/ to which the
investigator wants to generalize his results
Sampling Unit:
smallest unit from which sample can be selected
Sampling frame
List of all the sampling units from which sample is
drawn
Sampling scheme
Method of selecting sampling units from sampling
frame
Steps in sampling?
1) Identification of the target population
- the large group to which the researcher wishes to generalize the
results of the study
e.g.,
- All students in in the Faculty of Modern Languages and
Communication
- all Malaysian boys and girls in the age range of 12 to 21 years

Note: accessible population, which is the population of subjects


accessible to the researcher for drawing a sample e.g., a sample of
adolescents from one state
- to save time and money Of course,
- but we could only generalize results to adolescents in the chosen
state, not to all Malaysian adolescents.
Steps in sampling?
2) Sample selection

 Probability sampling
- By chance inclusion of subjects/elements
- Random selection
- Help in inferential statistics to generalize findings
Criteria: every member or element of the population has a known
probability of being chosen in the sample.

 Nonprobability sampling :
- the elements are not chosen by chance procedures.
- Its success depends on the knowledge, expertise, and judgment of
the researcher.
- the application of probability sampling is not feasible.
- Its advantages are convenience and economy.
Probability samples

 Random sampling
 Each subject has a known probability of
being selected
 Allows application of statistical sampling
theory to results to:
 Generalise
 Test hypotheses
Methods used in probability
samples

 Simple random sampling


 Systematic sampling

 Stratified sampling

 Multi-stage sampling

 Cluster sampling
Simple random sampling
Definition
 random means “without purpose or by accident.”
 chance alone determines which elements in the population will be in the sample

Purpose:
 When random sampling is used, the researcher can employ inferential statistics to
estimate how much the population is likely to differ from the sample.

Steps
1. Define the population.

2. List all members of the population.

3. Select the sample by employing a procedure where sheer chance determines which
members on the list are drawn for the sample.

Methods:

- A container
- a table of random numbers (can be absolutely without bias)
- Research Randomizer (www.randomizer.org).
Table of random numbers
684257954125632140
582032154785962024
362333254789120325
985263017424503686
Stratified Sampling
Definition
 a number of subgroups, or strata, that may differ in the characteristics being studied
e.g., age, neighborhood, and occupation
Purpose:
 To employ inferential statistics to estimate how much the population is likely to differ
from the sample.
 When the population to be sampled is not homogeneous but consists of several sub-
groups, stratified sampling may give a more representative sample than simple random
sampling.

Steps
1. identify the strata of interest
2. randomly draw a specified number of subjects from each stratum with exact ratio of
number

Possible bias
- Geographic
- characteristics of the population ( income, occupation, gender, age, or year in college)

 Proportional stratified sampling


Take equal numbers from each stratum or select in proportion to the size of the stratum in
the population
Cluster Sampling
Definition
 Cluster: a naturally occur-ring group of sampling units close to each other i.e.
crowding together in the same area or neighborhood
e.g, a number of schools randomly from a list of schools and then include all the
students in those schools in the sample
50 blocks from a city map and then the polling of all the adults living on those
blocks
intact classrooms as clusters
Purpose:
 When it is very difficult, if not impossible, to list all the members of a target
population and select the sample from among them
 it would be very expensive to study a sample that is scattered throughout
Malaysia

Steps
1. Random selection of the cluster
2. all the members of the cluster must be included in the sample
Possible problems
- sampling error especial when number of clusters is small
Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4
Systematic Sampling
Definition
drawing a sample by taking every Kth case from a list of the population.
Sampling fraction: Ratio between sample size and population size

Method:
 Randomize the sample
 Decide how many subjects you want in the sample (n)
 Divide N (total number of members in the population) by n to determine the
sampling interval (K ) list.
 Select the first member randomly from the first K members of the list and then
select every Kth member of the population for the sample by adding the K
value each time
e.g., N=500 subjects and a desired sample size n is 50:
K = N/n = 500/50 = 10.

Note: you could use cluster sampling if you were studying a very large and widely
dispersed population. At the same time, you might be interested in stratifying the
sample to answer questions regarding its different strata.
Systematic sampling
Non probability samples
 Probability of being chosen is unknown
 Cheaper- but unable to generalise
 potential for bias

 Convenience sampling (ease of access)


 Snowball sampling (friend of friend….etc.)
 Purposive sampling (judgemental)
 Quota sample
Convenience sampling
 sample is selected from elements of a population
that are easily accessible
 The weakest of all sampling procedures
 E.g.,
- Interviewing the first individuals you encounter on campus,
- a large undergraduate class
- using the students in your own classroom as a sample
- taking volunteers to be interviewed in survey research
 a convenience sample is perhaps better than nothing at all
 Be extremely cautious in interpreting the findings and know that you
cannot generalize the findings
Purposive/ judgment sampling

 sample ele- ments judged to be typical, or representative, are chosen


from the population
 E.g.,

- In forecasting national elections.


1. In each state, choose a number of small districts whose returns in
previous elections have been typical of the entire state.
2. Interview all the eligible voters in these districts
3. use the results to predict the voting patterns of the state.
4. Use similar procedures in all states, the pollsters forecast the
national results
Threats
 There is no reason to assume that the units judged to be typical of the
population will continue to be typical over a period of time.
 The results of a study using purposive sampling may be misleading.

Note: useful in attitude and opinion surveys because it is cheap


Quota Sampling
 Involves selecting typical cases from diverse strata of a popula-tion.
Based on known characteristics of the population to which you wish to
generalize.
 For example, if census results show that 25 percent of the population
of an urban area lives in the suburbs, then 25 percent of the sample
should come from the suburbs.
Steps
 Determine a number of variables related to the question to be used as
bases for stratification. (gender, age, education, and social class).
 Use census or other available data, determine the size of each
segment of the population.
 Compute quotas for each segment of the population that are
proportional to the size of each segment.
 Select typical cases from each segment, or stratum, of the population
to fill the quotas (problematic)
Why? The selection of elements is likely to be based on accessibility
and convenience
Random Assignment
 Definition:
A procedure used after we have a sample of 
participants and before we expose them to a
treatment.
 E.g., To compare the effects of two treatments on
the same dependent variable, we use random
assignment to put our available participants into
groups. Use a chance procedure (a table of
random numbers and tossing a coin is used to
decide which group gets which treatment.
Sample Size
 How large should a sample?
 Answer?
most important characteristic of a sample is its
representativeness, not its size.
A random sample of 200 is better than a random sample of
100, but a random sample of 100 is better than a biased
sample of 2.5 million.

most important characteristic of a sample is its


representativeness, not its size.
A random sample of 200 is better than a random sample of
100, but a random sample of 100 is better than a biased
sample of 2.5 million.
Power Calculations

Sample size
Quantitative Qualitative
Z 2σ 2 Z2 π(1  π)
n n
D2 D2

(σ12  σ 22 )xF 2 P (1 - P) F
n n
D2 D2
Conclusions

 Probability samples are the best

 Ensure
 Representativeness
 Precision
Errors in sample

Systematic error (or bias)


Inaccurate response (information bias)
Selection bias

Sampling error (random error)


“the difference between a population parameter and a sample statistic.” For example,
if you know the mean of the entire population (symbolized μ) and also the mean of a
random sample (symbolized X ) from that population, the difference between these
two (X − μ) represents sampling error (symbolized e).

Thus, e =X − μ. For example, if you know that the mean intelligence score for a
population of 10,000 fourth-graders is μ = 100 and a particular random sample of 200
has a mean of X = 99, then the sampling error is X − μ = 99 − 100 = −1.
Type 1 error
 The probability of finding a difference with our sample compared to
population

 The investigator will either retain or reject the null hypothesis. Either
decision may be correct or wrong. If the null hypothesis is true, the
investigator is correct in retaining it and in error in rejecting it. The
rejection of a true null hypothesis is labeled a Type I error.

 Known as the α (or “type 1 error”)


 Usually set at 5% (or 0.05)
Type 2 error

 The probability of not finding a difference that actually


exists between our sample compared to the population
 If the null hypothesis is false, the investigator is in error
in retaining it and correct in rejecting it. The retention of
a false null hypothesis labeled a Type II error.

 Known as the β (or “type 2 error”)

 Power is (1- β) and is usually 80%