Presented by

Hussein Walid Hussein Alkhawaja

Supervised by

Vahid Nimehchisalem, PhD

Important statistical terms

Population:

a set which includes all

measurements of interest

to the researcher

(The collection of all

responses, measurements, or

counts that are of interest)

Sample:

A subset of the population

Why sampling?

Less costs

Less field time

More accuracy i.e. Can Do A Better Job of

Data Collection

When it’s impossible to study the whole

population

Target Population:

The population to be studied/ to which the

investigator wants to generalize his results

Sampling Unit:

smallest unit from which sample can be selected

Sampling frame

List of all the sampling units from which sample is

drawn

Sampling scheme

Method of selecting sampling units from sampling

frame

Steps in sampling?

1) Identification of the target population

- the large group to which the researcher wishes to generalize the

results of the study

e.g.,

- All students in in the Faculty of Modern Languages and

Communication

- all Malaysian boys and girls in the age range of 12 to 21 years

accessible to the researcher for drawing a sample e.g., a sample of

adolescents from one state

- to save time and money Of course,

- but we could only generalize results to adolescents in the chosen

state, not to all Malaysian adolescents.

Steps in sampling?

2) Sample selection

Probability sampling

- By chance inclusion of subjects/elements

- Random selection

- Help in inferential statistics to generalize findings

Criteria: every member or element of the population has a known

probability of being chosen in the sample.

Nonprobability sampling :

- the elements are not chosen by chance procedures.

- Its success depends on the knowledge, expertise, and judgment of

the researcher.

- the application of probability sampling is not feasible.

- Its advantages are convenience and economy.

Probability samples

Random sampling

Each subject has a known probability of

being selected

Allows application of statistical sampling

theory to results to:

Generalise

Test hypotheses

Methods used in probability

samples

Systematic sampling

Stratified sampling

Multi-stage sampling

Cluster sampling

Simple random sampling

Definition

random means “without purpose or by accident.”

chance alone determines which elements in the population will be in the sample

Purpose:

When random sampling is used, the researcher can employ inferential statistics to

estimate how much the population is likely to differ from the sample.

Steps

1. Define the population.

3. Select the sample by employing a procedure where sheer chance determines which

members on the list are drawn for the sample.

Methods:

- A container

- a table of random numbers (can be absolutely without bias)

- Research Randomizer (www.randomizer.org).

Table of random numbers

684257954125632140

582032154785962024

362333254789120325

985263017424503686

Stratified Sampling

Definition

a number of subgroups, or strata, that may differ in the characteristics being studied

e.g., age, neighborhood, and occupation

Purpose:

To employ inferential statistics to estimate how much the population is likely to differ

from the sample.

When the population to be sampled is not homogeneous but consists of several sub-

groups, stratified sampling may give a more representative sample than simple random

sampling.

Steps

1. identify the strata of interest

2. randomly draw a specified number of subjects from each stratum with exact ratio of

number

Possible bias

- Geographic

- characteristics of the population ( income, occupation, gender, age, or year in college)

Take equal numbers from each stratum or select in proportion to the size of the stratum in

the population

Cluster Sampling

Definition

Cluster: a naturally occur-ring group of sampling units close to each other i.e.

crowding together in the same area or neighborhood

e.g, a number of schools randomly from a list of schools and then include all the

students in those schools in the sample

50 blocks from a city map and then the polling of all the adults living on those

blocks

intact classrooms as clusters

Purpose:

When it is very difficult, if not impossible, to list all the members of a target

population and select the sample from among them

it would be very expensive to study a sample that is scattered throughout

Malaysia

Steps

1. Random selection of the cluster

2. all the members of the cluster must be included in the sample

Possible problems

- sampling error especial when number of clusters is small

Cluster sampling

Section 1 Section 2

Section 3

Section 5

Section 4

Systematic Sampling

Definition

drawing a sample by taking every Kth case from a list of the population.

Sampling fraction: Ratio between sample size and population size

Method:

Randomize the sample

Decide how many subjects you want in the sample (n)

Divide N (total number of members in the population) by n to determine the

sampling interval (K ) list.

Select the first member randomly from the first K members of the list and then

select every Kth member of the population for the sample by adding the K

value each time

e.g., N=500 subjects and a desired sample size n is 50:

K = N/n = 500/50 = 10.

Note: you could use cluster sampling if you were studying a very large and widely

dispersed population. At the same time, you might be interested in stratifying the

sample to answer questions regarding its different strata.

Systematic sampling

Non probability samples

Probability of being chosen is unknown

Cheaper- but unable to generalise

potential for bias

Snowball sampling (friend of friend….etc.)

Purposive sampling (judgemental)

Quota sample

Convenience sampling

sample is selected from elements of a population

that are easily accessible

The weakest of all sampling procedures

E.g.,

- Interviewing the first individuals you encounter on campus,

- a large undergraduate class

- using the students in your own classroom as a sample

- taking volunteers to be interviewed in survey research

a convenience sample is perhaps better than nothing at all

Be extremely cautious in interpreting the findings and know that you

cannot generalize the findings

Purposive/ judgment sampling

from the population

E.g.,

1. In each state, choose a number of small districts whose returns in

previous elections have been typical of the entire state.

2. Interview all the eligible voters in these districts

3. use the results to predict the voting patterns of the state.

4. Use similar procedures in all states, the pollsters forecast the

national results

Threats

There is no reason to assume that the units judged to be typical of the

population will continue to be typical over a period of time.

The results of a study using purposive sampling may be misleading.

Quota Sampling

Involves selecting typical cases from diverse strata of a popula-tion.

Based on known characteristics of the population to which you wish to

generalize.

For example, if census results show that 25 percent of the population

of an urban area lives in the suburbs, then 25 percent of the sample

should come from the suburbs.

Steps

Determine a number of variables related to the question to be used as

bases for stratification. (gender, age, education, and social class).

Use census or other available data, determine the size of each

segment of the population.

Compute quotas for each segment of the population that are

proportional to the size of each segment.

Select typical cases from each segment, or stratum, of the population

to fill the quotas (problematic)

Why? The selection of elements is likely to be based on accessibility

and convenience

Random Assignment

Definition:

A procedure used after we have a sample of

participants and before we expose them to a

treatment.

E.g., To compare the effects of two treatments on

the same dependent variable, we use random

assignment to put our available participants into

groups. Use a chance procedure (a table of

random numbers and tossing a coin is used to

decide which group gets which treatment.

Sample Size

How large should a sample?

Answer?

most important characteristic of a sample is its

representativeness, not its size.

A random sample of 200 is better than a random sample of

100, but a random sample of 100 is better than a biased

sample of 2.5 million.

representativeness, not its size.

A random sample of 200 is better than a random sample of

100, but a random sample of 100 is better than a biased

sample of 2.5 million.

Power Calculations

Sample size

Quantitative Qualitative

Z 2σ 2 Z2 π(1 π)

n n

D2 D2

(σ12 σ 22 )xF 2 P (1 - P) F

n n

D2 D2

Conclusions

Ensure

Representativeness

Precision

Errors in sample

Inaccurate response (information bias)

Selection bias

“the difference between a population parameter and a sample statistic.” For example,

if you know the mean of the entire population (symbolized μ) and also the mean of a

random sample (symbolized X ) from that population, the difference between these

two (X − μ) represents sampling error (symbolized e).

Thus, e =X − μ. For example, if you know that the mean intelligence score for a

population of 10,000 fourth-graders is μ = 100 and a particular random sample of 200

has a mean of X = 99, then the sampling error is X − μ = 99 − 100 = −1.

Type 1 error

The probability of finding a difference with our sample compared to

population

The investigator will either retain or reject the null hypothesis. Either

decision may be correct or wrong. If the null hypothesis is true, the

investigator is correct in retaining it and in error in rejecting it. The

rejection of a true null hypothesis is labeled a Type I error.

Usually set at 5% (or 0.05)

Type 2 error

exists between our sample compared to the population

If the null hypothesis is false, the investigator is in error

in retaining it and correct in rejecting it. The retention of

a false null hypothesis labeled a Type II error.

