Types of Sampling Design

Types of sampling design
The sampling design is based on the techniques to select any object from a set of related objects.
Surveys of woodfuel consumption, supply and provision are basically conducted by means of
sampling techniques. This means that by studying a small group (sample) selected at random one
obtains information on variables of interest to a larger group (universe 6), thus permitting
inferences as to the behaviour of these variables within the universe. This procedure is adopted
because surveying an entire universe (unless very small) entails high costs.
3.1 UNIVERSE
The universe must be defined in the light of the objectives of the survey. It can be expressed in
geographical terms (locality, municipality, district, province, country or some intermediate
category) or in sectoral terms (urban population, pottery manufacturers, fuelwood producers). It
is also necessary to place time limits on the definition of the universe, because its composition
and characteristics can change over time. It is recommended that the universe be given spatial
limits that coincide with standard or official groupings (political, administrative, natural, etc.) in
common use in countries, so that its dimensions can be estimated from information already
available.
The universe is given a preliminary definition at the start of the methodological design of the
survey. It will subsequently be refined once its size and spatial and temporal distribution are
known by reviewing existing information. The redefinition may mean extending or reducing the
universe. An extension may be called for when it is realized that an area exists with sizeable
woodfuel use or where there is real or potential supply. Causes for universe reduction might be
that the scarcity of information on supply and demand in a certain area is such that its inclusion
in the survey would introduce greater error than its elimination; or the realization that a given
locality or area does not form part of the universe because without major users.
3.2 SAMPLING FRAME
Once the universe has been defined, information that is a precise as possible has to be sought on
its dimensions and spatial and temporal distribution in order to construct the sampling frame,
this being the basis on which to develop the sampling design. The sampling frame is the
information that locates and defines the dimensions of the universe and may consist of housing
censuses and maps grouped by locality, district, quarter, etc.; maps of forest cover with types of
vegetation or land use; or housing lists in small localities. Constructing the sampling frame is
described in the sections on General Variables Supply, Demand and Provision (Chapter 2).
3.3 SAMPLING UNIT
A basic concept in sampling theory is the sampling unit, which is the minimum unit of
observation for information on the operative variables. The sampling unit must be clearly
defined for constructing the sampling frame. By convention in statistics, a capital N is used to
refer to the number of sampling units making up the universe, and a lowercase n for the
number of sampling units in the sample itself. The sampling unit best suited for the respective
sectors is shown in Table 3.1. Other sampling units can be defined as suggested by the objective
of the survey.
Table 3.1: Sampling unit for thematic group and sector or branch under examination
Group
Sector/branch
Demand
Residential
- rural
Industrial
Sampling unit
Home
urban
Establishment
Commercial
Institutional
Supply
Provision
Direct
Plot
Indirect
Establishment
Producers
Individual producers,
companies
Transport operators
Commercial suppliers
Once the universe and sampling unit have been defined, and once the sampling frame is ready,
the sample design comprises two major stages: definition of type of sampling and determination
of sample size.
3.4 TYPES OF SAMPLING
There are different types of sampling, but all are based on the principle of randomness. In order
to be able to make valid inferences from a sample for transposition to the universe, the sample
must be representative of the universe; and this is achieved by its randomness and adequate
size.
The basis for statistical inference, then, is randomness. This means that all the elements making
up the universe have the same chance of being selected to form the sample. If the selection is
not random, there is a serious risk that the findings will not be representative of the whole
population, but of a section only. This is referred to as bias. An example of bias due to nonrandom selection in an inventory of wood resources occurs when the plots selected are those in
the vicinity of access roads, which are likely to be more heavily visited and have smaller stocks
of wood. Extrapolating the results of this non-random sample to the universe would lead to an
underestimation of stocks.
Sample size will depend on the variability of the phenomenon under study, the level of
confidence set and acceptable error. One common mistake is to think that for a sample to be
representative of a universe, it must be directly proportional in size to that of the universe, in
other words, the larger the universe, the larger the sample. This is not true and details on how to
arrive at the required sample size are given later.
3.4.1 Simple random sampling
This consists in selecting randomly n sample units (SU) in the universe, in a way that gives us
all the SUs the same opportunity of being selected.
A subset of a statistical population in which each member of the subset has an equal probability
of being chosen. A simple random sample is meant to be an unbiased representation of a group.
An example of a simple random sample would be a group of 25 employees chosen out of a hat
from a company of 250 employees. In this case, the population is all 250 employees, and the
sample is random because each employee has an equal chance of being chosen.
The Steps involved in SRS are as follow:
Assign a single number to each element in the sampling frame.
Use random numbers to select elements into the sample until the desired number of
cases is obtained.
The method is not very different from winning a lottery.
Each SU is assigned a number and the sample is selected randomly from tables of random
numbers, calculators, lots, etc. This technique can only be used when there is a complete
sampling frame that includes all the sampling units and where these can be readily recognisable
and identifiable in the field, for example a telephone directory or a list of homes identified by
street and number or the name of the occupant. When constructing a sample of natural
resources, it is usually difficult to identify or locate the selected plots accurately, as this requires
a detailed map and instruments for precise geographical location.
When simple random sampling must be used:
When it is known that the variable of greatest interest is randomly distributed within the
universe
With small universes (not more than 200 SUs)
With universes with little geographical dispersion
When the pattern of distribution for the variable under study is not known
3.4.2 Stratified random sampling
Stratified random sampling is used when the whole universe of size N is broken down into
relatively homogeneous strata for the variable under study. This is advisable provided the
variation between strata is greater than the variation within each stratum.
Regarding the selection of sampling units and estimation of parameters, each stratum is treated
independently, as if it were a universe on its own. Within each stratum, the sampling units can
be selected at random, by clusters or systematically.
Stratified sampling makes it possible to improve the precision of estimates with reduced
sampling effort, to characterize each stratum separately and to facilitate field work.
It is most important to realize that the sampling units should belong to only one stratum, that the
strata should be recognizable by people outside the survey group, and that the actual size of the
stratum should be known. It is not advisable to form a large number of strata, because this
would unnecessarily complicate field surveying and data analysis.
When it comes to deciding on a stratified sample there are general criteria that one can apply. In
the group on woodfuel demand, the advisability of stratification is defined in the first instance
by the patterns of saturation and consumption. In the direct supply group, stratification is done
by source and type of land cover or use. For the indirect supply group and providers, producers,
transport operators and traders, volume of production or sales is used. Since these are variables
that need to be known before the survey takes place, the relevant data can be obtained from
secondary sources or from indicator variables, as described in Chapter 2.
When stratified sampling should be used:
It is recommended for universes where it is supposed or known that distribution of the key
variable(s) differs between readily identifiable sub-universes;
Because of its low sampling efficiency, it is not recommended for small universes with fewer
than 200 sampling units and variables showing normal distribution.
3.4.3 Sampling by clusters

Cluster sampling is a sampling technique used when "natural" groupings are evident in a
statistical population. It is often used in marketing research. In this technique, the total
population is divided into these groups (or clusters) and a sample of the groups is selected. Then
the required information is collected from the elements within each selected group. This may be
done for every element in these groups or a subsample of elements may be selected within each
of these groups. A common motivation for cluster sampling is to reduce the average cost per
interview. Given a fixed budget, this can allow an increased sample size. Assuming a fixed
sample size, the technique given more accurate results when most of the variation in the
population is within the groups, not between them.
Cluster elements
Elements within a cluster should ideally be as heterogeneous as possible, but there should be
homogeneity between cluster means. Each cluster should be a small scale representation of the
total population. The clusters should be mutually exclusive and collectively exhaustive. A
random sampling technique is then used on any relevant clusters to choose which clusters to
include in the study. In single-stage cluster sampling, all the elements from each of the selected
clusters are used. In two-stage cluster sampling, a random sampling technique is applied to the
elements from each of the selected clusters.
The main difference between cluster sampling and stratified sampling is that in cluster sampling
the cluster is treated as the sampling unit so analysis is done on a population of clusters (at least
in the first stage). In stratified sampling, the analysis is done on elements within strata. In
stratified sampling, a random sample is drawn from each of the strata, whereas in cluster
sampling only the selected clusters are studied. The main objective of cluster sampling is to
reduce costs by increasing sampling efficiency. This contrasts with stratified sampling where the
main objective is to increase precision.
There also exists multistage sampling, where more than two steps are taken in selecting clusters
from clusters.
Aspects of cluster sampling
One version of cluster sampling is area sampling or geographical cluster sampling. Clusters
consist of geographical areas. Because a geographically dispersed population can be expensive to
survey, greater economy than simple random sampling can be achieved by treating several
respondents within a local area as a cluster. It is usually necessary to increase the total sample
size to achieve equivalent precision in the estimators, but cost savings may make that feasible.
In some situations cluster analysis is only appropriate when the clusters are approximately the
same size. This can be achieved by combining clusters. If this is not possible, probability
proportionate to size sampling is used. In this method, the probability of selecting any cluster
varies with the size of the cluster, giving larger clusters a greater probability of selection and
smaller clusters a lower probability. However, if clusters are selected with probability
proportionate to size, the same number of interviews should be carried out in each sampled
cluster so that each unit sampled has the same probability of selection.
Clusters are spatially compact groups of sampling units. They are selected randomly and within
each cluster all the sampling units are studied or subjected to further sampling. When sampling
by clusters should be used: when there is considerable difficulty in reaching every sampling
unit in the universe, because of wide dispersion or physical barriers to access.
3.4.4 Systematic sampling selection
A type of probability sampling method in which sample members from a larger population are
selected according to a random starting point and a fixed, periodic interval. This interval, called
the sampling interval, is calculated by dividing the population size by the desired sample size.
Despite the sample population being selected in advance, systematic sampling is still thought of
as being random, provided the periodic interval is determined beforehand and the starting point
is random.
Steps:
Calculate the sampling interval as the ratio between population size and sample size, I =
N/n.
Arrange all elements in the population in an order.
Select a case in the first interval randomly.
Select every ith case from this point.
Systematic Sampling is easier and simpler than SRS
This is not strictly speaking a type of sampling and is best considered as a frame for regular
sample selection.
The first sampling unit is chosen at random, while the remainder are selected at regular
intervals of unit, distance or time. Its theoretical limitation lies in the fact that only the first
number is chosen at random and the remainder do not have the same probability of being
included in the sample. Its advantage is that it facilitates location of the sampling units in areas
of difficult access and permits visits to sampling units that are not included in the sampling
frame.
When systematic selection should be used:
Whenever it is not possible to identify every sampling unit within the sample frame, e.g. in
large towns where lists of homes are not kept.
When access to sampling units is difficult because of distance, lack of roads or difficult
terrain, e.g. in forest inventories.
Combining several types of sampling
It is possible to combine different types of sampling within the same survey, depending on the
characteristics of the sectors or branches concerned and the degree of acceptable trade-off
between precision and cost of the exercise. For example, in a residential sector one may opt for
a two-stage stratified sample in clusters, whereas for a small homogeneous and compact
industrial branch, simple random sampling may be preferred.
3.5 SAMPLE SIZE
Sample size must be determined independently for each universe, according to three factors: the
variability of the most important numerical variable, the level of confidence required and the
acceptable level of error. This is summarized by the following formula:7
no = (s2 . t2 , v)/ e2 (1) in terms of variance and absolute error
or
no = (cv2 . t2 , v)/ e2 in terms of variation coefficient and relative error
where:
no = size of sample
s2 = variance of the sample
t2 , v = critical value of Students t test with significance level
and v degrees of freedom
e = acceptable error
cv = variation coefficient = standard deviation of the sample / sample mean
v = degrees of freedom= n 1
Variance (s2) and variation coefficient (VC) indicate the degree of homogeneity of the variable
under consideration in the sample. These are calculated - manually by calculator or with Excel
with the data from a preliminary sample or earlier survey.
Acceptable error (e) refers to the allowable difference between sample mean and mean of the
universe. It is set in accordance with previous knowledge of the phenomenon under study, and
it is advisable to keep it within 10-20% - which can also be expressed in absolute values with
the units of measurement of the variable in question.
The critical value of t is obtained from tables in statistics books or from Excel, selecting first
the level of significance ( ) or its complement, the level of confidence (1- ). A level of
confidence of 0.95, which is equivalent to a = 0.05 is enough for surveys of this kind. In
addition, in order to define the degrees of freedom (v = n-1), a first assessment of the number
(n) of cases in the sample is needed. These two values are the entry data for the tables.
Subsequently, the sample size is specified by means of an iterative process, where the value of
n is obtained using Formula (1) to determine the value of t.
This formula shows that the number of elements making up the sample is directly proportional
to the variance and value of t2, and inversely proportional to the square of the error. The sample
size will be large when: (a) the element under study is highly variable (high variance or
variation coefficient); (b) the level of confidence sought is high; and/or (c) the acceptable error
is low. Conversely, the sample size will be small if the phenomenon shows little variance, a low
level of confidence is set, and a high level of error is accepted.
From this it is clear that the size of a sample does not depend on the size of the universe. Thus,
starting with an equal level of confidence and error acceptance in a tropical rainforest covering
the same surface area as a temperate pine forest, the sample size will be larger for the rainforest
because of its greater heterogeneity in the wood stock variable in relation to the pine forest.
So far no consideration has been given to the size of the universe in determining sample size.
Nevertheless, for a small universe (fewer than 120 sampling units), it is necessary to correct the
value of no obtained from Formula (1), by using Formula (2)8:
n = no/(1 + no/N) (2)
where:
no = sample size obtained from Formula (1)
N = size of universe
n = definitive size of sample
Annex III gives the calculated sample size for the estimation of fuelwood consumption in a
residential sector for varying universe size and error margin and corrected for finite population.
It applies for the variable specific fuelwood consumption, where, due to the abundance of
case studies, the variation coefficient is known.
Variables to be used in calculating sample size
To define sample size of any sector or branch of woodfuel demand, it is best to use the
unit consumption variable.
In the industrial, commercial and institutional sectors it is not always possible to find
data on unit consumption, but one can use the volume of production per unit time,
which is closely correlated with unit consumption.
In the case of direct supply (from forest, plantation, etc.) the important variables may
be stock or productivity, but the first is recommended as there is more secondary
information and it is easier to measure in a preliminary sample. If there are no data on
stock, basal area data (G) may be used.
In sectors or branches of indirect supply (sawmills, carpentry workshops, etc.), volume
of production per unit time must be used.
For provision sectors: in the case of producers, it is best to use volume of woodfuel
production; traders, volume of sales; and transport operators, transport capacity, all
expressed per unit time.
The final decision on the size of the sample will depend on the agreed trade-off between
desired accuracy and availability of monetary, human and time resources for conducting
the field survey. It is recommended that sectors or branches having greater importance in
woodfuel demand, supply and provision be given priority in the allocation of resources for field
surveys so that estimations can be more accurate. In situations where it is not possible to realize
the sample size determined by statistical calculation, it is essential to survey at least ten sample
units per sector, branch or stratum, and to indicate the error in estimation, finding the e value of
Formula (1).
6
In
statistics
universe
is
also
referred
to
as
population.
Formula used to determine the sample size needed to estimate the population mean; for
hypothesis testing for differences between means and variances other formulas are available.
Useful statistics reference books include Zar 1999, Cochran 1977, and Steel and Torrie 1988.
8
Also termed correction for finite population.
7
Non probability sample

A quota sample a type of non-probability sample in which the researcher selects people
according to some fixed quota. That is, units are selected into a sample on the basis of prespecified characteristics so that the total sample has the same distribution of characteristics
assumed to exist in the population being studied. For example, if you are a researcher conducting
a national quota sample, you might need to know what proportion of the population is male and
what proportion is female as well as what proportions of each gender fall into different age
categories, race or ethnic categories, educational categories, etc. The researcher would then
collect a sample with the same proportions as the national population.
Nonprobability Sampling
The difference between nonprobability and probability sampling is that nonprobability sampling
does not involve random selection and probability sampling does. Does that mean that
nonprobability samples aren't representative of the population? Not necessarily. But it does mean
that nonprobability samples cannot depend upon the rationale of probability theory. At least with
a probabilistic sample, we know the odds or probability that we have represented the population
well. We are able to estimate confidence intervals for the statistic. With nonprobability samples,
we may or may not represent the population well, and it will often be hard for us to know how
well we've done so. In general, researchers prefer probabilistic or random sampling methods over
non probabilistic ones, and consider them to be more accurate and rigorous. However, in applied
social research there may be circumstances where it is not feasible, practical or theoretically
sensible to do random sampling. Here, we consider a wide range of non-probabilistic
alternatives.
We can divide nonprobability sampling methods into two broad types: accidental or purposive.
Most sampling methods are purposive in nature because we usually approach the sampling
problem with a specific plan in mind. The most important distinctions among these types of
sampling methods are the ones between the different types of purposive sampling approaches.
Accidental, Haphazard or Convenience Sampling
One of the most common methods of sampling goes under the various titles listed here. I would
include in this category the traditional "man on the street" (of course, now it's probably the
"person on the street") interviews conducted frequently by television news programs to get a
quick (although nonrepresentative) reading of public opinion. I would also argue that the typical
use of college students in much psychological research is primarily a matter of convenience.
(You don't really believe that psychologists use college students because they believe they're
representative of the population at large, do you?). In clinical practice,we might use clients who
are available to us as our sample. In many research contexts, we sample simply by asking for
volunteers. Clearly, the problem with all of these types of samples is that we have no evidence
that they are representative of the populations we're interested in generalizing to -- and in many
cases we would clearly suspect that they are not.
Purposive Sampling
In purposive sampling, we sample with a purpose in mind. We usually would have one or more
specific predefined groups we are seeking. For instance, have you ever run into people in a mall
or on the street who are carrying a clipboard and who are stopping various people and asking if
they could interview them? Most likely they are conducting a purposive sample (and most likely
they are engaged in market research). They might be looking for Caucasian females between 3040 years old. They size up the people passing by and anyone who looks to be in that category
they stop to ask if they will participate. One of the first things they're likely to do is verify that
the respondent does in fact meet the criteria for being in the sample. Purposive sampling can be
very useful for situations where you need to reach a targeted sample quickly and where sampling
for proportionality is not the primary concern. With a purposive sample, you are likely to get the
opinions of your target population, but you are also likely to overweight subgroups in your
population that are more readily accessible.
All of the methods that follow can be considered subcategories of purposive sampling methods.
We might sample for specific groups or types of people as in modal instance, expert, or quota
sampling. We might sample for diversity as in heterogeneity sampling. Or, we might capitalize
on informal social networks to identify specific respondents who are hard to locate otherwise, as
in snowball sampling. In all of these methods we know what we want -- we are sampling with a
purpose.
Modal Instance Sampling
In statistics, the mode is the most frequently occurring value in a distribution. In sampling, when
we do a modal instance sample, we are sampling the most frequent case, or the "typical" case. In
a lot of informal public opinion polls, for instance, they interview a "typical" voter. There are a
number of problems with this sampling approach. First, how do we know what the "typical" or
"modal" case is? We could say that the modal voter is a person who is of average age,
educational level, and income in the population. But, it's not clear that using the averages of
these is the fairest (consider the skewed distribution of income, for instance). And, how do you
know that those three variables -- age, education, income -- are the only or even the most
relevant for classifying the typical voter? What if religion or ethnicity is an important
discriminator? Clearly, modal instance sampling is only sensible for informal sampling contexts.
Expert Sampling
Expert sampling involves the assembling of a sample of persons with known or demonstrable
experience and expertise in some area. Often, we convene such a sample under the auspices of a
"panel of experts." There are actually two reasons you might do expert sampling. First, because it
would be the best way to elicit the views of persons who have specific expertise. In this case,
expert sampling is essentially just a specific subcase of purposive sampling. But the other reason
you might use expert sampling is to provide evidence for the validity of another sampling
approach you've chosen. For instance, let's say you do modal instance sampling and are
concerned that the criteria you used for defining the modal instance are subject to criticism. You
might convene an expert panel consisting of persons with acknowledged experience and insight
into that field or topic and ask them to examine your modal definitions and comment on their
appropriateness and validity. The advantage of doing this is that you aren't out on your own
trying to defend your decisions -- you have some acknowledged experts to back you. The
disadvantage is that even the experts can be, and often are, wrong.
Quota Sampling
In quota sampling, you select people non randomly according to some fixed quota. There are two
types of quota sampling : proportional and non-proportional. In proportional quota sampling you
want to represent the major characteristics of the population by sampling a proportional amount
of each. For instance, if you know the population has 40% women and 60% men, and that you
want a total sample size of 100, you will continue sampling until you get those percentages and
then you will stop. So, if you've already got the 40 women for your sample, but not the sixty
men, you will continue to sample men but even if legitimate women respondents come along,
you will not sample them because you have already "met your quota." The problem here (as in
much purposive sampling) is that you have to decide the specific characteristics on which you
will base the quota. Will it be by gender, age, education race, religion, etc.?
Nonproportional quota sampling is a bit less restrictive. In this method, you specify the
minimum number of sampled units you want in each category. here, you're not concerned with
having numbers that match the proportions in the population. Instead, you simply want to have
enough to assure that you will be able to talk about even small groups in the population. This
method is the nonprobabilistic analogue of stratified random sampling in that it is typically used
to assure that smaller groups are adequately represented in your sample.
Heterogeneity Sampling
We sample for heterogeneity when we want to include all opinions or views, and we aren't
concerned about representing these views proportionately. Another term for this is sampling
fordiversity. In many brainstorming or nominal group processes (including concept mapping),
we would use some form of heterogeneity sampling because our primary interest is in getting
broad spectrum of ideas, not identifying the "average" or "modal instance" ones. In effect, what
we would like to be sampling is not people, but ideas. We imagine that there is a universe of all
possible ideas relevant to some topic and that we want to sample this population, not the
population of people who have the ideas. Clearly, in order to get all of the ideas, and especially
the "outlier" or unusual ones, we have to include a broad and diverse range of participants.
Heterogeneity sampling is, in this sense, almost the opposite of modal instance sampling.
Snowball Sampling
In snowball sampling, you begin by identifying someone who meets the criteria for inclusion in
your study. You then ask them to recommend others who they may know who also meet the
criteria. Although this method would hardly lead to representative samples, there are times when
it may be the best method available. Snowball sampling is especially useful when you are trying
to reach populations that are inaccessible or hard to find. For instance, if you are studying the
homeless, you are not likely to be able to find good lists of homeless people within a specific
geographical area. However, if you go to that area and identify one or two, you may find that
they know very well who the other homeless people in their vicinity are and how you can find
them.
Researchers use this sampling method if the sample for the study is very rare or is limited to a
very small subgroup of the population. This type of sampling technique works like chain referral.
After observing the initial subject, the researcher asks for assistance from the subject to help
identify people with a similar trait of interest.
The process of snow ball sampling is much like asking your subjects to nominate another person
with the same trait as your next subject. The researcher then observes the nominated subjects and
continues in the same way until the obtaining sufficient number of subjects.
For example, if obtaining subjects for a study that wants to observe a rare disease, the researcher
may opt to use snowball sampling since it will be difficult to obtain subjects. It is also possible
that the patients with the same disease have a support group; being able to observe one of the
members as your initial subject will then lead you to more subjects for the study.
Types of Snowball Sampling
Linear Snowball Sampling
Exponential Non-Discriminative Snowball Sampling
Exponential Discriminative Snowball Sampling
Advantages of Snowball Sampling
The chain referral process allows the researcher to reach populations that are difficult to
sample when using other sampling methods.
The process is cheap, simple and cost-efficient.
This sampling technique needs little planning and fewer workforce compared to other
sampling techniques.
Disadvantages of Snowball Sampling
The researcher has little control over the sampling method. The subjects that the
researcher can obtain rely mainly on the previous subjects that were observed.
Representativeness of the sample is not guaranteed. The researcher has no idea of the true
distribution of the population and of the sample.
Sampling bias is also a fear of researchers when using this sampling technique. Initial
subjects tend to nominate people that they know well. Because of this, it is highly possible that
the subjects share the same traits and characteristics, thus, it is possible that the sample that the
researcher will obtain is only a small subgroup of the entire population.

Types of Sampling Design

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Types of Sampling Design

Încărcat de

Drepturi de autor:

Formate disponibile

Types of sampling design

3.4.3 Sampling by clusters

and v degrees of freedom

Non probability sample

Linear Snowball Sampling

Exponential Non-Discriminative Snowball Sampling

Exponential Discriminative Snowball Sampling

Advantages of Snowball Sampling

S-ar putea să vă placă și