Documente Academic
Documente Profesional
Documente Cultură
PHASE 2
Test in a small group of patients (about 100 500)
Objective is to determine possible short-term side
effects and risks associated with the drug; if it works
according to expected mechanism
PHASE 3
Test in a large group of patients (about 1000-5000) to
show safety and efficacy
PHASE 4
Post-marketing surveillance of drug to determine
long-term
safety
and
reassess
effectiveness,acceptability and continued use under
normal field settings
pg. 1
pg. 2
e.
The results of the literature review can be used to generate hypothesis,methods and comparative
datawhich are useful in the interpretation and discussion of results
f.
g.
Related literature should be summarized by topic rather than as a running bibliography. This
means that the conclusions of authors dealing with a particular topic should be compared and
synthesized
h.
If the research being proposed is pioneering, and no previous studies have been done in the
area, this has to be mentioned in the review of related literature to provide additional
basis/justification for the conduct of the proposed research
3. Revisit the research objectives and redefine the actual problem for investigation in more clear and specific
terms
Refers to the process of reviewing, refining or fine-tuning the first draft of the general and specific
objectives based on new knowledge derived from the review of related literature.
It may involve delimiting the scope of the study without dealing with a trivial problem
4. Formulate testable hypothesis and define basic concepts and variables
Identifying attributes of the variables to be tested in the research project
* Estimating magnitudes
* Determining differences
* Looking at relationships
Formulating conceptual and operational definitions of variables
Steps in Conducting Research
5. Construct the research design
Areas of concern include:
Study design
Methods of subject selection
Sample size
Strategies for control and manipulation of relevant variables
Establishment of criteria to evaluate outcomes
Instrumentation
Major considerations in formulating the research design:
Internal Validity
Does the study measure what it intends to measure?
Refers to the extent to which various types of biases are controlled in the study like comparability of
subjects, measurement bias and others
External Validity
Refers to the extent to which the study results can be generalized to a larger population
Covers issues related to sample selection and sample size
6. Design the tools for data collection
pg. 3
what is not known about the problem hence the need for the proposed research
Providing
pg. 4
Does the problem relate to broader social, economic or health issues (ex., poverty; climate change; status
of women and children, etc.)?
Who else are concerned about the problem (ex., government; civil society; church, etc)?
1.1 JUSTIFYING THE SIGNIFICANCE OF THE RESEARCH: HOW TO WRITE-UP THIS SECTION
a. Review your answers to the questions listed earlier.
b. Sort your answers into 2 categories whether they address broad or specific issues related to your research
problem
c. Arrange your answers in 1 or 2 paragraphs which justify the importance of the research problem. The
suggested flow of the discussion is one which follows an inverted triangle, starting with broad issues, then
focusing on specific issues related to particular groups or settings to be studied in the proposed research
1.2 JUSTIFYING THE SIGNIFICANCE OF THE RESEARCH: FLOW OF DISCUSSION
BROAD ISSUES
SPECIFIC ISSUES
end-user/target beneficiary may have a different use or can benefit from the research results in a
different way
The
proponent must describe in a concise way specifically how each end-user/target beneficiary can apply
or benefit from the research results
1.3.1 IDENTIFICATION OF END-USERS AND TARGET BENEFICIARIES: EXAMPLE
Title of Research:
Capacities and Needs Assessment for Health Emergency Management among conflict-affected and disasterprone LGUs in the Ligawasan Wetlands Biodiversity Reserve (LWBR)
1.3.2 IDENTIFICATION OF END-USERS AND TARGET BENEFICIARIES: BAD EXAMPLE
The following are the end-users and target beneficiaries of this research:
LGUs of disaster-prone areas
Legislators at the regional and local levels
Academicians/researchers
Residents in disaster-prone communities
1.3.3 IDENTIFICATION OF END-USERS AND TARGET BENEFICIARIES: GOOD EXAMPLE (actual write-up
presented in the proposal)
This study has immense use not only for the health services providers networks and government health
functionaries and personnel in the four LGUs, but also for Local Government Units, in harnessing and
mobilizing local resources toward an integrated and harmonized health emergency planning for preparedness
and resilience.
On the policy side, legislators, both at the local and regional levels, use the results of this study to push for
more integrative approaches in capacitating local health and health-related functionaries and other personnel
down to the barangay level.
pg. 5
The tools for gathering data can be integrated in various social science courses, especially in the Sociology of
Disaster, and in the graduate program in Public Administration, especially in Public Policy (Health and
Emergencies in LGUs). These tools are not yet included in the catalogue of traditional methods of gathering
data in most institutions of higher learning in the region.
More importantly, communities that continue to suffer from inordinate and heavy damage to life and property
after armed conflicts and natural disasters can also learn to appreciate their pro-active role in mitigating
disasters and in lessening their vulnerabilities to health and life risks resulting from disasters.
2. DEVELOPMENT OF THE CONCEPTUAL FRAMEWORK
The conceptual framework is a written or a visual presentation which explains either graphically or in
narrative form, the main variables being studied in the proposed research and how they are related to each
other
Inputs needed in developing a conceptual framework include:
Experiential knowledge of the researcher
Technical knowledge
Research background.
Personal experience.
Literature review:
Prior related theory concepts and relationships that are used to represent the world, what is happening
and why
Prior related research how people have tackled similar problems and what they have learned
Other theory and research - approaches, lines of investigation and theory that are not obviously
relevant/previously used.
In
the research process, the development of the conceptual framework is done after the review of related
literature and before the formulation of the research objectives
There
must be consistency between the conceptual framework presented and the research objectives to be
investigated
2.1 DEVELOPMENT OF THE CONCEPTUAL FRAMEWORK: CONVENTIONS/USUAL PRACTICES
In building the framework:
Start with the dependent /outcome variable or endpoint for intervention
Identify potential independent variables deemed to affect the dependent/outcome variable based on
empirical or theoretical evidence
Identify intervening, confounding , antecedent or mediating variables whose effects may alter the
relationship between the dependent and independent variable
Variables
pg. 6
OUTPUT
Number of nutrition education classes on child feeding conducted for mothers
Numbers of mothers trained on proper child feeding
OUTCOME
Change in mothers knowledge, attitudes and practices on child feeding
IMPACT
Change in the prevalence of malnutrition
Lecture 3:
OBJECTIVE FORMULATION
Ophelia M. Mendoza, DrPH
Lecturer
Steps in the Formulation
Research Objectives
of
Questions Asked
Steps to be Taken
SELECTION, ANALYSIS
STATEMENT
OF
RESEARCH PROBLEM
What information
available?
LITERATURE REVIEW
is
already
* Problem
identification
* Problem
Prioritization
* Justification/
significance of the
research problem
* Literature and other
available information
* Synthesis of previous
studies done
FORMULATION OF OBJECTIVES
General and specific research
objectives
pg. 7
pg. 8
To guide the researcher in the development of the research methodology, and orient the collection, analysis
and interpretation of the data
CHARACTERISTICS OF RESEARCH OBJECTIVES
They are phrased in such a way they focus on what the study is attempting to solve, and cover the different
parts of the problem in a logical way.
They are clearly phrased in measurable and operational terms,specifying exactly what are the researcher
wishing to do.
They are realistic,considering the constraints within local conditions and should be feasible.
They use action verbs which are specific enough to be measured.
SPECIFIC ACTION VERBS NON-SPECIFIC ACTION VERBS
Determine Appreciate
Compare Understand
Compute Explore
Describe
SOME DOWN TO EARTH REMINDERS WHEN SPECIFYING RESEARCH OBJECTIVES
1. We were taught basic grammar in grade school for a reason ----to write readable and understandable
research objectives when we get older
To describe the psychiatric needs of Hospital X through physicians assessment
2. KISS Keep It Short and Simple
To determine the efficacy of indoctrinating
a superannuated canine with innovative maneuvers
3.Say what you mean and mean what you say.
Consider the following research objectives:
1.To determine the mean birth weight of babies born to mothers in the following age-groups: <18, 18-35, and
36-49
2.To compare the mean birth weight of babies born to mothers in the following age-groups: <18, 18-35, and 3649
3.To compare the incidence of low birth weight among babies born to mothers in the following age-groups:
<18, 18-35, and 36-49
4.To determine if there is an association between the incidence of low birth weight and the age of the mother
5.To determine if place of age of the mother is a predictor of the incidence of low birth weight
Exercise: Consider the following research objectives and identify the potential problems
1. To determine reported cases of injuries and accidents related to waste management practices
2. Title: Assessment of HIV spread among the most vulnerable populations of Province X
General objective:
To assess the possible spread of HIV among the most vulnerable populations of Province X
3. Determine the utilization of health delivery facilities to support the health needs and problems of the
community.
4. To determine the health condition of women in the rural areas of Province X along the following:
Household morbidity
Household disability
Child and household deaths
pg. 9
LECTURE 4:
DEFINITION OF VARIABLES AND DATA COLLECTION
Ophelia M. Mendoza, DrPH
1. DEFINITION OF VARIABLES
The main questions to be answered when defining variables are:
a. What variables are needed for this research?
b. What specific data elements need to be collected in order to measure this variable?
These questions have to be considered in three phases of the research process, namely:
a. the formulation of research objectives;
b. the development of data collection tools; and
c. data analysis
1.1 Formulation of Specific Objectives
involves transforming abstract concepts presented in the theoretical or conceptual framework into
observable and measurable indicators
Academic Achievement
Based on the above conceptual framework, the following research objective can be formulated:
To determine the effect of school characteristics on the academic achievement of students
The above research objective can be rephrased as follows, after making it more specific:
To compare the grades of students in large and small class sizes
1.2 Development of Data Collection Tool
involves identifying variables needed to measure or compute the indicators of interest
Nutritional
Status
Weight
Nutritional Age
Sex
Height
involves selecting an indicator which is the most relevant to the phenomenon of interest or outcome
considered
Grades
pg. 10
of data collection
Design
pg. 11
o requires rigid training of data collectors; ensuring objectivity of observations may be a problem
o can provide more in-depth picture of non-quantitative phenomenon
2.2.2 Design
Use of primary vs secondary data
Observational vs Experimental
Cross-sectional vs Longitudinal
Paired vs Independent Samples
W
ith or without a control group
Quantitative vs qualitative approaches
2.3 From whom do we collect data?
identification of the most appropriate respondent
crucial issue when the subjects of the study are not in a position to provide answers (e.g., babies, sick
elderly)
2.4 Who should be the data collector?
Factors to consider:
required skills
cost
potential biases that can be committed
2.5 When and how often should data be collected?
Factors to consider:
objectives of the study
design
variable(s) being collected
important factor to consider when variable being studied is affected by time or seasonal patterns
2.6 Where should data be collected?
subjects home vs health facility vs public meeting place (e.g., barangay hall)
involves issues of practicality, the need to maximize study yield as well as to minimize biases
target vs sampling population
2.7 What procedures, activities or mechanisms are needed to minimize data collection problems?
memory/recall bias
lack of cooperation
Hawthorne effect (tendency of people to change their behavior because they are observed)
Observer bias
3. DESIGN OF DATA COLLECTION TOOLS
3.1 TYPES OF DATA COLLECTION TOOLS
a. Interview Schedule a tool used by the interviewer to ask questions and record responses during the
conduct of personal interviews.
b. Questionnaire a data collection tool which is self-administered or completed without the assistance of an
interviewer.
c. Form a concise data collection tool. It contains only labels or names of variables (ex., age) instead of the
items being phrased in question form (How old are you?)
d. Guide questions a listing of questions which serve as discussion or observation guides to be used for
qualitative modes of data collection like focus group discussions, nominal group techniques, participant
observation, etc.)
3.2 GOALS IN DESIGNING DATA COLLECTION TOOLS
pg. 12
3.2.1 RELEVANCE
What specific kinds of data are needed by the researcher?
The inclusion of each item in the data collection tool must be justified in relation to:
a. why the item or question will be asked (RESEARCH OBJECTIVES)
b. what will be done with the information (DATA ANALYSIS)
3.2.2 ACCURACY
enhanced when the wording and the sequence of the items/questions are designed to facilitate recall or to
motivate the respondent to answer accurately
3.3 CONSIDERATIONS IN THE CONSTRUCTION OF DATA COLLECTION TOOLS
a. Who will make the entries?
b. Wording of questions
c. Sequence and flow of questions
d. Number of questions or items asked
e. Purpose and relevance
f. Is the questionnaire or interview schedule to be used repeatedly?
g. What type of data processing will be used?
3.4. GUIDELINES TO QUESTION DEVELOPMENT AND FORMATTING
3.4.1 General Guidelines
a. Remember that the aim of designing data collection tools is to obtain complete and accurate information
which is relevant to the objectives of the data collection activity
b. Remember that the respondent is doing the data collector a favor by providing the necessary information.
c. Justify the relevance of each question or item in the data collection tool.
Avoid extraneous or irrelevant questions
Avoid back-rider questions to the extent possible.
d. Be sensitive to concerns the respondent may have to his/her privacy
Empathy think as a respondent when developing the data collection tool
3.4.2 Question Wording
a. Be careful about questions which require respondents to recall events or facts which occurred sometime in
the past
need to minimize recall or memory bias
the respondent can be helped in recalling events by tying-up dates with significant events
b. Use simple, generally familiar words which respondents might use in a conversation. Avoid technical jargon,
formal language and colloquialism
c. Avoid questions which are ambiguous because of a generally inadequate frame of reference
ex., How many times were you sick?
d. Avoid multi-barrelled questions. These are questions which are ask for more than one item at the same
time.
e. Avoid leading questions. These are questions which are phrased in such a way that the respondent gets a
clue on what the desired response is, and will be encouraged to provide it.
f. Avoid emotionally charged words in questions which arouse positive or negative feelings which might
overshadow the specific content of the question.
3.4.3 Format
a. When arranging the sequence of the questions or items in the data collection tool, start with those which are
easy to administer or to answer. The first questions should be an attempt to:
pg. 13
Lecture 5:
EPIDEMIOLOGIC STUDY DESIGNS
Ophelia M. Mendoza, DrPH
Lecturer
WAYS OF CATEGORIZING STUDY DESIGNS
1. Objectives of the study
descriptive vs analytical
pg. 14
DESCRIPTIVE
ANALYTICAL
Describes
Is more exploratory
Profiles characteristics of group
Focuses on what
Assumes no hypothesis
Does not require comparisons between groups or
over time
Explains
Is more explanatory
Analyzes why group has characteristics
Focuses on why
Assumes a hypothesis
Requires comparisons between groups over time
pg. 15
CASE-CONTROL STUDIES
ADVANTAGES
1. Feasible when dealing with rare diseases
2. Requires a smaller sample size than a cohort study
3. Little problem with attrition
DISADVANTAGES:
1. Incidence rates and attributable risks cannot
be computed.
2. The temporal sequence between disease and exposure may be a problem
3. Big chance for bias in the selection of cases and controls
4. Difficult to obtain information on exposure if the recall period is too long.
5. Selective survival may bias the comparison.
POPULATION-BASED
CASE-CONTROL STUDY
Cases and controls are sampled from a
defined population
ADVANTAGES
1. Source population is better defined.
2. It is easier to make certain that cases and controls come from the same source population
3. The exposure histories of the controls are more likely to reflect those of persons without the
disease of interest.
HOSPITAL-BASED
CASE-CONTROL STUDIES
Investigator selects cases from persons withthe disease of interest who are admitted to a particular hospital
controls are selected from persons admitted with other conditions but with no evidence of the disease of
interest
ADVANTAGES
1. Subjects are more accessible.
2. Subjects tend to be more cooperative.
3. Background characteristics of cases and controls may be balanced.
4. Easier to collect exposure information from medical records and biologic experiments
CROSS-SECTIONAL STUDIES
ADVANTAGES:
1. Less time-consuming and less costly than prospective studies
2. They often serve as the starting-point in prospective cohort studies for screening-out already existing
conditions
3. The design allows the measurement of risk, although the estimate is not precise
CROSS-SECTIONAL STUDIES
DISADVANTAGES:
1. It does not enable the direct estimation of risk.
2. Prone to bias from selective survival
3. Often difficult to establish the temporal sequence of exposure factor and the disease
ECOLOGICAL STUDIES
- unit of observation and unit of analysis is an aggregate rather than individual persons
- most practical design to use when exposure level is relatively homogeneous in a population but differs
between populations (ex., water quality) or when individual measurements of exposure are impossible
(ex., air pollution)
- they are used to generate hypothesis, or as a quick method of examining associations
- they cannot be used as basis for making causal inference
pg. 16
its most serious flaw is the risk of ecological fallacy--i.e., the characteristics of the geographical units
are incorrectly attributed to individuals
Lecture 6
EXPERIMENTAL STUDY DESIGNS
Ophelia M. Mendoza, DrPH
Lecturer
FEATURES:
they provide the best evidence for testing any hypothesis or to investigate possible cause-effect
relationships
they resemble cohort studies in that they require follow-up of subjects to determine outcome
its essential distinguishing feature is that it involves action, manipulation or intervention on the part of the
investigator
it typically uses a control group as a baseline against which to compare the group(s) receiving the
experimental treatment
they are generally difficult to carry out and raise some ethical issues
DEFINITION OF TERMS
a. reference population the group of ultimate interest
b. experimental population the group actually studied
c. random allocation process of permitting chance to determine the assignment of subjects to sub-groups
assures similarity on the average
should be distinguished from random selection
COMMONLY USED EXPERIMENTAL DESIGNS (based on types of treatment and measurements included in
the study)
1. One-shot Case-Study Design
Treatment
Post test
X
T
1.1 Uses
a. to develop ideas
b. to explore researchable hypothesis (i.e., conduct fishing expeditions)
1.2 Disadvantages
a. It is difficult, if not impossible to know if any change has occurred or to assess the degree to which the
observed behavior resulted from the treatment or intervention
b. The design cannot be used as basis for making defensible conclusions
2. One-Group Pre-test Post-test Design
Pre-test
Treatment
Post-test
T1
X
T2
2.1 Advantages and Disadvantages
a. Pre-test provides comparison between performance by the same group before and after exposure to X
b. Its major limitation is that there is no control group to permit the assessment of the possibility that the
observed change was influenced by factors other than the treatment give
2.2 Possible Sources of Error
a. maturation -- subjects growing older, more tired, less enthusiastic or less attentive
b. testing effect the experience of T1 by itself may increase motivation or modify attitudes
pg. 17
c. changes in instrumentation changes in the type of test given, in scoring, in observation or interviewing
techniques or calibration of instruments which make T1 and T2 different events
3. Nonrandomized Control-Group Pre-test post-test Design
Pre-test
Treatment
Experimental Group
Control Group
Post-test T1 X T2
T1 - T2
3.1 Characteristics
a. The design requires pre and post treatment measures for both the experimental and control groups
b. Randomization is not done in the assignment of the experimental and control groups hence this is often
referred to as quasi-experimental design
3.2 Guidelines
a. Subjects in the experimental group should not be exposed to X before the pre-treatment measure
b. The control group should be drawn from a population similar to that of the experimental group
c. Analysis of pre-treatment measures should be made to ensure comparability of the experimental and control
groups
4. Randomized Control Group Post-test Only Design
Pre-test
Treatment
Post-test R X T2
Experimental Group
Control Group
R - T2
4.1 Characteristics
a. Requires the use of both a control and experimental group, with the assignment of subjects to groups strictly
at random. The design represents that of a true experiment.
b. Pre-treatment measures are omitted since randomization techniques ensure comparability and objectivity in
the assignment of groups
4.2 Uses
a. When pre-tests are unavailable
b. When subjects anonymity must be maintained
c. When pre-test may interact with the intervention or treatment X
4.2 Uses
a. When pre-tests are unavailable
b. When subjects anonymity must be maintained
c. When pre-test may interact with the intervention or treatment X
5. Variations of Nonrandomized Control Group Pre-test Post-test Design
5.1 Extending design to include additional post-treatment measures
Pre-test
Treatment
Post-test T1 X T2 T3 T4 ......
Experimental Group
Control Group
T1 - T2 T3 T4 ......
ISSUES IN EXPERIMENTAL DESIGNS
1. What comparisons shall be made?
1.1 Between the treatment being studied and the complete absence of treatment
1.2 Between the test treatment and another treatment known or believed to be without therapeutic effect
(placebo)
1.3 Between the test treatment and another treatment of established therapeutic efficacy
pg. 18
1.4 Between one form of treatment and another form(s) of the same treatment (e.g., different dose levels of the
same drug, routes of administration, etc.)
1.5 Between early and later effects of the same treatment
2. Differences in composition of the study and control groups
Remedies:
a. randomization
b. stratified randomization or blocking
c. matching
d. using the patient as his own control
3. Subject expectations and observer bias
Open trial -- both subject and investigator are fully aware of what treatment is being given/received
Single-blind trial either the subject or the investigator (usually the former) is unaware of the nature of the
treatment given
Double-blind trial neither subject nor person assessing efficacy is aware of the nature of the treatment given
Treble-blind trial the subject, data collector and the data analyst are all unaware of the nature of the
treatments given
4. Interference between treatments
Cross-over design subjects in each group are taken off one treatment and crossed over to the treatment
previously given to other subjects
5. Sample attrition
creates problems in statistical analysis of the data
affects the comparability of the treatment and control groups
severe side-effects in treatment group may lead the investigator to withdraw patients from the trial, leaving
only the successes; treatment will appear more successful than what it really is
treatment may be so effective that patients believe themselves to have been cured and cease taking the
medication; successes disappear from treatment, leaving the treatment to appear less effective than what it
really is
drop-out rate should be considered in sample size estimation
drop-out data can be used as indicator if therapeutic usefulness and effectiveness and should be considered
when drawing conclusions from the trials
6. Ethical issues
the random allocation of subjects to the control and experimental groups gives rise to certain ethical
questions
informed consent in writing must be obtained from subjects
BASIC CATEGORIES OF EXPERIMENTAL DESIGNS BASED ON RANDOMIZATION PROCEDURE USED
1. Completely Randomized Design (CRD)
1.1 Description
a. Used for single factor experiments, where the effect of only one variable is being studied.
b. Random samples of size n are selected from each of k populations, representing the different
categories/groups of the variable or treatment being studied.
c. The sample size per treatment group may or may not be equal. Differences in sample size do not complicate
the computations for the one-way ANOVA to be applied to analyze the data, even when manual computation is
done.
1.2 Example
pg. 19
An experimenter is interested in evaluating the effectiveness of 3 methods of teaching a course. Three groups
of 8 subjects each were selected at random. The subjects were then taught using one of the 3 teaching
methods being tested. Upon completi9n of the course, each of the sub-groups was given a common test, and
their scores are shown below. Which of the 3 teaching methods is the most effective?
TOTAL
MEAN
Method 1
3
5
2
4
8
4
3
9
38
4.75
Method 2
4
4
3
8
7
4
2
5
37
4.62
Method 3
6
7
8
6
7
9
10
9
62
7.75
137
5.71
Bleeding Time: AM
Bleeding Time: PM
8.53
20.53
39.14
26.20
pg. 20
12.53
31.33
14.00
45.80
10.80
40.20
Treated
17.53
32.00
21.07
23.80
20.80
28.87
17.33
25.06
20.07
29.33
Source: Steel, R. and Torrie, J. Principles and Procedures of Statistics, page 201
Mean
Plasma
Phospholipid Levels in
Lambs
According
to
Treatment
Group
Diethylstilbestrol Status
Control
Treated
Total
Bleeding Time:
AM
Bleeding Time:
PM
Total
13.28
19.36
16.32
36.53
27.81
32.17
24.91
23.59
24.25
D
C
A
B
C
A
B
D
B
D
C
A
pg. 21
LECTURE 7
MOST COMMONLY USED MODES OF DATA COLLECTION
IN QUALITATIVE STUDY: IN-DEPTH INTERVIEWS
AND FOCUS GROUP DISCUSSIONS
Ophelia M. Mendoza, DrPH
Lecturer
1. COMPARISON BETWEEN QUANTITATIVE AND QUALITATIVE RESEARCH METHODS
ASPECT
General framework
Analytical objectives
QUANTITATIVE
Usually
Seeks
to
confirm
hypotheses about
phenomena
Instruments use more rigid
methods
of
eliciting
and
categorizing
responses
to
questions
Use highly structured methods of
data
collection
such
as
questionnaires, surveys,
and structured observation
To quantify variation
To predict causal relationships
To describe characteristics of a
Population
QUALITITATIVE
seek to explore phenomena
Instruments use more flexible,
iterative style of eliciting and
categorizing
responses
to
questions
Use semi-structured methods
such
as in-depth interviews, focus
group discussions, and participant
observation
To describe variation
To
describe
and
explain
relationships
To
describe
individual
experiences
To describe group norms
Question format
Closed-ended
Open-ended
Data format
Numerical (obtained by assigning Textual
(obtained
from
numerical values to responses
audiotapes,
videotapes, and field notes)
Flexibility in study design
Study design is pre-determined Some aspects of the study are
and stable from beginning to end
flexible (for example, the addition,
Participant responses do not exclusion, or wording of particular
influence or determine how and interview questions)
which questions researchers ask Participant responses affect how
next
and which questions researchers
Study design is subject to ask next
statistical
assumptions
and Study design is iterative, that is,
conditions
data collection and research
questions are adjusted according
to what is learned
Method of sample selection
Makes use of a representative Makes use of non-probability
sample selected through the sampling
designs,
usually
application of a probability purposive, quota and snow-ball
sampling design
sampling
Data analysis
Information collected is classified Information is classified into
according
to
predetermined categories which are identified in
categories (deductive process)
the data itself through an
inductive process
Succinct, quantifiable, can be Extensive, descriptive, cannot be
presented in numerical tables and succinctly
presented,
analyzed statistically
interpretation more subjective
Adopted from Family Health International.(2005) Qualitative Research Methods: A Data Collectors Guide
(Module 1: Qualitative Research Methods Overview).
2. IN-DEPTH INTERVIEWS
2.1 Description of Method
pg. 22
a. The in-depth interview is a technique designed to elicit a vivid picture of the participants perspective on the
research topic. It is a useful and effective method to use when the objective is to elicit individual experiences,
opinions, feelings as well as when addressing sensitive topics.
b. During the in-depth interviews, the person being interviewed is considered as an expert to the topic being
considered in the research. Hence this method is also called a key informant interview (KII).
c. Subjects or respondents of an in-depth interview are purposively selected based on their position, or specific
characteristic which makes them the best source if information regarding the topic being considered for the
research.
d. The researchers interviewing techniques during an in-depth interview are motivated by the desire to learn
everything the participant can share about the research topic or issue being discussed. Researchers engage
with participants by posing questions in a neutral manner, listening attentively to participants responses, and
asking follow-up questions and probes based on those responses. They do not lead participants according to
any preconceived notions, nor do they encourage participants to provide particular answers by expressing
approval or disapproval of what they say.
e. In-depth interviews are usually conducted face-to-face and involve one interviewer and one participant.
Phone conversations and interviews with more than one participant also qualify as in-depth interviews. On
average, in-depth interviews last from one to two hours.
2.2 Skills Needed in Conducting In-depth Interviews
Skills Needed
What
required
skills
mean
Emphasizing the
participants
Treating
the
perspective
participant as the expert
Keeping
the
participant
from
interviewing you
Balancing deference
to the participant with
control over the interview
Being an engaged
listener
Demonstrating
a
neutral attitude
Rationale
The
interviewers
perspective
on
the
research issue should be
invisible..
This avoids the risk that
participants will modify
their responses to please
the interviewer instead of
describing their own
perspectives.
Tip
Remember that the
purpose of the interview
is
to
elicit
the
participants perspective;
consider
yourself
a
student
If a participant asks
for factual information
during the interview,
write down the questions
and respond after the
interview is over.
If a participant asks
what you think, deflect
the question. Let the
participant know that you
consider his or her point
of view more important.
Dont
overcompensate
for
perceived
status
differences by giving the
participant too much
control
over
the
interview.
Pay attention to what
participants say and
follow up with relevant
questions and probes.
Be aware that what
you say, how you say it,
and your body language
can convey your own
biases and emotional
reactions
Use them instead to
convey neutrality and
acceptance.
pg. 23
Adapting to different
personalities and
emotional states
Different interviewing
styles may be needed for
different participants
for example, be able to
retain control of a
conversation
with
a
dominant personality and
to
animate
a
shy
participant.
Know how to tone
down
heightened
emotions, such as when
a participant starts crying
or becomes belligerent.
Adapting to each
individual may require
softening the way you
broach sensitive issues,
adjusting your tone of
voice to be more sober
or upbeat, or exhibiting
increased
Source: Family Health International (2005). Qualitative Research Methods: A Data Collectors Guide (Module
3:In-depth
Interviews).
3. FOCUS GROUP DISCUSSIONS
3.1 Description of Method
a. A focus group is a qualitative data collection method in which one or two researchers and a number of
participants meet as a group to discuss a given research topic.
b. A focus group consists of a small number (8-12) of relatively similar individuals who provide information
during a directed and moderated interactive group discussion. Focus group participants are typically chosen
based on their ability to provide specialized knowledge or insight into the issue under study.
c. Focus groups are especially effective for capturing information about social norms and the variety of
opinions or views within a population. The richness of focus group data emerges from the group dynamic and
from the diversity of the group. Participants influence each other through their presence and their reactions to
what other people say. Because not everyone will have the same views and experiences because of
differences in age, gender, education, access to resources, and other factors many different viewpoints will
likely be expressed by participants.
d. One researcher (the moderator) leads the discussion by asking participants to respond to open-ended
questions that is, questions that require an in-depth response rather than a single phrase or simple yes or
no answer. A second researcher (the note-taker) takes detailed notes on the discussion. A principal
advantage of focus groups is that they yield a large amount of information over a relatively short period of time.
They are also effective for accessing a broad range of views on a specific topic, as opposed to achieving group
consensus.
3.2 Preparing the FGD Guide Questions
a. Ask questions that encourage description and depth
One of the advantages of a focus group over a written survey is the opportunity to achieve greater depth of
understanding using open-ended rather than yes/no questions. FGD questions often begin with:
"How do you feel about .,"
"What is your opinion of ...." or
"Please describe."
Particularly effective are questions that begin with "how." Beware of "why" questions, however, at the beginning
of focus groups, because they may lead participants to justify their actions or opinions.
b. Use simple, clear language
pg. 24
Use language participants understand. Avoid asking questions that have several possible meanings or
questions that are so long that they are difficult to follow.
c. Avoid biased or leading questions
Avoid questions that lead respondents to answer a particular way. Similarly, avoid words such as "all,"
"always," "none," "never," "only," "just," and "merely," which may bias responses.
d. Use only one concept per question
Questions addressing more than one concept may confuse participants, leading them to answer only one part
of the question or to answer neither part. The solution is to separate two ideas into two questions.
e. List areas to probe
To ensure that the moderator consistently covers specific topics in all sessions, list probes or follow-up
questions after the main question.
Sample question: What are some factors that would motivate you to enroll for one course rather than another?
Probe: Explore the following factors:
i. fit with preferred schedule
ii. interest in subject
iii. instructor's reputation
iv. course difficulty and grading
v. career concerns
f. Organize focus group topics
Focus group discussions typically begin with general questions and end with one or two specific questions tied
to the study objectives. Because a group cannot adequately discuss a long list of questions in 90 minutes,
choose 6 to 10 questions, grouping similar questions. Once participants become comfortable, they may be
more likely to answer sensitive questions, so ask these questions toward the middle of the discussion.
3.3 Conducting the FGD
3.3.1 Participants
According to Goldenkoff (2004), "The key to focus groups is participant chemistry." To encourage participation
and openness, select participants with common concerns or backgrounds who don't know each other. The
American Statistical Association (1997) cautions, "Never put people together who are in the same chain of
command," so don't include a professor and her student or an employee and his boss in the same group. It's
not necessary to randomly select participants because results from a focus group are not meant to generalize
to a larger population. The goal is to recruit enough participants to get a full range of opinion, but not so many
as to discourage participation.
pg. 25
3.3.2 Setting The setting should be convenient, comfortable, and relaxing. Rooms with one-way mirrors,
conference tables, and microphones hanging from the ceiling may make participants feel like they are
performing, so make the setting informal, because people are more likely to open up if they feel at home. If
business operations are being discussed, a conference room may be fine, but for more personal topics, living
room-style seating is better. Serving light snacks and beverages can create a friendly atmosphere. If you are
using food as an incentive, however, serve it before or after the session, so it doesn't distract participants from
the discussion.
Dressing appropriately for the setting will improve rapport. It's acceptable to wear blue jeans for a student
focus group but better to wear more professional attire among program managers or administrators.
3.3.3 Moderating
An effective moderator keeps the discussion focused without discouraging the sharing of ideas and gets all
members to contribute while making sure that one or two members don't dominate. Some of the important
qualities of moderators are the following:
a. Knowledgeable: become thoroughly familiar with the topics of the focus group.
b. Enthusiastic: value your work but remain impartial.
c. Structuring: explain the purpose for the focus group; ask whether participants have questions.
d. Clear: ask simple, easy, short questions without using jargon.
e. Approachable: blend in; make sure the group can relate to you.
f. Gentle: allow people to finish; give them time to think; tolerate pauses.
g. Sensitive: listen attentively to what is said and how it is said; be empathic.
h. Open and flexible: respond to what is important to the participants.
i. Steering: know what you want to find out; keep the group focused; keep one or two members from
dominating.
j. Critical: prepare to politely challenge what is said. For example, you might question inconsistencies in
participants' replies.
k. Remembering and integrating: relate what is said to what has previously been said.
l. Interpreting: clarify and extend meanings of participants' statements without changing the meaning.
m. Inclusive: encourage reserved members to contribute by using eye contact, body language, and directly
asking for their input.
The focus group discussion begins with an introduction that explains the purpose, ground rules, and duration
(usually between 45 and 90 minutes) and conveys the expectation that everyone will contribute, all
contributions will be valued and remain confidential, and the session will be tape-recorded. Recording
increases the accuracy of your conclusions, so test your recording equipment immediately before each focus
group.
Inform participants of any exceptions to confidentiality. For example, if a participant discloses details of child
abuse or threats to his or her safety, you may be required by law to report this. Anticipate possible emotional
reactions from participants and how you will handle them.
After the introduction, the moderator typically has group members introduce themselves or uses an icebreaking
exercise to get them involved. To preserve confidentiality and commonality, the moderator should ask
members to introduce themselves by first name only and should avoid topics that emphasize differences in
status that might threaten cohesion.
For groups that focus on sensitive issues such as race or gender, the moderator's demographic background
should match that of participants.
Skilled moderators use reinforcers and probes. Reinforcers communicate interest in what members share but
don't suggest what is expected or acceptable. Use reinforcers like, "I see," or "Let me write that down," but
avoid comments like, "Excellent response," or nodding your head after some responses but not others. Try to
smile and appear open and friendly.
Be prepared to use probes such as, "Could you tell me some more about that?" "What do you mean by that?"
or "Anything else?" Allow participants time to respond, using silence in moderation to encourage someone to
expand on an answer. Nonverbal behaviors will help you judge whether a participant is uncomfortable or just
thinking about an answer. When a participant rambles or does not state a clear point of view, ask an
interpretive question, such as, "Do you mean that your priorities have shifted from developing programs to
building support for programs?"
A t the end of the discussion, summarize important points to ensure you have made the correct interpretation
and to allow participants to elaborate. Always thank respondents for their participation and ask them if they
have any questions for you.
3.4 STEPS IN ANALYZING FGD RESULTS
3.4.1 Review individual transcripts
pg. 26
pg. 27
Family Health International (2005). Qualitative Research Methods: A Data Collectors Guide. Module 1
Qualitative Research Methods Overview. Research Triangle Park, NC.
Family Health International (2005). Qualitative Research Methods: A Data Collectors Guide. Module 3 Indepth Interviews. Research Triangle Park, NC.
Family Health International (2005). Qualitative Research Methods: A Data Collectors Guide. Module 4 Focus
Groups. Research Triangle Park, NC.
Frechling, J., Sharp, L. (1997). User-Friendly Handbook for Mixed Method Evaluations. Directorate for
Education and Human Resources, National Science Foundation
USAID Center for Development Information and Evaluation (1996). Performance Monitoring and Evaluation
TIPS: Conducting Focus Group Interviews. USAID CDIE
Lecture 8
SAMPLING DESIGNS
Ophelia M. Mendoza, DrPH
1. Advantages of Sampling
a. It is cheaper.
b. It is faster.
c. Better quality of information can be collected.
d. More comprehensive data may be obtained.
e. It is the only possible method when the procedure is destructive.
2. Definition of Terms
a. Population the entire group of individuals or items of interest in the study
b. Target population the group from which representative information is desired and to which inferences will
be made. Whatever conclusions will be derived from the study, will be generalized to the target population.
c. Sampling Population the population from which a sample will actually be taken
Ideally, the target population should be the same as the sampling population. However there are
certain instances when there is a gap between the two, resulting from limited resources and other field
realities. When this occurs, what is important is for the investigator to determine the extent and
direction of the bias (if any) created by the gap between the target and the sampling population
pg. 28
sampling is used, the sampling units at each level of selection and the corresponding sampling frames used
must be mentioned.
e. When? refers to the time period for the conduct of the survey. This is an important consideration when the
variable being studied has seasonality
4. Basic Sampling Designs
4.1 Non-probability Sampling Designs the probability of each member of the sampling population to be
selected in the sample is difficult to determine or cannot be specified. Hence the reliability of the resulting
estimates of the sample results cannot be assessed
4.1.1 Judgment or Purposive Sampling a representative sample is selected based on an experts subjective
judgment or on some pe-specified criteria
4.1.2 Accidental or haphazard sampling whatever comes on hand or whoever is available is included as
sample
4.1.3 Quota sampling data collectors are given quotas to meet; they keep on collecting data in a given place,
until the quota is met
4.1.4 Snow-ball technique frequently used in studying hidden populations like drug users, commercial sex
workers, etc. Selection of subjects is based on who the earlier respondents have identified as members of the
eligible population for the survey
4.2 Probability Sampling Designs -- the rules and procedures for selecting the sample and estimating the
parameters are explicitly and rigidly specified
4.2.1 Simple Random Sampling
Characteristics: Every element in the population has an equal chance of
being included in the sample
Procedures for Sample Selection:
a. Prepare the sampling frame
b. Number all the population elements in the sampling frame chronologically from 1 to N, where N is the
population size
c. Determine the required sample size, n.
d. Select n numbers at random between 1 and N, using either the lottery method or a table of random numbers
e. The population elements in the list whose numbers correspond to the n numbers randomly selected will
comprise the simple random sample
4.2.2 Stratified Random Sampling
Characteristics: This design is used when the investigator wants to:
a. ensure that groups of interest or subsections of the population
considered important for the study are adequately represented
b. derive reasonably accurate estimates for important subsections of the population
Procedures:
a. Identify the stratification variable.
b. Classify the population elements according to the categories of the stratification variable.
c. Number the population elements chronologically from 1 to N, within each category of the stratification
variable
d. Determine the sample size needed from each stratum
e. Within each stratum, select the required number of samples by simple random sampling.
A frequent question asked in relation to stratified random sampling is how to allocate the computed sample
size to the various strata. There are several ways of doing it, and one of the frequent methods used is by
proportional allocation. This is a commonly used method because its application results in equal probability of
selection. As such, data analysis will be simplified since it avoids the need of computing and applying sampling
weights in the estimation of population parameters. The following is an example of how proportional allocation
of samples is applied to the different strata:
Suppose we want to allocate 250 samples to 3 sample barangays included in the study. These 3 barangays
have the following populations:
pg. 29
BARANGAY
NUMBER
A
B
C
TOTAL
3000
10500
6500
20000
POPULATION SIZE
%
15.0
52.5
32.5
100.0
The 250 samples can be allocated to the 3 barangays to reflect the population distribution as follows:
BARANGAY
POPULATION SIZE
A
B
C
TOTAL
NUMBER
3000
10500
6500
20000
SAMPLE SIZE
%
15.0
52.5
32.5
100.0
NUMBER
38
131
81
250
%
15.0
52.5
32.5
100.0
Systematic sampling
Not needed
1. Compute for the sampling
interval, k where k=N/n. Therefore
k=800/200 = 4. This means that for
every 4 households in the
population, 1 household will be
pg. 30
selected as sample
2. Select a random number
between 1 and 4. Suppose #2 was
selected. Therefore the second
household in the population to be
studied is included as sample.
3. Every 2nd household thereafter
will be included on the study.
These include households number
2, 6, 10, 14, 18, 22, 26, 30, 34, 38.
etc.
4.2.4 Cluster Sampling
Characteristic:
a. It is used when a frame for the individual elementary units in the
population is not available. However, a frame for groups or clusters of
elements is available.
b. The sampling unit is different from the elementary unit.
Procedures:
a. Identify the groups or clusters of elementary units. It is best if the sizes of
the clusters are not too big and do not vary much from each other.
b. Select a random sample of clusters.
c. All elements in the selected clusters will be included in the survey.
4.2.5 Multi-stage Sampling
Characteristics:
a. It is generally used when the survey has a wide coverage and a sampling
frame for the elementary units is difficult to obtain.
b. Sampling is done in successive stages.
c. Data collection is concentrated only on the samples selected at each stage, resulting in lower cost per unit of
inquiry.
d. Stratification and systematic sampling may be incorporated at any stage.
e. Statistical analysis of the data is more complicated.
Procedures for Sample Selection
a. Identify the number of stages of selection to be used in the sampling
design and the sampling units to be used at each stage.
b. Determine the sample size necessary for each stage of selection.
c. Prepare the sampling frame for the 1st stage of selection, and select at random a sample of primary
sampling units (PSUs).
d. For each of the PSUs earlier selected, prepare the sampling frame for the 2nd stage of selection. Randomly
select the corresponding number of secondary sampling units (SSUs) from each PSU included in the sample.
e. Repeat the process of frame preparation and sample selection until the last stage of sampling is reached.
LECTURE 9
SAMPLE SIZE DETERMINATION
Ophelia M. Mendoza, DrPH
1. WHEN IS IT IMPORTANT TO DETERMINE THE ADEQUATE SAMPLE SIZE FOR A GIVEN STUDY?
When the study is based on a sample instead of the whole population
When a probability sampling design was used to select the sample
When it is important to derive precise estimates of the variables/parameters being studied
2. GENERAL COMMENTS ABOUT SAMPLE SIZE DETERMINATION:
pg. 31
2.1 Sample size determination is a complicated issue which needs a lot of:
a. statistical inputs
b. practical considerations
2.2 The formula for sample size determination differs according to:
a. type of study design
b. type of sampling design
c. type of variables being measured
d. study objectives
e. number of groups being studied and compared
2.3 The following generalizations can be made regarding the sample size requirements of a given study:
a. longitudinal study designs require larger samples than cross-sectional or case-control study designs
b. cluster sampling designs require larger samples than simple random sampling of elementary units
c. the smaller the value of the parameter being estimated, the larger the sample size needed
rare conditions
small differences
d. the more heterogeneous the variable is in the population, the larger the sample size that is necessary
e. the more precise and the higher the confidence level you wish to have for the resulting estimates, the larger
is the sample size needed
3. INFORMATION NEEDED FOR SAMPLE SIZE DETERMINATION
3.1 ESTIMATING A MEAN OR A PROPORTION:
a. the anticipated value of the parameter to be estimated in the study (e.g., the prevalence of the disease; the
average length of stay in the hospital, etc.)
Possible sources of this value are:
i. previous studies or past records
ii. values derived from the pre-test or pilot phase of the project
iii. an experts opinion or an educated guess
iv. conducting the study in two parts
b. the degree of precision required for the resulting estimates (margin of error) this can be expressed either
in absolute (ex., 5%) or in relative terms (ex., 5% of the value of the resulting estimate). The value of an
acceptable margin of error will depend on:
the magnitude/level of the parameter being estimated
how the results of the study will be used
available resources for the conduct of the study
c. the desired confidence level standard levels used are 90%, 95% and 99%, with 95% being the most
commonly used
d. the estimated degree of variability of the observations (variance or the standard deviation)
3.2 SAMPLE SIZE FORMULA FOR ESTIMATING PROPORTIONS , USING SIMPLE RANDOM
SAMPLING n = z2 P Q
d2
where:
z = a value derived from the normal distribution and is dependent on the desired confidence level
for the derivation of the estimate. The z-values corresponding to the standard confidence levels
used when deriving estimates in research studies
are as follows:
Confidence level
z-value
90%
1.645
95%
1.96
99%
2.58
P = anticipated value of the proportion to be estimated in the population
Q = 1 P (the complement of P, where P + Q = 1)
pg. 32
d = the margin of error or maximum permissible error; a measure of the desired level of precision for the
resulting estimates
3.2.1 EXAMPLE:
A Municipal Health Officer wishes to conduct a survey to determine the prevalence of malnutrition among
preschoolers in his area. The only background data available is the result of a study done by his predecessor 5
years ago which indicates that the prevalence of moderate and severe malnutrition among preschoolers is
25%. If he decides to select a random sample of preschoolers for his study , how big should his sample size be
if he sets his error rate to be within 5%, with 95% confidence?
For this example:
z = 1.96 (based on the desired confidence level of 95%)
P = 0.25 (malnutrition prevalence based on a survey done 5 years ago)
Q = 1 - 0.25 = 0.75
d = 0.05
Therefore, n = (1.96)2 (0.25)(0.75) = 288
(.05)2
Note that sample sizes for single proportions corresponding to a given confidence level have already been
tabulated. An example is Table 1 from the book of Lwanga and Lemeshow which is included in your handout.
From this table, we get the same sample size of 288.
3.3 COMPARING TWO SAMPLE PROPORTIONS, P1 AND P2
3.3.1 INFORMATION NEEDED:
a. Anticipated values of the two proportions, P1 and P2
b. Magnitude of the difference between P1 and P2 which the investigator regards as clinically or
practically meaningful and which should be detected by the statistical test. If targets have been specified for
the amount of change observed in the indicators being studied in the research (ex., a 30% decrease in the
prevalence of malnutrition among preschoolers between the intervention and the control groups) the difference
between P1 and P2 will be equal to the target
c. Desired level of significance of the test () this refers to the degree of confidence with which it is desired
to be certain that an observed change or comparison group difference of the magnitude specified in (b) above ,
would not have occurred by chance. The conventional levels of significance used are 1%, 5% and 10%, with
5% being the most commonly used.
d. Desired power of the test (1-) this refers to the degree of confidence with which it is desired to be certain
that an actual change or difference of the magnitude specified in (b) above will be detected. Conventional
levels of statistical power used are 80% and 90%
3.3.2 SAMPLE SIZE FORMULA FOR TESTING FOR THE DIFFERENCE BETWEEN TWO
PROPORTIONS, USING SIMPLE RANDOM SAMPLING
n = { z1 - /2 [ 2P (1-P) ] + z1- [ P1 (1- P1) + P2 (1- P2) ] } 2
( P1 - P2 ) 2
Where:
P = ( P1 + P2 ) / 2
P1 = anticipated value of the population proportion for the first group; this is usually the
pre/baseline value, or the value of the control or the comparison group
P2 = anticipated value of the population proportion for the second group; this Is usually the
post/endline value, or the value of the study group
z1 - /2 = desired level of significance for the test of hypothesis
z1- = desired power of the test
The sample size values for the difference between two proportions for a one-tailed test with a 5% level of
significance and 90% power are presented in Table 5(a) of Lwanga and Lemeshow, which is included among
your handouts.
3.3.3 EXAMPLE:
a. It is believed that the proportion of patients who develop complications after undergoing one type of surgery
is 5% while the proportion of patients who develop complications after a second type of surgery is 15%. How
large should the sample size be in each of the two groups of patients if an investigator wishes to detect, with a
power of 90%, whether the second procedure has a complication rate significantly higher than the first at the
5% level of significance? (Ans. From Table 5(a) of Lwanga and Lemeshow, the needed sample size for this
study is 153 patients per group).
pg. 33
b. An NGO wishes to determine the effectiveness of their health education program on HIV/AIDS which they
have implemented. Among the important indicators used in this program is the proportion of female sex
workers (FSWs) who require their clients to use condoms. Suppose experiences in other areas where similar
projects on HIV AIDS have been conducted showed that the proportion of female sex workers who used
condoms before exposure to health education increased from 20% at baseline to 50% after the program. Using
these results from previous studies as basis, how many FSWs should be interviewed by the NGO in order to
determine, with 90% power and 5% level of significance, whether the proportion of FSWs who require the use
of condoms by their clients has significantly increased after their program? (Use Table 5(a) to determine the
sample size). (Ans. From Table 5(a) of Lwanga and Lemeshow, the needed sample size is 42 female sex
workers).
3.4 SAMPLE SIZE FORMULAS FOR ESTIMATING MEANS
3.4.1 ESTIMATING A SINGLE POPULATION MEAN
3.4.1.1 Formula for sample size determination
The sample size formula for estimating a simple population mean is:
n = z2 1-/2 2
d2
where:
= estimated standard deviation of the variable being studied
d = the desired precision, expressed in absolute terms
3.4.1.2 Example
Suppose a researcher wishes to determine the average increase in body weight of infant rats given treatment
A within a certain period of time. Since the study is still in its planning stage, there is no reliable estimate of the
variance or its standard deviation available. However, on the basis of a previous study, as well as of the results
of a number of pilot studies done on a small number of rats, it can be approximated that the standard deviation
of the body weight increase of infant rats would be 20g. How large should the sample size be if the researcher
wishes to have a margin of error of 10g for the resulting estimate with 95% confidence?
Solution:
In the above problem, z = 1.96 (based on a 95% confidence level)
= 20
d = 10
Substituting the above values in the formula for sample size determination for estimating a single mean, we
get:
n = (1.96)2 (20)2 = (3.84)(400) = 1536 = 15.4 or 16 rats
(10)2 100 100
3.4.2 TESTING FOR THE DIFERENCE BETWEEN TWO POPULATION MEANS
3.4.2.1 Formula for sample size determination
The formula for comparing the means of 2 groups, with variances assumed to be equal in each group
n = 22 [ z1-/2 + z1- ]2
( 1 - 2 )2
where:
= estimated standard deviation of the variable being studied, assumed to be equal for each group
1 - 2 = clinically or practically meaningful difference between the means of the 2 groups being which the
investigators wish to be detected in the study
and z1-/2 and z1- have the following values corresponding to the confidence level and power of the test,
respectively:
Confidence level
90%
95%
99%
Value of z1-/2
1.645
1.96
2.58
Value of z1-
.524
.842
1.282
1.645
3.4.2.2 Example
A researcher wishes to compare the mean weight gain of rats subjected to treatment A with those subjected to
treatment B in a given period of time. He considers a difference of at least 30g between the two groups to be
practically meaningful. How many rats per group should he include in his study if he sets his confidence level
and power of the test to be both 95%? Based on past studies, the standard deviation of the weight gains of rats
given treatments A and B were found to be equal at 20g.
pg. 34
Solution:
In the above problem:
= 20g
z1- = 1.96
z1- = 1.645
1 - 2 = 30g
Substituting the above values in the formula for sample size determination for the difference between two
means:
n = 22 [ z1-/2 + z1- ]2 = 2(20)2 [1.96 + 1.645]2 = 800 (12.9960) = 11.6
( 1 - 2 )2 (30)2 900
n = 12 rats per group
3.5 ADDITIONAL CONSIDERATIONS IN SAMPLE SIZE DETERMINATION
a. When deciding on the sample size requirements for a study with more than one objective involving the
estimation and/or testing of several parameters and hypothesis, the sample size requirement of each important
parameter has to be computed and considered.
b. When estimating a proportion whose value is unknown, a common practice is to assume that P=.50. The
basis for this is the fact that the variance of indicators which are in the form of proportions have a maximum
value when P=.50 and Q=0.50, and hence will ensure an adequate sample size irrespective of the actual value
of P.
c. When the sampling design used makes use of cluster sampling instead of pure simple random sampling, the
sample size has to be corrected for the design effect (deff) i.e.,
n(cluster sampling) = n(simple random sampling) x deff
Deff is the factor by which the sample size for a cluster sample has to be increased in order to derive
estimates with the same precision as a simple random sample. It has been shown that for most health surveys,
deff = 1.5 to 2.0, with deff=2.0 being a common value used.
d. In order to ensure that the required sample size is reached, a correction factor for non-response is usually
applied at the time of sample size determination. This avoids the need for looking for substitutes during data
collection which usually introduces biases in sample selection. The non-response rate varies depending on the
survey setting (ex., urban areas generally have high non-response rated compared to rural areas; surveys
which ask for sensitive questions also have higher non-response rates) but in general, an inflation factor of
10% for non-response has been shown to be adequate in most situations. Therefore, if for example, the
required sample size of a given survey after applying the design effect is 800, then the revised target sample
size after applying for the correction factor for non-response will be 800 + 80 = 880. Data collection activities
should therefore be planned for a sample size of 880.
e. There are instances when the computed sample size is deemed too big relative to the population size.
(There are even instances when the computed sample size is bigger than the population size).This is when the
finite population correction (fpc) can be applied to determine the final sample size to be considered. The
sample size formula after application of the fpc is:
nfpc = ______n0_________
1 + n0/N
where nfpc = computed sample size after application of the finite population correction
n0 = initial sample size computed prior to application of fpc
N = population size
References:
Aday, L.A. Designing and Conducting Health Surveys A Comprehensive Guide (Second Edition). Jossey-Bass
Publishers. San Francisco. 1996
Lwanga, S.K. and Lemeshow, S. Sample Size Determination in Health Studies A Practical Manual.
World Health Organization. Geneva. 1991.
Magnani, R. Sampling Guide. Food and Nutrition Technical Assistance Project. USAID.1997
LECTURE 10
AN OVERVIEW OF DATA ANALYSIS TECHNIQUES
Ophelia M. Mendoza, DrPH
pg. 35
pg. 36
NUMBER
365
469
834
PERCENT (%)
43.8
56.2
100.0
PERCENT (%)
20.9
79.1
100.0
b. Example of a cross-tabulation
Table 3. Distribution of Respondents According to Sex and Smoking Behavior
SEX
SMOKER
NON-SMOKER
Male
102
263
Female
72
397
TOTAL
174
660
TOTAL
365
469
834
Suppose the research objective is to determine if there is a relationship between the smoking behavior and sex
of high school students in private and public schools. Which of the two dummy tables below is the most
appropriate in answering this objective?
Table 4. Distribution of High School Students According to Smoking Behavior, Sex and type of School Attended
Smoking Behavior
Sex of Student
Male
Female
TOTAL
Smoker
Non-smoker
TOTAL
Table 5. Distribution of High School Students According to Smoking Behavior,
Sex and type of School Attended
Smoking Behavior
Male
Smoker
Non-smoker
TOTAL
Public Schools
Female
Private Schools
Male
TOTAL
Female
EXERCISE ON CROSS-TABULATION
Construct dummy tables corresponding to the following objectives:
1. To compare the seasonal patterns in the incidence of diarrhea and acute respiratory infections in 2010 and
2011, for Municipality X.
2. To compare the incidence of food poisoning among those who ate and did not eat fresh lumpia during a
wedding party, for different age-groups of guests.
3. To determine the relationship between hypertension and diabetes among adults with different levels of
physical activity.
4. To determine the relationship between the immunization status of children and the educational attainment of
mothers in urban and rural areas.
pg. 37
5. To determine the relationship between family planning practice and socio-economic status among Catholic
and non-Catholic women
2.1.2 TABLE 6. TYPES OF GRAPHS COMMONLY USED IN PRESENTING STATISTICAL DATA TYPE OF
GRAPH
Graphical Presentation
TYPE OF GRAPH
Histogram
TYPE OF VARIABLE/
DATA BEING GRAPHED
Continuous quantitative
Frequency Polygon
Continuous quantitative
Bar Chart
(horizontal or vertical)
Qualitative, or
Discrete quantitative
Line Diagram
Pie Chart
Qualitative, or
broad categories of quantitative
variables
Scatterpoint diagram
Quantitative
(discrete or continuous)
PURPOSE OF PRESENTING
THE GRAPH
To
present
a
frequency
distribution of a quantitative
continuous variable like age,
height, etc.
The same use as the histogram,
but is better to use when
presenting
more than one
frequency distribution in the same
graph (ex., comparison of the
weights of male and female
children)
To show or compare absolute
counts
or
relative
figures
(percentages, rates, etc) of
qualitative or discrete quantitative
variables
Used to show trends in absolute
counts, rates or means with
respect to time, age, etc.
Shows how a total is divided into
sub-categories; used when the
number of categories are not too
many
Same as the pie chart, but is
better to use when presenting or
comparing two or more sets of
data
To show the nature and the
strength of the relationship
between
two
continuous
quantitative variables
2.1.3 Measures of Central Tendency - measures used to summarize a set of quantitative data by computing for
a representative figure.
If I will be asked to describe the data by using just one value, what will that value be?
a. Mean
statistical measure derived by dividing the sum of all observations by the total number of observations
included in the computations
is easily affected by outliers with extremely high or low values; therefore it is not recommended to be used
when the data has extreme values
there are several kinds of means; the most commonly used are the arithmetic mean, the weighted mean
and the geometric mean
b. Median
computed by determining the middlemost value in a set of observations
is not affected by outliers with extremely low or high values
is usually the measure of central tendency used when the distribution is skewed
c. Mode
pg. 38
is determined by identifying the value which occurs most frequently in the data
it is possible for a data set not to have any mode at all; it is also possible for a data set to have several
modes
2.1.4 Measures of Dispersion - measures used to describe the degree of variability or heterogeneity of a given
data set
If I will be asked to describe how different are the values in the data by using just one value, what will that
value be?
a. Range
computed by determining the difference between the highest and the lowest value in the data set
it is easily affected by outliers with extremely high or low values
b. Variance and Standard Deviation
computed by determining the average of the squared deviations from the mean of a given data set
the units of the computed value of the variance are in squared units (ex., 5.8 kg.2 )
if the square root of the variance is extracted, the resulting value is called the standard deviation
the variance and the standard deviation are the measures of dispersion usually used in more advanced
statistical analysis like hypothesis testing
c. Coefficient of Variation (CV)
computed by expressing the standard deviation as a percentage of the mean
used when comparing the variability of two variables which have different units of measurement
VARIABLE
MEAN
Weight
Height
50.0 kg.
160.0 cm.
STANDARD
DEVIATION
3.9 kg
7.5 cm.
COEFFICIENT
OF
VARIATION
(3.9/50.0) x 100 = 7.8%
(7.5/160.0) x 100 =
4.7%
pg. 39
Lecture 11
ADMINISTRATIVE ASPECTS OF RESEARCH:
DETERMINING THE PROJECT TIMETABLE AND BUDGET
Ophelia M. Mendoza, DrPH
Lecturer
1. THE RESEARCH TIMETABLE
1.1 What is the Research Timetable or Schedule?
It designates work to be done and specifies deadlines for completing tasks and deliverables. It includes:
Estimates of time/duration of each research task/activity
Start and finish dates of each task/activity
Sequence of tasks/activities
Names of responsible staff and resources assigned for each activity
It is a tool used by the Research Manager in planning, executing and controlling the various research tasks,
as well as in monitoring the progress of the research.
It defines timelines for key deliverables and sets expectations for research progress and completion
1.2 Steps in developing a research timetable
a. Determine the tasks to be included in the research timetable.
b. Determine the relationships among the tasks. This involves identifying the sequence of tasks especially
those which need to be completed before others tasks can be started, as well as those which can be
performed at the same time.
c. Identify/assign responsible persons for each task.
d. Estimate the amount of time/effort required for each task.
e. Consider the other variables that go into building the schedule like when, where and how the tasks must be
performed; expected delays due to uncontrolled circumstances like scheduled brown-outs; time constraints of
research staff working only on a part-time basis, etc.)
f. Construct a Gantt chart to graphically present the research timetable
1.3 Determining Duration of Data Collection Activity in Research
Data collection is generally allocated the longest time among the various tasks in research
It is important for the researcher to have a good basis in determining the duration of time needed for data
collection. In certain instances, especially when the research results are needed
immediately, the time for data collection may be set at the start (ex., maximum of 2 months). In this case, the
task of the researcher is to determine how many data collectors and other resources are needed in order to
ensure that data collection can be finished within the prescribed time.
Important inputs needed to determine the duration of data collection are:
Average length of time needed to collect data from one sample. For example, in a household survey, this
refers to the average length of time to complete one household interview, including the time needed to travel to
and locate the sample household. The average length of interview is one of the important variables to be
determined during the pre-testing of the data collection tool.
Number of data collectors to be used/hired
An example of how the duration of data collection is computed is as follows:
Suppose a particular research needs to interview 500 sample households. During the pre-test, it was
determined that the average time to interview one household is 2 hours including travel time. If 2 interviewers
are available to collect the data for the research. how long will it take to collect data from 500 households?
Note that the interviewers will be collecting data for 8 hours/day, d days per week.
No. of households/day/interviewer = 8 working hours per day 2 hrs per household
= 4 households/day/interviewer
pg. 40
pg. 41