Curs 3 CB Metodologia Cercetarii

Metodologia cercetrii tiinifice
Colectarea datelor
Strategii de masurare
Corin Badiu, 2007
Obiective
Stabilirea tipurilor de variabile
Surse IT de date
Identifica si localizeaza seturile corecte de date
In institutie
In afara institutiei
Strategii de masurare
Analiza, interpretarea si raportarea rezultatelor
Selecteaza programul adecvat: Excel, SPSS

Foloseste programele pentru analize statistice simple si
prezentarea grafica a rezultatelor
Interpreteazarezultatele
Tipuri de variabile
Variabile de
confuzie*
Predictor*
Rezultat
Modificatori ai efectului*
*Considerate general ca expunere la factori de risc
Tipuri de studii clinice

Studii fara variabile
Studii de caz, serii de cazuri, editoriale,

opinii / comentarii, rapoarte review
Studii cu o singura variabila
Studii descriptive
Studii cu 2 variabile
Experimente
Studii observationale
Meta-analize si review-uri sistematice
Ierarhia tipurilor de studii clinice
Studii clinice
Descriptive
Analitice
Experimentale
Observationale
Cohorta
Caz-control
Cross-sectional
Variabile
Variabila predictor
(independenta)
Variabila rezultat
(dependenta)
Evidence Based Medicine

Metode de tratament sustinute de
dovezi clinice si de cercetare.
Necesita integrarea celor mai bune
dovezi din cercetare pentru diagnostic
si tratament cu experienta clinica.
Ia in considerare ce este optim pentru
fiecare pacient precum si preferintele
acestuia.
Realitatea
Cercetarea asupra eficientei de tratament
face subiectul unui numar mic de articole.
Evidence based medicine este considerat un
concept ce foloseste baze de date inclusiv
studii sistematice de caz pentru a ghida
interventii terapeutice.
Dovezile trebuie evaluate intr-un context
terapeutic efectiv, Ce tip de interventie
capata sens pentru mine ca practician?
Clinical Questions
What is the best choice of therapy
for my patient?
Is this program theoretically sound?
Does this therapy program work?
How long with the therapy take?
Where do I go from here?
Practicing Clinicians Needed!

Clinicians
are on the front line
have necessary clinical expertise
know their patients well
are naturally scientific thinkers
are well-versed in data collection
know how to look for outcomes
Baseline Measurements
A baseline is a measure of response
rates in the absence of treatment
Baselines
Establish a need for treatment
Document improvement
Allow us to modify if we dont see
improvement
Baseline Data
Create a set of exemplars of each of your
targets and prepare a recording sheet.
Utilize criterion referenced measures.
Data Collection Strategies

Always have more than one measurement
Check the reliability of the baseline data
Select research/clinical design
Research/Clinical Designs
ABA designs
Test, treat and test
ABAB designs
Test, treat, test and treat
Time-Series designs
Establish stable baseline
Begin treatment
Measure treatment results
Multiple-Baseline designs
Have a number of different baselines
Each baseline must be independent of the others
Only treat one variable
Data Collection Instruments

Requirements
Reliable
Valid
Responsive
Universal
Unbiased
Data Collection Instrument

Is it reliable?
Will the instrument measure consistently across:
Different testing situations?
Test-retest reliability
Different judges?
Inter-rater reliability

Is it valid?
Is the instrument
being used to
measure the kind of
data for which it
was intended?

Is it responsive?
The instrument should be equally sensitive, whether a
characteristic is present or absent.
Must measure both:
False-negatives:
You thought it was intact, but it was torn.
False-positives:
You thought it was torn, but it was intact.

Is it universal?
The investigator should
employ a widely used data
collection instrument, which
helps minimize reporting bias
because the data can then be
compared with other published
literature.

Is it unbiased?
There should be no difference between the true value
and the value that an investigator actually obtains
other than a difference caused by sampling variability.
Sampling Methods
Random Sampling (Simple)
Systematic Sampling
Stratified Sampling
4. Cluster Sampling
5. Convenience Sampling
6. More complex sampling
Qualitative and Quantitative Variables
Examples of qualitative variables are occupation, sex, marital

status, and etc
Variables that yield observations that can be measured are
considered to be quantitative variables. Examples of
quantitative variables are weight, height, and age
Quantitative variables can further be classified as discrete or
continuous
Variables types
1.
2.
3.
4.
Categorical variables (e.g., Sex, Marital Status,

income category)
Continuous variables (e.g., Age, income,
weight, height, time to achieve an outcome)
Discrete variables (e.g.,Number of Children in
a family)
Binary or Dichotomous variables (e.g.,
response to all Yes or No type of questions)
Scale of Data
1. Nominal: These data do not represent an amount or quantity (e.g.,
Marital Status, Sex)
2. Ordinal: These data represent an ordered series of relationship (e.g.,
level of education)
3. Interval: These data is measured on an interval scale having equal
units but an arbitrary zero point. (e.g.: Temperature in Fahrenheit)
4. Interval Ratio: Variable such as weight for which we can compare
meaningfully one weight versus another (say, 100 Kg is twice 50 Kg)
Variables in the protocol

TYPES OF VARIABLE
independent
dependent
intermediate
confounding
Independent Variable
The characteristic being observed and/or
measured that is hypothesized to influence an
event or outcome (dependent variable).
NOTE
The independent variable is not influenced
by the event or outcome, but may cause it
or contribute to its variation.
Dependent Variable
A variable whose value is dependent on
the effect of other variables (ie.,
independent variables) in the
relationship being studied. Synonyms:
outcome or response variable.
NOTE
an event or outcome whose variation we
seek to explain or account for by the
influence of independent variables.
Intermediate Variable
A variable that occurs in a causal pathway from
an independent to a dependent variable.
Synonyms: intervening, mediating
NOTES
it produces variation in the dependent
variable, and is caused to vary by the
independent variable.
such a variable is associated with both the
dependent and independent variables.
Confounding Variable
A factor (that is itself a determinant of the
outcome), that distorts the apparent effect of a
study variable on the outcome.
NOTE
such a factor may be unequally distributed
among the exposed and the unexposed, and
thereby influence the apparent magnitude and
even the direction of the effect.
Organizing Data
1.
2.
3.
4.
5.
6.
7.
8.
9.
Frequency Table
Frequency Histogram
Relative Frequency Histogram
Frequency polygon
Relative Frequency polygon
Bar chart
Pie chart
stem-and-leaf display
Box Plot
Frequency Table
Suppose we are interested in studying the number of
children in the families living in a community. The
following data has been collected based on a random
sample of n = 30 families from the community.
2, 2, 5, 3, 0, 1, 3, 2, 3, 4, 1, 3, 4, 5, 7, 3, 2, 4, 1, 0, 5, 8, 6,
5, 4 , 2, 4, 4, 7, 6
Organize this data in a Frequency Table!
Frequency Table
X=No. of Children Count
(Freq.)
Relative Freq.
2/30=0.067
3/30=0.100
5/30=0.167
5/30=0.167
6/30=0.200
4/30=0.133
2/30=0.067
2/30=0.067
1/30=0.033
Frequency plot
Frequency Table
Now suppose we need to construct a similar frequency table for the
age of patients with Heart related problems in a clinic.
The following data has been collected based on a random sample of
n = 30 patients who went to the emergency room of the clinic for
Heart related problems.
The measurements are: 42, 38, 51, 53, 40, 68, 62, 36, 32, 45, 51, 67,
53, 59, 47, 63, 52, 64, 61, 43, 56, 58, 66, 54, 56, 52, 40, 55, 72, 69.
Frequency Table
Age Groups
Frequency
Relative
Frequency
32 -36.99
2/30=0.067
37- 41.99
3/30=0.100
42-46.99
4/30=0.134
47-51.99
3/30=0.100
52-56.99
8/30=0.267
57-61.99
3/30=0.100
62-66.99
4/30=0.134
67-72
3/30=0.100
Total
n=30
1.00
Measures of Central Tendency

Where is the heart of distribution?
1. Mean
2. Median
3. Mode
Empirical Rule
For a Normal distribution approximately,
a) 68% of the measurements fall within one standard
deviation around the mean
b) 95% of the measurements fall within two standard
deviations around the mean
c) 99.7% of the measurements fall within three
standard deviations around the mean
Prerequisite Skills
Fundamental concepts of measurement
Scales of measurement
Distribution, central tendency, variability,
probability
Disease prevalence and incidence
Disease outcomes (eg, fatality rates)
Associations (correlation or covariance)
Health impact (eg, risk differences and ratios)
Sensitivity, specificity, predictive values
Scales of Measure
Nominal qualitative classification of equal

value: gender, race, color, city
Ordinal - qualitative classification which can
be rank ordered: socioeconomic status of
families
Interval - Numerical or quantitative data: can
be rank ordered and sizes compared :
temperature
Ratio - interval data with absolute zero value:
time or space
Distribution, Central Tendency

Mean
Variability, Probability
Mean
Median
Mode
Standard deviation
Statistical Significance p < .01
Confidence Interval
Statistical Significance
Type I and Type II errors
Null Hypothesis = Ho
Ho True
Ho False
Reject Ho
Type I error
Correct
decision
Do Not Reject
Ho
Correct
decision
Type II error
Statistics Online Textbook

The Statistics Homepage
http://www.statsoftinc.com/textbook/sta
thome.html
Disease Prevalence and

Incidence
Prevalence
probability of disease in entire population at

any point in time
2% of the population has diabetes
Incidence
probability that patient without disease

develops disease during interval
0.2% or 2 per 1000 new cases per year
Sensitivity, Specificity
sensitivity =
a / (a+c)
specificity =
d / (b+d)
Patients
with
disease
Test is
positive
Test is
negative
Patients
without
disease
Predictive Value
Positive predictive value
= a / ( a+b)
Negative predictive
value = d / (c+d)
Post-test probability of
disease given positive
test = a / (a+b)
Post-test probability of
disease given negative
test = c / (c+d)
Patients
with
disease
Test is
positive
Test is
negative
Patients
without
disease
Good Resource Sen, Spc, PV

An Introduction to Information Mastery
http://www.poems.msu.edu/InfoMastery/defau
lt.htm
Diagnosis
Sensitivity and specificity
Predictive values
Likelihood ratios
InfoRetriever
Calculators: Epidemiology, Diagnostic test
Bias in Clinical Trials

Areas in which bias can occur
Systematic error in . . .
Allocation
Response
Assessment

Allocation or Susceptibility Bias
Can occur when patient assignments to a trial
group are influenced by an investigators
knowledge of the treatment to be received.
Can result in
treatment groups
that have different
prognoses.

Allocation or Susceptibility Bias
Treatment groups must have similar
prognoses, which is achieved by:
Randomization of patients
Prospective evaluation of patients
Well-defined inclusion and exclusion criteria
Randomization in Clinical Trials

Occurs when patients
are assigned to
treatments by means
of a mechanism that
prevents both the
patients and the
investigator from
knowing which
treatment is being
assigned.
Benefits of Randomization
Prevents the systematic introduction of bias.
Minimizes the possibility of allocation bias.
Balances prognostic factors for treatment groups.
Improves the validity of statistical tests used to
compare treatments.

Response & Assessment/Recording Bias
Can occur when a patient reports a treatment
response or when an investigator assesses that
responseeither person can be influenced by
knowing the treatment.
A patient or an investigator may have a
preconceived idea of which treatment is better.
The patient may also want to please the
investigator.

Blinding
To minimize Response & Assessment/Recording Bias
Single Blind (patient blinded): protects against
response bias.
Double Blind (patient and investigator blinded):
protects against assessment/recording bias as well
as response bias.

Transfer bias
Occurs when patients are lost to follow-up.
Must be minimized.
Performance bias
Can occur with a single surgeon or with
multiple surgeons.
Confounding Example
Relationship between coffee and
pancreatic cancer, BUT
Smoking is a known risk factor for
pancreatic cancer
Smoking is associated with coffee
drinking but it is not a result of coffee
drinking.
What is confounding?
If an association is observed between
coffee drinking and pancreatic cancer
Coffee actually causes pancreatic cancer,

or
The coffee drinking and pancreatic cancer
association is the result of confounding by
cigarette smoking.
How to handle confounding

If you know something is a possible
confounder, in the data analysis use
Stratification, or
Adjustment
Fear the unknown!
Study Design Taxonomy

Treatment vs. Observational
Prospective vs. Retrospective
Longitudinal vs. Cross-sectional
Randomized vs. Non-Randomized
Blinded/Masked or Not
Single-blind, Double blind, Unblinded
Randomization: Definition
Random Allocation
known chance receiving a treatment

cannot predict the treatment to be given
Eliminate Selection Bias

Similar Treatment Groups
ONE Factor is Different

Randomization tries to ensure that ONE
factor is different between two or more
groups.
Observe the Consequences
Attribute Causality
Types of Randomization
Standard ways:
Random number tables (see text)
Computer programs
NOT legitimate
Birth date
Last digit of the medical record number
Odd/even room number
Types of Randomization
Simple
Blocked Randomization
Stratified Randomization
Simple Randomization
Randomize each patient to a treatment
with a known probability
Corresponds to flipping a coin
Could have imbalance in # / group or

trends in group assignment
Could have different distributions of a
trait like gender in the two arms
Block Randomization
Insure the # of patients assigned to
each treatment is not far out of balance
Variable block size
An additional layer of blindness
Different distributions of a trait like

gender in the two arms possible
A priori certain factors likely important
(e.g. Age, Gender)
Randomize so different levels of the
factor are balanced between treatment
groups
Cannot evaluate the stratification
variable
For each subgroup or strata perform a
separate block randomization
Common strata
Clinical center, Age, Gender
Stratification MUST be taken into

account in the data analysis!
Outline
Introductory Statistical Definitions

What is Randomization?
Randomized Study Design
Experimental vs. Observational
Non-Randomized Study Design
Stat Software, Books, Articles
Types of Randomized Studies
Parallel Group
Sequential Trials
Group Sequential trials
Cross-over
Factorial Designs
Parallel Group
Randomize patients to one of k
treatments
Response
Measure at end of study

Delta or % change from baseline
Repeated measures
Function of multiple measures
Ideal Study - Gold Standard

Double blind
Randomized
Parallel groups
Two Scenarios
Study 1
A U.S. study (2000) compared 469 patients with brain cancer

to 422 patients who did not have brain cancer. The patients
cell phone use was measured using a questionnaire. The two
groups use of cell phones was similar.
Study 2
An Australian study (1997) conducted a study with 200

transgenic mice. One hundred were exposed for two 30
minute periods a day to the same kind of microwaves with
roughly the same power as the kind transmitted from a cell
phone. The other 100 mice were not exposed. After 18
months, the brain tumor rate for the exposed mice was twice
as high as that for the unexposed mice.
Questions to Consider
How do the two studies differ?
Study 1
Study 2
Why do the results of different medical
studies sometimes disagree?
Could the second study be performed

on human beings?
Suppose a friend recently diagnosed with
brain cancer was a frequent cell phone
user. Is this strong evidence that frequent
cell phone use increases the likelihood of
getting brain cancer?
Informal observations of this type are called

_____________ _____________.
You should rely on reputable research studies,
not anecdotes.
Two Main Ways to Gather Data

Observational Study
The researcher observes values of the response and

explanatory variables for the sampled subjects without
imposing any treatments
Example:
Experiment
The researcher assigns experimental conditions (also

called treatments) to subjects (also called experimental
units) and then observes outcomes on the response
variable.
Treatments correspond to values of the explanatory
variable
Example:
Types of Observational Studies

Retrospective
Observational studies that look back in time

This is sometimes done to find risk factors for certain
diseases
Cross-Sectional
Observational studies that take a cross section of

the population at the current time
Prospective
Observational studies in which subjects are

followed into the future
Advantages of Experiments over

Observational Studies
In an observational study, there can always be
lurking variables affecting the results.
This means that observational studies can
_________ show causation.
It is easier to adjust for lurking variables in an
experiment.
In general, we can study the effect of an explanatory
variable on a response variable more accurately
with an experiment than with an observational study.
Disadvantages of Experiments
They can be ____________ to perform on the
subjects in which you are interested.
It can be difficult to monitor subjects to ensure that
they are doing what they are told.
They can take many years, even decades, to
complete.
Results of experiments that use animals do not
______________ to humans.
They are unnecessary the question of interest does
not involve trying to assess _____________.
Sampling Designs for

Simple Random Sampling (SRS)
A simple random sample of n subjects from a population is

one in which each possible sample of that size has the
_______ chance of being selected.
Sampling Designs for Observational

Studies
Stratified Sampling
A stratified random sample divides the population into

separate groups, called ________, and then selects an SRS
from each stratum.

Cluster Sampling
A cluster random sample can be used if the target population

naturally divides into groups, each of which is representative of the
entire target population. In this method, a SRS of groups (or strata)
is taken. Every member of the selected groups is put into the
sample.

Systematic Sampling
A systematic sample selects every kth

person from the sample frame. The
researcher randomly selects a number
between 1 and k in order to know which
person to select first, then selects every k th
person after this.
Advantages of the Various

Sampling Designs
Simple Random Sampling (SRS)
It is the easiest most widespread form of sampling.

Each subject has an _______ chance to be in the
sample.
The sample enables us to determine how likely it is
that descriptive statistics (like the sample mean)
fall close to corresponding values for which we
would like to make inference (like the population
mean).
Advantages of the Various

Sampling Designs
Stratified Sampling
It ensures that there are enough _________ in

each group that you want to compare.
Cluster Sampling
It does not require a sampling frame of subjects.

It is less ___________ to implement.
Bias in Sampling
A sampling method is _________ if
The sample tends to favor some parts of the

population over others.
In other words, the results from the sample are not
representative of the population.
Obviously, __________ samples are our goal.
Types of Bias
Undercoverage
Occurs when a sampling frame leaves out some groups in the

population
Nonresponse bias
Occurs when some sampled subjects cannot be reached, refuse to

participate or fail to answer some questions
Response bias
Occurs when the subject gives an incorrect response or when the

question wording or the way the interviewer asks the questions is
confusing or misleading
Examples of Poor Samples that

Result in Bias
Convenience Samples
Voluntary Response Samples
Elements of a Good Experiment

Control group
Gives us something to compare against

Enables us to control the __________ _______
The placebo effect occurs when patients seem to improve
regardless of the treatment they receive.
Randomization
Eliminates ______ that can result when researchers assign

treatments to the subjects
Balances the group on variables that you know affect the
response
Balances the group on _________ variables that may be
unknown to you
Elements of a Good Experiment

Blinding
Increases reliability of the results

_________-blind: subjects do not know the
treatment assignment
_________-blind: neither the subjects nor those
in contact with the subjects know the treatment
assignment
Example
A pharmaceutical company has developed a new drug for treating

high blood pressure. To determine the effectiveness of the drug,
the company conducted an experiment in which subjects with a
history of high blood pressure were treated with the new drug.
A later experiment randomly divided subjects with a history of

high blood pressure into two groups. Group A was treated with
the new drug as before. Group B received the most popular drug
on the market at that time. The subjects were unaware of which
treatment they received. 60% of the patients in Group A
improved, while 63% of the patients in Group B improved.
The __________ experiment is better because
Example
To investigate whether antidepressants help smokers to quit smoking,

one study used 429 men and women who were 18 or older and had
smoked 15 cigarettes or more per day in the previous year. They were
all highly motivated to quit and in good health. They were assigned to
one of two groups: one group took an antidepressant called Zyban,
while the other group did not take anything. At the end of a year, the
study observed whether each subject had successfully abstained from
smoking.
Logic Behind Randomized

Comparative Experiments
Randomization ensures that the groups of subjects
are similar in all respects before the treatments are
applied.
Using a control group for comparison ensures that
external influences operate equally on both groups.
If the groups are large enough, natural differences in
subjects will average out.
This means that there be little difference in the results
for the groups unless the treatments themselves
actually cause the difference.
Did You Know?

Observational studies can also have control groups.
These are called ______-________ studies.

The cases are people who have a certain disease or
condition, and the controls are people who do not have the
disease.
Their purpose is to see if one of the explanatory variables is
related to the disease.
_________ from the beginning of these notes is an example
of a case-control study.
Important Points
Types of studies:
Observational studies and experiments
Experiments control for lurking variables

Sampling designs:
SRS, stratified random samples and cluster samples
SRS is the preferred method

Potential sources of bias:
Undercoverage
Response bias
Nonresponse bias
Convenience sampling
Voluntary response sampling

Elements of good experiments:
Control group, randomization and blinding
Important Points
If a group is underrepresented in the sample, we cannot
make inference about it.
We must be careful when interpreting the results of
observational studies.
For comparison of several treatments to be valid, you must
apply all treatments to similar groups of experimental units.
Interesting questions are usually pretty tough to answer. This
is due in part to the fact that no single experiment or
observational study can determine causation.
Stop and Think!!!

Write the study!
Describe & classify the
variables.
Instruments for measure?
Bias?
Prepare to analyze data!

Curs 3 CB Metodologia Cercetarii

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Curs 3 CB Metodologia Cercetarii

Încărcat de

Drepturi de autor:

Formate disponibile

Metodologia cercetrii tiinifice

Corin Badiu, 2007

Selecteaza programul adecvat: Excel, SPSS

Tipuri de studii clinice

Studii de caz, serii de cazuri, editoriale,

Studii cu o singura variabila

Ierarhia tipurilor de studii clinice

Evidence Based Medicine

Practicing Clinicians Needed!

Data Collection Strategies

Data Collection Instruments

Data Collection Instrument

Data Collection Instrument

Data Collection Instrument

Data Collection Instrument

Data Collection Instrument

Random Sampling (Simple)

Qualitative and Quantitative Variables

Examples of qualitative variables are occupation, sex, marital

Categorical variables (e.g., Sex, Marital Status,

Variables in the protocol

Measures of Central Tendency

Nominal qualitative classification of equal

Distribution, Central Tendency

Statistics Online Textbook

Disease Prevalence and

probability of disease in entire population at

probability that patient without disease

Good Resource Sen, Spc, PV

Calculators: Epidemiology, Diagnostic test

Bias in Clinical Trials

Bias in Clinical Trials

Bias in Clinical Trials

Randomization in Clinical Trials

Bias in Clinical Trials

Bias in Clinical Trials

Bias in Clinical Trials

Coffee actually causes pancreatic cancer,

How to handle confounding

Fear the unknown!

Study Design Taxonomy

Single-blind, Double blind, Unblinded

known chance receiving a treatment

Eliminate Selection Bias

ONE Factor is Different

Corresponds to flipping a coin

Could have imbalance in # / group or

An additional layer of blindness

Different distributions of a trait like

Clinical center, Age, Gender

Stratification MUST be taken into

Introductory Statistical Definitions

Types of Randomized Studies

Measure at end of study

Ideal Study - Gold Standard

A U.S. study (2000) compared 469 patients with brain cancer

An Australian study (1997) conducted a study with 200

Could the second study be performed

Informal observations of this type are called

Two Main Ways to Gather Data

The researcher observes values of the response and

The researcher assigns experimental conditions (also

Types of Observational Studies

Observational studies that look back in time

These are called -__ studies.