Sunteți pe pagina 1din 44

Framing and testing

hypotheses

Hypotheses
Potential explanations that can account for
our observations of the external world
They usually describe cause and effect
relationships

Collecting observations is a means


to the understanding of a cause

Observations from
Manipulative experiments
Observational or correlative studies

Hypothesis
Suggested by the
data
Existing body of
scientific literature
Predictions of
theoretical models
Our own intuition and
reasoning

A valid scientific hypothesis


Must be testable
Should generate novel predictions
Should provide a unique set of predictions
that do not emerge from other
explanations

Scientific method
Is the technique used
to decide among
hypotheses on the
basis of observations
and predictions

Deduction and induction


Deduction proceeds from the general case
to the specific case: certain inference
Induction proceeds from the specific case
to the general case: probable inference
Both induction and deduction are used in all models of
scientific reasoning, but they receive different emphasis

Statistics
It is an inductive process: we are trying to
draw general conclusions based on a
specific, limited sample

The inductive method


Initial observation
suggests

Prediction
generates

hypothesis

NO, modify
hypothesis

experiments and data

New
observations

Do new
observations match
predictions?

YES,
confirm
hypothesis

Accepted
truth

Advantages of the inductive


method
It emphasizes the link between data and
theory
Explicitly builds and modifies the
hypothesis based on previous knowledge
It is confirmatory (we seek data that
support the hypothesis)

Disadvantages of the inductive


method
Considers only a single starting hypothesis
Derives theory exclusively from empirical
observations; some important hypotheses
have emerged well in advance of the critical
data that are needed to test them
Places emphasis on a single correct
hypothesis, making it difficult to evaluate
cases in which multiple factors are at work.

The null hypothesis


Is the starting point of a scientific
investigation
It tries to account for patterns in the data
in the simplest way possible, which often
means initially attributing variation in the
data to randomness or measurement error

How do we generate an
appropriate null hypothesis?
Example:
The photosynthetic
response of leaves to
increases in light
intensity

Each point represents a


different leaf for which we
record the light intensity (x
axis, predictor variable)
and the photosynthetic rate
(y axis, response variable)

Simplest null hypothesis


is that there is no
relationship between the
two variables

The Michaelis-Menten equation


Notice that if X is large
compared to D, X/(D + X)
approaches 1. Therefore,
the rate of product
formation (k) is equal to Y
in this case.
When X equals D, X/(D +
X) equals 0.5. In this
case, the rate of product
formation is half of the
maximum rate (1/2 k). By
plotting Y against X, one
can easily determine
Ymax (k) and D.

kX
Y
(D X )

Using our knowledge about


plant physiology, we can
formulate a more realistic initial
hypothesis
The Michaelis-Menten
equation [Y=kX/(D+X)],
where k =asymptotic
assimilation rate, and D
=half saturation constant

Real data could be used to


test the degree of support
for this more realistic
hypothesis against other
alternatives

The Hypothetico-Deductive Method


Championed by the
philosopher of science Karl
Popper (1902-1994)
The goal of these tests is not
to confirm, but to falsify, the
hypothesis
The accepted scientific
explanation is the hypothesis
that successfully withstands
repeated attempts to falsify it

The Hypothetico-Deductive Method


Initial observation
suggests
hypothesis

Prediction A

hypothesis

hypothesis

Prediction B

Prediction C

New
observations
NO, falsify
hypothesis

Do new
observations match
predictions?

hypothesis

Prediction D

YES, repeat
attempts to
falsify
Multiple
failed
falsifications

Accepted
truth

Advantages of the HypotheticoDeductive Method


It forces a consideration of multiple
working hypotheses right from the start
It highlights the key predictive differences
between them
The emphasis on falsification tends to
produce simple, testable hypotheses, so
that parsimonious explanations are
considered first and more complicated
mechanisms only later.

Disadvantages of the HypotheticoDeductive Method


Multiple working hypotheses may not always be
available, particularly in the early stages of
investigation
Even if multiple hypotheses are available, the
method does not really work unless the correct
hypothesis is among the alternatives
Places emphasis on a single correct hypothesis,
making it difficult to evaluate cases in which
multiple factors are at work.

Testing Statistical Hypotheses


Statistical hypothesis versus Scientific
hypothesis
We use statistics to describe pattern in our
data, and then we use statistical tests to
decide whether the predictions of an
hypothesis are supported or not

The Scientific Method


Establishing hypotheses
Articulating predictions

Designing and executing valid


experiments
Collecting data
Organizing data
Summarizing data

Statistical tests

Statistical hypothesis versus


Scientific hypothesis
Accepting or rejecting a statistical
hypothesis is quite distinct from accepting
or rejecting a scientific hypothesis.
The statistical null hypothesis is usually
one of no pattern, such as no difference
between groups or no relationship
between two continuous variables.

Statistical hypothesis versus


Scientific hypothesis
In contrast, the alternative hypothesis
is that pattern exists.
You must ask how such patterns relate to
the scientific hypothesis you are testing
The absence of evidence is not
evidence of absence; failure to reject a
null hypothesis is not equivalent to
accepting a null hypothesis

The statistical null hypothesis


A typical statistical null hypothesis is
that differences between groups are
no greater than we would expect due
to
random variation

The statistical alternative


hypothesis
Once we state the statistical null
hypothesis, we then define one or more
alternatives to the null hypothesis
The alternative hypothesis is focused
simply on the pattern that is present in the
data

The investigator infers the


mechanism from the pattern, but that
inference is a separate step

The statistical test merely reveals whether


the pattern is likely or unlikely, given that
the null hypothesis is true.
Our ability to assign causal mechanisms
to those statistical patterns depends on
the quality of our experimental design and
our measurements

An important goal of a good


experimental design is to avoid
confounded designs

Statistical significance and


P-values
In many statistical analyses, we ask
whether the null hypothesis of random
variation among individuals can be
rejected
A statistical P-value measures the
probability that observed or more extreme
differences would be found if the null
hypothesis were true. P(data|Ho)

What determines the P-value?

The calculated P-value depends on three


things:
1. The number of observations in the
samples (n)
2. The differences between the means of
the samples
3. The level of variation among individuals

When is a P-value small enough?


This is a judgment call, as there is no
natural critical value below which we
should always reject the null hypothesis
and above which we should never reject it.
Convention: P<0.05 (1/20)

When is a P-value small enough?


Perhaps the strongest
argument in favor of
requiring a low critical
value is that we
humans are
psychologically
predisposed to
recognizing and
seeing patterns in our
data, even when they
dont exist!

Decision Errors
Because we have incomplete and imperfect information,
there are four possible outcomes when testing a H0:
1.

When we correctly reject a false H0

2.

When we correctly retain a true H0

3.

When we mistakenly reject a true H0


(Type I Error)

4.

When we mistakenly retain a false H0


(Type II Error)

Decision Errors

Type I Error
If we falsely reject a null hypothesis that is true, we have
made a false claim that some factor above and beyond
random variation is causing patterns in our data.
In environmental impact assessment would be a false +
It is signified by the greek letter: (alpha)
This error only occurs when the H0 is indeed true.
Generally, this is the most concerning error because it
misleads us into believing that our results are significant
when they are not. Producer error

Type I Error

Type II Error
This error occurs when there are systematic differences
between the groups being compared, but the investigator
has failed to reject the null hypothesis and has concluded
incorrectly that only random variation among observations
is present.
In environmental impact assessment would be a false -
It is signified by the greek letter: (Beta)
This error only occurs when the H0 is false.
A Type II error will mislead you into thinking that there is no
significant effect happening, when in actuality there is.
Depending on the experimental design, this type of error
can be just as damaging (e.g. environmental impact
surveys, medical diagnosis, etc). Consumer error

Type II Error

Power
(1-): equals the probability of correctly
rejecting the null hypothesis when is false
Ideally, we would like to minimize both
Type I and Type II errors in our statistical
inference. However strategies designed to
reduce Type I error inevitably increase the
risk of Type II error, and vice versa.

Power

1.
2.
3.
4.

Although Type I and Type II errors are inversely


related to one another, there is no simple
mathematical relationship between them,
because the probability of a Type II error
depends on:
The alternative hypothesis
How large an effect we hope to detect
Sample size
Wisdom of our experimental design and
sampling protocol

The relationship between


Type I and Type II errors

Estimating Power
ES n
Power

ES is effect size we wish to detect, n is sample size,


is the significance level, and is the standard
deviation between sampling or experimental units
R. Lenth provides free online software to assist in a
priori power analysis for various statistical tests:
http://www.stat.uiowa.edu/~rlenth/Power/

Parameter estimation and


prediction
Rather than try to test multiple
hypotheses, it may be more worthwhile to
estimate the relative contributions of each
to a particular pattern.
In such cases, rather than ask whether a
particular cause has some effect versus
no effect, we ask what is the best estimate
of the parameter that expresses the
magnitude of the effect

S-ar putea să vă placă și