Lec 1 DOE Intro LZ

Introduction to Design and Analysis of Experiments
Dr. Li Zou
Department of Statistics and Biostatistics

CSUEB
Jan 21, 2020
1/27
A classic experiment from Ronald Fisher (1935):Lady
tasting tea
(Design) A lady declares that by tasting a cup of tea made with milk she
can discriminate whether the milk or the tea infusion was first added to
the cup. We will consider the problem of designing an experiment by
means of which this assertion can be tested. [. . . ] [It] consists in mixing
eight cups of tea, four in one way and four in the other, and presenting
them to the subject for judgment in a random order. The subject has
been told in advance of that the test will consist, namely, that she will be
asked to taste eight cups, that these shall be four of each kind [. . . ]. —
Fisher, 1935.
2/27
Dr. Li Zou (csueb) Introduction to Design and Analysis of Experiments Jan 21, 2020 2 / 27
The lady in question eventually answered correctly six out of the eight
trials. The next question is that how do we analyze the data now? Before
doing so, lets summarize the results in the following contingency table.
3/27
(Analysis) Since the lady knew in advance these shall be four of each kind,
each cup is not independent. We cannot use Binomial test. Given the fixed
margins, we can use Fisher’s exact test in this case. Based on Fisher’s
approach, we only need to focus on the outcome of the first cell which has
a hypergeometric distribution and know how extreme it is e.g., P(X ≥ a).
data = rbind(c(3, 1), c(1, 3))

fisher .test(data, alternative = ”greater ”)$”p.value”
Thus, if the lady is not able to discriminate whether tea or milk was poured
first (the null model), the chance of observing a result at least as favorable
towards her claim would be about 24% (p-value). If you consider 5% as an
informal rule to disprove the null hypothesis, the test is not significant.
4/27
An interesting takehome experiment
Some people including me claim that they can tell the difference between
Coke and Pepsi. How would you conduct an experiment to support his/her
claim?
Remark: Besides working on homework, the best way to study ANOVA is

doing it by hand.
5/27
Examples of Observational Studies
In an observational study investigators observe subjects and measure

variables of interest without assigning treatments to the subjects. The
treatment that each subject receives is determined beyond the control of
investigators.
Remark: Most data analyses you have performed so far perhaps come
from observational studies.
6/27
On effects of 2,4-D using case-control study
A study in 1994 examined the risk of cancer in dogs that are exposed to
the herbicide 2,4-dichlorophenoxyacetic acid (2,4-D). The study involved
491 dogs that had developed cancer and 945 dogs as a control. Of these
two groups, researchers identified which dogs had been exposed to 2,4-D
in their owner’s yard (no control over assignments of 2,4-D treatment).
Cancer No Cancer
Treatment: 2,4-D 191 304
Control: no 2,4-D 300 641
Remark: Review and show the procedure of modeling the data.
7/27
This is an example of observational studies.
Investigators collected the data by searching databases at animal
hospitals. Thus, they had no control of assigning the treatment,
herbicide 2,4-D acid, to experiment units e.g., dogs.
The crude cancer rates among two groups are 0.39 for the treatment
group and 0.31 for the control group. A quick application of
one-sided Fisher’s exact test can be performed to compare the
equality of two proportions.
data = rbind(c(191, 304), c(300, 641))
fisher .test(data, alternative = ”greater ”)$”p.value”
A p-value of 0.00659 (< 0.05) indicates that the test rejects the null
hypothesis that two proportions are equal in favor of the alternative
hypothesis that the cancer rate in the treatment group is higher than
that in the control group. Thus, data supports that there is significant
association between exposure (2,4-D) and disease (Cancer).
8/27
Here are some restrictions of observational studies.
1 Since researchers do not have control over assigning treatments to
experimental units, the sample may be biased. For example, most of
the dogs with cancer were also from air polluted areas (dissimilarities
between treatment and control groups).
2 Hidden confounders are another problem in observational studies: a
confounder (also confounding variable) is a variable that influences
both the dependent variable and independent variable causing a
spurious association. E.g., when people study the associate between
obesity and cardiovascular disease, age is a common confounder to be
considered in the studies.
3 Even after adjusting a number of confounding variables, investigators
can only conclude an association between treatments and diseases.
Remarks: Randomized controlled experiments can avoid the above three
restrictions and conclude a causal effect.
9/27
After all, observational studies are a powerful tool. Probably the most
famous example of observational studies is related to the discovery of the
strong association between smoking and diseases (e.g.,lung cancer). It is
unethical to make healthy people smoke cigarettes in an experiment due to
carcinogens in cigarettes. Nobody is going to smoke for twenty years just
to please a statistician. Thus these kind of studies are necessarily
observational.
10/27
What does an experiment usually mean?
Researchers use experiments to answer questions.

1 Justice officials consider three methods (the assailant could be
warned, sent to counseling or arrested for assault) to reduce or delay
the recurrence of spousal assault. Which of these actions works best?
2 A new drug is introduced. Is the new drug a safe, effective cure for a
disease?
3 Which combination of gum types and protein sources added provides
the highest sensory rating of the ice cream?
Research Questions → Design of Experiments → Data collection

→ Statistical inferences
11/27
An experiment applies treatments to experimental units using a chance
mechanism and measure responses. Investigators use the responses to
learn about the treatments. In this book, we focus on comparative
experiments which compare different treatments.
We want to learn the effects of treatments. (e.g., for the first case, the
treatments are three actions used by the police)
We compare treatments by using them on experimental units. (e.g.,

experimental units are individuals who assault their spouses)
Responses are measured and tell use how the treatment worked. (e.g., the
response could be the length of time until recurrence of assault)
Experimenters assign treatments to experimental units.
12/27
What makes an experiment special is control. The experimenter gets to
control the assignment of treatments to units. (Control has a twofold
meaning in DOE: 1. Assignments of treatments to experimental units are
controlled; 2. A control treatment is the one that is used as basis of
comparison for other treatments.)
In contrast, an observational study has the same triple of treatments,

experimental units, and responses, but the experimenter observes the
assignment of treatments to units. E.g., case-control study of 2,4-D acid,
smoking and lung cancer.
13/27
An example of using controls
Infections following surgery are a serious concern that can have a major
impact on a patient’s road to recovery. One approach to counter infection
is to kill surgical pathogens by oxidation. In one study (Greif et al.
(2000)), researchers randomly assigned 250 patients to receive 30%
inspired oxygen and 250 patients to receive 80% inspired oxygen. All
patients were undergoing surgery for colorectal resection. Of the patients
receiving 30% inspired oxygen, 28 had a surgical wound infection compared
with 13 patients who received the 80% inspired oxygen treatment.
Q1: Can we conclude the effectiveness of reducing infection rates by using

80% inspired oxygen?
Q2: Can we conclude the effectiveness of reducing infection rates by using

30% inspired oxygen?
14/27
Q1: Yes. We can compare the treatment group (80%) with the control
group (30%) to see the effect. In R, we can do
prop.test(c(28, 13), c(250, 250))
Q2: No. Since there was no control group- a group that received no
inspired oxygen-in this study. We do not know if 30% inspired oxygen is
better than no treatment at all.
15/27
Causal inferences
A lot of efforts are made in designing experiments comparing with
observational studies, what do we gain from designed experiments?
Most important, since we are in control of experiments, this allows us to

make stronger inferences about the nature of differences that we see in the
experiment. Specifically, experiments can make causal inferences.
However, in the observational study we merely observe which units are in

which treatment groups; we do not get to control that assignment.
In a summary, an experiment is characterized by the treatments and

experimental units to be used, they way treatments are assigned to units,
and the responses that are measured.The distinguishing feature of an
experiment from an observational study is that experiments can make
causal inferences.
16/27
Considerations of a good design
A good experimental design must

Avoid systematic error;
Be precise;
Allow estimation of error;
Have broad validity.
Remark: These are also applied to observational studies.
17/27
Avoid systematic error
Comparative experiments estimate differences in responses between

treatments. Thus, changes in response should mainly be responsible to the
difference in treatments.
Responses = Treatment effects + Random error .
18/27
For example, a company is evaluating two different word processing
packages (A and B) for use by its clerical staff. The goal is to see how
quickly a test document can be entered correctly using two programs.
Suppose that 20 test secretaries entered the document twice using each
program once in the order A first and B second.
Based on the design, does the difference in time processing the documents
depend on the effect of word programs A and B?
19/27
Solution: In this design we do not know if any observed differences are due
to treatment (programs) effects or the fact that the secretary will be
familiar with the document and thus enter it faster for the second time.
What can you do to improve the design?
In chapter 2, randomization will be discussed as our main tool to combat

systematic error.
20/27
Be precise
Experiments are precise when random error in the treatment comparisons
is small.
One possible to do this is using replication. By replication we mean an

independent repeat of each combination of treatments. For example,
considering a one-sample t-test based inference of population mean with n
i.i.d. observations X1 , ..., Xn . The (1 − α)100% confidence interval for
population mean is given by
ŝ
X̄ ± tα/2,n−1 √ ,
n
where X̄ = ni=1 Xi /n and ŝ 2 = ni=1 (Xi − X̄ )2 /(n − 1).

P P
By increasing sample size n (or replicates), one can reduce the random
√
error ŝ/ n. Other techniques such as blocking and analysis of covariance
will also be discussed to reduce the random error.
21/27
Allow estimation of error (e.g., replications)
Experiments must be designed so that we have an estimate of the size of

random error, e.g., replication. This permits statistical inference.
For example, looking at our previous one-sample t-based confidence

√
interval, the standard error term of X̄ has the structure ŝ/ n where
ŝ 2 = ni=1 (Xi − X̄ )2 /(n − 1). If somehow we had n = 1 replicates, we
P
would be unable to make statistical inferences about the population mean.
Why? Then the observed estimate could be simply the result of
experiment error.
The point is that without replication we have no way of quantifying the

random error in our estimates, leading to no statistical inferences. This
has serious consequences as we will see in complicated designs.
22/27
Have broad validity
Our experimental units should reflect the population about which we wish
to draw inference. If the units are actually a statistical sample from some
population of units, then the conclusions are also valid for the population.
Beyond this, we are extrapolating, and the extrapolation might or might
not be successful.
23/27
Some concepts
Treatments The different procedures we want to compare.
Experimental units Things to which treatments are applied.
Responses Outcomes we measure to judge what the treatments do.
Randomization The use of a known probabilistic mechanism for the

assignment of treatments to units. It is also used in other aspects of an
experiment, e.g., the order in which units are evaluated for their responses.
Experimental (Random) error The random variation present in all

experiment results, e.g., natural differences in the experimental units,
variations in the measurement devices, effects of all extraneous factors.
Replication Once the treatment is assigned to an experimental unit, a

single replication of the treatment has occurred. In general, we will
randomly assign several experimental units to each treatment. 24/27
Control treatment A baseline treatment. Usually, it is hard to judge the
effect of a treatment without comparing it to something else. See
examples in the book.
Blinding (Double blinding) Experimental units do not know whether they

are in treatment or in control group; neither do those who evaluate the
responses. This guards against bias, either in the response or in the
evaluations.
Factors Aspects of treatments are combined to form treatments. Recall

the ice-cream example, sensory rating involves a combination of gum type
and protein source, but we can vary the gum type and protein separately.
In this case, we say gum type and protein are factors of this experiment.
Individual settings for each factor are called levels of factor.
Measurement units The actual objects on which the response is

measured. These may differ from the experimental units.
25/27
A quick test
Consider the following experiment. Four types of protective coatings for
frying pans are to be evaluated. Five frying pans are randomly assigned to
each of the four coatings. A measure of the abrasion resistance of the
coating is measured at three locations on each of the 20 pans. Identify the
following items for this study: treatments, replications, experimental unit,
measurement unit, and total number of measurements.
26/27
Treatments: Four types of protective coatings.
Replication: There are five frying pans (replications) for each treatment.
Experimental unit: Frying pan, because coatings (treatments) are
randomly assigned to the frying pans.
Measurement unit: Particular locations on the frying pan.
Total number of measurements: 4 ∗ 5 ∗ 3 = 60 measurements in this
experiment.
The experimental unit is the frying pan since the unit was randomly
assigned to a coating. The measurement unit is a location on the frying
pan
Later on, you will recognize that this experimental design is called a
completely randomized design.
27/27

Lec 1 DOE Intro LZ

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Lec 1 DOE Intro LZ

Încărcat de

Drepturi de autor:

Formate disponibile

Introduction to Design and Analysis of Experiments

Department of Statistics and Biostatistics

Jan 21, 2020

data = rbind(c(3, 1), c(1, 3))

Remark: Besides working on homework, the best way to study ANOVA is

In an observational study investigators observe subjects and measure

Remark: Review and show the procedure of modeling the data.

Researchers use experiments to answer questions.

Research Questions → Design of Experiments → Data collection

We compare treatments by using them on experimental units. (e.g.,

Experimenters assign treatments to experimental units.

In contrast, an observational study has the same triple of treatments,

Q1: Can we conclude the effectiveness of reducing infection rates by using

Q2: Can we conclude the effectiveness of reducing infection rates by using

Most important, since we are in control of experiments, this allows us to

However, in the observational study we merely observe which units are in

In a summary, an experiment is characterized by the treatments and

A good experimental design must

Comparative experiments estimate differences in responses between

Responses = Treatment effects + Random error .

What can you do to improve the design?

In chapter 2, randomization will be discussed as our main tool to combat

One possible to do this is using replication. By replication we mean an

where X̄ = ni=1 Xi /n and ŝ 2 = ni=1 (Xi − X̄ )2 /(n − 1).

Experiments must be designed so that we have an estimate of the size of

For example, looking at our previous one-sample t-based confidence

The point is that without replication we have no way of quantifying the

Experimental units Things to which treatments are applied.

Responses Outcomes we measure to judge what the treatments do.

Randomization The use of a known probabilistic mechanism for the

Experimental (Random) error The random variation present in all

Replication Once the treatment is assigned to an experimental unit, a

Blinding (Double blinding) Experimental units do not know whether they

Factors Aspects of treatments are combined to form treatments. Recall

Measurement units The actual objects on which the response is

S-ar putea să vă placă și