Pearl (2018) HST 4

4
CONFOUNDING AND DECONFOUNDING:

OR, SLAYING THE LURKING VARIABLE
If our conception of causal effects had anything to

do with randomized experiments, the latter would
have been invented 500 years before Fisher.
-THE AUTHOR (2016)
A
Control
Intervention (King's Diet) SH PENAZ, the overseer of King Nebuchadnezzar's court, had
(Vegetarian Diet)
a major problem. In 597 BC, the king of Babylon had sacked
the kingdom of Judah and brought back thousands of captives,
often cited as the
first con :� ; r
many of them the nobility of Jerusalem. As was customary in his
The biblical story of Daniel, ::� : ;;�:!��:;�
e ed that a pro p er
comp kingdom, Nebuchadnezzar wanted some of them to serve in his
Dam· el (third from left?) r aliz . . ·1 in
ro p s of s1m1 ar
. d.1v1· d-
mad wh n th ey w ere give t tw o g court, so he commanded Ashpenaz to seek out "children in whom
could only b e e e
; o � impressed.with
N ebucha nezzar (rear was
uals, chosen in advance . King was no blemish, but well favoured, and skilful in all wisdom, and
by Dak ota Harr .)
the re sults. (Source: Drawing cun ning in knowledge, and understanding science." These lucky
children were to be educated in the language and culture of Bab
ylon so that they could serve in the administration of the empire,
which stretched from the Persian Gulf to the Mediterranean Sea.
As part of their education, they would get to eat royal meat and
dri nk royal wine.
A n d therein lay the problem. One of his favorites, a boy named
Da niel, refused to touch the food. For religious reasons, he could
135
Confounding and Deconfounding: Or, Slaying the Lurking Variable 137
THE BOOK OF WHY
136
ording to J ewish
law s, and he asked compare a group of p eople who get the treatment with a group of
not eat meat not prepared acc tead. Ash-
d hi . f ne. d b e giv. en a d.iet of v egetables ins similar people who d on't. It' s obvious, but nevertheless crucial, that
that h e s n s
an but h e
h v e l"k I ed to comp1
y wi. th the boy' s wishes, the groups be comparable and representative of som e population. If
p n z w ld a
sees y our f row n-
e a ou
w ould not.tce. "O nce he these conditions are met, then the results should be trans ferable to
w as af ra.id that the k"mg me
the oth er ch"l I dren y our age, it will cost the population at large. To Daniel's credit, he seems to understand
ing f aces, different f rom this. H e isn' t j ust asking for vegetables on his own behalf: if the trial
my h ead." vegetarian diet w ould shows the veg etarian diet is better, then all the Israelite servan ts
shpena z th at th e
D aniel t n. ed to assure A person
ve th e k .u1g . A s b,,e fits a sho uld be all owed th at diet in the future. That, a t least, is how I
not dimm1 . . sh the.ir capacity to ser
. science' he proposed interpret th e phrase, "As thou seest, deal with thy servants."
. and understandmg
"cunning m k now1edge'
h e sa1· d. Tak e f our of us d Daniel al so understood that it was important to compare groups.
an
e .
im . T y us f or ten days,
an exp ent r
r
d f eed
ly v g .
et abl es, t ak e ano
ther group of children an In this respect h e w as already more sophisticated than many people
fee d us on e
are the two toda y, who choose a f ad diet (for example) j ust because a friend
. meat and wm . e. Af ter ten days' comp ,,
them the k mg' s
t, d l wi th thy servants . w ent on that diet and lost weight. If you choose a diet based only
nd as h s ea
group s. Said D am. e1, "A
t ou see
the story' you can

probably guess what on one friend' s experi ence, y ou are essentially saying that you be
Even .if you h aven' t read prospered on the lieve you are similar to your friend in all relev ant details : age, he
ee c o rnpani ons
ne d x . D am. e1 and hi. s thr
happ e ne t
eir wisdom and redity, h ome environment, p revious diet, and so forth. That is a l ot
w as so .impressed with th
vegetarian di. et. The kmg h e gave
appear ance-th at to assume.
. on the.ir hea1 th y
learning-not to menti . es
e f ound th em ten tim Another k ey point of D aniel's experiment is that it was pro
d p 1 c i hi. s court, wh ere "h
th em a f a v or e a e n
hi. s
11 h m a gicians and as
tro1 ogers that were in all spective : the groups w ere chosen in advance. By contrast, sup
better th an a t e
ms
er of th e king' s drea pose that you see twent y p eople in an infomercial who all say they
realm . " Later D a
m. e1 b ecame an i. nterpret ,
in a h. on s den. lost weight on a diet. That see m s like a pretty larg e sample size,
and su rviv . ed a memorable encounter
bib . hc . a1 story of D anie
l encapsulates .m a so some viewers might consider it convincing evidence. But that
Believe .
it or no t , t h e
c onduct of expe nm
. ent a1 science today. Ash- would amount to basing their decision on the experience of people
d way h
prof ou
w·u
t e
n . et cause
q . on about cau sat
.ion .. i a vegetarian di who already had a g ood resp onse. For all you know, for e very per
P en z k s a ue sti
dea1
ethod ology to
a as
son who l ost weight, ten o thers j ust like him or her tried the diet
ht .' Dam. el proposes a m
my serv ants to lose w e.1g f people' identic . al .111 and had no success. B ut of course, they weren't chosen to appear
q . : S t up two groups o
with any suc h ues tto ns e
drug,
. G . ve one group a ne
w treatment (a diet, a on the infomercial.
all rele v w y i
p) either gets
an t a s
th e control grou D aniel's experiment was strikingly modern in all these ways.
) wh il th h er group ( called
tc. ' If ' af ter a suit-
e e ot
e
treatment at a11 .
Prospective controlled trials are still a hallmark of sound science.
p eci al
the old treatment
s
no e
. fference between th
or
a measurable di
However, Daniel didn't think of one thing: confounding bias.
f .
im , y see
able amount at ment
o t e ou
id ic al groups of peop
1 e, then the new tre Suppose that Daniel and his friends are healthier than the control
two supp ose dly en t
group to start with. In that case, their robust appearance after ten
ference.
must be the cause of the dif The principle
is days on the diet may have nothing to do with the diet itself; it may
ll hi c ll ed expe n. me �t .
Nowadays we c a t s a on tro
like
c a1 eff ect of the
diet, we would reflect their overall health. Maybe they would have prospered even
d h
d at would
aus
simple . To un ers tan t e
. e d.iet w1"th wh
Ill.o re if they had eaten the king' s meat !
happens Da m e l on on
b ck
to compare what
to
h oth er. B u t we
can't go a Confounding bias occurs when a variable influences both who
I he h a d y d t e
e
have happened "f
e on
best thing: w
sta
is selected for the treatment and the outcome of the experiment.
y .
m d w e d o the next
in time and rewrite hi sto r so ste a
'
138 THE BOOK OF WHY Confounding and Deconfounding: Or, Slaying the Lurking Variable 139
This method of compensation is familiar to all statisticians; it is

call ed " adjusting for Z " or "controlling for Z."
z Oddly, stati sticians both over- and underrate the importance of
adjusting for possible confounders. They overrate it i n the sense
that they often c ontrol for many more variables than they need to
and even for var iables th at they sh ould not control for. I recently
came across a quote from a political blogger nam ed Ezra Klein
XL--
----.....y who expresses this ph enom enon of "overcontrolling " very clearly:
"You see it all the time in studies.'We controlled for ...'And then
ersion of confound
i�g: Z the list starts. The l onger the better.Income.Age.R ace.Religion.
FIGURE 4.r. The most b asic v
p b e-
is a confounder of t
h e proposed cau sal relationsh i Height.H air col or.Sexual preference.Crossfit attendance.Love of
parents.Cok e or Pepsi. Th e more things you can control for, the
tween X and Y.
stronger your study i s- or, at least, the stronger your study seems.
Controls give the feeling of specificity, of precision.... But some
times, you can control for t oo much. Sometimes you end up con
trolling for the thing you' re trying to measure." Klein raises a valid
own. oth er times th e
y are merely concern. Statistician s h ave been immensely confused about what
Sometimes the confounders are kn
a "lurking th ird' va
. n. able · " In a causal dia- variables should and should not be controlled for, so the default
suspected and act as .
x m l y easy to recognize:
m Fi. gure 4.1, practice has b een to control for everything one can m easure. The
gram, confou n d er s are e tre e
r of X and y.
th e variable z at th e
center of th e fork is a confounde . vast majority of studies conducted in this day and age subscribe to
.. t this m angle i s
·u
(We w1 see a more um- v er s al d e fi mti o n lat
.
er , bu this practice.It i s a conveni ent, simple procedure to follow, but it
th e most recogmz . able and common situation.) . English. is both wasteful and ridden with errors.A key achievem ent of the
"m ,
nf un .
mg " r .
i gm. a11y m eant "mixing Causal Revolution h as b een to bring an end to this confusion.
The term " co o d o
. ame was cho-
aI1 d we can understa
nd from th e diagram why this n . At the same time, st ati sticians greatly underrate controlling in
➔ y .
1 " m1x . ed " with the spurious the sense that they are l oath to talk about causality at all, even if
sen. The tr ue causal
effect X s
X and y m . d uced by th e fork X � z ➔ Y.For the controlling h as been d one correctly. This too stands contrary
correl ation b etween
. are
e it to p atients who to the message of thi s ch apter: if you have identified a sufficient
stmg a drug and giv
example, if w e are t e
ge
ple 1. the control gr oup, th en a set of deconfounders in your diagram, gathered data on them, and
younger on average th an the pe_o
thir� d variable.If we don' t h ave properly adjusted for them, then you h ave every right to say that
b ecomes a confounder-a lurkmg
ngle the true
any data on th e ages
' we wil. l not b e ab le to disenta you have computed the causal effect X ➔ Y (provided, of course,
· that you can defend your causal diagr am on scientific grounds).
s effect .
effect from th e spuriou
e.If w e do h ave m ea
surements The textbook approach of statisticians to confounding is quite
However' the converse I· s also tru
. the true and
of the thir d var iable, t
hen it is very easy to deconfound different and rests on an idea most effectively advocated by R.A.
.
tance' i. f the confoun
ding variable z is age, Fisher: the rand omized controlled trial (RCT). Fisher was exactly
spurious effects.For ms
ontrol group s in ever
y age group right, but not for exactly the right reasons. The randomized con
we compare the treatment and c .
1ghti. ng trolled trial i s indeed a wonderful invention-but until recently the
h en ke n v g e of the effects, we
separately. W e c a n t ta a a era
.
ing to its percentage
in the target population. generations of stati sticians who followed Fisher could not prove
each age group accord
THE BOOK OF WHY Confounding and Deconfoun d'ing: Or, Slaying the Lurki
140 ng Variable 141
d what they sought which is only appropria · te for use with .

that what they got from the RCT was indee the back-door cnt . enon-
to write down what they and still
. eradicate all conf ound111g. .
to obtain. They did not have a language We w.I 11 save these exciting de-
ct of X on Y. One of my velopments for Chapter 7.
were looking for-namely, the causal effe
e point of view of causal Although confounding has a Iong h.ist
goals in this chapter is to explain, from th , ry 111
o . all areas of sci-
to estimate the causal ef ence, the recognition tha t the
diagrams, precisely why RCTs allow us problem reqmres causa1' not stat1. s-
er bias . Once we have tJCal
. , sol utions is very recent. E ven as re
fect X ➔ Y without falling prey to confound centlY as 200 1, revi. ewers
·
rebuked a paper of mine wh'l .
understood why RCTs w o
rk, there is no need to put them on a 1 e 111 . ' "Confoun d'mg i. s solidly
. s1st111g
rd of causal analysis, founded in standard s tar·is r·ics. " Fortuna
pedestal and treat them as the gold standa telY, the number of such
. Qu ite the opposite : we reviewers has shrunk dramatJC . a IIy m
which all other methods should emulate . the past decade. There is now
in fact derives its le giti an almost universal consensus, at least
among epi'dem1. 0logists,
will see that the so-called gold standard
.
philosophers' and socia Isc1ent1. sts, that (1)
macy from more basic principles. confounding needs and
rams make possible has a causal solution ' and (2) causa Id'iagra
This chapter will also show that causal diag ms prov1'de a complete
deconfounders. The for and systematic way of fi11d.mg that solut10n . · The age of confusi. on
a shift of emphasis from confounders to
. The two sets may over over confounding has come to an end!
mer cause the problem; the latter cure it
' t a on a su fficient set of
lap, but they don t have to. If we have da
ore some or even all of
deconfounders, it does not matter if we ign THE CHILLING FEAR OF CONFOUNDING
the conf ounders.
h the Causal Rev- In 1998, a study in the New Eng/and Journal
This shift of emphasis is a main way in whic ofMed'tctn · e revealed
ond Fish erian expe riments and
infer an association between regu ar I wa kmg I.
olution allows us t o go bey and reduced death rates
es. It enables us to de among retired men. The researchers use
causal effects from nonexperimental studi d data from th e H onolulu
termine which variables shou
ld be controlled for to serve as de Heart Program' which has f o IIowe. d the heal
th of 8'000 m en of J ap-
both theoretical and anese ancestry since 1965.
confounders. This question has bedeviled '
s heel of the field for The researchers, led by Robert Abbot .
practical statisticians; it has been an Achille t ' a bw stat1. st1c1an at the
. .
o with data or statistics. University of V'irgm • wanted to know whether th
· ia,
decades. That is b ecause it has nothing to d e men who exer-
s on rung two of the cised more lived longer. They chose a
Confounding is a causal concept-it b elong sample of 707 m en fr om
the
1arger group of 8'000' a IIof wh om .
Ladder of Causation. were physJCa11Y healthy enough
have totally decon- to walk. Abbott 's team f ound tha
Graphical meth ods, beginning in the 1990s, t the d eath rate o ve r a tw
elve-
ular, w e will soon meet year pen. od was two tim es h'igher
founded the confounding problem. In partic among men wh O wa1ked less
h unambiguously than a mile a day (I'll a11 the �
a method called the back-door criterion, whic "casual walkers ") than among men
are deconfounders. who walked more tha� two l
identifies which variables in a causal diagram mi s a ay ("intense walkers "). To be
t /
, she can adjust precise, 43 p ercent of the
If the researcher can gather data on those variables s had died, while only 21.5
the result of an inter- percent of the intense wal::;:�a::i��
for them and thereby make predictions about
However, because the expenm
vention even without performing i t. enters did n o t prescribe
who
farther than this. Would be a casual walker and who w
In fact, the Causal Revolution has gone even ould be an mten . se walker'
even when we do n
ot We have to take mt . era
In some cases we can control for confounding .
. o cons1d · • . ty of co nfounding
ti on the p oss1b1h
ers. In these cases
we bias. An obv10u s conf ou nd er might be ag
have data on a sufficient set of deconfound be more willing to
e . yo unger m en might
he conventional o , ne . o
do a vig
can use different adjustment f ormulas-not t rous workout and also would be
THE BOOK OF WHY
142
In fairness to Abbott and the r est of his team, they may h ave had
Age good reasons f or cauti on. This was a first study, and the sample
was r el atively small and homogeneous. Nevertheless, this caution
refl ects a more g eneral attitude, tran scending issues of homogene
ity and sample size. R esearch ers have been taught to believe that
an obs ervation al study (on e where subj ects choose their own treat
ment) can never illuminate a causal claim. I assert that this caution
Mortality is overex aggerated. Why else would one bother adjusting for all
Walking th ese confounders, if n ot to get rid of the spurious p art of the asso
alking example . ci ation and th ereby get a b etter view of the causal part?
FIGURE 4.2.. Causal diagram for w
Instead of saying "Of course we can't," as they did, we should
procl aim that of course we can say something about an intentional
wo uld have a cau sal

d.iagram like that in intervention. If we believe that Abbott' s team identified all the im
d . . So w
less likely to ie e
portant confounders, we must al so b elieve th at intentional walking
Figure 4.2. "Age" node tells u s t
hat ag e tends to prolong life (at least in J apanese males).
s. f or k. m g p attern at the
l mk
T • I'm sure you can th.
c as ic
he This provisional conclusion, predicated on the assumption that
f w lk mg and morta1·ity
is a confounder
a
the casual walkers w
o ere no other confounders could play a maj or role in the relationships
on fo un d ers. p erh aps
of other possi "bl e c
hus,
s on; mayb e they co
uldn't walk as much. T found, is an extremely valuable piece of information. It tells a po
sl acking off for a r ea
on and on tential walker preci sely what kind of uncertainty remains in taking
itio n could b e a c on
founder . W e could go
ond
physi cal c
w lk er s w ere alcoh
ol drinkers? What i.f th e claim at face value. It tells him that the remaining uncertainty
t lig t a
like this. W hat if h e h
is not higher than the possibility that additional confounders exist
they ate more? th ought about all thes
e f ac- that were not taken into account. It is also valuable as a guide to
d n w . s, t he research ers
The g oo e s i
very reason a61e future studies, which should focus on those other factors (if they
ad.JUSted for e
tu d y h accounted and
tors. T as and
mption, d.iet,
he s
a1cohol consu
exist), not th e ones neutralized in the current study. In short, know
. l ond itio
. . n,
factor-age, physica
c
t the intense walk ing th e set of assumptions that stand behind a given conclusion is
ers
Fo x p 1 e i.t's t rue tha
several others. r e am
d the death
er.'s o the rese
archers adjust e not less valuable than attempting to circumvent those assumptions
s ig
1. t l y y o ung
tended to b e h
nce b etween casual a
nd .m- with an RCT, which, as we shall see, has complications of its own.
n d f ou d t hat the di•ffere
rat e for a g e a � th rate for
w s st ill very large . (
The age- adjusted dea
tense w lk e r s a
nt for the
ared to 24 perce
a
w 41 p ercent, co mp THE SKILLFUL INTERROGATION OF NATURE:
the casu al w alk ers a s
intense walkers.) heir conclu-

WHY RCTS WORK
r arche rs were
very o. rcumspect in t
Even so , t he ese
, the effects As I have mentioned already, the one circumstance under which
1 , t y wro t e , "Of course
nd of th e a ru . c e he
sions. At the e he distan ce wa
lked scientists will abandon som e of their reticence to talk about ca usal
nt1 . 0n l ffort s to .m crease t
on longevity of .m t e a e
.m
. p able older men
cannot be addressed ity is when they have conducted a randomized controlled trial. You
per day by p h y s i ca lly ca
e to_ say can read it on Wikipedia or in a thousand other places : "The RCT
u t he la g a �e of
C hapter 1, they declin
our stud y." To se
� : lve years given is often considered the gold standard of a clinical trial." We have
o l ity of su rviving twe
t your p a i
anything abou
r
0ne person to thank for thi s, R. A. Fish er, so it i s very interesting to
that you do(exercise).
7
Confounding and Deconfounding: Or, Slaying the Lurking Variable

144 THE BOOK OF WHY 145
e
him wrote about his reasons. Th states in no uncertain terms that the questions they ask are "aimed
read what a person very close to at establishing causal relationships." And what gets in their way
full:
. lengthy , but worth quoting m
passage1s is confounding, although she does not use that word. They want
scie ntific experimen
tation is corn- to know the effect of a fertilizer (or "manurial treatment," as fer
The whoIe a rt and practice of
. of Nature. Observation has pro
- tilizers were called in that era)-that is, the expected yield under
pns .ed in the skillful interrogat10n
. ure of Nature in some aspect
, wh.ich one fertilizer compared with the yield under an alternative. Na
vided the scienti. st wi.th a p1ct
oluntar y statem ent
. He wishes to ture, however, tells them about the effect of the fertilizer mixed
1ms aII the imperfections of a v
. en t b y asking spec1
•fic ques- (remember, this is the original meaning of "confounded") with a
ation of this s tatem
cheCk his i nterpret
. . sh.ips. Hi. s quest.
sal relation
10ns, m. variety of other causes.
uons aimed at establishing cau
. ns, are necessarily paru.cuIar' I like the image that Fisher Box provides in the above passage:
the form of expe
rimental operatio
istency of Nature in m akmg . genera1 Nature is like a genie that answers exactly the question we pose,
and he _ must rely on the cons
in a particula r inst ance or in pr
ed1ct not necessarily the one we intend to ask. But we have to believe,
deduct1ons from her response . erations on other as Fisher Box clearly does, that the answer to the question we wish
.mg the out come to be anti. cipated from similar op
. . to draw valid conclusions of determm . ate pre- to ask does exist in nature. Our experiments are a sloppy means of
occasions . Hi. s aim is
evidence he el1 c1 ts.
.. uncovering the answer, but they do not by any means define the an
cision and general"ity from the
howeve r, Nature appears vac- swer. If we follow her analogy exactly, then do(X = x) must come
F_ar from behaving consistently,
e responds to the first, because it is a property of nature that represents the answer
. atmg, coy, and ambiguous in her answers. Sh
1ll .
. field and not necessa nly we seek: What is the effect of using the first fertilizer on the whole
o f th q tion as it i. s set out .m the
form e ues
,
• menter s mm . d·, she does not interpret field? Randomization comes second, because it is only a man-made
to the quesnon . 111 the expen.
mformat1o . n,. and she is a stickler means to elicit the answer to that question. One might compare it
for h. im,. she gives no gratuitous .
he experime nter who wants to corn
to the gauge on a thermometer, which is a means to elicit the tem
for accuracy. In consequence, t
wastes h.is 1a bor if ' dividing his
field perature but is not the temperature itself.
pare two manurial treatments
each half with one
of his m anures , In his early years at Rothamsted Experimental Station, Fisher
into two equa1 parts' he dresses
r om the t wo halves
. The usually took a very elaborate, systematic approach to disentan
grows a crop, and _ compares the y1. elds f
is the diff erence b etween the yiel
d gling the effects of fertilizer from other variables. He would divide
form of his question was : what
e nt and that of plot
B under the sec - his fields into a grid of subplots and plan carefully so that each fer
of pIot A under the first treatm
plot A would yield the same as plot
tilizer was tried with each combination of soil type and plant (see
ond? He has not asked whether
.
d he cannot distinguish plot eff
ects Figure 4.3). He did this to ensure the comparability of each sample
B under unifor m treatment, an ;
requested' not in reality, he could never anticipate all the confounders
re has recorded, as that might
f rom treatm ent effects, for Natu
to the plot y·1elds determine the fertility of a given plot. A clever enough genie
only the contn"bu non . of the manurial differences could
texture' defeat any structured layout of the field.
but also the contn"buuon . s of differences in s01. l fern·1·ity'
·nn ume rable other van. a61es. Around 1923 or 1924, Fisher began to realize that the only ex
drainage, aspect, mi.croflora, and
1
perimental design that the genie could not
defeat was a random
of one. Imagine performing the same experim
The author of this passage is . Joan Fisher Box ' the daughter ent one hundred times
. hy of her on a field with an unknown distrib
Ronald Aylmer Fis . her, and i.t i·s taken from h er biograp ution of fertility. Each time you
assign fertilizers to subplots random
. herself she has clearly
. 1an
.snc ly. Sometimes you may be very
. stri.ous father . Though not a stan
illu ' u nlucky and use Fertiliz
the cen tra l ch a11 enge statist icians face. She er 1 in all the least fertile subplots. Other
absorbed very deeply
146 THE BOOK OF WHY Confounding and Deconfoundin
147
g: Or, Slaying the Lurking Variab
le
If you ask the genie the wr
ong question, y
ou will never fin
what you want to know d out
. If you ask the right qu
estion, ge tting an
answer that is o cca
sionally wrong is much
less of a problem. You
can still estimate the amo
unt of uncertaint
y in your answer, be
cause the uncertainty com
es from the rand
omization p roce
(which is known) rather dure
than the characteristics
are unknown). of the soil (which
Thus, randomization actu

ally brings two b
ene fits. First, it e
inates confounder bias (it ask lim
s N ature the right quest
i t ena bles the research
ion). Second,
er to quantify his uncer
tainty. However, ac
cording to hi storian Stephen
Stigler, the second bene
Fisher's main reason f or fit was really
advocating randomizati
o n. He was the
world's master of quantifyi
ng uncertainty, having de
FIGURE 4. 3. R. A. Fisher w .,th one of his many innovations : a Latin s quare
new mathematical procedur veloped many
es for doi ng so. By
comparison, his un
.
derstanding of decon
expenmental des1gn,· · tended to ens ure that one plot of e ach pl ant type ap-
m
founding was purely int
uitive, for he lacked a
mathematical notati on f o
pears in each row (f ertiT,zer type) and co1 umn ( so1·1 type). Such designs are
r articulating what he so
. .
ught.
still used m pr actice, b ut FIsher wou ld l ater argue convincingly th at a ran-
Now, ninety years later
domized design is even more effective. (S ource.. Dr awing by Dakota H arr.) , we can use the do -op
.
era tor to fill in

what Fisher wanted to bu
t couldn't ask. Let's see,
point of view, how rando from a causal
. mization enables us to
ask the genie th
times you may _get lu ck and apply i t to the most fertile subplots. right question. e
But by generating a ne: random ass i gnment each time you per- Let's start, as usual, by
dr awing a caus
al d iagr am . M od
u ar antee th at the great m aj ority shown in Figure 4.4, descr el 1,
ibes how the yield of ea
ch
r t x i a
:� :e �i�: ;:: :��\)::�i t�; fucky nor u nlucky. In those _cases mined under normal cond plot is deter
: itions, where the farmer
deci des by whi m
FertlTizer 1 wi ll be applied to a selection of subplots that is rep or bias which fertilizer is
best for each plot. The
resentative of th e �eld as a whole This is exactly what you want pose to the genie Na ture query he wants to
is "What is the yield un
der a uniform a
for a controlled tnal. Because th.e. distribution of fertili ty in the plication of Fertilizer 1 p
(versus Fertilizer 2) to the
field is fixed throughout you r senes of. experiments-the genie i n do- operato tire fi eld?" Or,
r notation, what is P(yiel
en
d I do(fertilizer = l))?
can't change i t-he is tr icked m . to answenng (most of the nm . e) the
. te. d to ask.
causa1 questi· on you wan
From ou r perspective, m an era when randomized trials are the
gold standard, all of th'is maY appea.r obvious. But at th e tim . e, the
Soil Fertility Texture Drainage
idea of a ra ndomly designed expenm� nt h o rn• fied F isher's st at is-
Microflora Other
ti cal colleagues. Fisher's li terally drawing f rom d k of cards to

assign subplots to each fertili zer may h ave contn.: u :cde to thei r dis
may. Science subj ected to the whims of chance? .

. n
But Fisher realized that an uncertain answer to the nght questio
n
is much better than a highly cert ain an swer to th e wrong questio · FicuRE 4.4
. Model 1: an impro
perly controlled experiment.
Confounding and Deconfound'ing: Or, Slaying the Lurking Variable 149
THE BOOK OF WHY
148
erim ent naivel
y, f or example ap Soil Fertility Texture Microflora
the f armer p erf orms the e xp
If and Fertilizer
2 to the
t th hig h end of h is field
plying Fertilizer 1 o e
onf ou nd er. If
raina ge as a c
pr b b y i ntroduci ng D
low end, he is a l
o
year, he is
1 one year and
Fertilizer 2 the next
he uses F t z er e, h e
er. I n either cas
er ili
a conf oun d
i t d ci ng Weather as
proba bly n ro u
mparison.
will get a biased co now ab out is
described
er w ants to k
F1GURE4.6.
h t th f arm
The world t a e
(see Figure
wh all plots recei
ve the same f ertilizer
by Model 2, ere
erator is to
in d in Chapter 1,
the eff ect of the do-op .
world described by Figure 4 · 6, ther e 1s no d'Ifferenc e between see.mg
xp l
4.5). As nd f o rce this va riable to
e a e
ws p int g t o Fertilizer a . . er = 1 and do i·ng Fert1.1.1zer = 1.

erase all the arro
o in Fertiliz
alue-say, Fert
ilizer = 1. That brings us to the punch line · randomizat1.on .is a way of sim-
a part icular v . .
ulating M od el 2. It disables a11 the old confounders without intro-
Drainage Microflora ducing any new confounders. That is . the source o f its power; there
Soil Fertility Texture
.1s noth'mg mysterious or myst1c .
aI about .It· It is
. noth'mg more or less
than, as J oan Fisher Box said' "the sk'l]f · errogat10n of Nature. "
I u1 mt
The experiment would' however f a.il .m its . ob'1 ect1v .
. e of s1mul a t-
. '
ing Model 2 if either the expenmenter were allowed to use his . own
.
.1udgment to choose a f er t·1· I izer or the expenmenta I sub'1 ects, in this
ut. .
ould like to know abo case the plants' "knew" w h'ich card they had drawn. This .1s why
FIGURE 4. 5. Model 2: the world we w
. . al trials with human s u6.J ects go to gr ea t lengths to conce al
cl1mc
this inf ormation f rom both the pat.ients and the expenm . enters (a
when we apply ran
the world looks like .
Finally, let's see what d to do(fertilizer = 1)
procedure known as double bi md.mg).
m p ts w ill b e subj ecte . · e . there ar e other ways of
domization. N ow s e lo I will add to this a second punch 1m
which treatment .
o
izer = 2), b u t the choice of .
and others to
do(f ertil simulating Model 2. 0 ne way, .if you know wha t a I I the possible
orld cr eat ed by
such a model
whic h pl t is random. The w conf ounders are' is to measure and ad'J ust f or them. However, ran-
goes to le Fertilizer
o
M d el 3 i n Figure
4.6, showing the variab .
domization does ha ve one great advantage. It se vers every incom-
by
w -say, Fisher's deck
o
is sho n
. .
m nt by a random d evice . .
obtai nin g its a
ssig n e mg link to the randomiz . ed va nable, including the ones w e don 't
know about or cannot me asure ( e.g., "O ther " f actors in Figures
of cards.
ng tow ard Fer
tilizer have been
th t l th arrows poi nti 4.4 to 4.6).
N otice a al e
listens only
to
ssu mpt i o n that the f armer
erased, reflectin
g the a By contrast, in a nonrandomized study' the expe nm . enter must
ertilizer to use.
It is equally impor
the card when de cidi n g whi ch f
use the rely on her knowledge o f the subi.ect matter. If s he .is confident that
to Yield, b eca
is arr ow f rom C ard her causal model accounts f or a suffiCi·ent number of deconfound-
tant to note th a t th ere no
ption f or
d th e cards. (This
is a f airly saf e assum ers and she has gathered data on them, then she can estimate the
plants ca nno t rea
rial it is a ser
ious
s bj cts in a rand omized t effect of Fert1·1·izer on yi.eld in an unb.iased w ay. But the danger is
plants, b ut f or human
e
u e
3 d escribes a w or l
d in which the r
M d l that she may have m1ss
concern.) Theref ore is
o e . ed a conf ounding factor, and her estimate
d Yi eld is unconf o
unded (i.e., there
n F ti z er a n the rnay theref ore be biased.
lation b etwee
er li
tili zer and Yield).
This means that in
c us o f F er
n o comm on a e
Confounding and Deconfounding: Or' Slaying the Lurk'mg Variable 151
THE BOOK OF WHY
150 .
anything meaningful in o bservational studies wher e phys1eal con-
l
T are still prefe
rred to observatio na trol over treatments is infeasible.
l, RC s
All things being eq ua
ded for tightrope w
alkers.
f ty t s ar e recommen H ow was confounding de fined then, agd how should it be de-
s tudies, j ust
as sa e ne
tervention
t c essarily e qua
l. In some cases, in fined? Armed with wha t w e now know a bout the 1 og1c of causality,
But all things a re no ne
of the effect . . .
p ossible (for i
nstance, in a study .
the answer to the second quest10n is easier · The quantity we ob-
c lly s to
y mly assign patient
im
may be ph si a
.
, w e c annot rando serve 1s the conditional p roba b'l•1 1ty of the outcome giv · en the treat-
of obesity on
h eart di se as e
st udy of
ethical (in a .
rv e nt io n may be un ment, P(Y / X). The quest10n we want to ask of Nature has to d o
b e obese o r not). O r in te
eople to
w e can't ask ra
ndomly selected p .
with the causal rela tionsh1P
.
· between .X and y' wh'ich is captured
ki g, recruiting
the effects of sm o n
). O r w e may en
counter difficulties .
by the mterventional probability P(Y I do(X)) · Confoundmg, then,
.
t n y r s ith
smoke for dures and end up w
e ea
.
i t xp er i ment al proce should simply be de fined as anythmg that leads to a d'1screpancy
subj ects for inconv
en en e
ulation.
d not represen
t the intended pop between the two : P(Y / X) #- P(Y / do(X)). Why all the fuss?
volu t rs w h o o
lly sound w ays
us scientifica
n ee
, th do - o p erat or gives Unfortunately' th.mgs were not as easy as that before the 1990s
F ortunately e
nexperiment
al st udies, w
hich
g c l e ff ect s f rom no because the do-op erator had yet to be formal'ized. E ven today, if
of determin i n au sa
. As discussed in th
e
p re m acy of RCTs you stop a statistician in the street and ask' "What does ' confound-
tion
challenge the tradi
al su
onal
l estimat es p ro
duced by observati . '
mg mean to you,"
.
. you WI 11 probably get one Of the most convo-
s ch c sa
walking e xample,
au
u
sality," that
is, causality .
l b l d "p r ovisional cau luted and confounded answers you ever h�ar_d from a scientist. One
studies may
b e a e e
ausal diagra
m
s that our c .
th t of assumption recent book, coauthor ed bY 1ead'mg sta t1st1eians, sp ends literally
contingent u p on e se
sec ond
ese studies as . .
t nt th a t w e not treat th two pages trying to e xp!a m it, and I have yet to find a reader who
is imp or a
advertises. It f being conducted
in the
y h v th e advantage o understood the explanation.
class citizens : the
e
a
l setting
t rg et p opulation
, not in the artificia .
The reason for the difficulty 1s tha t confound·mg i• s not a statis-
ta t f th e a
eing
natural h abi nse of not b
o
ure" in the se
.
y, d th ey can be "p t1cal notion. It stands for the d'iscrepancy between what we want
of a labora t or an
ssues of ethic
s or feasibility. to assess (the causal effect) and what we ac tua11Y do assess using
contaminated by i al obj ective o
f an RCT . . . .
d rs t a nd th at the princip statistical methods. If you can , t art1eulate mathemat1cally what
N ow that we un e
' e other meth
ods that the
di g, l et s look at th you want to assess' you can't expect to de fine what constitutes a
is to eliminate c o nf ou n n
1986 pa
gi v en us. The s
tory begins with a .
discrepancy.
ti h as
uation
Causal Revol u on
arted a reeval .
im c l ea gues , which st Historically' the conc ept of "confoundmg " has evolved around
ngt e ol
per by two of my lo .
of wh at conf
ounding means. two related concept1· ons : mcomparab_ 'l' 1 ity and a l�rking third vari-
able. Both of these conce ts have resisted formalization . When we
? .
talked about compar ab.I lity, m the context of D am. el's experiment'
CONFOUNDING
THE NEW PARADIGM OF we said that the treatment and control groups should be id . entical
s one of the centr
al .
y recognized a tn all relevant ways. But th'is begs us to distmgu1 . .
g w id el . sh relevant from
"While confoundi the literature
n is
a review of
.
d i l g ic al research, irrelevant attributes. H ow do we k now that ag e .is r elevant in the
problems in epi
o
confounding
em o
efi nitions of
ns t cy among the d Honolulu walking study .;i H o do we know that the a lphabetical
will reveal lit tl e co is en
er Greenland
of the �
th e se nt ence, Sand order of a participant's name 1s not relevant.;i you might . '
say it s
er ." With
on
of Har
is
or confound amie Robins
r , L A ngeles, and J obv1· ous or cornmon sense, but g enerations of sc1. ent1. sts have strug-
University of C alif
os
o ni a
al reason why
the con
th i fi ge r on the centr gIed to articulate that common sense forma11 y, and a robot cannot
t her. Lacking
n
vard University p u e r
dv nc d on e bit since F is re!yon our common sense when asked to act properly.
g had not a a e
saY
trol of confoundin nding, scient
ists could not
g f c nf ou
underst andi n o o
a principled
THE BOOK OF WHY
152
. " .
ble defi nition. Epidemiologic Methods and Con ts, alls It t� e cl assic epidemi-
The same ambiguit y plagues the th ird-varia ;� �
ological definition of confo undi n ' an i t consists of three parts.
e a common ca
use of b oth X and Y or .
Should a conf ounder b wer such questions A confounder of X (th e treatment) and y , (the outcome) 1s a vari-
oday we can ans
able Z that is (1) assoc1- ated with x I:� · he popunIat10. n at large, and
l t d w t e ach ? T .
merely corr e a e i h
ich vari ables
and checking wh
h causal diagram
by re fe r rin g
s
to
r
t
p
e
ancy bet ween P(X

I I
Y) and P(X do(Y)). Lacking (2) associated with y among peop le h::: ;; been e xposed to
n
pr du ce a d i c e
and th� treatment X. In recent years, this h :s pplemented by a
o f statisticians
o
rator, five generations
a diagram or a do-ope
ro gates, none of which third condition: (3) z should not be on the causal p ath between X
sts ha d to st rug gle with sur
health scie nti
edicine cab and Y.
e drugs in y our m
t sf t ry. Co n sidering that th . .
were s a i ac o
efinition Observe that all the terms in the. "classIC ".vers10n (1 and 2) are
v be en develope d on
the basis of a dubious d
ine t m a y h a e
. statistical. In p articular ' Z is. onIy assumed to be associated with-
rs, " you should
be somewhat concerned
of "confounde itions of con not a cause of-X and y. Edward s·Impson proposed the rather con-
me of the surrogate defin .
Let's take a look at so voluted condition "Y IS associated with z am ong the unexposed"
f n to two m ain cate
go ries, declarative and pro .
foundin g. T he s e all i
in 1951. From the causal pomt of view, it seems that Simpson's idea
arativ e definitio
n would be "A
and wrong) decl .
cedu ral . A typ ica l (
and Y." was to discount the p art of the correIation of Z with y that.is due
ted with b oth X
confounde r is a
ny variable that is correla to the causal e ffect of X on y.' in othe� words, h e wanted to say that
edural definition
would attempt to char
On the other hand, a proc
al test. This app
eals to z has an effect on y md .
ependent of its effect on X . The onIy w ay
acterize a confou
nder in terms of a statistic ' .
he could think to express this d'iscountmg was to condi tion on X
data
s, w love any test tha
t can be performed on t he
stati sti cia n ho
by focusing on the controI group (X - - 0) . Stat1s . t.1caI vocabulary,
g to a model. '
directly without appealin the scary name of deprived of the word "effect ' " gave h im no other way of saying it.
ion that goes by
H ere is a procedural defi nit . a bi. t confusing, it should be_·' H ow much easi. er.It would
If th.is is
by the N orwegian
omes from a 1996 paper .
"noncoll apsibility." I t c have been if h e could have simply w ntt�n a causal diagram, like
ernberg: "F ormall
y one can compare the
Figure 4.1 ' and said, "Y .1s associated with z via · paths not gomg
epid m iol o g i st Sv e n H .
e ent
and the rel ative
risk resultin g after adjustm
crude rel ativ e risk g, through X." Bu t h e didn' h ave th'is tool, and he couldn't talk about
nde r. A diffe ren
ce indicates confoundin :
for th e potential confou mate. If there paths, whi ch were a forb dden concept.
adjusted risk esti
one should use the
and in that case The "cl assical ep idemiological definition" of a confounder has
ding is not an issue and th
e
g li g ibl e d ifference, c onfoun . two examples show:
is no or a ne
pect a other flaws ' as the f oIIowmg
t s to be preferred."
I n other w ords, if you sus
ud st im a e i
t adjusting for it. If there
cr e e
confounder, t ry
adjusting for it and try no
djusted (i)X➔ Z➔ y
a confo under, an
d you should trust the a
is a difference, it is as
no dif f erence, you are o
ff the hook. H ernberg w
value. If h r e i s
roach ; it has and
t e
at e such an app
m s h e firs t person to advoc
by n ean t
nomists, and social sci
o
idem iologist s, eco
misguided a century of ep tics. I (ii) X➔ M ➔ y
of app lied statis
s rei g n s i n certain quarters
entists, and it till
-J,.
was unusually explici
t
H rnb g only because he
have p ic k e d on e er
n 1996, well af te
r the Causal z
t d b us e he w rote this i
abo u t i an eca
derway.
Revolution was already un definitions evolv
ed In example (')1 , z sat1s
· fies conditions (1) and (2) but 1s· not a con-
p pu l r o f the declarative founder. It 1·s known as a mediat
The m ost o a
ry of · or: it
· is· the variable that explains
hor of A Histo
d f m . A l fr edo M orabia, aut
o v er a p er io o ti e
THE BOOK OF WHY Confounding and Deconfounding: Or, Slaying the Lurking Variable 155
154
y ou are
is a disast er to control for Z if three of the Ladd er of Causati on and therefore p owerful enough
the causal effect of X on y · It t those
. of X on y . If you look only a to detect confounding.) Exchangeability requires the researcher to
try mg to find the causal effect z
. ent and cont rol g rou
ps for whom = 0 ' consid er th e t reatment g roup, imagine wh at would hav e happen ed
ind1v'd 1 uals .m th e t reatm .
l'. bl k ed th e effect of X '
b ecause it w ork s to its constit uent s if th ey h ad not gotten treatment, and then judge
then you ha v e comp l ete oc
. z
by changing · S0 you will
conelude th at X ha s
no effect on Y. This whether th e outcome would b e the same as for those who (in real
. eant wh en h e sa1· d
' "Sometimes you ity) did not receive treatment. O nly then can we say that no con
1s exactly wh at Ezra Klein m
measure. " founding exists in the study.
end up cont roll'm g
for the th'm g you' re trying to
ians
") z · for the mediator M. Statistic In 1986, t alking counterfactual s to an audience of epidemiol o
In example (11 ' is a p. roxy e can't
when th e act ual causal variabl gists t ook some courage, b ecause th ey were still v ery much under
v ery often control. for proxies
party affi l.1at10n · m 1·ght b e used as a proxy the influence of classical statistics, which holds that all the an
b e measured; for i nstance, .
.
for pohti. cal b e1·iefs. Because
z isn't a perfeet measure of M,
some swers are in th e d a ta-not in what migh t h ave been, which will
if y ou control for remain forev er unob served. H owever, the statistical community
of the influe nce of
X on y might "leak th rough"
z. N evertheless, controlling for
.
z is still a mist ak e. While the bi
as was somewhat prepared to listen to such heresy, thanks to the
. rolled for M ' i. t is . st'1ll there. pioneering work of another Harvard statistician, Donald Rubin.
might b e less th an 1. f y ou co.nt . .
ns, bl David Cox in h.is text- In Rubin's "potential outcomes" f ramework, prop osed in 1974,
For this reason later stati_sticia ;;��
ents ( 1 ; arned th at you should counterfactual variables like "Blood Pressure of Person X had he
book The Design of Experim " reason" to b eliev e
only control fo r Z
1 you have a strong' pn. or
'f receiv ed D rug D" and "Blood Pressure of Person X had h e n ot re
nothing
. Th'1s "st rong prior reason" is ceived Drug D" are just as l egitimate as a traditional v ariable like
that it 1s not affect ed by X ·
uch hypoth eses
umpt.io n. H e adds, "S Blood Pressure-despite the fact that one of those two variables
more or less th an a causal ass . . e aware
t the scientist sh ould alway s b will remain forev er unobserv ed.
may b e p erfectly .m order ' bu
that it's 1958, in the R obins and Greenl and set but to express their conception of
when th ey are b e mg · appealed to · " R ememb er
. . is saying that you confounding in terms of potential outcom es. They p artitioned the
n o n causa1.ity. C ox
midst of th e great prohib 1 t10. .
c u l m oo nsh in e when adju st in g population i nto four types of individuals: doomed, causative, pre
sw ig f sa
can go ah ead and take a o a
tion! I
l th_e preach er. A daring sugges ventive, and imm une. The l ang uage is suggestive, so let's think of
for confounders, but don't tel
. d him for his b ravery• the treatment X as a flu vaccination and the outcome Y as coming
never f ail t o commen . b een combine . d
S. mp 's d C ox' s cond'1t10ns h ad down with flu. The doomed p eople are those for whom the vaccine
By 198 0, i so n an
ab ov e.
nfound.m g that I mentioned doesn' t work; they will get flu whether they get the vaccine or not.
into the three-part test for co . s. Ev en
a cano.e w ith only three leak The causative group (which m ay be nonexistent) includes those for
It is about as trustworthy as t (3),
. ted appeal to causality in par whom the vaccine actually causes the disease. The preventive group
though it d oes m ake a halfhear
unnecessary and
each of the fi rst two
parts can be shown to b e both consists of people f or whom the vaccine prev ents the disease : they
will get flu if they are not vaccinated, and they will not get flu if they
insuffi cient. . eir landmark
at conclusion in th are vaccinated. Fi nally, the immune group consists of people who
G reenland and Rob m s drew th
ro ach to con-
complet el y new app will not get flu in either case. Table 4.1 sums up these considerations.
1986 paper. The two took a to
xchangeab'l 1 'ity. " They went b ack Ideally, each person would have a sticker on his forehead iden
founding, wh1. ch th ey ealled "e ara- tif ying which group he bel onged to . Exchangeability simply means
l group (X = 0) should b e comp
the original idea that the contro factual th at the percentage of people with each kind of sticker (d percent, c
t th t tm en t g roup (X - - 1) . But th ey added a count er
b le o e rea
ual s are at rung Perce nt, p percent, and i percent, respectively) should be the same in
ter 1 th at count erf act
twist. (Rememb er from Chap
156 THE BOOK OF WHY
By now, I hope that your curiosity is well piqued. How can
ponse type.
TABLE 4. r. C ass1. ficano
n of individuals according to res causal diagrams turn this massive headache of confoun ding into
Outcome If Outcome If Not a fun game ? The trick lies in an operational test for con founding,
Percentage in Vaccinated
Vaccinated called th e back-door criterion. This criterion turns the p ro blem of
Group Group
Flu Flu defining confounding, identifying con founder s, and adjusting for
Doomed them into a routine puzzle that is no more challenging than solving
No flu
d
Flu
Causative a maze. It has thus brought the thorn y, age-old problem to a happy
No flu Flu
C
Preventive conclusion.
No flu No flu
Immune t
quality among th ese pro-

THE DO-OPERATOR AND THE BACK-DOOR CRITERION
tr atm t a d c ontro1 groups. E
both th e same if we
To understand the back-door criterion, it helps fi rst to have an in
e wo uld be 1· ust th
ees that the outcom
n
t
en
a an
e
gu
e
p rti nd tuitive sense of how information flows in a causal diagram. I like to

se th e treatment a
switch ed th e treat
ments and controls. O therwi '
on s r
think of the links as pipes that convey information from a starting

o
. ffect of the
, and our estimate of the e
control groups are not ahke s may be differ-
point X to a finish Y. Keep in mind that the conveying of informa
vaccme . w1" ll be confounded. Note that the two group tion goes in both dire ctions, cau sal and noncausal, as we saw in
. ca n d"f 1 fer m. age, sex, health conditions, and
e t m ma y wa y . Th ey • Chapter 3.
ri tic . O nly e qualit y amon
g d, c, p, and
. r cha rac
s
a vanetY_ of othe
n
In fact, the noncausal paths are precisely the source of con

n
not. So exchange

exchangeab le or
s s
i determmes whether they are

te
proportions, founding. Remember that I define confounding as anything that

ab1·1ity· amounts to equa1ity · between two se ts of four makes P(Y I do(X)) differ from P(Y I X). The do-operator erases
essing the
. . e alternative of ass
r du ti n m c omplex1ty from th
a va 1 er. all the arrows that come into X, and in this way it prevents an y
th e two groups ma
y d"ff
innumerable factors bY which
st e c o
. . land information about X from flowing in the noncausal direction. Ran

mt10n of confou. nding, Green
Using this commonsen se defi . ons, both declara- domization has the same effect. So does statistical adjustment, if
. th e "statistical" defi
nm
and R obms showed that we pick the right variables to adjust.
ura l,
.
g1v e incorrect answers
. A variable can satisfy
tive a d p d In the last chapter, we looked at three rules that tell us how to
ncrease bias, if ad
ol ogists and still i
the three-part test of epidemi
n roc e
stop the fl ow of information through any individual junction. I will

usted for . ent, be-
repeat them for emphasis :
definition was a great achie�em
j Greenland and Robins' s
. h m to give explicit exam
. . p1 es sh owmg that the
caus� it a ble d (a) In a chain junction, A ➔ B ➔ C, controlling for B prevents in
ever,
. . on s of confounding were inadequate . How
t e
d fi m
en
pre v10 u . t it sim- formation about A from getting to C or vice versa.

anslated '.nto _pra · · To pu
the definition could not be tr
s e n
;::o not even have (b) Likewise, in a fork or confounding junction, A f- B ➔ C, con
. head don t ex1st .
ply, those st1ckers on th. e fore precisely the trolling for B prevents information about A from getting to C
p i d, c, p, a nd i · I n fact, this is
a count f th pr nside or vice versa.
. e keeps locked i
at th e gem· e o f Natur
ons
t h
o ort
ati
e
f rm
o
ki d of i (c) Finally, in a collider, A ➔ B f- C, exactly the opposite rules

ng thi s in-
' t show to an yb od y. Lacki
h er magic lantern and doesn
n n o on
t and hold. The variables A and C start out independent, so that in
to intuit wheth er the treatmen
formation, the researcher is left formation about A tells you nothing about C. But if you control
ble or not .
control groups are exchangea
THE BOOK OF WHY
158 . .
you still find it hard, be assured that algo n thms exist that can crack
due
rm at ion starts flowi
ng through the "p ipe ," all such pro blems in a matter of nano seconds. In ea ch cas e, the g oal
for B, then in fo
. .
ct. of the game is to spec1·fY a set of van a bles that w1 1l deconfound X
to the explain-away effe
and Y. In other words ' the y should n ot be descen ded from X, and
m ental rule:
mind another funda they should block all the back-door paths.
We must also keep in
is like
fo r d escendants (or
proxies) of a variable
nt ro llin g
(d) Co Controlling for a
for the variable itself.
"partially" controlling
GAME I.
lling for
of a m dia to r pa rtl y closes the pip e ; contro
descendant e
the pipe.
er partly op ens
a d escendant of a collid
tions, like
f w e have longer
pipes with mo ce j unc
No w , w ha t i
thi s:
F ➔ G (-- H ➔ I ➔ ]? B
A (-- B (-- C ➔ D (-- E ➔
pl : if a single junct
ion is blocked, then
v y s m e
The answer is
i
er ny
u A through this
path. So we have ma .
This one is easy., There are n o arrows lead.mg m to X, theref ore
u " b t
J cannot "find o t a o
een A and J:
control for B,
ck c m m u nication betw no back-door paths · We don , t need to c ol for anything.
options to bl o o
ollide r), cont
rol
r l f r D (b ecaus e it' s a c Nevertheless, som e researchers woul�::nsid
control for C, don' t
c o n t o o
hy the
. er B a confounder.
ent. T hi s is w .
An y f t hese is suffici . associated with X because of the chain X ➔ A ➔ B. It 1s associ-
I t 1s
for E, and so forth . on e o
we
e of controlling
for everything that .
ated with Y am ong 1·nd1v1
. . - 0 because there is an open
· duals wi th X -
l p c d
usual statistic
e ur
ath is blocked
a ro
is particula r p
is so m sg ui d ed. In fact, th path Bf- A ➔ y tha t does not p ass through X · And B Is . not on the
can measur e i
nd G block
l f or anything!
The colliders at D a causal path X ➔ A ➔ y. It therefore passes the three-step "classical
if we do n' t c on tro
D and G would
sid elp . C ontrolling fo r . .
epidemiological definition " for confoundmg ' but I·t does not pass
out an y u t e h
the path with
o
h and enable
J to listen to A. the back-door criterion and w1·11 lead to dis . aster if controlled f or.
open this pat ly
u d two variables
X and Y, we need on
F inally, to d c nf o n

ocking or per
e o
em without bl
nc us l p ath between th
block eve ry a a y
back-door path is an
no
e preci sely, a
c us al paths. M or
turbi g y inting into X. X and
n an a
h an ar row po
GAME 2.
Y s r t s wi t
path from X to
ta
th at use
d f w e block every
back-door path (beca
Y will be dec nf ou n d e i
we do
between X and Y). If
o
A B C
ll w sp ur ious correlation
such path s a o
s of variables Z,
we also need to
lling f or s o m e et
l
thi s by contro t of X on a causa
m be r f Z i s a descendan
make sure that n o m e o
ath.
D
g p artly or compl
etely close off that p
path ; otherwis e w e m i ht
becomes
se les , d econf ounding
! W i th e ru
That's all there is to it
th
e. I u rge you to
trY
d fu n at y o u can treat it like a gam s. If
X E y
so simpl e an th
d see how easy it i
s g e h ang of it an
es ju et th
a few e xam pl
t to
THE BOOK OF WHY Confounding and Deconfounding: Or, Slaying the Lurking Variable
160
etreat-
h.mk Of A, B ' C ' and D as "pr
GAME 4.
In this exa mp 1e you should t
ent, as usu al ' is • X.) N ow there is one
ment" van. ab 1es. (The treatm
is path is already
back-door p ath Xf- -A➔Bf--D➔ E ➔ y Th A
. for any-
' so we d on 't need to c ontrol
blocked by the c o11i· d·er at B
.
tist ici ans w ould c ontro1
for B or C ' thinking there
hin g. M a ny st a
atment.
t
o ccu r before the tre
is no h arm m . do.mg so as long as th ey .. . B
. . "T o av oid condmonmg
A leading stat1s . t1e1an even recent1y wrote, .
ad h ockery." He is y
on some observed
covariates . ...IS n onsc1ent1'fie
. 'de a because it w ould ope
n
on B or C is a p oor
I
wr on g; c ond .
1 t .
10n .
mg
and Y. N ote th at m
the n oncaus a 1 p at
h and therefore conf ound X .
. h bY c ntr llin g f or A or D.This
pat
this case we c ould reclose the
o o
. n-
sh ws h there � ay b e d'ifferent strategies f or dec o
exa mple o t at
control
ht take th e e asy w ay and not This one introduces a new kind of bias, called "M-bias" (named
f ounding. One researcher. mig . for C and
. na1 researcher might c ontrol for the shape of the graph).Once again there is only one b ack-door
f or anythmg; a m ore tradmo .
ult (pr ovided
be c rrec n d s h o ld get the sa me res path, and it is already blocked by a collider a t B. So we d on't need
D.Both w ou ld o t a
o ugh sample).
th at the m odel is c o
rrect, and we �ave a la rge en to control f or anything. Nevertheless, all statisticians before 1986
and many today would c onsider B a confounder. It is associated
with X (vi a X f-- A ➔ B) and associated with Y via a p a th tha t
doesn't go through X (B f-- C➔Y). It d oes n ot lie on a causal p a th
GAME 3·
and is no t a descendant of anything on a causal p a th, because there
B is no causal path from X t o Y. Therefore B p asses the tradition al
three-step test for a confounder.
M-bias puts a finger on what is wrong with the traditional ap
A proach. It is inc orrect to call a vari able, like B, a confounder merely
y because i t is associated with both X and Y. To reiterate, X and Y
X are unc onfounded if we d o n o t control for B. B only becomes a
confounder when you control for it!
When I started showing this di agram to statisticians in the 1990s,
s time
h ave to do anyth.mg, but thi some of them laughed it off and said that such a diagra m was ex
In G ames 1 and 2 you didn't
.
is ne b ack-d oor p ath f ro
m X to y X f-- B➔Y, tremely unlikely to occur in practice. I disagree! For exa mple, sea t
you d o . Th ere o
·
contro11mg f O r B If B is unobserv- belt usage (B) h as no causal effect on smoking (X) or lung disease
which can only be 61ocked by . . ect of X on y with- (Y); it is merely an indicator of a person's attitudes toward s ocietal
a ble ' then there is . no w ay of esnmatmg the eff
me (in f act,
out running a rand· o miz . ed controlled expen. ment. So norms (A) as well as safety and health-related measures (C).Some
· · m h
. .
is si
.
tuation wo uld c ont
rol f or A, as a proxy of these attitudes m ay affect susceptibility to lung disease (Y). In
mo st) s .
is ici a ns t
ates
tat t
f or the unobservab1e vana61e

· B' but th'is on1Y p artially elimin practice, seatbelt usage w as found to be correlated with both X
r oduces a new collider bias. and Y; indeed, in a study c onducted in 2006 as p art of a tobacco
the confounding bias and int
THE BOOK OF WHY Confounding and Deconfounding: Or, Slaying the Lurking Variable 163
162
. to
the first v ari ables t o assist in disti n guishing con founders from deconfoun ders. She is
as listed as one of
litigation, seat-b elt usage w e l , then cont ro 11·
mg th e only person I know of who man aged this feat. Later, in 2012,
ept the ab ove m od
be controlled for. If you acc she collaborated on an updated version t,hat analyzes the same ex
ke .
for B alone would b e a mista also control for ampl es with causal diagrams and verifies that all her conclusion s
,
Note that .1t s a11 ng . ht to control for B if you
.mg for the collider B ope
ns the "pipe," but con- from 1993 were correct.
A or C. Co n t ro 11
-belt
for A or C clo s es i. t agam.. U nfortunately, in the seat In both of Weinberg's p apers, the medical application was to es
trol ling d
are van. ab les rela
t'l ng to people's attitudes an timate the effect of smokin g (X) on miscarriages, or "spon taneous
example, A and C , '
can t ad1ust
. If you ean't observe it, you abor tions" (Y). In G ame 1, A represents an un derlying abnormal
not likely to be observable ity that is in duced by smoking; this is not an observable variable
for it. because we don 't kn ow what the abnormali ty is. B represents a
history of previous miscarriages. It is very, very tempting for an ep
GAME 5·
idemiologist to take previous miscarriages into accoun t and adj ust
for them when estimatin g the probability of future miscarriages.
A
C But that is the wron g thing to do h ere ! By doin g so we are p artially
inactivating the mechanism through which smoking acts, and we
will thus underestimate the true eff ect of smoking.
B Game 2 is a more complicated version where there are two dif
ferent smoking variables: X represents whether the mother smokes
y now ( at the beginning of the second pregnancy), while A represents
X
whether she smoked durin g the first pregnancy.B and E are under
lying abnormali ti es caused by smoking, which are un observable,
and D represen ts other physiological causes of those abn ormali
. ra wn·nkle. Now a second ties. Note that this diagram allows for the fact that the mother
Game 5 1s . 1ust Game 4 with a little ext
ed. If w e close this. could have changed h er smoking behavior between pregn ancies,
b ack-door path X f- B f- C ➔ y needs to be clos
path X f-
. n we open up the M-shaped but the other physiological causes would n ot change. Again , many
path by controlling for B , the or C as
path , we must control for A epidemiologists would adjust for prior miscarriages (C), but this is
A ➔ B f- C ➔ Y. To. close that . 1 for C alone ; that
e could 1ust cont ro a bad idea un less you also adjust for smoking behavior in the first
well. However, nonce that w r path.
C ➔ y and not affect the othe pregnancy (A).
would close the path X f- B f- Clarice W em-
.
199 3 p pe r by Games 4 and 5 come from a paper published in 2014 by Andrew
Games 1 through 3 come fror:°
a a
lth, called
Nanona1 Institutes, of Hea Forbes, a biostatistician at Mon ash Un iversity in Australia, alon g
berg, a deputy chi' ef at the . , dun·ng
rd C r r De fi .
nm o n of Confound'mg. It came out with several collaborators. H e is in terested in the effect of smoking
"Tow a lea e
' when Greenland
a
the transinon . a1 peno . d between 1986 and 1995 on adult asthma. In Game 4, X represen ts an in dividual's smoking
were st·11l behavior, and Y represents whether the person has asthma as an
r was ava·1 1 able but causal d 1· agrams .
and Robins's pape
th erefore went th
rough the con s1d- adult. B represents childhood asthma, which is a collider because
not widely known. We�. n berg of
n
.
h m .
1c x rcis e of ven'fym . g exchangeability in each it is affected by both A, parental smokin g, and C, an underlying
erab le t et e e
a
aph 1' cal displays to
commu- (and unobservable) predisposition toward asthma. In Gam e 5 the
c s s s ow n. Al though she used gr
diagrams
the a e h
. _ she did not use the logic of
variables have the same meanings, but Forbe s added two arrows
nicate the scenarios involved,
Confounding and o econ found1ng: Or Slaying the
164 THE BOOK OF WHY ' Lurk.ing Vanab
. le 165
One final comment about these "games"·. whe .
for greater realism. (Game 4 was only meant to introduce the . . n you start identi-
fymg the va riables as smokmg, m1sc .
. a rnage
M-graph.) . ' and so forth, they a re
qmte obviously not games 6ut senous 6 si .
In fact, the full model in Forbes' paper has a few more vari- ness. I have referred to
them as games because the .Joy O f 6emg . µ
ables and looks like the diagram in Figure 4.7. Note that Game 5 is able to so1 ve them swiftly
and meaningfully is akin t0 the P1 easure a
embedded in this model in the sense that the variables A, B, C, X, eh·I Id fee1 s on figuring
out that he can crack puzz1 es that stumped
and Y have exactly the same relationships. So we can transfer our . him before
Few moments m a sc1en
· t1·fic career are as sat1s . .
conclusions over and conclude that we have to control for A and fymg as taking a
problem that has puzzled an d confused gene
B or for C; but C is an unobservable and ther efore uncont rollable rat'wns of p redecessors
and r educing it to a straight£orwa rd game or .
variable. In addition we have four new confounding variables: D = a1 gonthm. I consider
the complete solution of the confoundmg . pro 61
parental asthma, E = chronic bronchitis, F = sex, and G = socio . . em one of the main
.
highlights of the Causal Revo1 ut10n because it
economic status. The reader might enjoy figuring out that we must . n that en ded an era of con-
fus10 has probably resu1 ted 111 . many wrong dec1s1 . . 0ns in the
control for E, F, and G, but there is no need to control for D. So a
past. It has been a qu·iet revo1 ut10 .
. n, raging pnman·1 Y m
sufficient set of variables for deconfounding is A, B, E, F, and G. . research
laboratories and scient'fi .
I c meetmgs. Yet ' armed with .
. these new
tools and insights ' the SC1en t1'fic community is . now tack
ling harder
E problems, both theoretical and pr actic .
al, as subsequent chapters
will show.
FIGURE 4.7. Andrew Forbes's model of smoking (X) and

asthma (Y).
had a small and statis

In the end, Forbes found that smoking
lt asthma in the raw data,
tically insignificant association with adu
more insignificant after
and the effect became even smaller and
result should not det ract,
adjusting fo r the confounders. The null
r is a model for the "ski
llful
however, f rom the fact that his pape
interrogation of Nature."

Pearl (2018) HST 4

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Pearl (2018) HST 4

Încărcat de

Drepturi de autor:

Formate disponibile

4

CONFOUNDING AND DECONFOUNDING:

If our conception of causal effects had anything to

the story' you can

This method of compensation is familiar to all statisticians; it is

came across a quote from a political blogger nam ed Ezra Klein

d what they sought which is only appropria · te for use with .

wo uld have a cau sal

intense walkers.) heir conclu-

Confounding and Deconfounding: Or, Slaying the Lurking Variable

Thus, randomization actu

era tor to fill in

ti cal colleagues. Fisher's li terally drawing f rom d k of cards to

may. Science subj ected to the whims of chance? .

ws p int g t o Fertilizer a . . er = 1 and do i·ng Fert1.1.1zer = 1.

ancy bet ween P(X

quality among th ese pro-

p rti nd tuitive sense of how information flows in a causal diagram. I like to

think of the links as pipes that convey information from a starting

In fact, the noncausal paths are precisely the source of con­

i determmes whether they are

proportions, founding. Remember that I define confounding as anything that

. . land information about X from flowing in the noncausal direction. Ran­

stop the fl ow of information through any individual junction. I will

pre v10 u . t it sim- formation about A from getting to C or vice versa.

ki d of i (c) Finally, in a collider, A ➔ B f- C, exactly the opposite rules

f or the unobservab1e vana61e

FIGURE 4.7. Andrew Forbes's model of smoking (X) and

had a small and statis­

S-ar putea să vă placă și

In fact, the noncausal paths are precisely the source of con

. . land information about X from flowing in the noncausal direction. Ran

had a small and statis