Sunteți pe pagina 1din 9

The n e w e ng l a n d j o u r na l of m e dic i n e

Special article

Selective Publication of Antidepressant


Trials and Its Influence on Apparent Efficacy
Erick H. Turner, M.D., Annette M. Matthews, M.D., Eftihia Linardatos, B.S.,
Robert A. Tell, L.C.S.W., and Robert Rosenthal, Ph.D.

A BS T R AC T

BACKGROUND
From the Departments of Psychiatry Evidence-based medicine is valuable to the extent that the evidence base is complete
(E.H.T., A.M.M.) and Pharmacology and unbiased. Selective publication of clinical trials — and the outcomes within
(E.H.T.), Oregon Health and Science Uni-
versity; and the Behavioral Health and those trials — can lead to unrealistic estimates of drug effectiveness and alter the
Neurosciences Division, Portland Veter- apparent risk–benefit ratio.
ans Affairs Medical Center (E.H.T., A.M.M.,
R.A.T.) — both in Portland, OR; the De-
partment of Psychology, Kent State Uni-
METHODS
versity, Kent, OH (E.L.); the Department We obtained reviews from the Food and Drug Administration (FDA) for studies of
of Psychology, University of California– 12 antidepressant agents involving 12,564 patients. We conducted a systematic lit-
Riverside, Riverside (R.R.); and Harvard
University, Cambridge, MA (R.R.). Address erature search to identify matching publications. For trials that were reported in the
reprint requests to Dr. Turner at Portland literature, we compared the published outcomes with the FDA outcomes. We also
VA Medical Center, P3MHDC, 3710 SW compared the effect size derived from the published reports with the effect size de-
US Veterans Hospital Rd., Portland, OR
97239, or at turnere@ohsu.edu. rived from the entire FDA data set.

N Engl J Med 2008;358:252-60. RESULTS


Copyright © 2008 Massachusetts Medical Society.
Among 74 FDA-registered studies, 31%, accounting for 3449 study participants, were
not published. Whether and how the studies were published were associated with
the study outcome. A total of 37 studies viewed by the FDA as having positive results
were published; 1 study viewed as positive was not published. Studies viewed by the
FDA as having negative or questionable results were, with 3 exceptions, either not
published (22 studies) or published in a way that, in our opinion, conveyed a posi-
tive outcome (11 studies). According to the published literature, it appeared that
94% of the trials conducted were positive. By contrast, the FDA analysis showed that
51% were positive. Separate meta-analyses of the FDA and journal data sets showed
that the increase in effect size ranged from 11 to 69% for individual drugs and was
32% overall.

CONCLUSIONS
We cannot determine whether the bias observed resulted from a failure to submit
manuscripts on the part of authors and sponsors, from decisions by journal editors
and reviewers not to publish, or both. Selective reporting of clinical trial results may
have adverse consequences for researchers, study participants, health care profes-
sionals, and patients.

252 n engl j med 358;3  www.nejm.org  january 17, 2008

The New England Journal of Medicine


Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Selective Publication of Antidepressant Trials

M
edical decisions are based on an Information Act.19 Reviews for the four newer
understanding of publicly reported clin- antidepressants were available on the FDA Web
ical trials.1,2 If the evidence base is bi- site.17,20 This study was approved by the Research
ased, then decisions based on this evidence may and Development Committee of the Portland Vet-
not be the optimal decisions. For example, selec- erans Affairs Medical Center; because of its na-
tive publication of clinical trials, and the outcomes ture, informed consent from individual patients
within those trials, can lead to unrealistic estimates was not required.
of drug effectiveness and alter the apparent risk– From the FDA reviews of submitted clinical tri-
benefit ratio.3,4 als, we extracted efficacy data on all randomized,
Attempts to study selective publication are com- double-blind, placebo-controlled studies of drugs
plicated by the unavailability of data from unpub- for the short-term treatment of depression. We in-
lished trials. Researchers have found evidence for cluded data pertaining only to dosages later ap-
selective publication by comparing the results of proved as safe and effective; data pertaining to
published trials with information from surveys of unapproved dosages were excluded.
authors,5 registries,6 institutional review boards,7,8 We extracted the FDA’s regulatory decisions —
and funding agencies,9,10 and even with published that is, whether, for purposes of approval, the
methods.11 Numerous tests are available to detect studies were judged to be positive or negative with
selective-reporting bias, but none are known to be respect to the prespecified primary outcomes (or
capable of detecting or ruling out bias reliably.12‑16 primary end points).21 We classified as question-
In the United States, the Food and Drug Ad- able those studies that the FDA judged to be nei-
ministration (FDA) operates a registry and a re- ther positive nor clearly negative — that is, stud-
sults database.17 Drug companies must register ies that did not have significant findings on the
with the FDA all trials they intend to use in sup- primary outcome but did have significant findings
port of an application for marketing approval or on several secondary outcomes. Failed studies22
a change in labeling. The FDA uses this informa- were also classified as questionable (for more in-
tion to create a table of all studies.18 The study formation, see the Methods section of the Supple-
protocols in the database must prospectively iden- mentary Appendix, available with the full text of
tify the exact methods that will be used to collect this article at www.nejm.org). For fixed-dose stud-
and analyze data. Afterward, in their marketing ies (studies in which patients are randomly as-
application, sponsors must report the results ob- signed to receive one of two or more dose levels
tained using the prespecified methods. These or placebo) with a mix of significant and nonsig-
submissions include raw data, which FDA statis- nificant results for different doses, we used the
ticians use in corroborative analyses. This system FDA’s stated overall decisions on the studies. We
prevents selective post hoc reporting of favorable used double data extraction and entry, as detailed
trial results and outcomes within those trials. in the Methods section of the Supplementary Ap-
How accurately does the published literature pendix.
convey data on drug efficacy to the medical com-
munity? To address this question, we compared Data from Journal Articles
drug efficacy inferred from the published litera- Our literature-search strategy consisted of the
ture with drug efficacy according to FDA reviews. following steps: a search of articles in PubMed,
a search of references listed in review articles, and
Me thods a search of the Cochrane Central Register of Con-
trolled Trials; contact by telephone or e-mail with
Data from FDA Reviews the drug sponsor’s medical-information depart-
We identified the phase 2 and 3 clinical-trial pro- ment; and finally, contact by means of a certified
grams for 12 antidepressant agents approved by letter sent to the sponsor’s medical-information
the FDA between 1987 and 2004 (median, August department, including a deadline for responding
1996), involving 12,564 adult patients. For the eight in writing to our query about whether the study
older antidepressants, we obtained hard copies of results had been published. If these steps failed
statistical and medical reviews from colleagues to reveal any publications, we concluded that the
who had procured them through the Freedom of study results had not been published.

n engl j med 358;3  www.nejm.org  january 17, 2008 253


The New England Journal of Medicine
Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

We identified the best match between the FDA- the above calculation. Rather, P values were often
reviewed clinical trials and journal articles on the indicated as being below or above a certain thresh-
basis of the following information: drug name, old — for example, P<0.05 or “not significant”
dose groups, sample size, active comparator (if (i.e., P>0.05). In these cases, we followed the pro-
used), duration, and name of principal investiga- cedure described in the Supplementary Appendix.
tor. We sought published reports on individual For each fixed-dose (multiple-dose) study, we
studies; articles covering multiple studies were ex- computed a single study-level effect size weighted
cluded. When the results of a trial were reported by the degrees of freedom for each dose group.
in two or more primary publications, we selected On the basis of the study-level effect-size values
the first publication. for both fixed-dose and flexible-dose studies, we
Few journal articles used the term “primary ef- calculated weighted mean effect-size values for
ficacy outcome” or a reasonable equivalent. There- each drug and for all drugs combined, using a
fore, we identified the apparent primary efficacy random-effects model with the method of Der-
outcome, or the result highlighted most promi- Simonian and Laird27 in Stata.28
nently, as the drug–placebo comparison reported Within the published studies, we compared the
first in the text of the results section or in the table effect-size values derived from the journal articles
or figure first cited in the text. As with the FDA with the corresponding effect-size values derived
reviews, we used double data extraction and entry from the FDA reviews. Next, within the FDA data
(see the Methods section of the Supplementary Ap- set, we compared the effect-size values for the
pendix for details). published studies with the effect-size values for the
unpublished studies. Finally, we compared the
Statistical Analysis journal-based effect-size values with those derived
We categorized the trials on the basis of the FDA from the entire FDA data set — that is, both pub-
regulatory decision, whether the trial results were lished and unpublished studies.
published, and whether the apparent primary out- We made these comparisons at the level of
comes agreed or conflicted with the FDA decision. studies and again at the level of the 12 drugs. Be-
We calculated risk ratios with exact 95% confi- cause the data were not normally distributed, we
dence intervals and Pearson’s chi-square analysis, used the nonparametric rank-sum test for un-
using Stata software, version 9. We used a simi- paired data and the signed-rank test for paired
lar approach to examine the numbers of patients data. In these analyses, all the effect-size values
within the studies. Sample sizes were compared were given equal weight.
between published and unpublished studies with
the use of the Wilcoxon rank-sum test. R e sult s
For our major outcome indicator, we calculat-
ed the effect size for each trial using Hedges’s g Study Outcome and Publication Status
— that is, the difference between two means di- Of the 74 FDA-registered studies in the analysis
vided by their pooled standard deviation.23 How- we could not find evidence of publication for 23
ever, because means and standard deviations (or (31%) (Table 1). The difference between the sam-
standard errors) were inconsistently reported in ple sizes for the published studies (median, 153
both the FDA reviews and the journal articles, we patients) and the unpublished studies (median,
used the algebraically equivalent computational 146 patients) was neither large nor significant
equation24: (5% difference between medians; P = 0.29 by the
rank-sum test).


g = t ×  1 1 . The data in Table 1 are displayed in terms of
ndrug + nplacebo
the study outcome in Figure 1A. The questions of
We calculated the t statistic25 using the precise whether the studies were published and, if so, how
P value and the combined sample size as argu- the results were reported were strongly related
ments in Microsoft Excel’s TINV (inverse T) func- to their overall outcomes. The FDA deemed 38 of
tion, multiplying t by −1 when the study drug was the 74 studies (51%) positive, and all but 1 of the
inferior to the placebo. Hedges’s correction for 38 were published. The remaining 36 studies (49%)
small sample size was applied to all g values.26 were deemed to be either negative (24 studies) or
Precise P values were not always available for questionable (12). Of these 36 studies, 3 were pub-

254 n engl j med 358;3  www.nejm.org  january 17, 2008

The New England Journal of Medicine


Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Selective Publication of Antidepressant Trials

lished as not positive, whereas the remaining 33


Table 1. Overall Publication Status of FDA-Registered Antidepressant
either were not published (22 studies) or were pub- Studies.
lished, in our opinion, as positive (11) and therefore
conflicted with the FDA’s conclusion. Overall, the No. of No. of Patients
Publication Status Studies (%) in Studies (%)
studies that the FDA judged as positive were ap-
Published results agree with FDA 40 (54) 7,272 (58)
proximately 12 times as likely to be published in decision
a way that agreed with the FDA analysis as were
Published results conflict with FDA 11 (15) 1,843 (15)
studies with nonpositive results according to the decision (published as positive)
FDA (risk ratio, 11.7; 95% confidence interval [CI],
Results not published 23 (31) 3,449 (27)
6.2 to 22.0; P<0.001). This association of publi-
Total 74 (100) 12,564 (100)
cation status with study outcome remained sig-
nificant when we excluded questionable studies
and when we examined publication status with- specified primary outcome was nonsignificant,
out regard to whether the published conclusions each publication highlighted a positive result as
and the FDA conclusions were in agreement (for if it were the primary outcome. The nonsignificant
details, see the Supplementary Appendix). results for the prespecified primary outcomes were
Overall, 48 of the 51 published studies were either subordinated to nonprimary positive re-
reported to have positive results (94%; binomial sults (in two reports) or omitted (in nine). (Study-
95% CI, 84 to 99). According to the FDA, 38 of the level methodologic differences are detailed in the
74 registered studies had positive results (51%; footnotes to Table B of the Supplementary Ap-
95% CI, 39 to 63). There was no overlap between pendix.)
these two sets of confidence intervals.
These data are broken down by drug and study Effect Size
number in Figure 2A. For each of the 12 drugs, The effect-size values derived from the journal re-
the results of at least one study either were unpub- ports were often greater than those derived from
lished or were reported in the literature as posi- the FDA reviews. The difference between these two
tive despite a conflicting judgment by the FDA. sets of values was significant whether the studies
(P = 0.003) or the drugs (P = 0.012) were used as
Number of Study Participants the units of analysis (see Table D in the Supple-
As shown in Table 1, a total of 12,564 patients mentary Appendix).
participated in these trials. The data from 3449 The effect sizes of the published and unpub-
patients (27%) were not published. Data from an lished studies reviewed by the FDA are compared
additional 1843 patients (15%) were reported in in Figure 3A. The overall mean weighted effect-
journal articles in which the highlighted finding size value was 0.37 (95% CI, 0.33 to 0.41) for pub-
conflicted with the FDA-defined primary outcome. lished studies and 0.15 (95% CI, 0.08 to 0.22) for
Thus, the percentages for the patients closely mir- unpublished studies. The difference was signifi-
rored those for the studies (Table 1). cant whether the studies (P<0.001) or the drugs
Whether a patient’s data were reported in a way (P = 0.005) were used as the units of analysis
that was in concert with the FDA review was as- (Table D in the Supplementary Appendix).
sociated with the study outcome (Fig. 1B) (risk The mean effect-size values for all FDA stud-
ratio, 27.1), which was consistent with the above- ies, both published and unpublished, are com-
reported finding with the studies. Figure 2B shows pared with those for all published studies, as
these same data according to the drug being shown in Figure 3B. Again, the differences were
evaluated. significant whether the studies (P<0.001) or the
drugs (P = 0.002) were used as units of analysis
Qualitative Description of Selective (Table D in the Supplementary Appendix).
Reporting within Trials For each of the 12 drugs, the effect size derived
The methods reported in 11 journal articles ap- from the journal articles exceeded the effect size
pear to depart from the prespecified methods re- derived from the FDA reviews (sign test, P<0.001)
flected in the FDA reviews (Table B of the Supple- (Fig. 3B). The magnitude of the increases in effect
mentary Appendix). Although for each of these size between the FDA reviews and the published
studies the finding with respect to the protocol- reports ranged from 11 to 69%, with a median

n engl j med 358;3  www.nejm.org  january 17, 2008 255


The New England Journal of Medicine
Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

Published, agrees with FDA decision Figure 2 (facing page). Publication Status and FDA
Published, conflicts with FDA decision Regulatory Decision by Study and by Drug.
Not published Panel A shows the publication status of individual
studies. Nearly every study deemed positive by the
A Studies (N=74)
FDA (top row) was published in a way that agreed with
FDA Decision
the FDA’s judgment. By contrast, most studies
Positive 37 deemed negative (bottom row) or questionable (mid-
(N=38) (97%) dle row) by the FDA either were published in a way that
1 conflicted with the FDA’s judgment or were not pub-
(3%)
Questionable 6 6
lished. Numbers shown in boxes indicate individual
(N=12) (50%) (50%) studies and correspond to the study numbers listed in
Table A of the Supplementary Appendix. Panel B
shows the numbers of patients participating in the
Negative 5 16 individual studies indicated in Panel A. Data for pa-
(N=24) (21%) (67%)
tients who participated in studies deemed positive by
3 the FDA were very likely to be published in a way that
(12%) agreed with the FDA’s judgment. By contrast, data for
patients who participated in studies deemed negative
0 10 20 30 40
or questionable by the FDA tended either not to be
No. of Studies published or to be published in a way that conflicted
with the FDA’s judgment.
B Patients in Studies (N=12,564)
FDA Decision

Positive 7075 A list of the study-level effect-size values used


(N=7155) (99%) in the above analyses — derived from both the
80 FDA reviews and the published reports — is pro-
(1%)
Questionable 1180 1129 vided in Table C of the Supplementary Appendix.
(N=2309) (51%) (49%)
These effect-size values are based on P values and
sample sizes shown in Table A of the Supplemen-
Negative 2240
(N=3100) (72%) tary Appendix, which also lists reference infor-
mation for the publications consulted.
197 663
(6%) (21%)

0 2000 4000 6000


Dis cus sion
No. of Patients
We found a bias toward the publication of posi-
Figure 1. Effect of FDA Regulatory Decisions tive results. Not only were positive results more
on Publication.
ICM
AUTHOR: Turner likely to be published, but studies that were not
RETAKE 1st
FIGURE: 1 of 3 2nd
Among positive, in our opinion, were often published in
REG Fthe 74 studies reviewed by the FDA (Panel A),
3rd
38 were
CASE deemed to have positive results, 37Revised of which
a way that conveyed a positive outcome. We ana-
wereEMail Line 4-C
published with positive results; the remaining SIZE
study
ARTIST: ts
was not published. Among H/T theH/T lyzed these data in terms of the proportion of
studies deemed
Enon 16p6
Combo positive studies and in terms of the effect size
to have questionable or negative results by the FDA,
AUTHOR,
there was a tendency PLEASE
toward NOTE: associated with drug treatment. Using both ap-
nonpublication or publi-
Figure has been redrawn and type has been reset.
cation with positivePlease
results, conflicting
check proaches, we found that the efficacy of this drug
carefully. with the con-
clusion of the FDA. Among the 12,564 patients in all
class is less than would be gleaned from an ex-
74 studies
35803(Panel B), data for patients who participated
JOB: amination of the published literature alone. Ac-
ISSUE:
in studies deemed positive by the FDA were very likely
01-17-08

cording to the published literature, the results of


to be published in a way that agreed with the FDA. In
nearly all of the trials of antidepressants were
contrast, data for patients participating in studies
positive. In contrast, FDA analysis of the trial data
deemed questionable or negative by the FDA tended
either not to be published or to be published in a way
showed that roughly half of the trials had positive
that conflicted with the FDA’s judgment.
results. The statistical significance of a study’s re-
sults was strongly associated with whether and
increase of 32%. A 32% increase was also ob- how they were reported, and the association was
served in the weighted mean effect size for all independent of sample size. The study outcome
drugs combined, from 0.31 (95% CI, 0.27 to also affected the chances that the data from a par-
0.35) to 0.41 (95% CI, 0.36 to 0.45). ticipant would be published. As a result of selec-

256 n engl j med 358;3  www.nejm.org  january 17, 2008

The New England Journal of Medicine


Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Selective Publication of Antidepressant Trials

Published, agrees with FDA Published, conflicts with FDA Not published

A Studies
FDA Decision
42
43
26 44
Positive 9 21 27 45 66
10 17 22 28 36 46 67
4 11 18 23 29 37 47 58 68 72
1 5 12 19 24 30 38 48 59 61 69 73

13
14
Questionable 15 49
16 20 39 50 60 62 70 74

51
52
31 53
Negative
32 54
6 33 55 63
2 7 34 40 56 64
3 8 25 35 41 57 65 71
K R
)
F m
ba ul st)

(L Esc i Lil e
ap alo y)
, F am

za ox )
iL e
n, az )

to N an e
ye azo )
qu ne
xo ar b)

, G Pa hKl ne

Sm tin )
Kl R
(Z Se ne)

Pf e
r)

W ne
xo afa th)

W XR
)
(C Ci line

er Mir illy

on

la rox ine

th
l n

El in

rg in

t, lin
ith n S

ith e C
(P F res

(E Ve ize
l
a, ra

ib
, E eti

s S do

it ti

r, xi
ym D ore

ye

ye
c, et

O ap
ro pr

R, e
of ra
e
ex op

xo fa

n
Sm io

o
lta ox

Sm ox

ol rt

r X xi
xo op

el tal

ffe nl
ro lu

xo e
t
ex it

f
la pr

la P
l-M e
r

ffe nl
, G Bu

(E Ve
o
em

G
(C

ris

il,
SR

CR
(R

ax
,B

(P
rin

il
ne

ax
ut

zo

(P
lb

er
el

(S
(W

B Patients in Studies
FDA Decision

Positive N=1059 N=1046 N=1100


N=698
N=514 N=383 N=419 N=548 N=428
N=283 N=367
N=230
N=80

N=408
Questionable
N=413 N=80 N=347
N=249 N=158 N=187 N=238
N=81 N=148

N=107
N=90
N=627 N=272 N=125
Negative N=501 N=387
185
N=185 N=42 N=299 N=241 N=224
K R
)
F m
ba ul st)

(L Esc i Lil e
ap alo y)
, F am

za ox )
i e
n, az )

to N ano e
ye azo )

ui e
xo ar b)

, G Pa hKl ne

Sm tin )
Kl R
(Z Se ne)

Pf e
)
(E Ve , W ine
xo afa th)

W XR
)
(C Ci line

er Mir Lilly

la rox ine

(E Ve izer

th
l n

El in

rg in

Sq n

t, lin
ith n S

ith e C
(P F res
l
a, ra

b
, E eti

it ti
ym D ore

ye

ye
c, et

O ap

r x
ro pr

R, e
of ra
rs d

m xe
ex op

xo fa

n
Sm io

o
lta ox

ol rt

r X xi
o
xo op

el tal

ffe nl
ro lu

xo e
t
ex it

f
la pr

la P
l-M e

ffe nl
S
, G Bu

o
em

G
(C

ris

il,
SR

CR
(R

ax
,B

(P
rin

il
ne

ax
ut

zo

(P
lb

er
el

(S
(W

ICM
AUTHOR: Turner RETAKE 1st
FIGURE: 2 of 3 2nd
REG F
3rd
CASE Revised
Line 4-C
n EMail
englj med SIZE
358;3 tswww.nejm.org  january 17,
ARTIST: 2008 257
H/T H/T 33p9
Enon
Combo
The New England Journal of Medicine
AUTHOR,
Downloaded from nejm.org on February PLEASE
7, 2019. NOTE: use only. No other uses without permission.
For personal
Figure has been redrawn and type has been reset.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Please check carefully.
The n e w e ng l a n d j o u r na l of m e dic i n e

A FDA-Based Effect Size B Overall Effect Size


Unpublished Published FDA Journals
Change
g g in g
Bupropion SR Bupropion SR
0.14 0.17 +55%
(Wellbutrin SR, (Wellbutrin SR,
GlaxoSmithKline) 0.27 GlaxoSmithKline) 0.27

0.01 0.24 +25%


Citalopram Citalopram
(Celexa, Forest) 0.31 (Celexa, Forest) 0.30

Duloxetine 0.15 Duloxetine 0.30 +33%


(Cymbalta, Eli Lilly) 0.34 (Cymbalta, Eli Lilly) 0.40

Escitalopram 0.15 Escitalopram 0.31 +16%


(Lexapro, Forest) 0.35 (Lexapro, Forest) 0.36

Fluoxetine Fluoxetine 0.26 +14%


(Prozac, Eli Lilly) 0.26 (Prozac, Eli Lilly) 0.29

Mirtazapine 0.19 Mirtazapine 0.35 +61%


(Remeron, Organon) 0.45 (Remeron, Organon) 0.57

Nefazodone 0.09 Nefazodone 0.26 +69%


(Serzone, Bristol-Myers (Serzone, Bristol-Myers
Squibb) 0.33 Squibb) 0.44

Paroxetine 0.20 Paroxetine 0.42 +40%


(Paxil, GlaxoSmithKline) 0.55 (Paxil, GlaxoSmithKline) 0.59

Paroxetine CR Paroxetine CR 0.32 +11%


(Paxil CR, GlaxoSmithKline) 0.32 (Paxil CR, GlaxoSmithKline) 0.36

0.18 0.26 +64%


Sertraline Sertraline
(Zoloft, Pfizer) 0.30 (Zoloft, Pfizer) 0.42

Venlafaxine 0.11 Venlafaxine 0.40 +28%


(Effexor, Wyeth) 0.45 (Effexor, Wyeth) 0.51

Venlafaxine XR 0.19 Venlafaxine XR 0.40 +27%


(Effexor XR, Wyeth) 0.52 (Effexor XR, Wyeth) 0.51

Overall mean weighted 0.15 Overall mean weighted 0.31 +32%


effect size 0.37 effect size 0.41

−0.2 0.0 0.2 0.5 0.0 0.2 0.5


FDA-Based Effect Size Overall Effect Size

Figure 3. Mean Weighted Effect Size According to Drug, Publication Status, and Data Source.
AUTHOR: Turner RETAKE 1st
Values for effect size are expressed as Hedges’s
ICMg (the difference between two means divided by their pooled standard deviation). Effect-
2nd
FIGURE: 3 of 3 29
size values of 0.2 and 0.5 are considered to beREG
small
F and medium, respectively. Effect-size values for unpublished studies and published
3rd
studies, as extracted from data in FDA reviews, are shown in Panel A. Horizontal lines Revised
CASE indicate 95% confidence intervals. There were no un-
published studies for controlled-release paroxetine Line 4-C
EMail or fluoxetine. For each of the other antidepressants,
SIZE the effect size for the published
ARTIST: ts H/TsubgroupH/T of studies.
subgroup of studies was greater than the effect size for the unpublished
Enon 36p6Overall effect-size values (i.e., based on data
Combo
from the FDA for published and unpublished studies combined), as compared with effect-size values based on data from corresponding
AUTHOR, PLEASE NOTE:
published reports, are shown in Panel B. For eachFigure
drug,has
thebeen
effect-size
redrawn value based
and type on published
has been reset. literature was higher than the effect-size
value based on FDA data, with increases ranging from 11 to 69%. PleaseFor thecarefully.
check entire drug class, effect sizes increased by 32%.

JOB: 35803 ISSUE: 01-17-08

tive reporting, the published literature conveyed nals.3,30‑32 We built on this approach by compar-
an effect size nearly one third larger than the ef- ing study-level data from the FDA with matched
fect size derived from the FDA data. data from journal articles. This comparative ap-
Previous studies have examined the risk–ben- proach allowed us to quantify the effect of selec-
efit ratio for drugs after combining data from tive publication on apparent drug efficacy.
regulatory authorities with data published in jour- Our findings have several limitations: they are

258 n engl j med 358;3  www.nejm.org  january 17, 2008

The New England Journal of Medicine


Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Selective Publication of Antidepressant Trials

restricted to antidepressants, to industry-spon- the rate of false positive findings (type I error),
sored trials registered with the FDA, and to issues and it prevents HARKing,36 or hypothesizing af-
of efficacy (as opposed to “real-world” effective- ter the results are known.
ness33). This study did not account for other factors It might be argued that some trials did not
that may distort the apparent risk–benefit ratio, merit publication because of methodologic flaws,
such as selective publication of safety issues, as including problems beyond the control of the in-
has been reported with rofecoxib (Vioxx, Merck)34 vestigator. However, since the protocols were writ-
and with the use of selective serotonin-reuptake ten according to international guidelines for ef-
inhibitors for depression in children.3 Because ficacy studies37 and were carried out by companies
we excluded articles covering multiple studies, we with ample financial and human resources, to be
probably counted some studies as unpublished that fair to the people who put themselves at risk to
were — technically — published. The prac­tice of participate, a cogent public reason should be given
bundling negative and positive studies in a single for failure to publish.
article has been found to be associated with du- Selective reporting deprives researchers of the
plicate or multiple publication,35 which may also accurate data they need to estimate effect size re-
influence the apparent risk–benefit ratio. alistically. Inflated effect sizes lead to underesti-
There can be many reasons why the results of mates of the sample size required to achieve sta-
a study are not published, and we do not know tistical significance. Underpowered studies — and
the reasons for nonpublication. Thus, we cannot selectively reported studies in general — waste
determine whether the bias observed resulted from resources and the contributions of investigators
a failure to submit manuscripts on the part of and study participants, and they hinder the ad-
authors and sponsors, decisions by journal editors vancement of medical knowledge. By altering the
and reviewers not to publish submitted manu- apparent risk–benefit ratio of drugs, selective pub-
scripts, or both. lication can lead doctors to make inappropriate
We wish to clarify that nonsignificance in a prescribing decisions that may not be in the best
single trial does not necessarily indicate lack of interest of their patients and, thus, the public
efficacy. Each drug, when subjected to meta-analy- health.
sis, was shown to be superior to placebo. On the Dr. Turner reports having served as a medical reviewer for the
Food and Drug Administration. No other potential conflict of
other hand, the true magnitude of each drug’s interest relevant to this article was reported.
superiority to placebo was less than a diligent lit- We thank Emily Kizer, Marcus Griffith, and Tammy Lewis for
erature review would indicate. clerical assistance; David Wilson, Alex Sutton, Ohidul Siddiqui,
and Benjamin Chan for statistical consultation; Linda Ganzini,
We do not mean to imply that the primary Thomas B. Barrett, and Daniel Hilfet-Hilliker for their com-
methods agreed on between sponsors and the FDA ments on an earlier version of this manuscript; Arifula Khan,
are necessarily preferable to alternative methods. Kelly Schwartz, and David Antonuccio for providing access to
FDA reviews; Thomas B. Barrett, Norwan Moaleji and Samantha
Nevertheless, when multiple analyses are con- Ruimy for double data extraction and entry; and Andrew Hamil-
ducted, the principle of prespecification controls ton for literature database searches.

References
1. Hagdrup N, Falshaw M, Gray RW, clinical trials. Control Clin Trials 1987;8: Altman DG. Outcome reporting bias in
Carter Y. All members of primary care 343-53. randomized trials funded by the Canadi-
team are aware of importance of evidence 6. Simes RJ. Confronting publication bias: an Institutes of Health Research. CMAJ
based medicine. BMJ 1998;317:282. a cohort design for meta-analysis. Stat Med 2004;171:735-40.
2. Craig JC, Irwig LM, Stockler MR. Evi- 1987;6:11-29. 11. Chan AW, Altman DG. Identifying
dence-based medicine: useful tools for de- 7. Stern JM, Simes RJ. Publication bias: outcome reporting bias in randomised
cision making. Med J Aust 2001;174:248- evidence of delayed publication in a co- trials on PubMed: review of publications
53. hort study of clinical research projects. and survey of authors. BMJ 2005;330:753.
3. Whittington CJ, Kendall T, Fonagy P, BMJ 1997;315:640-5. 12. Lau J, Ioannidis JP, Terrin N, Schmid
Cottrell D, Cotgrove A, Boddington E. Se- 8. Chan AW, Hróbjartsson A, Haahr MT, CH, Olkin I. The case of the misleading
lective serotonin reuptake inhibitors in Gøtzsche PC, Altman DG. Empirical evi- funnel plot. BMJ 2006;333:597-600.
childhood depression: systematic review of dence for selective reporting of outcomes 13. Hayashino Y, Noguchi Y, Fukui T. Sys-
published versus unpublished data. Lan- in randomized trials: comparison of pro- tematic evaluation and comparison of sta-
cet 2004;363:1341-5. tocols to published articles. JAMA 2004; tistical tests for publication bias. J Epide-
4. Kyzas PA, Loizou KT, Ioannidis JP. Se- 291:2457-65. miol 2005;15:235-43.
lective reporting biases in cancer prog- 9. Ioannidis JP. Effect of the statistical 14. Pham B, Platt R, McAuley L, Klassen
nostic factor studies. J Natl Cancer Inst significance of results on the time to com- TP, Moher D. Is there a “best” way to de-
2005;97:1043-55. pletion and publication of randomized tect and minimize publication bias? An
5. Dickersin K, Chan S, Chalmers TC, efficacy trials. JAMA 1998;279:281-6. empirical evaluation. Eval Health Prof
Sacks HS, Smith H Jr. Publication bias and 10. Chan AW, Krleza-Jerić K, Schmid I, 2001;24:109-25.

n engl j med 358;3  www.nejm.org  january 17, 2008 259


The New England Journal of Medicine
Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.
Selective Publication of Antidepressant Trials

15. Sterne JA, Gavaghan D, Egger M. Pub- Agency (EMEA). Topic E9: statistical prin- 31. Nissen SE, Wolski K, Topol EJ. Effect
lication and related bias in meta-analysis: ciples for clinical trials. Rockville, MD: of muraglitazar on death and major ad-
power of statistical tests and prevalence Food and Drug Administration. (Accessed verse cardiovascular events in patients with
in the literature. J Clin Epidemiol 2000; December 20, 2007, at http://www.fda. type 2 diabetes mellitus. JAMA 2005;294:
53:1119-29. gov/cder/guidance/iche3.pdf.) 2581-6.
16. Ioannidis JP, Trikalinos TA. The ap- 22. Temple R, Ellenberg SS. Placebo-con- 32. Sackner-Bernstein JD, Kowalski M,
propriateness of asymmetry tests for pub- trolled trials and active-control trials in Fox M, Aaronson K. Short-term risk of
lication bias in meta-analyses: a large sur- the evaluation of new treatments. Part 1: death after treatment with nesiritide for
vey. CMAJ 2007;176:1091-6. ethical and scientific issues. Ann Intern decompensated heart failure: a pooled
17. Turner EHA. A taxpayer-funded clini- Med 2000;133:455-63. analysis of randomized controlled trials.
cal trials registry and results database. 23. Hedges LV. Estimation of effect size JAMA 2005;293:1900-5.
PLoS Med 2004;1(3):e60. from a series of independent experiments. 33. Revicki DA, Frank L. Pharmacoeco-
18. Center for Drug Evaluation and Re- Psychol Bull 1982;92:490-9. nomic evaluation in the real world: effec-
search. Manual of policies and procedures: 24. Rosenthal R. Meta-analytic procedures tiveness versus efficacy studies. Pharma-
clinical review template. Rockville, MD: for social research. Newbury Park, CA: coeconomics 1999;15:423-34.
Food and Drug Administration, 2004. Sage, 1991. 34. Topol EJ. Failing the public health —
(Accessed December 20, 2007, at http:// 25. Whitley E, Ball J. Statistics review 5: rofecoxib, Merck, and the FDA. N Engl J
www.fda.gov/cder/mapp/6010.3.pdf.) comparison of means. Crit Care 2002;6: Med 2004;351:1707-9.
19. Committee on Government Reform, 424-8. 35. Melander H, Ahlqvist-Rastad J, Meijer
U.S. House of Representatives, 109th 26. Hedges LV, Olkin I. Statistical meth- G, Beermann B. Evidence b(i)ased medi-
Congress, 1st Session. A citizen’s guide ods for meta-analysis. New York: Academic cine — selective reporting from studies
on using the Freedom of Information Act Press, 1985. sponsored by pharmaceutical industry: re-
and the Privacy Act of 1974 to request gov- 27. DerSimonian R, Laird N. Meta-analy- view of studies in new drug applications.
ernment records. Report no. 109-226. sis in clinical trials. Control Clin Trials BMJ 2003;326:1171-3.
Washington, DC: Government Printing 1986;7:177-88. 36. Kerr NL. HARKing: hypothesizing af-
Office, 2005. (Also available at: http:// 28. Stata statistical software, release 9. ter the results are known. Pers Soc Psychol
www.fas.org/sgp/foia/citizen.pdf.) College Station, TX: StataCorp, 2005. Rev 1998;2:196-217.
20. Center for Drug Evaluation and Re- 29. Cohen J. Statistical power analysis for 37. International Conference on Har-
search. Drugs@FDA: FDA approved drug the behavioral sciences. 2nd ed. New York: monisation — Efficacy. Rockville, MD:
products. Rockville, MD: Food and Drug Lawrence Erlbaum Associates, 1988. Food and Drug Administration. (Accessed
Administration. (Accessed December 20, 30. Nissen SE, Wolski K. Effect of rosigli- December 20, 2007, at http://www.fda.
2007, at http://www.accessdata.fda.gov/ tazone on the risk of myocardial infarc- gov/cder/guidance/#ICH_efficacy.)
scripts/cder/drugsatfda.) tion and death from cardiovascular causes. Copyright © 2008 Massachusetts Medical Society.
21. International Conference on Har- N Engl J Med 2007;356:2457-71. [Erratum,
monisation (ICH), European Medicines N Engl J Med 2007;357:100.]

view current job postings at the nejm careercenter


Visit our online CareerCenter for physicians
at www.nejmjobs.org to see the expanded features and
services available. Physicians can conduct a quick search
of the public database by specialty and view hundreds
of current openings that are updated daily online
at the CareerCenter.

260 n engl j med 358;3  www.nejm.org  january 17, 2008

The New England Journal of Medicine


Downloaded from nejm.org on February 7, 2019. For personal use only. No other uses without permission.
Copyright © 2008 Massachusetts Medical Society. All rights reserved.

S-ar putea să vă placă și