Documente Academic
Documente Profesional
Documente Cultură
Clinical Trials
Robert M. Califf
III Large-scale comparative trial Definitive evaluation of new therapy to determine if it should
of new therapy versus replace current standard of practice; randomized controlled trials
standard of practice required by regulatory agencies for registration of new therapeutic
modalities
Source: Adapted from Antman, E.M. and Califf, R.M. (1996) Clinical trials and meta-analysis. In: Smith, T.W. (ed.),
Cardiovascular Therapeutics, p. 679. Philadelphia, Saunders
phases, attention to detail is critical and should take priority because of changing doses or expanding indications for
over simplicity (although gathering detail for no specific a therapy; in other cases, a phase III study might not have
purpose is a waste of resources, regardless of the phase of provided the relevant comparisons for a particular therapeu-
the trial). tic context; information that is only obtainable in the period
The third phase, commonly referred to as the pivotal after the therapy is approved for marketing.
phase, evaluates the therapy in the relevant clinical context
with the goal of determining whether the treatment should
be used in clinical practice. For phase III studies, relevant CRITICAL GENERAL CONCEPTS
endpoints include measures that can be recognized by
patients as important: survival, major clinical events, quality Purposes of clinical trials
of life and cost. A well-designed clinical trial that informs Clinical trials may be divided into two broad categories:
the decisions that must be made by patients and healthcare explanatory/scientific or probabilistic/pragmatic. The sim-
providers justifies serious consideration for changing clini- plest but most essential concepts for understanding the rel-
cal practice, and certainly provides grounds for regulatory evance of a clinical study to clinical practice are validity
approval for sales and marketing. and generalizability. Table 2.2 illustrates an approach to
After a therapy or diagnostic test is approved by regula- these issues, developed by the McMaster group, to be used
tory authorities and is in use, phase IV begins. Traditionally, when reading the literature.
phase IV has been viewed as including a variety of studies
that monitor a therapy in clinical practice with the accompa- Validity
nying responsibility of developing more effective protocols
for its use, based on observational inference and reported The most fundamental question to ask of a clinical trial is
adverse events. Phase IV is also used to develop new indi- whether the result is valid. Are the results of the trial inter-
cations for drugs and devices already approved for a differ- nally consistent? Would the same result be obtained if the
ent use (see Chapter 36). The importance of this phase has trial were repeated in an identical population? Was the
grown with the recognition that many circumstances that trial design adequate; i.e., did it include blinding, endpoint
arise in clinical practice will not have been encountered in assessment, and statistical analyses? Of course, the most
randomized trials completed at the time the therapy receives compelling evidence of validity in science is replication.
regulatory approval. Phase IV studies may now include eval- If the results of a trial or study remain the same when the
uation of new dosing regimens (Rogers et al., 1994; Forrow study is repeated, they are likely to be valid.
et al., 1992; Society of Thoracic Surgeons Database, 2005)
Generalizability
and comparisons of one effective marketed therapy against
another, giving birth to a discipline of comparative effec- Given valid results from a clinical trial, it is equally impor-
tiveness (Tunis et al., 2003). In some cases, this need arises tant to determine whether the findings are generalizable.
15
Chapter | 2 Clinical Trials
Primary guides
Was the assignment of patients to treatment randomized? X
Were all patients who entered the study properly accounted
for at its conclusion?
Validity
Was follow-up complete?
Were patients analyzed in the groups to which they were FIGURE 2.1 Grid for comparing validity and generalizability in clinical
randomized? trial design
Secondary guides
Were patients, their clinicians, and study personnel blinded clinical practice, especially with regard to dosing and
to treatment? expected adherence and harms.
Were the groups similar at the start of the trial?
Aside from the experimental intervention, were the groups
treated equally?
Trade-off of validity and generalizability
What were the results?
How large was the treatment effect?
A simple but useful way to conceptualize trial designs is
How precise was the treatment effect in terms of a grid comparing the two constructs for a given
(confidence intervals)? trial (Fig. 2.1). In order to provide a clear answer to a con-
ceptual question about disease mechanisms, it is often
Will the results help me in caring for my patients? useful to limit the trial to a very narrow group of subjects
Does my patient fulfill the enrollment criteria for the trial? If
in a highly controlled environment, yielding a trial that
not, how close is the patient to the enrollment criteria? has high validity, but low generalizability. On the other
hand, to test major public health interventions, it may be
Does my patient fit the features of a subgroup in the trial necessary to open up entry criteria to most patients with
report? If so, are the results of the subgroup analysis in the trial a general diagnosis and to place no restrictions on ancil-
valid?
lary therapy, yielding a trial that is generalizable, but with
Were all the clinically important outcomes considered? open questions about the validity of the results according to
issues such as the possibility of interactions between treat-
Are the likely treatment benefits worth the potential harm and ments. Of course, a trial that scores low in either character-
costs? istic would be practically useless, and the ideal would be to
develop increasingly efficient tools that would allow trials
to have high scores in both domains.
the results are presented as a relative risk reduction rather translates results for the specific populations studied into
than as an absolute difference in outcomes (Bobbio et al., public health terms by quantifying how many patients
1994; Naylor et al., 1992). This appears to happen because would need to be treated to create a specific health benefit.
relative risk reductions result in larger apparent differ- The absolute difference can be used to assess quantitative
ences, even though they are reporting exactly the same interactions that is, significant differences in the number
clinical phenomenon. This sobering problem points out a of patients needed to treat to achieve a degree of benefit as
key issue of pragmatic trials: Because they are intended to a function of the type of patient treated. The use of throm-
answer questions that will directly affect patient care, the bolytic therapy provides an example: The Fibrinolytic
audience for the results will typically far exceed the local Therapy Trialists (FTT) collaboration demonstrated that
community of experts and often will include healthcare 37 lives are saved per 1000 patients treated when thrombo-
providers with varying levels of expertise, lay people and lytics are used in patients with anterior ST segment eleva-
the press. Planning is critical in order to handle these issues tion, whereas only 8 lives are saved per 1000 patients with
appropriately. inferior ST segment elevation (Fig. 2.2) (FTT Collaborative
One important metric for reporting the results of prag- Group, 1994). The direction of the treatment effect is the
matic clinical trials is the number of poor outcomes pre- same, but the magnitude of the effect is different.
vented by the more effective treatment, per 100 or 1000 Two other important aspects of the NNT calculation
patients treated. This measure, the number needed to treat that deserve consideration are the duration of treatment
(NNT), represents the absolute benefit of therapy and needed to achieve the benefit and the number needed to
harm (NNH). Depending on the circumstances, saving one of patients with ST segment elevation (FTT Collaborative
life per 100 patients treated over five years versus saving Group, 1994). Figure 2.3 displays the calculations for com-
one life per 100 patients treated in 1 week could be more monly used measures of treatment effect.
or less important. The NNH can be simply calculated, just A common way to display clinical trial results is the
as the NNT is calculated. odds ratio plot (Fig. 2.4). Both absolute and relative differ-
This approach, however, becomes more complex with ences in outcome can be expressed in terms of point esti-
non-discrete endpoints, such as exercise time, pain, or mates and confidence intervals. This type of display gives
quality of life. One way to express trial results when the the reader a balanced perspective, since both the relative
endpoint is a continuous measurement is to define the min- and the absolute differences are important, as well as the
imal clinically important difference (the smallest difference level of confidence in the estimate. Without confidence
that would lead practitioners to change their practices) and intervals, the reader will have difficulty ascertaining the pre-
to express the results in terms of the NNT to achieve that cision of the estimate of the treatment effect. The goals of a
difference. Another problem with NNT and NNH occurs pragmatic trial include: (1) the enrollment of a broad array
when the trial on which the calculation is based is not a of patients so that the effect of treatment in different types
generalizable trial that enrolled subjects likely to be treated of patients can be assessed, and (2) the enrollment of
in practice. Indeed, when clinically relevant subjects enough patients with enough events to make the confi-
(e.g., elderly patients or those with renal dysfunction) are dence intervals narrow and definitive. Using an odds ratio
excluded, these simple calculations can become mislead- or risk ratio plot, the investigator can quickly create a vis-
ing, although the issue is usually magnitude of effect rather ual image that defines the evidence for homogeneity or het-
than direction of effect. erogeneity of the treatment effect as a function of baseline
The relative benefit of therapy, on the other hand, is the characteristics.
best measure of the treatment effect in biological terms.
This concept is defined as the proportional reduction in
risk resulting from the more effective treatment, and it is CONCEPTS UNDERLYING TRIAL DESIGN
generally expressed in terms of an odds ratio or relative
risk reduction. The relative treatment effect can be used As experience with multiple clinical trials accumulates,
to assess qualitative interactions, which represent statisti- some general concepts deserve emphasis. These generali-
cally significant differences in the direction of the treat- ties may not always apply, but they serve as useful guides
ment effect as a function of the type of patient treated. In to the design or interpretation of trials. Failure to consider
the FTT analysis, the treatment effect in patients without these general principles often leads to a faulty design and
ST segment elevation is heterogeneous compared with that failure of the project.
Patients meeting No
enrollment criteria Event event
N 10 000
A EA 600 4400 5000
B EB 750 4250 5000
Randomize
1350 8650 10 000
detrimental, could be missed. An interesting recent case was organization, goals and structure of the pragmatic trial may
the surprise finding of an overall mortality reduction with be understood best by comparing the approach that might
zoledronic acid, a bisphosphonate used to prevent second be used in an explanatory trial with the approach used in a
fractures (Lyles et al., 2007). Importantly, the survival curves pragmatic trial (Tunis et al., 2003). These same principles
separated only after 18 months of follow-up, so that the find- are important in designing disease registries.
ing was entirely missed by previous shorter-term trials.
Entry criteria
GENERAL DESIGN CONSIDERATIONS In an explanatory trial, the entry criteria should be care-
When designing or interpreting the results of a clinical fully controlled so that the particular measurement of inter-
study, the purpose of the investigation is critical to plac- est will not be confounded. For example, a trial designed
ing the outcome in the appropriate context. Researchers to determine whether a treatment for heart failure improves
and clinicians who design the investigation are responsi- cardiac output should study patients who are stable enough
ble for constructing the project and presenting its results for elective hemodynamic monitoring. Similarly, in a trial
in a manner that reflects the intent of the study. In a small of depression, patients who are likely to return and who
phase II study an improvement in a biomarker linked to a can provide the data needed for depression inventories are
pathophysiological outcome is exciting, but enthusiasm can sought. In contrast, in a pragmatic trial, the general goal
easily lead the investigator to overstate the clinical import is to include patients who represent the population seen in
of the finding. Similarly, megatrials with little data col- clinical practice and whom the study organizers believe can
lection seldom provide useful information about disease make a plausible case for a benefit in outcome(s). From
mechanisms unless carefully planned substudies are per- this perspective, the number of entry and exclusion cri-
formed. The structural characteristics of trials can be char- teria should be minimized, as the rate of enrollment will
acterized as a function of the attributes discussed in the be inversely proportional to the number of criteria. In this
following sections. broadening of entry criteria, particular effort is made to
include patients with severe disease and comorbidities, since
they are likely to be encountered in practice. An extreme
Pragmatic versus explanatory version of open entry criteria is the uncertainty princi-
ple introduced by the Oxford group (Peto et al., 1995). In
Most clinical trials are designed to demonstrate a physiolog- this scheme, all patients with a given diagnosis would be
ical principle as part of the chain of causality of a particular enrolled in a trial if the treating physician was uncertain as
disease. Such studies, termed explanatory trials, need only to whether the proposed intervention had a positive, neutral,
be large enough to prove or disprove the hypothesis being or negative effect on clinical outcomes.
tested. Another perspective is that explanatory trials are Thus, an explanatory trial focuses on very specific cri-
focused on optimizing validity in order to prove a point. teria to elucidate a biological principle, whereas a large
Major problems have arisen because of the tendency of pragmatic trial should employ entry criteria that mimic
researchers performing explanatory trials to generalize the the conditions that would obtain if the treatment were
findings into recommendations about clinical therapeutics. employed in practice.
Studies designed to answer questions about which
therapies should be used are called pragmatic trials. These
trials should have a clinical outcome as the primary end-
Data collection instrument
point, so that when the trial is complete, the result will
inform the practitioner and the public about whether using The data collection instrument provides the information
the treatment in the manner tested will result in better clini- on which the results of the trial are built; if an item is not
cal outcomes than the alternative approaches. These trials included on the instrument, obviously it will not be availa-
generally require much larger sample sizes to arrive at a ble at the end of the trial. On the other hand, the likelihood
valid result, as well as a more heterogeneous population in of collecting accurate information is inversely propor-
order to be generalizable to populations treated in practice. tional to the amount of data collected. In an explanatory
This obligation to seriously consider generalizability in the trial, patient enrollment is generally not the most difficult
design is a key feature of pragmatic trials. issue, since small sample sizes are indicated. In a prag-
The decision about whether to perform an explana- matic trial, however, there is almost always an impetus to
tory or a pragmatic trial will have a significant effect on enroll patients as quickly as possible. Thus, a fundamental
the design of the study. When the study is published, the precept of pragmatic trials is that the data collection instru-
reader must also take into account the intent of the inves- ment should be as brief and simple as possible.
tigators, since the implications for practice or knowledge The ISIS-1 trial provides an excellent example of this prin-
will vary considerably depending on the type of study. The ciple: in this study, the data collection instrument consisted of
21
Chapter | 2 Clinical Trials
a single-page FAX form (ISIS-1 Collaborative Group, 1986). great for subjects enrolled in multiple studies. Further, in
This method made possible the accrual of tens of thousands some cases, there may be a known detrimental interaction
of patients in mortality trials with no reimbursement to the of some of the required interventions in the trials. Finally,
enrolling healthcare providers. Some of the most important there are pragmatic considerations: in highly experimental
findings for the broad use of therapies (beta blockers reduce situations, multiple enrollment may entail an unacceptable
mortality in acute myocardial infarction; aspirin reduces mor- regulatory or administrative burden on the research site.
tality in acute myocardial infarction; and fibrinolytic therapy More recently, however, it has been proposed that
is broadly beneficial in acute myocardial infarction) have when the uncertainty principle described above is present,
resulted from this approach. Regardless of the length of the patients should be randomly assigned, perhaps even to two
data collection form, it is critical to include only information therapies. Stimulated by the evident need to develop mul-
that will be useful in analyzing the trial outcome, or for which tiple therapies simultaneously in HIV-AIDS treatment, the
there is an explicitly identified opportunity to acquire new concept of multiple randomization has been reconsidered.
knowledge. Further, access to clinical trials is increasingly recognized
At the other end of the spectrum, there is growing inter- as a benefit rather than a burden, in part because the level
est in patient-reported outcomes to assess quality of life of clinical care in research studies tends to be superior
and response to therapy (Weinfurt, 2003) (see also Chapter to that provided in general practice (Davis et al., 1985;
9). While many more data items may be required, increas- Schmidt et al., 1999; Goss et al., 2006; Vist et al., 2007).
ingly sophisticated electronic means of collecting patient Factorial trial designs provide a specific approach to
subject surveys are allowing more detailed data collection multiple randomizations, one that possesses advantages
while reducing burdens on research sites. from both statistical and clinical perspectives. Because most
patients are now treated with multiple therapies, the facto-
rial design represents a clear means of determining whether
Ancillary therapy and practice therapies add to each other, work synergistically, or nullify
the effects of one or both therapies being tested. As long as
Decisions about the use of non-study therapies in a clinical a significant interaction does not exist between the two ther-
trial are critical to the studys validity and generalizability. apies, both can be tested in a factorial design with a sample
Including therapies that will interact in deleterious fashion size similar to that needed for a single therapy. An increas-
with the experimental agent could ruin an opportunity to ingly common approach is to add a simple component to
detect a clinically important treatment advance. a trial to test a commonly used treatment, such as vitamin
Alternatively, the goal of a pragmatic trial is to evaluate supplements. These trials have been critical in demonstrat-
a therapy in clinical, real-world context. Since clinical ing the futility of vitamin supplements in many applications.
practice is not managed according to prespecified algo-
rithms and many confounding situations can arise, evalu-
ation of the experimental therapy in the setting of such an
approach is likely to yield an unrealistic approximation of Adaptive trial designs
the likely impact of the therapy in clinical practice. For this
There is no effective way to develop therapies other than
reason, unless a specific detrimental interaction is known,
measuring intermediate physiologic endpoints in early
pragmatic trials avoid prescribing particular ancillary ther-
phases and then making ongoing estimates of the value
apeutic regimens. One exception is the encouragement (but
of continuing with the expensive effort of human-subjects
not the requirement) to follow clinical practice guidelines
research. However, other ways of winnowing the possible
if they exist for the disease under investigation.
doses or intensities of therapy must be developed after ini-
tial physiological evaluation, since these physiological end-
points are unreliable predictors of ultimate clinical effects.
Multiple randomization One such method is the pick the winner approach. In
Until recently, enrolling a patient in multiple simultaneous this design (Fig. 2.5), several doses or intensities of treat-
clinical trials was considered ethically questionable. The ment are devised, and at regular intervals during the trial an
origin of this concern is unclear, but seems to have arisen independent data and safety monitoring committee evalu-
from a general impression that clinical research exposes ates clinical outcomes with the goal of dropping arms of
patients to risks they would not experience in clinical prac- the study according to prespecified criteria.
tice, implying greater detriment from more clinical research Another form of adaptive design is the use of adap-
and thus a violation of the principles of beneficence and tive randomization, in which treatment allocation varies as
justice, if a few subjects assumed such a risk for the benefit a function of accruing information in the trial (Berry and
of the broader population. In specific instances, there are Eick, 1995). In this design, if a particular arm of a trial is
indeed legitimate concerns that the amount of information coming out ahead or behind, the odds of a patient being
required to parse the balance of benefit and risk may be too randomized to that arm can be altered.
22
PART | I Fundamental Principles
Placebo
Phase B
X
N 7000
Heparin
An additional, albeit complex, form of adaptation is therapy can be compared with a placebo or an active com-
adjustment of the trial endpoint as new external informa- parator, or whether the case is not adequate for such a trial,
tion accrues. Particularly as we conduct longer-term trials based on previous data.
to compare the capability of therapeutic strategies or diag-
nostic tests to lower rates of serious events, the ability to
measure outcomes with greater sensitivity and precision Groups of patients versus individuals
will change the playing field of endpoint measurement in
The ethical balance typically depends on the good of larger
clinical trials. A recent example is the new definition of
numbers of patients versus the good of individuals involved
myocardial infarction adopted by professional societies in
in the trial. Examples are accumulating in which a therapy
North America and Europe in a joint statement, which is
appeared to be better than its comparator based on prelimi-
based on the recognition that troponin measurements with
nary results or small studies, but was subsequently shown
improved operating characteristics are now routine on a
to be inferior based on adequately sized studies (Lo et al.,
global basis (Thygesen et al., 2007).
1988). These experiences have led some authorities to argue
that clinical practice should not change until a highly statisti-
cally significant difference in outcome is demonstrated (Peto
LEGAL AND ETHICAL ISSUES et al., 1995). Indeed, the standard for acceptance of a drug for
labeling by the Cardiorenal Group at the US Food and Drug
Medical justification Administration (FDA) is two adequate and well-controlled
All proposed treatments in a clinical trial must be within trials, each independently reaching statistical significance.
the realm of currently acceptable medical practice for the If the alpha for each trial is set at 0.05, an alpha of 0.0025
patients specific medical condition. Difficulties with such (0.05 0.05) would be needed for both to be positive.
medical justification typically arise in two areas: (1) stud- The counterargument is that the physician advising the
ies are generally performed because there is reason to individual patient should let that patient know which treat-
believe that one therapeutic approach is better than another, ment is most likely to lead to the best outcome. In fact,
and (2) many currently accepted therapies have never been Bayesian calculations could be used to provide running
subjected to the type of scrutiny that is now being applied estimates of the likelihood that one treatment is better. In
to new treatments. These factors create a dilemma for the the typical general construct of large pragmatic trials, how-
practitioner, who may be uncomfortable with protocols that ever, this approach is not taken: Applying the ethical prin-
require a change in standard practice. The subject, of course, ciples enumerated previously, an effort is made to accrue
is given the opportunity to review the situation and make a enough negative outcomes in a trial that a definitive result
decision, but for most patients, the physicians recommen- is achieved with a high degree of statistical significance
dation will be a critical factor in deciding whether to par- and narrow confidence intervals.
ticipate in a study. There is no escaping the basic fact that it An area of increasing confusion lies in the distinction
remains a matter of judgment as to whether a potential new between clinical investigation and measures taken to improve
23
Chapter | 2 Clinical Trials
the quality of care as an administrative matter. The argument However, when blinding would prevent a true test of a
has been made that the former requires individual patient treatment strategy, such as in surgical or behavioral inter-
informed consent, while the latter falls under the purview ventions, other methods must be used to ensure objectivity.
of the process of medical care and does not require indi- The clearest example is a trial of surgical versus medi-
vidual consent. This issue has recently led to another major cal therapy; in this situation, the patient and the primary
confrontation between the US Office of Human Research physician cannot remain blinded. (Interestingly, in some
Protection (OHRP) and major Academic Health Centers circumstances, sham surgical incisions have been used suc-
(AHCs) when consent was waived by an IRB, but OHRP cessfully to ensure that high-cost, high-risk surgical pro-
retrospectively ruled that waiving consent was not the correct cedures were being evaluated with maximum objectivity.)
decision (Pronovost et al., 2006; Miller and Emanuel, 2008). A similar situation exists when the administration of one
Several special situations must be considered in studies therapy is markedly different than the other. In some cases,
conducted in the setting of emergency medical treatment, a double-dummy technique (in which the comparative
which often does not permit sufficient time for explaining therapies each have a placebo) can be used, but often this
the research project in exacting detail and for obtaining approach leads to excessive complexity and renders the
informed consent. In treating acute stroke or myocardial proposed trial infeasible.
infarction, the time to administration of therapy is a criti- Given the large number of effective therapies, an
cal determinant of outcome, and time spent considering increasing problem will be the lack of availability of pla-
participation in a protocol could increase the risk of death. cebo. Manufacturing a placebo that cannot be distinguished
Accordingly, the use of an abbreviated consent to partici- from the active therapy and that cannot affect the outcome
pate, followed by a more detailed explanation later during of interest is a complex and expensive effort. Often, when a
the hospitalization, has been sanctioned. Collins, Doll and new therapy is compared with an old therapy, or two avail-
Peto have made a compelling case that the slow, cumber- able therapies are compared, one of the commercial parties
some informed consent form used in the United States refuses to cooperate, since the manufacturer of the estab-
in ISIS-2 actually resulted in the unnecessary deaths of a lished therapy has nothing to gain by participating in a
large number of patients with acute myocardial infarction comparative trial with a new therapy. Since a placebo needs
(Collins et al., 1992). to mimic the active therapy sufficiently well that the blind
An even more complex situation occurs in research con- cannot be broken, the successful performance of a placebo-
cerning treatment of cardiac or respiratory arrest. Clinical controlled trial depends on the participation of the manu-
investigation in this field almost came to a halt because facturers of both therapies.
of the impossibility of obtaining informed consent. After
considerable national debate, such research is now being
done only after careful consideration by the community of Endpoint adjudication
providers and citizens about the potential merits of the pro- The accurate and unbiased measurement of study endpoints
posed research. A situation at least as complex exists for is the foundation of a successful trials design, although many
patients with psychiatric disturbances, and considerable difficult issues may arise. Methods of endpoint ascertain-
discussion continues about the appropriate circumstances ment include blinded observers at the research sites and clin-
in which to obtain consent and to continue the patient in ical events adjudication committees that can review objective
the trial as his or her clinical state changes. data in a blinded manner independently of the site judgment.
Since most important endpoints (other than death)
require a judgment, unbiased assessment of endpoints is
Blinding essential, especially when blinding is not feasible. This point
Blinding (or masking) is essential in most explanatory tri- has been made vividly in trials of cardiovascular devices.
als, since the opportunity for bias is substantial. In most In the initial Coronary Angioplasty versus Excisional
pragmatic trials, blinding is also greatly preferred in order Atherectomy Trial (CAVEAT), comparing directional coro-
to reduce bias in the assessment of outcome. Single blind- nary atherectomy with balloon angioplasty, the majority of
ing refers to blinding of the patient, but not the investiga- myocardial infarctions were not noted on the case report
tor, to the therapy being given. Double blinding refers to form, despite electrocardiographic and enzymatic evidence
blinding of both the patient and the investigator, and triple of these events (Harrington et al., 1995). Even in a blinded
blinding refers to a double-blinded study in which the com- trial, recording of endpoints such as myocardial infarction,
mittee monitoring the trial is also blinded to which group recurrent ischemia and new or recurrent heart failure is sub-
is receiving which treatment. Despite the relative rarity of jective enough that independent judgment is thought to be
deceit in clinical research, examples of incorrect results due helpful in most cases (Mahaffey et al., 1997). Increasingly,
to bias in trials without blinding (Karlowski et al., 1975) central imaging laboratories are adding an element of objec-
and with single-blind studies reinforce the value of blind- tivity to the assessment of images as clinical trial endpoints
ing (Henkin et al., 1976). (Arias-Mendoza et al., 2004; Cranney et al., 1997).
24
PART | I Fundamental Principles
Disease and
intervention End points Reason for failurea
Surrogate Clinical A B C D
Cardiologic disorder
Arrhythmia
Elevated blood
pressure
Cancer
Prevention
Advanced disease
Other diseases
Osteoporosis
Chronic granulomatous
disease
Salary and other payments for services from the institution Intention to treat
Source: Adapted from: Appendix C: Definition of financial interests One of the most important concepts in the interpretation
in research. In: Protecting patients, preserving integrity, advancing
health: Accelerating the implementation of COI policies in human
of clinical trial results is that of intention to treat (ITT).
subjects research. A report of the Association of American Medical Excluding patients who were randomized into a trial leads
CollegesAssociation of American Universities Advisory Committee
on Financial Conflicts of Interest in Human Subjects Research.
to bias that cannot be quantified; therefore, the results of
February 2008. Available at: https://services.aamc.org/Publications/ the trial cannot be interpreted with confidence.
(accessed March 24, 2008)
The purpose of randomization is to ensure the ran-
dom distribution of any factors, known or unknown, that
might affect the outcomes of the subjects allocated to one
HYPOTHESIS FORMULATION treatment or another. Any post-randomization deletion of
patients weakens the assurance that the randomized groups
Primary hypothesis
are at equal risk before beginning treatment. Nevertheless,
Every clinical study should have a primary hypothesis. there are several common situations in which it may be rea-
The goal of the study design is to develop a hypothesis that sonable to drop patients from an analysis.
allows the most important question from the viewpoint of In blinded trials, when patients are randomized but do
the investigators to be answered without ambiguity. This not receive the treatment, it is reasonable to create a study
27
Chapter | 2 Clinical Trials
plan that would exclude these patients from the primary the p value was the only measure of probability), the type
analysis. The plan can call for substitution of additional I error is generally designated at an alpha level of 0.05.
patients to fulfill the planned sample size. When this hap- However, if the same question is asked repeatedly, or if
pens, extensive analyses must be done to ensure that there multiple subgroups within a trial are evaluated, the like-
was no bias in determining which subjects were not treated. lihood of finding a nominal p value of less than 0.05
In unblinded trials, dropping patients who do not receive increases substantially (Lee et al., 1980). When evaluating
the treatment is particularly treacherous and should not be the meaning of a p value, clinicians should be aware of the
allowed. Similarly, withdrawing patients from analysis after number of tests of significance performed and the impor-
treatment has started cannot be permitted in trials designed tance placed on the p value by the investigator as a function
to determine whether a therapy should be used in practice, of multiple comparisons (see also Chapters 3 and 4).
since the opportunity to drop out without being counted
does not exist when a therapy is given in practice.
Type II error and sample size
The type II error (beta) is the probability of inappropri-
PUBLICATION BIAS ately accepting the null hypothesis (no difference in treat-
ment effect) when a true difference in outcome exists. The
Clinical trials with negative findings are much less likely to power of a study (1-beta) is the probability of rejecting
be published than those with positive results. Approximately the null hypothesis appropriately. This probability is criti-
85% of studies published in medical journals report posi- cally dependent on (1) the difference in outcomes observed
tive results (Dickersin and Min, 1993). In a sobering analy- between treatments, and (2) the number of endpoint obser-
sis, Simes (1987) found that a review of published literature vations. A common error in thinking about statistical power
showed combination chemotherapy for advanced ovarian is to assume that the number of patients determines the
cancer to be beneficial, whereas a review of published and power; rather, it is the number of outcomes.
unpublished trials together showed that the therapy had The precision with which the primary endpoint can be
no significant effect. Dickersin and colleagues (Dickersin measured also affects the power of the study; endpoints
et al., 1987) found substantial evidence of negative report- that can be measured precisely require fewer patients. An
ing bias in a review of clinical trial protocols submitted to example is the use of sestamibi-estimated myocardial inf-
Oxford University and Johns Hopkins University. In partic- arct size. Measuring the area at risk before reperfusion and
ular, industry-sponsored research with negative results was then measuring final infarct size can dramatically reduce
unlikely to be published. the variance of the endpoint measure by providing an esti-
Twenty years after these studies, we now require studies mate of salvage rather than simply of infarct size (Gibbons
of human subjects in the United States to be posted in clini- et al., 1994). As is often the case, however, the more pre-
cal trials registries. A registry of all clinical trials, publicly cise measure is more difficult to obtain, leading to great dif-
or privately funded, is needed so that all evidence generated ficulty in finding sites that can perform the study; in many
from human clinical trials will be available to the public. cases, the time required to complete the study is as impor-
This issue of a comprehensive clinical trials registry has been tant as the number of patients needed. This same argument
a topic of great public interest (DeAngelis et al., 2004). The is one of the primary motivators in the detailed quality
National Library of Medicine (Zarin et al., 2005) is a criti- control measures typically employed when instruments are
cal repository for this registry (www.clinicaltrials.gov), and developed and administered in trials of behavioral therapy
registration with this repository will in time presumably be or psychiatry. For studies using physiological endpoints,
required for all clinical trials, regardless of funding sources. using a continuous measure generally will increase the
power to detect a difference.
A review of the New England Journal of Medicine in
1978 determined that 67 of 71 negative studies had made a
STATISTICAL CONSIDERATIONS significant (more than 10% chance of missing a 25% treat-
ment effect) type II error, and that 50 of the 71 trials had
Type I error and multiple comparisons
more than a 10% chance of missing a 50% treatment effect
Hypothesis testing in a clinical study may be thought of as (Frieman et al., 1978). Unfortunately, the situation has
setting up a straw man that the effects of the two treat- not improved sufficiently since that time. The most com-
ments being compared are identical. The goal of statistical mon reasons for failing to complete studies with adequate
testing is to determine whether this straw man hypothesis power include inadequate funding for the project and loss
should be accepted or rejected based on probabilities. The of enthusiasm by the investigators.
type I error (alpha) is the probability of rejecting the null A statistical power of at least 80% is highly desirable
hypothesis when it is correct. Since clinicians have been when conducting a clinical trial; 90% power is preferable.
trained in a simple, dichotomous mode of thinking (as if Discarding a good idea or a promising therapy because the
28
PART | I Fundamental Principles
study designed to test it had little chance of detecting a true is the estimate of the minimally important clinical differ-
difference is obviously an unfortunate circumstance. One ence (MID). By reviewing the proposed therapy in compar-
of the most difficult concepts to grasp is that a study with ison with the currently available therapy, the investigators
little power to detect a true difference not only has little should endeavor to determine the smallest difference in
chance of demonstrating a significant difference in favor the primary endpoint that would change clinical practice.
of the better treatment, but also that the direction of the Practical considerations may not allow a sample size large
observed treatment effect is highly unpredictable because enough to evaluate the MID, but the number should be
of random variation inherent in small samples. There is known. In some cases, the disease may be too rare to enroll
an overwhelming tendency to assume that if the observed sufficient patients, whereas in other cases the treatment
effect is in the wrong direction in a small study, the ther- may be too expensive or the sponsor may not have enough
apy is not promising, whereas if the observed effect is in money. Once the MID and the financial status of the trial
the expected direction but the p value is insignificant, the are established, the sample size can be determined easily
reason for the insignificant p value is an inadequate sample from a variety of published computer algorithms or tables.
size. We can avoid these problems by designing and con- It is useful for investigators to produce plots or tables to
ducting clinical trials of adequate size. enable them to see the effects of small variations in event
rates or treatment effects on the needed sample size. In the
GUSTO-I trial (GUSTO Angiographic Investigators, 1993),
Noninferiority the sample size was set after a series of international meet-
ings determined that saving an additional 1 life per 100
The concept of noninferiority has become increasingly
patients treated with a new thrombolytic regimen would be
important in the present cost-conscious environment, in
a clinically meaningful advance. With this knowledge, and
which many effective therapies are already available. Where
a range of possible underlying mortality rates in the con-
an effective therapy exists, the substitution of a less expen-
trol group, a table was produced demonstrating that a 1%
sive (but clinically noninferior) one is obviously attractive.
absolute reduction (a difference of 1 life per 100 treated) or
In these positive control studies, substantial effort is needed
a 15% relative reduction could be detected with 90% cer-
to define noninferiority. Sample size estimates require the
tainty by including 10 000 patients per study arm.
designation of a difference below which the outcome with
the new therapy is noninferior to the standard comparator,
and above which one therapy would be considered superior
to the other. Sample sizes are often larger than required to META-ANALYSIS AND SYSTEMATIC
demonstrate one therapy to be clearly superior to the other. OVERVIEWS
Clinicians must be wary of studies that are designed
Clinicians are often faced with therapeutic dilemmas, in
with a substantial type II error resulting from an inade-
which there is insufficient evidence to be certain of the best
quate number of endpoints, with the result that the p value
treatment. The basic principle of combining medical data
is greater than 0.05 because not enough events accrued, as
from multiple sources seems intuitively appealing, since
opposed to a valid conclusion that one treatment is not infe-
this approach results in greater statistical power. However,
rior to the other. This error could lead to a gradual loss of
the trade-off is the assumption that the studies being com-
therapeutic effectiveness for the target condition. For exam-
bined are similar enough that the combined result will be
ple, if we were willing to accept that a therapy for acute
valid. Inevitably, this assumption rests on expert opinion.
myocardial infarction with 1% higher mortality in an abso-
Table 2.5 provides an approach to reading meta-analyses.
lute sense was equivalent, and we examined four new,
The most common problems associated with meta-analyses
less-expensive therapies that met those criteria, we could
are combining studies with different designs or outcomes
cause a significant erosion of the progress in reducing mor-
and failing to find unpublished negative studies. There is
tality stemming from acute myocardial infarction.
no question regarding the critical importance of a full lit-
Another interesting feature of noninferiority trials is
erature search, as well as involvement of experts in the
that poor study conduct can bias the result toward no differ-
field of interest to ensure that all relevant information is
ence. For example, if no subjects in either treatment group
included. Statistical methods have been developed to help
took the assigned treatment, within the boundaries of the
in the assessment of systematic publication bias (Begg and
fluctuations of random chance, the outcomes in the rand-
Berlin, 1988). Another complex issue involves the assess-
omized cohorts should be identical.
ment of the quality of individual studies included in a sys-
tematic overview. Statistical methods have been proposed
for differential weighting as a function of quality (Detsky
Sample size calculations
et al., 1992), but these have not been broadly adopted.
The critical step in a sample size calculation, whether for The methodology of the statistical evaluation of pooled
a trial to determine a difference or to test for equivalence, information has recently been a topic of tremendous interest.
29
Chapter | 2 Clinical Trials
(PRAISE) trial (Packer et al., 1996), which observed a Similarly, the standard method of determining the dose
reduction in mortality with amlodipine in patients with of a drug has been to measure physiological endpoints. In
idiopathic dilated cardiomyopathy but not in patients with a sense, this technique resembles the use of a surrogate
ischemic cardiomyopathy. This case was particularly inter- endpoint. No field has more impressively demonstrated the
esting, because this subgroup was prespecified to the extent futility of this approach than the arena of treatment for heart
that the randomization was stratified. However, the reason failure. Several vasodilator and inotropic therapies have been
for the stratification was that the trial designers expected shown to improve hemodynamics in the acute phase but sub-
that amlodipine would be ineffective in patients without sequently were shown to increase mortality. The experience
cardiovascular disease; the opposite finding was in fact with heparin and warfarin has taught us that large numbers
observed. The trial organization, acting in responsible fash- of subjects are essential to understanding the relationship
ion, mounted a confirmatory second trial. In the completed between the dose of a drug and clinical outcome.
follow-up trial (PRAISE-2), the special benefit in the idi- Finally, the imperative of do no harm has long been a
opathic dilated cardiomyopathy group was not replicated fundamental tenet of medical practice. However, most bio-
(Cabell et al., 2004). logically potent therapies cause harm in some patients while
helping others. The recent emphasis on the neurological
complications of bypass surgery provides ample demonstra-
THERAPEUTIC TRUISMS tion that a therapy that saves lives can also lead to complica-
tions (Roach et al., 1996). Intracranial hemorrhage resulting
A review of recent clinical trials demonstrates that many from thrombolytic treatment exemplifies a therapy that is
commonly held beliefs about clinical practice need to be beneficial for populations but has devastating effects on some
challenged based on quantitative findings. If these assump- individuals. Similarly, beta blockade causes early deteriora-
tions are to be shown to be less solid than previously tion in many patients with heart failure, but the longer-term
believed, a substantial change in the pace of clinical inves- survival benefits are documented in multiple clinical trials.
tigation will be needed. The patients who are harmed can be detected easily, but
Frequently, medical trainees have been taught that vari- those patients whose lives are saved cannot be detected.
ations in practice patterns are inconsequential. The com-
mon observation that different practitioners treat the same
problem in different ways has been tolerated because of the STUDY ORGANIZATION
general belief that these differences do not matter. Clinical
trials have demonstrated, however, that small changes in Regardless of the size of the trial being contemplated by
practice patterns in epidemic diseases can have a sizable the investigator, the general principles of organization of
impact. Indeed, the distillation of trial results into clini- the study should be similar (Fig. 2.6). A balance of interest
cal practice guidelines has enabled direct research into the and power must be created to ensure that after the trial is
effects of variations in practice on clinical outcomes. The designed, the experiment can be performed without bias
fundamental message is that reliable delivery of effective and the interpretation will be generalizable.
therapies leads to better outcomes.
Another ingrained belief of medical training is that
observation of the patient will provide evidence for Executive functions
instances when a treatment needs to be changed. Although
The steering committee
no one would dispute the importance of following symp-
toms, many acute therapies have effects that cannot be In a large trial, the steering committee is a critical com-
judged in a short time, and many therapies for chronic ill- ponent of the study organization, and is responsible for
ness prevent adverse outcomes in patients with very few designing, executing and disseminating the study. A diverse
symptoms. For example, in treating acute congestive heart steering committee, comprising multiple perspectives that
failure, inotropic agents improve cardiac output early after include biology, biostatistics and clinical medicine, is more
initiation of therapy but lead to a higher risk of death. likely to organize a trial that will withstand external scru-
Beta blockers cause symptomatic deterioration acutely tiny. This same principle holds for small trials; an individual
but appear to improve long-term outcome. Mibefradil investigator, by organizing a committee of peers, can avoid
was effective in reducing angina and improving exercise the pitfalls of egocentric thinking about a clinical trial.
tolerance, but it also caused sudden death in an alarming The principal investigator plays a key role in the func-
proportion of patients, leading to its withdrawal from the tion of the trial as a whole, and a healthy interaction with
market. Most recently, erythropoietin at higher doses seems the steering committee can provide a stimulating exchange
to provide a transient improvement in quality of life, but a of ideas on how best to conduct a study. The principal trial
subsequent increase in mortal cardiac events compared with statistician is also crucial in making final recommendations
lower doses. about study design and data analysis. An executive committee
31
Chapter | 2 Clinical Trials
Steering
committee
Coordinating
center
Core
laboratories
can be useful, as it provides a small group to make real- Approval by an IRB is generally required for any type of
time critical decisions for the trial organization. This com- human subjects research, even if the research is not funded
mittee typically includes the sponsor, principal investigator, by an external source. The IRB should consist of physicians
statistician, and key representatives from the steering com- with expertise in clinical trials and non-physicians expert in
mittee and the data coordinating center. clinical research, as well as representatives with expertise
in medical ethics and representatives of the community in
The data and safety monitoring committee which the research is being conducted. As with the DSMC,
the IRB function has come under scrutiny, especially from
The data and safety monitoring committee (DSMC) is government agencies charged with ensuring the protection
charged with overseeing the safety of the trial from the of human subjects.
point of view of the participating subjects. The DSMC Several types of studies are typically exempted from the
should include clinical experts, biostatisticians and, some- IRB process, including studies of public behavior, research
times, medical ethicists; these individuals should have no on educational practices, and studies of existing data in
financial interest, emotional attachment, or other invest- which research data cannot be linked to individual sub-
ment in the therapies being studied. Committee members jects. Surveys and interviews may also be exempted when
have access to otherwise confidential data during the course the subjects are not identified and the data are unlikely to
of the trial, allowing decisions to be made on the basis of result in a lawsuit, financial loss, or reduced employability
information that, if made available to investigators, could of the subject.
compromise their objectivity. The DSMC also shoulders
an increasingly scrutinized ethical obligation to review the Regulatory authorities
management of the trial in the broadest sense, in conjunc-
tion with each Institutional Review Board, to ensure that Government regulatory authorities have played a major role in
patients are treated according to ethical principles. the conduct of clinical research. The FDA and other national
The role of the DSMC has become a topic of significant health authorities provide the rules by which industry-
global interest. Little has been published about the function sponsored clinical trials are conducted. In general, regulatory
of these groups, yet they hold considerable power over the requirements include interpretation of fundamental guide-
functioning of clinical trials. The first textbook on issues lines to ensure adherence to human rights and ethical stand-
surrounding DSMCs has been published only recently ards. The FDA and equivalent international authorities are
(Ellenberg et al., 2003). charged with ensuring that the drugs and devices marketed to
the public are safe and effective (a charge with broad leeway
for interpretation). Importantly, in the United States, there
The Institutional Review Board is no mandate to assess comparative effectiveness or cost-
The Institutional Review Board (IRB) continues to play a effectiveness, although the advent of organizations (such as
central role in the conduct of all types of clinical research. the National Institute for Clinical Excellence [NICE] in the
32
PART | I Fundamental Principles
United Kingdom and the Developing Evidence to Inform knowledge of successful, real-world operations is vital to a
Decisions about Effectiveness [DECIDE] Network in the trials success. It is important to remember that in large stud-
United States) charged with government-sponsored technol- ies, a small change in protocol or addition of just one more
ogy evaluations has led to a resurgence of cost evaluation in patient visit or testing procedure can add huge amounts to
the clinical trial portfolio. the study cost. The larger the trial, however, the greater the
A controversial organization that has recently become economies of scale in materials, supplies and organization
extremely powerful in the United States is the OHRP, which that can be achieved. For example, a simple protocol amend-
reports directly to the Secretary of Health and Human ment can take months and cost hundreds of thousands of
Services and has tremendous power to control studies and dollars (and even more in terms of the delay) to successfully
their conduct through its ability to summarily halt studies pass through review by multiple national regulatory authori-
or forbid entire institutions from conducting trials. Several ties and hundreds of local IRBs. If the intellectual leaders
recent cases have caused considerable debate about whether of a trial are not in touch with the practical implications of
the powers of this organization are appropriate. their decisions for study logistics, the trials potential for
providing useful information can be compromised.
extrapolate from reading journal articles to making indi- In many countries, the computerized management of
vidual decisions is clearly inadequate. Recognition of this information is an element of a coalescence of practitioners
deficit has led to a variety of efforts to synthesize empiri- into integrated health systems. In order to efficiently care for
cal information into practice guidelines. These guidelines populations of patients at a reasonable cost, practitioners are
may be considered as a mixture of opinion based advice working in large, geographically linked, economically inter-
and proven approaches to treatment that are not to be con- dependent groups. This integration of health systems will
sidered optional for patients who meet criteria. In addi- enable rapid deployment of trials into the community. This
tion, efforts such as the Cochrane collaboration (Cochrane includes not only trials of diagnostic strategies and thera-
Handbook for Systematic Reviews of Interventions, 2006) pies, but also evaluations of strategies of care using cluster
are attempting to make available systematic overviews of randomization (randomization of practices instead of indi-
clinical trials in most major therapeutic areas, an effort fur- vidual patients) in order to produce a refined and continu-
ther enhanced by new clinical trials registry requirements. ous learning process for healthcare providers, an approach
This effort has been integrated into a conceptual frame- dubbed the learning health system by the Institute of
work of a cycle of quality, in which disease registries cap- Medicine.
ture continuous information about the quality of care for Although integrated healthcare systems will provide the
populations (Califf et al., 2002). Within these populations, structure for medical practice, global communications will
clinical trials, of appropriate size and performed in relevant provide mechanisms to quickly answer questions about
study cohorts, can lead to definitive clinical practice guide- diagnosis, prevention, prognosis and treatment of common
lines. These guidelines can then form the basis for per- and uncommon diseases. The ability to aggregate informa-
formance measures that are used to capture the quality of tion about thousands of patients in multiple health systems
care delivered. Ultimately, gaps in clinical outcomes in this will change the critical issues facing clinical researchers.
system can help define the need for new technologies and Increasingly, attention will be diverted from attempts to
behavioral approaches. Increasingly, the linkage of interop- obtain data, and much effort will be required to develop
erable electronic health records, professional-society-driven efficient means of analyzing and interpreting the types of
quality efforts, and patient/payer-driven interest in improv- information that will be available.
ing outcomes is leading to a system in which clinical tri- Ultimately, leading practitioners will band together in
als are embedded within disease registries, so that the total global networks oriented toward treating illnesses of com-
population can be understood and the implementation of mon interest. When a specific question requiring randomiza-
findings into practice can be measured (Welke et al., 2004). tion is identified, studies will be much more straightforward,
because the randomization can simply be added to the com-
puterized database, and information that currently requires
THE FUTURE construction of a clinical trials infrastructure will be imme-
As revolutionary developments in human biology and infor- diately accessible without additional work. Information
mation technology continue to unfold, and as the costs of systems will be designed to provide continuous feedback
medical therapies continue to climb, the importance of inno- of information to clinicians, supporting rational therapeutic
vative, reliable and cost-effective clinical trials will only decisions. In essence, a continuous series of observational
grow in importance. During the next several years, practi- studies will be in progress, assessing outcomes as a function
tioners will make increasing use of electronic health records of diagnostic processes and therapeutic strategies.
that generate computerized databases to capture information All of the principles elucidated above will continue to
at the point of care. Early efforts in this area, focusing on be relevant; indeed, they will evolve toward an increasingly
procedure reports to meet mandates from payers and qual- sophisticated state, given better access to aggregate results of
ity reviewers, will be replaced by systems aimed at captur- multiple trials. As in many other aspects of modern life, the
ing data about the entire course of the patients encounter pace at which knowledge is generated and refined will occur at
with the healthcare system. Multimedia tools will allow cli- rates that a decade ago would not have even been imaginable.
nicians to view medical records and imaging studies simul-
taneously in the clinic or in the hospital.
In order to expedite the efficient exchange of informa- ACKNOWLEDGMENT
tion, the nomenclature of diagnosis, treatments and out-
comes is becoming increasingly standardized. In a parallel Substantial portions of this chapter originally appeared
development, a variety of disease registries are evolving, in: Califf, R.M. (2007) Large clinical trials and registries:
in which coded information about patients with particular clinical research institutes, in Principles and Practice of
problems is collected over time to ensure that they receive Clinical Research, 2nd edn (J.I. Gallin and F. Ognibene,
appropriate care in a systematic fashion. This combination eds). Burlington, MA, Academic Press (Elsevier), 2007,
of electronic health records and disease registries will have which was adapted and updated for the present publication
dramatic implications for the conduct of clinical trials. with permission from the authors and publisher.
35
Chapter | 2 Clinical Trials
REFERENCES Cranney, A., Tugwell, P., Cummings, S., Sambrook, P., Adachi, J., Silman, A.J.,
Gillespie, W.J., Felson, D.T., Shea, B. and Wells, G. (1997) Osteoporosis
Alexander, K.P. and Peterson, E.D. (2003) Evidence-based care for all clinical trials endpoints: candidate variables and clinimetric properties.
patients. Am. J. Med. 114, 333335. J. Rheumatol. 24, 12221229.
ALLHAT Collaborative Research Group (2000) Major cardiovascular Davidson, M.H. (2004) Rosuvastatin safety: Lessons from the FDA review
events in hypertensive patients randomized to doxazosin vs chlor- and post-approval surveillance. Expert. Opin. Drug Saf. 3, 547557.
thalidone The Antihypertensive and Lipid-Lowering Treatment to Davis, S., Wright, P.W., Schulman, S.F., Hill, L.D., Pinkham, R.D., Johnson,
Prevent Heart Attack Trial (ALLHAT). JAMA 283, 19671975. L.P., Jones, T.W., Kellogg, H.B. Jr, Radke, H.M., Sikkema, W.W.
Angell, M. and Kassirer, J.P. (1996) Editorials and conflicts of interest et al. (1985) Participants in prospective, randomized clinical trials for
[Editorial]. N. Engl. J. Med. 335, 10551056. resected non-small cell lung cancer have improved survival compared
Antman, E. (1995) Randomized trial of magnesium for acute myocardial with nonparticipants in such trials. Cancer. 56, 17101718.
infarction: Big numbers do not tell the whole story. Am. J. Cardiol. DeAngelis, C.D., Drazen, J.M., Frizelle, F.A., Haug, C., Hoey, J.,
75, 391393. Horton, R., Kotzin, S., Laine, C., Marusic, A., Overbeke, A.J. et al.
Arias-Mendoza, F., Zakian, K., Schwartz, A., Howe, F.A., Koutcher, J.A., (2004) Clinical trial registration: A statement from the International
Leach, M.O., Griffiths, J.R., Heerschap, A., Glickson, J.D., Nelson, S.J. Committee of Medical Journal Editors. JAMA 351, 12501251.
et al. (2004) Methodological standardization for a multi-institutional DeMets, D.L., Fleming, T.R., Rockhold, F., Massie, B., Merchant, T.,
in vivo trial of localized 31P MR spectroscopy in human cancer Meisel, A., Mishkin, B., Wittes, J., Stump, D. and Califf, R.M. (2004)
research. In vitro and normal volunteer studies. NMR Biomed. 17, Liability issues for data monitoring committee members. Clin. Trials.
382391. 1, 525531.
Begg, C. and Berlin, J. (1988) Publication bias: A problem in interpreting Detsky, A.S., Naylor, C.D., ORourke, K., McGeer, A.J. and LAbb, K.A.
medical data. J. R. Stat. Soc. A. 151, 419445. (1992) Incorporating variations in the quality of individual rand-
Berkey, C.S., Hoaglin, D.C., Mosteller, F. and Colditz, G.A. (1995) A omized trials into meta-analysis. J. Clin. Epidemiol. 45, 255265.
random-effects regression model for meta-analysis. Stat. Med. 14, Dickersin, K., Chan, S., Chalmers, T.C., Sacks, H.S. and Smith, H. Jr
395411. (1987) Publication bias and clinical trials. Controlled Clin. Trials 8,
Berry, D.A. and Eick, S.G. (1995) Adaptive assignment versus balanced 343353.
randomization in clinical trials: a decision analysis. Stat. Med. 14, Dickersin, K. and Min, Y.I. (1993) Publication bias: The problem that
231246. wont go away. Ann. NY Acad. Sci. 703, 135146.
Bobbio, M., Demichelis, B. and Giustetto, G. (1994) Completeness of Dressman, H.K., Hans, C., Bild, A., Olson, J.A., Rosen, E., Marcom, P.K.,
reporting trial results: Effect on physicians willingness to prescribe. Liotcheva, V.B., Jones, E.L., Vujaskovic, Z., Marks, J. et al. (2006)
Lancet. 343, 12091211. Gene expression profiles of multiple breast cancer phenotypes and
Cabell, C.H., Trichon, B.H., Velazquez, E.J., Dumesnil, J.G., Anstrom, response to neoadjuvant chemotherapy. Clin. Cancer Res. 12 (3 pt 1),
K.J., Ryan, T., Miller, A.B., Belkin, R.N., Cropp, A.B., OConnor, 819826.
C.M. and Jollis, J.G. (2004) Importance of echocardiography in Ellenberg, S.S., Fleming, T.R. and DeMets, D.L. (2003) Data Monitoring
patients with severe nonischemic heart failure: the second Prospective Committees in Clinical Trials: A Practical Perspective. Chichester:
Randomized Amlodipine Survival Evaluation (PRAISE-2) echocar- John Wiley & Sons.
diographic study. Am. Heart J. 147, 151157. Fibrinolytic Therapy Trialists (FTT) Collaborative Group (1994)
Califf, R.M. and Kramer, J.M. (1998) What have we learned from the cal- Indications for fibrinolytic therapy in suspected acute myocardial
cium channel blocker controversy? Circulation 97, 15291531. infarction: Collaborative overview of early mortality and major mor-
Califf, R.M. and Kramer, J.M. (2008) The balance of benefit and safety of bidity results from all randomised trials of more than 1000 patients.
rosiglitazone: Important lessons for our system of drug development. Lancet 343, 311322.
Pharmacoepidemiol. Drug Saf. 17, 782786. Fisher, R.A. and Mackenzie, W.A. (1923) Studies of crop variation: II.
Califf, R.M., Peterson, E.D., Gibbons, R.J., Garson, A. Jr, Brindis, R.G., The manurial response of different potato varieties. J. Agric. Sci. 13,
Beller, G.A., Smith, S.C. Jr; for the American College of Cardiology; 315.
American Heart Association (2002) Integrating quality into the cycle Food and Drug Administration Amendments Act of 2007, Title VIII,
of therapeutic development. J. Am. Coll. Cardiol. 40, 18951901. Section 801 (Pub. L. No. 110-85, 121 Stat 825).
Cochrane Handbook for Systematic Reviews of Interventions, 4.2.6 Forrow, L., Taylor, W.C. and Arnold, R.M. (1992) Absolutely relative:
[updated September 2006]. (J.P.T. Higgins and S. Green, eds.) How research results are summarized can affect treatment decisions.
Chichester: John Wiley & Sons. Available at: http://www3.inter Am. J. Med. 92, 121124.
science.wiley.com/homepages/106568753/handbook.pdf (accessed 19 Frieman, J.A., Chalmers, T.C., Smith, H. Jr and Kuebler, R.R. (1978) The
December 2006). importance of beta, the type II error and sample size in the design and
Collins, R., Doll, R. and Peto, R. (1992) Ethics of clinical trials. In: interpretation of the randomized control trial: Survey of 71 negative
Introducing New Treatments for Cancer: Practical, Ethical and Legal trials. N. Engl. J. Med. 299, 690694.
Problems (C. Williams, ed.), pp. 4965. Chichester: John Wiley & Gibbons, R.J., Christian, T.F., Hopfenspirger, M., Hodge, D.O. and Bailey, K.R.
Sons. (1994) Myocardium at risk and infarct size after thrombolytic ther-
Covinsky, K.E., Fuller, J.D., Yaffe, K., Johnston, C.B., Hamel, M.B., apy for acute myocardial infarction: Implications for the design of
Lynn, J., Teno, J.M. and Phillips, R.S. (2000) Communication and randomized trials of acute intervention. J. Am. Coll. Cardiol. 24,
decision making in seriously ill patients: findings of the SUPPORT 616623.
project. The Study to Understand Prognoses and Preferences for Goss, C.H., Rubenfeld, G.D., Ramsey, B.W. and Aitken, M.L. (2006)
Outcomes and Risks of Treatments. J. Am. Geriatr. Soc. 48 (Suppl. 5), Clinical trial participants compared with nonparticipants in cystic
S187193. fibrosis. Am. J. Respir. Crit. Care Med. 173, 98104.
36
PART | I Fundamental Principles
Gross, R. and Strom, B.L. (2003) Toward improved adverse event/sus- Mietlowski, W. and Wang, J. (2007) Letter to the re Yu and Holmgren.
pected adverse drug reaction reporting. Pharmacoepidemiol. Drug Traditional endpoint of progression-free survival (PFS) may not be
Saf. 12, 8991. appropriate for evaluating cytostatic agents combined with chemo-
GUSTO Angiographic Investigators (1993) The effects of tissue plas- therapy in cancer clinical trials. Contemp. Clin. Trials. 28, 674.
minogen activator, streptokinase, or both on coronary artery patency, Miller, F.G. and Emanuel, E.J. (2008) Quality-improvement research and
ventricular function and survival after acute myocardial infarction. informed consent. N. Engl. J. Med. 358, 765767.
N. Engl. J. Med. 329, 16151622. Naylor, C.D., Chen, E. and Strauss, B. (1992) Measured enthusiasm: Does
GUSTO Investigators (1993) An international randomized trial comparing the method of reporting trial results alter perceptions of therapeutic
four thrombolytic strategies for acute myocardial infarction. N. Engl. effectiveness?. Ann. Intern. Med. 117, 916921.
J. Med. 329, 673682. Nissen, S.E. and Wolski, K. (2007) Effect of rosiglitazone on the risk of
Harrington, R.A., Lincoff, A.M., Califf, R.M., Holmes, D.R. Jr, myocardial infarction and death from cardiovascular causes. N. Engl.
Berdan, L.G., OHanesian, M.A., Keeler, G.P., Garratt, K.N., J. Med. 356, 24572471.
Ohman, E.M., Mark, D.B. et al. (1995) Characteristics and conse- Olson, C.M., Rennie, D., Cook, D., Dickersin, K., Flanagin, A., Hogan, J.W.,
quences of myocardial infarction after percutaneous coronary inter- Zhu, Q., Reiling, J. and Pace, B. (2002) Publication bias in decision
vention: Insights from the Coronary Angioplasty versus Excisional making. JAMA 287, 28252828.
Atherectomy Trial (CAVEAT). J. Am. Coll. Cardiol. 25, 16931699. Packer, M. (1990) Calcium channel blockers in chronic heart failure: The
Henkin, R.I., Schecter, P.J., Friedewald, W.T., Demets, D.L. and Raff, M. risks of physiologically rational therapy [Editorial]. Circulation 82,
(1976) A double-blind study of the effects of zinc sulfate on taste and 22542257.
smell dysfunction. Am. J. Med. Sci. 272, 285299. Packer, M., OConnor, C.M., Ghali, J.K., Pressler, M.L., Carson, P.E.,
ISIS-1 (First International Study of Infarct Survival) Collaborative Group Belkin, R.N., Miller, A.B., Neuberg, G.W., Frid, D., Wertheimer, J.H.
(1986) Randomised trial of intravenous atenolol among 16,027 et al. (1996) Effect of amlodipine on morbidity and mortality in
cases of suspected acute myocardial infarction: ISIS-1. Lancet 2, severe chronic heart failure. N. Engl. J. Med. 335, 11071114.
5766. Peto, R., Collins, R. and Gray, R. (1995) Large-scale randomized evi-
ISIS-4 (Fourth International Study of Infarct Survival) Collaborative dence: Large, simple trials and overviews of trials. J. Clin. Epidemiol.
Group (1995) ISIS-4: A randomised factorial trial assessing early oral 48, 2340.
captopril, oral mononitrate and intravenous magnesium sulphate in Phrommintikul, A., Haas, S.J., Elsik, M. and Krum, H. (2007) Mortality
48,050 patients with suspected acute myocardial infarction. Lancet and target haemoglobin concentrations in anaemic patients with
345, 669685. chronic kidney disease treated with erythropoietin: a meta-analysis.
Karlowski, T.R., Chalmers, T.C., Frenkel, L.D., Kapikian, A.Z., Lewis, T.L. Lancet. 369, 381388.
and Lynch, J.M. (1975) Ascorbic acid for the common cold: A pro- Potti, A., Dressman, H.K., Bild, A., Riedel, R.F., Chan, G., Sayer, R.,
phylactic and therapeutic trial. JAMA 231, 10381042. Cragun, J., Cottrill, H., Kelley, M.J., Petersen, R. et al. (2006)
Lau, J., Antman, E.M., Jimenez-Silva, J., Kupelnick, B., Mosteller, F. and Genomic signatures to guide the use of chemotherapeutics. Nat. Med.
Chalmers, T.C. (1992) Cumulative meta-analysis of therapeutic trials 12, 12941300.
for myocardial infarction. N. Engl. J. Med. 327, 248254. Pratt, C.M. and Moye, L. (1990) The Cardiac Arrhythmia Suppression
Lee, K.L., McNeer, J.F., Starmer, C.F., Harris, P.J. and Rosati, R.A. (1980) Trial: Implications for anti-arrhythmic drug development. J. Clin.
Clinical judgment and statistics: Lessons from a simulated rand- Pharmacol. 30, 967974.
omized trial in coronary artery disease. Circulation 61, 508515. Pronovost, P., Needham, D., Berenholtz, S., Sinopoli, D., Chu, H.,
Lilienfield, A.M. (1982) Ceteris paribus: The evolution of the clinical Cosgrove, S., Sexton, B., Hyzy, R., Welsh, R., Roth, G., Bander, J.
trial. Bull. Hist. Med. 56, 118. et al. (2006) An intervention to decrease catheter-related bloodstream
Lincoff, A.M., Tcheng, J.E., Califf, R.M., Bass, T., Popma, J.J., Teirstein, infections in the ICU. N. Engl. J. Med. 355, 27252732.
P.S., Kleiman, N.S., Hattel, L.J., Anderson, H.V., Ferguson, J.J. et al. Roach, G.W., Kanchuger, M., Mangano, C.M., Newman, M., Nussmeier, N.,
(1997) Standard versus low-dose weight-adjusted heparin in patients Wolman, R., Aggarwal, A., Marschall, K., Graham, S.H. and Ley, C.
treated with the platelet glycoprotein IIb/IIIa receptor antibody frag- (1996) Adverse cerebral outcomes after coronary bypass surgery. N.
ment abciximab (c7E3 Fab) during percutaneous coronary revascu- Engl. J. Med. 335, 18571863.
larization. PROLOG Investigators. Am. J. Cardiol. 79, 286291. Roberts, R., Rodriguez, W., Murphy, D. and Crescenzi, T. (2003) Pediatric
Lo, B., Fiegal, D., Cummins, S. and Hulley, S.B. (1988) Addressing drug labeling: Improving the safety and efficacy of pediatric thera-
ethical issues. In: Designing Clinical Research (S.B. Hulley and pies. JAMA 290, 905911.
S.R. Cummings, eds), pp. 151157. Baltimore, MD: Williams & Wilkins. Rogers, W.J., Bowlby, L.J., Chandra, N.C., French, W.J., Gore, J.M.,
Lyles, K.W., Coln-Emeric, C.S., Magaziner, J.S., Adachi, J.D., Pieper, C.F., Lambrew, C.T., Rubison, R.M., Tiefenbrunn, A.J. and Weaver, W.D.
Mautalen, C., Hyldstrup, L., Recknor, C., Nordsletten, L., Moore, K.A. (1994) Treatment of myocardial infarction in the United States (1990
et al. (2007) Zoledronic acid and clinical fractures and mortality after to 1993) Observations from the National Registry of Myocardial
hip fracture. N. Engl. J. Med. 357, 17991809. Infarction. Circulation. 90, 21032114.
Mahaffey, K.W., Granger, C.B., Tardiff, B.E. et al. (1997) For the Schmidt, B., Gillie, P., Caco, C., Roberts, J. and Roberts, R. (1999) Do
GUSTO-IIb Investigators. Endpoint adjudication by a clinical events sick newborn infants benefit from participation in a randomized clini-
committee can impact the statistical outcome of a clinical trial: cal trial? J. Pediatr. 134, 151155.
Results from GUSTO-IIb [Abstract]. J. Am. Coll. Cardiol. 29 (Suppl. Schwarz, U.I., Ritchie, M.D., Bradford, Y., Li, C., Dudek, S.M., Frye-
A), 410A. Anderson, A., Kim, R.B., Roden, D.M. and Stein, C.M. (2008)
Medical Research Council (1948) Streptomycin treatment of pulmonary Genetic determinants of response to Warfarin during initial anticoagu-
tuberculosis. Br. Med. J. 2, 769782. lation. N. Engl. J. Med. 358, 9991008.
37
Chapter | 2 Clinical Trials
Simes, R.J. (1987) Publication bias: The case for an international registry randomised controlled trials compared to similar patients receiving
of clinical trials. J. Clin. Oncol. 4, 15291541. similar interventions who do not participate. Cochrane Database Syst.
Society of Thoracic Surgeons Database. Available at www.sts.org/ Rev., MR000009.
sections/stsnationaldatabase/. Accessed 30 November 2005. Weinfurt, K.P. (2003) Outcomes research related to patient decision mak-
Thygesen, K., Alpert, J.S. and White, H.D., for the Joint ESC/ACCF/ ing in oncology. Clin. Ther. 25, 671683.
AHA/WHF Task Force for the Redefinition of Myocardial Infarction Welke, K.F., Ferguson, T.B. Jr, Coombs, L.P., Dokholyan, R.S., Murray,
(2007) Universal definition of myocardial infarction. Circulation 116, C.J., Schrader, M.A. and Peterson, E.D. (2004) Validity of the Society
26342653. of Thoracic Surgeons National Adult Cardiac Surgery Database. Ann.
Topol, E.J. (2005) Arthritis medicines and cardiovascular events House Thorac. Surg. 77, 11371139.
of coxibs. JAMA 293, 366368. Yusuf, S., Wittes, J., Probstfield, J. and Tyroler, H.A. (1991) Analysis and
Tunis, S.R., Stryer, D.B. and Clancy, C.M. (2003) Practical clinical trials: interpretation of treatment effects in subgroups of patients in rand-
Increasing the value of clinical research for decision making in clini- omized clinical trials. JAMA 266, 9398.
cal and health policy. JAMA 290, 16241632. Zarin, D.A., Tse, T. and Ide, N.C. (2005) Trial registration at
Vist, G.E., Hagen, K.B., Devereaux, P.J., Bryant, D., Kristoffersen, D.T. ClinicalTrials.gov between May and October 2005. N. Engl. J. Med.
and Oxman, A.D. (2007) Outcomes of patients who participate in 353, 27792787.