Sunteți pe pagina 1din 13

Journal of Autism and Developmental Disorders, Vol. 30, No.

2, 2000

Assessment in Multisite Randomized Clinical Trials of Patients with Autistic Disorder: The Autism RUPP Network1
L. Eugene Arnold,2 Michael G. Aman, Andres Martin, Angie Collier-Crespin, Benedetto Vitiello, Elaine Tierney, Robert Asarnow, Felicia Bell-Bradshaw, Betty Jo Freeman, Patricia Gates-Ulanet, Ami Klin, James T. McCracken, Christopher J. McDougle, James J. McGough, David J. Posey, Lawrence Scahill,2 Naomi B. Swiezy, Louise Ritz and Fred Volkmar2

Assessment of autistic disorder (autism) symptoms, primary and secondary, poses more challenging problems than ordinarily found in multisite randomized clinical trial (RCT) assessments. For example, subjects may be uncommunicative and extremely heterogeneous in problem presentation, and current pharmacological treatments are not likely to alter most core features of autism. The Autism Research Units on Pediatric Psychopharmacology (RUPP Autism Network) resolved some of these problems during the design of a risperidone RCT in children/adolescents. The inappropriateness of the usual anchors for a Clinical Global Impression of Severity (CGI-S) was resolved by defining uncomplicated autism without secondary symptoms as a CGI-S of 3, mildly ill. The communication problems, compromising use of the patient as an informant, were addressed by several strategies, including careful questioning of care providers, rating scales, laboratory tests, and physical exams. The broad subject heterogeneity requires outcome measures sensitive to individual change over a wide spectrum of treatment response and side effects. The problems of neuropsychologically testing nonverbal, lower functioning, sometimes noncompliant subjects requires careful instrument selection/adaptation and flexible administration techniques. The problems of assessing low-end IQs, neglected by most standardized test developers, was resolved by an algorithm of test hierarchy. Scarcity of other autism-adapted cognitive and neuropsychological tests and lack of standardization required development of a new, specially adapted battery. Reliability on the Autism Diagnostic Interview (currently the most valid diagnostic instrument) and other clinician instruments required extensive cross-site training (in-person, videotape, and teleconference sessions). Definition of a treatment responder required focus on individually relevant target symptoms, synthesis of possible modest improvements in many domains, and acceptance of attainable though imperfect goals. The assessment strategy developed is implemented in a RCT of risperidone (McDougle et al., 2000) for which the design and other methodological challenges are described elsewhere (Scahill et al., 2000). Some of these problems and solutions are partially shared with RCTs of other treatments and other disorders.
KEY WORDS: autism; clinical trials; risperidone.
1

The Autism/PDD RUPP (Research Units in Pediatric Psychopharmacology) risperidone study is a cooperative treatment study performed by five research teams in collaboration with the staff of the Division of Services and Intervention Research of the National Institute of Mental Health (NIMH), Rockville, Maryland. The NIMH Principal Collaborators are Benedetto Vitiello and Louise Ritz. Principal Investigators, Co-investigators, and site coordinators from the five sites are as follows: Indiana University: Christopher J. McDougle, Naomi B. Swiezy, David J. Posey, Arlene Kohburn, Vanessa Patrick; Johns Hopkins University/

Kennedy Krieger Institute: Elaine Tierney, Jaswinder Ghuman, Nilda Gonzalez, Patricia Gates-Ulanet, Felicia Bell-Bradshaw; Ohio State University: Michael G. Aman, L. Eugene Arnold, Jill Hollway; Ronald Lindsay, Patricia Nash; University of California, Los Angeles: James T. McCracken, James J. McGough, Betty Jo Freeman, Pegeen Cronin, Bhvaik Shah; Yale University: Lawrence Scahill, Andres Martin, Lesley Farkas, Kathy Koenig, Fred Volkmar. 2 Address all correspondence to L. Eugene Arnold, 479 S. Galena Road, Sunbury, Ohio 43074; e-mail: Arnold.6@osu.edu Reprints not available.

99
0162-3257/00/0400-0099$18.00/0 2000 Plenum Publishing Corporation

100 INTRODUCTION The Research Units on Pediatric Psychopharmacology (RUPP) Autism Network is a network of five field sites formed by NIMH to investigate salient issues in pharmacotherapy of persons with autistic disorder (autism) and other pervasive developmental disorders (PDDs). The first autism RUPP study is a randomized clinical trial (RCT) of risperidone (2 months randomized placeborisperidone comparison followed by 4 months open maintenance, then 2 months randomized placebo discontinuation), for which the rationale is described by McDougle et al. (2000) and the design and methodology by Scahill et al. (2000). In the development of this study, we encountered assessment challenges more severe than ordinarily found in RCTs with subjects who do not have mental retardation (MR) or developmental disorders. Though some can be found in milder degree in most treatment research with children, a few may be unique to autism/MR research in quality or severity. First, subjects may be uncommunicative, making direct assessment of drug effects difficult. Second, they may not be able to cooperate with assessment activities. Third, treatments are unlikely to alter most core features of autism, such as impairments of communication, social relatedness, and imagination /fantasy, and repetitive stereotyped behavior, inexibility, and narrow focus of preoccupation. The more usual targets for psychopharmacotherapy are the secondary problems of aggression, hyperactivity, self-injurious behavior, anxiety, and agitation. Compulsive rituals and stereotypy constitute a special case: There is some question whether stereotypy and compulsions can be clearly distinguished, and this spectrum of behavior is the one core feature of autism amenable to pharmacotherapy. Because either significant improvement in one of these many symptom areas or moderate improvement in several would be clinically useful to document, it is important that assessments be sensitive to all of them. Fourth, the broad spectrum of pathology encompassed and the wide individual variation in symptomatic expression (sample heterogeneity) and treatment response challenge the sensitivity, psychometric properties, and/or assumptions of most instruments and assessment strategies commonly used in RCTs. Both reliability and validity present problems, complicated by the fact that the patient cannot be counted on as an informant. The latter problem is only partially shared with most other RCTs, even in children, who are not as good informants as adults. Though children are poor informants about attention-deficit/hyperactivity (ADHD) and

Arnold et al. oppositional-defiant symptoms at any age and unreliable about internalizing symptoms before age 8, the informant problem is not nearly as severe as for patients with autism. This article describes some of the assessment challenges the Autism RUPP faced in designing its first RCT, the strategies developed to address them (Table I), and the resulting assessment battery (Table II).

HETEROGENEITY OF PSYCHOPATHOLOGY AND TARGET SYMPTOMS The heterogeneity of autism is evident in the genetic, neurobiological, and clinical characteristics, as well as in the response to treatment. A high genotypic contribution is now recognized (Bailey, Le Couteur, Gottesman, & Bolton, 1995, Folstein & Rutter, 1977; Steffenburg et al., 1989;), but different genetic mechanisms may play a role in higher and lower functioning subjects (Szatmari, 1999). Fragile X abnormality has been found in 2.55% of autism cases (Bailey et al., 1993, Fisch, 1992). About one third have elevated whole blood serotonin levels (Anderson et al., 1987), a finding of unknown clinical significance and treatment implications. About 75% have mental retardation (IQ < 70) (Freeman, Ritvo, Needleman, & Yokota, 1985a). Among institutionalized persons with autism and mental retardation, the prevalence of self-injurious behavior (SIB) is as high as 1540% (Grifn, Williams, Stark, Altmeyer, & Mason, 1986; Tsai, 1996). Moreover, 25 40% of patients with autism have a seizure disorder (Volkmar & Nelson, 1990) and about 50% have electroencephalographic abnormalities (Minshew, 1991). Treatment response tends to be equally heterogeneous. When administered psychotropic medications such as stimulants and serotonin reuptake inhibitors, some patients may improve, others show no changes, and still others worsen (McDougle et al., 1999). This is true even for treatments documented effective at the group level, such as serotonin reuptake inhibitors for obssessive/compulsive/stereotypic symptoms in adults. One of the research implications of this high intersubject variability in treatment response is the need to study a large number of subjects in order to test the potential efficacy of a treatment. This problem is shared to a great extent with other areas of child psychopharmacology, which is plagued with underpowered studies (perhaps because of funding limitations). No treatment has been yet developed that can correct most of the core features of autism. Impairment in reciprocal social interaction, deficits in communication, and a restricted, repetitive, and stereotypic pattern of

Assessment in Autism RCTs


Table 1. Assessment Issues in Randomized Clinical Trials in Autistic Disorder Issues Communication, cooperation with assessment. Floor of instruments Problems Communication impairments, verbal and nonverbal. Many low general functioning (low IQ). Frequent inability to cooperate. IQ test developers neglect IQs at lower end; oor varies from test to test. Most target symptoms are secondary; even excellent results unlikely to produce normality. Nature of disorder leaves signicant underlying pathology Wide array of symptoms, primary and secondary, are possible opportunities for improvement. Important not to miss any. Common and often problematic, but not egodystonic. Even slight improvement is important, especially if across several domains. Regression to mean of quantied scales; Global Impression not specic to targets. Single rater could be wrong. Lack of instruments adapted to autism; lack of standardization/norms in autism. Best validated diagnostic instrument (Autism Diagnostic Interview-Revised) is very arduous, requires certied training. Swallowing/compliance, interaction of autistic symptoms and metabolism with side effects, toxicity, rate of benet onset. Solutions

101

Clinical Global ImpressionSeverity (CGI-S) anchors

Target symptoms; Heterogeneity

Tracking repetitive behaviors Sensitivity to change Denition of . response/responders

Neuropsychological/cognitive testing Diagnostic procedure

Medication-specic issues

Caregiver ratings, careful interview of caregivers; performance tests; observations, physical exams; adaptation of instruments. Algorithm for test choice, moving from more desirable with higher oors to those with lower oors as needed Arbitrary calibration of uncomplicated autistic disorder as 3 on 7-point Clinical Global Impression-Severity. Two-pronged denition of response Assessment of multiple symptoms and domains of function as basis for Global Impression. Focus on relevantABC subscale. Individual target symptom assessment. Adapted CY-BOCS to include perseverative behavior. Selection of measures with high sensitivity in multiple domains in individual patients. Require both much improved on CGIImprovement by blinded clinician and 25% improvement on target subscale (ABC Irritability) by parent. Development of special neuropsychological battery, with plans to standardize on rst sample. Group training; planned statistical analyses to establish reliability and validity of shorter procedures. Training for swallowing, monitoring of weight, EPS, metabolites, leptin, seizures. Titration schedule, phone monitoring.

behavior can be substantially ameliorated through intensive psychoeducational/behavioral interventions, but not, as of today, qualitatively altered and cured. The focus of neuropsychiatric treatments is usually not on core pathology (other than preoccupation and stereotypy) but on such associated symptoms as hyperactivity, agitation, inattention, aggression, anxiety, rituals, and SIB. In fact, even a modest decrease in one of these behavioral problems can result in clinically significant improvement and enhanced level of functioning. Therefore testing the efficacy of interventions aimed at the associated secondary psychopathology of autism requires being able to detect changes in the specific constellation of behaviors that each patient presents. Treatment response is usually measured on comprehensive rating scales that include multiple items relevant to various behavioral aspects (Campbell & Palij, 1985). An individual patient may manifest only a few specific behavior problems or may show improvement in only a few during treatment. In such cases, it is important that the few changes on the rating scale not be

overshadowed or diluted by the overall lack of change in the other items of the scale. In some studies, no difference between active treatment and placebo was detected on global scores of improvement, but scores of specific symptoms, such as hyperactivity, showed significant improvement with the active treatment (Campbell et al., 1993). (This raises questions about the clinical importance of the improvement and about the relevance of the instrument.) Searching for the possible treatment effect post hoc, however, would mean running multiple statistical analyses, requiring a high statistical price in Bonferroni corrections. A possible solution is to identify at baseline the critical symptoms of each patient that are the main targets of treatment and find a rating scale that contains the range of target symptoms relevant to a range of patients. This approach seems particularly fitting for trials of treatments, such as neuroleptics, that can impact various problem areas, such as hyperactivity, aggression, stereotypies, and SIB. Since the irritability subscale items of the Aberrant Behavior Checklist (ABC;

102
Table II. RUPP Risperidone RCT Assessment Battery Informant (rater) Parent interview (staff rating) Parent interview (staff rating) Parent interview (staff rating) Parent interview (blinded clinician) Parent interview (blinded clinician) Parent checklist Parent checklist Parent checklist Subject (tester) Subject (tester) Physician Parent/guardian (staff) Objective, cardiologist Objective, lab

Arnold et al.

Measure; Mean time burden Autism Diagnostic Interview (ADI); 23 hr. VinelandAdaptive Behavior Scale (Sparrow et al., 1984); 3545 min. Vineland Maladaptive Behavior Subscale; 812 min. Target symptom descriptive quantication; 25 min. Abberant Behavior Checklist (ABC) Irritability Subscale; 812 min. Full ABC; 1020 min. Child & Adolescent Autism Symptom Inventory (ASI); 20 40 min. Ritvo-Freeman Rating Scale adaptation for parents; 1525 min. IQ test algorithm (see text); 45 min. Cognitive battery (non-IQ); 3060 min. Physical exam; 1525 min. History; 1525 min. Electrocardiogram; 515 min. Drug blood level (Also leptin, insulin, prolactin)

Purpose Autism diagnosis, screen Screen, Secondary (2) outcome measure 2 outcome measure 2 outcome measure 2 outcome measure Screen, Main (1) outcome measure Screen 2 outcome measure Screen for mental . age 18 mo. 2 outcome measure Screen, adverse events Screen Screen, adverse events Screen, PK studies, level-response, compliance, adverse events Screen, side effects, other adverse events Monitor side effects Monitor side effects Monitor side effects Baseline for CGI, key checkpoints 1 outcome measure Outcome measure

Frequency Screen only Baseline (BL) & after 6 mo. Tx

BL, 2 mo., 6 mo. BL, 4 wk., 8 wk., 6 mo., 8 mo. BL, 4 wk., 8 wk., 6 mo., 8 mo. Screen, BL, every 2 wk. for 2 mo., then monthly Screen only BL and monthly Screen only BL, Months 1,2,6,7,8 Screen, 2 mo., 8 mo. Screen only Screen, 2 mo., 8 mo. Screen, Months 2,6,8

Vital signs; Ht & Wt; 1015 min. Side effects review; 812 min. Abnormal Involuntary Movement Scale (AIMS); 812 min. Simpson-Angus; 812 min. Clinical Global Impression-Severity Clinical Global Impression-Improvement Childrens Yale-Brown Obsessive- Compulsive Scale; 1020 min.

Objective, staff Parent, 1 clinician Primary clinician Primary clinician Blinded clinician Blinded clinician Blinded clinician

Every visit Every visit except screen Every visit except screen Every visit except screen Screen, BL, 2 mo., 6 mo., 8 mo. Every visit after BL BL, 2 wk. 4 wk. 6 wk. 2 mo., then monthly

Aman, Singh, Stewart, & Field 1985) encompass most of the targets of treatment in the first RUPP RCT, it was selected as one of the primary outcome measures (rather than the whole ABC). The second primary outcome measure will be the blinded Clinical Global ImpressionImprovement score (CGI-I; Psychopharmacology Bulletin, 1985), based on careful examination of all areas of possible improvement. A similar strategy could be used for other drugs targeting other symptoms. To capture repetitive behavior, the compulsion scale from the Childrens Yale-Brown Obsessive-Compulsive Scale

(CY-BOCS; Scahill et al., 1997) will be a secondary outcome measure. Another secondary outcome measure will be a blind rating of one or two recorded target symptoms individually selected by the parent/guardian and behaviorally quantified at major assessments.

ESTABLISHING CGI ANCHOR POINTS A major issue was how the Severity of Illness scale should be coded on the CGI-S (Psychopharma-

Assessment in Autism RCTs cology Bulletin, 1985), which usually has normality as the first anchor (score of 1), constituting the goal of treatment in many areas of psychiatry. For example, a good response to antidepressant medication in a patient with major depressive disorder may be defined by normalization in mood, work performance, sleep pattern, appetite, and activity level along with abatement of morbid ideation and feelings of guilt or worthlessness. In contrast to this situation, we do not usually anticipate that the study drug will alter most core features of autism. In most RCTs in autism, the goal is to reduce common secondary accompaniments of autism, namely, highly charged emotional and disruptive behaviors. On the one hand, uncomplicated autism (even without accompanying secondary behavioral or emotional problems) is a major neuropsychiatric disorder that often results in lifelong handicap and could merit an extremely severe CGI-S rating in its own right. On the other hand, RCTs in autism usually target secondary behaviors that frequently accompany autism (e.g., irritability, rapid mood changes, anxiety, aggression, agitation, tantrums, SIB) as well as compulsive rituals, which could be considered markers of severity. These symptoms allow considerable room for improvement even if the core pathology of autism is unaltered by the treatment. If Severity of Illness were assigned in full consideration of the core autistic symptoms, it would leave a very small range on the 7-point CGI-S to show clinically meaningful treatment changes in the target secondary symptoms. Yet the core symptoms of communication, social relatedness, and so forth, should not be entirely ignored, if for no other reason than that they might coincidentally benefit from treatment. The Autism RUPP solution to this was to assign a CGI-S score of 3 (mildly ill) to children and adolescents with pure autism. In this context, pure autism was characterized as the core features of autism (impairments in social relatedness, communication, and fantasy/imagination, with restricted, repetitive, stereotyped behavior and interests) but none of the secondary symptoms or comorbidity that often complicate management. By adopting this admittedly arbitrary approach, we were able to reserve the four highest scores on the CGI Severity dimension ranging from 4 (markedly ill) to 7 (among the most extremely ill) for entry into the study and for reflecting change with treatment, while providing a reasonable anchor point for totally uncomplicated autism. In a reliability test, A RUPP member (F.V.) with extensive clinical experience in autism prepared vignettes describing real patients seen clinically and rated these patients on both the CGI-S and CGI-I. These were

103 scored independently by a second RUPP member (M.G.A.) to establish adequate agreement on a gold standard before distribution to all clinical assessors. To be considered qualified to participate in the project, clinicians had to score within 1 unit of the gold standard on all four ratings in two vignettes (i.e., two CGIS ratings and two CGI-I ratings). ADAPTATION OF CY-BOCS TO MEASURE A CORE SYMPTOM Frequent and prominent in autism are repetitive behaviors and preoccupations, in some ways reminiscent of compulsive rituals and obsessions. Unlike classical adult and adolescent obsessive-compulsive disorder (OCD), in which symptoms are ego-dystonic, but like some young-child OCD, the rituals and preoccupations do not seem to bother the patient with autism. Nonetheless, the rituals and preoccupations can be severely impairing through time consumption and irritable reactions to interruption. This is one of the few core autistic features that may be drug-responsive. To evaluate repetitive behavior as a secondary outcome, we adapted the compulsion scale of the CYBOCS (Scahill et al., 1997). The adaptation expanded the symptom checklist to include repetitive behaviors associated with autism, such as spinning objects, staring, twirling, and repetition of words and phrases. The probes for resistance and control were adapted to the parent as informant. Reliability was established through training sessions and co-rating of videotaped interviews. Each clinical rater had to achieve a criterion of 15% of the gold standard established by an experienced rater (L.S.). For example, if the gold-standard rating total for a videotaped interview was 14, the other raters had to rate the same tape between 12 and 16 to qualify as a CY-BOCS rater; if the gold standard was 10, they had to rate between 9 and 11. Anyone who flunked this test underwent further training and retook the test on a different tape until qualified. CAREFUL INTERVIEWING OF PARENTS FOR SECONDARY OUTCOME MEASURES Since the patient is often not able to provide information about symptoms or side effects, it becomes especially important to insure the accuracy of reporting by the available informants, including parent or guardian. Several strategies were adopted to facilitate this, including these two:

104 1. The Irritability scale of the ABC, filled out by the parent or guardian, is one of two main outcome measures. As a check on its accuracy, the clinician interviews the parent with the parentfilled form in view, asking for examples or other clarification of the items. The clinician does not change the parents original marks (the scale was standardized on unretouched caregiver scores), but fills out a duplicate Clinician Irritability Scale based on the careful clinical elaboration. This companion clinician-rated Irritability Scale is a secondary outcome measure. 2. To characterize individualized target symptoms, the clinician asks the parent What are the one or two things you are most concerned about? and then helps the parent with exploratory questions to define the behavior qualitatively and quantitatively. Examples of quantification might be 10 hours a day, every half-hour, 5 times a day, 5 blocks away, or until he is completely soaked. At subsequent assessments, the parent is reminded of the target symptoms and asked to requantify them. These descriptions are then blindly rated by another clinician at the end of the study as a secondary outcome measure. RELIABILITY OF DIAGNOSIS The pleomorphic manifestations of pervasive developmental disorders in general and autistic disorder in particular have long subjected diagnosis to local drift, with emphasis on one or another feature as more essential. For a single-site RCT, this may not be critical as long as the sample is well described and the criteria used are explicated. However, it interferes with comparison of results across sites and becomes critical when multisite RCTs are undertaken. Only recently has significant progress been made in standardizing diagnosis suitably for cross-site studies. The Autism Diagnostic InterviewRevised (ADIR; Lord, Rutter, & LeCouteur, 1994) is a clinicianadministered, semistructured instrument designed to aid in the diagnosis of children, adolescents, and adults in whom the possibility of autism or a related pervasive developmental disorder (PDD) is entertained. The ADI-R has been adapted from the original ADI (LeCouteur et al,, 1989) such that it is closely linked to DSM-IV (American Psychiatric Association [APA], 1994) and ICD-10 (World Health Organization [WHO], 1993) diagnostic criteria. The standard (research or long) ADI-R consists

Arnold et al. of 111 items and usually takes 23 hours to administer, using the parent or other caregiver as primary informant. Although an abridged (short) version is available, it lacks reliability and validity data at this time. Therefore the RUPP network has adopted the standard version as the diagnostic screen even though it is costly in staff time for the lengthy administration and the rather arduous training in its use. The ADI-R follows a series of developmentally organized questions, and does not require major deviations from standard diagnostic assessment practices. Thus, it can be used to guide clinical interviews (usually the short form) without a major training commitment. However, in order to be certified in the use of the ADI-R for research purposes, investigators must undergo intensive training and attain reliability with an experienced rater. Workshops are offered at regular intervals at the University of Chicago and the Maudsley in London, where the instrument was originally developed. In certain circumstances, workshops can be arranged at alternative sites. For the RUPP network, a 3-day on-site training was organized at the Yale site in the spring of 1998. Following that workshop, each investigator had to achieve reliability to within 90% with two archived interviews and demonstrate on videotape to an expert rater the ability to elicit interview data. Results obtained for each of the ADI-R individual items are scored on a 03 scale (usually rating behaviors from absent to constantly present), and onset of target behaviors are coded to the nearest month. A 41-item algorithm is completed based on these results. The algorithm is divided into subscales corresponding to the three domains centrally affected in the PDDs: (a) Qualitative impairments in reciprocal social interaction; (b) Communication; and (c) Repetitive behaviors and stereotyped patterns. Interrater reliability on individual and overall ADI-R scores has been excellent (Lord et al.,) 1997), and cutoff values for the diagnosis of autistic disorder (autism) have been established for each subscale and overall score. Cutoffs for PDDs other than autism have not yet been established, contributing to the observation that the ADI-R is less useful at distinguishing atypical cases of PDD, or certain subtypes of PDD (e.g., autism vs. Asperger syndrome) than in differentiating autism from non-PDD cases (Mahoney et al., 1998). The ADI-R has excellent internal consistency within its three domains (Lord et al., 1997) and is a good instrument to differentiate mentally retarded individuals with and without autism. However, it may be overinclusive and less able to differentiate these groups in children with an overall mental age below 18 months or IQ below 20 (Lord, 1995). The cutoff mental age of 18 months se-

Assessment in Autism RCTs lected as an exclusionary criterion by the RUPP network is partly based on this floor effect of the ADI-R. Based on its clear links to DSM-IV and ICD-10 and to the domains affected in autism, the ADI-R is an instrument well suited to confirm the initial clinical diagnosis of autistic disorder. However, its inability to reflect change makes it unsuitable as an outcome measure. Therefore the ADI-R is used solely for screening diagnostic purposes. Of course, entry also requires clinical diagnosis of autistic disorder by DSM-IV criteria.

105 sented visually (with flip cards showing the stimuli). Past experience in the Ohio State University laboratory has shown that this adaptation is helpful to children experiencing difficulty in remembering the stimuli. The Spatial Learning Test, sometimes called the Dot Test, is interesting because of its simplicity and because in studies with schizophrenic patients, it documented that risperidone (the drug to be used in the first autism RUPP RCT) caused enhanced performance (Keefe, Lees-Roitman, & Dupre, 1997). The subjects task is to remember the positions of the dots on sheets of paper and to mark them on blank pages. The blank pages are presented simultaneous with or 10 seconds after each stimulus. In the standard adult version, subjects are asked to read lists of words between the stimulus sheets and test sheets to prevent rehearsal. For children with autism, we eliminated the word lists and substituted computer clip art of common objects, which subjects are asked to name during the delay interval. The test of attention span is the Cancellation Task (Aman & Turbott, 1986; Barkley, 1991; Geldmacher, 1998). It uses five geometric stimuli (squares, triangles, stars, diamonds, circles) randomly positioned in rows on the test forms. The task is for the subjects to place a slash through as many of the squares as possible in the allotted time. Errors of omission (missed squares), errors of commission (incorrectly canceled figures), and total number correctly canceled serve as the dependent variables. By working with common geometric figures, we have been able to devise a test that is appropriate to a large percentage of our subjects with autism. Work in children with ADHD has shown the Cancellation Task to be among the most ecologically valid and sensitive measures of drug effects (Barkley, 1991). To assess eyehand coordination, we are using the standard Purdue Pegboard and test materials (Lezak, 1995) but with modified instructions. Pilot work with children having a variety of handicaps revealed that such youngsters could not correctly position many pins in the allotted time period. We extended the time by 33% (from 30 to 40 seconds). The number of pins correctly placed and the number of drops are recorded. The extension of time provides for larger and more variable scores, and it is less likely to prove frustrating for subjects. The test of persistence is the Analogue Classroom Task. It uses mathematics problems because these can easily be scored as correct or not. We generated a large pool of problems that range in difficulty level from preschool level to the sixth-grade level. The easiest problems simply assess number concepts. On each assessment, subjects are presented with more problems than can be completed in the allotted time. The outcome

COGNITIVE MEASURES One of the key handicaps in autistic disorder is impaired cognitive functioning. Yet, cognitive functioning is one of the more difficult domains to assess. Children with autism are often difficult to engage. Given the target symptoms (aggression, hyperactivity, and related problems) for the first RUPP study, we anticipated lower functioning subjects, for whom the usual standardized neuropsychological/cognitive tests are too difficult. M.G.A., R.A., and A.K. reviewed the relevant domains, available cognitive batteries, and the study population, and identified the following cognitive-motor domains as of high relevance for the study: (a) attention span, (b) memory (both verbal and spatial), (c) eyehand coordination, and (d) persistence on academic tasks. Initially we examined several computerized systems, such as the CANTAB battery (Fray, Robbins, & Sahakian, 1996; Robbins & Sahakian, 1994) and the Iron Psyche battery developed in the Netherlands for subjects with epilepsy (Alpherts & Aldenkamp [online]; Alpherts & Aldenkamp, 1990). However, these batteries did not seem easy enough to enable testing a majority of the subjects in the RUPP project. Eventually we chose a variety of paper-and-pencil tests (Table III). Because they explore poorly charted territory, their reliability and validity across age and IQ remain largely undetermined; we expect that one of the secondary outcomes of this study will be data on internal consistency and testretest reliability. They are all timed tests, with the times indicated in Table III. The modified California Verbal Learning Task (MCVLT) is based on the California Verbal Learning Task (Delis, Kramer, Kaplan, & Ober, 1994), but it uses simple stimuli that appear at the earliest age levels within the Peabody Picture Vocabulary Test (Dunn & Dunn, 1981). We simplified the task so that there are five short-delay free recall trials, one long-delay free recall trial, and one recognition memory trial. Stimuli can be presented orally but if necessary, can also be pre-

106
Table III. Cognitive/Neuropsychological Test Battery for the Autism RUPP RCT. Measure Modied CA Verbal Learning Task Spatial Learning Task Purdue Pegboard Analogue Classroom Task Cancellation Task Average minutes (range) 11 minutes (714) 5 minutes (47) 5 minutes (4.757.5) 8 minutes (7.510) 6 minutes (69) Function assessed Verbal memory Spatial memory Hand steadiness Persistence/motivation Attention & distractibility Source

Arnold et al.

Custom developed for this and related studies, adapted from Delis et al., 1994 Keefe et al., 1997. Lezak, 1995 Handen et al., 1990; Douglas et al., 1986; Hamden et al., 1990; Pelham et al., 1985. Dykman, R. A., 1985.

measures are the number of problems attempted and the number correctly solved. The intent is not to assess mathematics ability or achievement, but to see how well subjects persist with a boring task. Once again, this is a task that has proven to be very sensitive to treatment effects in the ADHD literature (Douglas et al., 1986; Handen et al., 1990 Pelham et al., 1985). The challenge was to develop a broad enough range of difficulty to encompass all subjects.

IQ ASSESSMENT There were two goals in assessing IQ. First, we wanted to describe the sample accurately so that readers can identify the population to whom the findings can be generalized. More important, we wanted to conduct exploratory analyses to see if IQ or mental age are useful predictors of clinical response to the study drug. These goals proved more difficult to attain than originally anticipated. One of the great ironies of the IQ assessment industry is that test developers have largely abandoned the population with low IQs. Several of the most popular IQ tests have floors that exclude many individuals with mental retardation. For example, the Wechsler-III Intelligence Scale for Children (WISC-III, Weschler, 1991) has a floor of 45, and the Stanford Binet Intelligence Scale-4th Edition (Terman & Merril, 1982) has a floor of 36. Moreover, these scales lose discriminative power as these lower limits are approached. IQ test developers appear to be focusing their efforts on school administrative assignments. These tests are excellent at differentiating gifted students from average ability students, and average students from those with borderline IQ and mild mental retardation, but not good at differentiating moderate, severe, and profound mental retardation. We believe these measurement issues are important to pediatric psychopharmacological research because IQ level and mental age may be predictors of

drug response. For example, Aman, Marks, Turbott, Wisher, and Werry (1991) found that ADHD children with IQs less than 45 and mental ages less than 4.5 years were less likely to respond to methylphenidate than higher functioning children. Our solution was to adopt a hierarchy of three tests, depending on the childs ability level. The WISC-III (Weschler, 1991) is to be used preferentially when possible. Children for whom the WISC is unfeasible will try the Leiter International Test of Intelligence-Revised (Roid & Miller, 1997). Finally, if the Leiter proves unfeasible, we use the Mullen Scale of Early Development (Mullen, 1995). Though achieving the goal of describing our samples intellectual level, this strategy has drawbacks. For example, these tests have different standard deviations, which means that identical scores are not strictly comparable. Furthermore, not all of these tests provide a mental age, a measure that may prove helpful for a variety of exploratory analyses. A third problem is discontinuity across tests of constructs as to what constitutes intelligence. For example, the WISC-III (age range 6 years 0 months, to 16 years 11 months) assesses Verbal IQ, Performance IQ, and Full IQ. The WISC-III also contains subtests which measure perceptual organization, freedom from distractibility, and processing speed. In contrast, the Leiter-R (age range 2 years 0 months to 20 years 11 months) emphasizes performance IQ and contains subtests that measure abilities such as visualization, attention, and memory. This strategy for measuring IQ is practical and perhaps the only one currently possible, but due to differences in concept and standard deviations it is not the ideal resolution of the problem. A more adequate and comprehensive IQ test is needed. In some respects, the old Stanford-Binet Intelligence Scale, Form L-M (Terman & Merril, 1982) was an excellent example of what the developmental disabilities field needs now in that it assessed all levels of functioning (from profound mental retardation through to gifted) and children with a very wide age range (2 years through 18 years). Unfor-

Assessment in Autism RCTs tunately, the norms are now out of date, with current administrations scoring about 5 points too high. Other good intelligence tests considered, with parenthetical reasons for not adopting each included the Kaufman Assessment Battery for (Kauffman, 1983) Children (the K-ABC General Cognitive Index is not strictly comparable to and IQ), the Differential Ability Scale (does not provide a well-recognized measure for deriving an IQ), and current Stanford-Binet Intelligence Scale (tends to be verbally loaded). Because tests heavily loaded with verbal items are not well adapted to the autistic population, the main weakness of the Leiter, its paucity of verbal items, actually becomes a strength for testing this population. We might hope that the current market vacuum will lure one or more test developers to address this neglected population. Otherwise, there may be a need for research agencies to designate this a priority area.

107 in the clinic, external reports (e.g., teacher reports) or all of the above? Is equal emphasis given to these various sources or is the clinician giving particular emphasis to certain sources of information? The Autism RUPP Network chose a two-prong strategy for dening clinical responders: the child must receive a rating of 1 or 2 on the CGI-I and the child must show a 25% reduction, as rated by the parent, on the Irritability subscale of the ABC (Aman et al., 1985). (In order to qualify for the study, children must be rated moderately high on the Irritability subscale at screening.) This two-prong definition has certain advantages over the use of a single criterion. First, it denes the main domain of behavior being used to gauge clinical response. Clinicians are asked to rate global improvement, but some direction is given regarding the main target behaviors. Second, the two-prong strategy protects against placebo/expectancy effects, practice effects, or regression to the mean. When subjects are chosen for being extreme on a given dimension (in this case, the Irritability subscale) it is common to see spontaneous decreases over time (regression to the mean) that are unrelated to a true drug effect. The provision that the clinician also believes that the child is much or very much improved guards against classifying subjects as responders on the basis of statistical artifact while using the objective quantication of a multi-item rating scale. DIRECT OBSERVATIONS One way of compensating for a patients poor ability to communicate is by direct observation, which has a long tradition in autism treatment research (Feldman, Kolman, & Gonzaga, 1999; Freeman, Ritvo, Yokota, & Ritvo, 1985b). Observational instruments such as the Ritvo-Freeman Real-Life Rating Scale (Freeman et al., 1985a, 1985b) can be used to measure the core symptoms of autism. Although it has not been used extensively in clinical trials, it has successfully measured improvement in behavioral symptoms (McDougle et al., 1996, 1998). Therefore we planned to use the RitvoFreeman as a secondary outcome measure. However, even with extensive and repeated training and standardization of the probes during the observation, we were unable to attain acceptable cross-site reliability. Then we considered videotaping the half-hour RitvoFreeman observation protocol for later coding/rating by reliable raters. However, the validity of videotaping live examinations as a way of recording a representative sample of the patients behavior seemed questionable. Sanchez et al. (1995) found no concordance between live and videotape ratings of the same examinations by

DEFINITION OF CLINICAL RESPONDER To the best of our knowledge, there has been remarkably little discussion of what, in general, constitutes a clinical response in psychopharmacology. Should the patient be cured (i.e., asymptomatic), or should clinical response be defined as an improvement marked by a certain percentage decline in symptom scores or percentage improvement in function? Because different investigators have adopted different standards of what constitutes a positive response, it is difficult to determine across studies what is the true rate of clinical response. For example, Aman (1996), studying children with both mental retardation and ADHD, suggested that true responders should show at least a 35% decrease in ADHD symptoms relative to a placebo comparison, but other investigators use different percentages decrease or some global clinical judgment. Research teams appear to be adopting idiosyncratic definitions of clinical response. Many teams have used the CGI scale (Psychopharmacology Bulletin, 1985) as the main outcome measure. Despite its common use, it has not been well studied. Clinical responder is often defined as any subject who is rated 1 or 2 on the global improvement item of the CGI (CGI-I), that is, very much improved or much improved, respectively. Although helpful in being able to generalize across studies, this approach leaves room for uncertainty. For example, it can be difcult to know what a given clinician is weighing when making such a judgment. Is it the subjects or parents feedback about symptomatology, the lack of side effects, behavior manifested

108 trained raters. Treatment effects of haloperidol were detected in the live ratings but not on the videotape ratings, suggesting that critical information present in the live exam is lost in the recording. Therefore we reluctantly gave up the direct observations for this study but have been working on a more reliable system for future studies. Meanwhile, we are using the Ritvo-Freeman scale (in adapted format) as a parent rating scale supplementing other parent scales, and are using direct observation/exam to enhance side-effect assessment. DETECTION OF SIDE EFFECTS Due to their impairment of communicative abilities, subjects with autism may experience adverse effects of the study medication but not be able to convey this information to their parents or the researchers. Yet a better understanding of potential side effects is one of the study aims. Three strategies address this problem: (a) The parents will be systematically asked about possible adverse effects they may have noticed or suspected. (b) Structured side-effect instruments for direct observation /examination: Abnormal Involuntary Movement Scale (AIMS; Rapoport, Connors, & Reatig, 1985) and Simpson-Angus (Simpson & Angus, 1970), will be used regularly in addition to monitoring vital signs, height, and weight. (c) Thorough laboratory testing for possible metabolic or cardiovascular problems will be performed at the beginning and end of the protocol (or earlier if the subject terminates early). Subjects taking an anticonvulsant or showing signs indicating a possible medical complication will have additional laboratory monitoring. Laboratory tests will include prolactin (known to be elevated by antipsychotics and important for growth regulation and feedback to other neuroendocrine circuits); leptin (possibly related to excessive weight gain); and genotyping of polymorphisms in cytochrome P450 2D6 and 2C19 and receptor subtypes of 5HT2A and 5HT2C (detecting slow metabolism as a cause of side effects, fast metabolism as a cause of nonresponse, and receptor variation as cause of response differences). DRUG-SPECIFIC ASSESSMENT ISSUES The agent chosen for the first Autism RUPP trial is risperidone (McDougle et al., 2000). Adult clinical trials of newer atypical antipsychotic drugs (clozapine, risperidone, olanzapine, quetiapine, and ziprasidone) suggest a lower liability for acute extrapyramidal symptoms (EPS) than associated with conventional antipsy-

Arnold et al. chotics such as haloperidol. The relative liability of each of the newer agents to cause EPS is not known because they have not been available long enough for direct comparative studies, particularly among children and adolescents. Evidence is accumulating that patients exhibiting acute EPS are at greater risk of developing tardive dyskinesia (TD), which raises the hope that the atypical agents, with fewer EPS, may also be associated with less TD in the longer term (Barnes & McPhillips 1998). Although TD associated with risperidone and other newer agents has been described, its frequency is not known. Further, it is not known whether the neurological vulnerability of autism would make such patients more likely to develop EPS or TD (or seizures) with the newer agents. Thus it is important to document carefully the neurological effects of risperidone in this special population. Two complementary scales, the AIMS (Rapoport et al., 1985) and the Simpson-Angus (1970) will be used to monitor neurological side effects in addition to interview of caregivers. Though the risk of EPS and TD remain undetermined, other side effects have been more frequently noted. Weight gain has particularly emerged as a major limitation to long-term tolerability of the newer antipsychotics. Clozapine-induced weight gain is so common and predictable that it has been associated with treatment responsiveness in adults with psychotic disorders (Leadbetter et al., 1992). Overall, atypical antipsychotics have been described to produce significantly greater weight gain in adults than traditional agents (Allison et al., 1998). Little is known about differences in weight gain liability between adult and pediatric populations, between different diagnostic groups, or among the various newer agents. Available data and clinical experience suggest important developmental and age differences. For example, in a pooled sample of 424 adults with a variety of diagnoses treated for at least 1 year, risperidone led to an average weight gain of 3.3 kg (Amery et al., 1997), as compared to weight gains of 8.6 kg during a 6-month follow-up among 18 adolescents in a residential treatment setting (Kelly, Conley, Lover, Shorn, & Uschak, 1998), or 1020 kg weight gains in two open studies of autistic children and adolescents treated with risperidone for as short as 26 months (McDougle et al., 1997 Perry, 2 Pataki, Munoz-Silva, Armenteros, & Silva, 1997;). In addition, a case study of children with schizophrenia treated with risperidone described two males who developed liver dysfunction putatively associated with accelerated weight gain (Kumra, Herion, Jacobsen, Brigoglia & Grothe 1997). Of course, depending on age/stage of growth, part of the weight gain could be explained by normal maturation, but the mag-

Assessment in Autism RCTs nitude of gain noted seems to exceed normal growth expectations. The problem is especially complicated in children with autism because of peculiar and erratic eating habits associated with the disorder even when unmedicated. Mechanisms underlying neuroleptic-associated weight gain and appetite dysregulation are poorly understood. An exponential relationship between agents H1 receptor affinity and their weight gain liability has been described (Wirshing, Marder, Goldstein, & Wirshing, 1997). No similar relationship was found with 5HT2c receptor affinity, an important negative observation given the interesting finding of marked weight gain, appetite dysregulation, and seizures in mice lacking the same receptor (Tecott et al., 1995). Others have suggested alpha-1 antagonism as the underlying effect leading to weight gain (Benvenga & Leander, 1997). Given the likely proliferation of atypical antipsychotic use in pediatric, especially developmentally disabled populations, careful study of these agents tendency to induce weight gain is warranted. Of particular interest are differences between agents and the clinical, demographic, developmental, and laboratory predictors of weight gain. Elevations in leptin levels, for example, previously described in association with clozapine treatment (Bromel et al., 1998) may prove to be a useful biological marker predicting longer term weight gain. Therefore, in addition to careful monitoring of weight and dietary intake, blood leptin levels will be checked.

109 essary for administration of the ADI-R. In the absence of cognitive tests practical for use in children with autism, a battery of five timed paper-and-pencil tests was developed by adapting existing technology established in related fields. The inappropriateness of current intelligence tests for the full range of IQs expected was resolved by an algorithm of test choice based on the subjects ability or inability to give valid responses, moving from the WISC-III to the LeiterRevised to Mullen Scale as needed. The poor ability of the subject to report side effects was addressed by thorough examination, including standardized instruments, systematic parent interviews, and thorough laboratory testing. Insofar as the assessment challenges of this RCT of risperidone in autism are shared with the larger field of child/adolescent psychopharmacology in general, the solutions described here may be applicable also to RCTs of other drugs and other disorders. REFERENCES
Alpherts, W., & Aldenkamp, B. (1997, December 4). FePsy, a neuropsychological computerized battery [On line]. Available: http://www.euronet.nl/users/fepsy. Alpherts, W., & Aldenkamp, A. P. (1990). Computerized neuropsychological assessment of cognitive fuctioning in children with epilepsy. Epilepsia, 31 (Suppl. 4), S35S40. Allison, D., Mentore, J., Heo, M., Weiden, P., Cappelleri, J., & Chandler, L. (1998). Weight gain associated with conventional and newer antipsychotics (poster). Scientific Procedings of the American Psychiatric Associations Annual Meeting, Toronto, Ontario, NR 497. Aman, M. G. (1996). Stimulant drugs in the developmental disabilities revisited. Journal of Developmental and Physical Disabilities, 8, 347365. Aman, M. G., Marks, R. E., Turbott, S. H., Wilsher, C. P., & Werry, S. N. (1991). Methylphenidate and thioridazine in the treatment of intellectually subaverage children: Effects on cognitive-motor performance. Journal of the American Academy of Child and Adolescent Psychiatry, 30, 816824. Aman, M. G., Singh, N. N., Stewart, A. W., & Field, C. J. (1985). The Aberrant Behavior Checklist: A behavior rating scale for the assessment of treatment effects. American Journal of Mental Deficiency, 89, 485 491. Aman, M. G., & Turbott, S. H. (1986). Incidental Learning, Distraction, and Sustained Attention in Hyperactive and Control Subjects. Journal of Abnormal Child Psychology, 14, 441445. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Amery, W., Zuiderwijk, P., Brecher, M., Lemmens, P., & Van Baelen, B. (1997). Safety prole of risperidone. Scientic Abstracts of the American College of Neuropsychopharmacology 36th Annual Meeting. Waikoloa, HI, p 188. Anderson, G. M., Freedman, D. X., Cohen, D. J., Volkmar, F. R., Hoder, E. L., & McPhedran, P. (1987). Whole blood serotonin in autistic an normal subjects. Journal of Child Psychology and Psychiatry, 28, 885900 Ando, H., & Yoshimura, I. (1979). Effects of age or communication skill levels and prevalence of maladaptive behaviors in autistic

SUMMARY/CONCLUSION Designing of the first RUPP Autism Network RCT generated many adaptations and other solutions for asessment challenges. The heterogeneity of symptoms/complaints was addressed by a combination of scalar and global primary outcome measures, with the scale (Irritability Scale of the ABC) chosen to include items reflecting the range of target symptoms. CGI anchor points were adapted to the fact that the basic, severely disabling core pathology is not the main target of treatment, but may be incidentally helped. The important clinical characterization as a responder or nonresponder was addressed by requiring both scalar (25% reduction on Irritability Scale) and global (CGI-I of 1 or 2) improvement to be counted a responder. A standard scale (CY-BOCS) was adapted to measure the one core symptom (stereotypy/rituals) likely to be treatment responsive. The diagnostic screening challenge was addressed by undertaking the arduous training nec-

110
and mentally retarded children. Journal of Autism and Developmental Disorders, 9, 8393. Bailey, A., Bolton, P., Butler, L., Le Couteur, A., Murphy, M., Scott, S., Webb, T., & Rutter, M. (1993). Prevalence of fragile XS anomaly amongst autistic twins and singletons. Journal of Child Psychology and Psychiatry, 34, 673688. Bailey, A., Le Couteur, A., Gottesman, I., and Bolton, P. (1995). Autism as a strongly genetic disorder: Evidence from a British twin study. Psychological Medicine, 25, 6377. Barkley, R. A. (1991). The ecological validity of laboratory and analogue assessment methods of ADHD symptoms. Journal of Abnormal Child Psychiatry 19:149178. Barnes, T. R., & McPhillips, M. A. (1998). Novel antipsychotics, extrapyramidal side effects and tardive dyskinesia. International Clinical Psychopharmacology 13 (Suppl. 3), S4957. Benvenga, M. J., & Leander, J. D. (1997). Increased food consumption by clozapine, but not by olanzapine, in satiated rats. Drug Development Research, 41, 4850. Bromel, T., Blum, W. F., Ziegler, A., et al. (1998). Serum leptin levels increase rapidly after initiation of clozapine therapy. Molecular Psychiatry, 3(1), 7680. Campbell, M., & Palij, M. (1985). Behavioral and cognitive measures used in psychopharmacological studies of infantile autism. Psychopharmacology Bulletin, 21, 10471053. Campbell, M., Anderson, L. T., Small, A. M., Adams, P., Gonzales, N. M., & Ernst, M. (1993). Naltrexone in autistic children: Behavioral symptoms and attentional learning. Journal of the American Academy of Child and Adolescent Psychiatry, 32, 12831291. Delis, C., Kramer, J., Kaplan, E., & Ober, B. (1994). California verbal learning test manual. The San Antonio, TX: Psychological Corp. Douglas, V. I., Barr, R. G., ONeill, M. E., et al. (1986). Short term effects of methylphenidate on the cognition, learning and acedemic performance of children with attention deficit disorder in the laboratory and the classroom. Journal of Child Psychology and Psychiatry, 27, 191211. Dunn, L. M., & Dunn, L. M. (1981). Peabody Picture Vocabulary Test-Revised. Circle Pines, MN: American Guidance Service. Feldman, H. M., Kolmen, B. K., & Gonzaga, A. M. (1999). Naltrexone and communication skills in young children with autism. Journal of the American Academy of Child and Adolescent Psychiatry, 38, 58793 Fisch, G. S. (1992). Is autism associated with the fragile X syndrome? American Journal of Medical Genetics, 43, 4755. Folstein, S., & Rutter, M. (1977). Infantile autism: a genetic study of 21 twin pairs. Journal of Child Psychology and Psychiatry, 18, 297321 Fray, P. J., Robbins, T. W., & Sahakian, B. J. (1996). Neuropsychiatric applications of CANTAB (University of Canterbury). International Journal of Geriatric Psychiatry, 11, 329336. Freeman, B. J., Ritvo, E. R., Needleman, R., & Yokota A. (1985a). The stability of cognitive and linguistic parameters in autism: A 5-year study. Journal of the American Academy of Child and Adolescent Psychiatry, 24, 459464. Freeman, B. J., Ritvo, E. R., Yokota, A., & Ritvo, A. (1985b): A scale of rating symptoms of patients with the syndrome of autism in real life settings. Journal of the American Academy of Child and Adolescent Psychiatry, 25, 130136. Geldmacher, D. S. (1998). Stimulus characteristics determine processing approach on random array letter-cancellation tasks. Brain & Cognition, 36, 346354. Griffin, J. C., Williams, D. E., Stark, M. T., Altmeyer, B. K., & Mason, M. (1986). Self-injurious behavior: A state-wide prevalence survey of the extent and circumstances. Applied Research in Mental Retardation, 7, 105116. Handen, B. L., Breaux, A. M., Gosling, A., et al. (1990). Efficacy of ritalin among mentally retarded children with ADHD. Pediatrics, 86, 922930.

Arnold et al.
Kaufman, A. S., Kaufman, N. L. (1983). K-ABC: Kaufman Assessment Battery for Children. Circle Pines, MN, American Guidance Service. Keefe, R. S. E., Lees-Roitman, S. E., & Dupre, R. L. (1997). Performance of patients with schizophrenia on a pen and paper visuospatial working memory task with short delay. Schizophrenia Research, 26(1), 914. Kelly., D. L., Conley, R. R., Love, R. C., Shorn, D. S., & Uschak, C. M. (1998). Weight gain in adolescents treated with risperidone and conventional antipsychotics over six months. Journal of Child and Adolescent Psychopharmacology, 8(3), 151159 Kumra, S., Herion, D., Jacobsen, L., Briguglia, C., & Grothe, D. (1997): Case study: Risperidone-induced hepatotoxicity in pediatric patients. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 701705. Leadbetter, R., Shutty, M., Pavalonis, D., Viewveg, V., Higgins, P., & Downs, M. (1992). Clozapine-induced weight gain: Prevalence and clinical incidence. American Journal of Psychiatry, 149, 6872. Le Couteur, A., M. Rutter, Lord, C., Rics, P., Robertsen, S., Holdgrafer, & McLennan, J. (1989). Autism diagnostic interview: A standardized investigator-based instrument. Journal of Autism and Developmental Disorders, 19; 363387. Lezak, M. D. (1995). Neuropsychological assessment (3rd ed.). New York: Oxford University Press. Lord, C. (1995). Follow-up of two-year-olds referred for possible autism. Journal of Child Psychology and Psychiatry, 36, 13651382. Lord, C., Rutter, M., G??de, S., He??msbergen, J., Jordan, H., Manhood, L., and S??h??pler, E. (1989). Autism Diagnostic Observation Schedule: A standardized observation of communicative and social behavior. Journal of Autism and Developmental Disorders, 19, 185212. Lord, C., Rutter, M., & Le Couteur, A. M. (1994). Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24, 659685. Lord, C., Pickles, A., McLennan, J., Rutter, M., Bregman, J., Folstein, S., Fombonne, E., Leboyer, M., & Minshew N. (1997). Diagnosing autism: Analyses of data from the Autism Diagnostic Interview. Journal of Autism and Developmental Disorders, 27, 501517. Mahoney, W. J., Szatmari, P., MacLean, J. E., Bryson, S. E., Bartolucci, G., Walter, S. D., Jones, M. B., & Zwaigenbaum, L. (1998). Reliability and accuracy of differentiating pervasive developmental disorder subtypes. Journal of the American Academy of Child and Adolescent Psychiatry, 37, 278285. McDougle, C. J., Scahill, L., McCracken, J. T., Aman, M. G., Tierney, E., Arnold, L. E., Freeman, B. J., Martin, A., McGough, J. J., Cronin, P., Posey, D. J., Riddle, M. A., Ritz, L., Swiezy, N. B., Vitiello, B., Volkmar, F. R., Votolato, N. A., & Walson, P. (2000). Research Units on Pediatric Psychopharmacology (RUPP)Autism network: Background and rationale for an initial controlled study of risperidone. Child and Adolescent Psychiatric Clinics of North America, 9(1):201224. McDougle, C., Holmes, J., Bronson, M., Anderson, G., Volkmar, F., Price, L., & Cohen, D. (1997). Risperidone treatment of children and adolescents with pervasive developmental disorders: A prospective, open-label study. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 685693. McDougle, C. J., Holmes, J. P., Carlson, D. C., Pelton, G. H., Cohen, D. J., & Price, L. H. (1998). A double-blind, placebo-controlled study of risperidone in adults with autistic disorder and other pervasive developmental disorders. Archives of General Psychiatry, 55, 633641 McDougle, C. J., Naylor, S. T., Cohen, D. J., Volkmar, F. R., Heninger, G. R., & Price, L. (1996). A double-blind, placebocontrolled study of fluvoxamine in adults with autistic disorder. Archives of General Psychiatry, 53, 10011008.

Assessment in Autism RCTs


Minshew, N. (1991). Indices of neuronal function in autism: clinical and biological implications. Pediatrics, 31, 774780. Mullen, E. M. (1995). Mullen Scales of Early Learning. Circle Pines, MN: American Guidance Service. Pelham, W. E., Bender, M. E., Caddell, J., et al. (1985). Methylphenidate and children with attention decit disorder. Archives of General Psychiatry, 42, 941952. Perry, R., Pataki, C., Munoz-SIlva, D. M., Armenteros, J., & Silva, R. R. (1997). Risperidone in children and adolescents with pervasive developmental disorder: Pilot trial and follow-up. Journal of Child and Adolescent Psychopharmacology, 7(3), 167179. Psychopharmacology Bulletin. (1985). Special feature: Rating scales and assessment instruments for use in pediatric psychopharmacology research. Psychopharmacology Bulletin, 21(4). Rapoport, J., Connors, C., & Reatig, N. (1985). Rating scales and assessment instruments for use in pediatric psychopharmacology research. Psychopharmacology Bulletin, 21, 7131111. Robbins, T. W., & Sahakian, B. J. (1994). Computer methods of assessment of cognitive function. In J. R. M. Copeland, M. T. Abou-Saleh, & D. G. Blazer (Eds.), Principles and practice of geriatric psychiatry (pp. 205209). Chichester: Wiley. Roid, ??. & Miller, ??. (1997). Leiter International Test of Intelligence-Revised. Dale, IL: Stoeling. Sanchez, L. E., Adams, P. B., Uysal, S., Hallin, A., Campbell, M., & Small, A. M. (1985). A comparison of live and videotape ratings: Clomipramine and haloperidol in autism. Psychopharmacology Bulletin, 31, 371378. Scahill, L., Riddle, M. A., McSwiggin-Hardin, M., Ort, S. I., et al. (1997). Childrens Yale-Brown Obsessive Compulsive Scale: Reliability and validity. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 844852.

111
Scahill L., et al. (2000). Autism RUPP Network randomized clinical trial of risperidone: Design and methodological challenges. In preparation. Simpson, G. M., & Angus, J. W. (1970). A rating scale for extrapyramidal side effects. Acta Psychiatrica Scandinavica, 212, 1119. Sparrow, S., Balla, D., & Cicchetti, D. (1984). Vineland Adaptive Behavior Scales: Interview Edition survey form. Circle Pines, MN: American Guidance Service. Steffenburg, S., Gillberg, C., Hellgren, L., Anderson, L., Gillberg, I., Iakobsson, G., et al. (1989). A twin study of autism in Denmark, Finland, Iceland, Norway, and Sweden. Journal of Psychology and Psychiatry, 30, 405 416. Szatmari, P. (1999). Heterogeneity and the genetics of autism. Journal of Psychiatry and Neuroscience, 24, 159165. Tecott, L. H., Sun, L. M., Akana, S. F., et al. (1995). Eating disorder and epilepsy in mice lacking the 5-HT-sub(2c) serotonin receptors. Nature, 374, 542546. Terman, L. M., & Merril, M. A. (1982). Stanford-Binet Intelligence Scale-Form L-M. (4th ed.). Boston: Houghton Mifflin. Volkmar, F., & Nelson, I. (1990). Seizure disorder in autism. Journal of the American Academy of Child and Adolescent Psychiatry, 29, 127129. Wechsler, D. (1991). Weschler Intelligence Scale for Children (3rd ed.) San Antonio, TX: Psychological Corp. Wirshing, D. A., Marder, S., Goldstein, D., & Wirshing, W. C. (1997). Novel antipsychotics: Comparison of weight gain liabilities. Scientific Abstracts of the American College of Neuropsychopharmacology, 36th Annual Meeting, Waikoloa, HI, p 184. World Health Organization. (1993). International classification of diseases: Diagnostic criteria for research (10th ed.). Geneva: Author.

S-ar putea să vă placă și