Sunteți pe pagina 1din 7

Journal of Substance Use, June 2004; 9(3–4): 120–126

ORIGINAL ARTICLE

Three problems with the ASI composite scores

HANS OLAV MELBERG


J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

Norwegian Institute for Alcohol and Drug Research (SIRUS), Oslo, Norway

Abstract
This article discusses three problems relating to the use and interpretation of the Addiction Severity
Index (ASI) composite scores. First, the lack of standardized scores makes it difficult both to
interpret and compare individual scores. Second, it is difficult to interpret a change in a composite
score and to know whether this is a large or a small change. Finally, one might question the objective
validity of some of the composite scores because some of the questions that go into the calculation of
the composite scores invite subjective responses. Moreover, the argument that the validity of the ASI
composite scores is assured by high Cronbach’s alphas is rejected as largely irrelevant.
For personal use only.

Keywords: EuropASI, composite scores, substance abuse treatment.

This article presents three practical problems encountered when using the European
version of the Addiction Severity Index (EuropASI) composite scores to compare different
programmes for treating drug abuse (for more on the background and details on the
project, see the paper by Ravndal). First, there is a practical problem involved in
comparing the values on the different indices since the values have not been standardized.
Second, it is difficult to judge whether a change in a composite score after treatment is
large or small and to compare the changes on different indices. Finally, a close
examination of the questions that go into the calculation of the composite scores and the
way these are weighted lead to questions both about the validity and objectivity of the
indices. This section includes a criticism of tests of validity based on Cronbach’s alpha
values. These tests, I argue, are of limited use since high alpha values are neither necessary
nor sufficient for validity. In addition simply to presenting the general problems and some
examples, I also discuss causes, consequences and possible solutions.

A note on previous research


In February 2003, the Alcohol and Drug Abuse Institute (ADAI) Library at the University
of Washington had more than 60 references in its overview of literature on the ASI (http://
depts.washington.edu/adai/lib/bibs/dx_120.htm). Some of the topics mentioned above

Correspondence: Hans Olav Melberg, SIRUS, Box 565 Sentrum, N-0105 Oslo, Norway. Tel: z47 22 34 04 41. Fax: z47 22
34 04 01. E-mail: hom@sirus.no

ISSN 1465-9891 print/ISSN 1475-9942 online # 2004 Taylor & Francis


DOI: 10.1080/14659890410001720887
Problems with ASI composite scores 121

have been touched by this literature, but some—like the criticism of claims of validity
based on Cronbach’s alpha—have to my knowledge not been discussed previously.

How to compare the values on different indices


Imagine that an interview with a drug client yields the ASI scores illustrated in Figure 1.
Presented with this figure, individuals unfamiliar with the details of the calculation behind
the indices often reason as follows:

The person apparently has no serious medical problems (a score of zero) and serious
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

economic problems (scoring a maximum of 1 here). Moreover, the economic


problems seem to be worse than the psychological problems. Hence, when treating
this client we should give more attention than usual to measures that might help with
the economic problems.

Unfortunately, the reasoning is misleading. A higher value on, say, the composite score for
economic problems than the composite score for psychiatric problems does not mean that
the client has more than average problems with the former. More generally, one cannot
compare the absolute values for the different ASI composite scores and conclude that the
scores with the highest values indicate the area in which the client has the most problems.
The reason for this is simple: it is easier to get a high score on some indices than on
For personal use only.

others. For instance, the composite score for the economic situation consists of answers
to three questions, while the composite score for drugs consists of 16 questions. More
important than the numbers of questions, the nature of the questions are such that it is
relatively easier to get a high problem score in the area of economic situation compared
with the drug index. All you need to get a high score for economic problems is to be
unemployed in the last month (which usually results in low income and need for
economic assistance). To get a high score in the area of drug problems you need to use
all kinds of drugs every day in addition to getting drunk (every day). This is physically
impossible, and for this reason the empirical upper limit on the drug index is far below

Figure 1. ASI composite scores for one client


122 H. O. Melberg

its theoretical limit of 1. In short, it is much easier to get a high score on some
composite scores than others, and for this reason we should not compare the absolute
values to conclude that whatever area has the highest score also indicates the area with
the ‘‘most serious’’ problems.
The difference in the ease by which the composite scores in the different areas can reach
their theoretical maximum of 1 may also point to a cross-cultural problem with the ASI.
Ideally, we would want to interpret a similar composite score for two individuals from
different countries (or cultures) as implying that the seriousness of the problem for the
two individuals is about the same. However, if the nature of the questions is such that it is
easier to get a high score on in one culture, then it would be misleading to interpret ‘‘same
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

score’’ as ‘‘same level of problem’’.


One might object that the problem above is not really an inherent flaw in the ASI
composite scores. To some extent, this is a valid objection since it is the interpreter who
makes the mistake of comparing the scores on different indices without adjusting for the
fact that it is easier to get high scores in some areas than others. It is not the ASI manual
that tells us to make these kinds of comparisons and inferences. Still, it is too easy to
blame the interpreter. If a tool is frequently used incorrectly, then it is probably more
fruitful to redesign the tool instead of blaming the user. At the very least the tool should
be equipped with an explicit warning label.
Is it possible to reduce the problem? If the cause of the problem is that it is easier to get
higher values on some indices than others, then one might try to adjust for this. One way
of doing so would be to rescale the composite scores relative to some standardized mean
For personal use only.

value. Many other composite scores do this; for instance, the Millon Clinical Multiaxial
Inventory (MCMI) or the Childhood Trauma Questionnaire (CTQ). These instruments
have been used on samples of the population to establish the limits that denote ‘‘high’’ or
‘‘low’’ values. In the context of drug abuse, one would have liked to have a table of mean
results on the composite scores for the drug-using population (or treatment-seeking
population). For instance, one might imagine that the average score with respect to the
economic situation was 0.6, while the average score for drug use was 0.3. Knowing this
would make it less problematic to make comparative statements like:

in this client it seems like we need to focus more on his psychiatric problems than we
normally do because of his high score on this composite score relative both to his
drug score when we compare both to the means for the population.

Of course, one could always compare individual scores to the mean of the sample itself.
Although an advance, this is not good enough because one of the claimed advantages of
the ASI is its use in promoting results that can be compared across samples and different
studies. To do so everybody should use the same ‘‘limiting values’’ as reference points. If
people start using their own samples as reference points, they will all get different values
and one person who is ‘‘above mean’’ in one sample may be below it had he been
included in a different sample. There has been some research in this direction (Alterman
et al., 1998), but this has taken the twin road of both creating slightly new scales (called
evaluation indices or dimensions) and standardizing these. There are, so far, no
standardized thresholds regarding the ASI composite scores.
Problems with ASI composite scores 123

How to interpret changes in composite scores


Imagine that after a survey of clients in two different treatments—A and B—we find that
treatment A reduced the composite score for drug use from 0.30 to 0.15 and the
composite score for psychiatric problems from 0.5 to 0.25. By contrast, treatment B
reduced the score for drug use from 0.20 to 0.05 and psychiatric problems from 0.70 to
0.40.
Before moving on, one should note that the treatment that has the highest absolute
improvement in the composite score in this example is not the same as the treatment with
the largest changes measured by percentage reduction from the starting point. Hence, the
question of ‘‘which treatment is the best’’ would receive different answers depending on
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

whether one compared absolute or percentage changes. This, however, is not a criticism
of the ASI since it is a standard problem with all kinds of measures of change.
A slightly more worrying problem is the problem in interpreting the numbers. If you
report that the average days with heroin each month has been reduced from 21 to 5, this is
easy to understand. By contrast, a change in the ASI composite score for drug use from
0.30 to 0.15 is much harder to grasp intuitively. Is this a small or a large change?
In fact, if one examines the questions that enter the ASI composite score for drug use,
one would know that even a large reduction in the number of days on heroin last month
need not have a very large impact on the composite score for drug use. All other things
being constant, a change from 30 days to 15 days will reduce the index by about 0.06.
Hence one needs to be aware that even small changes in some of the ASI composite scores
may indicate large changes when interpreted, say, in numbers of days.
For personal use only.

This, however, only applies to some of the composite scores. Those with few items can
be changed more easily. For instance, the composite score for the economic situation can
change relatively easily when the answer to one of the items changes since there are only
three items overall in that composite score.
Hence, not only is the ease by which one gets a high score different for the composite
scores, the ease by which the scores change is different. This implies that it is difficult to
conclude from Table I that treatment A was relatively better at treating drug problems to
psychiatric problems.

In what sense are the composite scores objective and valid?


It is sometimes claimed that the ASI composite scores are ‘‘objective’’ and that this
objectivity is an advantage for research purposes. For instance, the EuropASI manual
explicitly does ‘‘not recommend that the severity ratings be used as outcome measures’’
(p. 11). Instead, the claim is that one should use the ‘‘more objective’’ and
‘‘mathematically based composite scores’’ for research purposes.
The claims above are true in the sense that everybody should get the same composite

Table I. Changes in composite scores—before and after treatment.

Before treatment After treatment Absolute change % change

Drugs (A) 0.30 0.15 0.15 50


Psychatric (A) 0.50 0.25 0.25 50
Drugs (B) 0.20 0.08 0.12 60
Psychatric (B) 0.70 0.40 0.30 43
124 H. O. Melberg

score based on identical interviews. The calculations are based on mathematical formulas
and the interviewer cannot influence the calculation by his or her subjective opinions. The
composite scores are, however, subjective in the sense that the answers that go into
creating the composite index involve terms that the clients interpret differently. It is also
subjective in the sense that one might disagree with both the questions that go into (or are
left out of) the making of the composite score and their relative importance.
As an illustration, consider the composite score for psychiatric problems. Table II shows
the mean results for a sample of Norwegian drug users in different kinds of treatment
institutions. Rather surprisingly it shows that the clients enrolled in methadone treatment
had the lowest mean composite score for psychiatric problems (0.21). A researcher with
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

little clinical experience may then conclude that the ASI shows that the clients in
methadone treatment were in better psychiatric shape than the other clients.
Clinical experience, however, suggests that it is misleading to use the ASI composite
score for psychiatric problems in this way. The reason methadone clients score so well is
probably not that they are in much better psychiatric shape than the other clients. On the
contrary, they are often in worse shape, but the poor overall state is momentarily forgotten
when they are allowed to receive methadone. At the time this study was conducted,
methadone treatment was relatively new in Norway and there were long waiting lists.
Some users therefore entertained beliefs like ‘‘Once I get methadone everything is going to
become better’’, and for this reason their mood was very good when they were admitted to
treatment. Since the first interview was conducted in this honeymoon period, they
probably reveal fewer problems than one would expect.
For personal use only.

There are, however, two additional causes of relatively low scores among the
methadone clients which are worth exploring. First, when asked about lifetime events
like child abuse, sexual abuse and so on, methadone clients were more likely than the
others to have experienced such problems. This is, however, not reflected in the
composite score since it is only based on questions about the clients’ experiences during
the past month only. One might argue that there should be a high correlation between
lifetime problems and current problems, but empirical studies reveal that this is not always
the case. For instance, in our Norwegian sample of 482 drug users in treatment, there was
no statistically significant correlation between the various CTQ scores (emotional neglect,
physical abuse and neglect, sexual abuse) and the ASI composite score for psychiatric
problems (all with p-values between 0.281 and 0.859). This leads to the question of
whether the exclusion of all items relating to lifetime events reduces the validity of the
composite score. Statistical tests of this, for instance by Alterman, Bovasso, Cacciola, and
McDermott (2001, p. 161, emphasis added), also conclude that ‘‘both lifetime and recent
events appear to be useful in predicting long-term outcomes from baseline data’’.
Second, the answers that go into the composite score are in fact highly subjective. It is
true that the composite score itself is objective in the sense that there is no scope for

Table II. Mean EuropASI psychiatric composite score for clients in different treatments.

Treatment EuropASI psychiatric composite score n

Residential 0.25 158


Psychiatric youth teams 0.33 72
Metadone asssisted rehabilitation 0.21 48
Youth (residential) 0.31 19
Problems with ASI composite scores 125

subjective adjustment by the person calculating the score. It is, however, subjective in the
sense that the answers that go into creating the composite index involve terms that the
clients interpret differently. For instance, the composite score for psychiatric problems
include questions like ‘‘For how many days during the past 30 days have you experienced
serious depression [or serious anxiety problems]?’’ The problem here is that clients may
differ in the extent to which they label the same experience as a ‘‘serious’’ or ‘‘less
serious’’ problem. In this sense, the composite score is based on subjective evaluations and
there may be systematic misrepresentation. For instance, one may speculate whether
methadone clients tend to be more careful about using the label ‘‘serious’’ for problems
because they are older and more experienced. A person who has experienced many
problems may increase the threshold for labelling something a ‘‘serious’’ problem. What
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

for others seems to be abnormal and a serious experience, may for him or her be a
common experience that is not labelled ‘‘serious’’. In this sense the composite score is
subjective and, in turn, leads to misleading conclusions about the psychiatric problems in
a sub-group. In short, one reason why methadone clients score lower than others may be
that they belong to a world in which they have increased the threshold for labelling a
problem as serious, not because they really do have fewer problems. Moreover, the
problem arises because the EuropASI includes questions for which subjective evaluation is
important. This may be unavoidable, but it is still a problem for the validity of the
composite score.
To defend the validity of the ASI composite score, it is sometimes claimed that
statistical tests revealing high Cronbach’s alphas indicate that the ASI composite scores
are valid. A typical example is the following: ‘‘Using Nunnally’s (1967) criterion of a
For personal use only.

minimum alpha of .60, all seven domains show good evidence of internal consistency’’
(Leonhard, Mulvey, Gastfriend, & Schwartz, 2000, pp. 131–132). Slightly worrying, it is
also easy to find examples of the opposite conclusion in the literature. After a brief review
of several studies reporting alpha values relating to the ASI composite scores, one article
concludes that ‘‘in as much as a value of .70 is the conventional threshold for acceptability
(Nunnally, 1978), these reliability levels are unsatisfactory’’ (Alterman et al., 1998,
p. 234). The basic problem, however, is that high alpha values are neither sufficient nor
necessary for validity. In fact, they are not very relevant to the issue of validity at all!
There are two ways to get high Cronbach’s alpha values. The first—increasing the
sample size—has little to do with validity. Second, if you have a high correlation between
the answers to the questions that go into the index, you will get a higher alpha value. To
some extent this may reveal something about the reliability of a scale. If you asked the
same question slightly in different ways, you would want the answers to be correlated. If
not, you might suspect that the respondents simply answer at random or at least are not
really paying attention to the questions. But reliability (getting the same answer to the
same kind of questions) is not the same as validity (whether the instrument or the scale
really captures the theoretical concept we want it to measure).
An example might make this more obvious. The ASI composite score for drug abuse
adds together answers for different drugs in order to say something about the overall level
of drug abuse for a client. If it is always true that those who use much heroin also use
much cocaine (and so on for all the other drugs), then this scale would have a high
Cronbach’s alpha. But it seems like it can be a perfectly useful (and valid) measure of
overall drug problems, even if those who use much/little of one always use much/little of
another drug. It all depends on what we want the scale to measure, and sometimes (as
with drug use) the sub-components may not be correlated, but it may still be useful to add
126 H. O. Melberg

up the answers to questions that are uncorrelated. If the concept we want to capture—
such as ‘‘overall level of drug use’’—really consists of many sub-groups, then it is perfectly
valid to do add the answers in an index even if the answers are uncorrelated.

Conclusion
Although the focus in this paper has been on the weaknesses of the ASI composite scores,
it should not lead to the conclusion that the author is against the ASI. For a researcher,
the great advantage of the widespread use of the ASI lies within the possibility of
comparison. Hence, this paper should be interpreted as friendly input in an on-going
J Subst Use Downloaded from informahealthcare.com by Norwegian Knowledge Cntr Health Svcs on 10/20/10

discussion about how to improve the ASI. The three suggestions would then be to
introduce some kind of standardization, to consider the introduction of lifetime questions
into some of the scales and to avoid too much emphasis on Cronbach’s alpha when
evaluating the validity of the composite scores.

References
Alterman, A. I., Bovasso, G. B., Cacciola, J. S., & McDermott, P. A. (2001). A comparison of the predictive
validity of four sets of baseline ASI summary indices. Psychology of Addictive Behaviors, 15, 159–162.
Alterman, A. I., McDermott, P. A., Cook, T. G., Metzger, D., Rutherford, M. J., & Cacciola, J. S., et al. (1998).
New scales to assess change in the addiction severity index for the opioid, cocaine, and alcohol dependent.
Psychology of Addictive Behaviors, 12, 233–246.
Leonhard, C., Mulvey, K., Gastfriend, D. R., & Schwartz, M. (2000). The Addiction Severity Index: a field
For personal use only.

study of internal consistency and validity. Journal of Substance Abuse Treatment, 18, 129–135.

S-ar putea să vă placă și