!!!12

A Dissertation
Entitled
A Psychometric Study of the Substance Abuse Subtle Screening Inventory-3
Using Rasch Analysis
By
Tara M.Hill
Submitted as partial fulfillment of the requirements for
Doctorate in Philosophy degree in
Counselor Education
Advisor: Dr. John Laux
Dr. Paula Dupuy
Dr. Holly Harper
Dr. Gregory Stone
College of Health Science and Human Service
College of Graduate Studies
The University of Toledo
May 2009
UMI Number: 3364311
Copyright 2009 by
Hill, Tara M.
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy
submitted. Broken or indistinct print, colored or poor quality illustrations
and photographs, print bleed-through, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
UMI
UMI Microform 3364311
Copyright 2009 by ProQuest LLC
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest LLC
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, Ml 48106-1346
Copyright © 2009
This document is copyrighted material. Under copyright law, no parts of this document
may be reproduced without the expressed permission of the author.

An Abstract of
A Psychometric Study of the Substance Abuse Subtle Screening Inventory-3
Using Rasch Analysis
TaraM. Hill
Submitted as partial fulfillment of the requirements for
Doctor of Philosophy degree in Counselor Education
University of Toledo
May 2009
The Substance Abuse Subtle Screening Inverntory-3 (SASSI-3; Miller & Lazowski,
1999) is a popular screening instrument used to assist professionals in the assessment of
individuals who may be substance dependent. Many researchers have reported reliability
and validity results on this instrument with mixed results, which at times have
contradicted those published by the authors of the instrument. This study is the first to
explore the fundamental measurement properties of the SASSI-3 from a Thurstoniah
perspective. Included in this dissertation are a comprehensive review of the literature on
the SASSI-3's psychometric properties and a discussion of the methods used to evaluate
the instrument. The results demonstrated that the whole SASSI-3 meets fundamental
measurement properties and can discriminate groups of people from high to low on the
substance dependency variable. However, the face valid scales continue to demonstrate
iii
higher functioning when used independently of the subtle items. Based on these results,
future research recommendations include combining the Face Valid Alcohol and Face
Valid Other Drug scales to determine the functioning of these two scales together.
iv
Acknowledgment
Thank you Sarah Richards; your love and support were essential and I could not have
done any of this without you. Sue Nagy, Beth and Jim Hill, and Shaunda Jennings;
thanks for always believing in me. John Laux; your guidance, feedback, humor,
persistence, and encouragement made it possible for me to complete my dissertation.
The rest of my committee, Paula Dupuy, Holly Harper, and Greg Stone; your support,
edits, and feedback were essential and appreciated. Thank you to Megan Mahon and
Amber Lange, who in this process, became my lifetime friends.
V
Table of Contents
Abstract .. hi
Acknowledgment ...v
Table of Contents..... vi
List of Tables... x
List of Figures xii
Chapter One Introduction •. 1
Statement of the Problem 6
Purpose of the Study. 7
Research Questions and Corresponding Hypotheses 7
Significance of the Study 9
Definition of Substance Dependence 9
Organization of Chapters 10
Chapter Two Review of the Literature 11
Substance Dependence 11
Alcohol and Drug Screenings 12
SASSI-3 - The Instrument 13
The SASSI-3 Scales 15
Interpreting the SASSI-3.. .21
vi
SASSI Psychometrics .. 22
Reliability.. 22
Validity 24
SASSI-3 reliability from the SASSI-3 Manual. 30
SASSI-3 reliability from independent researchers 30
SASSI-3 validity data from the SASSI-3 Manual 34
SASSI validity data from independent researchers 35
Limitations of the Psychometric Findings on the SASSI-3 38
Rasch Measurement ...40
Rasch Separation and Reliability 44
Response Validation 45
Construct analysis ...47
Unidimensionality 48
Independence 48
Summary 49
Chapter Three Methods , 50
Overview 50
Research Questions and Correlating Hypotheses 50
Participants 52
vii
Instrument - The Substance Abuse Subtle Screening Inventory-3 (SASSI-3) 54
Variable. 58
Procedures 59
Steps in conducting a Rasch Analysis 59
Step one - Response validation 60
Step two - Item fit analysis. 60
Step three - Construct analysis 60
Step five - Assess for measure independence 61
Limitations 62
Chapter Four Results '. 65
Face Valid Alcohol Scale (FVA) 66
Face Valid Other Drug Scale (FVOD) ! 79
Symptoms Scale (SYM) 93
Obvious Attributes Scale (OAT) 100
Subtle Attributes Scale (SAT) 107
Supplemental Addiction Measure (SAM)...... 112
Defensiveness Scale (DEF) 118
Family versus Control Scale (FAM) 126
Correctional Scale (COR) 133
viii
Random Answering Pattern (RAP) 138
Dichotomous SASSI-3 .....139
Review of Research Hypotheses 1-4 162
Review of Research Hypotheses 5-8 182
Chapter Five Discussion 184
The SASSI-3 Scales 188
The Dichotomous SASSI-3 and the Whole SASSI-3 191
Integration of Findings with Other Research 192
Implications.. 194
Suggestions for Future Research 196
Limitations 197
Conclusion 197
References 199
ix
List of Tables
Table 1 - Kappa Coefficient Agreement between Instruments by Authors 36
Table 2 - Summary of Collapsing Strategy for FVA Group 1 Response Options. 69
Table 3 - Summary of Collapsing Strategy for FVA Group 2 Response Options 74
Table 4 - FVA Test of Independence.. 78
Table 5 - Summary of Collapsing Strategy for FVOD Group 1 Response Options 83
Table 6 - Summary of Collapsing Strategy for FVOD Group 2 Response Options 88
Table 7-FVOD Test of Independence ...90
Table 8 - SYM Test of Independence... 99
Table 9 - OAT Test of Independence 106
Table 10 - SAT Test of Independence 111
Table 11 - SAM Test of Independence 117
Table 12-DEF Test of Independence 125
Table 13 - FAM Test of Independence 132
Table 14 - COR Test of Independence 137
Table 15 - Summary of Collapsing Strategy for Dichotomous Group 1 Face Valid
Response Options 143
Table 16 - Dichotomous SASSI-3 Group 1 Paired Aligned Items Fit Statistics 146
Table 17 - Summary of Collapsing Strategy for Dichotomous Group 2 Face Valid
Response Options.... 151

x
Table 18 - Dichotomous SASSl-3 Group 2 Paired Aligned Items Fit Statistics 154
Table 19 - Dichotomous SASSI-3 Test of Independence 158
Table 20 - Multivocal Items, Items on No Scale 164
Table 21 - Summary of Collapsing Strategy for Whole SASSI-3 Face Valid Response
Options 168
Table 21 - Summary of Collapsing Sartegy for Whole SASSI-3 Group 2 Face Valid
Response Options 176
Table 22-Whole SASSI-3 Test of Independence 177
Table 23 - Summary of Person and Item Separation Findings and RPCA's for Direct
Versus Direct and Indirect Scales Combined 193
xi
List of Figures
Figure 1- Response Option 0123 Output for Face Valid Alcohol Group 1 .67
Figure 2 - Item Map FVA Group 1 71
Figure 3 -Response Options Curve 0123 for FVA Group 2.. 73
Figure 4 - Item Map FVA Group 2......... 76
Figure 5 - Response Option Curve 0123 FVOD Groupl 79
Figure 6 - Corrected Response Option Curve 0112 FVOD Group 1 81
Figure 7 - Item Map FVOD Group 1 85
Figure 8 - Item Map FVOD Group 2 90
Figure 9 - Item Map SYM Group 1 95
Figure 10 - Item Map SYM Group 2 97
Figure 11 - Item Map OAT Group 1 102
Figure 12 - Item Map OAT Group 2 104
Figure 13 - Item Map SAT Group 1 108
Figure 14 - Item Map SAT Group 2 110
Figure 15 - Item Map SAM Group 1 113
Figure 16 - Item Map SAM Group 2 115
Figure 17 - Item Map DEF Group 1 120
Figure 18 - Item Map DEF Group 2 123
xii
Figure 19 - Item Map FAM Group 1 128
Figure 20 - Item Map FAM Group 2 130
Figure 21 - Item Map COR Group 1.. 134
Figure 22 - Item Map COR Group 2 136
Figure 23 - Response Option 0123 Dichotomous SASSI-3 Group 1 141
Figure 24 - Corrected Response Option Curve 0112 Dichotomous SASSI-3 Group 1.. 144
Figure 25 - Item Map Dichotomous SASSI-3 Group 1 148
Figure 26 - Response Option 0123 Dichotomous SASSI-3 Group 2 , 149
Figure 27 - Corrected Response Option Curve 0112 Dichotomous SASSI-3 Group 2.. 152
Figure 28 - Item Map Dichotomous SASSI-3 Group 2 156
Figure 29 - Response Options Curves 0123 Dichotomous Whole SASSI-3 Group 1.... 166
Figure 30 - Corrected Response Options Curve 0112 Whole SASSI-3 Group 1 169
Figure 31 - Item Map Whole SASSI-3 Group 1 171
Figure 32 - Response Options Curves 0123 Whole SASSI-3 Group 2 , 173
Figure 33 - Corrected Response Options Curve 0112 Whole SASSI-3 Group 2 174
xiii
Chapter One
Introduction
Substance dependency and abuse are expensive problems in the United States of
America and have negative impacts on its citizens (Substance Abuse and Mental Health
Services Administration [SAMHSA], 2008). In addition to the loss of life, there is a loss
in work productivity, reduction in days attended at school, money spent for medical care,
and convictions and prison sentences due to alcohol and drug problems (SAMHSA,
2008). Based on this information, it is important for people who struggle with alcohol and
drug abuse to get proper diagnosis and treatment. Part of the diagnostic process can
involve mental health professionals' use of substance use screening instruments. Due to
the clinical implications of the assessment process, it is necessary that substance abuse
screening tools are psychometrically sound and accurately measure substance
dependence. A number of substance dependence screening instruments are available to
assist in this process. According to a study of professional addictions counselors, there
are four substance dependence screens that are most frequently selected by these
counselors as aids in their diagnostic processes (Juhnke, Vacc, Curtis, Coll, & Paredes,
1
2003). These are the Substance Abuse Subtle Screening Inventory (SASSI-3; Miller &
Lazowski, 1999), the Michigan Alcoholism Screening Test (MAST; Selzer, 1971), the
MacAndrew Scale-Revised (Mac-R: MacAndrew, 1965) from the Minnesota Multiphasic
Personality Inventory - 2's (MMPI-2; Butcher, Dahlstrom, Graham, Tellegen, &
Kaemmer, 1989), and the Additions Severity Index (ASI; McLellan, Luborski, Cacciola,
Griffith, McGranhan, & O'Brien, 1992). Of these four, the SASSI-3 (Miller & Lazowski,
1999) was identified by professional addiction counselors as being the most important
(Juhnke et al. 2003) for the following reasons: a) it measures alcohol dependence as well
as dependence on other drugs of abuse; b) it provides several measures of response bias
(e.g., defensiveness and random answering); and c) it is scored and interpreted according
to sex-specific national normative data.
The SASSI-3 (Miller & Lazowski, 1999) is a paper and pencil, self-administered,
two-sided substance dependence screen that includes 67 true and false items on the front
and two columns of items on the back, each set of which presenting the respondent with
four rating scale choices of never, once or twice, several times, and repeatedly. The two
columns of items on the back of the screening are labeled Face Valid Alcohol (12 items)
and Face Valid Other Drug (14 items), respectively. These two groups of items directly
inquire of the respondent to identify the extent of the use and impact the use has had on
his or her life. However, the items on the front are meant to be more subtle in nature and
therefore illicit less defensiveness, a response commonly identified among clients who
are questioned directly about their substance abuse behaviors.
2
The SASSI-3's items form ten scales. Seven of these scales, either independently
or in combination, are used for clinical decision making regarding the probability of a
client's substance dependence. This final disposition is made through nine decision rules.
If any of these decision rules are affirmative then the respondent is likely to be substance
dependent. The seven scales used in the decision rules include the Face Valid Alcohol
scale (FVA), the Face Valid Other Drugs scale (FVOD), the Symptoms scale (SYM), the
Obvious Attributes scale (OAT), the Subtle Attributes scale (SAT), the Supplemental
Addiction Measure scale (SAM), and the Defensiveness scale (DEF). A check of profile
validity is provided by way of the Random Answering Pattern scale (RAP). If the RAP
score is greater than one then the decision rules may be invalid due to the likelihood that
the respondent did not answer the items in a meaningful way. The final two scales are the
Correctional (COR) and the Family vs. Controls scales (FAM). These final scales lend
additional clinical information which may be included in treatment goals for the
respondent. All of the SASSI-3's scales Will be discussed in greater detail in subsequent
sections of this document.
Reliability and validity test results have been published on the SASSI-2 and -3.
The results of these investigations provide a variant range of agreement with what is
found in the SASSI-3 Manual (Miller & Lazowski, 1999). For instance, the reliability
findings published in the SASSI-3 Manual (Miller & Lazowski) identified high internal
consistency scores for the individual scales. However, these results have yet to be fully
replicated by other researchers whose findings were as much as seven to twenty points
lower (Clements, 2001; Laux, Salyers, & Kotova, 2005; Myerholtz & Rosenberg, 1998).
3
Only moderate agreement was found between the SASSI-3 and other instruments
purporting to measure similar constructs (Laux, Salyers, & Kotova, 2005; Myerholtz &
Rosenberg, 1998). An independent investigation of the SASSI-3's construct validity
(Gray, 2001), performed using factor analysis, failed to render the same ten factor
solution as reported by Miller and Lazowski in the SASSI-3 Manual (1999). However,
the factor structure of two of the SASSI-3's scales, the Face-Valid Alcohol scale and the
Face-Valid Other Drugs scale, did concur with the SASSI-3 Manual's data regarding
these two scales (Laux, Perera-Diltz, Smirnoff, & Salyers, 2005; Laux, Salyers, &
Kotova, 2005). Finally, the SASSI-3 Manual (Miller & Lazowski) reports high overall
accuracy, sensitivity and specificity rates when comparing the SASSI-3's classification
results to diagnosis of a substance dependence diagnosis using the Diagnostic and
Statistical Manual IV Text Revision (American Psychiatric Association [APA], 2000).
However, these high accuracy, sensitivity and specificity rates have not been replicated
by independent researchers (Arneth et al., 2001; Clements, 2002; Svanum & McGrew,
1995).
Independent researchers (Arneth et al., 2001; Clements, 20002; Feldstein &
Miller, 2007; Gray, 2001; Laux, Salyers, & Kotova, 2005; Svanum & McGrew, 1995)
appear to question the SASSI-3's reliability and validity in the context of that which is
published by the SASSI-Institute. However, there has been no discussion in the literature
about SASSI-3's alignment with the fundamental principles of measurement (Thurstone,
1927). These fundamental principles of measurement include unidimensionality,
linearity, invariance, and independence. These terms will be introduced and explained as
. 4
they apply to the SASSI-3 investigation. Unidimensionality means that an instrument is
evaluating just one construct (Bond & Fox, 2007). In this study, the instrument of interest
is the SASSI-3 and the construct that it purports to measure is substance dependence
(Miller & Lazowski, 1999). The authors of the SASSI-3 acknowledge that it was not their
intention to develop a unidimensional instrument, rather, the goal in developing the
SASSI-3 was to advance an instrument that could discriminate between those who have a
high probability of substance dependence and those who do not (Miller & Lazowski).
However, the instrument appears to be addressing the construct of substance dependence
with the exception of a couple of its scales. Therefore, in addition to unidemisionality it
would be interesting to explore the SASSI-3's measurement properties. Linearity is
conceptualized in terms of a yard stick (Bond & Fox). A hierarchy of items is created
according to level of difficulty with easy items on one end and difficult items at the other.
This hierarchy would indicate a lesser degree of substance dependence is needed to
answer the items at the bottom, and harder items on the top requiring greater degree of
substance dependence to answer. Like a yard stick measures height, the taller one is the
more height he or she has, for the SASSI-3 the more items a person endorses the more
likely he or she is to be substance dependent. Invariance means that the items will be
aligned on an equal interval "yard stick" measuring substance dependence. Independence
means that regardless of the sample being measured, the alignment of the items on the
instrument will not vary. This means that the items will align in equal interval levels like
inches on a yard stick.
5
Statement of the Problem
Junkhe, Vacc, Curtis, Coll, and Paredes (2003) reported that one of the screening
instruments most frequently used by addictions counselors is the Substance Abuse Subtle
Screening Inventory-3 (SASSI-3; Miller & Lazowski, 1999). The SASSI-3 has been
used in a variety of settings including but not limited to community mental health
agencies, college counseling centers, prisons, alcohol and drug treatment facilities,
inpatient hospitalization programs, and rehabilitation treatment centers (Miller &
Lazowski, 1999). The SASSI-3's psychometric properties have been studied by several
independent researchers (Arneth, Bogner, Corrigan & Schmidt, 2001; Clements, 2002;
Feldstein & Miller, 2007; Gray, 2001; Laux, Perea-Ditlz, Smirnoff & Salyers, 2005;
Laux, Salyers & Kotova, 2005; Lazowski, Miller, Boye, & Miller, 1998; Peters et al.,
2000; Svanum & McGrew, 1995) have been found to differ, at times significantly, from
those reported in the SASSI-3 Manual (Miller & Lazowski). These differences may be
related to the traditional methods of testing reliability and validity used by researchers.
However, what is unclear is whether the SASSI-3 meets the fundamental requirements of
measurement. And, if there is doubt about whether the SASSI-3 meets the fundamental
requirements of measurement, then there is also doubt about the implications of the
diagnoses it informs and the subsequent treatment recommendations that are prescribed
based on these diagnoses. Therefore, it is necessary to explore the measurement
properties of the SASSI-3 as this may lead to improvement in the instruments' accuracy
rates, validity, and reliability across samples.
6
Purpose of the Study
The purpose of the study is to investigate the SASSI-3's psychometric alignment
with the fundamental principles of measurement as represented using the Rasch model
(Rasch, 1960,1980). Specifically, this investigation will focus on investigating the
measurement properties of the entire instrument and the individual scales, evaluate the
reliability of the response options by identifying whether the participants are utilizing the
response scales as intended by the authors of the SASSI-3, and assess the linearity,
invariance, and independence of the instrument.
Research Questions and Corresponding Hypotheses
The following research questions will be addressed in this study.
General Research Question 1: Does modern measurement methodology assist in
the revalidation of the SASSI-3?
Research Question 1: Do the ten scales on the SASSI-3 represent a
unidimensional measure of substance dependence?
Research Hypothesis 1: A Rasch principal components analysis will
produce a unidimensional factor structure that accounts for 60% or more of the items'
total variance.
Research Question 2: Do the ten scales included on the SASSI-3 adequately
measure the construct?
Research Hypothesis 2: An analysis of item fit will produce infit and outfit
statistics indicative of low item error.
7
Research Question 3: Are measures from the SASSI-3 ten scales reliable for
diagnostic classification purposes?
Research Hypothesis 3: The SASSI-3 ten scales (as evidenced in the
item-map) will remain reliably defined across independent samples.
Research Question 4: Do the SASSI-3 Rasch analyzed ten scales clearly
discriminate between those who are substance dependent and those who are not?
Hypothesis 4: The SASSI-3 Rasch analyzed ten scales demonstrate high
discriminatory ability (via high Rasch Person Separation).
General Research Question 2: Does modern measurement theory assist in
improving the SASSI-3 instrument holistically?
Research Question 5: Does the SASSI-3 instrument, as a whole, represent a
produce a unidimensional factor structure that accounts for 60% or more of the whole
instruments' total variance.
Research Question 6: Does the whole SASSI-3 adequately measure the substance
dependence construct?
statistics indicative of low item error for the SASSI-3 instrument as a whole.
Research Question 7: Is the whole SASSI-3 reliable for diagnostic classification
purposes?
8
Research Hypothesis 7a: Rasch Reliability statistics demonstrate acceptable levels
of internal consistency for the S AS §1-3 instrument as a whole.
Research Hypothesis 7b: The holistic SASSI-3 construct (as evidenced in the
Research Question 8: Does the whole SASSI-3 instrument demonstrate an ability
to clearly discriminate between those who are substance dependent and those who are
not?
Research Hypothesis 8: The holistic SASSI-3 demonstrates high discriminatory
ability (via high Rasch Person Separation).
Significance of the Study
The significance of investigating the SASSI-3' s measurement properties is to save
money. In the current state of the economy with budget cuts, mental health and drug
treatment benefits being reduced, alcohol and drug facilities closing due to funding issues
improvement in the screening instrument will save time and money. An improvement to
the SASSI-3's reliability and validity will improve diagnostic accuracy
(sensitivity/specificity) and potentially improve subsequent substance abuse treatment
recommendations based on these improved accuracy rates. With improved accuracy rates
the right clients will be receiving treatment, which leads to higher treatment success rates,
which leads to the public voting for levies and more funding for drug treatment programs.
Definition of Substance Dependence
In order to clarify the term "substance dependence", the following definition will
be used to operationally define substance dependence as provided by the Diagnostic and
9
Statistical Manual of Mental Disorders IV Text Revision (DSM TR-IV; American
Psychiatric Association, 2000). Substance dependence is a maladaptive behavioral
pattern involving substance use, within the past twelve months which leads "to clinically
significant impairment or distress" (p. 197). Three of the following seven criteria must be
met for an individual to be considered substance dependent: 1) The individual must have
demonstrated increase tolerance; 2) The individual must have withdrawal symptoms or
use of the substance to avoid withdrawal symptoms; 3) The individual must used more
than intended; 4) The individual tries to control or reduce the substance use despite
cravings to no avail; 5) The individual spends excessive time in substance seeking, use or
recovering behaviors; 6) The individual often neglects social, work, or other obligations
in favor of the substance use; and 7) Despite negative consequences, both physical and
psychological, the individual continues to use the substance.
Organization of Chapters
Chapter one introduced the problem and provided a rational for the study. Chapter
tv/o reviews the relevant literature. Chapter three presents the methodology to be used in
this study. Chapter four will consist of the results from the analysis and Chapter five will
include a discussion of the findings.
10
Chapter Two
Chapter two provides a review of the literature with an overview on substance
dependence and its impact on society and the Substance Abuse Subtle Inventory-3
(SASSI-3; Miller & Lazowski, 1999). Specifically, the review will begin with a
discussion of substance dependence, diagnosis and screening, and a review of the SASSI,
and its psychometric properties. Then, a discussion of modern measurement theory,
namely the Rasch model will follow. Finally, the chapter will close with a summary of
the SASSI-3 and the Rasch model.
Substance Dependence
Substance dependence and abuse in the United States has a negative impact on
society. For example, the latest Substance Abuse and Mental Health Services
Administration (SAMHSA; 2008) reports an increase in the use of medical services as a
result of substance dependence. According to the 2006 SAMHSA report, the number of
visits to an emergency department due to drug abuse increased roughly four percent,
while the US population only increased roughly three percent. Additionally, between
2004 and 2006, visits related to non-legitimate use of prescription medications increased
38 percent. The National Highway Traffic Safety Administration (2007) reports that
11
someone is killed roughly every 40 minutes by a drunk driver. In addition to the loss of
life, there is a loss in work productivity, reduction in days attended at school, money
spent for medical care, and convictions and prison sentences due to alcohol and drug
problems (SAMHSA, 2008). Substance dependence has a great economic toll on society.
Often, persons with substance use problems are sent for screening assessments to
determine if a substance use diagnosis is appropriate and, if so, to determine what
treatment should be prescribed. It is important to have appropriate screening instruments
available to effectively aid in the diagnosis and treatment process.
Alcohol and Drug Screenings
There are several methods by which counselors determine whether further
assessment and treatment for a substance dependence disorder diagnosis is appropriate.
One of these methods is to administer a substance dependence questionnaire or screening
instrument. The purpose of a substance dependence screening instrument is to aid and
assist the counselor in determining whether additional substance dependence assessment
is necessary (Adger & Werner, 1994). There are several screening instruments available
to assist clinicians in assessing for alcohol and drug related problems.
Junkhe, Vacc, Curtis, Coll, and Paredes (2003) surveyed professional addiction
counselors to determine which screening instruments were used most frequently. The
results of this survey suggest that there are four instruments that are most frequently
employed. These were the Substance Abuse Subtle Screening Inventory (SASSI-3; Miller
& Lazowski, 1999), the Michigan Alcoholism Screening Test (MAST; Selzer, 1971),
MacAndrew Scale-Revised (Mac-R: MacAndrew, 1965) from the Minnesota Multiphasic
12
Kaemmer, 1989), and the Additions Severity Index (ASI; McLellan, Luborski, Cacciola,
Griffith, McGranhan, & O'Brien, 1992). These professional addiction counselors also
identified the SASSI-3 as the most important assessment instrument (Junkhe et al., 2003).
In addition to the findings of Junkhe et al (2003), and because of the mixed literature
review regarding the instrument's reliability and validity, the SASSI-3 will serve as the
focus of this investigation.
SASSI-3 - The Instrument
The Substance Abuse Subtle Screening Inventory-3 (SASSI-3) is an instrument
designed to discriminate between people who have a high probability of being substance
dependent from those with a low probability of having a substance dependence disorder,
regardless of whether or not they are able or "willing to acknowledge relevant
symptoms" (Lazowski, Miller, Boye, & Miller, 1998, p. 115; Miller & Lazowski, 1999).
People with substance abuse and dependence disorders often deny the existence and
extent of the problem. The original SASSI was uniquely created to address the problem
of denial commonly identified by treatment providers through the inclusion of both
direct, or Content obvious, and indirect, or content subtle, items (Miller & Lazowski,
1999). Introduced in 1988, the SASSI has gone through two revisions. Currently, the
SASSI is available in three versions. There is an adult version (SASSI-3), an adolescent
version (SASSI-2A), and a version for Spanish speaking persons.
The SASSI-3's current version was published in 1999. The conversion from the
SASSI-2 to -3 was driven by a desire to reduce rate of false positives, which was 15.5
13
percent (Miller & Lazowski, 1999). The conversion process included the creation of a
new seven-item scale, the Symptoms scale, and the elimination of two items whose
wording was deemed to be Objectionable. The seven items forming the Symptoms scale
were unused items already included among the SASSI-2's item pool (Gray, 2001;
Lazowski et al. 1998). Gray asserted that the differences between SASSI-2 and-3 were
minor and as such, the literature base supporting the SASSI-2 could "readily be
generalized" to the SASSI-3 (p. 104). For the purpose of this study, both reliability and
validity findings for the SASSI-2 and SASSI-3 will be reported.
The current SASSI-3 instrument is printed on one page, front and back. The front
consists of 67 true-false items. The items on side one are typically referred to as indirect
or subtle as most, but not all, of the items do not directly inquire about the impact of
drinking or drug related behaviors. These 67 items make up eight of the SASSI-3's total
ten scales. The authors recommend that side one be administered first as the items on this
side are less likely to elicit defensiveness than those on side two, which directly ask about
substance use (Miller & Lazowski, 1999).
Side two of the SASSI-3 includes the face valid items which inquire directly
about alcohol and drug use, behaviors, and the impact thereof. Because the items on side
two are obvious in their intent to measure substance use, there is a potential that
respondents might fake-good or minimize their substance use, if any (Miller & Lazowski,
1999). The response choices for the items on side two are placed along a four-point
Likert-type scale with the options of "never" (0), "once or twice" (1), "several times" (2),
and "repeatedly" (3). For each scale, the score of each item is summed to produce a total
14
score. It takes approximately fifteen minutes to complete, score, and interpret the SASSI-
3. Counselors use a transparent overlay to calculate raw scores for each of the SASSI-3's
scales. These raw scores are then transferred to a profile sheet, which can be used to
approximate the individual's T-scores and percentile scores. A discussion of the scoring
rules will follow a general description of the SASSI-3's scales.
The SASSI-3 Scales
The SASSI-3 has ten scales, three of which are worded in such a way as to inquire
directly of the respondent of his or her use and the impact of the use of drugs and alcohol.
The directly worded scales are the Face Valid Alcohol (FVA), Face Valid Other Drug
(FVOD) and the Obvious Attributes (OAT) scales. The other seven scales are stated in a
subtle manner. The subtle scales are the Subtle Attributes (SAT), Supplemental Additions
Measure (SAM), Symptoms (SYM), Defensiveness (DEF), Family versus Control
(FAM), Correctional (COR), and the Random Answering Pattern (RAP) scales. All of the
scales are said to discriminate statistically between those who are and who are not
substance dependent (Miller & Lazowski, 1999). The FVA, FVOD, OAT, SAT, SAM,
SYM, and DEF are used in clinical decision making. This means that these scales
contribute to the decision rules for the clinician to further assess for treatment needs. The
RAP provides an indicator of how closely the respondent paid attention to the content of
the items, and the FAM and COR are experimental in nature and not used in the clinical
decision making process. Further discussion of the dichotomous clinical decision making
rules will follow.
15
Prior to engaging in clinical decision making regarding whether a respondent is
likely to be substance dependent, counselors must first check the respondent's score on
the Random Answering Pattern (RAP) scale. The RAP scale is a measure of random or
careless answering. In this regard the RAP scale is a global measure of the validity of the
respondent's approach to the process, and not the content, per se. The RAP scale is
typically reviewed first to verify whether the respondent completed the instrument in an
appropriate manner (Miller & Lazowski, 1999). This scale is a "measure of response
validity" (Laux, Salyers, & Kotova, 2005, p. 43). Any of these reasons is sufficient to
cause doubt about the validity of the SASSI-3's results. The SASSI-3 Manual
recommends that if the RAP scale score is 2 or greater than the screener should "interpret
with caution" due to the possibility that the respondent did not answer the questions in a
meaningful way or did not understand the directions (Miller & Lazowski, 1999, p. 11).
The RAP scale consists of six, true-false items that produce a range of scores from 0-6. If
the RAP scores suggest that the respondent did not answer in a random manner, the
counselor moves forward with the interpretation of the remaining SASSI-3 scales.
The first of the two scales derived from the items on the face valid or second side
is the Face Valid Alcohol scale (FVA). The response choices are arranged along a four-
point Likert-type scale with the options of "never" (0), "once or twice" (1), "several
times" (2), and "repeatedly" (3). The raw score range is 0-24 for the FVA scale. The
FVA scale consists of twelve questions inquiring directly of alcohol use behavior and the
impacts of use. Examples of item content include alcohol consumption with noon meals
and suicide attempts while under the influence of alcohol. As the reader can plainly see,
16
these items are face-valid in that the intent of the items to measure alcohol use is obvious.
High FVA scores represent intentional recognition and admission of alcohol use. Low
FVA scores may reflect an absence of alcohol use, or they could be the product of efforts
to minimize or deny alcohol consumption.
The second face valid scale is the Face Valid Other Drug (FVOD) scale. The
FVOD items consist of fourteen questions inquiring directly of drug use behavior and the
impact of use. The response choices for the FVOD items are a four-point Likert-type
scale with the options of "never" (0), "once or twice" (1), "several times" (2), and
"repeatedly" (3) with a total raw score range of 0-27. Examples of item content include
whether the respondent has had legal trouble as a result of drug Use and used drugs to
avoid withdrawal symptoms. Miller and Lazowski (1999) reported that the higher the
scores on the FVA and FVOD the more progressed the substance dependence disorder.
The Symptoms (SYM) scale is purported to measure the signs, symptoms and
correlates of substance dependence in a direct manner (Miller & Lazowski, 1999). There
are eleven items on this scale with dichotomous response options, "true" and "false" and
a raw score range of 0-11. Examples of item content include inquiring of the respondent
whether he or she has concern regarding memory loss and family history of alcohol or
drug use.
The Obvious Attributes Scale (OAT) scale also utilizes direct items. High OAT
scores have been shown to indicate a willingness to "admit symptoms" (Myerholtz &
Rosenberg, 1998, p. 440), recognize "problematic behaviors" (Miller & Lazowski, 1999,
p. 15), and "personal limitations" (Laux, Salyers & Kotova, 2005, p. 43) frequently
17
associated with substance dependence. While these items are direct in that they ask the
respondent to admit to personal foibles, they do not require the respondent to make the
connection that their foibles are associated with any particular source. Consequently, a
respondent could produce an elevated OAT score without elevating one of the Face-Valid
scales. An interpretation of this arrangement of scores might be that the respondent was
aware that problems were occurring without understanding that these problems were a
consequence of personal substance use. Examples of GAT item content include behaviors
such as impulse control problems and low tolerance for frustration. There are twelve
items on the OAT scale with a raw score range of 0-12. Examples of item content include
whether responsibilities have been avoided or forgotten as a result of substance use and
whether substance dependence has resulted in family conflicts.
The Subtle Attributes scale (SAT) consists of eight criterion-keyed items with a
range of raw scores between 0-8. Examples of item content include inquiries of whether
the respondent obeys laws and has excessive energy with a decreased need for sleep.
These eight items were selected solely on their basis of statistically distinguishing
between persons known to be substance dependent and persons known to not be
substance dependent. The specific content of empirically selected items is
inconsequential. The advantage to such items is that people who may be motivated to
conceal their substance use or those who are "in denial" about the extent to which they
may have a problem have no way to intentionally manipulate these items. Thus, they tend
to answer these questions differently than those who do not have a substance dependence
disorder (Laux, Salyers, & Kotova, 2005). The SAT scale is purported to measure the
18
predisposition of the respondent to developing a substance dependence disorder
(Meyerholtz & Rosenberg, 1998). Additionally, this scale has been able to discriminate
between substance abusers and non-abusers, regardless of their attempts to fake good or
fake bad (Miller & Lazowski, 1999).
The Defensiveness scale (DEF) consists of eleven criterion-keyed items and, like
the RAP scale, is used as a validity scale. The DEF scale measures "denial or deliberate
concealment of problems" (Myerholtz & Rosenberg, 1998) and is used in the decision
rules. As a result of the DEF scale being developed to discriminate between respondents
using the standard versus fake-good instructions, a respondent who has high scores on the
DEF scale may be making an effort to present him or herself in a positive way (Laux,
Salyers, & Kotova, 2005). Likewise, respondents achieve low DEF scores by endorsing a
high number of personal faults and foibles. Consequently, the DEF scale is also viewed
as an indirect measure of self esteem, depression, and, at the very lowest range, potential
suicidal ideation. The range of raw scores is 0-11 with scores of eight or higher
representing significant enough denial as to call the SASSI-3's results into question.
Examples of item content include inquiring about the amount of dangerous activities in
which the respondent has engaged and whether he or she is a restless person.
The Supplemental Addictions Measure (SAM) scale consists of fourteen
criterion-keyed items and has a range of scores between 0-14. Examples of item content
include whether the respondent feels worn out and whether he or she has experienced
periods of memory loss. This is the SASSI-3's third and final validity scale. The SAM is
designed to discriminate between respondents who are substance dependent and
19
defensive and those who are non-substance dependent with a more pervasive
defensiveness characteristic (Laux, Salyers, & Kotova, 2005; Miller & Lazowski, 1999).
The SAM scale is used to tease out whether elevated DEF scores reflect substances
specific defensiveness (high SAM score) or defensiveness due to some other reason (low
SAM score).
The Family versus Control Subjects (FAM) scale consists of fourteen criterion-
keyed items with a range of scores between 0-14. Examples of item content include
inquiry regarding whether the respondent would like more self-control and whether he or
she has ever broken the law. There are several potential interpretations of the FAM scale.
The SASSI-3 authors designed the FAM scale to assess the amount of focus a respondent
has on others (Miller & Lazowski, 1999). Myerholtz and Rosenberg (1998) reported that
the FAM scale identifies co-dependency. Still other researchers say it discriminates
between those who experienced substance abuse in their family of origin versus those
who did not (Laux, Salyers, & Kotova, 2005). The FAM is not used in the screening
decision rules but can be used to assess possible additional clinical issues needing
addressed in treatment.
The Correctional (COR) scale consists of fifteen criterion-keyed items with a
range of scores between 0-15. Examples of item content include whether the respondent
has wanted to leave his or her residence and whether he or she would like to hit another
person. Respondents with a high score on the COR scale endorse items in a similar
pattern as those who have extensive criminal histories and legal involvement (Miller &
Lazowski, 1999). This scale is purported to assess the level of treatment or supervision
20
needed by the respondent, if there is evidence of a due criminal history (Miller &
Lazowski). This scale is also not part of the screening decision rules. The reader is
cautioned that there is no published data to suggest that the COR scale predicts future
}
illegal behavior. •
Interpreting the SASSL3
The SASSI - 3 uses has nine decision rules that are used to arrive at a decision
about the respondent's likelihood of having a substance dependence disorder. Each of the
nine rules has between one and five criteria. These criteria are cutoff scores for seven of
the ten scales. If the cutoff score is met or exceeded, the rule is indicated as "yes". If
unmet, the rule is indicated as "no". Rules 1 and 2 are based solely on the FVA and
FVOD scales, respectively. Rules 3, 4, and 5 are based solely on data from the SYM,
OAT, and SAT scales respectively. The remaining rules 6-9, are based on a combination
of the various scales, both direct and indirect. Decision rule 6 must be a score of seven or
more for the OAT and five or more for the SAT to be a "yes". Decision rule 7 includes
two criteria. The first criterion is a combination of an FVA score of nine or more or an
FVOD or fifteen or more. The second criterion is a SAM score of eight or more. If both
criteria are met the Decision rule 7 is a "yes". Decision rule 8 must be a score of five or
more on the OAT, eight or more oft the DEF and eight or more on the SAM to be a "yes".
Decision rule 9 includes four criteria. The first criteria is a combination of an FVA score
of fourteen or more or eight or more on the FVOD. The second criteria is two or more on
the SAT, 4 or more on the DEF, and four or more on the SAM. If all four criteria are met
then Decision rule 9 is a "yes". An indication of "yes" on any of the nine decision rules
21
indicates a high probability of substance dependence. If all decision rules are answered
"no", it is an indication of a low probability of substance dependence. However, if the
respondent has a low probability of being substance dependent but had a score of eight or
more on the DEF scale then the counselor is cautioned that the results may be a false
negative.
SASSI Psychometrics
The following section will present and critique the available data regarding the
S AS Si's reliability and validity. Initially, the researcher will provide the psychometric
data provided in the SASSI-3 Manual (Miller & Lazowski, 1999). Then, the data
provided by independent researchers will be introduced. This section will conclude with a
critique of the available literature as well as a recommendation for a new approach to the
question of the SASSI-3's psychometrics. To begin, the researcher will provide a brief
review of the psychometric constructs "reliability" and "validity" as used in these
sections.
Reliability. Reliability means that an instrument yields stable results for a given
sample (Bartholomew, 1996; Mark, 1996; Traub, 1994). It is important to understand that
data about an instrument's reliability is sample specific (Gray, 2001). That is, reliability
is an attribute of and is specific to the sample and its data, rather than a characteristic of
the instrument. There are several methods of assessing the reliability of an instrument's
data. These methods are the test-retest, internal consistency, split-half, and inter-rater
reliability tests. Split-half and inter-rater reliability tests are not appropriate tests of
reliability for a screen such as this and thus have not been used to evaluate the SASSI-3.
22
Correlation statistical tests are used to evaluate reliability and include Cfonbach's Alpha,
often referred to as the alpha coefficient, (Cronbach, 1951), the Pearson product moment
correlation coefficient, the Kuder-Richardson 20 (KR-20; Kuder & Richardson, 1937),
andt-tests.
Test-retest reliability is used to explore the stability of the results from a given
instrument over a brief time period (Sproll, 1995). A scale is said to have test-retest
reliability when the value it assigns to a trait does not fluctuate between pretest and
posttest administrations. To test the stability of an instrument, a sample is administered
the instrument once and then a second time following a two- or four-week time delay. A
correlation coefficient between the first and second administration is calculated, the
magnitude of which is reported as the stability coefficient. The instrument is reported to
be reliable if the results of the test-retest yields stable scores across the time delay, from
time one to time two.
Internal consistency assesses the degree to which an instrument's items are
measuring a similar construct (Sproll, 1995). The internal consistency estimate provides
an indicator of the homogeneity of a set of items on an instrument (Mark, 1996). To
investigate an instrument's internal consistency, the instrument need only be
administered once, after which a statistical procedure will report the overall mean
correlation of each item's variance with each other item on the instrument. The
instrument is reported to be reliable, if the items strongly correlated with one another
(Reis & Judd, 2000). The internal consistency of an instrument is commonly referred to
as alpha co-efficient. The statistics used to evaluate the internal alpha coefficient are the
23
Cronbach's Alpha (Cronbach, 1951) and the Kuder-Richardson 20 and 21 (KR-20 & KR-
21; Kuder & Richardson, 1937). Each of these formulas measure internal consistency;
however, they are to be used in different circumstances. The Cronbach's Alpha can be
used with instruments employing any type of response option scales (i.e., two choice
scales to more than two response option categories such as the Likert-type response
scale). The KR-20 and KR-21 were designed to be used specifically and exclusively for
dichotomously scored instruments, correct/incorrect or with only two response options^
such as yes/no or true/false (Crocker & Algina, 1986).
As demonstrated above while the reliability of an instrument is the first question
to be explored when investigating an instrument's psychometric capabilities, reliability
only answers the question of whether or not an instrument provides consistent results.
The question of whether or not an instrument measures what it is purported to measure is
answered by an investigation of its validity. Validity will be discussed below.
Validity. Validity has been considered in the past as "most fundamental and
important in psychometrics" (Angoff, 1988, p. 19). An instrument that is valid measures
what it reports to measure (Mark, 1996). This means that if an instrument is reported as
being able to assess substance dependence, the instrument will indeed measure substance
abuse and not self-esteem, anxiety, or depression. While reliability can be investigated
empirically using statistics and formulas, "validity is more of a theoretical issue, and
therefore, its assessment is less straightforward" (Bartholomew, Henderson, & Marcia,
2000, p. 300). There are three ways to explore an instrument's validity. Those methods ,
are grouped as content validity, construct validity, and criterion-referenced validity.
24
Content validity is sometimes referred to as "face validity", but the two concepts
are not synonymous. Face validity is defined by Mosier (1947) as "the extent to which
the items appear to measure a construct that is meaningful to lay persons or typical
examinees" (Cited in Cocker & Algina, 1986, p. 223). Content validity refers to the
extent to which the items in the instrument accurately reflect the domain of interest
(Bartholomew, Henderson, & Marcia, 2000). Due to concerns about denial and
defensiveness among people with substance abuse and dependence disorders, there is
some debate about the appropriateness of using face valid items to screen for these
disorders.
Content validity is the first evaluation used to classify an instrument as valid and
is typically done in the earliest stages of test development (Bartholomew, Henderson, &
Marcia, 2000). Several steps are followed to assess an instrument's content validity. The
first step is to establish the researcher's intent with regard to the instrument and develop a
pool of items. These items are then evaluated by content expert judges to ascertain their
degree of agreement with the objectives of the instrument (Cocker & Algina, 1986). The
correlation of these matches among judges is evaluated for congruence. Highly congruent
matches between judges, items, and objectives mean that the instrument has content
validity.
Construct validity explores whether the trait of interest is the characteristic
impacting the test-taker's performance of if another underlying characteristic is partially
responsible (Mark, 1996). Wallen and Fraenkel (1991) outlined a three step process
useful in identifying whether an instrument is high in construct validity. The steps are 1)
25
to create a clear definition of the variable, 2) based on a theory underlying the variable,
develop hypotheses which are formed about how people who possess a "lot" vs. a "little"
of the variable will respond to a particular situation, and 3) test the hypotheses both
logically and empirically - that is, by collecting additional information (Wallen &
Fraenkel, p. 95). For example: Is multicultural awareness the only quality being measured
on this instrument or is affability impacting his or her performance as well? In this
manner, the instrument's structure is evaluated for underlying constructs which may be
affecting the outcome.
Construct validity also evaluates "various relationships" to the variable of interest
(Sproll, 1995, p. 77). To investigate construct validity, the researcher developing the
above mentioned instrument would want to compare it to the individual's level of cultural
understanding measured by using multiple sources of data. The researcher would be
hoping for high correlation. The researcher may also compare the individual's level of
racism (an opposing trait), using multiple sources of data, hoping for a low correlation.
These two types of correlations are used to demonstrate two types of construct validation,
convergent validity and divergent validity, respectively. Convergent validity is the degree
to which an instrument correlates positively with another measure of a similar construct
(Mark, 1996). Using the example above, the researcher can count the number of multi-
cultural events an individual attends in a year to correlate with the multicultural
awareness. A high level of positive correlation would mean that the instrument has
convergent validity. Discriminate validation is demonstrated when an instrument is
negatively correlated with another measure of an opposing construct (Mark). Again, in
26
the example above the researcher can interview an individual regarding his or her beliefs
about racially charged political events and qualitatively analyze the results hoping for a
negative correlation between the results of the interview and the instrument. Or, the
researcher can evaluate two differing groups of people hoping to differentiate between
them using correlations with the instrument. A negative correlation would demonstrate
discriminate validity.
Construct validity can be evaluated using four methods which are correlations by
exploring the convergent and discriminate validities, differentiation between groups
analysis, factor analysis, and multitrait-multimethod matrix. Construct validation of the
SASSI-3 has only been completed using the convergent, discriminate, differentiation
analysis, and factor analysis validity exploration methods. Therefore, for the purposes of
this study, the multitrait-multimethod matrix method will not be discussed.
Convergent validity is associated with testing an instrument's construct validity. It
usually involves the use of alternative methods of measurement, if the primary construct
is evaluated via a survey instrument (Litwin, 1995). Convergent validity can be evaluated
using a correlation or kappa coefficient (Litwin). A correlation coefficient level of 0.70 or
more is acceptable in social science research (Nunnally, 1978). There is less agreement as
to the acceptable level of the kappa coefficient as there are several different
interpretations which distinguish the levels of kappa (see Altman, 1991; Carletta, 1996;
Landis & Koch, 1977; Viera & Garrett, 2005). Altman (1991) adapted the Ladis and
Koch (1977) interpretation table for kappa indicating that a kappa score of less than .20 is
27
poor agreement, .21-.40 is fair agreement, 41-.60 is moderate agreement, .61-.80 is good
agreement and .81-1.00 is very good agreement with another instrument.
Factor analysis is another method used by researchers to evaluate construct
validity. Factor analysis allows researchers to identify the structure of the instrument and
validate whether it is measuring a common factor (Sproll, 1995). Using the factor
analytic method of validity testing, researchers compute a correlation matrix between the
subjects and items and then conduct a reduction technique to identify the number of
underlying constructs accounting for the variation in the variables (Cocker & Algina,
1986).
Criterion referenced validity is the degree of agreement the instrument has with a
'gold standard' for "assessing the same variable" (Litwin, 1995, p. 37). This 'gold
standard', the criteria against which the instrument is compared, is regarded as the best
measure of the construct (Litwin). The instrument being tested may be a more efficient,
cost effective, quicker or shorter method of evaluating the same construct. Criterion
referenced validity tests have a five step design as identified by Crocker and Algina
(1986). Those steps include 1) identifying a construct and a method to evaluate it; 2)
selecting a sample; 3) collecting and maintaining the data for future evaluation; 4) when
available, obtaining data on comparison construct for each participant; and 5) using a
correlation coefficient to evaluate the degree of relationship between the primary
construct and comparison construct.
There are two subtypes of criterion validity, predictive and concurrent. An
instrument is said to have predictive validity if it can predict, through the use of a
28
correlation, a future second variable (Sproll, 1995). A common example of this type of
validity test involves pre-college entrance examinations such as the Scholastic Aptitude
Test (SAT). Admissions departments often base their determinations on SAT scores
among other criteria because the SAT is said to predict future performance in college
(Salins, 2008). An instrument is said to have concurrent validity, if it measures a
construct that is present at the time of the evaluation (Cocker & Algina, 1986; Mark,
1996). An example of concurrent validity would be to evaluate a person's substance
dependence using the SASSI-3 against their admission of substance dependence Or a
clinical diagnosis using the Diagnosis and Statistical Manual of Mental Disorders IV Text
Revision (DSMIV-TR; American Psychiatric Association, 2000) by a licensed mental
health professional.
Criterion referenced validity requires investigating the instruments ability to
correctly identify those who meet the 'gold standard' criteria and correctly identify those
who do not meet the 'gold standard' criteria. The terms used to describe the two
conditions described above are sensitivity and specificity, respectively (Altaian, 1991).
With regard to this study, sensitivity refers to the SASSI-3's ability to correctly identify
persons with a substance use disorder, and specificity is the ability to correctly identify
those who do not have a substance dependence disorder. These two concepts are closely
related to the concepts of false positives and false negatives. False positive is when a
screen incorrectly identifies someone as having a substance use disorder. False negative
is when a screen incorrectly says that a person does not have a substance use disorder. If
29
an instrument is high in sensitivity and specificity, it is low in false positives and false
negatives.
SASSI-3 reliability from the SASSI-3 Manual. The authors of the SASSI-3 Manual
report that the two-week test-retest stability coefficients are 1.0 for the face valid scales
and between .92 and .97 for the clinical scales for the sample taken from voluntary
participants at addiction treatment centers, psychiatric hospitals, vocational rehabilitation
programs, and a sexual offender treatment program across the United States (Miller &
Lazowski, 1999). They report the alpha coefficient as .93 for the entire instrument.
However, the reported alphas by scale are low with the exception of the face valid scales.
The FVA and FVOD scales' alphas are .93 and .95 respectively. The SYM, OAT, and
SAT scales' alphas progressively decrease from .79 to .69 to .27 alphas. The DEF scale
alpha is .63 and is followed again by a decrease in values for the SAM and FAM scales'
alphas of .37 and .33 respectively. The COR scale has a .71 alpha. These varying alpha
values are explained by the authors by identifying that the instrument was not developed
to be unidimensional and therefore, the alpha findings are "not a primary consideration"
(Miller & Lazowski, 1999, p. 26). The support for their findings has been mixed and are
addressed in the following section.
SASSI-3 reliability from independent researchers. Independent research
investigations have found the SASSI-3 data to be at varying levels of reliability with
inconsistent findings when compared to the results found by the originators of the
instrument. Consistent with Miller and Lazowski (1999), several researchers have
identified that stability coefficients are the most meaningful reliability test because the
30
SASSI-3 was not constructed to be a unidimensional measure (Lazowski et al. 1998,
Miller & Rosenberg, 1998). However, this assertion has been both supported and
challenged in research as independent test-retest reliability studies have shown
inconsistent findings (Feldstein & Miller, 2007). In their study investigating the efficacy
of the SASSI-3, Lazowski, Miller, Boye, and Miller (1998) utilized a two-week test -
retest to explore reliability with a similar population as that reported in the SASSI-3
Manual. They found SASSI-3 score stability to be between 1.0 for the face valid scales
and between .92 and .97 for the subtle scales. These findings are consistent with the
findings reported in the SASSI-3 Manual which is 1.0 for both the FVA and FVOD
scales (Miller & Lazowski, 1999). With a college sample, Laux, Salyers and Kotova
(2005) also found stability scores for a two-week test-retest reliability investigation of the
SASSI-3 for all of the scales.
However, in their study investigating the reliability of SASSI-2, Myerholtz and
Rosenberg (1998) tested the reliability of the SASSI-2 using the test - retest method with
college students. Using several subsamples, these researchers found that the two-week
stability coefficient using the Pearson product-moment correlation coefficients for the
FVA and FVOD scales were .82 and .89 respectively. This demonstrates a moderately
high level of correlation indicating that the set of scores remained relatively stable.
However, other studies have found higher correlation coefficients (i.e., Laux, Salyers, &
Kotova [2005] found .94 for the FVA). The face valid scale stability findings range from
1.0-.97 (Lazowski et al, 1998; Miller & Lazowski, 1999). In the social sciences, it is
generally acceptable if the stability is above .70 (Nunnally, 1978). The clinical scales
31
indicate a more widely spread correlation coefficient across the scales. According to
Myerholtz and Rosenberg (1998), the stability coefficients ranged from .78 to .54,
averaging .71 across the six clinical scales. This indicates less stability than reported by
Miller and Lazowski (1999) but moderate stability in the set of scores between testing
situations for the clinical scales.
With respect to overall classification, a significant but rarely reported finding for
SASSI psychometric reports, Myerholtz and Rosenberg (1998) found that of a
subsarnples of 55 participants, five (10%) of them had a change in classification. Two
were found to be non-chemically dependent after initially being classified as chemically
dependent and three were found to be chemically dependent after initially being classified
as non-chemically dependent.
In the four-week test-retest reliability investigation using a subsample of college
students, Meyerhotlz and Rosenberg (1998) the stability results were mixed. The Pearson
product moment correlation coefficients for the Face Valid (FVA and FVOD) scales were
.76 and .93. This demonstrates a moderate and high level of correlation between time one
and two, indicating that the set of scores remained stable across the scales. Myerholtz and
Rosenberg found for the clinical scales the correlation coefficient for the 4-week test-
retest group ranged from .78 to .42, averaging .63 across the six clinical scales. This
indicates less stability in the set of scores between testing times for the clinical scales. In
addition, of the 47 participants, nine (19%) were found to have a change in classification
from the first to the second testing time four weeks later indicating "poor" reliability
(Myerholtz & Rosenberg, 1998, p. 441). Four (10.5%) of the 38 participants initially
32
found to be non-chemically dependent for test 1 were classified as chemically dependent
four weeks later on the retest. Five (56%) of the nine participants initially found to be
chemically dependent on the first administration were classified as non-chemically
dependent four weeks later on the second administration. These 4-week stability estimate
results cannot be placed in the context of the SASSI-3 authors' findings as the SASSI-3
Manual only reports 2-week correlation coefficients (Miller & Lazowski, 1999).
The Myerholtz and Rosenberg (1998) findings indicate that there is higher
stability in scores for test-retest for the SASSI-2 direct scales and poor stability for the
clinical scales. They conclude that because the SASSI-2 purports to screen for an
"enduring trait of chemical dependency", the inventory should have more robust clinical
scales and fewer changes in status over testing situations (Myerholtz & Rosenberg, p.
445).
Four studies have investigated the internal consistency of the SASSI-3 or its
scales. The face valid scales have been found to have high internal consistency. The
reported subtle scales' internal consistency varies from good to poor. The coefficient
alpha for the FVA was .92 in a study comparing the SASSI-3 to other substance abuse
screening instruments with a college population (Laux, Salyers, & Kotova, 2005). The
coefficient alpha for the FVOD was .95 in a study which supports the psychometric
properties of the scale using a college student population (Laux, Perrera-Diltz, Smirnoff,
& Salyers, 2005). These two coefficient alpha findings for the FVA and FVOD scales are
consistent with those of Clements (2002) and Miller and Lazowski (1999). The decision
rule findings for SASSI-3 produced a .49 coefficient alpha (Clements, 2001). This means
33
that the items which are included in the scales used for the decision rules finding a person
substance dependent or non-dependent appear to moderately represent a single construct.
Clements also found that the three direct scales had the highest coefficient alphas, and the
subtle scales had low alphas.
SASSI-3 validity data from the SASSI-3 Manual. While the SASSI - 3 Manual
(Miller & Lazowski, 1999) identifies two scales as "face valid" the content validity is not
reported for those scales, any other scale, or the instrument as a whole. In stating the
obvious however, the FVA and the FVOD scales appear meant to be reflective of their
content validity. In a study conducted by Lazowski, Miller, Boye, and Miller (1998), the
researchers explored previous research that compared the SASSI-3 to the MMPI - 2
Addiction Acknowledgement Scale (Weed, Butcher, McKenna & Ben-Porath, 1992),
MMPI-2 Addition Potential Scale (Weed et al.), MAC-R (MacAndrew, 1965), the MAST
(Selzer, 1971), and the Millon Clinical Multiaxial Inventory-II (MCMI-II) Alcohol
Dependence Scale and Drug Dependence Scale (Millon, 1987). They found that people
who scored positive for substance dependent on the SASSI-3 had higher mean scores and
all of those that scored non-dependent on the SASSI-3 had lower mean scores on the
above listed instruments (Lazowski et al. 1998).
Miller and Lazowski report in the SASSI - 3 Manual (1999), when using the
instrument to compare to clinical diagnosis, 93 percent accuracy, 93.3 percent sensitivity
and 94.2 percent specificity for the SASSI-3. Again, comparing the SASSI-3 to clinical
diagnosis, Lazowski, Miller, Boye, arid Miller (1997) found an overall accuracy rate of
97 percent, a.sensitivity rate of 97 percent, and a specificity rate of 95 percent.
34
SASSI validity data from independent researchers. In the current SASSI literature,
researchers have compared the SASSI to other survey instruments measuring the same
construct (Laux, Salyers, & Kotova, 2005; Lazowski, Miller, Boye, & Miller, 1998;
Myerholtz & Rosenberg, 1998). When comparing the SASSI-2 to other instruments
which also purport to screen for alcohol and of drug problems, Myerholtz and Rosenberg
(1998) found that the SASSI-2 had less than acceptable (.61) convergent validity with the
CAGE (Ewing, 1984; Mayfield et al., 1974). "CAGE" is an acronym, the letters of which
represent the following alcohol-related traits and behaviors: C- have you ever felt you
should cut down on your drinking, A- have people annoyed you by criticizing your
drinking, G- have you ever felt bad or guilty about your drinking, and, E- have you ever
had a drink first thing in the morning to steady your nerves or to get rid of a hangover
(eye-opener). This acronym is a question prompt for clinicians in their screenings of
clients. Laux, Salyers, and Kotova (2005) compared the SASSI-3's classification
agreement with the MAST, CAGE and MAC - R (see Table 1). Using the Altaian
approach to Kappa interpretation, Laux, Salyers, and Kotova (2005) identified that the
agreement between the SASSI-3 and the CAGE and MAST is in the "high-moderate
range" (p. 47).
35
Table 1
Kappa Coefficient Agreement between Instruments by Authors
SASSI &
SASSI& Modified SASSI & SASSI & SASSI &
CAGE CAGE MAC MAST MAC-R
Laux, Salyers &

.49 .52 .29
Kotova — —
Moderate Moderate Fair
(2005)
Myerholtz &
61 .58 .34 .22
Rosenberg —
Good Moderate Fair Fair
(1998)
Note. Cells with no value indicate that no data were reported.
A factor analytic evaluation of the SASSI-2 was published by Gray (2001) who
found through confirmatory factor analysis and exploratory factor analysis that the ten
factor solution as suggested by the ten scales identified in the SASSI-3 Manual, was not a
good fit for his data. In fact, a two factor solution, with items mostly representing the
FVA and the FVOD scales, accounted for up to 53 percent of the variance (Gray). The
subtle items did not organize into the scales as identified by the SASSI-3 Manual and
were found to be "multivocal" (Gray, p. 109). This dimensionality was confirmed later in
two studies exploring the FVA scale and the FVOD respectively (Laux, Salyers, &
Kotova, 2005; Laux, Perera-Diltz, Smirnoff, & Salyers, 2005).
One group of researchers (Arneth, Bogner, Corrigan, & Schmidt, 2001),
examined the SASSI-3's predictive validity among patients with traumatic brain injury
36
(TBI). These authors compared the TBI patients' SASSl-3's results with their blood
alcohol level (BAL) at the time of their injury in an effort to see which would better
predict chemical dependency. They found that the SASSI-3's results were equally
predictive of chemical dependence as was their BAL.
Researchers (Arneth et al., 2001; Clements, 2002; Peters et al., 2000; Svanum &
McGrew, 1995) investigating the SASSI's criterion referenced validity have used
diagnoses by licensed mental health professionals using the DSMIV-TR criteria as the
gold standard. The results of these studies are mixed and are not consistent with of the
results published in the SASSI - 3 Manual (Miller & Lazowski, 1999). For example, in a
study of the SASSI -2, Svanum and McGrew (1995) found a sensitivity of 33 percent and
87 percent specificity in their college student population. Using an incarcerated
population, Peters et al. (2000) found an overall accuracy rate for the SASSI -2 of 69.4
percent with a sensitivity of 73.3 percent and a specificity of 62.2 percent. A similarly
lower finding came from a study of TBI patients using the SASSI-3 and diagnostic
criteria (Arneth et al., 2001). The accuracy rate was found to be 69.2 percent, sensitivity
rate equaled 70.8 percent, and specificity was 68.5 percent. All of which were reported to
be statistically significantly different than the normative sample at the p<.001 level
(Arneth et al.). Using a sample of college students, Clements (2002) also found lower
sensitivity and specificity findings, 65 percent and 89 percent, respectively. Clements
hypothesized that if the cutoff scores were lower for the college population, the SASSI-3
may have higher sensitivity. Upon further investigation Clements found that if the cutoff
scores were reduced, the sensitivity increased to 89 percent. Clearly, independent
37
researchers have been unable to replicate many of the sensitivity, specificity, and overall
classification rate findings found in the SASSI-3 Manual.
Limitations of the Psychometric Findings on the SASSI-3
The SASSI-3 is one of the most frequently used substance dependence screening
instruments used by counselors and has been identified as the "most important"
instrument of its kind (Juhnke et al., 2003). Unfortunately, there is significant question
and disparity as reported in independent research of the SASSI-3's reliability and
validity. This may be due to differences in the methods researchers use to evaluate
reliability and validity.
A limitation to the test-retest method of reliability testing involves extraneous
sources. Those sources, which may contribute to changes in scores from the initial to the
follow up test include: (a) the individuals attempts to recall what was previously asked or
how they answered, (b) changes in the characteristic being assessed, and (c) changes in
the conditions or environment and the interaction between the individual and those
changes the occasion accounted for in the variance (Traub, 1994).
The major limitation for testing the internal consistency for the SASSI -3 includes
the fact that many of the instrument's scales were not developed to measure one
construct. Rather, their test construction and item selection were guided by the criterion
keyed method, a procedure that identifies an item's statistical ability to discriminate
between people who are substance dependent and those who are not, regardless of the
item's content (Miller & Lazowski, 1999). Therefore, internal consistency is a less
38
relevant reliability measure for the SASSI-3 (Miller & Lazowski, 1999; John & Benet-
Martinez, 2000).
A limitation of construct validation studies is the potential for "mono-operational
bias" (Brewer, 2000, p. 9) which can reduce validity. Researchers can use two different
methods to evaluate the construct which will aide in eliminating mono-operational bias.
For example, if a researcher is using a survey instrument to evaluate the primary
construct, he or she should use a method other than a survey instrument to evaluate a
similar construct. Another limitation of exploring construct validity is that researchers
often use a variety of terminology when referring to the relationships between
instruments. Often researchers will interchange "convergent validity" with "concurrent
validity". While these concepts can compare two measures and varying sources of data,
they in fact are fundamentally different in their intentions. Construct validity involves
validating the instrument's theoretical underpinnings. Concurrent validity compares the
instrument to the existence of a criterion. In exploring the convergent validity of the
SASSI with other substance dependence screening instruments, the findings result in
mixed outcomes at best. These mixed outcomes may be the result of a lack of clarity and
agreement among researchers as to the correct terminology when referring to validity.
The primary limitation to these traditional methods of establishing reliability and
validity are sample specific. If the sample changes, the results of the reliability and
validity investigations will change as well (Keeves & Masters, 1999). The performance
of the person is dependent on the instrument in classical test theory because there is an
interaction between the instrument and the sample (Keeves & Masters). As a
39
consequence of this interaction, no inferences can be made about the performance of any
one person on any particular item. Instead, all that can be known is the individual's
performance on the test as a whole (John & Benet-Martinez, 2000). Additionally, there is
no way to empirically evaluate the quality of any individual item (Kagee & deBruin,
2007). Nor is there a way to empirically evaluate the response scales of the instrument in
classical test theory (Keeves & Masters). Finally, often researchers assign numbers to an"
ordinal scale and then assume that those numbers are interval and mean the same for each
item in order to use them in statistical analyses (Keeves & Masters). Each of these
limitations can be addressed through the use of measurement models in which a person's
performance and the items are independently scaled "along a continuous intervally scaled
latent trait" (Henderson, Taxman, & Young, 2008, p. 165).
Rasch Measurement
Rasch analysis (Rasch, 1960, 1980) is an alternative method of evaluating an
instrument's reliability and validity (Fox & Jones, 1998). This method of evaluating the
reliability and validity of a psychological instrument employs the guiding principles of
measurement (Thurstone, 1927). These principles are the same principles utilized when
measuring the height of a house, the weight of a baby, or the volume of a container of
liquid. The Thurstonian principles of measurement, which include (a) unidimensionality;
(b) linearity; (c) invariance; and (d) independence, can be applied to instruments which
are designed to measure psychological constructs in humans (Stone, 2007). Each of the
Thurstonian principles will be described in detail and will include an example of its
40
application on a poplar measure of general distress, the Symptoms Checklist-90, Revised
(Derogatis, 1975) as evaluated using the Rasch method by Elliott etal. (2006).
Unidimensionality means that an instrument measures one construct or
characteristic of an object at a time (Bond & Fox, 2007). For instance, a scale only
measures weight, not height. A ruler only measures length and not temperature. In
counseling research, this means that an instrument should only measure one trait or
construct. One example of such an instrument is the Symptom Checklist-90-Revised
(SCL-90-R; Derogaris, 1975). A psychometric analysis was conducted to evaluate the
psychometric properties of the SLC-90-R (Elliot et al., 2006). The researchers found that
the instrument measured the construct of "general clinical distress" as evidenced by the
measurement principal components analysis finding that the instrument accounted for 78
percent of the total variance (Elliot et al., p. 359). This means that the SCL-90-R is
measuring one construct; the items on this instrument aligned in a hierarchical fashion
according to difficulty.
Linearity implies that an object of measurement has more or less of the construct.
For instance, a person has more height or less height than another person, more weight or
less weight than another. In counseling research, an example is that an instrument should
measure more or less of a construct such as more or less anxiety, or more or less
depression in a person. This is evident in the SCL-90-R because the analysis using the
above principles of measurement found increasing levels of severity both among the
items and the people (Elliot et al., 2006). There was a continuum of items from more to
less difficult to answer from "general malaise" to "psychosis" and an increasing
41
agreeability for people from "non-clinical" to "extreme distress" indicating a hierarchical
arrangement of items (Elliot et al., p. 364).
Invariance means that a unit of measurement can be repeated without
modification in different parts of a continuous instrument and it will remain constant
across samples (Stone, 2007). For instance, five inches is equal to five inches regardless
of where on the ruler one begins to measure or what one is measuring. In counseling
research, this means that an instrument regardless of whether one starts measuring with
the low end units or the middle units of the "ruler" will result in measuring the same size
unit. In the psychometric analysis of the SCL-90-R, Elliott et al. (2006) found that the
instrument could be used to measure people at the high end of the ruler, demonstrating
that individuals were experiencing extreme clinical distress, and at the low end of the
ruler, demonstrating this part of the sample was experiencing non-clinical distress.
Independence means that as Thurstone stated, "a measurement must not be
seriously affected in its measuring function by the object of measurement" (as cited in
Wright, 1960, p. ix). For instance, whether a person is weighing apples at the produce
market, a baby at birth, or gold, the scale is an instrument used to measure and ounces are
the unit of measurement regardless of the item being measured. In addition, the scale
does not measure color of the apples, length of the baby, or karats of gold. In counseling
research, it follows that regardless of the population, criminals or mothers of infants, an
instrument designed to evaluate depression measures depression and not additional
factors related to being a criminal or a mother. To weigh produce, disperse speeding
42
tickets, and dispense medications, people have come to rely on systems of calibrated
measurement. The following questions are then raised by some researchers:
How is it that when we go to our offices to conduct educational research,
undertake some psychological investigation, or implement a standardized
survey, we then go about treating and analyzing those data as if the
requirements for measurement that existed at home in the morning no longer
apply in the afternoon? Why do we change our definition of and standards for
measurement when the human condition is the focus of our attention? (Bond
& Fox, p. 1)
Developing and improving instruments designed to measure constructs of human
psychological characteristics and impairments using the guiding principles of
measurement are important to furthering knowledge in the field of counseling research
(Fox & Jones, 1998). Many researchers have explored the psychometric properties of
several different psychological constructs and instruments using the Rasch model. Some
of those investigations include hostility (Strong, Kahler, Greene, & Schinka, 2005), the
Symptom Checklist-90-Revised (Elliott et al., 2006), school readiness (Banerji, Smith, &
Dedrick, 1997), detainees distress (Kagee & de Bruin, 2007), and evidenced based
practices in the criminal justice system (Henderson, Taxman, & Young, 2008).
Using the Rasch model is user friendly for instrument development and for
analyzing instruments' psychometric properties. It is based on a complicated probability
equation involving logarithms. However, Winsteps (Linacre, 2009) is the current
computer program used for this evaluation. Winsteps provides researchers easy to read
43
tables, charts, and graphs. The variable, scales, items, and people can be represented
through clear pictorial representations such as the person-item map and the response
probability curves. These charts and graphs will be referred to throughout this section and
will be carefully described to aid in understanding of the concepts.
Elliot et al. (2006) used the Rasch model to explore psychometric properties of
the SCL-90-R. Their method outlined the process by which other researchers can
evaluate instruments. This method, to be described below, involves the following steps:
1) evaluate the separation and reliability for the entire instrument, 2) response validation,
3) analyze the item fit, 4) evaluate the construct analysis, 5) evaluate the instrument for
unidimensionality by reviewing the fit statistics and the principal components analysis,
and 6) investigate whether the items function the same with a different sample. After
each step, the person and item separation and reliabilities will be evaluated for changes.
Rasch Separation and Reliability. In classical terms, the concept of internal
consistency is analogous with the Rasch model's person separation and item separation
reliabilities (Fox & Jones, 1998). The separation statistic assists in identifying the number
of distinct groups among the items and people (Elliot et al., 2006). From the separation
statistic the strata index can be determined (Bond & Fox, 2007). The strata inform
researchers of the statistically distinct groups of people and items found. It is suggested
that a separation of two is the minimum acceptable standard (Wright & Masters, 1982 as
cited in Elliot et al.). A separation of two or greater creates three or more distinct groups
of items or people. The output, known as item map, is another indication of person and
item separation as they can be visually distinguishable on this diagram (Elliot et al.).
44
The first step in evaluating an instrument is to review the separation and
reliability (Elliot et al. 2006). The Rasch outputs offer two sets of statistics for separation
and reliability, one is for the items and the other is for the participants which is called
"person". The second step is to evaluate the separation and reliability for the subscales.
The separation and reliability statistics will be the basis upon which the researcher will
compare any changes made to the response scales or elimination of misfitting items. For
example, if the researcher eliminates a misfitting item, this may affect the separation and
reliability statistics. If it increases these two statistics, than the outcome of the change is
positive. If it decreases these two statistics, than it may limit available information
provided by the data.
Response Validation. Researchers can use the Rasch model to determine whether
participants utilized the rating scale as established by the developers. This process is
called response validation and is important to do prior to interpreting further results, as
the response scales may not be working as the researchers intended (Elliot et al., 2006).
Completing a rating scale analysis allows researchers to test their hypotheses regarding
whether the rating scale was clear, had the correct amount of response choices, and
whether the participants were using the scale as developed (Fox & Jones, 1998).
Conducting an analysis of the rating scale also allows researchers to evaluate whether the
instrument's items function unidimensionally (Elliot et al.). For this analysis, the
commonly accepted rule is that the distance between two adjacent response options
(threshold) should be more than 1.4 but not more than 5 logits (Linacre, 1999). A logit is
a unit of measurement that is arranged on an equal interval log scale (Bond & Fox, 2007).
45
A second way to evaluate the rating scales is to visually inspect the response probability
curves output. For each item, a probabilistic curve is created from the data. This curve
demonstrates the likelihood of each response option being chosen by the sample. If any
response option curve does not exceed 50 percent probability of being selected or the
threshold is below or in excess of 1.4 or 5 logits, test developers should consider re-
evaluating the rating scale, redefining the options, or logically collapsing two response
options into one.
Response validation is an important step in instrument construction because it
assists in scale development and individual diagnosis. In the SCL-90-R Rasch analysis,
the researchers found that the respondents were not using the response scale as expected,
and therefore, for the instrument's response scales to function as intended and maintain
separation and reliability, it became necessary to collapse the five point Likert-type scale
to a three point scale (Elliot et al.). This means that the original scale i.e., (1) not at all,
(2) a little bit, (3) moderately, (4) quite a bit, (5) extremely, needed adjusted because
individuals did not respond to these categories in five distinct ways; but instead
individuals responded in three distinct ways: (1) not at all, (2) a little bit and moderately,
and (3) quite a bit and extremely. By collapsing the rating scale in this manner, the
instrument could be improved (Elliot et al.).
Item Fit Analysis. Item fit is similar to construct validation from a factor analysis
point of view. The purpose of item fit analysis is to investigate whether any item is
measuring "something qualitatively different" than the construct of focus (Elliot et al.,
2006, p. 362). Fit statistics are sensitive to "unexpected variance in response patterns"
46
(Henderson, Taxman & Young, 2008, p. 166). Bond and Fox (2007) identify the in-fit
mean square cutoff as 1.4 for items that are measuring something different. If an item is
over 1.4, the researcher should consider that the item in question is not measuring the
construct of interest. To explore item redundancy, the same criterion is applied as well as
an evaluation of the standardized residual correlations (Elliot et al.). Standardized
residual correlations are similar to tests of significance for each item. Items that have >
•
high standardized residual correlations are redundant and not contributing to the
information provided by the data. Items that are considered redundant have an out-fit z-
standard score of under 0.7 and also are among the highest standardized residual
correlation. Researchers should consider eliminating misfitting items as long as they do
not negatively impact the separation and reliability statistics. Fit statistics is a test of
unidimensionality (Elliot et al).
Construct analysis. Linearity is a concept that is unique to Rasch and is applicable
in the evaluation of instruments when conducting construct analysis. The Rasch model
analyzes linearity by allowing for item ordering along a continuum ranked by difficulty
(Elliot et al., 2006). A way to conceptualize Rasch construct analysis is to consider a flag
pole as the variable of substance dependence, from less, (i.e., low on the pole) to more
(i.e., high on the pole). Flags on the left of the pole are items. Items are arranged from
difficult to endorse at the top, to easy to endorse at the bottom of the pole. People are
arranged on the right of the pole from possessing more substance dependence at the top
to less substance dependence at the bottom of the pole. In considering the FVA scale of
the SASSI-3, the items inquire about the behaviors of respondents involving alcohol such
47
as drinking with lunch or suicide attempts when drinking. Respondents are more likely to
endorse the item regarding drinking with lunch than the item inquiring of suicidal
behavior when drinking. Therefore, the second item is considered more difficult. This
item continuum allows researchers to compare the order of items to clinical and
theoretical logic (Elliot et al.).
Unidimensionality. The extent to which an instrument measures only one
construct is unidimensionality and involves establishing the instruments invafiance
(Banerji, Smith, & Dedrick, 1997). Two methods are used to investigate an instruments
unidimensionality. The first method is to investigate the fit statistics, which was
discussed above. If an item is misfitting it may be expressing the existence of a multiple
dimension (Elliot et al. 2006). The second method of unidimensionality is through the
Rasch principal-components analysis (RPCA).The RPCA explains the overall amount of
variance for the instrument (Elliot et al.). In the study investigating the SCL-90-R, the
researchers found that while the instrument is not completely measuring a unidimensional
construct, the additional multidimensions are trivial in comparison to the overall distress
identified in the RPCA which demonstrated that the measure accounted for 78 percent to
the total variance (Elliot et al.).
Independence, The last step in using the Rasch model to investigate an
instrument's validity and reliability is to compare two samples to verify the consistency
of the measure (Elliot et al., 2006). This analysis allows researchers to assess whether a
measure maintains its meaning across different samples. This is the measurement
property of invariance or specific objectivity (Bartholomew, 1996; Bond & Fox, 2007).
48
Items should line up according to their level of difficulty, regardless of the population
being evaluated. Thus, an inherent quality of an item is its difficulty (Bartholomew). In
the study investigating the SCL-90-R, Elliot, et al. identified that there was no
meaningful or statistical difference between the clinical and non-clinical samples on the
instrument when the item maps were compared.
Summary
Chapter two provided a review of the SASSI's psychometric properties of
reliability and validity. The SASSI authors' and independent researchers' results varied in
many ways. Limitations of the traditional approaches taken were highlighted and a
different method was introduced, the Rasch model. The Rasch model has been used to
successfully investigate the quality of the measure of general clinical distress as evaluated
by the SCL-90-R. Of particular interest is the concept of unidimensionality. This is
important for the SASSI-3 because it is purported to be multidimensional. If the SASSI-3
can be found to work as a single "ruler" of substance dependence, then, the instrument
will function in a more efficient, effective, and holistic manner.
49
Chapter Three
Methods
Overview
Chapter Three presents the research methodology that was to answer the research
question regarding whether the SASSI-3 adequately measures substance dependence
according to the Thurstonian principals of measurement as analyzed by the Rasch model.
The participants were samples collected from two previous research investigations. The
demographic information of these samples is presented in this chapter. The SASSI-3 will
be reviewed and the procedure by which it was analyzed via the Rasch model will be
outlined.
Research Questions and Correlating Hypotheses
The purpose of this study is to investigate the psychometric properties of the
SASSI-3 using the Rasch model of measurement. The research questions are as follows:
General Research Question: Does modern measurement methodology assist in
the revalidation of the SASSI-3?
Research Question 1: Do the items included on the SASSI-3 represent a
50
produce a uriidimensional factor structure that accounts for 60% or more of the items'
total variance.
Research Question 2: Do the items included on the SASSI-3 adequately measure
the construct?
statistics indicative of low item error.
Research Question 3: Are measures from the SASSI-3 reliable for diagnostic
classification purposes?
Research Hypothesis 3 a: Rasch Reliability statistics demonstrate acceptable levels
of internal consistency.
Hypothesis 3b: The SASSI-3 decision rule scales (as evidenced in the
Question 4: Do the SASSI-3 Rasch analyzed decision rule scales clearly
discriminate between those who are substance dependent and those who are not?
Hypothesis 4: The SASSI-3 Rasch analyzed decision rules demonstrate high
discriminatory ability (via high Rasch Person Separation).
General Research Question 2: Does modern measurement theory assist in
improving the SASSI-3 instrument holistically?
Research Question 5: Does the SASSI-3 instrument, as a whole, represent a
51
produce a unidimensional factor structure that accounts for 60 percent or more of the
whole instruments' total variance.
Research Question 6: Does the whole SASSI-3 adequately measure the substance
dependence construct?
statistics indicative of low item error for the SASSI-3 instrument as a whole.
Research Question 7: Is the whole SASSI-3 reliable for diagnostic classification
purposes?
Research Hypothesis 7: The holistic SASSI-3 construct (as evidenced in the
Research Question 8: Does the whole SASSI-3 instrument demonstrate an ability
to clearly discriminate between those who are substance dependent and those who are
not?
Research Hypothesis 8: The holistic SASSI-3 demonstrates high discriminatory
ability (via high Rasch Person Separation).
Participants
The participants in this study consist of a total of 3 5 8 adults from two previous
research investigation samples collected from the greater Toledo Area (see Laux, Salyers,
& Kotova 2005 for the study involving the first sample). Institutional Review Board
approval was granted for the first study involving a sample of 230 students, men
accounted for 21.2 percent of the sample (n=49), and women accounted for 78.8 percent
(n=181). Participants included 165 undergraduate students and 65 graduate students at a
52
large Midwestern university, enrolled in social work or counseling courses (mean number
of years in college was 3.5, SD = 2, range = 0-10, median=4). The sample self-identified
ethnicity included 62.6 percent (n=144) European American, 24.8 percent (n=57) African
American, 3 percent (n=7) Native American, 2.6 percent (n=6) biracial, 1.7 percent (n-4)
Hispanic, .4 percent (n=l) Asian American, and 4.8 percent (n=l 1) did not report (Laux
et al. 2005). The mean age for this sample was 28.1 years (SD=10.4, range=18-59,
median=26).
The participants in the second sample included clients involved in a local
community agency and court cooperative program designed to assist in reunifying drug
and alcohol abusing parents with their children. The data was collected by the
professionals involved in the daily administration of the program and provided to the
evaluators of an expansion and enhancement grant awarded by Substance Abuse and
Mental Health Services Administration (SAMHSA). This data is the result of a second
sample that contained a total of 235 adults with 20.9 percent (n=49) men, 77.0 percent
(n=181) women, and 2.1 percent (n=5) did not report. The sample self-identified ethnicity
included 61.3 percent (n=144) European Americans, 24.3 percent (n=57) African
Americans, 3 percent (n=7) Native Americans, 1.7 percent (n=4) Hispanics, 2.6 percent
(n=6) biracial, 0.4 percent (n=l) Asian American, and 6.8 percent (n=16) did not report.
The mean age for this sample was 28 years (SD=11, range 19-59, median=23).
The samples were combined and selected, by utilizing a random numbers table, to
create two groups. The samples were combined to ensure that a portion of each of the
groups contains individuals with problems related to substance abuse necessitating some
53
therapeutic intervention. If the SASSI-3 functions as a measure, these samples should
represent a wide range on the substance dependent ruler. The first group was used for the
initial validation of the SASSI-3; the first purpose of this study. The second group was
used to evaluate the SASSI's independence against the first sample; the second purpose
of this study.
Instrument - The Substance Abuse Subtle Screening Inventory-3 (SASSI-3)
The Substance Abuse Subtle Screening Inventory-3 (Miller & Lazowski, 1999)
was developed to identify individuals who had a high probability of being substance
dependent (Miller & Lazowski). The instrument was first published in 1988, revised in
1999, and is now in its third edition.
The SASSI-3 is a paper, pencil, screening instrument printed on both sides of one
page. It is brief, easy to administer and score, and is economical. The front consists of 67
true and false items. The back has 26 items with a rating scale choices 0-3 indicating
never, once or twice, several times, and repeatedly. The front side includes subtle items,
which purportedly indirectly inquire about substance abuse related issues. However,
several of these items directly pertain to past alcohol and drug use.
The developers of the SASSI-3 identified ten scales upon which to measure
individuals for the probability of substance dependence (Miller & Lazowski, 1999).
These ten scales include the Face Valid Alcohol scale (FVA), the Face Valid Other Drug
scale (FVOD), the Symptom scale (SYM), the Obvious Attributes scale (OAT), the
Subtle Attributes scale (SAT), the Defensiveness scale (DEF), the Supplemental Addition
Measure scale (SAM), the Family vs. Control Subjects scale (FAM), the Correctional
54
scale (COR), and the Random Answering Pattern scale (RAP). The FVA and FVOD
scales' items directly question the respondent about his or her alcohol and other drug use.
The SYM scale assesses respondents' symptoms and consequences of drug and alcohol
use. Obvious traits associated with substance use are measured through the OAT scale,
while subtle traits are measured through the SAT scale. The DEF scale is a validity scale
which measures respondents' defensiveness to the SASSI-3's items. The SAM scale is
meant to discriminate between persons whose high DEF scores are due to substance
specific defensiveness from those whose elevated DEF scales are due to some other
source of defensiveness. The FAM scale evaluates the amount that the respondent
focuses his or her own feelings or thoughts on herself or himself versus the feelings or
thoughts of others. The COR scale reports on the similarity of a respondent's scores to a
group of persons known to have a history of criminal behavior. Finally, the RAP is a
validity scale which determines whether a respondent was answering in a random pattern.
If a respondent's RAP score is greater than one, then the respondent's screening
may be invalid. Therefore, prior to scoring, the RAP scale should be reviewed. The
scoring procedures include nine decision rules which are used to determine the likelihood
of substance dependence for the respondent. For each of the scoring rules, should the
respondent's scores exceed the cutoff, he or she is considered to be highly likely to be
substance dependent.
The results of independent investigations on the psychometric properties (Arneth,
Bogner, Corrigan, & Schmidt, 2001; Clements, 2002; Feldstein & Miller, 2007; Gray,
2001; Laux, Perera-Diltz, Smirnoff, & Salyers, 2005; Laux, Salyers & Kotova, 2005;
55
Lazowski, Miller, Boye & Miller, 1998, Svanum & McGrew, 1995, Sweet & Saules,
2003) have been mixed when compared to the results reported by the authors of the
SASSI (Miller & Lazowski, 1999). Often the findings have not reflected the high levels
of reliability and validity found by the SASSI authors.
The stability of an instrument's results for a given sample is referred to as the
instrument's reliability (Bartholomew, 1996; Mark, 1996; Traub, 1994). The two-week
test-retest reliability found for the SASSI by the authors (Miller & Lazowski, 1999) as
1.0 for the FVA and FVGD scales and between .92 and .97 for the clinical scales. This
finding was supported by Laux, Salyers, and Kotova (2005) but challenged by Myerholtz
and Rosenberg (1998). Myerholtz and Rosenberg found .82 and .89 for the FVA and
FVOD scales respectively. In a four-week test-retest reliability investigation Myerholtz
and Rosenberg (1998) found the FVA and FVOD scales to be .76 and .93 respectively.
With regard to internal consistency, Miller and Lazowski (1999) found that the SASSI
had a .93 coefficient alpha. While the internal consistency finding is less meaningful
because the SASSI was not developed to be a unidimensional instrument, this provides
evidence to the underlying one dimensional construct of substance dependence. The
findings for the face valid scales have been consistent with Miller and Lazowski.
However, Clements (2001) produced only a .49 coefficient alpha for the instrument. This
indicates that the instrument only moderately represented a single construct.
An instrument is valid if it measures what it reports toimeasure. Validity is
evaluated in several ways including the content, construct, and criterion referenced
approaches. Lazowski, Miller, Boye, and Miller (1999) found that people who score high
56
on the SASSI also score high on other instruments measuring the same construct such as
the MAST and the MMPI-2 Addiction Potential Scale. Likewise, people who scored low
on the SASSI also scored low on similar instruments. Independent researchers'
comparisons of the SASSI to other instruments produced mixed results. For example, the
overall classification agreement findings for the SASSI and CAGE agreement was .49
(Laux, Salyers, & Kotova, 2005) and .61 (Myerholtz & Rosenberg, 1998). When the
SASSI was compared to a modified CAGE, the agreement rate dropped to .58 (Myerholtz
& Rosenberg). The SASSI and MAC agreement was lower still with a .22 agreement
(Myerholtz & Rosenberg) but in another study had a higher agreement rate result at .52
(Laux, Salyers, & Kotova). However, when the SASSI was compared to the MAC-R the
agreement was again lower with a .29 (Laux et al., 2005).
Based on an exploratory factor analysis, the authors of the SASSI (Miller &
Lazowski, 1999) identified a ten factor solution; however, the only other study to
investigate the factor structure of the SASSI was unable to replicate this finding (Gray,
2001). Gray's data factor analysis identified a two factor solution comprised of mostly
the FVA and FVOD items, which accounted for 53 percent of the SASSI-3's total
variance. Two studies have also confirmed the factor structure of the FVA and FVOD
scales respectively (Laux, Salyers, & Kotova, 2005: Laux, Perea-Diltz, Smirnoff, &
Salyers, 2005).
Using a substance dependence diagnosis provided by a licensed mental health
professional with the criteria from the Diagnostic and Statistical Manual IV Text
Revision (DSMIV-TR; American Psychiatric Association, 2000) as the 'gold standard'
57
criterion, Lazowski, Miller, Boye, and Miller (1997) reported the accuracy rate for the
SASSI as 97 percent, sensitivity of 97 percent and specificity of 95 percent. Two years
later, in the SASSI-3 Manual, Miller and Lazowski (1999) reported a lower but still
acceptable accuracy rate for the SASSI to be 93 percent, sensitivity to be 93.3 percent,
and specificity to be 94.2 percent. However, the results from independent researchers
have again been mixed. Using the same gold standard* Svanum and McGrew (1995)
found the sensitivity to be 33 percent and specificity to be 87 percent for their college
student sample. Five years later using an incarcerated population the results improved
with an overall accuracy rate of 69.4 percent, sensitivity of 73.3 percent, and specificity
of 62.2 percent. Using a traumatic brain injury sample, the overall accuracy rate was
again lower than that found by the SASSI authors at 69.2 percent, sensitivity of 70.8
percent and specificity of 68.5 percent (Arneth et al., 2001). Finally, Clements also found
a lower sensitivity and specificity rating of 65 and 89 percent, respectively (2002).
Variable
When investigating the psychometric properties of an instrument, particularly
whether an instrument is an accurate measure of a construct, it is imperative to define the
variable being investigated. For this study, the variable being evaluated is substance
dependence (Miller & Lazowski, 1999). The SASSI-3 is intended to discriminate
between those who are likely to be substance dependent from those who are not (Miller
& Lazowski). The authors of the SASSI-3 also asserted that it was not their intention to
develop a unidimensional instrument. However, Myerholtz and Rosenberg (1998)
comment that the scales measuring homogeneous traits have higher internal
58
consistencies. The high coefficient alpha findings identified by several authors have
given evidence to the unidimensionality of certain SASSI-3 subscales (Clements, 2001;
Laux, Salyers, & Kotova, 2005; Laux, Perera-Diltz, Smirnoff, & Salyers, 2005; Miller &
Lazowksi). Independent evidence of unidimensionality has been reported for the FVA
and FVOD. The FVA, FVOD, SAT, OAT, and SYM scales and the SAM have been
identified as measuring substance dependence (Miller & Lazowski, 1999). However,
constructs other than substance dependence, such as validity and additional clinical
issues, are being measured with the FAM, DEF, COR and RAP scales (Miller &
Lazowski, 1999).
Procedures
One of the many advantages of using the Rasch model is that the outputs from the
analysis are in the form of easy to read "pictures". The pictures are graphs and charts
which demonstrate visually the response scales and the "ruler" upon which the items and
people can be aligned. The pictures will be described below as they apply to each step in
the procedure. The following method includes the steps used to evaluate the SASSI-3's
measurement properties.
Steps in conducting a Rasch Analysis. This study will follow the process of Rasch
analysis using the example set by Elliot et al. (2006). When conducting a Rasch analysis,
at each step described below, the person and item separations and reliabilities will be
reviewed for changes and improvements as a guide to determine whether the change was
effective.
59
Step one- Response validation. The purpose of exploring the response validity
first is to establish whether the participants are using the response scales as intended by
the authors of the SASSI-3 (Elliot et al, 2006). In addition, response validation is the first
step in determining whether the items function unidimensionally (Elliot et al). There are
two ways the response options will be validated. The first is by visually reviewing the
probability curves. Each response option should have over .50 probability of being
chosen. The second is by examining the thresholds. Each response option (1 to 2, or, 2 to
3, etc.) threshold should be between 1.4 and 5 units in distance from the next response
option. If the threshold is less than 1.4 or greater than 5 and the probability of being
choose is less than .50, then it is recommended that the response options be revised.
Step two - Item fit analysis. Item fit analysis is a form of construct validation and
a test of unidimensionality. By reviewing the z-standardized score for the cutoff of 2.0
any item over this value or any item with a negative point-biserial value is likely either
redundant or measuring a separate construct than intended. As such, items failing to meet
these standards should be either eliminated or revised to increase the differences of
meaning between these points. The fit analysis will also be conducted for people in the
sample using the same statistical standards.
Step three - Construct analysis. Construct analysis is a test for the Thurstonian
concept of linearity. Linearity means that a hierarchy will be established by a person-item
map output. A way to conceptualize the Rasch construct analysis output is to consider a
flag pole as the variable of substance dependence, from less, low on the pole, to more
high on the pole. Flags on the left of the pole are items. Items are arranged from difficult
60
to endorse at the top to easy to endorse at the bottom of the pole. People are arranged on
the right of the pole from possessing more substance dependence at the top to less
substance dependence at the bottom of the pole. The linear measure construct item map is
the variable of substance dependence extrapolated from the instrument. In this way one
can see the degrees of separation along the variable and where the separations are.
Step four - Assess for unidimensionality. The primary way to evaluate the
unidimensionality of an instrument is through Rasch principle-components analysis
(RPCA). RPCA is conceptually similar to the correlation matrix that is developed
through principal components exploratory factor analysis (Stevens, 1996). These
procedures differ, however, in that the RPCA approach not only provides first order
factor results, but additionally provides the researcher with evidence of the presence of
unsuspected secondary variables, if they exist (Bond & Fox, 2007). If the RPCA is over
60 percent and the remaining residuals do not explain greater than five percent variance,
then a researcher can conclude that the instrument is unidimensional.
Step five - Assess for measure independence. To evaluate an instrument's
independence, the instrument must be compared to a second comparable sample. A
comparable sample is one in which the researcher would expect to find a wide range of
the construct being measured. For example, if one was interested in measuring the
construct of intelligence, the researcher would generally need a sample that included
persons of low, average, and above average intelligence so as to be able to determine
whether or not the instrument included items at all points along this continuum. In this
study an appropriate and comparable sample would be composed of persons whose use of
61
substances ranges from none at all to those whose use has progressed to the point where
they are experiencing significant consequences in their lives. Tests of independence will
inform the researcher of whether the meaning of the instrument and the item hierarchy,
ranging from easy to difficult, remains consistent. The resulting person-item map from
the first group is visually compared against the second group's person-item map. These
maps are evaluated for consistency by observing the arrangement of items. That is, if the
items fall in relatively the same point along the hierarchy difficulty continuum for both
groups, then the researcher can conclude that the measure is sample independent.
To identify the SASSI-3's demarcation between individuals with substance
dependence and non-dependence (specific research questions six and seven) was
identified through the use of the person-item map. The individuals in the second group
was coded according to the traditional SASSI-3 decision rule (high probability vs. low
probability of substance dependence), Using the person-item map, these people was
evaluated to see where they fall on the substance dependence construct.
Limitations
Broadly, the limitations of this study are associated self-report instruments as well
as criticisms of the Rasch model. Both will be discussed below.
One limitation of this study is that the SASSI-3 is a self report instrument. Self
report, generally speaking, is one of the easiest ways to collect information on a construct
of interest. Respondents often answer items in socially desirable ways (Donaldson &
Grant-Vallone, 2002). That is, sometimes people answer self-report questions in a
manner that artificially minimizes (faking-good) or maximizes (faking bad) the severity
62
of their presenting issues. Such response styles may be a particular concern regarding
substance dependence screening due to possible secondary gains from results that are
positive or negative. Although the SASSI-3's authors purport to have lirnited the impact
of faking on the SASSI-3 by incorporating scales that discriminate between respondents
faking good or bad, initial evidence suggests that, when instructed to do so, college
students can fake-good and fake-bad on this instrument (Burck, Laux, Harper, & Ritchie,
2009). As such, self-report may be a potential limitation for this study in that the
objectivity of this SASSI has not been established.
A second limitation is the utilization of the Rasch model. Despite its multiple uses
and high reputation for instrument validation, some do offer critiques against the Rasch
model. These critics state that Rasch model analysis is not a theory building method as is
factor analysis and that the Rasch model theory is too simplistic (Bond & Fox, 2007).
According to the Rasch model, the theory drives the development of the instrument. This
principle is contradictory to exploratory factor analysis, which is employed to facilitate
the process of theory building. If a researcher is interested in exploring multidimensions,
it is ineffective to utilize Rasch analysis as the Rasch model only works for
unidimensional instruments (Kubinger, 2005). Since the SASSI was developed using the
Diagnostic and Statistical Manual (APA, 2000) criteria and was based on the
understanding of substance dependence, the instrument is already grounded in theory. In
addition, according to the SASSI-3 Manual (Miller & Lazowski, 1999), the SASSI-3
measures the probability of substance dependence. While it was not developed to be a
63
unidimensional measure, as seen above in the reliability studies, the instrument or some
of its scales appear to meet some characteristics of unidimensionality.
Another criticism involves the SASSI-3 's scoring procedures. When two
individuals' raw scores are compared, some researchers may report a person's ability in
an invalid manner (Kubinger, 2005). This can happen when two people have the same
raw total score yet one person (person A) correctly answered the first ten easiest
questions on a 2 5-question test but the second person (person B) correctly answered the
ten hardest questions on the same test. Both raw scores equal ten, yet, person B was able
to answer more difficult questions than person A. Therefore, the scoring may not
necessarily be correct. One way researchers using the Rasch model can rectify this
problem is by carefully analyzing the total scores versus high score prior to publishing or
utilizing their findings.
In Rasch analysis, the data must fit the model as opposed to factor analysis in
which researchers can adjust the model to fit the data. An additional limitation will be
that the data do not fit the model. This means that the instrument is not a measure of
substance dependence. However, this limitation will be unknown until after the analysis.
Finally, as Bond and Fox (2007) point out, "critics argue that we can't physically
align bits of the human psyche together to produce measures, as we can with centimeters
to make meters" (p. 6). This means that a string of substance dependence units cannot be
added together to distinguish between someone who is or is not substance dependent.
64
Chapter Four
Results
This chapter presents the results of the Rasch analysis on the archived data from a
study of the SASSI-3 on two samples of adults. The samples were a combination of two
samples from previous research. One sample was taken from a study involving a
community cooperative program with child protective services and family court, and the
other sample was taken from a study involving adults from a large metropolitan
university. Using a random numbers table the samples were divided into two groups and
then one group from each sample was combined to create a dichotomous sample. Both
groups contained a dichotomous sample which included some of the sample from the
community cooperative program with the child protective services study and family court
and some of the sample from the university study. These groups were referred to as
Group 1 and Group 2 throughout the course of the study. Group 1 consisted of 174
participants, men accounted for 23.6 percent (n=41) of the sample, and women accounted
for 76.4 percent (n=f 133). This sample self-identified ethnicity included 58 percent
(n=101) European American, 26.4 percent (n=46) African American, 1.7 percent (n=3)
Native American, 2.3 percent (n=4) biraeial, and 2.3 percent (n=4) Hispanic. Group 2
65
consisted of 175 participants, men accounted for 20 percent (n=35) of the sample, women
accounted for 79.4 percent (n=139), one person did not report his/her sex. This sample
self-identified ethnicity included 63.4 percent (n=l 1.1) European American, 22.3 percent
(n=39) African American, 2.3 percent (n=4) Native American, 2.3 percent (n=4) biracial,
3.4 percent (n=6) Hispanic, and 1.1 percent (n=2) Asian American.
Face Valid Alcohol Scale (FVA)
The initial person and item separation and reliabilities fof the FVA scale were
2.51I.%1 and 7.47/.9S, respectively. The minimum acceptable standard for separation is
2.0 (Wright & Masters, 1982). A separation of 2.0 translates statistically into 3 strata.
This means that the FVA's scale's reliability is excellent and its ability to distinguish
differences in the people is good. In this case, the FVA can be said to be a linear
construct which accurately measures the sample. However, in an effort to determine
whether improvements could be made, the researcher conducted analysis of the FVA
scale's response options, items, and underlying factor structure. Step one of the Rasch
analysis involved evaluating how the respondents were using the response options. Each
of the FVA scale's twelve items include four choices of responses to which respondents
can indicate the frequency to which they engage in the item's behaviors. These response
options and corresponding point value are: 0-Never, l-Once or Twice, 2-Several Times,
or 3- Repeatedly. Examination of the probability curves and the thresholds (Figure 1)
revealed that the respondents used all response options as expected by the authors of the
SASSI-3 except for the option 1-Once or Twice.
66
;
Figure 1
Response Option 0123 Output for Face Valid Alcohol Group 1
INPUT: 174 Persons 12 Items MEASURED: 174 Persons 12 Items 4 CATS

3.62.1
SUMMARY OF CATEGORY STRUCTURE. Model="R"

+- ' ^ ;--
CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT||STRUCTURE|CATEGORY|
LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQ||CALIBRATN| MEASURE|
0 0 972 541-28.21 -27.8| .99 1.00| | NONE | (-22.88)1 0

1 1 324 181-10.73 -12.11 1.01 .96|| -8.50 | -7.79 | 1
2 2 299 17| 1.23 .74| .95 .82|| -4.94 | 6.15 | 2
3 3 188 111 16.97 18.291 1.22 1.1811 13.44 |( 25.53)1 3
MISSING 5 0 1 -3.57 | || | |
+ ; •-.--
OBSERVED AVERAGE is mean, of measures in category. It is not a parameter

estimate.
+ - ^ '•* •
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|

LABEL MEASURE S.E. | AT CAT. -ZONE |PROBABLTY| M->C C->M|DISCR|
0 NONE | (-22.88) -INF -15.88 | I 87% 81%| |0

1 -8.50 .67 | -7.79 -15.88 -1.231 -12.56 | 36% 52%| 1.10| 1
2 -4.94 .81 | 6.15 -1.23 16.731 - -2.44 | 48% 45%| .93 1 2
3 13.44 1.08 |( 25.53) 16.73 +INF | 14.85 I 72% 50%| .811 3
+ '—+- --- ' •
M->C = Does Measure imply Category?

C->M = Does Category imply Measure?
CATEGORY PROBABILITIES: MODES - Structure measures at intersections
R 1.0 + +
O 100000 3|
B I 000 3333 I
A I 000 333 |
B .8 + 00 33 +
I I 0 ( 33 |
L | 00 3 |
I I 0 33 |
T .6 + 0 3 +
Y I 0 22 3 |
.5 + 0 222 222 33 +
0 1 0 22 *2 |
F . 4 + 0 2 3 22 +
I 111*1**1 3 22 |
R | 11 * 11 33 22 |
E' I 111 2 00 11 3 2 |
S .2 + 11 22 0 3*1 222 +
P I 111 2 0 3 11 222 |
O. I 1111 222 33*00 111 222 |
N 111111 2222 3333 000 11111 221
S .0 +************3333333333 000000000**************+
-40 -30 -20 -10 0 10 20 30 40

Person [MINUS] Item MEASURE
67
The 1-Once or Twice response option failed to reach the .50 probability curve cutoff.
This failure suggested that the sample did not reliably distinguish between option l-0nce
or Twice and the next adjacent category, 2-Several times. However, because all of the
other response options appeared to work as intended and because no improvements were
found in the person and item separation and reliabilities after collapsing strategies were
attempted, no changes to the response scale were made at this time (see Table 2 for
collapsing strategy options).
68
Table 2
Summary of Collapsing Strategy for FVA Group 1 Response Options
Rating Probability
Threshold2 P S & R IS&R RPCA
1
Scale Curve
0 = 0.95
0-1 = N/A
1=0.40
0,1,2,3 1-2 = 5.41 2.65A87 7.81/.98 94.8%
2 = 0.60
2-3 = 22.67
3 =0.90
0 = 0.95
0-l=N/A
0,1,1,2 1=0.80 2.48A86 7.50/.98 95.3%
1-2 = 24.00
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.80 1.99/.80 5.56A97 92.4%
1-2 = 44.00
2 = 0.95
0 = 0.95
0-1= N/A
0,1,2,2 1=0.40 2.40/.85 7.99A98 97.3%
1-2 = 6.76
2 = 0.95
Note. 1 = >/ .5 is acceptable. 2 = >/1.4 is acceptable. PS& R = Person Separation & Reliability. IS & R = Item Separation <
Reliability. RPCA = Rasch principal components analysis.
69
In an effort to further improve the FVA's separation and reliability results, the
researcher inspected the FVA items and respondents for fit. Item and people are
considered to fit if the z-standardized score is less than 2.0 and the point-biserial is not
negative. If items or people are outside of these cutoffs, they are considered to be misfits
and should be considered for possible elimination. This inspection lead to a final iterative
elimination of twelve people. No items failed to meet the standards set forth for item fit.
The elimination of misfitting people resulted in the final person and item separations and
reliabilities of 2.65A87 for persons and 7.81/98 for the FVA scale. These separations and
reliabilities are improvements from the initial findings, which suggested a well defined
linear construct that accurately measures the people. The FVA scale is divided into ten
levels of difficulty and discriminates among nearly four groups of people ranging from
low to high agreeability on the items.
The third step in the analysis involved a review of the person-item map (Figure 2)
to explore the extrapolated construct.
70
Figure 2
Item Map FVA Group 1
FVA Gl Item-map
Persons MAP OF Items

<more>|<rare>
90 +
80
##
T FVA12-commit suicide
70
# T FVA9-effects recur
60 + FVAll-Nervous/shakes
##
FVA6-trouble
FVAlO-relationship
50 # +M
FVA8-argued
.# FVA3-energy FVA7-depressed
.# FVA1-lunch
#
FVA2-feelings
40 . # # •
#
####
#### M FVA5-physical probs
30
### FVA4-intended
#####
20 ###### +
S
10
.#####
0 .############ +
<less>|<frequ>
EACH '#* IS 2.
71
The resulting hierarchy of items resulted in a pattern from most difficult items to endorse
to least difficult items to endorse. When two items are aligned at the same place on the
hierarchy, the items are either theoretically redundant or at the same level of difficulty.
Overfitting items can be eliminated, if the infit mean-square is below 0.6 and the z-
standardized score is -2.0 or less. Despite appearing to be measuring the same theoretical
content, one of the items from the aligned group of FVA3/FVA7 could be eliminated
because the items in this combination fall within the item fit standards and are at the same
level of difficulty. Although it is the case both items remained because neither misfit.
Group l's FVA hierarchy is displayed visually on Table 2. The initial Rasch principle
components analysis (RPCA) indicated that 91.9 percent of the total variance was
explained by the instrument. With the elimination of twelve misfitting people the RPCA
increased to 95.1 percent of the total variance having been explained by the instrument,
which demonstrated improvement in the FVA. However, the item/person map means and
standard deviations were separated by nearly a standard deviation indicating that the
items were more difficult to endorse than the people were able to agree to them. An
example of this may be like a spelling bee. In this scenario the students would be third
grade level spellers and the words would be tenth grade spelling words. The words would
be more difficult to spell than the students would be able to spell.
In the final step of the analysis, the extrapolated variable was compared using the
data from a second comparable group using the same process. The FVA scale, using
Group 2 data, demonstrated similar person and item separation and reliability results as
were produced in the analyses of the first data set. While no changes were needed to the
72
response options, option 1-Once or Twice, as was reported from analysis from the first
data set, only met the probability curve at 0.4. (See Figure 3 for response curve and Table
3 for a summary of the collapsing strategy).
Figure 3
Response Options Curve 0123 for FVA Group 2

+ '—•—-—• ' '
CATEGORY OBSERVED|OBSVD SAMPLE UNFIT OUTFIT| |STRUCTURE|CATEGORY|
LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQIICALIBRATN| MEASURE|
0 0 934 591-43.44 -43.0| 1.05 1.111 I NONE | (-29.72)1 0

1 1 306 191-15.12 -16.91 ~\ .97 .62|| -16.78 | -11.60 I 1
2 2, 238 151 .18 .851 1.00 1.14| | -5.79 | 9.09 | 2
3 3 104 7| 32.26 32.941 1.15 1.06|| 22.57 |( 33.99)1 3
MISSING 2 01-38.61 | || | . |
+
OBSERVED AVERAGE is mean of measures in category. It is not a parameter estimate.
+ — ; • • :
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.I COHERENCE|ESTIMI .

LABEL MEASURE S.E. | AT CAT. ZONE 1 PROBABLTY | M->C C->M|DISCR|
0 NONE |(-29.72) -INF -21.65| | 90% 87%| |0

1 -16.78 .77 | -11.60 -21.65 -2.531 -19.15 | 47% 60%| 1.111 1
2 -5.79 .97 | 9.09 -2.53 24.021 -4.00 I 58% 50%| . 821 2
3 22.57 1.57 |( 33.99) 24.02 +INF | 23.12 | 80% 61%| .94| 3
+ • • -
R 1.0 + +
0 | 33 1
B |00 333 |
A | 000 333 |
B .8 + 0 3 +
I I 00 33 |
L | 0 3 |
I | 0 2222222 33 |
T .6 + 0 22 2 3 +
Y | 00 22 22 3 |
.5 + 0 2 23 +
O | 011111112 322 |
F .4 + 110 211 3 2 +
| 1 0 2 1 33 2 |
R I 11 002 11 3 22 |
E | 11 20 1 3 2 |
S .2 + 11 22 0 11 33 22 +
P | 11 2 00 3*1 222 |
O 1111 22 00 33 111 222 |
N | 2222 333**0 11111 22|
S .0 +********3333333333333 000000000000****************+
E ++ + + + +--• + + —+ + ++
-40 -30 -20 -10 0 10 20 30 40 50
73
Table 3
Summary of Collapsing Strategy for FVA Group 2 Response Options
Rating Probability
1
Scale Curve
0 = 0.90
0-l=N/A
1 = 0.45
0,1,2,3 1-2 = 15.15 2.787.89.7.767.98 98.6%
2 = 0.65
2-3 = 19.12
3 =0.95
0 = 0.90
0-l=N/A
0,1,1,2 1=0.90 2.71/.88 7.50/.98 99.4%
1-2 = 57.32
2 = 0.90
0 = 0.95
0-l=N/A
0,0,1,2 1=0.60 2.19/.83 5.52A97 99.7%
1-2 = 29.38
2 = 0.95
0 = 0.90
0-l=N/A
0,1,2,2 1=0.50 2.38A85 8.45A98 97.9%
1-2=11.88
2 = 0.90
Note. 1 = >/ .5 is acceptable. 2 = >/ 1.4 is acceptable. PS& R = Person Separation & Reliability. IS & R = Item Separation &
74
After the iterative elimination of 23 misfitting people, the final person and item
separation and reliability findings increased to 2J8/.89 and 7.76A98 respectively. No
items failed to meet the cutoff for item fit, therefore none were eliminated. Two items
(FVA1 and FVA2) were aligned at the same place on the hierarchy and neither met the
statistical standards for item overfit. Therefore, they are measuring unique qualities. And,
neither can be eliminated. The final RPCA for the scale was also comparable at 98.6
percent of the variance being accounted for by the items. As was presented for the FVA
for Group 1, the hierarchy of FVA item endorsement difficulty for Group 2 is also
presented on Figure 4.
75
Figure 4
Item Map FVA Group 2


<more>|<rare>
110 # +
I
100 +
IT FVA12-commit suicide
90 +
1
1
1
# 1
80 +
1
## 1
T| FVA11-Nervous/shakes
# IS FVA9-effects recur
70 +
1
1
# 1
i
1
60 + FVAlO-relationship
1
1
## SI
1 FVA6-trouble
50 +M FVA8-argued
.## 1
## 1 FVA7-depressed
### 1 FVA3-energy
i
1
40 .#### +
#### 1
1 FVA1-lunch FVA2-feelings
.#### M|
.### 1
30 +
######## IS
1 FVA5-physical probs
#### 1
1
20 +
####### 1
1
SI
.###### 1 FVA4-intended
10 +
IT
1
1
.#####
1
0 ######## +
<less>|<frequ>
EACH '#' IS 2.
76
A side-by-side comparison of the Groups' respective item-endorsement difficulty
indicated that ten of the twelve items remained constant on the hierarchy across Groups is
presented on Table 4. Of the three items found to be the most difficult items to endorse
by both Groups, the second and third most difficult were interchanged. Again, the items
fell into ten levels of difficulty, and the scale distinguished nearly four groups of people
among the sample from low to high on the variable.
77
Table 4
FVA Test of Independence
Item Hierarchy
Group 1 Group 2
Difficult to endorse
Item number Item Item number Item
FVA12 Suicide FVA12 Suicide
FVA9 Drinking effects FVA11 Shakes after sobering up
FVA11 Shakes after sobering up FVA9 Drinking effects
FVA6 Trouble at work, school FVA10 Relationship problems
FVA10 Relationship problems FVA6 Trouble at work, school
FVA8 Argued w/ friends/family FVA8 Argued w/ friends/family
FVA7 Depressed after sobering FVA7 Depressed after sobering
FVA3 Drink for energy FVA3 Drink for energy
FVA1 Midday drinks FVA1 Midday drinks
FVA2 To express feelings FVA2 To express feelings
FVA5 Physical problems FVA5 Physical Problems
FVA4 More than intended FVA4 More than intended
Easy to endorse
78
Face Valid Other Drug Scale (FVOD)
The initial person and item separation and reliabilities for the FVOD scale were
2.50/.86 and 4.69A96, respectively. Additionally, the initial RPCA was 83.9. Viewing
these findings collectively, the FVOD can be said to be a linear construct which
accurately measures the sample. However, in an effort to see if improvements were
possible, the response options, items, and the underlying factor structure were explored.
At each step the person and item separation and reliability were reviewed as a way to
evaluate the changes made to the instrument.
Like the FVA scale, the FVOD scale has the same four response options; 0-
Never, 1-Once or Twice, 2-Several Times, 3-Repeatedly. Unlike the FVA scale the
FVOD response scale did not seem to function as well. Inspection of the probability
curve and thresholds (Figure 5) indicated that option 1-Once and Twice and 2-Several
Times did not meet the acceptable standards.
Figure 5
Response Option Curve 0123 FVOD Group 1

INPUT: 174 Persons 14 Items MEASURED: 174 Persons 14 Items 4 CATS N

+ •
I CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFITJ |STRUCTURE|CATEGORY I
I LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQI |CALIBRATN| MEASURE!
| 0 0 698 451-15.08-15.21 1.14 1.201 ! NONE | (-16.26)1 0

1 1 1 259 17| -6.42 -5.691 .91 .95|| -.41 | -4.39 I 1
| 2 2 234 15| 1.85 2.04| .94 1.12|| -.70 I 4.26 | 2
I 3 3 369 24| 9.56 9.33| .94 .86|| 1.10 l( 16.43)1 3
IMISSING 8 1| -3.46 | I | I |
+ •
79
Figure 5 (Continued)
+-
CATEGORY STRUCTURE I SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|
LABEL MEASURE S.E. | AT CAT. ZONE 1 PROBABLTY | M->C C->M|DISCR|
0 NONE I (-16.26) -INF -10.441 | 82% 65%| | 0

1 -.41 .69 I -4.39 -10.44 -. 071 -6.44 | 27% 47% | v .94| 1
2 ' -.70 .74 | 4.26 -.07 10.431 -.18 | 30% 47%| 1.04| 2
3 1.10 .78 |( 16.43) 10143 +INF | 6.55 | 78% 44%| 1.071 3
+- . — • • . • —

R 1.0 + +
O 0000 3333
B 00000 33333
A 0000 333
B .8 + •00 333
I 00 33
L 00 33
I 00
T .6 + 33
Y
.5 + 00 33
.4 +
0 3
111100 **222222
11111 ***1 2222
.2 + 1111 222 3 0111 2222
1111 222 33 00 111 2222
111111 222 333 000 1111 222222
111111 2222222 3333 0000 111111 2222
^Q +************33333333 0000000*************+
-30 -20 -10 0 10 20 .30

Specifically, the calibration thresholds reflected the respondents' misuse of the response
options 1 and 2, which were both below the 0.2 on the probability curve. Respondents
were not reliably distinguishing between option 1-Once or twice and 2-Several times.
Considering the logic behind the options and the statistical evidence provided by the
thresholds, the researcher decided to use a collapsing strategy of combining the middle
two response categories in an effort to improve the FVOD's functioning. That is, the
researcher reanalyzed the data using 0,1 and 2, and 3 and the three response options.
This change produced an improvement in the person separation and reliability and a
80
minor decrease in the item separation and reliability scores of 2.61/.87 and 4.43A95,
respectively. The RPCA conducted after response options 1 and 2 were combined
decreased from 83.9 percent to 76.8 percent of the total variance accounted for. Despite
this decline, the final value was still above the minimum accepted standard. Further
evaluation demonstrated that the three option response scales seemed to work better in
this model because all response options exceeded the statistical standard cutoff ranges of
.50 on the probability curves and the threshold should be more than 1.4 units in distance
from the next response option, despite the decrease in item separation and reliability and
RPCA (see Figure 6).
Figure 6
Corrected Response Option Curve 0112 FVOD Group 1

3.62.1

+
0 0 698 451-20.25 -20.4| 1.10 1.14|| NONE | (-20.35)1 0

1 1 493 31| -3.55 -2.92| .98 1.111 I -8.09 | .00 I 1
2 2 369 24| 13.58 13.131 .91 .85|| 8.09 |( 20.35)1 3
MISSING 8 1| -5.65 | || | |
:
+
OBSERVED AVERAGE is mean of measures in category. It is not a parameter
estimate.
+
LABEL MEASURE S.E. | AT CAT. -—-ZONE |PROBABLTY| M->C C->M|DISCR|
0 NONE | (-20.35) -INF -11.591 | 80% 73%| |0

1 -8.09 .69 | .00 -11.59 11.591 -9.65 | 48% 63% | .95 1 1
2 8.09 .79 |( 20.35) 11.59. +INF | 9.65 | 75% 54%| 1.08| 3
:
+ •
81
R 1.0 + +
0 I , 1
B |000 222|
A I 0000 2222 |
B .8 + 000 ' .. " 222 +
1 I 00 22 |
h I 00 22 |
•I | • • 00 22 |
T .6 + 00 22 +
Y | 00 111 22 |
.5 '+ 00 11111 11111 22 +
O I 1*1 1*1 I
F .4 + 111 00 22 111 +
| 11 00 22 11 |
R I 111 00 22 111 I
E | 111 0*2 111 I
S .2 + 111 22 00 111 +
P I 1111 222 000 1111 |
O 1111 2222 0000 111|
N | 2222222 0000000 |
S .0+22222222222222 00000000000000+
-30 -20 -10 0 10 20 30

No other collapsing strategy met the threshold and probability statistical standards for
response scales (see Table 5).
82
Tables
Summary of Collapsing Strategy for FVOD Group 1 Response Options
Ratine Probability
Threshold2 PS & R IS&R RPCA
1
Scale Curve
0 = 0.95
0-l=N/A
1=0.30
0,1,2,3 1-2 = 1.27 2.687.88 5.05/.96 88.3%
2 = 0.35
2-3 = 4.36
3 =0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.60 2.78/.89 4.777.96 84.3%
1-2 = 20.46
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.35 2.27A84 4.11/.94 82.5%
1-2 = 0.02
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.30 2.56A87 5.24A96 88.9% i.
1-2 = 2.16
2 = 0.95
83
Step two of the Rasch instrument validation analysis involved reviewing the
person and item fit. Inspection of the items and persons lead to a final iterative
elimination of twelve people whose responses were inconsistent. No items failed to meet
the standards set forth for item fit. The item fit standards include the z-standard score
being below 2.0 and a positive point-biserial. The elimination of these twelve people
resulted in final person and item separations and reliabilities of 2.78A87 for persons and
4.77A96 for the FVOD scale. These findings suggest a well defined linear construct that
reliably distinguishes differences among the people. The FVOD scale can be divided into
roughly six groups of items, and it distinguishes four groups of people.
The third step in the analysis was to review the person-item map (Figure 7) to
examine the extrapolated construct.
84
Figure 7
Item Map FVOD Group 1

INPUT: 174 Persons 14 Items MEASURED: 162 Persons 14 Items 3 CATS 3.62.1

<more>|<rare>
100 +
90
80 T+
FVOD9-doctor
70 +T
I
I
I
FVOD7-trouble w/law
60 . +S
I FVOD3-more aware
I
I FVOD12-avoid withdrawal FVOD14-treatment program
FVOD4-sex
. 'I
• . • I
50
. +M
• I
I FVOD8-really stoned
I FVODl-improve thinking FVOD5-help
. M| FVODlO-activities
I FVOD13-life FVOD6-forget
40
.# • +S FVOD2-feel better
. 1
I FVODll-aod
# I
I
30 .# +T.
I
# I
# S|
I
20 +
. I
I
10 .########### +
<less>I<frequ>
EACH '#' IS 6.
85
The hierarchy of items resulted in a pattern from most difficult items to endorse to least
difficult items to endorse. When two items are aligned at the same place on the hierarchy,
the items are either theoretically redundant or at the same level of difficulty. In Rasch this
means the items overfit. Overfitting items can be eliminated if the infit mean-square is
below 0.6 and the z-standardized score is -2.0 or less. Despite appearing to be measuring
the same theoretical content, one of the items from each of the aligned groups of
FVOD12/FVOD14, FVOD1/FVOD5, and FVOD13/FVOD6 can be eliminated because
all of items in these combinations fall within the item fit standards and are at the same
level of difficulty. Group 1 's item hierarchy is visually displayed on Table 7. The final
Group 1 RPCA indicated that 84.3 percent of the total variance was explained by the
scale. This improvement in this scale was achieved by adjusting the response scale and
eliminating the twelve misfitting people. In addition, the item/person map means and
standard deviations (Figure 6) were close in proximity, indicating that the difficulty of the
items were similar to the ability of the people. It should be noted that while the means
were close in proximity, it appeared that only the most extreme people were identified on
the FVOD scale, while the majority of the sample were at the bottom of the scale. This
may be due to the fact that only a small number of people in the sample were found to be
dependent to drugs other than alcohol.
The extrapolated variable, from Group 2's data, was compared to the variable
constructed using the data from Group 1, using the data from Group 2 using the same
process. The FVOD scale, using Group 2 data, demonstrated similar person and item
separation and reliability findings. As with Group 1, the response options were not being
86
used as intended by the authors of the SASSI-3. By reviewing the thresholds and
probability curves the following collapsing strategy was developed (see Table 6). The
two middle response options, 1-Once or Twice and 2-Several times, were combined. This
allowed for a better functioning response scale and an increase in the person and item
separation and reliability findings.
87
Table 6
Summary of Collapsing Strategy for FVOD Group 2 Response Options
Rating Probability
1
Scale Curve
0 = 0.95
0-l=N/A
1=0.30
0,1,2,3 1-2=1.24 2.827.89 4.777.96 88.7%
2 = 0.45
2-3 = 13.94
3 =0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.65 2.97/.90 4.19/.95 84.3%
1-2 = 28.18
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.45 2.49A86 4.21/.95 89.8%
'1-2*11.36
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.30 2.65A88 4.83A96 90.5%
1-2 = 3.18
2 = 0.95
88
After the iterative elimination of thirteen misfitting people the final person and item
separation and reliability findings increased to 2.97/.90 and 4.19/.95 respectively. No
items failed to meet the cutoff for item fit, therefore no items were eliminated. The final
RPCA for the scale was also comparable to that of Group 1 's RPCA as 84.3 percent of
the variance was accounted for by the FVOD scale's items. This means that the FVOD
scale using 14 items can be separated into about six groups and identifies about four
groups of people reliably. As was reported for the FVOD for Group 1, the hierarchy of
FVOD item endorsement difficulty for Group 2 is also presented on Figure 8.
89
Figure 8
Item Map FVOD Group 2

INPUT: 173 Persons 14 Items MEASURED: 158 Persons 14 Items 3 CATS 3.62.1

<more>|<rare>
90 . +
. T
80
.#
FVOD9-doctor
70 # +T
. S
#
60 +s
FVOD7-trouble w/law
FVOD12-avoid withdrawal FVOD3-more aware
FVOD14-treatment program FVOD4-sex
50 +M
FVOD5-help
FVOD13-life FVOD8-feally stoned
FVODlO-activities
FVODl-improve thinking FVOD6-forget

40 +S
FVOD2-feel better
FVODll-aod
SI
30 .#. +T
20 # +
10 .############ +
<less>|<frequ>
EACH '#' IS 5.
90
Four pairs of items were aligned on the variable FVOD12/FVOD3, FVOD14/FVOD4,
FVOD13/FVOD8, and FVOD1/FVOD6. All of the items met the statistical standards for
item fit and appear to measure different content. One item from each of the pairs can be
eliminated. A side-by-side comparison of the Groups' respective item-endorsement
difficulty indicated that eight of the scales fourteen items remained constant on the
hierarchy across Groups (see Table 7). The two items found most difficult to endorse by
both Groups on the hierarchy and the three items found to be the easiest to endorse by
both Groups remained consistent. However, the items around the means were not aligned
across Groups.
91
Table 7
FVOD Test of Independence
Item Hierarchy
Group 1 Group 2
Item Number Item Item Number Item
FVOD9 Talk a Dr. into it FVOD9 Talk a Dr. into it
FVOD7 Trouble with law FVOD7 Trouble with law
FVOD3 Become more aware FVOD12 Avoid withdrawal
FVOD 12 Avoid withdrawal FVOD3 Become more aware
FVOD 14 Treatment program FVOD14 Treatment program
FVOD4 Increase sexual pleasure FVOD4 Increase sexual pleasure
FVOD8 Really stoned FVOD5 Forget helplessness
FVOD1 Improved thinking FVOD 13 Drugs keep from life
FVOD5 Forget helplessness FVOD8 Really stoned
FVOD 10 Drug-related activities FVOD 10 Drug-related activities
FVOD 13 Drugs keep from life FVOD1 Improved thinking
FVOD6 To forget pressure FVOD6 To forget pressure
FVOD2 Feel better FVOD2 Feel better
FVOD 11 Drugs/alcohol together FVOD 11 Drugs/alcohol together
Easy to endorse
92
Symptoms Scale (SYM)
The initial review of the person and item separations and reliabilities findings
indicated 1.28/.62 for persons and 4.90/.96 for items of the SYM scale respectively. This
means that the scale can be divided into six groups of items in terms of difficulty, but
does not measure the people in a reliable way. While the person separation does not meet
the standard of 2.0, a person separation of 1.28 suggests that we may marginally
i
distinguish between two groups of people with significant error. The purpose of the SYM
scale is to distinguish between two groups, those who have a high probability of
substance dependence and those who do not.
Typically, the standard first step in conducting a Rasch analysis is to evaluate the
scale's items' range of responses. However, the SYM scale's items, as well as those of all
the other SASSI-3 scales, have only two response choices: true or false. Dichotomous
response options have an equal probability of being selected. Therefore, the review of the
response scales for the SYM and all subsequent scales was unnecessary. Step two of the
Rasch instrument validation analysis involved reviewing the person and item fit. Further
evaluation was conducted in an effort to improve this scale's separation and reliability
results. The researcher reviewed the fit statistics for the items and persons. This review
lead to a final iterative elimination of twelve people and two items that failed to meet the
standards set forth for item fit. These eliminations resulted in a decrease in function for
the scale as evidenced by the final person and item separations and reliabilities of
1.16/.57 for persons and 6.65/.98 for the remaining eight items on SYM scale. This
93
suggests a reasonably well defined linear construct but the construct does not do a good
job in measuring the people.
The third step in the analysis involved a review of the person-item map to explore
the extrapolated construct. The resulting hierarchy of items resulted in a pattern from
most difficult to endorse to least difficult to endorse (see Figure 9).
94
Figure 9
Item Map SYM Group 1
<more>|<rare>
80
,####
Q55 MORNING
70
.########
Q58 INTO TROUBLE
60 ####### +
Q35 MEMORY
. # • # # # # Q59 FAMILY PROBLEMS
Q54 NEGLECTED
50 +M
######### Q56 TEENAGER
Q42 TOO OFTEN
########* Q40 REMEMBER

40
.#########
30
20 ############ +
Q27 TOO MUCH
10 .########## +
<less>|<frequ>
EACH '#' IS 2.
95
Group 1 's item hierarchy is visually demonstrated in Table 8. The initial Rasch principle
components analysis (RPCA) indicated that 63.7 percent of the total variance was
explained by the instrument. With the eliminatibn of one item (Q60 [Drink away from
home]) from the SYM scale the RPCA increased to 92.9 percent of the total variance
having been explained by the instrument. This demonstrated improvement in this scale by
eliminating misfitting items. In addition, the item/person map means and standard
deviations were close in proximity indicating that the items were as difficult to endorse as
the people were able to agree to them.
The extrapolated variable was compared using the data from Group 2 by
following the same process in the analysis of the measurement function of the scale.
After the iterative elimination of sixteen misfitting people and two items, the final person
and item separation and reliability findings increased to 1.33A64 and 7.23A98
respectively. No items failed to meet the cutoff for item fit, therefore no items were
eliminated. The final RPCA for the scale was also comparable at 96.6 percent of the
variance being accounted for by the items. This means that the SYM scale items can be
divided into roughly ten groups. And, because the separation of people is not greater than
2.0 the SYM scale does not reliably distinguish between those with a high probability of
substance dependence and those who do not. As was presented for the SYM for Group 1,
the hierarchy of SYM item endorsement difficulty for Group 2 is also presented on
Figure 10.
96
Figure 10
Item Map SYM Group 2 „

<more>|<rare>
100 .#### +
' Q55 MORNING

90
#######
80
f##
70 S+
Q58 INTO TROUBLE
Q54 NEGLECTED
60 .## + Q59 FAMILY PROBLEMS
,######
50 +M
Q56 TEENAGER
#####
40 Q42 TOO OFTEN
Q40 REMEMBER
#########
30
20 S+
#########
10
0.
Q27 TOO MUCH
-10
<less>|<frequ>
EACH '#' IS 3.
97
No items were aligned on the item map for Group 2. A side-by-side comparison of the
Groups' respective item-endorsement difficulty indicated that seven of the scales nine
items remained constant on the hierarchy across Groups (see Table 8).
98
Table 8
SYM Test of Independence
Item Hierarchy
Group 1 ' „ • Group 2
Item Number Item Item Item
Number
Q55 Morning drink Q55 Morning drink
Q58 Get in trouble Q58 Get in trouble
Q35 Memory Problems Q54 Neglected obligations
Q59 Family problems Q59 Family Problems
Q54 Neglected obligations Q56 Teen use
Q56 Teen use Q42 Used too often
Q42 Used too often Q40 Couldn't remember
Q40 Couldn't remember Q27 Used too much
Q27 Used too much
Easy to endorse
Eliminated items:
Q60 Drink away from home Q60 Drink away from home
Q35 Memory Problems
99
Obvious Attributes Scale (OAT)
indicated 1.16/.57 for persons and 4.89A96 for items of the OAT scale, respectively. Like
the SYM, the OAT scale has only true and false possible response options. Therefore,
these response options had an equal probability of being selected. Based on this, the
review of the response scales was unnecessary. Step two of the Rasch instrument
validation analysis involved reviewing the person and item fit. Further evaluation was
conducted in an effort to improve this scale's separation and reliability results. Inspection
of the items and persons lead to a final iterative elimination of seven people whose
responses were inconsistent. No items failed to meet the standards set forth for item fit,
and therefore, none were eliminated. The elimination of these seven people resulted in
the final person and item separations and reliabilities of 1.20/.59 for persons and 5.15/.96
for the twelve items on OAT scale. These findings suggested a reasonably well defined
linear construct which can be divided between seven groups of items. However, the
construct does not reliably distinguish any characteristic differences within the people
from lower to higher on the variable.
most difficult to endorse to least difficult to endorse. When two items are aligned at the
same place on the on the hierarchy, the items are either theoretically redundant or at the
same level of difficulty. This means the items overfit. By way of reminder, overfitting
items can be eliminated if the infit mean-square is below 0.6 and the z-standardized score
100
is -2.0 or less. Despite appearing to be measuring the same theoretical content, one of the
items from the aligned group of Q20 and Q54 can be eliminated because each item in this
Combination falls within the item fit standards and is at the same level of difficulty. A
visual representation of the item hierarchy can be viewed in Table 9. The initial Rasch
principle components analysis (RPCA) indicated that 53.6 percent of the total variance
was explained by the instrument. However, there was also an indication of three
underlying contrasts. Underlying contrasts can demonstrate the presence of underlying
constructs which may point to the construct being multidimensional. With the elimination
of misfitting people from the OAT scale, the RPCA increased to 60.3 percent of the total
variance having been explained by the instrument with the three underlying contrasts
remaining. This demonstrated a minimal improvement in the OAT scale and is just within
the RPCA range of acceptability. In addition, the item/person map means and standard
deviations were separated by three items between the person mean and item mean and
one item separating the upper standard deviation and two separating the lower standard
deviation between the persons and items. These distances indicate that the items were
more difficult to endorse than the people were able to agree to them (see Figure 11).
101
Figure 11
Item Map OAT Group 1

<more>|<rare>
80 # +
T Q23 clever
70 .# +
Q17 respectful
. # • #
60
.####### Q53 responsibilities
.#######
Q20 disapproval Q52 resentful
50 .####.##### +M Q33 blame

Q39 law
Q19 leave home
.##########
Q7 hot lived
#########
40 Q48 punished
Q4V Police
###########
Qll sitting still
30 .######## +
##########
20 ## +
<less>I<frequ>
EACH '#' IS 2.
102
The extrapolated variable was compared using the data from a second comparable
group using the same process in the final step of the analysis. The OAT scale, using
Group 2 data demonstrated similar person and item separation and reliability findings.
After the iterative elimination of eight misfitting people and one item which failed to
meet the standards for item fit, the final person and item separation and reliability
findings increased to 1.211.62 and 5.83A97 respectively. Item Q48 (Rarely punished) and
Q7 (Not lived) were aligned at the same place on the variable which implied item
redundancy. Item Q7 overfit, meaning it met the statistical standards for item elimination.
However, this elimination reduced the person separation and reliability findings, while
the item separation and reliability finding remained relatively constant to 1.09/. 5 4 and
5.81/.91 respectively. Because of this reduction in the person separation and reliability
findings, Q7 was retained. The final RPCA for the scale was also comparable at 73.5
percent of the variance being accounted for by the items. The variance accounted for in
RPCA for Group 2 was substantially higher than the RPCA for Group 1 (a difference of
13.2%). This means that the SYM scale items can be divided into eight groups of
difficulty but it cannot reliably distinguish any differences among the group of people. As
was presented for the OAT for Group 1, the hierarchy of OAT item endorsement
difficulty for Group 2 is also presented on Figure 12.
103
Figure 12
Item Map OAT Group 2

Persons M A P OF Items '

<more>|<rare>
+
.#1
i
IT Q23 clever
t
i
70 +
# T|
1
1 Q17 respectful
1
1
IS
1
1
.### 1
60 +
1
•
1
1
SI.
. .###### 1
1 Q53 responsibilities
1 Q20 disapproval
1 Q52 resentful
i
1
50 +M
.##### Q39 law
1
1
i
I
1
1 Q19 leave home
######## 1
1
M|
i
40
Q48 punished iQ7 not lived
.#########
i
IS Q4 Police
1
I
######### I
30 +
SI
I Qll sitting still
I
I
I
I
IT
####### I
20 .## +
<less>|<frequ>
EACH ' # ' IS 3 .
104
A side-by-side comparison of the two Groups' respective item-endorsement difficulty
indicated that six of the scale's twelve items remained constant on the hierarchy across
Groups (see Table 9). Five of the six consistent items were found by both Groups to be
the most difficult to endorse.
105
Table 9
OAT Test of Independence
Item Hierarchy
Group 1 ' Group 2
Item Number Item Item number Item
Q23 Clever crooks Q23 Clever crooks
Q17 Respectful Q17 Respectful
Q53 Responsible Q53 Responsible
Q20 Disapproving looks Q20 Disapproving looks
Q52 Resentful Q52 Resentful

V
Q33 Take the blame Q39 Broken the law
Q39 Broken the law Q19 Leave home
Q19 Leave home Q48 Rarely punished

1
Q7 Not lived Q7 Not lived
Q48 Rarely punished Q4 Police
Q4 Police Qll Sitting still
Qll Sitting still
Easy to endorse
Eliminated items:
Q33 Take the blame
106
Subtle Attributes Scale (SAT)
Upon initial review of the person and item separations and reliabilities, the
findings indicated .45/. 17 for persons and 7.52A98 for items of the SAT scale,
respectively. The SAT items can be reliably divided into 10 levels of difficulty. However,
the SAT scale distinguishes no differences among the people. Because the SAT scale has
only dichotomous response options, the review of the response scales was unnecessary.
Step two of the Rasch instrument validation analysis involved reviewing the person and
item fit. Further evaluation was conducted in an effort to improve this scales separation
and reliability results. Inspection of the items and persons lead to a final iterative
elimination of seven people whose fit statistics did not meet the 2.0 z-standardized or
negative point-biserial values. Twelve people were eliminated for misfitting. No items
failed to meet the standards set forth for item fit, and therefore, none were eliminated.
This resulted in the final person and item separations and reliabilities of .49/.20 for
persons and 6.72A98 for the eight items on SAT scale. These findings represented a slight
increase for person separation and reliability but a decrease in item separation. While
these results suggest a reasonably well defined linear construct, the construct fails to
discriminate differences between the people.
107
Figure 13
Item Map SAT Group 1

<more> <rare>
Q32
90
S Q61
70
Q18
60 .##.## +
Q50
50 +M
.#########
Q4 9
40 M+
############
30
Q6
######## S
Q44
20 Q28
.###
10
<less>|<frequ>
EACH •#' IS 4.
108
The initial Rasch principle components analysis (RPCA) indicated that 68.2 percent of
the total variance was explained by the instrument. When the misfitting people were
eliminated from the SAT, the RPCA increased to 92.8 percent of the total variance
having been explained by the instrument with no underlying contrasts remaining. This
change in variance accounted for an improvement in the SAT scale. In addition, the
item/person map means and standard deviations were separated by several items. This
separation indicates a great distance between the difficulty of the endorsability of the
items and the agreeability of the people. However, these items appeared to span the entire
scale of the variable.
group using the same process in the final step of the analysis of the scale. The SAT scale,
using Group 2 data, demonstrated similar person and item separation and reliability
findings. After the iterative elimination of eleven misfitting people, the final person and
item separation and reliability findings increased to .62A28 and 8.32A99 respectively. No
items failed to meet the cutoff for item fit; therefore, no items were eliminated. The final
RPCA for the scale was also comparable at 92.9 percent of the variance being accounted
for by the items. As was presented for the SAT for Group 1, the hierarchy of SAT item
endorsement difficulty for Group 2 is also presented on Figure 14.
109
Figure 14
Item Map SAT Group 2
Persons MAP OF Items.

<more> <rare>
Q61
Q32
80
70 T+
Q18
####
60
Q50
50 +M
#######
Q4 9
40
.###########
30 + Q6
.##### S
Q44
20
Q28
.##
10
<less>I<frequ>
EACH •#' IS 5.
110
indicated that six of the scales eight items remained constant on the hierarchy across
Groups (see Table 10). The two items found to be the most difficult to endorse by both
Groups were interchanged. This means that the SAT scale can be divided into eleven
levels of difficulty. However, the scale does not discriminate differences among the
Group 2.
Table 10
SAT Test of Independence
Item Hierarchy
Group 1 Group 2
Q32 Break more laws Q61 Antacid
Q61 Antacid Q32 Break more laws
Q18 Obey the law Q18 Obey the law
Q50 Full of energy Q50 Full of energy
Q49 Smoke Q49 Smoke
Q6 Not my fault Q6 Not my fault
Q44 Responsible Q44 Responsible
Q28 Uninteresting Q28 Uninteresting
Easy to endorse
111
Supplemental Addiction Measure (SAM)
indicated 1.06/.53 for persons and 3.01/.90 for items of the SAM scale, respectively. This
means that the scale does a good job of distinguishing the levels of items into four levels
of difficulty but it does not meet the minimum standard of a separation of 2.0 for people.
The scale does not differentiate any groups of people in this Group. Due to the SAM
having only true and false possible responses, the review of the response scales was
unnecessary. Step two of the Rasch instrument validation analysis involved reviewing the
person and item fit. Further evaluation was conducted in an effort to improve this scale's
separation and reliability results. Inspection of the items and persons lead to a final
iterative elimination of the 21 people and the one item that failed to meet the standards
set forth for person and item fit. This elimination process resulted in the final person and
item separations and reliabilities of 1.32/.64 for persons and 4.17/.95 for the remaining
ten items on the SAM scale. This suggests a reasonably well defined linear construct.
However, the construct does not discriminate among the Group reliably.
112
Figure 15
Item Map SAM Group 1

<more>I<rare>
80 +
. # •
Q51 SAT ABOUT.

70 #### +
60 + Q16 NOTUPTOIT Q9 DAYDREAM

S
########## Q39 BROKEN LAW
#########
50 M+M Q54 NEGLECTED
########### Q42 TOO OFTEN
Q48 PUNISHED Q7 NOT LIVED

Q2 9 CONTROL MYSELF Q40 COULDN'T REMEMBER
##### Q4 TROUBLE
S Q4 6 UNDESIRABLE
40
Q13 WORNOUT
S
########
30
T
##
20
<less>|<frequ>
EACH '#' IS 2.
113
When two items are aligned at the same place on the hierarchy, the items are either
theoretically redundant or at the same level of difficulty. In Rasch this means the items
overfit. Overfitting items can be eliminated if the infit mean-square is below 0.6 and the
z-standardized score is -2.0 or less. Despite appearing to be measuring the same
theoretical content, one of the items from each of the aligned groups of Q16/Q9, Q42/Q7
and Q29/Q40 can be eliminated because all of items in these combinations fall within the
item fit standards and are at the same level of difficulty. A visual representation of the
hierarchy for Group 1 's SAM scale items is demonstrated in Table 11. The initial Rasch
principle components analysis (RPCA) indicated that 26.8 percent of the total variance
was explained by the instrument. Additionally, five underlying contrasts were indicated.
With the elimination of 21 people and one item (Q5 [Made mistakes]) from the SAM
scale, the RPCA increased to 47.4 percent of the total variance having been explained by
the instrument and a reduction in the remaining contrasts to four. This demonstrated some
improvement in this scale due to the elimination of the misfitting items and people.
However, the minimal accepted standard for RPCA is greater than or equal to 60 percent.
Even with the improvements the SAM does not appear to function as a linear construct
due to its low RPCA.
group using the same process in the final step of the analysis. The SAM scale, using
Group 2 data, demonstrated similar person and item separation and reliability findings.
After the iterative elimination of twenty-three misfitting people and two items (Q16 and
Q5) that failed to meet the standards for item fit, the final person item separation and
114
reliability findings increased to 1.42/.69 and 5.45A97 respectively. The final RPCA for
the scale was also comparable at 71.6 percent of the variance being accounted for by the
items with no underlying contrasts. As was presented for the SAM for Group 1, the
hierarchy of SAM item endorsement difficulty for Group 2 is also presented on Figure
16.
Figure 16
Item Map SAM Group 2
irsons MAP OF Items

<more>|<rare>
80 + Q51 SAT ABOUT
.## ' IT
T|
70 +
.##### 1
i
SIS
1 .Q9 DAYDREAM
. 1
1
60 +
########## 1
1
1
1 Q39 BROKEN LAW
1 Q54 NEGLECTED
.##### . 1
50 +M
M|
i
1
######### 1
I Q4 8 PUNISHED
I Q42 TOO OFTEN
I
I
40 #### + Q4 TROUBLE Q40 COULDN'T REMEMBER
IS
I
SI
###### I
30
I Q13 WORNOUT
.##### I
|T
20 •# + ••
<less>I<frequ>
EACH '#' IS 3.
115
The item combination of Q46/Q40 was aligned at the same place on the variable. Neither
of the items fit both statistical standards for item elimination and neither seemed to be
measuring the same theoretical content. In addition, elimination of Q40, which had the
highest overfit statistics did not improve the scale. This change produced a person
separation and reliability and RPCA decrease to 1.32/.64 and 70.2 percent respectively.
Therefore, item Q40 remained in the hierarchy. However, the removal of item Q46
increased the scale's person and item separation and reliability findings as well as
increased the RPCA, 1.46/.68, 5.65A97, and 7.75 percent, respectively. Therefore, item
Q46 was removed due to redundancy and improvement in the scale. A side-by-side
comparison of the Groups' respective item-endorsement difficulty indicated that three of
the scale's fourteen items remained constant on the hierarchy across Groups (Table 11).
The item found to be the most difficult to endorse and the two items found to be least
difficult to endorse by both Groups were consistent across both scales. Yet, the hierarchy
developed from Group 1 's data was comprised of thirteen items. The hierarchy item scale
developed from Group 2's data was comprised of eleven items. This means that the SAM
scale items can be divided into seven levels of difficulty, which does not reliably
discriminate any differences among the Group 2.
116
Table 11
SAM Test of Independence
Item Hierarchy
Group 1 Group 2
Item Number Item Item number Item

' ' • /•
Q51 Sat about Q51 Sat about
Q16 Wasn't up to it Q9 Daydream
Q9 Daydream Q39 Broken law
Q39 Broken law Q54 Neglected obligations
Q54 Neglected obligations Q48 Rarely punished
Q42 Too often Q42 Too often
Q7 Not lived Q4 Police trouble
Q29 Control myself Q40 Couldn't remember
Q40 Couldn't remember Q13 Worn out
Q4 Police trouble
Q46 Undesirable people
Q13 Worn out
117
Table 11 (Continued)
Easy to endorse
Eliminated items:
Q5 Make mistakes Q5 Make mistakes
Q16 Wasn't up to it
Q46 Undesirable People
Defensiveness Scale (DEF)
The review of the initial person and item separations and reliabilities findings
indicated .84/.41 for persons and 5.64A97 for items of the DEF scale, respectively. This
means that while the DEF scale items can be divided into seven levels of difficulty, they
do not reliably discriminate differences among the people. Because of the dichotomous
nature of the response options (true and false) both options have an equal probability of
being selected. Therefore, the review of the response scales was unnecessary.
person and item fit. Further evaluation was conducted in an effort to improve this scale's
separation and reliability results. Inspection of the items and persons lead to a final
iterative elimination often people whose responses were inconsistent and two items (Q8
[Friendly] and Q25 [Dangerous]) which failed to meet the standards set forth for item fit.
This resulted in the final person and item separations and reliabilities of .93/.47 for
persons and 6.62A98 for the remaining ten items On DEF scale. While this suggests a
reasonably well defined linear construct, which can be divided into nine levels of
118
difficulty with high reliability (.98), the variable does not reliably discriminate any
differences among the people.
the extrapolated construct (see Figure 17).
119
Figure 17
Item Map DEF Group 1

• • i , . • • . •
INPUT: 174 Persons 12 Items MEASURED: 164 Persons 10 Items 4 GATS
Persons. MAP OF Items

<more>|<rare>
80 . +T
I Q61 ANTACID
I
70 T+
.# I
IS
I
I
.### I Q22 AVOIDED PEOPLE
I
60 +
I Q51 SAT ABOUT
S|
I
.########## I
I Q9 DAYDREAM
I
I
I Q16 NOT UP TO IT
I.
50 .########### +M Q41 THINK
. I Ql LIE
I
I
M | . •
I
.######### I
40
##### I Q65 RESTLESS
IS
I
S|
I
.###### I.
30 +
I Q64 HAPPY
I
I Q31 NO GOOD
' - . . [ •
T|
20 • .##### +T
<less>|<frequ>
EACH '#' IS 3.
120
The resulting hierarchy of items resulted in a pattern from most difficult to endorse to
least difficult to endorse. The initial Rasch principle components analysis (RPCA)
indicated that 42.8 percent of the total variance was explained by the instrument.
Additionally, five underlying contrasts were indicated. Following the elimination often
people and two items from the DEF scale, the RPCA increased to 71.6 percent of the total
variance having been explained by the instrument and the number of remaining contrasts
was reduced to one. This increase in the RPCA demonstrated the improvement in the
DEF scale by eliminating misfitting items and people. In addition the item/person map
means and standard deviations were separated by several items but span the length of the
variable. This indicates that the items were marginally as difficult to endorse as the
people were able to agree to them.
group using the same process. The DEF scale, using Group 2 data, demonstrated similar
person and item separation and reliability findings as were found using Group 1 's data.
After the iterative elimination of thirteen misfitting people and one item that failed to
meet the standards for item fit, the final person and item separation and reliability
findings increased to .96/.48 and 7.07/.98, respectively. While these eliminations
improved the scale, the DEF scale still did not distinguish the people in a reliable manner.
The final RPCA for the scale was also comparable at 80.5 percent of the variance being
accounted for by the items. As was presented for the DEF for Group 1, the hierarchy of
DEF item endorsement difficulty for Group 2 is also presented on Table 12. Items Q25
and Q9 were aligned on the variable for the data provided by Group 2. Further evaluation
121
of this alignment indicated that neither of the items met the statistical standards for
overfit. Additionally, the items appeared to be measuring two theoretically different
content areas, Q25 (Dangerous), Q9 (Don't like to Daydream). It should be noted that in
the hierarchy produced by the data from Group 1, item Q25 was eliminated for misfitting.
Eliminating item Q25 from the hierarchy produced by the data from Group 2 improved
the item separation and reliability findings and the RPCA for the DEF scale to 7.45A98
and 83.9 percent respectively. See Figure 18 for the DEF item map.
122
Figure 18
Item Map DEF Group 2
Persons MAP OF Items .

<more> | <rare>.
| Q61 ANTACID
80 +
# 1
70 T+S Q22 AVOIDED PEOPLE

### I
I
I Q51 SAT ABOUT
60 + Q8 FRIENDLY
I
.######## I
S|
| Q9 DAYDREAM
• I
I
50 ######### +M
| Q16 NOT UP TO IT
I
| Q41 THINK
I
########## Ml
I ..
40 +
.######## I
| Q65 RESTLESS
30 S+S
I
######### I
I
I
I Q64 HAPPY
| Q31 NO GOOD
20 +
I
I
### T|.
I
I
IT
10 . +
<less>|<frequ>
EACH '#' IS 3.
123
A side-by-side comparison of the two Groups' respective item-endorsement difficulty
indicated that six of the scales twelve items remained constant on the hierarchy across
Groups (Table 12). Three of the six consistent items were found by both Groups to be the
most difficult to endorse. The other three items were found by both Groups to be the least
difficult to endorse.
124
Table 12
DEF Test of Independence
Item Hierarchy
Group 1 Group 2
Item number Item Item number Item
Q61 Antacid Q61 Antacid
Q22 Avoided people Q22 Avoided people
Q51 Sat about Q51 Sat about
Q9 Daydream Q8 Friendly
Q16 Wasn't up to it Q25 Dangerous
Q41 Think carefully Q9 Daydream
Ql Lie Q16 Wasn't up to it
Q65 Restless Q41 Think carefully
Q64 Happy Q65 Restless
Q31 No good Q64 Happy
Q31 No good
Easy to endorse
Eliminated items:
Q8 Friendly Ql Lie
Q25 Dangerous
125
Family versus Control Scale (FAM)
The initial review of the FAM scale's person and item separations and reliabilities
findings indicated .71/.33 for persons and 5.03/.96 for items, respectively. This means
that the FAM scale items can be divided into seven levels of difficulty but the scale did
not reliably distinguish any differences among Group 1. Since the FAM scale's items are
limited to true and false responses, both options have an equal probability of being
selected. Therefore, the review of the response scales was unnecessary. Step two of the
Rasch instrument validation analysis involved reviewing the person and item fit. Further
evaluation was conducted in an effort to improve this scale's separation and reliability
results. Inspection of the items and persons lead to a final iterative elimination of fourteen
people whose responses were inconsistent and three items (Q27 [Too much], Q63 [Loss
for words], and Q8 [Friendly]) which failed to meet the standards set forth for item fit.
This resulted in the final person and item separations and reliabilities of 1.00/. 50 for
persons and 5.71/.97 for the FAM scale's remaining thirteen items. These separation and
reliability findings suggest a reasonably well defined linear construct which can be
divided into seven levels of difficulty but the construct still fails to reliably distinguish
any differences among the people.
most difficult to endorse to least difficult to endorse. When two items are aligned at the
same place on the on the hierarchy, the items are either theoretically redundant or at the
same level of difficulty. In Rasch this means the items overfit. Overfitting items can be
126
eliminated if the infit mean-square is below 0.6 and the z-standardized score is -2.0 or
less. Despite not appearing to be measuring the same theoretical content and all of the
items in these combinations falling within the item fit standards, one of the items from
each of the aligned groups of Q25/Q9 and Q23/Q55 can be eliminated because they are at
the same level of difficulty. The initial Rasch principle components analysis (RPCA)
indicated that 35.9 percent of the total variance was explained by the instrument.
Additionally, three underlying contrasts were indicated. With the elimination of fourteen
people and three items from the FAM scale, the RPCA increased to 78.1 percent of the
total variance having been explained by the instrument with no remaining contrasts. This
demonstrated improvement in this scale by eliminating misfitting items and people. In
addition, the item/person map means and standard deviations were separated by several
items but span the length of the variable. This indicates that the items were easier to
endorse than the people were able to agree to them (see Figure 19).
127
Figure 19
Item Map FAM Group 1

<more>|<rare>
90 . . +
T|
####
JO
,###### S Q25 DANGEROUS

70 + Q9 DAYDREAM
.########
Ql LIE
60 ############ M+
Q54 NEGLECTED
#####.### Q65 RESTLESS
Q39 BROKEN LAW
####
50 +M
Q50 ENERGY
,#### Q38 FEEL SURE

Q3 GO ALONG
.#
40
Q23 CROOKS Q55 STEADY
30
20
Q61 ANTACID
10
<less>|<frequ>
EACH '#' IS 3,
128
group using the same process in the final step of the analysis. The FAM scale, using
Group 2 data, demonstrated similar person and item separation and reliability findings.
After the iterative elimination of six misfitting people, the final person and item
separation and reliability findings increased to .43/. 16 and 5.41/.97 respectively. No items
failed to meet the cutoff for item fit, therefore, no items were eliminated. The final RPC A
for the scale was also comparable at 40 percent of the variance being accounted for by the
items. As was presented for the FAM for Group 1, the hierarchy of FAM item
endorsement difficulty for Group 2 is also presented on Table 13. Items Q25 and Q9 were
again aligned on the same variable for the hierarchy Created by the data from Group 2,
which indicates redundancy. In addition, the item combination of Q39 and Q54 was also
aligned on the hierarchy. Item Q54 fit statistics indicated that the item overfit. However,
elimination of the item drastically decreased the person separation and reliability findings
while only narrowly increasing the item separation and reliability findings and RPC A to
.19/.04, 5.54A97 and 40.4 percent, respectively. Therefore, the item Q54 remained in the
scale, as it was found to drastically reduce the ability of the instrument of discriminate
between the people and also appeared to measures a different content area (Figure 20).
129
Figure 20
Item Map FAM Group 2
INPUT: 173 Persons 15 Items MEASURED:. 167 Persons 14 Items 4 CATS

<more>|<rare>
73 .### +T
72 +
71 + Q9 DAYDREAM
70 + Q25 DANGEROUS
69 +
68 +
67 .###### +
66 S+
65 +
64 +
63 +
62 .########## +s Q63 LOSSFORWORDS
61 . +
60 + Ql LIE
59 M+
58 +
57 .######### +
.56 +
55 +
54 +
53 .###### + Q39 BROKEN LAW
52 S+ Q65 RESTLESS
51 +
50 ..### +M
49 + Q50 ENERGY
48 +
47 +
46 . + Q38 FEEL SURE
.45 T + Q3 GO ALONG
44 + Q27 TOO MUCH Q8 FRIENDLY
43 +
42 #• +
41 +
40 +
39 + • Q55 STEADY
38 +S
37 +
36 +
35 +
34 + Q23 CROOKS
33 +
32 +
31 + Q61 ANTACID
<less>|<frequ>
EACH •#' IS 4.
130
indicated that six of the scales' fifteen items remained constant on the hierarchy across
Groups (Table 13). The two items found by both Groups to be the most difficult to
endorse (Q25 [Dangerous] and Q9 [Daydream]) and the item least difficult to endorse
(Q61 [Antacid]) were consistent as well as a set of three items (Q3 [Go along with], Q38
[Feel sure], and Q50 [Full of energy]) which were clustered around the mean for both
hierarchies and were found to be consistent. This means that the FAM scale fails to work
in terms of discriminating differences among the people and also fails to account for
more than 60 percent of the variance.
131
Table 13
FAM Test of Independence
Item Hierarchy
Group 1 Group 2
Q25 Dangerous Q9 Daydream
Q9 Daydream Q25 Dangerous
Ql Lie Q63 Loss for words
Q54 Neglected obligations Ql Lie
Q65 Restless Q39 Broken a law
Q39 Broken a law Q65 Neglected obligations
Q50 Full of energy Q50 Full of energy
Q38 Feel sure Q38 Feel sure
Q3 Go along with Q3 Go along with
Q23 Clever crooks Q27 Too much
Q55 Morning drinks Q8 Friendly
Q61 Antacid Q55 Morning drinks
Q23 Clever crooks
Q61 Antacid
Easy to endorse
;
Eliminated items:
132
Q8 Friendly Q54 Neglected obligations
Q27 Too much
Q63 Loss for words
Correctional Scale (COR)
indicated 1.10/.55 for persons and 5.35A97 for items of the COR scale respectively. The
COR scale items can be divided into seven levels of difficulty but it cannot reliably
discriminate differences among the people. The COR scale has true and false responses.
Therefore, the review of the response scales was unnecessary. Step two of the Rasch
instrument validation analysis involved reviewing the person and item fit. Inspection of
the items and persons lead to a final iterative elimination of six people and one item (Ql
[Lie]) which failed to meet the standards set forth for item fit. This resulted in the final
person and item separations and reliabilities of 1.13/.56 for persons and 5.84A97 for the
remaining eleven items on COR scale. While these separation and reliability findings
suggest a reasonably well defined linear construct, the variable does not distinguish
differences among the people.
133
Figure 21
Item Map COR Group 1
INPUT: 174 Persons 12 Items MEASURED: 168. Persons 11 Items 4 CATS

<more> <rare>
100
90 + Q32 BREAK MORE
80
70
Q18 OBEY
##
60
Q31 NO GOOD
.##### S
Q24 TEACHERS
50 ######### +M
Q39 NEVER BROKEN LAW
.######
Q19 LEAVE HOME
M| Q42 TOO OFTEN
40 .######## + Q4 0 COULDN'T REMEMBER

Q7 TROUBLE
.####'# I Q41 THINK

S Q36 HIT PEOPLE
30 ###### S+
20 ######### +
<less>|<frequ>
EACH 'i' IS 3.
134
The initial RPCA indicated that 60.1 percent of the total variance was explained by the
instrument. Additionally, two underlying contrasts were indicated. With the elimination
of six people and one item from the COR scale, the RPCA increased to 85.1 percent of
the total variance having been explained by the instrument with no remaining contrasts.
This demonstrated improvement in this scale by eliminating misfitting items and people.
In addition, the item/person map means and standard deviations were separated by
several items but span the length of the variable. This indicates that the items were more
difficult to endorse than the people were able to agree to them.
group using the same process in the last step of the analysis. The COR scale, using Group
2 data, demonstrated similar person and item separation and reliability findings. After the
iterative elimination of three misfitting people and one item (Ql), the final person and
item separation and reliability findings increased to 1.28/.62 and 6.17/.97, respectively.
Using the data from Group 2, two combinations of items aligned at the same place on the
hierarchy, Q42/Q7 and Q36/Q40. By examining the item fit statistics for the aligned
items, only one item appeared to overfit statistically, and all of the items appeared to
measure different content. There was no improvement for this scale despite the removal
of the overfitting item (Q42), the review of the person and item separation and reliability
findings and the RPCA. The person and item separation and reliability findings and
RPCA declined to 1.12/.56,6.16/.97 and 80.6 percent, respectively. Therefore, item Q42
remained in the hierarchy. The final RPCA for the scale was also comparable at 82.8
percent of the variance being accounted for by the items. As was presented for the COR
135
for Group 1, the hierarchy of COR item endorsement difficulty for Group 2 is also
presented on Figure 22.
Figure 22 '
Item Map COR Group 2
INPUT: 172 Persons 12 Items MEASURED: 16'9 Persons 11 Items 4 CATS

<more>|<rare>
90 +
I Q'32 BREAK MORE
# 1
IT
80 +
' • I •
## T|
70 +
I
Q18 OBEY
IS
.###
60 + Q31 NO GOOD
SI
.###•# I
I Q24 TEACHERS
.#########
50 +M Q39 NEVER BROKEN LAW
#### I
M| Q19 LEAVE HOME
I
40 +
I Q42 TOO OFTEN Q7 TROUBLE
I
I Q36 HIT PEOPLE Q40 COULDN'T REMEMBER
Q41 THINK .
#### IS.
30 ######## +
SI
.##### I
20. ####. +
<less>|<frequ>
EACH '#' IS 3.
136
indicated that seven of the scale's remaining eleven items remained constant on the
hierarchy across Groups (Table 14). All seven of these items were among those found to
be among the most difficult to be endorsed by both Groups. This means that the COR
scale items can be divided into eight difficulty levels, but it does not do an adequately
reliable job of distinguishing among Group 2.
Table 14
COR Test of Independence
Item Hierarchy
Group 1 Group 2
Q32 Break more laws Q32 Break more laws
Q18 Obey laws Q18 Obey laws
Q31 No good Q31 No good
Q24 School/teacher problems Q24 School/teacher problems
Q39 Broken laws Q39 Broken laws
Q19 Leave home Q19 Leave home
Q42 Too often Q42 Too often
Q40 Couldn't remember Q7 Not lived
Q7 Not lived Q40 Couldn't remember
137
Q41 Think carefully Q36 Tempted to hit
Q36 Tempted to hit Q41 Think carefully
Easy to endorse
Eliminated items:
Ql Lie Ql Lie
Random Answering Pattern (RAP)
The initial review of the person and item separations and reliabilities the findings
indicated 0.00/0.00 for persons and 3.63A93 for items of the RAP scale respectively. This
means that the RAP scale does not distinguish any differences among Group 1 but
initially, the items can be divided into five levels of difficulty. However, the RAP has
true and false response options, therefore, both have an equal probability of being
selected. Therefore, the review of the response scales was unnecessary. Step two of the
Rasch instrument validation analysis involved reviewing the person and item fit.
Inspection of the items and persons lead to a final iterative elimination of twelve
misfitting people who failed to meet the statistical standards for fit. No items failed to
meet the standards set forth for item fit. This resulted in the final person and item
separations and reliabilities of 0.00/0.00 for persons and 0.00/0.00 for the RAP scale.
This indicated no change for items and a decrease in reliability and separation for
persons. This suggests that the RAP scale is functioning as developed, which is in a
random manner. The third step in the analysis involved a review of the person-item map
138
to explore the extrapolated construct. There was no resulting hierarchy of items due to the
nature of the scale being random. The initial RPCA indicated that 48 percent of the total
variance was explained by the instrument. Additionally, two underlying contrasts were
indicated. With the elimination of twelve misfitting people from the RAP scale, the
RPCA decreased to 1.4 percent of the total variance having been explained by the
instrument with two remaining contrasts which accounted for 98.6 percent of the
unexplained variance. The RPCA also indicated that the RAP scale was functioning as
intended which is in a random manner.
group using the same process in the last step of the analysis. The RAP scale, using Group
2 data, demonstrated similar person and item separation and reliability findings. After the
iterative elimination of fourteen misfitting people, the final person and item separation
and reliability findings decreased to .00/.00 and .00/.00, respectively. The final RPCA for
items with three additional contrasts. As with the RAP scale using data from Group 1, the
RAP scale using data from Group 2 had similar results. This means that the RAP scale
functions in a random pattern as intended by the SASSI-3 authors.
Dichotomous SASSI-3
The SASSI-3 employs a 9-step decision rubric to arrive at a final determination of
whether a respondent's answers indicated "high probability" or "low probability" of
having a substance dependence disorder. This rubric requires the clinician to reference
the respondent's scores on eight of the SASSI-3's ten subscales. The first 5 of these 9
139
steps require the scorer to reference the individual SASSI-3 subscales. The remaining 4
steps are a function of two or more subscales used in combination. In total, these 9 steps,
involving eight of the ten subscales employ only 70 of the SASSI-3's total of 93 items.
The initial person and item separation and reliabilities for the dichotomous SASSI-3 were
3.54A93 and 5.60/.97, respectively. Additionally, the initial RPCA was 59.7 with one
contrast accounting for more than 5 percent of additional variance. Viewed cumulatively,
all of these findings suggest that the dichotomous SASSI-3, according to the Rasch
analysis, can be said to be a logical linear construct in which the items can be divided into
seven levels of difficulty and which discriminates five levels of differences among the
people. It is just under the required 60 percent cut off for explained variance for the
Rasch principle components. However, in an effort to determine whether improvements
could be made, the researcher conducted analyses of the dichotomous SASSI-3's scale's
response options, items, and underlying factor structure.
The first step in a Rasch analysis was to evaluate the response scales. Because the
dichotomous SASSI-3 involves items from both the front and back of the instrument, the
true/false and Likert-type response options, it is important to evaluate the response scales
for validity. Inspection of the probability curve and thresholds indicated that response
options 1-Once or Twice and 2- Several times did not meet the standards for cutoffs for
the face valid response scales as identified earlier (see Figure 23).
140
Figure 23
Response Option 0123 Dichotomous SASSI-3 Group J

3.62.1

FOR GROUPING "B" Item NUMBERS: 46-71
+
CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT I I STRUCTURE|CATEGORY|
LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQI|CALIBRATN| MEASURE|
0 0 2810 62|--14.22 -13.51 .77 .801 I NONE I (-14.07) |0

1 1 583 13 1 -6.36 -7.941 .90 .60 1 I 4.99 | -4.08 | 1
2 2 533 12 1 -1.90 -2.481 .87 .63 M -4.30 | 3.50 | 2
3 3 585 131 3.80 2.881 .83 .661 I -.69 |( 14.72)| 3
+- ++-
MISSING 13 01 -3.33 1 1 1 1 1
estimate.
+
CATEGORY STRUCTURE I SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|
LABEL MEASURE S.E. | AT CAT. ZONE 1PROBABLTY| M->C C->M|DISCR|
0 NONE |(-14.07) -INF -9.21| | 88% 74%| | 0

1 4.99 .38 | -4.08 -9.21 -.331 -4.63 | 21% 49%[ 1.11 | 1
2 -4.30 .44 | 3.50 -.33 9.08| -.86 | 29% 34%| 1.40| 2
3 -.69 .54 |( 14.72) 9.08 +INF | 5.07 | 80% 22% | 1.351 3
+
p ++ -+- H 1 -+ n (. + 1 (. _++
R 1.0 + +
0 1000 33|
B I 000000 333333 |
A 1 000 3333 |
B .8 + 000 333 +
I 1 00 333 |
L 1 00 33 |
I 1 0 33 |
T .6 + 00 3 +
Y 1 0 33 |
.5 + 0 3 +
O 1 00 33 |
F .4 + 0 3 +
1 0 3 |
R 1 0*3222222 |
E 1 22*2* 22222 |
S .2 + 111*****1 00 22222 +
P 1 1111111222 33 1111*0 22222 |
O 1 11111111 2222 33: 1**11 2222222 |
N 111111 2222222233333 00****1111 22|
S .0 -J.*******!* 33333333 0000**************+
E ++ -+- + + .+ + + + + + ++
-24 -19 -14 - 9 - 4 1 6 11 16 21 26
141
As the purpose of this exploration was to investigate the effectiveness of the dichotomous
SASSI-3 as a unidimensional instrument, it was important to maintain the continuity of
the response scales of the face valid scales and not separate the FVA and FVOD scales.
Therefore, employing the same collapsing strategy for both the FVA and FVOD response
scales resulted in a positive increase in the item separation and reliability to 5.72/.97 (see
Table 15). No additional examination was warranted because the response options for the
other scales are dichotomous (true and false).
142
Table 15
Summary of Collapsing Strategy for Dichotomous Group 1 Face Valid Response Options
Rating Probability
Threshold2 PS&R IS&R RPCA
1
Scale Curve
0 = 0.95
0-l=N/A
1=0.20'
0,1,2,3 1-2 = 9.29 3.547.93 5.60/.97 59.7%
2 = 0.30
2-3 = 3.61
3 =0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.40 3.35A92 5.72A97 49.3%
1-2 = 6.04
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.20 3.06/.90 5.75A97 58.1%
1-2=11.44
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.20 3.43A92 5.43A97 53.1%
1-2=14.49
2 = 0.95
143
Figure 24 depicts the corrected response scale in which the middle two response
options 1-Once or Twice and 2-Several Times were combined.
Figure 24
Corrected Response Option Curve 0112 Dichotomous SASSI-3 Group 1


+ -T • '
LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQJ | CALIBRATN | MEASURE|
0 0 2810 621-17.73 -16.6| .72 .78|| NONE |(-16.80)I 0

1 1 1116 251 -6.18 -7.881 .79 .591| -3.02 | .00 | 1
2 2 585 131 2.97 .891 .79 .621 | 3.02 |( 16.80)1 3
MISSING 13 0| -5.57 | || | |
+ .—
:
+ - • •- ;
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.I COHERENCE|ESTIM|

LABEL MEASURE S.E. | AT CAT. ZONE |PROBABLTY| M->C. C->M|DISCR|
0 NONE I (-16.80) -INF -9.311 I 84% 84%| |0

1 -3.02 .36 | .00 -9.31 9.30| -6.33 | 43% 60%I 1.31| 1
2 3.02 .52 |( 16.80) 9.30 +INF | 6.33 I 78% 20%| 1.32| 3
:
+
R 1.0 + + '
0 |00 22|
B | 00000 22222 |
A | 0000 2222 |
B .8 + 000 222 +
1 I 00 22 |
L | 00 22 |
I | 00 22 |
T .6 + 00 22 +
Y | 0 2 |
.5 + 00 22 + •
0 | . 0 2 I
F .4 + **11111** +
| 1111 00 22 1111 |.
R | 111 * ' 111 I!
E " I 111 22 00 ' 111 |
S .2 + 1111 22 00 1111 +
P | 1111 222 000 . 1111 |
o i nun 222 ooo mill i
N 111 2222222 0000000 111
S .0+2222222222222 0000000000000+
-30 -20 -10 0 10 20 30

144
person and item fit. Further evaluation was conducted for the dichotomous SASSI-3 to
determine whether the separation and reliability results could be improved. Inspection of
the items and persons lead to a final iterative elimination of 25 people whose responses
were inconsistent and eighteen items which failed to meet the standards set forth for item
fit. This resulted in the final person and item separations and reliabilities of 3.32/.92 and
5.50/.92, respectively, for the dichotomous SASSI-3. This suggests that the resulting
items formed a well defined linear construct that does a good job in measuring the
people.
The third step in the analysis of the dichotomous SASSI-3 involved a review of
the person-item map to explore the extrapolated construct. The resulting hierarchy of
items resulted in the following pattern from most difficult to endorse to least difficult to
endorse is available for a visual review on Table 19. Items aligned at the same position
on the variable may imply redundancy. Therefore, the item fit statistics for the pairs of
items were reviewed to identify which items fit best. The least best fitting item of the pair
Was eliminated. In addition, because the items from the face valid scales were found to
discriminate differences among people, these items were analyzed last for elimination as
they seem to contribute the most to the effectiveness of the instrument. Table 16 lays out
the pairs of items and the fit statistics.
145
Table 16
Dichotomous SASSI-3 Group 1 Paired Aligned Items Fit Statistics
Infit Outfit
Items MNSO1 zsur MNSO ZSTD
Q17* 1.06 .40 1.02 .20
Q18 1.08 .60 .75 -.80
Q35* 1.10 1.0 2.11 3.8
Q52 1.08 .8 .99 .0
Q42* .92 -1.0 .86 -1.0
Q49 .81 -2.4 .70 -2.2
Q29* 1.04 .5 1.06 .5
Q4 1.06 .7 1.04 .3
Q40 .95 -.6 .90
Q48* 1.20 2.4 1.27 1.9
*=better fitting item of the pair. Items with no * was eliminated. MNSQ = mean-square. ZSTD=z-standardized.
The elimination of items Q17, Q35, Q42, Q29, and Q48 resulted in an increase in person
and item separation and reliability findings of 3.21/.91 and 5.62A97, respectively, and a
RPCA of 97.8 percent. The above process was repeated until no further improvements
146
were made in person and item separation and reliabilities. Finally, with the elimination of
29 misfitting items and 25 misfitting people, the resulting 41 item dichotomous SASSI-3
scale had a person and item separation and reliability finding of 3.06/.90 and 5.63/.97,
respectively (see Figure 25 for the item map).
147
Figure 25
Item Map Dichotomous SASSI-3 Group 1

INPUT: 174 Persons 71 Items MEASURED: 14 9 Persons 42 Items 7 CATS
Persons MAP OF Items.

<more>|<rare>
-' . I Q32 •
90 +
80 +T FVA12-^commit suicide
I
I
. I
| FVA9-effects recur FVOD9-doctor
70 +
| FVOD7-trouble w/law
IS
. I •
# | FVAlO-relationship FVAll-Nervous/shakes
FVA6-trouble
60 # T+ FVOD12-avoid withdrawal FVOD3-more aware
Q55
## 1 FVA8-argued FVOD14-treatment pro

FVOD4-sex
, # 1 FVA3-energy FVA7-depressed
FVOD8-really stoned
# 1 FVOD5-help Q58
50 ### +M FVODl-improve thinking FVODlO-activities
FVOD13-life Q53 responsibilities
.# SI FVAl-lunch FVA2-feelings
FVOD6-forget Q50
.## 1 FVOD2-feel better Q20 disapproval
### 1 FVODll-aod Q59
.# 1 Q39 law
40 .'#### + FVA5-physical probs Q54
Q56
.### 1
.###### 1
##### M|S FVA4-intended Q7 not lived
### 1 Q29'
30 ####### +
.### 1
# 1 Q46
## 1 Q13
.### 1
20 .## S+T
.### 1 Q27
.#### 1
1 Q22
.# 1
10 +
1
T|
0 +
-10 +
<less>I<frequ>
EACH '#' IS 2.
148
The final RPCA indicated that 81 percent of the total variance was explained by
the remaining 42 items on the dichotomous SASSI-3. This demonstrated improvement in
this scale by adjusting the response scale and eliminating misfitting people and items. In
addition, the item/person map means and standard deviations were separated by nearly
one standard deviation, indicating that the items were more difficult to endorse than the
people were able to agree to them.
data from a second comparable group using the same process. Dichotomous SASSI-3,
using Group 2 data, demonstrated similar person and item separation and reliability
findings. As with Group 1, the response options were not being used as intended (see
Figure 26).
Figure 26
Response Option 0123 Dichotomous SASSI-3 Group 2

3.62.1 .

FOR GROUPING "B" Item NUMBERS: 4 6-71
+ ' '
0 0 2774 621-14.89 -14.21 .85 .931 | NONE | (-14.63)1 0

1 1 574 13| -6.76-8.541 .76 .431| 4.33 | -4.37 | 1
2 2 559 12| -2.23 -2.60 1 .85 .561| -5.31 | 3.58 | 2
3 3 531 12| 4.48 3.52| .87 .791| .98 1(15.58)1 3
MISSING 60 1| -7.29 | || I I
+ • ^- •-

estimate.
149
+ - - — ••• • — - > 1 — - . — - > - — • —

LABEL MEASURE S.E. | AT CAT. ZONE |PROBABLTY| M->C C->M|DISCR|
0 NONE | (-14.63) -INF -9.67| | 88% 78%| I0

1 4.33 .38 | -4.37 -9.67 -.461 . -5.12 j 24% 53%| 1.11 | 1
2 -5.31 .45 | 3.58 -.46 9.54| -1.17 | 33% 35%| 1.33| 2
3 .98 .57 |( 15.58) 9.54 +INF | 5.77 |. ' 84% 26% | 1.33| 3
' + ' — - — • • — • — . ; - - - — . • - — • —

p ++ -+ ,-- 1 h H 1 h - + --; 1 h h+
R 1.0 + +
0 100 1
B I 000000 333333 1
A 1 000 3333 |
B .8 + 000 3333 +
I 1 00 33 |
L 1 00 • 333 |
I 1 00 33 |
T .6 + • 0 • 3 +
Y 1 0 33 |
.5 + 00 33 +
0 I. 0 3 |
F .4 + 0 3. +
1 00 33 |
R 1 *2*222222222 |
E 1 2223*0 2222 |
S .2 + 111***1**1 0 22222 +
P. 1 1111111222 33 111** 22222 |
0 1 1111111 2222 333 ***n 22222221
N 111111 22222222 33333 00***11111 1
S .0 +*******333333333 00000**************+
E ++ .- -+ + -+ + +-—•—+ + +-• +- ++
-24 -19 -14 -9.-4 1 . 6 11 16 21 26
A collapsing strategy was developed by reviewing the thresholds and probability curves
(see Table 17). This strategy lead the researcher to combine the two middle response
options: 1-Once or Twice and 2-Several times. This combination allowed for a better
functioning response scale and an increase in the item separation and reliability findings.
150
Table 17
Summary of Collapsing Strategy for Dichotomous Group 2 Face Valid Response Options
Rating Probability
1
Scale Curve
0 = 0.95
0-l=N/A
1 = 0.20
0,1,2,3 1-2 = 9.64 3.53A93 6.15/.97 62.2%
2 = 0.30
2-3 = 6.29
3 =0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.45 3.39A92 6.22A97 56.7%
1-2 = 8.32
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.25 3.11/.91 6.22/.91 61.4%
1-2 = 8.66
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.20 3.42A92 5.96A97 56.7%
1-2 = 14.44
2 = 0.95
151
The corrected response option curves for Group 2 are presented in Figure
Figure 27
Corrected Response Option Curve 0112 Dichotomous SASSI-3 Group 2


+
0 0 2774 621-18.69 -17.51 .76 .831 | NONE | (-17.51)1 0

1 1 1133 25| -6.59 -8.50| .72 .48|| -4.16 | .00 | 1
2 2 531 12| 3.74 1.71| .81 .70|| 4.16 |( 17.51)1 3
MISSING 60 11-10.20 | | | | |
+
+
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIMI
LABEL MEASURE S.E. | AT CAT. ZONE 1PROBABLTY | M->C C->M|DISCR|
0 NONE | (-17.51) -INF -9.741 | 84% 86%| I0

1 -4.16 .36 | .00 -9.74 9.741 -6.99 | 48% 60%| 1.331 1
2 4.16 .55 |( 17.51) 9.74 +INF | 6.99 | 83% 26%| 1.26| 3

C - > M = Does Category imply Measure?

p —+-
R 1.0 + +
O 10 21
B I 00000 22222 |
A I 0000 2222 |
B .8 + 000 222 +
I 1 00 22 1
L 1 00 22 1
I 1 00 22 1
T .6 + 00 22 +
Y 1 00 22 1
.5 + 0 2 +
O 1 00 11111 22 1
F .4 + 111*1 1*111 +
1 111 00 22 111 1
R 1 111 0*2 111 1
E 1 111 22 00 111 1
S .2 + 111 22 00 111 +
P 1 1111 22 00 1111 |
O 1 11111] 2222 0000 111111 |
N 11 2222222 0000000 H
S .0 +2222222222222 0000000000000+
E ++ + + + + —+- ++
-30 -20 -10 0 10 20 30
152
There were 35 misfitting people and 18 items that failed to meet the standards for
item fit. The elimination of these people and items increased the final person and item
separation and reliability results to 3.45A92 and 6.11/.97, respectively. The item map
identified five pairs of aligned items. These items were considered for elimination. See
Table 18 for details.
153
Table 18
Dichotomous SASSI-3 Group 2 Paired Aligned Items Fit Statistics
Infit Outfit
Items MNSO ZSTD MNSO ZSTD
Q54 .62 -3.4 .45 -2.8
Q59* .62 -3.4 .45 -2.8
Q35 1.36 2.7 1.92 3.3
Q50* 1.37 2.8 1.54 2.1
Q29* .96 -.5 .97 -.1
Q49 .96 -1.2 2.50 5.9
Q56* .96 -1.0 1.01 .1
Q7 .99 -.1 .94 -.3
Q42 .78 -2.9 .76 -1.3
Q48* 1.07 .8 1.19 1.0
Q22 1.16 1.0 2.20 2.3
Q28* 1.32 1.7 1.55 1.2
*=better fitting item of the pair. Items with no * were eliminated.
154
While Group 2's final RFC A for the scale increased to 85.5 percent of the variance being
accounted for by the items, the person and item separation and reliability findings
decreased to 3.17/.91 and 5.78A97, respectively. Therefore, these items remained in the
instrument. As was presented for the Dichotomous SASSI-3 for Group 1, the hierarchy of
item endorsement difficulty for Group 2 is also presented on Figure 28.
155
Figure 28
Item Map Dichotomous SASSI-3 Group 2

INPUT: 173 Persons 71. Items MEASURED: 139 Persons 53 Items 7 CATS

<more>|<rare>
. • I FVA12-commit suicide
90 +
I Q32
IT
80 +
I EVA9-effects recur
70 + FVAll-Nervous/shakes FVOD9-doctor
IS FV0D7-trouble w/law Q61
I FV0D3-more aware Q23 clever
Q55
# | FV0D12-avoid withdrawal Q17 respectful
. T| FVAlO-relationship FVOD14-treatment program
FV0D4-sex Q18
60 .# + FVA6-trouble
# | FVOD5-help FVOD8-really storied
# I FVOD13-life
I FVA8-argued FVODl-improve thinking
FVODlO-activities
## | FVA3-energy FVA7-depressed
FVOD2-feel better FVOD6-forget
Q58
50 # +M Q20 disapproval
## S| Q54 Q59
. •• I FVA2-feelings FVODll-aod
Q35 Q50
Q52 resentful
.# 1
#### 1 Q39 law
40 . # • #
+
## 1 FVA5-physical probs Q19 leaye home
##### 1 Q29 Q49
Q56 Q7 not lived
.#### 1 Q42 Q4 8 punished
Q60
.#### M|S FVA4-intended Q.4 police
30 .## + Q40
##### 1
#### 1
### 1
### 1
20 ### +
.### SI Q44
.# 1
.### IT Q27
10 .# ' +
1 Q22 Q28
T.I
# 1
+
<less>|< frequ>
EACH '#' IS 2.
156
indicated that the scale maintained some of its consistency on the hierarchy of item
difficulty across groups (see Table 19). This means that the dichotomous SASSI-3*s
items can be divided into eight levels of difficulty with high (.97) reliability. Further,
these items distinguish four levels of differences among the groups with high (.91)
reliability.
157
Table 19
Dichotomous SASSI-3 Test of Independence
Item Hierarchy
Group 1 Group 2
Q32 Break more laws FVA12 Suicide
FVA12 Suicide Q32 Break more laws
FVA9 Drinking effects FVA9 Drinking effects
FV0D9 Tried to talk a Dr. into it FVA11 Shakes after sober
FV0D7 Legal trouble FVOD9 Tried to talk a Dr. into
FVA10 Relationship problems FVOD7 Legal trouble
FVA11 Shakes after sobering Q61 Antacid
FVA6 Work/school problems FVOD3 More aware
FVOD12 Avoid withdrawal Q23 Clever crooks
FVOD3 More aware Q55 Morning drinks
Q55 Morning drinks FVOD12 Avoid withdrawal
FVA8 Argued Q17 Respectful
FVOD14 Treatment programs FVA10 Relationship problems
FVOD4 Improve sex FVOD14 Treatment program
FVA3 For energy FVOD4 Improve sex
FVA7 Depressed after sober Q18 Obey the law
158
FV0D8 Really stoned FVA6 Work/school problems
FV0D5 Forget helplessness FVOD5 Forget helplessness
Q58 Get in trouble FVOD8 Really stoned
FV0D1 Improve thinking FVOD13 Keep from life
FVOD10 Drug-related activities FVA8 Argued w/others
FVOD13 Keep from life FVODl Improved thinking
Q53 Responsibilities FVOD10 Drug-related activities
FVA1 Midday drinks FVA3 For energy
FVA2 Express feelings FVA7 Depressed after sober
FVOD6 Forget pressures FVOD2 Feel better
Q50 Full of energy FVOD6 Forget pressures
FVOD2 Feel better Q58 Get in trouble
Q20 Disapproving looks Q20 Disapproving looks
FVODll* Drink & drug together Q54 Neglected obligations
Q59 Family problems Q59 Family problems
Q39 Never broken laws FVA2 Express feelings
FVA5 Physical Problems FVODll Drink & drug together
Q54 Neglected obligations Q35 Memory problems
Q56 Teenage use Q50 Full of energy
FVA4 More than intended Q52 Resentful
Q7 Not lived Q39 Never broken laws
159
Q29 Control myself FVA5 Physical problems
Q46 Undesirable types Q19 Tempted to leave
Q13 Worn out Q29 Control myself
Q27 Too much Q49 Cigarettes
Q22 Avoided people Q56 Teenage use
Q7 Not lived
Q42 Too often
Q48 Rarely punished
Q60 Away from home
FVA4 More than intended
Q4 Police trouble
Q40 Couldn't remember
Q44 Who is to blame
Q27 Too much
Q22 Avoided people
Q28 Uninteresting life
Easy to endorse
Eliminate(litems:
Qi Lie Q1 Lie
Q4 Police trouble Q5 Well behaved
Q5 Well behaved Q6 Not my fault
160
Q6 Not my fault Q8 Friendly
Q8 Friendly Q9 Daydream
Q9 Daydream Qll Sitting still
Qll Sitting still Q13 Worn out
Q16 Wasn't up to it Q16 Wasn't up to it
Q17 Respectful Q25 Dangerous
Q18 Obey the law Q31 No good
Q19 Tempted to leave Q33 Take the blame
Q23 Clever crooks Q41 Think carefully
Q25 Dangerous Q46 Undesirable types
Q28 Uninteresting Q51 Sat about
Q31 No good Q53 Responsibilities
Q33 Take the blame Q64 Happy
Q35 Memory problems Q65 Restless
Q40 Couldn't remember FVA1 Midday drinks
Q41 Think carefully
Q42 Too often
Q44 Who is to blame
Q48 Rarely punished
Q49 Cigarettes
Q51 Sat about
161
Q52 Resentful
Q60 Away from home
Q61 Antacid
Q64 Happy
Q65 Restless
Review of Research Hypotheses 1-4
The RCPA analyses indicated that the following scales were unidimensional in
structure (i.e., each scale accounted for equal to or greater than 60 percent of the scale's
total variance): FVA, FVOD, SYM, OAT, SAT, DEF, FAM, COR, and the dichotomous
SASSI-3 scale. The SAM and the RAP scale's RPCA, failed to meet the minimum 60
percent cutoff. Therefore, the researcher rejected Research Hypothesis 1.
The following scales' item fit produced infit and outfit statistics indicative of low
item error: FVA, FVOD, SAM, SYM, OAT, SAT, DEF, FAM, COR, and the
dichotomous SASSI-3 scale. The RAP scale's items did not meet the acceptable
standards for item fit. Therefore, the researcher rejected Research Hypothesis 2.
The following scales' reliability statistics demonstrated acceptable levels of
internal consistency: FVA, FVOD, OAT, SAT, SAM, DEF, FAM, COR, and the
dichotomous SASSI-3 scale. The RAP scale did not produce acceptable reliability
statistics for internal consistency. Therefore, the researcher rejected Research Hypothesis
3a.
162
The following scales remained reliably defined across samples: FVA, FVOD,
SYM, OAT, SAT, SAM, DEF, FAM, COR, RAP, and dichotomous SASSI-3. Therefore,
the researcher failed to reject Research Hypothesis 3b.
The following scales demonstrated high discriminatory ability: FVA, FVOD, and
the dichotomous SASSI-3 scale. The SYM, OAT, SAT, SAM, DEF, FAM, COR, and
RAP did not demonstrate discriminatory ability. Therefore, the researcher rejected
Research Hypothesis 4.
Whole SASSI-3
The SASSI-3 has a total of 93 items. Eleven of the SASSI-3's 93 items are not
used on any of the ten scales. Twenty-six of these 93 load on more than one scale. While
the 26 shared items each have dichotomous response options, nine are true on at least one
of the scales and false on another (see Table 20). Items that do not fall in the same
direction or cannot be coded as such are deemed to be misfitting. While there is a key
indicating the expected or "correct" response as identified by the authors of the SASSI-3,
twenty items either have opposite correct answers on two different scales or have no
correct answer listed. In addition, this creates interdependence and artificially high
intercorrelations on other scales. Because of this interdependence and high
intercorrelations, it was expected that many of these items will appear to be redundant or
misfit.
163
Table 20
Multivocal Items, Items on No Scale
Item Number Item Scale
Ql Lie DEF, FAM, COR
Q4 Never trouble for police OAT, SAM
Q7 Not lived OAT, SAM, COR
Q8* Friendly DEF, FAM
Q9 Daydream DEF, SAM, FAM
Q12 Take my advice No scale
Q14 Enjoy moving No scale
Q15 Not to talk No scale
Q16 Wasn't up to it DEF, SAM
Q18 Obey the law SAT, COR
Q21 Other's can't handle it No scale
Q23* Clever crooks OAT, FAM
Q25 Dangerous DEF, FAM
Q26 Bored No scale
Q27 Too much SYM,FAM
Q31* Break more laws DEF, COR
Q32 Break more laws SAT, COR
Q34 Crying No scale
Q37 Successes No scale
164
Q39* Never broken laws OAT, SAM, FAM, COR
Q40 Couldn't remember SYM, SAM, COR
Q41* Think carefully DEF,COR
Q42 Picked on SYM, SAM, COR
Q45 Make lists No scale
Q48 Rarely punished OAT, SAM, COR
Q50* Full of energy SAT, FAM
Q51 Sat about DEF, SAM
Q54* Neglected obligations SYM, SAM, FAM, COR
Q55* Morning drinks SYM, FAM, COR
Q57 Dad was a drinker No scale
Q61* Antacid SAT, DEF, FAM
Q62 Never sad No scale
Q65 Restless DEF, FAM
Q66 Spontaneous No scale
Note: * items response options are true on at least one scale and false on another.
The initial person and item separation and reliabilities for the SASSI-3 were
3.29A92 and 5.73A97, respectively. Additionally, the initial RPCA was 54.6 percent with
only one contrast that accounted for more than 5 percent of additional variance.
Combined, these findings suggest that the SASSI-3 is a logical linear construct that meets
the minimum standards of distinguishing differences among the sample. However, the
165
instrument is multidimensional and accounts for less than the accepted level of 60 percent
of the total variance. This suggests that the SASSI-3 is measuring more than just one
construct. And, whatever additional constructs that are being measured account for
approximately 40 percent of the SASSI-3's variance. As such, the researcher attempted to
determine whether augmentations could be made that would render the SASSI-3 a
unidimensional instrument. Because the SASSI-3 involves items from both the front and
back of the instrument, true/false and Likert-type response options, it is important to
evaluate the response scales for validity. Inspection of the probability curves and
thresholds indicated that response option 1-Once or twice and 2-Several times (Figure 20)
did not meet the standards for cutoffs for the face valid response scales as identified
earlier (see Table 29).
Figure 29
Response Options Curves 0123 Dichotomous Whole SASSI-3 Group 1

+ ; : ' ; - - •
|CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT| |STRUCTURE|CATEGORY I

• |LABEL SCORE COUNT %fAVRGE EXPECT| MNSQ MNSQ| |CALIBRATN| MEASURE|
I 0 '0 2810 621-12.57 -11.8| .71 .761| NONE | (-13.54)1 0

I 1 1 583 13| -5.92 -7.421- .84 .58|| 6.11 | -3.91 I 1
1 2 2 533 12| -2.14 -3.03| .82 .5611 -4.32 | 3.38 I 2
1 3 3 585 131 2.52 1.21| .78. .5911 -1.79 |( 14.09)1 3
IMISSING 13 0| -3.16 | | | | |
+ : '
estimate.
166
+ r
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM. T COHERENCE|ESTIM|
LABEL MEASURE S.E. | AT CAT. ZONE- |PROBABLTY| M->C C->M|DISCR|
0 NONE . I (-13.54) -INF -8.861 1 89% 72% | | 0

1 6.11 .36 | -3.91 -8.86 -.301 -4.20 | 21% 55%| 1.131 1
2 -4.32 .42 | 3.38 -.30 8.72 1 -.79 | 29% _ 35%| 1.63 12
3 -1.79 .52 | (14.09) 8.72 + INF | 4.57 | 84% 12%| 1.561 3


p ++— — + - H H ^H H H - H •--- + h :--++
R 1.0 + +
0 1000 3333 1
B 1 ooooc1 333333 |
A 1 0000 3333 |
B .8 + 00 333 +
I 1 00 333 |
L 1 00 33 |
I ~, 1 00 33 |
T .6 + 0 3 +
Y,- 0 33 |
.5 +
r 00 3 +
O i 0 3 |
F .4 + ,- 0 33 +
i 0 3 |
R i 0* I
E i **2*222222222 |
S .2 + 11*** 00 22222 +
P i 111111**2*3 111111100 22222 |
O i 1111111 2222 33 11**1 2222222 |
N l i m n 222222*33333 0 *****111 22222 1
S .0 +****** **'333333 QOO****************+
E ++-- +- + + + +—^-'—+ •—-+ -+-- + ++
-23 -18 -13 -8 -3 2 7 12 17 22 27
167
Table 21
Summary of Collapsing Strategy for Whole SASSI-3 Face Valid Response Options
Rating Probability
Scale Curve1
0 = 0.95
0-l=N/A
1 = 0.20
0,1,2,3 1-2 = 10.43 3.297.92 5.727.97 77.8%
2 = 0.25
2-3 = 2.44
3 =0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.40 3.00/.90 5.74A97 49.7%
1-2 = 3.84
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.20 7.78A89 5.71/.97 53%
1-2=13.48
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.20 3.12/.91 5.70/.97 52.1%
1-2=16.54
2 = 0.95
168
Therefore, the collapsing strategy for the face valid response scales positively increased
the item separation and reliability to 5.74/.97, respectively (Figure 30). As stated above
because the response options for the other scales are true and false no additional
examination was warranted.
Figure 30
Corrected Response Options Curve 0112 Whole SASSI-3 Group 1

;
+ •* '
0 0 2810 621-15.72 -14.5| .69 .77|| NONE | (-16.15)1 0

1 1 1116 25| -6.07 -7.841 .75 .581| -1.92 | .00 | 1
2 2 585 13| 1.12 -1.35| .77 .60|| 1.92 |( 16.15)1 3
MISSING 13 0| -5.31 | | | | |
+- ;
estimate.
+ -• . - - —

LABEL MEASURE S.E. | AT CAT. — — Z O N E — — | P R O B A B L T Y | M->C C->M|DISCR|
0 NONE |(-16.15) -INF -8.92| | 85% 83%I 10

1 -1.92 .35 | .00 -8.92 8.92| -5.73 | 42% 64% j 1.351 1
2 1.92 .50 |( 16.15) 8.92 +INF | 5.73 I 81% 10%I 1.42| 3
+—'--. - • >
169
R 1.0 +
0 000 222|
B 00000 22222
A 0000 2222
B 000 222
I 00 22
L 00 22
I' 00 22
T .6 +
Y I 00 22
.5 + 00 22
0 I 0
F .4 + 0011122
I 1111110 2111111
R I 111 2*0 111
E I 111 2 0• 111
S .2 + 1111 22 00 1111 . +
P 1111 222 000 1111 I
0 11111 222 000 11111 |
N 1111 2222222 0000000 11111
'S .0 +2222222222222 0000000000000+
E
-30 -20 -10 0 .10 20 30 ;
person and item fit. Inspection of the items and persons lead to a final iterative
elimination of 22 people and 20 items which failed to meet the standards set forth for
item fit. This resulted in the final person and item separations and reliabilities of 3.82A94
for persons and 5.63/.97 for the SASSI-3 with an RPCA of 69.2 percent. This suggests
that the remaining items form a well defined linear construct that can be divided into
seven levels of difficulty, and it does a reliable (.97) job in discriminating five different
groups among the people from low to high agreeability on the hierarchy of items.
the extrapolated construct. The resulting hierarchy produced a pattern of items ranging
from most difficult to endorse to least difficult to endorse (see Table 22). No items shared
170
the same position on the scale. Group 1 's final RPCA indicated that 69.2 percent of the
total variance was explained by the SASSI-3's remaining 73 items. This improvement
was achieved by adjusting the response scale and eliminating misfitting people and items.
In addition, the item/person map means and standard deviations were separated by nearly
one standard deviation (Figure 31). This separation indicates that the items were more
difficult to endorse than the people were able to agree to them.
Figure 31
Item Map Whole SASSI-3 Group I

<more>|<rare>
90 + Q10 criticized
5
Q2 make mistakes
Q30 confused
Q43 picked on
Q62 sad
I
I'
I
I
| Q47 joke
'I
80 +
I
I FVA12-commit suicide
Q32 break laws
IT
I
. I
I
70 + FVA9-effects recur
I FVOD9-doctor
I
I Q23 crooks are clever
. .1
# T|S
FVAll-Nervous/shakes
Q15 better to not talk
Q18 obey the law
Q55 morning drinks
# | FVA6-trouble
FVOD7-trouble w/law
Q17 respectful of authority
60 .'# + FVAlO-relationship
.# I FVOD12-avoid withdrawal
FVOD3-more aware
. . | Q37 successes
.## | FVOD14-treatment program .
FVOD4-sex
.## | FVA3-energy
171
FVA8-argiied
Q31 no good
Q53 responsibilities .
I FVA7-depressed
FVOD8-really stoned
Q58 drink/drugs trouble
S| FVAl-lunch
Q50 energy
50 .## +M FVA2-feelings
FVODl-improve thinking
FVOD13-life
FVOD5-help
Q20 disapproving looks
Q24 teachers had probs
Q52 resentful
Q67 binge
##### I FVODlO-activities
FVOD6-forget
Q57 father
Q9 daydream
# | FVODll-aod
FVOD2-feel better
.### I FVA5-physical probs
Q33 take blame
Q59 family probs
1 Q21 others would not deal
Q56 teenager
#### 1 Q19 tempted to leave
Q54 neglected obligations
#### 1 Q42 too much/often
Q4 9 cigarettes
Q63 loss for words
40 ######## M+ FVA4-intended
Q60 away from home
.## 1 Q26 need something to do
Q40 done things
Q48 punished
Q7 not lived the way
#### IS
#### 1 Q29 control
Q4 trouble w/police
1
.### 1 Q46 undesirable types
#### 1 Qll sitting still
Q13 worn out
Q6 not my fault
30 ### +
.# S| Q44 blame
### 1 Q66 spur of moment
### 1 Q27 drunk too much
.## IT
1 Q14 moving
.# l
I
20 +
# 1
T|
# ' I
10 +
<less>|<frequ>
EACH ' # ' IS 2.
172
data from a second comparable group using the same process. The SASSI-3, using Group
2 data, demonstrated similar person and item separation and reliability findings, 3.S7/.93
and 5.26A97, respectively and a RPCA of 60.9 percent. As with Group 1, the response
options were not being used as intended (Figure 32).
Figure 32
Response Options Curves 0123 Whole SASSI-3 Group 2

3.62.1

:
+ — - — - — • - — ' • -' . — ' •

LABEL SCORE COUNT %IAVRGE EXPECT| MNSQ MNSQ||CALIBRATN| MEASURE|
0 0 2126 58|-12.76 -12.01 .79 . 85 | | N O N E | (-14.30)1 0

1 1 499 14| -5.9.4 -7.27 | .71 .44|| 4.84 | -4.22 | 1
2 2 507 1 4 | -2.07 -2.43| .83 .571| -5.00 I 3.52 | 2
3 3 499 14| 3.82 2.37 | .79 .701 | .16 |( 15.12) | 3
MISSING 35 1-1 -4.39 | II I I

+ ^--^ .__^__^ —L .. —
OBSERVED AVERAGE is mean of measures in category. It is hot a parameter
estimate.
+: ^_, -^ . . ~
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.. I COHERENCE|ESTIM|
LABEL MEASURE S.E. | AT CAT. -ZONE |PROBABLTY| M->C C->M|DISCR|
0 NONE I (-14.30) -INF -9.421 1 88% 71% | 10

1 4.84 .41 | -4.22 -9.42 -.411 -4.84 | 23% 58% | 1 131 1
2 -5.00 .46 | 3.52 -.41 9.281 -1.05 | 32% 37% 1 1 501 2
3 .16 .57 |( 15.12) 9.28 + INF | 5.39 | 88% 20% | 1 58| 3

173

p ++ -+- 1 j. + 1 : + h--- h H -+ +
R 1.0 + • +
0 1000 31
B I 00000 333333 I
A 1 0000 3333 I
B .8 + 00 333 +
I 1 000 333 |
L 1 0 .33 |
I 1 00 .33. |
T .6 + 0 33 +
Y 1 00 . 3 ' I
.5 + 0 33 +
0 1 0 3 1
F .4 + 00 33 +
1 0 3 1
R 1 0**22222222 |
E 1 222*00 2222 |
S .2 + 11***1**1 0 22222 +
P 1 111111*22 33 111** 22222 |
0 1 11111111 2222 333 ***11 2222222 1
N 111111 2222222 33333 00****1111 2|
S .0 _fl_* * * * * * * * 33333333 0000**************+
E ++ -+- — + + + —+_. +—-;—+ .—-+ + ++
-24 -19 -14 -9 -4 1
The researcher developed a collapsing strategy after reviewing the thresholds and
probability curves. The two middle response options, 1-Once or Twice and 2-Several
times, were combined (Figure 33).
Figure 33
Corrected Response Options Curve 0112 Whole SASSI-3 Group 2

3.62.1

+ -— -' • —
CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT I I STRUCTURE|CATEGORY|

LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQ|ICALIBRATN| MEASURE|
0 0 2126 581-15.83.-14.51 .73 .801 | NONE | (-17.07)1 0

1 1 1006 27 1-5.85-7.321 .71 .51|I -3.46 | .00 | 1
2 2 4 99 14| 2.79 .17 1 .76 .6511 3.46 - | ( 17.07)' | 3
MISSING 35 1| -6.73 I I I . I I
174
estimate.
+
CATEGORY STRUCTURE | SCORE-TO-MEASURE | 5 0 % CUM.I COHERENCE|ESTIM|
LABEL MEASURE S.E. | A T CAT. ZONE |PROBABLTY| M->C C->M|DISCR|
0 NONE | (-17.07) -INF -9.47| | 83% 81% | | 0

1 -3.46 .39 I .00 -9.47 9.47| -6.58 I 46% 6 7 % | 1.41| 1
2 3.46 .55 |( 17.07) 9.47 +INF | 6.58 I 86% 1 8 % | 1.39| 3
+

p ++- + — 1 -| 1. -i + +
R 1.0 + +
0 100 22 1
B 1 00000 22222 |
A 1 0000 2222 |
B .8 + 00 22 +
I 1 000 222 |
L 1 00 22 |
I 1 00 22 |
T .6 + 0 2 +
Y 1 00 22 . |
.5 + 00 22 +
0 1 0 2 |
F .4 + 1**11111**1 +
1 1111 0 2 1111 |
R 1 111 0*2 111 |
E 1 111 22 00 111 |
S .2 + 111 22 00 111 +
P 1 1111 222 000 1111 |
O 1 111111 222 000 111111 |
N ill 2222222 0000000 111
S .0 +2222222222222 0000000000000+
E ++- + — + + + + ++
-30 -20 -10 0
This allowed for a better functioning response scale and an increase in the person and
item separation and reliability findings (see Table 22).
175
Table 21
Summary of Collapsing Strategy for Whole SASSI-3 Group 2 Face Valid Response
Options
Rating Probability
j2
Threshold* PS&R IS&R RPCA
Scale Curve1
0 = 0.95
0-l=N/A
1=0.20
0,1,2,3 1-2 = 9.84 3.57A93 5261.91 60.9%
2 = 0.30
2-3 = 5.16
3=0.95
0 = 0.95
0-l=N/A
0,1,1,2 1=0.40 3.32A92 5.25A97 58%
1-2 = 6.92
2 = 0.95
0 = 0.95
0-l=N/A
0,0,1,2 1=0.25 3.11/.91 S.23/.96 59.9%
1-2 = 9.54
2 = 0.95
0 = 0.95
0-l=N/A
0,1,2,2 1=0.20 3.37A92 5.25A97 60.6%
1-2=15.34 "
2 = 0.95
Note. 1 = >/ .5 is acceptable. 2 = >/1.4 is acceptable. PS& R = Person Separation & Reliability. IS & R = Item Separation &
176
After the iterative elimination of the 13 misfitting people and the 22 items that
failed to meet the standards for fit, Group 2's final person and item separation and
reliability findings improved to 4.12/.94 and 5.10/.96, respectively. The final RPCA for
items. This means that the items can be divided into seven levels of difficulty that
discriminate between five groups of people among the sample and have a high reliability
(.94 and .96, respectively). As was reported for the SASSI-3 for Group 1, the hierarchy of
item endorsement difficulty for Group 2 is also presented on Table 21. A side-by-side
comparison of the Groups' respective item-endorsement difficulty indicated that the scale
maintained some of its consistency on the hierarchy across groups. Of the 20 items
deleted from Group 1 's hierarchy all but one item were also deleted from Group 2's
hierarchy.
Table 22
Whole SASSI-3 Test of Independence
Item Hierarchy
Group 1 Group 2
Q10 Criticized Q43 Picked on
Q2 Make mistakes Q47 Laugh at jokes
Q30 Get Confused Q10 Criticized
Q43 Picked on FVA12 Suicide
177
Q62 Never sad Q32 Break more laws
Q47 Laugh at jokes Q61 Antacid
FVA1.2 Avoid withdrawal FVA11 Shakes after sobering
Q32 Break more laws FVOD9 Tried to talk a Dr. into it
FVA9 Drinking effects Q23 Clever crooks
FV0D9 Tried to talk a Dr. into it FVA9 Drinking effects
Q23 Clever crooks Q15 Better to not talk
FVA1.1 Shakes after sobering Q18 Obey the law
Q15 Better not to talk Q34 Crying
Q18 Obey the law Q37 Successes
Q55 Moring drinks Q17 Respectful
FVA6 Work/home problems Q55 Morning drinks
FV0D7 Trouble with the law FVA10 Relationship problems
Q17 Respectful FVA12 Avoid withdrawal
FAV10 Relationship problems FVOD3 Become more aware
FVOD12 Avoid withdrawal Q31 No good
FVOD3 Become more aware FVA6 Work/school trouble
Q37 Successes FVOD4 Improve sex
FVOD14 Treatment program FVOD7 Legal trouble
FVOD4 . Improve sex FVOD14 Treatment program
FVA3 For energy FVA3 For energy
178
FVA8 Argued w/ family FVA8 Argued w/family
Q31 Get confused FVOD8 Really stoned
Q53 Responsibilities Q9 Daydream
FVA7 Depressed after sober FVA7 Depressed after sober
FV0D8 Really stoned FVOD10 Drug-related activities
Q58 Get in trouble FVOD5 Forget helplessness
FVA1 Midday drinks Q20 Disapproving looks
Q50 Full of energy Q24 School problems
FVA2 Express feelings Q53 Responsibilities
FV0D1 Improve thinking FVA1 Midday drinks
FVOD13 Kept from life FVOD1 Improve thinking
FVOD5 Forget helplessness FVOD13 Kept from life
Q20 Disapproval looks FVOD6 Forget pressure
Q24 School problems Q50 Full of energy
Q52 Resentful Q52 Resentful
Q67 Binge Q58 Get in trouble
FVOD10 Relationship problems FVA2 Express feelings
FVOD6 Work/home trouble FVOD2 Feel better
Q57 Dad drinker Q33 Take the blame
FVOD11 Drink & drugs together Q67 Binge
FVOD2 Feel better FVOD11 Drink & drug together
179
FVA5 Physical problems FVA5 Psychical problems •
Q33 Take the blame Q57 Dad drinker
Q59 Family problems Q59 Family problems
Q21 Others can't deal Q54 Neglected obligations
Q56 Teenage use Q63 Loss for words
Q19 Tempted to leave Q19 Tempted to leave
Q54 Neglected obligations Q21 Others can't deal
Q42 Too often Q60 Away from home
Q49 Cigarettes FVA4 More than intended
Q63 Loss for words Q26 Get bored
FVA4 More than intended Q48 Rarely punished
Q60 Away from home Q49 Cigarettes
Q26 Bored Q56 Teenage use
Q40 Couldn't remember Q29 Control myself
Q48 Rarely punished Q46 Undesirable types
Q7 Not lived Q7 Not lived
Q29 Control myself Q4 Police trouble
Q4 Police trouble Q42 Too often
Q46 Undesirable types Q40 Couldn't remember
Qll Sitting still Qll Sitting still
Q13 Worn Out Q66 Spontaneous
180
Q6 Not my fault Q14 Moving
Q44 Who is to blame Q13 Worn out
Q66 Spontaneous Q44 Who is to blame
Q27 Too much Q27 Too much
Q14 Enjoy moving
to endorse
Eliminated items:
Ql Lie Ql Lie
Q3 Go along with Q2 Make mistakes
Q5 Well behaved Q3 Go along with
Q8 Friendly Q5 Well behaved
Q12 Take my advice Q6 Not my fault
Q16 Wasn't up to it Q8 Friendly
Q22 Avoided people Q12 Take my advice
Q25 Dangerous Q16 Wasn't up to it
Q28 Uninteresting life Q22 Avoided people
Q34 Crying Q25 Dangerous
Q35 Memory problems Q28 Uninteresting life
Q36 Tempted to hit Q30 Get confused
Q38 Feel sure Q35 Memory problems
Q39 Broken the law Q36 Tempted to hit
181
Q41 Think carefully Q38 Feel sure
Q45 Make lists Q39 Never broken laws
Q51 Sat about Q41 Think carefully
Q61 Antacid Q45 Make lists
Q64 Happy Q51 Sat about
Q65 Restless Q62 Never sad
Q64 Happy
Q65 Restless
Review of Research Hypotheses 5-8
The whole SASSI-3's RPCA indicated that greater than 60 percent (74.8%) of the
variability is accounted for by the instrument. Based on this finding, the researcher failed
to reject Research Hypothesis 5.
The SASSI-3's remaining 63 items infit and outfit statistics fall within the lower
than 2.0 z-standardized and positive point-biserial cutoff statistics. Based on this finding,
the researcher failed to reject Research Hypothesis 6.
The SASSI-3 maintains its' item consistency as the items align in the same
general area on the variable across samples. Based on this finding, the researcher failed to
reject Research Hypothesis 7.
182
The SASSI-3 discriminates five different groups (person - 3,82) among sample
with high reliability (.94). Based on this finding, the researcher failed to reject Research
Hypothesis 8.
Summary
This study had two general research questions. General Research Question 1 was
Does modern measurement methodology assist in the revalidation of the SASSI-3?
General Research Question 2 was Does modern measurement theory assist in improving
the SASSI-3 instrument holistically? Based on the results reported in this chapter, the
researcher failed to reject both General Research Questions. Generally, the evidence
supports that the face valid scales meet fundamental measurement properties and the
subtle scales do not. Additionally, when combined with the subtle scales, the face valid
scales perform better but are still outperformed when they are used independently.
183
Chapter Five
Discussion
Substance abuse and dependency is an expensive problem in the United States of
America and one that has a negative impact on its citizens (Substance Abuse and Mental
Health Services Administration [SAMHSA], 2008). Substance dependence is associated
with untimely deaths, loss in work productivity, reduction in days attended at school,
increased costs due to substance dependence associated medical care, and criminal
activity (SAMHSA, 2008). It is important for people who struggle with alcohol and drug
dependency to get proper diagnosis and treatment to help reduce and eliminate these
unwanted biopsychosocial consequences. Part of the diagnostic process can involve
mental health professionals' use of substance use screening instruments. The
effectiveness of substance dependence assessment and treatment is limited by the
accuracy of the tools used in formulating a diagnosis. As such, due to the clinical
implications of the assessment process, it is necessary that substance abuse screening
tools are psychometrically sound and accurately measure the behaviors they are designed
to measure—substance abuse.
184
A number of substance abuse screening instruments are available to assist in this
process. A study of masters addictions counselors revealed that there are four substance
abuse screens that these counselors most frequently select as aids in their diagnostic
processes (Juhnke, Vacc, Curtis, Coll, & Paredes, 2003). These four screens are the
Substance Abuse Subtle Screening Inventory-3 (SASSI-3; Miller & Lazowski, 1999), the
Michigan Alcoholism Screening Test (MAST; Selzer, 1971) the Minnesota Multiphasic
Kaemmer, 1989) Mac Andrew Scale-Revised (Mac-R: MacAndrew, 1965), and the
Additions Severity Index (ASI; McLellan, Luborski, Cacciola, Griffith, McGranhan, &
O'Brien, 1992). Of these four, the Substance Abuse Subtle Screening Inventory-3 (Miller
& Lazowski, 1999) was identified by these counselors as being the most important
(Juhnke et al., 2003) for the following reasons: the SASSI-3, unlike the other three
identified screens, measures alcohol dependence as well as dependence on other drugs of
abuse; it provides several measures of response bias (e.g., defensiveness and random
answering); and it is scored and interpreted according to sex-specific national normative
data.
A robust but conflicting literature base has developed to address the SASSI-3's
psychometric characteristics. The results of these investigations provide a variant range
of agreement with what is published in the SASSI-3 Manual (Miller & Lazowski, 1999).
In fact, research conducted by investigators not associated with the SASSI-3's publishers
appears to question the SASSI-3's reliability and validity. Despite this well-developed
body of literature, nothing is known about the SASSI-3's alignment with the fundamental
185
principles of measurement (Thurstone, 1927). Psychometric concepts central to
Thurstonian principles of measurement are unidimensionality, linearity, invariance, and
independence. Unidimensionality refers to the degree to which an instrument is
evaluating just one construct (Bond & Fox, 2007). In this study, the construct purportedly
measured by the SASSI-3 is substance dependence (Miller & Lazowski, 1999). Linearity
refers to an ever increasing level of an instrument's items' difficulty (Bond & Fox). If an
instrument is linear, a hierarchy of items can be created according to level of difficulty.
Easier to answer items fall on one end of the spectrum and harder to answer items fall on
the opposite end of the spectrum. As applied to the measurement of substance
dependence, a disease theorized to be progressive in nature, easier to answer items about
one's substance use might include the following: "I can drink one or two drinks without
passing out." Most persons who consume alcohol could very likely answer that item
meaning that it is an easy item to answer. Conversely, a harder item to answer
affirmatively could be "I experience delirium tremens when I stop drinking." It is likely
that fewer persons' substance dependence has progressed to this level. Consequently, it is
harder for most people to answer this question affirmatively. Additionally, invariance
means that the items will be aligned on an equal interval. That is, for example, the
psychological distance between "never" and "sometimes" is the same psychological
distance between "sometimes and "frequently." Finally, an instrument that invariant will
demonstrate equal alignment of the items' response options, regardless of the sample in
which the construct is being measured.
186
Despite its popularity among addictions counselors (Junkhe, Vacc, Curtis, Coll,
and Paredes, 2003) and its use in a wide-range of settings, the SASSI-3's psychometric
properties have been found to differ (Arneth, Bogner, Corrigan & Schmidt, 2001;
Clements, 2002; Feldstein & Miller, 2007; Gray, 2001; Laux, Perea-Ditlz, Smirnoff &
Salyers, 2005; Laux, Salyers & Kotova, 2005; Lazowski, Miller, Boye, & Miller, 1998;
Peters et al., 2000; Svanum & McGrew, 1995), at times significantly, from those reported
in the Manual (Miller & Lazowski). These differences may be related to the traditional
methods of testing reliability and validity used by researchers. However, what is unclear
is whether the S ASSI-3 meets the fundamental requirements of measurement.
Consequently, it is unclear whether or not the SASSI-3 is actually measuring what it
purports to measure. If there is doubt about what the S ASSI-3 is measuring, then there is
also doubt about the implications of the diagnoses it informs and the subsequent
treatment recommendations that are prescribed based on these diagnoses.
Therefore, the purpose of the study was to investigate the SASSI-3's
psychometric alignment with the fundamental principles of measurement as represented
using the Rasch model (Rasch, 1960,1980). Specifically, this investigation focused on
the unidimensionality of the entire instrument and the individual scales. Additionally, it
evaluated the reliability rating scales by identifying whether the participants are utilizing
the scales as intended by the authors of the SASSI-3, and assessed the linearity,
invariance and independence of the instrument.
Summary of the Results
187
This study explored the measurement properties of the SASSI-3 in three parts.
The first part was to look at each scale individually. The SASSI-3 authors identified a
factor structure which resulted in ten scales (Miller & Lazowski, 1995). Those ten scales
included the Face Valid Alcohol (FVA), Face Valid Other Drug (FVOD), Obvious
Attributes (OAT), Subtle Attributes (SAT), Supplemental Addiction Measure (SAM),
Symptoms (SYM), Family versus Control (FAM), Defensiveness (DEF), Correctional
(COR), and the Random Answering Pattern (RAP) scales. The second part involved
exploring all the items together which contribute to the dichotomous decision of
likelihood of substance dependence or not. This included the face valid scales, the OAT,
SAT, SAM, SYM, and DEF scales only. The third part of the investigation involved the
exploration of the entire instrument including all 93 items. The following will summarize
these findings in the following order: each SASSI-3 scale, the dichotomous SASSI-3, and
the whole SASSI-3.
The SASSI-3 Scales
The FVA scale includes twelve items. Each item is accompanied by a four point
Likert-type rating scale response option. The respondent is directed to identify the
number of times he Or she has engaged in the particular behavior listed in the item. The
results of this investigation indicate that the FVA was unidimensional because its RPCA
was above 60 percent and it had no underlying contrasts. After adjusting the rating scale
for improved functioning and eliminating misfitting people it was found that the FVA
scale's items could be divided into ten levels of difficulty. These 10 levels discriminated
188
between nearly four groups of people ranging from low to high agreeability on the items
with high reliability (.98).
The FVOD scale includes fourteen items. As does the FVA, each of the FVOD's
items is accompanied by a four point Likert-type rating scale response option. The
respondent is requested to identify the number of times he or she has engaged in the
particular behavior listed in the item. The FVOD was unidimensional because its RPCA
was above 60 percent and it had no underlying contrasts. After adjusting the rating scale
for improved functioning and eliminating misfitting people, it was found that the FVOD
scale's items could be divided into six levels of difficulty. These six levels discriminated
between nearly four groups of people and ranged from low to high agreeability on the
items with high reliability (.96).
The SYM scale includes ten items, each with a dichotomous true-false response
option. After eliminating two items, the RPCA indicated that 92.9 percent of the variance
could be explained by the scale. However, despite the remaining eight items being
divided into as many levels of difficulty, the scale did not distinguish any differences
among the people in the sample. Therefore, this scale failed to meet fundamental
The OAT scale includes twelve items, each with a dichotomous true-false
response option. The final RPCA indicated 60.3 percent of the total variance was
accounted for by the OAT scale with three underlying contrasts accounting for greater
than 5 percent of the variance. This implied that the OAT scale possibly had multiple
dimensions. Additionally, while the items were divided into seven levels of difficulty, the
189
scale did not distinguish any differences among the group. Therefore, this scale failed to
meet the fundamental measurement properties.
The SAT scale includes eight items, each with a dichotomous true-false response
option. The final RPCCA indicated that 92.8 percent of the variance was explained by the
SAT scale. Additionally, while the scale's items divided into as many levels of difficulty,
the SAT scale did not distinguish any differences among the group. Therefore, this scale
failed to meet the fundamental measurement properties.
The SAM scale includes fourteen items, each with a dichotomous true-false
response option. While the items could be divided into four levels of difficulty, the final
RPCA for Group 1 indicated that 47.4 percent of the variance was accounted for by the
SAM scale. In contrast, the RPCA for Group 2 indicated that 80.8 percent of the variance
was accounted for by the scale. Neither Groups' person separation met or exceeded the
2.0 standard. Additionally, the SAM scale did not distinguish any differences among the
group. Therefore, this scale failed to meet the, fundamental measurement properties.
The DEF scale includes twelve items, each with a dichotomous true-false
response option. The final RPCA indicated that 71.6 percent of the total variance could
be explained by the DEF scale. Additionally, the items, while dividing into nine groups,
did not distinguish any differences among the group. Therefore, this scale failed to meet
the fundamental measurement properties.
The FAM scale included fourteen items, each with a dichotomous true-false
response option. After the elimination of three misfitting items, the final RPCA indicated
that 78.1 percent of the variance was explained by the FAM scale. However, while the
190
FAM scale's items could be divided into seven levels of difficulty, they could not
discriminate any differences among the people. Therefore, the FAM scale failed to meet
the fundamental measurement properties.
The COR scale included twelve items, each with a dichotomous true-false
response option. After the elimination of one misfitting item, the final RPAC indicated
that 85.1 percent of the variance was explained by the COR scale. However, while the
items could be divided into seven levels of difficulty, the COR scale did not distinguish
any differences among the group. Therefore, the COR scale failed to meet the
fundamental measurement properties.
The RAP scale included six items, each with a dichotomous true-false response
option. The final RPCA indicated that 1.4 of the variance was explained by the scale. The
items could not be divided into any levels of difficulty and no distinction could be made
among differences in the people. In addition, there was no reliability (.00) with this scale.
This scale did not meet the fundamental measurement properties. However, one would
question whether that was not the original intention of the SASSI-3 authors as it was
meant to identify people who were responding to the instrument in a random way.
The Dichotomous SASSI-3 and the Whole SASSI-3
The dichotomous SASSI-3 includes 70 items with both a four point Likert-type
response scale and a dichotomous response scale, true and false. After adjusting the four
point Likert-type scale for maximum meaning and eliminating 29 misfitting items, the
RPCA indicated that the instrument functioned as a unidimensional measure that
accounted for 81 percent of the variance explained and no underlying constructs. The
191
items were divided into four levels of difficulty. These levels discriminated seven
different groups of people ranging from high to low on the variable. Therefore, the
dichotomous SASSI-3 can work as a unidimensional instrument that distinguishes people
high on the variable from those low on the variable.
The whole SASSI-3 included all 93 items including both the four point Likert-
type response scales and the dichotomous true-false response item scales. After adjusting
the four point Likert-type response scale for maximum meaning and eliminating 20
items, the RPC A indicated that the instrument functioned as a unidimensional measure
with 69.2 percent of the variance explained, and no evidence of underlying constructs.
The items were divided into seven levels of difficulty which could discriminate five
different groups among the people within the group from high to low on the variable.
Therefore, the whole SASSI-3 instrument, can work as a unidimensional instrument, used
to distinguish people high on the variable from those low on the variable.
Integration of Findings with Other Research
The SASSI-3 authors' purport that the unique integration of subtle items with
direct items provides additional information which is often difficult to assess due to the
clinical denial often present in people dealing with substance dependence issues (Miller
& Lazowski, 1985). However, in their review of the empirical SASSI-3 literature,
Feldstein and Miller (2007) concluded that the SASSI-3's subtle scales have fair to poor
internal consistency. These researchers also concluded that no independent "peer-
reviewed substantiation was found" for the claims that the unique contribution the
combination of indirect and direction items provides in the screening of substance
192
dependence (p. 49). The findings of the present study are supportive of Feldstein and
Miller's summary conclusions. The subtle scales in this study did not function
independently as measures as indicated by their failure to meet the minimal fundamental
measurement properties. In addition, the face valid scales had higher person and item
separation and reliability findings and RPCA's than the dichotomous or whole SASSI-3
(See Table 23).
Table 23
Summary of Person and Item Separation Findings and RPCA 'sfor Direct Versus Direct
and Indirect Scales Combined
~~ Scale ' PS&R IP&R RPCA
FVA 2.65A87 7.81/.98 95.1%
FVOD 2.78A87 4.77A96 84.3%
Dichotomous SASSI-3 3.06/.90 5.63A97 81%
Whole SASSI-3 3.82A94 5.63A97 69.2%
PS& R = Person Separation & Reliability. IS & R = Item Separation & Reliability. RPCA = Rasch principal components analysis.
In 2006, Tellegen et al. introduced a revised version of the MMPl-2. Tellegen and
his co-authors noted that many of the MMPI-2's items loaded on two or more of the
MMPI-2's Clinical scales. They concluded that these multi-item overlaps reduced
specificity among the 8 Clinical scales. In an effort to improve these Basic scales'
specificity, these authors published a newer version of the MMPI called the MMPI-
Restructured Clinical (RC). This reduction and restructuring of the Clinical scales
resulted in RC scales that have higher validity and reliability estimates (Nichols, 2006;
Rogers, Sewell, Harrison & Jordan, 2006). As noted earlier, many of the SASSI-3's items
193
load on one or more of the dichotomous scales. Employing the types of analyses
presented in this study has the potential to produce the same results for the Substance
Abuse Subtle Screening Instrument-3. Specifically, the findings of the present study were
supportive of this assertion. A reduction in the number of items improved the reliability
and measurement functioning of the S AS SI-3.
Finally, the hierarchies that were established for the S AS SI-3 items, regardless of
whether they were from the face valid only scales, the dichotomous, or the whole SASSI-
3, maintained the same general position across all four measures. For example, FVA12
(Suicide) was more difficult to endorse on each of the scales, and FVA4 (More than
intended) was less difficult to endorse on each of the scales. These consistent patterns of
item difficulty are indicative of the linearity of the SASSI-3 measure. That is, the SASSI-
3 measures less to more of the variable of substance dependence consistently and reliably
across samples. This is not unlike intelligence tests; the purpose of which is to measure
less to more of the variable of intelligence consistently and reliably across samples. The
more difficult the item the more of the quality or characteristic one possesses.
Implications
The implications of this study focus on recommendations to improve the SASSI-
3. The first recommendation is to reduce the number of scales. The SASSI-3 meets the
fundamental measurement properties to work, holistically, as a unidimensional measure
to screen for substance dependency. This means that the SASSI-3 can be made more
efficient and effective since it will be unidimensional.
194
A second recommendation is to reduce the number of items. Eliminating
multivocal items, items that are true on one scale and false on another, and items that are
not on any scale, may have a broader effect on the instrument's measurement properties
because these added to the misfitting items (see Table 20). These deleted items indicated
that the item misfits of overfits on the instrument on a consistent manner. Deleting the
misfitting or the overfitting items improved the instrument's person and item separation
and reliability findings as well as the RPCA.
The respondents failed to utilize the response options as the SASSI-3 authors
intended. These standards include meeting the probability curve of .5 or better and a
threshold of greater than or equal to 1.4 units distance between two adjacent response
choices. It appeared from the data reviewed in this study that the respondents did not
make a qualitative distinction resulting clear statistically significant difference between
response option 1- Once or Twice and response option 2- Several Times. A review of the
definitions, clarifying the options would be beneficial to promote more accurate
discrimination of the responses. A final consideration for revising the Likert-style
response options would be to vary the weights assigned to each level of behavior
acknowledgment. For example, to respond "frequently" to the question of how often one
consumes alcohol with lunch has different clinical implications than responding
"frequently" to a question about attempting suicide while consuming alcohol. Under the
current SASSI-3 scoring system, each of these responses are scored a " 3 " even though,
from a biopsychosocial perspective, a person who frequently attempts suicide while
195
consuming alcohol is of much greater concern, clinically, than is someone who frequently
consumes alcohol with lunch.
These analyses indicate that the subtle scales do not contribute in a meaningful
way to the instrument. This was evidenced by the fact that when the face Valid scales
were used independently, the person and item separation and reliability findings as well
as the RPCAs were higher than when the face valid scales were combined with the subtle
scale items. Therefore, the subtle items could be removed without losing any of the
Suggestions for Future Research
Recommendations for future research include combining both the FVA and the
FVOD with the subtle items to investigate the measurement properties. These new
instruments may produce an alcohol only and a drug only screening instrument.
However, it is important to explore whether the face valid scales have higher
measurement properties with or without the subtle items. Combining the face valid scales
into one scale may also be an area of research to investigate. While the results of the
Rasch analysis demonstrated that the FVOD and FVA scale did function independently,
investigating whether they function together with some modification in the wording of
the items to make them more universal to substances instead of drugs or alcohol
exclusively, may be a benefit to the SASSI-3. Finally, reworking the response options for
the face valid scales may contribute to the functioning of the instrument.
196
Limitations
Despite its multiple uses and high reputation for instrument validation, critiques
against the Rasch model are purported hy individuals who are solely committed to the use
of factor analysis. Bond and Fox (2007) report that these critics state that Rasch model
/ analysis is not a theory building method, as is factor analysis, and that the Rasch model
theory is too simplistic. In Rasch, the theory drives the development of the instrument.
This principle is contradictory to exploratory factor analysis, which is designed to
facilitate theory building. For those researchers interested in exploring multidimensions,
Rasch analysis will prove ineffective as the Rasch model only works for unidimensional
instruments (Kubinger, 2005).
A specific limitation of this study included the assumption made by the researcher
that the sample drawn from the data gathered from the community family court project
included people with a higher likelihood of substance dependence. This assumption was
made primarily due to the involvement with the project. However, just because a
respondent was involved with the project did not necessarily imply a higher likelihood of
substance dependence.
Conclusion
The purpose of this study was to investigate the measurement properties of the
Substance Abuse Subtle Screening Inventory-3 (SASSI-3). The measurement properties
as outlined by Thurston (1927) include Unidimensionality, Linearity, Invariance, and
Independence. This study produced two major findings. The first involves the SASSI-3's
measurement properties. While it is commonly known that the SASSI-3, as it is currently
197
written, is not intended to be unidimensional, the SASSI-3 can function as a
unidimensional measure which meets fundamental measurement properties with some
minor adjustments to the response options and elimination of some misfitting and
redundant items. The second major finding of this study is that the subtle scales and
subtle items do not appear to contribute to the functioning of the instrument. The
implications of these findings are that changing the response scale and eliminating
multivocal items, items that are true on one scale and false on another, items with no
scale and other items that misfit or are redundant will improve the functioning of the
instrument by improving its reliability. A higher functioning instrument will improve
time management and save money for community agencies and drug and alcohol
treatment facilities by effectively screening in people who need treatment, providing
them with treatment which leads to effective outcomes.
Additional research on the SASSI-3 using modern measurement methodology can
only improve On the effectiveness of the instrument. However, more research is needed to
confirm the findings of this study. As has been suggested for the MMPI-2 RC, immediate
change to a new instrument without research to confirm and validate these findings
would be premature.
198
References
Adger, H., & Werner, M. J. (1994). The pediatrician. Alcohol Health and Research
World, 75, 121-126.
Altaian, D. G. (1991). Practical statistics for medical research. London, England:
Chapman & Hall.
American Psychiatric Association. (2000). Diagnostic and statistical manual (4 ed.)
Washington DC: American Psychiatric Association.
Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. I. Braun,
(Eds.), Test validity (pp. 19-32). Princeton, NJ: Lawrence Erlbaum Associates,
Inc.
Arneth, P. M., Bogner, J. A., Corrigan, J. D. & Schmidt, L. (2001). The utility of the
Substance Abuse Subtle Screening Inventory-3 for use with individuals with brain
injury. Brain Injury, 15, 499-510.
Banerji, M., Smith, R. M., & Dedrick, R. F. (1997). Dimensionality of an early childhood
scale using Rasch analysis and Confirmatory Factor Analysis. Journal of
Outcome Measurement 7(1), 56-85.
Bartholomew, D. J. (1996). The statistical approach to social measurement. London
England: Academic Press Inc.
Bartholomew, K., Henderson, A. J. Z., & Marcia, J. E. (2000). Coded semistructured
Interviews in social psychological research. In H. T. Reis & C. M. Judd (Eds.),
Handbook of research methods in social andpersonality psychology. New
York, NY: Cambridge University Press.
199
Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental
measurement in the human sciences, (2n ed). Mahwah, NJ: Lawrence Erlbaum
Associates, Publishers.
Brewer, M. B. (2000). Research design and issues of validity. In H. T. Reis, & C. M.
Judd, (Eds.). Handbook of research methods in social and personality psychology
(pp. 3-16). Cambridge, UK: Cambridge University Press.
Burck, A. M., Laux, J. M., Harper, H. L., & Ritchie, M, (2008). Detecting college student
impression management using the SASSl-3. Adams State College. Paper
submitted for publication.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989).
Manual for the Restandardized Minnesota Multiphasix Personality Inventory:
MMPI-2. Minneapolis, MN: University of Minnesota Press.
Carletta, J. (1996). Assessing agreement on classification tasks: The Kappa statistic.
Computational Linguistics, 22, 249-254.
Clements, R. (2002). Psychometric properties of the Substance Abuse Subtle Screening
Inventory - 3. Journal of Substance Abuse Treatment, 23, 419-423.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Fort
Worth, TX: Harcourt Brace Jovanovich College Publishers.
Derogaris, L. R. (1975). The SCL-90-R. Baltimore, MD: Clinical Psychometric Research.
Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in
organizational behavior research. Journal of Business and Psychology, 17, 245-
260.
200
Elliott, R., Fox, C. M , Beltyukova, S. A., Stone, G. E.> Gunderson, J., & Zhang, X.
(2006). Deconstructing therapy outcome measurement with Rasch analysis of a
measure of general clinical distress: The Symptom Checklist-90-Revised.
Psychological Assessment, 18,359-372.
Ewing, J. A. (1984). Detecting alcoholism: The CAGE questionnaire. JAMA, 252, 1905-
1907.
Feldstein, S. W., & Miller, W. R. (2007). Does subtle screening for substance abuse
work? A review of the Substance Abuse Subtle Screening Inventory (SASSI).
Addiction, 702,41-50.
Fox, C. M^ & Jones, J. A. (1998). Uses of Rasch modeling in counseling psychology
research. Journal of Counseling Psychology, ¥5,30-45.
Gray, B. T. (2001). A factor analytic study of the Substance Abuse Subtle Screening
Inventory (SASSI). Educational and Psychological Measurement, 61(1), 102-
118.
Henderson, C. E., Taxman, F. S., & Young, D. W. (2007). A Rasch model analysis of
evidence-based treatment practices in the criminal justice system. Drug and
Alcohol Dependence, 93, 163-175.
John, O. P., & Benet-Martinez, V. (2000) Measurement: Reliability, construct validation,
and scale construction (pp. 339-369). In H. T. Reis & C. M. Judd (Eds.).
Handbook ofresearch methods in social andpersonality psychology.
Cambridge, UK: Cambridge University Press.
Juhnke, G. A., Vacc, N. A., Curtis, R. C , Coll, K. M., & Paredes, D. M. (2003).
201
Assessment instruments used by addictions counselors. Journal of Addictions and
Offender Counseling, 23. 66-12.
Kagee, A., & deBruin, G. P. (2007). The South African former detainees distress scale:
Results of a Rasch item response theory analysis. South African Journal of
Psychology, 37, 518-529.
Keeves, J. P., & Masters, G. N., (1999) Introduction. In G. N. Masters & J. P. Keeves
(Eds.), Advances in measurement in educational research and assessment. New
York, NY: Pergamon. ' _.
Kubinger, K.D. (2005). Psychological test calibration using the Rasch model - Some
critical suggestions on traditional approaches. International Journal or Testing,
5(4), 377-394.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for
categorical data. Biometrics, 33. 159-174.
Laux, J. M., Perera-Diltz, D., Smirnoff, J. B., & Salyers, K. M. (2005). The SASSI-3 face
valid other drugs scale: A psychometric investigation. Journal of Additions and
Offender Counseling 26, 15-21.
Laux, J. M., Salyers, K. M., & Kotova, E. (2005). Psychometric evaluation of the SASSI-
3 in a college sample. Journal of College Counseling, 8, 41-51.
Lazowski, L. E., Miller, F. G., Boye, M. W., & Miller, G. A. (1998). Efficacy of the
Substance Abuse Subtle Screening Inventory-3 (SASSI-3) in identifying
substance dependence disorders in clinical settings. Journal of Personality
Assessment, 71,114-128.
202
Linacre, J. M. (1999). Investigating rating scale category utility. Journal of Outcome
Measurement, 3, 103-122.
Linacre, J. M. (2G09). Winsteps. Retrieved on March 21, 2009 from www.winsteps.com
Litwin, M. (1995). How to measure survey reliability and validity. Thousand Oaks, CA:
SAGE Publications.
Mark, R. (1996). Research made simple: A handbookfor social workers. Thousand
Oaks, CA: SAGE Publications.
Mayfield, D., McLeod, G:, & Hall, P. (1974). The CAGE questionnaire: Validation of a
new alcoholism screening instrument. American Journal of Psychiatry, 131,
1121-1123.
McAndrew, C. (1965). The differentiation of male alcoholic outpatients from
nonalcoholic psychiatric outpatients by means of the MMPI. Quarterly Journal of
Studies on Alcohol, 26, 238-246.
Miller, W. R., & Lazowski, L. (1999). Adult SASSI-3 Manual. Springfield, IN: SASSI
Institute.
Miller, W. R., & Feldstein, S. W. (2007). SASSI: A response to Lazowski & Miller.
Addiction, 102, 1001-1004.
Millon, T. (1987). Manual for the Millon Clinical Multiaxial lnverntory-ll (MCMI-II).
Minneapolis, MN: National Computer Systems.
Myerholtz, L , & Rosenberg, H. (1998). Screening college students for alcohol problems:
Psychometric assessmentof the SASSI-2. Journal orStudies on Alcohol, 59, 439-
446.
203
National Highway Safety Traffic Safety Administration (2006). 2006 Annual assessment
final report. Retrieved 1/28/09 from http://www.nhtsa.dot.gov/.
Nichols, D. S. (2006). The trials of separating bath Water from baby: A review and
critique of the MMPA-2 Restructured Clinical Scales. Journal of Personality
Assessment 87, 121-138.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
Peters, R. H., Greenbaum, P. E., Steinberg, M. L., Carter, C. R., Ortiz, M. M., Fry, B. C ,
& Valle, S. K. (2000). Effectiveness of screening instruments in detecting
substance use disorders among prisoners. Journal of Substance Abuse Treatment,
75,349-358.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests.
Chicago, IL: The University of Chicago Press.
Rasch, G. (1980). Probabilistic Models for some intelligence and attainment tests
(expanded ed.). Chicago, IL: The University of Chicago Press.
Rogers, R., Sewell, K. W., Harrison, K. S, & Jordan, M. J. (2006). The MMPI-2
Restructured Clinical Scales: A paradigmatic shift in scale development. Journal
of Personality Assessment, 87, 139-147.
Salins, P. (2008). Does the SAT predict college success? Retrieved 1/23/09 from
http://www.mindingthecampus.com/originals/2008/10/by_peter_salins one of.ht
ml.
Selzer, M. L. (1971) The Michigan Alcohol Screening Test: The quest for a new
diagnostic instrument. American Journal of Psychiatry, 127, 1653-1658.
204
Sproll, N. L. (1995). Handbook of research methods: A guide for practitioners and
students in the social sciences, (2nd ed). Metuchen, N.J. & London, England:
Scarecrow Press, Inc.
Stevens, J. (1996). Applied multivariate statistics for social sciences, (3 rd ed). Mahwah,
NJ: Lawrence Erlbaum.
Stone, G. E. (2007). Thurstonian principles of measurement. Powerpoint presented at the
weekly class meeting Measurement I, Toledo, Oh.
Strong, D. R., Kahler, G. W., Greene, R. L., & Schinka, J. (2005). Isolating a primary
dimension within the Cook-Medley hostility scale: A Rasch analysis. Personality
and Individual Differences, 39, 21 -33.
Substance Abuse and Mental Health Services Administration (2008). Drug abuse
warning network, 2006: National estimates of drug-related emergency
department visits. Retrieved on January 7, 2009, from http://www.samhsa.gov/.
Substance Abuse and Mental Health Services Administration (2008). Results from the
2007 national survey on drug use and health: National findings. Retrieved on
January 7,2009, from http ://www. samhsa. gov/.
Svanum, S., & McGrew, J. (1995). Prospective screening of substance dependence: The
advantages of directness. Addictive Behaviors, 20, 205-213.
Sweet, R. I., & Saules, K. K. (2003). Validity of the Substance Abuse Subtle Screening
Inventory - Adolescent Version (SASSI-A). Journal of Substance Abuse
Treatment, 24, 331-340.
Tellegen, A., Ben-Porath, Y. S, Sellbom, M., Arbisi, P. A., McNulty, J. L., & Graham, J.
205
R. (2006). Further evidence on the validity of the MMPI-2 Restructured Clinical
(RC) Scales: Addressing questions raised by Rogers, Sewell, Harrison and Jordan
mdNichols. Journal of Personality Assessment, 87, 148-171.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273-
286.
Traub, R. (1994). MMSS Reliability for the social sciences: Theory and applications,
vol. 3. Thousand Oaks, CA: SAGE Publications.
Viera, A. J., & Garrett, J. M. (2005). Understanding interobserver agreement: The kappa
statistic. Family Medicine 3 7, 360-363.
Wallen, N. E., & Fraenkel, J. R. (1991). Educational research: A guide to the process.
New York, NY: McGraw-Hill.
Weed, N. C , Butcher, J. N., McKenna, T., & Ben-Porath, Y. S. (1992). New measures
for assessing alcohol. And drug abuse with the MMPI-2: The APS and AAS.
Journal of Personality Assessment, 58, 389-404.
Wright, B. D. (1960). Forward. In Rasch, G. (1960) Probabilistic models for some
intelligence and attainment tests (pp. ix-xix). Chicago, IL: The University of
Chicago Press.
206

!!!12

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

!!!12

Încărcat de

Drepturi de autor:

Formate disponibile

A Dissertation

A Psychometric Study of the Substance Abuse Subtle Screening Inventory-3

Using Rasch Analysis

Submitted as partial fulfillment of the requirements for

Doctorate in Philosophy degree in

Advisor: Dr. John Laux

Dr. Paula Dupuy

Dr. Holly Harper

Dr. Gregory Stone

College of Health Science and Human Service

College of Graduate Studies

The University of Toledo

may be reproduced without the expressed permission of the author.

A Psychometric Study of the Substance Abuse Subtle Screening Inventory-3

Using Rasch Analysis

Submitted as partial fulfillment of the requirements for

Doctor of Philosophy degree in Counselor Education

1999) is a popular screening instrument used to assist professionals in the assessment of

explore the fundamental measurement properties of the SASSI-3 from a Thurstoniah

perspective. Included in this dissertation are a comprehensive review of the literature on

persistence, and encouragement made it possible for me to complete my dissertation.

Amber Lange, who in this process, became my lifetime friends.

List of Figures xii

Chapter One Introduction •. 1

Statement of the Problem 6

Purpose of the Study. 7

Research Questions and Corresponding Hypotheses 7

Significance of the Study 9

Definition of Substance Dependence 9

Chapter Two Review of the Literature 11

Alcohol and Drug Screenings 12

SASSI-3 - The Instrument 13

The SASSI-3 Scales 15

Interpreting the SASSI-3.. .21

SASSI-3 reliability from the SASSI-3 Manual. 30

SASSI-3 reliability from independent researchers 30

SASSI-3 validity data from the SASSI-3 Manual 34

SASSI validity data from independent researchers 35

Limitations of the Psychometric Findings on the SASSI-3 38

Rasch Measurement ...40

Rasch Separation and Reliability 44

Construct analysis ...47

Chapter Three Methods , 50

Research Questions and Correlating Hypotheses 50

Steps in conducting a Rasch Analysis 59

Step one - Response validation 60

Step two - Item fit analysis. 60

Step three - Construct analysis 60

Step five - Assess for measure independence 61

Chapter Four Results '. 65

Face Valid Alcohol Scale (FVA) 66

Face Valid Other Drug Scale (FVOD) ! 79

Symptoms Scale (SYM) 93

Obvious Attributes Scale (OAT) 100

Subtle Attributes Scale (SAT) 107

Supplemental Addiction Measure (SAM)...... 112

Defensiveness Scale (DEF) 118

Family versus Control Scale (FAM) 126

Correctional Scale (COR) 133

Dichotomous SASSI-3 .....139

Review of Research Hypotheses 1-4 162

Review of Research Hypotheses 5-8 182

Chapter Five Discussion 184