Sunteți pe pagina 1din 12

ISAKOS Scientific Committee Report

Scoring Systems for the Functional Assessment of the Shoulder

Alexandra Kirkley,† M.D., M.Sc., F.R.C.S.C., Sharon Griffin, C.S.S., and


Katie Dainty, M.Sc., C.R.P.C.

Abstract: A number of instruments have been developed to measure the quality of life in patients
with various conditions of the shoulder. Older instruments appear to have been developed at a time
when little information was available on the appropriate methodology for instrument development.
Much progress has been made in this area, and currently an appropriate instrument exists for each of
the main conditions of the shoulder. Investigators planning clinical trials should select modern
instruments that have been developed with appropriate patient input for item generation and
reduction, and established validity and reliability. Among the other factors discussed in this review,
responsiveness of an instrument is an important consideration as it can serve to minimize the sample
size for a proposed study. The shoulder instruments reviewed include the Rating Sheet for Bankart
Repair (Rowe), ASES Shoulder Evaluation Form, UCLA Shoulder Score, The Constant Score,
Disabilities of the Arm, Shoulder and Hand (DASH), the Shoulder Rating Questionnaire, the Simple
Shoulder Test (SST), the Western Ontario Osteoarthritis of the Shoulder Index (WOOS), the Western
Ontario Rotator Cuff Index (WORC), the Western Ontario Shoulder Instability Index (WOSI),
Rotator Cuff Quality of Life (RC-QOL), and the Oxford Shoulder Scores (OSS). Key Words:
Quality of life—Shoulder outcomes—Outcome development.

I n a previous article in this series, the methodology


for the development and evaluation of a disease-
specific quality of life instrument was described. We
Shoulder Rating Questionnaire, the Western Ontario
Osteoarthritis of the Shoulder Index (WOOS), the
Western Ontario Rotator Cuff Index (WORC), the
will now discuss each of the most commonly used Western Ontario Shoulder Instability Index (WOSI),
shoulder scoring systems, commenting on their Rotator Cuff Quality of Life (RC-QOL), and the Ox-
strengths and weaknesses. The shoulder instruments ford Shoulder Scores (OSS).
reviewed in this article include the Rating Sheet for
Bankart Repair (Rowe), UCLA Shoulder Score, The
Shoulder Pain and Disability Index (SPADI), ASES THE RATING SHEET FOR BANKART
Shoulder Evaluation Form, The Constant Score, Dis- REPAIR
abilities of the Arm, Shoulder and Hand (DASH), the
In 1978, Carter Rowe published a classic article
evaluating the long-term results of the Bankart repair.1
It was in this article that he introduced a new rating
From the Fowler Kennedy Sport Medicine Clinic, London, On- system for the postoperative assessment of patients
tario, Canada. undergoing anterior stabilization. This system scores
†Deceased. patients based on 3 separate areas—stability, motion,
Address correspondence and reprint requests to Katie Dainty,
M.Sc., C.R.P.C., Fowler Kennedy Sport Medicine Clinic, 3M Cen- and function—with 1 item for each of these areas. The
tre, University of Western Ontario, London, Ontario N6A 3K7, weighting is such that stability accounts for 50 points,
Canada. E-mail: kdainty@uwo.ca motion for 20 points and function for 30 points, giving
© 2003 by the Arthroscopy Association of North America
0749-8063/03/1910-3893$30.00/0 a total possible score of 100 points.
doi:10.1016/j.arthro.2003.10.030 Unfortunately, there are no published reports on the

Arthroscopy: The Journal of Arthroscopic and Related Surgery, Vol 19, No 10 (December), 2003: pp 1109-1120 1109
1110 A. KIRKLEY ET AL.

development or testing of this instrument. It is likely THE UCLA SHOULDER SCORE


that the items used in the questionnaire were selected
without direct patient input. There are a number of The University of California at Los Angeles Shoul-
problems that can be identified with this instrument. der Rating scale was first published in 1981 in a paper
Each of the 3 domains in this instrument contains by H. C. Amstutz et al.2 The instrument was intended
“double-barreled” questions, i.e., the subject is asked to be used in studies of patients undergoing total
to consider more than 1 question at the same time, shoulder arthroplasty for arthritis of the shoulder.
each of which may be answered differently. For ex- Since then, however, it has been used for patients with
ample, in responding to the stability domain, the sub- other shoulder conditions including rotator cuff dis-
ject must choose the best response considering dislo- ease3 and shoulder instability.4
cations, subluxations, and apprehension. The motion This instrument assigns a score to patients based on
domain includes 3 different motions (external rota- 5 separate domains: pain, function, active forward
tion, forward elevation, and internal rotation) and the flexion, strength of forward flexion, and overall satis-
function domain includes both functional limitation faction. There is 1 item for each of these areas. The
and pain. Some subjects may choose the response weighting is such that pain accounts for 10 points,
option only if all conditions are met while others may function for 10 points, forward flexion for 5 points,
choose based on the 1 condition they think is the most strength for 5 points, and overall satisfaction for 5
important. points, giving a total of 35 points.
It is unknown why the developers of this instrument There are no publications available on the develop-
assigned the various weights to the 3 items (stability ment or testing of this instrument. It is likely that the
accounts for 50%, motion 20%, function 30%). While items on this instrument were also selected by the
not necessarily incorrect, it is unsupported. Similarly, authors without direct patient input, similar to the
it is unknown why the items have been assigned what Rowe instrument. Several problems can be identified
appear to be random scores. For instance, a total of 30 with this tool.
points are assigned to the function domain. The dif- The items in the pain and function domains are
ference in score from no limitation to mild limitation “double-barreled.” As an example, when measuring
is 5 points, whereas the difference from mild to mod- pain, the patient is asked to comment on frequency,
erate limitation is 15 points. Although this is not severity, and the type and amount of medication that is
necessarily incorrect, it is unconventional and is fun- required to relieve the pain. This certainly presents
damentally arbitrary. difficulties in choosing an appropriate response when
It is not clear whether apprehension is to be mea- the patients will be unlikely to find a perfect match
sured by asking the patient whether they have appre- from the response options available.
hension or by examining the patient and doing an Again similar to the Rowe, it is unknown why the
apprehension test (putting the arm in a position of developers of this instrument assigned the various
extreme abduction and external rotation and monitor- weights to the 5 domains (Pain 28.6%, Function
ing the sensation of instability). This is important to 28.6%, Range of Motion 14.3%, Strength 14.3%, Sat-
define because many patients will deny apprehension isfaction 14.3%). While not necessarily incorrect, it is
for day-to-day activities but if put in a provocative unsupported.
position will feel apprehensive. The overall satisfaction item only allows for the
The evaluation of motion is not defined as active or instrument to be used after an intervention and not
passive nor does the instrument describe whether the before and after as would be ideal in most clinical
scapulothoracic joint is to be stabilized. Further, since trials. In addition, it is not clear how a subject would
for this instrument, motion is based on a percentage of respond if his or her condition were unchanged.
the normal shoulder, it is not clear how one would This instrument also combines 2 items of subjective
score a patient who does not have a contralateral evaluation of function with 1 item of physical exam-
normal shoulder. ination. As these are measuring fundamentally differ-
This instrument combines 2 items of subjective ent attributes it is probably not meaningful to combine
evaluation with 1 item of physical examination. As them for a total score.
these items are measuring fundamentally different at- Clearly both of these first 2 instruments were de-
tributes it is probably not meaningful to combine them veloped before the advent of modern measurement
for a total score as is meant to be done in this instru- development methodology. The problems identified
ment. with these instruments may lead to poor reliability,
REVIEW OF SHOULDER OUTCOME TOOLS 1111

validity, and responsiveness, and therefore they may quently enrolled in a randomized clinical trial. The
or may not be ideal choices for evaluating patients in ages of the subjects ranged from 23 to 76 years, with
the research or clinical environment. a mean age of 58 years. Of the patients included, 27
had shoulder pain of musculoskeletal origin. The ma-
THE SHOULDER PAIN AND DISABILITY jority of the remaining subjects had shoulder pain of
INDEX (SPADI) neurogenic or undetermined origin. The intraclass cor-
relation coefficient (ICC) for the SPADI total score
In 1991, Roach et al. published the development (.65) and for the pain and disability subscales (.64 and
and evaluation of the SPADI.5 The authors state that .64) may be falsely elevated by the short time interval
“the SPADI was developed to provide a self-admin- of several days, which may not be long enough for
istered instrument that would reflect the disability and subjects to have forgotten their original score. The
pain associated with the clinical syndrome of painful diverse population on which it was tested may also
shoulder.” It was designed as both a discriminative have an effect on the overall outcome. Despite this,
and evaluative instrument. The majority of the item the scores of reliability are only modest. The authors
generation and reduction was carried out by a panel of explain that the subjects may have actually improved
3 rheumatologists and a physical therapist without over the short time frame as most were started on an
direct patient input. Further items were eliminated analgesic at the original visit. As a global rating of
based on poor test-retest reliability or a low correla- change score was not administered concurrently, it is
tion with shoulder range of motion (ROM) on the unknown if this is the case or not. The authors also
involved side. Eliminating items based on poor reli- report on the internal consistency of the instrument
ability is logical for a discriminative instrument but using Cronbach’s alpha (total score .95, pain subscale
not necessarily ideal for an evaluative instrument, as .86, disability subscale .93). Further, factor analysis
an item can have poor reliability but be important to found that most of the items loaded onto 1 factor
patients and therefore be highly responsive. Eliminat- supporting the conclusion that the SPADI measures 1
ing items based on poor correlation with range of construct. Varimax rotation provided limited support
motion may have a negative impact on the final tool for 2 subscales.
make-up, as range of motion has been found to rarely Limited validation testing has been described. Con-
correlate more than modestly with patients’ estimation struct validation consisted of testing the hypotheses
of their subjective functioning. There is no report of that the SPADI would correlate with baseline shoulder
formal pretesting. active ROM (flexion, abduction, extension, external
The instrument has 13 items divided into 2 sub- rotation) measurements as an indicator of discrimina-
scales: pain (5 items) and disability (8 items). The tive function and over time using change scores after
response format selected for the instrument was the an intervention (as a measure of evaluative function).
10-cm VAS anchored verbally at each end. However, The tool was administered to all 37 of the previously
in distinction to the usual method of scoring a VAS in described subjects. Correlations with baseline range of
which the slash on the line is measured from the left motion ranged from .55 to .80. Correlations with
anchor in millimeters, the authors describe dividing range of motion change scores ranged from .50 to .70.
the horizontal line into 12 segments of equal length. A The responsiveness has not been formally tested. No
number ranging from 1 to 11 is attached to the seg- report of the minimally important difference has been
ment to produce a score for each item. provided.
The scores for the individual items are given equal
weight within their domain and the domain scores are THE AMERICAN SHOULDER AND ELBOW
reported by converting to a score out of 100, with a SURGEONS EVALUATION FORM (ASES)
score of 0 being perfect and a score of 100 being the
worst score possible. The total score for the instru- In 1993, the Society of the American Shoulder and
ment is determined by averaging the scores for the two Elbow Surgeons developed a standardized form for
domains of pain and disability. the assessment of shoulder function.6 The purpose was
The reliability of the instrument has been evaluated to facilitate communication between investigators and
by measuring test-retest reliability over several days in to permit and encourage multicenter trials. The mem-
23 subjects who represented a subset of 37 male bers felt that the required attributes of any new tool
patients presenting to an ambulatory care clinic with a were ease of use, a method of assessing activities of
complaint of shoulder pain and who were subse- daily living and inclusion of a patient self-evaluation
1112 A. KIRKLEY ET AL.

section. The research committee of the ASES re- ferent categorization of patients and correlated poorly
viewed all published forms available at the time and with each other. The authors of this study concluded,
based on those and their own ideas developed a pro- “the most commonly used scoring systems for shoul-
totype instrument. It is not stated how the committee der conditions yield varying results when used to
selected the items for this instrument. The prototype evaluate shoulder instability outcomes in our patient
instrument was distributed to the members, who were population. We urgently need a well-accepted shoul-
encouraged to use the instrument and then offer con- der system based on the patient’s functional status to
structive criticism. More than 70 suggestions to im- critically assess our management of various shoulder
prove the instrument were made after distribution of conditions.”4
the prototype. Following review by the research com-
mittee a second prototype was distributed in the sum- THE CONSTANT SCORE
mer of 1992. A further 15 suggestions were made and
further revisions resulted in the final instrument. The The Constant Score7 has become the most widely
instrument consists of a physician assessment section used shoulder evaluation instrument in Europe. This
and a patient self-evaluation section. scoring system combines physical examination tests
The physician assessment section includes physical with subjective evaluations by the patients. The sub-
examination and documentation of range of motion, jective assessment consists of 35 points and the re-
strength, and instability, and demonstration of specific maining 65 points are assigned for the physical exam-
physical signs. No score is derived for this section of ination assessment.
the instrument. The patient self-evaluation section has The subjective assessment includes a single item for
11 items that can be used to generate a score. These pain (15 points) and 4 items for activities of daily
are divided into 2 areas: pain (1 item) and function (10 living (work 4, sport 4, sleep 2, and positioning the
items). The response to the single pain question is hand in space 10 points).
marked on a 10-cm visual analog scale (VAS), which The objective assessment includes: range of motion
is divided into 1-cm increments and anchored with (forward elevation, 10 points; lateral elevation, 10
verbal descriptors at 0 and 10 cm. points; internal rotation, 10 points; external rotation,
The 10 items in the function area of the ASES 10 points) and power (scoring based on the number of
include activities of daily living such as managing pounds of pull the patient can resist in abduction to a
toileting, putting on a coat, etc. There are more de- maximum of 25 points). The total possible score is
manding activities such as lifting 10 pounds above therefore 100 points.
shoulder height and throwing a ball overhand. Finally, The publication by Constant7 in which he describes
there are 2 general items: doing usual work and doing the instrument does not include methodology for how
usual sport. There are 4 categories for response op- it was developed and more specifically, the rationale
tions from 0 (unable to do) to 3 (not difficult). Because for item selection and relative weighting of the items.
of this, the responsiveness of the individual items is The strength of this instrument is that the method
likely poor, especially in higher functioning patients. for administering the tool is quite clearly described
As an example, if a patient found an activity some- which is an improvement on pre-existing tools.
what difficult prior to treatment he or she would have It is unknown why the developers of this instrument
to have no difficulty whatsoever after treatment to assigned various weights to the items (pain 15%,
improve by 1 category. function 20%, range of motion 40%, strength 15%).
The final score is tabulated by multiplying the pain While not necessarily incorrect, it is unsupported.
score (maximum 10) by 5 (therefore total possible 50) This instrument combines 4 items of function with
and the cumulative activity score (maximum 30) by 5 items of physical examination. As these are measur-
5/3 (therefore, a total possible 50) for a total of 100. ing fundamentally different attributes, they should be
No rationale has been presented for the weighting measured separately as opposed to being combined for
scheme of this instrument. While not necessarily in- a total score.
correct, it is unsupported. No published data is avail- This instrument is weighted heavily on range of
able on the testing of this instrument. motion (40%) and strength (25%). Although this may
Three of these instruments, the Rowe, UCLA score, be useful for discriminating between patients with
and ASES score have been compared in a group of 52 significant rotator cuff disease or osteoarthritis, it is
patients with shoulder instability undergoing surgical not useful for patients with instability. In fact, in one
stabilization.4 The 3 scales provided remarkably dif- study all the patients with instability of the shoulder
REVIEW OF SHOULDER OUTCOME TOOLS 1113

scored nearly perfectly (95-100) despite having prob- The items were stripped of scaling and attribution to a
lems of sufficient magnitude to request surgical inter- specific disorder. Items that were repetitive or obvi-
vention. The reliability of this measurement tool has ously unrelated to the upper extremity were elimi-
been evaluated on a limited basis.8 Although the meth- nated. The reduced list was then sent to clinician
odology is not described in detail, Constant7 states that “content experts” for their input as to content/face
when the instrument was used to assess 100 abnormal validity and the importance of the items (5 point scale:
shoulders by 3 different observers, the interobserver 2 ⫽ definitely yes, to ⫺2 ⫽ definitely no). This
error was an average of 3% ranging from 0% to 8%. allowed for reduction of 821 potential items to a 67
Conboy et al.8 measured the reliability on 25 patients item questionnaire.
with varying diagnoses of shoulder syndromes. They The 67 items were reformatted into a questionnaire
demonstrated that the 95% confidence limit between suitable for field testing. This questionnaire was pre-
observers was 27.7 points and within observers was tested on 20 patients with upper limb problems to
16 points. No data on the formal testing of validity nor ensure readability, absence of ambiguity, and under-
the responsiveness of this instrument has been pub- standing of scaling and content, as well as to confirm
lished. that an adequate number/type of response items were
available. In this publication the authors state,
THE DISABILITIES OF THE ARM, “. . . further item reduction will be carried out after
SHOULDER AND HAND (DASH) field testing of the questionnaire on 420 patients in
Canada, Australia, and the United States. Frequency
Recently, the American Academy of Orthopaedics of endorsement and internal consistency will be as-
Surgeons (AAOS) along with the Institute for Work & sessed using the data generated by the field testing.
Health (Toronto, Ontario, Canada) developed an out- Items with a very high or low endorsement rate or
come tool to be used for patients with any condition of excessively high correlations with other items in the
any joint of the upper extremity. This instrument same scale will be eliminated. Factor analysis will
called the Disabilities of the Arm, Shoulder and Hand also be used to empirically validate the aggregation of
Measurement tool or DASH is made available by the items into subscales.” This testing was actually com-
AAOS. A brief description of the methodology for the pleted by Marx et al. in 1996.
item generation and the initial item reduction phases The major criticism of this tool is that the item-
has been published.9 In 1999, the AAOS and Institute generation phase did not include interviews with pa-
for Work & Health developed and published a User’s tients with the conditions of interest. It has been well
Manual for the DASH outcome measure.10 The com- documented that physicians are poor judges of patient
plete development and testing of the instrument is status12,13 and likely are poor judges of what is im-
detailed in this manual. portant to patients. The initial item reduction was done
The DASH is a 30-item questionnaire designed to by clinicians, although it has been reported that item
evaluate “upper extremity-related symptoms and mea- impact, as determined from patient input, was used for
sure functional status at the level of disability.” Dis- the remainder of the item reduction.
ability is defined as “difficulty doing activities in any There are several examples in the DASH where one
domain of life (the domains typical for one’s age/sex item is a more specific version of another item. For
group) due to a health or physical problem.”11 Con- example, item 324 “pain in the arm, shoulder, or hand
cepts covered by the DASH include symptoms (pain, when performing any specific activity” is a more
weakness, stiffness, and tingling/numbness), physical specific version of item 323, which asks about arm,
function (daily activities, house/yard chores, shop- shoulder, or hand pain in general. It is unclear why
ping, errands, recreational activities, self-care, dress- they would choose 2 items where the more specific
ing, eating, sexual activities, sleep, and sport/perform- one would make up part or all of the response to the
ing art), social function (family care occupation, more general one. Similarly, the 4 questions relating
socializing with friends/family) and psychological to sports or playing an instrument would appear to
function (self-image). have considerable overlap. Item 332, “difficulty play-
Item generation was carried out by first reviewing ing your musical instrument or sport as well as you
the literature. Thirteen scales were combined to pro- would like” would have a large contribution from item
duce an initial pool of 821 items. Item reduction was 331 “difficulty playing your musical instrument or
carried out in 2 steps. Three members of the collabo- sport because of pain.” Although it is not technically
rative development group reviewed the original items. incorrect, it builds in considerable redundancy into the
1114 A. KIRKLEY ET AL.

tool, which has the effect of attributing more weight or scores of the completed questions and multiplying by
value to these items. two. Thus, the possible score for each domain ranges
This instrument is intended for patients with any from 2 (poorest) to 10 (best). Further, the investigators
condition of any joint of the upper extremity. This suggest a weighting scheme based on “consultation
makes it attractive for use in the clinical setting where with several shoulder surgeons and patients regarding
patients present in an undifferentiated fashion. The the relative importance of each of the domains.” The
patients can complete the questionnaire before a diag- weighting is as follows: global assessment 15%, pain
nosis is established. There is also much more infor- 40%, daily activities 20%, recreational and athletic
mation currently available on scoring of the DASH activities 15%, work 10%. Therefore, the total possi-
now that the DASH User’s Manual is available. This ble score ranges from 17 to 100.
is a very useful resource for clinicians interested in Testing of this instrument has been described by the
properly implementing the DASH as an outcome tool developers. Test-retest reliability was evaluated in 40
in their practice. patients with a wide variation of characteristics (age,
Unfortunately the broader scope of this instrument gender, shoulder disease, and severity) at variable
makes it less attractive for use in a clinical trial. Many time intervals within 1 week of initial administration
of the items may seem irrelevant to patients with (mean of 3 days, range 1 to 7 days). They reported the
specific conditions. In addition, this instrument has Spearman Rank Correlation Coefficient for the overall
been shown to be less responsive than other shoulder instrument (.96) and each of the domains (range .81-
specific and shoulder condition specific instruments .96). A criticism of this approach is that the values
making it less efficient as a research tool.14-16 may have been falsely elevated for 2 reasons. First, 3
days is unlikely to be long enough for patients to
THE SHOULDER RATING forget their previous responses, making it more likely
QUESTIONNAIRE that they could reproduce their original score. Second,
because reliability is a measure of the between-person
In 1997, L’Insalata et al. published the Shoulder variance to the total variance, testing reliability in
Rating Questionnaire “a self-administered question- such a diverse population increases the numerator,
naire for the assessment of symptoms and function of giving a higher reliability than one might get in a
the shoulder.”18 It is unknown how the items on the population more representative of a typical study pop-
instrument were generated or selected. It is simply ulation where all the patients have only 1 condition.
stated that “A preliminary questionnaire was devel- To date, the responsiveness for this tool has not been
oped.” The preliminary questionnaire was adminis- compared with any other existing shoulder instru-
tered to 30 patients and a subset of those patients were ments.
interviewed to identify clinical relevance, relative im- The investigators indicate that a difference of 12
portance, and ease of completion and grading. This points for the total score and 2 points for each domain
allowed for modifications to be made to produce a score compared with pretreatment scores is clinically
revised questionnaire. An “assessment” of the ques- important although the rationale for the selection of
tionnaire was said to have been completed, after these values is not described.
which “questions that had poor reliability, substan- The validation described consisted of correlating
tially reduced the total or subset internal consistency, scores on the Shoulder Rating Questionnaire with
or contributed little to the clinical sensitivity of the comparable domains of the Arthritis Impact Measure-
over-all instrument were eliminated to produce the ment Scales 2. No a priori predictions were made and
final questionnaire.” no interpretation of the observed correlations (ranging
The final instrument includes 6 separately scored from .56 to .89) is described. A second construct was
domains: global assessment, pain, daily activities, rec- tested: that patients who selected a particular domain
reational and athletic activities, work, and satisfaction. as an important area for improvement would score
A final, nongraded domain allows the patient to select lower on that domain than patients who did not select
2 areas in which he or she believes improvement is it as an important area. A significant difference was
most important. The global assessment domain con- found for each of the 4 domains tested (pain, daily
sists of a single VAS. Each of the other scored do- activities, recreation/athletic activities, and work).
mains consists of a series of multiple-choice questions Construct validation through correlations between this
with 5 response categories from 1 (poorest) to 5 (best). instrument and other measures of shoulder function
Each domain is scored separately by averaging the have not been determined.
REVIEW OF SHOULDER OUTCOME TOOLS 1115

THE SIMPLE SHOULDER TEST (SST) tive function to differentiate between patients with
varying severity of the same condition.
In 1992 Lippitt, Harryman, and Matsen reported on
the development and testing of the Simple Shoulder
Test (SST).19 The purpose of the instrument is stated THE WESTERN ONTARIO SHOULDER
TOOLS
to be a means of documenting the functional improve-
ment resulting from a specified procedure performed In 1998, Kirkley et al. published the first in a series
by a specific surgeon in response to a given diagnosis of disease-specific quality of life measure tools for the
and to characterize the severity of the condition. shoulder, The Western Ontario Shoulder Instability
The SST consists of 12 questions with “yes or no” Index (WOSI).14 This instrument was developed and
response options. The instrument combines subjective evaluated using the methodology as described by
items and items that actually require the patient to Kirschner and Guyatt.22 The stated purpose of the
perform a physical function. For example, the patient instrument was for use as the primary outcome mea-
is asked “Does your shoulder allow you to sleep sure in clinical trials evaluating treatments for patients
comfortably?” which is subjective and “Can you lift 8 with shoulder instability.
pounds to the level of your shoulder without bending In 2001, the second in the series of disease-specific
your elbow?” which requires the patient to perform quality of life instruments for the shoulder, The West-
the maneuver. ern Ontario Osteoarthritis of the Shoulder Index
Item generation and reduction was based on Neer’s (WOOS), was published.23 The authors state that the
evaluation,20 the ASES evaluation,21 and observation instrument was developed and evaluated using similar
of complaints of patients by the instrument develop- methodology as was used in the development of the
ers. It is not clear how the final 12 items were actually WOSI.14 The WOOS is meant for use as the primary
selected. The tool was administered to 49 subjects outcome measure in clinical trials evaluating patients
between the ages of 60 and 70 with (1) no history of with symptomatic primary osteoarthritis of the shoul-
shoulder disease, injury, or surgery, (2) no shoulder der.
symptoms, and (3) a normal shoulder ultrasound to Most recently, in 2003, the third instrument in the
rule out silent rotator cuff tears. Essentially, all pa- series, the Western Ontario Rotator Cuff Index
tients obtained a perfect score (3% unable to place 8 lb (WORC) was accepted for publication as a primary
at head level, 2% unable to carry 20 lb at the side, 5% outcome measure in clinical trials evaluating treat-
incapable of throwing 20 yards). The tool has been ments for patients with degeneration of the rotator
cuff.15 The WORC was also developed and evaluated
administered to 250 patients with different diagnoses
using the methodology as described by Kirschner and
(osteoarthritis, rheumatoid arthritis, avascular necro-
Guyatt.22
sis, subacromial impingement, rotator cuff tears, fro-
Item generation was carried out in 3 steps for all 3
zen shoulder, traumatic anterior instability, and mul-
of the tools, which included a review of the literature
tidirectional instability). The instrument is able to and existing instruments, interviews with clinician
distinguish between patients with these conditions and experts, and interviews with 33 patients (sampled to
normal shoulder function. The authors noted distinct redundancy), representing the full spectrum of patient
patterns between groups of patients with the different characteristics. Item reduction was carried out using
conditions, indicating that the instrument might be the frequency importance product (impact) from a
helpful in establishing a diagnosis. Some data on the survey of 100 patients representing the full spectrum
SST following patients after rotator cuff repair indi- of patient characteristics and a correlation matrix to
cates that the instrument can be used to determine eliminate redundant items. The response format se-
what functional improvement the average patient ob- lected for the instrument was the 10-cm VAS an-
tains post treatment. The authors provide no report of chored verbally at each end. The prototype instrument
formal testing of reliability of this instrument. The was pretested on 2 consecutive groups of 10 patients.
responsiveness has not been evaluated nor compared The items were assigned equal weight based on the
with other measures of shoulder function. The SST is uniformly high impact scores.
unlikely to be sensitive to small but clinically impor- A database of patients meeting the inclusion/exclu-
tant changes in patient function because of the dichot- sion criteria for symptomatic shoulder instability from
omous response options (yes or no). For the same all the clinically relevant categories with the exception
reason, the instrument is likely have poor discrimina- of fixed dislocations was established. A database of
1116 A. KIRKLEY ET AL.

TABLE 1. The Western Ontario Instruments


WOSI (21 items) WORC (21 items) WOOS (19 items)

Physical Symptoms (10 items) Physical Symptoms (6 items) Physical Symptoms (6 items)
Sport/Recreation/Work Function (4 items) Sport/Recreation (4 items) Sport/Recreation/Work Function (5 items)
Lifestyle Function (4 items) Work Function (4 items) Lifestyle Function (5 items)
Emotional Function (3 items) Lifestyle Function (4 items) Emotional Function (3 items)
Emotional Function (3 items)

patients meeting the inclusion/exclusion criteria for and .91) and individual domain scores (range .72 to
symptomatic rotator cuff disease including rotator cuff .94) are reported.
tendinitis, rotator cuff tendinosis with no tear, partial- The instrument was administered to 47 patients
thickness rotator cuff tears, full-thickness rotator cuff undergoing surgical repair for anterior instability. All
tears (small to massive) and rotator cuff arthropathy correlations were within .2 of the predicted values. As
was established. Similarly, a database of patients of all predicted, the WOSI correlated best with the DASH as
ages with a diagnosis of primary osteoarthritis of the both a discriminative and evaluative instrument (r ⫽
shoulder was defined and established. .77, r ⫽ .76) and showed poor correlations with the
The Western Ontario instruments are constructed as SF-12 mental score (r ⫽ .115 discriminative; r ⫽ .12
shown in Table 1. Each instrument includes instruc- evaluative).
tions to the patients, a supplement with an explanation The responsiveness has been evaluated using the
of each item, and detailed instructions for the clinician Standardized Response Mean and compared with the
on scoring. The authors recommend using the total other measures of shoulder function in the same 47
score for the primary outcome in clinical trials but also patients used for the validation testing. The WOSI was
recommend reporting individual domain scores. The more responsive than the others tested (in order of
scores can be presented in their raw form or converted responsiveness: WOSI, Rowe, DASH, Constant
to a percent score. The best possible total score is Score, ASES, Range of Motion, UCLA, SF-12 phys-
100% (raw score ⫽ 0) and signifies that the patient has ical, and SF-12 mental).
no decrease in shoulder-related quality of life. The The minimally important difference was estimated
worst possible score is 0% (raw score ⫽ 2,100 in the in the same group of 47 patients.25 The patients were
WOSI and the WORC and 1,900 in the WOOS) and administered the WOSI concurrently with a 5-point
signifies that the patient has an extreme decrease in global rating of change score. Patients were asked
shoulder-related quality of life. whether, after treatment, they were better, worse, or
Validity has been assessed through construct vali- the same. If they indicated that they were better or
dation by making a priori predictions of how the
instrument would correlate with other measures of
health status at 1 time point, as an indicator of dis- TABLE 2. The Western Ontario Indexes
criminative function, and over time using change
scores after an intervention of known effectiveness, as Western Ontario
Index Correlated Measures
a measure of evaluative function (Table 2).
WOSI ASES, UCLA, Constant, DASH, Rowe, SF-12
(Physical & Mental domains), and range of
THE WESTERN ONTARIO SHOULDER motion
INSTABILITY INDEX (WOSI) WORC ASES, UCLA, Constant, DASH, Global
Rating of Change, Sickness Impact Profile
The reliability of the WOSI has been evaluated in Total Scale, SF-36 (Bodily Pain/Physical,
Social Function/Lifestyle, Physical Role
51 stable patients at 2 weeks and 3 months in con- Limitation/Work and Mental Health
junction with a global rating of change score. The domains), and range of motion
patient population tested was only briefly described as WOOS ASES, UCLA, Constant, Global Rating of
patients with shoulder instability who were stable and Change, McGill Pain Questionnaire, McGill
it is not clear how diverse a population this was. The VAS, SF-12 (Physical & Mental domains),
and range of motion
ICCs at 2 weeks and 3 months for the total score (.95
REVIEW OF SHOULDER OUTCOME TOOLS 1117

worse, they were asked to quantify their change on a more responsive than the others tested (in order of
5-point scale (1 to 5, very little different to a great deal responsiveness WOOS, McGill VAS, UCLA, ASES,
different). Patients with 1 or 2 points change were McGill Pain, Constant Score, SF-12 physical, ROM,
considered minimally different, 3 or 4 points change and SF-12 mental).
moderately different, and those with 5 points change a No estimate of the minimally important difference
great deal different. The estimates were as follows: or the responsiveness data has been reported for the
MID change in total score of 220 (10.4%), moderate WOOS. This instrument has been translated and val-
difference change in total score of 469 (22.3%), and idated in French, Spanish, and German.
large difference change in total score of 527.46 (25%).
The confidence intervals around these estimates were THE WESTERN ONTARIO ROTATOR CUFF
large because of the small number of patients involved INDEX (WORC)
in the determination. Further testing is needed to make
more accurate estimates. The reliability and validity of the WORC was as-
The WOSI is more responsive than other tools for sessed in patients who were being treated for rotator
shoulder instability. Richards et al.16 reported on the cuff tendinosis with no or a small full-thickness cuff
results of treatment of posterior shoulder instability tear. Patients completed the WORC and other mea-
and determined that the WOSI was more responsive sures of health as well as a global rating of change
than the SPADI, DASH, Constant, and ASES. The score. Those that indicated they had not changed at 2
results of a randomized clinical trial evaluating the weeks were used for the analysis of reliability. The
treatment of patients with a first anterior dislocation of ICC was calculated based on the 50 subjects who
the shoulder showed that the WOSI was more respon- remained stable over the 2 weeks. The ICC for the
sive than other instruments tested.26 (In order of re- total score was .96 and for each of the domains ranged
sponsiveness: WOSI, Rating Sheet for Bankart Re- from .63 for the emotional well being domain to .91
pair, DASH, Constant, ASES, ROM, UCLA, SF-12 for the physical symptoms domain.
physical score, and SF-12 mental score). The instrument was administered to 110 patients
with rotator cuff tendinopathy or small full-thickness
THE WESTERN ONTARIO cuff tears who were undergoing active treatment (in-
OSTEOARTHRITIS OF THE SHOULDER jections, physiotherapy, or arthroscopy and subacro-
INDEX (WOOS) mial decompression) All correlations were within .2
of the predicted values. The WORC correlated best
The reliability of the WOOS instrument has been with the ASES and DASH as a discriminative instru-
evaluated in 58 stable patients at 3 months in conjunc- ment (r ⫽ .73, r ⫽ .69) and with the ASES and UCLA
tion with a global rating of change score. The patient as an evaluative instrument (r ⫽ .75, r ⫽ .65). Data on
population tested was described as meeting the inclu- the responsiveness of the WORC tool has not been
sion criteria of primary osteoarthritis of the shoulder. reported.
The ICC was calculated based on the 22 subjects who The minimally important difference was calculated
remained stable over the 3 months. The ICC for the using 44 patients meeting specific inclusion/exclusion
total score was .96 and for each of the domains ranged criteria for chronic cuff tendinosis without tear under-
from .87 to .95. This number may be falsely decreased going treatment with subacromial injection. They
by the long test-retest interval of 3 months. were prospectively evaluated at baseline and 3 months
The instrument was administered to 41 patients after injection using a global rating of change and the
selected from the database undergoing treatment for WORC Index. Patients were asked whether, after
osteoarthritis of the shoulder. All correlations were treatment, they were better, worse, or the same. If they
within .2 of the predicted values. As predicted, the indicated that they were better or worse, they were
WOOS correlated best with the Constant Score as asked to quantify their change on a 5-point scale (1 to
both a discriminative and evaluative instrument (r ⫽ 5, very little different to a great deal different). Pa-
.69, r ⫽ .73). tients with 1 or 2 points change were considered
The responsiveness was evaluated using the Stan- minimally different, 3 or 4 points change moderately
dardized Response Mean and compared with the other different, and those with 5 points change a great deal
measures of shoulder function in 41 patients involved different. The estimates were as follows: MID change
in a randomized clinical trial of hemiarthroplasty ver- in total score of 245.26 (11.7%), moderate difference
sus total shoulder arthroplasty.15 The WOOS was change in total score of 371.3 (17.68%), and large
1118 A. KIRKLEY ET AL.

difference change in total score of 773.4 (36.82%). evaluative instrument as an item can have poor reli-
The confidence intervals around these estimates were ability but be important to patients (high impact) and
large because of the small number of patients involved be highly responsive.
in the determination. Further testing is needed to make The instrument has 34 items with 5 domains: Symp-
more accurate estimates. This instrument has also toms and Physical Complaints (16 items), Sport/Rec-
been translated into French and German. reation (4 items), Work-Related Concerns (4 items),
Lifestyle Issues (5 items), and Social and Emotional
THE ROTATOR CUFF QUALITY-OF-LIFE issues (5). The instrument does provide instructions to
MEASURE (RC-QOL) the patients. It asks the patients to consider the last 3
months when answering questions which may be too
In October 2000, Hollinshead et al. published a long for most patients’ recall. Some of the items are
paper reporting on the 6-year follow-up of large and double barreled as they ask the subject to consider
massive rotator cuff tears.27 In the article, they intro- pain and difficulty at the same time. The response
duced a new disease-specific quality of life instrument options are written such that the best score is 100 mm
for patients with rotator cuff disease. The instrument and the worst score is 0 mm. However, because the
was developed and tested using similar methodology items are asking about symptoms, this requires the
to that described by Guyatt et al.28 This instrument is patient to consider the amount of the symptom from
indicated for use as an outcome tool in patients with right to left as opposed to the traditional left to right.
the “full spectrum of rotator cuff disease.” It is unknown if this presents any difficulty to patients.
Item generation was carried out in 3 steps, including The reliability of the instrument was evaluated in 30
a review of the literature and existing outcome tools, consecutive patients with an interval of 2 weeks. The
discussions with clinician experts, and “direct input patient population tested was not described other than
from a set of patients with a full spectrum of rotator
they had documented rotator cuff disease. The authors
cuff disease ranging from primary impingement ten-
report “average difference in score” as a measure of
dinopathy to massive rotator cuff defects.” It is not
reliability. The average difference for the total score
stated how many patients were interviewed nor how
was 5.05%. The reliability of each of the domains is
many items were generated at this phase. A prelimi-
not reported. The ICC values are not reported.
nary questionnaire was formulated using 10-cm VAS
Some validation of the discriminative function of
response format. The preliminary questionnaire was
pretested on 20 patients with documented rotator cuff the RC-QOL has been performed. The RC-QOL has
disease. Patients underwent a structured interview been correlated with other measures of shoulder func-
consisting of 5 questions pertaining to whether the tion and measures of health status (Functional Shoul-
items were semantically appropriate, whether the pa- der Elevation Test, ASES, and SF-36) at final fol-
tient considered the items important to his or her low-up (average, 42 months; range, 25 to 71 months)
quality of life, whether the patient could comprehend in 70 patients undergoing surgical treatment for large
the question, and whether the patient would suggest and massive rotator cuff tears. The authors do not
any modifications to the questionnaire.” A revised comment on the surprisingly high correlations be-
55-item questionnaire was then developed. The au- tween it and the generic health profile, the global
thors describe further item reduction, from 55 to 34 shoulder tool and the functional test. The RC-QOL
items, but do not provide details on the methodology. correlated very highly with the SF-36 (.78) the ASES
They state “On the basis of qualitative and quantita- (.84), and the FSET (.84). In addition, the hypothesis
tive criteria, reduction of this 55-item instrument to a that the RC-QOL should be able to distinguish be-
smaller, more manageable questionnaire was consid- tween patients with large and massive rotator cuff
ered.” The qualitative criteria included the importance tears as further indication of its discriminative func-
of each item in demonstrating a quality-of life issue, tion is described. The RC-QOL, ASES, and the FSET
the importance of each item to patients and the elim- were all able to distinguish between patients with
ination of redundancy or ambiguity. The quantitative large and massive cuff tears in this sample of 73
criterion was based on reliability testing. Items that shoulders (17 large and 56 massive cuff tears) at final
had an average difference score of 15% or greater follow-up. No validation of the instrument’s evalua-
were eliminated from the tool. Although eliminating tive function has been reported. The responsiveness
items based on poor reliability is logical for a discrim- and determination of the minimally important differ-
inative instrument it is not necessarily ideal for an ence have also not been reported.
REVIEW OF SHOULDER OUTCOME TOOLS 1119

For example: shoulder instruments and the appropriate domains of


Question: With any prolonged activity how much the global tools, seem appropriate. Finally, respon-
pain or discomfort do you experience in your shoul- siveness or sensitivity to change was measured by
der? comparing the effect sizes of the new questionnaires
and the SF-36 scores, as well as the HAQ, Constant,
0 100 and Rowe scores in patients undergoing surgical sta-
Severe Pain No pain at all bilization. The results show that the instruments were
more sensitive than the generic instruments. In addi-
The authors recommend converting the raw scores tion, the new questionnaire was compared with the
(0 to 3,400; 0 ⫽ worst score, 3,400 ⫽ best score) to a other instruments for the ability to distinguish between
percentage score, i.e., presenting scores out of 100. patients who reported the most positive change in their
shoulder from all other patients on 3 separate ques-
OXFORD SHOULDER SCORES (OSS) tions of patient perception of overall success of treat-
ment, room for improvement, and perception of im-
Similar to Kirkley et al., Dawson, Fitzpatrick and provement in shoulder problems following treatment.
Carr29 have published 2 questionnaires that deal with Medium-term results have also been reported for
the perceptions of patients about shoulder surgery. the OSS.31 Once again comparisons were made with
The first, the Oxford Shoulder Score (OSS) was pub- the SF-36 and the Constant Shoulder score. In addi-
lished in 1996 and is for patients having shoulder tion to these measures, patients were also asked to
operations other than stabilization. The second ques- assess the success of their surgery and to judge the
tionnaire was published in 1999 and is meant for the degree of change in the symptoms arising from their
group of patients who had been excluded from the shoulder. Ninety-three patients were assessed preop-
original questionnaire, those presenting with shoulder eratively, and at 6 months and 4 years postoperatively.
instability.30 Both are 12-item questionnaires with The correlation coefficients between the absolute
each item scored from 1 to 5, from least to most scores of the OSS, the Constant assessment and the
difficulty or severity, combined to produce a single relevant dimensions of the SF-36 were generally high
score ranging from 12 (best score) to 60 (worst score). (r ⬎ .5) and highly significant. Comparisons between
The Oxford Shoulder Instability Questionnaire was mean change scores, grouped responses, and the pa-
developed by interviewing 20 patients referred to an tient satisfaction question further strengthened support
outpatient clinic with shoulder instability. It is un- for the OSS questionnaire. Patients reported consider-
known whether these patients represented all types of able differences in mean change scores 6 months
shoulder instability categories, age, gender, and treat- postoperatively on the Constant, OSS, and relevant
ment experiences. Based on the interviews, an 18-item domains of the SF-36. Similar results at the 4-year
instrument was drafted and then pared down to 12 assessment were shown for the OSS and the pain
questions following pretesting on a further 2 groups of dimension of the SF-36. Interestingly, differences at
20 patients. It is not stated by what method the items this stage for the Constant barely approached signifi-
were selected or discarded. The instrument has been cance, mean differences being significantly reduced
tested for test-retest reliability in 34 patients at 24- between the 6-month and 4-year assessment points.
hour recall period. The ICC was not calculated; how- This suggests that the reliability and sensitivity of the
ever, it is likely that the Pearson Correlation Coeffi- Constant Score relative to the OSS were significantly
cient closely approximates it. The r value was reduced over the long term. However, in reporting all
reported as .97. The recall period was very short for of these medium-term results, the authors acknowl-
this type of assessment raising the possibility that edge that only 66% of the original sample underwent
patients were able to remember their previous scores a clinical assessment at the 4-year mark and that this
and artificially increasing the r value. Construct va- variation in the period of follow-up may have affected
lidity has been determined through prospective studies the clinical validity of the investigation.
in which both instruments have been compared to It would seem from the publications that these ques-
other outcome tools as discriminative instruments (1 tionnaires have been tested and should provide reli-
point in time both before treatment and at 6 months able, valid, and responsive information. The authors
after treatment). Although predictions as to how the of the present article have no experience with these
instruments should correlate were not made, the re- particular tools and perhaps in the future, more infor-
sults, which show modest correlation with the other mation regarding their effectiveness will be available.
1120 A. KIRKLEY ET AL.

CONCLUSION and physicians’ evaluations of outcome after total hip arthro-


plasty. J Bone Joint Surg Am 1996;80:835-838.
In summary, older instruments designed for evaluating 14. Kirkley A, Griffin S, McLintock H, Ng L. The development
and evaluation of a disease-specific quality of life measurement
shoulder conditions were developed at a time when little tool for shoulder instability: The Western Ontario Shoulder In-
information was available or little attention was paid to stability Index (WOSI). Am J Sports Med 1998;26:764-772.
the appropriate methodology for such endeavors. How- 15. Kirkley A, Griffin S, Alvarez C. The development and evalu-
ation of a disease-specific quality of life measurement tool for
ever, there now exist a number of instruments that are rotator cuff disease: The Western Ontario Rotator Cuff Index
excellent for specific conditions of the shoulder. Much (WORC). Clin J Sport Med 2003;13:84-92.
work remains to be done to evaluate these instruments in 16. Richards RR, Harniman E. A long-term follow-up of posterior
shoulder stabilizations for recurrent posterior glenohumeral
specific patient populations, to determine values for the instability. London, Ontario: Canadian Orthopaedic Associa-
minimally clinically important difference for each of tion, #74, 2001.
these tools, and to develop valid translations such that 17. Barrack RL, Skinner HB. The sensory function of knee liga-
ments. In: Daniel DM, Akeson WH, O’Connor JJ, eds. Knee
they can be used internationally. It is clear that much ligaments: Structure, function, injury and repair. New York:
progress has already been made in this area of orthopae- Raven, 1990.
dic surgery and that currently there exists an appropriate 18. L’Insalata JC, Warren RF, Cohen SB, Altchek DW, Peterson
MG. A self-administered questionnaire for assessment of
instrument for each of the main conditions of the shoul- symptoms and function of the shoulder. J Bone Joint Surg Am
der. Investigators planning clinical trials should select a 1997;79:738-748.
modern instrument developed with appropriate patient 19. Lippitt SB, Harryman DT II, Matsen FA III. A practical tool
for evaluating function: The Simple Shoulder Test. In: Matsen
input for item generation and reduction, established va- FA, Fu FH, Hawkins RJ, eds. The shoulder: A balance of
lidity, and reliability. All things being equal, the most mobilty and stability. Rosemont, IL: American Academy of
responsive instrument available should be used in order Orthopaedic Surgeons, 1992;501-518.
20. Rowe CR. Evaluation of the shoulder. In: The shoulder. New
to minimize the sample size for the proposed study. York: Churchill-Livingstone, 1988;631-637.
21. Barrett WP, Franklin JL, Jackins SE, Wyss CR, Matsen FA III.
Total shoulder arthroplasty. J Bone Joint Surg Am 1987;69:
REFERENCES 865-872.
22. Kirshner B, Guyatt G. A methodological framework for as-
1. Rowe CR, Patel D, Southmard WW. The Bankart proce- sessing health indices. J Chronic Dis 1985;38:27-35.
dure—A study of late results. J Bone Joint Surg Am 1977;59: 23. Lo IKY, Griffin S, Kirkley A. The development and evaluation
122. of a disease-specific quality of life measurement tool for
2. Amstutz HC, Sew Hoy AL, Clarke IC. UCLA anatomic total osteoarthritis of the shoulder: The Western Ontario Osteoar-
shoulder arthroplasty. Clin Orthop 1981;155:7-20. thritis of the Shoulder Index (WOOS). Arthritis Cartilage
3. Ellman H, Hanker G, Bayer M. Repair of rotator cuff. Factors 2001;9:771-778.
influencing reconstruction. J Bone Joint Surg Am 1986;68: 24. Juniper EF, Guyatt GH, Jaeschke R. How to develop and
1136-1144. validate a new health-related quality of life instrument. In:
4. Romeo AA, Bach BR Jr, O’Halloran KL. Scoring systems for Spilker B, ed. Quality of life and pharmacoeconomics in
shoulder conditions. Am J Sports Med 1996;24:472-476. clinical trials. Philadelphia: Lippincott-Raven, 1996;49-56.
5. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. 25. Kirkley A. The development and evaluation of a disease-
Development of a shoulder pain and disability index. Arthritis specific quality of life measurement tool for shoulder instabil-
Care Res 1991;4:143-149. ity: The Western Ontario Shoulder Instability Index (WOSI).
6. Richards RR, An K-N, Bigliani LU, Friedman RJ, Gartsman 1-89. Thesis/Dissertation, McMaster University, Hamilton,
GM, Gristina AG, Iannotti JP, Mow VC, Sidles JA, Zucker- Ontario, Canada, 2001.
man JD. A standardized method for the assessment of shoulder 26. Kirkley A, Griffin S, Richards C, Miniaci A, Mohtadi N.
function. J Shoulder Elbow Surg 1994;3:347-352. Prospective randomized clinical trial comparing the effective-
7. Constant CR, Murley AHG. A clinical method of functional ness of immediate arthroscopic stabilization versus immobili-
assessment of the shoulder. Clin Orthop 1987;214:160-164. zation and rehabilitation in first traumatic anterior dislocation
8. Conboy VB, Morris RW, Kiss J, Carr AJ. An evaluation of the of the shoulder. Arthroscopy 1998;15:507-514.
Constant-Murley Shoulder Assessment. J Bone Joint Surg Br 27. Hollinshead RM, Mohtadi NG, Vande Guchte RA, Wadey
1996;78:229-232. VM. Two 6-year follow-up studies of large and massive rota-
9. Hudak PL, Amadio PC, Bombardier C. Development of an upper tor cuff tears: Comparison of outcome measures. J Shoulder
extremity outcome measure: The DASH (disabilities of the arm, Elbow Surg 2000;9:373-381.
shoulder and hand) [corrected]. The Upper Extremity Collabora- 28. Guyatt GH, Townsend M, Berman LB, Keller JL. A compar-
tive Group (UECG). Am J Ind Med 1996;29:602-608. ison of Likert and visual analogue scales for measuring change
10. Solway S, Beaton DE, McConnell S, Bombardier C. The in function. J Chronic Dis 1987;40:1129-1133.
DASH outcome measure user’s manual. Toronto, Ontario: 29. Dawson J, Fitzpatrick R, Carr A. Questionnaire on the percep-
Institute for Work & Health, 2002. tions of patients about shoulder surgery. J Bone Joint Surg Br
11. Verbrugge LM, Jette AM. The disablement process. Soc Sci 1996;78:593-600.
Med 1994;38:1-14. 30. Dawson J, Fitzpatrick R, Carr A. The assessment of shoulder
12. Haworth RJ, Hopkins J, Ells P, Ackroyd CE, Mowat AG. instability. J Bone Joint Surg Br 1999;81:420-426.
Expectations and outcome of total hip replacement. Rheumatol 31. Dawson J, Hill G, Fitzpatrick R, Carr A. The benefits of using
Rehabil 1981;20:65-70. patient-based methods of assessment: medium-term results of
13. Lieberman JR, Dorey F, Shekelle P, Schumacher L, Thomas an observational study of shoulder surgery. J Bone Joint Surg
BJ, Kilgus DJ, Finerman GA. Differences between patients’ Br 2001;83:877-882.

S-ar putea să vă placă și