Sunteți pe pagina 1din 8

Journal of Clinical Epidemiology 57 (2004) 10961103

Agreement between self-report questionnaires and medical record


data was substantial for diabetes, hypertension, myocardial
infarction and stroke but not for heart failure
Yuji Okuraa, Lynn H. Urbanb, Douglas W. Mahoneyb, Steven J. Jacobsenc,
Richard J. Rodeheffera,*
a

Division of Cardiovascular Diseases and Internal Medicine, Department of Health Science Research, Mayo Clinic and Foundation,
200 First St SW, Rochester, Minnesota 55905, USA
b
Section of Biostatistics, Department of Health Science Research, Mayo Clinic and Foundation, 200 First St SW, Rochester, Minnesota 55905, USA
c
Section of Clinical Epidemiology, Department of Health Science Research, Mayo Clinic and Foundation, 200 First St SW,
Rochester, Minnesota 55905, USA
Accepted 14 April 2004

Abstract
Objectives: Questionnaires are used to estimate disease burden. Agreement between questionnaire responses and a criterion standard
is important for optimal disease prevalence estimates. We measured the agreement between self-reported disease and medical record diagnosis
of disease.
Study Design and Setting: A total of 2,037 Olmsted County, Minnesota residents 45 years of age were randomly selected.
Questionnaires asked if subjects had ever had heart failure, diabetes, hypertension, myocardial infarction (MI), or stroke. Medical records
were abstracted.
Results: Self-report of disease showed 90% specificity for all these diseases, but sensitivity was low for heart failure (69%) and
diabetes (66%). Agreement between self-report and medical record was substantial (kappa 0.710.80) for diabetes, hypertension, MI,
and stroke but not for heart failure (kappa 0.46). Factors associated with high total agreement by multivariate analysis were age 65
years, female sex, education 12 years, and zero Charlson Index score (P .05).
Conclusion: Questionnaire data are of greatest value in life-threatening, acute-onset diseases (e.g., MI and stroke) and chronic disorders
requiring ongoing management (e.g.,diabetes and hypertension). They are more accurate in young women and better-educated subjects.
2004 Elsevier Inc. All rights reserved.
Keywords: Cardiovascular diseases; Epidemiologic methods; Questionnaires; Recall; Reliability

1. Introduction
Epidemiologic studies and surveys often rely on selfadministered questionnaires to obtain information on subject
health status [111]. The agreement between questionnaire
data and a criterion standard, such as the medical record, is
critical for obtaining meaningful estimates of disease
prevalence.
A number of studies have attempted to assess the value
of a self-reported disease by comparing self-reports with a
criterion standard, such as the medical record, for a range of
cardiovascular conditions [119]. When comparing results

* Corresponding author. Tel.: 507-284-0294; fax: 507-284-4200.


E-mail address: rodeheffer.richard@mayo.edu (R.J. Rodeheffer).
0895-4356/04/$ see front matter 2004 Elsevier Inc. All rights reserved.
doi: 10.1016/j.jclinepi.2004.04.005

among different studies, it is important to bear in mind that


the characteristics of the cohort and study methodology vary
widely. Some studies did not evaluate both men and women
[3,79]. Other investigations limited the target population to
those who had been hospitalized due to the disease of interest
[13,16]. Most studies imposed a limit on the recall period
for a particular diagnosis [69,13]. Questionnaire instruments have also varied; in many studies subjects have been
asked if they have had a condition but have not been asked
whether a medical professional had informed them of the
diagnosis [15,15]. Moreover, information obtained by
interview [1219] may differ from that gained by self-administered questionnaire [111]. The form of analysis and presentation of findings also varies among studies: Some authors
focused on subjects who reported the presence of disease and
those who denied disease [16,812,14,15,18,19], whereas

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103

other authors limited their investigation to those who reported the presence of disease [7,13,16,17]. Limiting evaluation of a criterion standard (e.g., chart abstraction) only to
persons who report the presence of disease may result in
underestimation of disease prevalence because false-negative reports are not taken into account.
Criterion standards for the presence of disease have also
been inconsistent. Some investigators have used physicians
diagnosis in medical records, whereas others have required
the presence of a morbidity endpoint at the time of clinical
measurement or evaluation [2,19]. Harlow pointed out that
the lack of consistency in analytic methods to measure
agreement makes comparison across studies difficult and suggested that investigations should incorporate standardized
methodologies to enable comparison of results across studies
[20]. However, study limitations, such as referral bias, insufficient sample size, nonuniform archival style, short archival
period, or incomplete archive, may limit the ability to achieve
interpretable results.
The purpose of this study was to measure the agreement
between self-reported cardiovascular disease and extensive
medical record documentation of disease in a well-characterized, population-based cohort with a long record archival
period. We also sought to determine the subject characteristics that are associated with agreement.

1097

showed no difference in age, gender, cardiac hospitalization,


history of heart failure, or history of diabetes; chronic obstructive pulmonary disease was noted in 49.7% of nonparticipants compared with 29.3% of participants (P .01) [25].
2.3. Questionnaire information
The questionnaire was developed with the assistance
of the Survey Research Center of the Mayo Department of
Health Services Research. A printed questionnaire was
mailed to each subject with an invitation to participate in
the study. The invitation letter described the goals of the
study and the information to be obtained (questionnaire,
electrocardiogram, echocardiogram, spirometry, physical
examination, and phlebotomy) during a 4-hour visit to Mayo
Clinic. The questionnaire contained the following inquiry:
Has a medical provider ever told you that you had any of
following conditions? The disease conditions of interest in
this study were heart failure, diabetes, borderline diabetes,
hypertension, myocardial infarction (MI) (referred to as
heart attack in the questionnaire), or stroke including transient ischemic attack. Three response choices were provided
for each question: yes, no, or I dont know. I dont
know answers in the questionnaire were regarded as no
responses. Participants were also asked to indicate if they
completed the questionnaires alone or if they required
assistance.

2. Methods

2.4. Medical record review

2.1. Study setting

Population-based research in Olmsted County is feasible


because all care providers have maintained a unified medical
record whereby all data collected on an individual patient
are assembled in one dossier record. The unique characteristics of this comprehensive medical record system have been
documented previously [23]. These records are indexed and
maintained by the Rochester Epidemiology Project [22,23].
Since 1907, every Mayo Clinic patient has been assigned a
unique identifier, and all information from every Mayo
Clinic contact (including hospital inpatient or outpatient
care, office visits, emergency room and nursing home care,
and death certificate and autopsy information) is contained
within a single dossier chart for each patient. The diagnoses
assigned and surgical procedures performed at each visit are
coded and entered into continuously updated computer files.
Consequently, comprehensive assessment of agreement of
affirmative and negative questionnaire responses could be
carried out without limiting recall period. Because of the
completeness of the medical record and the long archival
period for this cohort (see Results), a diagnosis in the record
was considered to be the criterion standard for the presence
of the disease.
Individual information concerning education, number of
hospitalizations, and archival length duration of available
record were obtained from the database. Medical records
for all 2,037 participants were reviewed by trained chart
abstractors (four registered nurses with an average of 8

The study was conducted in Olmsted County, Minnesota


using the resources of the Rochester Epidemiology Project.
In 1990, the population of Olmsted County was 106,470
(96% white). The proportions of the population over age
45, 65, and 75 years were 28%, 11%, and 5%, respectively.
Other characteristics of this population and the unique aspects of population-based epidemiologic research in
Olmsted County have been previously described [2124].
2.2. Population sampling, subject recruitment,
and enrollment
This study was approved by the Mayo Clinic Institutional
Review Board. A prospective study of cardiac function using
a representative sample of persons in Olmsted County was
initiated in 1998. Residents of Olmsted County 45 years of
age were eligible to participate. A sampling fraction of 7%
was applied within each sex and age (5-year) specific stratum. The sampling procedure has been described in detail
elsewhere (M. Redfield, S. Jacobsen, J.J. Burnett, unpublished observations). The sample size of this cohort was
established based on a goal of estimating the prevalence of
echocardiographically determined asymptomatic left ventricular dysfunction within acceptable confidence limits. A
total of 2,037 persons (47% of invited subjects) participated.
Comparison of the participants with 511 nonparticipants

1098

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103

years abstracting experience) using specified criteria. For


heart failure, a physicians diagnosis of congestive heart
failure (CHF) had to include documentation of the criteria
developed by the Framingham Heart Study (Table 1) [26].
For diabetes, the diagnosis of diabetes mellitus had to be
explicit in record. For hypertension, a physicians diagnosis
of hypertension and the use of antihypertensive medication or
the words borderline or labile used in reference to blood
pressure with documentation of two blood pressure measurements (consecutively but may be subsequent visits) 140
systolic or 90 diastolic within a 12-month period was required. For heart attack, a MI had to be documented in
the medical record by ECG, cardiac enzymes, and chest
pain according to the Gillum or WHO criteria [27]. For
stroke, loss of neurologic function caused by an ischemic
event had to be diagnosed in the medical record. Diagnosis
of a transient ischemic attack required a documented history
of focal abnormalities in vision, speech, sensation, or motor
function lasting no longer than 24 hours.
To assess the influence of the coexisting disease on
agreement between questionnaire and medical record, we
measured comorbidity with the Charlson index [28]. This
weighted index takes into account the number and seriousness of 17 comorbid conditions. With each increased
level of comorbidity index, a significant stepwise increase
in cumulative mortality is observed. This index estimates
illness burden (defined as risk of death) attributable to comorbid disease. Using the Rochester Epidemiology Project
database, we estimated each participants extent of comorbidity as described previously [29,30] and compared questionnaire response between those subjects with Charlson
index of zero (no comorbid conditions) and those with Charlson index of 1. In measurement of the Charlson index,
scores for the disease in question were not counted. For
example, heart failure was counted as a comorbidity for dia-

Table 1
Framingham criteria for the clinical diagnosis of CHF
Major criteria
Paroxysmal nocturnal dyspnea
Orthopnea
Elevated jugular venous pressure
Pulmonary rales
Third heart sound
Cardiomegaly on chest radiograph
Pulmonary edema on chest radiograph
Minor criteria
Peripheral edema
Night cough
Dyspnea on exertion
Hepatomegaly
Pleural effusion
Heart rate 120/min
Weight loss 4.5 kg in 5 da
a
Weight loss 4.5 kg in 5 days is considered a major criterion if it
occurred in response to therapy for CHF. A patient was considered to have
validated CHF if two major criteria were present or one major and two
minor criteria were present concurrently.

betes, whereas heart failure was not counted as a comorbidity


in the question on heart failure.
2.5. Statistical methods
The sensitivity (correctly reported positive medical records/all positive medical records), specificity (correctly
reported negative medical records/all negative medical records), positive predictive value (PPV) (correctly reported
positive medical records/all positives reported by the questionnaire), negative predictive value (NPV) (correctly reported negative medical records/all negatives reported by
the questionnaire), and total agreement (correctly reported
positive medical records and negative medical records/total
reports or records) were estimated.
Kappa coefficients were calculated to determine the
chance corrected agreement between self-reported questionnaire data and medical records [31]. A kappa value of 0.40
was considered poor-to-fair agreement, a kappa value of
0.41 to 0.60 was considered moderate agreement, a kappa
value of 0.61 to 0.80 was considered substantial agreement,
and a kappa value of 0.81 to 1.00 was considered excellent agreement, as suggested by Landis and Koch [32].
Stratified analysis of kappa coefficients and total agreement
were examined for each condition by age, sex, education
level, number of hospital admissions, archival length, and
Charlson index. Differences across strata were examined
using the 2 test. Each of the above stratification factors were
also evaluated as unadjusted and adjusted covariates within a
logistic regression model with questionnaire and medical
record agreement (yes/no) as the response. A two-sided alpha
level of 0.05 was considered statistically significant.

3. Results
The characteristics of the cohort are described in Table
2. Of 2,037 participants, 1,950 (95.7%) filled out the questionnaire without assistance. The median participant age
was 61 years, and the median length of the patient medical
record archive was 36 years.
The agreement between self-report and the medical record
for heart failure, diabetes, hypertension, MI, and stroke is
presented in Table 3. Self-report was most sensitive and
specific (89.5% and 98.2%, respectively) for MI. In contrast,
a self-report of heart failure was 68.6% sensitive and 97%
specific for the diagnosis of heart failure (Table 3). Kappa
ranged from 0.71 to 0.80 for diabetes, hypertension, MI,
and stroke, suggesting a good level of agreement. In contrast,
only moderate agreement (kappa 0.46) was found for
heart failure.
Heart failure had a high questionnaire false-positive rate
and a low PPV (36.8%), with questionnaire responses indicating a nearly two-fold higher heart failure prevalence than
was recorded in the medical record (4.7% and 2.5%, respectively). For diabetes, there was a high questionnaire falsenegative rate and low questionnaire sensitivity (66.0%). Of

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103


Table 2
Characteristics of the Cohort (N 2,037)
Male
Female
Education 12 y
Education 12 y
Heart failure
Diabetes
Hypertension
MI
Stroke
Charlson index 0
Charlson index 1
Charlson index 2

Age (y)
Medical record duration (y)
Number of hospital admissions

Number

981
1,056
757
1,152
51
150
768
105
74
991
397
649

48.2
51.8
39.7
60.4
2.5
7.4
37.7
5.2
3.6
48.7
19.5
31.9

25th
Percentile

Median

75th
Percentile

61
36
3

70
49
5

53
25
1

1099

length, and zero Charlson score were associated with greater


total agreement for all diseases except hypertension.
The amount of agreement varied by age, sex, education,
length of medical record, and Charlson index, particularly
for heart failure, diabetes, and MI. When considered simultaneously in a multiple logistic regression model, all variables
except length of medical record were found to be associated
with total agreement (Table 5). In particular, younger age was
associated with better agreement in all diseases except
hypertension. Female sex was associated with greater
agreement in heart failure and MI. After adjustment for
other factors, greater attained educational level was associated with greater agreement in only two diseases, heart failure, and diabetes. The Charlson index was an independent
predictor of poor agreement in heart failure, diabetes, and
MI; that is, the presence of a greater number of comorbidities
that influenced participants prognosis was associated with
weaker total agreement.

Abbreviation: MI, myocardial infarction.

4. Discussion

51 false-negative responders, 37 reported borderline diabetes, whereas their medical records contained a diagnosis of
diabetes rather than borderline diabetes. As a result, the
questionnaire responses indicated a lower prevalence of diabetes than did medical records (5.2% versus 7.4%).
The prevalence of hypertension based on the questionnaire responses was close to that based on diagnosis in the
medical records (35.8% and 37.7%, respectively). However,
a considerable number of false-positive and false-negative
responses reduced specificity, NPV, and total agreement.
In contrast to the other diagnoses, MI had a small number
of false-negative responses, resulting in the highest sensitivity (89.5%), specificity (98.2%), and kappa (0.80) among the
five diseases. Although stroke showed high total agreement,
a considerable number of false-positive responses reduced
the PPV.
The association of participant characteristics with
strength of agreement between self-reports and medical
records is shown in Table 4. Female sex, younger age,
higher education, fewer hospital admissions, shorter archival

Self-administered health-status questionnaires continue


to be important tools in epidemiology and public health
research. This study of agreement between questionnaire
responses and medical record diagnosis was conducted in a
population-based cohort of persons 45 years of age. It
showed that there was substantial agreement between questionnaire responses and medical records for diabetes, hypertension, MI, and stroke but not for heart failure. Factors
associated with higher agreement were age 65 years,
female sex, education 12 years, and the absence of comorbid conditions on the Charlson index.
Our current study of cardiac function in Olmsted County
offered an opportunity to overcome some of the limitations
encountered in previous studies [33]. We used a sizable
population-based sample and a comprehensive inpatient and
outpatient medical archive with uniform style, a long archival period (median 36 years in this cohort), and relative
completeness of the archive for residents of Olmsted County.
Almost all medical care received by local residents is provided by the Mayo Clinic together with its two affiliated

Table 3
Sensitivity and specificity of self-reported cardiovascular disease for medical record-documented disease in 2,037 men and women in Olmsted
County, Minnesota

Heart failure
Diabetes
Hypertension
MI
Stroke

Sensitivity (%)

Specificity (%)

PPV (%)

NPV (%)

Kappa

95% CI

Total agreement (%)

68.6
66.0
82.0
89.5
78.4

97.0
99.7
92.2
98.2
98.6

36.8
94.3
86.4
73.4
67.4

99.2
97.4
89.4
99.4
99.2

0.46
0.76
0.75
0.80
0.71

0.360.56
0.700.82
0.720.78
0.740.85
0.630.79

96.3
97.2
88.4
97.8
97.8

Abbreviations: PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval; MI, myocardial infarction.
Total agreement (%) (MRQN) (MRQN)/number of total participants 100; where MRQN are participants whose affirmative reports
were confirmed in their medical records (true positive) and MRQN are participants who neither reported diseases nor had diseases confirmed in their
medical history (true negative).

1100

Table 4
Association of participant characteristics with strength of agreement between self-report and the medical record
Hypertension

MI

Diabetes

Stroke

Sensitivity Specificity % agreement Sensitivity Specificity % agreement Sensitivity Specificity % agreement Sensitivity Specificity % agreement Sensitivity Specificity % agreement

Gender
Male
94.9
Female 98.8
Age
4564 y 98.5
6596 y 94.5
Years of education
12 y 94.4
12 y 98.5
Number of admissions
0
98.5
13
97.5
47
96.9
8
93.2
Length of record
36 y 97.9
36 y 96.2
Charlson index
0
99.6
1
94.3

75.8
55.6

94.3**
98.1

91.5
92.9

81.1
82.9

87.7
89.0

97.2
99.1

92.3
81.5

96.8**
98.7

99.5
99.8

66.7
64.9

96.4*
97.9

98.7
98.4

83.3
71.9

98.1
97.6

83.3
64.1

98.4**
93.1

91.1
94.7

85.3
79.5

89.5
86.7

99.4
96.3

96.4
87.0

99.3**
95.4

99.7
99.7

77.1
56.3

98.4**
95.4

99.3
97.5

82.4
77.2

99.0**
96.0

81.5
52.2

93.9
97.6

91.5
92.6

81.4
83.2

86.9
89.4

97.3
98.6

89.5
88.9

96.7*
98.3

99.6
99.8

57.3
74.6

95.4**
98.3

97.6
99.0

73.0
82.9

96.4**
98.5

44.4
71.4
61.5
81.8

97.6**
97.2
96.0
92.3

89.1
93.6
93.2
92.5

80.2
79.5
83.7
85.0

86.2
89.3
89.0
88.7

99.2
98.7
97.6
96.6

84.6
88.5
94.1
87.5

98.8*
98.3
97.3
95.7

100
99.9
99.0
100

60.0
72.5
60.0
70.0

97.6*
98.3
96.0
96.0

99.2
99.3
97.8
97.1

73.3
91.7
87.5
65.2

98.4**
99.2
97.3
94.7

63.2
71.9

97.2*
95.4

91.9
93.0

83.2
81.2

89.3
87.6

98.6
98.0

88.6
90.0

98.1
97.5

100
99.4

71.9
62.4

98.3**
96.1

99.1
98.0

77.3
78.4

98.7*
97.1

50.0
71.1

99.3*
93.3

92.4
91.9

78.4
83.9

88.7
88.0

99.4
97.0

92.3
89.1

99.3*
96.3

99.8
99.5

68.2
65.6

99.1*
95.3

99.0
98.1

81.8
77.8

98.8**
96.8

Abbreviation: MI, myocardial infarction.


* P .05.
** P .01.

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103

Heart failure
Group

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103

1101

Table 5
Patient characteristics associated with questionnaire and medical record agreement in a logistic regression model (N 2037)

Analysis variables
Unadjusted
Age 65 y
Female gender
Education 12 y
Medical record 36 y
Any hospital admission
Charlson index 1
Adjusteda
Age 65 y
Female gender
Education 12 y
Medical record 36 y
Any hospital admission
Charlson index 1

Heart failure

Diabetes

Hypertension

MI

OR

95% CI

OR

0.22
3.14
2.60
0.60
0.58
0.10

0.130.37*
1.875.27*
1.614.19*
0.370.97*
0.311.08
0.050.21*

0.30
2.77
1.85
1.25
0.72
0.18

0.170.54*
1.624.74*
1.113.08*
0.732.15
0.371.38
0.080.39*

Stroke

95% CI

OR

95% CI

OR

95% CI

OR

95% CI

0.35
1.74
2.74
0.42
0.84
0.18

0.200.60*
1.012.99*
1.574.79*
0.230.75*
0.441.59
0.090.36*

0.76
1.14
1.27
0.85
1.31
0.94

0.581.00
0.871.50
0.961.69
0.651.12
0.971.77
0.721.23

0.14
2.43
1.93
0.75
0.48
0.18

0.060.30*
1.284.59*
1.073.51*
0.411.38
0.201.14
0.080.41*

0.24
0.82
2.47
0.45
0.70
0.36

0.120.47*
0.451.49
1.344.56*
0.230.87*
0.321.51
0.180.69*

0.53
1.63
2.02
0.76
1.02
0.28

0.290.99*
0.932.87
1.133.61*
0.401.45
0.512.02
0.130.59*

0.81
1.09
1.25
0.92
1.40
1.09

0.591.11
0.811.45
0.931.68
0.671.26
1.011.95*
0.811.46

0.17
2.38
1.52
1.76
0.45
0.34

0.080.39*
1.244.59*
0.802.86
0.913.40
0.171.16
0.150.78*

0.34
0.77
1.87
0.85
0.97
0.50

0.170.71*
0.421.43
0.993.53
0.421.71
0.442.13
0.251.01

Abbreviations: OR, odds ratio; CI, confidence interval; MI, myocardial infarction.
a
Adjusted for all variables.
* P .05.

hospitals or by Olmsted Medical Center with its affiliated


hospital. Of those who sought medical assessment from a
physician, 90% were evaluated in these facilities, and 96%
of Olmsted County residents selected one of these providers
when they had a major medical problem [34].
In spite of heterogeneous methodology, there have been
some common findings among previously published studies.
In general, self-reports on medical conditions that are well
defined and relatively easily diagnosed have good
agreement, in contrast to conditions characterized by complex nonspecific symptoms [7]. Our results support this concept, showing high questionnaire agreement for MI, diabetes,
hypertension, and stroke.
MI tends to present as a dramatic, easily recalled event
involving pain and the need for intensive care hospitalization. It usually includes the need for outpatient follow-up
evaluation and risk factor management, and it may be associated with other diseases (hypertension, diabetes, etc.) that
call for frequent interactions with the medical care system.
These factors enhance agreement between self-report and
the medical record for MI.
Although diabetes and hypertension are usually not precipitous in onset, they are chronic and require ongoing
repeated engagement with the medical care system. These
factors reinforce agreement between self-report and the medical record. Nevertheless, our questionnaire data on diabetes
contained a high false-negative rate and low sensitivity, perhaps because patients tended to minimize the severity of
their glucose intolerance, opting for the designation borderline diabetes when their medical records described them
as having diabetes.
Stroke, like MI, often has an abrupt onset and is frequently
associated with permanent functional impairment. Although
our data show good agreement between questionnaire and
medical record documentation of stroke, a relatively high
rate of false-positive questionnaire responses was observed.

This may be a consequence of patient confusion about which


neurologic symptoms constitute a stroke and patient difficulty in distinguishing between stroke as a possible diagnosis
and stroke as a definite diagnosis.
In contrast to MI, stroke, diabetes, and hypertension,
patient self-reports of the diagnosis of CHF showed poorer
agreement with chart-confirmed diagnosis. This is significant
given the apparent increase in the prevalence of heart failure
in the aging American population. In our cohort, a selfreport of the diagnosis of CHF was characterized by frequent
false positives (60 persons) and false negatives (16 persons),
compared with true positives (35 persons). As a consequence, there was poor agreement between self-report and
medical record for CHF. This may be a consequence of
heart failure being a less familiar and less easily understood
diagnosis or may be a consequence of its nonspecific and intermittent symptoms. Heart failure shares symptoms with
some pulmonary and renal diseases, thus increasing the
possibility of false-positive self-report for heart failure. Our
relatively strict criteria for a chart diagnosis of heart failure
may also have contributed to false-positive questionnaire
responses for heart failure. For heart failure to be confirmed
in the medical record it had to have been recognized as such
by the physician and had to have fulfilled the Framingham
heart failure criteria. Some true heart failure cases may
not have had adequate documentation of Framingham criteria
in their medical records. In addition, heart failure may remain
asymptomatic, especially in elderly patients with sedentary
activity levels, or may have a fluctuating course with asymptomatic and symptomatic periods. The ability to reduce or
abolish symptoms with medical therapy may result in patient
uncertainty about the diagnosis and, hence, increased falsenegative responses. In practical terms, these data suggest
that survey questionnaire information on heart failure may
be relatively unreliable and underscore the need for more

1102

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103

effective methods of identifying persons with significant


ventricular dysfunction.
Few investigators have examined the influence of sociodemographic factors on the agreement between self-reports
of cardiovascular disease and a criterion diagnostic standard,
and no consensus on this issue has emerged [1,2,1315]. In
the present study, aging and comorbidity (as measured by the
Charlson index) were independently associated with reduced
agreement between self-report and the medical record report.
Female gender and higher education were associated with
increased agreement. These associations were observed in
heart failure, diabetes, and MI but not in hypertension and
stroke. Haapanen [1] pointed out that the importance of respondent-related characteristics was greater for a disease with
low agreement compared with a disease with high
agreement. Their finding can be applied to the diagnosis of
heart failure, which manifested relatively low agreement.
We also observed a significant impact of respondent characteristics in MI and diabetes, which showed high agreement.
The reasons why agreement between self-report and medical record diagnosis of hypertension was unlikely to be
influenced by age, sex, or education level may relate to
the high level of recognition given to the disease in the
community and to the ease of diagnosis. Stroke, which is
often associated with sensory, memory, and communicative
deficits, may also attenuate the effect of age, sex, and education. Haapanen [1] reported increased false-positive selfreports of cardiovascular disease among subjects who had
frequent contact with the health care system. Our analysis
using the Charlson index supports their finding and suggests that more frequent interactions with the healthcare
system because of the presence of comorbidities may result
in an exaggerated patient awareness of actual or possible
diseases.
This study is subject to potential limitations. The Olmsted
County population is predominantly white and middle class
and may not be representative of other populations. Although
access to medical care in Olmsted County, Minnesota may
be greater than in some United States populations, this is
difficult to quantify. Although participation bias could have
been operative in our study, preliminary review of the clinical
characteristics of participants and nonparticipants did not
reveal significant differences [25]. The process of chart abstraction is subject to inaccuracy. Therefore, to validate our
chart abstraction, 100 randomly selected charts were reabstracted by the same nurse abstractors who were blinded to
the results of the first record review 4 years earlier. The
percent agreement between the two chart abstractions for
MI, stroke, CHF, and diabetes were 100%, 99%, 95%, and
100%, respectively. A short archival period for some subjects
could increase the rate of false-positive questionnaire responses. However, the median archival period among 2,037
participants was 36 years and age-adjusted archival period
showed no significant difference between participants with
an archival period over 36 years and those with an archival
period under 36 years. Finally, the medical record is an

imperfect criterion standard for the presence of disease.


However, despite this objection, the comprehensiveness of
the Rochester Epidemiology Project medical record system,
the long archival period, and the application of uniform disease definitions by highly trained nurse abstractors make the
medical record the best available criterion standard available for this population-based study.
The results of this study suggest that the agreement of
self-reported cardiovascular disease with medical record diagnosis was good for diabetes, hypertension, stroke, and MI
but was less so for heart failure. Furthermore, agreement
was enhanced with younger age, female sex, high education
level, and the absence of comorbid conditions. These factors
need to be taken into account when interpreting self-administered questionnaire data on cardiovascular disease.

Acknowledgments
We thank Tammy Burns for expert preparation of this
manuscript for publication. This study was funded by
grants from the Public Health Service NIH HL-55502
(R.J.R.) and NIH AR-30582 (S.J.J.), by Merck-Banyu fellowship award (Y.O.), and by the Mayo Foundation.

References
[1] Haapanen N, Miilunpalo S, Pasanen M, Oja P, Vuori I. Agreement
between questionnaire data and medical records of chronic diseases
in middle-aged and elderly Finnish men and women. Am J Epidemiol
1997;145:7629.
[2] Engstad T, Bonaa KH, Viitanen M. Validity of self-reported stroke:
the Tromso Study. Stroke 2000;31:16027.
[3] Tretli S, Lund-Larsen PG, Foss OP. Reliability of questionnaire information on cardiovascular disease and diabetes: cardiovascular disease
study in Finnmark county. J Epidemiol Community Health 1982;36:
26973.
[4] Midthjell K, Holmen J, Bjorndal A, Lund-Larsen G. Is questionnaire
information valid in the study of a chronic disease such as diabetes?
The Nord-Trondelag diabetes study. J Epidemiol Community Health
1992;46:53742.
[5] OMahony PG, Dobson R, Rodgers H, James OF, Thomson RG.
Validation of a population screening questionnaire to assess prevalence
of stroke. Stroke 1995;26:13347.
[6] Paganini-Hill A, Chao A. Accuracy of recall of hip fracture, heart
attack, and cancer: a comparison of postal survey data and medical
records. Am J Epidemiol 1993;138:1016.
[7] Colditz GA, Martin P, Stampfer MJ, Willett WC, Sampson L, Rosner
B, Hennekens CH, Speizer FE. Validation of questionnaire information
on risk factors and disease outcomes in a prospective cohort study of
women. Am J Epidemiol 1986;123:894900.
[8] Lampe FC, Walker M, Lennon LT, Whincup PH, Ebrahim S. Validity
of a self-reported history of doctor-diagnosed angina. J Clin Epidemiol
1999;52:7381.
[9] Walker MK, Whincup PH, Shaper AG, Lennon LT, Thomson AG.
Validation of patient recall of doctor-diagnosed heart attack and stroke:
a postal questionnaire and record review comparison. AmJ Epidemiol
1998;148:35561.
[10] Olsson L, Svardsudd K, Nilsson G, Ringqvist I, Tibblin G. Validity
of a postal questionnaire with regard to the prevalence of myocardial
infarction in a general population sample. Eur Heart J 1989;10:10116.

Y. Okura et al. / Journal of Clinical Epidemiology 57 (2004) 10961103


[11] Bush TL, Miller SR, Golden AL, Hale WE. Self-report and medical
record report agreement of selected medical conditions in the elderly.
Am J Public Health 1989;79:15546.
[12] Heliovaara M, Aromaa A, Klaukka T, Knekt P, Joukamaa M, Impivaara O. Reliability and validity of interview data on chronic diseases:
the Mini-Finland Health Survey. J Clin Epidemiol 1993;46:18191.
[13] Bergmann MM, Byers T, Freedman DS, Mokdad A. Validity of
self-reported diagnoses leading to hospitalization: a comparison of selfreports with hospital records in a prospective study of American
adults. Am J Epidemiol 1998;147:96977.
[14] Kehoe R, Wu SY, Leske MC, Chylack LT Jr. Comparing self-reported
and physician-reported medical history. Am J Epidemiol 1994;139:
8138.
[15] Kriegsman DM, Penninx BW, van Eijk JT, Boeke AJ, Deeg DJ. Selfreports and general practitioner information on the presence of chronic
diseases in community dwelling elderly: a study on the accuracy
of patients self-reports and on determinants of inaccuracy. J Clin
Epidemiol 1996;49:140717.
[16] Rosamond WD, Sprafka JM, McGovern PG, Nelson M, Luepker RV.
Validation of self-reported history of acute myocardial infarction:
experience of the Minnesota Heart Survey Registry. Epidemiology
1995;6:679.
[17] Burgess AM, Martel MU, Wyman DK. Validation of interview-based
disease classifications: a mail survey of physicians. J Chronic Dis
1971;24:4559.
[18] Bowlin SJ, Morrill BD, Nafziger AN, Jenkins PL, Lewis C, Pearson TA.
Validity of cardiovascular disease risk factors assessed by telephone
survey: the Behavioral Risk Factor Survey. J Clin Epidemiol 1993;
46:56171.
[19] Vargas CM, Burt VL, Gillum RF, Pamuk ER. Validity of self-reported
hypertension in the National Health and Nutrition Examination Survey
III, 1988-1991. Prev Med 1997;26:67885.
[20] Harlow SD, Linet MS. Agreement between questionnaire data and
medical records: the evidence for accuracy of recall. Am J Epidemiol
1989;129:23348.
[21] Senni M, Tribouilloy CM, Rodeheffer RJ, Jacobsen SJ, Evans JM,
Bailey KR, Redfield MM. Congestive heart failure in the community:

[22]
[23]
[24]

[25]

[26]

[27]

[28]

[29]
[30]
[31]
[32]
[33]

[34]

1103

a study of all incident cases in Olmsted County, Minnesota, in 1991.


Circulation 1998;98:22829.
Melton LJ III. History of the Rochester Epidemiology Project. Mayo
Clin Proc 1996;71:26674.
Kurland LT, Molgaard CA. The patient record in epidemiology. Sci
Am 1981;245:5463.
Jacobsen SJ, Bergstralh EJ, Guess HA, Katusic SK, Klee GG, Oesterling JE, Lieber MM. Predictive properties of serum-prostate-specific
antigen testing in a community-based setting. Arch Intern Med
1996;156:24628.
Jacobsen SJ, Mahoney DW, Redfield MM, Bailey KR, Burnett JC Jr,
Rodeheffer RJ. Participation bias in a population-based echocardiography study. Ann Epidemiol, in press.
McKee PA, Castelli WP, McNamara PM, Kannel WB. The natural
history of congestive heart failure: the Framingham study. N Engl J
Med 1971;285:14416.
Gillum RF, Fortmann SP, Prineas RJ, Kottke TE. International diagnostic criteria for acute myocardial infarction and acute stroke. Am Heart
J 1984;108:1508.
Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method
of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 1987;40:37383.
Gabriel SE, Crowson CS, OFallon WM. A comparison of two comorbidity instruments in arthritis. J Clin Epidemiol 1999;52:113742.
Gabriel SE, Crowson CS, OFallon WM. Comorbidity in arthritis.
J Rheumatol 1999;26:24759.
Thompson WD, Walter SD. A reappraisal of the kappa coefficient.J Clin
Epidemiol 1988;41:94958.
Landis JR, Koch GG. The measurement of observer agreement for
categorical data. Biometrics 1977;33:15974.
Redfield MR, Jacobsen SJ, Burnett JC, Mahoney DW, Bailey KR,
Rodeheffer RJ. Burden of systolic and diastolic ventricular dysfunction in the community. JAMA 2003;289:194202.
OBrien P. A random survey of Olmsted County, Minnesota, 1973.
Technical report series, no.48. Rochester (MN): Section of Biostatistics, Mayo Clinic; 1991.

S-ar putea să vă placă și