Sunteți pe pagina 1din 115

Cognition, 8 (1980) 111 143

@Elsevier Sequoia S.A., Lausanne

1
- Printed

in the Netherlands

Polite responses to polite requests*

HERBERT H. CLARK and


DALE H. SCHUNK
Stanford

University

Abstract
Indirect requests vary in politeness; for example, Can you tell me where
Jordan Hall is? is more polite than Shouldnt you tell me where Jordan Hall
is? By one theory, the more the literal meaning of a request implies personal
benefits for the listener, within reason, the more polite is the request. This
prediction was confirmed in Experiment 1. Responses to indirect requests
also vary in politeness. For Can you tell me where Jordan Hall is?, the
response Yes, I can - its up the street is more polite than Its up the street.
By an extension of that theory, the more attentive the responder is to all of
the requesters meaning, the more polite is the response. This prediction was
confirmed in Experiments 2, 3 and 4. From this evidence, we argued that
people ordinarily compute both the literal and the indirect meanings of
indirect requests. They must if they are to recognize when the speaker is
and isn t being polite, and if they are to respond politely, impolitely, or
even neutrally.
When people make requests,
they tend to make them indirectly.
They
generally avoid imperatives like Tell me the time, which are direct requests,
in preference for questions like Can you tell me the time? or assertions like
Im trying to find out what time it is, which are indirect requests. The
curious thing about indirect requests is that they appear to have one meaning
too many. Can you tell me the time?, as a request, has the indirect meaning
I request you to tell me the time. Yet it also possesses the literal meaning
I ask you whether you have the ability to tell me the time. If the speaker
*This research was supported
in part by Grant MH-20021
from the National Institute of Mental
Health, the Center for Advanced Study in the Behavioral Sciences, and a National Endowment
for the
Humanities
Fellowship.
We thank Eve V. Clark and Ellen M. Markman for their helpful advice in the
writing of this paper *and Susan L. Lyte for carrying out most of the experiments.
Dale H. Schunk is
now at the University
of Houston.
Requests
for reprints
should be sent to Herbert
H. Clark,
Department
of Psychology,
Stanford University, Stanford, CA 94305, U.S.A.

112 H. H, Clark mld D. H. Schunk

is merely requesting the time, why the extraneous question about ability?
How does it figure in the listeners understanding
of that request? It was
these two questions that prompted the present study.
These questions
suggest two general kinds of processes by which an
indirect request might be understood.
The first kind, which we will call
idiomatic processes, creates one and only one meaning - the indirect meaning.
0111 JOU tell me the time.?, used as a request, would be understood directly
and solely as Please tell me the time. At no point would the listener create
and use the literal meaning Do you have the ability to tell me the time?
The second kind of process, which we will call multiple-meaning
processes,
creates both the literal and the indirect meanings, though not necessarily one
after the other. By this kind of process Can you tell me the time? would be
understood
as involving both a question (Do you have the ability?) and a
request (Please tell me the time).
Each kind of process is needed in certain clear cases. An idiomatic process
is probably required for Ilow do you do.?, which is a question indirectly used
as a greeting. Although the historical vestiges of the literal question (How
are you?) are still present, the question no longer has any force; it isnt
answered sensibly by Fine, thank you. On the other hand, a multiple-meaning
process is probably required for the use of Its late, isrz t it? to request the
time. There seems to be no way of figuring out the request without knowing
what the speaker meant literally. However, on the continuum
from frozen
idioms like fIo\v do JOUdo? to novel requests like Its late, isrz t it ,? there are
intermediate
cases in which a sentence is conventionally
used for an indirect
purpose. For these, either kind of process might apply.
For conventional
indirect requests like Curl _~ozltcfl me tile time:?, which
kind of process is used? Within linguistics, the earliest proposals by Sadock
(1970) required
an idiomatic
process, but more recent ones, by Searle
(1975) and Morgan (1978) for example, require a multiple-meaning
process.
Within psychology,
Schweller (1978) and Gibbs (1979) have proposed
idiomatic
processes,
but Clark & Lucy (1975) and Clark (1979) have
proposed
two different
processes of the multiple-meaning
variety. Thus,
there is an issue here to be resolved.
The feature that makes the multiple-meaning
processes distinctive is their
assumption that literal meaning plays a role in comprehension.
But if it does,
what is that role? For indirect requests, one answer has been offered by
Lakoff (1973, 1977) and by Brown & Levinson (1978): The literal meaning
is important in conveying politeness. As requests for the time, Ma) I ask you
what time it is.7 is ordinarily more polite than Wont JOU tell me what time it
is? Since the two requests have the same indirect meaning, the reason must
lie in their literal meanings. The literal meaning of the first, roughly 1

Polite responses to polite requests

113

request permission to ask you what time it is, presumes very little on the
requestee and offers him the power to grant permission. The literal meaning
of the second, roughly I ask you if you do not intend to tell me what time
it is, presumes a good deal on the requestee and expresses a not-so-hidden
criticism. By this logic, conventional
indirect requests get their politeness
rather directly from the literal meanings.
In a roundabout
way, responses to indirect requests may get their politeness from the literal meanings too. When Ann asks Bob Gzrz you tell ~2e the
time?, Bob might ordinarily respond with a single move, Its six. But if he
wanted to be especially polite, it is our intuition that he would add a first
move, as in Yes, 1 can - its six. Let us call Yes, / cuyl the literal move, and
Its six the indirect move. If we assume that Bob couldnt give the literal
move without computing
the literal meaning, then he must have taken in
Anns request by a multiple-meaning
process. But are responses with both
moves actually more polite, and if so, why?
In this paper, then, we will investigate two issues jointly. The first is comprehension. Does literal meaning play a role in the understanding
of indirect
requests, and if so, what? The second issue is politeness: What makes some
indirect requests, and some responses, more polite than others? In the first
half of the paper, we will take up the politeness of indirect requests, and in
the second half, the politeness of their responses.

The politeness of indirect requests


In a request and its response, two people coordinate an exchange of goods.
For convenience,
let us assume the requestor is a woman called A, and the
requestee a man called B. In her turn, A requests B to do something for
her, and in his turn, B commits himself, or refuses to commit himself, to do
what she wanted. When she requests information,
as in all the requests we
will consider, B ordinarily
gives the information
instead of merely committing himself to give it.
The problem with requests is that, on the surface, they are inequitable.
While A benefits from the information she receives, it costs B some effort to
give it to her. In Goffmans (1955, 1967) terms, requests threaten Bs face.
For Goffman,
face is the positive social value people claim for themselves.
It consists of two particular wants - the want to be unimpeded,
free from
imposition by others, and the want to be approved of in certain respects.
People ordinarily
act to maintain or gain face and to avoid losing face.
Clearly As requests, by imposing on B, are potentially
threatening to Bs
face. Brown and Levinson (1978), following up work by Lakoff (1973,

114

l-1. H. Clark ard D. H. Schunk

1977), have incorporated


this idea in a general theory of politeness whose
basic tenet is this: people are polite to the extent that they enhance, or
lessen the threat to, anothers face. In our case, A will be polite to the extent
that she can reduce or eliminate the threat to Bs face caused by her request.
We will look at only a few of the linguistic devices by which A could
reduce or eliminate the threat to Bs face ~-~for example, Gun _JOU,or
Couldnt you, or Will JOU tell me the time.7 These devices differ in how
much they benefit or cost B. Ordinarily, if a device benefits B, it simultaneously costs A, although the benefit to B may not equal the cost to A. For
simplicity, we will assume that the benefit or cost to B actually does equal
the cost or benefit to A. So A will be polite to the extent that the linguistic
device she selects benefits B or lowers the cost to B (at least within limits).
Table 1.

Examples of the 18 request types used in Experiments

I, 3, and 4

Descriptive category
_____

Request

1. Permission

May I ask you where Jordan Hall is?


Might I ask you where Jordan Hall is?
Could 1 ask you where Jordan Hall is?

2. Imposition

Would you mind telling me where Jordan Hall is?


Would it be too much trouble to tell me where
Jordan Hall is?

3. Ability

Can you tell me where Jordan Hall is?


Could you tell me where Jordan Hall is?
Cant you tell me where Jordan Hall is?
Do you know where Jordan Hall is?

4. Memory

Have I already asked you where Jordan Hall is?


Did 1 ask you where Jordan Hall is?
Have you told me where Jordan Hall is?
Do I know where Jordan Hall is?

5. Commitment

Will you tell me where Jordan HalI is?


Would you tell me where Jordan Hall is?
Wont you tell me where Jordan Hall is?
Do you want to tell me where Jordan Hall is?

6. Obligation

Shouldnt

type

you tell me where Jordan

Hall is?

The Linguistic devices we have selected are ones in which A asks B a


literal question answerable by yes or no, and by virtue of that question she
requests from him a relatively slight piece of information.
Example: Will
you tell me who is coming to dimmer tonight?
From the literature
on
indirect
requests (e.g., Gordon & Lakoff,
1971 ; Green, 1975; Heringer,
1972; Sadock, 1972, 1974; Searle, 1975), we selected the 18 types listed
in Table I. These requests vary from polite to impolite; some of them take a

Polite responses to polite requests

115

literal yes answer for compliance, and others take a no. We will use the first
few words of each request as its abbreviation, like May Z ask you? for May
I ask you where Jordan Hall is?
Since all 18 requests have the same indirect meaning, their differences lie
in the literal meanings. Indeed, these requests can be ordered, on a priori
intuitive grounds, for how much their literal meanings, if taken seriously,
would benefit B or reduce the costs to B. Note that all of them have one
cost in common. They impose on B by asking a question he must answer
with yes or no. Otherwise, the requests can be sorted into six broad categories (see Gordon & Lakoff, 1971; Searle, 1975), as shown in Table 1.
These categories can be ordered approximately
for their benefit to B.
1. Permission. With the literal meaning of May I ask you where Jordan
Hall is?, A is offering B the authority
to grant her permission to make her
request. This is obviously a great benefit to B. He now has a higher status,
or authority,
than he had the moment before, and the status entitles him
to give permission to A even to make a rather trivial request. Such a benefit
makes this and the other two requests in this category particularly polite.
2. Imposition.
With the literal meaning of Would you mind telling me
where Jordan Hall is?, A is no longer offering B the full authority to permit
her to ask him for the wanted information.
Still, she is offering him the
authority
to say that her request imposes too much. This benefits B. A is
thereby admitting that she is imposing on him, and the admission benefits
B too. So Would you mind? should be relatively polite too, although not as
polite as May I ask? and its kind. The authority to grant permission, on the
face of it, benefits B more than the mere chance to say that the task is too
imposing.
3. Ability. When A says C&I you tell me where Jordan Hall is?, she is
literally asking B to say whether or not he has the ability to tell her where
Jordan Hall is. By giving him the opportunity
to deny this ability, the
question both benefits and costs B a little bit. It benefits him by allowing
him to avoid the embarrassment
of being asked a request he couldnt comply
with. But it costs him a little by suggesting that he may not be competent to
comply. Compared to May I ask? and Would you mind? with their great
benefits to B, Gzn you tell me? should be less polite. In so far as the other
three ability requests reflect the same rationale, they should be similar in
politeness. We will take up this qualification later.
4. Memory. The literal meaning of Have /already asked you where Jordan
Hall is? makes a subtle demand on B. It asks him whether or not he can
remember
whether A asked him earlier for the location of Jordan Hall.
Most of the time he wont find this literal demand easy to fulfill, and
anyway, why should he be expected to keep track of what he has told her

116

H. H. Clark and D. H. ScIumk

when she is in as good a position to remember as he is? So this question, if


anything, costs B something, which works against politeness. The same goes
for the other three requests in this category, especially Do I know? These
requests should be less polite, generally, than those of permission, imposition, or ability.
5. Commitmetzt. With the literal meaning of Will you tell me where Jordan
Hall is.?, A is asking B whether or not he will commit himself to tell her the
wanted information.
Commitments,
of course, are quite the opposite of permissions. In commitments,
B obligates himself to A to carry out an action.
This gives her the authority later to demand the fulfillment of his obligation,
and that puts him in a position inferior to her. This should cost B a great
deal - probably
as much as or more than the memory requests. If so,
Will you tell me? and its kind should be less polite even than the memory
requests.
6. Obligatiorl, The last request, Shouldnt you tell me where Jordun Hall
is?, should be the least polite of all. By using should, A is literally asking
B whether or not he is under some obligation to tell her the wanted information. By using shouldn t, she further implies that B has failed in his obligation. Her request, then, costs B in two ways. It implies that he is obligated
to tell her something; he has no choice in the matter. The obligation here is
more severe than in the commitment
requests. And it scolds him for already
having failed in his duties. With such onerous costs to B, this request should
be relatively impolite.
As this discussion shows, the ways in which the literal meaning can be
used to benefit and cost B involve many factors. The ordering of these six
categories of requests is our best judgment of how these factors combine for
a net amount of politeness.
Yet three factors that cut across these six
categories and lead to subsidiary predictions are conditionality,
negativity,
and strength.
The difference
between Muy I ask you.3 and Might I ask you? is one of
conditionality.
The subjunctive
might ordinarily
indicates that what is
being said is conditional on something. For Might I ask?, Brown and Levinson (1978), among others, speculate that the implicit condition is if you
please. If so, might should benefit B and increase the politeness of the
request, since it makes explicit that B can do as he pleases. The same contrast is found between Gzrr you tell me? and Could you tell me?, and
between Will you tell me? and Would you tell me? In each case, the conditional request should be the more polite of the two.
The second factor is negativity,
the difference
between Carl and calz t
and between will and wont. The literal question GM you tell me? doesnt
express any opinion pro or con about what the answer is likely to be. Cant

Polite responses to polite requests

117

you tell me?, however, does (Bolinger,


1975, pp. 528-529).
In some
contexts, it indicates that A expects a yes answer, supposing that B really
can tell her the information. This is the so-called conducive reading. In other
contexts, it indicates that A supposes that B cannot tell her the information
and what she is questioning is whether or not her supposition is correct. This
is the so-called plain reading. Either interpretation
should be costly to B. The
first presumes on B since it indicates that A already knows what his answer
will be. And the second expresses a negative opinion about B - he doesnt
have the ability to tell her the wanted information.
Similar arguments go
through for Will you tell me? and Wont you tell me? In both pairs, the
negative should lead to less politeness.
The final factor is strength. Compare Z willgo and I want to go. Although
they differ in other ways too, they differ in the strength of the implied
desire to go. Will indicates an intention to go; want indicates a more positive
desire. For A to ask B to want to tell her something is therefore to ask for
a stronger commitment.
Since that is more costly to B,Do you want to tell
me? should be less polite than Will you tell me? Also, there is a difference in
strength of imposition implied between Would you mind? and Would it be
too much trouble? With the first, A doesnt suggest that her imposition on B
is very great, whereas with the second, she does - it may be too much
trouble. Since the second benefits B more than the first, it should be more
polite.
These predictions
assume requests among peers who are acquainted but
not intimate. Among other people, the same factors should come into play
but with different consequences.
It would be very odd for a general to ask a
private May / ask you what time it is? That would put the general in an
inferior position that is inconsistent with his rank. The literal meaning still
benefits B. It is just that it is inappropriate
for a general to defer to a private.
This suggests that politeness, as defined by costs and benefits, can be studied
somewhat independently
of appropriateness,
whether or not it is appropriate
to be so polite, or impolite. In this paper we will avoid this complication
and
stick to politeness among acquainted but not intimate equals.

In all our experiments


we used Stanford
University undergraduates,
who are drawn from all over
the United States. While there may be dialectal variations in the phenomena
we are studying, our data
should be fairly representative
of middle class American speech. In any case, our general conclusions,
especially those about comprehension,
shouldnt be affected by any variations that do exist.

118 H. H. Clark arzdD. H. Schutlk

Experiment

Method
Thirty
Stanford
University
undergraduate
students
rated the politeness
of 54 requests, three of each of the 18 types of requests in Table 1.
The 54 sentences used each requested different information.
The information was ordinary,
but fictitious everyday
information
of a relatively
simple kind about who someone was, what something was, or where or
when something happened.
There was one each of these three kinds of
content for each of the 18 types of requests. Examples: May Zask you where
you bought your jacket? and Did you tell me who went to the party last
/ziglzt.T These 54 requests were typed in random order, 18 to a page, on three
mimeographed
sheets, which were stapled in random order for each student.
The students wrote their ratings next to each request.
The students were instructed to rate each request on the following scale:
1 ~ very polite; 2 - fairly polite; 3 - somewhat polite; 4 ~ neither polite
nor impolite; 5 ~ somewhat impolite; 6 - fairly impolite; and 7 ~ very
impolite. They were either paid $2.50 or given credit for a course requirement, and were the same students who participated
in Experiment
4. They
completed
Experiment
4 first and then Experiment
1, all within an hour.
Results
The ratings of politeness turned out very much as predicted. This can be seen
in Table 2, which lists the mean rating for each type of request and for each
category. These means were submitted
to an analysis of variance in which
both subjects and items were random effects (Clark, 1973). It showed that
the means differed reliably from one another, F (17,7 1) = 15.66, p < 0.001.
The mean ratings for the six categories of requests were expected to order
themselves from permission to obligation, and except for a minor reversal,
they did: 2.16, 3.04, 3.85, 3.80, 4.20 and 5.77. These ratings are significantly correlated
with the predicted rank order (Abelson & Tukey, 1963),
F (1,7 1) = 166.08, p < 0.001. The predicted rank order accounts for 57%~of
the variance among the 18 means. If instead of taking all the means we
consider only the two most polite forms within each category, the ordering
is still as predicted, except for a different minor reversal: 1.94, 3.04, 2.92,
3.50, 3.82, and 5.77.
The three subsidiary predictions were also generally upheld. Conditional
modal verbs raised politeness an average of 0.54 units, F (1,7 1) = 5.87,
p < 0.001. The increase was 0.17 units for may/might,
0.59 units for can/

Polite responses to polite requests

Table 2.

Mean politeness ratings for 18 types of requests (Experiment 1)

_
Category
-_____

119

_____
Request
___-

type
_______

____
Mean
_______

_____Category
__-

Permission

May I ask you?


Might I ask you?
Could I ask you?

2.00
1.87
2.62

2.16

Imposition

Would you mind?


Would it be too much?

3.31
2.77

3.04

Ability

Can you tell me?


Could you tell me?
Cant you tell me?
Do you know?

3.22
2.63
5.58
3.98

3.85

Memory

Have I already asked you?


Did I ask you?
Have you told me?
Do I know?

3.48
3.51
3.99
4.24

3.80

Commitment

Will you tell me?


Would you tell me?
Wont you tell me?
Do you want to tell me?

4.24
3.39
4.41
4.76

4.20

Obligation
__I_

Shouldnt

5.17

5.77

Note ~ 1 is very polite,

you tell me?

mean

and 7 is very impolite.

could, and 0.85 units for will/would.


As for negativity, an added negative
lowered politeness an average of 1.26 units, F (1,7 1) = 23.32, p < 0.001.
The decrease was 2.36 units for can/cant, although only 0.17 units for
will/wont,
so this finding isnt nearly as consistent.
Finally, strength was
important.
WiIl you? was 0.50 units more polite than Do you want?, and
Would it be too much trouble? 0.54 units more polite than Would you
mind?, together F (1,7 1) = 4.06, p < 0.05. If we combine the rank order of
the six categories, conditionality,
negativity, and strength, we account for
80% of the variance among the 18 means with only 4 degrees of freedom.
The variance left over, however, is sizable and significant, F (13,71) = 7.04,
p < 0.00 1, suggesting that we havent identified all of the factors that affect
politeness.
Discussion
The costs and benefits theory of politeness is strongly supported by these
results. It says that the more As request benefits B, within limits, the more
polite A is. On this basis we identified six broad categories of requests, and

120

Ii. H. Clark and D. H. Schwk

they were ordered in politeness as predicted. And we identified three other


negativity,
and
factors that should affect politeness
~~ conditionality,
strength -- and they turned out roughly as predicted.
But are these requests understood
by an idiomatic process, or by a
multiple-meaning
process? About this question, the results are less clear. At
first, they appear to offer incontrovertible
evidence for a multiple-meaning
process. Since all 18 requests have the same indirect meaning, by an
idiomatic process they should be identical in politeness. Since they werent,
they must have been handled by a multiple-meaning
process. This makes
good sense. To judge politeness,
people had to figure out the costs and
benefits of each request. These were present only in the literal meaning, and
so people must have computed both meanings.
The idiomatic processes could be saved, however, if we assumed that the
18 requests werent really identical in their indirect meanings. We could
assume, rather, that each request had an indirect meaning with two parts:
I request you to tell me where Jordan Hall is and I am hereby being
polite to degree p. Each request in Table 1 would have a different politeness value p conventionally
associated with it. Thisp would be conventional
in the sense that it would be a permanent value associated with the requests
form itself and would not be computed from the literal meaning. Crudely
put, May I ask you.? would have a p of 2.00, and Do I know? a p of 4.24.
When people judged politeness. they would merely retrieve these ps and
select the corresponding
scale values. In this view, the politeness of each
request is conventional.
It is retrieved, not computed, each time the request
is understood.
The mystery in this position is why there is such a tight fit between the
benefits
and costs implied by the literal meaning and the conventional
politeness values, the ps. The fit could hardly have come about by accident.
One explanation
might be historical. At one time, people computed
the
politeness of May I ask you? from its literal meaning, just as the theory
claims. Over the years, however, its particular value, say 2.00, became dissociated with the literal meaning and began to be learned as a conventional
and therefore arbitrary value. This is not entirely implausible. Morgan (1978)
has traced just such a historical process for such expressions asgooclbye, and
Clark and Clark (1979) have done so for such denominal verbs as in to boyco tt grupes.

There are at least two problems with this historical explanation.


First, the
fit between literal meaning and politeness seems altogether too tight. In the
cases Morgan, and Clark and Clark, brought up, there were certain quirks
of meaning. As the meaning of an expression became partially or fully dissociated with its historical origins, it became partly or fully specialized, or it

Polite responses to polite requests

12 1

changed altogether. There is little evidence of that sort of specialization


in
the requests of Table 1.
The more serious drawback is that there would have to be too many ps.
For an idiomatic process to work right, May Iaskyou?
would have to have a
lower p than Wont you.7 regardless of context. Yet, as offers, May I ask
you to take a piece of cake? appears to be less polite than Wont you take a
piece of cake? If this is so, May Iask you? would require one p for its use as
a request and another p for its use as an offer. Each of the other forms
would have two ps too. By the multiple-meaning
hypothesis, on the other
hand, this inversion is quite predictable.
Requests are for things B didnt
intend to do, and offers, for things B wants to do, so it is more imposing on
B the more obligated he is to carry out a request, but less imposing the
more obligated he is to accept the offer. It is more parsimonious to assume
that the politeness of these forms is based on the relation between the literal
meaning and what is being requested or offered. By this argument, a multiple-meaning process is necessary after all.

The politeness

of responses

Just as there are many ways of making requests, so there are many ways of
responding to them. For As request Gzn you tell me the time?, B could
respond in any of these ways, among others: six: six oclock; its six; its
six oclock,. yes, six; yes, its six; sure, its six; and yes, I can, its six. How
does B choose? One way is by the seriousness of As literal meaning (Clark,
1979). If B understands
A to have intended the literal meaning of her
request to be taken seriously, then to be cooperative
he should include a
literal move such as yes or sure or yes, 1 can. If the literal meaning was
intended merely pro forma, he neednt include such a move. Another way
is by how polite he wants to be. Some of these responses seem more polite
than others. These differences,
we propose, reflect the costs and benefits
theory of politeness as applied to responses. The more Bs response raises the
benefits or lowers the costs. to A, within limits, the more polite B is. The
question is how A is benefitted by Bs response.
We propose an attentiveness
hypothesis:
The more attentive B is to all
aspects of As request, within reason, the more polite B is. For indirect
requests for information,
there are at least four ways B can benefit A.
(1) Precision: B should provide the requested information
as precisely as
required. In the time example, Its six would be more polite in most contexts than Its late afternoon.
(2) Clarity: B should express the requested
information
clearly. It S six oclock, for example, is clearer without being

122

H.

H. Clark at&D. H. Schunk

unnecessarily
wordy or redundant
than Six, where ellipsis could interfere
with As comprehension
of the information.
(3) Completeness:
B should
take seriously the literal meaning, as well as the indirect meaning. Ordinarily,
that means including a literal move, making Yes, its six more polite than a
mere Its six. Other times, including a literal move may lead to less politeness, as we shall show. (4) Znformdity:
B should put A at ease by not being
too formal, or too informal, for the occasion. In casual conversations among
acquainted peers, Sure, its six might well be more polite than Yes, its six.
B should ordinarily be much less polite when he doesnt comply with As
request. To be attentive to As request is, ideally, to comply with it. There
are, however, several ways in which B can mitigate the negative consequences
of not complying. (5) Apologies: B should apologize for not complying. In
the time example, Im sorry, Z,can t would be more polite than a simple
I cun t. (6) Explanations:
B should explain why he is not complying.
Responses that contain a good reason, like I can t, I dont have a watch,
would be more polite than ones without, like I cant. Apologies and explanations benefit A in different ways. Apologies place B in a deferential position
and give A the benefit of increased status. Explanations
tell A that B isnt
refusing to comply merely to snub, put down, or otherwise do in A.
Explanations lower the cost to A of Bs refusal.
Experiments
2, 3, and 4 test several aspects of the attentiveness hypothesis. Experiment
2 explores the range of factors involved, while Experiments
3 and 4 examine more closely how politeness is related to literal meaning.

Experiment

Met hod
Students were asked to rank order for politeness three to five alternative
responses to each of eight requests. The eight requests are shown in Table 3.
For each we composed two sets of three to five responses. One set consisted
of compliant responses, and the other set of refusals to comply. These sets
are also listed in Table 3. In composing the responses we tried to find ones
that sounded as natural as possible.
We constructed
two different
questionnaires.
Each one contained
the
eight requests typed four to a page in random order on two mimeographed
sheets. Under each request were three to five responses also in random order.
For one questionnaire,
four of the requests were followed by compliant
responses, and the other four by non-compliant
responses. For the other
questionnaire,
that assignment wasreversed. For each response set separately,

Polite responses to polite requests

Table 3.

Mean politeness

123

ranks for alternative responses to indirect requests (Experi-

ment 2)

Request
._____-

Response

1. Can you tell me who the guest speaker


will be?

Yes, its Tom James.


Yes, I can. Its Tom James.
Its Tom James.
Tom James.

1.63
1.94
2.56
3.75

No, Im sorry, I cant. I dont know.


No, I cant. I dont know.
I dont know.
No.

1 .Ol
1.93
3.07
3.93

Certainly. Its around the comer.


Yes, I can. Its around the corner.
Yes. Its around the corner.
Its around the corner.

1.13
2.00
2.87
4.00

No, Im sorry,
No, I cant.
No.

I cant.

1.00
2.00
3.00

Sure, here.
Yes, I can. Here it is.
Yes, here it is.
Here it is.
Here.

1.81
2.19
2.31
3.94
4.15

Sorry, I dont have any money.


No, Im sorry, I cant.
No, I cant.
No.

1.60
1.60
2.93
3.87

Sure,
Yes,
Yes,
Tom

1.61
2.27
2.33
3.73

2. Can you direct


Found?

me to the Lost and

3. Can you lend me $S.OO?

4.

Could you tell me who will be here


for dinner tonight?

Tom and Janet.


I could. Tom and Janet.
Tom and Janet
and Janet.

No, Im sorry.
No, I couldnt.
No.
5. Could you tell me what time you close?

I couldnt.

Mean rank

1.25
1.94
2.81

Yes. I could. We close at 9:O0.


Yes; at 9~00.
We close at 9:O0.
9:oo

1.87
2.07
2.07
3.80

No, I dont know.


No, I couldnt.
No.

1.13
2.00
2.88

(Continued

overleaf)

124

H. H. Clark and D. H. Schunk

Table

3 (continued)

Request

Response

Mean rank

6. Would you tell me your name?

Yes, my name is Sheila King.


Yes, 1 am Sheila King.
Sheila King.

1 .40
1.87
2.13

No, 1 wouldnt.
No, I wont.
No.

2 .oo
2.00
2.06

No, not at all. Its around the corner.


Sure, Its around the comer.
No, its around the corner.
Its around the comer.

I .07
2.20
2.93
3.80

No, Im sorry. I dont know where


No, I dont know where it is.
I dont know where it is.
No.

1.06
2.19
2.81
3.94

7. Would you mind telling me where the


bathroom
is?

8. Do you have the time?

_______

it is.

Yes, I do. Its 6: 10.


Sure, its 6:lO.
Yes, its 6:lO.
Its 6:lO.

1.69
1.81
2.50
3.81

No, Im sorry, I dont.


No, I dont.
I dont.
No.

1.07
2.07
3.33
3.53

______.__-__

the students ranked each response for politeness by writing 1 next to the
most polite response, 2 next to the next most dolite response, and so on
was
down to, at most, 5. They were not to give ties. One questionnaire
completed by 15 students and the other by 16 students, all Stanford University undergraduates
who were either paid or given course credit. The task
took less than 15 minutes.

Results
The mean rank for each response is shown in Table 3. Within each set the
responses are listed from most to least polite. The differences within each
set were tested by the Friedman analysis of variance by ranks (Siegel, 1956).
Of the 16 analyses, 14 were significant at the 0.001 level and one at the
0.01 level. The only set not significant
was the set of noncompliant
responses to Would you tell me your name? We will take up the most robust

Polite responses to polite requests

125

of these findings without further statistical justification


and leave the more
subtle comparisons to Experiments 3 and 4.
The factor of completeness
turned out to be highly influential. The compliant responses were of two types. The first, called answer-plus-information
reponses, included a literal move like Sure or Yes, Ican or Certainly, and the
second type, called information-only
responses, did not. The answer-plusinformation
responses
averaged
1.98 ranks, and the information-only
responses 3.54 ranks, suggesting that the literal move added in a full 1.56
ranks worth of politeness. Its influence appears even more substantial if we
compare wherever possible each answer-plus-information
response with the
information-only
response that was identical in every respect except for
the lack of the literal move. Then the literal move added in 1.66 ranks worth
of politeness. Within each response set, every answer-plus-information
was
ranked more polite than every information-only
response, except for one tie.
Clarity was an important factor too. This can be seen first in the information-only responses. They were sometimes expressed as complete sentences,
like Its Tom James, and sometimes in elliptical sentences, like Tom James.
For Requests 1, 3, and 5, where these two forms could be compared, the
complete responses were judged more polite by an average of 1.24 ranks.
Clarity also showed up in the literal moves. They were sometimes expressed
as full answers, like Yes, 1 cuy1, and other times as half answers, like
Yes. For 12 of the response sets, there were pairs of responses that differed
only in whether they contained full or half answers. In all 12 sets, the full
answer was judged more polite than the half answers. The average difference
in ranks was 0.58.
Another
factor,
informality,
showed up too. Among the compliant
responses, the literal move sometimes contained yes and other times the less
formal certainly or Sure (see Clark, 1979, Experiment
2). Three pairs of
responses differed in this respect alone, and for each the more informal
response was more polite. Informality
won out by an average of 1.02 ranks.
In the refusals the additional factors of apologies and explanations
were
both influential. There were six pairs of responses that differed only in that
one contained
the apology Im sorry. For all six pairs, the apologetic
response was more polite, an average difference of 1.OO ranks. As for explanations, every response with an explanation was rated more polite within its
set than every response without one. Note that the full literal moves are
often explanations
themselves.
For Grn you direct me to the Lost and
Found?, the response No, 1 cant explains briefly that B doesnt have the
requisite ability. This response was more polite than the simple No, which
can readily be taken as a refusal even to consider the request. In five such
comparisons, the explanatory
responses were always more polite, and by an

126

H. H. Clark and D. H. Schunk

average of 1.03 ranks. When the two other pairs of responses with and
without explanations
are included in this comparison, explanations
had an
edge of 1.25 ranks.

Discussion
The attentive response, these data tell us, is a polite response. For Gzn you
tell me what time it is?, B could reply simply Six. He will be more polite,
however, if he: (1) makes his information
clearer with Its six; (2) answers
the literal question with Yes, or more clearly with Yes, I can; and (3) softens
the formality of this literal answer with Sure. If he intends not to comply,
he will be more polite if he: (4) apologizes with Im sorry; and (5) gives an
explanation
with I dont have a watch. Each added move signals more
concern with As full request. Some of them are attentive to the indirect
meaning, and others to the literal meaning.
If to be polite B has to be attentive to As literal meaning, then he must be
computing both the literal and the indirect meaning. He must be using a
multiple-meaning
process, not an idiomatic process. Is this conclusion justified? Not completely.
It might be argued that just as there are conventional
ways of making indirect requests, there are conventional ways of responding
to them politely. The link between the two is historically based but by now
entirely
conventional.
By this argument, B could be using an idiomatic
process. However, in Experiment
1, we found reasons for doubting such an
idiomatic
hypothesis
for indirect requests, and the same reasons should
make us suspect the idiomatic hypothesis for responses. Experiments
3 and
4 were designed to dissect this argument more incisively.

Experiment

The politeness of a response need not work the same way for every indirect
request. For example, while a literal move may add politeness for one
indirect request, it may not do so for another. In this experiment
we will
take up two factors that should affect response politeness. We will use the
18 request types in Table 1.
The first factor is conventionality.
Indirect requests, according to Clark
(1979), Morgan (1978), and Searle (1975), differ in how conventionally
they are used for making requests. Although Gzn you tell me the time.? and
Is your watch still working? can both be used in the right circumstances
for
requesting the time, the ordinary, usual, or conventional
form for that pur-

Polite responses to polite requests

127

pose is Cizn you? and not 1s your watch? These two indirect requests differ
in conventionality,
and so do the 18 requests in Table 1.
The politeness of a response should depend on conventionality.
According
to Clark (1979), the conventionality
of an indirect request is one piece of
information
B uses in deciding whether or not to take that utterance as a
request. Because Gzn you? is highly conventional
as a request, B can be
fairly confident
that it is indeed being used to request the time and not
merely to ask a question, and hence that he is expected to comply. By the
attentiveness
hypothesis,
it would be impolite of him not to comply. But
because IS your watch? is not conventional
as a request, he cannot be so
confident that it is being used as a request and that he is expected to comply.
This utterance may not be a request at all, so it would? be so impolite to
answer it literally and do nothing more. The prediction,
therefore, is this:
The more conventional
the indirect request, the more polite B is to provide
the requested information.
This prediction is tested in Experiment 3.
The second factor is the politeness of the literal move of the response.
For each request in Experiment
2, a response with a literal move (e.g., Yes,
I can) was more polite than a response without. But how much politeness
should a literal move add? That depends, we propose, on what the literal
move asserts. Compare Cizn you tell me? and May 1 ask you? from Table 1.
In response to the first, the literal move Yes, 1 can is really an abbreviation
of the assertion / can tell you where Jordan Hall is. In response to the
second, the literal move Yes, you may is an abbreviation
for You may ask
me where Jordan Hall is. Of these two assertions, the first would ordinarily
be more polite among peers. The second presumes B has the authority to
permit or forbid As asking where Jordan Hall is, whereas the first doesnt
presume much at all. When the literal moves to the 18 requests in Table 1
are each spelled out this way, they will vary in how polite they are judged
as assertions. We propose that the more polite the assertion, the more politeness that literal move should add to the response as a whole. This prediction
is also tested in Experiment 3.
Experiment
3 is therefore
divided into three parts. In Experiment
3a,
people were asked to rate the 18 requests in Table 1 for conventionality.
In
Experiment
3b, other people were asked to rate the assertions corresponding
to the literal moves in responses to these same requests for politeness. And
in Experiment
3c, still other people rated the full responses themselves for
politeness.

128

H. H. Clark and D. H. Schurzk

Experiment
The

3a

18 requests

in Table 1 were each typed on a separate file card with


in place of Jordalz Hall. The deck of cards was shuffled and
presented to each of ten Stanford University students with the instruction:
On each card there is a different way of asking where Candlestick Park is.
Some of these requests represent usual, ordinary, and conventional
ways of
asking for information,
while others represent ways that do not seem usual,
ordinary, or conventional.
We would appreciate your rank ordering these 18
requests from most to least conventional.
Just put the cards in the order you
think is most to least conventional.
Cundlestick

Table 4.

Park

Mean ranks of 18 requests judged for comverl tionality


Category

Request

Mean rank

type

Category
means

_~___
Permission

May I ask you?


Might I ask you?
Could I ask you?

8.6
8.5
7.6

8.2

Imposition

Would you mind?


Would it be too much?

1.2
9.6

8.4

Ability

Can you tell me?


Could you tell me?
Cant you tell me?
Do you know?

2.2
2.5
13.3
3.8

5.4

Memory

Have I already asked you?


Did I ask you?
Have you told me?
Do I know?

15.0
11.3
13.7
17.3

14.3

Commitment

Will you tell me?


Would you tell me?
Wont you tell me?
Do you want?

6.8
3.4
12.4
12.6

8.8

Obligation

Shouldnt

15.2

15.2

Note - Rank

you tell me?

1 is most conventional,

and rank

18 least conventional.

The mean ranks of the 18 requests are listed in Table 4. The student
raters were highly consistent in their rankings. Kendalls coefficient
of concordance W was 0.76, p < 0.001. There was an average rank order correlation of 0.73 between any two student raters.
The most conventional
of the requests in Table 4 are Can you?, Could
you?,
Would you.7, and Do .~ou know:,
in which the category of ability
dominates. These requests are of middling politeness of Experiment
1. This
suggests that even though these mean ranks correlate 0.51 with the polite-

Polite responses to polite requests

129

ness ratings of Experiment


1, conventionality
is distinct from politeness.
Recall that in Experiment
1 our hypothesis about the order of the six categories correlated 0.75 with politeness. Once that factor is partialled out, the
correlation
between conventionality
and politeness is 0.28, which accounts
for less than 8% of the variance. In short, conventionality
appears to have a
somewhat independent status.

Experiment 3b
Corresponding
to the literal moves in the responses to the 18 requests in
Table 4 are the 13 assertions in Table 5. As we stipulated in Experiment 3c,
May I? and Might I? both had the literal move Yes, you may; Gzn you?,
Could you? and Gmt you? all had Yes, I can; and Will you?, Would you?,
and Wont you? all had Yes, I will. That is why there are five fewer assertions than requests. Each assertion was typed on a separate file card, and
the deck was shuffled and presented
to each of ten Stanford University
students with these instructions:
On each card there is a different statement
a person might make in the middle of an ordinary conversation.
Some of
these statements are polite things to say to someone in the middle of a conversation and others are not so polite. We would appreciate
your rank
ordering these 13 statements
from most to least polite. Just put the cards
in the order you think is most to least polite to say to someone in the middle
of a conversation.
Table 5.

Mean ranks of 13 assertions judged for politeness


Category

Assertion

Mean rank

Permission

You may ask me where CP is.


You can ask me where CP is.

10.5
9.6

Imposition

I wouldnt mind telling you where CP is.


It wouldnt be too much trouble to tell
you where CP is.

3.8

Ability

I can tell you where CP is.


I know where CP is.

3.6
6.1

Memory

You havent yet asked me where CP is.


You didnt ask me where CP is.
I havent told you where CP is.
You dont know where CP is.

Commitment

I will tell you where CP is.


I want to tell you where CP is.
I should tell you where CP is.

Obligation
Note - Rank

1 is most polite,

and rank

13 least polite.

1.5

6.1
11.4
3.2
12.1
1.6
7.1
7.8

130 H. H. Clark and D. H. Schunk

The mean ranks of the 13 assertions are listed in Table 5. The raters were
highly consistent in their rankings. Kendalls coefficient of concordance
W
was 0.73, p < 0.001; there was an average rank order correlation of 0.70
between any two students.
These rank orders make good sense. The more an assertion benefits and
doesnt cost A, the more polite it ought to be. So when B says that he has
the ability to provide the wanted information,
or that it wouldnt bedifficult for him to do so, that should benefit A a great deal without any cost.
These indeed were the two most polite categories. On the other hand, telling
A that he intends to give the information
regardless of her wishes, or that he
is obligated to give it to her, or that she has his permission to ask him for
it, or that she has forgotten to ask for it - all these cost A, and the assertions
should be correspondingly
less polite. Indeed, they were.

Experiment

3c

Method
Thirty students were each given 54 pairs of requests and responses and were
asked to rate the politeness of each response on a 1 to 7 scale.
The 54 requests were the same as those used in Experiment
1, with three
examples for each of the 18 types of requests in Table 1. For each request
we composed three plausible responses. One had a full literal move followed
by the requested information;
a second had only a half literal move, either
yes or 110, whichever was appropriate
for compliance; and a third consisted
of the requested information
alone. The three responses to Could Iask you
who ute all the eggs.? were: (1) Yes, you CUII. It was my boyfriend. (2) Yes.
It wus my boyfriend. (3) It was my boyfriend. These will be called the full,
half, and null literal responses, respectively. As mentioned earlier, we used
the indicative CUIZ,will, and may instead of the subjunctive could, would,
and might for the literal moves, except for Would JOU mind? and Would it
be too much trouble.:. where we retained would.
The 54 responses each student rated consisted of one full, one half, and
one null literal response to each of the 18 types of request in Table 1. The
assigntnent
of the full, half, and null responses to the 54 requests was
counterbalanced
in a Latin square design over three groups of ten subjects
each. The 54 requests paired with their responses were typed in random
order 18 to a page, the request on one line and its response on the next,
and the pages were shuffled for each student.

Polite responses to polite requests

131

The 30 students, Stanford University undergraduates,


were told to think
of each request as having been made by Speaker A and its response as having
been made by Speaker B. They were to rate the politeness of Bs response.
They used the same rating scale as in Experiment
1 on which 1 was very
polite, 4 neither polite nor impolite, and 7 very impolite.
Results
The politeness ratings came out much as predicted. They are listed in Table 6
by request type and response type. There are two main findings of interest,
the differences
among the request types and the politeness added by the
literal move.
Table 6.
Category

Mean politeness ratings for responses to 18 types of requests (Experiment 3c)


Request

type

Response

type

Category
means

____

FUll

Half

Null

Means

Permission

May I ask you?


Might I ask you?
Could I ask you?

2.61
2.80
2.93

3.30
2.90
3.21

3.83
3.63
3.60

3.18
3.11
3.27

3.19

Imposition

Would you mind?


Would it be too much?

2.80
2.70

3.51
3.20

4.03
4.00

3.47
3.30

3.38

Ability

Can you tell me?


Could you tell me?
Cant you tell me?
Do you know?

2.53
2.83
2.87
2.87

3.30
3.13
3.20
3.21

3.90
4.20
4.13
3.13

3.16
3.39
3.40
3.29

3.31

Memory

Have I already asked you?


Did I ask you?
Have you told me?
Do I know?

3.17
3.23
2.93
4.07

4.30
4.10
3.93
4.13

Will you tell me?


Would you tell me?
Wont you tell me?
Do you want?

2.90
2.80
3.10
3.10

3.60
3.67
3.80
3.90

3.68
3.58
3.50
3.96
3.22
3.17
3.31
3.39

3.68

Commitment

3.57
3.40
3.63
3.67
3.17
3.03
3.03
3.17

Obligation

Shouldnt

3.21

3.33

4.10

3.57

3.57

2.98

3.26

3.92

3.38

you tell me?

Overall means

3.27

As predicted,
the mean response politeness for the 18 request types
(column 4 in Table 6) correlated very highly with the mean conventionality
for the same 18 requests (Table 4). The correlation was 0.72, min F (1,76)
= 19.40, p < 0.00 1. The variance in response politeness not accounted for by
conventionality
was not significant, min F (16,76) = 1.13. Although the

132

H. H. Clark and D. H. Schmk

correlation
between response politeness and request politeness (Table 2)
was a moderate 0.42, when conventionality
was partialled out, this correlation reduced to a negligible 0.09. There was virtually no correlation,
0.19,
between response politeness and the politeness of the literal assertion (Table
5). The main predictor of response politeness was conventionality
: the more
conventional
the request, the more polite it was for B to provide the wanted
information.
Overall, the half and full literal moves - for example, Yes and Yes, Ican
~ each added politeness to the response with no literal move. The half
literal moves added an average of 0.67 units, and the full literal moves
another 0.29 units. Both increases were significant, min F (1,75) = 16.9 1,
p < 0.001, and 2.97, p < 0.05, respectively. These data reinforce Experiment 2 in showing that the more complete
the literal move in general,
the more polite the response.
The politeness
added by the full literal move, however, varied from
0.06 units for Do I know.7 to 1.37 units for Can you tell me? and Could
you tell me? As predicted,
this variation was highly correlated with the
politeness
of the assertion made by the literal move (see Table 5). The
correlation
was 0.73, which is highly significant, F (1,17) = 19.39, p <
0.001. The conventionality
of the request, however, was also moderately
correlated,
0.43, with the increase in politeness from the literal move,
F( 1,17) = 3.48, ns. With both assertion politeness and conventionality
as
predictors, the multiple correlation is 0.8 1.
Which part of the full literal move accounts for these variations in added
politeness - the affirmation
or denial _res or no, or the elliptical assertion
I carz, You muy, or whatever? Let us call these two parts yes/no
and
assertion fragment.
The increase from the yes/no alone correlated a negligible 0.22 with assertion politeness. But the increase from the assertion
fragment correlated
0.70 with assertion politeness. This correlation
is only
slightly less than the 0.73 correlation
for the increase from the full literal
move. The correlations
for conventionality
follow the same pattern, being
0.12 and 0.42, respectively.
It is the assertion fragment, then, that seems to
account for how much politeness is added by the full literal moves.
Discussion
According to these results, the politeness of responses to indirect requests
fits the attentiveness hypothesis. First, the more conventionally
a sentence is
used for making requests, the clearer it should be that A wants certain information, and the more polite B should be to provide it. That was confirmed.
For example, giving the requested
information
was more polite for the

Polite responses to polite requests

133

conventional
Can you tell me? than for the less conventional Have Ialready
asked you? Second, the more polite it is to assert what is literally being
asked, the more polite it should be to add the literal move. This too was
confirmed.
Adding a pleasant Yes, I CUMin response to Gzrz you tell me?
increased politeness more than did adding an insulting No, you dont in
response to Do I know?
Literal moves like Yes, I Carl and No, you don t, we noted, divide into two
parts - the yes/no and the assertion fragment. It was largely the assertion
fragment that governed how much politeness was added. There are two
possible reasons for this. The most obvious is that I can and You dont are
clearer than the bare yes or no about what B is asserting with the literal
move. A less obvious reason is that yes and no alone may be ambiguous.
Yes in response to Cizn you tell me? might indicate either Yes, I can tell
you, which is the assertion fragment, or Yes, Ill tell you if you like,
which is not. The second sense indicates a mere intention to comply, which
shouldnt vary so much from one request to the next.
These findings implicate literal meaning even more than before. If B wants
to respond to As indirect request politely, he must hear at least the
literal form of her request. Without that, he has no way of figuring out which
literal move to include.
But to account
for Experiment
3, he must
truly understand
her literal meaning. He needs this in order to decide
whether or not it would be polite to include the literal move. In short, he
is required
to use a multiple-meaning
rather than an idiomatic process.

Experiment

What we have shown so far is that Bs response to As indirect request will


ordinarily be judged more polite when it contains a literal move -- a move
that deals explicitly
with the literal meaning of the request. How much
politeness is added depends on what that move means as an assertion. But
do people trying to make themselves polite think of using this device,
the literal move? This was the question that led to Experiment 4, in which
people were given a request together with a response with no literal move,
like Do you know where Jordan Hall is? and Up the street, and were asked
to revise the response - Up the street - to make it more polite. By
examining these revisions, we could test certain hypotheses about the conventionality
of the request, the politeness of the literal move, and the elliptical nature of the response.
For certain requests, B is expected to include the literal move. According
to the Clark (1979) proposal, when A uses a conventional
form for making a

134

H. H. Clark and D. H. Schnk

request, like Can you tell meP, she is very likely signalling that she doesnt
intend the literal meaning to be taken seriously - it is merely pro forma ~
and so B isnt expected to deal with it explicitly. But when she uses a less
conventional
form, like Have I already asked you.?, she may well intend the
literal meaning to be taken seriously, and if B is to be polite, he ought to
deal with it explicitly. This theory leads to a straight-forward
prediction:
The less conventional
the request, all other things being equal, the more
likely B will take the literal meaning seriously and the more likely he will
include the literal move.
But as we showed in Experiment
3, it isnt always so polite to include
the literal move, since this may make B sound presumptuous
or superior.
It wouldnt be particularly
polite to tell A that she doesnt know where
Jordan Hall is, which is what the literal move for Do I know? would do.
Accordingly,
the more polite the literal move is, the more likely it should
be included. But these considerations
come into play when B is thinking
of including the literal move anyway. That is, the predictions
based on
politeness of the literal move should merely modify the predictions based
on conventionality
that we just presented.
Finally, there is the ellipsis of the response. A complete sentence like
It is up the street is ordinarily
deemed more polite than an incomplete
one like Up the street (see Experiment
2). If people trying to be polite
know this, then they ought to turn incomplete sentences like Up the street
into complete ones like It is up the street.
Method
Thirty
Stanford
University
undergraduates
were
paired with responses
that provided
only the
Example:
A. Can you tell me where your parents
B. Theyre in the front row.

each given
information

54 requests
requested.

are sitting?

For half the students, all of Bs responses were expressed in complete sentences, as in this example. For the other half, all of them were expressed in
fully appropriate
but incomplete
sentences, such as In the front row. The
students were asked simply to revise each response to make it more polite
and to write their revision on the blank line below Bs response. The 54
requests were the same as those used in Experiments
1, 3a, and 3c. They
were typed, in the format just given, six to a page on nine mimeographed
sheets in random order, and the nine pages were given to each student in a
random order.

Polite responses to polite requests

135

Results and Discussion

The most obvious outcome was that there was an almost universal tendency
to fill out the information
requested. Fully 92% of the incomplete sentences
given to the one group of students were turned into complete sentences. And
although the complete sentences given to the other group of students could
have been turned into perfectly acceptable incomplete sentences (by revising,
for example, Theyre in the front row to In the front row), only 2% of them
were. Indeed, the sentences for both groups of students tended to be filled
out with material that was redundant with the request. Pronouns tended
to be turned into complete noun phrases, as when Theyre in the front row
was revised to My parents are in the front row, and missing verb phrases
tended to be filled in, as when My roommate did was revised to My roommate cut my hair. There was a strong consensus that to be more polite, one
should be clearer and more explicit about the information
provided. Otherwise, the two groups of students didnt differ reliably, and so for the
remaining discussion they will be lumped together.
Table

7.

Category

The most frequent literal moves and the percentage of people supplying a
literal move in responding to 18 types of requests (Experiment 4)
Request

type

Most Frequent

Literal Moves

Half

Full

Percentage
Literal Moves
__.

Permission

May I ask you?


Might I ask you?
Could I ask you?

Sure.
Sure.
Yes.

Yes, you may.


Yes, you may.
Yes, you can.

49
56
41

Imposition

Would you mind?


Would it be too much?

Not at all.
Not at all.

No, I wouldnt.
Of course, it wouldnt.

51
82

Ability

Can you tell me?


Could you tell me?
Cant you tell me?
Do you know?

Sure.
Yes.
Sure.
Yes.

Sure
Yes,
Sure
Yes,

I can.
I can.
I can.
I do.

48
33
68
52

Memory

Have I already asked you?


Did I ask you?
Have you told me?
Do I know?

No.
No.
No.
Yes.

No, you havent.


No, you didnt.
No, I havent.
Yes, you do.

64
66
61
54

Commitment

Will you tell me?


Would you tell me?
Wont you tell me?
Do you want?

Yes.
Sure.
Sure.
Sure.

Yes,
Sure,
Sure,
Yes,

41
48
52
56

Obligation

Shouldnt

Yes.

Yes, I should.

you tell me?

I .will.
I could tell you.
111tell you.
I do.

59

136 H. H. Clark and D. H. Schunk

Although the bare responses presented to the students did not contain
literal moves, many of their revisions did. Each of the 1620 revisions was
checked for this feature, and the percentage for each request type is shown
in Table 7. These percentages provide rather striking confirmation
of our
predictions.
First, there was a 0.57 correlation between the percentages of
literal moves in Table 7 and the conventionality
ranks of each request type
from Experiment
3a (Table 4). This correlation
accounted
for a highly
significant proportion
of the variance among the percentages
in Table 7,
F (1,42) = 11.72, p < 0.005. Second, there was a -0.24 correlation between
these percentages
and the politeness ratings of the corresponding
literal
moves from Experiment
3b (Table 5). This correlation,
however, is spuriously low because of the correlation between conventionality
and politeness themselves.
With conventionality
partialled
out, as our prediction
requires, the correlation between the percentages in Table 7 and the politeness ratings of the literal move rises to -0.50.
This too accounts for a
significant proportion
of the variance, F (1,42) = 6.08, p < 0.05. The
variance not accounted for by these two factors is not significant, F( 15,42)
= 1.23. In short, the less conventional
the request, the more literal moves
were added, and then the more polite the literal move, the more often it
was added.
There was other evidence that the students were sensitive to the literal
meanings of the requests, some of it so obvious that it hardly needs to be
pointed out. In Table 7 are listed the most frequent half and full literal
moves that turned up in the revisions. These show that the literal moves the
students selected were selected because they were appropriate
to the literal
meanings of the requests. Consider the half moves first. Most of the requests
- 13 of them - were answered with yes or sure. The five that were answered
no were just the ones for which a negative answer was appropriate.
And
among these five, only Would you mind.? and Would it be too much trouble?
were provided with Not at all, which wouldnt have been appropriate
as
literal answers to the other three. Then consider the full moves. In them the
use of can, ma), will, do, didn t, haven 7, wouldnt, and shouldnt were
always appropriate
to the literal question
asked. May / ask you.7 was
answered with you ma) and not I will, while Will you tell me? was answered
with I will and not you may. Yet the auxiliary verb in the question - can,
may, havent, and the like - is not always appropriate
for a literal move of
compliance.
Accordingly,
Might I ask you? was answered with you rnaJ>,
not you might, and Would you tell me? with I will, not I would. The
students
didnt turn the literal questions into answers by a mechanical
algorithm. They chose literal moves appropriate
to what they intended to
convey.

Polite responses to polite requests

137

This conclusion
is even more evident in the literal moves not listed in
Table 7. Consider those for the permission requests. Generally,
it isnt
terribly polite to assert You may ask me where Jordan Hall is. To soften
its authoritarian
tone, the students used marks of reassurance - of course,
certainly,
and sure - fully 64% of the time. Nor is it very polite, for the
memory requests, to assert I havent told you where Jordan Hall is. To
soften this move, the students often used such hedges as I may have forgotten to, I dont think I have, and Im not sure. These relieve the implicit
criticism that is otherwise
conveyed
by a bald ~10. For the imposition
requests, on the other hand, it is all right to assert It wouldnt be too much
trouble to tell you where Jordan Hall is, but even better to be more insistent, as many students were in such moves as No trouble at all, Certainly
not, and Of course not. The critical point is that there are several ways of
hedging, softening, and strengthening
literal moves, and they are not interchangeable.
Which way is appropriate
depends on the meaning of that
particular literal move.
These findings argue even further for a multiple-meaning
process, since
the literal meaning of the request was used in so many ways. It was used
initially by the students in deciding whether or not to make a literal move.
Then it was used in selecting the right form of that move and in deciding
how to strengthen or soften that move appropriately.
It seems difficult to
account for this constellation
of decisions with a process that used the
indirect meaning and nothing more.

General Discussion
It is time now to draw out the three main threads that have been running
through
these experiments:
the politeness of requests, the politeness of
responses to requests, and understanding
indirect requests.
The politeness

of indirect

requests

The politeness of an indirect request, we have argued, springs principally


from its literal meaning. The theory we have drawn on, Brown and Levinsons face-work theory of politeness, predicts that a request is polite to the
extent that it increases the benefits, or lowers the costs, to B. The request
itself costs B something, since he is being asked to do something for A. A
can compensate by various symbolic means. She can subordinate herself to
B by asking permission to make her request, as in May I ask you? She can
offer B the authority
to say that the request is too imposing, as in Would

138 H. H. Clark a&D. H. Schmk

JQU ?nindP She can give B the chance to say that he is unable to carry out
the request, as in chn JYIU tell me? And so on. These devices are graded in
their costs and benefits, and their politeness follows suit.*
This neat picture is complicated
by conventionality.
If literal meaning
were the sole determinant
of politeness, then Cm you tell me.7 and Are you
able to tell me?, whose literal meanings are roughly synonymous,
ought to
be equally polite. But they arent. While both of them ask B whether or not
he has the ability to give the wanted information,
Are you uble to tell me?
signals that A more likely intends the question to be taken seriously and
expects B to respond with a literal move (Clark, 1979, Experiment
3). As
literal meaning is a deliberate
request for another piece of information,
which should cost B something.
So Are you able to tell me,7 should be
slightly less polite than Cizn JOU tell me? Similar logic applies to the other
categories of request types too.
In an informal experiment similar to Experiment
1, we asked ten students
to rank order for politeness the following indirect requests (each of which
was completed with where Cadlestick
Park is):
1.
2.
3.
4.
5.
6.
7.
8.

May I ask you? (2.2)


Will you permit me to ask you? (3.4)
Would you mind telling me? (2.3)
Would you object to telling me? (4.7)
Can you tell me? (3.5)
Are you able to tell me? (4.9)
Shouldnt you tell me? (7.0)
Arent you obligated to tell me? (8.0)

The mean ranks, shown in parentheses


next to each request, confirm that
conventionality
matters:
1 was more polite than 2; 3 more polite than 4;
5 more polite than 6, and 7 more polite than 8. For the last three pairs,
nine out of ten students agreed on the ordering; for the first pair, seven
out of ten did. As predicted, Can you? was more polite than Are JVU able?
So in the limited domain in which we have been working, politeness is
determined
by at least two factors: (I) the literal meaning of the indirect
request, and (2) the seriousness with which that literal meaning was intended.
Although seriousness is determined in our last examples by how conventional

2The request forms we used, of course, can take on ironic, sarcastic, or even impudent
meanings
when uttered in just the right contexts.
In assuming requests among acquainted
peers, the students
in our experiments
appear also to have assumed ordinary
contexts
in which the requests have their
usual meanings.
It is an important
question,
however, when and how these requests take on ironic,
sarcastic, or impudent
meanings.

Polite responses to polite requests

the request is, it is more generally determined


by a number
which conventionality
is only one (Clark, 1979).
The politeness

of factors

139

of

of responses

The politeness of a response to a request, we have argued, is governed by the


attentiveness
hypothesis, which is itself derived from Brown and Levinsons
face-work theory. It is this: The more attentive B is to all aspects of As
request, within reason, the more polite he is. The two main aspects he should
be attentive to are the indirect meaning and the literal meaning.
The indirect meanings we have examined have all been requests for information, like I request you to tell me where Jordan Hall is. To be particularly polite B should do these things. (1) Precision. He should give as precise
information
as A requires, as in Up the street instead of Nearby. This is a
factor we didnt study. (2) Clarity. B should express this information
fully
enough to be comprehended
with certainty. Complete sentences like Its up
the street are generally more polite than incomplete ones like Up the street
(Experiments
2 and 4). On the same grounds, fully spelled out expressions,
as in Jordan Hall is up the street, are generally more polite than their abbreviated forms, at least within reason (Experiment
4). (3) Seriousness. B
should be more certain to supply the wanted information
the clearer it is
that A is making a request - that is, the more conventional
a form the
request takes (Experiment
3). (4) Apologies. If B wont provide the information, he should apologize, as with Im sorry (Experiment
2). (5) Reasons. If
B wont provide the information,
he should explain why (Experiment
2).
All these, and there are probably more, are ways B can show his concern
with what he is actually being requested to do.
It is the literal meaning that we have been most concerned with. When
A makes her request with, say, Do you know where Jordan Hall is., she
literally means I ask you whether or not you know where Jordan Hall is.
To be particularly polite then, B should do these things. (1) Completeness.
He should deal explicitly with the literal meaning too, as in Yes, its up the
street (Experiments
2 and 3). (2) Clarity. He should express this literal move
clearly, to show that he is explicitly responding to the literal meaning, as
in Yes, / L1o- its up the street. (3) Seriousness. He should give the literal
meaning more attention,
responding
to it oftener,
the more clearly A
intended it to be taken seriously, as when she uses a less conventional
form of request (Experiment
4). (4) Implications.
Nevertheless, he should
make the literal move less often, or he should soften or hedge it more often,
the more it would cost A if he made it (Experiments
3 and 4). In response to
Do Zkrlow where Jordan Hall is?, he will be more polite if he omits the literal

140

H. H. Clark and D. H. Schunk

move, as in Its up the street,


Lou ~ its up the street.

or if he hedges

it, as in O/z! I forgot

to tell

Clark (1979), in a study of indirect requests, proposed a model of how B


selects his response to a particular request. According to that model, Bs
choice depends on how conventional
the form of the request is, how
transparent
what is being requested is, whether special markers like please
are present, how plausible the literal meaning is, and what As plans and
goals are thought to be. The factors we have just introduced
are meant to
complement this model.
Understanding

indirect

requests

What about understanding


indirect requests? In the introduction
we laid out
two broad classes of comprehension
processes - the idiomatic processes,
which create the indirect meaning and nothing more, and the multiplemeaning processes, which create both the literal and the indirect meaning.
The indirect meaning is computed
in both types, so the question was
whether the literal meaning is computed.
Mounting evidence suggests that it
is, at least in a significant proportion
of situations.
The first evidence turned up in Experiment
1. There politeness varied
from request to request, not arbitrarily, but according to the literal meaning
as predicted by the face-work theory. It might be proposed, as an alternative,
that associated with the form of each request, as part of its indirect meaning,
there is a conventional
value for politeness. This alternative isnt plausible
for several reasons. First, the fit between politeness
and literal meaning
seems too exact. Second, offers that take the same form as our requests
appear to convey quite different amounts of politeness.
The rest of our evidence, in Experiments 2, 3, and 4, was that people consistently took account of literal meaning in judging or composing responses
to indirect requests. In Experiment 2, they preferred as polite responses ones
that included literal moves. In Experiment 3, they generally preferred literal
moves that were explicit over ones that were incomplete
full over half
literal moves. However, they modulated these judgments by what the literal
moves -- responding to the literal meaning ~-- would actually mean when
asserted. In Experiment
4, to be polite, they created literal moves, but held
back on them, or hedged them, when they would exact too much cost from
the requester. In all three experiments,
people kept close track not merely of
the literalforrn
of the indirect request, but also of its literal meaning.
Not all of this evidence, however, seems to require a multiple-meaning
process on each and every occasion. In Experiment 4, it could be argued that
the revisions without literal moves ~~~45% of the total ~ were at least some-

Polite responses to polite requests

141

times composed by people who had not computed the literal meaning. On
these occasions, the requests were understood in the same idiomatic way we
suggested How do you do? is ordinarily understood.
The critical question for indirect requests, then, is under what conditions
could an idiomatic process be used. Such a process requires two things.
First, it requires the form of the indirect request to be conventional enough
to be recognized as a request. This requirement is satisfied by many indirect
requests (see Clark, 1979). Indeed, the same requirement
is needed in a
multiple-meaning
process to account for how seriously the literal meaning is
to be taken. Second, it requires that, on the occasion on which the request is
uttered, politeness and other things associated with the literal meaning do
not matter to the listener. For indirect requests, it isnt obvious whether
this second requirement is ever satisfied.
Politeness almost always matters - if only by default. In our experiments,
it mattered a great deal since that was what the students were asked to judge.
But in ordinary circumstances,
it matters too. People appear to have strong
expectations
in each kind of circumstance
about the forms of request A
would ordinarily use. When asked for the time, for example, B might expect
the highly conventional
f%z you tell me the time?, which asks about his
abilities. When A uses a form he does not expect, regardless of how conventional it is, he takes her as signalling, by her contrast in form, a contrast in
meaning. If she had used Would you tell me the time?, querying his conditional intentions
instead, he should see that she had perhaps expected him
to tell her the time and was wondering why he hadnt. Unlike the contrast in
meaning between the idioms Hi and How do you do?, the contrast here is
signalled by the difference
in literal meaning. Our conjecture
is this: Any
contrast with the default, or expected, form of request indicates a contrast
in meaning; if B is ever to recognize that contrast, it must be on the basis
of the literal meaning via a multiple-meaning
process.
Even aside from politeness, highly conventional
forms of indirect requests
are not interchangeable
from one situation to the next. In asking B for his
middle name, for example, A could use the highly conventional
Could you
tell me your middle name? but not the equally conventional
Do you know
your middle name? The second request is odd because of its literal meaning,
which supposes that B might not know his middle name. There are probably
subtle contrasts like this between virtually any two indirect requests that
can be made in a particular circumstance.
To show that B uses an idiomatic
process in any of these circumstances,
we would have to show that he is
indifferent
to subtle distinctions
conveyed by the literal meanings - for
example, that he isnt stopped for even the slightest moment by the oddness
of Do you know your middle name? Such a hypothesis should be difficult to
prove.

142

H. H. Clark and D. H. &hunk

Thus, the idiomatic processes, however promising they look at the outset,
should not be assumed too readily. In one field experiment
(Clark, 1979,
Experiment
l), 50 merchants were telephoned
and asked Could you tell me
the time you close tonight? Only four of them, or S%, included a literal
move in their response. One might be tempted to conclude that the other
92% had used an idiomatic process. Yet in another field experiment (Munro,
1977), students on the UCLA campus were approached and asked Could you
tell me the time?, virtually the same request. Of these, 57% included a literal
move, presumably
because the face-to-face
situation led them to be more
polite. One might now be tempted to conclude that people use an idiomatic
process except when they anticipate they will have to be particularly polite.
But if politeness is an inherent part in every interchange of this sort, as it
seems to be, it is more parsimonious to conclude that people use a multiplemeaning process regardless.

References
Abelson, R. P., & Tukey, J. W. (1963) Efficient utilization
of non-numerical
information
in quantitative analysis: General theory and the case of simple order. Annals of Mathematical Statisfics,
34, 1347-1369.
Bolinget, D. L. (1975) Aspects of language (2nd ed.). New York, Harcoutt Brace Jovanovich.
Brown, F., & Levinson, S. (1978) Universals in language usage: Politeness phenomena.
In E. Goody
(Ed.), Questions and politeness. Cambridge,
Cambridge University Press, pp. 56-324.
Clark, E. V., & Clark, H. H. (1979) When nouns surface as verbs. Lang., 55, 767-811.
Clark, H. H. (1973) The language-as-fixed-effect
fallacy: A critique of language statistics in psychological research. J. verb. Learn. verb. Behav., 12, 3355359.
Clark, H. H. (1979) Responding
to indirect speech acts. Cog. Psychol., 1 I, 430-477.
Clark, H. H., & Lucy, P. (1975) Understanding
what is meant from what is said: A study in convetsationally conveyed requests. J. verb. Learn. verb. Behav., 14, 56-12.
Gibbs, R. W. (1979) Contextual
effects in understanding
indirect
requests. Discourse
Processes, 2.
l-10.
Goffman,
E. (1955) On face-work:
An analysis of ritual elements in social interaction.
Psych., 18,
213 -231.
Goffman,
E. (1967) Interaction ritual: Essays on face-to-face behavior. Garden City, NY, Anchor
Books.
Gordon,
D., & Lakoff, G. (1971) Conversational
postulates.
In Papers from the Seventh Regional
Meeting, Chicago Linguistic Society, pp. 63 -84.
Green, G. M. (1975) How to get people to do things with words: The whimpetative
question.
In
P. Cole and J. L. Morgan (Eds.), Syntax and semantics, Vol. 3: Speech acts. New York, Seminar
Press, pp. 107-141.
Hetinget,
J. (1972) Some grammatical
correlates
of felicity conditions
and presuppositions.
Working
Papers in Linguistics (The Ohio State University),
11, l- 110.
Lakoff, R. (1973) The logic of politeness;
ot minding your ps and qs, In Papers from the Ninth
Regional Meeting, Chicago Linguistic Society, pp. 292-305.
Lakoff,
R. (1977) What you can do with words: Politeness,
ptagmatics,
and petfotmatives.
In A.
Rogers, B. Wall, and J. P. Murphy (Eds.), Procedings of the Texas Conference on Performanfives, Presuppositions, and Implicatures. Arlington, Va., Center for Applied Linguistics, pp. 79105.

Polite responses to polite requests

143

J. L. (1978) Two types of convention


in indirect speech acts. In P. Cole (Ed.), S~nrux and
semantics, Vol. 9: Pragmatics. New York, Academic Press, pp. 261-280.
Munro, A. (1977) Speech acf understanding in context. Unpublished
doctoral dissertation,
University
of California at San Diego.
Sadock, J. (1970) Whimperatives.
In J. Sadock and A. Vanek (Eds.), Studies presented to Roberr B.
Lees by his students. Edmonton,
Ill., Linguistic Research, Inc., pp. 223- 238.
Sadock, L. (1972) Speech act idioms. In Papers from the Eighth RegionalMeetitzg, Chicago Linguistic
Society, pp. 329- 339.
Sadock, L. (1974) Toward a linguistic theory of speech acts. New York, Academic Press.
Schweller, K. G. (1978) The role of expectation in the comprehension and recall of direct and indirect
requests. Unpublished
doctoral dissertation,
University of Illinois at Urbana-Champaign.
Searle, J. R. (19.75) Indirect speech acts. In P. Cole and J. L. Morgan (Eds.), Synrux and semantics,
Vol. 3: Speech Acts, New York, Seminar Press, pp. 59-82.
Siegel, S. (1956) Nonparametric statistics for the behavioural sciences. New York, McGraw-Hill.
Morgan,

R&sum6
Les demandes
indirectes peuvent etre formulees de facon plus ou moms polie. Par exemple Can you
tell me where Jordan Hall is? (P ouvez-vous
me dire oi se trouve Jordan Hall?) est plus poli que
Shouldnt
you tell me where Jordan Hall is? (Ne devriez-vous pas me dire oti se trouve Jordan Hall?).
Une approche
theorique
propose
que plus le sens litt&a_l de la demande
implique davantages
personnels
pour Iauditeur,
dans les limites du raisonnable,
plus polie est la demande. Cette p&diction
est confirm&e par IExperience
1.
Les reponses aux demandes
indirectes
varient aussi en politesse. Pour Can you tell me where
Jordan Hall is? (Pouvez-vous
mc dire oti se trouve Jordan Hall?) la reponse Yes, I can - its up the
street (Oui, je peux vous le dire, il se trouve en haut de la rue) est plus polie que Its up the street
(Cest en haut de la rue). Une extension
de la theorie permet de predire que plus celui qui rtpond fait
attention
i tous les sens impliques par la requete, plus la reponse est polie. Les Experiences
2, 3 et 4
contirment
cette prediction.
Avec ces preuves, nous proposons
que les gens calculent les sens directs et indirects des demandes
indirectes.
Cela est necessaire pour reconnaitre
quand le locuteur
est poli ou ne lest pas, et pour
pouvoir repondre poliment, impoliment
ou de facon neutre.

Cognition,
@Elscvier

8 (1980)
145 - 174
Sequoia S.A., Lausanne

- Printed

in the Netherlands

How big is big?*


Relative and absolute properties in memory
LANCE J. RIPS
University

WILLIAM
Simon

of Chicago

TURNBULL

Fraser University

Abstract
Previous studies of semantic memory have overlooked an important distin&
tion among so-called property statements.
Statements
with relative adjectives (e.g., Flamingos are big) imply a comparison to a standard or reference
point associated with an immediate superordinate
category (a flamingo is big
for a bird), while the truth of statements with absolute adjectives (e.g., Flamingos are pink) is generally independent
of such a standard. To examine
the psychological
consequences
of this distinction,
we asked subjects in
Experiment
1 to verify sentences containing either relative or absolute adjectives embedded
in either predicate-adjective
(PA) constructions
(e.g., A tlamingo is big (pink)) or predicate-noun
(PN) constructions
(e.g., A flamingo
is a big (pink) bird), where the predicate noun was the immediate superordinate. Reaction
times (RTs) and errors for relative sentences decreased
when the superordinate
was specified,
but remained constant for absolute
sentences.
These data also suggest that the truth value of relative sentences
depends,
not just on the superordinate,
but also on a more global standard
for everyday, human-oriented
objects. Experiment
2 extends these results in
showing that ratings of the truth of relative sentences are a function of the
difference
in size between an instance and its superordinate
standard (e.g.,
between the size ofaflamingo
and that of an average bird) and the difference
between the instance and the standard for everyday objects. Experiment
3
replicated
these findings
using reaction time as the dependent
measure.

*We would like to thank J. Angiolillo-Bent,


F. Conrad, J. Galambos,
G. Garvey, M. Hickmann,
J. Huttenlocher,
G. Kahn, 1. Lanin, S. Leehey, F. Lui, J. McCawley, S. Schacht, and D. Stephens for
their advice and assistance.
We also acknowledge
support
from National Science Foundation
Grant
No. BNS76-03377
and Public Health Service Grant No. K02MH00236.
Correspondence
should be
addressed to Lance Rips, Department
of Behavioral Sciences, University of Chicago, 5848 S. University Avenue, Chicago, IL 60637.

146

L.

J. Ripsand W. Turrzbull

Two major views have evolved about the way we remember the properties
of common
objects. On one hand, most theories of semantic memory
(e.g., Anderson, 1976; Collins & Loftus, 1975; and Kintsch, 1974) represent
properties
as unitary mental predicates. According to such theories, we are
for example, that flamingos are pink ~~
able to recall property information
because a predicate for pink is attached to the concept flamingo in longterm memory. A predicate may not be stored directly with every concept to
which it applies, and in these cases, recall of the property
may require
memory search. Nevertheless,
the predicates themselves are atomic, having
no underlying semantic structure.
On the other hand, theories of mental comparison
(e.g.. Moyer. 1973;
Paivio, 1975) imply that property
information
is calculated rather than
simply stored and retrieved as a unit. For example, these theories claim that
in order to decide whether a flamingo is larger than an eagle, we compare
their respective values along a mental scale for size. While we could store
the complex predicate is-lurger-than-an-eagle
with flamingo, this possibility
is seen as unlikely for both empirical and theoretical reasons. (Consider the
flamingos
enormous number of comparative statements we know to be true
are also larger-than-turnips,
larger-than-clothes-pins,
and so on.) Although
some comparison
theories allow relations to be stored intact (see Banks,
1977), property
information
is generally computed
rather than pre-stored
(in the terminology
of Smith, 1978).
There are several factors that could account for this difference in the way
properties are characterized.
First, different constructions
are involved since
in semantic memory experiments
subjects are asked to verify sentences containing simple (one-place)
predicates such as f+mingos
ure pink, while in
mental comparison
experiments
subjects verify two-place relations (Humingos are lurger than eagles). It may be that one-place predicates are prestored, and two-place predicates computed.
But a second, and perhaps more
important,
difference
is the type of properties that have been employed.
Semantic memory has focused on absolute adjectives (e.g., those denoting
color, such as pink), while mental comparison has employed relutive adjectives such as lurge (Katz, 1972, pp. 254- 26 1).
To see why the relative/absolute
distinction might be important, we begin
by describing some linguistic and logical differences between these adjective
types. Next, we consider possible psychological mechanisms for representing
this distinction.
Finally, we report three experiments
whose goal is to
examine these mechanisms. While the experiments are solidly in the semantic
memory tradition in using sentences of the form S-V-Adj, we explore the
differences
between relative and absolute properties
in Experiment
1 by
varying the adjective (e.g.. Flatningos are pink versus Flamingos are large).

Relative properties in memory

147

In Experiments
2 and 3 we show that symbolic distance effects, like those
found in studies of mental comparison, can also be obtained in a semantic
memory context.

Relative versus Absolute Adjectives


Large

and pink are representative


of two broad classes of English adjectives.
Relative adjectives include those like large, small, wide, narrow, tall, short,
thick, and thin that are based on an underlying
ratio scale of physical
measurement
(volume, width, height, and so on), as well as those like safe,
dangerous,
strong, weak, happy, and sad that depend on ordinal judgments
(Huttenlocher
& Higgins, 197 1). Absolute adjectives, by contrast, describe
more qualitative
properties
of their referents such as color (red, blue, or
pink), shape (square, round, or triangular), physical composition
(metallic,
wooden, or plastic), nationality
(Chinese, African, or Canadian), and many
others. In discussing these adjective types, we will use large and small as
typical relative adjectives and color terms like pink and green as typical
absolute adjectives.
The distinction
between relative and absolute adjectives appears most
clearly when we examine inferences from sentences that contain them. For
example, consider the logical relations between ( 1a) and (1 b) and between
(2a) and (2b):
( 1)
(2)

a.
b.
a.
b.

A
A
A
A

grasshopper
grasshopper
grasshopper
grasshopper

is
is
is
is

a
a
a
a

large
large
green
green

insect.
animal.
insect.
animal.

Despite the fact that grasshoppers are insects and insects animals, being a
large insect does not mean being a large animal; the attribution of a relative
property
like large does not automatically
generalize
to superordinates
(Vendler,
1968, p. 96). However, absolute adjectives like green do permit
generalization
of this sort, so that (2a) entails (2b) and, in addition, any
other sentence
in which a more inclusive superordinate
(e.g., object) is
substituted for the predicate noun.
One possible way to explain this difference
is to assume that relative
adjectives convey an implicit reference
to a norm or standard associated
with the modified noun. (This idea is traceable to Leibniz - see Wierzbicka,
1972 - and appears in the work of many modern semanticists, e.g., Bierwisch,
1967; Fillmore, 197 1; Katz, 1972; Langford, 1942; Ross, 1930; Sapir, 1944;
Vendler, 1968). Large insect in (la) means that the designated insect is larger

148

L. J. Rips and W. Turnbull

than some normal size for insects. Since what is the normal size for insects
will be different
from the normal size of animals and other objects, a
creature thats a large insect (i.e., large for insects) may be small relative to
other animals or objects. For this reason, (1 b) does not follow from (1 a).
On the other hand, standards for absolute adjectives do not shift (or do not
shift as much) from one noun class to another. So while objects may be
more or less green, what is green about insects will be green with respect to
other things as well. This means that (.2b) will be true on the basis of (2a).
For convenience
in discussing relative adjectives, lets call the implicit
norm the reference
point, and the associated category the reference
class with respect to the adjective in question. For example, in the phrase
large insect, insect provides the reference class and the normal size of insects
provides the reference point from which largeness is determined.
We note that the reference class is not always explicitly mentioned when a
relative adjective is used. To see how the reference class is determined in
such situations, we can examine the sentences in (3) and (4):
(3)
(4)

a.
b.
a.
b.

This insect
Insects are
This insect
Insects are

is small.
small.
is a small insect.
small animals.

When a singular term is the sentence subject, as in (3a), the noun itself provides the appropriate
reference class for the predicate adjective, so that (3a)
is synonymous
with (4a). This is not true, however, when the subject is an
unmodified
plural noun, as in (3b), as is clear from the fact that (3b) is not
equivalent to Insects are small insects. According to a proposal by Bierwisch
(1971) and Katz (1972), the appropriate
reference class in this situation is
the immediate
superordinate
of the subject. Assuming animal to be the
immediate superordinate
of insect, we predict that (3b) means the same as
(4b). Since this proposal for determining
the implicit reference class will be
important
in what follows, we will label it the Immediate
Superordinate
hypothesis.
This account of relative and absolute adjectives skims over many details
that would be required by a formal semantic theory (see R. Clark, 1970;.
Cresswell, 1976; Kamp, 1975; Parsons, 1972; Wallace, 1972; and Wheeler,
1972, for attempts at such a theory). For example, it may be an oversimplification to assume a strict dichotomy between relative and absolute adjectives, since it is difficult to tell for many items in which class they belong
(extreme
adjectives like gigantic and miniscule provide examples - see
Higgins, 1976; Huttenlocher
& Higgins, 1971). Intuitively,
there appears
to be a continuum
between relative and absolute types (Miller & Johnson-

Relative properties in memory

149

Laird, 1976, pp. 356-357).


But while we believe that this intuition
is
correct, there are nevertheless enough clear cases of relative and absolute
adjectives to allow us to pursue the differences between them. We return to
the more subtle borderline cases in the General Discussion where we can
bring some new facts to bear on them.

Relative Adjectives and Semantic Memory


There seem to be two ways to handle relative properties in semantic memory.
On one alternative, relative facts are pre-stored directly with the concepts
to which they apply; on the other, relative properties are not stored as such,
but are computed
from more basic data. We need to explore these Prestorage and Computation
theories in more detail, and in doing so, we will
pay particular attention to the way they account for verification of sentences
like Insects are small.
In the simplest type of Pre-storage model, concepts
like insect are
associated with a list of predicates denoting both relative properties (e.g.,
small, light-weight)
and absolute properties
(e.g., six-legged, egg-laying).
Verifying
sentences
like Insects are small or Insects are six-legged is
accomplished
by locating the corresponding
predicate in the property list
of the subject concept. However, in this unelaborated
form, the Pre-storage
model runs into problems in connection
with inference rules of the type
discussed above. As the pairs in (1) and (2) illustrate, absolute predicates
can participate
in inferences that relative predicates cannot, and we must
therefore
find a way to mark this relative-absolute
difference in order to
avoid drawing obviously fallacious conclusions
like (1 b). While there are
several ways to do this in a Pre-storage framework, we will simply assume
for the time being that predicates can be tagged as relative or absolute and
that the inference
rule embodied
in (1) and (2) contains a restriction
limiting its application to absolute predicates.
The main alternative to the Pre-storage idea is a Computational
model in
which no relative information
is stored at all. Along these lines, lets assume
that concepts can include an indication of the normal size (height, weight,
etc.) of their instances (Smith, Rips, & Shoben, 1974). For example, insect
might be marked as having a size of say a quarter inch and animal as say
12 inches. We can then determine the truth of the sentence Insects are small
by retrieving the size of insect, retrieving the size of its immediate superordinate (animal), and comparing these two values. Of course, the indication
of size need not be numeric. Any analogue or digital quantity will do as
long as these quantities fall on a common scale that allows direct comparison.

150

L. J. Rips and W. Turnbull

Verification of sentences with absolute adjectives (e.g., Insects are six-legged)


will proceed as in the Pre-storage model, that is, by retrieval of predicates
stored with the subject concept.
The Computational
model also handles the problem raised in connection
with (1) and (2) by restricting inferences of this kind to stored properties.
The inference goes through for stored size information
just as it does for
predicates like green, since it follows from Grasshoppers
m-e two-inch insects
that Grasshoppers
are two-inch
animals.
However, the deduction
will be
blocked for predicates like large in (1) since such higher order properties
are computed, not stored.
An experimental
comparison of the Pre-storage and Computation
models
can be based on differences in the way they verify sentences containing the
two adjective types, and in this respect, the Computation
model provides
the key prediction.
As we have just noticed, this model requires a threestep process (retrieving the value of the instance, retrieving the value of the
superordinate,
and comparing the values) before it can confirm a sentence
with relative adjectives. However, only a single retrieval step is needed for
sentences with absolute adjectives, so the Computation
model should predict
longer verification times for Sentence (5a) than for (Sb):
(5)

a. An insect is small.
b. An insect is six-legged.

The Pre-storage model, on the other hand, verifies both sentences in the
same way (by retrieving predicates of insect) and therefore does not predict
a difference in time to confirm them.
Unfortunately,
though, differences
in frequency,
imageability,
and the
like confound the comparison of absolute and relative adjectives in sentences
such as (5a) and (5b), so a more indirect approach is necessary. One possible
test of the models that gets around this problem makes use of sentences of
the form An S is an Adj P, where the predicate noun P is the immediate
superordinate
of the subject noun S. For example, corresponding
to the
predicate-adjective
sentences in (5), we have the following predicate-noun
sentences:
(6)

a. An insect is a small animal.


b. An insect is a six-legged animal.

To see why sentences of this type are helpful, consider first the Computation model. According to this approach, verifying both (5a) and (6a) means
retrieving the reference class animal. The two sentences differ only in that
(6a) specifies this class explicitly, while (5a) does not. Reading time for (6a)
will be longer than for (Sa) because of this extra word. But this disadvantage

Relative properties in memory

1.51

for (6a) may be offset if mentioning the reference class decreases the time
needed to access it. However, verification of (Sb) or (6b) does not require a
reference class since the adjective six-legged is absolute. Adding the superordinate in (6b) merely increases the number of words to be processed,
so (6b) provides no advantage over (5b). Thus, the Computation
model predicts an interaction
between the syntactic form of the sentences (predicateadjective versus predicate-noun)
and adjective type (relative versus absolute).
By way of comparison,
the Pre-storage model does not predict an interaction for the sentences in (5) and (6). According to this model, retrieval
of the superordinate
is not needed to determine the truth of either (5a) or
(5b) since both the relative predicate (small) and the absolute predicate
(six-legged)
are stored with insect. Consequently,
adding the superordinate
in (6a) and (6b) is redundant
and should slow processing by an equal
amount.
Since these predictions
are independent
of factors like frequency,
they
seem worth testing, and we proceed to do so in the following experiment.

Experiment

To provide a test of our predictions,


we employed two groups of subjects:
a PA group who verified predicate-adjective
sentences (e.g., (5a-b))
and
a PN group who verified the corresponding
predicate-noun
sentences (e.g.,
(6a-b)).
Before we could construct
the sentence stimuli, though, two
preliminary
studies were needed to select the subject and predicate noun
pairs and to establish the truth value of the resulting sentences.
A fair test of the Computation
model requires that the predicate nouns
are indeed the immediate
superordinates
of the subject nouns. For this
reason, we gave a separate group of subjects a set of nouns and asked them
to produce
the superordinate
category for each. On the basis of these
responses, we chose superordinates
for which there was substantial agreement among subjects, and we assume that these superordinates
provide the
More explicitly,
let a equal the base reaction
time for reading a predicate-adjective
sentence
containing
a relative adjective, executing
a response, and other background
tasks. Let P be the corresponding base time for absolute predicate-adjective
sentences. Predicate-noun
sentences will require an
extra time increment,
b, for reading the final word. Lastly, verification
of relative sentences will entail
time to access the superordinate,
which should be faster when the superordinate
is provided (c msec)
than when it must be inferred (c + d msec). Total time for relative sentences will then be II + c + d
msec for predicate-adjective
constructions
and a + b + c msec for predicate-noun
constructions.
Similarly,
absolute adjectives will take II msec when they occur in predicate-adjective
sentences and
u + b msec in predicate-noun
sentences. The interaction
follows when b (superordinate
reading time)
is small with respect to the remaining parameters.

152

L. J. Rips and W. Turnbull

reference class for relative sentences like (5a). Although they may not be the
most direct superordinates
in a scientific
taxonomy,
nevertheless,
they
appear to be the ones subjects would naturally use in verifying relative statements. This assumption seems to us to preserve the spirit of the Immediate
Superordinate
hypothesis.
To decide whether such sentences were true or false, we need to know
whether the normal value of the subject noun exceeds that of the superordinate with respect to the given adjective. We determined this by asking
another group of subjects to compare the referents of the two nouns along
a set of relative properties including size, width, thickness, height and length.

Method
Superordinate

generation

task

In the first preliminary study, we presented subjects with a list of nouns and
asked them to write below each a one-word category in which objects of that
type belonged. Subjects were told that for the word water they might write
liquid, and for steak, meat or food (neither of these examples appeared in
the experimental
list).
To compose the lists, we began with a set of 426 nouns, most of them
drawn from Battig and Montagues (1969) category norms. Nouns were
chosen from 24 of the Battig-Montague
categories (e.g., birds, flowers,
vehicles, etc.) that could plausibly be modified by both absolute and relative
adjectives denoting physical properties.
In addition to the items from the
norms, we used nouns from four categories of our own: building (e.g., skyscraper), car (e.g., Cadillac), rodent (e.g., rat), and road (with the instances
drawn from the local area). We sampled from 3 to 3 1 items from each category, attempting
to eliminate unfamiliar or ambiguous items and to maximize the range of properties
among the items represented.
Because of the
large number of items, we divided them randomly into two lists of 213 each.
The items on each list were themselves randomly ordered and typed in an
eight-page booklet. A blank line appeared beneath each item on which the
subject wrote his response.
We tested 22 subjects in a single group, half of them receiving one of the
booklets and half the other. Subjects were asked to complete the task in an
hour, and to help them keep pace, a signal was given after each eighth of an
hour had elapsed. The subjects were recruited by an advertisement
in the
University
of Chicago student
newspaper
and were undergraduate
or
graduate students or nonstudents
of comparable age. All of them were native

Relative properties in memory

153

speakers of English, and each received two dollars for his participation.
(Subjects in the remaining parts of the experiment
were drawn from the same
subject pool, but none was involved in more than one part.)

Rating task

As a second step, we asked a new group of subjects to compare a series of


items (e.g., grasshopper,
mosquito)
to the average member of its superordinate category (insect) on certain physical dimensions. The items and
their superordinates
were taken from the results of the production task, and
all items were ones for which at least nine of eleven subjects produced the
same superordinate
term (or produced close synonyms of the same superordinate category - e.g., fabric and cloth). In all, we selected 173 items
from 15 of the original categories.
Each of these categories was then
assigned to a pair of polar relative adjectives for purposes of the rating
task: tree and jlower to tall-short; vegetable, fish, and cloth to thick-thin;
bird, weapon, and vehicle to long-short;
city, insect, and animal to biglittle; fruit and (musical) instrument
to large-small; and tool and state (of
the U.S.) to wide-narrow.
Subjects decided whether each item was greater than or less than the
average member of its superordinate
category with respect to the assigned
dimension.
For example,
the subjects determined
for each insect item
whether it was bigger or littler than the average insect. These judgments
were given as ratings on an 1 l-point scale, with 0 designated as much littler
(narrower,
thinner, etc.) than average, 5 as average, and 10 as much bigger
(wider, thicker) than average. Items from a given category were listed in
random order on a separate sheet along with instructions
specifying the
superordinate
category and the dimension on which they were to base their
decision. The 15 sheets were then assembled in a booklet, using a new
random order of the sheets for each subject. Thirteen subjects participated
in a group in this phase of the experiment.
One hour was allotted for the
ratings, and subjects received two dollars for their time.

Sentence

verification

task

On the basis of the ratings, we constructed


a set of 96 predicate-adjective
(PA) sentences, which were presented in a tachistoscope
to the PA group for
verification. An equal number of predicate-noun
(PN) sentences were formed
by adding the appropriate
superordinates,
and these were presented to the
PN group.

154

L. J. Rips and W. Tumbull

The experimental
procedure
was identical for the P4 and PN groups. A
subject initiated a trial by pressing the central button of a three-button
response panel with his right index finger. This button-press
brought a
fixation point into view on the left side of the tachistoscope
field where it
remained for two seconds. At the end of this interval, the stimulus sentence
appeared automatically
with its first letter in the position previously occupied by the fixation point. We had instructed the subject to read the sentence and to decide whether it was true or false. To register his decision, he
pressed one of the two outer buttons of the response panel with his right
index finger. For half the subjects in each group, the right-most button was
labeled True and the left-most button False, while for the remaining subjects these positions were reversed. Subjects were instructed
to execute
their response as quickly as possible, but without making any mistakes. The
second button-press
ended the sentence presentation
and stopped a clock
that had been activated by the onset of the sentence. In the interval between
trials (approximately
10 seconds), the experimenter
informed the subject of
his reaction time and of the accuracy of his response. The experimenter
then
recorded this information,
replaced the stimulus card, and signaled that the
subject could begin the next trial. At the very beginning of the experimental
session, the subject was given 12 practice trials (6 true and 6 false ones) to
acquaint him with the procedure. The practice sentences were of a variety
of syntactic
types, some similar to those of the experimental
sentences;
however, there was no overlap in the semantic content of the experimental
and practice sentences.
Half the PA and half the PN sentences contained relative adjectives, taken
from the six pairs of polar adjectives listed above. For each pair (e.g., hrgesmall),
we selected two of the superordinates
(e.g., instrument
and fruit)
that had been rated with them, and for each superordinate,
two items (e.g.,
flute
and xylophone
for instrument,
and plum and grapefruit
for fruit),
one of which had been rated greater than the average and the other less
than the average category member. We created an octet of sentences by
combining
these items with the two adjectives in both PA and PN form
(e.g., A plum is (al small (fruit), A plum is (a) hrge (fruit), A grapefruit
is
(a) small (fruit),
and A grapefruit
is (a) [urge (fruit)).
There were a total of
12 octets, and within each, four of the sentences were PA and four were
PN. Within these two syntactic types, two of the sentences were true and
two were false. On the 1 l-point scale, the mean rating for the greater-thanaverage items was 6.5 and the mean for the less-than-average
items was 3.5,
SE = 0.10 (recall that 5.0 had been designated
as the size of an average
member of the category). Median word frequency was 3.5 and 2.5 tokens
per million words for the greater-than-average
and less-than-average
items,

Relative properties in memory

155

and 212 tokens per million for the relative adjectives (KuEera & Francis,
1967).
The remaining sentences were formed in a similar way from the absolute
adjective pairs fragran t-odorless, airborne-flightless,
dark-pale, curved-straight,
shiny-lusterless,
and hilly-flat. The same 12 superordinate
categories were
employed with the absolute adjectives as with the relative adjectives, two of
them being assigned to each adjective pair (e.g., tool and instrument were
assigned to curved-straight).
As before, two items were chosen from each
category so that one item was true of the first member of the adjective pair
and the other item was true of the second (e.g., pliers was chosen as the
curved tool and screwdriver as the straight tool, while tuba was the curved
instrument
and piccolo the straight one). Octets of sentences were again
generated by combining the two instances in each category with the two
adjectives (e.g., A piccolo is (a) straight (instrument),
A piccolo is (al curved
(instrument),
A tuba is (a) straight (instrument),
and A tuba is (a) curved
(instrument)).
Though drawn from the same categories, the individual items
were different from those used with the relative adjectives. The median word
frequency of these nouns was 5.5 tokens per million words, and the median
frequency for absolute adjectives was 5.0.
Each sentence was typed in lower case Orator letters on a white 6 X 9
inch card. The length of the PA sentences varied from 32 to 65 mm (2.0 to
4.1 degrees of visual angle) while the length of the PN sentences varied from
55 to 85 mm (3.4 to 5.3 degrees). The sentences measured 3 mm (0.2
degrees) vertically. PA and PN sentences were separately randomized at the
beginning of the experiment,
and each set was reshuffled after it had been
presented to a subject.
Forty-eight
subjects participated
in the sentence verification
task, half
in the PA and half in the PN group. All of the subjects were right-handed
members of the subject pool described above. The experiment
took about
45 minutes to complete (including a short break after the first 48 sentences),
and subjects were paid two dollars each.

Results and Discussion


Relative

versus absolute adjectives

Our principal interest is in mean correct RTs and error rates for relative and
absolute adjectives. Figure 1 presents these data separately for the PA and
PN groups. We note first of all the large error rates in this experiment,
averaging 19.3%. An error rate of this size is, of course, unlikely to be caused

156

L. J. Rips and W. Turnbull

merely by low-level processing mistakes, and we th,erefore use the term


error advisedly. As we will see, these responses provide an important clue
as to how subjects comprehend
relative adjectives. For purposes of analysis,
RTs from trials on which subjects made an error were replaced according to
the procedure described by Winer (197 1, p. 487).
Figure 1.

Reaction time and error rate as a function


type, Experiment 1.
2100,

of syntactic form and adjective

54
2050

4Oi

30.
B
:
E
F
g

zo-

The Computation
model predicts that the difficulty in verifying sentences
with relative adjectives should be greater for PA than PN constructions.
In
line with this prediction,
error rates increased from 21.3% for relative PN
sentences to 28.6% for relative PA sentences. At the same time, errors were
very nearly constant for absolute adjectives across the PN condition (13.3%)
and the PA condition (14.1%). The RTs exhibited a similar trend. With relative adjectives, subjects took 1920 msec to verify PN sentences, but 1980
msec to verify PA sentences. However, with absolute adjectives, RTs were
about equal: 1958 msec for PN sentences and 1955 msec for PA sentences.
To assess these effects, we carried out analyses of variance on the errors and
RTs with both subjects and sentence octets serving as random effects (H.
Clark, 1973; Winer, 197 1, p. 375). In these analyses, the interaction between

Relative properties in memory 157

syntactic form and adjective type was significant in the error data, but not
in the RTs. (For errors, SE = 1.5%, F (1,3 1) = 5.35, p < 0.05, where F is
the quasi-F ratio -Winers F. For RTs, SE = 3 1 msec, F (1,24) = 1.36, p
> 0.10).
Error rates were larger for sentences with relative adjectives (24.9%) than
for those with absolute adjectives (13.7%), SE = 1.6%, F( 1,13) = 12.96,
p < 0.01. The Computation
model easily accommodates
this difference since
it employs a more complex (and, presumably, a more error-prone) process in
handling relative properties. However, as we remarked earlier, this difference
is confound.ed
by imagery and other variables. Moreover, no comparable
difference appeared in the RTs. Subjects took 1950 msec to verify relative
adjectives and 1956 msec to verify absolute adjectives, SE = 30 msec,
F < 1. There was no significant main effect of syntactic form in either the
error data (SE = 1.9%, F( 1,5 1) = 2.5 1, p > 0.10) or the RTs (SE = 85 msec,
F < 1). Neither dependent measure showed a reliable effect of the sentences
truth (SE = 0.9%, F < 1 for the errors, and SE = 13 msec, F( 1,27) = 1.70,
p > 0.10 for the RTs) nor any interaction
of truth with syntactic form or
adjective type.
Although the Computation
model is consistent with these data, the high
error rates are grounds for suspicion, and they prompt us to take a closer
look at the relative items, where most of the errors arise. One cause of the
errors becomes apparent if we put ourselves in the place of subjects verifying
the following sentences:
(7) a. A spruce is tall.
b. A dogwood is tall.
c. A poinsettia is tall.
d. A petunia is tall.
Since a spruce is taller than the average tree and a poinsettia taller than the
average flower (as determined
by our ratings), (7a) and (7~) should be true
according to Bierwischs and Katzs theory (the Immediate
Superordinate
hypothesis),
which we used to construct our stimuli. Similarly, since dogwoods are shorter than average trees and petunias are shorter than average
flowers, (7b) and (7d) should be false. But while this analysis seems quite
reasonable for (7a) and (7d), there is something odd about affirming that
poinsettias
are tall while denying that dogwoods
are tall. To put this
another way, the Immediate Superordinate
hypothesis
stipulates that the
truth value of the examples in (7) should be the same as that of the corresponding PN sentences in (8):
(8) a. A spruce is a tall tree.
b. A dogwood is a tall tree.

1.58

L. J. Rips and W. Turnbull

c.
d.

A poinsettia is a tall flower.


A petunia is a tall flower.

But intuitively, the truth value of (8b) and (8~) is more clear-cut
counterparts
in (7b) and (7~).
Consistency

effects

and the anthropomorphic

than their

norm

These observations suggest that the large number of errors for relative PA
sentences may have been due to faulty linguistic analysis rather than to
subjects mistakes.
One possible source of difficulty for (7b) is that while
dogwoods are shorter than average trees, theyre nevertheless taller than the
size of most objects with which people typically interact (including the size
of people themselves). In the same way, poinsettias in (7~) are tall flowers,
but are short for most everyday objects. If subjects apply this alternative
reference point in deciding on the truth of (7b-c),
we would expect their
decision to differ from the experimentally
defined answer. In (8b-c), however, the reference class is explicitly provided, and subjects responses should
coincide with our appointed answer. Use of a human reference point for
relative adjectives has been discussed by Suzuki (1970), who notes that a
sentence like Giraffes have long necks is often understood
to mean that
giraffes have longer necks than people. A similar anthropomorphic
standard may apply to the relative adjectives in our own PA sentences (see also
Miller & Johnson-Laird,
1976, p. 324). In the remainder of the paper, we
refer to this standard as the object reference point, meaning the normal
size of everyday, human-oriented
objects.
Sentence sets like (7) allow us to examine our data for the effects of this
alternative reference point. For Sentences (7a) and (7d), a decision based on
the immediate
superordinate
will be the same as one based on average
objects, since spruces are tall and petunias are short with respect to both
standards. We will therefore label such instances as consistent
items. However, as we have seen, dogwoods and poinsettias are tall with respect to one
reference point and short with respect to the other, and we will call instances
of this type inconsistent
items.
For sentences containing relative adjectives, 11 of 12 octets in our experiment contained one consistent and one inconsistent item (where consistency

Feedback
during the experiment
may have caused subjects to change their response criteria in
the direction
of the Immediate
Superordinate
hypothesis.
Thus, the obtained error rates may be conservative estimates of the proportion
of trials on which subjects judgments
differed from the experimentally
defined correct
answers. Underestimates,
however, are unlikely to affect the conclusions
that we draw from these data. (See Experiment
3 for a different approach to the feedback problem.)

Relative properties in memory

159

was determined
by ratings to be described in Experiment
2). Figure 2
exhibits RTs and error rates for these relative octets, with consistent and
inconsistent items plotted separately. Looking first at the error rates, we find
an increase for inconsistent
items from 24.8% in PN sentences to 36.9%
in PA sentences. Errors on consistent items increase only slightly from 17.6%
for PN sentences to 19.9% for PA sentences. Although this interaction
is
not significant (SE = 3.3%, F( 1,22) = 2.59, p > O.lO),
these data suggest
that the PA-PN difference
observed in Figure 1 for relative adjectives is
largely attributable
to inconsistent items. The same conclusion can be drawn
from the RTs. Inconsistent
items exhibit an increase from 1934 msec (PN
sentences) .to 2074 msec (PA sentences),
while consistent items increase
from 1896 msec (PN) to only 1905 msec (PA), SE = 40 msec, F( 1,49) =
3.17,0.05
<p < 0.10.
Figure 2.

Reaction time and error rate far sentences containing relative adjectives as a
function of syntactic form and consistency, Experiment I.
21001
50-

2050

40-

30I

!
w

Inconsistmt

&
;

20\Consistent

Predicate
Adjective

Syntactic

Implications

Predicate
Adjective

Predicate
Noun

Predicate
NOUll

Form

for the models

In the light of the consistency effects, our brief for the Computation
model
appears weaker than at first. Relative PA sentences were indeed difficult to

verify

in this experiment.
but the difficulty was largely due to inconsistent
items. Sentences with consistent
items resemble sentences with absolute
adiectives in showing little, if any. difference in errors or RTs between their
PN and their PA versions (compare
the slopes for absolute sentences in
Figure
1 to those for consistent
relative sentences
in Figure 2). The
Computation
model has trouble explaining this resemblance.
Of course,
these results are also no comfort to the Pre-storage model, although the
problem with this theory is of the opposite sort. Since this model treats
relative and absolute adjectives identically, it can predict the data for consistent relative and absolute sentences, but founders in explaining the inconsistent items.
Our results suggest that a revised Computation
model should incorporate
two comparisons in handling relative sentences, one to the superordinate
and
the other to the object reference point. The simplest assuInption is that both
comparisons are carried out for PA sentences, while only the superordinate
comparison
is executed
for PN sentences.
But in most serial or parallel
models, this would predict faster RTs for PN than for PA sentences, a difference that amounts to only 9 msec for consistent items in Figure 2. A
second possibility is that both comparisons
are performed
for PA and PN
sentences alike. With respect to consistent
items, these two comparisons
would produce the same outcome (a consistent item, by definition. exceeds
both reference
points or falls short of both), and no further processing
would be needed for either syntactic form. But for inconsistent
items, the
outcomes differ, forcing a subiect to choose between them or to combine
them according to some decision rule. We can imagine that such a decision
is easier for a PN sentence, since its superordinate
signals that the result of
the object comparison
can be ignored. If so, this would account for the
PA-PN difference for inconsistent
items. Of course, this modified Computation model is little more than a description of the data. but as we shall see,
it yields predictions
that will prove useful in the following experiment.3
Multiple reference points can also help us bail out the Pre-storage model.
Along these lines, we can continue to assume that the property list of a
consistent item contains a single predicate for each relative property (e.g.,

3The consistent
items are something
of a problem for this second model as well. We earlier assumed
that the superordinates
in PN sentences facilitate access to the corresponding
concept. To explain the
data for consistent
items, we must either assume that this advantage
is canceled by superordinate
reading time Cb = d > 0 in Footnote
1) or that both effects are nil (b = d = 0). Alternatively,
we could
posit a compromise
Computation-Pre-storage
model in which absolute and consistent
relative information is stored and inconsistent
relative information
computed.
But if anything,
this latter model is
more ad hoc than the one outlined above. and we stick to the former alternative
in the discussions
that follow.

Relative properties in memnrv

I61

will be labeled short). However, inconsistent items will possess two


such properties,
marked with respect to the alternative reference points:
for example, poinsettia will be listed as tall-for-flowers and short-for-averageobjects.
Both predicates will have to be checked in verifying inconsistent
PA sentences, and a choice made between them, yielding slow and errorprone responses. But with PN sentences, only the superordinate
reference
point is considered,
producing
faster, error-free decisions. This modified
Pre-storage model, like its Computation
rival. can therefore account for the
results of Figures 1 and 2, but at the expense of some ad hoc assumptions.
In short,. our findings are sufficient to reject both of the models in their
original form. However, both can be revived by including the distinction
between consistent and inconsistent
items. To differentiate
these modified
theories, we need to explore variables other than syntactic form.
petunia

Experiment 2
The main feature that distinguishes the Computation
model from the Prestorage model is its extra comparison step. Previous studies have identified
factors that affect this step, and if we can show that these factors also affect
verification of sentences with relative adjectives, we will have obtained some
prima facie support for the Computation
model. This is the strategy that we
pursue in the experiments
reported below, using symbolic distance as the
critical factor. In Experiment
2 we look for evidence of this effect in ratings
of the truth of relative sentences, while in Experiment
3 we use reaction
time data.
Symbolic

distance predictions

Symbolic distance is the subjective difference


in the size of two objects
(Moyer & Bayer, 1976), and in general, it determines the speed with which
objects can be mentally compared, with greater distance producing faster
comparison
times. For example, subjects take less time to decide that a
horse is larger than a rabbit than that a horse is larger than a deer (Banks
& Flora, 1977; Holyoak,
1977; Jamieson & Petrusic, 1975; Moyer, 1973;
and Paivio, 1975). The mechanism responsible for this effect is a matter of
current dispute (see Banks, 1977, and Moyer & Dumais, 1978, for reviews),
but in most of the theories proposed to date, the size values of the two
objects are retrieved from the relevant concepts in semantic memory and are
compared to determine which is larger.
This process is clearly similar to that of the Computation
model and
suggests that we look for symbolic distance effects in relative sentences.

162

L. J. Rips and W. Turrzbull

However, in this case, the critical distance will be between a subordinate


item and its superordinate
category, rather than between two coordinate
categories. For example, consider the true PN sentences Airplanes are large
vehicles and Trucks are large vehicles. According to the Computation
model,
these sentences are verified by comparing the size of an airplane or truck to
that of a normal-sized vehicle. Assuming that the symbolic distance between
airplane and vehicle is greater than that between truck and vehicle, it should
be easier to confirm the first of the two sentences above.
This symbolic distance prediction can be elaborated in view of the results
of Experiment
1. In explaining the consistency findings of that experiment,
we were led to assume that the Computation
model performs two comparisons in verifying relative sentences. If this hypothesis
is correct, we
should also expect to find two symbolic distance effects: one of them will
depend on the difference in size between the subject category and the superordinate reference
point, while the other will depend on the difference
between the subject category and the object reference point. To put this
a bit more precisely, let I denote the normal (subjective) size of instances in
the subject category, S the reference point for the immediate superordinate,
and 0 the reference point for objects. The difference 1 - S then represents
the amount by which instances of the subject category exceed the superordinate reference point, and I
0 represents the amount by which the
instances exceed the object reference point. For a given subject category,
we will call I - S its superordinate
size and Z 0 its object size.
In these terms, the Computation
model predicts that the perceived
truth of a sentence containing an adjective like big will increase as superordinate size increases. For example, Airplanes are big (vehicles) should be
given higher truth ratings than Trucks are big (vehicles) since airplanes are
bigger vehicles than trucks. In the same vein, the model predicts that rated
truth will increase with increasing object size. For although both airplanes
and elephants are large members of their respective categories, Airplanes
are big (vehicles)
should receive higher ratings than Elephants are big
fanimals) since airplanes are bigger than elephants. These effects, however,
may depend on whether or not the superordinate
is specified. The superordinates vehicle and animal indicate that the truth of the sentence should
be judged relative to these categories. So while the effect of superordinate
size may be greater for predicate-noun
than predicate-adjective
constructions, the effect of object size should be greater for the predicate adjectives.
Of course, all of these predictions are peculiar to the Computation
model.
Since the Pre-storage model has no comparison
stage, it does not predict
variations in ratings with changes in distance.

Relative properties in memory

163

Method
We began with 172 of the items from Experiment
1 for which there had
been good agreement about immediate superordinates
(one noun from the
original set was inadvertently
omitted). For each of these items, separate
groups of subjects were asked to provide ratings of the following variables.
(a) the size of the items with respect to an average member of their immediate superordinate
(e.g., the size of apples with respect to the average
fruit); (b) the size of the items with respect to average objects; (c) the truth
of PA sentences of the form IS are big (e.g., Apples are big); (d) the truth of
PN sentences of the form Is are big Ss (e.g., Apples are big fruits). The first
two of these measures were used to determine symbolic distance. The truth
ratings in (c) and (d) serve as dependent variables.
For the size ratings of Task (a), we followed the procedure described in
Experiment
1 (see the section entitled Rating Task). The procedure for
Task (b) was somewhat similar; however, in this case, subjects received a
dittoed set of instructions together with a computer-generated
list consisting
of noun-adjective
pairs (e.g., apple-big).
The instructions asked subjects to
compare each item to an object of average size with respect to the indicated
property. The subject used an 1 l-point scale for his response, with 0 designated much less than average, 5 average, and 10 much more than average.
All of the 172 nouns were paired with the adjective big, but a number of
these nouns were repeated with other relative adjectives. These additional
pairs were used to examine the consistency of the items in Experiment
1,
as described in the previous Results and Discussion section. Altogether the
list contained 306 pairs. Order of the pairs on the list was randomized in a
new order for each subject. (This was true as well for the lists associated with
Tasks (c) and (d) below.)
In the remaining tasks, subjects evaluated the degree of truth for a set of
PA sentences (Task (c)) or PN sentences (Task (d)). All 172 nouns appeared
with the adjective big, but as in Task (b), some of the nouns were repeated
with other adjectives. The ratings were made on the usual 1 l-point scale,
with 0 denoting definitely false and 10 denoting definitely true.
Forty-eight
subjects provided these ratings, twelve in each task. The subjects were part of the same population as those in Experiment
1, but had not
taken part in the earlier study. They were tested in groups of from I to 12
individuals and were paid $2.00 for an hour-long session.

164

I.. J. Kips at& W. Turnbull

Results and Discussion


in order to assess our predictions
statistically,
we performed
a regression
analysis on the mean truth ratings with superordinate
size, object size, and
sentence construction
as independent
variables. This choice of method was
motivated by the continuous
nature of the two size variables and the correlatron between them (r = 0.65 for these data). All effects were estimated
usmg the procedure
for repeated measures described by Cohen and Cohen
(1975, Ch. 10). The truth and size ratings were first re-expressed to compenbate for the upper and lower bounds of the scale using the logit transformatron
Y = log X -~ log( 10
X), where X is the rating on the original
U to 10 scale and Y is the transformed
variable (Mosteller & Tukey, 1977,
Ch. 5). The construction
factor was coded +l for predicate-noun
sentences
and -- 1 tar predicate-adjective
sentences.
The results of this analysis were in good agreement with the predictions
of the Computation
model. First, the rated truth value of a sentence increased as the superordinate
size of the subject noun increased (b = 0.70,
SE = 0.034, J( 1,169) = 424). indicating
that subject judgments
were
sensitrve to average size within the immediate superordinate
category. This
effect is larger for predicate-noun
constructions
where the superordinate
is
mentioned,
than for predicate-adjective
sentences where it is not (b = 0.28,
SE = 0.02 1, F( 1, 169) = 180). More interestingly,
object size had an independent effect on truth ratings (17 = 0.44, SE = 0.040, F( 1, 169) = 116),
suggesting that subjects also made use of a standard associated with everyday objects. Object size also interacted with sentence construction,
exerting
a larger influence
on predicate-adjective
than predicate-noun
sentences
(b = --0.19, SE = 0.025, F( 1, 169) = 59). So while superordinate
size dominates for predicate-noun
constructions,
object size is more important
for
predicate-adjective
items.
Superordinate
size, object size, and their interactions
with sentence type
together account for 85.8% of the variance among the mean truth ratings,
and the success of the Computation
model in predicting these ratings encourages us to look for similar effects in a reaction time task like that of
Experiment
1. The ratings collected in the present study stand us in good
stead in this regard, since they allow us to partition the stimulus instances
into those high and low along both size continua.
Moreover, the truth
ratings can be used to assign a truth value to individual sentences rather than
relying on the now suspect Immediate Superordinate
hypothesis.

Relative properties in rrwrwty

Experiment

105

The predictions of the Computation


model are slightly mure cornplicareci for
reaction times than for truth ratings. The perceived truth of the sentence XS
urc big should increase as the Xs increase in size, bur verificaiion time
for such a sentence should instead follow an mverred U-shaped funcrion.
As the Xs get larger, XS are big goes fi-om being detlmreiy false (e.g.,
Hummingbirds
are big) to dubiously false (e.g.,Spurrows
are big) 50 dubiously
true (Pigeons ure big) to definitely true (Flamzngos are big), with slower verlfication rmle for the dubious cases. This means that our superorciinate sir,e and
object size variables should interact with the truth of Ihe srimuius sentences.
For true sentences, verification
time should decrease with superordirlate
(or ObJect) size, while for false sentences, verification time should increase
with size. The results of Experiment
2 also prompt us to expect an inferaction of the size variables with sentence construction,
superordinate
SIK
having a larger effect on predicate-noun
sentences and object size having a
larger effect on predicate-adjective
sentences.

Method
We again used the adjective big to test the above predictions. To one group
of subjects (analogous to the PA group of Experiment
1) we presented plural
nouns (e.g., apples) singly on a CRT. On each trial, the subject was to decide
whether the sentence frame _ are big would be true if the noun was substituted in the blank. A second, PN, group viewed the same nouns this time
accompanied
by their immediate superordinates,
(e.g., apples-jkuits),
and
they were asked to determine the truth of the sentence L____ are big __
when the instance filled the first slot and the superordinate
the second (the
frames themselves did not appear during the trial).
To select the stimuli, we employed the ratings collected in Experiments
1
and 2. From the pool of 172 nouns used in the second experiment,
we
selected 1 12 according to the following criteria. First, for each item, both
the predicate-adjective
and predicate-noun
sentences
that contained
it
received a mean truth rating greater than 5.00 (True items) or both received
a mean rating less than 5.00 (False items). This rule was adopted to simplify
the analysis of the results, since the truth of a given item is fixed across PA
and PN groups. The True items and False items were then separately classified as large or small with respect to rated superordinate
and objecr size.
This classification produces eight categories (e.g., true, small superordinate
size, large object size; false, large superordinate
size, small object size; etc.),

166

L. J. Rips and W. Turnbull

and the final set of items was chosen with an equal number of instances
(viz., 14) in each category. For the True instances, mean superordinate
size
was 8.24 for large items and 6.26 for small ones; mean object size was 6.52
for large and 4.72 for small items on the 0 to 10 scale (SE = 0.24). For False
instances, mean superordinate
size was 4.38 for large and 2.52 for small
items, while object size was 3.56 for large and 1.77 for small items (SE =
0.23). Median word frequency
for the True instances was 9.5 tokens per
million words for small superordinate
size, 5.5 for large superordinate
size,
4.5 for small object size, and 26 for large object size. The corresponding
frequencies for False instances were 0, 2, 0, and 3.5 tokens per million. To
these critical items we added 34 fillers so that for most (9 of 16) superordinate categories, half of the instances in each were True and half were
False. Over the entire set of 146 items there were also an equal number of
True and False instances.
This set of instances was presented to subjects four times in successive
blocks of trials, two during a first days session and two on a following day,
with stimulus order randomized
anew at each presentation.
The procedure
during a trial was similar to that of Experiment
1, but with a few minor
changes. The subject was seated this time at a CRT terminal with a response
apparatus that consisted of a button at his left and three buttons about 18
mm apart on his right. He initiated the trial by pressing the left-hand button
with his left index finger, and for a 2 set interval thereafter he saw the word
ready presented on the screen about 400 mm away. At the end of the
warning interval, the ready signal was replaced by either a single instance
(for the PA group) or a superordinate-instance
pair (for the PN group) with
the superordinate
just above the instance. The subject made his true or false
decision by moving his right index finger from the center of the three
buttons on the right to one of the neighboring buttons. The position of the
True button was at the right of center for half the subjects in each group
and at the left for the other half. The response terminated the display and
was followed by a 2 set period in which the reaction time for that trial (but
no indication of accuracy) was presented to the subject as feedback. At the
end of a session (i.e., two blocks of trials) the experimenter
informed the
subject of both his mean reaction time and error rate. Delaying accuracy
feedback
until the end of the session was intended
to discourage rote
learning of the assigned truth value, while at the same time encouraging
correct responses. The experiment was preceded by a 20-trial practice session
during which subjects were asked to press the appropriate button in response
to the word true or false.
The PA and PN groups consisted of eight subjects each. These subjects
were right handed and belonged to the same subject pool as those of Experi-

Relative properties in memory

167

ments 1 and 2; however, none of them had been involved in the previous
experiments.
They received $4.00 apiece for participating,
plus a 50 cent
bonus for each block in which their error rate was less than 10 per cent.
Subjects received an average of $4.47.

Results and Discussion


The main reaction time and error data are shown in Table 1. Perhaps the
most obvious fact about them is the very fast times for the PN group, a
difference that may be due to the way we presented the stimuli. PN subjects
saw the superordinate
noun above and slightly before
the instance,
and the superordinate
may have given them a headstart in processing the
following word (Meyer & Schvaneveldt,
197 1).
In most respects the data bear out the predictions
of the Computation
model, and to see this, consider first the results of the superordinate
size
variation.
If we combine,
for the moment,
data from the PA and PN
groups, we find that mean correct verification time for True items decreases
from 906 msec for small instances to 840 msec for large ones. However,
verification
time for False items increases with size from 861 to 9 13 msec,
producing
the predicted
interaction
of superordinate
size and truth; SE
(subjects) = 21 msec, SE(items) = 22 msec, min F (1,41) = 4.79, p < 0.05
(for the min F statistic, see H. Clark, 1973). The error rates also conform
to this pattern, decreasing with size from 21.5% to 9.2% for True items
and increasing with size from 9.2% to 15.2% for False items (SE(subjects)
= 1.3%, SE(items) = 2.0%, min F (1,46) = 11.33,~ < 0.01).
The effect of object size is at least equally strong in Table 1. The interaction with truth value is evidenced by decreasing verification time with size
for True items (from 9 12 msec for small instances to 834 msec for large
ones) and increasing times for False items (from 828 msec to 946 msec).
Error rates again echo the reaction times, dropping from 19.4% for small
True items to 11.2% for large Trues, and rising from 7.0% for small False
items to 17.5% for large False ones. This interaction is once again significant
for reaction times (SE(subjects)
= 19 msec, SE(items) = 22 msec, min F
(1,63)
= 10.6 1, p < 0.01) and for error rates as well (SE(subjects)
= SE
(items) = 2.0%, min F(1,52)
= 11.2,p < 0.01).
We can also check the way the above effects differ for the PA and PN
groups. On the basis of Experiment
2, we would expect larger effects of
superordinate
size for predicate-noun
constructions,
but larger object size
effects for the predicate-adjective
items. In the context of the present experiment, these predictions imply that the interaction
of superordinate
size and

168

L. J. Rips and W. Turnbull

Table 1.

Mean Reaction Time (msec) and Percent Errors (in ParenthcsesJ for Prediute
Adjective (PA) and Predicute Noun (PN) Sentences in Experiment
3, b.v
Truth, Superordinate Size, and Object Size.

SUpUordinate
Size

True PA Sentences

..__~~~~~
Object

ITake

PA Sentences

Objwt

SIX

True PN Sentences

SIX

Object

Size

I.&e

PN Scntenccc

Object

Size

Small

L31pJ

SIIl~ll

La1g:u

Small

Lllrgc

SInall

LClQ!C

1175
(31.U)

1CD.)3
(Y.6)

922
(2.0)

1142
(1 X.5)

135
(27.2)

112
(18.1)

6Y1
(3.8)

688
(12.7)

10x0
(13.6)

988
(1 1.6)
~__ __-

984
(5.6)

1189
(17.8)

656
(6.0)

634
(5.6)

716
(16.5)

763
(2 1 SJ)
_

.~_

truth. examined above, should be larger for the PN than for the PA group,
while the object size by truth interaction
should show the reverse effect.
Turning to the data, we find that the relevant difference in reaction time for
superordinate
size is not reliable, though it shows a trend in the predicted
direction (S&subjects)
= 29 msec, SE(items) = 24 msec, min F < 1). The
errors show a somewhat stronger effect, with the interaction increasing from
4.6Yc for the PA group to 13.7% for the PN group; however, this effect is
only nlarginally
significant
by the min F test (XQsubjccts)
= 2.6(,%,
SE(items) = 2.0%, nzin I;( 1.34) = 3.73, 0.05 < p < 0.10). The object size
predictions,
however, are clearly confirmed since the size of the interaction
is 172 msec for the PA group and only 22 msec for the PN group (SE(subjects) = 27 msec, SE(items) = 24 msec, min F( 1,38) = 9.32, p < 0.01).
The error rates show a parallel difference, though in this case not a significant one (SE(subjects)
= 2.8%, SE(items) = 2.0S, min F( 1,31) = 2.15, p >
0.10).
Taken together, the above results provide rather strong support for the
Computation
model. Moreover, by partitioning the items, we have been able
to show effects of both superordinate
and object size when these factors
vary orthogonally.

General

Discussion

Computarion

versus Pre-Storage

ModcJls

The Computation
model evolved from the basic idea that the truth of sentences with relative adjectives is determined
by mental comparison. For the

Relative properties in memory

169

sentence Spruces are tall, this would mean comparing the stored height of
spruces with that of its immediate superordinate
tree. However, the results
of Experiment
1 led us to modify this assumption by suggesting that two
comparisons
were involved - one to the normal value of the superordinate
and the other to a normal value for everyday objects. Experiments
2 and 3
lent some support to this prospect. RTs, errors, and truth ratings all showed
effects of symbolic distance to both the superordinate
value, and to the
object value as well.
The Pre-storage model stacks up less well against the evidence. While it
was able to explain the results of the first experiment on the assumption that
two relative properties are stored, it ran into difficulties in accounting for
the symbolic distance effects in Experiments
2 and 3. Of course, our results
do not imply that relative properties are never pre-stored; what the evidence
rules out is pre-storage for all relative properties of common object concepts.
Although the results favor a Computation
approach, there are a number of
residual problems with such a model that we should consider carefully. One
of these concerns the inefficiency
of Computation,
for it seems redundant
to calculate the truth of a relative sentence in the elaborate manner that the
model dictates. Why not store the result of an initial computation
once and
for all so that it can be referred to as needed? The question of efficiency,
however, depends on the relative costs attached to storage and processing.
If storage consumes a large share of the systems resources, it may prove
more efficient
to store a minimal amount of information.
By analogy,
mental arithmetic would be computationally
easier if one memorized
the
multiplication
table for all pairs of numbers less than 100. The fact that few
of us do so indicates that computational
simplicity must trade against
storage economy. Furthermore,
while storage is not out of the question for
the kinds of sentences considered here, we should remember that relative
information
is used in other ways as well - for example, to compare two
instances (A spruce is taller than a refrigerator) or to compare an instance to
a metric reference point (A spruce is more than six feet tall) or to a contextually
established reference point (Spruces are the tallest trees on this
block). Since there is an unlimited number of such propositions,
not all of
them can be pre-stored. Given that computation
is needed in these cases,
it would not be surprising if a similar process were applied to sentences such
as those considered here.
One can grant the plausibility of a computation
process, however, and still
object to the model outlined above. In particular, the idea of two distinct
comparisons seems odd, since in functional terms a single comparison would
be easier to perform and would simplify communication
about relative
facts. It may be possible to formulate the Computation
model in a way

170

L. J. Rips and W. Turnbull

that omits the object comparison and that is still consistent with the experimental results. For example, we can suppose that instead of the double
comparisons,
a subject weights the result of a single superordinate
comparison by the absolute size of the instance, with instances at the extremes of
the size continuum
receiving high weight. But of course in these terms, the
question then becomes why any weighting is needed in determining the sentences truth value. Note, too, that something akin to an object reference
point is still required in this alternative model to decide what constitutes
the upper extreme of a dimension like size that is unbounded above.
A second possibility
is that the object comparison
be explained away
as an artifact of the experimental
situation. In all of the studies reported
here, subjects received a randomized set of instances drawn from a variety
of categories, and the range of instances may itself provide a context against
which any given instance will appear big (tall, thick, and so on). We can
think of this as a type of adaptation
level that could be absent in more
naturalistic settings.
While we have no firm evidence against the adaptation theory, Experiment
3 provides some suggestive data. If the effect of object size is due to adaptation, we should find that this effect increases over the four blocks of trials.
But in fact, the opposite trend appears in the results: the crucial interaction
of object size and truth decreases steadily (though not significantly)
across
blocks. Moreover,
there are independent
reasons why an object reference
point might be important.
First, for very atypical instances, the immediate
superordinate
may be uncertain or inaccessible. Second, even if the immediate superordinate
is obvious, its reference point may not be. For example,
the superordinate
size of a category like weapon will vary greatly depending
on which instances we are willing to include in this category (the size will be
quite large if such things as missiles are included). Both problems can be
avoided by using the object reference point.
Properties

in Semantic

Memory

While our experiments have tried to determine the status (pre-stored or computed) of relative properties, we have simply assumed that absolute properties are pre-stored. However, it is possible to challenge this assumption, and
in fact, there are several good reasons for doing so.
First, if we consider properties like being non-pink or being-a-resident-ofa-state-beginning-with-l,
then it becomes clear that not all absolute properties can be pre-stored. Non-pink is an absolute property if pink is, but it is
unlikely that concepts such as gruss and snow contain non-pink in their pro-

Relative properties in memory

17 1

perty lists. While we could memorize the fact that grass is non-pink, we need
not do so, but can infer it from other sources of information.
Second, it is easy to imagine how even common absolute properties could
be computed
rather than stored. In answering the question Is a banana
yellow? we may compare the hue of bananas with some prototypical
yellow
in much the same way as we would compare the size of a banana in determining whether it is big. Te Linde and Paivio (1979) have obtained clear distance effects when subjects must determine the similarity between color chips
and a named color. Stephens (Note 1) has also found distance effects for
absolute properties of named objects (by asking questions like Which is more
yellow - a Iemon or a banana?) that parallel those for relative properties.
These possibilities suggest that the substantive difference between relative
and absolute adjectives may depend, not on whether they are computed or
pre-stored,
but on the kind of computation
involved. In this respect, the
notion of two reference points provides one way that this difference might
be framed. As a first approximation,
we can suppose that adjectives vary in
the importance
attached to the superordinate
and object points during the
comparison.
Relative adjectives would depend most on the superordinate
point, but for reasons described above, influenced to a lesser extent by the
object point. Absolute adjectives, on the other hand, would be dominated
mainly by the object point so that judgments would be indifferent
to category membership
of the modified noun (cf. Wheeler, 1972). In this way,
we can account for the logical distinctions
proposed by Katz (1972) and
Vendler (1968) and, at the same time, explain our intuition of a continuum
between absolute and relative adjectives, as discussed earlier.
However, viewing adjectives in this way leads us to a number of difficult
questions. Clearly, not all properties can be computed, since if this were true
there would be nothing for the comparison process to operate on. But while
some core of data must be present to make the computations
possible, it
appears to be a very difficult task to get at these core properties. Perhaps
there is some underlying level of analysis in which all properties are prestored. But it is equally possible that pre-storage occurs with just a few landmark instances. For example, in determining
whether an object X is big,
we could try to recall its relation to some other object Y that we have
already determined
to be big. If X and Y share the same superordinate,
and
if we can show that X contains Y, or that Y is a part of X, or that X completely occludes Y when X is immediately in front of Y, then we can deduce
that X is also big. Such a process may be less elegant than a simple comparison, but it is not out of the running (see Banks, 1977).
Another
question
concerns
the ultimate
grounds for the distinction
between relative and absolute adjectives. Why, for example, are color terms

172

L. J. Rips and W. Turnbull

absolute
and dimensional
adjectives relative? The difference
apparently
does not lie in our ability to distinguish variation in the corresponding
qualities, for we can certainly discern degrees of yellowness. Number of underlying dimensions is also immaterial since big, which depends on three dimensions, is no less a relative adjective than tall, which depends on one. One
possibility is that the difference has less to do with the type of attribute than
with its distribution
among objects. For relative adjectives, variability of the
corresponding
property
may be greater between superordinate
classes than
within them, so that a comparison to the superordinate
reference point will
convey valuable information.
For absolute adjectives, variability
may be
equally great within as between superordinates,
so that such a comparison
is irrelevant. This question is far from settled, however, and the distinction
may depend also on the integrality
of the property
(Garner, 1974), the
salience of its component
dimensions (Kamp, 1975), or the way in which
the reference point changes with exposure to new instances (Wheeler, 1972).
Finally, it is important
to realize that absolute and relative adjectives
do not exhaust the range of adjectives in English. For example, we have not
considered
fictionalizing
adjectives like mythical that map real entities
like fake or pseudo that signal noninto imaginary ones or negators
membership
in a given category (R. Clark, 1970). Adjectives like these
probably
call for a very different
kind of analysis than the one offered
above. However, these items take us further from the traditional view of
adjectives as properties stored with the nouns they modify, and in this way,
they echo the message of the preceding studies.

References
Anderson, J. R. (1976) Language, memory, and thought. Hillsdale, N.J., Lawrence Erlbaum Associates.
Banks, W. P. (1977) Encoding and processing
of symbolic information
in comparative
judgments.
In
G. H. Bower (Ed.), The psychology
of learning and motivarion (Vol. 1 I), New York, Academic
Press.
Banks, W. P., and Flora, J. (1977) Semantic and perceptual
processes in symbolic comparisons.
J. exp.
Psychol.: Human Perception and Performance, 3, 278-290.
Battig, W. F., and Montague, W. E. (1969) Category norms for verbal items in 56 categories.
A replication of the Connecticut
category norms. J. exper. Psychol. Mono., 80, (3, Pt. 2).
Bierwisch, M. (1967) Some universals of German adjectivals. Found. Lang., 3, 1-36.
Bierwisch, M. (1971) On classifying
semantic features. In D. D. Steinberg and L. A. Jakobovits
(Eds.),
Semantics: An interdisciplinary reader in philosophy, linguistics, and psychology. Cambridge,
Cambridge University Press.
Clark, H. H. (1973) The language-as-fixed-effect
fallacy: A critique of language statistics in psychological research. J. verb. Learn. verb. Behav., 12, 335-359.
Clark. R. (1970) Concerning
the logic of predicate modifiers. Nous, 4. 31 l-355.
Cohen, J., and Cohen, P. (1975) Applied multiple regression/correlation
analysis for the behavioral
sciences. Hillsdale, N.J., Lawrence Erlbaum Associates.

Relative

Collins.

properties

in memop

173

A. M.. and Loftus. E. F. (1975) A spreadin?-activation


theory of semantic processing. ps,vvrhol.
Rev., 82. 407-428.
Cresswell. M. J. (1976) The semantics of degree. In B. H. Partee (Ed.).Monfa4regrammar.
New York,
Academic Press.
Fillmore. C. J. (1971) Entailment
rules in a semantic theory. In J. F. Ronenbere and C. Travis (Eds.),
Readings in the philosophy of language. Englewood Cliffs. N.J.. Prentice-Hall.
Garner, W. R. (1974) 7he processing of information and structure. Potomac. Md.. Lawrence Erlbaum
Associates.
Higgins, E. T. (1976) Effects of presuppositions
on deductive reasonine. J. verb. Learn. verb. Rehav.,
U-,419-430.
Holyoak,
K. J. (19773 The form of analog size information
in memory.
Cog. Psychol..
9. 31-51.
Huttenlocher,
J., and Higgins, E. T. (1971) Adjectives. comparatives,
and sylloeisms. P.sychol. Rev.. 78.
487-504.
Jamieson,
D. G,, and Petrusic, W. M. (1975) Relational judgments
with remembered
stimuli. Percep.
Psychophys.. 18, 373-378.
Kamp. J. A. W. (1975) Two theories about adjectives. In E. L. Keenan (Ed.), Formal semantics of
natural language. Cambridge, Cambridge University Press.
Katz. J. J. (1972) Semantic theory. New York, Harper and Row.
Kintsch.
W. (1974) The representation of mean{ng in memory. Hillsdale, N.J., Lawrence
Erlbaum
Associates.
Kuzera, H., and Francis, W. N. (1976) Computational analysis of presentdav American English. Provi:
dence, R.I., Brown University Press.
Langford,
C. H. (1942) Moores notion of analysis. In P. A. Schilpp (Ed.). The philosophy of G. E.
Moore. Chicago, Northwestern
University Press.
Meyer, D. E., and Schvaneveldt,
R. W. (1971) Facilitation
in recognizing pairs of words: Evidence of a
dependence
between retrieval operations.
J. exper. Psychol., 90, 227-234.
Miller, G. A., and Johnson-Laird,
P. N. (1976) Langua,qe and perception. Cambridge.
Mass. Belknap
Press.
Mosteller, F., and Tukey, J. W. (1977) Data analysis and regression. Reading, Mass., Addison-Wesley.
Moyer, R. S. (1973) Comparing
objects in memory:
Evidence suggesting an internal psychophysics.
Percep. Psychophys., 13, 180-184.
Moyer, R. S. and Bayer, R. H. (1976) Mental comparison
and the symbolic distance effect. Cog.
PsychoI., 8, 228-246.
Moyer, R. S., and Dumais, S. T. (1978) Mental comparisons.
In G. H. Bow& (Ed.), The ps.ychoZogy of
learning and motivation (Vol. 12). New York, Academic Press.
Paivio, A. (1975) Perceptual comparisons
through the minds eye. Mem. Cog., 3, 635&647.
Parsons, T. (1972) Some problems concerning
the logic of grammatical
modifiers,
In D. Davidson &
G. Harman (Eds.), Semantics of natural language. Dordrecht,
Holland, D. Reidel.
Ross, W. D. (1930) The right and thegood. Oxford, Clarendon Press.
Sapir, E. (1944) Grading: A study in semantics. Philos. Sci., II, 93--116.
Smith, E. E. (1978) Theories of semantic memory. In W. K. Estes (Ed.), Handbook of learning and
cognitive processes (Vol. 6). Hillsdale, N.J., Lawrence Erlbaum Associates.
Smith, E. E., Rips, L. J., and Shoben. E. J. (1974) Semantic memory and psychological
semantics.
In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8). New York. Academic
Press.
Suzuki, T. (1970) An essay on the anthropomorphic
norm. In R. Jakobson and S. Kawamoto
(Eds.),
Studies in general and oriental linguistics. Tokyo, TEC Company.
Te Linde, J. and Paivio, A. (1979) Symbolic comparisons
of color similarity. Memo Cog., 7, 141-148.
Vendler, Z. (1968) Adjectives and nominalizations. The Hague, Mouton.
Wallace, J. (1972) Positive, comparative,
superlative. J. Pbilos., 69, 773-782.
Wheeler, S. C. (1972) Attributives
and their modifiers. Nous, 6, 310-334.
Wierzbicka, A. (1972) Semantic primitives. Frankfurt,
Athenaum.
Winer, B. J. (197 1) Statistical principles in experimental design. New York: McGraw-Hill.

174

L. J. Ripsand

W. Turnbull

Reference Notes
1. Stephens,
D. (1978) Processing of pictures versus words in a comparative judgment
lished manuscript,
University of Chicago.

task. Unpub-

Des etudes anterieures


sur la memoire semantique
ont omis une distinction
importante
parmi ce que
Ion a appeld les assertions
de proprietes.
Les assertions avec des adjectifs i caractere relatif (Ex.
les flamants
sont grands) impliquent
une comparaison
avec un point de reference
ou une norme
standard
associee i la categoric
superordonnee
(Ex. un flamant est grand en tant que oiseau). La
valeur de v&it6 des assertions
avec des adjectifs absolus (Ex. les flamants
sont roses) est g&&alement indipendante
de ce type de reference.
Les consequences
psychologiques
de cette distinction
ont et6 ktudiees dans 1Experience
1. Les sujets ont pour tache de verifier des phrases in&ant
soit
des adjectifs
absolus
soit des adjectifs
i caractere
relatif dans des structures
de type predicatadjectif (Ex. un flamant est grand (rose)) ou dans des structures
de type pridicat-nom
(Ex. un flamant
est un grand oiseau (oiseau rose)). Dans ce cas le nom predique
est le superordonne
immediat.
Les temps de reaction et les erreurs sont moindres pour les phrases a caractere relatif lorsque le terme
superordonne
est specific. Pour les phrases absolues il ny a pas de difference.
Ces donnees suggerent
que la valeur de v&it6 des phrases a caractere relatif ne depend pas seuiement du terme superordonne
mais aussi de normes plus g&kales
pour les objets familiers. Dans 1Experience
2 on montre que le
classement
des valeurs de &rite des phrases a caractere
relatif est fonction
de la difference
entre
Iexemple donne et le superordonne
standard
(par exemple, la taille du flamant par rapport i celle
dun oiseau ordinaire),
et de la difference
entre Iexemple et la norme des objets familiers.
Dans
1Experience
3 on retrouve ces resultats en utilisant comme mesure dependante
le temps de reaction.

Cognition,
@Elsevier

8 (1980) 175-185
Sequoia S.A., Lausanne

3
- Printed

in the Netherlands

Very long term memory for tacit knowledge*

RHIANON

ALLEN**

The Graduate

ARTHUR
Brooklyn

Center of CUNY

S. REBER
College of CUNY

Abstract
Very long term memory for abstract materials was examined by recalling
subjects who had served in a synthetic grammar learning experiment two
years earlier. In that study (Reber & Allen, 1978) we differentiated among
several cognitive modes of acquisition, their resultant memorial representations, and their associated decision processes. Two years later and without
any opportunity for rehearsal or relearning, subjects still retain knowledge
of these grammars to a remarkable degree. Although some differences have
become blurred with the passage of time, the form and structure of that
knowledge and the manner in which it is put to use remain strikingly similar
to the original. That is, differences traceable to acquisition mode and conditions of initial training can still be observed. As in the original study, these
results are discussed within the general context of a functionalist approach
to complex cognitive processes.
This paper is a report of rather remarkably persistent long term memory for
highly abstract and complex materials; specifically,
the knowledge of the
grammatical structure of two artificial languages after a two year hiatus.
In researching the area of very long term memory we were struck by the
lack of attention which has been paid to memories of this kind. For the most
part, the study of long term memory has dealt with real world knowledge

*This research was conducted


while the senior author was supported
by a doctoral fellowship from
the Social Sciences and Humanities
Research Council of Canada.
**Requests
for reprints should be sent to Rhianon Allen, Developmental
Psychology
Program, The
Graduate Center of CUNY, 33 West 42nd Street, New York, NY, 10036, USA.
The original study was reported
two years ago in this journal (Reber & Allen, 1978). Although we
provide a synopsis below of the major findings of that experiment,
the interested reader should refer
to that paper for details on procedure
and results as well as a full discussion of the theoretical
issues
which underlie the learning of complexly
structured,
rule-governed
stimulus domains.

176

R. Allen and A. S. Reber

which is both highly codable and likely to be either rehearsed or refreshed


by day to day activities and events. The few studies that we found which
used arbitrary stimuli, however, suggested that human memory is certainly
quite robust (Wickelgren, 1972; Burtt, 1941; Kolers, 1976).
In this experiment
we take these notions of arbitrariness and nonrehearsability of the stimulus materials to previously unexplored
extremes. First,
we are focusing on very long term memory for knowledge of a complex
stimulus domain which was specifically selected to be as remote as possible
from normal day to day activities. Second, the knowledge of grammatical
structure which resulted from the original learning was largely unconscious
so we shall be looking at the longevity of implicit knowledge, not explicit.
Third, during the original experiment
the subjects did not know that there
was to be this later follow-up and we can be quite confident that our subjects have not rehearsed the material in the interim. Indeed, it is far from
clear, given our understanding
of memorial strategies, how one can rehearse
abstract information
which is tacitly coded.3
Before proceeding,
it seems prudent to review briefly some of the basic
findings from the original study so that the kinds of memorial residues we
are looking for can be specified. In that experiment,
subjects learned about
the underlying
grammatical
structure
of two different artificial languages
under two different training conditions. On one occasion a paired associate
(PA) procedure
was used where exemplary letter strings from one artificial
language were paired with the names of cities; on the other occasion an
observation (OBS) procedure was used in which the same subjects attended
to a series of exemplars from the other language. Knowledge of each synthetic language was assessed using a well-formedness
test in which subjects
had to judge the grammaticality
of a large number of novel letter strings.
The results revealed that subjects have available three basic cognitive
modes for acquiring knowledge
of such complex stimulus environments.
2The decision to use stimulus materials whose structure
was dictated by finite-state
grammars was
motivated
by theoretical
issues concerning
acquisition
of tacit knowledge.
They seemed a reasonable
choice because they are arbitrary
and can be made arbitrarily
complex; they are organized and deeply
so; and they have structural
forms that are most unlikely to be amenable to the typical subjects bank
of heuristic
devices for learning about rule-governed
systems. These points are discussed in more
detail in Reber and Click (forthcoming).
3This issue of unconscious
rehearsal or unconscious
work has received some attention
in the area
of problem
solving. The so-called incubation
period during which solutions to problems are often
achieved certainly
suggests that some kind of long term unconscious
cogitation
takes place. A nice
discussion of a number of mechanisms
which could be operating to produce the incubation
effect may
be found in Posner (1973). However, we suspect that there are fundamental
differences
here since the
problem solver is aware that he is a problem solver; our subjects are unaware that they were later to be
memorially
responsible
for the material learned.

Vety long term memory for tacit knowledge

177

Each acquisition mode results in a particular form of memorial representation and an attendant set of operations for making decisions. Let us review
each.
(a) Explicit

rule induction

This procedure consists of the overt formation


and testing of hypotheses
about aspects of letter order and the establishment of consciously held rules.
We found this mode appearing to a limited extent in both the PA and OBS
training procedures.
Typically,
these rules were correct reflections
of the
letter order constraints although they were not particularly
sophisticated.
They consisted almost entirely of relatively simple notions about short letter
groups (primarily bigrams) which occur in initial and terminal positions of
letter strings. Even at this simple level, however, subjects reported using
them on only about 40% of the test trials. Generally speaking, this explicit
mode can be identified with the phenomena
which have been extensively
studied in the literature
on concept formation,
problem solving, pattern
learning, etc.
(b) Individuated

memory

and the analogic strategy

This procedure consists of attending to and memorizing specific items and/or


discriminable differences between items during learning - operations which
result in a fairly concrete memorial space. The PA task, by its very nature,
invited such a mode and hence it was associated almost entirely with that
learning procedure.
Decisions about the acceptability
of test strings tended
to be made by searching for an analogy between the to-be-judged item and
the contents of this individuated memory (see Brooks, 1978, for a general
theoretical discussion of this procedure).
Not surprisingly, this strategy led to high accuracy in the assignment of
grammaticality
to the few old test items which had also been part of the
learning set. It was, however, also associated with a relatively poor knowledge of structure and a high rate of erroneous rejection of novel grammatical test strings. These omission errors were frequently based on a tendency
to reject any item for which no acceptable analogy could be recalled.
(c) Implicit

learning and the abstraction

strategy

This acquisition mode consists of the unconscious abstraction of the underlying rule system inherent
in the exemplars
presented
during learning.
Characteristic
of this mode is that little or no specific concrete information

178

R. Allen and A. S. Reber

about the actual learning items is retained, and decisions about the wellformedness
of test strings are made largely on an intuitive basis. Although
there was evidence that some learning of this kind accompanied
the PA
procedure,
the abstraction
strategy was strongly associated with the OBS
training procedure
which, unlike PA, has no specific task demands. The
advantage in dealing with old strings found with the analogy strategy was
totally absent here; all strings from the learning set are dealt with as if they
were novel strings.
In this study, then, we are looking for evidence with respect to three
important
issues in the study of very long term memory. First, can a body
of unconscious knowledge be retained for an extended period of time without the opportunity
for rehearsal? Second, how important is the mode of
acquisition
of original knowledge in determining
what is retained? Third,
how closely does the form of two-year-old
knowledge
resemble that of
original knowledge?

Method
Subjects

Of the ten subjects in the original experiment,


we were able to recall eight.
The two unavailable ones were typical of the group as a whole and, since
each was from a different order condition in the original design, there are
no reasons to suspect that any systematic biases were introduced
by the
failure to corral all ten. For reasons explained in the original paper, these
subjects were hand-picked
advanced undergraduates
and graduate students
who agreed to serve without pay or other remuneration.
Stimulus

materials

The stimuli used were the letter strings from the two tests for well-formedness in the original study. In that experiment,
the knowledge of grammatical
structure acquired during learning was evaluated by presenting each subject
with a set of 100 strings of letters (actually only 50 distinct items were used,
each being presented twice), one-half of which conformed with the rules for
letter order (the grammatical strings) and one-half of which contained one or
more violations of those rules (the nongrammatical
strings). Details of these
test items are given in Reber and Allen (1978). For our purposes here note
that five of the grammatical
strings had been used as part of the original

Very long term memoy for tacit knowledge

learning
occurred

stimuli (the old items) and the other


only during testing (the novel items).

20 grammatical

179

strings

Procedure
Prior to testing, subjects were told which grammar they would be responsible for and asked in all cases to respond yes or no depending upon
whether or not, as best as they could recall, each item conformed to the
rules of that grammar. All subjects were reminded that half of the items
were acceptable and half were not. There was no opportunity
for relearning
or refamiliarization
with the materials. No other information
about the
materials or the task was given; no mention was made of the repetition of
test items or about the existence of the old items; no feedback about the
correctness
of their responses was given; and no reference was made to the
fact that these test strings were the same ones which had been used two
years ago. Both grammars were tested in exactly the same manner, each
time reminding the subject about the procedure used to learn that particular
grammar two years ago. After completing the well-formedness
test, subjects
were asked to provide an estimate of how well they thought they had done
by estimating how many of the 100 items they classified correctly.
Counterbalancing and notation
The order of running was counterbalanced
with four of the subjects first
tested with the strings based on the grammar learned by the PA procedure
two years earlier (denoted as PA-1st subjects) and the remaining four subjects beginning testing with the one learned with the OBS procedure (OBS1st subjects). Following testing on the first grammar, subjects proceeded
directly to the task for the other grammar (denoted as PA-2nd and OBS2nd). Note that subjects referred to as OBS-2nd are the same subjects as
PA-I st and similarly for the PA-2nd and OBS-1st subjects.
All subjects were run in the same order condition
as two years ago.
For example, PA-1st subjects here are the same subjects as those who were
described as PA-1st in Reber and Allen (1978). This point will be important
later since we will report on some effects that can be traced back to the
order of running in the initial training sessions.

180

R. Allen and A. S. Reber

Results
Introspections
At the outset, only one or two subjects thought that they were now capable
of performing
above chance on this task. However, as testing continued all
reported that they were, to their surprise, becoming more and more aware of
their ability to make accurate decisions and all but one of the subjects
estimated their performance
to be above chance. However, unlike two years
ago, the overall correlation between estimated and actual performance
was
not significantly
different from zero:.our
subjects knew they were performing above chance but they had no accurate sense of just how well they
were doing.
The pattern of justifications
offered two years ago had revealed some
strong differences
in the types of reasons given for making decisions following the two learning procedures. Here, no differences were observed. For
both grammars we received a mixture
of justifications
like, Im just
guessing, This one somehow feels right (or wrong), I think I remember
this one, and so forth. However, the frequency of such justifications
was
very low. Unlike two years ago where roughly 40% of all responses could be
justified, a concrete reason for a decision in this follow-up was a relatively
rare event. This apparent loss of conscious contact with at least some sense
of what is known probably accounts for the lack of confidence that subjects
had in their knowledge and the generally poor ability to assess actual performance.
Finally here, virtually all subjects felt that the task became easier as
testing proceeded and they thought their performance
improved consistently.
Although there was a trend in this direction over the full course of testing,
it failed to reach significance, F(3, 21) = 1.68. The sense of increased performance
over trials probably has more to do with a refamiliarization
with
the task than with an actual increase in the amount of recalled knowledge.
Probability of a correct response (PC)
Table 1 gives the mean P, values for the grammatical and non-grammatical
items for the grammars learned under each condition in both the original
experiment
and the follow-up. The single most interesting value in this
experiment
is the overall P, for the follow-up of 0.667. With chance at 0.5
and P, > 0.6 needed for significance for an individual subject, this value
demonstrates
that sufficient knowledge of these grammars has survived the
two year hiatus for our subjects to reliably distinguish well-formed
from

Very long term memory for tacit knowledge

18 1

non-well-formed
strings. However, it is also clear that there has been a
decrease in overall performance;
the difference between the P, values from
the original and follow-up testing sessions is significant,4 F( 1,7) = 26.2,
p < 0.005.
Table 1.
Item Status

Grammatical
Nongrammatical
Means

Probability of a Correct Response (P,) on the Two Well-formedness Tasks


Original

Follow-up

Task

Task

Observation

Paired
Associates

Means

Observation

Paired
Associates

Means

0.845
0.775
0.808

0.710
0.780
0.740

0.778
0.778
0.778

0.678
0.690
0.684

0.650
0.650
0.650

0.664
0.670
0.667

In the original experiment


the OBS procedure produced better overall
performance
than PA and there was an item status by training procedure
interaction.
These effects are no longer statistically detectable. Although on
the surface this suggests that there has been a significant loss of knowledge
of the grammar learned using the OBS procedure relative to the amount lost
from the PA acquired grammar, the interactions which would reflect such an
effect (the training procedure by time of testing and the training procedure
by time of testing by grammatical status) were not significant, F( 1,7) = 4.88
and 5.43 respectively, ps > 0.05.
Two years ago subjects were better at detecting non-grammatical
items
which contained multiple errors than those with but a single violation. This
effect emerges intact two years later, P, = 0.80 and 0.65 for multiple and
single letter violations respectively. Two years ago subjects were also better
at detecting non-grammatical
items with single violations in the initial position than in any other position. This effect is no longer present; no particular violation location shows an advantage over any other. This result is
probably due to the loss of explicit knowledge about letter position constraints. As mentioned above, a large proportion
of the justifications
which
subjects supplied in the initial testing concerned initial letters and initial
bigrams. Once this concrete information
is lost from memory the detection
advantage accruing to first letter violations goes with it.

4Wherever statistical
comparisons
are drawn between the original and follow-up studies, the data
from the two subjects not recalled have been discarded.
All tests to follow, therefore,
utilize a completely within subjects design with eight subjects. The deletion of these two original subjects seems not
to have resulted in any systematic loss of data.

182 R. Allen and A. S. Reber

Representativeness

The issue here concerns the extent to which subjects knowledge of structure
is an accurate reflection of grammatical structure as displayed in the original
learning stimuli. We had noted two kinds of non-representativeness
in the
original experiment:
the explicit rule induction
strategy occasionally
led
subjects to articulate
rules which were simply incorrect and the analogy
strategy often led subjects to consistently
misclassify items on the grounds
that candidates for analogy-by-similarity
were not in memory. The existence
of non-representativeness
is detemrined by analyzing the pattern of responses
to the two presentations
of each test item, comparing (by a x2 test) the
number of repeated misclassifications
(EE) to the number of single misclassifications (CE and EC). Table 2 shows all four possible patterns from each
learning procedure and order of running.
Table 2.

Patterns of Responding to Successive Presentations on the Follow-up Wellformedness Task


Pattern

Training

Condition

Observation

cc
CE
EC
EE

Run 1st

.~~
Run 2nd

109
24
36
31

104
35
26
35

Paired Associates
Run 1st

Run 2nd

93
20
32
55

106
29
40
25

The overall EE rate here is significantly higher than would be expected if


errors were simply a result of guessing when an items grammatical status is
not determinable
given a representative
knowledge base, x(3) = 12.0, p <
0.01. This effect, as it was at the time of original testing, is contained entirely
in the PA-1st subjects, x*(l) = 10.43,~ < 0.01.
This result raises the question of whether subjects are still using the same
inappropriate
strategies and hence consistently misclassifying the same items
on the follow-up. Since exactly the same test items were used both times,
the overall pattern of classification responses can be easily traced through all.
four separate presentations.
The proper test here is to compare the EEEE
rate with the mean of the other 14 combinations
of correct and error responding to ensure that the most conservative
test is being applied. As before,
only the PA-1st subjects show a significant tendency to commit repeated
errors, x*(l) = 6.13, p < 0.01, with three of the four subjects in this con-

very long term memow for tacit knowledge 183

dition reaching significance. Interestingly,


fully 88% of these consistent misclassifications were rejections of items which were actually grammatical. This
is the reverse of all other conditions where the modal error was to accept
non-grammatical
items. The tendency for PA-1st subjects to consistently err
on grammatical items thus seems to result from their persistent use of the
analogic strategy - a strategy which gets one into difficulty when a test item
does not sufficiently resemble the letter strings in memory.
Old versus novel grammatical

test items

After original learning all subjects had performed equally effectively when
assigning grammatical status to old test items and to novel grammatical
items. In the follow-up, however, we now observed a significant learning
procedure by old/new status interaction, F( 1,7) = 10.62, p < 0.025. Specifically, PA subjects now perform significantly
poorer on novel grammatical
items than they do on old items, F( 1,7) = 6.15, p < 0.05; OBS subjects show
no significant difference. This result clearly indicates that there is retention
of specitic learning set materials after a two year lag, and that it is associated
with the acquisition mode that most strongly directed subjects attention
to the physical features of the stimulus material.
In summary, there is no doubt that knowledge of these grammars has
survived remarkably
well. Some of it is in an abstract form and some in
reasonably
concrete
form, and these memorial
forms correspond
quite
closely with the memory systems of two years ago. Moreover, as indicated
by the analysis of the response patterns to old and novel items and by the
emergence of non-representativeness
in the PA-1st subjects, both the beneficial and detrimental
impacts of these memorial forms can still be felt.

Discussion
To return to the original questions of robustness, form and mode of acquisition, it seems quite remarkable that information
gained over the course of
a 10 to 15 minute exposure to an artificial language can be retained for as
long as two years without intervening exposure or rehearsal. Even two years
after learning, all subjects are significantly above chance at assigning grammatical status to test items. But it is not the case that all types of knowledge are
equally robust. Explicit, conscious knowledge in particular appears to be
relatively fragile in nature.s From a levels of processing point of view as
Rather,
we should say that explicit knowledge
is fragile without
rehearsal.
It seems an obvious
point that if a rule (e.g., a chess move) is rehearsed periodically,
it will be remembered
- perhaps indefinitely. The important
notion here is that the other two modes are robust without rehearsal.

184

R. Allen and A. S. Reber

put forward by Craik and his co-workers (Craik & Lockhart,


1972; Craik
& Tulving, 1975) one is led to the surprising implication
that knowledge
gained from conscious,
analytic procedures
is less deeply processed than
knowledge achieved by alternative means.
Knowledge acquired in an implicit mode, on the other hand, can still be
detected
after the two year hiatus; subjects continue to be able to make
accurate judgments in the absence of verbalizable knowledge. What is known
here is still abstract in nature; no advantage accrues to old items and it
remains an accurate reflection of the underlying structure of the grammar.
While some blurring of structure knowledge comes with time, and subjects
report that immediate intuitive apprehension of grammaticality
is somewhat
harder to come by, knowledge gained in the implicit mode is persistent in
both form and quality.
A surprising result was the persistence of individuated
memories in the
PA-1st subjects. Although few could consciously
recall learning set items,
they continued to perform at a high rate on old letter strings. While these
subjects perform
well, their reliance on concrete
memory and analogy
carries some disadvantages.
First, holes in memory are not going to be
patched in the course of time. Consequently,
there is a high likelihood that
items initially rejected on the grounds that they do not resemble anything
in concrete memory will be repeatedly
rejected. Second, the individuated
memory space seems to be established at the expense of structural knowledge, resulting in subjects emerging from the PA training session with little
aside from concrete memories of letter strings and fragments thereof.
From the functionalist
point of view which we favor, the level of processing necessary for very long term memory can be attained by either
implicit processing or memorization
of exemplars. That is, both abstract
structural
knowledge
and concrete
individuated
memories are processed
deeply enough to result in knowledge
that is resistant to the passage of
time. Yet, pragmatic distinctions
can be drawn between these two modes.
The abstraction
strategy
encouraged
by the OBS procedure
confers an
advantage in identification
of underlying grammaticality,
in the recognition
of that which is structurally
regular. The memorize-and-analogize
strategy
optimized
by the PA procedure
yields an advantage in identification
of
specific stimuli, in the recognition
of that which has been confronted
previously.
Our original data suggested that strategies of acquisition are tailored to
immediate task demands, task expectations,
and stimulus parameters. These
learning strategies resulted in distinctive types of memorial representations
which are still detectable two years after learning. Very long term memory
appears not to be uniform in nature. That is, knowledge can be represented

Vev long term memory for tacit knowledge 185

in either abstract or concrete form and it seems to


the time of initial entry or formation. While two of
their attendant
decision processes are remarkably
themselves adaptive to different kinds of ecological
their application and deployment.

maintain its form from


the memorial forms and
robust, they may find
niches when it comes to

References
Brooks,

L. R. (1978) Nonanalytic
concept formation
and memory for instances. In E. Rosch and B. B.
Lloyd (Eds.), Cognition
and categorizntion.
Hillsdale, N.J., Lawrence
Erlbaum
Associates.
Burtt, H. E. (1941) An experimental
study of early childhood
memory. J. gener. Psychol., 58, 435439.
Craik, F. I. M. and Lockhart,
R. S. (1972) Levels of processing:
A framework
for memory research.
J. verb. Learn. verb. Beh., 11, 671-684.
Craik, F. I. M. and Tulving, E. (1975) Depth of processing
and the retention
of words in episodic
memory. J. exper. Psychol.: General, 104, 268-294.
Kolers, P. (1976) Pattern analyzing memory. Science, 191, 1280 -1281.
Posner, M. I. (1973) Cognition:
An introduction.
Glenview, Ill., Scott, Foresman
and Co., Chap. 7.
Reber, A. S. and Allen, R. (1978) Analogic and abstraction
strategies in synthetic grammar learning:
A functionalist
interpretation.
Cog., 6, 189-221.
Reber, A. S. and Glick, J. A. Implicit learning and stage theory. Int. J. Beh. Devel., in press.
Wicketgren, W. A. (1972) Trace resistance and the decay of long term memory. J. math. Psychol., 9,
418-455.

Cette recherche
Porte sur la memoire i long terme pour un materiel abstrait. Les sujets de Iexperience
avaient participe,
deux ans auparavant,
i une experience
dapprentissage
de grammaire
synthitique.
Au tours de cette recherche
(Reber and Allen, 1978) on avait degagk plusieurs modes dacquisition
cognitive, les representations
en memoire quils induisaient
et les processus de decisions qui y ktaient
associes. Deux ans plus tard sans quil y ait possibilite de repetition
ou de reapprentissage,
les sujets
se souvenaient
remarquablement
de ces grammaires.
Si certaines nuances etaient att&mees avec le
temps, la forme et la structure des connaissances
et leurs modes dutilisation
restaient t&s cornparables
avec les originaux.
Les variations remarquekes
dans le mode dacquisition
dam Ientrainement
initial
sobservaient
encore. Comme pour la premiere etude, ces resultats sont discutks dans le contexte
general dune approche fonctionnaliste
des processus cognitifs complexes.

Cognition,
@Elsevier

8 (1980) 187 207


Sequoia S.A., Iausanne

4
-- Printed

in the Netherlands

The acquisition of homonymy


ANN M. PETERS
University

of Hawaii

ERAN ZAIDE L
University

of California,

Los Angeles

and
California

Institute

of

Technology

Abstract
The growth in children S ability to perform the task of separating the sounds
of words from their meanings was investigated by asking children between
3;3 and 6;3 to select homonyms from pictures. The results show a growth in
ability with age, with a jump at 4;4. An investigation of the developmental
changes in the strategies employed shows that the task is cognitively
complex. Performance in the younger children is more hampered by a
resource-limited inability to cope with many cognitive factors all at once
than by lack of ability to do the linguistic aspects of the task. These cognitive factors include access to vocabulary, rehearsal of intermediate results,
and implementation of a search strategy.

Introduction
In English, with its phonologically-based
writing system (as opposed, for
example, to the Chinese ideographically-based
system), reading readiness
must depend in part on an ability to separate the sounds of words from
their meanings. At what point in their linguistic development
are Englishspeaking children able to effect this separation? Is there a clearly marked
*We thank Deborah
Burke for advice on test design, Leslie L. Wolcott for drawing of the test
materials, Susan Fischer and Danny Steinberg for help in statistics, and the All Saints Day Care Center
for making subjects and facilities available. Thanks are also due to H. and V. Wayland for support of
the first author during this study. This work was also supported
by NIMH grant MH-03372 and NSF
grant BNS76-01629
to Prof. R. W. Sperry, by USPH awards MH-00179 and RR07003,land
by NSF
grant BNS7 8-247 29 to E. Zaidel.

188

A. M. Peters and E. Zaidel

point at which such an ability appears? In order to be able to separate the


sound of a word from its meaning, a child has to be able to operate on
language metalinguistically.
That is, both cognitive and linguistic development must have progressed to the point where the child is able to manipulate the pieces of language as if they were objects unrelated to any immediate need to communicate.
Even very young children can, to some
extent, separate the sounds of words from their meanings, but the circumstances under which this can happen seem to be very limited. Thus, children
as young as 2;4 years have been observed to play with the sounds of their
language in noncommunicative
contexts (e.g., Chao, 195 1; Keenan, ms.),
and Iwamura (1977) has observed 3-year-olds discussing the pronunciation
of words in the context of an ongoing conversation.
If, however, children are to have enough control of their phonological
systems to make use of them, for instance, in learning to read they must be
able to do such metalinguistic
tasks whenever it is necessary and not just
when the circumstances
are optimal. It seems that this ability is still
developing past age 3. Thus, it has been shown that although 4-year-olds
seem to be able to use phonemic information
to recall labels of pictures
(they recall more rhyming labels than non-rhyming
ones) (Locke, 1971),
for 3-year-olds
the semantic aspects of labels are still more important
since they recall significantly more words in semantically similar ensembles
than in phonemically
similar control lists (Locke and Locke, 1971). The
Lockes observe that their young Ss were respectful of the symbolic value of
language. They treated words as words, as units of reference and meaning,
rather than nonrepresentational
phonemic strings. (ibid, p. 189)
In our culture, phonological
awareness seems first to be introduced
to
children through the vehicle of rhyme, especially through nursery rhymes
and childrens jingles. That is, words are (more or less consciously) juxtaposed which share a partial phonological
similarity:
they have the same
sounds at the ends. Reading readiness exercises go on to introduce another
kind of partial phonological
similarity: the idea of words which begin with
the same sound (alliteration).
Total similarity of sound between two words
(homonymy)
is, however, rarely explicitly brought to childrens attention.
Whether this is because homonymy
is rarer in the language than rhyme and
alliteration,
because it is considered too confusing, or because homonymy
does not allow access to single phonological
units in the way that partial
similarity does is unclear and yet it seems as if it should be a simpler task
to determine
whether there is total phonological
similarity (homonymy)
between
two words than only partial similarity (rhyme or alliteration).
In this study, therefore,
we ask whether there is an age before which
children cannot (in general) separate the sound of a word from its meaning,

The acquisition of homonymy

189

as measured by their ability to find pictures representing two homonymous


words from a given set of four pictures. We further ask how the linguistic
strategies which children bring to bear in solving such a problem develop
with age. In particular, we will look at the errors they make to see whether
younger children tend to make more semantically-based
errors while older
children make more phonologically-based
mistakes.

Method
Subjects

Thirty middle class children of normal intelligence attending a private day


care center in Pasadena, California, participated
in the study: there were- 5
boys and 5 girls from each of three age groups. The mean ages for these
groups were 3;lO years (range 3;3 to 4;5, s.d. 4 mo.), 4;9 years (range 4;3
to 5;1, s.d. 3 mo.), and 5;8 years(range
5;l to 6;3, s.d. 5 mo.).
Materials

Twelve sets of picturable homonym


pairs were chosen with a fairly wide
range of vocabulary difficulty. Three sets of homonyms were reserved for
training; the other 9 pairs were used for testing. For, each of the test sets,
four picturable distractor items were found: two semantic associates (one
for each member of the pair), a rhyme, and an alliteration, thus making sets
of six items. See Table 1 for the entire list of words depicted. For each such
set of 6, eight line drawings were made, similar to those in the Peabody
Picture Vocabulary Test: one for each of the four distractor items and two
different pictures for each of the homonyms. The pictures were arranged in
sets of four, each set containing a target word, its homonym,
its semantic
associate, and either its rhyme or alliteration (chosen randomly). An attempt
was made to pick the easier meaning of a homonym pair as the first target
word, and these first sets were presented on the first pass (underlined items
in Table 1). The second set (with semantic focus on the other word of the
Vocabulary
difficulty
was not easy to estimate ahead of time, partly because most of the homonym pairs are also homographs
and the relative frequencies
of the two meanings are not separated out
in e.g., Thomdike-Lorge,
and partly because with the spoken language of preschool
children, Thomdike-Lorge
and similar sources which are based on written materials
seem inappropriate
anyway.
There was indeed a clear range of vocabulary
difficulty which can be inferred from the childrens performance
on the tests, but it does not correspond
to the Thomdike-Lorge
vocabulary
ratings, nor
does it uniquely predict ability to recognize homonymy.

190

A. M. Peters and E. Zaidel

Homonym sets

Table 1.

Semantic

ring (jewel)
glasses (drink)
nail (metal)

ring (bell)
glasses (specs)
nail (finger)

necklace
cups
hammer

bat (baseball)
bow
(arrow)
horn (instrument)
trunk (elephant)
tie (cravat)
bear
night
palm (hand)
spring
(metal)
-_

bat (mammal)
bow (ribbon)
horn (animal)
trunk (chest)
tie (package)
bare
knight
palm (tree)
spring (season)

Homonym

Homonym

Semantic

Rhyme

Allit

Repeat
order

Training

1.
2.
3.
4.
5.
6.
7.
8.
9.

mitt
gu
drum
hippo
jacket
lion
day
foot
screw

swing
glad/girl

hat
hoe
corn
skunk

spider
knot
tusk
suitcases
sew
clothes
queen
bush
fall

cry
pear
kite
bomb
ring

back
bone
horse
train
tire
barrel
knife
P_OJ

2.
6.
5.
I.
1.
4.
3.
9.
8.

aThe order indicated


on the left is the original order of presentation.
The order indicated on the right
is the order for the repetitions.
The underlined
items appeared in the first presentations,
the other
items (plus the homonyms)
in the repeats.

pair and using the other phonological


associate) was presented on a second
pass. The pictures were arranged in rectangular formation and the positions
of the four elements were varied so that the homonym pair fell equally often
in each of the 6 possible positions. with the positions of the other elements
also being randomized.
Either all 4 pictures of a set were colored or none
was. For example:
first pass
flying-bat
baseball-bat
Figure

1 illustrates

repeat pass

mitt
hat

baseball-bat
back

flying-bat
spider

these two test sets.

Testing Procedures

The children were tested one at a time in a small room or office at the
school (whichever happened to be free at the time). Sessions took varying
lengths of time depending on the ages of the children: some of the younger
children took 45 minutes while some of the older children finished in 15. All
sessions were tape recorded.

T?zeacquisition of homonymy

Figure 1.

19 1

Sample items from the homonym test: Find hvo pictures that sound the
same but mean different kinds of things. Left: first pass; right: repeat pass
(see text).

I. Prenaming

Since many of the pictures could be labelled in a number of different ways


(e.g., horn/trumpet,
palm/tree)
an attempt was made to associate the desired
label with each picture by means of a preliminary prenaming
task. Thus,
a second set of pictures, identical to those made for the homonym test but
excluding the repeat pictures of the homonym words was again arranged in
groups of 4 with care now taken that pictures from any given homonym
set did not appear in the same group. The child was asked Can you point to
X? for each of the 4 labels depicted in each set of pictures. On some of the
less obvious items (e.g., bare, finger-nail), the child was warned This one is
tricky. If the child couldnt
find a particular picture, the investigators
pointed it out and made sure the child could recognize it, also giving a
verbal association
such as bare like bare feet or nail like on your finger.
Any vocabulary
difficulties,
including hesitations,
were noted. This task,
then, in addition to helping to associate the desired label with each picture,
also gave an estimate of each childs passive (receptive) vocabulary.

192 A. hf. Peters and E. Zaidel

2. Homo~~yms

a. Traitzhg. Three sets of homonyms


were reserved for pretraining (see
Table 1). The child was shown the first set of 4 pictures and told, This is a
game about words that have the same sound but meal1 different kinds of
thillgs. I want ~~ozr to show me two pictures that sound just the same but
mean different kirlds of things. Like this: ring, ring. A ring that you put on
your finger and rirlg the bell. The)> soured exactly the s&e: ring, ring. But
the], mean different kinds of things. The child was then given two practice
sets to do before testing was begun.
b. Testing: first pass. The first 9 homonym sets were then presented one
at a time. All pointing responses were recorded on a preprinted score sheet
along with response times as measured by a stopwatch. The child was first
asked (Task l), Find two pictures that sound just the same but mean different kinds of things. For each pair that the child pointed to, s/he was asked
Whats the word? if the word was not spontaneously
given. If the wrong pair
was pointed to, the child was encouraged to continue searching. If a rhyme
or alliteration was chosen the child was asked, Do those sound exactly the
same.?, whereas when a semantically associated pair was chosen, the mvestigator said, Yes, but thats the same kind of thing. I want two pictures that
sound the same but meat1 different kinds of things. If the child gave up or
the right pair was not found after several responses or about 30 seconds on
Task 1, the investigator pointed to one of the homonym pictures and asked
(Task 2), Curl JWU find arlother picture that sounds exactly the same as this
one? If s/he still could not find the homonym pair, or seemed to have found
it on Task 1 or Task 2 but refused to say the word, s/he was asked (Task 3),
Can you point to X.? And can you poitlt to another kirld of X? If the child
did this correctly after having silently pointed to the right pair in Task 1 or
Task 2, s/he was given credit for knowing the homonym passively.
c. Repeat set. After the first 9 homonym sets, the child was shown the
prenaming pictures for the distractor items that would appear in the repeat
set of homonyms.
(Now well play the first pointing game a little more.) As
mentioned
above, the homonym
pictures were also changed on the repeat
sets but the new pictures were not shown in either prenaming. The purpose
of administering
the repeat set was to see whether the children transferred
learning from the original task when new pictures depicting the same concepts were shown, or whether performance
on the repeats was indistinguishable from that on the original presentations.
The repeat sets were administered in a different order (see Table 1) with the placement of the target pairs
changed for each homonym.
Otherwise, the administration
was the same as
on the original pass.

The acquisition of homonymy

193

Scoring
As soon as possible after testing, the tapes from each session were reviewed
and any verbal comments made by the child were transcribed onto a new
score sheet along with a copy of the pointing responses and timing information noted at the time of testing. Each childs responses were scored according to the following rules:
1. Correct responses.
a) overt: if the correct pair was indicated and the child could say the word.
b) passive: if the correct pair was pointed to and, although the child
would not say the word, s/he did Task 3 correctly.
2. Errors
a) semantic (S)
b) phonological (P), including
( 1) rhyme (RI
(2) alliteration (A)
c) random association (X), if the child pointed to a pair that was neither
correct nor S nor P (i.e., association between the phonological and semantic
distractor
items, or between the non-target homonym
and the semantic
distractor).
d) no response (-), when the child refused to point to a pair and either
said nothing or said I cant or I don t know.
e) phonological
inventions
(I), where the children either tried to invent
rhymes or alliterations
that were not words used in the prenaming or else
tried to force homonymy
through neutralization
or by brute force relabelling (see Discussion under Development
of strategies for finding homonyms).
3. Errors were scored as originally designed unless a verbalization
indicated
that some other strategy was being used, e.g., Knight (with sword), krzife was
scored as A (alliteration),
but if the child said knife, sword, it was scored
as S (semantic).
4. No response (-) was counted as an error only on the first request for
each task, but refusal to make another try after a child had made at least
one response was not counted as a further error (since it was assumed that
after one overt attempt, no further response simply indicated that the child
had no better guess to offer).
5. Any given pair was only counted once even if it was pointed to more than
once.
6. If a child indicated a pair but rejected this choice himself, it was not
counted.
There are several possible ways of assessing each childs basic homonym
ability
due to the facts that (1) each child was encouraged
to keep

194

A. M. Peters and E. Zaidel

searching for each homonym


pair until either s/he found it or gave up,
(2) when a child did fail at Task 1 the problem was made easier by shifting
to Task 2, and (3) passive answers were noted. The measures that have
been used in this study are:
H, = number of overt homonyms found on Task 1, first tries only.
H, = number of overt homonyms found on Task 1, all tries.
Hz = number of overt homonyms found on Tasks 1 and 2, all tries.
H, = number of overt and passive homonyms
found on Tasks 1 and
2, all tries.
Thus, HO gives a very conservative
estimate of homonym
finding ability,
being restricted
to overt first tries only. H, shows how well a child did on
Task 1 while H2 reflects performance
on both homonym finding tasks. H,
is tlie most generous estimate of homonym
ability since it also includes
passively correct answers.

Results and Discussion


Homonym

Performance

and Age

I. Results

The group means for the four homonym


scorings are given in Table 2.
Significant main effects for age were found when separate 2 X 3 (sex X age)
analyses of variance were run for each scoring. Post hoc correlated-sample
t-tests showed that the differences
between scores on first tries only (H,)
and all tries (H,) on Task 1 reached the greatest level of significance for
Table 2.

Age group means for the 4 homonym scorings (maximum for each = 18).
Significant differences between scorings computed from correlated-sample
t-test.

Oldest
Middle
Youngest
All

10.7
9.0
2.5

*
**

13.7
10.5
3.5

*
**
**

15.6
13.4
6.6

*
*

16.5
15.1
8.4

7.4

**

9.2

**

11.9

**

13.3

* p < 0.01.
**p < 0.001.
Ho
HI
Hz
HP

=
=
=
=

overt
overt
overt
overt

homonyms
homonyms
homonyms
and passive

found on Task 1, first tries.


on Task 1, all tries.
on Tasks 1 and 2, all tries.
homonyms
on Tasks 1 and 2, all tries

The acquisition of homonymy

195

the middle group (p < O.OOl), a lesser level for the oldest group (p < 0.01)
and were not significant at all for the youngest group. Increases in scores
from Task I (H,) to Task 2 (H,) were significant for all three groups,
whereas adding in passive scores made significant differences
only for the
two younger groups (p < 0.0 1) (see Table 2).
Even though most of the youngest children could find at least one or two
homonyms,
there was a clear jump in ability at the boundary between the
youngest and middle age groups (4;4 years). This was indicated by a maximum in the value of F [F = 42.6, df = (1,28)], signalling a maximum of certainty in a score difference when the children were ranked by age, and oneway analyses of variance were performed on the H, scores of older versus
younger
age groups when the boundary
between the two groups was
systematically
increased. A second maximum value for F [F = 19.2, df =
(1,28)]
occurred with the boundary
between the two groups set at 5;2
years, setting off the 8 oldest children as the most able group.
Since a one-way analysis of variance on the differences in scores between
Pass 1 and Pass 2 (originals versus repeats) was not significant [F = 1.43,
df = (1,58)],
these scores have been combined into a total score for each
child (maximum = 18). The lack of change from originals to repeats shows
that the children evoked the names of the concepts rather than having
learned to associate them by rote with specific pictures. The children were
not told that the second set of homonyms involved the same words as the
first set.
2. Discussion
Since the children were always asked to verbalize the homonyms for each
pair they chose, it was very clear whether or not they really had found a pair
and thus their scores were not compared to chance. Even for the children
who had the hardest time, most of them were able to find at least one or two
homonym pairs and in each case it was very clear that they understood what
their goal was and were aware that they had solved that particular problem.
The expected increase with age in ability to find homonyms was clearly
indicated by every statistical test we made. More interesting is the relationship between age and scoring that can be seen in Table 2 and which was
supported by the t-tests: the older the children, the better they did on their
first try (H,), while the younger the children, the more they benefited from
more tries and passive scoring. In particular, the oldest children did best
within Task 1 with their biggest increase in scores from Ho to H1, while
the younger two groups profited most by moving to Task 2 with their
biggest increase from H, to Hz.
Although no significant interaction
between homonym
ability and sex
was found for any score separately, it is worth noting that while the girls

196

A. M. Peters and E. Zaidel

did better than the boys on Task 1, particularly,


Task 2 (Hz) their performance
was almost identical;
counted the boys did better than the girls.
Development
1. Results:

of Strutegies

on first tries (Ho), on


and when passives were

for Findirzg Homonq,ms

By Age Group

Three 3-way analyses of variance were performed to look at the effects


of age and sex on the strategies used by the children, as reflected by the
types of errors they made. In the first analysis, the errors examined were
limited to the most common types: phonological and semantic (P and S). In
the second, the phonological
errors were further investigated by separating
them into rhymes and alliterations (R, A, S). Finally, in the third analysis,
random choices and refusals to answer were added (P, S, X, -). In all three
analyses, significant main effects were found for age (p < 0.01) and errors
(p < 0.05 for the first two analyses and p < 0.01 for the third), as well as
a significant interaction
between age and errors (p < 0.05 for the first and
third analyses and p < 0.01 for the second). The interaction
effects are the
result of each group of children having clear strategy preferences:
the
youngest children used S and X more than the older two groups, the middle
group used P the most, and the oldest children used no response (() the
least. When P was broken down into R and A, it was found that while all
three groups used rhymes about equally, the oldest children used alliterations only about half as much as either of the younger two groups. The lack
of any effect of sex on strategies was taken as a justification
for combining
the boys with the girls in the subsequent analyses on strategies.
Table 3 summarizes for each of the age groups the means for each of the
possible responses on first tries for Task 1. Since only first tries are tabulated,
each row sums to 180. (C, = number passively correct.) Post hoc t-tests on
the group mean scores show that the differences in use of individual strategies between the oldest and middle groups were never significant, but the
youngest group differs significantly from both the middle and oldest groups
in number correct (p < O.OOl), number of passive homonyms (p < O.OS),
number of semantically
related choices (p < O.Ol), and number of unrelated
pairs (p < 0.05 for middle versus youngest,
p < 0.01 for oldest versus
youngest) (see Table 3). The percentages of each strategy are graphed in the
top of Figure 2, giving a strategy profile on first tries for each age group.
The differences
between strategies by age groups change very little from
first tries to all tries. At first glance, it seems reasonable that, although the
youngest children make many more S and X tries, neither on first tries alone
nor on all tries is there a significant difference between age groups in phono-

The acquisition of homonymy

Table 3.

197

Age group means for strategies on Task I, first tries. Significant differences
in each strategy are shown between youngest and middle groups and youngest
and oldest groups (bottom row) as computed by i-tests.

Oldest
5;1-6;3
Middle
4;3-5;l
Youngest
3;3-4;5

Ho

CP

107
(59%)

(Z%,
**

Cl%)
(*)

25
(14%)
**

:3y,)
(*)

39
(22%)

197$h)

(&

::O%,
*

&
(*)

$983
*

(2:2%)
*

42
(23%)
43
(24%)

*p < 0.01.
**p < 0.001.
(*)p < 0.05.
Ho = number of overt homonyms
C, = number found passively.
P = phonological
errors.
S = semantic errors.
X = random associations
- = no response.

logically-based
guesses (P). This,
ceiling effects in the oldest group.

12
(7%)
22
(12%)

found.

however,

may be artifactual

and due to

2. Results: By Ability Groups


During the testing, it became evident that homonym finding ability did
not vary strictly with age: some of the children in the middle group were
clearly much better at the task than some of the older children. Not only
did they find the homonyms
quickly and efficiently
(using few tries) but
the few errors they did make seemed to be qualitatively
different from
those of the older children who had more difficulty. In order to investigate
this observation,
the children were divided into three ability groups based
on their Ho scores. The most able group (A) was comprised of 7 children
from the oldest group and 3 from the middle group (including one of the
youngest from that group, aged 4;s). The second group (B) contained 7
children from the middle age group, 2 from the oldest, and 1 from the
youngest.
The least able group (C) contained
the remaining 9 of the
youngest children and one child from the oldest group. The ability group
means for each strategy on Task 1, first tries, are summarized in Table 4,
while the bottom of Figure 2 gives a strategy profile for each ability group.
Comparing the top and bottom of Figure 2, we see that the strategy
differences between ability groups are more marked than those between age

198

A. M. Peters and E. Zaidel

Figure 2.

Percent responses for each strategy used on Task 1, first tries. Top: by age
group; bottom: by ability group. Symbols as in Table 3.

ocp
Oldest

pP

Best

s x (A)

s x -

Middle

H,

cp

Middle

H,

cp

s x -

Youngest

s x (B)

HoCp

worst

s x (C)

groups. The original observation


that ability did not vary strictly with age
was verified: the middle group (B) used 2.4 times as many extra guesses after
the first try as did the best group (A). In addition, we see that with this
grouping there is a significant difference
in phonologically-based
choices
between A and B. In fact, group B made 3 to 4 times as many such errors
as group A. Semantically-based
guesses remain significantly greater for the
bottom group as do passively correct guesses. While refusal to respond (-)

The acquisition of homonymy

Table 4.

199

Ability group means for strategies on Task I, first tries. Significant differences
in each strategy are shown between best and middle groups, middle and worst
groups, and best and worst groups (bottom row) as computed by i-tests.
Symbols as in Table 3.
P

Ho

CP

A: best
4;s6;3

129
(72%)
**

(:%,

8: middle
4;3-5;6

69
(38%)
**

(24%)

58
(32%)

C: worst
3;3-5;2

24
(13%)
**

12
(7%)
(*)

49
(27%)
(*I

**

19
(11%)

fF2%)
(*I
46
(26%)
*

7
(4%)
*

ffl

%)

:122%)
*

is fairly evenly distributed across all groups, random


be concentrated
in the bottom groups.

associations

(X) tend to

3. Discussion
There were several other readily observable strategy differences between
the groups. In both the youngest (Fig. 2, top) and least able (Fig. 2, bottom)
groups, passive responses were extremely common: not only when these
children picked correct pairs, but also when they picked phonologically
and semantically associated pairs, they tended not to want to say any words
aloud when asked Whats the word? In fact in Task 1, 8 1% of the passive
responses were made by children in the youngest age group. (For further
discussion of passive vocabulary,
see below.) Another clear developmental
difference was a shift in the type of semantic responses given. The youngest
children not only indicated many semantically
associated pairs for which
they refused to verbalize, but when they did say the words, they tended to
label the individual members of a class rather than giving a single superordinate
class label (for instance, pointing to Zion and bear and saying
lion, bear rather than animal which would at least have used the same word
for both pictures). 77% of such class membership responses were made by
the youngest group whereas superordinate
class responses were quite evenly
distributed
across the age groups (youngest 31%, middle 35%, oldest 35%).
A cognitively-based
difference
that separated one group from another
was apparent in the type of searching strategy employed. Thus, while the
most able children tended to scan each array of 4 pictures silently (though
often subvocalizing, as evidenced by lip movement), smile, point to the right

200

A. M. Peters and E. Zaidel

pair and say the words, the youngest seemed to just pick two pictures. If
these were wrong, they often picked the other two pictures and then gave
up. The intermediate
children, however, seemed to be on the way to developing a systematic search strategy without having quite gotten there. First,
they tended to want to name all 4 pictures aloud without making any
choices. Then they often seemed to pick out one picture which served as a
focus for their comparisons and would systematically
pair it with each of the
other 3, indicating that each such pairing was a guess at the right answer. If
they happened to choose one of the homonym pictures as their focus, this
strategy was often successful. If, however, they picked a non-homonym
as
focus, they often could not find the homonym
even though they applied
the correct labels to the pictures: although they said the correct words
aloud, they seemed not to be able to carry the sounds over from one comparison to the next. A shift to Task 2, however, in which one of the homonyms was indicated by the investigator,
seemed to help these children get
unstuck from that first choice of focus. This phenomenon
was much more
common among the older two groups of children, occurring only rarely
among the youngest. It may reflect local rigidity associated with flexibility
in another cognitive locus. It is as if the child has a limited resource for
flexible open-ended
search which s/he can apply to the search for focus or
to the search for identical labels but not simultaneously
to both. (See
Norman and Bobrow, 1975, for a discussion of resource limitations.)
An increase in ability to deal with the phonological
nature of the problem was also evident, being more pronounced
in the ability grouping than in
the age groups. The children acted very much as if there was a hierarchy of
strategies at their disposal, and if a higher strategy didnt work they would
fall back on a lower one. The apparent sequence was: get it right, make a
phonological
choice, make a semantic choice of the inclusive kind, make a
semantic choice of the associative kind, guess randomly. (Giving up could
occur at any point - how soon a child refused to try any more seemed to
depend on the individuals personality.)
The older children had more control
over the higher end of the sequence - the most able children almost always
found the right answers and when they had trouble they would fall back on
P or S almost equally often (see Fig. 2, bottom).
The least able children,
who had great difficulty,
seemed also to use P and S about equally often,
but the middle group used P much more often than S (again, see Fig. 2,
bottom). This is because, aware of the phonological
nature of the problem.
some of them used every trick they could muster to find two words that
sounded alike, including hunting for rhymes (by means of both real words
and invented nonsense words) and forcing identity of sound between two
words (invention
(I) errors). For example,
K.B. (5;l) was a prolific

The acquisition of homonynzy 201

arrowlbrarrow,
horselmorse
and
suggesting
mitten/kitten,
among others.
There were two different ways in which identity of sound was forced:
through brute force relabelling and through phonological neutralization.
Brute force relabelling occurred when a child pointed to two non-homonymous pictures and applied the word for one of them to both. E.N. (4;3)
did this some 10 times, e.g., pointing to hoe, bow-and-arrow,
but saying

rhymer,

drum/turn,

arrow, arrow.

A somewhat more subtle strategy involved taking advantage of the near


phonological identity of some of the phonological associates and pronouncing
such pairs halfway
between
so that the phonological
contrast
was
neutralized. Thus A.J. (5; 1) asserted that pear and bear sounded exactly the
same by devoicing the /b/ in bear, producing [ ph Er], [pErI. She also pronounced palm and bomb identically. And A.D. (4;O) tried to pronounce
bat and back the same, producing batk. A developing ability to manipulate
the phonological
aspects of words, divorced from their meanings, is thus
apparent among the intermediate
children.
There is now substantial
evidence that the left cerebral hemisphere is
specialized
for processing
phonological
information
in speech (Zaidel,
197&). If it also controls the recognition
of homonymy,
we would have
evidence for a rather early onset of cortical lateralization
of language, at
about 4;6. Consequently,
we wanted a developmental
estimate of the
abilities of the adult right and left hemispheres to recognize homonymy.
In
a separate study (Zaidel and Peters, 1979), we administered
an extended
version of the homonym test separately to the right (RH) and then the left
(LH) hemispheres
of two patients who had undergone complete cerebral
commissurotomy
to alleviate intractable epilepsy (Bogen and Vogel, 1975).
First tries of Task 1 are precisely comparable across these two studies. The
LHs obtained perfect scores, far superior to the corresponding
RH scores
which themselves fit quite well within the developmental
progression found
for the children. Thus, the RH of patient N.G. (a 45-year-old woman who
had surgery at age 30 and first signs of epilepsy at age 17) had scores quite
similar to those of the lowest ability children (Table 4) with 11% correct
responses (13% for the children), and with the 36% phonological errors only
slightly outnumbering
the 34% semantic errors (27% and 26%, respectively,
for the children). The RH of patient L.B. (a 25-year-old
man who had
surgery at age 13 and first epileptic symptoms at age 3) scored similarly to
the middle ability group: 59% correct first tries on Task 1 (38% for the
children) and many more phonological
than semantic errors (about a 4 to 1
ratio for L.B., 3 to 1 ratio for the children). Comparison of the adult with
the child data thus suggests a rather early LH specialization for phonological

202

A. M. Peters and E. Zaidel

encoding and individual differences in RH processing ability. Furthermore,


the data are consistent with the hypothesis of a developmental
arrest for the
RH in the acquisition of the skill. This is not a universal result - other tasks
show slightly higher equivalent
mental ages for RH competence
(e.g., in
receptive syntax) and divergent error patterns as well as performance
styles
between the RHs and children who had obtained the same total score on the
test (Zaidel, 1978).
Vocahuhry

Difficulty

and Homonym

Performance

Since the vocabulary


items involved in the various pairs were of varying
degrees of difficulty,
it seemed likely that some children were better at
finding homonyms
because they had a greater vocabulary
proficiency.
Therefore,
two prenaming scores were calculated for each child based on the
number of items for which difficulty
was encountered
in the prenaming
task: (1) a total prenaming score, P,, based on the 54 items used in the 9 test
sets, and (2) a homonym
prenaming score, Ph, based on the subset of 18
homonym words used in the test sets. When calculated for the whole group,
correlations
between P, and each of the 4 homonym scorings (H,, H,, Hz,
HP) were significant at the 0.001 level as were correlations
between Ph and
each of the 4 homonym scorings (see Table 5). When, however, these correlations were calculated
for the individual age groups, prenaming scores
turned out to be the most highly correlated with homonym performance
for
the oldest group, significantly correlated only with Task 1 performance
for
Table 5.

Group correlations between prenaming scores and homonym scores. a

Whole Group
Pt
Ho

HI

-0.84

Oldest

PO.66

**

**

-0.85

-0.71

**

**

H2

PO.78

m-o.73

HP

PO.78
**

m-o.70
**

**

pt

Ph

**

i PO.80
I *
1 PO.82
;
I

,
,

Middle

Youngest
Ph

ph

Pt

ph

pt

-0.67
(*)

-0.72
(*)

-0.71
(*)

-0.80
* _

-0.77
*

-0.76
(*)
\ ,

-0.85
*

PO.67

-0.76

(*)

(*)

%ignificant
differences:
**p < 0.001; *p < 0.01;
Pt = prenaming
score on total set of pictures.
Ph = prenaming
sco*e on set of homonym
pictures.
Other symbols as in Table 2.

-0.69
(*)
-0.4;

0.76
-0.58
(*)

I
I

,
,

~0.61

(*I p < 0.05.

-0.55

-0.62

-0.01

_!
PO.13

1
0.47

PO.36

PO.29

0.30

The acquisition of homonymy

203

the middle group, and not correlated at all for the youngest group except for
P, with H, (Table 5). Thus, although prenaming proficiency
has something
to do with homonym
finding ability, it does not tell the whole story,
especially for the youngest children. It is as if vocabulary proficiency releases
resources for searching and matching. When all of the component
prerequisites for the task (searching, matching, vocabulary proficiency)
are mature
enough, growth in ability in any one area releases cognitive resources to
improve perfomance in the whole task.
It also seemed likely that if a child did not know one or both of a given
pair of homonym words at the prenaming stage, s/he would have difficulty
finding that particular pair in the homonym test. Therefore, we looked at
how well the three age groups did at finding homonyms contingent upon
whether they did or did not have prenaming success with the homonym
words. This analysis showed that the oldest children did quite well even
when they had vocabulary
difficulty (91% of the homonyms in this case).
Both the oldest and middle groups did well when they had no vocabulary
difficulty (95% and 96%, respectively,
of these homonyms).
The youngest
children only got 58% of items where they had no vocabulary
problems,
42% when such problems existed. And again, when there was no vocabulary
difficulty,
the older 2 groups had few passives (1% and 4%) while the
youngest had 19%. When, however, there had been vocabulary problems, the
middle group went up to 17% passives and the youngest to 24%. Thus, the
youngest children seem to be relatively unable to take advantage of exposure
to difficult vocabulary
items at the prenaming stage as shown by their
increase in homonym errors for just those words (23% to 34%). The middle
children, on the other hand, could utilize at least some of the prenaming
information
as evidenced by the increase in passive responses (4% to 17%).
And the oldest children seem to have taken such good advantage of their prenaming problems that their homonym performance dropped very little when
they had vocabulary
difficulty (96% to 91%). The fact that the youngest
children did find 42% of these homonyms
where prenaming difficulty
occurred shows that mastery of vocabulary as measured by success in prenaming is by no means necessary for success in homonym finding.
Although the prenaming scores do correlate fairly well with the homonym
scores, the interaction
between the two tasks seems much more complex.
Indeed, the prenaming task was designed to associate particular labels with
particular pictures in the minds of the children before they were confronted
with the homonym sets, and judging from the childrens homonym scores on
those items for which they had vocabulary
difficulty,
the prenaming task
seems to have functioned
much as it was intended to (although it did not
work perfectly
since the children did not always remember
the desired

204 A. M. Peters and E. Zaidel

labels). In particular,
any vocabulary
items which a child knew to some
extent but had temporarily
forgotten were likely to be reinforced,
often to
the point where finding the homonym
was a possibility, passively if not
actively. In addition, the pressure to perform well probably further enhanced
this reinforcing effect.
An interesting phonological difficulty arose for some of the children when
the alliteration
happened to phonologically
contain the whole target word
as its first part. This happened with the words tie and tire, and bear/bare and
barrel.
Somehow these were much more confusing than minimal pairs such
as bat and back, horrz and horse, or night/knight
and knife.
A final question that needs to be discussed with respect to the effects of
vocabulary on homonym performance
is that of homonymy versus polysemy.
That is, is there any evidence that any of these pairs of words were stored in
the childrens lexicons as two sub-meanings to a single entry rather than as
two separate entries which happened to sound alike? Of all the homonym
pairs, only tie (a string) and necktie seemed to be at all polysemous.
(One
child, age 4; 10, spontaneously
remarked,
You tie something around your
neck anyou tie OH your shoe, too.) This did not, however, seem to be the
case for all the children.
The ability to find homonym
pairs depends then, not only on an understanding of the nature of the task involved, but also on having access to the
phonological
representations
of the critical words in order to be able to compare them for identity.
Active (productive)
versus passive (only receptive)
knowledge of words probably has its effect here - in the case of passively
correct choices the children seemed to be able to hear enough of the relevant words in their heads to make their decisions but were not sure enough
of the words to want to say them out loud. The tendency of the middle
children to want first to name all 4 pictures aloud before making any choices
also seems to relate to the need to be able to hear the words in order to compare them. When a childs control over pronunciation
is not fully developed,
it is unclear whether his difficulties
with pronunciation
will tend to carry
over into his phonological
comparisons or not. The child who had the least
success in finding homonyms
was a boy (4;O) whose phonological
development was very slow. According to his teacher, this trait ran in his family and
was always eventually outgrown. How much of his difficulty with homonyms
was due to this developmental
characteristic
is unclear, but probably the
effect was not negligible.
The homonym
test calls for the coordination
of a number of cognitive
prerequisites.
These include the ability (1) to understand
the task, i.e.,
what sound the same means, (2) to conduct an exhaustive search through
the set of alternative pictures, (3) to access the phonological representations

The acquisition of homonymy

20.5

of the critical words, (4) to rehearse a label while searching for others to
match with it, (5) phonologically
to match two labels once found, (6) to
cycle through alternative labels for a picture in cases of phonological mismatch. Inefficient processing or immaturity
in any of the component
processes or in the ability to coordinate them could result in failure to perform
the task. Maturation of some component
processes can release resources for
processing others. Thus, the younger children were particularly limited by
mastery of vocabulary - a problem which hardly affected the older children.
That improvement
in ability to find homonyms is a function of maturation rather than learning is shown by the fact that exposure to one exemplar
of a homonym (Pass 1) did not result in improved performance
on exposure
to a second exemplar of the same homonym pair (Pass 2 - viz. the fact that
the overall scores on the two passes were not significantly different). And
yet there is a sharp improvement
in recognizing homonyms
at age 4;4
without any special training. Thus, the resource limitations affecting performance on this task would seem to be biologically determined rather than
learning-dependent.

Summary
In our investigation
of pre-school childrens ability to find homonyms,
we
have found not only that children over 4;4 years of age had considerably
more success than their juniors, but also that successtat solving this problem
depended on a complex interaction
of cognitive and linguistic development.
Thus, even though children were able to deal with the linguistic aspects of
the problem, the fact that they had not yet developed an efficient search
strategy could, if they were unlucky in their choice of a focus for comparisons, cause insurmountable
problems. And, on the other hand, even if a
search strategy was well developed, linguistic problems could cause a particular pair to be missed. The youngest children had both cognitive and
linguistic problems; the middle children were learning to deal with both sometimes
difficulties
arose in one area, sometimes
in the other. The
most able children had their searching strategies well developed and only
rarely had linguistic difficulties.
a As noted in Homonym Performance and Age, even the children who had the hardest time were
able to find one or two homonym
pairs and in these cases it was clear that they knew they had solved
the problem and found two words that sounded the same. Thus, when they had difficulties
with the
other pairs, it was not because of problems
with component
(1) alone, but rather mainly with the
cognitive components
of search (2) and rehearsal (4) and/or the linguistic components
of access to and
phonological
representation
of vocabulary
(3), (S), and (6).

206 A. M. Peters and I?. Zaidel

The linguistic abilities needed for finding these homonyms were of two
kinds: lexical and phonological.
If a child had no lexical access to a
particular vocabulary
item, s/he could not use it in the task. If such access
was only passive (receptive), it might be sufficient to allow the child to find
the homonym
but insufficient
for the child to want to risk producing the
word. Such passive success was most common among the youngest children.
The oldest children were the most lexically facile - if they happened to
forget the particular label associated with a picture at prenaming, they were
able to try out several names for each picture.
Phonological ability here refers to the capability of separating sound from
symbol and then manipulating
that sound by comparing it with the sounds
of other words. The youngest children showed relatively little evidence of
having developed such abilities - they tended to fall back on semantic association as a criterion for similarity. The intermediate
children, however, had
developed
a fair repertory
of phonological
manipulations
they could perform. Since they were not as efficient as the most able group, they made
numerous
guesses, looking for rhymes and alliterations,
inventing them if
they had to, or trying in some way to force identity of sound.
The ability to recognize
phonological
similarity would seem to be a
necessary if not sufficient prerequisite
for learning to read via phonological
decoding. Indeed, the disconnected
left hemisphere
is proficient
in both
recognizing homonymy
and in translating graphemes to phonemes, whereas
the right hemisphere is not proficient in either. The improvement
in ability
to recognize
homonyms
between 4 and 6 years apparently
reflects left
hemisphere maturation
(Zaidel and Peters, 1979) - if so, then age 5 seems a
natural biological (rather than purely cultural) starting point for learning to
read. And yet the fact that the oldest group in our experiment
did not
precisely
consist of the most able homonym
finders should be kept in
mind: some children simply had their act together
(both cognitive and
linguistic) at an earlier age than others.

References
Bogen,

J. E. and Vogel, P. J. (1975) Neurologic


status in the long term following complete cerebral
commissurotomy.
In F. Michel and B. Schott (Eds.), Les Syndromes de Disconnexion
Cufleuse
chez IHomme. Lyon. Hopital Neurologique.
Chao, Y. R. (1951) TheCantian
idiolect: an analysis of the Chinese spoken by a twenty-eight-monthsold-child.
Reprinted
in A. BarAdon
and W. Leopold
(Eds.1, Child Language: A Book of
Readings. En&wood
Cliffs, New Jersey, Prentice-Hall.
Iwamura, S. J. (1977) Games and other Routines in the Conversation ofPreschool
Children. Unpublished Ph.D. dissertation,
University of Hawaii.
Keenan, E. 0. (n.d.) Evolving discourse ~ the next step. Ms.

The acquisition of homonymy

207

Locke,
Locke,

J. L. (1971) Phonetic mediation in four-year-old


children. Psychon. Sci 24, 409.
J. L. and Locke, V. L. (1971) Recall of phonetically
and semantically
similar words by 3-yearold children. Psychon. Sci. 24, 189.
Norman,
D. A. and Bobrow,
D. G. (1975) On data-limited
and resource-limited
processes.
Cog.
Psychol. 7, 44-64.
Thorndike,
E. L. and Lorge, I. (1944) The Teachers Word Book of30,OOO Words. New York, Teachers
College Press.
Zaidel, E. (1978) Lexical organization
in the right hemisphere.
In P. Buser and A. Rougcul-Buser
(Eds.), Cerebral Correlates of Conscious Experience. Amsterdam,
Elsevier.
Zaidel, E. and Peters, A. M. (1979) Phonological
encoding and ideographic
reading by the disconnected right hemisphere:
Two case studies. Submitted for publication.

Les auteurs etudient le developpement


de la capaciti des enfants i dissocier sons et sens des mots. La
tache consiste, pour des enfants de 3 ans 3 i 6 ans 3 i choisir des homonymes
i partir de dcssins.
Les resultats
montrent
que le developpement
de cette capacite
subit une brusque acceleration
i
4 ans 4. Letude longitudinale
des strategies utilisees indique une tache cognitivement
complexe.
La performance
des jeunes enfants est limitde plus par leur incapacite
fondamentale
i faire face i
plusieurs facteurs cognitifs i la fois, que par une incapacitd
a traiter les aspects linguistiques
de la
&he.
Les facteurs
cognitifs
incluent
Iaccks au vocabulaire,
lenumkration
dcs r&ultats
intermediaires et letablissement
dune strategic de rechcrche.

Cognition,
@Elsevier

8 (1980) 209-225
Sequoia LA., Lausanne

Discussion
- Printed

in the Netherlands

The ATN and the Sausage Machine :


Which one is baloney ?

ERIC WANNER*
Sussex University

In a recent issue of Cognition, Lyn Frazier and Janet Dean Fodor proposed
a new two-stage parsing model, dubbed the Sausage Machine (Frazier and
Fodor, 1978). One of the major results which Frazier and Fodor bring
forward in support of their proposal concerns a parsing strategy which,
following Kimball (1973), they call Right Association. The center-piece
of
their argument concerns an interaction
between this parsing strategy and
another
one, which they call Minimal Attachment.
Frazier and Fodor
(henceforth
FF) provide interesting evidence that the language user makes
tacit use of both strategies to resolve temporary syntactic ambiguities that
arise during parsing. FF then proceed to argue that the existence of these
strategies, as well as the apparent interaction
between them, can be fully
explained if we assume that the language users parsing system is configured
along the lines of the Sausage Machine. In FFs view, the Augmented Transition Network (ATN) runs a very poor second to the Sausage Machine, for
according to FFs argument, it is impossible even to describe the two parsing
strategies within the ATN framework. In effect then, FF are claiming that
the Sausage Machine achieves explanation
adequacy in this case while the
ATN fails to reach the level of descriptive adequacy.
These are strong and potentially
important
claims. If correct,
they
obviously provide grounds for pursuing parsing models built along the lines
of the Sausage Machine rather than the ATN. However, when FFs arguments
are examined at close range, the comparison between parsing systems comes
out rather differently
than they claim. In particular, it appears that the
Sausage Machine explanation
of Right Association and its interaction with
Minimal Attachment
is empirically incorrect. The inadequacy of this explanation completely
cancels the Sausage Machines ability to describe the
interaction
between strategies that FF have observed. This follows because
*Reprint
Cambridge,

requests
should be sent
Mass. 02138, U.S.A.

to Eric Wanner,

Harvard

University

Press,

79, Garden

Street,

210

Eric Wanner

FF aspire to an explanation
that renders independent
description
of the
parsing strategies unnecessary.
The Sausage Machine contains no apparatus
for describing strategies. Hence, the failure to achieve explanatory
adequacy
automatically
entails descriptive failure as well. In contrast, and in contradiction of FFs negative claim, the ATN can provide a perfectly general description for each strategy in terms of scheduling principles that constrain the
order in which arcs in an ATN grammar are attempted.
Moreover, when
these scheduling principles are coupled with an ATN version of the grammar
FF tacitly employed to generate their pivotal cases, FFs observations about
the interactions
between
strategies are completely
accounted
for. Thus,
although the ATN framework
does not provide an explanation
for either
parsing strategy, it appears to achieve descriptive adequacy. Moreover, the
descriptive framework
of the ATN makes it possible to discern just what
phenomena require explanation
and to speculate in a reasonable way about
the explanatory
principles that underlie the parsing strategies FF have discovered.

The Sausage Machine


As advertized, the Sausage Machine has two very distinct stages. According
to Frazier and Fodors proposal,
... the human sentence parsing device
assigns phrase structure to word strings in two steps. The first stage parser
(called the PPP) assigns lexical and phrasal nodes to substrings of roughly six
words. The second stage parser (called the SSS) then adds higher nodes to
link these phrasal packages together into a complete phrase marker (p. 29 1).
Although FF do not provide a detailed characterization
of how the Sausage
Machine works, they do supply the following sketch: The PPP has a
viewing window which shifts continuously
through the sentence and
accommodates
perhaps half a dozen words (p. 305). The PPP uses the rules
of the grammar to assign each input string within the window its lower
lexical and phrasal nodes (p. 296). It is important
to understand
that in
making these structural assignments, the PPP can only take account of the
six words within its current window plus any low level structure it may have
already assigned to the words within the window. Given the severe shortsightedness
of the PPP, the SSS can survey the whole phrase tnarker for
the sentence
as it is computed,
and it can keep track of dependencies
between items that are widely separated in the sentence and of long term
structural commitments
which are acquired as the analysis proceeds (p. 292).
The SSS works only on the output of the PPP. The low level phrasal
packages assembled by the PPP are deposited in the path of the SSS which

The A TN and the Sausage Machine

2 11

is sweeping through the sentence behind it (p. 306). As it sweeps along,


the SSS also uses the grammar to assemble the phrases left to it by the PPP
into a complete phrase marker for the input sentence.
Although this description is somewhat vague, it is precise enough for FFs
purposes. According to their argument, there are only three features of the
Sausage Machine which provide its explanatory
power. These are also the
features which most notably distinguish it from the ATN:
(A) The existence of 2 separate stages of parsing.
(B) The PPPs limitation to a six word viewing window.
(C) The SSSs ability to appraise the whole phrase marker as it develops and
therefore to make decisions contingent upon the geometry of the entire
parse tree.

Can the Sausage Machine Cut the Mustard?


In FFs terms, a parsing strategy is a rule that governs situations in which the
grammar permits the parser to attach a constituent in more than one possible
way to the developing parse tree. So, for example, both sentence (1) and (2)
are ambiguous because the final word in each can be attached at two possible
points in the phrase marker:
(1)
(2)

Tom said that Bill had taken the cleaning out yesterday.
Joe called the friend who had smashed his new carup.

In (l), yesterday
can be attached as an adverbial modifier either to the topmost S in the phrase marker (Tom said . ..) or to the embedded S (Bill had
taken . ..). Similarly, in (2), up can be attached as a particle to the verb in the
topmost S (called) or to the verb in the embedded S (smashed). In both
sentences, the lower of the two possible attachments
seems to be preferred
by most people and Frazier (1978) has provided experimental
evidence for
the reliability of this preference.
According
to FF, this type of bias can be adequately
described by
Kimballs principle of Right Association, which dictates that an ambiguous
constituent
should be attached into the phrase marker as a right sister to
existing constituents
and as low in the tree as possible (p. 294). The Right
Association strategy applies in the obvious way to make the correct predictions about the language users preferences
in sentences (1) and (2). But
what explains the existence of this particular
strategy? Why should the
language user be uniformly biased toward low right attachment
as opposed
to (say) high right attachment?
According to FF, the Sausage Machine can
supply the answer. Their story begins with the observation
that the ten-

212

Eric Wanner

dency towards low right association of an incoming constituent sets in only


when the word is at some distance from the other daughter constituents
of
the higher node to which it might have been attached (p. 299). Sentences
(3) and (4) provide the evidence for FFs claim that Right Association sets
in only . . . at some distance.
(3)
(4)

Joe bought
Joe bought

the book that I had been trying to obtain for Susan.


the book for Susan.

In (3) there are two possible attachments


for the final prepositional
phrase
for Susarz: it can be attached either to the object noun phrase (the book that
I had been trying to obtain for Susan) or the main clause verb phrase
(bought
the book for Susnn).
Right Association
correctly
predicts the
preference
for the first of these attachments,
which is at the lower right
margin of the phrase marker. Notice, however, that in sentence (4), this
preference
seems to be reversed. The preferred attachment
is to the verb
phrase, not the noun phrase; and as phrase markers (5) and (6) demonstrate
this is clearly the higher of the two possible attachments:

(5)

1
N

PNP

I
Joe

bought

the

I
book

I
for

(6)

FF argue that the preference


for (5) over (6) is a special case of the general
parsing strategy they call Minimal Attachment.
This strategy also governs
situations where the grammar permits more than one possible attachment

The A TN and the Sausage Machine

2 13

for a given constituent


and it stipulates that the ambiguous item is to be
attached into the phrase marker with the fewest possible number of nonterminal nodes linking it with the nodes that are already present (p. 320).
Comparison
of (5) and (6) will show that noun phrase attachment
involves
one more non-terminal
node than verb phrase attachment;
hence the
Minimal
Attachment
principle
correctly
predicts
the language users
preference
for (5). But why does Minimal Attachment
prevail over Right
Associaticm in sentence (4)? And why does Right Association appear to set
in only at a distance? Here FF offer an ingenious explanation
based exclusively on the architecture of the Sausage Machine:
L.et us suppose for the sake of argument that the first stage parser has the capacity
to retain six words of the sentence, together with whatever lexical and phrasal
nodes it has assigned to them. Then in processing (4), it will still be able to see
the verb when it encounters for Susan. It will know that there is a verb phrase
node to which the prepositional phrase could be attached, and also that this particular verb is one which permits a for-phrase. But in sentence (3), where a long
noun phrase follows the verb bought, the first stage parser will have lost access to
bought by the time for Susan must be entered into the structure; the only possible
attachment will be within the long noun phrase, as a modifier to trying to obtain
(p. 300).
Notice that according
to this account,
there need be no independent
statement
of Right
Association
anywhere
in the Sausage
Machine.
The PPP
simply makes whatever
attachments
it can. In long seritences
like (3) the low
right attachment
of for Susan is the only attachment
the PPP can make

because its limited window prevents it from seeing the higher attachment
possibility. Note also that this account automatically
explains why Minimal
Attachment
prevails over Right Association in (4). Since there is no independent statement of Right Association in the parser there is no conflict to be
explained. In short sentences like (l), the PPP will see both attachment
possibilities. Therefore,
there will be no bias towards low right attachment
and the Minimal Attachment
strategy prevails by default. On the basis of
this demonstration,
FF claim to have achieved, at least in one important
instance,
their announced
goal of showing that the parsers decision
preferences
can be seen as an automatic
consequence
of its structure
(p. 297).

FF also offer a structural


account for Minimal Attachment
which is quite irrelevant
to the interaction between the two strategies.
Here it is sufficient
to note that on FFs account, Minimal Attachment is insensitive
to distance effects in the manner putatively
characteristic
of Right Association.
Hence, Minimal Attachment
continues
to operate
in contexts
where Right Association
does not.

214

Eric Wanner

There are, however, serious problems with this claim. If the preference for
low right attachment
sets in . .. at some distance just because of the PPPs
limitation
to a six word window, then this limitation
ought to operate
uniformly
in all cases. Just as the preference for low right attachment
dissolves as sentence (3) is shortened into sentence (4), so it should also dissolve as sentences (1) and (2) are shortened. But it does not. Sentence sets
(7) and (8) represent
progressive
shortenings
of sentences (1) and (2):
(7)

(8)

(a)
(b)
(c)
(d)
(e)
(f)
(a)
(b)
(c)
(d)
(e)
(f)

Tom said
Tom said
Tom said
Tom said
Tom said
Tom said
Joe called
Joe called
Joe called
Joe called
Joe called
Joe called

that Bill had taken the cleaning out yesterday.


that Bill had taken it out yesterday.
that Bill had taken it yesterday.
that Bill took it yesterday.
that Bill died yesterday.
Bill died yesterday.
the friend who had smashed his new car up.
the friend who had smashed his car up.
the friend who had smashed it up.
the friend who smashed it up.
everyone who smashed it up.
everyone who smashed up.

Notice that as these sentences shrink, there is no noticeable tendency for the
preference
for low right attachment
to diminish. Indeed, informants
to
whom I have given just the (f) versions uniformly
report a preference
in
favor of the analysis in which the final word is attached to the lower of the
two clauses.* But neither (f) version is more than six words long. Both (f)
sentences can fit comfortably
within the PPPs window. Hence the PPP
could readily see both clauses as candidates
for possible attachment.
Therefore,
the structure of the PPP cannot provide any explanation
of the
language users continued preference for low right attachment in these short
sentences.3
*Some informants
find the higher attachment
in (80 ungrammatical,
presumably
because it requires
an intransitive
interpretation
of smashed.
However, these informants
all prefer the low right attachment in (8e) where there is Rio possible confounding
from ungrammaticality
of either attachment.
3Thc same sort of argument
can be brought
to bear upon some of FFs other arguments
for the
explanatory
power of the PPPs limited window. For example, FE argue that the multiple embedded
sentence
(a) is easier than the identically
embedded
sentence
(b) because its major constituents
(marked here by brackets) are approximately
the length of the PPPs window:
[The very beautiful
young woman] [the man the girl loved] [met on a cruise ship in Maine]
(a)
[died of cholera in 19721.
The woman the man the girl loved met died.
(b)
But again it is possible to construct
an equivalent
sentence which is short enough to fall entirely
within the PPPs window yet is very difficult to comprehend:
Women men girls love meet die.
(c)

The A TN and the Sausage Machine

2 15

One might hope to save the Sausage Machine by somehow incorporating


the Right Association strategy within the PPP itself. It might be possible to
stipulate,
for example, that the PPP tries to fashion the longest possible
phrases from the words within its window. But this move would leave us
without
an explanation
of why Minimal Attachment
appears to prevail
over Right Association
in sentence (4). Moverover, it would necessarily
entail the abandonment
of FFs goal of explaining Right Association exclusively in terms of Sausage Machine architecture.
For as FF point out themselves, there is nothing about the division of labor between the PPP and SSS
which might explain why the PPP should strive to build maximally long
phrases:
Trying to squeeze extra words into the current package could also be counterproductive, for it might happen that the limits of the PPPs capacity are reached
at a point which is not a natural phrasal break in the sentence. In such circum.stances it would have been better for the PPP to terminate the current package
a word or two sooner, and start afresh with a new phrase as a new package (p. 312).

To summarize, it now appears that contrary to the Sausage Machine prediction, Right Association
is not limited to cases of distant attachment.
Moreover, the Sausage Machine offers no explanation
of why the language
user appears to follow the Right Association strategy in some short sen-tences (7f and Sf, but not others (4). Accordingly,
it seems clear that the
Sausage Machines putative explanation of the behavior of Right Association
strategy is simply incorrect. There is nothing about FFs observations which
would require a parser with properties (A) and (B). However, it remains to
be seen whether a parser like the ATN, which has neither two stages nor a
limited input window, can give a satisfactory
account of the behavior of
Right Association
and Minimal Attachment,
as well as their somewhat
puzzling interaction.

Is the ATN in the Same Pickle?


According
to FF, IMinimal Attachment
and Right Association
cannot be
described within the ATN framework.
The problem, as they see it, is that
the ATN lacks property (C) - the ability to make structural assignments
contingent
on the geometry
of the developing phrase marker. In FFs
words,
An ATN parser could certainly be designed so that it would make exactly the same
decisions at choice points as the Kimball parser. But because its decisions are determined by the ranking of arcs for specific word and phrase types, rather than in

2 16 Eric Wanner

terms of concepts like lowest rightmost node in the phrase marker, the parsers
structural preferences would have to be built in separately for each type of phrase
and each sentence context in which it can appear. Evidence that the human
sentence parser exhibits general preferences based on the geometric arrangement of
nodes in the phrase marker indicates that its executive component does have access
to the results of its prior computations.
Its input at each choice point must consist
of both the incoming lexical string and the phrase marker (or some portion thereof)
which it has already assigned to previous lexical items (p. 294).
It is difficult
to determine in general, whether the ATN will eventually
require the addition of something like property (C). However, it is quite clear
that no such property is required to give a perfectly general description of
the two parsing strategies that FF have proposed. The structural preferences
involved in these strategies would not have to be built in separately for each
type of phrase and each sentence context. On the contrary, it appears to be
possible to fonnulate
scheduling principles for the ATN that completely
capture the structural preferences involved and that do so without explicit
appeal to the geometry of the phrase marker. Moreover, when these principles are combined with an ATN grammar for FFs crucial sentences, the
residual mysteries concerning the interaction between Right Association and
Minimal Attachment
are completely resolved.
To see this, recall first that a scheduling rule in an ATN, as described by
Kaplan (1975, 1972) and by Wanner and Maratsos (1978), is essentially a
specification
of the order in which the ATN processor considers the arcs
leaving a state in an ATN grammar network. Recall also that the ATN network includes at least 5 types of arcsp
~ WORD arcs that analyze specific grammatical morphemes such as that
orto,
-- CAT

arcs that analyze grammatical


categories such as Noun (N) or
Verb (V),
- SEEK arcs that analyze whole phrases or clauses such as NP, VP, or S;
- SEND arcs which terminate a network;
- JUMP arcs which provide a free transition between states, thus expressing the optionality
of certain sub-paths through a network.
Given this enumeration
of arc types, we can formulate
two general
constraints on ATN scheduling rules which provide a general description of
Right Association and Minimal Attachment:

4For a more detailed

discussion

of these arc types see the ATN sources

cited above.

The A TN and the Sausage Machine

2 17

Right Association:
Schedule all SEND arcs and all JUMP arcs after
every other arc type. (Since SEND arcs and JUMP arcs never leave the
same state, there is no ambiguity here with respect to the relative
ordering of these two arc types.)
Schedule all CAT arcs and WORD arcs before all
(10) Minimal Attachment:
SEEK arcs.

(9)

Consider Minimal Attachment


first. Basically this strategy stipulates that
the parser should never add an additional non-terminal
node to the parse
tree unless it is forced to by the grammar. Scheduling rule (10) enforces
this strategy by providing that any input element will be analysed as a
category or a word of the current phrase before any SEEK to a lower phrase
is attempted.
Suppose, for example, that our ATN grammer includes the
following network level that analyzes X phrases (XP):
CAT Y

<yyJ&

\,

Note first that because CAT Y arc is ordered before SEEK ZP arc, the
constituent
XP will always be completed
by means of categorical nodes
rather than phrasal nodes, if such a completion
is possible. This ordering
guarantees Minimal Attachment.
To see this, imagine that our hypothetical
ATN also includes a network for Z phrases (ZP), one path through which
begins with a CAT Y arc, as in:
CAT Y

CAT R

Suppose that the parser is in state XP, at the moment that it encounters a
word in the input string that belongs to the syntactic category Y. At this
point, two analyses of Y are possible, roughly those corresponding
to the
following attachment possibilities:

2 18

Eric Wanner

(12)

(11) ... .
XP

....
\

1 .. ..
Y

y,zP*.

XP
.

Obviously, (11) is the minimal attachment


of Y to XP and it is just this
possibility which will be tried first so long as arc 1 is ordered before arc 2.
Now consider Right Association.
Basically, this strategy requires the
parser to add as many nodes to the current constituent
as possible. In an
ATN this is enforced by postponing the network-final
SEND arc as long as
possible. To see this, suppose that the parser is in state XPfinat when it
encounters
a word in the input string that can be categorized as Q. So long
as arc 4 is always ordered before arc 5 at this state, as it must be in order
to obey the Right Association
scheduling principle (lo), the Q node will
be attached as the right sister of the current constituent
XP. To see how
this guarantees Right Association, suppose our ATN also includes a network
for U phrases (UP) which contains a path including a SEEK XP arc followed
by a CAT Q arc, as in:
SEEK XP

CAT Q

Suppose also that the SEEK XP on arc 9 has been issued and the parser has
completed
the partial path through the XP network to state XPnnat by
finding a Y in the input. Now suppose the next word falls in the Q category.
Here again, two attachments
are possible. Either Q can be attached directly
to the X constituent
via arc 4 to yield (13) or Q can be attached to the UP
constituent via arcs 5 and 10 to yield (14).
(13)

(14)

Obviously Right Association favors (13), and so long as arc 4 precedes


this is the analysis that the ATN will favor as well.

arc 5,

The A TN and the Sausage Machine 2 19

Finally, notice that the JUMP on arc 3 must be ordered last since it leads
to the SEND on arc 5. If the JUMP were ordered earlier at state XP,, it
could lead the parser to violate Right Association by executing the SEND at
arc 5 before trying the CAT and SEEK on arcs I and 2.
Given the ATN restatement of Right Association and Minimal Attachment
provided by scheduling principles (9) and (lo), we can now consider the way
in which these principles apply to FFs crucial cases. Figure 1 presents an
ATN grammar which will handle sentences (l), (4), and (7a-7f).
The
grammar was constructed
by restating in ATN terms all the phrase structure
rules that FF implicitly used to construct the phrase markers given in their
paper. Corresponding
to every context free phrase structure rule in FFs
generative grammar, there is a level of the ATN network which expresses the
identical analysis of each phrase. For simplicity, however, we have ignored
irrelevant grammatical details pertaining to verbal auxiliaries, verb particles,
and deleted complementizers
in the grammar of Figure 1. None of these
omissions has any bearing upon the interesting aspects of the ATN analysis
of FFs sentences.
To illustrate the way in which principle (9) captures Kimballs principle
of Right Association,
consider first the analysis of sentence (7e), repeated
here alongside the arc sequence (15) which gives its analysis path through the
grammar of Figure 1:
(7e)
(15)

Tom said that Bill died yesterday.


l(17, 22), 2(5, 8(13, 14(1,(17, 22), 2(5,9,
11, 12),4.

11, 12) 3,4),

15),

In constructing
arc sequence (IS), the arcs in the analysis sub-path that
fulfill a SEEK have been listed in parenthesis after the number of the SEEK
arc that caused the SEEK to be attempted. Thus the analysis of sentence (7e)
begins with a SEEK for a NP on arc 1, which is completed when the proper
noun Tom is analyzed on arc 17 and control returned to arc 1 by the
SEND on arc 22. This arc sequence is represented in (IS) as I( 17, 22). Following the analysis from this poin_t, we see that said is analyzed on arc 5,
and a SEEK for the complement
S is issued on arc 8. The complementizer

In adopting
this principle
for constructing
the ATN grammar,
I am following Bresnans (1978)
proposal
for relating ATN and phrase structure grammars.
ATN grammars constructed
according
to
this principle
provide a well formed labelled bracketing
of the input sentence directly by means of
the sequence of transitions
made in accepting the sentence. This avoids the use of LISP functions to
build phrase- markers thereby reducing the expressive power of the ATN. A limited set of additional
actions is required
to label grammatical
functions
and handle moved constituents.
However, with
one exception
(the HOLD action) I have deleted these actions from the Figure 1 grammar since they
play no part in the description
of Minimal Attachment
and Right Association.

220 Eric Wanner

Figure 1,

An ATN grammar for sentences (3), (4) and (7).

SEEK VP

SEEK NP

11

CAT PRONOUN

CAT PREP

SEEK NP

23

that is analyzed on arc 13, followed by a SEEK S on arc 14. The SEEK S
is pursued along arc 1 which analyzes the subject noun phrase Bill and arc 2
which analyzes the verb phrase died. At this point there is a choice between
analyzing the adverb yesterday
as a modifier of the complement
clause on
arc 3 or terminating the complement
clause via the SEND on arc 4. However,

The ATN and the Sausage Machine

22 1

since the SEND arc must be ordered after all other arcs according to principle (9), this choice must be resolved in favor of arc 3. Hence, yesterday is
attached to the complement
clause, thus insuring the low right attachment
of the adverb. Notice also that this attachment
will be preferred no matter
how long or short the complement
clause is. So long as the complement
clause terminates at state Snnat, the fact that arc 3 precedes arc 4 will guarantee low right attachment.
Since each complement
clause in sentence set (7)
terminates
at just this state, the ATN successfully captures our intuitions
that the lower attachment
of yesterday
is preferred throughout
the entire
set of sentences in (7). (Anyone with sufficient scepticism and stamina can
prove this by tracing through the ATN analysis of the full set.)
Principle (9) also insures the low right attachment offir Susan in sentence
(3), although here the principle applies at state VP,, where the JUMP on arc
11 must be ordered after the SEEK PP on arc 10. To see the effect of this
ordering,
consider the following analysis path for sentence (3) by the
grammar of Figure 1:
(3)
(16)

Joe bought the book that I had been trying to obtain for Susan.
1(17,22), 2(5, 7(19(18, 20,22), 26(13, 14(l(i6, 22),
2(5,6, 27(5, 7(28,22),
lO(23, 24(17, 22), 25), 12) 12) 4)
15,22), 11, 12), 4.

The parser works its way to state VP, by SEEKing an S at arc 26 in order to
process the relative clause (that I had been trying...). As this SEEK is executed, the head noun phrase (the book) is put on HOLD in accordance
with the ATN procedure
for processing relative clauses (for details, see
Wanner and Maratsos (1978)). The relative clause is then processed as an
ordinary declarative
clause until the parser reaches state VP,, having just
analyzed the infinitival complement
to obtain. Since obtain requires an
object noun phrase, the book will be removed from HOLD at this point
and assigned as direct object on arc 7. Then, at state VP,, the parser must
attempt to find a prepositional
phrase on arc 10 to complete the complement clause. Since for Susan is available at this point in the input string, it
is automatically
attached as the indirect object of the complement
clause.
This is, of course, just the low right attachment that language users prefer in
this case. The only way for the ATN to make the higher attachment would
be to reverse the order of arcs 10 and 11 in violation of principle (9).
But now what about sentence (4)? Why doesnt Right Association operate
there as well and come into conflict with Minimal Attachment?
This is a
natural question if one considers just the geometry of the alternative phrase
markers for (4): in phrase marker (5),for Susan is minimally attached to the
VP; in (6), for Susan is low right attached to the NP. However, the question

222

Eric Wanner

disappears given the ATN formulation


of the two parsing strategies. When
the two parsing strategies are implemented
by ordering arcs according to
principles (9) and (lo), the left-to-right operation of the ATN automatically
establishes the priority of the Minimal Attachment
analysis of sentence (4).
One way to see this is to trace through the analysis of sentence (4) which is
given below in arc sequence (17)
(4)
17)

Joe bought the book for Susan.


l(17, 22), 2(5, 7(18,20, 22), lO(23, 24(17, 22). 251, 12),4.

The subject

noun

phrase Joe is analyzed

successfully
apply to the book. At this point, the book has been minimally attached to
the verb phrase. In effect the CAT-before-SEEK
arc ordering at state NP,
has selected partial structure ( 18) over structure ( 19) :

Once this structure has been selected, there is only one possible attachment
for the prepositional
phrase for Susan and that is the direct attachment
to
the verb phrase. The ATN accomplishes this attachment on arc 10 once control returns to state VP3 after the successful SEEK for an NP on arc 7.
Notice that the conflict between the minimal attachment
and the low right.
attachment
for Susan never arises in the ATN analysis because the ATN
never considers structure
(19), and it is only in terms of a comparison
between structures (18) and (19) that there appears to be a conflict between
Minimal Attachment
and Right Association.
Therefore,
it appears that the
ATN resolves the apparent conflict in (4) between Minimal Attachment
and
Right Association
in the psychologically
appropriate
way specifically

The ATNand the Sausage Machine 223

because it does not appraise the geometry of the two possible parse trees.
This is, of course, just the reverse of FFs claimed advantage for the Sausage
Machines ability to survey the structural details of the developing phrase
marker.
To summarize then, I hope to have shown, contrary to FFs claims, first
that the ATN can provide a general statement of Minimal Attachment
and
Right Association; second, that the ATN can do so without explicit appeal
to the geometry of the developing phrase marker; third, that a careful formulation of the two parsing strategies coupled with a detailed ATN grammar
can account for the otherwise puzzling interactions between parsing strategies
noted by FF simply by appeal to left-to-right processing and without any
assumption of a limited input window.

Explanation
If the ATN analysis of the interaction of Minimal Attachment
and Right
Association in sentence (4) is correct, then there is no need to explain why
Right Association sets in only at a distance in some cases (4) but not others
(7 and 8). Evidently,
Right Association
operates uniformly
across all
sentences although it may be preempted
by another strategy if the preemptive strategy operates at an earlier point in the sentence and eliminates
the opportunity
for low right attachment.
Thus the ATN analysis explains
FFs observations
about the interaction
of the two strategies but it leaves
us with the problem of explaining the strategies themselves. Why does the
parser employ these strategies as opposed to others?
To answer this question within the ATN framework will require a theory
of scheduling
which provides some means of selecting psychologically
appropriate
scheduling principles and dismissing psychologically
inappropriate scheduling principles. Although it is difficult to specify such a theory
at present, it is possible to speculate, given the results in hand, about the
eventual character of such a theory. The basic idea is that a scheduling
theory should choose scheduling principles which minimize computation
during parsing. In different ways both the Minimal Attachment
principle
(IO) and the Right Association principle (9) appear to have this effect on
ATN processing. Thus, by ordering CAT arcs before SEEK arcs, the Minimal
Attachment
principle guarantees that the attachment
requiring the fewest
number of arcs will be tried first. This follows because a CAT arc makes an
attachment
directly, while a SEEK arc requires the implementation
of at
least one additional
arc (within the network invoked by the SEEK) to
complete the attachment.
Minimal Attachment
also minimizes the number

224

Eric Wanner

of SEEKS per parse and this also reduces the memory demands involved in
implementing
SEEKS (for details see Wanner and Maratsos, 1978).
Right Association
minimizes
computation
in a different
way.6 This
strategy guarantees that the parser will continue to include input elements
-within the scope of the current constituent
for as long as possible. Shifts
between constituents
will be minimized. Since syntactic structure is generally
more predictable within constituents
than across constituent
boundaries, this
strategy should insure that the parser will minimize garden paths. In the
ATN, garden paths inevitably inflate the number of arcs which the parser
must traverse while pursuing dead-ended
analysis patlls.7 Therefore,
Right
Association, like Minimal Attachment,
will have the effect of reducing to a
minimum the average number of arcs traversed for any large and representative set of sentences input to the parser. This suggests that a theory of
scheduling might ultimately be constructed
around a metric based on ATN
performance
measured in arc counts. Such a theory would rank scheduling
principles according to their effects on average arc count, with the highest
ranking scheduling
principles
producing
the lowest average count. The
theory could be tested by determining
whether the highest ranking principles were also those employed by the language user.*
These brief suggestions are obviously far from the sort of explanatory
theory
of scheduling
that is required to account for FFs observations
about parsing strategies. Nevertheless,
they should serve to demonstrate
that
such a theory is by no means unobtainable
within an ATN framework.
References
Bresnan,

J. (1978) A realistic transformational


grammar.
In M. Halle, J. Bresnan and C. A. Miller
(Eds.) Linguisfic Theory and Psychological Reality, MIT Press, Cambridge,
Mass.
Frazier,
L. (1978) On comprehending
sentences:
syntactic
parsing strategies.
Unpublished
doctoral
dissertation,
University of Connecticut.
Frazier,
L. and Fodor, J. D. (1978) The sausage machine:
a new two-stage parsing model. Cog.. 6.
291-325.

1 am indebted to Ronald Kaplan for pointing out this property of Right Association.
7See Wanner, Kaplan and Shiner (1975) for evidence that garden paths have a measurable
effect
on comprehension
time.
*This procedure
for ranking scheduling
principles
need not be identified with the childs procedure
for learning scheduling
principles.
The child might reconstruct
the ranking by means of tacit statistics
performed
over its own parsing history. But it might also be innately equipped with biases in favor
of the highest ranking strategies. Conceivably,
such biases might evolve to permit the child to perform
efficient
parsing
from the outset.
Note, however,
that if so, the explanation
of the preferred
scheduling
principles does not lie in the childs innate bias but in the efficiency
ranking to which that
bias conforms.
Explanation
need not always be rooted in constraints
on acquisition
as is sometimes
assumed.

The ATNand the Sausage Machine

225

Kimball, J. (1973) Seven principles of surface structure


parsing in natural language. Cog., 2, 15-47.
Kaplan, R. (1972) Augmental transition
networks as psychological
models of sentence comprehension.
Artific. InteN.. 3, 77-100.
Wanner, E. and Maratsos, M. (1978) An ATN approach to comprehension.
In M. Halle, J. Bresnan and
G. A. Miller (Eds.) Linguistic Theory and Psychological Reality, MIT Press, Cambridge,
Mass.

Reference
Kaplan,

Notes

R. (1975) Transient processing load in sentence comprehension.


Unpublished
doctoral
tation, Harvard University.
Wanner, E., Kaplan, R. and Shiner, S. (1975) Garden paths in relative clauses, unpublished
Harvard University.

disserpaper,

S-ar putea să vă placă și