Documente Academic
Documente Profesional
Documente Cultură
1093/ijl/eci042
Abstract
This article attempts to synthesise recent advances in collocational theory into a
coherent framework for lexicological theory and lexicographic practice. By posing a
number of fundamental questions related to the definition of collocation, it critically
reviews frequency-based, semantic and pragmatic approaches to collocation. It is found,
among other things, that two types of collocation, namely long-distance collocation
and collocation between semantic features, have suffered almost total neglect. This leads
to suggestions for a new division of the collocational spectrum and for a revised
definition of collocation based on the notions of usage norm (Steyer 2000) and
holisticity (Siepmann 2003). It is argued that this new view of collocation considerably
widens the dictionary makers brief, since future lexicography will have to provide a full
account of both structurally simple and structurally complex units, including fixed
expressions of regular syntactic-semantic composition (see Part II of this article, to be
published in the March issue of this journal).
1. Introduction
According to modern science, there is no such thing as independent existence;
at least since the advent of chaos theory, there has been full recognition that
all forms of life and material phenomena, whether at the micro-level or at the
macro-level, are interdependent. In linguistics, this realization has found its
fittest expression in the idea of linguistic rather than literary intertextuality,
whereby the meaning of one text and its constituent elements depends on
millions of other texts using similar or identical elements. Textual meaning is
thus created by the interplay of two types of repetition, viz. (a) collocation
(in the largest possible sense, including colligation1 and phraseology) and
(b) cohesion. It turns out that one instance of collocation and the entire
language are mutually illuminating, since the instance is understood in terms of
International Journal of Lexicography, Vol. 18 No. 4
2005 Oxford University Press. All rights reserved. For permissions,
please email: journals.permissions@oxfordjournals.org
410
Dirk Siepmann
the whole, and the whole in terms of the instance (cf. Hunston 2001: 31); taking
this a bit further, we might say that not only is each pattern necessary for
comprehending the sum total of similar patterns, but each pattern is
also a miniature version of that sum total, as shown by the fact that the
meaning of individual patterns (e.g. German sonniges Gemut [sunny
disposition irrepressible high spirits] vs sonnige Lage [sunny location]),
even if shorn of any context, is evident to the native speaker.
This relatively recent view of meaning creation (Hoey 1991, 1998, 2000,
Feilke 1994, 1996) seems much more in keeping with speakers intuitive
knowledge about language than was the case in earlier structuralist theories.
The latter tended to assume that expressions such as sonnige Lage have
both a compositional, literal meaning and a non-compositional, figurative
meaning (Feilke 1996: 128). In an intertextual or socially-based view of
meaning creation, the compositional meaning is exposed for what it is, namely
an abstraction of the linguist which has no base in the native speakers
mental lexicon; the expression sonnige Lage is then considered to be a
holistic sign that is irreducible to the sum of its parts. In a related
development, computational and cognitive linguists have used corpus-linguistic
insights to work out models of language grounded in actual usage rather
than abstract general rules (Chandler 1993, Croft and Cruse 2003, Skousen
1989). In these models word or clause formation is by analogy with existing
exemplars, and it will be seen that such models can also be applied to
collocation.
This article reviews, one by one, the various defining criteria that have in
the last half century been called upon to define the notion of collocation,
pursuing a dual objective: (a) to show that none of these criteria apply in
all cases, so that we can at best give a prototypical definition of collocation,
and (b) to demonstrate that the problems associated with the definition
of collocation stem from the mechanistic, old-paradigm view of language
embodied in structuralist theories which try to impose theoretical abstractions
on an infinitely complex reality arising from communicative interaction and
the institutional practices such interaction puts in place. This will then allow
us to provide a more secure and more broadly based underpinning for the
treatment of colligation and collocation in lexicography. With the exception
of Steyer (2000), no such model has as yet been proposed.
The subject of collocation has been approached from two main angles:
on one side are the semantically-based approaches (e.g. Benson 1986, Melcuk
1998, Gonzalez-Rey 2002, Hausmann 2003, Grossmann and Tutin 2003) which
assume a particular meaning relationship between the constituents of a
collocation; on the other is the frequency-oriented approach (e.g. Jones and
Sinclair 1974, Sinclair 1991, Sinclair 2004, Kjellmer 1994) which looks at
statistically significant cooccurrences of two or more words. This theoretical
distinction is paralleled by a geographical divide: the semantic approach has its
411
412
Dirk Siepmann
Content
Corpus of Academic
English (CAE)
Corpus of Academic
French (CAF)
Corpus of Academic
German (CAG)
Corpus of English
Fiction (FE)
Corpus of French
Fiction (FF)
Corpus of German
Fiction (FG)
Corpus of English
Motoring (CME)
full-text
30 million
full-text
full-text
full-text and samples
full-text and samples
full-text and samples
full-text and samples
50 million
50 million
1980
50 million
1980
100 million
1995
100 million
1990
30 million
30 million
(Continued)
Corpus (Abbreviation)
413
414
Dirk Siepmann
Table 1: Continued
Corpus (Abbreviation)
Type
Content
Word Count
Baseline Year
full-text
100 million
1990
100 million
1990
full-text
415
416
Dirk Siepmann
417
418
Dirk Siepmann
semantically autonomous
semantically dependent collocation
he likes money
look at the sea!
he prefers fish to meat
money withdraw
decision take
clouds scudding
419
constituents (nouns, verbs and adjectives) on one side, and words with a
morphological or syntactic function (articles, prepositions, etc.) on the other.
Other scholars assume that the boundary cuts across different parts of speech;
according to them, a noun such as scholar is semantically autonomous, whilst
a noun like member is semantically dependent on its linguistic environment
(e.g. party member, family member). Yet others (e.g. Lutzeier 1981) go so far as
to claim that there are no criteria at all allowing us to differentiate between
words that have lexical content and those that do not. Indeed, words that
have been intuited as semantically dependent by collocation scholars may, on
inspection, turn out to be semantically autonomous (see 2.2.3 below).
3.2.2 Collocations ofregular syntactic-semantic composition. As seen in Section 1, the
collocational character of seemingly free combinations such as accepter des
pie`ces (take coins) only comes to light if the wider context is taken into
consideration. Similar considerations hold true for other types of combinations
involving items with the same or a similar semiotactic status; here are a few
typical examples:
(6) Ive got grease all over my shirt. (FE)
regarde ou` tu vas! (FF) ( pass auf, wo du hintrittst; watch where you are
going/stepping!, watch where you put your feet!)
I didnt bring the car (FE)
look at the time! (FE)
From the perspective of structuralist linguistics, such sentences would be
considered composite units whose meaning is the sum total of the literal
meaning of its constituents; in other words, they would be viewed as falling
within the scope of the open-choice principle. On inspection, however, they
turn out to be semi-phrasemes (i.e. collocations). Three main reasons can be
advanced for this: firstly, they are clearly not idioms, since they are
immediately comprehensible to anyone who is familiar with their basic
constituents; thus, the first example can be analysed as follows: [subject] have
got [object] [locative]. Secondly, it is evident that the literal meaning of
the first sentence could only be construed as referring to a shirt every square
inch of which was entirely smeared with grease, but, of course, this is not what
the sentence means to a native speaker, who will take it to mean that only
part of the shirts surface has been stained.5 Thirdly, the same meaning could
be expressed quite differently in another language such as German: ich habe
mein Hemd mit Fett beschmiert/mein Hemd ist voller Fett/mein Hemd ist ganz
fettig. What we are dealing with, then, is an instance of a collocational
framework (Renouf and Sinclair 1991) or, more precisely, a type of colligation,
that is, a recurrent grammatical pattern that is lexically restrained: have
got [liquid, crumbs, etc.] on/all over [item of clothing, body, body part].
420
Dirk Siepmann
Collocation/idiom
colligation 1 collocation
sein Versuch, [Berg] zu bezwingen /
[Rekord] zu brechen
colligation
N a` ses heures [ poe`te, peintre,
cuisinier a` ses heures]
Similar observations can be made for the second example, where the
interlingual equivalents clearly show that the phrase is idiomatically
constrained. The standard German translation uses two entirely different
and more specific verbs (regarder -4 aufpassen ( pay attention), aller -4
hintreten (= step [somewhere]).
This kind of finding links up with Hausmanns (1997) claim that everything
in language is idiomatic and with Hunstons (2001) investigation into
colligation, which shows that even grammatical strings of a fairly random
nature may carry a particular semantic prosody. Thus, the sequence NP
may not be a(n) NP is used as a signal of concession commonly followed by
a contrasting clause introduced by but (Hunston 2001: 24).
This is also obvious from such interlingual correspondences as those given
in Table 3.
These examples show that translational equivalence can usually be achieved
at the level of constructions (in the sense of Fillmore). Probably the most
frequent case is the rendition of one construction type by the same type in
another language (e.g. espionner, cest attendre; to spy is to wait; spionieren
heit warten); it is by no means uncommon, however, to find one construction
type translated by another. Thus, equivalences 13 of Table 3 can be accounted
for in terms of a shift from an English complex and schematic construction,
whose rules of semantic composition are fairly general, to a German complex
and substantive construction, whose rules of semantic composition are more
specialized (for a listing of construction types, see Table 4). The French phrase
421
Traditional name
Examples
syntax
subcategorization
frame
idiom
morphology
syntactic category
word/lexicon
sur (une) autoroute degagee (example 2), where the indefinite article is
optional, shows how increased use may result in greater fixity and brevity,
in other words, in phraseologicization (cf. German Porsche fahren alongside
einen Porsche fahren, or French sur chaussee mouillee alongside sur une
chaussee mouillee). Equivalence 4 is remarkable as demonstrating that mainly
schematic constructions in one language may correspond to combinations of
schematic and substantive constructions in another. Even stronger support
for the notion of different construction types comes from such equivalences
as 5, where a complex but bound construction in German corresponds to a
complex and schematic construction in French.
3.2.3 Contingent meaning. The autonomous/dependent distinction presupposes
that, in the words of Melcuk (1998: 31), the problem of the lexicographic
description of lexical units is an independent problem that has to be
solved . . . prior to any discussion of phraseology. Thus, Melcuk seems to
assume that the meaning of the adjective rancid, which occurs in the noun-verb
collocation rancid butter, can only be defined with reference to butter. This
assumption is, however, belied by even the briefest corpus enquiry; it is found
that the adjective itself has a wide combinatorial range, which divides into
two separate meaning groups, viz. (a) food, butter, bacon, milk, meat, cream,
fat, grease, flour, wheat, oil, chocolate; smell, odour, aroma; socks, sweat; water
and (b) atmosphere, sentiment, academics, affair, show, humour, prune. This
shows that the adjective has at least two metonymically related meanings of its
own which might be glossed respectively as (of food) having a rank smell or
taste as the result of decomposition or chemical change and (of people or
things) having vile, revolting, obnoxious qualities; these two meanings would
have to be recorded in the dictionary. Similar analyses have been proposed for
other seemingly unique collocations of Melcuks type 2(b), such as schutteres
Haar (thin hair; Steyer 2003: 107), with the same results. Another reason why
422
Dirk Siepmann
lexical entries cannot simply be presupposed as given is that some nouns simply
do not have any meaning in isolation. One example cited by Feilke is German
Lage (situation), and the same goes for its standard English and French
equivalents. The French collocation situation faire (la situation faite aux
protestants) could therefore be said to consist of two semantically empty
items, and yet the combination of the two yields a meaningful collocation.
3.2.4 Collocation of semantically autonomous items. Even if we assume that a
sharp line can be drawn between content words and delexical words, there
remain numerous examples of collocations made up of two semantically
autonomous items (printed in bold below), some of which have interlingual
relevance:
(7) an empty parking space (or: a vacant parking space) -4 un emplacement
libre -4 ein freier Parkplatz (cf. ein leerer Parkplatz an empty/deserted
car park)
a quiet drink (hypallage) -4 (cf. prendre un verre en toute tranquillite) -4
(cf. the idiom: in Ruhe einen trinken)
(have) cold feet (in the non-figurative sense) -4 (avoir) les pieds geles /
glaces (cf. also: avoir froid aux pieds) -4 kalte Fue (haben)
to stop for petrol (for coffee, for a pee) -4 (free combination: sarreter pour
faire le plein) -4 (free combination: anhalten um zu tanken)
to tell a joke -4 faire une blague -4 einen Witz erzahlen
The first example shows that English distinguishes between free ( free
of charge) and empty ( unoccupied) parking spaces. The second example
illustrates a case of frozen hypallage: the semantic features of the adjective
quiet are incompatible with the noun drink; it is the situational context in which
the drink is taken that would normally be described as quiet.6 The third
example demonstrates that French cannot use the adjective froid attributively
when reference is made to parts of the body. The fourth example illustrates
equivalences between seemingly free combinations in German and fixed
expressions in English. Although there is a small number of variants in
evidence, we cannot assume compositionality here. The fifth example is
interesting in that there are synonymic collocations where the verb would be
regarded as semantically contingent on the noun: crack/make a joke.
423
3.4 Directionality
A related problem is the assumption of directionality (Hausmann 1979)
or of a hierarchical relationship between the constituents of the collocation
(Gonzalez-Rey 2002), whereby the selection of the collocate is contingent on
the prior selection of the base. While this is more or less obvious with items
such as table lay/set or money withdraw, we have already seen above that
examples such as road hold cast serious doubt upon the validity of the theory.
Hartenstein (1996: 95) cites counterexamples of the type he`re pauvre (poor
wretch) where the noun cannot be viewed as semantically autonomous since
it has no referent in present-day French. In similar vein, Scherfer (2002) notes
that even such textbook examples of collocational theory as celibataire
endurci (confirmed bachelor) may be viewed as bidirectional, since the
adjective endurci combines with any noun carrying the semantic feature [ fige
dans son comportement]: criminel, catholique, Parisien, etc; it is monosemous,
semantically autonomous and just as clearly defined as the noun celibataire.
Similar considerations hold for adjectives such as crowded or busy in
combination with nouns like street, road or square. Another example of this
is the French adjective sauf, as witness the concordance given in Table 5
(cf. Siepmann 2003).
424
Dirk Siepmann
425
verb
un car
un bus
une voiture
subject (semantic field: region)
sa bordure meridionale
verb
mord (sur) (2)
prepositional object
(semantic field: part of the
road)
le cote
la ligne blanche
la voie opposee
prepositional object
(semantic field: region)
Jerusalem
les departements de la
Loire, de lAin et de
lIse`re
le continent africain
(2) requires both the subject and object slots to be filled by items denoting
areas (mainly geographical areas or parts of the body). The question
then arises whether the relationship between subject and object can be
best captured in terms of selectional restrictions inherent in the verb or in
terms of collocational restrictions operating across the entire phrase
(verb two nouns).
To resolve this question, we may turn to Cruses (1986: 278279) distinction
between selectional and collocational restrictions. Cruse defines selectional
restrictions as being logically necessary: according to him, it is logically
necessary for the subject of the verb die to carry the semantic traits organic,
alive and mortal. It is different with kick the bucket, which, although
identical in meaning to die, arbitrarily requires a human rather than an animal
subject (*the horse kicked the bucket vs the horse died ). Following Cruse, we
would be entitled to consider the above example as an instance of collocational
rather than selectional restriction. Firstly, there are no logical constraints on
the subjects of mordre sur (1) and (2), whose meaning is simply glossed as
empieter sur (overlap into, eat into) in the Tresor de la Langue Francaise;
indeed, mordre sur occurs with a wide range of subjects and objects in a more
general sense:
(10) les luttes politiques, religieuses et morales, les activites de parti, lagitation
electorale, le fait que les associations croissent de facon excessive, tout
ceci . . . mord sur le temps de detente (all this . . . takes up a lot of our
spare time)
je ne voudrais pas mordre sur le temps des questions (heard in a lecture)
(I dont want to take up any of the time reserved for questions)
426
Dirk Siepmann
plus nous vivons dans les signes, et moins les choses mordent sur nous
(less things will affect us)
sans jamais leur (aux lois, D.S.) permettre de mordre sur son esprit (never
allowing them to affect ones mental state)
le nazisme a mordu sur une large tranche du proletariat (many workingclass people were drawn to Nazi ideology)
une abstraction qui mord sur le reel (an abstraction which is close to
reality)
(all examples except the second from NF)
Secondly, there is a mutual dependency between the subject noun phrase
and the object noun phrase in that (e.g.) a subject noun phrase denoting
a vehicle will entail an object noun phrase designating a part of a road,
and vice versa. We are thus dealing with collocation between certain semantic
properties rather than between specific lexical items. Again, as with the
example of autoroute filer locative discussed above, we have a three-slot
collocation mixing collocational attraction and valency: vehicle mordre
(sur) locative(part of a road).7 Valency theory does not make allowances
for collocational constraints of such a specific nature, as it posits only three
levels of semantic restrictions, the highest of which is selectional restrictions
of the type [ human] (cf. Blank 2001: 238). Collocation thus turns out
to have a paradigmatic as well as a syntagmatic dimension, with an entire
semantic set (body part, region) - rather than a clearly delimited lexical set
(tousled 1. hair 2. mane) - sharing the same syntagmatic environment.9
The case for collocation between semantic features is strengthened further
when we look at adjectival collocations. A fine example is provided by
cooccurrences of the adverb beautifully with participial adjectives such as
carved, draped, drawn, restored, etc. The verbs on which these participial
adjectives are based share a common semantic feature in describing artwork
or craftwork. Thus, there is a lexical dependency between a specific semantic
feature and a lexeme.10
The list of such examples could be lengthened. To take but one more case,
the adjective bad and the adverb badly co-occur significantly with a semantic
feature which can be glossed as physical imperfection; thus, we have:
(11) I never had a bad chest
hes had a bad concussion
Never had a bad cough, not even a sniffle.
He had a bad heart. Hole in the left ventricle.
He stuttered badly. (all examples from FE)
Note that a distinction could be made between two types of collocation here,
viz. (a) words which share the semantic feature bad (concussion, cough, stutter,
427
limp) and (b) words which require the adjective to add the notion of badness
(chest, heart).
It is important to reiterate that many such collocations between semantic
features and lexemes are bidirectional. With a collocation such as beautifully
carved it is perfectly conceivable that speakers begin by encoding the type of
craftwork involved, but it is equally likely that they are awe-struck by the sheer
beauty of a painting or other work of art, and the first thing that comes to their
minds is an adverbial expression of the concept of beauty. This latter
hypothesis is also borne out by the high frequency of the unspecific collocation
beautifully done, which does not specify the type of work involved. The notion
of beauty would seem to be just as semantically or cognitively autonomous as
that of craftwork, so that the collocation should be regarded as bidirectional
or even as one conceptual unit.
Similar but less regular collocational dependencies have been observed by
Grossmann and Tutin (forthcoming), Melcuk and Wanner (1996) and
LHomme (2003). These authors prefer to analyse such regularities in terms
of semantic classes. In weighing the two analyses, my judgement is that the
assumption of semantic features is more consistent, especially if long-distance
collocations (Siepmann 2003; 2005) are taken into account.
By long-distance collocations are meant lexical dependencies which
manifest themselves over considerable stretches of text. A convenient
illustration is provided by the topic initiator turning to, which is commonly
followed at some distance by informers such as I/we find/see/note or it
appears that:
(12) Turning to the use of semi-auxiliary is to/are to in if-clauses, we find that a
fifth of the instances in the sample (and 1340 in the corpus as a whole)
appear in this syntactic environment.
In this respect the speech of younger British speakers appears to be following
the lead of American English. Turning to the speech of older speakers,
we note some words which are suggestive of hesitation, uncertainty or turn
manipulation: well, mm, er.
The corresponding Middle High German forms are fuoss, fuesse; mus, muse.
Modern German Fuss: Fusse, Maus: Mause are the regular developments
of these medieval forms. Turning to Anglo-Saxon, we find that our modern
English forms correspond to fot, fet; mus, mys.
Turning to requirements involving both age plus service, it appears there has
been an increase in the propensity of participants to have normal retirement
available at age 62 with a combination of years of service. (all examples
from CAE)
A similar phenomenon can be observed with the marker of
comparison any more than. This marker, which introduces a
428
Dirk Siepmann
429
430
Dirk Siepmann
431
432
Dirk Siepmann
433
arguments can be broadly classified into two variants, viz. the argument from
syntax and the argument from semantics.
First, let us look at the argument from syntax. It has been repeatedly claimed
by theoretical linguists that a sharp boundary can be drawn between
collocations and fixed expressions by resorting to standardised tests such as
passivization or pronominalisation (Gross 1996, Scherfer 2002). Thus, a fixed
expression such as prendre la tangente can indeed be neither passivized nor
pronominalised (or rather, it is not normally passivized or pronominalised):
*la tangente a ete prise par lui
*la tangente, il la prise
Detailed observation of real language use, however, leaves the theoreticians
without a leg to stand on. As Moon (1998), Partington (1998), Burger (1998)
and Siepmann (2003) have shown, modification of standard citation forms
of phrasemes is almost the rule rather than the exception, and we find
numerous instances of passivization or relativization where we might not have
expected it. A few examples will suffice:
(19) jeter un pave dans la mare -4 ce pave dans la mare etait lance par
quelquun qui . . .
decouvrir le pot aux roses -4 le pot aux roses a ete decouvert
cracher dans la soupe -4 la soupe dans laquelle peu osent cracher
avaler des couleuvres -4 en compensation des couleuvres quelle a du
avaler (all examples from NF)
Our linguistic competence invariably allows us to modify previous
utterances, and this seems to occur quite commonly with phrasemes.
The argument from syntax is spurious for another reason, namely that, just
like phrasemes, collocations (in the traditional sense defined by Hausmann and
Melcuk) may also be syntactically or otherwise restricted. One such restricted
collocation is the French noun verb combination situation [ensemble des
circonstances dans lesquelles une personne (un pays, une collectivite) se
trouve] faire (cf. Siepmann 2003: 244245). In this construction faire
invariably introduces a participial relative clause:
(20) la situation faite aux protestants
la situation faite aux immigrants
la situation faite aux prisonniers guineens (all examples from NF)
A construction of the type on a fait une situation (ADJ) aux protestants
appears to run counter to the norms of French prose. Such examples could
be multiplied (e.g. la confiance qui lhabite; see Siepmann 2003); they show that
434
Dirk Siepmann
Endocentric (collocations)
Tiens!
Quand le chat nest pas la`, les souris dansent.
poivre et sel ( gris)
un panier perce
435
similar borderline cases, such as krummer Hund, where it must be assumed that
Hund has the langue-meaning person if it is to be considered the base of
the collocation.
It is also doubtful whether deletability can serve as a valid definining
criterion. Counterexamples are not far to seek; thus, it is quite common to find
the second part of an idiom, especially a proverb, deleted, as in speak of the
devil, . . . or quand le chat nest pas la`, . . ..
Feilke (1994, 1996, 2003) was the first to discern the root cause of such
classificatory problems with full conceptual clarity. Recognizing that linguistic
expressions can be idiomatic while at the same time being syntactically
and semantically well-formed, he advocates the theoretical decoupling of
idiomaticity and syntactic-semantic compositionality (Feilke 2003: 60).
According to him, it is the context and the participants placed in that context
which, via a figure-ground relationship, bestow meaning on such collocations
as the landscape rushes past or lu et approuve. This is all the more convincing
since some words (e.g. Lage [situation]) have no distinctive meaning
components, so that it is impossible to attribute a summative meaning to
such expressions as sonnige Lage (sunny location).
6. Are collocations monosemous and monoreferential? Are there synonymic
collocations?
According to Gonzalez-Rey (2002: 117), collocations are monoreferential
and do not allow synonymic variation:
Lunite ne peut se constituer comme variante, exprimee sous la forme de
periphrase, dun mot deja` etabli, ni admettre dautres variations pour
le meme referent, a` moins den creer des sous-categories. (Gonzalez-Rey
2002: 117, my emphasis)
Although this statement is generally correct, here too it is relatively easy
to find a number of counterexamples, such as to stick to/keep to the speed limit;
Verbrechen begehen / veruben11; parvenir/arriver a` un compromis; la pluie baisse /
baisse dintensite / diminue / se calme, etc. It is often claimed that such synonyms
differ in some aspects of their meaning, especially according to style level, but
this line of argument clearly does not apply to the first two examples just cited.
It is also interesting to note that one collocation may take on several
meanings, a factor that has been neglected both in lexicological theory and in
dictionary making. A simple example of a polysemous collocation is English
avoid an accident:
(21) s.o. avoids an accident (1) -4 qqn evite un accident -4 j-m vermeidet
einen Unfall
436
Dirk Siepmann
437
The first sentence of the second group, for example, could be translated as
follows:
Saudi-Arabia is an example of a modern Islamic state.
Saudi-Arabien stellt ein Beispiel fur einen modernen islamischen Staat dar.
The above considerations also hold true for noun-adjective combinations
such as heures creuses (literally hollow hours). Heures creuses is a semitechnical term which occurs in at least four different fields: power generation,
rail transport, road transport and telecommunications:
(22) Les radiateurs a` accumulation necessitent la mise en oeuvre dun
asservissement aux heures creuses EDF.
la SNCF renforce les trains aux heures creuses entre Paris et Combsla-Ville
0,075 ou 0,105 (Bouygues) aux heures creuses (all examples from NF)
Such collocational polysemy is also apparent from the paradigmatic
relations entered by heures creuses. Thus, whereas in telephony the antonym
of heures creuses is heures pleines, in road transport it is heures de pointe.
Somewhat counterintuitively, collocational polysemy is particularly
common in special-purpose language. Thus, some French noun-(relational)
adjective combinations of the type roue interieure can usually be disambiguated
in context only, since at least one of its meanings arises from the deletion
of an intermediate element: roue (a` denture) interieure (Forner 2000: 180ff.).
7. Conclusion: A redefinition of collocation for lexicographic purposes
It should have become clear that previous definitions of collocations have
relied too heavily on introspection rather than corpus evidence. This has
prevented linguists from realizing that what has traditionally been known as
collocation or phraseology is only one aspect of idiomatic language use,
and that the boundaries between the two are hazy and uncertain. The only way
out of this dilemma is a rigourously corpus-driven approach to the study
of lexis and grammar, and this is the approach that has been taken in the
present study.
Our discussion suggests that even the most sophisticated structuralist
definitions cannot adequately capture the phenomenon of habitual
co-occurrences, and that the frequency-based approach to collocation cannot
account for the collocation of semantic features. We would therefore be
justified in loosening the definition of collocation to a considerable extent;
collocation could be defined pragmatically with reference to the notions of
Gebrauchsnorm, or usage norm (Steyer 2000: 108), reflected in concepts
438
Dirk Siepmann
439
(cf. Bolinger 1965: 570571). Both operations have been shown to be governed
by collocation, thus providing further evidence for Hoeys claim that
collocation is indeed one of the central mechanisms involved in meaning
creation (see introduction).
It thus appears that both structurally simple (i.e. [bound] morphemes,
lexemes) and structurally complex units (i.e. collocations/colligational
patterns) are linguistic signs. If the dictionary is meant to be a record of
such signs, the task of the lexicographer is to gather together evidence of both
types of sign. So far it has been lexemes, non-compositional idioms and
morphemes that have received the bulk of lexicographic attention, but the
future clearly belongs to collocation and colligation in the widest possible
sense.
In the second part of this article, I shall discuss some of the implications
such a change in perspective not to say paradigm shift has for the making
of encoding dictionaries.
Notes
1
For those readers who are not yet familiar with the relatively recent notion of
colligation (a term originally coined by Firth), here is how Hoey (1998) defines
colligation:
- the grammatical company a word keeps (or avoids keeping) either within its own
group or at a higher rank.
- the grammatical functions that the words group prefers (or avoids).
- the place in a sequence that a word prefers (or avoids).
2
Even a superficial glance at lexical functions shows that they disregard contextual
relationships. Thus, the adverb drop-dead may intensify the adjective beautiful with
reference to women, but not with reference to buildings.
3
On an alternative construal, the German sequence might be viewed as a
colligational pattern or schematic construction (Croft and Cruse 2003): eine ADJ
Straenlage haben, but this seems problematic to the extent that very few adjectives can
fill the slot.
4
I use the term concept more or less in its standard terminological sense to
refer to a unit of thought constituted through abstraction on the basis of properties
common to a set of objects or phenomena.
5
Clearly, then, the notion of literal meaning turns out to be a linguistic abstraction
(see also the introduction to this article).
6
A point of criticism that might be raised is that we are here dealing with an instance
of regular polysemy. The meaning of drink could be glossed as referring to an occasion
where people have a drink, and the same reasoning would apply to cases such as quiet
dinner/breakfast/lunch/tea. I would argue that such apparent regularities are in fact
more or less accidental; as Blank (2001) and Grossmann and Tutin (forthcoming) have
shown, nouns belonging to the same semantic class may share some of their collocations
or colligations, but not all of them (e.g. nach der Schule gingen die Schuler nach Hause vs
*nach dem Parlament gingen die Abgeordneten nach Hause).
7
Or a three-item construction in the sense of Croft and Cruse (2003).
440
Dirk Siepmann
Blank is not unaware of the fact that verbs may also be associated with particular
circumstantial complements (Zirkumstanten) which may themselves carry selectional
restrictions, but he considers these two levels to be of lesser importance. As our analysis
has shown, however, it is often the particular collocation that determines the verb
pattern (lautoroute file quelque part). Put another way, valency and collocation appear
to shade off into each other; speakers have semantically and syntactically prepatterned
collocations or constructions (Fillmore) at their disposal.
9
Interestingly, the distinction we have just established between selectional and
collocational restrictions has a parallel in theories of formal grammar, such as Headdriven Phrase Structure Grammar, where selection refers to the process whereby
a head selects its complements and an adjunct selects its head. Using the example of the
German verb fackeln, whose linguistic environment invariably comprises a durational
modifier (most commonly nicht lange), Sailer and Richter (2002) show that the
durational modifier cannot be interpreted as a complement of the head verb, but rather
as an adjunct. Therefore, they argue, the relationship between the head verb and the
relational modifier is one of collocation rather than selection.
10
An alternative, cognitive-linguistic explanation might take the conceptual background as its starting point. Since paintings, carvings, etc. are often perceived
as aesthetically pleasing, the adjective beautiful readily springs to mind to describe
them. Collocations incorporating the adverb beautifully could then be regarded as
being derived from the original collocation (beautiful carving -4 beautifully carved ).
The problem with this explanation is that such derivation is not always possible.
11
Dieter Wirth, personal communication.
References
A. Dictionaries
Ilgenfritz, P. et al. 1989. Langenscheidts Kontextworterbuch Franzosisch-Deutsch.
Ein neues Worterbuch zum Schreiben, Lernen, Formulieren. Munich: Langenscheidt.
(LKFD)
B. Other literature
Benson, M. 1986. Lexicographic Description of English. Amsterdam: Benjamins.
Biber, D. et al. 1999. Longman Grammar of Spoken and Written English. London:
Longman.
Blank, A. 2001. Einfuhrung in die lexikalische Semantik fur Romanisten. Tubingen:
Niemeyer.
Bolinger, D. 1965. The atomization of meaning. Language 41: 555573.
Braue, U. 1992. Funktionsworter im Worterbuch in U. Braue and D. Viehweger,
Lexikontheorie und Worterbuch: Wege der Verbindung von lexikologischer Forschung
und lexikographischer Praxis. Tubingen: Niemeyer, 188.
Burger, H. 1998. Phraseologie: Eine Einfuhrung am Beispiel des Deutschen. Berlin:
Schmidt.
Chandler, S. 1993. Are Rules and Modules Really Necessary for Explaining Language?
Journal of Psycholinguistic Research 22: 593606.
Croft, W. and Cruse, A. D. 2003. Cognitive Linguistics. Cambridge: Cambridge
University Press.
Cruse, A. D. 1986. Lexical Semantics. Cambridge: Cambridge University Press.
441
442
Dirk Siepmann
443
Sinclair, J. M. 2004. Trust the Text. Language, Corpus and Discourse. New York/
London: Routledge.
Sinclair, J. M. 1996/2004. The Search for Units of Meaning in J. M. Sinclair, Trust the
Text, Language, Corpus and Discourse. New York/London: Routledge, 2448.
Sinclair, J. M. 1998/2004. The Lexical Item in J. M. Sinclair, Trust the Text, Language,
Corpus and Discourse. New York/London: Routledge, 131148.
Skousen, R. 1989. Analogical Modelling of Language. Dordrecht: Kluwer.
Steyer, K. 2000. Usuelle Wortverbindungen des Deutschen. Linguistisches Konzept
und lexikograsche Moglichkeiten. Deutsche Sprache 2: 101125.
Steyer, K. 2003. Kookkurrenz. Korpusmethodik, linguistisches Modell, lexikograsche
Perspektiven in K. Steyer (ed.), Wortverbindungen mehr oder weniger fest.
(Jahrbuch des Instituts fur deutsche Sprache.) Berlin: De Gruyter, 87116.