Documente Academic
Documente Profesional
Documente Cultură
):
Word Length Studies
and Related Issues.
In print.
organization, level of its morphemic units. Until now we have only sporadic
word/morphemic studies of only three languages – German (Gerlach, 1982),
Turkish (Hřebíček, 1995) and Russian (Polikarpov, 2000, 2000a). Meanwhile, it
is natural to expect that regularities of units from some basic level of Human
Language organization should determine to some significant extent regularities
of any other upper lying levels of it. So, the whole building of linguistic theory
(including quantitatively oriented length theory of syntactic and suprasyntactic
units) without solid basis of explained regularities for the most elementary, ul-
timate sign units of a language – morphemes – cannot be built in principle. So,
there is a vital necessity, first, for considering theoretical foundations of possible
ontological mechanism leading to the arising of language units’ length regulari-
ties. Second, obviously, it should be begun from the elementary – morphemic –
level of language system. It is necessary to gather and analyze extensive and
multi-aspectually characterized data on morphemic structures of words in var-
ious languages.
This paper is a step on the road of building a theory of word/morpheme rela-
tionship and widening empirical basis for testing a model of word/morphemes
length regularities using Russian language data.
same direction in history of any word (as well as in history of any other linguis-
tic sign): (1) by the inescapable gradual drift of any word’s meaning quality in
time, during each speech act, mainly into the direction of its gradually greater
abstractness and subjectivity, (2) by the predominant relative change of new
word meanings into the direction of their greater abstractness as compared to
maternal meanings. Correspondingly, this is a direction of the quality drift of
the integral semantics of any word in time.
According to the principle of necessity for close correspondence between lexi-
cal and categorial semantics of words (Polikarpov, 1998), becoming more ab-
stract lexical semantics “seeks” corresponding more abstract categorial (part-of-
speech) form which is more organic to it and “finds” it in acts of word-formation,
production of derivatives with greater concord between lexical and categorial
semantics than it is specific at the moment for word-bases.
Production of derivatives of more grammatical word categories at each next
step of word formation is reached usually by means of adding relatively more
abstract, more grammatical suffixes to corresponding word-bases. As time goes,
a former derived “new word” becomes semantically more abstract than it was
initially. Therefore it looses semantic-grammatical concord obtained initially
and correspondingly tends to give birth to a new, categorically else more ab-
stract derivative than it is now itself. Repeating (but gradually retarded in
time in their intensity) acts of word-formation lead eventually at some nests to
forming of pure grammar (functional) words.
• suffixal units which are more distant to their root (are at more remote
position to it) should be proportionally more grammatical, more frequent,
and, finally, shorter than less distant ones;
• prefixal units which are more distant to their root, on the contrary, should
be proportionally less grammatical, less frequent, and, finally, longer than
less distant ones;
So, the whole picture of morphemes’ length changes during the process of grow-
ing number of all morphemes step by step added to word-bases is not homo-
geneous. It consists of, at least, three components which should be discrimi-
nated. Prefixes and suffixes follow two different, even opposite tendencies of
their functional and structural dependence on the positional number of their
placing. Roots should follow one else, specific law of their length changes as
function of the growth of affixil chains within word-forms containing that or
another root.
In sum, our model predicts negative correlation of suffixes’ length and posi-
tive correlation of prefixes’ length on their growing positional number within
their word-forms. Correspondingly, we predict positive correlation of prefixes’
length on their overall quantity, and negative – of suffixes length on their over-
all quantity within word-forms. So, dealing with the dependence of average
affixes’ length on overall number of affixes (suffixes and prefixes together) we,
seemingly, do not obtain some homogeneous dependence. This fact wasn’t men-
tioned still in any of Menzerathian studies, because of too abstract approach to
Explaining Basic Menzerathian Regularity 5
the problem in many of them, not taking into account real basic mechanisms
of the word-formational process.
Specific role and specific dynamics of roots within lengthening word-forms also
was not noticed yet. Therefore positional numbers of prefixes and suffixes in
this case are of primary interest to those who try to model the process. Posi-
tional numbers are oriented to a root as a center of word-formational process
and a zero point in word-formational static structure. Positional features con-
stitute some basic system of coordinates for the object under study which
should be taken into account in the very beginning of its study. Overall num-
ber of morphemes in a word (which is usually taken as the main determinant
of “part/whole” length relations in a word) is not more than some combined
(mixed) parameter. Exact form of this parameter’s influence on the average
morphemes’ length should be still carefully considered, analytically derived
from taking into account three more fundamental dependencies. They are po-
sitional dependencies separately acting for suffixes and prefixes’ length, and
Menzerathian-like dependence for roots’ length (which depends on the maxi-
mum number of those positions).
If the growth of number of prefixes in any word-form is correlated with the
growth of number of suffixes and if the degree of corresponding changes for the
average length of suffixes and prefixes (while growth of length of their chains) is
correlated, it will give an opportunity to come to the more reasonable conclusion
on the really more sophisticated dependence of affixes’ length on the overall
number of affixes and, correspondingly, on the overall number of morphemes in
a word-form. Even if it is so, it still should be analytically integrated with the
corresponding dependence of roots’ length on growing length of chains of affixes
(and morphemes, on the whole) within some complex equation of Menzerathian
type.
All in all, the so-called “Menzerath’s Law” for word/morphemic relations is a
mixed result of acting of three different, more elementary laws (differently af-
fecting prefixal, root, and suffixal length), which should be considered one by
one for arriving further, possibly, to the eventual decision about their integra-
tion into some complex law.
Even more fundamental for understanding the phenomenon is studying of the
length/age relationship, as it follows from the Model of sign’s life cycle. Data
on these fundamental features are presented below.
3 Source of Data
In the submitted paper those data were analysed which concern morphemic
structures of root and affixally derived Russian words (50,747 different words)
6 A.A. Polikarpov
y = a · ln(x + c) + b, (1)
Explaining Basic Menzerathian Regularity 7
where
y – average length of affixes being in some numbered position in their word-
forms;
x – positional number of affixes;
a – coefficient of proportionality;
b – average length of affixes in the initial (-3rd) position within word-forms
present in the analyzed dictionary;
c – coefficient for converting of a negative-positive scale into a pure positive
one (c is here maximum ordinal number of prefixes plus one in words of any
given dictionary).
a = 0, 3953
b = 2, 5473
c = 4
Parameters of the equation have been calculated on the basis of the data pre-
sented in the summarizing Table 1 – for a detailed presentation of the data see
Table 4 in the appendix (pp. 19ff.).
Length of morphemes is measured by the number of letters in them. According
to specific features of Russian alphabet there is a very close (almost one-to-one)
correspondence between Russian letters and phonemes. So, it is possible to use
both kinds of units without noticeable difference.
8
Table 1: Dependence of Lengths of Morphemes of Different Type
Suffixes 0 1 2 3 4 5 6 7 total
in words
A.A. Polikarpov
Explaining Basic Menzerathian Regularity 9
3
&
2,5
Average letter length &
2 &
& & &
& & &
1,5 &
0,5
0
-3 -2 -1 1 2 3 4 5 6 7
Position of morphemes in words
5
$0 1 " 2 ) 3 * 4 ) 5 & 6 # 7 + total
$
4
Average letter length
"
+
)*
)
3 " &
) # #
+
) # #
* "
)*
+ $
"
+
))&* $
"
+
))&* )&* &
2 $ $ "
+
) ))*
+ )*
+ #
& "
+
) &)
+ +
&
# # # &
)* +
#
& #
1 #
0
-3 -2 -1 0 1 2 3 4 5 6 7
Position of morphemes in words
provides some additional opportunities for the specification (if necessary) of the
denoted feature by the possible additional use of attributes and predicates of
it and object (circumstantial) determinants around it.
If we take for granted that in majority of cases a word of the initial, zero de-
gree of derivation within a word-formational nest is present by a noun (usually
having physical object reference and not having any affixal “clothes”, i.e. being
present by a pure root), it means that in the beginning of suffixation (at suf-
fixal position # 1, just after a root) there can be the movement mainly into the
direction of noticeably relative greater categorial abstractness with use of some
relatively shorter suffix. Next, second suffixation step, can be in either of two
directions – (1) to the greater and (2) to the lower categorial abstractness. But
in majority of cases it is realized into the second – categorically concreticizing
direction and, correspondingly, in increasing of length of a suffix used at the
step. It is because of the strong negative correlation between quality of con-
tiguous derivation steps within any nest. But this substantivizing “revenge” is
preparing some additional abstractivizing opportunities for those words which
have undergone substantivizing at the previous step. So, the third step should
be, according to the second tendency of negative correlation between the the
direction of quality changes for contiguous derivation steps, again mainly into
the direction of greater categorial abstractness of derivatives than that of their
word-bases and, correspondingly, into the direction of shortening, on the av-
erage, suffixes used on this step. This, in its turn, gives additional opportuni-
ties for the next step to substantivization of derivatives (as compared to word
bases), to relative categorial and semantic specification of used here suffixes
and, correspondingly, to relative growth of their length. The fourth step will
repeat the relative logic of the second one, etc.
All in all, there should be a picture of some general process of average suffixes’
(and prefixes) shortening while moving along affixes’ positional scale. Besides,
the process is modulated by oscillations, rhythmic (regularly repeating) plus
and minus deviations within, at least, suffixal zone. Seemingly, prefixal zone
is influenced only by the general tendency, without oscillations. But the last
statement still needs additional theoretical and empirical support.
Exact form of this dependence needs further study. Now it is possible, at least,
to say that the dependence is less homogeneous for roots than for any other
kind of morphemes (yet with noting that chains of suffixes are influenced by
the factor of oscillations).
Explaining Basic Menzerathian Regularity 13
Qmrf age 1 age2 age 3 age 4 age 5 age 6 age 7 all ages
1 3,28 4,35 4,87 4,96 4,75 4,86 4,34 4,58
2 2,15 2,68 2,92 3,08 3,11 3,18 3,50 3,01
3 1,88 2,25 2,39 2,45 2,58 2,66 2,74 2,53
4 2,02 2,15 2,21 2,29 2,34 2,43 2,27
5 1,84 2,04 2,07 2,17 2,22 2,23 2,17
6 1,58 1,94 1,98 2,08 2,10 2,19 2,10
7 1,87 1,88 2,02 2,08 2,06 2,04
8 1,84 1,88 1,98 2,01 1,97 1,98
9 2,11 2,00 1,83 1,99 2,04 2,01
10 2,10 1,95 2,00 2,00
Total 2,52 2,36 2,19 2,30 2,30 2,33 2,37 2,31
erties for taking into account regular age modifications of the considered de-
pendencies.
7 Conclusion
A revealed (predicted and proven) in our study phenomenon of the average af-
fixes’ length negative dependence (possibly, logarithmic) on the unified ordinal
number of morphemes’ position within a word, in our opinion, is a result of
functional determination of words and morphemes structural features.
For deeper understanding observed features it is necessary to take also into
account age features of words and morphemes.
Oscillative phenomenon for further studies of Menzerathian regularities is of
primary importance. As it was shown, general tendency for relative greater
categorial abstractness of derivatives of each next step of word-formational
chain is modified by oscillations as a result of collaboration of the main ten-
dency (of production of new word of relatively more abstract category, for in-
stance, in the course of derivational movement from nouns to adjectives: friend –
Explaining Basic Menzerathian Regularity 15
4
Average letter length of morphemes
) ) )
3,5 ) )
)
)
3
2,5
$
$ $ $ $ $
2 ' %'
' '
% %
' %
%' %
1,5
%'
$ Prefixes % Suffixes ' Affixes ) Roots
1 $
1 2 3 4 5 6 7
Age Period
8 References
Fenk, A.; Fenk-Oszlon, G. (1993): “Menzerath’s Law and the Constant Flow of
Linguistic Information.” In: R. Köhler; B.B. Rieger (eds.), Contributions to
Quantitative Linguistics. Dordrecht (NL) u.a. (11-32).
Polikarpov, A.A. (1993): “On the Model of Word Life Cycle.” In: R. Köhler,
R.; B. Rieger (eds.), Contributions to Quantitative Linguistics. Dordrecht
(NL). (53-66).
Appendix
Table 4 (cont.)
PREF 0 1 2 3
2 m.s l.s m.s l.s m.s l.s m.s l.s
0 2697 5241 21502 14761
1 44 44 35 35 221 221 153 153
2 53 106 71 142 475 950 345 690
3 22 66 57 171 485 1455 229 687
4
P 4 16 4 16 72 288 38 152
123 232 167 364 1253 2914 765 1682
x̄ 1,89 2,18 2,33 2,20
4 5 6 7
m.s l.s m.s l.s m.s l.s m.s l.s
0 3430 499 59 6
1 53 53 10 10 5 5 2 2
2 77 154 11 22 4 8 2 4
3 87 261 11 33 2 6 0
4
P 16 64 4 16 0 0
233 532 36 81 11 19 4 6
x̄ 2,28 2,25 1,73 1,50
M-Pos. L Word length (in number of suffixes in them)
PREF 0 1 2 3
1 m.s l.s m.s l.s m.s l.s m.s l.s
0 1463 3115 8122 5085
1 265 265 481 481 3041 3041 2421 2421
2 624 1248 1125 2250 7200 14400 5215 10430
3 372 1116 630 1890 4208 12624 2699 8097
4
P 95 380 55 220 179 716 101 404
1356 3009 2291 4841 14628 30781 10436 21352
x̄ 2,22 2,11 2,10 2,05
4 5 6 7
m.s l.s m.s l.s m.s l.s m.s l.s
0 1455 220 16
1 613 613 96 96 20 20 7 7
2 1072 2144 151 302 20 40 1 2
3 500 1500 60 180 9 27 1 3
4
P 20 80 8 32 5 20 1 4
2205 4337 315 610 54 107 10 16
x̄ 1,97 1,94 1,98 1,60
Explaining Basic Menzerathian Regularity 21
Table 4 (cont.)
Table 4 (cont.)
Table 4 (cont.)
Table 4 (cont.)
Table 4 (cont.)