Documente Academic
Documente Profesional
Documente Cultură
Jonathan Levin
Minerva@hisown.com
2
E.g. Nominative (subject) and Accusative (direct ob- It's important to emphasize, that while the lan-
ject) for neuter, as well as Dative (indirect object) and Abla- guage dependent modules are currently imple-
tive (means) for feminine, as well as many others. mented in Latin (1,2) and English/French (4), we
surmise there is no insurmountable challenge in rules. The Grammatical Analysis module thus
adapting them to other languages, as well. In this implements a classical rule-based system.
sense, our claim can be expanded to say that the
semantic analysis and disambiguation is so lan- Grammar rules are defined in a simple, yet ef-
guage agnostic that the system should ideally be fective language that makes use of conditionals
able to deduce senses and context irrespective of and word attributes to form pattern matching ex-
choice of the languages involved. pressions. Ambiguities in word senses are han-
dled by means of reducing multiple senses to a
3.1 Morphological Analysis simple character representation, upon which a
regular expression may be applied and tested.
Minerva performs morphological analysis by
following an XML tree that correctly describes Rules are defined in one of several classes
the Latin syntax. The tree is organized according (mandatory, common, unusual), and evaluated by
to endings, and its hierarchical structure easily order. Additionally, when a given meaning of a
leads to determining possible meanings of end- word is discarded due to the application of a rule,
ings. Rather than follow a greedy approach, all the system keeps track of its decision.
possible endings are considered (including par-
tial ones) from which the base form candidate is A "mandatory" rule class is one wherein the
proposed. A dictionary lookup ensues, and – if system will try to enforce the rule, eliminating
the candidate is found – it is added to the possi- the possible senses of the word in case which fail
ble meanings of the word. to match it. In a way, it mimics human expecta-
tion. Much like in English one would expect cer-
Additionally, Minerva maintains separate lists tain parts of speech to follow others (say,
of exceptional verbs and noun forms, to handle nouns/adjectives after determiners), so too does
the numerous (yet finite) cases wherein words Minerva anticipate certain declensions to follow
are declined in non-standard, or alternate forms. prepositions, and such. If a mismatched manda-
This stage is fully debugged and tested, and is tory rule results in the elimination of the last pos-
thus on par with "Perseus", and even exceeds it – sible sense of a given word, Minerva rejects the
as it correctly recognizes many forms the former sentence as ungrammatical.
does not.
A "common" rule class is one wherein the sys-
The output of this phase is an array of all the tem expects a common, yet not necessarily strict
possible meanings of a given word. Meanings are pattern to occur in a sentence. These rules are
disambiguated at the grammatical level only, e.g. "nice-to-have", yet can be violated.