Sunteți pe pagina 1din 46

Phonix

A Phonological Transformation Language

Jesse Bangs
Copyright c 2009-2010 Jesse Bangs.
This manual and Phonix itself are licensed under the BSD license.
Phonix version 0.8.1
i

Table of Contents

1 Quick Start Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 How To Use Phonix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 The Phonix Language . . . . . . . . . . . . . . . . . . . . . . . . . . 4


3.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.1 Basic Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.2 Unary Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.3 Binary Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.4 Scalar Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.5 Node Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Feature Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.1 Feature Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 Basic symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.2 Diacritic Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.3 Symbol Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.1 Parts of a rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.2 Basic transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.3 Inserting and deleting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4.4 Rule variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4.5 Node feature values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4.6 Assimilation and gemination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.7 Scalar feature value operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.7.1 Comparison operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.7.2 Addition and subtraction operators . . . . . . . . . . . . . . . . . . 14
3.4.8 Optional and repeated segments . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.8.1 Differences from regular expressions. . . . . . . . . . . . . . . . . . 15
3.4.9 Rule parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.9.1 direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.9.2 filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.9.3 persist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.9.4 applicationRate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5 Syllables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5.1 Syllable templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5.2 Syllable rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5.3 Syllable parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.3.1 onsetRequired and codaRequired . . . . . . . . . . . . . . . . . . 19
3.5.3.2 nucleusPreference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.3.3 persist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.4 Using syllable elements in rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ii

3.6 Imports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6.1 Importing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6.2 std.features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6.3 std.symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6.4 std.symbols.diacritics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6.5 std.symbols.ipa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6.6 std.symbols.ipa.diacritics . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.8 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Romanian. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1 ‘romanian.phonix’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.2 Input and output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Appendix A License . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Chapter 1: Quick Start Guide 1

1 Quick Start Guide


You are a busy person. You have sound changes to apply and phonologies to crunch and
you do not have time to read a whole manual.
This section is for you.
Writing your first phonology in Phonix is very easy: all that you need to do is open up
a text file and type the following:
import std.features
import std.symbols
The first line above automatically adds the standard feature set to your phonology.
The second line adds the standard symbol set. Together, these provide you with a set
of contrastive features and phonetic symbols to allow you express whatever phonological
processes you desire. You can find out exactly what these imports contain by reading
Section 3.6.2 [std.features], page 22 and Section 3.6.3 [std.symbols], page 23.
But you’ve only started. Your phonology contains rules and processes that convert
underlying forms to surface forms, and the first and most important rule in your phonology
is I-Nasal. This rule is expressed in English as: "every front vowel must be /i/ if the next
segment is /n/". (This is, perhaps, not a very realistic rule, but we’re doing this for the
sake of simplicity.) In Phonix, you will write this as such:
rule i-nasal
[-cons +fr] => i / _ n
A well educated phonologist such as yourself will immediately recognize the notation
being used here. The string [-cons +fr] is a feature matrix that specifies non-consonantal
front segments a.k.a. front vowels, and the notation => i may be read as "becomes /i/".
The portion following the / is the context, and the context for this rule is _ n, which we
read as "when followed by /n/".
If you didn’t understand that, you should read Section 3.4 [Rules], page 9, which has
a much fuller explanation. You may also look at the Section 4.1 [Romanian], page 35 for
examples of other Phonix rules.
Save your phonology in a file called ‘my.phonix’. Let’s assume that you already have a
file ‘lexicon.txt’ with a list of underlying forms for your transformations, with one word
on each line. Conveniently, this is exactly the format that Phonix uses. You can then apply
your rules to your lexicon by typing the following on the command line:
phonix my.phonix -i lexicon.txt
This will print the output of your sound changes to the screen, one word on each line.
Should you wish to have Phonix save them to a file, simply provide the filename with ‘-o’:
phonix my.phonix -i lexicon.txt -o output.txt
This barely scratches the surface of what Phonix can do, but it should give you an idea
of how Phonix works and what you can do with it. There are many, many more features in
Phonix which the rest of this manual will describe.
Chapter 2: How To Use Phonix 2

2 How To Use Phonix


Phonix refers to the Phonix programming language for modeling phonological processes,
and informally to the program ‘phonix’ that interprets Phonix language files and applies
the rules contained therein to a lexicon.
In order to use Phonix, you need two things:
1. A Phonix file that contains the feature, symbol, and rule declarations of the phonology
you wish to model. Phonix files conventionally have the extension ‘.phonix’ or ‘.phx’.
2. The lexicon for your phonology—the underlying or ancestral forms before any rules or
sound-changes are applied to them—written with the symbols defined in your Phonix
file. Phonix can read your lexicon from standard input or from a file.
The Phonix language is described in detail in Chapter 3 [The Phonix Language], page 4.
This section describes how to invoke ‘phonix’ to run your phonology files.
A basic invocation of Phonix looks like this:
phonix phonix-file arguments
The only required argument is the phonix-file, which is a file containing a phonology
definition. The following arguments are optional.
‘-i filename ’
‘--in filename ’
The lexicon file from which underlying/ancestral forms are read to have the
Phonix rules applied to them. If this argument is not given, Phonix reads from
standard input.
‘-o filename ’
‘--out filename ’
The filename to which to write output forms that the rules have applied to. If
this argument is not given, Phonix writes to standard output.
‘-q’
‘--quiet’ When this option is present, Phonix doesn’t print any errors except for fatal
error messages.
‘-d’
‘--debug’ When this option is given, Phonix writes a variety of diagnostic messages to
the standard error stream to aid in debugging your phonology.
‘-v’
‘--verbose’
When this option is given, Phonix logs verbosely to the standard error stream.
This creates all of the output generated by ‘-d’, plus more.
‘-w’
‘--warn-fatal’
When this option is given, certain conditions that normally produce a warning
will cause Phonix to immediately exit.
Phonix requires some version of the Common Language Runtime, commonly known on
Windows as .NET. Through the Mono project, this runtime is widely available on almost all
Chapter 2: How To Use Phonix 3

operating systems. However, the specific way that you invoke ‘phonix’ will depend slightly
on your setup.
1. If you are using any version of Windows and ‘phonix.exe’ is in the current directory
or any directory in %PATH%, then you can simply call ‘phonix’ or ‘phonix.exe’. Phonix
requires .NET Framework 2.0 or later, which is standard on all recent versions of
Windows.
2. If you are using Linux, you can invoke Phonix with ./phonix.exe, provided that you
have the Mono runtime installed. If you installed using the .deb package, then the
prerequisites have already been installed for you. If you are building from source, there
is a make install target that will install a ‘phonix’ script without the .exe extension
in /usr/local/bin.
3. If you are using Mac OSX or a POSIX system other than Linux, you can invoke Phonix
with mono phonix.exe. You must have the Mono runtime installed, and you need to
explicitly call mono in order to run Phonix.
Chapter 3: The Phonix Language 4

3 The Phonix Language


This section covers the syntax and meaning of all aspects of a Phonix file in detail. The
Phonix language is simple but powerful, and it should be very easy for people who are
familiar with basic phonological concepts and rule notation. If you are not familiar with
the basics of generative phonology, you may find this section tough going.
The Phonix language is declarative: you use Phonix to create a description of your
phonology, and phonix reads your description and applies the rules contained therein to
the lexicon. There are three fundamental elements to a Phonix file: features, symbols,
and rules. Each of these is built up from the previous parts: features are the most basic
element, symbols are composed of features, and rules are composed from symbols and
feature matrices.

3.1 Features
Phonological features are the contrastive elements that compose the phonemes in a phonol-
ogy. Features must be declared before they can be used in symbols or rules.

3.1.1 Basic Features


The simplest feature declaration looks like this:
feature feature-name
The name of a feature can generally be anything, so long as it doesn’t contain spaces or
characters with a special meaning in Phonix. For detailed rules, see Section 3.8 [Strings],
page 33. For example:
feature voice
The preceding example creates voice as a binary feature, meaning that it may have the
values +voice and -voice. (There is also a third value *voice which will be talked about
shortly.) Phonix supports three types of features: unary, binary, and scalar. By default
all features are binary. To declare a feature of a different type, you must use the extended
syntax:
feature feature-name (parameter-name =parameter-value )
Note that the parentheses around the parameter list are part of the syntax and cannot
be omitted. Currently the only parameters allowed on feature declarations are type, which
may have the values unary, binary, scalar, or node, and children, which is discussed
under Section 3.1.5 [Node Features], page 5.
These feature types are discussed in detail in the following sections.

3.1.2 Unary Features


The following example declares a unary feature ‘un’:
feature un (type=unary)
Unary features are either present or absent. A unary feature which is present is written
with the feature name and no other decoration: un. A unary feature which is absent is
preceded by ‘*’: *un.
Unary features are also called "privative features" in linguistic literature.
Chapter 3: The Phonix Language 5

3.1.3 Binary Features


The following example declares a binary feature ‘bn’:
feature bn (type=binary)
Binary features have both positive and negative values, written as +bn and -bn respec-
tively. Additionally, a binary feature may be absent, which is written the same way as
for unary features: *bn. For example, the features ‘anterior’ and ‘distributed’ are not
meaningful for labials, so a labial segment would have the value *anterior.

3.1.4 Scalar Features


The following example declares a scalar feature:
feature sc (type=scalar)
Scalar features may have any non-negative integer value, and are written as sc=value .
For example, sc=0, sc=1, and sc=123456789 are all valid scalar feature values. However,
negative values such as sc=-1 and strings such as sc=foo are not valid. Scalar features may
also be absent, in which case they are written *sc.
Scalar features may take two additional parameters min and max to specify the minimum
and maximum value for the feature, as in the following example:
feature sc (type=scalar min=1 max=5)
Note that if you specify either min or max, you must specify both.

3.1.5 Node Features


Node features are used to represent feature trees, which are used in many phonological
models. A node feature represents a collection of child features, and the value of a node
feature is the set of values of all of its children.
A node feature is declared with the parameter type=node and the children parameter.
For example:
feature nd (type=node children=a,b,c)
The children parameter is required for node features, and forbidden on all other feature
types. Its value is a list of the child features, separated by commas. You cannot put spaces
after the commas unless you put the whole list in quotes, i.e. either children=a,b,c or
children="a, b, c". The features that are named as children must be declared before you
declare the node feature.
Feature nodes can have any kind of feature as a child, including other nodes. This
allows you to create a multi-level hierarchies. For an example of such a feature tree, see
Section 3.6.2 [std.features], page 22.
Feature nodes are present if any of their child features are present (i.e. if any child
feature has a value other than *feature), otherwise they are absent. Feature nodes are
written the same way as unary features, so that nd indicates that a node is present and *nd
that it is absent. Note that there are some restrictions on where you can use node feature
values:
• Node feature values can’t be used at all in symbol declarations. Symbols can only
contain non-node features.
Chapter 3: The Phonix Language 6

• Node feature values that test for the presence of a node (such as nd) can only be used
in the match and context portions of a rule. This is because it isn’t meaningful to
assign a "present" value directly to a node.
• Node feature values that test for the absence of a node (such as *nd) can be used
anywhere within a rule. Assigning a null value to a node within a rule implies assigning
a null value to all children of that node.

3.2 Feature Matrices


A feature matrix is a set of features and their values, enclosed in ‘[]’. The following line
represents a feature matrix.
[+cons -son -cont -str -vc -ro]
Features that are present in your phonology but which are not listed in a feature matrix
are assumed to have the absent value. In the feature matrix above, the vocalic feature fr
(front) is unspecified, so it has the default value of [*fr].
Feature matrices are used to define symbols and rules. However, only unary, binary, and
scalar feature values can appear in symbols. Node feature values and variable feature values
cannot be used in symbol declarations, though they can be used in rules.

3.2.1 Feature Macros


Sometimes when you declare symbols or write rules, you find that certain clusters of feature
values occur together frequently. To reduce repetition of these frequently occurring sets
of feature values, you can create a feature macro. A feature macro is declared with the
feature keyword followed by the macro name in brackets, followed by a feature matrix:
feature [V] [-cons +son +syll]
Once it is declared, a feature macro can be used anywhere inside a feature matrix. A
feature macro is enclosed in square brackets [] both when it is declared and when it is used.
# These feature values are common to all vowels
feature [V] [-cons +son +syll]

# The feature macro [V] can be used inside symbol declarations


symbol a [[V] -hi +lo -fr +bk]
symbol e [[V] -hi -lo +fr -bk]
# etc...

# You can also use [V] inside a rule


rule lower-all-vowels
[[V]] => [-hi +lo]
Note that you must include the second set of brackets around a feature macro, as shown
in the example above.
Feature macros are expanded when your Phonix file is parsed. If you run your file
in debug mode, Phonix will show you what your rules look like after macros have been
expanded.
Chapter 3: The Phonix Language 7

3.3 Symbols
A symbol in Phonix is a string of text that corresponds to a given feature matrix.
Symbols in Phonix are used for three things:
1. To define the phonetic symbols represented by the strings in your lexicon files.
2. To provide a convenient shortcut for writing rules that refer to specific phonemes.
3. To define the strings used to represent feature matrices in output.

3.3.1 Basic symbols


A basic symbol declaration looks like this:
symbol symbol-string feature-matrix
For example, this is the declaration of the symbol ‘s’ in the standard symbol set:
symbol s [+cons -son +cont +str -vc +ant -dist]
Having this symbol declaration does three things:
1. Allows you to use the string s in your input to Phonix.
2. Allows you to use s as a shorthand for [+cons -son +cont +str -vc +ant -dist] in
your rules.
3. Causes the feature matrix [+cons -son +cont +str -vc +ant -dist] to be translated
to s in phonix output.
Every character you use in in your lexicon file must be defined as a symbol, or else
Phonix will quit with an error message. If your rules result in an output feature matrix for
which there is no symbol, Phonix will print a warning and insert the dummy symbol [?]
as a placeholder in your output.
Any Unicode character that doesn’t have a special meaning in Phonix can be used as
a symbol. So if you want to define (Unicode snowman) as a symbol in your phonology,
you’re welcome to do so.
Symbols can have any number of characters in them. Phonix always maps input strings
to the longest matching symbol, however. For example, say you define the following symbols:
symbol s [...]
symbol k [...]
symbol sk [...]
If your input or rules contain the characters ‘sk’, these will be mapped to the symbol
/sk/, and not to the sequence /s/ followed by /k/.
For many examples of symbol definitions, see Section 3.6.3 [std.symbols], page 23.

3.3.2 Diacritic Symbols


Phonix allows you to define certain symbols as diacritics, which are modifiers applied to
other symbols. Diacritics are defined like this:
symbol diacritic-string (diacritic) feature-matrix
For example, the following is the declaration of the nasality diacritic in the standard
diacritic set:
symbol ~ (diacritic) [+nas]
Diacritics differ from basic symbols in several ways:
Chapter 3: The Phonix Language 8

1. A diacritic is not a segment of its own, but rather adds its features to the symbol that
it modifies. Take the symbol /s/ defined above, together with the nasality diacritic
/~/. The combination of these symbols /s~/ results in a single feature matrix with the
features [+cons -son +cont +str -vc +ant -dist +nas]. These are the feature values
for /s/, with the +nas value of /~/ added to them.
2. Diacritics must follow a base symbol in your input. If /s/ is a base symbol and /~/
is a diacritic, then s is a valid input string, as is s~. However, ~ and ~s are not valid
input strings. Phonix will quit with an error if it encounters such a string.
3. When representing the output of your rules with symbols, Phonix will first look for
a base symbol that matches the output feature matrix. It then adds diacritics as
necessary. Phonix will try to find the fewest number of diacritics necessary to represent
your output.
For many examples of diacritic definitions, see Section 3.6.4 [std.symbols.diacritics],
page 27.

3.3.3 Symbol Macros


When applying phonological rules, Phonix matches symbols strictly. If you use a symbol in
the match portion of a rule (see Section 3.4.1 [Parts of a rule], page 9), an input segment
must have exactly the same feature values as the symbol, and likewise if you use a symbol
in the action portion of a rule, the output will have exactly the feature values of the symbol
you assign it to.
However, sometimes one wants to match a symbol less strictly, allowing a symbol to
match input segments that consist of the same base symbol but have some diacritics. For
this case, you can use a symbol macro. A symbol macro may occur anywhere inside a
feature matrix, and is formed by putting a symbol inside parentheses (): [(symbol )].
Consider the following example:
symbol a [-cons +son +syll -hi +lo -fr +bk]
symbol e [-cons +son +syll -hi -lo +fr -bk]

symbol ’ (diacritic) [+stress]

rule match-strictly
a => e

rule match-with-macros
[(a)] => [(e)]
Here, the rule match-strictly will only match the input a. It will not match a’, where
a carries the stress diacritic. Likewise, if the input segment matches the a, then it will be
replaced with exactly e, preserving none of the original values from the input segment.
The rule match-with-macros, on the other hand, will match any input segment that
has the features which are specified for a, i.e. [-cons +son +syll -hi +lo -fr +bk]. Since
[+stress] is not specified for a, both a and a’ can match this rule. In the same way, only
the features specified by the symbol e will be replaced by this rule action, leaving the value
for stress unchanged.
You can freely combine symbol macros with regular feature values in a feature matrix:
Chapter 3: The Phonix Language 9

rule match-with-stress
[(a) +stress] => [(e)]
Symbol macros are expanded when your Phonix file is parsed. If you run your file
in debug mode, Phonix will show you what your rules look like after macros have been
expanded.

3.4 Rules
The rule is the most complex and the most variable object in Phonix. A rule describes the
conditions under which one phonological unit is transformed into another.

3.4.1 Parts of a rule


A complete rule has the following syntax:
rule name (parameters ) match => action / context // excluded-context
• The name of the rule can be any Phonix string. See Section 3.8 [Strings], page 33.
• The parameters are optional and define characteristics of how the rule is applied. If
you don’t apply any parameters, you can omit the parentheses.
• The match is a sequence of feature matrices or symbols that matches the segments that
the transformation works on.
• The action is sequence of feature matrices or symbols that defines the transformations
applied to the match.
• The context defines the conditions under which the transformation occurs. These are
the adjacent segments which aren’t themselves altered, but which are required to trigger
the rule. If the rule is unconditional and does not depend on the surrounding segments,
you can omit the context and the excluded context.
• The excluded-context defines conditions under which the rule should not apply. These
are exceptions to the conditions in the context.
These are described in more detail in the following sections.

3.4.2 Basic transformations


The simplest transformation is one that has no context or parameters and acts only on a
single phoneme, as in the following example.
rule s-to-z
s => z
The match here is the segment /s/, and the action is to turn all such segments into
/z/. Since /s/ and /z/ differ only by voice, we can also write this rule like this (using the
appreviation vc for voice, as in the standard feature set):
rule s-voicing
s => [+vc]
Here the action is a feature matrix rather than a segment. The effect of this is to take
whichever value for voice the input previously had and replace it with +voice. You can also
use a feature matrix in the match portion of the rule to match against a class of segments:
rule continuant-voicing
[+cont] => [+vc]
Chapter 3: The Phonix Language 10

The matrix [+cont] causes this rule to match all segments that have the feature +cont,
and applies the feature +vc to them.
Most rules do not apply everywhere, however, but have some context. The context in
Phonix consists of, at minimum, the character _ with any number of feature matrices or
symbols before or after it. The _ character stands for the segment(s) of the match/action.
Regardless of how many segments are matched or transformed, you must write only one
underscore. For example, let’s modify our previous rule to only apply after a nasal:
rule postnasal-voicing
[+cont] => [+vc] / [+nas] _
In this case, the context / [+nas] _ indicates that the segments matched must be pre-
ceded by a segment with the feature +nas.
To indicate word boundaries we use $. The $ character can stand for either the beginning
or the end of a word. It can only appear as the first or the last character of the rule context
(or both), but it cannot appear internal to a rule. Suppose that we wish to further restrict
our rule to only apply at the end of a word. In this case, we write this:
rule final-postnasal-voicing
[+cont] => [+vc] / [+nas] _ $
If we wish to add an exception to this rule, we use the excluded context. This is indicated
by a double-slash // after the context. If the rule includes an excluded context, then phonix
will check both that the context matches the input and that the excluded context does not
match. For example:
rule postnasal-voicing-with-exception
[+cont] => [+vc] / [+nas] _ // N _
Here we voice continuants after nasals, except after /N/ (the velar nasal).
If you wish, you can also indicate the excluded context without the context:
rule voicing-with-exception
[+cont] => [+vc] // $ _
This voices all continuants unless they are the first segment in the word. Note that we
could accomplish the exact same thing with:
rule voicing-with-exception
[+cont] => [+vc] / [] _
Here, we specify that voicing occurs when the continuant is preceded by any segment.
(A feature matrix with no values, indicated by [], acts as a match for any segment.) This
illustrates that an excluded context can usually be indicated by a properly constructed
matching context and vice-versa. The excluded context construct is provided only to make
rules clearer and easier to understand.

3.4.3 Inserting and deleting


Insertion and deletion rules are written with the help of the special character *, which you
should think of as "nothing". To write an insertion rule, specify that "nothing" becomes
something:
rule e-epenthesis
* => e / $ _ s[-cont]
Chapter 3: The Phonix Language 11

This is a familiar rule found in many Romance languages, which adds an epenthetic /e/
following an initial cluster with /s/.
The opposite of insertion is deletion, in which something becomes nothing. E.g.:
rule final-cons-deletion
[-son] => * / _ $
Here, the "something" is any non-sonorant (-son), which becomes "nothing" when fol-
lowed by the word boundary.
Our nothing character is important when writing rules in which multiple segments be-
come one. For example, to express that /s/+/k/ becomes /S/, you cannot write the follow-
ing:
# DOES NOT WORK - Phonix will not compile this rule
rule sk-coalescence
sk => S
The problem with this is that you must have the same number of segments both before
and after the => symbol. To accomplish what you want, just specify that one of the input
segments is deleted:
rule sk-coalescence
sk => S*
This rule will compile and function as expected.

3.4.4 Rule variables


It is common for linguistic rules to operate when two segments share the same value for some
feature, regardless of what the particular value of that feature is. The linguistic literature
traditionally indicates such values with Greek letters to represent the variables. Phonix
allows you to indicate such rules without the Greek using variable feature values, which are
indicated by $ preceding a feature name.
Consider the following rule:
rule cluster-spirantization
[$vc -son] => [+cont] / _ [$vc -son]
This rule in English reads as "spirantize any non-sonorant when followed by another
non-sonorant of the same voicing". The feature values -son and +cont are ordinary feature
values. The feature value $vc is different: it doesn’t stipulate +vc or -vc, but rather requires
that the value for vc be the same in both the match and context segments. This rule will
change apta into afta since both /p/ and /t/ have the same voicinge, and it will change
abda into avda since /b/ and /d/ have the some voicing. However, it won’t change abta at
all, because the segments in that word are of different voicing.
Variables may also occur in rule actions. The following illustrates a nasal assimilation
rule:
rule nasal-assimilate
[-cont] => [$nas $son] / _ [-cont $nas $son]
This rule reads as "any non-continuant takes the nasality and sonority of a following
non-contiuant segment". The values $nas $son in the rule action have the effect of setting
nas and son in the matched segment to whatever value they have in the context segment.
This will cause, for example /d/+/m/ to become /n/+/m/, while /n/+/b/ becomes /d/+/b/.
Chapter 3: The Phonix Language 12

In order for a variable feature value to be meaningful, it has to occur at least twice in a
rule. Consider the following modified example:
rule nasal-assimilate-mod
[-cont] => [$nas] / _ [-cont $nas $son]
This rule parses and executes without any warning, but the value $son in the rule context
has no effect, since there is no other instance of $son to match against.
The following example, however, will generate a warning message:
rule nasal-assimilate-mod2
[-cont] => [$nas $son] / _ [-cont $nas]
In this case we are trying to set the value of son in the rule action, but the value for
son is not defined anywhere else in the rule. Since this usually indicates a mistake, Phonix
issues a warning and leaves the value of son unchanged.

3.4.5 Node feature values


Node features are features with one or more child features (see Section 3.1.5 [Node Features],
page 5), and they have some special properties.
A node feature is present if one or more of its children is present. Node features don’t
have any value of their own. Within a rule you can use the same syntax that is used for
unary values to test for the presence of a node:
rule postlabial-centralization
[+fr -bk] => [-fr] / [Labial] _
In this case we are testing for the presence of the Labial node. This is especially
convenient because it allows us to capture both +ro segments (rounded vowels) and -ro
segments (labial consonants), which otherwise can’t be represented in a natural class.
Conversely, you can test for the absence of a particular node by preceding the node
name with *, as with other features. For example, the following rule centralizes vowels after
non-Coronal segments.
rule nonlabial-centralization
[+fr -bk] => [-fr] / [*Coronal] _
This rule will apply whenever the action segment is preceded by something that has
no non-null values for features that occur under the Coronal node. You could also write
[*Coronal] as [*ant *dist], since ant and dist are the only two features under Coronal.
These two forms have exactly the same meaning, though the former is clearer and easier to
understand.
Within the action portion of a rule, you can use this syntax to set an entire group of
segments to their absent values. For example, the following rule reduces all stops in final
position to /?/ by removing every feature under the Place node.
rule final-glottalization
[-cont] => [*Place] / _ $
As this rule illustrates, you may use the *Node syntax in a rule action, but you cannot
use the Node syntax in an action, because it makes no sense to assign a value directly to a
node.
Chapter 3: The Phonix Language 13

3.4.6 Assimilation and gemination


Node features (see Section 3.1.5 [Node Features], page 5) are especially useful for assimila-
tion and gemination rules, as we’ll see.
A common phonological process is nasal place assimilation, by which nasals in coda
positions take on the place of the following segments. These rules can be written with a flat
feature set by writing a complex rule that assigns every feature value correspending to the
notion of "place". However, this can get very cumbersome–consider that to write a total
assimilation rule, you would need include every single feature as a variable!
However, these processes are considerably simplified by using node features. For example,
using Section 3.6.2 [std.features], page 22, we can write a nasal place assimilation rule
as follows:
rule nasal-place-assimilation
[+nas] => [$Place] / _ [+cons $Place]
This rule is greatly simplified by using the Place node, which has as its children all of
the features that contribute to place of articulation. The effect of the $Place variable is
the same as any other rule variable: it copies the value of Place from the context to the
action segment. However, because Place is a node, every single feature value under Place
in the feature hierarchy is copied as well.
You can extend this even further for a gemination rule that duplicates all of the features
for a given segment:
rule intervocalic-gemination
* => [$ROOT] / [-cons][+cons $ROOT] _ [-cons]
Here a consonantal segment is doubled when surrounded on either side by non-
consonantal segments. The feature ROOT has as its children every feature in std.features,
and so the inserted segment copies every value from the preceding segment.

3.4.7 Scalar feature value operators


There a few special operators that can be used with scalar feature values to perform integer
operations and comparisons. Scalar feature values are unlike other feature value types in
that they represent a range of integer values, and so it’s often useful to perform simple
numeric operations on them.

3.4.7.1 Comparison operators


Scalar comparison operators can be used anywhere within a feature matrix that does match-
ing, i.e. in the match portion of the rule, or to the left of the arrow in the action. The most
basic scalar comparison is for equality, which has the same format as a regular scalar value:
[sc=2]
This feature matrix will match any segment for which the scalar feature sc has the value
2.
You can also check for inequality with <>:
[sc<>2]
This matrix will match any segment which has a value for sc other than 2. This includes
values like sc=3, sc=1, and even null values like *sc.
Chapter 3: The Phonix Language 14

There are also operators for greater than, greater than or equal, less than, and less than
or equal, which have the familiar numeric notation:
sc>2 # matches sc=3, sc=4, etc.
sc>=2 # matches sc=2, sc=3, etc.
sc<2 # matches sc=1, sc=0
sc<=2 # matches sc=2, sc=1, sc=0
These numeric comparison operators will only match if the segment they’re matching
against has a defined value for sc. They will always return false when comparing against
*sc.

3.4.7.2 Addition and subtraction operators


You can use addition and subtraction operators to add to or subtract from a scalar feature
value. These operators can only be used in a rule action, on the right-hand side of the
arrow.
The =+ operator is used to add to a scalar feature value. (Note that the equal sign comes
first, unlike most programming languages.)
rule add-two
[sc>0] => [sc=+2]
This rule matches any segment that has a value for sc greater than zero, and adds two
to that value. An input segment with sc=1 would have sc=3 after this rule applied.
The subtraction operator =- works much the same way:
rule subtract-one
[sc>2] => [sc=-1]
This rule matches any segment that has a value for sc greater than two, and subtracts
one from that value. An input segment with sc=3 would have sc=2 after this rule applied.
These operators may generate non-fatal warnings under a few conditions:
• If the scalar feature value is null (e.g. *sc) and you attempt to apply the addition or
subtraction operator, Phonix will issue a warning and leave the value unchanged.
• If the scalar feature has a minimum and a maximum (see Section 3.1.4 [Scalar Features],
page 5) and applying the addition or subtraction operator causes the scalar to go out
of range, Phonix will issue a warning and leave the value unchanged.

3.4.8 Optional and repeated segments


You often need to make some segments in your rule optional. In traditional linguistic
notation, this is indicated by enclosing the optional segment(s) in parenthesis, and the
same convention is followed by Phonix.
rule optional-r
a => i / k(r) _
This rule will take the sequences /ka/ and /kra/ and convert them to /ki/ and /kri/
respectively. The (r) in the rule context indicates that the /r/ optionally follows the /k/.
This syntax can be extended to indicate that a segment is repeated zero or more times or
one or more times. These are indicated by using the operators * or + respectively, together
with the parentheses.
Chapter 3: The Phonix Language 15

The operator * following a parenthesized segment indicates that the segment may appear
zero or more times.
rule multiple-r
a => i / k(r)* _
This rule will match against /ka/ and /kra/, as the one above, but it also matches /krra/
or /krrrrrrrrrra/ or any number of other repetitions of /r/.
The operator + following a parenthesized segment indicates that the segment may appear
one or more times.
rule at-least-one-r
a => i / k(r)+ _
This rule, unlike the ones above, does not match /ka/. It does, however, match /kra/,
/krra/, and any other input with at least one /r/.

3.4.8.1 Differences from regular expressions


The * and + operators are similar to the equivalent regular expression operators. However,
Phonix is not a regular expression engine, and there are important differences between
regular expressions and the Phonix engine.
• The parentheses around segment(s) preceding the * or + are not optional.
• Phonix does not do backtracking, so a rule context involving multiple * or + expressions
may not work as expected.
• Phonix matches are always greedy (taking in as many segments as they can). There
are no non-greedy match operators in Phonix.
Many rules using multiple match segments can be expressed more succinctly by using a
filter instead. See Section 3.4.9.2 [filter], page 16.

3.4.9 Rule parameters


Rule parameters alter the execution of your rule in some way that can’t be indicated in
the rule itself. Parameters take the form param-name =param-value , and are enclosed in
parentheses after the rule name, with spaces between them if there are more than one. The
following rule example specifies the direction and filter parameters:
rule leftward-example (direction=right-to-left filter=[+cons]) s => z
There are two different parameters given in this example: direction=left-to-right
and filter=[+cons]. Note that the parameters are separated by a space, not by a comma.
Supported rule parameters are discussed below.

3.4.9.1 direction
The direction parameter defines the direction in which the rule scans for matching
contexts. This can be important if one application of a rule may create the context
for further applications. To specify the direction for a rule, specify the parameter
direction=direction-value . Valid direction values are left-to-right and
right-to-left, with left-to-right as the default.
A good example for a direction-sensitive rule is voice agreement. Suppose we have a
rule that all consonants in a cluster must agree with the voice of the last consonant. The
following rule does not accomplish that:
Chapter 3: The Phonix Language 16

# This doesn’t work as expected


rule voicing-agreement
[+cons] => [+vc] / _ [+cons +vc]
Given an input word like astga, the output of this rule will be asdga—only the middle
consonant in the cluster is affected! This is because the default rule direction is to the right.
First the /s/ is evaluated, but this /s/ is followed by /t/, so the rule does not apply. Then
/t/ followed by /g/ is evaluated. This does match the required context, so /t/ becomes
/d/. However, the /s/ is never reevaluated, leaving asdga as the result. To fix this, specify
that the rule applies from the right to the left:
rule voicing-agreement (direction=right-to-left)
[+cons] => [+vc] / _ [+cons +vc]
This rule will transform astga into azdga as expected. First /t/ followed by /g/ is
evaluated, transforming the /t/ into /d/. Then /s/ followed by /d/ is evaluated, and the
/s/ becomes /z/.

3.4.9.2 filter
Rule filters are a powerful way to express rules that only apply to certain classes of segments
and ignore intervening segments. A familiar real-world example of this kind of rule is vowel
harmony, which typically works on vowels regardless of intervening consonants. To add a
filter to a rule, specify the parameter filter=filter-value , where the filter-value may be
a feature matrix or a symbol.
When you apply a filter, the rule acts as if those segments not matching the filter were
not present at all in the input. Take the following vowel harmony rule:
rule vowel-harmony (filter=[+syll])
[-fr] => [+fr -bk] / _ [+fr]
This rule turns back vowels into front vowels when followed by other front vowels, and will
work correctly regardless of how many non-vowel segments intervene between the vowels.
The filter [+syll] effectively removes all non-syllabic constants from the input for the
duration of the rule. Therefore, the context of the rule is written as if the front vowel that
triggers vowel harmony immediately follows the matching vowel.

3.4.9.3 persist
A persistent rule is a rule that applies at all stages of the phonology, rather than at a
specific point in the derivation of the output form. Persistent rules in Phonix are indicated
with the rule parameter persist, which does not take any value. The usual purpose of a
persistent rule is to ensure that redundant features values are maintained. For example,
in most languages all segments that are +syll are also +vc. If we wanted to enforce this
redundancy, we could do so with the following persistent rule:
rule voiced-syllabics (persist)
[+syll] => [+vc]
Of course, this rule only serves some purpose if there are other processes which might
make a voiceless segment syllabic.
Persistent rules are applied once at the start of rule application, and thereafter every
time that some other rule changes the input. This is true regardless of where in your Phonix
Chapter 3: The Phonix Language 17

file the persistent rule is defined–persistent rules ignore normal rule ordering. However, the
ordering of persistent rules relative to each other is maintained.

3.4.9.4 applicationRate
A sporadic rule is a rule which does not apply to every word in the input lexicon, but
rather applies randomly to some words and not others. Phonix allows you to create this
kind of rule with the rule parameter applicationRate=ratio . The ratio parameter to
applicationRate must be a decimal between 0 and 1. Examples:
# This rule deletes /b/ 25% of the time
rule sometimes-delete (applicationRate=0.25) b => *

# With applicationRate=1, the rule applies 100% of the time. This is


# equivalent to omitting the applicationRate parameter entirely.
rule always-delete (applicationRate=1) a => *

# With applictionRate=0, the rule applies 0% of the time. This is


# equivalent to leaving the rule out.
rule never-delete (applicationRate=0) c => *
Setting the value for applicationRate to less than 0 or more than 1 will cause Phonix
to print an error message and quit.
Note that if you create sporadic rules in this fashion, Phonix’s output will no longer be
predictable or consistent from run to run. There is no way to ensure that a sporadic rule
will always apply to given word, nor can you predict which words will be affected by which
rules in any given invocation.

3.5 Syllables
Many languages use syllables as an important phonological unit. For this reason Phonix
provides special syntax both for dividing input words into syllables with syllable templates
and for creating phonological rules that depend on syllable structure.

3.5.1 Syllable templates


The syllable declaration is used to describe the valid templates that are used for syllabi-
fication. The form of the syllable template is:
syllable
onset onset-segments
nucleus nucleus-segments
coda coda-segments
The entries onset-segments, nucleus-segments, coda-segments are sequences of segments
(feature matrices or symbols) as used in the match portion of a rule. These define the
segments that are allowed at every position in the syllable. As a full example, consider the
following:
syllable onset [+cons] nucleus [-cons +son] coda [+son]
This allows any [+cons] segment in syllable onsets, any [-cons +son] segment as a
nucleus, and only [+son] segments as the coda. This creates (C)V(C) syllables, since
Chapter 3: The Phonix Language 18

onsets and codas are optional by default. Onset clusters and coda clusters are not allowed
by this template, since only one segment is allowed in the onset.
If we wanted to allow an optional liquid to follow a stop in the onset cluster, we can
rewrite the syllable template this way:
syllable
onset [+cons]
onset [+cons -cont -son][+son +cont]
nucleus [-cons +son]
coda [+son]
Here onset appears twice, giving two alternate templates for syllable onsets, one which
allows any single [+cons] segment, and another which allows a stop ([+cons -cont -son])
followed by a liquid ([+son +cont]). Phonix will attempt to create syllables matching
either of the given templates.
Any of the syllable parts onset, nucleus, and coda can be repeated as many times as
desired. You can also abbreviate optional segments in a syllable template with (), as in the
following example that allows an optional sonorant to precede a non-sonorant in a syllable
coda:
syllable
onset [+cons]
nucleus [-cons +son]
coda ([+son])[+cons -son]
However, () is the only regex-like operator that can be used in a syllable template. The
extended operators ()+ and ()* are not allowed, and will result in parsing errors.
To specify that a syllable template should not allow codas (or, more rarely, allow no
onsets), the coda or onset element should be omitted:
syllable
onset [+cons]
nucleus [-cons +son]
This creates (C)V syllables and does not allow codas. The nucleus element may not be
omitted.

3.5.2 Syllable rules


A syllable template is actually a special kind of rule. All rules, including syllable rules,
apply only once, and apply in the order that they are declared. This means that:
• Not every segment will necessarily be assigned to a syllable if there is no syllable
template that can cover every segment. Phonix does not do anything with unsyllabified
segments by default. If you want to avoid this, you need to write your own rule to delete,
add an epenthetic segment, or otherwise modify the unsyllabified segments.
• The syllabification built by a syllable rule persists after the syllable rule occurs, even
if segments are inserted or deleted.
• Insertions, deletions, and other phonological modifications do not automatically trigger
resyllabification. This means that an inserted segment is not automatically added to a
syllable, and deleted segments may leave behind incomplete syllables.
Chapter 3: The Phonix Language 19

• If you want to force a word to be resyllabified after every change, you may use the
persist syllable parameter (see Section 3.5.3 [Syllable parameters], page 19). However,
since insertions and deletions often accompany changes in the syllable template, you
may want to follow an insertion or deletion rule with a new syllable rule.
If you use the -d or --debug flags to ‘phonix’, then when a syllable rule applies, the
new syllabification of the word is printed. Every syllable is written inside angle brackets
(<>), with a double colon (::) between the onset and the nucleus, and a single colon (:)
between the nucleus and the coda. For example, the syllabification of hypothetical word
like abarno with a (C)V(C) template is presented as:
<a> <b :: a : r> <n :: o>
The spacing is used to avoid confusing the syllable separators with the colon symbol.

3.5.3 Syllable parameters


Syllable parameters affect the way that the syllable template is evaluated and the sylla-
ble rule is applied. Syllable parameters appear in parenthesis directly immediately after
syllable, e.g.:
syllable (parameter-list )
onset onset-segments
nucleus nucleus-segments
coda coda-segments

3.5.3.1 onsetRequired and codaRequired


The parameters onsetRequired and codaRequired cause the onset and coda portions of
the syllable template respectively to be required in all complete syllables. By default the
onset and the coda are optional. Because of this, the following syllable declaration with no
parameters creates (C)V(C) syllables:
syllable
onset [+cons]
nucleus [-cons +son]
coda [+cons]
Given an input string such as alona, this rule will present the syllabification <a> <l
:: o> <n :: a>, in which the first syllable has neither onset nor coda, and the remaining
syllables have no coda.
However, we can make onsets obligatory by adding onsetRequired:
syllable (onsetRequired)
onset [+cons]
nucleus [-cons +son]
coda [+cons]
Given the input alona, this syllabifies a <l :: o> <n :: a>, leaving the initial a unsyl-
labified, since no onset can be found for that nucleus. Only the latter two syllables are
constructed, since only those syllables can onsets as required.
Likewise, we can require both onsets and codas:
syllable (onsetRequired codaRequired)
onset [+cons]
Chapter 3: The Phonix Language 20

nucleus [-cons +son]


coda [+cons]
This will syllabify alona as a <l :: o : n> a, leaving both the initial and the final a
unsyllabified, since the medial lon is the only syllable that can be constructed with both
an onset and a coda.
Finally, in the unusual case that we wish to require codas but not onsets, we can create
the following:
syllable (codaRequired)
onset [+cons]
nucleus [-cons +son]
coda [+cons]
This syllabifies alona as <a : l> <o : n> a, with the first two syllables having no onsets,
and the final a unsyllabified since no coda can be found for it.

3.5.3.2 nucleusPreference
The nucleusPreference parameter controls whether syllable nuclei prefer to fall to the
right or to the left in situations where there is ambiguity as to where the nucleus should
fall. It takes two values, right and left. The value right is the default.
This in turn controls whether rising diphthongs or falling diphthongs are preferred in
situations where either one is allowed by the syllable template. Consider an underlying
vowel sequence /ui/ with allophonic rules that allow /u/ to be both [u] and [w], and /i/ to
be [i] and [j]. We’ll represent this by the following extremely simplified syllable template:
syllable
onset []([+son])
nucleus [-cons +son]
coda []
This template requires vowels in the nuclear position, but allows anything into onsets
and codas. Onsets may also form clusters with sonorants as the second element. Given this
template, the input string bui could be syllabified as <bu :: i> (i.e. [bwi]) or <b :: u : i>
(i.e. [buj]).
The default case is to produce <bu :: i>, which is equivalent to nucleusPreference=right.
To force the nucleus to the left, use nucleusPreference=left:
syllable (nucleusPreference=left)
onset []([+son])
nucleus [-cons +son]
coda []
This template will produce syllables with the nuclei aligned to the left whenever possible,
creating <b :: u : i> ([buj]) as the syllabification of bui.

3.5.3.3 persist
The persist parameter has the same affect as the persist parameter on rule declarations:
it causes the syllabification rule to apply after every other rule, ensuring that words are
automatically resyllabified after every change.
Chapter 3: The Phonix Language 21

3.5.4 Using syllable elements in rules


To write a rule that is triggered by a particular syllable position, you can include the syllable
position inside angle brackets (<>) inside a feature matrix, as if it were a special feature
value. The recognized syllable features are <syllable>, <onset>, <nucleus>, and <coda>.
For example, the following rule devoices non-sonorant obstruents in coda positions:
rule coda-devoice
[-son +cons <coda>] => [-vc]
In the same vein, the following rule ensures that all nuclear segments are marked as
+syll:
rule nucleus-syllabic
[<nucleus>] => [+syll]
The syllable features <onset>, <nucleus>, and <coda> work exactly as expected, match-
ing any segment which is assigned to the onset, nucleus, and coda portions of the syllable,
respectively. The <syllable> feature matches any segment that has been assigned to any
syllable, and fails to match only when the segment in question is unsyllabified.
There are also features that match segments not in a particular syllable portion, marked
with an asterisk (*) inside the angle brackets, as with null feature values: <*onset>,
<*nucleus>, <*coda>, and <*syllable>. The following rule deletes segments which could
not be assigned to any syllable:
rule delete-unsyllabified
[<*syllable>] => *
Likewise, the following inserts an epenthetic /i/ before unsyllabified consonants:
rule epenthetic-i
* => i / _ [+cons <*syllable>]

3.6 Imports
You can include Phonix files from other Phonix files using the import command. There are
also several built-in resources that you can use, which are covered in this section.

3.6.1 Importing files


You may import one Phonix file into another with the following declaration:
import filename
The filename given to the import command follows the same rules as other Phonix
strings, which means that you need to quote it if it contains certain characters that have
special meaning for Phonix. For details, See Section 3.8 [Strings], page 33. Examples:
import other-phonology.phonix
import "/home/linguist/latin.phonix"
import "C:\greek.phonix"
If the file name that you give after import is not an absolute path, Phonix looks for the
file you imported in the following places:
1. In the current directory.
2. In the same directory as the Phonix file currently being parsed (if that directory is
different from the current directory).
Chapter 3: The Phonix Language 22

3. In a named resource.
The named resources that Phonix looks for are special "filenames" that represent built-
in feature or symbol sets meant to simplify common linguistic tasks. The following sections
describe these resources.

3.6.2 std.features
The standard feature set contains the commonly used features that you probably learned
about in your introductory Phonology class. You include the standard feature set by writing
import std.features. All of the features in the standard set are binary features given in
their abbreviated forms. The features are gathered into a feature tree following this naming
convention: leaf features are written in lower case, node features are written in initial caps,
and the ROOT feature is written in all caps. The tree has the following structure:
+ROOT
+Place
+Labial
-ro
+Coronal
-ant
-dist
+Dorsal
-hi
-lo
-bk
-fr
+Glottal
-vc
-sg
-cg
+Manner
-cont
-nas
-str
-lat
-dr
+Class
-cons
-syll
-son
The std.features file itself looks like this:
# Segment classes
feature cons # consonantal
feature son # sonorant
feature syll # syllabic
feature Class (type=node children=cons,son,syll)

# Glottal features
Chapter 3: The Phonix Language 23

feature vc # voice
feature sg # spread glottis
feature cg # constricted glottis
feature Glottal (type=node children=vc,sg,cg)

# Manner of articulation
feature cont # continuant
feature nas # nasal
feature str # strident
feature lat # lateral
feature dr # delayed-release
feature Manner (type=node children=cont,nas,str,lat,dr,Class)

# Labial features
feature ro # round
feature Labial (type=node children=ro)

# Coronal features
feature ant # anterior
feature dist # distributed
feature Coronal (type=node children=ant,dist)

# Dorsal features
feature hi # high
feature lo # low
feature bk # back
feature fr # front
feature Dorsal (type=node children=hi,lo,bk,fr)

# The Place node governs all place-of-articulation features


feature Place (type=node children=Labial,Coronal,Dorsal)

# The ROOT node governs all features


feature ROOT (type=node children=Place,Glottal,Manner,Class)

3.6.3 std.symbols
The standard symbol set contains over 100 phonetic symbols that you can use for input
and output. In order to use the standard symbol set, you have to first import the standard
feature set std.features, since the standard symbols depend on the features defined in
that set. You include the standard symbols by writing import std.symbols. The standard
symbol set uses only 7-bit ASCII characters and is based on the X-SAMPA IPA encoding.
### CONSONANTS ###

## Obstruents ##

# Labial #
Chapter 3: The Phonix Language 24

symbol p [+cons -son -cont -vc -ro]


symbol p\ [+cons -son +cont -str -vc -ro]
symbol f [+cons -son +cont +str -vc -ro]
symbol b [+cons -son -cont +vc -ro]
symbol B [+cons -son +cont -str +vc -ro]
symbol v [+cons -son +cont +str +vc -ro]

# Dental #

symbol T [+cons -son +cont -str -vc +ant +dist]


symbol D [+cons -son +cont -str +vc +ant +dist]

# Alveolar #

symbol t [+cons -son -cont -vc +ant -dist]


symbol s [+cons -son +cont +str -vc +ant -dist]
symbol d [+cons -son -cont +vc +ant -dist]
symbol z [+cons -son +cont +str +vc +ant -dist]

# Palatal #

symbol c [+cons -son -cont -vc -ant +dist]


symbol C [+cons -son +cont -str -vc -ant +dist]
symbol S [+cons -son +cont +str -vc -ant +dist]
symbol J\ [+cons -son -cont +vc -ant +dist]
symbol j\ [+cons -son +cont -str +vc -ant +dist]
symbol Z [+cons -son +cont +str +vc -ant +dist]

# Retroflex #

symbol t‘ [+cons -son -cont -vc -ant -dist]


symbol s‘ [+cons -son +cont +str -vc -ant -dist]
symbol d‘ [+cons -son -cont +vc -ant -dist]
symbol z‘ [+cons -son +cont +str +vc -ant -dist]

# Velar #

symbol k [+cons -son -cont -vc +bk -lo +hi]


symbol x [+cons -son +cont -str -vc +bk -lo +hi]
symbol g [+cons -son -cont +vc +bk -lo +hi]
symbol G [+cons -son +cont -str +vc +bk -lo +hi]

# Uvular #

symbol q [+cons -son -cont -vc +bk -lo -hi]


symbol X [+cons -son +cont -str -vc +bk -lo -hi]
Chapter 3: The Phonix Language 25

symbol G\ [+cons -son -cont +vc +bk -lo -hi]


symbol R [+cons -son +cont -str +vc +bk -lo -hi]

# Pharyngeal #

symbol X\ [+cons -son +cont -str -vc +bk +lo -hi]


symbol ?\ [+cons -son +cont -str +vc +bk +lo -hi]

# Glottal #

symbol ? [+cons -son -cont -vc]


symbol h [+cons -son +cont -str -vc]
symbol h\ [+cons -son +cont -str +vc]

## Sonorants ##

# Labial #

symbol m [+cons +son -cont +nas -lat -ro]


symbol B\ [+cons +son +cont -nas -lat -ro]

# Alveolar #

symbol n [+cons +son -cont +nas -lat +ant -dist]


symbol l [+cons +son -cont -nas +lat +ant -dist]
symbol 4 [+cons +son -cont -nas -lat +ant -dist]
symbol r [+cons +son +cont -nas -lat +ant -dist]
symbol r\ [-cons +son +cont -nas -lat +ant -dist]

# Retroflex #

symbol n‘ [+cons +son -cont +nas -lat -ant -dist]


symbol l‘ [+cons +son -cont -nas +lat -ant -dist]

# Postalveolar #

symbol J [+cons +son -cont +nas -lat -ant +dist]


symbol L [+cons +son -cont -nas +lat -ant +dist]

# Velar #

symbol N [+cons +son -cont +nas -lat +bk -lo +hi]


symbol L\ [+cons +son -cont -nas +lat +bk -lo +hi]

# Uvular #

symbol N\ [+cons +son -cont +nas -lat +bk -lo -hi]


Chapter 3: The Phonix Language 26

symbol R\ [+cons +son +cont -nas -lat +bk -lo -hi]

### Vocoids ###

## Vowels ##

# High tense #

symbol i [-cons +son +syll +fr -bk +hi -lo +str]


symbol y [-cons +son +syll +fr -bk +hi -lo +ro +str]
symbol 1 [-cons +son +syll -fr -bk +hi -lo +str]
symbol } [-cons +son +syll -fr -bk +hi -lo +ro +str]
symbol M [-cons +son +syll -fr +bk +hi -lo +str]
symbol u [-cons +son +syll -fr +bk +hi -lo +ro +str]

# High lax #

symbol I [-cons +son +syll +fr -bk +hi -lo -str]


symbol Y [-cons +son +syll +fr -bk +hi -lo +ro -str]
symbol I\ [-cons +son +syll -fr -bk +hi -lo -str]
symbol U\ [-cons +son +syll -fr -bk +hi -lo +ro -str]
symbol U [-cons +son +syll -fr +bk +hi -lo +ro -str]

# High mid #

symbol e [-cons +son +syll +fr -bk -hi -lo +str]


symbol 2 [-cons +son +syll +fr -bk -hi -lo +ro +str]
symbol @\ [-cons +son +syll -fr -bk -hi -lo +str]
symbol 8 [-cons +son +syll -fr -bk -hi -lo +ro +str]
symbol 7 [-cons +son +syll -fr +bk -hi -lo +str]
symbol o [-cons +son +syll -fr +bk -hi -lo +ro +str]

# Schwa #

symbol @ [-cons +son +syll -fr -bk -hi -lo -str]

# Open mid #

symbol E [-cons +son +syll +fr -bk -hi -lo -str]


symbol 9 [-cons +son +syll +fr -bk -hi -lo +ro -str]
# Omitted: /3/, which is identical, feature-wise to /@/
symbol 3\ [-cons +son +syll -fr -bk -hi -lo +ro -str]
symbol V [-cons +son +syll -fr +bk -hi -lo -str]
symbol O [-cons +son +syll -fr +bk -hi -lo +ro -str]

# Open lax #
Chapter 3: The Phonix Language 27

symbol { [-cons +son +syll -fr -bk -hi +lo -str]


symbol 6 [-cons +son +syll -fr +bk -hi +lo -str]

# Open #

symbol a [-cons +son +syll -fr -bk -hi +lo +str]


symbol & [-cons +son +syll -fr -bk -hi +lo -ro +str]
symbol A [-cons +son +syll -fr +bk -hi +lo +str]
symbol Q [-cons +son +syll -fr +bk -hi +lo -ro +str]

## Semivowels ##

symbol j [-cons +son -syll +fr -bk +hi -lo +str]


symbol H [-cons +son -syll +fr -bk +hi -lo +ro +str]
symbol w [-cons +son -syll -fr +bk +hi -lo +ro +str]

3.6.4 std.symbols.diacritics
The standard diacritic set includes diacritics intended for use with std.symbols. These
diacritics are 7-bit ASCII characters based on the X-SAMPA IPA encoding.
## Consonant place modifiers ##

# Labialized
symbol ’_w’ (diacritic) [-ro]

# Linguolabial
symbol ’_N’ (diacritic) [-ro +ant]

# Dental
symbol ’_d’ (diacritic) [+ant +dist]

# Palatalized
symbol ’_j’ (diacritic) [-ant +dist]

# Velarized
symbol ’_G’ (diacritic) [+bk +hi]

# Pharyngealized
symbol ’_?\’ (diacritic) [+bk +lo]

## Vocoid modifiers ##

# Advanced
symbol ’_+’ (diacritic) [+fr]

# Retracted
symbol ’_-’ (diacritic) [+bk]
Chapter 3: The Phonix Language 28

# Lowered
symbol ’_o’ (diacritic) [+lo]

# Raised
symbol ’_r’ (diacritic) [+hi]

# Centralized
symbol ’_"’ (diacritic) [-hi -lo -fr -bk]

## Manner modifiers ##

# Voiceless
symbol ’_0’ (diacritic) [-vc]

# Voiced
symbol ’_v’ (diacritic) [+vc]

# Syllabic
symbol ’=’ (diacritic) [+syll]

# Non-syllabic
symbol ’_^’ (diacritic) [-syll]

# Nasalized
symbol ~ (diacritic) [+nas]

3.6.5 std.symbols.ipa
The IPA symbol set contains the same notional symbols as std.symbols, but it uses IPA
Unicode characters instead of ASCII X-SAMPA. To use the IPA symbol set, write import
std.features and import std.symbols.ipa at the top of your file.
Unfortunately, not all IPA symbols used in this set are present in the fonts used in this
PDF, so many of these symbols may appear to be blank.
### CONSONANTS ###

## Obstruents ##

# Labial #

symbol p [+cons -son -cont -vc -ro]


symbol [+cons -son +cont -str -vc -ro]
symbol f [+cons -son +cont +str -vc -ro]
symbol b [+cons -son -cont +vc -ro]
symbol [+cons -son +cont -str +vc -ro]
symbol v [+cons -son +cont +str +vc -ro]
Chapter 3: The Phonix Language 29

# Dental #

symbol [+cons -son +cont -str -vc +ant +dist]


symbol [+cons -son +cont -str +vc +ant +dist]

# Alveolar #

symbol t [+cons -son -cont -vc +ant -dist]


symbol s [+cons -son +cont +str -vc +ant -dist]
symbol d [+cons -son -cont +vc +ant -dist]
symbol z [+cons -son +cont +str +vc +ant -dist]

# Palatal #

symbol c [+cons -son -cont -vc -ant +dist]


symbol ç [+cons -son +cont -str -vc -ant +dist]
symbol [+cons -son +cont +str -vc -ant +dist]
symbol [+cons -son -cont +vc -ant +dist]
symbol [+cons -son +cont -str +vc -ant +dist]
symbol [+cons -son +cont +str +vc -ant +dist]

# Retroflex #

symbol [+cons -son -cont -vc -ant -dist]


symbol [+cons -son +cont +str -vc -ant -dist]
symbol [+cons -son -cont +vc -ant -dist]
symbol [+cons -son +cont +str +vc -ant -dist]

# Velar #

symbol k [+cons -son -cont -vc +bk -lo +hi]


symbol x [+cons -son +cont -str -vc +bk -lo +hi]
symbol g [+cons -son -cont +vc +bk -lo +hi]
symbol [+cons -son +cont -str +vc +bk -lo +hi]

# Uvular #

symbol q [+cons -son -cont -vc +bk -lo -hi]


symbol [+cons -son +cont -str -vc +bk -lo -hi]
symbol [+cons -son -cont +vc +bk -lo -hi]
symbol [+cons -son +cont -str +vc +bk -lo -hi]

# Pharyngeal #

symbol [+cons -son +cont -str -vc +bk +lo -hi]


symbol [+cons -son +cont -str +vc +bk +lo -hi]
Chapter 3: The Phonix Language 30

# Glottal #

symbol [+cons -son -cont -vc]


symbol h [+cons -son +cont -str -vc]
symbol [+cons -son +cont -str +vc]

## Sonorants ##

# Labial #

symbol m [+cons +son -cont +nas -lat -ro]


symbol [+cons +son +cont -nas -lat -ro]

# Alveolar #

symbol n [+cons +son -cont +nas -lat +ant -dist]


symbol l [+cons +son -cont -nas +lat +ant -dist]
symbol [+cons +son -cont -nas -lat +ant -dist]
symbol r [+cons +son +cont -nas -lat +ant -dist]
symbol [-cons +son +cont -nas -lat +ant -dist]

# Retroflex #

symbol [+cons +son -cont +nas -lat -ant -dist]


symbol [+cons +son -cont -nas +lat -ant -dist]

# Postalveolar #

symbol [+cons +son -cont +nas -lat -ant +dist]


symbol [+cons +son -cont -nas +lat -ant +dist]

# Velar #

symbol [+cons +son -cont +nas -lat +bk -lo +hi]


symbol [+cons +son -cont -nas +lat +bk -lo +hi]

# Uvular #

symbol [+cons +son -cont +nas -lat +bk -lo -hi]


symbol [+cons +son +cont -nas -lat +bk -lo -hi]

### Vocoids ###

## Vowels ##

# High tense #
Chapter 3: The Phonix Language 31

symbol i [-cons +son +syll +fr -bk +hi -lo +str]


symbol y [-cons +son +syll +fr -bk +hi -lo +ro +str]
symbol [-cons +son +syll -fr -bk +hi -lo +str]
symbol [-cons +son +syll -fr -bk +hi -lo +ro +str]
symbol [-cons +son +syll -fr +bk +hi -lo +str]
symbol u [-cons +son +syll -fr +bk +hi -lo +ro +str]

# High lax #

symbol [-cons +son +syll +fr -bk +hi -lo -str]


symbol [-cons +son +syll +fr -bk +hi -lo +ro -str]
symbol [-cons +son +syll -fr -bk +hi -lo -str]
symbol [-cons +son +syll -fr -bk +hi -lo +ro -str]
symbol [-cons +son +syll -fr +bk +hi -lo +ro -str]

# High mid #

symbol e [-cons +son +syll +fr -bk -hi -lo +str]


symbol ø [-cons +son +syll +fr -bk -hi -lo +ro +str]
symbol [-cons +son +syll -fr -bk -hi -lo +str]
symbol [-cons +son +syll -fr -bk -hi -lo +ro +str]
symbol [-cons +son +syll -fr +bk -hi -lo +str]
symbol o [-cons +son +syll -fr +bk -hi -lo +ro +str]

# Schwa #

symbol [-cons +son +syll -fr -bk -hi -lo -str]

# Open mid #

symbol [-cons +son +syll +fr -bk -hi -lo -str]


symbol œ [-cons +son +syll +fr -bk -hi -lo +ro -str]
# Omitted: //, which is identical, feature-wise to //
symbol [-cons +son +syll -fr -bk -hi -lo +ro -str]
symbol [-cons +son +syll -fr +bk -hi -lo -str]
symbol [-cons +son +syll -fr +bk -hi -lo +ro -str]

# Open lax #

symbol æ [-cons +son +syll -fr -bk -hi +lo -str]


symbol [-cons +son +syll -fr +bk -hi +lo -str]

# Open #

symbol a [-cons +son +syll -fr -bk -hi +lo +str]


symbol [-cons +son +syll -fr -bk -hi +lo -ro +str]
symbol [-cons +son +syll -fr +bk -hi +lo +str]
Chapter 3: The Phonix Language 32

symbol [-cons +son +syll -fr +bk -hi +lo -ro +str]

## Semivowels ##

symbol j [-cons +son -syll +fr -bk +hi -lo +str]


symbol [-cons +son -syll +fr -bk +hi -lo +ro +str]
symbol w [-cons +son -syll -fr +bk +hi -lo +ro +str]

3.6.6 std.symbols.ipa.diacritics
This set contains IPA diacritics for use with std.symbols.ipa. The diacritics contained
here are the same as those in std.symbols.diacritics, but rendered in IPA Unicode.
## Consonant place modifiers ##

# Labialized
symbol (diacritic) [-ro]

# Linguolabial
symbol (diacritic) [-ro +ant]

# Dental
symbol (diacritic) [+ant +dist]

# Palatalized
symbol (diacritic) [-ant +dist]

# Velarized
symbol (diacritic) [+bk +hi]

# Pharyngealized
symbol (diacritic) [+bk +lo]

## Vocoid modifiers ##

# Advanced
symbol (diacritic) [+fr]

# Retracted
symbol (diacritic) [+bk]

# Lowered
symbol (diacritic) [+lo]

# Raised
symbol (diacritic) [+hi]

# Centralized
Chapter 3: The Phonix Language 33

symbol (diacritic) [-hi -lo -fr -bk]

## Manner modifiers ##

# Voiceless
symbol (diacritic) [-vc]

# Voiced
symbol (diacritic) [+vc]

# Syllabic
symbol (diacritic) [+syll]

# Non-syllabic
symbol (diacritic) [-syll]

# Nasalized
symbol (diacritic) [+nas]

3.7 Comments
In Phonix the comment character is #. Everything from # to the end of a line is a comment
and is silently ignored by Phonix (unless the # character is embedded in a string). For
example:
# This is a comment
feature ex # this is also a comment

3.8 Strings
Any time that Phonix expects you to provide a string, as for feature names, symbols, or
rule names, you can type almost anything you want. The only exception is if the contents
of the string somehow confuse the Phonix compiler. Don’t do that. Everything else is fine.
The long, boring, technical version follows.
Phonix recognizes two types of strings: bare strings and quoted strings. Most of the
time you can use bare strings, which keeps your Phonix file nice and uncluttered. The rules
for bare strings are extremely forgiving, to minimize the situations where you must use a
quoted string:
• A bare string cannot contain whitespace. More specifically, whitespace is always inter-
preted as a token delimiter outside of a quoted string, which implicitly means it cannot
be part of a string.
• A bare string cannot contain any of the characters []()<>$= at all.
• A bare string cannot begin with any of the characters +-*/#_"’. However, these
characters can appear in the middle of strings.
This should be enough for almost all cases, but in case it’s not, you can also create a
quoted string by surrounding a string with single quotes (’) or double quotes ("). If you are
using a quoted string in a rule or a symbol declaration, the opening quote must be preceded
Chapter 3: The Phonix Language 34

by a space. Anything at all can appear inside a quoted string, except for another quote, of
course. There is no mechanism for escaping quotes, so if you need to include a single quote
then surround your quote with double quotes, and vice versa.
The following are valid strings that could be used for feature names, rule names, or
symbols:
foo
foo-bar
foo+
bar?
!
b*a*z
blu"
"+++"
’_’
’"’
The following are NOT valid strings and cannot be used for feature names, rule names,
or symbols:
-foo # looks like a feature value
_foo # ambiguous in rule context
foo() # cannot use parentheses
foo$ # cannot use dollar sign
Chapter 4: Examples 35

4 Examples
The following examples illustrate common use cases for Phonix.

4.1 Romanian
The following files define a phonology for contemporary Romanian, and illustrates how most
of the surface alternations in Modern Romanian can be derived. Note that the example is
somewhat incomplete: the input represents an internal reconstruction of Proto-Romanian,
not Proto-Romance or Vulgar Latin, and there may be a few alternations that have slipped
by. However, the majority of the active phonological processes in Romanian should be
represented.

4.1.1 ‘romanian.phonix’
import std.features

#
# We add a feature for stress, which is important for many rules
#
feature stress

#
# We create our own symbol set, defining only Romanian phonemes
#

# Unstressed vowels
symbol a [+syll -cons +son -hi +lo -fr -bk -stress]
symbol e [+syll -cons +son -hi -lo +fr -bk -stress]
symbol o [+syll -cons +son -hi -lo -fr +bk +ro -stress]
symbol @ [+syll -cons +son -hi -lo -fr -bk -stress]
symbol i [+syll -cons +son +hi -lo +fr -bk -stress]
symbol u [+syll -cons +son +hi -lo -fr +bk +ro -stress]
symbol 1 [+syll -cons +son +hi -lo -fr -bk -stress]

# Stress diacritic
symbol "’" (diacritic) [+stress]

# Non-syllabic vocoids
symbol j [-syll -cons +son +hi -lo +fr -bk -stress]
symbol w [-syll -cons +son +hi -lo -fr +bk +ro -stress]

# Non-syllabic diacritic
symbol ‘ (diacritic) [-syll]

# Labials
symbol p [+cons +ro -son]
symbol b [+cons +ro -son +vc]
Chapter 4: Examples 36

symbol f [+cons +ro -son +cont]


symbol v [+cons +ro -son +cont +vc]
symbol m [+cons +ro +son +nas]

# Dentals
symbol t [+cons +ant -son]
symbol ts [+cons +ant -son +dr]
symbol d [+cons +ant -son +vc]
symbol dz [+cons +ant -son +dr +vc]
symbol s [+cons +cont -son +ant]
symbol z [+cons +cont -son +ant +vc]
symbol n [+cons +ant +son +nas]
symbol l [+cons +ant +son +lat]
symbol r [+cons +ant +son +cont]

# Palatals
symbol tS [+cons -son -ant +dist +dr]
symbol dZ [+cons -son -ant +dist +vc +dr]
symbol S [+cons -son -ant +cont +dist]
symbol Z [+cons -son -ant +cont +dist +vc]

# Velars
symbol k [+cons -son +hi]
symbol g [+cons -son +hi +vc]
symbol h [+cons -son +hi +cont]

#
# Here we begin with the rules
#

# Reduce i/u to semivowels where appropriate


rule make-semivowels
[+hi +syll] => [-syll] / [+syll] _

# Stress all penult syllables. This rule illustrates two interesting


# techniques. First, it specifies a filter of [+syll], which means that the
# rule only "sees" [+syll] segments. Second, it uses an empty set of braces []
# to match "any segment".
rule stress-penult (filter=[+syll])
[] => [+stress] / _ [] $

# When the antepenult syllable is lexically stressed, remove stress on the


# penult syllable.
rule stress-antepenult (filter=[+syll])
[+stress] => [-stress] / [+stress] _

# Break word-initial /e/


Chapter 4: Examples 37

rule initial-iotacization
* => j / $ _ [+fr -hi -lo]

# Palatalize velars before front vowels


rule palatalize-velars
[+cons +hi] => [-ant +dist +dr *hi] / _ [+fr]

# Palatalize dental fricatives before high front vowels


rule palatalize-dentals
[+cons +ant +cont -son] => [-ant +dist] / _ [+hi +fr]

# Assimilate palatalization
rule assimilate-palatals
[+cons -son *ro] => [-ant +dist] / _ [-ant +dist]

# Affricate dental stops before /i/


rule affricate-dentals
[+cons +ant -son] => [+dr] / _ i

# Simplify /dz/
rule simplify-dz
dz => z

# Simplify /StS/ everywhere


rule simplify-StS
StS => St

# Change velar+/l/ to a palatalized velar


rule l-palatalize
l => j / [+cons +hi] _

# Raise stressed central vowels before /n/


rule prenasal-raising
[-hi -lo +stress] => [+hi] / _ n

# Centralize front non-high vowels after labials


rule postlabial-centralize
[-hi +fr] => [-fr] / [+ro] _

# Front central vowels when the next vowel is a front vowel


rule front-assimilation (filter=[-cons])
[-fr -bk -lo] => [+fr] / _ [+fr]

# Raise unstressed /o/


rule o-raising (filter=[+syll])
o => u / _ [+stress]
Chapter 4: Examples 38

# Drop word-final /u/


rule drop-final-u
u => * / _ $

# Desyllabify word-final /i/


rule desyllabify-final-i
i => j / _ $

# Centralize unstressed /a/, except when /a/ is the first segment


rule centralize-a
a => @ / [] _

# Centralize /a/ in the 1pl verbal ending


rule centralize-1pl-a
a’ => @’ / _m $

# Break stressed mid vowels into semivowel + a when the next syllable contains
# a non-high vowel
rule breaking (filter=[+syll])
[-hi -lo +stress] * => [-syll -stress] a’ / _ [-hi *ro]

# Assimilate /ea/ followed by /e/


rule assimilate-ea (filter=[-cons])
e‘a’ => e’ * / _ e

# Assimilate /oa/ followed by /w/


rule assimilate-oa
o‘a’ => o’ * / _ w

# Assimilate semivowels after a palatal


rule assimilate-postpalatal-semivowel
[-syll +fr] => * / [+dr +dist] _

# Assimilate consecutive semivowels of the same frontness


rule assimilate-consecutive-semivowel
[-syll $fr] => * / [-syll $fr] _

# Assimilate coronal sonorants before /j/


rule assimilate-sonorants
[+cons +ant +son *cont] => * / _ j

# Drop non-syllabic central vowels


rule drop-central-vocoids
[-fr -bk -syll] => *
Chapter 4: Examples 39

4.1.2 Input and output


You can generate the output file with the following command line:
phonix romanian.phonix -i romanian.input -o romanian.output
The following table shows the input and the generated output together with the output
form in Romanian orthography.

Input Output Orthography


vedu v@’d văd
vedi ve’zj vezi
vede ve’de vede
vedemu vede’m vedem
vedeti vede’tsj vedeŢi
vedu v@’d văd
veda va’d@ vadă
vezutu v@zu’t văzut
venu vi’n vin
veni vi’j vii
vene vi’ne vine
venimu veni’m venim
veniti veni’tsj veniŢi
venu vi’n vin
vena vi’n@ vină
venitu veni’t venit
esu je’s ies
esi je’Sj ieşi
ese je’se iese
esimu jeSi’m ieşim
esiti jeSi’tsj ieşiŢi
esu je’s ies
esa ja’s@ iasă
esitu jeSi’t ieşit
potu po’t pot
poti po’tsj poŢi
pote po‘a’te poate
potemu pute’m putem
poteti pute’tsj puteŢi
potu po’t pot
pota po‘a’t@ poată
potutu putu’t putut
keru tSe’r cer
keri tSe’rj ceri
kere tSe’re cere
ke’remu tSe’rem cerem
ke’reti tSe’retsj cereŢi
keru tSe’r cer
kera tSa’r@ ceară
Chapter 4: Examples 40

kerutu tSeru’t cerut


portu po’rt port
porti po’rtsj porŢi
porta po‘a’rt@ poartă
portamu purt@’m purtăm
portati purta’tsj purtaŢi
porta po‘a’rt@ poartă
porte po‘a’rte poarte
portatu purta’t purtat
freku fre’k frec
freki fre’tS freci
freka fre‘a’k@ freacă
frekamu frek@’m frecăm
frekati freka’tsj frecaŢi
freka fre‘a’k@ freacă
freke fre’tSe frece
klemu kje’m chem
klemi kje’mj chemi
klema kja’m@ cheamă
klemamu kjem@’m chemăm
klemati kjema’tsj chemaŢi
klema kja’m@ cheamă
kleme kja’m@ cheamă [Note that this is the regular out-
come of the sound changes, but this form
(the 3rd person subjunctive) has been ana-
logically restored to ’cheme’ in the standard
language.]
kalu ka’l cal
kali ka’j cai
udu u’d ud
uda u’d@ udă
ude u’de ude
udi u’zj uzi
pro’spetu pro‘a’sp@t proaspăt
pro’speta pro‘a’sp@t@ proaspătă
pro’speti pro‘a’spetsj proaspeŢi
pro’spete pro‘a’spete proaspete
osu o’s os
ose o‘a’se oase
oie o‘a’je oaie
oi oj oi
ou ow ou
oue o’w@ ouă
peske pe’Ste peşte
peski pe’Stj peşti
oklu o’kj ochi
Chapter 4: Examples 41

okli o’kj ochi


klaru kja’r chiar
Appendix A: License 42

Appendix A License
This is the license for the Phonix Phonological Transformation Language, the program
‘phonix.exe’ in source and binary forms, and for the Phonix manual.
Copyright (C) 2009 Jesse Bangs All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are
permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright * notice, this list of
conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright * notice, this list
of conditions and the following disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of Jesse Bangs nor the names of its contributors * may be used
to endorse or promote products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIB-
UTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

S-ar putea să vă placă și