Documente Academic
Documente Profesional
Documente Cultură
Computer
Science Logic
8th Workshop, CSL '94
Kazimierz, Poland, September 25-30, 1994
Selected Papers
Springer
Series Editors
Gerhard Goos, Umversltat Karlsruhe, Germany
Volume Editors
Leszek Pacholski
Institute of Computer Science, Wroc~aw University
Przesmyckiego 20, 51-151 Wroc~aw, Poland
Jerzy Tiuryn
Institute of Informatics, Warsaw University
Banacha 2, 02-097 Warsaw, Poland
The 1994 Annual Conference of the European Association for Computer Science
Logic, CSL '94 was held in Kazimierz (Poland) from September 25 through Septem-
ber 30 , 1994. CSL'94 was the eighth in the series of workshops and the third to be
held as the Annual Conference of the European Association for Computer Science
Logic.
The workshop was attended by 100 participants from over 15 countries. In-
vited lectures were given by M. Ajtai, M. Baaz, It. Barendregt, J.-P. Jouannaud,
V. Orevkov, P. Pudlak, and A. Tarlecki. Moreover, 39 contributed talks selected from
151 submissions were presented. The selection and nomination of invited speakers
was done by the Program Committee consisting of E. BSrger, M. Dezani, N. Jones,
P. Kolaitis, J. Krajicek, J.-L. Krivine, L. Pacholski, A. Pitts, A. Razborov, and
J. Tiuryn (Chair). We would like to express our gratitude to the Program Com-
mittee for the time and effort they contributed to the task of selecting the best
papers from the unexpectedly large number of submissions. We are also thankful to
over 200 referees who have helped the program committee.
The conference was organized by Warsaw University. We would like very much to
thank Dr. Igor Walukiewicz for the splendid work he has contributed to the success
of the meeting. Special thanks go to R. Maron and A. Schubert for their help with
organizing and running the conference office. We also wish to thank to M. Benke
and G. Grudzifiski for taking care of the computers used during the conference.
We gratefully acknowledge the generous sponsorship by the following institutions:
- Office of Naval Research, under the Grant Number N00014-94-J9001
- Polish Committee for Scientific Research (KBN)
- Warsaw University
- Wroctaw University
- Mathematical Institute of the Polish Academy of Sciences.
Due to the financial support of ONR and KBN we were able to offer a number
of grants for participants who otherwise could not afford to come to the conference.
The topics covered by the talks at the conference addressed all important as-
pects of the methods of mathematical logic in computer science: finite model theory,
lambda calculus, type theory, modal logics, nommonotonic reasoning, decidability
problems, and the interplay between complexity theory and logic.
The order of the papers in the proceedings, unlike in the previous ones, follows
more closely the order in which they were presented during the conference. They are
grouped according to their subjects.
Following the traditional procedure for CSL volumes, papers were collected after
the presentation at the conference, and after a regular reviewing process 38 papers
have been selected for publication. We thank the referees of the final versions. With-
out them it would have been impossible to prepare this volume. Finally, we would
like to thank W. Charatonik for his help in collecting the papers for the proceedings.
SPONSORS
We gratefully acknowledge the generous sponsorship by the following
institutions:
Type systems for current programming languages provide only coarse distinctions
amongst data values: Real, Boo1, S t r i n g , etc. Constructive type theories for program
specification can provide very fine distinctions such as {x e Nat[Prime(x)}, but
often terms contain non-computational parts, or else type-checking is undecidable.
We want to study type systems in between where terms do not contain unnecessary
codes and, ideally, type-checking is decidable. When types express requirements for
data values more accurately, it can help to eliminate more run-time errors and to
increase confidence in program transformations which are type-preserving.
Singleton types express the most stringent requirement imaginable. Suppose fac
stands for the expression:
M. Ax. i~ x = 0 t~en 1 else x * (f(~ - 1))
Then {fac} is a specification of the factorial function, and
fac :{fac}
says that /ac satisfies the specification {fac). This is an instance of the principal
assertion for singleton types, M : {M}. But syntactic identity is too stringent; we
can write the factorial function in other ways and it would be useful if when/'ac t is an
implementation of the factorial function, we also have fac t : {fac}. This suggests that
we let {M} stand for the collection of terms equal to M in some theory of equality,
so {M} denotes an equivalence class of terms, rather than a singleton set.
Although we want types to be more expressive, this should not sacrifice the us-
ability of the type system. More types can lead to more polymorphlsm: a term may
possess several types, and the type system should recognize this and allow the pro-
grammer as much flexibility as possible. Subtyping systems provide flexibility by
allowing a term of some type A to be used where one of a 'larger' type B is expected.
The characteristic rule for subtyping is known as subsumption, which captures this
kind of polymorphism:
M:A A<B
M :B (SUB)
Subsumption at ground types suggests suhtyping at higher types. For example, a
function defined on I n t may be used where one defined only on Nat is needed, because
Nat < I n t means that every natural number will be a suitable argument. So we expect
that I n t --~ I n t is a subtype of Nat --* Int.
Subtyping leads to a stratified notion of equality. Because terms may have many
types, the equality of two terms can be different at different types. Indeed, consider
two different functions on integers:
(Ax:Int. if x > O then x else 2*x) ~ (Ax:Int. x) : Int-~ Int
which have equal values at every natural:
(Ax:Int. if x > O then x else 2*x) = (•x:Int. x) :Nat--+ Int
If in some context only arguments of type Nat are supp]{ed, these functions are inter-
changeable; useful perhaps for program transformation during compilation.
This view of equality influences our treatment of singleton types. Because equality
can vary at different types, we think of {M} as a family of equivalence classes indexed
by a type. We attach a tag to the singleton, which denotes the type at which we "view"
the term. The introduction rule for singletons is:
M:A
M : {M}A ({}-I)
In fact, the type tag can be important for another reason: the type A might affect
the interpretation of M, as well as its equivalence class of terms (although this is not
the case for the semantics we give later). Imagine a model in which the integers are
constructed using pairs of naturals: the pair (m, n) codes the integer (m - n). Then
the interpretation [3: Int] is quite different from [[3: Nat], and the semantic types have
different equality relations associated. (Of course, there is an obvious coercion from
[[Nat] to [Intl.) To allow for typed interpretations we need to know the type given
to a term in a singleton, but unless it is recorded somehow it cannot be determined
from a typing derivation.
There is no typing elimination rule for singleton types, but we have a subtyping
rule that says that a singleton type is a subtype of its type tag:
M:A
{M}A < A (suB-{})
which allows us to deduce M : A from M : {N}A via (SUB).
For us, singleton types have a non-informative flavour. In other words, we have
no term operators corresponding to singleton introduction and elimination. This con-
trasts with constructive type theories utilising propositions-as-types, where singletons
might be treated akin to a propositional equality type and given a powerful elimina-
tion operator. In our approach, membership of singletons corresponds to definitional
equality, which is usually decidable. A technical side-effect of non-informative types
is that the meta-theory of our system is harder to deal with, because the rules are
less syntax-directed.1 Notice that the presence of (suB) already means that the typing
rules are not syntax-directed.
1A set of rules is called syntax-directed if the last rule used in a derivation of any statement J is
uniquely determined by the structure of J.
The theory of equality we choose to incorporate in singleton types is a natural
typed equational theory for the terms. The typing assertion M : {N}A asserts that
M and N are equal at type A, so instead of axiomatizing a separate judgement form
F b M = N : A, we use typing rules with the form F b M : {N}A directly. The
usual rule of B-equality is admissible. This formulation is nicer to deal with than one
defined using a rule of untyped ~-conversion.
r ~- A (SUB-REFL)
FkA<_A
r ~- M : A (SUB-{})
r ~- {M}A<_A
F F- M = N : A r ~- A_< B (SUB'EQ-SYM)
r f- {N}A <_ {M}s
r ~- M : A (SUB-EQ-ITER)
r F { M h _<
Singleton Types and Equality. Singletontypes are formed by the rule (FORM-{})
and terms of singleton type are introduced by the equality rules,principallyreflexivity
(EQ-REFL),which is the singleton introduction rule ({l-I) shown before under a dif-
ferent guise. Symmetry and transitivityare derived using the subtyping rules shown
below. W e also have the usual rules for equality of A-abstractions (EQ-A) and appli-
cations (EQ-APP). The rule (EQ-A) is more flexiblethan usual: it allows one to derive
equalities between functions by examining only a restricted domain. (This rule is
forced when one has untagged singletons and the usual equal domains equality,which
was the inspiration for adding it). It leads to the admissibilityof a correspondingly
stronger typing rule, via (SUB) and (SUB-{}):
Ft-A'<_A F,x:AbM:B F,x:XbM:B'
F F- Ax: A. M : l'Ix: A'. B'
which lets us give a more refined type for a function, given a more refined type for its
argument. This was used in the idR.~l example in Section 2.
One might wonder whether we can have deduction from arbitrary hypotheses of
equations between terms, assumed via iterated subscripts in the syntax. For example,
it holds that F, x: {M){N} d ~ M = N : A. In fact, we have only a pure theory of
equality since this judgement presupposes that F F- M = N : A. This is because the
rule (ADD-HYP)requires that types in contexts must be well-formed; Proposition 4.1
below establishes this formally.
S u b t y p i n g Singletons. Subtyping of singleton types is provided by.three rules.
First, we have the rule (SUB-{}) shown earlier, which asserts that a singleton is a
subtype of the type it is tagged with.
The rule (SUB-EQ-SYlVi) combines two principles. The first is monoton~city of
equality with respect to subtyping: if two terms are equal at a type, say M = N : A,
and A ~ B, then M = N : B also. We can express this via subtyping of singleton
types. Generally, as we pass from subtype to supertype, the equivalence class of any
particular term gets larger, so {N}A <_ {N}s and M = N : B via subsumption.
The second principle is symmetry of equality, and again we can express the typing
rule by a subtyping one (an economy, since we want both). If M = N : A then the
equivalence classes of M and N at A must be the same, in particular {N}A ~_ {M}A.
These are combined to get the single rule (SUB-EQ-SYM).
The third subtyping rule for singletons (SUB-EQ-ITER)deals with the case when a
singleton type is tagged with another singleton type. Observe that we can repeat the
operation of taking singletons in the syntax, forming {M}A, {M}{M)A,.... We shall
consider these types as equal, because singleton types are already the smallest non-
empty types we are interested in. And because {M}A inherits equality from A, the
equality on terms in {M}A and {M}{M)A is the same. We have {M}{M)A _< {M}A
alreacl by (suB-{}), for the other inclusion we need (SUB-EQ-ITER).
P r o p o s i t i o n 4.1 ( C o n t e x t s a n d S u b s t i t u t i o n )
1. Context formation. Suppose F Context, where F =_ xl : A1, ... , x , : A,. Then
(a) F ~- A~ for l < i < n.
(b) I f F ~- J, then B Y ( J ) C_ {.Tl,... ,an}.
2. Weakening. It" F F- J and F C F I with F~ Context, then F/ k J.
3. Substitution. I[ F, x : A, F' k J and P b N : A, then F', PIN~x] F- J[N/x].
4. Bound narrowing. /fF, x : A,F ~ k J andF ~ A ~ < A, then F, x : At, F r ~- J.
The next proposition shows some implications between judgements. It is a common
practice in the presentation of type theories to simply require the consequences of these
as premises in the rules to begin with (often implicitly). Here our rules have fewer
premises and we show the implications afterward, but our approach does make some
proofs on derivations slightly harder, because sometimes we cannot apply an induction
hypothesis directly. This can be circumvented by considering term structure instead,
or subderivations of the premises.
Proposition 4.2 (Implied Judgements)
1. If F }- J then F Context.
2. If F k J and J - M : A, A <_ B o r B <_A, then F F A.
3. If F ~- J and J =_ M = N : A, N = M : A, {M}A _< B o r { M } B <_ A then
F~ M:A.
Generation principles are important for meta-theoretic analysis. They allow us to de-
compose a derived judgement into further derivable judgements concerning subterms
from the first judgement. Typically a generation principle expresses the general way
in which a judgement form may be constructed; for the context and type formation
judgements, the generation principles are merely inversions of the rules. The following
generation result for the subtyping judgement allows us to show generation for the
typing judgement. It also reveals the "structural" nature of subtyping we mentioned
before, except in the case of singletons: {M}A can be a subtype of a type B which is
not itself a singleton. There is a case according to each syntactic form on either side
of the subtyping symbol.
P r o p o s i t i o n 4.3 ( S u b t y p i n g Generation)
1. If F b- P <_ B then B is also an atomic type, say P', and P <_P~i~ P'.
2. I f F }- 1-Ix:A. B <: C then for some X , B ' , we have C -- IIx: A t. B', such that (a)
F ~- A ' < A , (b) F , x : A ' ~ B < _ B ' , a n d ( c ) F , x : A ~ B.
3. IfF ~ {M}A <_ B, then F ~- M : B .
4. If F F- A < pt where A is not a singIeton, then A is also an atomic type, say P,
and P < p , ~ P'.
5. If F }- C _ Fix: A'. B t where C is not a singleton, then for some A,B, we have
C - H x : A . B , such that (a) F b A t <_ A, (b) F , x : A ' b- B <_ B', and (c)
F , x : A ~" B.
6. If F ~- C ~ {N}B then for some A,M, we have C =- {M}A.
Parts 3 and 6 of this proposition are rather weak, in particular nothing is said about
the relation between types A and B. This will be rectified later.
The generation principle for the typing judgement F F- M : A looks unusual,
because we must account for the possibility that A is a singleton type.
P r o p o s i t i o n 4.4 ( T y p i n g Generation)
1. IfF k x : A , thenF k {x}r(~)<_A
2. It'F k 2 x : A . M : C, then for some At,B,B ', we have (a) F k A t <_ A, (b)
F, x:A' ~ M : B', (c) F, x : A e M : B, and (d) F e {Ax:A.M}nx:A,.B, <_ C.
3. H F ~ M N : C, then t.or some A, B, we have that (a) F b M : FIx:A.B, (b)
F e N : A, and (c) F k {MN)B[N/~] <_ C.
In specific instances, the consequence of typing generation can be further broken down
using the subtyping generation principle, and so on.
A d m i s s i b l e e q u a l i t y rules. We mention a few important admissible rules of A<{}.
The symmetry and transitivity of equality
Fb-M=N:A
F [- N : M : A (EQ-SYM)
Fk L=M:A Fb" M = N : A
F b- L = N : A (EQ-TRANS)
are derived via (SUB-EQ-SYM)and (SUB-TRANS),using Proposition 4.2.
The usual rule for fl-equality is (perhaps surprisingly) derivable. This is because
Ax: A. M can be given the tight dependent type Hx: A. {M}s using ({}-I), and so
together with (A) and (APP) we can derive:
F,x:Ak M:B Fk N:A
F t- (Az: A. M ) N = M[N/x] : BIN~x] (EQ-~)
This rule is used to show fl subject reduction.
R e m o v i n g Singletons. We mentioned that two parts of the subtyping generation
principle (Proposition 4.3) are rather weak. If { M } A <. B, we would like to find a
relationship between the types A and B. When B is not a singleton type, we expect
(from the rules) that A < B. However, when B -= {N}c for some N and C, we
may have A < C or vice-versa, because of rules (SUB-EQ-SYM)and (SUB-EQ-ITER).
A generation lemma covering these cases is untidy to state, and difficult to prove
directly because of the rule (SUB-TRANS).
Here we define an operation ( - ) ~ which derives a non-singleton type from a type
by repeatedly taking the type tag of a singleton type. A proposition relates a type to
its singleton-deleted form; this sufficiently strengthens the generation result to give us
a tool to show the admissibility of subject reduction and the minimal type property.
Definition 4.5 (Singleton Removal)
p~ =p
(Hx: A. B) ~ = 1-Ix:A. B
({M}A) "~ = A:~
Minimal types are not unique, and the minimal types given by the simple defi-
nition of mint(M) are not necessarily the simplest syntactically. For example, if
F -- z : a , f : f l ---+{z}~,x:fl then minr(fx) - {fx}{~}~. But F ~- f x : {z}~ too, and
we can show that F b {z}~ < {fx}{~}~.
5 A P E R Interpretation of A_<{}
Subtyping calculi have two basic kinds of model. We may choose a typed value space
where subsumption is modelled using coercion maps between types. In some sense
this is the most general setting , but it requires some way of relating coercion maps to
the syntax: either we forgo (SUB) and introduce coercions explicitly into the syntax
[CL91], or we reconstruct coercions by some translation process [BTCGSgl]. Either
route requires a coherence property of the interpretation, because of the possibility of
different ways of deriving or expressing the analogue of a coercion-free statement, by
permuting the positions of coercions. This property can be quite tricky to establish in
a general form, and has yet to be demonstrated in a subtyping calculus more complex
than F< (see [CG92]). Here we follow the alternative untyped approach, based on a
global value space from which types are carved out. Coercion maps are unnecessary
since subtyping amounts to inclusion between types, and the interpretation of a term
does not depend on its type. The need for coherency properties can be avoided by
defining the interpretation by induction on the structure of raw (coercion-free) terms
rather than typing derivations.
T h e P E R m o d e l . Recall that a partial equivalence relation (PER) on a set D is a
symmetric and transitive relation R C_ D x D. The domain of R, dora(R), is the set
{d [ d R d}, but we often write d E R instead of d E dora(R). The equivalence class
{d' [ d' R d} of d in R is written [d]n. Subtyping will be interpreted as inclusion of
PERs, which is simply subset inclusion on D • D.
The construction is mostly standard (see e.g., [CL91, BEg0]), but incorporates
type-term dependency. We make use of a model of the untyped A-calculus to interpret
terms and to build PERs over.
o e a n i t l o n 5.1 ( L a m b d a M o d e l [HLSO])
A lambda model is a triple, 7) = (D,., [ ~), where D is a set, 9 is a binary operation
on D and for untyped lambda4erms M, the interpretation of M in an environment
p: Var ~ D is [M~p E D, such that:
Mp = e(~) (vA~)
~M~]p = ~M];" Me (APF)
~Ax. M~p = ~)~y.M[y/X]~p (a)
(Vd e D. IM~p[x ~ d] = FVlp[x ~ d]) ~ [~x. M]p = [~x. JV~p (~)
(Vx E F V ( M ) . p(x) = p'(x)) ~ [M~p = jiM, p, (Fv)
Vd E D. lax. M~p- d = ~M~p[x ~ d] (fl)
From the above axioms (except fl), we also have:
[M[NIx]~p ~M]p[x ~ [N]]p]
=
(SUBSTITUTE)
13
An environment p' extends another p, written p C_ p', if for all variables x, if p(x) is
defined then p'(x) is defined and p(x) = p'(x). Fix a lambda-model D with domain
D. Terms are interpreted as elements of D as usual (we leave the erase operation,
which deletes type information, implicit) and types are interpreted as PERs on D.
Pairing and projection operations in the model are defined by:
{a,b) = ~Af. fxyl[ x ~ a,y ~ b]
~lP =
7r2p =
We first show some constructions for building PERs, and then the interpretation
proper.
Definition 5.2 (PER Constructions)
We define PERs to interpret the types of A<O , as follows:
9 For each primitive type P, we assume a PER Rp such that P <_p~.~ P' implies
Rvc_Rp,.
9 Let R be a PER and S(a) be a PER for a11 a e dora(R), such that S(a) = S(b)
whenever a R b.
Det~ne the PER II(R, S) by:
f II(R,S) g if[ Va, b. a R b ==~ f . a S(a) g . b
Define the PER E(R, S) by:
(a,, b,) E(R, S) (a2, b2) iff a 1 R a2 a n d b1 S(al) b2
9 Let R be a PER. Define the PER [p]R by:
m [p]R n if[ turn and m R p
The interpretation [F]] of a context F is a PER. The interpretation of a type in some
context is a family of PERs [IF t- A~g indexed by elements g E ~F] that is invariant
under the choice of representative of equivalence class in ~F].
Definition 5.3 ( I n t e r p r e t a t i o n o f C o n t e x t s and T y p e s )
For each context F, we define a PER ~F] by:
[[0] = D x D
F, x: A] = (F Jr e Al)
For each context F and type A, we define a PER IF b" A~g, for each g E dora(H):
IF b P~g = Rp
iF [- IIx:A. Big = II(IF t- Aig, Aa. IF, x : A ~" Bi(g,a))
IF t- {M}Aig = [~M~gr]ireAlg
Notice that ]- is just used as a place-holder here, it does not signify a judgement
derivation. The symbol A stands for lambda-abstraction at the meta-level, and
gr: Var ~ D is the environment defined by projections on g:
gO(y) undefined, for all y.
{ ~r2(g), if y - z,
The following theorem establishes the main soundness property of the model con-
struction. It also shows well-definedness of the interpretation of contexts ~F]] and of
types IF ~- Ai, whenever F is a context and F 1" A. The parts of the theorem need
to be proven simultaneously because of type dependency.
]4
References
[Asp95] David R. Aspinall. Algebraic specification in a type-theoretic setting.
Forthcoming PhD thesis, Department of Computer Science, University
of Edinburgh, 1995.
[BL90] Kim B. Bruce and Giuseppe Longo. A modest model of records, in-
heritance and bounded quantification. Information and Computation,
87:196-240, 1990.
[BTCGS91] Val Breazu-Tannen, Thierry Coquand, Carl A. Gunter, and Andre Sce-
drov. Inheritance as implicit coercion. Information and Computation,
93:172-221, 1991.
[Car88a] Luca Cardelli. A semantics of multiple inheritance. Information and
Computation, 76:138-164, 1988.
[CarS8b] Luca Cardelli. Structural subtyping and the notion of power type. In
Fifteenth Annual ACM Symposium on Principles of Programming Lan-
guages, 1988.
[cG92] Pierre-Louis Curien and Giorgio Ghelli. Coherence of subsumption, min-
imum typing and type-checking in F<. Mathematical Structures in Com-
puter Science, 2:55-91, 1992.
[CL91] Luca Cardelii and Giuseppe Longo. A semantic basis for Quest. Journal
of Functional Programming, 1(4):417-458, 1991.
[Hay94] Susumu Hayashi. Singleton, union and intersection types for program
extraction. Information and Computation, 109, 1994.
[HLSO] R. Hindley and G. Longo. Lambda calculus models and extensionality.
Z. Math. Logik Grundla9. Math., 26:289-310, 1980.
[HP91] Robert Harper and Robert Pollack. Type checking with universes. The-
oretical Computer Science, 89:107-136, 1991.
[KST94] Stefan Kahrs, Donald Sannella, and Andrzej Tarlecki. The definition of
Extended ML. Technical Report ECS-LFCS-94-300, LFCS, Department
of Computer Science, University of Edinburgh, 1994.
[SP9~ Paula Severi and Erik Poll. Pure Type Systems with Definitions. In
Logical Foundations of Computer Science, LFCS'94, Lecture Notes in
Computer Science 813, pages 316-328. Springer-Verlag, 1994.
[SST92] Donald T. Sannella, Stefan Sokotowski, and Andrzej Tarlecki. Toward
formal development of programs from algebraic specifications: Parame-
terisation revisited. Acta Informatica, 29:689-736, 1992.
[sw83] Donald Sannella and Martin Wirsing. A kernel language for algebraic
specification and implementation. In Proceedings of International Con-
ference on Foundations of Computation Theory, Borgholm, Sweden, Lec-
ture Notes in Computer Science 158. Springer-Verlag, 1983.
A Subtyping for the Fisher-Honsell-Mitchell
L a m b d a Calculus of Objects
Abstract. Labeled types and a new relation between types are added
to the lambda calculus of objects as described in [6]. This relation is
a trade-off between the possibility of having a restricted form of width
subtyping and the features of the delegation-based language itself. The
original type inference systern allows both specialization of the type of an
inherited method to the type of the inheriting object and static detection
of errors, such as 'message-not-understood'. The resulting calculus is an
extension of the original one. Type soundness follows from the subject
reduction property.
1 Introduction
When the message m is sent to {ele-{- m~-ej>, the result is obtained by applying
e~ to (el+-{- m=ej) (similarly for (el +-- m=ej)).
This form of self-application allows to model the special symbol self of object
oriented languages directly by lambda abstraction. Intuitively, the method body
e2 must be a function and the first actual parameter of e2 will always be the
object itself. The type system of this calculus allows methods to be specialized
appropriately as they are inherited.
We consider the type of an object as the collection of the types of its methods.
The intuitive definition of the width subtyping then is: ~ is a subtype of v if cr has
more methods than r. The standard subsumption rule allows to use an object
of type ~r in any context expecting an object of type r. In the original object
calculus of [6], no width subtyping is possible, because the addition of the method
m to the object e is allowed if and only ifm does not occur in e. So, the object e
could not be replaced by an object e I that already contains m.
Moreover, it is not possible to have depth subtyping, namely, to generalize
the types of methods that appear in the type of the object, because with method
override we can give type to an expression that produces run-time errors (a nice
example of [1] is translated in the original object calculus in [8]).
In this paper, we introduce a restricted form of subtyping, informally written
as cr _ r. This relation is a width subtyping, i.e., a type of an object is a subtype
of another type if the former has more methods than the latter. Subtyping is
constrained by one restriction: o" is a subtype of another type v if and only if we
can assure that the methods of or, that are not methods of r, are not referred to
by the methods also in r. The restriction is crucial to avoid that methods of r
will refer to the forgotten methods of or, causing a run-time error. The subtyping
relation allows to forget methods in the type without changing the shape of the
object; it follows that we can type programs that accept as actual parameters
objects with more methods than could be expected. The information on which
methods are used is collected by introducing labeled types. A first consequence
of this relation is that it can be possible to have an object in which a method
is, via a new operation, added more than once. For this reason, we introduce a
different symbol to indicate the method addition operation on objects, namely
(el+-o
The operation +--o behaves exactly as the method addition of [6], but it can be
used to add the same method more than once. For example, in the object
the first addition of the method m is forgotten by the type inference system via
a subspmption rule. Our extension gives the following (positive) consequences:
- objects with extra methods can be used in any context where an object with
fewer methods might be used,
our subtyping relation does not cause the shortcomings described in [1],
-
The untyped lambda calculus enriched with object related syntactic forms is
defined as follows:
::= x I c I ~x.e I e ~ I < / I / ~ * -o m=e~> j (~ +- m=e~> r ~ ~ ~ f ~ ~ m I ~ ,
where x is a term variable, c belongs to a fixed set of constants, and m is a
method name. The object forms are:
(} the empty object;
(el+-o m=e2) extends object el with a new method m having body e2;
(el +-- re=e2) replaces the body of method m in el by e2;
e 4= ra sends message m to the object e;
e +-~ m searches the body of the message m into the object e;
err the error object.
Notice that the last two object forms are not present in the original calculus
of [6]. Let e--* denote t--o or +--. The description of ~an object via +--, opera-
tions is intensional, and the object corresponding to a sequence of +--, can be
extensionally defined as follows.
To send message m to the object e means applying the body of m to the object
itself. In fact, the body of m is a lambda abstraction whose first bound variable
will be substituted by the full object in the next step of/?-reduction.
The problem that arises in the calculus of objects is how to extract the ap-
propriate method out of an object. The most natural way is moving the required
method in an accessible position (the most external one). This means to treat
objects as sets of methods. Unfortunately, this approach is not possible: in fact,
the typing rules of objects depend on the order of +--* operations. For instance,
the typing of e 3 in the object expression ((el+--* m:-e2><---* n----e3) depends on
the typing of the "subobjects" el and (el e--* re=e2).
The approach chosen in [6] to solve the problem of method order is to add
book
to the e_~z relation a bookeeping relation --+ . This relation leads to a standard
form, in which each method is defined exactly once (with the extension opera-
tion), using some "dummy" bodies, and redefined exactly once (with the override
operation), giving it the desired body.
In our system the notion of standard form is unuseful since the subject reduc-
tion property does not hold for the book
--+ part of the evaluation rule. On the other
hand, we can use the extra information contained in types to type correctly the
extraction of the bodies of methods from the objects (it will be clear how in
paragraph 3:2). Therefore, we propose the following operational semantics. We
list here only the most meaningful reduction rules. Appendix 1 contains the full
set of rules, which includes rules of error propagation. The evaluation relation is
the least congruence generated by these rules.
(/?) ( x.el)e2 v4t [e2/x]el
e m -+. (e m)e
eval -
(8?.tCC+ "z) (el+--* re:e2> +--~III ~ e2
eval
(n xt <el , n=e > el +-m
eval
(fail (>1 () +J m --4 err
eval
(fail abs) Ax.e ~ m -+ err.
To send message m to the object e still means applying the body of m to the
object itself. The difference is that in our semantics the body of the method is
recursively searched by the 4-~ operator without modifying the shape of the full
object; if such a method does not existl the object evaluates to error. Observe
that the rule (fail vat) x ~ m ~_~t e r r is unsound, since the variable x could
be substituted (by applying a/?-reduction) by an object containing the method
Ill.
The central part of the type system of an object oriented language consists of
the types of objects. In [6], the type of an object is called a class-type. It has the
form:
c l a s s t.((ml :'rl, . . ., mk :Tk >>,
20
where ((ml:rl,... ,mk:'rk>> is called a row expression. This type expression de-
scribes the properties of any object e that can receive messages mi, (e ~ mi),
producing a result of type Ti, for 1 < i < k. The bound variable t may appear in
ci, referring to the object itself. Thus, a class-type is a form of recursively-defined
type. As a simple example of types in [6], we consider the following p o i n t object:
p o i n t d~] (x-=Aself.O, ravx'-Aself.Adx.(self +- x=As.(self ~ x) -4- dx)),
with the following class-type: classt.(<x:int, ravx:int~t)).
A significant aspect of this type system is that the type (int-+t) of method
mvx does not change syntactically if we perform a method addition of a method
c o l o r to build a c o l o r e d _ p o i n t object from p o i n t . Instead, the meaning of the
type changes, since, before the c o l o r adjuction, the bound variable t referred to
an object of type p o i n t , and after t refers to an object of type c o l o r e d _ p o i n t .
So the type of a method may change when a method is inherited: the authors
of [6] called this property method specialization (also called mytype specialization
in [8]). The typing rules assure that every possible type for an added method
will be correct and this is done via a sort of implicit higher-order polymorphism.
To allow subtyping, we add a new sort of types, the labeled types, that carry
on the information about the methods used to type a certain method body.
This information is given by a subscript which is a set of method names. The
methods used to type a body are roughly the method names which occur in the
body itself. For example, suppose that the object el has a method ra with body
e~, that in e2 a message n is sent to the bound variable self and a method n' (of
el) is overriden. Then the type r of e2 is subscripted by the set {n, n~}, since e~
uses n, n'. These labeled types are written inside the row of the class-type and
they do not appear externally. Therefore, in our system the object p o i n t will
have the following class-type: classt.((x:int{}, mVx:(int~t){x}>}.
We can forget by subtyping those methods that are not used by other meth-
ods in the object, i.e., a method is forgettable if and only it does not appear in
the labels of the types of the remaining methods. This dependency is correctly
handled in the typing rules for adding and overriding methods (i.e., (obj ext)
and (obj over)), where the labels of types are created. We refer to section 3.4 for
some meaningful examples.
The type expressions include type constants, type variables, function types and
class-types. In this paper, a term will be an object of the calculus, or a type, or
a row, or a kind.
The symbols c~, r, p, ... are metavariables over types; ~ ranges over type
constants; t, self,.., range over type variables; A ranges over labels; a, /~, ...
range over labeled types; R, r range over rows and row variables respectively;
m, n, ... range over method names and ~ ranges over kinds. The symbols a, b,
c, ... range over term variables or constants; u, v, ... range over type and row
variables; U, V, ... range over type, row, and kind expressions, and, finally, A,
21
B, C, ... range over terms. All symbols may appear indexed. The set of types,
rows and kinds are mutually defined by the following grammar:
Types r ::= ~ It ] ~'--+~" ] c l a s s t . R
Labels A ::= { } I A U { m }
Labeled Types a ::= ra
Rows n : : : r t <<)) I <<R Im:~)) I At.R I a t
Kinds g ::= T l T ' - + [ m l , . . . , r a k ] ( n > 0 , k > l ) .
We say that r is the type and A is the label of the labeled type r a .
The row expressions appear as subexpressions of class-type expressions, with
rows and types distinguished by kinds, intuitively, the elements of kind [mz,..., real
are rows that do not include method names m l , . . . , ink. We need this information
to guarantee statically that methods are not multiply defined. In what follows,
we use the notation r~:fm as short for m l : r ~ , ... ,m•. "'rak k and r~:~7 as short for
1111 : O ~ 1 , , . , ,mk:OLk.
We say that a row R' is a subrow of a row R if and only if R--((R' [ r~:5)), for
suitable r~ and c7.
The set S(R) of labels of a row R is inductively defined by:
s((O))=S(r)={}, S(<<R [ m:T~>>)=S(R) U A, S(At.R)=S(R), and S ( R r ) = S ( R ) .
In our system, the contexts are defined as follows:
r ::= ~ l r , ~::7-] r, ~:T IF, r:~ ] F, t~ _~ L2,
and the judgement forms are:
F~- * F~ R:~ Ft-r:T F [ - r l <r2 F }- e :~'.
The judgement F t- , can be read as " F is a well-formed context". The meaning
of the other judgements is the usual one.
In this subsection we discuss all the typing rules which are new with respect to
[6], except for the subsumption rule which will be discussed in the next section.
More precisely, we present the rules for extending an object with a new method
or for re-defining an existing one with a new body, the rule for searching method
bodies and the rule for sending messages. The remaining rules of the type system
are presented in Appendix 2.
We can assume, without loss of generality, that the order of methods inside
rows can be arbitrarily modified: this assumption allows to write any method as
the last method listed in the class-type. A formal definition of type equality is
given in Appendix 2.
The (obj ext) rule performs a method addition, producing the new object
(e~ +-o n =e2}. This rule always adds the method to the syntactic object in case
the latter is not present or it is present in the object but it was previously
forgotten in the type by an application of the subtyping rule (sub -<). Another
task performed by this rule is to build the labeled type v{~} for the new method
n, where the label {~} represents the set of all methods of el that are useful to
type n's body.
22
with respect to the _ relation). The full subtyping system is given in Appendix
2. Let two class-types 71 and v2 be given, such that the judgement F ~- vl ___ 72
is derivable and the object e is of type 7"1. The (sub -<) rule says that we can
derive also type r2 for e. It follows that the object e can be used in any context
in which an object of type v2 is required. The possibility of giving more types to
the same object makes our calculus more expressive than the original one.
/'~- e:T1 El- Tl! 7"2
(sub -4) 1" [- e :T2.
Using the (sub -i) rule we can obtain judgements of the shape F ~- e : c l a s s t . R ,
where n is a method of e but F ~- R : In]. In this case we say that this rule forgets
the method n. It is important to remark that, when a method is forgotten in the
type of an object, it is like it was never added to the object.
3.4 Examples
In this section we will present two examples: the first shows how our subtyping
relation works on a critical example of [1]. The second example gives a simple
object, typable in our calculus, but not typable in the original calculus.
de/
Pl ---- (x=)~self.O,mvx--)~self.)~dz.<selfi--
x=)~s.(self~ x) + dx>>
de/
p2 = (x=~,self.1,y=~,self.O,mvx=~,self),dx.<self+- x=~,8.(self.r x) + dx>,
mvy=~,self.:~dy.<selfe- y=~,8.(self~ y)+ dy>>,
we can derive ~- Pl : P1 and ~- P2 : P2, Where
P1 de=: classt.<(x:int, mvx:(int--+t){x}l>
P2 d~=: dasst.<<x:int, y:int, mvx:(int~t)~x~, mvy:(int~t)~y~>>,
and int stands for int{}.It is easy to verify that in our system P2 -< P1. This
relation between P2 and/~ is the one we want to have, since it is the intuitive
relation between a one-dimensional point and a two-dimensional point. If we
modify Pl and P2 as follows:
P'1 de_: <Pl <-- mvx--Aself.Adx.self>
p~ ~: <<p2+--x=~self.((self ~ mvx I) ~ y)> ~- mvx=~,sel:.:,d~.sd:>,
we can derive ~- p~: P~ and ~- p~: P~, where
Pi ~-~: classt.<<~:i,~t,mv~:(int--+~)>>
P~' ~-~: ~lasst.<<~:i,~t~mvxZ~,y:int,mvx:(int~t),mvy:(int~t)~yg>.
Now P~ 2~ P[, because we cannot forget the y method since the x method
uses it. Therefore, we are unable to assign type P~' to the object p~. In this
way, we avoid the so called message-not-understood error. In fact, if we allowed
P~ ___ P~, we would get ~- p~ : P~ by subtyping. Then, it would be possible to
override the mvx method of p~ by a body that has an output of type P~. Since
the x method of p~ uses y, this would produce a run-rime error. Let us formalize
24
this situation (the original pattern appears in [1], paragraph 5.4). Suppose to
override the object p~ as follows:
p~ de___](p~ +- mvx=Aself.Adx.pl).
If we send message x to p[, then an error occurs, since the body of x sends
the message y to the object (self ~ mVx 1), but this object does not have any y
method.
Example2. Consider the object draw that can receive two messages: f i g u r e ,
that describes a geometrical figure, and p l o t , that, given a point, colors it black
or white, depending on the position of the point with respect to the figure. The
object draw accepts as input both a colored point or a point. This would be
impossible in the original system of [6], since there one would have to write two
different objects, one for colored points and one for points, with different bodies
for the method p l o t . In fact, for colored points we need an override instead of
an extension. For the object draw:
draw def
=
(figure=Aself .Adx.Ady.(dy=f(dx)), plot=Aself.Ap.if (self ~ figure)
(p ~ x)(p ~ y) then (p+-o col=Aself .black} else (p+-o col=Aself .white}>,
we can derive F- draw: DR, where
In this section we will show that our extension has all the good properties of
the original system. We follow the same pattern of [7]: first we introduce some
substitution lemmas and, then, the notion of derivation in normal form that
simplifies the proofs of technical lemmas and the proof of the subject reduction
theorem.
The following lemma is useful to show both a substitution property on type
and kind derivations and to specialize class-types with additional methods. Let
UoVstandsfor U:Vor U~V.
4.1 N o r m a l Form
It is well known that equality rules in proof systems usually complicate deriva-
tions, and make theorems and lemmas more difficult to prove. These rules intro-
duce many unessential judgement derivations. In this subsection, we introduce
the notion of normal form derivation and of type and row in normal form, re-
spectively denoted by ~-g and vnf in [7]. Although it is not possible to derive
all judgements of the system by means of these derivations, we will show that all
judgements whose rows and type expressions are in r n f are PN -derivable. Using
this, we can prove the subject reduction theorem using only ~-lv derivations.
4.2 ~?echnical L e m m a s
We are going to state some technical lemmas, necessary to prove some parts of
the subject reduction theorem. They essentially say that each component of a
judgement is well-formed.
The following lemma shows that the contexts are well-formed in every judge-
ment and allows to treat contexts, which are lists, more like sets. Moreover, it
enables us to build well-formed row expressions.
26
Proof. It is enough to give that each of the basic evaluation steps preserves the
type of the expression being reduced. We show the derivation for the left-hand
side of each rule (considering the most difficult cases, in which the (sub ~_) rule
is applied after each other rule) and then we build the correct derivation for the
right-hand side. For the rules (succ +=) and (next +_o), we consider the +-o cases
only. The +- cases are similar.
9 (~) (~x.el)6 2 e_~l [e2/x]el" This case follows from Lemma 9.
eval
9 (r e r n --+ (e ~ n)e. This case follows from Lemma 8 (2), and
Lemma 7 (4).
eval
9 (succ +-~) (el+-o n - e 2 ) e-~ n --+ e2. In this case the left-hand side is:
F F-N e l : classt.((R I g:(~)) r,t:T ~-N R: [~,n] n ~ $(((R ] ~:c7)))
F, r:T-+[g,n] ~-N e2: [classt.((rt]~:~,n:7-ffa}))/t](t~r) r not in r
(obj ext) F i-to (elt-o n=e2) : classt.((R ] r~:ff, n:7-{~}))
(sub ~_)
r F-N (el+-o n----e2): classt.((R' ] r~:c~,n:r(m})) g,t:T ~-Y R": [r~,n]
(m search)
F F-N (el+-o n=e2) ~ n: [classt.({Ft"l~:~,n:r{~}}}/t](t--+r)
(sub ~_)
1" ~-N (ele-o n=e2) ~ n : r
where (m search) abbreviates (meth search). Observe that it is not possible to
forget n in the first application of (sub -~) rule, since afterwards we have to type
its search. Moreover, it is not possible to forget any of the ~ methods, since n
uses them.
27
From F, t:T ~-N R" : [r~,n], by applying (rowfn abs) rule, we get r ~-N )tt.RH :
T--+ [~, n]. This, together with 1", r:T--+[~,n] i-y ~ : [classt.((rt I ~:S, n:r{~})}/t](t--+r)
implies, by Lemma 2(4), F ~- e2: [classt.(((At.R")tl~:~,n:r{~}))/t](t--+'r). So,
by Lemma 6, we can conclude F t-y e2 : [classt.((R" I g:5, n:r{~}))/t](t--+r). Then
we build a derivation for the right-hand side as follows:
r i-N e2: [classt.((R" I ~:a, n:r{fa}>>/t](t-+r)
(sub -4) r ~-e e2 : o.
This proof shows the usefulness of labeled types. Thanks to the label {g} (that
contains all the me~hods used by n) it is possible to reconstruct the correcfl type
of e~ in the derivation of the left-hand side of the rule, and therefore the correct
type of the right-hand side.
eval
9 ( n e x t 4--~) (e14---o h i : e 2 ) / . . . z n --). e 1 ~ n.
There are two possible cases, according to whether the labeled type of m
contains n or not, We consider the first case, the second being similar. The
left-hand side is:
7) F, t:T ~-N R " : [~, n]
(meth search) F bN (elt--o m=e2} ~ n: [r I
(sub -4)
r l-'e (ei+-o m=e~) += n : o,
where 7) is the following derivation:
F ~-N e,: classt.((R I ~:&n:r{6t))) F,t:T ~-sv R: [~,n,m]
F, r:T-+[l~, m, n] t-N e2: [classt.((rt]~:~, n:r{~l},m:p{~,n}})/t](t--+p)
m g~S(((R I ~:g,n:r{~l}}) ) r not in p
(obj ezt/
f' l-N <el+-o re=e2> : classt.((Rl~:G,n:r{c~.,m:p@,n}))
(sub -4)
F ~-N (el +--~ra=e2): classt.((R'l~:~,n:v{~}) ).
The correctness of the application of (sub -<) rule in 7) implies that ((~:~, n:r{~t)))
is a subrow of ((R I ~:c7, n:r{~l} , m:P{15,n}}}. Moreover, the condition m ~ S(((R I
i~:r n:r{~l}))) implies m ~ {~}. Therefore, ((R I 1~:5, n:v{el})} = ((R'" I q:J, n:v{el})},
for a suitable R "I. So we can build the derivation for the right-hand side:
F t- N el :classt.((R'"[~lj, n:v{~l}}} F,t:T t-N RH: [~,n]
(meth search)
(sub _) F I-N el +-~ n: [classt.((R" [~:j, n!r{~}})/t](t--+T)
F~N el r -.
Clearly, the left-hand sides of all other evaluation rules cannot be typed. 9
The subject reduction proof shows the power of our typing system. Labeled
types not only allow a restricted form of subtyping that enriches the set of types
of typable objects, but they also lead us to find a simpler and more natural
operational semantics, in which no transformations on the objects are necessary
to get the body of a message. In fact, the typing rule for the +-~ operation
eval
is strictly based on the information given by the labels. Moreover, since an -4
28
produces the object e r r when a message ra is sent to an expression which does not
define an object with a method m, the type soundness follows from Theorem 10.
Theoremll. If F ~-N e : 7" is derivable for some F and v, then the evaluation
ore cannot produce e r r , i.e. ee~t~t e r r . []
Notice that in [7] the type soundness was proved by introducing a structural
operational semantics and showing suitable properties.
5 Conclusions
This paper extends the delegation-based calculus of objects of [6] with a subtyp-
ing relation between types. This new relation states that two types of objects
with different numbers of methods can be subsumed under certain restrictions.
This restricted form of subsumption is conservative with respect to the features
of delegation-based languages which are present in the original system.
Among the other proposals, that allow width specialization we have studied,
one solution is to explicitely coerce an object with more methods into an object
with less methods, by expanding each call of a method that does not belong to
the smaller type with the proper body of that method. This solution forces a
new subsumption rule that performs explicitely this job; this rule creates a quite
different object but allows to eliminate the labels and the restrictions on them.
Further goals of our research are:
- finding a denotational model for the calculus,
-- adding some mechanism to model encapsulation,
- determining if the type-checking of this calculus is decidable.
Acknowledgements
The authors wish to thank Mariangiola Dezani-Ciancaglini for her precious
technical support and encouragement, Furio Honsell and Kathleen Fisher for
their helpful comments, Steffen van Bakel and Paola Giannini for their careful
reading of the preliminary versions of the paper.
References
1. Abadi, M., Cardelli, L., A Theory of Primitive Objects. Manuscript, 1994. Also in
Proc. Theoretical Aspect of Computer Software, LNCS 789, Springer-Verlag, 1994,
pp. 296-320.
2. Bell~, G., Some Remarks on Lambda Calculus of Objects. Internal Report, Dipar-
tiroento di Matematica ed Informatica, Urdversits di Udine, 1994.
3. Bono, V., Liquori, L., A Subtyping for the Fisher-Honsell-Mitchell Lambda
Calculus of Objects. Full version, avalaible as f t p : / / p i a n e t a . d i . t m i t o , it/pub
/LAMBDA/liquori/subtyping .ps. Z, 1994.
4. Borning, A.H., Ingalls, D.H., A Type Declaration and Inference System for
Smalltalk. In Proc. ACM Syrup. Principles of Programming Languages, ACM
Press, 1982, pp. 133-141.
5. Ellis, E., Stroustrop, B., The Annotated C++ Reference Manual. Addison Wesley,
1990.
29
6. Fisher, K., Honsell, F., Michell, J. C., A Lambda Calculus of Objects and Method
Specialization. In Proc. 8th Annual IEEE Symposium on Logic in Computer Sci-
ence, Computer Society Press, 1993, pp. 26-38.
7. Fisher, K., Honsell, F., Mitchell, 3. C., A Lambda Calculus of Objects and Method
Specialization. Nordic Journal of Computing, 1 (1), 1994, pp. 3-37.
8. Fisher, K., Michell, J. C., Notes on Typed Object-Oriented Programming. In Proc.
Theoretical Aspect of Computer Software, LNCS 789, Springer-Verlag, 1994, pp.
844-885.
eval
(Z) (AX.el)e2 e$_~l [e2/x]el (errabs) Ax.err -+ err
eval eval
(r e r m -+ (e +-~ m)e (err appl) err e -+ err
eval eval
(SUCC 4-~) <el/--* m ----e2> +-~ m -+ e2 (err 4-~) err /-~ 1% -+ err
eval eval
(next +-~) (el/--* I% ----e2> +-~ m -+ el /-~ m (fail<>) <> +-~ m -+ err
~v al
(err+---*) <err4--, m = e> e~_~l err (failabs) Ax.e 4-~ m -+ err
A p p e n d i x 2: T y p i n g r u l e s
General Rules R u l e s for t y p e e x p r e s s i o n s
(start) (type var) F I" * t ~ dom(.P)
e ~- * F, t:T ~- *
R u l e s for a s s i g n i n g t y p e s to t e r m s
F F r : T x ~ dora( F) FF,
(exp var) F, x:r ~- * (empty obj) F ~ 0: classt.(0)
F F el :classt.((R[g:5, n:TA))
F,r:T--+[~,n] }- e2: [elasst.((rt l ~:5, n:rf~})}/t](t-+r ) r not in r
(obj o,eO
F F (el +-- n=e~): elasst,((R I g:a,n:r{~}) )
Rules of subtyplng
(width _ ) F F classt.((R ] ~:(7}): T f~ r S(R)_
F F classt.((R [ ~:5)) _ c l a s s t . R
Torben Braiiner
BRICS**
Department of Computer Science
University of Aarhus
Ny Munkegade
DK-8000 Aarhus C, Denmark
Internet: tor@daimi.aau.dk
1 Introduction
Linear Logic was discovered by J.-Y. Girard in 1987 and published in the now fa-
mous paper [Gir87]. In the abstract of this paper, it is stated that "a completely
new approach to the whole area between constructive logics and computer sci-
ence is initiated". Since then, a lot of work has been done to corroborate this
claim. This paper deals with a Curry-Howard interpretation of the intuitionistic
fragment of Linear Logic, appropriate for recursion.
The original Curry-Howard isomorphism, [How80], relates the natural deduc-
tion formulation of Intuitionistic Logic to the )t-calculus; formulas correspond to
types, proofs to terms, and normalisation of proofs to reduction of terms. The
fundamental idea of categorical logic is that formulas are interpreted as objects,
proof-rules as natural operations on maps; and proofs a s maps. We can give a
sound categorical interpretation of Intuitionistic Logic in a cartesian closed cat-
egory, so given the above mentioned Curry-Howard isomorphism, this induces
a sound categorical interpretation of the :k-calculus. In the present paper this
interpretation will be extended to deal with the A-calculus with an additional
rule for recursion, the Arr
* A full version of this paper is available as Technical Report BRICS-RS-95-13.
** Basic Research in Computer Science,
Centre of the Da~nish National Research Foundation.
32
AxB f~ B
Now, one can show that a cartesian closed category has a fixpoint operator
iff every map f : A • B --+ B has a fixpoint, but we will show here a more
informative result:
2.3 Linear F i x p o i n t s
!A| f, B
!A In. B
such that ft is a linear fixpoint of f , and such that the operation is natural in
!A with respect to maps in the image of 3'.
3~5
The definitions of linear fixpoints and (external) linear fixpoint operators can
be generalised to an arbitrary number of parameters; however, the definitions
presented here are appropriate for the purpose of this article, so we will not
pursue this further. One can show that a closed !-category has a linear fixpoint
operator iff every map f :!A| ~ B has a linear fixpoint, but we will show
here a more informative result:
P r o p o s i t i o n 2.11 There is a bijection between external iinear fixpoint operators
and linear fixpoint operators in a closed .r-category.
The definition of a linear fixpoint in C is the definition of a fixpoint in the
category of free coalgebras stated in terms of maps in C, which entails that:
L a m i n a 2.12 Let C be a .t-category. Given maps of coalgebras
h : (!A, 8) --+ (!B, 8) and f : (!A, 8)| 8) --+ (!B, 6) we have that h is a fixpoint
o f f iff r :!A ~ t3 is a line.r fizpoint o r e ( f ) :!A| ~ B.
P r o o f . The following calculation proves the result:
T h e o r e m 2.14 Let C be a closed .t-category such that the category of free coalge-
bras is closed under finile products. There is a bijection belween fixpoint operators
in the category of free coalgebras and linear fixpoint operators in C.
An example of a closed !-category with finite products, finite sums, and a linear
fixpoint operator is the category of CPOs and strict continuous functions. This
category has a linear fixpoint operator because the induced category of free
Coalgebras is equivalent to the category of CPOs and continuous functions; a
cartesian closed category with a fLxpoint operator.
In [Bra94b] another example of a closed !-category with finite products and
a linear fixpoint operator is given, namely the category of dI domains and join
preserving stable functions.
3 T h e )~r~c-Calculus
3.1 D e f i n i t i o n o f t h e )~re%Calculus
One can show that the Ar~-calculus satisfies the Substitution Property entailing
that the rule satisfies Subject Reduction, that is, typing is preserved by an
application of the reduction rule. Instead of equipping the Ar~-calculus with the
mentioned reduction rules, one could define an operational semantics in natural
semantics style. This is dorm in [Win93].
4 T h e Linear M~%Calculus
where "t, ..., t" means a sequence of n occurrences of t. Rules for assignment
of types to terms are given in Appendix B. Type assignments have the form
of sequents xl : A 1 , . . . , x n : A n l- u : A where x l , . . . , x n are pairwise distinct
variables. Note that the definition of sequents implicitly restricts use of the rules.
It is for example not possible to use the (| rule if F and A have common
39
variables. The terms together with the typing rules for the fragment without
recursion will be called the linear A-calculus. The linear A-calculus is essentially
the same as the calculus given in [BBdPH92]. The extension with recursion will
be called the linear A~eC-calculus.
In what follows, the expression ~ r means Wl, ..., wn, %opy ~ as ~ , ~ in u r~
means copy Vl as Zl, yl in (...copy vn as zn, Yn in u...), and '~discard ~ in u ~
means discard Vl in (...discard v= in u...).
The reduction rules for terms of the linear A-calculus, [BBdPH92], can be
extended with a reduction rule for the term corresponding to recursion:
One can show that the linear A~eC-cMculus satisfies the Substitution Property,
entailing that the rule satisfies Subject Reduction. Instead of equipping the linear
At*C-calculus with the mentioned reduction rules, one could define an operational
semantics in natural semantics style. This is dealt with in [Bra94a].
4.2 The C u r r y - H o w a r d I s o m o r p h i s m ( E x t e n d e d w i t h R e c u r s i o n )
['1 | ... | Fn ,o,o...o,o,~ !A1 | ...| "r(e,,r~(,,)! !(!B ---o B) v " "* B
It can be shown that the interpretation is sound with respect to the reduction
rule for the term corresponding to recursion. This is so because the reduction rule
is essentially a syntactic restatement of the defining equation of a (generalised)
linear fixpoint. The interpretation is also sound with respect to the reduction
rules for terms of the linear A-calculus, as shown in [BBdPH92].
~0
5.2 Soundness
If g is a closed !-category with finite products and a linear fixpoint operator, then
the category of free coalgebras is a cartesian closed category with a fixpoint op-
erator, as previous results show. We can therefore interpret types and derivable
sequents in the )~reC-calculus as objects and arrows in the category of free coal-
gebras. It turns out that the interpretation of a type A in the ~r~C-calculus can
be written in a simple way using the Girard Translation at the level of formulas,
namely [A] = (![A~ 6). Note that A, a type of the )~reC-calculus, is interpreted
in the category of free coalgebras, and that A ~ a type of the linear ~r~-calculus,
is interpreted in g. Let the following composite be denoted by lin:
C'([Ad x ... x [ A , ] , [ B ] ) = C'((![A~], 6) x ... x (![A~ 6), (![B~ 6))
-~ e ' ( ( ! { A ~ ] , 6) | ... | ( ! [ A ~ 8), (![B~ 6))
e- e(![A~] | ...| [U~
We are now ready to state a result showing that the extended Girard Translation
is sound with respect to the above mentioned categorical interpretations. The
result essentially says that the extended Girard Translation corresponds to the
adjunction between the category of free coalgebras and g, or to be precise, to
the function lin. Recall that (xt : At, ..., xn : An ~- u : B) ~ is a derivable sequent
in the linear Arec-calculus of the shape xt :]A~, ..., Zn :!A~ ~ u r : B ~
T h e o r e m 5.1 (Soundness) Let g be a closed/-category with finite products and
a linearfixpoint operator. I f x t : A t , ...,xn : An f- u : B is a derivable sequent in
the )~re%calculus, then
lin([xt : At,...,~:n : An F- u : B]) = [ ( x t : A t , . . . , x , : A , ~- u : B) ~
4]
F b A A B !F ~ t-- A ~ ~
(AE1) (S,S1)
F ~- A !F ~ t-- A ~
6 R e m a r k s on P o s s i b l e E x t e n s i o n s a n d F u t u r e W o r k
6.2 (Linear) F i x p o i n t O b j e c t s
Let C be a closed !-category with finite products. The category of free coalgebras
is cartesian closed as shown above, and moreover, it can be shown to have a
strong monad (T, r/, #, t) where the functor T is given by U:; F:. This enables us
to define a fixpoint object, [CP90].
With this definition, we can get a result saying that C has a linear fixp0int object
iff the category of free coalgebras has a fixpoint object. But if the category of
free coalgebras has a fixpoint object, then it has a fixpoint operator, which is the
same as having a linear fixpoint operator in g. We conclude that if C has a linear
flxpoint object, then it has a linear fixpoint operator. Another way to show this
would be to prove the existence of a linear fixpoint operator directly from the
existence of a linear fixpoint object in C, without leaving the linear world.
The A-calculus can be extended with recursion at the level of types, that is, we
have additional types X i # X . s , where X is a type variable, typing rules
F ~- t : A [ p X . A / X ] F ~- t : I~X.A
F t- abs(t) : I~X.A F ~- rep(t) : A ~ u X . A / X ]
and the reduction rule rep(abs(t)) -,~ t . We are then able to define recursion
at the level of terms as in the Ar~-calculus. Let/1, z : B ~- u : B be given. If C is
an abbreviation for the type I~X.(X ==~B), and F }- a : C is an abbreviation for
F ~- abs(Ax.((Az.u)(rep(x)x))): C, then F t- rep(~)~ : B acts as the reduction
rule for recursion in the AreC-calculus. This is essentially a syntactic restatement
of a result in [Law69] saying that if f : C --~ C =~ B is a weakly point-surjective
map in a cartesian closed category, then every endomap on B has a fixpoint.
Similarly, we are able to define recursion at the level of terms as in the linear
A~r when the linear A-calculus is extended with the above mentioned
rules for recursion at the level of types. Note that the !-free fragment of this
system is strongly normMising because the underlying proof of a term shrinks
during each reduction step.
43
References
[Abr90] S. Abramsky. Computational interpretations of linear logic. Technical
Report 90/20, Department of Computing, Imperial College, 1990.
[BBdPH92] N. Benton, G. Bierman, V. de Paiva, and M. ttyland. Term assignment
for intnitionistic linear logic. Technical Report 262, Computer Laboratory,
University of Cambridge, 1992.
[Bie94] G. Bierman. On Intuitionistic Linear Logic. PhD thesis, Computer Labo-
ratory, University of Cambridge, 1994.
[Bra94a] T. BrafineL A general adequacy result for a linear functional language.
Technical Report BRICS-RS-94-22, BRICS, Department of Computer Sci-
ence, University of Aarhus, aug 1994. Manuscript presented at MFPS '94.
[Bra94b] T. Brafiner. A raodel of intnitionistic affine logic from stable domain the-
ory. In Proceedings of ICALP '9•, LNCS, volume 820. Springer-Verlag,
1994.
[CP90] R. L. Crole and A. M. Pitts. New foundations for fixpoint computations.
In 5th LICS Conference. IEEE, 1990.
[G~87] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50, 1987.
[GLT89] J.-Y. Girard, Y. Lafont, and P. Taylor. Proofs and Types. Cambridge
University Press, 1989.
[How80] W. A. Howard. The formulae-as-type notion of construction. In J. R.
Hindley and J. P. Seldin, editors, To H. B. Curry: Essays on Combinatory
Logic, Lambda Calculus and Formalism. Academic Press, 1980.
[HP90] H. Huwig and A. Poigne. A note on inconsistencies caused by fixpoints in
a cartesian closed category. Theoretical Computer Science, 73, 1990.
[Law69] F. W. Lawvere. Diagonal arguments and cartesian closed categories. In
P. Hilton, editor, Category Theory, Homology Theory and their Applica-
tions 1I, LNM, volume 92. Springer-Verlag, 1969.
[Mac91] I. Mackie. Lilac : A Functional Programming Language Based on Linear
Logic. M.Sc. thesis, Imperial College, 1991.
[MRA93] I. Mackie, L. Romgn, and S. Abramsky. An internal language for au-
tonomous categories. Journal of Applied Categorical Structures, 1, 1993.
[Mu192] P. S. Mulry. Strong monads, algebras and fixed points. In M. P. Four-
man, P. T. Johnstone, and A. M. Pitts, editors, Application of Categories
in Computer Science, volume 177. London Mathematical Society Lecture
Notes Series, 1992.
[Pio93] G.D. Plotkin. Type theory and recursion (extended abstract). In 8th
LICS Conference. IEEE, 1993.
[Wad91] P. Wadler. There's no substitute for linear logic. Manuscript, 1991.
[Win93] G. Winskel. The Formal Semantics of Programming Languages. The MIT
Press, 1993.
44
(Ax)
xl : A 1 , . . . , x n : An ~- xq : Aq
F,x:B~-u:B
(Rec)
F I- r e c x B . u : B
(Ax)
x:A~x:A
A I - u :!A
(Der)
A,- derelict(u):A
- (AAB) ~176 ~
- (A~B) ~ ~ ~
At the level of proofs, the Girard Translation translates a proof of A1, ..., An t- B
into a proof of ! A ~ , . . . , !A ~ t - B ~ by induction in the proof of A1, ...,An t- B.
Special cases of rules will be used in the definition when appropriate (for example
in the case of ( R e c ) ) . A double bar means a number of applications of a rule.
(Az)
!AO~,--!A~
(Am) ~ (Oer)
A I , . . . , A , F Aq !A~ I--- A~
T o !A ~ o ( W e a k ) and ( E x )
.A1, ..., Aq
F I- A F F B ! F ~ I-- A ~ ! F ~ I-- B ~
(AI) ~ (~I)
F f- A A B ! F ~ v- A ~ ~
F ~- A A B !F ~ I-- A ~ ~
(A ~ ) ~ (~:~)
F b A ! F ~ I-- A ~
F F A A B !F ~ ~ A~ ~
(AE2)
FI- B !Fo ~ Bo (&E2)
F, A b B ! F ~ !A ~ t-- B ~
(~)
F F A ~ B ! F ~ t--!A ~ --o B ~ ( - o i )
! F ~ I-- A ~
(!x)
F F A ~ B F ~- A ! F ~ v - ! A ~ --o B ~ !F ~ v-!A ~
( =::=FE ) v--.+ (--o~)
1" F B !F ~ ! F ~ ~ B ~
( C o n ) and ( E x )
! F ~ e- B ~
F, B F B ! F ~ ! B ~ I-- B ~
(Rec) ~ (eec)
F t- B ! F ~ e- B ~
Decidability of Higher-Order Subtyping
with Intersection Types
Adriana B. Compagnoni*
Abstract
The combination of higher-order subtyping with intersection types yields
a typed model of object-oriented programming with multiple inheritance
[11]. The target calculus, F~', a natural generalization of Girard's system
F ~ with intersection types and bounded polymorpbism, is of independent
interest, and is our subject of study.
Our main contribution is the proof that subtyping in F~ is decidable.
This yields as a corollary the decidability of subtyping in F~, its inter-
section free fragment, because the F~ subtyping system is a conservative
extension of that of F~.
The calculus presented in [8] has no reductions on types. In the F~ sub-
typing system the presence of ~A-conversion - an extension of fl-conversion
with distributivity lav/s - drastically increases the complexity of proving
the decidability of the subtypiug relation. Our proof consists of, firstly,
defining an algorithmic presentation of the subtyping system of F/~, sec-
ondly, proving that this new presentation is sound and complete with
respect to the original one, and finally, proving that the algorithm always
terminates.
Moreover, we establish basic structural properties of the language of
types of F/~ such as strong normalization and Church-Rosser.
Among the novel aspects of the present solution is the use of term
rewriting techniques to present intersection types, which clearly splits
the computational semantics (reduction rules) from the syntax (inference
rules) of the system. Another original feature is the use of a choice operator
to model the behavior of variables during subtype checking.
1 Introduction
The system F ~ (F-omega-meet) was first introduced in [11], where it was shown
to be rich enough to provide a t y p e d model of object oriented p r o g r a m m i n g with
multiple inheritance. F/~ is an extension of F ~ [17] with bounded quantification
and intersection types, which can be seen as a natural generalization of the
type disciplines present in the current literature, for example in [14, 21, 22,
8]. Systems including either subtyping or intersection types or both have been
widely studied for m a n y years. W h a t follows is not intended to be an exhaustive
description, but a framework for the present work.
2 The S y s t e m
The kinds of F~ are those of F ~, they are the kind * of proper types and the
kinds KI-~K2 of functions on types (sometimes called type operators). The lan-
guage of types of F~ is a straightforward higher-order extension of F_<, Cardelli
and Wegner's second-order calculus of bounded quantification. Like F_<, it in-
cludes type variables (written X), function types (T--+T'), and polymorphie
types (VX<T:K.T~), in which the bound type variable X ranges over all sub-
types of the upper bound T. Moreover, like F ~, we allow types to be abstracted
on types (AX:K.T) and applied to argument types (TT'). Finally, we allow
arbitrary finite intersections (Atc [T1..T~]), where all the Ti's are members of the
same kind K. The empty intersection at kind K is written -]-K. We drop the
maximal type Top of F_<, since its role is played here by T*.
The reduction --+~A on types consists of the/3-reduction and distributivity
rules for each constructor associated with intersections.
49
DEFINITION 2.1
5. U AK [T1V .. T n U ]
The rules in F~ are organized as proof systems for four interdependent judge-
ment forms:
F F ok well-formed context FFT EK well-kinded type
F F S _< T subtype FFeET well-typed term.
2.1.2 Subtyping
The rules defining the subtype relation are a natural extension of familiar calculi
of bounded quantification. Aside from some extra well-formedness conditions,
the rules S-TRANS, S-TVAR,and S-ARrow, are the same as in the usual, second-
order case. F ~'s rule of type conversion (i.e. if F ~- e E T and T =~ T ' then
F t- e E T t) is captured here as the subtyping rule S-CONV, which also gives
reflexivity as a special case. Rules S-MEET-G and S-MEET-LB specify that an
intersection of a set of types is their order-theoretic greatest lower bound.
F~-SEK F}-TEK S=~AT
F ~- S _< T (S-CoNv)
3 Decidability of subtyping
In this section we show that the subtyping relation of F/~ is decidable. The solu-
tion is divided into two main parts. First, we develop a normal subtyping system,
NF~, in which only types in normal form are considered. We prove that deriva-
tions in NF~ can be normalized by eliminating transitivity and simplifying re-
flexivity. This simplification yields an algorithmic presentation, AlgF~, whose
rules are syntax directed. Moreover, we prove that AlgF~ is indeed an alterna-
tive presentation of the F/~ subtyping relation. Formally, F [- S _< T if and only
if F nf ~Alg snf < Tnf (proposition 3.3.3).
In the solution for the second order lambda calculus presented in [21], the
distributivity rules for intersection types are not considered as rewrite rules. For
that reason, new syntactic categories have to be defined (composite and indi-
vidual canonical types) and an auxiliary mapping (flattening) transforms a type
into a canonical type. Our solution does not need either new syntactic categories
or elaborate auxiliary mappings, since the role played there by canonical types
is performed here by types in normal form.
Independently Steffen and Pierce proved a similar result for F~ [23]. There
are several differences between our work and the proof of decidability of sub-
typing in [23]. First, our result is for a stronger system which also includes
intersection types. Our proof of termination has the novel idea of using a choice
operator to model the behavior of type variables during subtype checking. A
second major difference is the choice of the intermediate subtyping system. We
define the normal system NF~, which is not only the key to proving decidability
of subtyping but helped understand the fine structure of subtyping, yielding the
algorithm AlgF~. In [23] the intermediate system, called a reducing system,
leads to a much more complicated proof which involves dealing with several
notions of reduction and further reformulations of the intermediate system.
P~-ok
S-TVAR
F ~-X <_AY:K.Y (AY:K.Y) W =~^ W
S-OAPP S-CONV
F ~ X W < (AY:K.Y) W (AY:K.Y) W <_W
S-TKANS
F ~ - X W <_W.
This example shows that S-TRANS erases information obtained by S-CoNY that
is not present in the conclusion. A first step towards an algorithm to check the
subtyping relation is to design a set of rules in which the derivable judgements
contain all the information about its derivations. For that we define a set of
rules (NF~) in which conversion is reduced to a minimum and transitivity can
be eliminated (lemma 3.1.7). Both results are proved with a standard cut-
elimination argument. This yields a syntax directed subtyping relation (AlgF~)
which constitutes a decision procedure for the original system.
In this section, we present the subtyping system NF~ that uses the context
and type formation rules of F~. We prove a generation lemma for subtyping
(proposition 3.1.8) and define an algorithmic presentation, AlgF~ (see definition
3.3.1). Finally, we show that there is an equivalence between subtyping in F~
and subtyping in AlgF~, which is essential to prove the decidability of subtyping
in F/~ (see section 3.3).
We now define the normal subtyping system, NF~. Subtyping statements in
NF~ are written F Fn S < T, and S, T, and all types appearing in F are in
flA-normal form.
We now define lubr(S), and we prove in lemma 3.2.1 and corollary 3.2.9,
that, when defined, it is the smallest type above S with respect to F.
and since well-kinded types are strongly normalizing, its normal form exists.
The rules S-MEET-LB and S-MEET-G are replaced by NS-3, NS-V, and NS-V3.
7. F ~-~ A < AK[BI..B~] implies that for each iE{1..n} F ~-~ A _< B{ and
F~AEK.
8. F ~-~ AK[A1..Ar~] <_ AK[B1..Bn] implies for each iE{1..n} there exists
jE{1..m) such that F F-n Aj _< Bi and VkE{1..m} F ~- Ak E K.
Moreover, given a normal proof of any of the antecedents the proofs of the
consequents are proper subderivations.
55
2. F ~ S <_lubr(S).
PROPOSITION 3.2.2 (Soundness) If F ~-n S < T, then F t- S _< T.
LEMMA 3.2.3 1. F f- ok implies F nf f- ok.
2. F ~- T E K implies F nf f- T E K.
3. F b S < T implies I"nf ~- S < T.
4. Let F1, F2 ~- ok. Then F~f, F2 t- T E K implies F1, 1-'2 ~ T E K.
5. Let F1, F2 t- ok. Then F~f, F2 F- S < T implies F1, F2 ~- S < T.
6. Let F F- S, T E K. Then F nf I- S nf _< T nf if and only if F ~- S _< T.
This substitution lemma is the key result we use in proving that 8-OAPP has
a corresponding admissible rule in NF~.
LEMMA 3.2.6 F ~- S U E K. Then F t-n S _< T implies F ~-n (SU) nf _< (TU) nf.
PROPOSITION 3.2.7 (Completeness) F ~- S < T ~ I"nf I-n S nf __ T nf.
DEFINITION 3.3.1 We define the algorithmic system AlgF~ from NF~, by re-
moving NS-TRANS, and replacing NS-REFL by
F~XEK F~TSEK
(AS-TVARREFL) (AS-OAPPREFL)
F F-AlgX < X F F-AlgT S <_T S
Next, we define a measure for subtyping statements such that, for any sub-
typing rule, the measure of each hypothesis is smaller than that of the conclusion.
Most measures for showing the well-foundedness of a relation defined by a set
of inference rules involve a clever assignment of weights to judgements, often in-
volving the number of symbols. We need a more sophisticated measure, since in
NS-OAPP it is not necessarily the case that the size of the hypothesis is smaller
than the size of the conclusion.
We introduce a new mapping from types to types in the extended language
in order to define a new measure on subtyping statements. To motivate the
definition of this new measure, we analyze the behavior of type variables during
subtype checking. Assume that we want to check if F ~Alg S < T, where S is a
variable or a type application. It can be the case that the judgement is obtained
with an application of NS-TVAR or NS-OAPP, in which case we have to consider
a new statement F }-Alg S I <_ T, where S' is obtained from S by replacing a
variable by its bound (and possibly normalizing). However, we do not replace
every variable by its bound, as this would constitute an unsound operation with
respect to subtyping. This fact is illustrated in the following example.
EXAMPLE 4.2 Two unrelated variables may have the same bound.
Our new mapping, plus, includes in each type expression this nondetermin-
istic behavior of its type variables.
9 = 1+
PROOF: The height of the derivation of the kinding judgements of the arguments
strictly decreases in each recursive call. []
LEMMA 4.11 (Substitution forplus) Let F1, X~S:K1, I~2 [- T2 E /(2 and F1 I-
T1 E K1. Then
plusrl, X <_S:K~,r~ (T2)[X e-plusr~ (T1)] ---~13/x+plusr~, r~ [X~-T~](T2 [X +-T1]).
LEMMA 4.12 (Monotonicity of plus with respect to --+Z/~) Let F ~- T E K. Then
PP~OOF: Item i is a particular case of the previous lemma (lemma 4.13), and
item 2 is a consequence of lemma 4.13 and the monotonicity of plus with respect
to -+~A+ (lemma 4.12(2)). []
Finally, we can define our measure.
59
Pairs are ordered lexicographically. Note that <0, 0> is the least weight.
5 Acknowledgements
I want to express my gratitude to Mariangiola Dezani-Ciancaglini for her scien-
tific and moral support which made the present work possible. I am also grateful
for discussions with Benjamin Pierce, Henk Barendregt, and Ugo de'Liguoro.
References
[1] H. P. Barendregt, M. Coppo, and M. Dezani-Ciancaglini. A filter lambda model
and the completeness of type assignment. Journal of Symbolic Logic, 48(4):931-
940, 1983.
[2] Kim Bruce and John Mitchell. PER models of subtyping, recursive types and
higher-order polymorphism. In Proceedings of the Nineteenth A CM Symposium
on Principles of Programming Languages, Albequerque, NM, January 1992.
[3] Peter Canning, William Cook, Walter Hill, Walter Olthoff, and John Mitchell. F-
bounded quantification for object-oriented programming. In Fourth International
Conference on Functional Programming Languages and Computer Architecture,
pages 273-280, September 1989.
[4] Luca Cardelli. A semantics of multiple inheritance. Information and Computa-
tion, 76:138-164, 1988. Preliminary version in Semantics of Data Types, Kahn,
MacQueen, and Plotkin, eds., Springer-Verlag LNCS 173, 1984.
[5] Luca Cardelli. Notes about F~:. Unpublished manuscript, October 1990.
[6] Luca Cardelli and Peter Wegner. On understanding types, data abstraction, and
polymorphism. Computing Surveys, 17(4), December 1985.
60
[7] Felice Cardone and Mario Coppo. Two extensions of Curry's type inference sys-
tem. In Odifreddi [20], pages 19-76.
[8] Giuseppe Castagna and Benjamin Pierce. Decidable bounded quantification. In
Proceedings of Twenty-First Annual ACM Symposium on Principles of Program-
ming Languages, Portland, OR. ACM, January 1994.
[9] Adriana Compagnoni. Higher-Order Subtyping with Intersection Types. PhD the-
sis, University of Nijmegen, The Netherlands, January 1995.
[10] Adriana B. Compagnoni. Subtyping in F~ is decidable. Technical Report ECS-
LFCS-94-281, LFCS, University of Edinburgh, January 1994.
[11] Adriana B. Compagnoni and Benjamin C. Pierce. Multiple inheritance via inter-
section types. Mathematical Structures in Computer Science, 1995. To appear.
Preliminary version available as University of Edinburgh technical report ECS-
LFCS-93-275 and Catholic University Nijmegen computer science technical report
93-18, Aug. 1993.
[12] William R. Cook, Walter L. Hill, and Peter S. Canning. Inheritance is not sub-
typing. In Seventeenth Annual ACM Symposium on Principles of Programming
Languages, pages 125-135, San Francisco, CA, January 1990. Also in [18].
[13] M. Coppo and M. Dezani-Ciancaglini. A new type-assignment for A-terms. Archly.
Math. Logik, 19:139-156, 1978.
[14] Pierre-Louis Curien and Giorgio Ghelli. Coherence of subsumption: Minimum
typing and type-checking in F_<. Mathematical Structures in Computer Science,
2:55-91, 1992. Also in [18].
[15] Giorgi0 Ghelli. Proof Theoretic Studies about a Minimal Type System Integrating
Inclusion and Parametric Polymorphism. PhD thesis, Universitk di Pisa, March
1990. Technical report TD-6/90, Dipartimento di Informatica, Universit~ di Pisa.
[16] Giorgio Ghelli, January 1994. Message to the Types mailing list.
[17] Jean-Yves Girard. Interprdtation fonctionelle et dlimination des eoupures de
l'arithm~tique d'ordre supdrieur. PhD thesis, Universit6 Paris VII, 1972.
[18] Carl A. Gunter and John C. Mitchell. Theoretical Aspects of Object-Oriented
Programming: Types, Semantics, and Language Design. The MIT Press, 1994.
[19] John C. Mitchell. Toward a typed foundation for method specialization and inher-
itance. In Proceedings of the 17th A CM Symposium on Principles of Programming
Languages, pages 109-124, January 1990. Also in in [18].
[20] Piergiorgio Odifreddi, editor. Logic and Computer Science. Number 31 in APIC
Studies in Data Processing. Academic Press, 1990.
[21] Benjamin C. Pierce. Programming with Intersection Types and Bounded PoIy-
morphism. PhD thesis, Carnegie Mellon University, December 1991. Available as
School of Computer Science technical report CMU-CS-91-205.
[22] Benjamin C. Pierce and David N. Turner. Simple type-theoretic foundations for
object-oriented programming. Journal of Functional Programming, 4(2):207-247,
April 1994.
[23] Martin Steffen and Benjamin Pierce. Higher-order subtyping. In IFIP Working
Conference on Programming Concepts, Methods and Calculi (PROCOMET), June
1994. An earlier version appeared as University of Edinburgh technical report
ECS-LFCS-94-280 and Universit~it Erlangen-Niirnberg Interner Bericht IMMD7-
01/94, February 1994.
A A-calculus Structure Isomorphic to
Gentzen-style Sequent Calculus Structure
Hugo Herbelin *
LITP, University Paris 7, 2 place Jussieu, 78252 Paris Cedex 05, France
INRIA-Rocquencourt, B.P. 105, 78153 Le Cliesnay Cedex, France
Hugo. HerbelinQinria. f r
1 Introduction
2.1 T h e S e q u e n t Calculus L J
A::= X I A ~ A
where X ranges over VF, an infinite set of which the elements are called
p r o p o s i t i o n a l v a r i a b l e names. In thesequel, we reserve the letters A, B, C,
... to denote formulas.
S e q u e n t s of LJ have the form F b A. To avoid the need of a structural rule
we define F as a set. To avoid confusion between multiple occurrences of the
same formula, this set is a set of named formulas. We assume the existence of an
infinite set of which the elements are called n a m e s . Then, a n a m e d f o r m u l a
is just the pair of a formula and a name. Usually, we do not mention the names
of formulas (anyway, no ambiguity occurs in the sequents we consider here).
Under the condition that A, with its name, does not belong to F, the notation
F, A stands for the set-theoretic union of F and {A}.
To avoid the need of a weakening rule, we admit irrelevant formulas in axioms.
The rules of LJ are:
Ax F,A,A~" C
F,A b A F, A t" C Cont
F t - A F, B t - C P,A~-B
F , A - - - , B F C IL F ~ - A - * B IR
F~- A I",AF B
Cut
FI-B
63
F,z:A,y:AF u:C
F , z : A t " z : A Ax Cont
F,x:At- u{y := ~}:C
FI-u:A F,y:Bt-v:C F,z:AI- u:B
In F ~" Az.u : A.--* B IR
r,x:A-~S ~ v{U := (~ u)}:C
for which v{x := u} denotes the term v in which each occurrence of z has
been replaced by u.
However different proofs may be associated to the same k-term. For instance:
Az Az
A,C~- A A,C, Bt- B
IL
A-.* B , A , Ct- B
A--+ B , A t- C-+ B IR
and
AT,
A,C, Bt- B
AI-----'-A A z A, B t " C ~ B IR
1i,
A ---,B , A I-- C ---+B
are both associated to the Church-like typed k-term k z : C . ( z y):C----~ B for
a context in which z : A--* B and y: A.
We decide to restrict LJ in order to get a bijective correspondence between
normal simply-typed k-terms and cut-free proofs. For this purpose, we restrict
the use of the In rule in order to forbid the second proof. The calculus we obtain
has two kind of sequents. We call it LJT, since it appears as the intuitionistic
fragment of a calculus called LKT and defined by Danos, Joinet, Schellinx [2].
A s e q u e n t of LJT has either the form F; I- A or the form F; A I- B. In both
cases, F is defined as a set of named formulas. The semi~colon delimits a place on
its right. A uniform notation for sequents of LJT is the following one: F ; / / ~ - B
w h e r e / / i s a notation to say that the place on the right of the semi-colon may
be either empty or filled with one (not named) formula. The idea of using these
kinds of sequents comes from Girard [5] who called "stoup" the special place
between the symbols ";" and "}-'.
6~
F ; A b A Az F,A;At- B
F, A; t- B Cont
F;I-A F ; B t - C F,A;b B
F;A~BbC IL F ; F A - - . B IR
R e m a r k s : 1) With these rules, the first proof above is not directly a proof in
the restriction: the axiom rule of LJ has to be encoded in the restriction by an
axiom rule followed by a contraction rule.
2)This calculus appears also in Danos et al [2] with a slight difference in
the treatment of structural rules. Like its classical version LKT, it has been
considered by Danos el al for its good behaviour w.r.t, embedding into linear
logic. The calculus LJT appears also as a fragment of ILU, the intuitionistic
neutral fragment of unified logic described by Girard in [6]. The calculus ILU
is itself a form of LJ constrained with a stoup, for which Girard pointed out
that "the formula [in the stoup] (if there is one) is the analogue of the familiar
head.variable for typed A-calculi'.
Recently, Mints defined in [9] a notion of normal form for cut-free proofs of
LJ which also coincides with the notion of cut-freeness in LJT.
We have also to mention the definition of a cut-free sequent calculus similar
to the cut-free LJT in the paper of Howard [12] on the interpretation of natural
deduction as a A-calculus. Howard mentions that the proofs of this cut-free
calculus are in one-to-one correspondence with the normal simply-typed A-terms.
AT,
F;I-up:Cp F;y:Bby:B
IL
F; yp:Cp---*B I- (yp up):B
R e m a r k : The construction of the applicative part of the term starts from up and
ends with ul in contrast with the usual way of building a term (u ul ... up) in A-
calculus. This is why we have not an exact correspondence with the substitution
operator when we consider the cut rule.
According to the place of the cut formula (in the stoup or not), there are two
kinds of cut rules in LJT:
F;bv:A F,z:A;//~-u:B
CM
F ; / 7 I- u[z := v ] : B
A standard way to eliminate cuts is to apply rewriting rules to proofs in
order to propagate the cuts towards smaller proofs. Here is an example of such
a rewriting rule (we let C = A2--+...--+A,~B):
F,z:A;t-ua:A1 F,z:A;y:Ct-(yu2...u,):B
IL
F ; b v:A r, z:A;y:A1 -..,C ~- (y ux ... u,,):B
CM
r; y:Al-~C ~ (y ,,, ... , , ) [ , := ~]:a
reduces to
F, A t - A Az .,., F,A;A~-A
F, A; I- A Coat
Az
9 9 F;~-A F ; B t - B
IL
: : F; A - , B ~- B
F ~- A F, B ~ C ~ Coat
FI'-C Iz F;t-- B F, B i t - C
C~t
F;~- C
F,A'~- B F, Ai~- B
F~-A--,B IR ~.~ F;~-A--~B IR
9 9 * 9
Cut CM
F t- B F; I- B
3 The A-calculus
We assume the existence of an infinite set 12 of which the elements are called
t e r m v a r i a b l e s n a m e s and here denoted by the letters z, y, z, ...
The set of A-expressions, including the X-terms (or shortly t e r m s ) and the
lists o f a r g u m e n t s are mutually defined by the following grammar for which
z ranges over P
The syntax (t[z := u]) stands for an operator of explicit substitution in terms
(a " l e t x=u i a t" operator) and (l[z := u]) stands for an operator of explicit
substitution in lists of arguments.
We usuMly abbreviate an argument list [tl :: [... :: [tn :: [ ]]...]] by It1; ...;in].
Terms such as ((...(t tl) ...) tn) are abbreviated (t tl ... tn). Sometimes (z []) is
shortened into z. Also, the expressions (Az.t), (t[z := u]) and (l @ i I) may be
written respectively Az.L t[z := u] and l @ l~ when there is no ambiguity.
Subexpressions of A-expressions are defined as usual, but, in our case, by a
simultaneous recursion on terms and arguments lists.
Bound variables are defined as usual. We say that two A-expressions are
a-equal if they differ only in the names (assumed distinct the one from the
others) of bound variables. This notion of equality does not affect the structure
of expressions and, in the sequel, we consider A-expressions up to this a-equality.
3.2 N o r m a l K-expressions
t::= Z)l(Ax.O
Z::= :: tl
3.3 R e d u c t i o n Rules
The presence of explicit substitution and concatenation operators entails the
presence of appropriated reduction rules:
- /~-reduction
[][~ := v] ~ [] s..
[,, :: t][~ := v] -~ [,,[~ := v] :: I[~ := vii Soo.,
4 C u t - e l i m i n a t i o n in the Calculus L J T
We say that two proofs are equal if they differ only by the names of formula in
the proved sequent or by addition of irrelevant formulas to the left part of the
proved sequents. We consider proofs up to this notion of equality. In particular,
if p is a proof o f / ' ; / 7 1- A, then, for any named formula t3 not i n / ' , p is a proof
of /', B; H }- A, even if it becomes necessary to change the name of another
similarly named occurrence of B throughout p.
4.1 Cut-elimination
Reduction of C u Rules.
- logical counterpart of ]%reduction
F,A;t- B
IR Az
F ; F- A.--*B F ; A--* B F- A--* B
CH LJT F,A;I- B
F; F A..-. B ---~ /R
F;f- A--*B
A~
F;At-A F;AF.C
F; A t- C CH LJT
F;AI-C
Rednc(ion of CM Rules.
- logical counterpart o f p r o p a g a t i o n o f s u b s t i t u t i o n s t h r o u g h w e a k l y n o r m a l
terms
F,A;AF-C F;I-A F , A ; A I - C
F; I- A F, A; t- C Cont F; F-A F; A [- C CM
CM LJT CH
F;F- C ---* F;F- C
70
F,A,B;B~C P, B;~- A F , A , B ; B t - C
Cont CM
F,B;I- A I",A,B;I--(7 .F,B;BI-C
]',B; I--C C#t LJ...T F, B; I~C Cont
note that, if B already occurs with the same name somewhere in the proof of
F; I- A, then this latter name has to be changed throughout the proof.
Ax
F;I.- A F , A ; B b B
F; B l- B CM LJ...T Ax.
F;BI- B
I',A;I- B F,A;CI-. D
Iz
F;b A F , A ; B ~ C I- D
F; B---,C I- D CM
F;t-A F,A;I-- B F;~ A F,A;CI-- D
F; 1- B CM F'~C I- D Cxr
L:}T F; B--- (7 1- D I~.
6 Strong Termination
By the isomorphism, the strong termination of cut-elimination for LJT (using the
above rewriting system) and the strong termination of reduction for typable X-
expressions are equivalent. We show hereafter the strong termination for typable
X-expressions.
h
~ Z . U -----} U
[u::/]•
l) L l [u :: • t
where u ranges over the set of ~-terms and I over the set of argument lists.
L e m m a 2. lfthe A-term u and the argument list I are SEN then ~z.u, (z l) and
[u :: O are SEN.
Proof. By induction on the proof that u is SEN then by induction on the proof
that l is SEN. Let us treat of the case [u ::/]. If [u ::/] ~ e' then, either e' is u
or l, in which case, by hypothesis, e' is SEN, or e ~ is [u' :: /] with u ~ u ~ , or
[u :: l'] with l ~ l' in which cases e ~ is SEN by induction hypothesis. Therefore,
in any case, e reduces to a SEN T-expression. This implies t h a t e is itself SEN.
Lemma3. Let e and u be SEN-~-ezpressions. If, for all l SEN, the typability of
(u l) implies that (u l) is SEN, then, also the typability of e[z := u] implies that
e[z := u] is SEN.
- The case where e is (z 1') - in which case w denotes (u i'[z := u]) - is the
more delicate one. But since e h_~ 1~' the proof of SEN for l' is smaller than the
one for e. Therefore, by induction hypothesis, l~[z := u] is SEN. And since we
have assumed t h a t for all l SEN, (u l) was SEN, we infer t h a t (u i'[z := u])
is SEN.
73
Thus, whatever the form of e, the reducts of e[z := u] are all SEN. This is
enough to say that e[z := u] is SEN,
- The more delicate case is when e has the form Az.u while 1 has the form
[v :: l']. In this case, the type of A has the form B --* C, the A-term v is
typable by B and w denotes (u[z : - v] l'). Since B is smaller than A, by
induction hypothesis, the typability of (v !") implies that it is SEN whatever
1" SEN. It is then possible to use lemma 3 in order to infer that u[z := v]
is SEN. But this latter is typable by C which is also smaller than A. By
induction hypothesis, again, (u[z := v]l') is SEN.
- If e is (z 1') then w denotes (z (1' @ 1)). But (x l') ~ l', therefore, by
induction hypothesis, (1' @ !) is SEN. By lemma 2, w is SEN.
- If e is Az.u and I denotes [ ] then w is e which, by hypothesis, is SEN.
- If e is [ ] then w denotes 1 which is directly SEN.
- If e is [v :: 1'] then w denotes [v :: (1' @ 1)]. But [v :: 1'] ~ I', therefore, by
induction hypothesis, i' @ ! is SEN. As for v, it is also SEN by induction
hypothesis. Then, by lemma 2, w is SEN.
7~
R e m a r k s : 1) A similar proof has been done by Dragalin [3] for the system of
reduction rules given in the seminal paper of Gentzen on the cut-elimination
theorem for LK. The difference is that Dragalin's proof does not work by struc-
tural induction on the proof of strong E-normalisability, but rather by induction
on the length of these proofs. Our proof has been done independently, extending
a proof from Coquand that the elimination of cuts according to an outermost
strategy of reduction terminates.
Note that this kind of strong cut-elimination proof applies also to non-
confluent systems of reduction rules (it is the case of Gcntzen's system of re-
duction rules) but not to system including rules affecting the order of cuts. This
is contrast with the cut-elimination procedures that Zucker or Pottinger and
have considered.
2) An interesting result would be to prove the Strong normalisation of the
simply-typed A--calculus with the additional reduction rule (Az.t u)[y := v] r
((),z.t)[y := v] u[y := v]). As a corollary of this result, we would get the strong
normalisation of the usual simply-typed A-calculus and even the strong normal-
isation for the simply-typed A-calculus with an explicit "let _ in 2'-like substitu-
tion operator (see for instance Lescanne [8]).
Conclusion
Acknowledgements
Simplifications in the proof of strong normalisation are due to Thierry Coquand.
I thank also the Paris 7 computer science logic group, Phil W~dler and Viviana
Bono for echoes on this work.
References
1. V. Brea~u Tanen, D. Kesner, L. Puel: "A typed pattern calculus", IEEE Symposium
on Logic in Computer Science, Montreal, Canada, June 1993, pp 262-274.
2. V. Danos, J-B. Joiner, H. Schellinx: "LKQ and LKT: Sequent calcufi for second
order logic based upon dual linear decompositions of classical implication", in
Proceedings of the Workshop on Linear Logic, Cornell, edited by J-Y. Girard, Y.
Lafont, L. R~gnier, 1993.
3. A. G. Dragalin: Mathematical Intuitionism: Introduction to Proof Theory, Trans-
lations of mathematical monographs, Vol 67, Providence, R.I.: American Mathe-
matical Society, 1988.
4. J. Gallier: "Constructive logics, part I: A tutorial on proof systems and typed
)t-calculi", Theoretical Computer Science, Vol 110, 1993, pp 249-339.
5. J.-Y. Girard: "A new constructive logic." classical logic", Mathematical Structures
in Computer Science, Vol 1, 1991, pp 255-296.
6. J-Y. Girard: "On the Unity of Logic", Annals of Pure and Applied Logic, Vol 59,
1993, pp 201-217.
7. G. Huet: "Confluent Reductions: Abstract Properties and Applications to Term
Rewriting Systems", Journal of the Association for Computing Machinery, Vol 27,
1980, pp 797-821.
8. Z. Benaissa, D. Briaud, P. Lescanne, J. Rouyer-Degli, "~,v, a calculus of explicit
substitutions which preserves strong normalisation", submitted to Journal of Func-
tional Programming, 1995.
9. G. Mints: "Normal forms for sequent derivations", Private communication, 1994.
10. G. Pottinger: "Normalization as a homomorphic image of cut-ellmination", Annals
of mathematical logic), Vol 12, 1977, pp 323-357.
11. D. Prawitz: Natural Deduction, a Proof-Theoretical Study, Almquist and Wiksell,
Stockholm, 1965, pp 90-91
12. W. A. Howard, "The Formulae-as-Types Notion of Constructions", in J.P. Seldin
and J.R. Hindley Eds, To H.B. Curry: Essays on Combinatory Logic, Lambda
Calculus and Formalism, Academic Press, 1980 (unpublished manuscript of 1969).
13. P. Wadler: "A Curry-Howard isomorphism for sequent calculus", Private commu-
nication, 1993.
14. J. I. Zucker: "Correspondence between cut-elimination and normalization, part I
and II', Annals of mathematical logic, Vol 7, 1974, pp 1-156.
Usability: formalising (un)definedness
in typed lambda calculus
Jan Kuper
1 Introduction
2 The calculus
3 Solvability
3N (Ax.M)N = Ia,
(b) medium solvable, if there is a type a such that for all terms L of type
3N ()~x.M)N = L,
(c) weakly solvable, if there is (a type a and) a term L (of type a), L in normal
form, such that
3N ( s = L. []
Example 1.
Clearly, the items (a), (b), (c) from definition 2 correspond to (a), (b), (c) from
lemma 1. In the untyped lambda calculus we have (a) r (b) r (c), whereas, as
can be seen from the examples above, in )~ we only have (a) ~ (b) ~ (c).
L e m m a 3. In A we have
( ~ M is weakly solvable
(i) f(C[_]),
(ii) (C[_])M,
(iii)
(iv) ,x.c[_].
are strict contexts.
(b) In the untyped lambda calculus the definition Of strict context is obtained
from part (a) by removing clauses (i) and (iv). []
We remark that part (a) of this definition remains unchanged if product types
are added to A. That is to say, 7r~(C[_]) is a strict context whenever C[_] is.
However, (C[_], M) and (M, C[_ D are not strict contexts.
(i) [_],
(ii) f(C[_])M1... M,~, n > O,
(iii) (C[_])M1... Mn, n _> 1, C[-I not of form (ii) or (iii),
(iv) Axl...xn.C[_], n > 1, C[_] not of form Ax.C'[_],
(v)#xl...x~.C[-], n>_ 1, C[_] n o t o f f o r m ~x.C'[_].
(b) In the untyped lambda calculus a strict context is of form (i), (iii), or (iv).
The next definition works for any calculus in which strict contexts can be defined.
Definition 6 (Usability).
Example 2.
L e m m a 7.
P r o o f . Most of these properties are easy to prove, and left to the reader. With
respect to (ix) we remark that [_]x is a strict context. Property (xi) can be
proved using (iv), (viii) and (ix).
For (xiii) notice that there is a normal form N such that M >> N. Hence N
does not contain a #-term. By (x) we may assume that N is closed. Now let L
be a sequence of closed terms in normal form such that N L is of ground type.
Since the #-free fragment of A is strongly normalising, it follows that N L --~ c
for some constant c. []
P r o o f . (a) " ~ " : If M is weakly solvable, then there are sequences x, L such
that (;~x.M)L has a normal form, N say. Since ()~x.[_])L is a strict context, it
follows that M >> N. Hence, M is usable.
(a) " ~ " : Tedious. We have to prove that M is weakly solvable. It is sufficient
to prove that there is a (strict) context C[_] = (~x.[_])L such that C[M] has a
normal form.
If M is usable, then there is a strict context Co[-] such that C0[M] has a
normal form. Without loss of generality we may assume that Co [-] is constructed
without applying clause (iv) of definition 4: since Co [M] has a normal form, a
"subcontext" of the form #x.C~[_] can be replaced by ()~x.Ct[_])(#x.Ct[U]),
which is also strict (M is given).
Define a "measure function" q on strict contexts as follows:
q([_]) = 0
q(.fC[_]) = q(C[_]) + 1
q((C[_])X) = q(C[_])
q(C[_]) if C[_] is of the form Ax.C'[_]
q()~x.C[_]) = q(C[-]) + 1 otherwise
So q counts the number of applications of clause 4(i) and the number of sequences
of consecutive "leading" lambda's. We proceed by induction on q(C0[-]).
Basic case: q(C0[-]) = 0. Then Co[-] = [ - ] M 1 . . . M n , n > 0, i.e., Co[-] is of the
required form already.
Induction case: q(C0[-]) > 0. Notice that Co[-] ~ [-] (since q([_]) = 0). Hence,
by lemma 5, Co[-] is of one of the following three forms:
1. Co[-] ~- )~xl . . . x n . C l [ - ] , n _> 1, C1[-] not of the form )w.C~[-]. Since C0[M]
has a normal form, CI[M] has a normal form too. Since
q(Cl[-]) <(q(C0[-]),
the result follows by the induction hypothesis.
2. Co[-] - f(Cl[-])L1." "Ln, n > O. Since C0[M] has a normal form, it follows
by that C1 [M] has a normal form too. Since
since C0 [M] has a normal form, we may add terms in normal form to
the right of L, and the result will have a normal form again.
i. C~ [_] = [_]. Then Co [-] is of the required form.
ii. C2[-] ~ f(C3[-])N1 " "Nk, k > 0. Then
Co[M] - (Ax..f(C3[M])N)L
Since the length of L is not smaller than the length of x, it is easily
seen that there are P, Q such that
Co [M] = ]((Ax.C3 [M])P)Q:
Since Co[M] has a normal form, it follows that (Ax.C3[M])P has a
normal form. Since
q((Ax.C3[_])P) < q(Co[-])
the result follows by the induction hypothesis.
iii. C2[-] -~ (C3[-])N1 .. "Nk, k > 1. Then
Co[M] -= (Ax.(C3[M])N)L.
As before, there is a sequence Q such that
Co [M] = (Ax.C3 [MI)Q.
Clearly, C3[-] is not of form (ii) or (iii) from lemma 5. So two cases
remain:
A. C3[-] =-[-]. Then (Ax.C3[_])Q is of the required form.
B. Ca[-]- AziC4[_]. Then
q((Ax.Az.C4[_])Q) < q(Co[-])
and the result follows by the induction hypothesis.
In the untyped lambda calculus the solvable terms are precisely the terms with
a head normal form. However, in A the usable terms can not be characterized in
this way. Consider the following terms.
Notice that the restriction of this definition to the untyped lambda calculus
yields the standard definition of hnf. For a comparison with other definitions of
hnf's in typed lambda calculi, cf. (Kuper 1994, chapter 6).
For the proof of the next lemma, see (Kuper 1994, lemma 6.2.6). Compare
also (Barendregt 1984, section 8.3).
L e m m a 10.
(a) Ax.M has a hnf r M has a hnf
(b) M i x := N] has a hnf ~ M has a hnf
(c) M N has a hnf =~ M has a hnf []
Now we come to the main proposition of this section. Notice that from the
examples above it follows that the converse arrows do not hold.
P r o p o s i t i o n 11.
If product types are added to A, then it depends on the precise definition of head
normal form, whether " ( ~ " of this proposition will still hold. For example,
(0,/2) is usable. However, this term is usually not considered as a head normal
form, but as a weak head normal form.
We mention two corollaries of proposition 11.
C o r o l l a r y 12. /2 is not usable.
6 Genericity
In section 1 we called a term meaningful if it can have a contribution to a termi-
nating computation. This conception of meaningfulness motivated the notion of
usability. Now we make this conception of meaningfulness precise in a different
way:
85
We remark that the generic terms are the operationally least defined terms in
the sense of (Plotkin 1977, Berry et al, 1985): a term M is operationally less
defined than N, if
C[M] has a normal form ~ C[N] has the same normal form.
Now we come to the main result of this section.
In the remaining part of this section we prove the Genericity Lemma and some
of its variants (see lemma 30 and its corollaries). In order to do so, we need an
extension A of A. Informally, the terms of_A are the terms of A in which subterms
can be underlined, but no subterm is underlined more than once.
D e f i n i t i o n 16 ( T e r m s in _A).
D e f i n i t i o n 17 ( R e m o v a l o f u n d e r l i n i n g s ) .
- mAt - A if A is line free,
- ]A t -- A ,
- lAB I -[ALIBI,
- .iAi,
- i, .AI -- , .iAI. []
36
D e f i n i t i o n 19 ( R e d u c t i o n in _A_).
(i) The/3-, It- and b-rules are identical to the corresponding rules in X, e.g.,
( A x . M ) N --~ M[x:=N], where M, N are X__-terms,
(ii) If A ~ B in X, then A ~ B B_ in A__,
(iii) There are four underlining rules:
A B ~ A[B[,
IA- IA,
Ax.A --* Ax.A,
~ x . A ~ ~x.A. []
P r o o f . Straightforward. []
L e m m a 26.
- if M , N are A-terms, then M - ~ N ~=~M = N ,
-M~_N,
- M N ~ L iff there are M r, N ~ such that L - M t N ~, and M~_M ~, N~_N ~,
- A x . M ~ L if] there is an M ~ such that L =- A x . M ~, and M ~ M t,
- # x . M ~ L if] there is an M ~ such that L -- # x . M r, and M~_M ~.
P r o o f . Straightforward. []
Lemma27. If M - ~ M t and N - ~ N ~, then
M [ x := N] ~ M'[x := N'].
=N. []
Notation. The set of all equations P--Q for which P, Q have the same type and
P, Q are unusable, is denoted by ,9.
T h e o r e m 32. A + S is consistent.
)~ i- t r u e = Cl[P1].
A 5 true ----Of[Q1].
Since
CI[Q1] = c2[P ],
it follows that
A P t r u e = C2[P2],
which is of type N a t , and notice that A+U=c is inconsistent for every constant c
of type N a t (we remark that c is restricted to the given constants of A. Clearly, it
would be possible to introduce, for example, a constant J_, with rule ~2~J_, but
not • i.e., _L is a normal form. Then t~ =- _l_does not lead to inconsistencies).
Clearly, for type B o o l there is also such a term, which we will denote by tl too.
Hence, for every type a there is a term
U~ = Ax.U
A + 8 + M = P F- M--U~.
Since M is usable, it follows by lemma 8(a) that there are sequences y, N such
that ( A y . M ) N has a normal form. Without loss of generality we may assume
that this term is closed. It follows, that there is a sequence L and a constant c
of ground type such that ()~y.M)NL = c.
90
c = ()w.M)NL
---- (Ay.U~)NL
= (Ay.~x.U)NL
_~.
Acknowledgements
I thank Henk Barendregt and Maarten Fokkinga for the interesting and valuable
discussions. I also thank one of the anonymous referees for his or her detailed
corrections.
References
Abramsky, S. (1990), The Lazy Lambda Calculus, in: Turner, D.A. (Editor), Re-
search Topics in Functional Programming Languages, Addison-Wesley, Reading,
Massachusetts.
Barendregt, tt.P. (1971), Some extensional term models for combinatory logics and
lambda calculi, Ph.D. Thesis, Utrecht.
Barendregt, H.P. (1975), Solvability in lambda calculi, Colloque International de
Logique, Clermont Ferrand, 209 - 219.
Barendregt, H.P. (1984), The Lambda Calculus - Its Syntax and Semantics (revised
edition), North-Holland, Amsterdam.
Berry, G., P.-L. Curien, J.-J. L@vy (1985), Full abstraction for sequential languages:
state of the art, in: Nivat, M. and J.C. Reynolds (Editors), Algebraic Methods in
Semantics, Cambridge University Press, Cambridge, 89 - 132.
Kuper, J. (1994), Partiality in Logic and Computation - Aspects of Undefinedness,
Ph.D.Thesis, Enschede.
Ong, C.-H.L. (1988), The Lazy Lambda Calculus: an Investigation into the Foundations
o/Functional Programming, Ph.D. Thesis, Imperial College, London.
Plotkin, G.D. (1977),LCF considered as a programming language, Theoretical Computer
Science 5, 223 - 255.
Lambda Representation of Operations
Between Different Term Algebras 1
Marek Zaionc
Instytut Informatyki, Uniwersytet Jagiellofiski,
Nawojki 11, 30-072 Krakow, Poland 2
email zaionc@ii.uj, edu.pl
Introduction
As a contribution to the ongoing research on computing over general algebraic
structures, we consider recurrence over free algebras (compare [BOB85], [Lei89],
[Lei90], [Zai89]). As a model for computing a simple typed lambda calculus is
employed. The lambda calculus introduced by Church is a calculus of expres-
sions, which naturally describes the notion of computable function. Function-
als are considered dynamically as rules rather than set theoretic graphs. The
lambda calculus mimics the procedure of computation of the program by the
process called beta reduction. There is a natural way of expressing objects such
as numbers, words, trees and other syntactic entities in the lambda calculus.
All those objects are of a considerable value for computer scientists. Dynamic
operations on objects of this kind can be described by terms of lambda calculus.
Therefore lambda terms may be considered as algorithms or programs working
1This research was supported by KBN Grant 0384/P4/93
2This paper was partially prepared while author was visiting Computer Science Department
at State University of New York at Buffalo, USA
92
on those syntactic objects and producing as a result a new object not necessar-
ily of the same type. It is well known result by Church and Kleene, relating
all partial computable numerical functions with lambda terms. Of course, the
notion of partial recursive function can be naturally extended to other structures
such as words, trees etc. It is natural that the Church-Kleene theorem might be
extended and holds for these structures.
The typed version of lambda calculus is obtained by imposing simple types on
the terms of the lambda calculus. The problem of representing structures is ba-
sically the same in the typed lambda calculus, however, the rigid type structure
imposed on the syntax of lambda calculus dramatically reduces expressiveness
of functions on these structures. Interestingly enough, the solution for repre-
sentability problems varies for different structures.
The first result concerning representability in the typed ,~ calculus have been
proved by Schwichtenberg in 1975 and independently by Statman (see [Sch75],
[StaT9]). Schwichtenberg studied numerical functions represented in the typed
lambda calculus and the following characteristic was proved: lambda definable
functions are exactly those generated by composition from the constants 0 and
1 and operations of addition, multiplication and conditional (extended poly-
nomials). The similar result for word operations was obtained by Zaionc in
[Zai8T]. The word functions represented in typed lambda calculus are exactly
those generated by composition from constant A (empty word) and operations
append, substitution and cut. The results of Schwichtenberg and Zaionc were
extended to the structure of binary trees [Zai90]. It was shown that t definable
tree operations are those obtained from initial functions by composition and a
limited version of primitive recursion. Leivant [Lei89] showed that recursion is
essential and can not be removed from this characteristic. A similar result was
obtained for ,~ definable operations on arbitrary homogeneous free algebra.
In this paper we examine the situation when the input and output algebras are
generally different. The proof of the main result is obtained by inductive de-
composition of closed term which represents a function between two different
algebras. While the decomposed terms are generally simpler according to some
measure of complexity, they represent operations between definitely different al-
gebras from algebras we started with. Therefore, the problem must be presented
in a more general setting in which we consider functions from product of several
not necessarily same algebras.
1. Free Algebras
Algebra A given by a signature SA = [o~1, ..., OLn] has n constructors al, ..., an
of arities a l , . . . a n respectively. Expressions of the algebra A are defined by
induction as the minimal set such that if ai = 0 then ai is an expression and
if ai > 0 and tl, ...,ta~ are expressions then ai(tl, ...,toni) is an expression. We
may assume that at least one ai is equal 0, otherwise the set of expressions is
empty. By A we mean the set of all expressions in algebra given by signature
SA = [c~1,..., an]. For simplicity we are going to write A = [al, ..., a~] to say
that A is an algebra given by the signature [o~1,..., eta]. If A1 .... , An are algebras
93
D e f i n i t i o n 1.5 Let A = In1, ..., an]. Let al, ..., an are all constructors in algebra
A such t h a t arity of a~ is hi. Function h : A • B ~ C is defined by recursion
f r o m functions fl : C ~1 • B ~ C , . . . , f ~ : C ~ • B ~ C if for all i <_ n the
following equations hold:
2. E x t e n d e d T y p e d A Calculus.
Our language is derived from Church's [Chu40] simple theory of types. Every
t e r m possesses a unique type which indicates its position in a functional hierar-
chy. Let T Y P E be a set of types which are defined as follows: 0 is a type and
if r a n d / i are types then 7 --*/i is a type. For any type 7 we define numbers
rank(r) and a r g ( r ) as follows: a r g ( O ) = r a n k ( 0 ) = 0 and arg(7 --+/i) = l+arg(/i)
and rank(7 --+/i) = max(rank(v) + 1, rank(/i) ). Associated with each type r
is a denumerable set of variables V(7). Any variable of type 7 is a t e r m of type
r. If T is a t e r m of type 7 - + / i and S is a t e r m of type 7 then T S is a t e r m
of type /i 9 I f T i s a t e r m of t y p e / i and x is a variable of type 7 t h e n Ax.T
is a t e r m of type 7--* /i. I f T i s a term of type 7 we write T E 7. We shall
use the notation Axl...xn.T for term ,~xl.(Ax2.(...(,~x~.T)...)) and TS1...5;~ for
(...(TS1)...Sn). If T is a term and x is a variable of the same type as a term S,
then T[x/S] denotes the substitution of the term S for each free occurrence of
x in T. The axioms of equality between terms have the form of a/?r/ conversions
and the convertible terms are written as T =Z~ S. By the Cl(7) we mean the
set of all closed terms (without free variables) of type 7. Term T is in the long
normal form if T = )~Xl...x,.yT1...Tk where y is an xi for some i < n or y is a
free variable, Tj for j _< k are in the long normal form and yT1...Tk is a term of
type 0 . Long normal forms exist and are unique for/~r 1 conversions.
Language over tuple types is build in the similar way. If M is a finite string of
terms (M1, ..., Ms) of types rl , ..., v~ respectively then M is denoted by x ~i=lMi.
of the type x~=17i. We use the same notation M E r to say that Mi is a t e r m
of type ri for i < n. This definition m a y by iterated so we m a y consider strings
of strings of terms and so on. The e m p t y tuple of terms is denoted by ft of type
w. In the case when all Mi are the same we use M n instead of xr~=lM with
f~n = ft. T e r m formation is following: If M E X~=zri --* # and N E X~=lri
then by M N we mean MN1...N,~ of type # with M f t = M when X~=lri = w
(n=0). If M E r ~ xj=l,u m ~9 and N E r then by M N we mean (M1N, ..., M,~N)
or equivalently x~=IMiN of type xr~=lpj with f t N = f t in case when m = 0 . If
M E # and x is a variable of type x~=17-i (which means that x has a form of
tuple t e r m (xl, ..., x~) ) then by Ax.M we m e a n Ax~...x,~.M of type x~=l~-i ~ p
with Ax~o.M = M for variable x of type w. If M E X~=l# j and x is a variable
of type r then by Az.M we mean (Ax.M1, ...,Ax.M,~) which is identical with
xr~=lAx.Mi of type 7 --~ xj=~#3
_ m . with Ax.ft = ~ when m = 0 . If M is a t e r m of
type x~=~(vi ~ #i) and N is a t e r m of type xr~=lri then by parallel application
M<>N we m e a n the t e r m xi= n 1MiNi o f t y p e Xi=z#,.
n . I f x = (xl...x,~) is a variable
X rt
of type X~=lri and M E i=zg* then b y parallel abstraction Ax 9 M we m e a n
the t e r m (Axl.M1, ...,Ax,..Mn) of type x'~=~(ri ~ Pi). We m a y summarize all
those definitions by:
Having proved the existence of the long-normal form in the ordinary typed A
calculus we can show the same for tuple terms by induction. If M is a term of
type Xi=z#,
'~ 9 then by Mi we mean i-th coordinate of M, therefore we have
3. Representability
If A is an algebra given by a signature SA = [al, ..., c~a] then by r A we mean
a t y p e (O al -+ O) --+ ...--+ (O ~ -+ O) --+ O. By r/a for i_< a we m e a n i - t h
component of type r A i.e. r/A = O ~ -+ O. We will see that closed terms of this
type reflect constructions in algebra A. Assuming that at least one o~i is 0 we
have that r A is not empty, r A is the simple type for any algebra A. There is
a natural 1-1 isomorphism between expressions of algebra A and closed terms
of type r A. Let Cl, ..., ca are all constructors in term algebra, of arity Ctl, ..., o~a
respectively. If ai is an 0-ary constructor in A then the closed term ~Xl...Xn.Xi
represents ai. If ai > 0 and tl, ..., t ~ are expressions in A represented by closed
terms T1, ..., T ~ of type r A, then an expression ai (tl, ..., t ~ ) is represented by the
term A x l . . . x , . x i ( T l X l . . . x n ) . . . ( T ~ x l . . . x n ) . Thus, we have a 1-1 correspondence
between closed terms of type r a and expressions of algebra A. The unique (up to
r term of type r A which represents an expression t in algebra A is
denoted by t. Let A1, ..., A,~ and B be algebras. A function h : A1 x ... x AN --+ B
is represented by a closed term H of type r al --+ ... --+ r Am --+ r B if for all
expressions tl E A1, ..., tn E An, the following terms are/3r / convertible
H t l . . . t ~ =#~ h ( t z , . . . , t~).
leftmost(e)=0
leftmost (t 1At 2) ~-~leftmOst(t ] ) + 1
The function leftmost is obtained by the recursion schema from the functions
fl(y, z) = y + l and f2 = 0. Since f~ and f2 are ~ functions which are nonterminal
trees in N, the function leftmost is also a ~ function (see definition 1.6).
Next four lemmas 3.5, 3.6, 3.7, 3.8 are concern with type checking of particular
terms and will be used afterward in lemmas 3.10, 3.13 and 3.14 as well as in
theorem 4.6.
L e m m a 3.11 The collection t~ A (see definition 3.4) of terms is just the set of
representatives of nonterminal trees in A.
P r o o f . By induction of the construction of elements from hA. The con-
structors are represented (see definition 3.3). Projections are represented. Let
f : A n ~ A b e a function represented by F E nA Let functions gl : A k --*
A, ...,g,~ : A k --~ A be represented by terms G1, ...,Gn from hA. A function h
defined by h(el, ..., ek) = f ( g l ( e t , . . . , ek), .., g(el, ..., ee)) is represented by term
AT.F(G~T)...(GnT). By simple induction on the construction on F we can check
that H ~ hA.
~x.G(Az.Px)x = ~ definition of G
~ . ( ( ~ . s ~ ) ( ~ z . ~ ) ~ ) =p,~ /~ conversion 2.13
~.((~z.P~)~)x) = ~ formulas 2.18 and 2.17
~.((~z.p~)~) = ~ /3 conversion 2.13
Ax.(P~x)= ~ U conversion 2.15
P~ = ~ definition of G
GP
Let D be a closed term of type (TA) p ~ ('cA) a' such that every D j ~ nA for
j ~ a~. For induction we assume that every Dj satisfy lemma which means that
i O0
~ x . E ( A z . P x ) x =~, definition of E
/3 conversion 2.13
/? conversion 2.13
~.x,((~.~(~.~)V)~) =~, inductive assumption for Ay.D(Az.Py)y
Ax.xi(DPx) =~, definition of E
EP
K :=~ )~xq.ql
K ==~ Axq.qa~
K ~ Axq.xj when 7j = 0
K ~ Axq.xj ( K x q ) . . . ( K x q ) when 7j > 0
"gj times
nonterminM trees. Let G = (G1, ..., Gnu) and K = (K1,..., K-r, ). So we have G =
A X r x . K x ( X ~ x ) . Let us check that the theorem also holds for K ' = A x q . x ~ ( K x q ) .
Let G' be A X r x . K ' x ( X ~ x )
4. M a i n R e s u l t
L e m m a 4.1 Let A be an algebra based on signature [Ctl, ...,eta]. Let B be a
product of algebras. Let C be an algebra. Let F be a closed term of type
• ~ --+ T B --+ r c) representing the system of functions fl, ..., fa. Let
h : A x B --~ C be a function defined by recursion from functions fl, ..., f~ ( see
definition 1.5.) Let H be a closed term of type r A -+ 7"~ --+ 7"C. The following
two statements are equivalent:
1. Term H represents h
a ~o(i
such that fi : C ~ x B --+ C for all i < a. Let us assume that for every
b C B functions (fl/-6) : C ~ -+ C,...,-(fa/-b) : C ~~ ~ C are nonterminal
trees. Let h : A • B ~ C be a function defined by primitive recursion from
f l , . . . , fa. Let the system f l , . . . , fa be represented by a closed term F of type
• ~' ~ 7--~ --~ 7"c). For every b E B we define functions g~,...gba by
gb(x) = f i ( x , b ) . Function gb is represented by the term G~ = A X i . F X i b for
i _< a. Therefore the tuple term G b = (G~,..., G~) is given by $ X 9 F X b . Since
gb = f i / b for i < a are nonterminal trees in C then according to lemma 3.11 the
term G~ belongs to ~c. Let H be a closed term of type T A --+ T B -'-+ T C defined
by A S T x . S ( ~ R 9 ( F ~ ( ~ z . R ) ) T x ) (see lemma 3.9). By lemma 3.13 it holds that
for every T and Y-, ( ~ a <>( C o Y ) ) T =p, ( F o (• <>~//))T)T. According
to lemma 4.1 H represents h. It means that the function h is ~ definable.
p( ~ T x . ~ (-[Tx) ) = 0 if I T • =~ x
In the next theorem we are going to design a procedure which reduces the prob-
lem of representability of a closed term to possibly few "simpler" problems. For
reason of termination of this procedure we are going to investigate some quasi-
order of terms. Let us consider the set of pairs of natural numbers well-ordered
in the ordinal w • w. For every closed term T of type T B - ' + T C where B is
104
D e f i n i t i o n 4.4 Let B and B' be two products of algebras. Let C and C '
be an algebra. Let P be a term of the type 7-~ - - ~ ~-c and ~ h e a term of
the type r B--r ~ r C'. We call the t e r m P ' "simpler" than P if (p(~-7), ~(~-~)) <
(p(P), ~(__P)) in the_ ordinal w xw. It means that p(P') < p(P) or if p ( P ' ) = p(P)
then ~ ( P ' ) < ~ ( P ) .
m
constructor of the algebra C of arity 71. The function p as the composition of/~
functions is a ,~ function.
(Case 2) Let T = ~Tx.Ti(TTx) for some closed term 7 of the type r B --*
xj=17
c
jC -* x ja = l r r where A is the i - th algebra in the product B given
by signature [al,...,o~a]. Let I T x Cp~ x. Let T be a closed term of type
x]=l((rc) rB r C) defined by T = ~ Y . ( A T x . ( ( - [ T x ) o ( Y x ) ) ) . By lemma
4.5 p(Fj) < p(P) for all j _< a. Since any term Fj is "simpler" than the term P
then all Fj are representing ,~ functions. Let fl, ..., fa be the system of ~ functions
represented by F where fj : C ~j x B ---, C. By lemma3.14 it holds that for every
T, closed term G defined by ,~X * ( T o X ) T belongs to ~ c Since ,~X 9 ( T o X ) T
belongs to gc then according to lemma 3.11 all functions f j / b for j _< a are
nonterminal trees in C. Let h : A ---, B --+ C (where A is an Bi ) be a function
defined by limited primitive recursion from the system fl, ...,fa. Let H be a
closed term of type v A --+ v -~ ~ r C defined by ;~STx.S(TTx). By lemma 3.10
we know that for every T and g , (~a (>(C<>g))T = ~ (~o(( x]= 1( H ~ <>Yj))T)T.
Thanks to lemma 4.1 it means that H represents h. Therefore h is a A func-
tion. Function h is represented by ~STx.S(-[Tx) but function p is represented
by ;~Tx.~(-[Tx). Therefore the following relation between functions h and p
holds: p(bl, ..., bk) = h(bi, bl, ..., bk) for all expressions bl, ..., b~. Since the class
of ,~ functions is closed for compositions it holds that p is a ,~ function.
References.
[BSB85 ] Corrado Bbhm and Allessandro Berarducci, Automatic synthesis
of typed ~ programs on term algebras, Theoretical Computer Sci-
ence 39 (1985) 135-154
[Lei89 ] Daniel Leivant Subrecursion and lambda representation over free
algebras, in S. Buss and P Scott (eds.), Feasible Mathematics (Pro-
ceedings of June 1988 Workshop at Cornell)
[Mad91 ] Madry M, On the A definable functions between numbers, words
and trees Fundamenta Informaticae, 1991
[Sch75 ] Schwichtenberg H., Definierbare Funktionen im ~ -Kalkiil mit
Typen, Arch Math. Logik Grundlagenforsch 17 (1975-76) pp 113-
114.
[Sta79 ] Statman 1%. Intuitionistic propositional logic is polynomial-space
complete, Theoretical Computer Science 9, 67-72 (1979)
[Zai87 ] Zaionc M. Word operations definable in the typed )~ calculus, The-
oretical Computer Science 52 (1987) pp. 1-14
[zaig0 ] Zaionc M. A Characteristic of )~ definable Tree Operations, Infor-
mation and Computation 89 No.l, (1990) 35-46
[Zai91 ] Zaionc M. A definability on free algebras, Annals of Pure and Ap-
plied Logic 51 (1991) pp 279 -300.
Semi-Unification and Generalizations of a
Particularly Simple Form
1 Introduction
When dealing with the generalization of complex terms in short proofs, one of
the first questions is: Have the innermost parts of a sufficiently complex term in
the end-formula any influence on the proof?. Or more formally, Is it possible to
transform a given proof of A(t) to A(t'), where t' is the result of replacing suffi-
ciently deep subterms oft by corresponding variables? We call this type of gen-
eralization generalization of a particularly simple form. There are calculi which
admit this type of generalization trivially without changing the logical structure
of derivations. Take for example first-order resolution calculi: the generalizations
are provided by the so-called lifting lemmas (cf. [CL 73], L e m m a 5.1). Other cal-
culi admit this form of generalization after adequate transformations; for LK this
means elimination of cuts (cf. [KP 88], Chapter 2).
In this paper we concentrate on schematic 3 theories within usual logical de-
duction systems. It is known from literature that schematic theories, which are
identical in the sense of model theory, may behave in a completely different man-
ner with respect to the generalization principle mentioned above. E.g., for every
finitely axiomatized number theory Z augmented by the least number principle
3#r D (c~(x) A x_<y)) (LP)
restricted to purely existential formulas there is a function g' such that
~k(Z + L P ( ~ I ) ~k A(s~ (0)) and n >_ e ( k , A(a)))
3 m ( Z + LP(Z1) ~- VxA(sm(x))) (*)
* Wiedner Itptstr. 8/El18-2, A-1040 Wien, Austria. Emaih baaz01ogic, tuwien, ac. at
** Karlsplatz 13/E185-2, A-1040 Wien, Austria. Emaih salzer01ogic, tuwien, ac. at
3 Throughout this paper a schema is formula which, in addition to ordinary function
and predicate symbols, may contain predicate variables. An instance of a schema is
a first-order formula obtained by replacing the atomic second-order semi-formulas
a ( t l , . . . , tn) by first-order formulas of corresponding type. A schematic theory is a
finite set of schemata.
107
2 (Semi-)Term Bases
T-(H, A(al,..., aN)) = {(t,,... ,t~) ] II ~ proves A(t,,... ,t~) for s o m e / / ' = / / }
where = is some equivalence relation over proofs. We are interested in general-
izations where
(A) T-(H, A ( a l , . . . , aN)) is the set of all instances of some n-tuple of terms.
(B) = is chosen in way such that for all k E w there are only finitely many
equivalence classes of proofs of length _< k. We will concentrate on the case
where / / = H ' iff /7 and /7' have the same extended proof matrix (see
Section 7).
Condition (A) and (B) imply the existence of a term basis for A(al,..., a~) and
all k E w in a theory 7".
D e f i n i t i o n 1 ( T e r m B a s i s ) . A finite set of n-tuples of terms (t~, 9 " ",
t~ \z is
nti=l
a term basis for A ( a l , . . . , as) and k iff
1. 7" ~ A(ti,,..., ti~) for 1 < i < l, and
2. if 7" t-k A(sl,..., s~) for an n-tuple g of variable free terms then there is a
substitution a such that { S l , . . . , s . ) = {t~,...,ti)c~ for some i, 1 < i < 1.
We extend this concept to semi-terms bound by strong quantifiers. 4 For a bound
variable x, let ST~ be the set of semi-terms containing x and function symbols,
but no constants or other variables9 STc denotes the set of closed terms. For a
formula A(a,,..., as) with free variables a l , . . . , a,~, let a binding assignment 7
be a function that assigns to each ai either one of the variables that are bound
by a strong quantifier occurrence in A or the symbol c. Let
Again, if T-(H, A(al,..., a,~), 7) consists of all instances of some tuple of terms
consistent with 7, we obtain the concept of semi-term bases.
D e f i n i t i o n 2 ( S e m i - T e r m B a s i s ) . A finite set of n-tuples of pure semi-terms
(t~, ...,tn)i= i , 1 is called a semi-term basis for A(al,...,an), 7 and k E w in a
theory 7" if the following holds:
1. 7- k- A'(ti~,...,t~) for 1 < i < l, where A' is obtained from A by replacing
the strong quantifier occurrence Qx by Qx, ~t if 7(aj) = x and fi are the
variables in tj.
2. For all n-tuples ( s l , . . . , sn), where sj E ST~(aj), 7" ~_k A(sl,..., sn) implies
that there is a n i , 1 < i < l, and asubstitutionc~ such that sj = tjc~ for
a l l j , 1 <_j<_n.
4 A n o c c u r r e n c e of 3 (\/) is weak in a f o r m u l a if it is in t h e scope of a n even ( u n e v e n )
n u m b e r of n e g a t i o n signs, a n d strong if it is in t h e scope of a n u n e v e n (even) n u m b e r
of n e g a t i o n signs; s u b - f o r m u l a s A D B are t r e a t e d as --A V B .
109
For the following we assume that the theories under consideration are consistent
and prove the usual axioms of equality. An almost immediate consequence of
the existence of term bases that usually receives much attention is a version of
Kreisel's conjecture.
Proof. We prove the following, somewhat more general form of Kreisel's conjec-
ture:
Let TB be a term basis for A(al, ...,at) and k. Since TB is finite there is a
bound h such that dp(t) < h for all t occurring in TB. All tuples ( t l , . . . , G )
where t~ = sg(0) for some g < h or ti = sh+l(bl) for pairwise distinct free
variables bi are substitution instances of tuples in the term basis. It follows that
7- F A ( t l , . . . , t,.)
We now derive consequences of the fact that large semi-terms can be generalized.
T h e o r e m 4 . If a theory 7- admits semi-term bases then for every formula
A(a, a l , . . . , a r )
Proof. Let SB be a semi-term basis for 3xA(x, al,..., at) and k. Since SB is
finite we can choose terms s ~I ( y ) , . . . , s ~ ( y ) such that all ni as well as I n i - nil
(for i r j and 1 < i, j < r) are greater than the maximal depth of terms in SB.
Because of
T F 3xVyA(x,s'*(y),...,s"r(y)) ,
( s " ( y ) , . . . , s ~ ( y ) } is an instance of some tuple (sin'(z1),.. s'~(zr)} in SB
where z l , . . . , z~ are pairwise distinct variables. By definition,
[- 3xVyl...VyrA(x ,sh(yl),...,sh(yr)) []
Another application of semi-term bases shows: if the existence of a bound beyond
which a statement holds can be shown by a short proof then this bound can be
made explicit within the theory.
Theoremh. If a theory T admits semi-term bases and proves Vx3y(x<y) as
well as Vx(sn(O)<x D 3y(x=s~+l(y))) for all n E r then for every formula A(a)
T ~ Vy(sh(0)<y D A(y))
Proof. Obvious from the laws for shifting quantifiers in first-order logic. []
T h e o r e m 7. Le t 7- be a f u n c t i o n - f r e e s c h e m a t i c t h e o r y p r o v i n g V x ( h ( x ) = x ) for
some monadic function s y m b o l h.
(a) ff all schemata S = S[(Qlx,..-~(xl)...),..., (Qk~k " ' ~ ( ~ ) ' " ")] in 7- are
of a f o r m such that no strong quantifier occurrence Q i x i is in the scope o f any
w e a k q u a n t i f i e r occurrence, a n d i f 7 - a d m i t s t e r m bases, t h e n 7- is splittable.
(b) I f 7-/- a d m i t s s e m i - t e r m bases t h e n 7- is splitlable.
Proof.
(a ) Let 7" consist of the schema
3 x , ( , ) A 3x-.~(x) A v x v y ( , ( , ) A ~ ( v ) ~ x<v)
D 3xVz(o~(z) D z<_x A -oe(z) D x<_z) (bED)
is splittable, too.
113
Now instantiate u=v for a(u, v). Shifting the quantifiers we get
We now have to fix our concepts of proof (7" ~- A and IT ~-~ A) such that
1. usual notions of proofs are captured, and
2. the proofs, or rather their transformed forms, should admit most general,
term minimal proofs within finitely many equivalence classes of proofs. Every
term minimal proof provides for one element of the (semi-)term basis.
We shall work with LK in one of its usual formalizations.
The length len(//) of a proof/7 (also: its number of steps) is defined as the
number of applications of inference rules. We write 7- k k A for a schematic the-
ory 7- if there is an LK-proof of a sequent Vs ...VxmTrn ---, A of length < k,
where T1, . . . , Tm are instances of schemata in T and the quantifiers V21,..., V2m
bind the parameters of these instances.
T h e o r e m 8. Let 7- be a schematic theory. 7- ~_k A ( t t , . . . , t,~) implies that there
is a proof H of a sequent V21T1 .. "V2mTm --+ A such that
(a) 17 is cut-free 5 and uses only atomic initial sequents and weakenings, and
(b) len.(//) < r k, A ( a l , . . . , an)) for a recursive ~.
Proof. Using the argument of [Part 73, KP 88], all formulas in the proof can
be restricted to formulas of a certain logical depth depending on 7", k and
A ( a l , . . . , aN), and therefore the usual bounds on cut-elimination can be used.
[]
5 An alternative to cut-elimination is the addition of the schema ce D a, which is
splittable, and the replacement of all cuts by D: I. The cut-free proof, however,
admits stronger generMizations than the proof using instances of ce D a. [Kreis 94]
6 Semi-Unification
A skeleton of an LK-derivation is a finite tree where all inner nodes are labelled
with the name of inferences to be applied. Quantifier inferences are augmented
by their bound variables. Additionally, predicate symbols are attached to the
initial nodes and to the nodes labelled with 'weakening:right' or 'weakening:left';
the atomic axioms and weakenings are supposed to be based on these predicate
symbols. The skeleton obviously determines the logical structure of the proof. Let
7* be a mapping assigning to each argument position in a predicate occurrence
a free variable, a bound variable or a constant; 7* denotes the center of the
monadic terms. Every cut-free LK-derivation in a monadic language starting
from atomic axioms and weakenings determines uniquely a skeleton S and a
mapping 7* .6
6 Clearly, this definition makes only sense for languages consisting only of monadic
function symbols.
115
An extended proof matrix M(S, 7*) is constructed as follows. First add initial
sequents of the form P(al,..., an) ---+P(al,..., a~), where P is the predicate
symbol according to S and a l , . . . , an are pairwise distinct fresh variables. Then
add sequents to the inner nodes of the skeleton by traversing it downwards as
follows:
1. If the node is labelled with A:right then its upper sequents are of the form
H --+ A, B and H I --+ A', C, respectively,
where the sequences II(A) and H'(A') are equal up to naming of variables.
Unify those formulas and apply the unifier to all sequents above the node.
Then apply the rule A:right to obtain the lower sequent, i.e., the sequent
labelling this node.
If the node is labelled with an exchange, a contraction, a cut or another
propositional inference rule, proceed analogously.
2. If the inference is a weakening then an atomic formula P(al,..., an) is intro-
duced, where P is determined by S and (al,..., an) is an n-tuple of pairwise
distinct fresh variables.
3. If the inference is 3:right with bound variable x and the upper sequent is
ii ---+A, B(al,..., an)
then add as lower sequent
8 C o n s t r u c t i o n of an A d e q u a t e Semi-Unification P r o b l e m
Let the following information be given:
(A1) a cut-free extended proof matrix AJ(S, 7*);
(A2) a sequence H ] , . . . H ~ of function-free schemata;
(A3) a formula A(al,..., a,0 with a binding assignment 7 coinciding with 7*.
All bound variables in item A2 and A3 are assumed to be different.
We construct a semi-unfication problem U together with a basic substitu-
tion ~ to be applied to the matrix such that
(B1) every p r o o f o f A ( t l , . . . , t~), where t~ E STs(a0, represents a solution for U;
(B2) every solution cr for U can be applied to the substituted proof matrix
M(S, 7")~, which then can be extended to a proof of
V21T1...V~2~T~ ---+A*(al~r,..., an~a)
where
116
P P
D:r
a (t a a a a
P ( ~ , ~ ) --, P ( ~ , u~) P ( ~ , u~) - P ( ~ , u~)
o o o o
P(~, ~) --, 3~P(,~, g) P(~, ~) --~ 3~P(v~, v2) ~ P(.~, u6) ~ e(.s, ~6)
a a a a gg ;?3
P(ul, us) V P(ua, u4) ---+3xP(vl, ve) P(u~ 06) ---+3 x P ( 4 , 4 )
P(~I, ua2) V p(uaa, ~.4) V P(u~ u~ ) -+ 9ecP( ~l, v~2)
a 2
Vy2 (P(r ~ ) V P(ua, ~6) V P(u~ 06)) -- 3xP(vl,23 v2)
x
c c c
P(uT, ~6) -+ P(~7, ~8)
P(u~, ~s) c ~3)
3z~P(u7, z2
B D C , A--, D
P ( ,g , ,& ) --+ 3zl 3z2P(v~,
z ~ z~
v3) A, B D C --+ D
23 ~ Z2;
3xP(v~, vs) -+ 3z~3z2P(~, va e DC~ d DD
c D
where c ~ d are arbitrary constants. We first unify the end sequent of the matrix
with V21T~...V2~T'~ ~ A ( a l , . . . , a, 0 where Y2iT~ is obtained from the corre-
sponding schema Hi by replacing the different occurrences of schema variables
by different propositional variables and by taking V2i from the given matrix; the
resulting unifier initializes ~. We proceed by traversing the matrix bottom-up.
(C1) The inference is a propositional or structural inference: nothing to do.
(C2) The inference is V:right, the lower sequent is of the form
H -+ F, VxB(x, a i l , . . . , a i r ) ,
where VxB(x, a l l , . . . , air) is a subformula of A ( a l , . . . , an) and 7(aij) = x.
In this case unify the upper sequent with
II ~ F,U(e, b l , . . . , b r ) ,
where e is the eigenvariable associated with x by 7* and bl, ..., b~ are new
variables. Apply the most general unifier to the matrix as well as to ~.
Extend U by the two pairs
((e,b~,...,b~),(x,a~,...,air)) and ((x, a i ~ , . . . , a , ) , ( e , b ~ , . . . , b ~ ) )
(C3) The inference is V:right, the lower sequent is of the form
H ---+F, VzB ,
where VzB will become a part of a schema instance. Let B* be the formula
occurring immediately above VxB in the proof matrix, and let pl,. 9 pr be
the positions in B for which 7' yields x. Identify the terms ti occurring in
position Pi in B, and likewise the terms ui occurring in position Pi in B*.
Extend U by the two pairs
((e, u l , . . . , u ~ ) , ( x , t l , . . . , t r ) ) and ((x,tl,...,t~),(e, ul,...,u~)) ,
where e is the eigenvariable associated with x by 7".
(C4) The inference is 3:right, the lower sequent is of the form
H --, r , ,
P(a, r3) --+ P(a, ra) P(s(a), s(ra)) --~ P(s(a), s(ra))
P(~, ~) -* ~,P(t~, t~) P0(~), ~(r~)) -~ 3*P(t~, ~) P(O, p) ---* P(O, p)
P(a, ra) V P(s(a), s(r3)) --* 3xP(tl, t2) P(O, p) --* 3xP(tl, t2)
P(a, ra) V P(s(a), s(r3)) V P(O,p) ---*3xP(tl, t2)
Vy2 (P(a, y2) V P(s(a), s(y2)) V P(O,p)) --* 3xP(tl, t2)
~ylVy2(P(yl, y2) V P(s(yl), s(y~)) V P(O,p)) --* 3xP(tl, t~)
P(n, ~:) -~ p ( n , ~)
P(r~,r2) ~ 3z2P(ri, z2) B D C,A---* D
P(r~, w_) --~ 3z13z2P(zl, z~) A, B D C --* D
3xP~l,t2)--~3z13z2P(zl,z2), ~ , BDC--ADD
U D
9 Conclusion
The characterization results obtained in this paper can be used to demonstrate
that model-theoretically equivalent schemata cannot always be directly trans-
formed into each other. For example, successor induction SI cannot be directly
120
derived from the least number principle LP, even if finitely many, arbitrary
strong consistent formulas are added. In this context, 'directly derived' means
that all instances are derived within bounded depth. The analysis of the rela-
tion between different model-theoretically equivalent schemata is part of a joint
project with P. Wojtylak.
The main interest for computer science, however, seems to be the charac-
terization of schematic theories where term minimal proofs are obtainable, i.e.,
where lifting lemmas in the widest sense are possible.
References
[Baaz 93] Baaz, M. Note on the existence of most general semi-unifiers. In Arithmetic,
Proof Theory and Computational Complexity, P. Clote and J. Kraj/Sek, ed-
itors, pp. 20-29. Oxford University Press, 1993.
[BP 93] Baaz, M. and P. Pudls Kreisel's conjecture for L31. In Arithmetic, Proof
Theory and Computational Complexity, P. Clote and J. Kraji~ek, editors,
pp. 30-60. Oxford University Press, 1993.
[CL 73] Chang, C.-L. and Lee, R.C.-T. Symbolic Logic and Mechanical Theorem
Proving. Academic Press, 1973.
[Farm 88] Farmer, W. M. A unification aJgorithm for second-order monadic terms. In
Ann. Pure Appl. Logic, 39 (1988), 131-174.
[KP ss] Kraji~ek, J: and P. Pudls The number of proof lines and the size of proofs
in first order logic. Arch. Math. Logic, 27 (1988), 69-84.
[Kreis 94] Kreisel, G. Generalizing Proofs: Implications for Generalizing Theorems
Proved in Relatively Few Lines, I. Unpublished manuscript.
[KTU 93] Kfoury, A., J. Tiuryn, and P. Urzyczyn. The undecidability of the semi-
unification problem. Information and Computation, 102 (1993), 83-101.
[Maka 77] Makanin, G.S. The problem of solvability of equations in a free semi-
group. Mat. Sb., 103(2), 147-236. English translation in Math. USSR Sb.,
32 (1977).
[Pari 73] Parikh, R. J. Some results on the length of proofs. Trans. Am. Math. Soc.,
177 (1973), 29-36.
[Rich 74] Richardson, D. Sets of theorems with short proofs. Y. Symbolic Logic, 39(2),
235-242.
A Mixed Linear and Non-Linear Logic:
Proofs, Terms and Models
(Extended Abstract )*
P. N. Benton t
University of Cambridge
Abstract
Intuitionistic linear logic regains the expressive power of intuitionistic logic through
the ! ('of course') modality. Benton, Bierman, Hyland and de Paiva have given a term
assignment system for ILL and an associated notion of categorical model in which the
! modality is modelled by a comonad satisfying certain extra conditions. Ordinary
intuitionistic logic is then modelled in a cartesian dosed category which arises as a
full subcategory of the category of coalgebras for the comonad.
This paper attempts to explain the connection between ILL and IL more directly
and symmetrically by giving a logic, term calculus and categorical model for a system
in which the linear and non-linear worlds exist on an equal footing, with operations
allowing one to pass in both directions. We start from the categorical model of ILL
given by Benton, Bierman, Hyland and de Paiva and show that this is equivalent
to having a symmetric monoidal adjunction between a symmetric monoidal dosed
category and a cartesian closed category. We then derive both a sequent calculus
and a natural deduction presentation of the logic corresponding to the new notion of
model.
1 Introduction
This paper concerns a variant of the intuitionistic fragment of Girard's linear logic [7].
Linear logic does not contain the structural rules of weakening and contraction, but these
are reintroduced in a controlled way via a unary operator !. The rules for ! allow ordinary
intuitionistic logic to be interpreted within intuitionisitic linear logic.
In [5, 4], Benton, Bierman, Hyland and de Paiva formulated a natural deduction pre-
sentation of the multiplicative/exponential fragment of ILL, together with a term calculus
(extending the propositions as types analogy to linear logic) and a categorical model (a lin-
ear category). In that work, the multiplicative (i.e. | part of the logic is modelled in a
symmetric monoidal closed category (SMCC). That much is standard and well-understood.
The ! n~odality is then modelled by a monoidal comonad on the SMCC Which is required
to satisfy certain extra (and non-trivial) conditions. These extra conditions are sufficient
*A considerably longer version of this paper is available as a University of Cambridge technical report
[21.
tAuthor's address: University of Cambridge, Computer Laboratory, New Museums Site, Pembroke
Street, Cambridge CB2 3QG, United Kingdom. Email: Nick.Benton~d.cam.ac.uk.Research supported by
a SERC Fellowshipand the EU Esprit project LOMAPS.
~22
to ensure that the category of coalgebras for the comonad contains a full subcategory
which is cartesian closed and thus models the interpretation of IL in ILL.
Whilst the view that linear logic is primary and that ordinary logic is merely a part of
linear logic is appealing, it is not necessarily always the best way of seeing the situation.
This paper tries to present a more symmetric view of the relationship between IL and ILL
and it seems worth trying to give some motivation for why this might be worth doing.
From a p~actical point of view, there are a number of reasons why the standard linear
term calculus (LTC) of [5] might be considered unsuitable as the basis of a linear functional
programming language. Firstly, linear functional programming is verbose and unnatural -
whilst the LTC might well be a useful intermediate language for a compiler, it is not very
appropriate as a language for everyday programming. If linearity is to be made visible
to the programmer at all, it appears preferable to have some extension of a traditional
non-linear language in which one could write the occasional linear function in order to
deal with I/O, in-place update or whatever.
A more fundamental, problem is that, despite considerable research effort, the precise
way in which a linear language can help with what we have deliberately referred to rather
vaguely as ~I/O, in-place update or whatever' is still not clear. Most published proposals
for using linear types to control or describe intensionat features of functional programs
are either unconvincing or use type systems which are only loosely inspired by linear
logic. Systems in the last category can, pragmatically, be extremely successful; the most
obvious example being the language CLEAN. The type system of CLEAN [1] incorporates
a 'uniqueness' operator for (roughly) making non-linear types linear. This is in some
sense dual to the ! of linear logic, which allows linear types to be treated non-linearly.
Unique types in CLEAN are used to add destructive updates and I / O to the language in
a referentially transparent way.
One (somewhat speculative) aim of the work described here is to provide a sound
mathematical and logical basis for a type system like that of CLEAN. We are encour-
aged not only by the similarities between CLEAN and the calculus to be presented here
(the LNL term calculus), but also by the fact that other researchers looking at practical
implementations of linear languages have come up with systems which include aspects
of the LNL term calculus. For example, Lincoln and Mitchell's linear variant [10] of the
'three instruction machine' divides memory into two spaces corresponding to linear and
non-linear objects. Similarly, Wadler's 'active and passive' type system [14] separates lin-
ear from non-linear types. Jacobs [9] has also described how a sequent calculus inspired
by CLEAN's uniqueness types may be interpreted using the linear categories of [4] under
some extra simplifying assumptions.
From a more logical point of view, there has recently been much interest in Girard's
system LU [8] and related systems in which the (multi)sets of formulae occuring in sequents
are split into different zones. Formulae in some zones are treated classically, whilst those
in other zones are treated linearly. Intuitionistic logics inspired by LU have been proposed
by Plotkin [12] and by Wadler [15]. It is desirable to study the proof and model theory
of such systems directly, rather than treating them as syntactic sugar for, for example,
ordinary linear logic (if only to verify that it is possible to treat them as such syntactic
sugar). The logic of this paper should turn out to be equivalent to a subsystem of LU,
though there are some superficial differences of presentation.
From the categorical perspective, it seems natural to explore the more symmetric
situation where one starts from an SMCC and a CCC with (adjoint) functors between
them, rather than an SMCC with sufficient extra structure to ensure the existence of such
123
a CCC. This is particularly true in the light of the fact that the definition of a linear
category in [4] was arrived at mostly from the proof theory of linear logic, but also (and
this was something of a 'hidden agenda') from a desire to have enough structure to be able
to derive an appropriate CCC from the model. 1 It is also fair to say that the definition
of a linear category is surprisingly complicated, so looking for simpler models, or simpler
presentations of the same models, is a good idea.
The initial motivation for the present work comes from the categorical picture sketched
in the previous paragraph. Once the definition has been made a little more precise, we
shall show that such a situation leads to a comonad on the linear part of the model which
automatically satisfies all the extra conditions required of a linear category, and thus
gives a sound model of ILL including the ! operator. Y~rthermore, the converse holds -
every linear category gives rise to such a pair of categories. Thus we have an alternative,
simpler, definition of what constitutes a model for ILL. This can be seen as giving a purely
category-theoretic reconstruction of !, in that a linear category (a model for ILL with !) is
exactly what one obtains if one attempts directly to model an interpretation of IL in ILL
without the !.
Another interesting feature of the model is that it gives rise to a strong monad on the
CCC part. Thus one obtains a model not just of the lambda calculus, but of Moggi's
computational lambda calculus [11].
Section 3 then looks at the logic and term calculus which are associated with our
new notion of model. We formulate a sequent calculus presentation which satisfies cut
elimination and then give an equivalent natural deduction system. This then gives, by
the Curry-Howard correspondence, an interesting term calculus which combines the usual
simply-typed lambda calculus with a linear lambda calculus. We also consider translations
in both directions between this new term calculus and the linear calculus of [5].
1This is not to say that there is anything in the model which is not justifiable in terms of the proof
theory (given a proper proof-theoretic account of T-rules), but merely that, given that a translation of IL
proofs into ILL proofs exists, any correct model for ILL must be able to reflect the translation semantically.
124
3. a pair of symmetric monoidal functors (G, n) : s --+ C and (F, m) : C --+ E between
them which form a symmetric monoidal adjunction with F q G.
P r o p o s i t i o n 1 In an LNL model (in fact for any monoidal adjunction), the maps mx,y
are the components of a natural isomorphism with inverses Px,Y and, furthermore, the
map rn is an isomorphism with inverse p:
F(X) | F(Y) ~- F ( X x Y)
I -~ F(1)
[]
d Adef
"~PGA,GA o F(AGA )
T h e o r e m 3 For any LNL model, e and d as defined above satisfy all the conditions given
in part 3 of Definition 2. In other words, any LNL model is a linear category.
P r o o f . This involves checking that a fairly large collection of diagrams all commute.
Although this is a lot of work, none of them are very difficult. Proposition 1 plays an
important role in several of them. Fhrther details may be found in [2]. H
We now sketch the proof of the converse to Theorem 3. Whilst this is largely a matter
of recalling results which were proved in [4], by doing this carefully we obtain a slightly
better understanding of the situation.
Assume that s is a linear category as in Definition 2. We need to show that this gives
rise to a CCC C and a symmetric monoidal adjunction between s and C as in Definition 1.
Recall that the comonad on s gives rise to two categories of algebras:
126
9 The Eilenberg-Moore category E !. This has as objects all the !-coalgebras (A, hA :
A -~!A) and as morphisms all the coalgebra morphisms.
9 The (co-)Kleisli category s This is the full subcategory of s which has as objects
all the free !-coalgebras (!A, gA :!A ~!!A).
Each of these categories comes with a pair of adjoint functors F -~ G where G : A ~-~
(!A, ~A) and F : (A, hA) ~ A.
L e m m a 4 If L is a linear category then s has finite products, with the terminal object
given by (I, q : I -+!I) and the product of (A, hA) and (B, hB) by (A | B, qA,B o (hA | hB)).
[]
L e m m a 5 In fJ, all the free coalgebras are exponentiable.That is, there is an inter-
nal horn into any free coalgebra (!B, gB). Furthermore, the internal horn is itself a free
coalgebra. []
Now, notice that in any cartesian category, if an object X is exponentiable then so is
[Y, X] for any Y, since we can take [Z, [Y, X]] to be [Z • Y, X]. Furthermore, the product
of two exponentiable objects X and Y is exponentiable since we can take [Z, X • Y] to
be [Z, X] • [Z, Y]. Taking this together with the previous !emma, we have:
P r o o f . We have already seen that both the choices for C are cartesian closed so it just
remains to check that F and G form a symmetric monoidal adjunction, which is straight-
forward, n
the additive connective 'with' (&). The functor G preserves limits because it is a right
adjoint, and in particular
G(A&B) -~ GA x GB and G1 -~ 1
Taking this together with Proposition 1, we obtain the following natural isomorphisms:
These isomorphisms were central to Seely's proposed model of ILL [13], which also pro-
posed interpreting IL in the Kleisli category. See [6] for a critique of Seely's semantics;
here we merely note the following:
Proposition 9 I~ a linear category has products then the Kleisli category s is cartesian
closed.
P r o o f . One shows that s having products implies that the product of two free !-coalgebras
is a free coalgebra. This means that t'! coincides with s which is cartesian closed by
Lemma 7. D
The correspondence between linear categories and LNL models extends trivially to
one between linear categories with finite products and LNL models with products on the
SMCC part. Coproducts are slightly more problematic. Whilst the appropriate extension
of an LNL model seems obvious (just require both s and C to have finite coproducts),
this does not correspond quite as simply as one might hope to linear categories with
coproducts.
The difficulty is that, whilst an LNL model with coproducts certainly gives rise to
a linear category with coproducts, the converse does not appear necessarily to be true.
Assume s is a linear category with finite coproducts, then s also has finite coproducts as
we can define the coproduct of (A, hA) and (B, hB) to be
aild this is easily checked to satisfy the appropriate conditions. There seems no general
reason, however, why either of the two CCCs which we have already identified as arising
from s should be closed under this coproduct.
Fortunately, something can be salvaged. There are weak finite coproducts ~ in the
Kleisli category, obtained by defining
~'A,B : A | T B -~ T ( A | B)
2.3 Examples
Let E be the category of pointed ~cpos (w-cocomplete partial orders with a least element)
and strict (bottom preserving) continuous maps. This is a symmetric monoidal closed
category with tensor product given by the so-called smash product, the identity for the
tensor by the one-point space (which is also a biterminator) and internal hom by the strict
continuous function space. In fact, /: also has binary products and coproducts, given by
cartesian product and coalesced sum respectively.
Given this choice of/:, there are a couple of obvious choices for the CCC C which give
an LNL model. One is to take g to be the category of pointed w-cpos and continuous
(not necessarily strict) maps, G to be the inclusion functor and F to be the lifting functor
F : X ~ X• The monoidal structure m on F is given by the evident isomorphism
X• | Y• ~ (X x Y)• In this case, g is (equivalent to) the Kleisli category of the lifting
comonad on E. Note that the cartesian closure of the Kleisli category follows from the
fact that s has products. There are strong coproducts in E but only weak ones in C.
An alternative choice of g is the category of (not necessarily pointed) w-cpos (these
are sometimes called predomains) and continuous maps, again with inclusion and lifting
functors. This is equivalent to the Eilenberg-Moore category of the lift comonad on E, so
it has products and coproducts by our previous general arguments, but it also turns out
to be cartesian closed.
A different example arises from taking E be the category of Abelian groups and group
homomorphisms. This is symmetric monoidal closed with A | B the Abelian group gen-
erated by the set of tokens {a | b I a E A~ b E B} subject to the relations
3 LNL Logic
LNL-models are, of course, supposed to be models of a logical system. Theorem 3 says
that they are models for intuitionistic linear logic as defined by Girard, but the form of
the definition of LNL-model suggests an interesting alternative presentation of the logic.
The idea is that one starts with two independent logics, corresponding to the categories
L: and C and then adds operators which correspond in some way to the adjunction.
In keeping with our earlier conventions for naming objects o f / : and C, we will use
A, B, C to range over linear propositions and X, Y, Z for conventional ones. We shall use
P and A to range over linear contexts (finite multisets of linear propositions) and {9 and
for non-linear ones. We also decorate turnstiles with s or C to indicate which subsystem
they belong to. Finally, if {9 is X1,... ,Xn then F{9 means FX1,... ,FXn, and similarly
for GF. The two classes of propositions with which we shall be dealing are defined by the
following grammar:
A,B := A o l I I A |
X,Y := X o l I l X •
where A0 (resp. X0) ranges over some unspecified set of atomic linear (resp. non-linear)
propositions.
3.1 S e q u e n t Calculus
The two logics with which we start are very familiar viz. the exponential-free, multiplica-
tive fragment of propositional intuitionisticlinear logic and the x, -~ fragment of ordinary
intuitionisticlogic. These both have very well-behaved sequent presentations. H o w should
the systems be enriched and combined to give LNL-logic? There are (at least) two natural
answers, neither of which satisfiescut elimination. Fortunately, there is a presentation
of the logic which has a good proof theory. The trick is to allow conventional non-linear
formulae to appear in the assumptions of a linear sequent. A typical linear sequent looks,
therefore, like this:
X1,... ,Xm, A1,... ,An ~'s B
which is interpreted as a morphism in s of the form
Non-linear sequents are still constrained to have purely non-linear antecedents and are
interpreted as morphisms in C in the usual way. We abuse notation by writing linear
!30
O, X, X; F ~-~ A O, X, X I-c Y
L-contraction C-contraction
O, X; F ~-~ A O, X FC Y
O; F ks A O~'c Y
L-weakening C-weakening
O, X; F bc A O,X ~-c Y
0 I-c X X, (I);F t-c A 0 ~c X X, ,~ Fc Y
CL-cut CC-cut
O, ~; F Pc A O, 0 f-c Y
O; F I-z: A (I);A, A ~-~ B
f.L-cut
O, O; F, A t-c B
O, X I-c Z O, Y I-C. Z
C- x -left l C-x-left2
O, X x Y t-c Z O,X x YFc Z
O, X; F t-c A O, Y; F ~-~ A
s x -Ieftl s x -left2
O, X x Y; F t-s A O, X x Y; F F-z:A
O Fc X g2 Fc Y 1-right
x -right
O, 9 F"c X x Y F-c 1
O; F I-z: A I-right
I-left ~-~ I
O; F, I I-~ A
e; B, F F-c A 0 F-s A
G-left G-right
O, GB; P ~-c A (9 Pc GA
X1 x . . . x X,-%X
FX1 | | FXn---~F(X1 x ... x X n ) ~ $ F X F-right
FO | B @ F-5,A
G-left
F O | F G B @F I | | B @ F-%A
sequents in the form O; F I-s A, even though there is no need for the ';' since linear and
non-linear formulae can never be confused. Figure 1 shows the sequent rules for LNL logic.
The interpretation of LNL logic in an LNL model is fairly straightforward. We omit the
interpretation of the standard logical connectives and just give details of the interpretation
of one cut rule and some of the rules for F and G in Figure 2.
Proof. This follows the broad outline of most cut elimination proofs, showing that proofs
may be simplified by a (non-deterministic) succession of local rewrites which percolate the
cuts upwards. Again, see the full version of the paper for details, rl
T h e o r e m 11 The cut elimination procedure for LNL logic is modelled soundly in any
LNL model.
Proof. One shows that whenever one proof is simplified to another then the interpreta-
tions of those two proofs are equal morphisms in the model.
There are many possible variations on the sequent rules for LNL logic. One of the most
natural is to treat the non-linear parts of antecedents as additive rather than multiplicative.
This yields an equivalent logic containing rules such as
O; F I--s A O; A ~-s B
|
O;F,A ~-s A |
One can also present the purely multiplicative version of the logic in a concise way by
using some new metavariables: let P, Q range over either linear or non-linear propositions
132
0 l-c s: X (9 t-c t: Y
O l-c (8,t):X x Y e l-c O: 1
O ~-c s: X x Y 0 bc s: X x Y
O t-'c fst(s): X @ l-c snd(8): Y
O; F l-l: e: I O; A I-s f : A
0 l-s *: I O;F,A t-z: let, = e i n f : A
O l-z:e: A @ l-c s: GA
O Fc G(e): GA O Fs derelict(s): A
and T over mixed contexts. Then we can, for example, capture both --+-left rules in the
one rule
O l- X Y, T I- P
-left
e , X -~ Y, T I- P
This gives a set of rules which are essentially the same as those given by Jacobs in [9]
(which also contains a good account of some concrete categorical models). Jacobs gives a
rather different account of the semantics and there are also some subtle differences in the
proof rules.
fst(s,,) ~
snd(s, t) -+~ t
let * = * in e - - ~ e
derelict(G(e)) -4~ e
It is easy to check that terms code derivations uniquely and that the natural deduction
system is equivalent to the sequent calculus. The proof of the equivalence uses the impor-
tant lemmas that substitution and weakening are admissible rules in the natural deduction
system. Linear variables a,b in the context occur free exactly once in a well-typed term,
whereas non-linear variables x,y may occur any number of times, including 0. Note also
that there is no explicit syntax for weakening or contraction. We omit the details of the
interpretation of natural deductions in LNL models.
The fundamental kind of normalisation step on natural deductions is the removal of a
'detour' in the deduction, which consists of an introduction rule immediately followed by
the corresponding elimination. For reasons of space, we omit the details of the reductions
on proofs but merely list the induced fl-reductions on terms in Figure 4.
There is also a secondary class of reductions - the commuting conversions, of which
there are 12 in total. The following is a typical term reduction induced by a commuting
conversion:
leta| ing --% l e t * = e i n ( l e t a |
The reduction relation --+~,c, which is the precongruence closure of the union of --+Z and
~ c , is easily checked to preserve types. We also have (cf. Theorem 11):
T h e o r e m 12 Both the fl-reductions and the commuting conversions are soundly modelled
by the interpretation of the natural deduction system in any LNL model.
9 / f {9; F ~-s e: A and e --4Z,e e' then [O; F ~-s e: A] = [ e ; F }-s e': A]
9 I / o ~c s: x and s - ~ , c s' then [O ~c s: X] = [O ~c s': X]
[]
We can define translations in both directions between LNL logic and ILL. H A is an
ILL proposition, define the linear LNL proposition A ~ inductively as follows:
A~ = A0 (A0atomic) (A| ~ = A~ 1 7 4 ~
(A-oB) ~ = A~ ~ I~ = I
(!A) ~ = F G ( A ~
134
It is easy to see that for any ILL judgement P t- A, P ~ ~- A ~ is equal to the original
judgement. Thus P }- A is provable in ILL iff F ~ ~-L A~ is provable in LNL logic. This
extends to proofs in the following way:
T h e o r e m 15 I f P t- e: A in LTC, then not only is P t- eO*: A provable, but e ~. e ~ where
is the categorical equality relation on L T C terms given in [4]. []
References
[1] E. Barendsen and S. Smetsers. Conventional and uniqueness typing in graph rewrite systems.
Technical Report CSI-R9328, Katholieke Universiteit Nijmegen, December 1993.
[2] P. N. Benton. A mixed linear and non-linear logic: Proofs, terms and models (preliminary
report). Technical Report 352, Computer Laboratory, University of Cambridge, September
1994.
[3] P. N. Benton. Strong normalisation for the linear term calculus. Journal of Functional Pro-
9ramming, 1995. To appear. Also available as Technical Report 305, University of Cambridge
Computer Laboratory, July 1993.
[4] P.N. Benton, G. M. Bierman, J. M. E. Hyland, and V. C. V. de Paiva. Linear lambda calculus
and categorical models revisited. In E. BSrger et al., editor, Selected Papers from Computer
Science Logic '92, volume 702 of Lecture Notes in Computer Science. Springer-Verlag, 1993.
[5] P. N. Benton, G. M. Bierman, J. M. E. Hyland, and V. C. V. de Paiva. A term calculus
for intuitionistic linear logic. In M. Bezem and J. F. Groote, editors, Proceedings of the
International Conference on Typed Lambda Calculi and Applications, volume 664 of Lecture
Notes in Computer Science. Springer-Verlag, 1993.
[6] G. M. Bierman. On intuitionistic linear logic (revised version of PhD thesis). Technical Report
346, Computer Laboratory, University of Cambridge, August 1994.
[7] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50:1-102, 1987.
[8] J.-Y. Girard. On the unity of logic. Annals of Pure and Applied Logic, 59:201-217, 1993.
[9] B. Jacobs. Conventional and linear types in a logic of coalgebras. Preprint, University of
Utrecht, April 1993.
[1(3] P. Lincoln and J. C. Mitchell, Operational aspects of linear lambda calculus. In Proceedings
of the 7th Annual Symposium on Logic in Computer Science. IEEE, 1992.
[11] E. Moggi. Notions of computation and monads. Information and Computation, 93:55-92,
1991.
[12] G. D. Plotkin. Type theory and recursion (abstract). In Proceedings of 8th Conference on
Logic in Computer Science. IEEE Computer Society Press, 1993.
[13] R.. A. G. Seely. Linear logic, *-autonomous categories and coffee coalgebras. In Conference on
Categories in Computer Science and Logic, volume 92 of AMS Contemporary Mathematics,
June 1980.
[14] P. Wadler. There's no substitute for linear logic (projector slides). In G. Winskel, editor,
Proceedings of the CLICS Workshop (Part I), March 1992, Aarhus, Denmark, May 1992.
Available as DAIMI PB-397-I Computer Science Department, Aarhus University.
[15] P. Wadler. A taste of linear logic. In A. M. Borzyszkowski and S. Sokolowski, editors,
Proceedings of the 18th International Symposium on Mathematical Foundations of Computer
Science, number 711 in Lecture Notes in Computer Science, pages 185-210, 1993.
Cut Free Formalization of Logic with Finitely Many Variables. Part I.
L.GORDEEV
7SI f . Informatik, Math. Logik, Auf der Morgenstelle 10,
D-72076 Tiibingen, Germany
gordeew@informatik.uni-tuebingen.de
A -- B
(s) B=A
A=B F =G A=B F =G A--B A=B
(K) A vF=BvG A ^F=B^G 3zA=3zB VzA=VzB
(T) A = C A -- B C = B
1.2.5 It is easily verified that EPC~ is complete and hence equivalent to
MPC~modulo provability and the canonical interpretation (see below).
1.2.6 Canonical translation. In the language ~n 1 set
9 T : : .LqJ.., -~A:=A~.L, A v B : = ' ~ A ~ B , A^B:=-~(A~',B), 3xA: = - V x - A
In the language 0or set
9 A~B: = -~AvB
1.2.7 Canonical interpretation. Keeping 1.2.6 in mind, for any formula A
let EI~U(A) denote the equation A = T. Conversely, for any equation A = B
let F{}R(A = B) denote the corresponding formula A H B , i.e. (A~B)^(B--+A).
1.2.8 THEOREM.I f MPC. proves a formula A then EPC. proves EQU(A).
Conversely, if EPC n proves an equation A ---- B then MPC n proves
FI]R(A ~ B). Moreover both proof transformations are carried out
by polynomial-weight algorithms (: the weight of output is polynomial
in the weight of input; r weight of a formula/equation/proof etc. is
the total number of its symbol-occurrences).
PROOF. These proof thransformations are defined by straightforward recur-
sion on the length/depth of proofs in MPC n and EPC n respectively. (The
associative law for ^, v is derivable from (1)e-(5)e, see [H].) (4)m is proved
by (7)e, (5)m by (6)e and (8)e , (G) by (8)e, (Me) by (T). This is routine. []
1.3 Sequential form..alism.
1.3.1 In the language ~n 2 define sequents, SE{] (abbr.: F, A, E, II, 2),
as finite strings of formulas (possibly empty), i.e. expressions D1,...,D k ,
k > 0. In particular, FOR C SEQ.
1.3.2 The corresponding sequent calculus SPC n is as follows.
1.3.2.1 Axioms of SPC.:
(1)s F,T,~
(2)s r,L/~ ,-~L, 2
1.3.2.2 Deduction rules of SPC.:
F,A ,B,~
(D) r,AvB,r. (C) F,A,2 F,A ^B,S F,B,2
(W) r,A,2
Having this, the axioms (1)e-(5)e are readily derivable in SPCn-(CUT):
Furthermore, (6)e is derivable by (C) and (D) followed by (U), (D) and (E).
(7)e and (8)e are derivable by (D), (C) and (E). The rules (S) and (K) are
derivable by (C), (D) and (W) (see above). The transitivity rule (T) is
derivable by (CUT) - this is the only passage which requires (CUT). []
1.3.6 Recall that (CUT) is superfluous in SPC~ (Gentzen's Hauptsatz),
although its counterparts (MP) and (T) are vital for MPC~ and EPC~. In
contrast, for each finite n, SECn-(CUT ) is decidable and hence, for n>3,
dramatically weaker than SPCn, MEC n mid/or EPCn, as shows the follo-
wing theorem (see also 1.1.5 and Theorem 1.3.5).
1.3.7 TtIEO~EM.
(1) For n ~ , the set of formulas provable in S P C n - ( C U T ) is decidable.
(2) There is no effective upper bound to the number of distinct vari-
ables occurring in cut free SPC~-proofs of valid formulas contai-
ning just three distinct variables.
Ptt00F. Consider the canonical proof search tree, T, of a given formula A in
SPCn, n_< w. So A is provable in SPC n iff every (linear) branch of T con-
tains T or both L and -~L for some literal L. For every branch of T consider
the set of all formulas occurring in it (call it clause). If only finitely many
distinct variables occur in T then there are finitely many different clauses as
well, since any formula in any clause arises by carrying out substitutions
[x/a] in a certain subformula of A. Moreover, we can redefine T such that
the variables form an initial segment of VAlt. As a result, the (finite) set of all
clauses of T is effectively determined by A and the number of variables
involved. This proves (1). (2) follows by the completeness of SPC~ and the
undecidability of the 3-VAR restricted validity (the latter holds e.g. by
3-VAR translation of Post Correspondence Problem; see also [TG]). []
1.3.86) Theorem 1.3.7 holds in various extended variants obtained by
adding new "direct" rules such as e.g. (U +) and/or (S) which guarantee that
S P C n - ( C U T ) is closed under contraction and/or specification, respectively:
s ,VzA,E]
VzA,II [ygrr(r,VxA,S), ~, C_r, n C_S]
(U+) =_,A[z/y]r,VzA,s, (S) r,A[xla] ,S
w REDUCTION CALCULI. That (CUT) is vital for SPC n for finite n is
~42
due to the deterministic mode of sequent calculi under which formulas are
evaluated in the "prescribed" reduction order such that subformulas are
treated later than formulas in which they occur. Now I switch to term rewri-
ting, as it provides mutually independent reductions at all subformula levels. 7)
Instead of eigen-variables (y from (U) in 1.3.2.2) a different idea of "variable
deletions" is employed. The resulting formalisms are called reduction calculi.
2.1 In ~,~2, define operations A[:--z] and A[x§ of "variable deletion":
9 .L[§ = .L and T[§ = T
L, if xi~Yr(L)
9 L[bz] : =
• if xEFr(L)
9 (BvC)[§247247 and (BAC)[§247247
9 3yB[-rx]: = x= . VyB , if x= y
. {3yB ,if y and V y B [ v x ] : = { V y ( B [ §
3y(B[§ if z # y
,r . ~ /A ,ifx=y
9 A t x v y j : = ~ A['y], if x r y
For any E = G1,..., G k E SEQ let E[x§ Gl[x§ Gk[x§ C SE{].
2.2 Below, for brevity, parentheses in the left-associative disjunctions are
dropped. Now let FvE:=FVGlVG2v...VGk, AVE:=HlV...VHlVGIV...VGk,
etc., where E = G1, G2,...,G k and A = HI,:..,H 1. So e.g. A v B v E v C actually
stands for (...((AvB)v G1) v...)v Gk)V C. Let E range over FOR*: = FflRu { ~ } .
2.3 8) Reduction calculus RPC n . The language of RPC n is ,L/n2.
2.3.1 Rewriting rules of RPCn:
(1)r EVTVE -~ T , if E u E r
(2)r EvLvAv-~L 4 T
AAT 4 A
(3)r ThA 4 A
(4)r Av(BvC ) 4 AvBvC
(5)r EV(AhB)VE "* ( E v A v E ) ^ ( E v B v E ) , i f E v E r
(6)r 3xA -~ 3xAvA[x/a]
(7)r EvVxAvE 4 EvVxAvA[§247247247
2.3.2 Let G be any subformula-occurrence in F, G-* H be any rewriting
rule (i)r, l _ i_< 7, which is sometimes indicated by G i'* H. Let F[G/IH], or
just F[ G/H], arise by replacing G by H in F. This operation is called reduc-
tion and denoted by F-~4 F[G/iH], or just F ~ F[G/It~. Any sequence
F 0 - ~ F 1 -*-* ... ~ Fk, k > 0 , is called a reduction chain. Let A ~ B
express that A reduces to B by some chain A = F 0 -.4 F 1 ~ ... -.-, F k = B.
2.3.3 A is provable in RPC n (abbr.: RPCn ~- A) if A ~ T holds. The
correlated chain A = FO-*-*F I - ~ . . . - ~ F k = T is called a proof of A in RPC n .
2.4 THEORE~I.//RPCn proves A then F P C n proves EQU(A), i.e. A = T .
The correlated proof transformation is carried out by a polynomial-
weight algorithm. (The weight of a reduction chain is the total
number of its symbol-occurrences.)
LEMMA. / f EPC n proves G ---- H then F P C n proves F = F[ G/H].
Furthermore, F P C n proves G = H for every reduction G i-~ H,
l<_i<_Z The correlated proof and "equation ~ proof' transformations
are carried out by polynomial-weight algorithms.
143
(16)r AAA -* A
(17)r EvBvAv~ -* E V B v A v B v ~
(18)r AvB "~ BvA
(19)r (AAC)v(BAC) -* (AvC)AC and (CAA)v(C^B) -* C^(AvB)
PROOF. (9)r-(12)r, (14)r-(15)r , (18)r: The proof runs by straightforward
induction on the proof length, i.e. the length of the reduction chain F-*-*-* T
(see 2.6.3 and 3.2). (Note that (18)r follows directly from (4)r , (10)r and
(17)r.) (16)r: Argue analogously by using (14)r-(1D)r in case A being the
"major" formula of (5)r. (13)r: Straightforward induction on the complexity
of A by also using (9)r-(12)r. (17)r: analogous. Note that if B= VxA is the
"major" left-hand formula of (7)r , then the required contraction is obtained
by applying (7)r twice. (19)r follows directly from (5)r , (10)r and (15)r. []
3.4 LEMMA.The rewriting rules (20)r-(23)r are admissible in RPC n .
(20)r - 3xz
(21)r
(22)r VxA 4 A['x]
PROOF. (The admissibility of) (20)r is readily seen by induction on the proof
length. (21)r , (22)r are consequences of (6)r , (7)r together with (10)r. []
3.5 Denote by A - B that A and B are equal modulo 1-1 renaming of
bound variables (without affecting the free variables involved).
LEMMA. The rewriting rules (23)r-(2~) r are admissible in RPC n .
(23)r Vxg -* Vy(A[x/y]) if y~Fr(d)
(24)r B -* C , i f B -
PROOF. (23)r follows directly from (7)r and (10)r. (24)r is proved by double
induction on (first) the 3-depth of B and (second) the proof length. Here by
definition the ~-depth of B is maximal length of strings B1, ..., Bi, Bi+l, ...
such that each B i is an ~-formula (i.e. begins with ~), each Bi§ 1 is proper
subformula of B i and B 1 is a subformula of B. So no reduction can increase
the 3-depth. Note that if 3-depth of B is 0 then (24)r follows from (23)r. []
3.6 Let Fr*(A) denote the subset of Fr(A) that is obtained by not
counting (hereditarily) those free variables y from B[x/y] which appear in
disjunctions together with 3xB (to the effect that the rewriting rule (6)r does
not increase Fr*(-) of the right-hand formula). Let A = d B express that A
equals B modulo associativity and commutativity of v. Let A ~ B abbreviate
taht A is a subformula-occurrence in B. Let A -~0 B denote A ~ B provided
that A appears in the boolean part of B, i.e. outside any quantifier-domain.
3.7 Let (a)r be the following list of rewriting rules.
9 A -* B , i f A = d B
9 A V V z ( B [ A / ~ ] ) 4 VzB , if z ~ Fr*(A) and A ~0 B ("splitting")
The a-reducibility, G a-*--~ H, is defined by analogy to 2.3.2 with respect to
the rewriting rules (a)r instead of (1)r-(7)r. Note that every rule from (a)r is
admissible in RPC n . This follows from the previous lemmas and, in the case
of splitting rule, from (7)r , by using extra (6)r when dealing with Fr*(-).
Thus if B a - ~ 4 C, then R P C n I- F implies RPC. I- F[C/B].
3.8 LEMMA. Let B=AoVVxlAIVAlV...VVXkAkVA k ~-~4-~ C where C "~ F,
B~=AoVAlVAlV...VAkVAk and R P C n F F. Then R P C n I- F[C~B~].
!46
provable in n-VAR logic (cf. [TG] for references). In this chapter I show
that nevertheless, for each n>3, n-VAR logic is complete modulo interpre-
tation. More precisely, there is a polynomial-weight interpretation, p.*, of
w-VAR language into 3-VAR language such that A is valid iff p,*(A) is
provable in IVlPC., EPC. and/or RPC. for any fixed n>3. This generalizes
the corresponding results of [TG] and [M3].l~ For brevity, here only the
language of binary relations is considered. (This language is universal by the
canonical translation into the language of set theory; by familiar interpreta-
tions, this Mso applies to the ones with functionM symbols and/or equMity.)
4.1 Let .L/R,n be the language ~.~n2 (schema) provided a i = 2 for all i_< p
(cf. 1.1.1.1, 1.2.1.1 above). For every particular l a n g u a g e / ~ E ~ffR,~, let ~3
E ~R,3 be its 3-VAR fragment, and let /~w+ E ~ , w and ~3 E R,3 be
their expansions by the four new binary relations Vp+l, Vp+2, Vp+3, Vp+ 4
(cf. 1.1.1.1, 1.2.1.1 above) which are denoted by U, P, Q and R, respectively.
(P, Q express the basic projections as A, B from [TG], R, U the appropriate
"congruence" and "universe".) The basic variables v0, v1 and v2 are denoted
by x, y and z, respectively. As usual, xZy or x~Zy stands for Zz, y or "~Zz,y.
For any formula A of ~ffR,~, set IFr(A)= {i: viEFr(A)}. In view of 1.2.6-
1.2.8, 2.5, 2.7, 3.1, both formalisms, MPC n and EPCn, and their languages,
are used simultaneously, when dealing with the n-VAR logic.
4.211) In ~3 +, let ~r E FOR be the universal conjunction of (1)-(8):
(1) 3x(xux)
(2) z-,Px v z-,Py v xPy
(3) z-~Qx v z~Qy v xQy
(4) 3z(zPx A zRy)
(5) xRx
(6) x-,Ry v yRx
(7) x-,Ry v y~Rz v xRz
(S) x'-Ry v ~L v L[x/y], for all L= ziVkYi and zi,VkY j (i,j< 2, k< p+4)
4.3 In ~3 +, define the following abbreviations.
xP+y: = zPy v (xay A Vy(z-,Py)) and xQ+y: = xQy v (xRy A Vy(x~Qy))
Pi: = Q+(i)| i.e. zPoy : = zP+y and zPi+lY := 3z(xQ+z A zPiy )
(e.g. xP3y : 3z(xq+z A :]x(zq+x A :Iz(xQ+z A zP+y)))) )
4.4 Let K be any finite set of natural numbers. Define the relation "~'K" by
{&ieK(~z(xPiz A yPiz)), if K r
9 x~KY:= T , ifK=O
4.5 Let I~ : FOtL(s +) H FOR(L~3+) be the auxiliary mapping defined recur-
sively as follows. Note that rr(l~(A)) C_{x} holds for each A. (t~ corresponds
to the mapping MAB from [TG] .)
9 ~ ( T ) : - - T and IJ,(.L): = _L.
9 ~(L): = L, if Irr(L) = O.
9 I.I,(L):= 3y(xPiy AL[vi/y]), if Irr(A) = {i}.
9 [~(L): = 3y3z(xPiy ^ xPjz ^ L[vi/y ][vj/z]), if Irr(A) = {i,j}, iCj.
9 [~(AvB): = [J,(A)v It(B) and [J,(AAB): = p,(A)A IJI,(B)
9 I~(qviA): = 3y(x~Ky a I~(A)[x/y]), if K = Irr(qviA).
9 I~(VviA): =Vy(x*Ky v l~(A)[x/y]), if K = IFr(VviA ).
i48
4.6 LEMMA. The formulas (9)-(19) and (9)-(20) are provable in MPC 4 + 7
and MPC~ + 7, respectively, for any iE~] and I,J Cfi n ~.
(9) xRy ~ x~ I y
(10) z~iz
(11) z,~iy -4 y ~ i x
1) Cf. [I~]: p.30: I do not know of any other method of finding proofs in the
Propositional Calculus than the method of "trial and error".
2) The unification approach does not essentially change the situation. It
merely provides a more economical enumeration by deleting some repetitions.
3) There are known sequent calculi for relation algebras, which include cut
rule as counterpart of modus ponens, see [M1]. Their cut free fragments are
150
1 Introduction
sentence, then the tautologies have polynomial size proofs in the system R vr~
A superpolynomial lower bound on the size of proofs in the proof system R vr~
would imply independence of A/'P =?coX7) from R. Thus even partial results in
this approach to the problem 24"7) =?coNP may have interesting consequences.
The most important class of propositional proof systems is called Frege sys-
tems. This concept was defined by Cook and Reckhow [7,6] and was intended
to capture the properties of the most common propositional proof systems. For-
mally, a Frege system is determined by a complete finite basis of connectives and
a finite set of rules
fro) (1)
which form a sound and implicationally complete system. Let us note that when
applying the rules we use their substitutional instances, but the general rule of
substitution is not allowed. A typical representative Frege system is based on
finitely many axiom schemas (zero premise rules) and the Modus Ponens rule.
There are two measures of complexity that one uses for such proofs: the size of
the proof (which include the sizes of the formulas in it) and the number of steps
(which counts only the number of formulas used in the proof). The concept of
the Frege system is very robust with respect to each measure: every two systems
polynomially simulate each other. Moreover they are equivalent in this sense
with sequent calculi with the cut rule. For their associated theories of bounded
arithmetic, see [5,3,13]. So far, superpolynomial lower bounds have been proved
only for more restricted systems, see e.g. [1,2].
In this paper we introduce two frameworks for proving lower bounds for
Frege systems and their restricted versions. First we shall define an interactive
way of proving propositional tautologies. This game is well-known, however the
relation of the length of the game to the lengths of Frege proofs is (as far as
we know) new. Namely, the minimal number of rounds in the game is propor-
tional to the logarithm of the minimal number of proof steps in a Frege proof.
The inspiration of this game comes from some lower bound techniques in com-
plexity theory, the so-called adversary arguments and certain game-theoretical
characterizations of circuit complexity measures. We shall show that it is triv-
ial that most tautologies require interactive games of at least logn rounds (all
logarithms in this paper are base two). However, we shall also prove a lower
bound of log n + log log n - O(log log log n) on the number of rounds in interac-
tive games for some tautologies consisting of randomly chosen permutations of
conjunctions. Here, n is the number of distinct subformulas of the tautology.
The second approach for lower bounds on Frege proofs is based on valuations
in boolean algebras. This has actually been used implicitly in [2], and other proofs
can be interpreted in such a way.
For the reader who wishes to get a deeper knowledge about lower bounds in
propositional calculus we recommend the forthcoming book by Kraji~ek [9] and
a forthcoming survey by the first author [17].
153
force value r ~-+ 1. Also he needs only a constant number of rounds of questions
to get ek ~ 0, since ~k ~-* 0. Then he uses binary search to find an i such that
r ~-+ 1 and r w-~ 0. This takes O(logk) rounds. Another constant number
of rounds suffices to get ~i+1 ~-* 0. Suppose ~i+1 was derived from ~il, 9 9 Tit,
i a , . . . , il < i. For each of these premises ~ij it takes only O(log i) rounds to force
Tij ~-+ 1 (or to get an elementary contradiction), since Prover can force r ~-* 1
in O(log n) rounds using binary search again. Once the premises get the value 1
and the conclusion value 0, Prover needs only a constant number of questions to
force an elementary contradiction.
2. Let a winning strategy for Prover be given, suppose it has r rounds in
the worst case. We construct a sequent calculus proof of ~ of size 2~ It is
well-known that a sequent proof can be transformed into a Frege proof with at
most polynomial increase.
Consider a particular play P, let o h , . . . , at, t _< r be the formulas asserted
to have value 1 by Adversary in response to Prover's questions, where we have
added (or removed) negations if Adversary answered 0. (In particular al is - ~ ) .
Thus o~1 A ... A s t is false, hence --+ -~1,...,--'c~r is a true sequent. Moreover,
as easily seen, it has a proof with t + O(1) number of lines, since there is a
simple contradiction in the statements c~1 ... s t . The proof of ~ is constructed
by taking proofs of all such sequents and then using cuts eliminating successively
all formulas except 9. This is possible due to the structure of the possible plays.
Namely,
For each play P with questions o h , . . . , ai, and each j _< i, there is a another
play P~ in which the first j questions are the same as in P and in which the
first j - 1 answers are the same and the j - t h answer is different.
Finally observe that the number of such sequents is at most 2~, which gives the
bound. []
Let us note that the proof constructed from the game is in a tree form, except
possibly for constant size pieces at the leaves, which can be easily changed into
such a form. Thus we get:
Proof. Let an arbitrary Frege proof of size n be given. First transform it into the
Prover-Adversary game, thus we get a game with O(log n) rounds. Transforming
it into a sequent proof we get a proof of size 2 ~176 = n ~ Then one can
check that the translation of this proof into a Frege proof can be done so that
the tree form is preserved. []
This corollary is not surprising, since the main idea of the first part of the
proof of Proposition 2 is the same as in Krajieek's proof. In fact the number of
rounds characterizes more precisely the log of the minimal number of steps of
a proof in a tree form. Using these transformation we get, however, still a little
155
It is conceivable that the answers to both questions are NO. Thus tree-like
monotone proofs are an interesting class of proofs on which we can try lower
bound methods. Let us stress that we do not have superpolynomial lower bounds
even for such proofs. In Prover-Adversary game this means that the following is
open.
Question 6. Are there monotone true sequents which cannot be proved in mono-
tone Prover-Adversary game using O(logn) rounds, where n the size of a se-
quent?
~56
. . . . . . . (p v
Note that t2,~ is always a tautology; this is a well-known example for which one
can prove a linear lower bound on the number of steps in Frege proofs [4,11].
We shall show an f2(log n) lower bound for the number of rounds in the game.
This, of course, follows from the cited result and Proposition 2. Still the proof
is interesting, because it is different, it is not just a translation. The direct
translation only gives us a proof that Prover cannot win in r rounds for some
r = o(log n). To get a winning strategy for Adversary in r rounds we have to
refer to the finiteness of the game. Thus the direct translation does not give the
winning strategy explicitly, so it is not a really "adversary argument".
4 A n o n c o n s t r u c t i v e lower bound
In this section we prove a slightly larger lower bound log n+log log n - O ( l o g log log n)
on the number of rounds in the Prover-Adversary game. (Note that this is larger
than previous bound only if we do not count the size of indices of variables.) We
do not construct the formulas explicitly, but use a counting argument to show
that they exist. Although counting arguments sometimes easily give exponen-
tial lower bounds in circuit complexity [19], it seems that for the propositional
calculus we cannot get such strong bounds.
We consider the following formulas
L e m m a 9 . Let S be the skeleton of some winning strategy for s~,~, 7r any per-
mutation of { 1 , . . . , n } . Then S and n uniquely determine the permutation ~r.
v =
where v is the variable of the node to which an edge labeled by o is pointing etc.
~' 5 8
Next we state and and give a quick sketch of a theorem originally proved by
Orevkov [15] and later rediscovered by the authors. This gives a / 2 ( n log n) lower
bound on the length of tree-like Frege proofs of the tautologies s,~,~.
T h e o r e m 10. For every Frege system there exists a positive constant r such that
for every n there exists a permutati'on 7c of { 1 , . . . , n} such that every tree-like
proof of sn,~ has at least en log n steps.
L e m m a 11. Let S be the skeleton of some Frege proof of Sn,r. Then S and n
uniquely determine the permutation ~r.
Proof of Theorem 10. To prove the theorem it suffices to compare the number of
tree-skeletons with a given number of vertices and the number of permutation on
n elements. To estimate the number of skeletons we can use well-known estimates
about the number of trees, but we can also estimate it easily directly. A tree-
skeleton can be represented as a term where we have a function symbol for each
rule and a single (constant) symbol c which we use for all leaves. Using Polish
notation we can even avoid parentheses. Thus we can code tree-skeletons with
_~ L vertices by words of length L in an alphabet of size r + 1, where r is the
number of rules of the Frege system. If all tautologies s~,r have proofs with at
most L steps, then
( r - b l ) L k n!,
which gives L = 12(n log n). []
We shall discuss another method for proving lower bounds on the lengths of
proofs. This method has been successfully applied in the case of proofs where
the formulas have bounded depth [2]. (Here the restriction means that we use
only the De Morgan basis and the number of alternations of different connectives
is bounded by a constant; e.g. CNF's and DNF's are of depth < 3 . ) A j t a l [1]
and Riis [18] use in fact a different approach, an approach based on forcing, but
their results can be interpreted using the boolean values method.
In model theory we use boolean values to prove independence results as fol-
lows. We take a suitable boolean algebra and assign suitable values to formulas.
If a sentence gets value different from 1, then it is not provable, since we can
collapse the boolean algebra to a two-element boolean algebra and get a model,
where the sentence is false. In propositional calculus we are interested in lower
bounds on the length of proofs of tautologies. A tautology gets Value 1 in any
boolean algebra, so we cannot use a single boolean algebra. Our approach is based
on assigning boolean algebras to every small subset of a given set of formulas in
a consistent way. An equivalent approach has been proposed by Kr~ji~ek [12],
which is based on assignments in a single pariial boolean algebra.
The concept of a homomorphism is defined for boolean algebras. We extend
it to mappings of sets of formulas into boolean algebras. Namely, let a set of
formulas L and a boolean algebra B be given. A mapping A : L --+ B will be
called a homomorphism, if it is consistent w.r.t, connectives. For instance
v r = vB if r v r e L.
160
id ~ ~T,S
S A s Bs
Proof. Let a proof (~1,-.., ~rn), m < n of r ( = ~,~) be given. We shall show that
the assumption of the proposition fails for ~ = { ~ 1 , . . . , ~,~}. Suppose that we
have a system of homomorphisms as required in the proposition, except possibly
for the last condition. We shall show that all ~ C ~ get k{~}(~) = 1, thus the
last condition is not satisfied.
First observe that B{~} is embedded in all B s where ~ E S, hence As(p) = 1
for one S iff it holds for all such S. Let ~ be an instance of a logical axiom
r i.e. ~ = r for some formulas X 1 , . . . , X k . Let S be the
set of formulas O(X1,...,Xk), where t~ runs over all subformulas of r By the
assumption, IS[ is at most the degree of the Frege system, hence we have a
boolean algebra B s and a homomorphism As : S ~ B s . Since r is a tautology,
it must get value 1 for any assignment of boolean values. Thus
= As(r : A (Xk)) : 1.
Suppose that ~i is obtained in the proof from some PjI, 9-., ~J~, jl, . . . , j~ < i by
a Frege rule, and suppose that ~Jl, 9 9 ~j~ get all the value 1 in their algebras.
Then ~pplying the same argument as for an axiom (namely, a Frege rule is
sound in any boolean algebra), we conclude that T also gets the value 1. Thus,
by induction, all formulas ~ 1 , . - . , ~,~ get value 1. []
As an example, we shall describe the form of boolean algebras that one can
use for proving a superpolynomial lower bound on the lengths of bounded depth
proofs of the Pigeon Hole Principle, using the combinatorial arguments of Ajtai
[1]. The Pigeon Hole Principle is the statement that there is no bijection between
161
References
Bruno Courcelle
Introduction
We n o w explain t h e links b e t w e e n Q u e s t i o n s 1 a n d 2. It is k n o w n
t h a t every MS-definable set is recognizable. Let E b e a class of g r a p h s ,
let IF b e t h e s e t of g r a p h o p e r a t i o n s on E involved in t h e n o t i o n of
recognizability (of [4]), let u s also a s s u m e t h a t every g r a p h in q: is t h e
v a l u e of a n I F - e x p r e s s i o n , i.e., of a n a l g e b r a i c e x p r e s s i o n over IF.
A s s u m e fmally t h a t for every g r a p h G in C we c a n c o n s t r u c t "in G" a n
I F - e x p r e s s i o n t h a t d e f i n e s this graph. Then, if L is a r e c o g n i z a b l e
s u b s e t of s , t h e r e exists a finite t r e e - a u t o m a t o n recognizing t h e s e t of
I F - e x p r e s s i o n s t h e v a l u e of w h i c h is in L . Given a g r a p h G we c a n
e x p r e s s t h a t G b e l o n g s to L b y m e a n s of a n MS formula t h a t w o r k s as
follows :
(1) it defmes " i n G " a n IF-expression, the v a l u e o f w h i c h i s G,
(2) it c h e c k s w h e t h e r t h e t r e e - a u t o m a t o n a c c e p t s this e x p r e s s i o n
(this is p o s s i b l e b y Doner's theorem):
165
Graphs
All g r a p h s will be finite, directed (unless otherwise stated), simple
(no two edges h a v e t h e s a m e o r d e r e d pair of vertices). A g r a p h will be
given as a pair G = <V G , e d g G > where V G is the set of vertices a n d
e d g G c VG • is the edge relation. If X c_ V G we denote by G [X] the
i n d u c e d s u b g r a p h of G with set of vertices X . A p a t h is a s e q u e n c e of
p a i r w i s e d i s t i n c t vertices (Xl,.:.,xn) s u c h t h a t (x i, xi+ 1) E e d g G for
every i = 1, n- 1. It connects Xl to x n. It is e m p t y ff n = 1. A cycle is like
a p a t h except t h a t Xl = Xn a n d n > 1. A g r a p h is a p a t h if its vertices
form a p a t h (x 1 ..... xr0 a n d all edges of the g r a p h are in t h e path, i.e., are
of t h e form (x i, xi+ 1) for s o m e i. A discrete g r a p h is a g r a p h w i t h o u t
edges. We let SucG(x) := { y / ( x , y) is a n edge} a n d we call it t h e set of
successors of x. We say t h a t x is a predecessor of y if y is a s u c c e s s o r
of x. The outdegree of G is t h e m a x i m a l c a r d i n a l i t y of the sets SucG(x).
A d a g is a (directed) acyclic graph; a tree is a dag s u c h t h a t every vertex
is r e a c h a b l e by a u n i q u e p a t h from a (necessarly unique) vertex called
t h e root. A f o r e s t is a dag, e a c h c o n n e c t e d c o m p o n e n t of w h i c h is a
tree; h e n c e , a f o r e s t t h a t is n o t a tree h a s several roots. A v e r t e x
w i t h o u t s u c c e s s o r s is called a leaf. The transitive closure of a g r a p h G
is a g r a p h denoted by G+. If G is a dag, the relation edgG* (the reflexive
a n d t r a n s i t i v e c l o s u r e of t h e s u c c e s s o r relation) is a partial order on
V G . Two vertices x and y are comparable ff x edgG*y or y edgG* x ;
o t h e r w i s e t h e y are i n c o m p a r a b l e a n d we write t h i s x _kG y. The
reduction of a dag G is the least s u b g r a p h H of G s u c h t h a t H + = G + . It
is u n i q u e a n d d e n o t e d b y red(G); it is t h e H a s s e - d i a g r a m of t h e order
edgG* . We s a y t h a t a g r a p h G is linear if it is a d a g a n d a n y two
vertices are l i n k e d b y a n edge; its r e d u c t i o n is a p a t h a n d t h e order
edgG* is linear.
Recognizable s e t s
Let ?3 be a p o s s i b l y infinite set of sorts. A n ~3-signature is a s e t of
f u n c t i o n s y m b o l s F s u c h t h a t e a c h f in F h a s a type of t h e form SlXS2X
... XSn --~ s w h e r e s 1 ..... Sn,S a r e sorts. An F-algebra is a n object M =
<(Ms)se ?3, ( f M ) f a F >, w h e r e , for e a c h s in ?3, M s is a set called t h e
d o m a i n o f sort s of M , a n d for e a c h f a F of type Sl•215 ... x s n --~ s,
f M is a total m a p p i n g : M s 1 x Ms2 x ... x M s n --~ M s . We d e n o t e b y T(F)
t h e F- a l g e b r a of f i n i t e t e r m s (algebraic expressions) over F a n d b y
h M t h e u n i q u e h o m o m o r p h i s m : T(/~ --~ M t h a t a s s o c i a t e s with a t e r m
its v a l u e . We s h a l l s a y t h a t t is a t e r m (or a n expression) d e n o t i n g
hM(t). An F - a l g e b r a A is locally f i n i t e if e a c h d o m a i n A s , s ~ ?3, is
finite. Let M be a n F - a l g e b r a a n d s a ? 3 . A s u b s e t B of M s i s
recognizable if t h e r e e x i s t s a l o c a l l y f i n i t e F - a l g e b r a A, a
h o m o m o r p h i s m h : M--~ A, a n d a (finite) s u b s e t C of A s s u c h t h a t B =
h -l(c).
Locally o r d e r e d d a g s
E v e r y d a g h a s a topological sorting, i.e., a n o r d e r i n g < of t h e
vertices s u c h t h a t if there is a n edge from x to y t h e n x <_y. A vertex of
a dag G h a v i n g no p r e d e c e s s o r is called a root a n d R o o t G d e n o t e s t h e
set of roots of G . (Since g r a p h s are finite, if a dag G is n o n e m p t y , t h e n
R o o t G is nonempty). A partial order ~ on V G locally orders G if the
sets R o o t G and S u c G (x) for every x ~ V G are linearly ordered b y ~;
we let P a t h s ( G ) d e n o t e t h e s e t of p a t h s in G s t a r t i n g from a root;
P a t h s ( G ) is l i n e a r l y o r d e r e d b y <~ w h e r e -<(z is t h e l e x i c o g r a p h i c a l
order on s e q u e n c e s of vertices a s s o c i a t e d with (z. For e a c h x ~ V G , we
let n(x) d e n o t e t h e u n i q u e _<(z-minimal p a t h from a root to x . For x, y
V G , we let x _<(zy iff n(x) -<(z n(Y). Hence, <(~ is a linear order on V G
T h e e n u m e r a t i o n of V G in i n c r e a s i n g o r d e r w i t h r e s p e c t to _<(z is
called t h e (z-depth-first t r a v e r s a l of G. It is n o t h i n g b u t t h e o r d e r in
168
w h i c h t h e v e r t i c e s of G a r e v i s i t e d d u r i n g t h e d e p t h - f i r s t s e a r c h o f G
w h e r e , in c a s e o f c h o i c e , t h e a - s m a l l e s t v e r t e x is c h o s e n . W e let P be
the binary relation on VG such that
(x, y) ~ P r x is j u s t b e f o r e y o n t h e p a t h n(y).
T h e g r a p h F(G, a):=<V G , P > is t h e a-depth-first spanning f o r e s t .
I n p a r t i c u l a r s i n c e G is a dag, a n e d g e (x,y) o f G c a n b e o f o n l y 3 t y p e s ,
b y [2, L e m m a (5.6)I:
(1) e i t h e r it is a tree edge, i.e., a n e d g e o f F(G, a),
(2) o r it is a forward edge, i.e., x is a n a n c e s t o r of y in
s o m e t r e e of F(G, a), b u t (x; y) is n o t a n e d g e of F(G, a),
(3) o r it is a cross e d g e , i.e. x a n d y a r e i n c o m p a r a b l e in F(G, a)
and y <aX.
F i n a l l y , w e d e f i n e t h e a-canonical traversal o f G a s t h e a - d e p t h
f i r s t - t r a v e r s a l o f F(G, a -1) w h e r e a -1 is t h e o p p o s i t e o r d e r o f a (i.e., x
< a - 1 y iff y < a x). It o r d e r s G locally iff a does.
Corollary 3 : For e a c h d ~ IK s o m e d e p t h - f i r s t t r a v e r s a l o f t r e e s o f
o u t d e g r e e a t m o s t d is M S - d e f i n a b l e .
General graphs.
C a n o n e e x t e n d T h e o r e m 2 to d i r e c t e d g r a p h s w i t h c y c l e s ? O n e
c a n n o t o f c o u r s e d e f i n e a t o p o l o g i c a l s o r t i n g b u t o n e m a y w a n t to
d e f i n e a l i n e a r o r d e r . L e t u s a s s u m e t h a t G is d i r e c t e d a n d h a s a n
origin, n a m e l y a v e r t e x r f r o m w h i c h e v e r y v e r t e x is r e a c h a b l e b y a
d i r e c t e d p a t h ; let u s also a s s u m e t h a t a is a p a r t i a l o r d e r o n V G t h a t
o r d e r s l i n e a r l y e a c h s e t S u c G (x). F o r e v e r y x t h e r e is a < a - s m a l l e s t
p a t h i n G f r o m r to x d e n o t e d b y x(x) a n d t h e d e f i n i t i o n o f F(G, tz)
e x t e n d s . T h e a - d e p t h - f i r s t t r a v e r s a l o f t h e t r e e F(G, ~) is a l i n e a r o r d e r
o f G ( b u t n o t a l w a y s a t o p o l o g i c a l sorting), H o w e v e r , w e d o n o t k n o w
h o w to d e f i n e it in MS. T h e r e d u c t i o n r e d ( H ) is n o t u n i q u e l y d e f i n e d
for g r a p h s H w i t h c y c l e s (take for e x a m p l e a c o m p l e t e d i r e c t e d g r a p h
w i t h 3 v e r t i c e s ) a n d w e d o n o t k n o w h o w to d e f i n e a n a l t e r n a t i v e
formula with same meaning as r y, X). A c t u a l l y , n o s u c h f o r m u l a
172
d o e s exist: o t h e r w i s e o n e c o u l d e x p r e s s in MS t h a t a d i r e c t e d g r a p h h a s
a H a m i l t o n i a n cycle a n d t h i s is n o t e x p r e s s i b l e (see [6]).
Recognizability v e r s u s MS-definability.
We now consider classes of graphs of which the recognizable sets
a r e M S - d e f i n a b l e . W e let E k m A k b e t h e c l a s s o f d a g s G s u c h t h a t
there exists a chain partition (X1 ..... X k) o f V G a n d a topological
s o r t i n g S s u c h t h a t : for e a c h e d g e (x,y) o f G , if x e Xi t h e n x is t h e
l a s t v e r t e x in X i t h a t p r e c e d e s y w i t h r e s p e c t to S .
W e c o n c l u d e t h i s s e c t i o n w i t h a n a p p l i c a t i o n to p a r t i a l k - p a t h (see
[6,8,13] for d e f i n i t i o n s ) .
It is c o n j e c t u r e d in [5] t h a t a s e t o f g r a p h s o f t r e e w i d t h a t m o s t k
for a n y fixed k is r e c o g n i z a b l e iff it is t h e s e t of finite m o d e l s o f a n M S
formula, where one can also use special quantifiers expressing that
s e t s h a v e c a r d i n a l i t y t h a t is a m u l t i p l e o f a fixed n u m b e r . T h i s r e s u l t
173
Traces
Proposition I 0 ( [ 1 ] ) F o r a n y t w o w o r d s u , v e A *, u - v i f f d e p ( u ) a n d
dep(v) a r e i s o m o r p h i c , i f f v is a t o p o l o g i c a l s o r t i n g o f d e p ( u ) .
174
It follows t h a t t h e g r a p h H a s a b o v e is a c t u a l l y a s s o c i a t e d w i t h
t h e e q u i v a l e n c e c l a s s o f u, i.e., w i t h a t r a c e t. W e s h a l l d e n o t e b y dep(t)
t h e a b s t r a c t g r a p h t h a t is t h e i s o m o r p h i s m c l a s s o f d e p ( u ) w h e r e u is
a n y m e m b e r o f t . (The n u m b e r i n g of t h e v e r t i c e s o f dep(u) d e p e n d s o n u
b u t is i r r e l e v a n t in dep(t).) If L is a s e t o f t r a c e s , w e let Dep(L) := {dep(t)
I t e L}).
In p a r t i c u l a r e v e r y d e p e n d e n c y g r a p h G b e l o n g s to t h e c l a s s o f
d a g s C k w h e r e k is t h e size of t h e a l p h a b e t : t h e s e t s of v e r t i c e s w i t h a
same label form an appropriate chain partition. For such graph G, we
let r a i n ( G ) = m i n ( d e p - 1 (G)) = t h e u n i q u e < - m i n i m a l - w o r d i n t h e t r a c e
dep-l(G) c A *
References
[1] AALBERSBERG I., ROZENBERG G., Theory of traces, Theoret. Comput. Sci.
60(1988) 1-82.
[2] AHO A., HOPCROFT J., ULLMAN J., The design and analysis of c o m p u t e r
algorithms, Adison-Wesley, 1974.
[3] CORI R., METIVIER u ZlELONKA W., A s y n c h r o n o u s m a p p i n g s a n d
a s y n c h r o n o u s cellular a u t o m a t a , Information and C o m p u t a t i o n 106(1993)
159-203.
[4] COURCELLE B., The monadic second-order logic of graphs I: Recognizable
sets of finite graphs. Information and Computation 8 5 (1990) 12-75.
[5] COURCELLE B., The monadic second-order logic of graphs V: On closing
the gap beween definability and recorgnizability, Theoret. Comput. Sci. 8 0
(1991) 153-202.
[6] COURCELLE B., The monadic second-order logic of graphs VI: On several
representations of g r a p h s by r e l a t i o n a l structures, D i s c r e t e Applied
Mathematics, 5 4 (1994) 117-149.
[7] COURCELLE B., The m o n a d i c s e c o n d - o r d e r logic of g r a p h s VIII:
Orientations, Annals Pure Applied Logic, 72(2)(1995).
[8] COURCELLE B., The m o n a d i c s e c o n d - o r d e r logic of g r a p h s X: Linear
orderings, http://www.labri.u-bordeaux.fr/-courcell/ActSci.html, May 1994,
[9] COURCELLE B., Monadic second-order definable graph transductions: a
survey, Theoret. Comput. Sci. 126(1994) 53-75.
[10] COURCELLE B., Recognizable sets of graphs: equivalent definitions and
closure properties, Math. Str. Comp. Sci. 4(1994) 1-32.
[11] HOOGEBOOM H., ten PAS P., Recognizable text languages, MFCS 1994,
LNCS 841 (1994)413-422.
[12] OCHMANSKI E., Regular behaviour of c o n c u r r e n t systems, Bull. of EATCS
27(1985) 56-67.
[13] PROSKUROWSKI A., Separating s u b g r a p h s in k-trees: cables a n d
caterpillars, Discrete Maths 4 9 (1984) 275-285.
[14] THOMAS W., A u t o m a t a on infinite objects, in "Handbook of Theoretical
Computer Science, Volume B", J. Van Leeuwen ed., Elsevier, 1990, pp. 133-192.
[15] THOMAS W., On logical definability of trace languages, Proce-edings of a
workshop held in Kochel in October 1989 , V. Diekert ed., Report of Technische
Universit~t Mflnchen 1-9002, 1990, pp. 172-182.
First-Order Spectra with One Binary Predicate
1 Introduction
Let 9 be a first-order sentence ( i.e. with no free variable ). The spectrum of 9,
denoted by Sp(9), is the set of cardinalities of finite structures which satisfy the
sentence 9. Let Spectra denotes the class of spectra of all first-order sentences.
It has been proved that
N P = Spectra
holds for the unary representation of integers (see [11, 3]).
Such a result establishes connection between computational complexity and
finite model theory and permits us, for any property, to study its "complexity"
either in terms of the machine which recognizes it or in terms of the formula
which characterizes it.
So, as soon as we have a good measure of complexity for formulas, we are able
to translate some computational complexity results into finite model theory's
ones. For example, Pudlak [14] uses the Cook's hierarchy theorem (see [2]) in
order to show that there is a strict hierarchy for spectra. Of course, the inverse
translation is possible although we do not know any result in computational
complexity obtained by this way.
Let Sp(dV) be the Class of spectra of sentences with at most d universal
quantifiers. Let S be a set of positive integers. We have, for all d > 1 (see
[8, 10]):
where n is the input integer. This result links closely the degree of the non-
deterministic time complexity class which recognizes a given property 7) and the
required number of universal quantifiers for a formula which expresses 7). If we
use NRAMs (with only successor as operator and uniform cost measure) instead
of TM, it yields, for d > 1 (see [9, 10]):
Func~(T) = BIgl'b~ -)
We will only give the proof of the theorem for spectra (see the remark at the
end of the paper for the case of generalized spectra). Talking about spectra, we
have as an immediate corollary:
C o r o l l a r y 1.2
N T I M E ( n log n) C B I N 1 ,
where n is the input integer.
In particular, it's an other way to show that the set of primes belongs to
B I N 1 (see also [161).
We divide our work into three parts. In section 3, we prove the following
proposition.
Then, in section 4, we prove the converse result (which is easier than the
previous one).
2 Definitions
We will use the usual notation in first-order logic and in model theory.
A type q" is a finite set of relation and function symbols { V 1 , . . . , V k } . A
formula is of type T if all its relation and function symbols are in T.
Ad = (Dom, V1,..., Vk) denotes a structure consisting of a nonempty set
D o m called the domain and of relations and functions defined on Dom. For
convenience, our notation will not distinguish between a relation or function
symbol and its interpretation.
The cardinality of a structure is the cardinality of its domain. In this paper,
we will only consider finite structures.
Let T be a sentence. Let Al(x) be a formula with only one free variable
x. We define the relativization A for ~o by induction on the construction of
formulas as follows: i f ~ is atomic, then ~ a = ~; else (-!a) A = ~oa , (~1 A~2) ~ =
~ A ~ . (3zlo) a (also denoted (3xA(x))~) becomes 3x(A(x) A ~p) and (Vx~o)'~
(also denoted (VxA(x))~) becomes Vx(A(x) ~ ~o).
"~80
Let's begin with the main result which is also the most difficult one :
U l
k~ l,y
fl1
Vl,y C bij
I /',. / Pr2
f2
2
I V2
fig.l: the arrows (both full lines and dotted lines) represent relation R.
~2: Vz (3aA(a))(3/3B(/3))(Va'A(a'))(Vfl'B(/3'))
[(R(~, ~') ^ n(~, ~')) ~ (~' = ~ ^ ~' = ~)]
k
We set kP4 = A~ where:
i=1
"each element x of the domain is associated by l=d to exactly one pair (u, v)
of U; x 88
. ~ to U.i, Vi,A,B
p,,, pr ,
from U i , Vi
(bij)
i, ,; l
9 t o U i , Vi ' / ,
(f~)
to A,B
(prl, pr2)
\ ~ / /-r \ ~ fromA,B
(bij)
fig.2
9.., ul, vi are in Tdem. Fig.3 describes the unique meaning of each arrow R(x, y)
according to the respective sets (U1, V j , . . . , A , B, C) of its endpoints x and y.
We will also distinguish the case where y is one of the constants Ul, Vl,...,a,b,c.
e def
b d~f
a def
v2 def
U2 def
Vl def
111 def
c bij bij bij bij bij bij
B pr2 pr2 pr~ pr2 pr2 pr2 pr2 pr2
A pr! prl prl prl prl prl prl prl
V2
U2 f~ f~ f~ f~ f~ f~ f12 f~
V1 f21 f~ f21 f~ f~ f~ f~ f2
U1 f~ fl f~ f~ fl f~ fl fl
I
I y, x---~ U1 V1 U2 V2 A B C Rem
fig.3
3.2 P r o o f o f p r o p o s i t i o n 1.3
Proof easy. []
C o r o l l a r y 3.2
Func~ C_ B I N 1,b~ C_ B I N 1.
4 A converse result
We have to prove the following proposition which is easier than the previous
one.
37w, --+
Let G ~ 9, we build G' to be a model of 9' as follows: edges of R' are given
by those of R both with edges (x, 7) where x is of outdegree 0 (for R) and 7
is some fixed element. Conversely, if G' D ~', then the structure G such that
"/~(a, b) holds iff R'(a, b) A "-,Z(a)" is a model of p (by construction of 9') and
if G' has an outdegree bounded by k then so has G.
9 From now on we transform R' and Z into unary functions. We only have to
replace in 9' each sub-formula of the :form R'(x, y) with fl (x) = yV...Vf~ (x) = y
where the fis' are new unary function symbols and to replace each subformula
Z(x) by the formula fo(x) = z where f0 is a unary function and z is a new
variable. We denote ~"(z) the resulting sentence.
The idea is to "label" the (at most) k edges R'(x, Yl), R'(x, Y2),..., R'(x, Yk)
starting from any x by respective arrows f, : x ~ Y l , . . . , fk : x ~ y~. The reader
should be easily convinced that the following equivalence holds for IDom I >_2:
there exists G' = (Dora, R', Z} which satisfies 9' and where
each vertex has an outdegree (for R') between 1 and k
iff
there exists iT = (Dora, fo, f l , . . . , fk) (on the same domain)
which satisfies 3z~"(z)
[]
C o r o l l a r y 4.2
B I N l'b~ C_ Func~.
N T I M E ( n log n) C_ B I N 1,b~C_B I N 1.
187
Proof
C o r o l l a r y 4.4 The set of primes and the set of perfect numbers are in B I N 1.
Notice that this corollary can also be proved using results of Woods [16]. Let
P be a k-ary predicate (on integers). We say that P is rudimentary if it can
be defined by a first-order sentence q~ in a language containing only equality
(x = y), addition (x + y = z) and multiplication (x.y = z) predicates and whose
variables are bounded by the variables of P. For example, it is easy to see that
the set of primes is a (unary) rudimentary predicate.
In his thesis [16], Woods shows that every rudimentary set of positive integers
is the spectrum of a sentence involving only one binary relation symbol (then,
of course, corollary 4.3 follows). Let R U D denote the class of rudimentary sets.
In fact, our opinion was that the following inclusions hold :
R U D C_ F u n c ~ g B I N 1.
This "contrasts" with the result by Fagin and De Rougemont ( see [4, 7, 15])
that connectedness is not definable by a monadic second-order sentence even in
the presence of an underlying successor relation.
~88
5 Conclusion
By an extension of the method of this paper, we hope to give soon the same kind
of result where instead of a simple binary relation we consider more restricted
one as a symetric binary relation or a partial ordering.
In [6], Fagin asks the problem of the existence of spectra (resp. generalized
spectra about graphs) which are not in B I N 1 (respectively in B I N 1 ({R})). Usu-
ally, logical undefinability results (in a given language) concern natural problems
(about graphs, words, numbers). But, for all we know, most of natural problems
about graphs (resp. words, numbers) are either in B I N 1 ({R}) or in F u n c ~ ( { t ~ } )
(which, according to this paper, is also in B I N 1 ({R})). Consequently, a positive
answer to the above question seems to lie in the construction of artificial prob-
lems. This explains, in some way, why such a positive answer seems to be very
hard to justify.
Aknowledgements We would like to thank Professor Etienne Grandjean for
the many ideas he suggests to us and for the attention he gives to this work. We
are grateful to Nadia Creignou for her helpful advices which improve readability.
References
1. M. Ajtal. ~l-formulae
1 on finite structures. Ann. Pure Appl. Logic, 24:pp.1-48,
1983.
2. S.A Cook. A hierarchy for nondeterministic time complexity. J. Comput. Systems
Sei., vol.7:pp.343-353, 1973.
3. R. Fagin. Generalized first-order spectra and polynomial-time recognizable sets.
Complexity of computations, vol.7:pp.43-73, 1974.
4. R. Fagin. Monadic generalized spectra. Z. Math. Logik. Grundlag. Math.,
21:pp.89-96, 1975.
5. R. Fagin. A spectrum hierarchy. Z. Math. Logik Grundlag. Math., (21):pp.123-
134, 1975.
6. R. Fagin. Finite-model theory - a personal perspective. Theoretical Computer
Science, (116):pp.3-31, 1993.
7. R. Fagin, L.J. Stockmeyer, and M.Y. Vardi. On monadic np vs monadic co - np.
IBM Research Report, 1993.
8. E. Grandjean. The spectra of first-order sentences and computationM complexity.
S I A M d. Comput., vol.13:pp.356-373, 1984.
9. E. Grandjean. Universal quantifiers and time complexity of random access ma-
chine. Math. Systems Theory, vol.18:pp.171-187, 1985.
10. E. Grandjean. First-order spectra with one variable. J. Comput. Systems Sci.,
vol.40(2):pp. 136-153, 1990.
11. N.D Jones and A.L Selman. Turing machines and the spectra of first-order formulas
with equality. J. Symb. Logic, vol.39:pp.139-150, 1974.
12. J.F. Lynch. Complexity classes and theories of finite models. Math. Systems
Theory, vol.15:pp.127-144, 1982.
13. F. Olive. Personal communication.
14. P. Pudlak. The observational predicate calculus and complexity of computations.
Comment. Math. Univ. Carolin., vol.16, 1975.
189
INTRODUCTION
W h a t is the relation between the computational complexity of a decision 1 prob-
lem and the logical complexity of the language required to describe it? This
question was formulated first by I m m e r m a n [Ira3] (see also [Gul, Gu2, I m l , Im2,
Va]). In complexity theory any problem is identified with a language s C Z*.
Solving the problem means deciding the associated language, i.e., deciding for
each word w whether it belongs to s or not. Now, a word w on a finite alphabet
is easily identified to a finite structure Aw. The most common encoding consists
in identifying each w C {0, 1}* with the finite structure (n, X, suce) defined by :
on = {0, 1 , . . . , l e n g t h ( w ) - 1} ;
9X C n a n d g i E n : X(i) Cezthei thletter o f w i s 1;
9suce is the successor relation on n.
Such an encoding being chosen, the purpose is the following : try to associate
to every complexity class g a class of formulas of a given logical language, 9e, in
such a way that: A language s C S* is in g iff there exists a sentence 9 E 5e s.t.
VwEZ* : wEs ~.
1 This question is also studied for optimisation problems. See [KoTh] for instance.
191
Among the results which were proved in that framework, let us mention those
of Biichi and Fagin, often quoted in this way :
regular languages = monadic SO = monadic SO(~) [Bit] ; N P = S 0 ( 3 ) [Fa],
where, whenever C is a complexity class and Z" is a class of formulas, C = 3v
means that for every language L on Z:, X: is in g iff there exists a formula ~ E
such that : Vw E ,U* : w E L: r Aw ~ ~. We would expect this correspondence
between computational complexity and logical definability to allow us :
1- to use the flexibility of complexity-theory tools (models of computation for
instance) to get results on logical definability;
2- to export complexity questions in the field of logical definability.
An important example of the second item is the following : In order to prove
that a given problem belongs to a given complexity class, it is enough to build an
algorithm of that complexity which decides this problem. But of course, there is
not guarantee that this algorithm is the "best" one. In other words, the difficulty
is to obtain lower bounds of complexity.
In [Ly2], Lynch focuses on this precise question. He writes : "there are hundreds
of known NP-complete problems, but until recently, not one of them had a prov-
able nontrivial lower bound." ~ This is the reason why Lynch wants to compare
the class NTIME(n) (i.e. the class of languages recognized in nondeterminis-
tic linear time on Turing machines) with some class of logical formulas F . He
hopes then to obtain results as "H ~NTIME(n)", for some natural NP-complete
problems H, in proving their non-definability by formulas of f . Actually, the
"inclusion" NTIME(n)C ~ is enough to get such a result, because it still allows
the implication "H non-definable by f :=~ H ~NTIME(n)". The main result of
[Lyl,2] may be stated as follow :
Concluding his paper [Ly2], Lynch asks: "Can the (above) theorem be extended
to random access models of computation, where the memory elements can be
read and written in any order?" The question is justified by the specificity of his
proof, which uses a discrete analogue of the Intermediate Value Theorem for the
Turing machines, that is : if the head is at the position c at the time t and at
c' at the time t' > t, then for every position c" between c and c', there is some
time t" between t and t' such that the head is at c" at time t". Of course, there
2 There is a noteworthy exception with the problem "Reduction of Incompletely Spec-
ified Automata"(RISA), which has been proved non-solvable in deterministic linear
time in [Gr2].
!92
Remark. k only depends on S cardinality (e.g. for IS1=2, each k > 1 suits).
a In this paper, log n denotes the logarithm of n in basis 2.
193
1.2 R e a d i n g D e c o m p o s i t i o n of a W o r d w E ,U*
In what follows, we assume that Z is the dyadic alphabet ( Z = {1, 2}) and we
denote k an integer greater than or equal to 2. In this section, we describe a de-
composition of the words of ~* suited to the NRAM definition given above.This
decomposition, called "reading decomposition", will allow to encode every word
w E Z* as a finite structure of the (m, f, <, suc, 0) type, where f denotes a unary
function on m = {0, 1 , . . . , m - 1}, < is the usual order and suc is the successor
function on the domain m. This encoding is used in an essential manne? in the
logical characterization of NLIN given by [Gr3].
Let w E Z * 9
Let us s u p p o s e n = l e n g t h ( w ) , l = , r ~ k l / , m = r91"Thenwe
can break down w into m words To, w l , . . . , win-l, all of which have length l
exactly, except the last one, which may be of length less than or equal to I. We
have w = wg'w~". ~ wm_l, and if w d denotes the integer whose dyadic writing
is w~, it is easy to show that m -- O ( n / l o g n) and w d < m for large enough n.
We can now identify w E {1, 2}*, with reading decomposition w = w ~ . . . ~ win-l,
with the finite structure (m, fw, <, 0) defined by :
- m = { 0 , 1 , . . . , m - 1}
: ~ m ----+ m
- f~ ( i~.. f~(i) = w d
- < is the natural linear order on m and 0 is the "real" 0 of m. 4
This exact characterization of NLIN will allow to prove our main result, that is
4 Compare this with the usual identification of a word w E {0, 1}'~ with the finite
structure (n, X, <), where X is the unary relation on n such that X(i) holds iff the
i *h letter of w is 1 (see [Lyl].
9Z.
9 We know that each positive integer x may be coded by a word of {0, 1}* of
length [log(x) + 1] (binary representation of integers). In particular, if E is a
finite set of integers, we can identify every x E E with a word B(x) E {0, 1}*
of length L = M a x { [ l o g ( x ) - 4 - l J , x E E} exactly, does mean padding with O's
binary representations of the elements of E. Equality between two integers of E
is equivalent to equality of their representative words. Namely, if for x E E we
denote B(x)t the bit of B(x) of rank t (0 < t < L with convention: the bit of
rank 0 is the less significant bit), we have :
hence
f ( x ) = 9(y) i f f (Vt < L)( (BI),,~+t = (Bg)my+, ) (*)
Eventually, we know that a word of length mL is naturally associated with the
unary predicate X C mL defined by "
language, has a cost: the domain of finite structures considered must be expanded
(we pass from unary functions on domain m = O(n/log n) to unary relations on
domain mL, which is O(m log m) = O(n)), and the predefined symbols <, suc, 0
must be replaced by the much more expressive predicate +.
The study of this flattening of functions in unary predicates constitutes the
core of this paper. Combined with the above-mentioned characterization of a
language Z: E N L I N :
(w E/2) iff ((m, fw, <, O) ~ 3 g l . . . 3gpVx r f ,-y, <, 0))
it allows to prove that for every /2 E N L I N , there exists O E S0(3, mon, +)
(i.e. O is a monadic existential second-order formula with addition), so that :
(w e C) r (<n, X, +) ~ O),
where n = length(w) and X is the unary predicate on n coding w.
More precisely, the paper is organized as follows :
In p a r t 2 we first specify the above-mentioned identification between a unary
function f : m ~ m and a unary predicate F C_ mn (n = Llog(m)+lJ). Actually,
for technical reasons, the predicate F is built on a domain cn (for a fixed constant
e) slightly bigger than mL. We will call F "flattening of f on cn". Then, we state
precisely the passage from functional language to relational language : for each
formula gr E FO(V*3*) on signature {f, g l , . . . , gp, <, 0}, where the gi's are unary
function symbols, there exists ~ E FO(V*3*), on signature {F, G 1 , . . . , Gq, +},
where F and the Gi's are monadic predicate symbols, such that if f : m -* m
and if F is the flattening of f on cn then
(re, f , < , 0 ) p 3 ~ r (ca, F , + ) p 3 a ~
Then we combine this result with the NLIN characterization of Theorem 1 to
prove : for every /: C {1, 2}* in NLIN, there exists a second-order formula
such that for every w E {1, 2}":
(w E c) ((en,X, +) p +)),
where X is a unary predicate which codes w on cn.
P a r t 3 focuses on a technical result which makes it possible to provide the
preceding result with a more standard form. We first prove that if R 1 , . . . , Rp
are predicates with respective arities r l , . . . , r p , on a domain of the form cn,
and if 9 is a first-order formula on signature { R 1 , . . . , Rp}, then there exists a
division of each Ri into a series R* of ri - ary predicates on n, and a first-order
formula r on signature { R ~ , . . . , R~}, such t h a t :
(<cn, R1,..., Rp> p r (<n, Rp> p
Associated to the last equivalence seen in part 2, this result allows us to conclude:
If/2 C {1, 2}* belongs to NLIN, then there exists O E S0(3, mon, +) such that
for every w E {1,2}* :
(w e c) ** (<n, x, +> p o),
where X is the unary predicate naturally coding w.
~95
2.1 Notations
R e m a r k . In the present paper, both the binary and the dyadic ~ representation
of positive integers are used. Binary representation is used when padded and
uniform length notations are helpful. Dyadic one allows to identify any n o n e m p t y
word of {1, 2}* with a positive integer, in a one-one way.
For this whole paragraph, m, L, and m I denote three integers such that L =
[log(m) + l] and m ' > ran. We show there that, using a simple coding, each
function h : m --* m can be identified with a unary relation H on m t.
Let h : m --~ m. For every i < m, let us denote b~. = B L ( h ( i ) ) . The image of each
i < m being thus coded by a word of {0, 1} L, h can be coded by a word Wh, the
length of which is exactly m ' : Wh = b~'b~" .. ~ b m _ l O m ' - m L
5 Recall that dyadic representation consists in encoding each positive integer x by the
word D ( x ) = x p . . . xlxo E {1, 2}*, where (x0, xl . . . . . xp) is the only tuple on {1,2}
such that : x = x0 + x12 + ... + zp2 p.
197
Remark. If for x < m and t < L, bt~ denotes the t th bit of b=, then :
L-1
h(x) = ~'~
i...~ bt~2t = E 2t
t.~-O t<L st
btz:l
Every function from m to m is thus represented in a single way by a word of
{0, 1}mL.om'-mL. Besides, one usually represents a word v E {0, 1}* of length q
by the unary predicate V C_ q defined as follows :
Yi < q : V(i) ~ the t th bit o f v is 1.
Composition of this two codings makes it possible to associate to each unary
function h : m ---* m a unary predicate H C m ~ defined as follows :
Ya < m ~ : H ( a ) r the OLth bit o f Wh i8 1
ie: H ( a ) r ~ = x a L + t ~ , with x~ < m , to < L , and bXc~
t" = 1
Which implies the following :
Vx<m,Vt<L : H(xL+t) vvbt~=l
In fact, to make the notations simpler, we will note this predicate "Bit", the
(m, m ~) pair which it refers to being implicitly given by the context.
m ~ f ( x ) < g(y) r cn b (3t < L)(Vt' < L) [t' > t --+ ( F ( z L + t') ~-* G(yL + t')A]
[G(yL + t) A -,F(xL + t))
Proof. We just have to recall the connection between a function h : m ---+m, its
associated word B L ( h ( O ) ) ~ . . . ~ B L ( h ( m - 1)) E {0, 1}* and its flattening on cn
to see that these three equivalences express the following facts :
- f ( x ) : g(y) iff the words B L ( f ( x ) ) and BL(g(y)) are equal, i.e. coincide on
their L successive bits.
- f ( x ) < g(y) iff B L ( f ( x ) ) and BL(g(y)) are in the same order for the reverse
lexicographic order on {0, 1} L.
[]
where the o~i's and /3i's are new function symbols. Thus, if we still denote
the tuple 91, 9 99 gr made up of gl,. 9 gp in addition to as many new function
symbols as necessary, this allows to write g' in the following equivalent form :
gs = A a~ = 7(x) A A V e(u) or r with
a, fl, 3: E {f,~}, r {f,-~,fd}, c<E {--,7~,<,~}.
Besides, replacing each subformula c~o/3(x) = 7(x) of gr by Vy[fl(x) = y ~ ~(y) =
7(x)], and writing the obtained formula under prenex conjonctive normal form,
we get for Vxg: an equivalent formula of the following form: Vg A V a(u) o< fl(v),
with a, fl E {f,-g, Id}, u,v E {0, g}, c<E { = , r 1 6 2 which can be written
more precisely : VgAiV~B#, where each Bq is one of the formulas a(u) =
/3(v), c~(u) r fl(v), a(u) < fl(v), o~(u) r /3(v), for some a,/3 E {f,~-, Id} and
some u, v E {0, g}.
At last, we observe that the formula c~(u) < fl(v) can be expressed with the
others (namely, those of the form ~(u) r /3(v), ~(u) r /3(v)) by a universal
formula using the only connectives A and V. Indeed we have :
o~(u) </3(v) ~ --,[/3(v) < c~(u)] A --,[fl(v) = a(u)]. The prenex conjunctive normal
form of the formula produced by this last rewriting is the sought formula ~.o. []
We can achieve a stronger result in showing that the previous lemma still holds
when using + as the only arithmetical predefined symbol. Recall that for k, c
two fixed integers and for n E N, we denote : l = , k ,, m = [~],
L = [log(m) + lJ.
The following result, essentially due to Lynch, states that the constants k, c, n,
l, m, L, as well as the predicates <, x , Bit, are definable in cn by existential
monadic second-order formulas on a signature including the predefined predi-
200
cate +. Moreover, this definability preserves the V*3* quantifier pattern of the
formulas. More precisely, we have the :
Proof. Definability of the predicates <, x , Bit, has been proved in [Lyl]. In
the first paragraph of the appendix, we recall the main steps of this proof and
we show the definability of the arithmetical constants k, e, n, l, m, L, taking a
specific care to the quantifier prefixes of the used formulas. []
Remark. From now on, we will still use subformulas like x < m, B i t ( y L +
t ) , . . . , but it will have to be interpreted as the abreviation of existential monadic
formulas on {+}.
Our purpose being to get back to a finite structure in which the word w is encoded
by its naturally associated predicate X, we make explicit, in this paragraph, the
link between F and X. Let w be a word of {1, 2} n and let us denote
w = d o 9..~0
.I-1 ~1 0 . . . .@
.. 1 d 0, ~ _ ~ . . . ~ , t-l
~ _ 20~ , ~ _ l . . . d , ~ _ l , 0 < s < _ l ,
its reading decomposition into rn words wi = d~ -1 for i < m - 2 and
0 s--1
w,~_ 1 = din-1 "'" d ~ _ l . The function fw : m ~ m is then defined by :
/-1
Ed{2 j if i < r n - 1
Vi < m, f~(i) = w/d = j=0
s--1
j=0
E'/ J 1 2j i f i = m
t~rn-- 1
Every integer w/d may be coded by its reverse binary representation of length L:
b ~ bL-1. Hence fw may be described by a binary word :
W' = boO . . . .b0
.L. - 1 0
brn_l . . . ./ Lm_l.~,
- 1 (~en-rnL E {0, 1} cn
Relationship between the words w, w ~ and the function f,o is therefore expressed:
Vi < m,
201
dO... d~-lis the reverse dyadic representation of fw(i) (replace 1 by s for win-l)
b ~ bL-lis the reverse binary representation of length L of f~ (i)
We deduce : Vx < m, Vt < L :
{b~ = 1}
(,) iff
With the choosen notations, the predicate X naturally coding w on n and the
flattening F of f~ on cn are defined by:
Vx < m, Vt < l: [n ~ X(xl + t)] r [dt~ = 2]
Vx < m, Vt < I: [ca ~ F(xL + t)] r [bt~ = 2]
The equivalence (.) can therefore be written : Vx < m, Vt < I :
F(xL + t)
iff
Remark. The assertion ~ 9 FO(V*3*) has not been justified above. Since ~) -_-
D A ~, and since ~ 9 FO(V*3*), it is sufficient to prove that D 9 FO(V*3*).
But D _ Vo~[F(c~) ~ (3x < m ) ( 3 t < L){(c~ = x L + t)A r can also be w r i t t e n :
D -- V a [ F ( a ) ~ (Vx < m ) ( V t < L ) { ( a - x L + t ) ~ ~]. Using this two writtings
and observing that :
9 a = x L + t can he replaced by a formula in FO(V*3*) ;
9 (P can be written with either a V*3* or an 3*V*-quantifier pattern,
the reader will easily convince himself that the result holds.
3 From cnton
This part states without proof a technical result which is essentially due to Lynch
(see [Lyl] or [Gr2]). Let us suppose that n and c are fixed in N. If R is an r-ary
predicate on cn, we call "division" of R on n the set of r-ary predicates on n,
R* = {Rfl...i~, i l , . . . , i r < c ] , defined, for all i l , . . . , i r < c, by :
v ( y ~ , . . . , Yr) 9 n r : R~I . , ( y l , . . . , ~ ) r R(iln + y l , . . . , ir~ + yr).
The bijectivity of the mapping which associates every r-ary predicate on cn to
its division R* = {Ril...~, ij < c} on n, allows us to a f f r m :
L e m m a 10. Let q5 be a first-order sentence on signature { R1, . . . , Rs, P1, . . . , Pt } ,
where each Ri is a ri-ary predicate symbol, and each Pi is a pi-ary relation on en.
Then there exists q~*, f r s t - o r d e r sentence on signature { R~ , . 9 . , R*~, P~ , 9 * ., p * ~ },
where each R* is a set of c r~ ri-ary predicates symbols, and the P* 's are the re-
spective divisions of the Pi 's, such that :
(cn, t:)1,..., Pt) ~ 3 R ~ . . . 3 R ~ ( R ~ , . . . , R~, P1, 9 Pt)
iff
(n, P*~,. .., Pt*) p3R~ . . . 3 R ~*r * ( R I*, . . . , R , , * PI*,...,Pt*)
Moreover, i r e 9 FO(V*3*), ~* can be required to be i~ FO(V*~*).
Theorem A immediately follows from Lemma 9, Lemma 10, ~nd a simple study
of the divisions of + (addition on cn) and X (trivial extension of the predicate
naturally coding w) on n :
T h e o r e m A Let 12 be a language on ~ = {1,2}. I f 12 9 N L I N , then there
exists a sentence 0 9 FO(V*3*) on signature { X , U 1 , . . . , Us, +}, where X and
the Ui's are unary predicate symbols, + is a ternary predicate symbol, such that
w 9 ~ : [~ 9 121 r [(~,x,+) p 3u~...~u~o(x, u,+)],
where X is the predicate naturally coding w on n, and + is the predefined addition
on ft.
203
4 Conclusion
References
[Bii] J.R. BOCHI, Weak second order arithmetic and finite automata, Z.Math.
Logik Grundlagen Math. 6 (1960), pp.66-92.
[Fa] R.FAGIN, Generalised first-order spectra and polynomial-time recognizable
sets, in Complexity of Computations, R.Karp, ed., SIAM-AMS Proc. 7, 1974,
pp.43-73.
[G1] E.GR~.DEL, On the notion of linear time computability, International J. of
Foundations of Computer Science, No 1 (1990), pp.295-307.
[Grl] E.GRANDJEAN, A natural NP-complete problem with a nontrivial lower
bound, SIAM J.Comput.,17 (1988), pp.786-809.
[Gr2] E.GRANDJEAN, A nontrivial lower bound for an NP problem on automata,
SIAM J. Comput., 19 (1990), pp.438-451.
[Gr3] E.GRANDJEAN, Linear time algorithms and NP-complete problems, Proc.
CSL'92, Lect. Motes Comput. Sci. 702 (1993), pp. 248-273, also to appear in
SIAM J. on Computing.
[Gr4] E.GRANDJEAN, Sorting, linear time and the satisfiability problem, to ap-
pear in special issue of Annals of Math. and Artificial Intelligence, 1995.
[GuSh] Y.GUREVICH and S.SHELAH, Nearly linear time, Lect.notes Comput.Sci.
363 (1989), Springer-Verlag, pp.108-118.
[G~I] Y.GUREVICH, Toward Logic Tailored for Computational Complexity,
Computation and Proof Theory, (M.M.Richter et. al., eds.).Springer-
Verlag Lecture Notes in Math. Vol.l104, pp.175-216, Springer-Verlag, New
York/Berlin, 1984.
[Gu2] Y.GUREVICH, Logic and the challenge of computer science, Current Trends
in Theorical Computer Science, (E~Boerger Ed.), pp.l-55, Computer science,
Rockville, MD, 1986.
[Ira1] N.IMMERMAN, Languages which capture complexity classes, 15th ACM
Symp. on Theory of Computing , 1983, pp.347-354; SIAM J. Comput., 16,
No.4 (1987), 760-778.
Jim2] N.IMMERMAN, Relational Queries Computable in Polynomial Time, 14th
ACM STOC Symp., 1982, pp.147-152. Also appeared in revised form in
Information and Control, 68 (1986), pp.86-104.
[Im3] N.IMMERMAN, Descriptive and Computational Complexity, in J. Hartma-
his ed., Computational Complexity Theory, Proc. of AMS Symposia in Appl.
Math. 38 (1989), pp. 75-91.
R.M.KARP, Reducibility among combinatorial problems, IBM Symp.1972,
Complexity of Computers Computations, Plenum Press, New York, 1972.
[IKoTh] P.G.KOLAITIS, M.N.THAKUR, Logical definability of NP-optimization
problems, Technical report UCSC-CRL-90-48, Computer and Information
Sciences, University of California, Santa-Cruz, 1990.
[Lyl] J.F.LYNCH, Complexity classes and theories of finite models, Math. Systems
Theory, 15 (1982), pp.127-144.
[Ly2] J.F.LYNCH, The quantifier structure of sentences that characterize nonde-
terministic time complexity, in Comput. Complexity, 2 (1992), pp.40-66.
[Set] C.P.SCHNORR, Satisfiability is quasilinear complete in NQL, J. ACM, 25
(1978), pp.136:145.
[Va] M.VARDI, Complexity of Relational Query Languages, 14th ACM Syrup. on
Theory of Computation, 1982, pp.137-146.
Logics For Context-Free Languages
Clemens L a u t e m a n n 1 T h o m a s Schwentick 1
Denis Th~rien 2.
1 Introduction
In the following sections, we will use string logic. We consider strings over
some a l p h a b e t A = { a l , . . . , a t } as structures over the signature < A , < > :=
< Q ~ I , ' " , Q~r, < > in the usual way: a string w = wl ... w~ E A + is identified
with the s t r u c t u r e 3 < { 1 , . . . , n } , Q ~ l , . . . ,Q~,, < > , where i E Q ~ iff wi = 6tj.
For the sake of clarity we will also use obvious abbreviations such as rain and
m a x (denoting the smallest and largest element, repectively). W i t h CFL(A) we
denote the class of context-free languages over the alphabet A.
2 Matchings
a b a b b a b a a b b a a
1 2 3 4 5 6 7 8 9 10 11 12 13
3 Whenever appropriate, we use the same letters to denote predicate symbols and their
interpretations in the structure.
207
With quantification over matchings, the Dyck language Dk over the alphabet
Ak := { a l , f l l , a 2 , . . . ,ilk} can be defined as follows:
We will now show that first order logic plus existential quantification over match-
ings is enough to define every context-free language, and, moreover, no other
languages can be defined that way.
This process will eventually terminate, since the substitution in 2 will either
make the production terminal, or it will increase its length. Thus we conclude:
208
a
/I\a a b a X4 a b
/I\
a b
/\
b b b
We show now t h a t there is a f o r m u l a ~ over < A , <, M > , which holds for a string
w with m a t c h i n g M , if, a n d only if, there is a G-derivation tree T of w such t h a t
M = MT. It follows t h a t there is a m a t c h i n g M on w with < w , M > ~ p iff w
can be derived in ~.
E v e r y arch ( i , j ) E M defines a substring, w l , . . . ,wj. We say t h a t an arch
(k,1) E M (i < k < l < j) lies at the surface of (i,j), if t h e r e is no o t h e r arch
b e t w e e n it a n d ( i , j ) , i.e., if there is no arch (r,s) with i < r < k < s. Similarly, a
position k (i <_ k _< j ) which is n o t the e n d p o i n t of an arch o t h e r t h a n (i, j) lies
at the surface of ( i , j ) , if there is no other arch between it a n d (i,j), i.e., if there
209
is no arch (r, s) with i < r < k < s.The string of surface symbols, with surface
arches replaced by the symbol [ is called the pattern of (i,j), c.f. Figure 3.
Vo VI Vs
X,(x,y) = Q~(x) h Q#(y) h 3XI ~BI . . . ~Xs~Ys : (X<Xl <Yl <X2< ... <Ys < Y) A
^(~)vo(X, Xl)A~r)vl(Yl,X2)A...A~v.(ys, y)) n
A(M(xl,yl)A...AM(x,,ys)).
For every X E N, let Xx (x, y) be the disjunction of all those Xq(X, y), for which
X is the left-hand side of the production q. Then, for p as above, the following
formula f~p(X, y) expresses that (x, y) corresponds t o p and that the surface arches
(xl, Y l ) , . . - , (x~, Ys) correspond to productions with left-hand sides X 1 , . . . , X~,
respectively.
2.1.3 L e m m a . [9] A language L over A is context-free if, and only if, there is
a recognisable tree language T , with leaf alphabet A, such that every w C A +
belongs to L iff it is the yield of some tree T E 7-.
5 This is first order logic with additionM (existential and universal) quantification over
unary predicates.
6 For the notion of a recognisable tree language, see e.g. [8].
211
w has a matching M with <w, M > ~ r if and only if w is the yield of a tree
which satisfies ~.
will be of the form ~ A T, where T describes a class of trees which can be
obtained from strings with matchings in a certain way t h a t will be explained
below. T h e formula q~ corresponds to r and holds for those trees which are
obtained from strings with matchings t h a t satisfy ~.
Let w = wl - . . w , be a string, and M a matching on w. We construct a tree
T,,,M in two stages as follows:
1. Define a tree Tw,M the nodes of which are the positions of w and the arches
of M:
- for every i < n, i is a leaf, labeled wi;
- every arch (i, j ) E M is an internal node, labeled ~), its children are all
those positions k and arches (k, l) which are at the surface of (i, j), in
the order in which they a p p e a r in <w, M > .
If n > 1, and (1, n) • M , add a root, labeled Q , whose children are all those
positions and arches which are not underneath any arch.
2. W h e n e v e r an internal node has more t h a n two children, we distribute t h e m
over binary subtrees, 7 using additional nodes with label (~). This results in
a tree Tw,M, with yield w and with internal nodes labeled C) or ~).
(i)
b | @ @
/\ / \
a | a ,,|
/\ /\
a b a b
Thus our trees have leaf alphabet A and two binary labels, (~) and @ . A m o n g
all such trees the ones t h a t can be obtained in the way described above are
characterised by the following property:
7 We don't need this construction to be deterministic - any tree Tw,M obtained this
way will do.
212
This p r o p e r t y can be expressed in a monadic second order formula over the tree
signature < Q ~ , . . . Q~r, Q| Q o , C, < 2> as T = TL A Yn, where TL deals with
the leftmost, TR with the rightmost path, respectively. For TL, e.g., we can write:
(i,j) e MT, iff c, the c o m m o n ancestor of i and j in T, has label @ , and i lies
on a leftmost, j on a rightmost p a t h from c.
- restricting all quantifiers to range over leaves only; to this end, we replace
every subformula of the form 3xx by 3x : Lf(x) A X, and every occurence of
Vxx by Vx : LI(x) ~ X;
- replacing x < y by the monadic second order formula
T h e n there is a matching M on w such t h a t <w, M > ~ ~b, if, and only if,
w = yield(T) for some T such t h a t T D T A k~. Hence the set of those w for
which there is such a matching is context-free. []
T h e construction in the second half of the proof of T h e o r e m 2.1 still yields a
monadic second order tree formula 9 if the string formula ~b is monadic sec-
ond order rather t h a n first order. Therefore, the theorem remains true with f.o.
replaced by m.s.o.
3 Variations
- S )XI...Xk, XI,...,XkEN\{S}.
- X ~ uYv, u , v e A*, X e N, Y e N \ { S } ;
There are other classes of binary relations which can be used to chaxacterise
context-free languages. E.g., the proof of T h e o r e m 2.1 can easily be modified
to show CFL(A) = 3 B f . o . ( A , < ) , where the class/3 consist of t h o s e ' b i n a r y
relations B t h a t satisfy the following noncrossing condition:
Here we allow several arches to share the same left or right endpoint; note,
however, t h a t no position can serve b o t h as a left and a right endpoint.
As there are only 20(n) binary trees with n leaves, not all linear orders can have
this property. Let T D O denote the class of tree definable linear orderings.
P r o o f (sketch):
Let < w , 1--> ~ ~, and let T be a tree defining the order relation f- on { 1 , . . . , n}.
T3
i T2
Here T1, T2,2"3 are trees defining the order relations derived from the restrictions
of M to { 1 , . . . , i - 1 } , { i + 1 , . . . , j - I } , and { j + l , . . . , n}, respectively. []
(min.~=min<)AVxVy:(y=suc_<(suc<(x))-+y=suc<(x)).
9 It is only for the sake of conciseness that we use a function symbol suc< here to
express the direct successor in the order relation -<. Of course, the successor can
easliy be expressed in first order logic without this symbol.
215
4 Discussion
Of course, our m a i n result does not teach us anything new a b o u t context-free
languages. However, it supports, in a rather satisfying m a t h e m a t i c a l way, the
intuition t h a t the essence of context-freeness is contained in the notion of a
matching relation. And since context-free languages are well understood, it tells
us something a b o u t our logic 3 Match f.o., e.g., t h a t it is not closed under nega-
tion: We believe t h a t semantically restricted logics can be used to characterise
not only the class of context-free languages, and some of its subclasses, but also
m a n y other interesting classes. We are convinced t h a t a more general study of
these logics will prove worthwhile also in the context of general finite structures,
instead of strings. Here, as opposed to the string case, the class m o n N P defined
by existential monadic second order logic is not closed under complementation.
In fact, the set of connected graphs, although expressible by means of a universal
monadic second order sentence, is not contained in m o n N P [6], a result which
has since been refined and extended in a n u m b e r of ways [4, 1, 7, 11, 12]. On the
other hand, the expressive power of sentences with one binary existential second
order quantifier is not well understood. We believe t h a t studying semantically
restricted versions of this latter logic, will help us understand the limitations,
and the expressive power of binary existential second order quantification.
References
1. M. Ajtai and R. Fagin. Reachability is harder for directed than for undirected
finite graphs. Journal of Symbolic Logic, 55(1):113-150, 1990.
2. D. A. M. Barrington, K. Compton, H. Straubing, and D. Th6rien. Regular lan-
guages in NC 1, Journal of Computer and System Sciences, 44:478-499, 1992.
3. J. R. Biichi. Weak second order arithmetic and finite automata. Zeitschrift fiir
Mathematische Logik und Grundlagen der Mathematik, 6:66-92, 1960.
4. M. de Rougemont. Second-order and inductive definability on finite structures.
Zeitschrift fiir Mathematisehe Logik und Grundlagen der Mathematik, 33:47-G3,
1987.
5. J. Doner. Tree acceptors and some of their applications. Journal of Computer and
System Sciences, 4:406-451, 1970.
6. R. Fagin. Monadic generalized spectra. Zeitschrift fiir Mathematische Logik und
Grundlagen der Mathematik, 21:89-96, 1975.
7. R. Fagin, L. J. Stockmeyer, and M. Y. Vardi. On monadic NP vs. monadic Co-
NP. In Proc. 8th Annual Conference Structure in Complexity Theory, pages 19-30,
1993. To appear in Information and Computation.
8. F. G6cseg and M. Steinby. Tree Automata. Akad4miai Kind6, 1984.
9. J. Mezei and J. B. Wright. Algebraic automata and context-free sets. Information
and Control, 11:3-29, 1967.
10. A. Salomaa. Formal Languages. Academic Press, 1987.
11. T. Schwentick. Graph connectivity and monadic NP. In Proc. 35st IEEE Syrup.
on Foundations of Computer Science, pages 614-622, 1994.
12. T. Schwentick. Graph connectivity, monadic NP and built-in relations of mod-
erate degree. In Proc. 22nd International Colloq. on Automata, Languages, and
Programming, 1995. to appear.
216
Abstract
We extend recent work about the relationship between logically defined
classes of NP optimization problems and the asymptotic growth of optimal
solutions on random inputs. We consider the syntactic class MIN F + 1-[2(1)
of minimization problems. Kolaitis and Thakur proved that every problem
in this class is log-approximable. We show that for every problem Q in the
class MIN F+II2(1) there exist polynomials g(n) and h(n) such the optimal
solution of Q almost surely grows like O(g(n)+ h(n).log n). Applying this
result we show without using any complexity theoretical assumptions that
the problem MIN GRAPH COLOURINGis not in MIN F+II2(1). With the
same method we get a similar criterion for each class MIN F+II2(k). Using
the fact that on a random graph with n nodes the chromatic number is
n/(2 log n) almost surely we prove that MIN GRAPH COLOURINGis not in
MIN F+II2, resolving a conjecture by Kolaitis and Thakur.
1 Introduction
Many NP-complete decision problems come from combinatorial optimization
problems by putting a bound on the cost of a desired solution. Assuming P # NP
we cannot find polynomial-time algorithms which solve optimization problems
exactly. This does not exclude to find nearly optimal solutions, that means
solutions whose relative error compared to an optimal solution is smaller than a
constant. For many problems their approximation properties are known. Some
have efficient approximization algorithms, others like the T S P are as hard to
approximate as to solve exactly. It would be interesting to know more about the
structural reasons for these differences.
Motivated by Fagin's characterization of NP by existential second-order logic
on finite structures [4], Papadimitriou and Yannakakis introduced in [11] meth-
ods for defining NP optimization problems using logic. T h e y showed that in
some important cases there is a relation between the logical representation of
an optimization problem and its approximation properties. They defined the
two syntactic classes of optimization problems, MAx S N P and MAX NP, and
218
proved that for all problems Q in these classes there exists a polynomial-time
algorithm that approximates optQ up to a constant factor. Many natural max-
imization problems like MAX CUT and MAX SAT are contained in MAX NP.
Surprisingly the analogue definitions for minimization problems do not lead to
a similar result (see [7] and [8] for details). However, Kolaitis and Thakur in-
troduced in [7] a different syntactic class, called MIN F+Hi, and proved that all
problems in this class are approximable in polynomial time up to a constant fac-
tor. MIN F + H i consists of all minimization problems Q whose input instances
92 are finite structures such that the optimum can be expressed as
D e f i n i t i o n 1.1 For every k E IN, the class MIN F+H2(k) consists of all mini-
mization problems Q whose optimum can be expressed as
Kolaitis and Thakur proved that every problem in MIN F+H2(1) is log-
approximable. An example for this class is MIN DOMINATING SET. Given a
graph, the problem asks for the cardinality of the smallest subset S of vertices
such that every vertex either is in S or is adjacent to a vertex in S. In fact
MIN DOMINATING SET is complete for MIN F+H2(1). It is definable by the
expression
Other complete problems in this class are MIN SET COVER and MIN HITTING
SET.
It is not known whether MIN b'+H2 is in APX, the class of all NP optimiza-
tion problems that are approximable up to a constant factor. In fact this is
unlikely since Lund and Yannakakis showed in [10] that there exists a constant
c, such that MIN DOMINATING SET cannot be approximated within ratio c log n
unless NP is contained in DTIME[np~176
For an overview of NP optimization problems see [3] or [6].
To show that a problem is in a logically defined class is usually done by giving
the desired logical formula. On the other hand showing that a specific problem
does not belong to a class is more difficult. In [2] Behrendt, Compton and Gr~idel
219
2 Preliminaries
D e f i n i t i o n 2.1 An NP optimization problem is a tuple Q = (I~, YQ, fQ, opt)
such that
9 Z~ is the set of input instances.
9 YQ(I) is the set of feasible solutions for input I. Here 'feasible' means that
the size of the elements S E YQ (I) is polynomially bounded in the size of
I and that the set {(I, S) : S E YQ(I)} is recognizable in polynomial time.
9 f~ : {(I, S ) : S E YQ(I)} ~ N is a polynomial-time computable function,
called the cost function.
9 optQ is one of the two functions defined below with 2:6 as domain and
positive integers as values:
optQ(I) = n~nfQ(I, S) or optQ(I) = m~x/Q(I, S).
In the first case we say Q is a minimization problem and in the second case
we say Q is a maximization problem.
220
We see that the most important of these classes is MIN F+II2(1). We show
our main result for this class and then we extend it to MIN F+II2(k) for each k.
With some simple syntactical transformations we get
Moreover we need some definitions and results from the theory of asymptotic
probabilities and from the theory of graphs.
As usual, for functions f, g : 1~ --~ ~ , we write f = O(g), if there are constants
c, d > 0 such that cg(n) <<. f ( n ) <<. dg(n) for sufficiently large n, and we write
f ..~ g to indicate that lim~...oo f ( n ) / g ( n ) = 1.
Definition 2.4 Let (r be a finite relational vocabulary and xl, 9.., Xk be a se-
quence of distinct variables. A maximal consistent set t of G-atoms and negated
~-atoms (including equalities and inequalities) in X l , . . . , xk is called an atomic
~-type in xl . . . , xk. Since such a set is always finite, we can form in first-order
logic the conjunction of the formulae in t; by abuse of notation, we denote this
conjunction by t ( x l , . . . , xk). On every G-structure 92, the type t defines the set
of realizations t ~ = {fi E A k : 92 ~ t(fi)}, where A is the universe of 92.
A O-type is called an equality type e. So e is a maximal consistent set of
equalities xi = xj and inequalities xl # z., where 1 ~< i < j ~< k, which defines
on every structure 92 the set e ~ = {fi E A~: 92 ~ e(fi)}.
For each atomic type t let rt be the unique natural number of pairwise distinct
components of each tuple of type t.
k
i
for all but an exponentially decreasing fraction of structures 92 when the size
of 92 tends to infinity.
221
3 Main Result
In this section we establish a necessary criterion for the membership of a problem
Q in the class MIN F+II2(1). We will give a probabilistic estimation of optQ on
a random structure 92. First we need a normal form for the first-order formula
that defines Q.
Let Q be an optimization problem whose input instances are finite structures
over a fixed vocabulary o.
j=l
and 2j is of length m.
i j=l j=l
k k
i j=l j=l
k
- w [ ( v # ~ ( ~ , ~11 -* 3~, ... 3~(3~ V(~,(~, ~1 ^ A ~ = ~1
i j=l
k
^ A s~j))]
j=l
223
PROOF. By lemma 3.1 we can express the optimum of every Q E MIN F+II2(1)
as:
By theorem 2.5 there exists a finite number of atomic types u~(~) and tj (~, 5)
such that
For each i there exists at least one j such that tj is an extension type of ui.
That means we can decompose the formula tj(~, 5) in the following way.
and t(~, 5) = u(2) A v(2) A ~(~, 2) .4/3(~, 2). The optimum of Q is the
cardinality of the smallest subset S C v ~ such that for every 2 E u ~ there
exists a 5 E S such that t(2, 2) becomes TRUE. Let ru and rv be the natural
numbers from definition 2.4, the number of pairwise distinct components of
respectively 2. From these rv pairwise distinct components of 2 let m be the
number of components that are equal to some component of 2. We get that
m ~< min{ru, rv}.
Each value of this m components defines a subset u~ of u ~. So u ~ is the
disjoint union of n(n - 1 ) . . . (n - m + 1) sets u~, and the sets
opt ( ) = o ( g0(n) +
i=l
)
f o r a randomly chosen ~-structure 92, where n = ]9.1].
First we need a generalization of lemma 2.6 to k + 1-hypergraphs. A k + 1-
hyperdominating set is a set of vertices such that for every vertex there exist
k vertices in this set and these k + 1 vertices form a k + 1-hyperedge. The
hyperdomination number is the cardinality of the smallest hyperdominatlng set.
(l_(l_p)r('*)k)a('~).-(l_qr(,~)k)a(r~)=(l_a~n)) a(n)
1
and this tends to ~ if n tends to infinity. We infer that ~(H) ~< k log~/q a(n)
almost surely.
Now let f(n) = ((1 - r for any r > 0. There are 1--[i~l ,e(,q/
different choices for D in B. Define the indicator random variable X D by
1 : D is a dominating set of G
XD(G):= 0 : otherwise
=
22~
lim E(Xr) ~< lim bi(n) ~(n) ~< lim b(n) ~(n)
n "--+ ~ n"+ O0 ~ "-*~*0 0
~< l i m e in b(n)'k~(n)-a(n)~ :- 0
PROOF. [of Theorem 4.1] By lemma 3.1 we can express the optimum of every
Q E MIN F+II2(k) as:
k
opt~(~) = mjn{ISl : (~, S) k Wb(~) -~ 3~1.. 32,(~(~, 2) A/~ S~d]}.
i=1
Analogue to the proof of theorem 3.2 we decompose the formula into atomic
a-types and consider the special case where
k
opte (~) -- mjn{ISI : (~, S) ~ W[u(2) --* 3h... 32kt(2, 2)A A Sa]}
i=1
With the same arguments as in [5] we can show that the criterion remains
true if we use more powerful logics than first-order for example fixed point logic.
227
References
[1] N. Alon and J. Spencer. The Probabilistic Method. Wiley, 1991.
[2] Th. Behrendt, K. Compton; and E. GrKdel. Optimization problems: Ex-
pressibility, approximation properties, and expected asymptotic growth of
optimal solutions. In E. BSrger, G. J~ger, It. Kleine Brining, S. Martini,
and M.M. Pdchter, editors, Computer Science Logic, 6th Workshop, CSL
'92, San Miniato 1992, Selected Papers, volume 702 of LNCS, pages 43-60.
Springer-Verlag, 1993.
[3] P. Crescenzi and V. Kann. A compendium of NP optimization problems.
unpublished, 1994.
[4] R. Fagin. Generalized first-order spectra and polynomial-time recogniz-
able sets. In R. M. Karp, editor, Complexity of Computation, SIAM-AMS
Proceedings, Vol. 7, pages 43-73, 1974.
[5] E. Gr~del and A. MalmstrSm. Approximable minimization problems and
optimal solutions on random input. In E. BSrger, Y. Gurevich, and
K. Meinke, editors, Computer Science Logic, 7th Workshop, CSL '93,
Swansea 1993, Selected Papers, volume 832 of LNCS, pages 139-149.
Springer-Verlag, 1994.
[6] V. Kann. On the Approximability of NP-complete Optimization Problems.
PhD thesis, Royal Institute of Technology, Stockholm, 1992.
[7] Ph. Kolaitis and M. Thakur. Logical definability of NP-optimization prob-
lems. Technical Report CRL-90-48, University of California, Santa Cruz,
October 1990.
[8] Ph. Kolaitis and M. Thakur. Approximation properties of NP minimization
classes. In Proc. 6th IEEE Syrup. on Structure in Complexity Theory, pages
353-366, 1991.
[9] Ph. Kolaitis and M. Vardi. Infinitary logics and 0-1 laws. Information and
Computation, 98:258-294, 1992.
[10] C. Lund and M. Yannakakis. On the hardness of approximating minimiza-
tion problems. In Proe. 95th ACM Symp. on Theory of Computing, pages
286-293, New York, 1993. ACM.
[11] C. Papadimitriou and M. Yannakakis. Optimization, approximization and
complexity classes. Journal of Computer and System Sciences, 43:425-440,
1991.
k
Convergence and 0-1 Laws for under
Arbitrary Measures
Monica McArthur
Department of Mathematics
University of California
Los Angeles, CA 90095
1 Introduction
recently obtained a 0-1 law for L~,~ for random graphs with edge probability
p(n) = n -a, a irrational, 0 .< a < 1. The first results in this field, however,
are due to Kolaitis and Vardi [9]. They showed that the class of all structures
with uniform measure (for any given signature) has an L%,~ 0-1 law, and also
to
characterized the existence of an Lor 0-1 law on a class with arbitrary measure:
It is this theorem which is the starting point for the work in this paper.
We show that the existence of L ~ , W (and L%,~) 0-1 laws on a class of structures
is equivalent to various conditions on the L k types of the first-order random
theory (the set of sentences of first-order logic which have probability 1 on the
class), and, in the process, we also provide another proof of the L%,~ 0-1 law
characterization theorem of Kolaitis and Vardi (Theorem 1).
We first state some definitions regarding types; these can be found in any
standard book on model theory, such as Pillay [12], except for L k types, which
can be found in [3].
D e f i n i t i o n 2.
We say that a type p(~) is realized in a model if there is some tuple ~ in the
model such that ~ satisfies every formula in p(~).
Note that the Stone space of L k types, like the Stone spaces for k-types, is
compact.
Proof. The proof is done in round-robin fashion: we show (1) =~ (2) =a (3) =~
(4) =* (5) =a (6) =~ (1). Most of these implications are already known: (1)
(2), for instance, was proved by Dawar, Lindell, and Weinstein [3]; (2) =a (3) is
just due to the compactness of the Stone space of L k types; (3) =:~ (4) and (4)
=* (5) are similar to one of the proofs given by Kolaitis and Vardi [9] that the
class of all structures has an L~o,,0 0-1 law; (5) =~ (6) is the "easy half" of the
characterization theorem of Kolaitis and Vardi [9].
(6) :=# (1) is the only new part of the proof. We prove the contrapositive. Let p
be an L k type o f T which is not realizable in any finite model. By compactness, for
each n there is some formula Cn (~) in p which implies the existence of at least n
distinct elements. Let Cn (~') be Aj <n Cj (s we do this so that r (~') --~ r (~).
Clearly, for each n, 35r m u s t ~ a v e probability 1, since for each n 3~r
has probability 1, because r is in p and p is consistent with T. For each n,
let rnn be such that for all m > m,,/~m(3s162 > 1 - 1/n. Pick some n0 > 0,
and then define f ( n ) recursively as follows:
(1) f ( 0 ) = n0
(2) f(n + 1) = mr(.) + 1
For m = m:(~i), clearly prn(8) is less than 1/f(2i): we only need consider the
conjunct 3~r ~ B~r in which the hypothesis has probability
greater than 1 - 1/f(2i) on structures of size m, and the conclusion is always
false, since there cannot be f(2i + 1) = my(20 + 1 elements in a structure of
size m$(20. But for m = m1(2i+1), Pro(~9) is greater than 1 - 1/f(2i + 1): any
of the conjuncts with f(2n), n > i, in the hypothesis can be disregarded, since
all of their hypotheses must be false (we cannot have f(2i + 2) = m1(2i+1) +
1 elements is a structure of size rn1(2i+l)), and Bs162163 has probability
greater than 1 - 1 / f ( 2 i + l ) on structures of size m and implies all of the remaining
conjuncts. Thus this sentence does not have a probability (i.e. limm-.oo #m (t9) is
not defined), and so C does not have an Lco,o k J 0-1 law.
This proof relies on the following proposition, which, as stated here, is slightly
more general than, but similar in proof to, m a n y others which reduce the sen-
tences of any countable language to first-order logic (e.g., McColm's proof, as
mentioned in [7], that McColm's second conjecture is true for sentences).
We will be considering abstract logics s = (L, ~ k ) , where L is a set of
objects, called sentences, and ~ k is a relation between structures and sentences
in L (the satisfaction relation). (For a brief discussion of abstract logics, see [1]).
The only requirements we will impose on these logics is that the satisfaction
relationship is preserved under isomorphism of structures, and that they are
closed under negation, that is, for every sentence 0 in L there is a sentence -~/9
in L such that for all structures A, A ~ k "~/9if and only if it is not the case that
set S C T, [S[ = R0, such that for each sentence 0 in T, there is a sentence r
in S which implies 0 under ~ for all structures in C, that is, if A E C is such
that A ~ r then A ~ z 0.
This is basically the same as the normal definition; however, we require that for
each sentence 0 in T there be a single sentence in S which implies 0 since s m a y
not be closed under conjunction.
(where p, is the first-order sentence stating that there exist more than n distinct
elements). Again, the RHS is clearly first-order, and we are done.
The "hard direction" of the Kolaitis-Vardi theorem (Theorem 1) is an imme-
diate corollary of this proposition.
233
since the RttS does not hold, we know that a < 1. Fix e > 0 very small (less
than (1 - a ) / 2 will suffice; since a < 1, this will still be greater than 0). Let U
be a finite set of L*-equivalence classes such that #(U) > a - e/2. Then we have,
by finite additivity, that any finite set of Lk-equivalence classes disjoint from U
will have probability < e/2.
We will now construct a sequence of finite sets of Lk-equivalence classes Xi,
Y/, and integers rni, hi, which satisfy the following conditions:
3. I.tm,(UUXi) > 1 - e .
4. g m ( U U X i ) < a + e / 2 .
We let rn0 be such that #i(U) > a - e / 2 for all i > m0 and let X0 be A,~o\U.
Clearly/~m0 (X0 U U) = 1, so (3) is satisfied, and certainly X0 is disjoint from
U. Let no be such that #i(X0 U U) < a + e/2 for all i > no (no is guaranteed
to exist, since U and X0 are both finite, so/z(X0 U U) < a), thus satisfying (4).
Let II0 be Ano\(Xo U U). Yo clearly satisfies (2).
Given Xi, Yi, rni, ni, we construct their successors in similar fashion. We let
mi+~ be such that pro,+, (X~ U ~ U U) < a + e/2 (and thus #,,,+, (Y~) < e). We
let Xi+l = Xi U Am,+,\(]~ U V). Then Xi+l satisfies (1) and (3), and is disjoint
from U. We let ni+l be such that/~i(Y~ U Xi+l U U) < a + e/2 for all i >_ ni+l,
so I.t,~,+~(Xi+t) < e and (4) is satisfied, and let ~+1 = ]~ U An,+~\(Xi+l U U).
Then ~ + 1 satisfies the disjointness conditions (1) and (2).
To complete the proof, we consider the property U U U x i , which is definable
in L~,~ by some sentence 0 [9]. By (3), p,,~ (0) > 1 - e for all m~, and by (4)
#n~(O) < a + e/2 for all n~. Thus 0 has no asymptotic probability.
(4=): Assume that the RHS holds. Let {Xi} be a sequence of finite sets of
equivalence classes such that limi-.~ #(Xi) = 1. Taking Xi = Xi U Uj<i x j , we
may assume that Xi+l ~_ Xi.
Let 0 be a sentence of L~,~, and consider #,(0). We certainly have that
(5) ((i) >_ lim sup #n (0) > liminf#n(0) >_ p(i)
for each i.
Since Xi C Xi+l, p(i) is increasing as i increases, and ~(i) is decreasing with
i. Thus lirr~._.~ p(i) and limi-.oo ~(i) both exist, In fact, since p(i) + (1 - ~(i)) =
~(Xi), we have lim~...oo(p(i) + (1 - ~(i)) = limi~oo p(Xi) = 1 so limi--.oo p(i) =
lirr~-~oo ~(i). Thus, by (5), lim inf#,(0) = lira sup pn(0), so ~(0) exists.
This theorem has several immediate corollaries. For the first corollary, we
need to define the graph associated with a given structure with an arbitrary
signature:
235
5 S o m e A p p l i c a t i o n s to P a r t i c u l a r Classes of Finite
Structures
Corollary 12 can be immediately applied to get some negative results about the
existence of L~,~ convergence laws. The first two of these have already been
stated by Tyszkiewicz [15], and the third is a weaker version (but with a shorter
proof) of another result due to Tyszkiewicz [14].
1. The class of graphs with edge probability p(n), n - 1 - ~ << p ~< n -1 for all
e > 0, does not have an L~,~ convergence law. This can be read off from the
analysis in Shelah and Spencer [13]: the random theory asserts the existence
of copies of all finite trees, including those of arbitrarily large diameter.
2. The class of graphs with edge probability p(n), n -1 ~ p ~ n -1 logn, does
not have an L~ convergence law. This is also immediate from Spencer and
Shelah [13].
3. The class of graphs with edge probability p(n), n - 1 log n ~ p ~< n-1+c, does
not have an L~CO~O~ convergence law. This is also immediate from Spencer and
Shelah [13].
4. The class of all unary functions with uniform measure does not have an L%,o~
convergence law. This is immediate from Lynch [10], where it is shown that
this class has arbitrarily large diameter with probability 1.
P r o p o s i t i o n 14. Let C be a class with uniform measure such that, with proba-
bility 1, every point in every structure in C is of bounded degree. Then C has an
L%, w convergence law only if there is a polynomial p(x) such that the exponential
generating function G(x) for C (as defined by Compton [2]) converges for all x
and has G(x) < ep(~) for all x >__O.
Proof. We first note that, by Corollary 12, if C has an L w oo,~ convergence law
then C must contain a set of measure 1 which has bounded diameter; since C
also has a set of measure 1 which has bounded degree, C must have a set S
of measure 1 (the intersection of the two above-mentioned sets) in which every
connected component of every structure is of bounded size. Let A be the set
of all structures which are connected components of some structure in S. Now,
there is a bound n on the size of structures in A, and there are at most finitely
m a n y structures of size < n, so A is finite.
So assume C has a set S of measure 1 such that the set A of all structures
which are connected components of some structure in S is finite. Then the ex-
ponential generating function associated with A is some polynomial q(x). Now,
S is clearly a subset of the set A* of all structures whose components are in A,
and the exponential generating function associated with A* is e q(~). Thus the
generating function H ( x ) associated with S is term-by-term less than e q(~), and
so, since each coefficient of the series is nonnegative (one cannot have a nega-
tive number of structures), H ( x ) clearly converges and is < e q(~) for all x > 0.
Since H(x) is a power series, we also have that H ( x ) converges for all x, since
it converges for all ~ > 0.
Now, G(x), the generating function of C, is clearly very close to H ( x ) , since S
is of measure 1, and in fact, for z > 0, G(x) must be strictly less than 2 H ( z ) + r ( z )
for some polynomial r(x)~ since there must be some m such that for all n > m,
less than half of the structures of C of size n are not in S; otherwise S would
not have measure 1. Thus each term g,~xn of G(x), n > m, is less than 2hnx n,
where hnx n is the corresponding term of g ( z ) . We add r(x) in to take account
all the things that may happen in C for structures of size < m. Thus we have,
for x > 0,
G(x) < 2 H ( x ) + r(x) <_ 2e q(x) + r(x) <_ ep(~)
where p(x) can be constructed from q(x) and r(x).
In particular, if C is of bounded degree and has (n - c)! structures of size n
for each n and some constant c, then C does not have an L W convergence law,
because the exponential generating function for C will either fail to converge or
will not be bounded by ep(~:) for any polynomial p(x). Thus we get the following
results:
237
1. The class of all unary 1-1 functions with uniform measure does not have an
oJ
Leo,~ convergence law, as it has nl structures of size n.
2. The class of all graphs of bounded degree with uniform measure does not
have an L ~ , W convergence law, as it has at least ( n - 1)! structures (the
chains of length n) of size n.
D e f i n i t i o n 17.
Let Tk be the conjunction of all extension axioms (in the language with one
binary relation) with < k variables. Let b(k) be a recursive function such that
for all rn > b(k), there is a model of Tk of cardinality m. We can construct such
a b(k) using Fagin's proof of the 0-1 law for first-order logic [5]. Now, we define
a sequence {A~} of connected directed graphs, ]A(] = i, as follows: let A1 be the
(unique) graph with one vertex. For i < b(3), let Ai be the chain {1, 2 , . . . , i}.
For i = b(3), let Ai be the first directed graph (in some recursive enumeration)
of size i which satisfies T3. We know that such an A~ exists by the definition of
b(k); Ai will be connected since one of the 3-variable extension axioms states
that every two points not connected by a path of length 1 are connected by a
path of length 2. In general, for k > 3, b(k - 1) < i < b(k), we let A~ be the first
structure which satisfies Tk-1. As in the specific case i = b(3), we know that A~
is connected.
It is clear that the set A = {Ai} is recursive. Taking the obvious probability
distribution (#i(Ai) = 1, and #i(A) = 0 otherwise), we clearly get a class with a
recursive measure. It is also clear that the probability of each extension axiom is
1. So then we can apply the following theorem of Tyszkiewicz to conclude that
A does not have an MSO convergence law:
1. Now, any two structures which satisfy this property are Lk-equivalent, since
any two copies of each Ai, i < b(k), are certainly Lk-equivalent, and any two Aj,
At, j, l > b(k) are L~-equivalent since any two models of Tk are Lk-equivalent.
Thus any two structures which satisfy this property both have more than k copies
of the same Lk-equivalence classes, and thus are L~-equivalent by a theorem
of Kolaitis [8]. Thus, for each k, there is an L~-equivalence class which has
probability 1, and thus A* has an L~o,o~ 0-1 law by Theorem 1. Also, it is clear
that each equivalence class of MSO with k quantifiers occurs more than m times
for any m as n get bigger, so by a similar argument (as in Compton's proof [2]
that the class of all equivalence classes has an MSO 0-1 law), we can conclude
that A* has an MSO 0-1 law.
However, it is fairly easy to see that MSO does not reduce to first-order on
any subset of A* of measure 1. To see this, let 0 be a sentence of MSO with no
probability on A (such a sentence must exist, since A does not have an MSO
convergence law). Let r be a formula which says "there is a connected set
U with diameter 2 such that z E U and 0 relativized to U is true". This is
clearly expressible in MSO. It is also clear that the truth of r depends only
on the isomorphism class of the connected component that z is in. We will show
that r cannot be equivalent to any first-order formula on any subset of A*
of measure 1. We will need the following lemma:
Ai ~ 3xa'(x) ~ Ai P O.
Proof. The is most easily seen by using Ehrenfeucht-Fra'iss6 games [4], [6]. As-
sume the quantifier rank of a is n. Let Bi be sufficiently large (large enough to
be in S and to have at least n + 1 copies of each connected component of size
< n + 1), and let x and y in Bi be such that tr(z) holds but or(y) does not (i.e. z
is in a connected component of Bi on which 0 is true and y is in a connected
component on which 0 is false). Let C~ and Cy denote the connected components
of z and y respectively.
Add a new constant c to the language, and consider the structures (Bi, z)
and (Bi,y), in which z and y, respectively, are the interpretations of c. (Bi, z)
and (Bi, y) will differ in a sentence of quantifier rank n, namely ~(c). So Player I
can always win the Ehrenfeucht-Fra'issd game of length n on (Bi, z) and (Bi, y).
But whenever Player I plays on a connected component other than Cz in (Bi, z)
or C~ in (Bi, y), Player II can easily match the move by playing on an isomorphic
component of Bi which is not Cy or Cz respectively (here we use the fact that
Bi has at least n + 1 copies of each connected component). So Player I only
needs to play on Cr and C~. And clearly Player II will still lose whether or not
he only plays on C= and C~. Thus we have that (Cz, z) and (C~, y) differ on a
sentence of quantifier rank at most n.
240
This argument works for any x and y; thus if C and D are two connected
components such that 0 is true in C but not in D, and z E C and y E D, then
(C, x) and (D, y) must have different first-order theories up to quantifier rank
n. But there are only finitely m a n y equivalence classes of first-order logic up to
quantifier rank n, each of which is expressible by a single sentence of quantifier
rank n. Let So be the (finite) set of sentences which describe the first-order
theories up to quantifier rank n for each (C, x) such that 0 holds in C and
x E C. Let So (x) be So with each occurrence of the constant c in each sentence
replaced by x. T h e n we have
0 4 - ~ Bx V s o ( z )
on all At.
References
1. Chang, C. C., and Keisler, H. J., Model Theory, North-Holland, Amsterdam, 1990.
2. Compton, K. J., 0-1 Laws in Logic and Combinatorics, in I. Rival, ed., Proc. NATO
Advanced Study Institute on Algorithms and Order, Reidel, Dordrecht (1988),
pp. 353-383.
3. Dawar, A., Lindell, S., and Weinstein, S., Infinitary Logic and Inductive Definabil-
ity over Finite Structures, University of Pennsylvania Tech. Report IRCS 92-20
(1992).
4. Ehrenfeucht, A., An Appfication of Games to the Completeness Problem for For-
malized Theories, Fund. Math. 49 (1961), pp. 129-141.
5. Fagin, R., Probabilities on Finite Models, J. Symbolic Logic 41 (1976), pp. 50-58.
6. Fra~ss~, R., Sur quelques classifications des syst~mes de relations, Publ. Sci. Univ.
Alger. Sgr. A 1 (1954), pp. 35-182.
7. Gurevich, Y., Immerman, N., and Shelah, S., McColm's Conjecture, Proc. of the
9th IEEE Symposium on Logic in Computer Science, 1994.
241
Introduction
The question put in the title of this paper was originally motivated by certain
questions asked by Moshe Vardi after a talk by the first author, and further
crystallized during the author's discussions with Yiannis Moschovakis. We call
this "Moschovakis's Problem".
Let's start from the context for this question.
It is well-known t h a t the language FO of first-order logic fails to express
certain simple (say, computable in P) properties of finite models. This remains
true even in the presence of linear order (see [2]).
[11] and [20] showed that the class P can be characterized in the presence of
linear order by the language of least fixpoint logic 3.
On the other hand, if we are going to evaluate a fixed first-order formula in
a finite model, it seems natural that we would need to run a search for every
single quantifier in the formula, in other words, the complexity of this problem
* This work has been partially supported by NSF Grant CCR 9403809.
A part of this research was carried out while the author was visiting UCLA and
supported in part by NSF Grant CCR 9403809.
3 Many other characterizations of this complexity class in logical terms have been
proposed [6, 14, 17, 18].
243
Corollary 2. If P = PSPACE then for any signature there would exist a k such
that for any FO formula, checking its validity would be in DTtME(nk).
D e f i n i t i o n 3 Class $. S is the class of finite models of the linear order <, the
successor function ', and of two constants 0 and LAST, such that < is a linear
order of the universe, ' gives the next element (in the <-ordering), while 0 and
LAST are, respectively, the first and last elements of the universe (w.r.t. <).
Note,~. The class S is finitely axiomatizable in the class of all finite models by
means of universal formulas.
245
( L A S T ' = L A S T A (Vx)(Vy)(Vz)(x' ~ OA
(x = x I ---+x = L A S T ) A
((y~ = x A z ~ = x) ~ (y = L A S T V y = z V z.= L A S T ) ) A
Q.E.D.
T h e o r e m 6. If the hierarchy LFPi collapses in the class $, then for any signa-
ture that contains one binary predicate symbol, and for any k there exists an FO
formula ~k in the signature that defines a global predicate not in DTIME(nk).
4 Often, definitions of LFP use arbitrary LFP formulas in fixpoint operators, however,
it does not increase the expressive power of the logical languages (see [16]).
s As before, this restriction doesn't actually restrict the expressive power.
2L,6
PROOF: Suppose the contrary, that is that LFPi collapses, say, at the level 1,
but for any signature ~r there exists an ma such that any FO formula ~ defines
a global predicate of time complexity DTIME(n m').
Let ~ be an extension of the signature of S by a new predicate symbol of
arity l and l constant symbols. It is now easy to see that the time complexity of
any fixpoint operator of dimension l (or less) in the class S is DTIME(n 2/+m~).
Indeed, to compute the predicate defined by the fixpoint operator, we will need
to make at most n I iterations, and at each iteration, we will need to check n 1
l-tuples, and each such check will take (by assumption) n m~ steps.
Because the fixpoint operators define predicates of arity l (or less), the time
complexity of checking any LFP1 (and thus any LFP) sentence in the class S is
then DTIME(n 21+2m6)~,
But this contradicts to the well-known fact ([11, 20]) that in this situation
LFP is capable of defining any S-global predicate in P.
Now this proves that if the hierarchy of LFPi collapses, then for a certain
signature ~ FO is in DTIME(n m) for no m.
However, models of this signature 5 can be translated into models of one
binary relation (see e.g. [9]), and the way they are translated, the models grow
in size only polynomially.
Hence, the theorem. Q.E.D.
2 O n a n o r m a l f o r m for I D
Note that the definition of implicit definability in [13], although different by its ap-
pearance, defines the s~me class of global predicates.
247
Corollary 11. For any k there exists a PT2k+3 S-global predicate thai is not in
DTIME(nk).
PROOF: By the time hierarchy theorem ([10]; see also [1, Exercise 11.8]), there
exists a deterministic Turing machine in DTIME(n k+l) \ DTIME(nk). This ma-
chine accepts a certain language and can be thought of as defining a 0-ary global
predicate (essentially, a subset of S).
The global predicate that this machine computes can be defined, according
to the proof of Theorem 9, by an ID v~z3 formula that implicitly defines a new
predicate of arity 2(k + 1) + 1 = 2k + 3.
By its very construction, this predicate is in PT~k+3, which proves the corol-
lary. Q.E.D.
248
References
1. Aho, Alfred V., John E. Hopcroff, and Jeffrey D. Unman, "The design and analy-
sis of computer algorithms", Addison-Wesley Publ. Co., 1974, 1-470.
2. Aho, Alfred V., and Jeffrey D. Ullman, Universality of data retrieval languages,
in: "Proceedings of the 6th ACM Syrup. on Principles of Programming Languages
(POPL)", 1979, 110-117.
3. Behmann, tteinrich, BeitrKge zur Algebra der Logic, insbesondere zur Entschei-
dungproblem, Math. Ann., 86, 1922, 163-229.
4. Chandra, Ashok K., and David Harel, Structure and complexity of relational
queries, J. Comput. Syst. Sci., 25, 1980, 156-178.
5. Grohe, Martin, Bounded-arity hierarchies in fixed-point logics, in: '~Proceedings
of CSL '93. 1993 Annual Conference of the European Association for Computer
Science Logic, Swansea, UK, 13-17 Sept. 1993)" (Borger, E.; Curevich, Y.; Meinke,
K., Eds.), Springer-Verlag: Berlin, 1994. 150-164.
6. Gurevich, Yuri, Algebras of feasible functions, in: "24th Syrup. on Foundations of
Computer Science (FOGS)", IEEE Computer Society Press, 1983, 210-214.
7. Gurevich, Yuri, Logic and the challenge of computer science, in: "Current Trends
in Theoretical Computer Science" (Egon B6rger, Ed.), Computer Science Press:
Rockville, Md., 1987, 1-57.
8. Enderton, Herbert B., "A mathematical introduction to logic", Academic Press:
New York, 1972.
9. Ershov, Yuri L., Igor A. Lavrov, Asan D. Taimanov, and Michael A. Taitslin, Ele-
mentary theories, Russian Mathematical Surveys, 20:4, 1965, 35-105.
10. Hennie, F.G., and R.E. Stearns, Two tape simulation of multitape machines,
J. ACM, 13:4, 533-546.
11. Immerman, Neff, Relational queries computable in polynomial time, in: "14th
ACM Syrup. on Theory of Computing (STOC)', ACM, 1982, 147-152.
12. Immerman, Neff, Languages that capture complexity classes, SIAM J. Computing,
16, 1987, 760-778.
13. Kolaitis, Phokion G., Implicit definability on finite structures and unambiguous
computations, in: "5th IEEE Annu. Syrup. on Logic in Computer Science (LICS)",
IEEE Computer Society Press: Los Alamitos, CA, 1990, 168-180.
14. Livchak, Alexander B., The relational model for process control, in: "Automatic
Documentation and Mathematical Linguistics 4", Moscow, 1983, 27-29.
15. Maltsev, Anatoly I., Regular products of models, lzv. Acad. Nauk. SSSR,
Ser. Mat., 23, 1959, 489-502.
16. Moschovakis, Yiannis N., "Elementary induction on abstract structures", North-
Holland/Elsevier: Amsterdam/New York, 1974, 218pp.
17. Sazonov, Valdimir Yu., Polinomial computability and recursivity in finite domains,
Elektronisehe Informationsverarbeitung und Kybernetik, 16, 1980, 319-323.
18. Stolboushkin, Alexei P., and Michael A. Taitslin, Dynamic logics, in: "Cybernetics
and Computing Technology" (V.A. Mel'nikov, Ed.), Moscow: Nauka, 1986, 180-
230.
19. Valiant, L., Relative complexity of checking and evaluating, In]ormation Process-
ing,, 5, 1976, 20-23.
20. Vardi, Moshe Y., Complexity of relational query languages, in: "14th ACM Syrup.
on Theory of Computing", ACM, 1982, 137-146.
Logic Programmingin Tau Categories
S t a c y E. F i n k e l s t e i n Peter Freyd
McGill University U n i v e r s i t y of P e n n s y l v a n i a
sef@triples, math. ~cgill. ca pj ~@saul. cis. upenn, edu
James Lipton
Wesleyan University
lipt on~allegory, cs. wesleyan, edu
Abstract
Many features of current logic programming languages are not captured by
conventional semantics. Their fundamentally non-ground character, and the
uniform way in which such languages have been extended to typed domains sub-
ject to constraints, suggest that a categorical treatment of constraint domains,
of programming syntax and of semantics may be closer in spirit to declarative
programming than conventional set theoretic semantics.
We generalize the notion of a (many-sorted) logic program and of a resolution
proof by defining them both over a (not necessarily free) r-category, a category
with products enriched with a mechanism for canonically manipulating n-ary
relations. Computing over this domain includes computing over the Herbrand
Universe, and over equationally presented constraint domains as special cases.
We give a categorical treatment of the fix-point semantics of Kowalski and van
Emden, which establishes completeness in a very general setting.
1 Introduction
9 Tau Categories are categories enriched with a mechanism for canonically manipulating
n-ary relations. This r-structure creates, through canonical limits, an ideal framework
for formalizing an abstract syntax for logic programming with constraints. By defining
the notions of resolution and unification over such a category, one is able to capture
a quite general notion of constraint logic programming and establish a completeness
theorem, based on a non-ground categorical generalization of fixed point semantics.
Connections between categories and logic programming were used to describe uni-
fication in [18]. A. Corradini and U. Montanari [5] focus on the abstract computations
of logic programs and the view of logic programs as structured transition systems.
A. Corradini and A. Asperti [4] build on [5] and define a model for a logic program
as a family of monoidal categories indexed by sets of variables. In both works, the
category theoretic setting is semantical and is not used to describe the resolution pro-
cess. Asperti and Martini [2] give a categorical treatment of the syntactic mechanisms
of resolution along with the interpretation of logic programs based on the concept of
using projections as variables [14], and was a starting point for our work. The categor-
ical framework is here extended to cover more general notions of programming syntax
and resolution and to produce a cleaner semantics that exploits and strengthens the
fixed-point approach of Kowalski and van Emden [12].
250
Categorical interpretations are often limited by the fact that the main constructs
such as products and pullbacks are only defined up to isomorphism, forcing arbitrary
choices of representatives of a given isomorphism class, r-categories, introduced by
Peter Freyd [7], are a uniform setting in which to work with n-ary relations and thus
describe logic programs. They are finite limit categories enriched with r-structure,
a distinguished class of diagrams which provides canonical choices for limits, and
guarantees associative equality and a strict unit for the product.
We begin by describing a categorical syntax for logic programs in an arbitrary
r-category as in [6].
2 r-categories
The information contained in this section describes a r-categorical syntax for logic
programming and is given in greater detail in the dissertation of Stacy E. Finkelstein
[6].
The basic translation of the first-order language Z: into the r-category follows the
usual interpretation as in [14], slightly augmented with the canonical choices allowed
by the r-structure. The relevant definitions are given in table 2. We first define
an interpretation of terms, inductively, as arrows in A ~, relative to a sequence of
distinct variables containing all of the free variables in the term. We then extend this
interpretation to predicates.
251
If xj is a short column, then ( T ; x l , . . . , k j , . . . , Xn} will denote the table obtained by removing
the arrow xj from the sequence x] , . . . , Xn.
D e f i n i t i o n 2.3 Given tables ( T ; x l , . . . , Xrn) and (TI; yl . . . . . Yn) where x j : T --~ T', we define
their c o m p o s i t i o n a t j as
3.1 S u b s t i t u t i o n a n d U n i f i c a t i o n in A r
Definition 3.2 For a term t of sort p, having all of its free variables among ~, M'~(t) is defined
to be a morphism M(~) --* m(o) in A ~ as follows:
t : = xi M~(xi) is defined to be the canonical projection M(~) -* M(pi)
t : = ftl "'" tm If each ti is o] sort ai, then s'VI~(ftl " ' t i n ) is defined as the composition
.M(f) o (Me(tl) . . . . . Me(tin))
Predicates ]
Definition 3.3 For every predicate symbol R in/5 with arity n and sorts cr1 ,an,... ,qn, define
M ( I t ) ~o be a z-monic M ( R ) --* M(~}. For a formula ~ with all ir free variables among ~,
Me(w) is defined as follows:
~ : = P t l - ' " tm /] each ~i is of sort ~i, then M f ( P t l ""tin) is defined to be the
z-arrow p which is the canonical pullback o] M ( P ) , ~ M(~) along the a r r o w
(M~(tl) . . . . . Mz(tm)).
%o:= A~n__lAi If each Ai is an atom which has been interpreted as described above, then ~4~(~o)
is defined to be the m-tuple of z-monic arrows Ats(Ai) , 9 A4(~).
In this case the pair (@~, el) are said to unify the formulae ~ and r Note t h a t
the first condition implies in particular t h a t n = m , M ( g ) = M(p-), and M ( P ) = M ( R ) .
Also note t h a t the occur check is internal to this setting. Since all interpretations are
m a d e with respect to a list of variables and, in particular, the substitution arrows
are m a d e with respect to an ostensibly different list of variables than the terms to
be unified, there c a n be no substitution of a variable with a term containing t h a t
variable unless a separate d i a g r a m specifies these to be equal. In m a n y constraint
253
domains (e.g. the set of closed terms in the lambda calculus), terms may be unifiable
without having mgu's. But if unifiers exist in a category, it is straightforward to add
the appropriate pullback squares and form a new r-category in which mgu's exist.
This is discussed in the last section of the paper.
3.2 Resolution
M(~ , , MdB)
A definite r-logic p r o g r a m pr is a finite set of definite clause diagrams MffC).
Let G be the definite goal B1 A B2 A ... A B,~, (for m>_l). Then a definite goal
d i a g r a m , Me(G) will be an m-tuple of r-monic arrows Me(B~) , , M(~) to the
base M(g).
M(u3
3.3 Models
I. A conjunction M~(~o) is valid in 2) iff H [l], where l: Lira ~ M(~') is the canon-
ical limit of the diagram M~(~o), is an isomorphism.
2. A definite clause diagram Mr(A1, A 2 , . . . , A n ~- B ) is valid in 7) (under H) iff
there exists a monte arrow m such that:
m
H [Lira] , , H [M~(B)]
H [M(a?)]
where l is the limit of the diagram M r 1 6 2 for r = A1 A A2 A . . . A An.
3. A logic program Pr is valid in l) iff every clause of P~ is valid in 1). In this case
we say that 1) (under H) is a m o d e l of p r
4 Completeness of TSLD-Resolution
Let s be the language of the Horn-clause program P and /~n = {R1,..., Rn} the
set of relation symbols in the language, of sorts {r an}. In order to establish
completeness we begin with a category C with terminator and products, in which all
sorts, constants and function symbols of s have been interpreted by a C-structure
M. On this base category of constraints we create a syntactic category into which we
translate logic programs and over which SLD resolution will take place. The syntactic
category, (previously denoted Arin preceding sections) is obtained by freely adjoining
indeterminate monics Xi , , B to C, so as to associate to each predicate in ~:~ an
indeterminate monte, through which no arrows from C representing terms will factor.
255
In general, for any category C with terminator and products, and any object B in
C, the category CB (also called C[Xi] below) obtained by adjoining an indeterminate
subobject Xi -- ~ B is defined as follows. The objects of CB are pairs (A, s) where
A is an object of C and s is a finite set of morphisms A --0 B of C. We will let the
morphisms (A, s) ~ (A', s t) in CB be those morphisms f : A --* A' of C such that for
every morphism g : A' --* B in s' it is the case that f; g : A --* B is in s.
C/~ has a terminal object and products for every pair of objects. It is also a
r-category with r-structure inherited from C. In addition, the embedding C ~ " ~ CB
which sends f : A -~ A' in Cto f : (A,0) -~ (A',0) in CB is full and faithful
and preserves existing limits. The category CB also has a generic subobject idc :
(B, {id}) --*~(B, 0). This is our indeterminate monic X -- : t(B), and it is important
to note that all pullbacks of this monic along arrows t from C (so-called "term arrows")
exist in CB and are of the form (A, {t}) i a c (A, 0} which cannot be an isomorphism
in CB.
To emphasize the fact that we are interpreting predicate letters as indeterminate
monics, we use the notation C[X] - C [ X 1 , . . . , X n ] for the category obtained from
C by freely adjoining indeterminate monics Xi * , M(~i) one for each Ri E/2~. We
extend M to an interpretation (also called M) of the predicates in s by defining
M(P~) = Xi (that is to say, M ( R I ) is the monic Xi , , M ( ~ ) ) . Such a category,
along with such a structure map, corresponds to the category A~and structure map
(also denoted M) over which the proof theory has been discussed in previous sections.
We shall call the category C[X] with its structure map a logic programming category
(an LP-category). We now use the notation PrbcG to mean there is a rSLD-proof of
G from P~ in the category C[X].
Our ambient semantic category will be the topos C'given by the contravariant
Yoneda embedding from C (ie: C ~ , Set C ~ In Cwe are able to take unions and
images. Hence, in pa~icular, any predicate defined by resolution over C is repre-
sentable as a monic in C. We will use this framework to define initial models for logic
programs over C, as subcategories of C.
We will need to make use of a few elementary facts about the Yoneda embedding
whose proofs can be found in e.g. [9, 7]. The embedding shown above is full and
faithful. The objects B in C appear in Cas representable functors Horn(_, B), al-
though we will often abuse notation and refer to them by their original names B. All
representable functors are coprime and projective. Thus, in particular, arrows from
representables into unions of functors in C must factor through one of them. Also, up
to isomorphism, we may take the subobject relation A C B between subobjects A, B
of some object C to be pointwise inclusion of functors. That is to say, given a chain
of subobjects of b in C'such as the diagram on the left,
b b
256
there are isomorphic subobjects Fi ~ Ei such that the diagram on the right com-
mutes, with all arrows the canonical inclusions g ~ induced by pointwise set
theoretic containment of functors: for each object X in g, Fi(X) C_ Fi+I(X).
To show completeness, we shall develop a category theoretic generalization of
the Kowalski-Van Emden Tp-operator ([12]). We will define an operator Tp,c on
semantic categories l) which builds a new category from 79 generated by the clauses
of the program. By a process of iteration we construct a fixed point for Tp,c, an initial
model for the program from which we may show completeness.
Throughout, we will denote by (f)# : Sub(a) ~ Sub(b) the functor from the
lattice of sub-objects of a in alto that of b induced by pulling back along f Lb --~ a,
and by ~ ( f ) its left adjoint in C. Notice that ~(id)(g) is the image of 9 in g .
Note that one may show using a straightforward induction on terms that for any
term t, lit ] ~ = [ t l~g " Thus we will often drop the superscripted categories from our
notation when discussing interpretations of terms.
Given a program P let us denote by cla clause A1, A s , . . . , An F- B(t). We will then
mean by hdr the predicate symbol B (the head of cl), by tm,~ the term t (associated
with hdr and by tlr the list A1,..., An (the tail of cl). We then make the following
categorical notational definitions.
D e f i n i t i o n 4.3 If I r i s : M(a~) ~ M(~) and ~r]g: M(y') --~ M(~) are terms, then
[t ]e" is said to f a c t o r t h r o u g h [rig in D if there exists an arrow 0 in 7)such that
the arrows id and 0 are pnllbacks of the arrows [t ]g. and [rig respectively.
Van Emden and Kowalski introduced an elegant semantics for Prolog in terms of
fixed points of a continuous operator Tp on the power set of the Herbrand base Bp
257
h ~I=B
P r o o f . Follows from the fact that filtered colimits commute with finite limits in a
topos (eg [3]). []
(1)
[B]*=[B~*U[h~'~P--B {-~( [[t,nr ]]~)( it lr ~;) }
for each predicate letter B E s It will suffice to show that for each clause such that
hdcl = 8, ~(~trnr C_ ~[B]*. Then since 9t([tmr is the left adjoint of
([trn~z]r #, it suffices to show [tl~l]} C ( [tm~z]r [B]*. We know by the definition
of [B] k+l that ~([tmel]z) [tlr C_ [B] k+l. So since (-)# is also a left adjoint, we
have that for each k
c ( D ~ o , b ) # [BI ~+1
= ([tm~,l~)# [ B F
By lemma 4.7 we know that [tl~,]* = li_+m{[tlr Thus we may conclude by the
universal property of colimits that
(2)
In fact it is the case that for any semantic category D , being a fixed point of
Tp,c is equivalent to being a model of P. In the proof above, one sees that to show
C* a fixed point of Tp,c (equation 1 for each predicate letter B), one may equivalently
show that for each clause such that hdr = B, C* is a model of P (equation 2). []
259
P r o o f . We want to show that for any model 7) of P (ie any semantic category 7) which
is a fixed point of Tp,c), that validity in C* implies validity in 7). In particular, it
suffices to show that for any predicate letter B E / ~ , [B a* C_ [B ]79.
We first show by induction on k that for all k, [B] k C_ [BAT). Since [B] ~ =
the unique m a p 0 - * [ba ~ b = sort o f B , itisclear that [Ba ~ C [B]7). Now
suppose that [Ba k C [B]7). By Lemma 4.9 and the fact that 7) is a fixed point of
Tp,c, [Ba k+l = E ( [ a k, B) C_ E ( [ ] 7 ) , B) = [BAT).
It now follows by the definition of [Ba* that [Ba* = li__.m{[[Ba" } C [Ba D . D
Proof. Consider the following model 7) of the program P. For each predicate letter
Q E P, adjoin to C the arrows (in C'): U ~zeP ~(id)( [tm~t ]g), and close under finite
hds =Q
limits. Let 7) be the full subcategory of C generated by this. Now let the extended
structure [aT) be defined by I[BaT) -- U oz~P ~(id)([trnczag) where by convention,
hdez =Q
if B r P then [BAT) (the empty union) shall be the unique arrow 0 ---* [b a for
b = sort of B.
The category 7)with interpretation H : C[X] ----+ 7) induced by evaluation
g ( x i ) = [BAT) is a model of the program P since if A1,A2,...,Am b B(r) is a
clause cl of P, then [[rag factors through 9(id)([ray ) and so through [[B]]7). Thus
[B(r) a~ is an isomorphism through which [tic, aDf factors automatically.
[Q]'u U {9([tmr162
clEP
But then, since M(s (or, more precisely, its image in C') is coprime, it must be
the case that It ]~ factors through one of the members of the union. Since we
have assumed that [t~z does not factor through [Q]n, it must be the case that
for some clause cl e P such that hdd = Q (say the clause A1,A2,...,Am I-- Q(r))
( lit ]~)#~( [rig)( [tl~t ]$) is an isomorphism. In fact, by Lemmas 4.12 and 4.10, there
is a substitution arrow Oein C such that
(a~ o)~j , (a~]~_j and M(~) ~ M(ff)
Thus, we have that for each i, [Ai~]~ is an isomorphism. But then, by our induction
hypothesis, there are rSLD-proofs of Mff[Xl(AiO) for each i.
We may then build a rSLD-proof of MClXl(Q(t)) as follows. Using the clause
diagram on the left below, we may take one vSLD-derivation step using (id, 8) as
mgu. Then our new goal would be the diagram on the right below. But, as mentioned
above, each of thesehas a (finite) rSLD-proof.
Mff[X](A1) ... Mff[Xl(Am) Mc[X](AIO) ... ]14ff[X](AmO)
\/ \/
M(y~ o 9 M~[X]CQ(r)) M(~)
[]
261
Proof. Let e ' a s described above have clause diagrams MC[ X] (cll),..., MC[X](cl m)
and let G be a definite goal diagram (ie diagram at left below) such that for every
model of P ' , G is true.
Mg[X](G1) "'" Mc-iX](Gr)
Y
[G1 ]9* o', [G r ]9*
M(~ [g]
In particular then, for the model C* of pr, we have that H [~], where ~ is the limit of
the diagram G at left above, is an isomorphism. Since by definition, H[~] is the limit
of the diagram at right above, if it is an isomorphism, then g~ is an isomorphism for
each j. But now by Lemma 4.14, g] is an isomorphism for each j, so by Lemma 4.13
there is a rSLD-proof of MC[Xl(Gj) for each j.
Let s be the language of a one-sorted Horn clause program. Define 7/, the free
algebraic theory for E, to be the category with objects the natural numbers and arrows:
projection maps zrm : n --+ m and diagonal arrows 6nm : m --* n for each nonzero pair
of objects rn, n with m _< n and, for each function symbol f of arity n in s an
arrow f : n ~ 1. These arrows satisfy the equations for associativity as well as
7r,~nrr~m= ,'r~ and ~5nm7rmn= idm __ 6ram= 7rm.m
This category has products and 7"structure. There is a one-to-one correspondence
between arrows in 7-/ and (tuples) of terms in the Herbrand universe over E. Give
7 / t h e canonical interpretation that associates with each f in E the arrow f and let
H [ X 1 , . . . , Xn] be the category obtained by adjoining indeterminates for each of the
predicate letters in E, then H [ X 1 , . . . , X,~] captures logic programming in the sense
made precise in the theorem below. Terminology is from [12].
6 Conclusion
References
[1] G. Levi A. Bossl, M. Gabbrielli and M. MarteUi. The s-semantics approach: Theory and
applications. Journal o] Logic Programming, 19-20, 1994.
[2] A. Asperti and S. Martini. Projections instead of variables, a category theoretic interpretation
of logic programs. In Proc. 6th 1CLP, pages 337-352. MIT Press, 1989.
[3] Michael Barr and Charles Wells. Toposes, Triples, and Theories. Springer-Verlag, 1985.
[4] A. Gorradini and A. Asperti. A categorical model for logic programs: Indexed monoidal
categories. In Proceedings R E X Workshop '92. Springer Lecture Notes in Computer Science,
1992.
[5] A. Corradini and U. Montanari. An algebraic semantics of logic programs as structured tran-
sition systems. In Proceedings of the North American Conference on Logic Programming
(NACLP '90). MIT Press, 1990.
[6] Stacy E. Finkelstein. Tau Categories and Logic Programming. PhD thesis, University of
Pennsylvania, 1994.
[r] Peter Freyd and Andre Scedrov. Categories, Allegories. North-Holland, 1990.
[8] Joxan Jaffar and Michael Mailer. Constraint logic programming: A survey. Journal of Logic
Programming, 19/20, 1994.
[9] J. Lambek and P.J. Scott. Introduction to Higher Order Categorical Logic. Cambridge, 1986.
[10] Saunders Mac Lane and Ieke Moerdijk. Sheaves in Geometry and Logic. Springer-Verlag, 1992.
[11] James Lipton and Paul Broome. Combinatory logic programming. In Proc. ILPS'94. MIT,
1994.
[12] J. W. Lloyd; Foundations of Logic Programming. Springer Verlag, New York, 1987.
[13] M. Gabbrielli M. Alpuente, M. Falaschi and G. Levi. The semantics of equational logic pro-
gramming as an instance of clp. In Logic Programming Languages. MIT, 1993.
[14] Michael Makkai and Gonza]o Reyes. First Order Categorical Logic, volume 611 of Lecture
Notes in Mathematics. Springer-Verlag, 1977.
[15] Dale Miller, Gopalan Nadathur, Frank Pfenning, and Andre Scedrov. Uniform proofs as a
foundation for logic programming. Annals of Pure and Applied Logic, 1990.
[16] P. Panangaden, V. Saraswat, P.J. Scott, and R.A.G. Seely. A hyperdoctrlnal view of constraint
systems. In Lecture Notes in Computer Science 666. Springer-Verlag, 1993.
[lr] A. Polgne. Algebra categorically. In Category Theory and Computer Programming. Springer,
1986.
[18] D,E. Rydeheard and R.M. Burstall. A categorical unification algorithm. In Category Theory
and Computer Pro#ramming~ 1985.
Reasoning and Rewriting with Set-Relations I:
Ground Completeness
1 Introduction
In the present paper we use the same set-relations but introduce a new rea-
soning system. It is less general than the earlier one - we are studying only
the ground case - but it is much more prone to automation. Rewriting with
non-congruence relations becomes also an issue of increasing importance. The
set-relations we are considering are not even equivalences: equality is symmetric
and transitive (but not reflexive), inclusion is reflexive' and transitive (but not
symmetric) and intersection is reflexive and symmetric (but not transitive). We
study the rewriting proofs in the presence of these relations generalizing sev-
eral classical notions (critical pair, confluence, rewriting proof) to the present
context. Our results on rewriting extend bi-rewriting [LA93] in that we con-
sider three different set-relations. We also take a step beyond the framework of
[BG93] in that we study more general composition of relations than chaining of
transitive relations.
Section 2 defines the syntax and the multialgebraic semantics of the lan-
guage and lists some basic properties of the set-relations. Section 3 introduces
the reasoning system E, ordering of words and specifies the m a x i m a l literal proof
strategy for using 2.. Section 4 discusses term rewriting with the introduced
set-relations. In section 5 we discuss the main theorem - refutational ground
completeness of 2" with the maximal literal strategy and, as a simple corollary,
ground completeness of rewriting.
The present paper is an improved and shortened version of the report [KW94].
Because of the space limitations, all the proofs had to be omitted in the present
version of the paper.
2 Specifications of Set-Relations
Specifications are written using a finite set of function symbols $" having arity
ar : ~" -+ ~q.3 A symbol f C $-0 is called a constant. Only ground case is
considered here, and we do not introduce any variables. We denote by T($')
the set of all (ground) terms. There are only three atomic forms using binary
predicates: equation s ~ t, inclusion s -~ t and intersection s ~. t. A specification
is a set of clauses - finite sets of literals, where a literal is an atom or a negated
atom written -~a. (In [WM95, Wal93] a restricted language is used, allowing
only negated intersections, positive inclusions and equations in clauses.) We will
usually write negated atoms explicitly as s ~ t, s -~ t and s ~ t, and assume
~(-~a) = a. By words we will mean the union of the sets of terms, literals and
clauses. We will write u[s]p to denote that a term s is a subterm of a term u at
a position p. Often the position will be omitted for the sake of simplicity.
Syntactic expressions of the language are interpretated in multialgebras [Kap88,
Hus92, Wal93].
D e f i n i t i o n 2.1 An JZ-multialgebra A is a tuple(sA,~ A)
where S A is a non
empty carrier set, and $-A is a set of set-valued functions f A : (SA)ar(I) _+
3 We are treating only the unsorted case - extension to many sorts is straightforward.
266
The following relation expresses equality of term value sets, and is the usual
interpretation of equality in the set-valued approach to nondeterminism [Hes88,
Kap88]:
s• ~[ s ~ t A s ~ - t . (1)
As can be expected, it does not increase expressibility and therefore is not used
in the language. For a discussion of the intended meaning and difference between
' ~ ' and ' • in the context of nondeterminism see [WM95, Wa193].
The positive, resp. negative, relations are totally ordered by strength:
u.-~v ~ u • ~ u-<v ~ u ~ . v and u ~ v ~ u ~ v 4= uv~v 4= u q k v (2)
The following two lemmas present the subterm replacement and composition
(chaining) properties. Replacement of "equals by equals" occurs only in the case
of two of the four relations. Nevertheless these properties will allow us lager to
develop techniques of term-rewriting.
267
Iii IIs ~ ~1~-< ~1s >- ~ts ~ ~1~ ~ ~ls ~ ~1~ ~ ~1s ~ ~1
s~t t~ut-~ut~ut~u t~ut~ut~ut~u
s - ~ t , t ~ - u t ~ ' . u t ~ - u t~-.u t ~ u t - ~ u t ~ u t - ~ u
s~-t t ~ u t - ~ u . . . . t~u!t~u
sat t~-ut~.u . . . . tr t-~u
For convenience we will write the partial function coded in this table as 5) o | =
| meaning that | is the strongest relation obtained by composing 5) and | for
any terms, i.e.: 5)o| = | ~ (s@tAt| ~ s o u ) for any terms s , t , u . Note
that the table defines only the strongest composite of the arguments. Because
of the ordering (2) the fact that, for instance, ~ ~ ~ = ~. will imply that also
-~ can be obtained from composing ..~ and ~ .
Composition of negative and positive atoms is symmetric to the composition
of the positive and the negative ones given in the table. Composition of two
negative atoms does not allow one to draw any conclusion and therefore is not
mentioned at all.
The next lemma is an easy corollary of the two previous lemmas, but it is
important because it describes the situation known from term rewriting as a
critical peak [DJ90] and is related with generation of critical pairs.
The two tables differ in predicate signs at four places - - 1:3 - 1:6 (row 1,
columns 3 through 6), where the relation resulting from Table 1 is stronger than
the one from Table 2. These cases must be distinguished when the superposition
rule (see below) is applied. Therefore we introduce the function Sup(s, t, @, |
which will select the appropriate table. Its value is @ ~ | in the case s = t, and
Repl(D, | otherwise.
C, s 9 t 19, u[s]p..| v
Superposition C, D, u[t]p | v , where | = Sup(s, u[S]p, 0, |
3.1 O r d e r i n g of W o r d s
Various ordering8 of terms and atoms are used extensively in the study of auto-
mated deduction. We will apply such an ordering to define a more specific proof
strategy for the system 27, to study the possibility of rewriting wrt. the introduced
predicates and, finally, to define the model in the completeness proof. We assume
the existence of a simplification ordering ' > ' [DJ90] on ground terms which is
total (Vs :~ t E T ( J z) : s > t V t > s), well-founded ( V t e T(iT) : {s : s < t}
269
is finite), monotone (Vu, s , t e T(J:) : s > t ~ u[s] > u[t]) and increasing
(Vs E T(J:) : u[s] r s ~ u[s] > s). Ordering of other words is defined by the
multiset extension [DM79] of this ordering. Let A4(T) denote the set of all finite
multisets of elements from T. Each element of A4(T) can be represented by a
function ~ : T -+ ~ such that ~ _= 0 except for some finite number of elements
of T./~(d) is a number of copies of d in the multiset 8.
In the particular case of total ordering o f T , which is the only one considered here,
a _~m f~ means that there is some c e T such that: a(c) > fl(c) A Vd -+ c a(d) =
fl(d). This is a lexicographic ordering comparing biggest components first. In the
general case it is known [DM79] that ,_~m, is total if ' - F is total and , ~ m , is
well-founded if '-~' is well-founded.
Writing a literal s @ t, we indicate that s _ t. It explains why both signs
'-~' and '~-' are used. This rule, of course, is not applied to the conclusions of
the proof rules. We assume that any term is bigger than any predicate symbol.
A stronger positive predicate is bigger than a weaker one, the order between
negative predicates is reversed, and all negative predicates are bigger than the
positive ones:
The literals mentioned explicitly in the premises of the proof rules are called
active. Various ways of selecting the active literals will lead to different proof
strategies. The maximal literal strategy requires that the active literals in the
premise clauses are the ones which are maximal wrt. the ordering defined above.
Stated explicitly the strategy amounts to the following restrictions on the appli-
cation of the rules:
R e f l e x i v i t y r e s o l u t i o n " the literal s ~ s is maximal in the premise clause.
270
The restriction on the last rule is the only case where some active atom
(s | u) is not maximal in its clause. However, it is almost maximal because the
maximal term s of the clause occurs in it. Another reason why this weakening
of the strategy is not essential is that the second clause in the premise provides
merely the context allowing application of the rule, and in fact the atom s | u is
not so "active". A particular consequence of this restriction is that the rule can
be applied only when its second premise is a non-Horn clause.
4 Rewriting Proofs
In the next section we show that if the empty clause can not be deduced using the
maximal literal strategy, then a model exists satisfying the initial set of clauses.
The model is constructed from an appropriate set of ground atoms which force
all the initial clauses to be true. The notion of forcing requires construction
of a deductive closure of a given set of literals. This section investigates the
rewriting proofs in which ground literals are rewritten to ground literals. The
obtained results will serve as a basis for the construction of forcing set in the
completeness proof.
Although, eventually, only atoms will be used in the model construction we
give a more general account - our definitions and lemmas apply to rewriting
with both negative and positive literals.
Rewriting of literals with the set-relations is based on the fact that the re-
lations satisfy replacement properties from Lemma 2.6. For example, the impli-
cation: s .-. t ~ u[s]p ~ u[t]p means that the atom u[s]p ~. u[t]p can be derived
applying the rule s ~-'~ t to the term u[S]p. The following definition states what
kind of literals can be derived directly applying the replacement property to
some set ,4 of ground literals (also called axioms).
function _o_: @ = ( ( @ { l o @2) -1 0 . . . ) - I o @n. The next definition puts all such
literals into rewriting closure of .4. This closure also contains atoms t h a t are
trivially true.
(s=~u.~=t v s = ~ . ~ = t v s~,~=t) ~ s ~ t.
(s=~u.~=t v s=~u4=?=t v s=:guc=~=t) ~ s-< t,
(s==%~=t v s=~=t v s=%u~=t) ~ s >- t,
(s=%u4=?=t v s=%u<~=t v s = ~ 4 4 = t ) ~ s.-.t.
L e m m a 2.6 describes how the peaks can be eliminated from rewriting sequences.
Let us take, for example, one implication from this lemma: s~,tAu[s]~v ~ u[t]
v. The premise can be interpreted as a possibility to have a peak u[t] ~ u[s] ~,~v
in proofs, if both atoms s ~ t and u[s] ~ t are axioms. This peak can be "cut
down" changing it by the consequence u[t] ~- v, if it is also among the axioms.
T h e following notions are commonly used in similar situations.
272
Since we have allowed both negative and positive literals to occur in one set of
axioms ,4, the unpleasant situation, when both an atom a and its negation ~a
belong to A*, is possible. The set A is consistent if no such atom exists. A set
containing only atoms is obviously consistent. Although such a set will be of
main importance in the following section, we again formulate stronger results,
taking into account the general situation of possibly inconsistent sets of literals.
Next lemma, to be used in the completeness proof to construct confluent systems
incrementally, characterizes rules that can be added to a confluent and consis-
tent system preserving both these properties. Here we have again the situation
different from the usual equational reasoning, because the rule s ~ t overlaps
itself and produces the critical atom t ~ t which need not be always true. The
following lemma characterizes the rules which can be added to a system preserv-
ing its consistency and confluence. It serves as a basis for the construction of the
forcing set in the next section.
L e m m a 4.7 For a confluent and consistent system T~ and a rule r ~ Tt* the
system T~ U {r} is confluent and consistent iff
(i) r does not have the form s - ~ s, where | e {~, ~, r
(ii) for any critical literal 1 formed by any r' e 7~U{r} overlapping (or overlapped
by) r, l e ~ * .
273
The proof system ~ is used to derive a clause from a given specification S "by
contradiction": to prove that a clause C = { a l , . . . , an} follows from S, one takes
the negation of C, namely the set of unary clauses neg(C) ~f {-~al;... ;~an},
adds it to S, and tries to derive the empty clause from the resulting set of clauses.
Proving refutational completeness, we have to show that if some set of ground
clauses S has no model, then the empty clause is derivable using rules from I .
The usual way to prove this is to show that there exists a model satisfying all
the clauses from S if the empty clause is not derivable from S. In our proof
we follow the ideas from [S-A92] and [BG93] which, in turn, develop the ideas
of [Bez90]. Similar proof using forcing is given in [PP91]. All these works are
concerned with first-order predicate calculus with equality. In [BG93], a similar
proof method is used with respect to transitive relations.
Our construction proceeds in two main steps. Given a consistent set S of
clauses, we select a set of atoms T~ (section 5.1) and show that 7~ is a forcing
set for the clauses from S. Then (section 5.2) we show that 7~ can be used to
construct a multimodel which satisfies S.
We call a set of clauses S consistent if it does not contain the empty clause.
The redundancy of clauses in S will be defined during the model construction.
Redundancy notion was developed by Bachmair and Ganzinger [BG91] to cover
simplification techniques commonly used in theorem provers. Referring to this
notion we fix a set S and assume it is consistent and relatively closed, meaning
that any application of a rule from 27 with premises from S produces a clause
that is in S or is redundant in 8. The main result is
In the following sections we merely indicate the main steps and results needed
in the proof of this main theorem.
In the last case we say that A is a forcing set for S. We write .4 H-w if .4
forces w. For a consistent set S of ground clauses we will construct a set T~ of
ground atoms forcing S. All such atoms can be oriented into rules because of our
assumption about an ordering of terms, therefore we can treat 7~ as a system
274
In rewriting proofs all terms are not bigger than the maximal term of the lit-
eral being proved. This admits an incremental construction of the model, starting
with A0 and removing redundant literals.
Redundancy of clauses is defined relatively to two sets: one set of clauses S
and one of ground atoms A. This is an intermediate notion, the final one refers
only to S. We have already fixed the set of clauses S to shorten our formulations.
For a given literal 1 and a set s of literals the set s ~-~ {a E s : a < l} contains
all the literals from s that are smaller than 1.
D e f i n i t i o n 5.3 A clause C E S with max(C) = 1 is redundant in a set of
ground atoms .4 if either
- A t ~ C or
- S contains another clause C' < C with max(C/) = l, such that Al ~ C j.
The nature of the second condition of the definition may not be very clear, but
thanks to this condition, the whole definition is a negated assertion about some
minimality of a clause. Statements of this kind are very appropriate in inductive
proofs, like our proof of completeness. The redundancy of literals is based on
redundancy of clauses and Lemma 4.7.
D e f i n i t i o n 5.4 A ground literal 1 is redundant in a set of ground atoms .4 if
either
- l=s@s, where@E {~,~t @},or
- .4U {/} contains a rule r overlapping 1 and forming the critical literal a, such
that A ~ a, or
- every clause C E S with max(C) = l is redundant in .4.
We write red(.4,w) to indicate that w is redundant in A. Observe that Defi-
nitions 5.4 and 5.2 of redundancy and forcing are so related, that all negative
literals that are not forced are redundant. Since any forced literal makes all
clauses containing it redundant, any negative literal appears redundant.
After all the preliminary definitions the definition of the forcing set is quite
short. The set is defined as a limit of a decreasing sequence of sets, which begins
with .40 defined in (4). Succeeding sets are obtained removing minimal redundant
literals. Suppose .4i is already known, and let li be the minimal redundant literal
in .4i:
.4 +1 do= \ {ld, n A (5)
iEIN
The next lemma shows that redundancy is preserved when taking the limit in
the definition of 7~, and that redundancy of a word in some .4i is equivalent to
its redundancy in 7~.
275
C o r o l l a r y 5.6 7~ is confluent.
With every atom a from 7~ there is associated a clause from S which causes a to
be included in 7~. In [S-A92] such clauses are called regular, in [BG91] productive
because they produce atoms being included in the forcing set. In [PP91], where
the forcing method is presented, no special notion for clauses of this kind is used.
Thus, we have shown that for a consistent and relatively closed set ,S of ground
clauses, the set of ground atoms 7~ is a model of 8 in the sense that it forces
all the clauses from S. To complete the proof of Theorem 5.1 we need to show
that the existence of such an T~ implies the existence of a multialgebra A which
satisfies all the atoms from 7~* and only these ones. Then, from the definition of
forcing it follows that A also satisfies all the clauses S.
The rewriting closure 7~* defines a reflexive transitive relation '-~' on the set
of terms T(Jv). In multialgebras this partial (pre)order '-<' on terms is interpreted
as set inclusion: an atom s -~ t means that ~s] A C ~t]A. Two other predicates of
our language also have natural interpretation in partial order terms:
- s ~ t means that both sets ~s] A and It] A are equal minimal elements in
the partial order of nonempty sets, i.e., they denote the same set with one
element;
- s ~-. t means that there exists some minimal element a such that a E [s] A
and a E ~t] A.
276
The relation '-4' on the set T ( f ) is partial preorder, because different terms
may have the same value set. To turn it into a partial order we have to take the
quotient of 7~* modulo ' • that was defined in (1). Since • denotes set equality, it
is obviously a congruence: reflexivity and transitivity follow from the analogous
properties of inclusion '-~', symmetricity from (1), and the replacement property
is given in (2.3).
First, given a set T~ of ground atoms, we construct a partially ordered set
PO(T~) = (C, _E). Then, we extend signature with new constants and 7~ with
new atoms, so that P O ( T t ) defines a multialgebra satisfying all the atoms in 7~.
Let d %f ~ * / • be the quotient of 7~* modulo '• [t] ~-~ {s C T(Y) : s < t c
7~* A t -4 s E 7~*} denotes the equivalence class in 0 of a term t, and E %~
{([s], [t]/ : s -4 t E 7~*} is a partial order on 6 ('E' is the irreflexive part of this
relation). (That (d, -~/is well defined, i.e, that all atoms in one equivalence class
stand in the same relations, follows from Table 1: for any atom s @ t, any t -~ u
or t ~- u is enough to derive also s G u.) Let I9 %f {[s] : s E T ( ~ ) , s ~ s c 7~*}
be the set of deterministic elements in d, 7)(T) %f {S C 7) : S _ T} be the
set of deterministic elements which are smaller than or equal to T E r and
.~4 ~ {S E d : VT E d T r S} be the set of minimal elements in C. ~4(S)
denotes the set of elements from A/~ that are smaller than or equal to S. We will
write A4[t] (or 7)It]) instead A4([t]) (respectively, 7)([t])).
The set 7) is a subset of A4, because from s ~ s and s ~ u it follows that
s ~ u, meaning that no elements lie below the class Is] if Is] C 7). The set
7) is a "candidate" to be a carrier of a multialgebra, and 7)(T) should be an
interpretation mapping defining the multialgebra. Definition 2.2 tells us what
properties 7)[_] should have in order to be an interpretation mapping:
P O 1 . 7 ) [ f ( t l , . . . , tn)] = U { 7 ) [ f ( a l , . . . , an)]: ai E 7)[t/]}
for any f E T ~, { t l , . . . , t ~ } C T($');
P O 2 . 7)[sl={[s]} r s~scn*;
P O 3 . 7)[s]C_~[t] z ~. s -~ t E T~*;
P O 4 . 7)[s] M 7)[t] =/=~ r s ~ t E n*.
If 7)[_] satisfies PO1-PO4, then a multialgebra A satisfying T/can be defined:
M A 1 . the carrier S A ~f 7);
M A 2 . for any constant e C ~-: c A %~ 7)[c];
M A 3 . for any f e 9vn and {[all],..., Ida]} C 7):
f A ( [ d l ] , . . . , [d~]) %f 7 ) [ f ( d l , . . . , dn)].
In this case [[_]A = 7)[_], what follows from PO1, and we say that 7~ defines
the multialgebra A. The next result is important, because the multialgebra is
constructed from positive atoms only, while clauses from S may also contain
negative atoms. We must be sure that the multialgebra makes true only atoms
derivable from the forcing set 7~.
L e m m a 5.9 If T5 defines a multialgebra A then, for any atom a : a C T~*
~a] A is true.
277
However, the mapping ~D[_] defined from the forcing set 74 may violate any of
the Requirements PO1-P04. We might therefore consider the mapping M[_] in-
stead, but also this one may violate these requirements. To meet these problems
we have to extend C (and mapping ~D[_]) with new minimal elements. For in-
stance, there may be no (deterministic) element validating the atom s ~ t E 74*,
or even no such term included in a given term t (which therefore would denote
empty set). We do not give here the, rather elaborate, details of this extension.
Its result is that the signature is extended with new constants, and new atoms
are added imposing the required properties on these new elements. In particu-
lar, all new elements are deterministic, which makes the mappings ~D[_]and M[_]
coincide. The crucial property of this extension is that it is conservative.
D e f i n i t i o n 5.10 Let $- C ~rl be two signatures and 74,741 two sets of atoms
over signature $', resp. $1. 741 is a conservative extension of 74 if for any ~--atom
a, aE74~ ~ aE74*.
Thus, we can complete the set 74 with elements and atoms needed to construct
a multialgebra satisfying exactly the same atoms which are members of 74. Since
74 H- S, this means that we obtain a multialgebraic model of S. This last step of
the construction is expressed in:
T h e o r e m 5.11 For any atom set 74 there exists an atom set 741 that is a con-
servative extension of 74 and defines a multialgebra.
This ends the proof of the completeness theorem which also yields:
6 Conclusion
References
[Bezg0] M. Bezem. Completeness of Resolution Revisited. TCS, 74, pp.27-237, (1990).
[BG91] L. Bachmair, H. Ganzinger. Rewrite-Based Equational Theorem Proving with
Selection and Simplification. Technical Report MPI-I-91-208, Max-Planck-
Institut f. Informatik, Saarbriicken~ (1991).
[BG93] L. Bachmair, H. Ganzinger. Rewrite Techniques for Transitive Relations. Tech-
nicM Report MPI-I-93-249, Max-Planck-Institut f. Informatik, Saarbrficken,
(1993). [to appear in LICS'94]
[DJg0] N. Dershowitz, J.-P. Jouannaud. Rewrite systems. In: J. van Leeuwen (ed.)
Handbook of theoretical computer science, vol. B, chap. 6, pp.243-320. Amster-
dam: Elsevier, (1990).
[DM79] N. Dershowitz, Z. Manna. Proving termination with multiset orderings. Com-
munications of the ACM, 22:8,pp.465-476, (1979).
[DO92] A. Dovier,E. Omodeo,E. Pontelli,G.-F. Rossi. Embedding finite sets in a logic
programming language. LNAL 660, pp.150-167, Springer Verlag, (1993).
[Hes88] W.H. Hesselink. A Mathematical Approach to Nondeterminism in Data Types.
ACM ToPLaS 10, pp.87-117, (1988).
[Hus92] H. Hussmann. Nondeterministic algebraic specifications and nonconfluent
term rewriting. Journal of Logic Programming, 12, pp.237-235, (1992).
[Hus93] H. Hussmann. Nondeterminism in Algebraic Specifications and Algebraic Pro-
grams. Birkh~user Boston~ (1993).
[Jay92] B. Jayaraman. Implementation of Subset-Equational Programs. Journal of
Logic Programming, 12:4, pp.299-324, (1992).
[Kap88] S. Kaplan. Rewriting with a Nondeterministie Choice Operator. TCS, 56:1,
pp.37-57, (1988).
[KW94] V. Kriau~iukas, M. Walicki Reasoning and Rewriting with Set-Relations I:
Ground-Completeness. Technical Report no.96, Dept. of Informatics, Univer-
sity of Bergen (1994).
[LA93] J. Levy, J. Agusti. Bi-rewriting, a term rewriting technique for monotonic or-
der relations. In RTA '93, LNCS, 690, pp.17-31. Springer-Verlag, (1993).
[PP91] J. Pais, G.E. Peterson. Using Forcing to Prove Completeness of Resolution and
Paramodulation. Journal of Symbolic Computation, 11:(1/2), pp.3-19, (1991).
[S-A92] R. Socher-Ambrosius. Completeness of Resolution and Superposition Cal-
culi. Technical Report MPI-I-92-224, Max-Planck-Institut f. Informatik,
Saarbrficken, (1992).
[SD86] J. Schwartz,R. Dewar,E. Schonberg,E. Dubinsky. Programming with sets, an
introduction to SETL. Springer Verlag, New York, (1986).
[Sto93] F. Stolzenburg. An Algorithm for General Set Unification. Workshop on Logic
Programming with Sets, ICLP'93, (1993).
[Wa193] M. Walicki. Algebraic Specifications of Nondeterminism. Ph.D. thesis, Insti-
tute of Informatics, University of Bergen, (1993).
[WM95] M. Walicki, S. Meldal. A Complete Calculus for Multialgebraic and Func-
tional Semantics of Nondeterminism. [to appear in ACM ToPLaS, (1995).]
[WM95b] M. Walicki, S. Meldal. Multialgebras, Power Algebras and Complete Calculi
of Identities and Inclusions, Recent Trends in Data Type Specification, LNCS,
906, (1995).
Resolution Games and Non-Liftable Resolution
Orderings
Hans de Nivelle,
Department of Mathematics and Computer Science,
Delft University of Technology,
Julianalaan 132, 2628 BL, the Netherlands,
email: nivelle~cs.tudelft.nl
Abstract
W e prove the completeness of the combination of ordered resolution and
factoring for a large class of non-liftable orderings, without the need for
any additional rules like saturation. This is possible because of a new proof
method wich avoids making use of the standard ordered lifting theorem.
This proof method is based on resolution games.
1 Introduction
Resolution was introduced in ([Robins65]) and is still among the most successful
methods for automated theorem proving in first order logic. (See [ChangLee73]).
Although resolution is efficient, it is not efficient enough. Therefore so called
refinements of resolution have been designed, which can improve efficiency quite
a lot, without losing completeness. In this paper we will consider ordering re-
finements. Ordering refinements are a restriction of the resolution rule. With
refinements two types of improvement can be gained: First resolution refine-
ments simply improve efficiency, which means that less memory will be used,
and less time will be spent on finding a proof if it exists. Second it can be
shown that certain resolution refinements are ~erminating on certain clause sets
for which unrestricted resolution would be non-terminating. Thus it is possible
to obtain decision procedures with resolution. This approach was initiated by
([Joy76]) and ([Zam72]). ([FLTZ93]) contains an overview of the results reached
in this field. The general strategy for proving the completeness of an ordering
refinement is as follows: (1) Prove the completeness of the refinement for the
ground level. (2) Then show that a refutation of a certain set of ground clauses
can be lifted to the non-ground level. For the first part it has been shown
that resolution with every ordering on ground literals is complete. For the sec-
ond part, the ordering must have a special property which is called liftability:
280
It has been proven in [Robins65] that there exists an algorithm that has as input
two a t o m s (or literals), computes a most general unifier if they are unifiable,
and reports failure otherwise.
Note that, because E is an order, and clauses are finite, every n o n - e m p t y clause
has at least one m a x i m a l element.
Resolution is a refutation method. If one wants to try to prove a formula one
has to try to refute its negation.
This property ensures that if a literal Ai| is maximal in a clause {A10,..., ApO},
that then its uninstantiated counterpart A~ in { A 1 , . . . , A p } is also maximal.
This makes lifting possible. The next theorem is the standard ordered resolu-
tion theorem.
T h e o r e m 1.7 Ordered resolution with ordered factoring is complete, for any
liftable L-order.
L-orders are a slight generalization of the more well-known A-orders. An A-
order is an order on atoms, which is extended to literals by the rule A 7- B =~
A F- -, B, -~ A v- B , - , A v- -~ B. Although every extension of an A-order is
an L-order, the converse is not true. For example P E Q f- -" Q 7- -~ P is
an L-order, but not the extension of an A-order. It is known that A-ordered
resolution and factoring is complete since ([KH69]).
2 Non-Liftable Orderings
We will now give the two completeness theorems for non-liftable orderings. For
the proof we develope the resolution games in the next section. After that we
prove the two completeness theorems in Section 5.
T h e o r e m 2.1 Let E be an L• such that
REN If A [- B, then for all renamings AO1 of A, and BO~ of B, we must have
A01 E BO2,
SUBST For every A and strict instance AO of A it must be that AO v- A.
Then the combination of [--ordered resolution and factoring is complete.
Theorem 2.1 implies the completeness of resolution with any relation that is in-
cluded in an order satisfying the conditions. An example is the ordering defined
by L1 E L2 iff ~L1 > # L 2 . Another possibility is an alfabetic, lexicographic
ordering on term structure.
T h e o r e m 2.2 Let v- be an order, such that
REN if A and B contain exactly the same variables, and A [- B, then for all
substitutions O1 and O~, such that (1) AO1 is a renaming of A, (2) B 0 2
is a renaming of B, (3) A01 and BO~ have exactly the same variables,
we have AO1 v- BO2.
Then U-ordered resolution with factoring is complete for every set of decom-
posed clauses.
It is impossible that p(X, Y) I- q(X, Y) and q(Y, X) 7- p(X, Y). This would
imply p(X, Y) 7- p(Y, X), which would imply p(Z, Y) r- p(X, Y).
The <~ order, together with the E+'-class, defined in ([FLTZ93]), pp. 82,
satisfies the conditions, mentioned here. There is no place for details here, but
the check is easy.
283
3 Resolution Games
In this section we define resolution g a m e s and give a completeness result for
resolution games. T h e p r o o f is given in the next section. We need a precise con-
trol over the factoring rule. Therefore it is needed to define clauses as multisets
instead of o r d i n a r y sets. So we define:
D e f i n i t i o n 3.1 A multiset is a set, which is able to distinguish how often an el-
e m e n t occurs in it. We write [ A 1 , . . . , Ap] for the multiset containing A 1 , . . . , Ap.
Unlike in the set { A 1 , . . . , A p } it is m e a n i n g f u l to repeat elements in the list.
T h e union of 2 multisets $1 U $2 is o b t a i n e d by s u m m i n g the n u m b e r of occur-
rences for each element. T h e difference set of 2 multisets $1\$2 is o b t a i n e d by
s u b t r a c t i n g for each element, the n u m b e r of occurrences in $2 f r o m the n u m b e r
of occurrences in $1. If this results in a negative n u m b e r then the n u m b e r of
occurrences is p u t to 0.
9 M is a set of attributes,
D e f i n i t i o n 3.3
R e s o l u t i o n Let Cl and c2 be two clauses, such t h a t (1) Cl can be written as
cl = [r : R1] U [al : A 1 , . . . , ap : Ap], and c2 can be written as
c2 = [--1 r : R2] U [hi : B 1 , . . . , bq : Br (2) r : R1 is < - m a x i m a l in Cl, and
-1 r :/~2 is < - m a x i m a l in c2. T h e n [al : A 1 , . . . , a p : Ap] U [bl : B 1 , . . . , b r :
Bq] is a resolvent of cl and c2.
There are two sets G and N. The set G contains all the derived clauses, and N
contains the clauses of the last generation. The game starts with G = ~, and
N = C. Then:
2. Now the opponent can compute any ordered resolvent, or ordered factor
of clauses in G. The result is put in N. I-Ie can derive as m a n y clauses as
he wants in one turn, but he cannot use the new clauses because they are
in N. When he is finished the defender is on turn again.
We have defined the resolution game in such a way that the defender can only
affect newly derived clauses. We could also have defined the resolution game
in such a way t h a t the defender is allowed to reduce any clause. In that case
T h e o r e m 3.5 still holds.
The resolution game is different from lock or indexed resolution [Boyer71], be-
cause in lock resolution the resolvent inherits the indices from the parent clause
without any changes. We have the following theorem:
We call the first part of the theorem completeness, and the second part sound-
ness. T h e proof of the soundness is not difficult. All the actions of the opponent
are semantically sound. The defender can play in such a manner t h a t his actions
are sound, by never deleting a literal. This guarantees t h a t the e m p t y clause
285
We. will show that the completeness of game F/, implies the completeness of F.
It is for this reason that it is sufficient to consider games in which the order is
total. An opponent of a set of clauses C playing F can simultaneously play a
g a m e using game F t as defender. He will copy the moves from the opponent of
F t to F, and copy the moves from the defender o f f to F ~. This goes as follows:
T h e opponent of a set of clauses C with game F starts a simultaneous g a m e as
defender of C using game F t. Then he proceeds as follows:
2. After this he can imitate the reductions made by the defender on F onto
F t. This is possible because of COPY2.
Because the opponent of F / will derive the e m p t y clause, if the initial Clause
set is unsatisfiable, the opponent of F will derive the e m p t y clause, and win
the resolution game. So it is sufficient to prove the completeness of resolution
games for those resolution games, in which -~ is a well-order. We will do this in
the next section. We will end with an example:
The clauses are sorted according to -4 . So each last literal is the selected literal.
If the defender doesn't make any reductions then the resolvent [-~ a : 2, c : 2] is
possible. This clause can be reduced to for example [-~ a : 0, c: 0], [c: 1,-~ a : 2],
or [-~ a : 0, e : 2]. The defender can also replace the initial clause [c : 2,-~ b: 2]
by [-1 b : 1, c : 2]. In that case the only possible resolvent is [-~ b : 1]. Whatever
reductions the defender makes, the emtpy clause can always be derived.
4 C o m p l e t e n e s s of R e s o l u t i o n G a m e s
In this section we give the completeness proof of the resolution game. For this
proof we need the following notion:
We will prove completeness of resolution games by showing that every closed set
that does not contain the empty clause, is satisfiable. This implies completeness.
Suppose that this holds, while resolution games are not complete. There is a
clause set C, of a resolution game ~, such that C is unsatisfiable, and whatever
the opponent does, the defender can block derivation of the empty clause. Then,
when the opponent produces all possible clauses in each move, the conjunction
of the successive generations C = Ui>0 Gi is a closure of C. By assumption this
set does not contain the empty clause. Then C must be satisfiable, and this
implies that C is satifiable. This is a contradiction.
We use an adaptation of a proof in [Bezem90], of the completeness of A-ordered
hyperresolution. The proof is probably a bit dissapointing, because it does
not use the game-structure, but it is with less technicality than the proof in
([Nivelle94b]). The proof in ([Nivelle94b]) is based on games. We adapt the
proof in two steps for the clarity of the presentation. We first give the proof for
the case in which the defender never makes a reduction. In that case we have
proven the completeness of a variant of lock resolution. After that we make
some more adaptations to obtain the completeness of the full resolution game.
We will show that every closed set of clauses has a formal model, and that this
implies that every closed set of clauses has a model.
L e r a m a 4.5 Let C be a closed set (in which resolvents and factors are never
reduced), s.t. 0 ~ C. Then there exists an intersection set I of C, that satisfies
MAXUNIQUE.
1. I o = 0 ,
End of proof
1.10=0,
E n d o f proof
We will give two examples demonstrating that resolution games are not complete
when (1) the condition that A : a' -~ A : a in reductions, or (2) the condition
that -< is well-founded, is dropped. (So for example replacing a : 1 by a : 2 is a
valid reduction in the first case)
E x a m p l e 4.8 Define ~ = (P, .4, -~) from P = {a, b}, .4 = A/', and
11 : nl -~ 12 : n~ iff n] < n2. The clause set C =
5 A p p l i c a t i o n of R e s o l u t i o n G a m e s
We are now in the position to prove Theorems 2.1 and 2.2. For both theorems
the strategy is the same. Each unsatisfiable clause set has a finite set Cg of
ground-instances, which is unsatisfiable. From this non-satisfiable set we con-
struct the resolution game, by taking all the ground literals in Cg. We use the
attributes of the resolution game to indicate the non-ground literals by which
the ground literals are represented. Then it is possible for the defender to make
his moves in such a manner, that the resulting game corresponds to the be-
haviour of the non-liftable ordering. Because the empty clause will be derived
with the game, the empty clause will be derived with the non-liftable ordering.
We begin with Theorem 2.1. Assume that a set of clauses C is unsatisfiable.
By Herbrands theorem there exists a finite set Cg = { c l , . . . , ~ } of clauses in
C, such that Cg is unsatisfiable. Let Cused = {Cl,..., Cn} C C be the set of
clauses for which each ci is an instance of ci. (Here C~sed is written with possible
repetitions)
Now construct the following resolution game. Define G = (P,,4, ~), where
We will show that ~ is a valid resolution game. For this we have to show that
-4 is an order on/~ • ,4, and that -< is well-founded on s • A. The first follows
trivially from the fact that r- is an order.
For the second let us define al : A1 -= a2 : A2 if al = a2, and A1 is equivalent
with A2 (i.e. they are an instance of each other).
This is an equivalence relation with only a finite number of equivalence-classes.
-4 will not distinguish elements of these classes. Then, because every sequence
of -< must be finite, -~ is well-founded.
We will now describe how the resolution game is played.
9 The resolution game starts with the following set of clauses: For every
ci = { A 1 , . . . , Ap}, the initial set Cga,~ contains a clause
[A10 : A 1 , . . . ,ApO :Ap]. Here O is a substitution, such that ciO = -6i. In
his first move the defender does not affect the indices. Now we have:
9 -~ is defined from: (al : A1) -~ (a2 : A2) if one of the following holds
292
We will show that this is a valid resolution game. In order to show that the
relation -~ is a well-founded order, it is sufficient to show that the relations
mentioned under (1) and (2) are well-founded orders. It is easily seen that the
relation under (1) is a well-founded order. For (2) it is easily checked that (2)
defines an order.
It remains to show that the ordered defined under (2) is well-founded. In the
same way as in the proof of Theorem 2.1 an equivalence relation ~ can be
defined. -~ will not distinguish equivalent indexed literals under this relation.
Now ~ has only a finite number of equivalence classes. Because of this the
ordering defined under (2) is well-founded, and hence the composition of (1)
and (2) is well-founded.
Now the resolution game proceeds in exactly the same way as in the proof of
Theorem 2.1, and a F-ordered refutation of C can be extracted from this game
in the same manner.
7 Acknowledgements
I would like to thank Trudie Stoute for her advice on English and Tanel Tammet
for his improvements in the formulation of theorem 2.1.
293
References
[Baum92] P. Baumgartner, An ordered theory calculus, in LPAR92, Springer
Verlag, Berlin, 1992.
[BG90] L. Bachmair, H. Ganzinger, On restrictions of ordered paramodu-
lation with simplification, CADE 10, pp 427-441, Keiserslautern,
Germany, Springer Verlag, 1990.
[Bezem90] M. Bezem, Completeness of resolution revisited, Theoretical com-
puter science 74, pp. 227-237, 1990.
[Boyer71] R.S. Boyer, Locking: A restriction of resolution, Ph.D. Thesis,
University of Texas at Austin, Texas 1971.
[ChangLee73] C-L. Chang, R. C-T. Lee, Symbolic logic and mechanical theorem
proving, Academic Press, New York 1973.
[FLTZ93] C. Fermiiller, A. Leitsch, T. Tammet, N. Zamov, Resolution meth-
ods for the decision problem, Springer Verlag, 1993.
[Joy76] W.H. Joyner, Resolution Strategies as Decision Procedures, J.
ACM 23, 1 (July 1976), pp. 398-417.
[KH69] R. Kowalski, P.J. Hayes, Semantic trees in automated theorem
proving, Machine Intelligence 4, B. Meltzer and D. Michie, Ed-
ingburgh University Press, Edingburgh, 1969.
[Nivelle94b] H. de Nivelle, Resolution games and non-liftable resolution or-
derings, Internal report 94-36, Department of Mathematics and
Computer Science, Delft University of Technology.
[Robins65] J.A. Robinson, A machine oriented logic based on the resolution
principle, Journal of the ACM, Vol. 12, pp 23-41, 1965.
[Tamm94] T. Tammet, Seperate orderings for ground and'non-ground literals
preserve completeness of resolution, unpublished, 1994.
[Zam72] N.K. Zamov: On a Bound for the Complexity of Terms in the
Resolution Method, Trudy Mat. Inst. Steklov 128, pp. 5-13, 1972.
On E x i s t e n t i a l T h e o r i e s of List C o n c a t e n a t i o n
Klaus U. Schulz ~
1 Introduction
of equations and disequations between terms with concatenated lists. For effi-
ciency reasons, however, satisfiability is only tested in a rather approximative
way. Colmerauer introduces a non-standard "naive" concatenation on a compli-
cated "extended domain" to explain the precise answer behaviour of the solver
declaratively. The question arises whether satisfiability of equations and dise-
quations between terms with concatenated lists is decidable.
Approximating the formal model of PROLOG III, we consider the algebra
of finite trees with lists and the algebra of rational trees with lists. In both
domains, concatenation is interpreted as a partial operation acting on lists only,
free function symbols are interpreted as tree constructors. In view of the results of
Quine and Biichi-Senger we only consider the existential fragment of the theories
of these two structures. The syntax is more or less identical to the syntax of
PROLOG III for constraints over lists. The "list constraint systems" that will
be considered are finite sets of equations and disequations between terms with
concatenated lists. Arbitrary existential sentences correspond to disjunctions of
list constraint systems.
The paper is structured as follows. Section 2 starts with central definitions.
In Section 3 we show that solvability of list constraint systems over the algebra
of finite trees with lists is decidable. This implies that the existential theory of
this structure is decidable. The decision procedure is based on a decomposition
technique that was introduced in [2] in the context of disunification in the union
of disjoint equational theories. A variant of Makanin's algorithm [7] deciding
solvability of word equations is needed.
In Section 4 we consider the algebra of rational trees with lists as solution
domain. It is shown that solvability of equational list constraint systems is decid-
able. Thus the positive existential theory of this algebra is decidable. We sketch
how the problem of solvability of arbitrary list constraint systems over the alge-
bra of rational trees with lists may be traced back to the following problem: given
a word equation with variables x l , . . , x., and given a finite set of constraints of
the form Ixi[ = Ixj] demanding that the length of the (words to be substituted
for the) variables xi and xj has to be the same, decide if the word equation has
a solution that satisfies these restrictions. Decidability of word equations with
these length constraints seems to be a deep problem. G.S. Makanin (personal
communication) has shown that a primitive recursive decision procedure would
give a primitive recursive algorithm for deciding solvability of equations in free
groups. It is known that Makanin's algorithm for free groups [8] is not primitive
recursive [5].
List c o n s t r a i n t s y s t e m s
Following the syntax of PROLOG III we shall use an infinite set of list construct-
ing symbols for representing lists. For each natural number k, let [ ]a denote a
function symbol of arity k. Let ZL := {[ ]k;k > 0}. Let ~F denote a finite set of
296
free function symbols, disjoint to EL, containing at least one constant and one
non-constant function symbol. The complete signature that we shall use con-
tains binary concatenation "o", and all symbols from Z L a F := EL U ZF. X is
a countably infinite set of variables. In the sequel, possibly subscripted symbols
x, y, z , . . . always denote variables.
The set of all (F- and L-) terms is recursively defined as follows:
Terms [In(t1,..., t~) will be written in the form [ t l , . . . , t~]. Since the infix
symbol "o" is interpreted as concatenation, we omit brackets in expressions
11 o - . . o ln. For n = 0, an expression 11 o . . . o In denotes the empty lists [ ]0. Of
course many "natural" expressions (such as those using "con~' and "coati', or
Prolog-style [t~ Ill) are not treated as terms. It is simple to see that for all these
expressions there are terms which behave in the same way, in any relevant sense.
In order to keep proofs simple we have chosen a compact syntax which captures
all conventional constructions for a combination of terms with lists.
A list constraint system is a finite set of equations and disequations F of the
form
{ S l --" t l . . . . , s , - t~, s.+l # t , + l . . . . . s,+m # t , + ~ }
- ( Z v , Z c ) is a VC-declaration of Z C_ X,
- _Ff is a finite set of equations and disequations between terms in
T ( ~ . U Z c , Z v ) and
- < is a linear ordering on Z.
F l a t p u r e list c o n s t r a i n t s y s t e m s w i t h linear c o n s t a n t r e s t r i c t i o n
A fiat pure list constraint system with linear constant restriction is a quadrupel
( FL , Z v , Z c , < ) where
- (Zv, Z c ) is a VC-declaration of Z C X ,
- Fz is a finite set of disequations of the form x ~ y (x,y E Z v ) and of
equations of the form x = 11 o . . . o I , (n k 0) where x E Z v and the li have
the form z E Zv or the form [y] with y E Z = Zv U Zc,
- < is a linear ordering on Z.
Let M be a set. W i t h EM~.d,,, we denote the set of all finite, possibly nested
lists where elements t h a t are not itself lists are in M. This domain contains only
finite trees.
A solution of (FL, Z v , Zc, <) is a m a p p i n g a which assigns to every variable
X
x E Z v an element x ~ ~ g,~.~d,f~, such t h a t the canonical extension of a on pure
L - t e r m s a solves all equations and disequations of FL and the constant c E Z c
does not occur in x ~ for all x < c (x E Zv). T h e solution ~ is called compatible
with < if x~ is never a proper subtree of x~ for x2 < xl (xl,x2 e Zv). Note
that, by definition, e v e r y solution cr is restrictive in the sense t h a t x ~ cannot be
a variable (x E Zv).
In the third step of the algorithm, systems of this type are created. Nested
lists as solution values m a y be necessary since variables m a y occur a m o n g the
elements of lists in equations. This is the i m p o r t a n t distinction to the following
type of system.
S h a l l o w p u r e list c o n s t r a i n t s y s t e m s w i t h l i n e a r c o n s t a n t r e s t r i c t i o n
Let (FL,Zv, Zc, <) be a flat pure list constraint system with linear constant
restriction. T h e shallow version ( FL, Zv , Z c U Zv , < ) of ( FL, Zv, Zc, <) is ob-
tained by
(1) introducing the new set of constants Z v := {~;x E Zv},
(2) replacing every t e r m [x] in FL with an embedded occurrence of a variable
x E Z v by an expression [~],
(3) using the linear ordering < which is the extension of < on Z v U Z c U Z v
where each constant ~ is the i m m e d i a t e successor of x with respect to
(x e Zv).
T h e second components of the o u t p u t pairs will have this form. T h e domain
s contains all lists of the form [11,... ,in] (n _> 0 ) w i t h e l e m e n t s / / E X U Z v .
A solution of (]~L, Z v , Z c U Zv, <) is a m a p p i n g a which assigns to every
x E Z v a value x ~ E fXUZv
~flat
such t h a t the canonical extension 4 of g on terms
3 Where c~ :-----c for c e Zc, Ill . . . . , l,] ~ = [l~,..., l•] and (11 o12)r is the concatenation
of 17 and l ~
4 Defined as above, w i t h ~ ---- ~ for ~ E Z v . N o t e t h a t t h e canonical extension of a
assigns to b o t h sides of each e q u a t i o n of/~L again values in ~fXu2v
flat since there axe
no variables in element positions.
300
in/~L solves all equations and disequations of/~L and the constant c E Z c U Z v
does not occur in x a for all x<c (x E Z v ) . As for flat pure systems, each solution
is restrictive by definition.
Proof. (Sketch). Suppose that /~L has m disequations. It is first shown that
solvability of (/~L, Z v , Z c UZv, <) may be tested in a domain s where
X0 C X has 2m + 1 elements and X0 M Z c = ~. Now we have a finite solution
alphabet ' and the method of Biichi and Senger ([3]) may be used to compute
an equivalent finite set of systems with equations only. This latter systems are
like word unification problems with linear constant restriction, where solvability
is known to be decidable (see [1]). (More details of all steps can be found in [2]
where the almost identical case of associative disunification with linear constant
restriction has been treated.) []
Step 1: variable identification. Consider all partitions of the set of all variables
occurring in Fo such that distinct variables x, y are in the same class of the
partition if the system contains the equation x "- y, and distinct variables x, y
are in distinct classes of the partition if the system contains the disequation
x ~= y. Each of these partitions yields one of the new systems I"1 as follows.
The variables in each class of the partition are "identified" with each other by
choosing an element of the class as representative, and replacing in the system
all occurrences of variables of the class by this representative. Afterwards, trivial
equations x - x are erased. In addition, we add a disequation x ~ y for every
pair x, y of distinct representatives to the system if this disequation is not already
present. Systems that are trivial now are excluded.
In each system F1, the right-hand side of every equation is either an F - t e r m
or an L-term (but not a variable). Thus, we may speak about F-equations and
L-equations.
Step 2: choose ordering, type variables. For a given system 1"1, consider all pos-
sible strict linear orderings < on the variables of the system. Guess a type assign-
ment which maps every variable x to an element type(x) of {F, L}, satisfying
the following restrictions: if x has type L with respect to 1"1, or if 1"1 contains an
equation x - t where t is a non-variable L-term (resp. F-term), then type(x) = L
(resp. type(x) -- F ) . Each pair (<, type) yields one of the new systems obtained
from the given one.
For a system 1"2 obtained by Step 2, let X3,p (X3,L) denote the set of variables
of type F (L) occurring in Fs. Let X2 = X3,F U X3,L. Now left-hand sides of F
(L) equations are in X3,F (Xa,L).
301
Step 3: split systems. A given system/"2 is divided into two systems/"2 =/"3,F U
/"3,L. The "free" subsystem/"a,f contains all F-equations of~"2, the "L "-subsystem
/"3,L contains all L-equations of/"2. Disequations with at least one variable of
type F are added to the free subsystem, the other disequations are added to/"3,L.
Now (F3,F,Xa,F,Xa,L,<) is a free disunification problem with linear constant
restriction and (/"3,L,X3,L,X3,F,<~.) iS a fiat pure list constraint system with
linear constant restriction.
Step 4: dot embedded variables. In this last step we compute the shallow version
(/~3,L, X3,L, X3,F t9 )(3,L, "<) of the fiat pure list constraint system with linear
constant restriction, ( /"3,L , X3,L ~X3,F , <~), obtained in the previous step.
Terms of /~3,L have the form ll o . . . olm (m > 0) where the subterms li are
variables x E X3,L or lists [t] where t E X3,F U -~a,L is a constant.
Note that Steps 1 and 2 are non-deterministic. The output of the algorithm
consists of all pairs
:x~,~oY~
7i- : i ~ , n e s t e d l f i n
:~x~,~ ; [l~,.
~'~f|at ""
,l.] ~ [fi(l~),...,/~(l.)].
L e m m a 9. If a dotted system (/~3,L, X3,L, X3,F U X3,L, <.) obtained after Step
has a solution, then the original system (/"3,L,X3,L,X3,F, < ) has a solution.
L e m m a 10. If there exists a pair ((F3,F, X3,F , X3,L , <), ( /"3,L , X3,L , X3,F , <~) )
reached after Step 3 such that (F3,F,Xa,F,X3,L,<) has a restrictive solution
and (F3,L,X3,L,X3,F, < ) has a solution, then 1"0 has a solution.
of the output pairs have the effect of a partial occur check, excluding cyclic de-
pendencies between values of F- and L-variables. If we now ask for rational-tree
solutions, cyclic dependencies are acceptable and may be necessary. Accordingly,
constant restrictions are not used in Algorithm 2.
I,*YuXa,FUJ(~3,L
dr : X3,L "~ "~rtat
9 _ 6r
[Y~r ~1 = [Y~'-~] = i [Y]~ = lj. T h u s x ~ =i x ~' = (l', o . . . ol'n) ~' =i (I1 0''' Oln) a
for i > 1 a n d a solves x -- ll o . . . oln. rl
Proposition 16. If there exists a pair ((F3,F, XS,F, X3,L), (F3,L,Xa,L, X3,F))
that is reached after Step 3 such that (F3,F, X3,F , X3,L ) has a restrictive solution
and ( F3,L, X3,L, X3,F ) has a solution, then 1"0 has a solution.
and a solves x - ll o . . . o ln. This shows t h a t a solves all equations of/"1. Thus
cr is a solution of F1. It is now trivial to extend a to a solution of F0. [:]
T h e o r e m 17. 6 If a typed flat constraint system P has a solution, then the sys-
tem Fx(r) has a solution that is obtained from 1" by replacing every disequation
x ~s y of F by a bounded disequation x ~ x ( r ) Y. Here X(F) h e2
m b -~ n d i a "q- 1
6 A list constraint system T' is typed if every variable occurring in T' has type F or
type L. A tree assignment a solves a bounded disequation x ~k y if the trees x ~ and
y~ have a distinct label in depth j < k. An occurrence of a variable x in a term of t
of F of the form [x] or f( . . . . x . . . . ) 0 e E ZF) is called an embedded occurrence of x
in F.
308
where ne.~b is the number o/ embedded variables of 1" and ndi. is the number of
disequations of 1".
References
Tanel T a m m e t
1 Introduction
The motivation for this work is to devise efficient automated theorem proving
strategies for the first-order theorem proving tasks arising in the formal deriva-
tion of programs from specifications. The specific aim of the paper is to present
completeness results for certain simple relatively well-known program synthesis
algorithms.
One of the standard approaches to automated program construction is using
intuitionistic logic with a suitable realiZability interpretation to derive programs
from proofs (see [5], [11], [8]). The programs derived in this way always enjoy an
intuitionistic correctness proof.
Another approach (see [2], [6], [7], [1]) is to use classical logic instead, with
the additional restrictions guaranteeing that the proof contains a single definite
substitution t into a certain existentially bound variable, and this t is further-
more in a signature where all the function and predicate symbols are assumed
to represent computable functions. The derived programs thus always have a
classical correctness proof, although they may lack an intuitionistic one.
31o
The following summarizes our motivation for using the second approach (clas-
sical logic) for program construction.
The known realizability interpretations for intuitionistic logic often give pro-
grams which contain computationally irrelevant parts. For example, the real-
ization of the formula Vx3y(x = y & y = x) is a term Az.(x, (id, id)). The
A-resolution gives a term ),z.z as a program to compute y.
Some formulas which admit a proof by A-resolution (and hence give a pro-
gram) are not provable by intuitionistic logic. For example, A-resolution gives
a program Az.x for computing y for the intuitionistically unprovable formula
Vx3y((A V -~A) & y = x).
The standard resolution method with Skolemization and/or conversion to a
conjunctive normal form (CNF) cannot be used for intuitionistic logic, although
there exist special resolution methods without Skolemization and CNF ([9], [10])
and a tableaux method with partial dynamic Skolemization ([15]) for intuitionis-
tic logic. Also, there exists a sizeable amount of theory for the resolution method,
which can be used for program derivation by A-resolution.
2 A N S - m e t h o d and t h e D-calculus
2.1 ANS-method
Example 1. Consider the formula F - ((P(a) V P(c)) ::~ 3yP(y)) and the main
variable y in F. Skolemization gives (P(a) V P(e)) =v P(y). The clause form S
of F: {{P(a), P(c)}, {-~P(y)}}. The result of adding the answer literals A is
the clause set S': {{P(a), P(c)}, {~P(y), A(y)}} Resolution derives the answer
clause {A(a), A(c)} from S', thus the set of substitutions for y is {a, c}.
312
Example 2. For the formula n - (((P(a) V P(c)) & P(d)) ~ qyP(y)) with the
main variable y there exist both the result set {a, c} and the result set {d}, the
latter giving the definite answer d.
2.2 D-calculus
The D-calculus is used for finding a definite substitution t for the main variable
of a formula F, such that t is built of the function symbols in F. The D-calculus
is a weaker version of the forthcoming A-calculus, with the difference being in
that the A-calculus allows the substitution term t to contain the case splitting
function "if", whereas the D-calculus does not.
where A(t) and A(g) are answer literals, on the condition that both the atoms
L and L ~ are unifiable, as well as the terms t~ and g~r.
As the Skolemized forms of formulas in the Simple Class do not contain any
nonparametric Skolem functions, any term give11 by the D-calculus for the main
variable of these formulas can be used as a program. In the general case we can
ensure usability of the terms in the derived definite answer clauses by using the
following restricted form of the D-calculus.
D e f i n i t i o n 8 . A computable predicate R on (possibly non-ground) terms is
called a liflable term restriction iff it has the following property:
VtV,~.R(t,~) ~ n(t)
where t is a term and ~r is a substitution. R is a predicate on the meta-level, not
in the object language of resolution.
For example, we can define a certain liftable term restriction RE (t) as "t does
not contain function symbols from the set S " . It is easily seen that for any
R~ is indeed a liftable term restriction. E.g. the set of nonparametric Skolem
functions in a clause is typically taken as the set S.
The definition of D(R)-completeness is obtained from the definition of D-
completeness by requiring that the terms t and g satisfy the criteria R. We
prove D(R)-completeness of the D(R)-calculus. The proof is a subcase of the
Theorem 1.
3 A(R)-calculus
The D(R)-calculus fails to find a proof for a large class of formulas which admit
a proof in intuitionistic logic. The reason for this, roughly speaking, is that
intuitionistic logic assumes subformulas of any formula F to have an associated
program (realization of the formula), whereas the D(R)-calculus assumes only
the function symbols in F to have an associated program.
314
Example,~. Consider the formula F - ((P(a) V P(b)) :=~ 3yP(y)). F does not
admit a proof by the D(R)-calculus. However, F is provable in the intuitionistic
logic.
The A(R)-calculus can be used for finding programs for the main variables
of formulas in the same way as the D(R)-calculus.
315
Example 5. Consider the formula F - ( ( P(a)V P(b) ) =:>3yP(y) ) and let the term
restriction R hold for all terms (thus we assume P, a and b be computable).
The clause form G' of F (after adding the answer literals): {l:{P(a), P(b)},
2:{-~P(y), A(y)} }. The derivation of the answer clause in the A(R)-calculus: 1
and 2 give 3: {P(b), A(a)}, 3 and 2 give an answer clause: {A(if(P(b), b, a))}.
Thus the A(R)-calculus gives a program (without arguments) if(P(b), b, a) for
computing the value of y.
Suppose that the algorithm we have for computing the predicate P is defined
only on a. In that case we define the restriction R(t) as "any subterm of t with
the leading symbol P has either a form P(a) or P(x) for some variable x". Then
the A(R)-calculus cannot derive the answer clause {A(if(P(b), b, a))}; the only
answer clause it can derive is {A(if(P(a), a, b))},
D e f i n i t i o n 12. A type Boolean is the set of two logical constants 7~ue and False.
A Boolean function is a function taking n(O _< n) arguments of the Boolean type
and returning a value of the Boolean type. We consider "if" to be a polymorphic
function: an occurrence of if is Boolean iff all its arguments are Boolean. A term
t is of a Boolean type if either t is a logical constant, literal, or has a form
f ( t l , . . . ,tn) where f is a Boolean function and all the ti (1 < i < n) are of the
Boolean type.
We will introduce the construction Sit/y] for clause sets, similar to the or-
dinary substitution S { t / y } . The difference is that in the newly introduced con-
struction the term t may contain literals and the function i f , thus we will use
the algorithm AF to "flatten out", so to say, any literals containing the term t
after direct substitution.
D e f i n i t i o n l 8 . Let t be a ground B-correct term, possibly containing literals
and the function i f . Consider the clause set S to be a conjunction of disjunctions
of literals. Build a new construction St by replacing the variable y everywhere
in S by the term t. Build the formula SAF(t ) by replacing all the literals L in St
containing t by the formula computed by A F ( L ) . Sit/y] is obtained by bringing
the formula SAy(t) to the clause form again and removing all the tautologous
clauses.
317
Observe that for the earlier mentioned "Simple Class" (the class where all Skolem
functions are parametric) the A(R)-calculus is complete even in the standard
classical sense: since all the Skolem functions are parametric, we define R to
return True for every term t. Then if a formula F in that class is provable, the
A(R)-calculus derives from the clause form S of F a definite answer clause which
is either empty or has a form A(g) such that S~q/y] is provable.
The following completeness theorem for the general case is a main result of
the paper.
(with the tautologous element clauses missing due to tautologies being removed
by the construction of S[t/y]).
Since Sit/Y] is unsatisfiable, there must be some unsatisfiable finite Herbrand
expansion S[t/y]E of the set S[t/y]. Recall that the finite Herbrand expansion
318
{ClO'h...,ClO'rn,...,CnO'l,...,Cno'rn}
where each Ci6rj is ground and contains only predicate, function and constant
symbols from the set { C 1 , . . . , Cn} (plus a single new constant symbol, in case
{ C 1 , . . . , Cn} contains none).
Unless it is explicitly said otherwise, we will in the following use ordi-
nary resolution (not the D- or A-calculus) which is restricted in the following
completeness-preserving manner. We introduce a following ordering of ground
literals in S[t/y]E: all the literals in S[t/y]E which do not occur in the set I are
preferred for resolution over the literals occurring in I. We restrict the resolu-
tion m e t h o d by allowing resolution upon a literal L in a clause C only if C does
not contain any literal R preferred over L. This restriction is a case of so-cMled
ordered semantic resolution, see [3] or [1]. We will restrict resolution further
by prohibiting the derivation of tautologies (clauses containing some literal L
and its negation -~L). This restriction preserves completeness for the semantic
resolution.
W e build the clause set S[t/y]EA from the clause set S[t/y]E by adding an
answer literal {A(gi)} to each clause Cj{gi/y} U Ai (1 _ j < k, 1 < i < l) built
from some clause Cj in Sy.
to hold and the clause set S[t/y]E is assumed to be unsatisfiable, the A(R)-
calculus derives from S[t/y]EA a definite answer clause with a term satisfying
R. Let D be such a derivation.
Finally, consider the original clause set S. Add an answer literal A(y) to each
clause containing the variable y. We get the following clause set S~:
For each clause C/in the set S[t/y]~A there is a clause C~ in the set S' subsuming
Ci (in general, several clauses in S[t/y]~A may map to one clause in S').
We will now lift the derivation D from the clause set S[t/Y]EA to the deriva-
tion of a definite answer-clause from the set S ~. We note that the standard lifting
lemma is not true for the A(R)-calculus due to the D-resolution rule. However,
we can show that D can be assumed to have a special form such that the standard
lifting lemma is applicable. Namely, whenever there is a derivation of a definite
answer-clause from the final clauses, then there is also a derivation without the
use of the D-resolution rule (since literals in final clauses satisfy the R-restriction,
D-resolution inferences can be replaced by A-resolution inferences). Considering
D-resolution inferences from the non-final clauses, we observe that the answer
literals in the figure do not contain if and thus standard lifting is applicable.
Lifting: transform the derivation D to a derivation D ~by replacing each input
clause Ci in S[t/y]EA by the subsuming clause C~ in S' and each clause inferred
in D by the correspondingly inferred subsuming clause. Remove the resolution
inferences which become impossible (it is possible to remove those in lifting since
for such figures the lifted consequent is the same as the lifted premiss).
Conclusion: since Sit~y] is assumed to be provable, S[t/y]E is an arbitary
finite unsatisfiable Herbrand expansion of Sit~y], the term with the required
properties was derivable from the clause set S[t/y]EA by the ordered A(R)-
resolution and the term restriction R is strongly liftable, the term with the
required properties is also derivable from the clause set S ~ by the unrestricted
A(R)-resolution.
Example 7. Let ~-a be defined as "literals with the predicate G are preferred over
all the other literals". The ordering ~-a is an instance of the semantic resolution
and is thus known to preserve completeness of resolution. Consider the clause
set S: {{-~a(b,y), P(a), A(y)}, {G(y,a), P(b), A(y)}, {--P(y), A(y)}}. We
define R(t) as "t does not contain the predicate symbol P". Then there is no
A(R)-derivation of a definite answer clause from S satisfying either the ordering
~-G or the hyperresolution-compatible semantic ordering, whereas there is an
A(R)-derivation of the definite answer clause {A(if(G(b, a), a, b))} from S using
unrestricted A(R)-calculus.
to find a member of a list satisfying P, under assumption that the list contains
such a member. The specification is
Vx((3y(m(y, x) & P(y))) ::~ (3z(m(z, x) & P(z))))
and we want to find a program to compute a value for z for any list-type value
of x. We define R(t) as "t does not contain the Skolem function for y".
First, an attempt to derive a definite answer clause with a term satisfying
R fails, if we do not use induction. Conversion of the whole problem to the
resolution form (clauses axiomatizing equality are skipped, overlined variables
like 9 represent Skolem functions, first four clauses come from the definition of
m, A is the answer predicate to collect substitutions):
1) {~m(x, nil)}
2) {~m(x, c(y, z)), x = y, m(x, z)}
3) {x r y, m(x,c(y,z))}
4) {'-~m(x, z), re(x, c(y, z))}
6) {P(~)}
7) {-m(z,~), -P(z), A(z)}
The only derivable definite answer clause is derived in the following way:
5,6 and 7 give 8) {A(~)}
But this answer is discarded by R.
We get successful derivations of a definite answer clause by using one struc-
tural induction over x.
4.2 I n d u c t i o n Step
References
1. C.L.Chang, R.C.T Lee. Symbolic Logic and Mechanical Theorem Proving. Aca-
demic Press, 1973.
2. C.L.Chang, R.C.T Lee, R.Waldinger. An Improved Program-Synthesizing Algo-
rithm and its Correctness. Comm. of ACM, (1974), V17, N4, 211-217.
3. C.Ferm/iller, A.Leitsch, T.Tammet, N.Zamov. Resolution Methods for the Deci-
sion Problem. Lecture Notes in Artificial Intelligence 679, Springer Verlag, 1993.
4. C.Green. Application of theorem-proving to problem solving. In Proc. 1st Inter-
nat. Joint. Conf. Artificial Intelligence, pages 219-239, 1969.
5. S.C.Kleene. Introduction to Metamathematics. North-Holland, Amsterdam, 1952.
6. Z.Manna, R.Waldinger. A deductive approach to program synthesis. ACM Trans.
Programming Languages and Systems, (1980), N2(1), 91-121.
7. Z.Manna, R.Waldinger. Fundamentals of Deductive Program Synthesis. 1EEE
Transactions on Software Engineering, (1992), V18, N8, 674-704.
8. G.Mints, E.Tyugu. Justification of the structural synthesis of programs. Sci. of
Comput. Program., (1982),N2, 215-240.
9. G.Mints. Gentzen-type Systems and Resolution Rules. Part I. Propositional
Logic. In COLOG-88, pages 198-231. Lecture Notes in Computer Science vol.
417, Springer Verlag, 1990.
10. G.Mints. Gentzen-type Systems and Resolution Rules. Part II. Predicate Logic.
In Logic Colloquium '90.
11. B.Nordstr6m, K.Petersson, J.M.Smith. Programming in Martin-L6f's Type The-
ory. Clarendon Press, Oxford, 1990.
12. G.Peterson. A technique for establishing completeness results in theorem proving
with equality. SIAM J. of Comput. (1983), N12, 82-100.
13. J.A. Robinson. A Machine-oriented Logic Based on the Resolution Principle.
Journal of the ACM 12, 1965, pp 23-41.
14. U.R.Schmerl. A Resolution Calculus Giving Definite Answers. Report Nr 9108,
July 1991, Fakultiit fiir Informatik, Universitiit der Bundeswehr Miinchen.
15. N.Shankar. Proof Search in the Intuitionistic Sequent Calculus. In CADE.11,
pages 522-536, Lecture Notes in Artificial Intelligence 607, Springer Verlag, 1992.
S u b r e c u r s i o n as a B a s i s for a F e a s i b l e
Programming Language
Paul J. Voda
1 Introduction
The motivation for this research comes from our search for a good programming
language where we are constructing computable functions over some inductively
presented domain. The domain of LISP, i.e. S-expressions, is an example of a
simple, yet amazingly powerful, domain specified as words. We have designed
and implemented two practical declarative programming languages Trilogy I
and Trilogy II based on S-expressions with a single atom 0 (Nil) [9, 1]. Since
the domain of S-expressions with a single atom is denumerable it seems natural
to identify it with the set of natural numbers. Functions of our programming
language will become recursive functions. The identification is obtained by means
of a suitable pairing function.
Quite a few people have investigated properties of S-expressions but to our
knowledge nobody has done it in the context of subreeursion. Yet, a feasible
programming language should restrict itself to functions computable by binary
coded Turing machines in polynomial time. This class is a subclass of elemen-
tary functions which is a small subclass of primitive recursive functions. Hence
it seems natural to study the connection between a pairing-function-based pre-
sentation of primitive recursive function hierarchies with the usual presentation
based on the successor function s(x) = x + 1. The relation to Grzegorczyk-based
hierarchies shouId be central. The connection between recursive classes of func-
tions (based both on the successor recursion and on the recursion on notation)
and classes of computational complexity is quite well-understood now (see for
instance [10]). We investigate this connection by recursion based on pairing.
Section Sect. 2 introduces the pairing function P. We order the presentation
of primitive pair recursive functions in Sect. 3 in such a way that we can quickly
325
2 Pairing functions
All functions and predicates in this paper are total over the domain of natural
numbers N. It is well known that in the presence of a pairing function we can
restrict our attention to the unary functions and predicates. Unless we explicitly
mention the arity of our functions and predicates they will be understood to be
unary.
A binary function (., .) is a semi-suitable pairing function if it is ( P l ) : a
bijection from N 2 onto N \ {0}, and we have (P2): (x, y) > x and (x, y) > y.
The condition ( P 1 ) assures the pairing property that from (a, b) = (c, d) we get
a = c and b = d and the property that 0 is the only atom, i.e. the only number
not of the form (x, y).
We will abbreviate (x, (y, z)} t o ( x , y , z) and when discussing only unary
functions we will write x, y for (x, y). Thus '., .' can be viewed as an infix pairing
operator with a lowest precedence where, for instance, x+y, z stands for (x+y), z.
Thms. 1 and 2 guarantee that for a semi-suitable pairing function (a): every
number x is either 0 or it can be uniquel[y written in the form xl, x2,..., x~, 0
for some n > 1 and numbers xi. Thus every number codes a single finite sequence
over numbers (codes of finite sequences are called lists in the computer science),
and vice versa. (b): There exist unique pair size Ixl and length t e n ( x ) functions
such that 101 = 0, Ix,yl = I x l + l y l + l , and Len(O) = 0 and Len(x,y) = Len(y)+l.
326
The function L e n ( x ) gives the length of the finite sequence coded by z. We have
Len(x) < Izl .
There are many semi-suitable pairing functions. For instance we can offset
the standard recursion-theoretic pairing function J [3] by one: J ( x , y) = ((x +
y). (x + y + 1)) + 2 + x + 1. However, the function J is not good for our purposes
as it is not a suitable pairing function satisfying the additional condition ( P 3 ) :
I~l = O ( l o g ( ~ ) ) . We will see in Sect. 5 that a suitable pairing function gives a
rise to the classes of functions with very desirable properties of computational
complexity.
Let us temporarily assume that there is a binary function P ( z , y) = x, y such
that the following sequence of pair representation terms enumerates all natural
numbers in the natural order:
0; 0, 0; o, 0, 0; (0, 0), 0; 0, 0, 0, 0; o, (0, 0), 0; (0, 0), (0, 0); (0, 0, 0), 0; ((0, 0), 0), o;...
(1)
The sequence is obtained by letting the term x to precede the term y iff Ixl < lY]
or Ixl = IY], x = Xl,X2, y = yl,y2 and either xl precedes yl or else xl = Yl and
x2 precedes y2. It should be clear that P (if it does exist) is unique and it is a
semi-suitable pairing function. The closed interval
The function Ro(x) = x - cr Ix] gives the offset of the number x from the least
number of the same pair size. If for two numbers x and y we set n = Ix I + lYl + 1
we can see that P(x, y) = P(x, cr lYl) + Ro(y) because the number P(x, y) occurs
at the offset Ro(y) in the interval 27= with the lower bound P(x, ~r]Yl). Using (6)
we get
We can conclude that if the function P exists then (7) must hold. Vice versa,
when we define P by (7) where Ixl is defined by (5) we can see that the sequence
(1) consists of all natural numbers in the increasing order. So the function P
does exist and it is a semi-suitable pairing function strictly monotone in both
arguments. We now show that P satisfies also the condition (P3). We clearly
have C(n) <_ ~r(n + 1), so for Ix] >_ 2 we get 21=1-2 _< C(Ix ] _ 1) _< (r Ix] _< x.
Hence for all x
2 I=1 _< 4 . x + 1 . (8)
Each of the intervals Z0 through 27,(,~)_1 is non-empty and so c~(n) _< C(n) which
holds also for n = 0. Thus
Let us denote by d(x) the binary size function yielding the number of bits in the
binary representation of x (d(0) = 0). We clearly have d(x) = O(log(x)). From
(8) we get Ix] + 1 = d(21~l) ~ d(4.x + 1) = d(x) + 2, i.e. Ixl _< d(x) + 1. From
(9) we get d(x) <_ d(2 21~1+2) = 2. }xI + 3. Thus the condition ( P 3 ) holds and
we have:
The development of function classes discussed in this section does not require
the property (P3). Any primitive recursive semi-suitable pairing function p can
be used instead of P provided we are able to derive the successor function and
unary iteration in a way similar to the derivation of arithmetic below.
The identityfunction is I(x) = x. The zero function is Z(x) = 0. The first (H)
and second (T) projection functions are such that H(0) = T(0) = 0, H(x, y) =
x, and T(x, y) = y. The conditional function D is such that D(0, x, y) = y,
D((v, w), x, y) = x, and D(0) = D(x, 0) = 0. The function f is obtained from
the functions g and h by (unary) composition if f(x) = g h(x) and by pairing
if f(x) = g(x), h(x). Functions simple (in a class of functions .T) are generated
from ($'), I, Z, and D by composition and pairing.
328
f((v,w),x) = 0
f(O, x) = h(w, v) ~ g(x) = v, w
f(O,x) = 0 ~ g(x) = 0
f(O) = c
The clauses, when read as inverted implications, express properties of the defined
function. The first and third clauses can be omitted as the defined function
then obtains the value 0 by default when no clause can be satisfied for a given
argument. Clauses can be given in any order (we have listed the clauses for f
in the order obtained by removing the conditionals from the definition (10)).
Clauses must be presented in such a way that the conversion from a clausal
definition to its initial form should be always possible.
We do not have the space to dwell on the details of our clausal language and
hope that the clausal definitions given in this paper will be simple enough so the
reader can reconstruct the initial form of the definition. In the case of explicit
definitions the initial form is always f ( x ) = a. We permit the clausal language
to be used with inductive definitions where the initial form will be a schema of
inductive definition. The development of the clausal language should be viewed
from the perspective of its eventual implementation on computers. We offer here
no details on how one can go about the automatic compilation.
We now present three schemas of pair-based inductive definitions which can
be used as initial forms of clausal definitions. The function f is defined by the
unary iteration of g if f(0) = 0, f(0, y) = y, and f(s(x), y) = g f(x, y). We
write g~(y) as an abbreviation for the application f(x,y). The function f is
defined by the pair iteration of g if f(0) = 0 and f(x, y) = gt~l(y). The function
f is defined by pair recursion from g and h if f(0) = 0, f(0, y) = g(y), and
f ( (v, w), y) = h(v, w, y, f(v, y), f ( w , y) ).
The classes "P7~1, 7)7~2, and P7~3 are generated from [, H, and T by com-
position pairing and respectively by pair recursion, pair iteration, and unary
iteration. We call the functions of 7)T~1 primitive pair recursive functions.
Proof. Clearly, it is sufficient to show that the classes PTgi are simple in them-
selves, i.e. that each of them contains the functions Z and D. []
We can now use the clausal language with any of the three inductive schemas.
The functions @, | 1", written in infix notation e.g. x 9 y instead of | y), are
obtained in 79791 by pair recursion:
O~y=y
(v,w)| v, w G y
~t(o) = o
n~(v, w) = o, Rt(v) + Rt(w)
x | v = fl(R~(x), y)
A (0, u) = o
f~(0, w), ~) = u + fl(w, u)
x , y = f~(Rt(y), x)
h ( 0 , x) = 1
A ( 0 , w), x) = x | A ( w , x)
The function @ is the list concatenation function. The function Rt is obtained by
parameterless pair recursion for which PTQ is easily seen closed. Rt(x) yields a
right-leaning number of the same pair size as x: Rt(x) = cr IX[. We have IRt(x)[ =
Len Rt(x) = Ix]. By pair induction we can derive Ix ~ y[ = ]x t + ]yl, Ix | y[ =
Ixl. lyh and Ix 1 yl = Ixl '~t.
The function Lt(x) = a(lx I + 1) - 1 yielding a left-leaning number of the
same pair size is obtained as primitive pair recursive by Lt(x) = f Rt(x) where
f(0) = 0 and f(v, w) = f(w), O.
We identify a predicate /~ with its characteristic function R,(x) = y
R(x) A y = 1 V -~R(x) A y = 0. For a function class 2- we denote by 2-, the
class of 0, 1-valued functions of .7". We say that the predicate R is from 2-, and
sometimes write R E 2-, if R, E 2",. The predicate Left(x) true of left-leaning
nmnbers is defined primitive pair recursively by a clausal definition
Left(0)
Left(v, w) ~- w = 0 A Left(v).
Clausal definitions of predicates should be viewed as abbreviations for clausal
definitions of their characteristic functions. If we replace Left(x) by Left,(x) =
1 above we get such a definition. Note that the clauses for --Left(x), i.e. for
Lefl,(x) = O, are obtained by default. We can similarly define the primitive pair
recursive predicate Right holding of right-leaning numbers.
We now turn to the arithmetic functions. The successor and predecessor
functions are obtained by the case analysis of the enumeration of pairs (1). We
derive the successor function in P7~1 by a parameterless pair recursion:
s(O) = 1
8 0 , w)= v, 8(w) ~- -,Left(w)
~(v, w) = ~(v), n t ( ~ ) ,-- Left(w) A -,~eft(~)
~(v, w) = 0, R t 0 , w) +- Left(w) A Zeft0) A w = 0
8(V, W) = Rt(0, y), ~t(wl) +--- L e f t ( w ) A L e f t ( v ) h w = Wl, w2
387
fl(O,x,y) = x,y
fl ((v, w), x, y) = O, Yl *--- f, (w, x, y) = O, Yl
fl((V, ~), ~, y) = p(Vl, w~), g(y~) ~- f l ( ~ , ~, y) = (v~, ~ ) , y, .
The auxiliary function fl accepts as a parameter and yields numbers of the form
c, a where c is a counter of iterations and a an accumulator where g is repeatedly
applied to. This works provided the length of recursion is at least x. By (9) we
ha.ve
L ~ R @ I (0, x)) = ]9 1 (0, =)1 = t911~ = 4M+, > = .EJ
331
4 Pair-Based Hierarchies
We assume that the reader is familiar with the Grzegorczyk hierarchy s [5]
which we will need from the multiplicative stage g 2 onwards. Rose in [7] gives a
good discussion of the topic. The hierachy functions generating the classes gi are
E2(x) = x2+2, and and Ei+3(x) = E~+2(2). The reader will note that in order to
get Ei E gi we have increased the indices of Rose's functions by one. Functions
Ei are strictly monotone, Ei+l bounds (dominates) Ei, i.e. El(x) <_ Ei+l(x) for
all x. We also have El(y) < E +l(x + y).
The subexponential stage 2.5 between the multiplicative and elementary (ex-
ponential) stage 3 is an inserted stage into the Grzegorczyk hierachy. We augment
the Grzegorczyk hierarchy by adding the class g2.5 generated in the usual way
from the binary subexponential hierarchy function E2.5(x, y) = x d(y) (alterna-
tively, we can use the 'smash' function x # y = 2d(x)'d(Y)). It is easy to see that
all functions f ( x l , . . . , xn) in g2.5 are bounded by 2p(d(xl),',d(x~)) for a polyno-
mial p(xl,..., xn) and so the class g~.5 lies strictly between the classes E2 and
ga.
Pair hierachy functions Fi generate our pair-based hieraehies. We set F2 = I,
F2.5 = | F3(x) = gl=l(2) where g(x) = (x @ x) | 2, and Fi+4(x) = F~+13(2). We
have Fi E PTr The functions Fi+3 have been chooses in such a way that we
have IFi+a(x)l = Ei+31xl.
A function operator yielding the function f is limited if f is bounded by
a function j which is an argument of the limited operator. To every operator
introduced sofar we clearly have a limited version by adding a new argument
function j.
We will write ~ i when we mean a class at the i-th stage of any of the three
pair hierarchies. We did not define the classes .T~ of pair hierarchies below the
multiplicative stage 9r2, i.e. for i = 0 and i = 1. As it is well known no pairing
332
function can be introduced in these stages. We only note here that by limiting
the pairing operator we can extend the pairing hierarchies to the stages zero or
one. We now state mostly without proofs a number of lemmas leading to Thms.
20 and 22.
L e m m a 9. | E f i .
f(o, y) = o, y
f((v,w),y) = w , v , y .
A computer programmer will recognize that the function Reva is the so called
accumulator version of the list reversal function Rev. We have Reva(x,y) =
Rev(x) | y, i.e. Rev(x) = Reva(x, 0). The pair iteration of f is limited as we
have Ifl l( )l = I*i _< IT(n, x)l. It should be clear that we can replace the limited
pair iteration by the limited unary iteration because Ixl _< x. []
Lemma14. TE 7)3.
L e m m a 15. f ~ C f2.5 C f a C ~4 C '" ".
L e m m a 16. The classes 9ci for i >_ 2.5 are closed under limited pair recursion.
The classes ~'~ are closed under limited pair recursion when the bounding con-
dition is independent of the parameter: If(x, Y)t < [j(x)l. This, clearly, includes
limited parameterless pair recursion.
333
It is unknown whether for i < 2.5 any of the inclusions are strict.
T h e o r e m 2 0 . The union of each of the hierarchies 7)i 7)3//i and 7)0 i is the
class of primitive pair recursive functions.
Proof. 7)0 i C_ gi: By Lemma21 the classes gi are closed under pairing and
contain I, H, and T. The function D can be easily derived in gi and so the
classes are closed under explicit clausal definitions. We can now easily show [ i
closed under both limited iteration and pair iteration with the help of limited
recursion. It remains to derive Fi E gi. For that we derive | in $2 just as in
Lemma 9. As in Lemma 13 we derive | E g2.5 with the bound obtained from
(9) as x N y _< 4 Ix| = 4 M'IyI+I < 4 (d(x)T1)'(d(y)+l)+t. The functions Fi+3 are
obtained by pair iteration in g i+3 with the bounds Fi+3(x) < c~(IFi+3(x)l + 1) =
o(Ei+3lxf + 1).
For the converse inclusion we show by induction on the construction of gi
a stronger claim that both the unary functions and the unary equivalents of
functions in gi are in 7)0 i. This is certainly true of Z, s, and Ei. The unary
equivalents of the projection functions U ~ ( x t , . . . , x ~ ) = xi are obtained by
clausal definitions of the same form. The closure under n-ary composition and
limited recursion which is simulated by limited unary iteration is left to the
reader. []
T h e o r e m 23 C h a r a e t e r i z a t i o n of p r i m i t i v e pair reeursive f u n c t i o n s .
The classes 7)7~i consist exactly of unary primitive recursive functions.
The reader will see that in our special case there is no problem with bounds. []
We say that a Turing machine M with three tape symbols: blank, 0 and 1
computes an n-dry function f ( x l , 9 9 x,~) when, after giving it the words coding
the arguments in the binary and separated by blanks as input, the machine stops
with a word coding the number •(x) in the binary as output. In order to convert
between numbers and codes of such words we will need two functions B and
B -1. The function B(x) yields a list of l's and 2's called the binary code of the
number x > 0, i.e. the list coding the word with the binary representation of
x (note that the bits 0, and 1 are coded in our Turing simulation by 1 and 2
respectively). We set B(0) = 0. Any function B -1 such that B - 1 B ( x ) = x is a
binary inverse of B.
Proof. This is similar to the proof of Lemma 21 but we have to make sure that
all iterations are short. []
Lemma26. B , B -1 C 7)~.
336
T h e o r e m 2 7 . The classes ~2, 7)0 ~, ~2.5 and 7)0 2.5 consist exactly of unary
functions computed by Turing machines in linear space/polynomial time, linear
space, polynomial time, and polynomial space respectively.
Such Turing machines can also pair iterate such functions in polynomial time
(for 7)2 7)2.5 functions) because the time is polynomial in d(z) which is on the
order of Ixl. []
Now we see that the class 7)2 contains + , - , - , +, etc. The Turing machine
characterization of 7)(92 can be also proved from Thm. 22 by the result of Ritchie
[6] that E 2 consists of n-ary functions Turing computable in linear space. Cob-
ham [2, 7] was the first one to characterize the n-ary functions computable in
polynomial time by an inductively defined class based on the recursion on no-
tation with the length of iteration d(x).
We now characterize the class 7)M,"2 5. The class of languages P H = {,Ji Si
where {S~} is the polynomial *ime hierachy [8] such that So = P and S~+~ =
N P ( S i ) . We have P C_ N P = $1, c o N P C_ S2.
T h e o r e m 28. The languages over {0, 1} in P H are exactly the languages com-
putable by 7 ) M 2s predicates.
where S E 7)2. The rest of the proof then follows from the well know presentation
of P H by means of alternating quantifiers. []
We probably cannot derive limited pair iteration from bounded unary min-
imum in the class 7)3/t 2 . The relations of this class can be characterized by
Ui P L i where { P L i } is a hierarchy similar to P H with
Clearly, from .~F1 : .)u2 follows (9vl), = (5c2),. So it remains to show the
converse where we assume (5cl), = (5r2), for the above three pairs of relation
classes. We have F1 C_ 9c2. Let us denote by (a)i the 792 function yielding the i-
th element (0 <_ i) of the list a or 0 if i >_ Len(a). Take any function f E .T2. The
function f is bounded by a function j E 5cl. The relation R(x, i) ~ (B f(x))i = 1
is in (Y'2), and hence also in (Y'l),- Now, L e n B f ( x ) <_ L e n B j ( x ) and we
can derive by pair iteration a function /~ C f l such that k(x) = B f(x) by
assembling the bits of the binary code with the help of the predicate R. Hence
f ( x ) = B -1 k(x) is in ~'1 and we h a v e / P l = :P2. [3
6 Conclusions
Although the central results of this paper are contained in the characterization
theorems 22, 27, 28, 29 we would like to remind the reader that this research
was started as a search for a feasible declarative p r o g r a m m i n g language based
on subrecursion. We plan to develop and implement on computers our clausal
language in the near future.
References
1 Motivation
such as LISP, Scheme and ML. But although both the latter are defined formally
[17, 25] neither definition includes the I/O operations.
We address this longstanding but still pertinent problem by supplying both
an operational and a denotational semantics for I/O effects. We work with a
call-by-value PCF-like language, (9, equipped with interactive I/O operations
analogous to those of LISP 1.5. We can think of dO as a tiny higher-order im-
perative language, with an applicative syntax making it a fragment of ML. We
adopt CCS-style bisimilarity as the natural operational equivalence on (9 pro-
grams. Our first theorem is congruence of bisimilarity, via Howe's method [14],
justifying operationally-based equational reasoning about O programs.
The denotational semantics is specified in two stages. First, we give a denota-
tional semantics to a metalogic ~4 in the category 57)/~ of cppos and Scott
continuous functions. Second, we give a formal translation of the types and ex-
pressions of O into those of M . M is based on the equational fragment of Crole
and Pitts' FIX-logic [5], but contains a single parameterised recursive datatype
which is used to model computations engaged in I/O, and does not (explic-
itly) contain a fixpoint type. Following Plotkin's use of a metalogic to study
object languages [24] we equip the programs (closed expressions) of J~4 with
an operational semantics. Our second theorem shows the 'good fit' between the
domain-theoretic semantics of M and its operational semantics: we prove that
the denotational semantics is sound and adequate with respect to the operational
semantics.
To complete our study, we establish a close relationship between the operational
semantics of each (9 program and that of its denotation. Hence we prove our
third theorem: that if the denotations of two (9 programs are provably equal in
the metalogic, the programs are in fact operationally equivalent. The proof is
by co-induction: we can show that the relation between (9 programs of equal
denotations is in fact a bisimulation, and hence contained in bisimilarity.
We overcame two principal difficulties in this study. First, although it is fairly
straightforward to write down operational semantics rules for side-effects, the es-
sential problem is to develop a useful operational equivalence. Witness the great
current interest in ML plus concurrency primitives: there are many operational
semantics [2, 13] but few if any developed notions of operational equivalence.
HolmstrSm [13] pioneered a stratified approach to mixing applicative and imper-
ative features in which a CCS-style labelled transition system for the side-effects
was defined in terms of a 'big-step' natural semantics for the applicative part of
the language. But HolmstrSm's approach fails for the languages of interest here,
in which side-effects may be freely mixed with applicative computation. Instead,
we solve the problem of finding a suitable operational equivalence by express-
ing both the applicative and the side-effecting aspects of (9 in a single labelled
transition system, a family (--~[ a E Act), of binary relations on (.9 programs
indexed by a set of actions, Act. The actions correspond to the atomic obser-
vations one can make of an (9 program. Milner's classical definition of (strong)
bisimilarity from CCS [16] generates a natural operational equivalence, which
341
subsumes both Abramsky's applicative bisimulation [1] and the stratified equiv-
alences suggested by Holmstrhm's semantics [10, 11]. The second main difficulty
was the construction of formal approximation relations in the proof of adequacy
for J%4. Proof of their existence is complicated by the presence in A/[ of a param-
eterised recursive type needed to model (9 computations engaged in I/O. Our
proof makes use of recent work on algebraic completeness by Freyd [9] and Pitts
[21].
As usual, we identify phrases of syntax up to alpha-conversion, that is, renaming
of bound variables. We write r = %bto mean that phrases r and r are alpha-
convertible. We write r162 for the substitution of phrase r for each variable x
free in phrase r A context, C, is a phrase of syntax with one or more holes.
A hole is written as [] and we write C[r for the outcome of filling each hole in
C with the phrase r If T/is a relation, 7~+ is its transitive closure, and 7Z* its
reflexive and transitive closure.
e) v ely/x] ~(i,j_) -+ i @ j
f~-~f~
f s t ( u , v ) -+ U s n d ( u , v ) -4 v
i f t r u e t h e n p e l s e q --~ p if false thenp else q -+ q
g[P] -+ $[q]
where g is an e x p e r i m e n t , a context specified by the grammar
The rules for 5 and f~ introduce the possibility of non-termination into O. One
can easily verify that the relation -+ is a partial function, and that it preserves
types in the expected way. A c o m m u n i c a t o r is a program ready to engage in
I/O, that is, one of the form C[read ()] or C[write n], where g is an e v a l u a t i o n
c o n t e x t , a context made up of zero or more experiments. More precisely, such
contexts are given by the grammar g = [] I g[C]. If we let the set of a c t i v e
p r o g r a m s , Active, ranged over by a and b, be the union of the communicators
and the values, we can easily show that the active programs are the normal forms
of -~, that is:
Lemmal. Active = {P l -~3q(p --+ q)}.
Our behaviourat equivalence is based on a set of atomic observations, or a c t i o n s ,
that may be observed of a program. This set, ranged over by a, is given by
where Msg, a set of m e s s a g e s , represents I/O effects. Let Msg, ranged over by
#, be Msg def {?n, !n I n 6 N}, where ?n represents input of a number n and !n
output of n.
343
3 The metalogic
We outline a Martin-LSf style type theory which will be used as a metalogic, 34,
into which (9 may be translated and reasoned about--it is based on ideas from
the FIX-Logic [5, 6], though A4 does not explicitly contain a fixpoint type. The
(simple) types of J~4 are given by
in which any type U(a) occurring in the a~ is of the form U(Xo), and each function
type in any ai has the form a -~ a~ (thus the function types in the body of the
recursive type are required to be partial). The use of these types in the modelling
of O is essentially standard, but note that the single recursive datatype will be
used in Section 4 to model I/O. The collection of (raw) expressions of A4 is given
by the grammar in Figure 1,
E ::~ x (variable)
0 (unit value)
[~J (literal value)
E [OJ E (arithmetic)
If E then E else E (conditional)
(E, E) (pair)
Split E as (x, y)in E (projection)
c(E) (recursive data)
CaseEofc(x) -+ E I " [ c(x) -~ E (case analysis)
Ax:a. E (abstraction)
EE (application)
Lift(E) (lifted value)
Drop E to x in E (sequential composition)
Rec x in E (recursion)
Most of the syntax of fv4 is standard [6, 7]. The types are either a type vari-
able, a unit type, Booleans, integers, products, exponentials, liftings, or a single,
parameterised recursive datatype whose body consists of a disjoint sum of in-
stances of the latter types. Here, the expressions Lift(E) and Drop E1 to x in E2
give rise to an instance of (the type theory corresponding to) the lifting compu-
tational monad [19]. A closed type a is one in which there are no occurrences of
345
the type variable X0, and we omit the easy formal definition. We define a type
assignment system [4] for A4 which consists of rules for generating judgements
of the form F t- E:a, called p r o v e d expressions, where a is a closed type, and
the e n v i r o n m e n t F is a finite list x l : a l , . . . ,xn:an of (variable, closed type)
pairs. Most of the rules for generating these judgements are fairly standard [6].
We give a few example rules in Figure 2.
Finally, we equip the syntax of f14 with an operational semantics. This is spec-
346
ified in two ways, first in the style of natural semantics 'big-step' reduction
relations, and second in 'small-step' reduction relations. The former is specified
via judgements of the form P ~ V where ~ C_ Prog M x Value M. The latter
reductions take the form P1 ~ P2, with P1 and P2 both M programs. We omit
most of the rules for generating the operational semantics, except those associ-
ated with recursion and the recursive datatype which appear in Figure 4. Given
any program P, we write P4~ to mean that there is a value V for which P I W .
As usual for a deterministic language we can prove that P ~ V iff P -~* V.
T h e o r e m 2.
(1) If P E P r o g ~ and P -4 P', then P' E Prog~ and moreover P = P':a is a
theorem of M .
(2) If P = Lift(P'):a• is a theorem of M then there exists a value Y for which
V E Value~J~z and P ~ V.
It is easy to prove the first part by rule induction on P --+ P~. A corollary is that
whenever P ~ V, P = V:a is a theorem of A,t.
In order to prove the second part, we first give a denotational semantics to ~4
in the category ~ of complete pointed posets (cppos) and Scott continuous
functions. For us, a cppo is a poset which is complete in the sense of having joins
of all w-chains and pointed in the sense of having a bottom element. Closed types
will be modelled by cppos, and the proved terms by Scott continuous functions.
In order to set up the denotational semantics, we define a set of functors, each
functor being of the form
where 57X9• is the category of cppos and strict continuous functions. These
functors are introduced to provide convenient machinery for specifying the se-
mantics of types, and for inducing functions which arise when we later prove
the existence of certain logical relations. The cppos X - and X + will model the
347
9 FUnit(X_,X+,Q_,Q+) de__fl i ,
9 F,,~,~, d"=fF,~(X+,X-,Q+,Q-) --+ F~,(X-,X+,Q-,Q+),
9 FU(xo ) (X-, X +, Q-, Q+) ~f Q+,
9 Fxo (X-, X +, Q-, Q+) dej X + ' and
9 F ( X - , X +, Q - , Q+) de=f (E~F~, ( X - , X +, Q - , Q+))• (where ,U denotes a
coproduct of cpos, itself a cppo).
The remaining clauses are omitted. The definition of the semantics of the closed
types a, written [a], are as the reader expects, except possibly for a recursive
type U(a). There is, for each pair ( X - , X +) of cppos, a functor
def
then
[IF }- CaseEof cl(Xl) -+ E1 I"" I Cn(Xn) --~ En:a'](~)
clef f ej(~, _L) if i-I(e(~)) = _l_
= [ if =
To prove Theorem 2, we shall show that there is a type indexed family of relations
<~ C_ [a] x Prog~ satisfying certain conditions. Such formal approximation
348
relations are fairly standard (see for example [7, 22, 24]) so we simply give these
conditions at lifted and recursive types:
r ~U(~) P iff r = _L or ~Pj.P ~ cf(Pj) and 3dj.r = inj(dj) and dj ~j[a/Xo] PJ"
n(x) %f
{ a 6 7)(X x H) I for each P:a, {x I (x, P:cr) 6 R} C_X is chain complete}.
Write D def D(1, 1). We then define (monotone) functions, at each closed type
a, where Fa:TZ(D) ~ x TZ(D) -4 TZ(F~(1, 1, D, D)), by inductive clauses such as
t-el o')1 f = • or
P 1~ Ax. E' and V(d, Pl:a) 6 F~(S, R).(f(d), E'[P1/x]:a ') 6 Fa,(R, S)}.
4 T h e translation of O into
Vah a -4 Ta
Let: Ta -4 (a -4 Ta') -4 Ta I.
(Strictly, speaking these are type schemes, and Val and Let are type-indexed
families of combinators.) The idea behind this monadic translation is that Val
and Let correspond to immediate termination and sequential composition re-
spectively. We can define Val d_efAX. Return(x) and Let has a recursive definition
350
that roughly speaking stitches together the strings of I/O operations denoted by
its two arguments,
where Fix:(a -4 a~L) -~ (a --+ a~_) is a fixpoint combinator defined from Rec [10].
(Note that let, w and (o and their primed variants are simply jr4 variables.)
We simultaneously define the translation (_)o of arbitrary O expressions to AA
expressions, and an auxiliary translation ]_[o of O value expressions. Here are
the rules for value expressions
IXl O --: X
I()1 ~ - 0
I-~1~ --- ls
I~_1~ = ~x. S p l i t x a s (y,y') in DropyL| in V a l z
I~=1 ~ - Ax.Splitxas (y,z)inValy
Isndl ~ - Ax. Split x as (y, z) in Val z
161~ - Fix(~I~. ~x. (e~[fq~]) ~ where ~un~(x:~):~5 dej e~
I(~, u)I ~ - (Ivl ~ lul ~
I~=:~.el ~ - Ax:(~) ~ ~
Iread] O - Ax:Unit. Read(Val)
I,=itel ~ - A x : l n t . W r i t e ( x , Val 0 )
and here are the rules for arbitrary expressions, where Let x <= E in E' is an
abbreviation for Let (E, Ax. E').
(re) ~ -~ Return(lvel ~ (*)
(~)o = R e c x i n x
(if el then e2 else e3)0 --- Letx <= (el) O in I f x t h e n (e2) O else (e3) ~
(vee')~ - Letx <= (e') 0 in Ivel ~ x (**)
(ee')~ -- Letf 4= (e) O in L e t x 4= (e') 0 in f x
( (ve, e') ) ~ _-- Lety 4= (e') ~ in Val(Ivel~ y) (**)
((e,e')) ~ _ Letx <== (e) ~ in L e t y 4= ( d ) ~ in V a l ( x , y )
Rules marked (*) and (**) take precedence over later rules.
L e m m a 5 (Correspondence).
(1) L e t x <= R e t u r n ( P ) i n E --~+ E[P/x]
351
Proof. Suppose that p S q and that p a ) pr. By Lemma 2 there is a with p -+* a
and a a ) p'. By Lemma 5 we have (p)O __+. (a)O and therefore (p)O = (a)O
by Theorem 2. By transitivity (q)O = (a)O is derivable, so by Theorem 2 and
Lemma 6 we have (q)O~. Hence q$ by Lemma 7, that is, there is b with q -+* b.
By Lemma5 and Theorem2 we have (q)O = (b)O and so (a) ~ = (b) 0 by
transitivity. Hence by Lemma 8 there is ql with b a ) ql and (p~)O = (qr)O.
Altogether we have q a ~ q~ and p~ S q~. A symmetric argument shows that q
can match any action of p, hence S is a bisimulation. []
The soundness of metalogical reasoning follows by co-induction, Lemma 3.
T h e o r e m 3 ( S o u n d n e s s ) . (p)O = (q)O implies p ~ q.
5 Discussion
Acknowledgements We thank Simon Gay, Andrew Pitts and Eike Ritter for use-
ful discussions. Roy Crole was supported by a Research Fellowship from the
EPSRC. Andrew Gordon was funded by the Types BRA. This work was par-
tially supported by the CLICS BRA.
References
1. Samson Abramsky and Luke Ong. Full abstraction in the lazy lambda calculus.
Information and C o m p u t a t i o n 105:159-267, 1993.
353
2. Dave Berry, Robin Milner, and David N. Turner. A semantics for ML concurrency
primitives. In 19th P O P L , pages 119-129, 1992.
3. G~rard Boudol. Towards a lambda-calculus for concurrent and communicating
systems. In T A P S O F T ' 8 9 , Springer LNCS 351, 1989.
4. Roy. L. Crole. Categories for Types. CUP, 1993.
5. Roy. L. Crole and A. M. Pitts. New foundations for fixpoint computations: FIX
hyperdoctrines and the FIX-logic. I n f o r m a t i o n and C o m p u t a t i o n , 98:171-210,
1992.
6. Roy L. Crole. P r o g r a m m i n g Metalogics with a Fixpoint T y p e . PhD thesis,
University of Cambridge, 1992.
7. Roy L. Crole and Andrew D. Gordon. Factoring an adequacy proof (preliminary
report). In Functional Programming~ Glasgow 1993, Springer 1994.
8. Marcello P. Fiore and Gordon D. Plotkin. An Axiomatisation of Computationally
Adequate Domain Theoretic Models of FPC. In 9th LICS, 1994.
9. P. Freyd. Algebraically complete categories. In 1990 C o m o C a t e g o r y T h e o r y
Conference, Springer Lecture Notes in Mathematics, 1991.
10. Andrew D. Gordon. Functional P r o g r a m m i n g a n d I n p u t / O u t p u t . CUP,
1994.
11. Andrew D. Gordon. An operational semantics for I/O in a lazy functional lan-
guage. In F P C A ' 9 3 , pages 136-145. 1993.
12. Andrew D. Gordon. A tutorial on co-induction and functional programming. In
F u n c t i o n a l Programming~ Glasgow 1994. Springer Workshops in Computing.
13. SSren HolmstrSm. PFL: A functional language for parallel programming. Report
7, Chalmers PMG. 1983.
14. Douglas J. Howe. Equality in lazy computation systems. In 4 t h LICS, 1989.
15. John McCarthy et al. LISP 1.5 P r o g r a m m e r ' s Manual. MIT Press, 1962.
16. Robin Milner. C o m m u n i c a t i o n and C o n c u r r e n c y . Prentice-Hall, 1989.
17. R. Milner, M. Torte and R. Harper. T h e Definition of SML. MIT Press, 1990.
18. Peter D. Mosses. Denotational semantics. In Jan Van Leeuven, editor, H a n d b o o k
of T h e o r e t i c a l C o m p u t e r Science, pages 575-631. Elsevier 1990.
19. Eugenio Moggi. Notions of computations and monads. TCS, 93:55-92, 1989.
20. Andrew M. Pitts. Evaluation logic. In I V t h H i g h e r O r d e r W o r k s h o p , B a n f f
1990, pages 162-189. Springer 1991.
21. Andrew M. PittS. Relational properties of domains. Tech. Report 321, University
of Cambridge Computer Laboratory, December 1993.
22. Andrew M. Pitts. Computational adequacy via 'mixed' inductive definitions. In
M F P S IX, N e w Orleans 1993, pages 72-82, Springer LNCS 802, 1994.
23. Gordon D. Plotkin. Pisa notes on domains, June 1978.
24. Gordon D. Plotkin. Denotational semantics with partial functions. Stanford CSLI
1985.
25. Jonathan Rees and William Clinger. Revised a report on the algorithmic language
scheme. A C M S I G P L A N Notices, 21(12):37-79, December 1986.
26. Philip Wadler. The essence of functional programming. In 19th P O P L , 1992.
27. John H. Williams and Edward L. Wimmers. Sacrificing simplicity for convenience:
Where do you draw the line? In 15th P O P L , 1988.
A n Intuitionistic M o d a l Logic
with Applications to the
Formal Verification of Hardware
1 University of Sheffield
Department of Computer Science
Regent Court, Sheffield $1 4DP, UK
Emaih m.fair tlough@dcs.shef.ac.uk
2 (Corresponding author)
Technical University of Denmark
Department of Computer Science
Building 344, DK-2800 Lyngby, Denmark
Emaih mvm@id.dtu.dk
1 Motivation
tion at the gate level. It is convenient to reason about the static behaviour of a
combinational circuit in terms of high or low voltage and to abstract away from
propagation delays. This makes it possible to analyse large circuits by classical
propositional logic and standard Boolean techniques. In this 'ideal' abstract set-
ting the behaviour of the circuit shown in Fig. 1 may be specified by "if input
A is 0 then output C is 0," formally
A=ODC=O.
weakened by the stability constraint 'B stable' to rule out the glitch, and by the
timing constraint 'after some delay' to account for the input/output propagation
delay. The trouble is that both constraints refer to time and thus belong to the
concrete level of timed signals. Thus, the dominant formalism of Boolean algebra,
or propositional logic for that matter, is not adequate to capture correctness of
356
the timing abstraction. No problem, one might say, since of course at the low
level one can make things precise, say by
Though the state of the art in formal hardware verification this cannot be the
answer as it jeopardizes the crucial distinction between the abstraction levels.
We are throwing overboard the abstractness of the Boolean approach, and we
are back where we started, entangled in the nitty-gritty details of exact timing
verification.
Fortunately, there is a middle way of tackling the problem of approximat-
ive and incomplete abstractions: we employ a weakened notion of correctness,
viz. correclness-up-to-consiraints, and formalize it as a logical modality. The ex-
ample circuit's behaviour then would be specified by
In this paper we shall present a concrete formal calculus, Propositional Lax Logic,
conceived along the lines set out in [12]. The term 'lax logic' is chosen to indic-
ate the 'looseness' associated with the notion of correctness up to constraints.
Propositional Lax Logic, PLL, is an intuitionistic propositional calculus with a
single modality O. The intuitive interpretation of OM is "for some constraint
c, formula M holds under c". Clearly, different notions of constraint will have
different properties, and thus will give rise to different axioms for O. The generic
interpretation leads to the following three axioms:
OR : M D O M
OM : OOM D OM
OF : ( M D N ) D ( O M D O N ) .
Axiom OR says "if M holds outright then it holds under a (trivial) constraint";
OM says "if under some constraint, M holds under another constraint, then
M holds under a (combined) constraint"; finally, O F says "if M implies N
then if M holds under a constraint, N holds under a (the same) constraint."
However innocent each of these axioms may appear in this informal reading,
their combination results in a rather strange modality. Indeed, O has a flavour
357
of both possibility and of necessity without being one or the other. Axioms OR
and OM are typical of possibility while O F is typical for necessity. On the other
hand, in standard systems, say Lewis' modal system $4 [2], the axiom OR is
never adopted for necessity while OF never for possibility, and in fact they would
trivialize the modalities.
The second noteworthy feature of PLL is that it is an intuitionistic rather than
classical logic which so far has been the basis of the great majority of approaches
in the area of hardware verification. In [11, 12] the intuitionistic nature has
been exploited to extract constraints from proofs. Yet, dropping the Excluded
Middle is not merely for pragmatic reasons: PLL is essentially intuitionistic in
the sense that assuming the Excluded Middle and the axiom -10 false trivializes
O, i.e. O M becomes provably equivalent to M. This is another indication for the
'strangeness' of O in the context of standard classical modal logics. Of course:
we hope this paper will convince the reader that the O modality is actually a
very natural one.
So, just what kind of modality is O? Why should it be interesting at all and how
does it relate to correctness-up-to constraints? In [11, 12] O is motivated by a
proof-theoretic interpretation. The present paper attempts to justify the axioms
and their constraint interpretation by model-theoretic means. In this paper we
will show that PLL has a natural class of Kripke models for which it is sound
and complete. Two concrete subclasses of such models will be presented obtain-
ing two concrete constraint interpretations of O. These concrete models verify
that PLL has nontrivial expressiveness and illustrate the benefit of dropping
Excluded Middle and ~Ofalse. But before we get to the technical results it may
be appropriate to give some general justification of O.
(1) Consider the most simple constraint interpretation, viz. O M - C D M ,
where C is an arbitrary but fixed constraint. Under this encoding all three axioms
OR, OM, OF become tautologies of (intuitionistic) propositional logic. With
modification, this generalizes to a set g of constraints: OM - Z C E g. C D M
(see [11]). The single constraint interpretation is precisely Curry's system LJZ
[3]. PLL itself appears to have occurred for the first time in Curry's 1948 Notre
Dame lectures on A Theory of Formal Deducibility [4]. These lectures contain
some sketchy remarks on a O modality endowed with axiom schemata that are
essentially equivalent to the ones we are adopting for O. The present paper may
be seen as giving a model-theoretic account of Curry's proof-theoretic O and in
particular of LJZ.
(2) A second motivation for O can be drawn from general type theory. The formal
properties of O viewed as an unary type constructor are precisely the data of
a strong closure operator, or strong monad familiar from category theory. In
fact, the propositions-as-types principle which yields an equivalence between In-
tuitionistic Propositional Logic (IPC) and bi-Cartesian closed categories can be
extended to an equivalence between PLL and bi-Cartesian closed categories with
a strong monad. This categorical structure is also known as the computational
lambda calculus Ac [13]. The application of Ac as a calculus of proofs for PLL
has been investigated by Benton et al. [1] (there the logic is called CL).
358
(3) The third motivation for O is the possibility of a timing analysis of combin-
ational circuits. In an equivalent presentation of PLL we can replace O F by the
axiom
OS:(OM A ON) D O(M A N)
and the additional inference rule "from M D N infer OM D ON". One may
now establish a direct correspondence between the axioms used in verifying the
functional behaviour of a combinational circuit and the computation of a data-
dependent timing constraint: OR corresponds to a wire, which involves zero
delay 0; OM deals with the sequential composition of circuits, which involves the
addition of delays +, and OS effects the parallel composition of circuits, which
amounts to the maximum operation max on delays. In other words, by systematic
translation of proofs in PLL into a term over the delay algebra (Nat, O, +, max)
we can extract verification-driven (= data-dependent) timing information. This
is essentially an interpretation, in the sense of (2), in a concrete A, calculus.
3 Results
F ~ - M OR F,M~-ON 9
F ~- OM F, OM F- ON
The complete set of rules is listed in Fig. 2. One can verify that the Hilbert and
Gentzen .systems for P LL are equivalent, i.e. for all formulas M, ~-PLLM iff ~- M
is derivable.
The deduction theorem does not hold for ordinary modal logics. For instance in
K, T, $4 [2] we have M ~- ~ M but ~/M D QM, and M D N b OM D ON but
k/(M D N) D (OM D <~U).
359
Logical Rules
F, M F A F,N~- A
VL
F, M V N F A
FI--M FI-N
VR1 VR2
I~}-MVN FbMVN
F, M F N FF M F, N F A
DR DL
FF M D N F, M D N b A
F, M I - - R FF M ,L
F t- --M F,-',M b
F I ' - M OR F, M F O N OL
F F OM F, O M I-- ON
Structural Rules
id F ~- M F, M F A cut
M b M FFA
s FI-
weakL - - weakR
F, M F AI F~- M
The proof uses the same method that works for IPC [5]. One new reduction step
needs to be introduced, as shown in Fig. 3.
HI II2
FF-M F, M F - o N
OR 9L ~zl ~z2
F F- O M 17, O M ~- O N F ~- M F, M ~- O N
cut reduce cut
F ~- o N ==# 1~ b o N .
--~'PLL M iff ~ M.
- PLL +-~ Ofalse is sound and complete for the class of constraint models with
F=0.
- PLL + O(M V N) D (OM V ON) is sound and complete for the class of
constraint models where Rm and Ri are mutually confluent, i.e. if aRmb
and aRic, then there exists d such that bRid and cRmd.
Proof. We indicate the proof of the first statement which follows traditional
lines in constructing a counter model for every formula that is not derivable. The
counter model employs a suitable generalization of the Lindenbaum construction,
in which worlds are triples
(r,~,O)
362
which falsifies all unprovable formulas. As the elements in C* we take the max-
imally consistent theories (F, A O). The accessibility relation R* is simply the
subset relation on the first component, i.e.
It is not hard to verify that these data indeed constitute a constraint Kripke
model.
Suppose VPLL M. Then (O, {M}, O) is consistent. Take a maximally consistent
extension "/', then 9- ~= M in the constraint Kripke model g*.
A complete proof of this result may be found in [7]. []
Two examples of counter models, one falsifying -~ Ofalse and the other falsifying
O(AV B) D (OAVOB), are shown in Fig. 4. Both may be seen to be constructed
from g*. The solid arrows represent R* and the dashed lines R*.
363
I I
I I
I I
I I
v.E F*
~A ~A ~B
In this paper we give two variants of concrete models for PLL. The first class
of models is characterized by formulas of the kind O M = C[M] where C[_] is
one of a family of possible contexts, for example dIM] _= C D M where C
is an arbitrary but fixed proposition. As mentioned this is precisely Curry's
system LJZ [3] and a special case of the quite general constraint interpretation
according to which O M means 3' D M, where 7 is taken from a predefined set of
distinguished propositions representing constraints. Other possible contexts are
C[M] ~_ C V M or C[M] ~. (M D C) D C.
Let PLL c be the theory
PLL + O M = ( C D M )
and .AAc the class of (antisymmetric) Kripke constraint models validating PLL C.
The PLL C interpretation of 9 provides us with a class of models for which the
axiom schemata --10false and o ( m V N) D (OM V ON) are unsound, in general.
The former is valid iff ~ -~-~C and the latter whenever [C] = { w I w ~ C } is
a principal ideal, i.e. for some z, [[C] = { w ] zRiw }.
Let PLL1 and PLL2 be the theories
L e m m a 8.
The second type of models to be investigated are still more concrete. They are
obtained from the dynamic behaviour of combinational circuits under explicit
modelling of propagation delays, so that 9 means there exists a timing con-
straint d such that the circuit stabilizes in state M after time delay d. Concretely,
the circuit models are set up such that for a propositional constant A
This allows us to retain the ideal 'static' interpretation of truth values while
safely keeping track of the offset to the real signals caused by propagation delays.
For instance, the example circuit mentioned in the introductory section (1) can
now be specified by
(B V-~B) D (-,A D O~C).
1 1
I I I II
H
I(A) L I
I
t
J q
I
I I I I
I i ! t
I
I(B) I • I I / I
T i ~ ~
J I I I I I I
I I I I I I I
0 t~ t2 t3 t4 L5 t~t7 c~
is constructed as follows:
- W(I) is the set of Leibniz intervals for I.
Is, t)R~[s', t') if Is', t') is a subinterval of Is, t),
-
- [s,t)Rm[s',t') if [s',t') is a final subinterval of [s,t), i.e. t = t' and s < s'.
- [s,t) E V(I)(A) if I(A) is constant 1 throughout [s,t), i.e. Vx. s < x < t,
I(A)(x) = 1.
- F(I) is the set of empty intervals [s, s).
The set W(I) is clearly nonempty, as it always contains the pairs [0, 0) and [0, r
The other properties of a constraint Kripke model are easily verified. Also, as
this model is confluent, it satisfies the axiom O ( M V N) D ( O M V ON).
The paper presented a novel intuitionistic modal logic, PLL, a conservative ex-
tension of the standard intuitionistic propositional calculus by a new modal op-
erator O to capture the notion of 'correctness-up-to-constraints'. The modality
algebraically is a strong closure operator or - - from a type-theoretic perspective
- a strong monad. The main result is that PLL has a natural class of two-frame
-
Kripke models for which it is sound and complete. This provides a satisfactory
367
We have given a number of concrete types of models for PLL, one of them mo-
tivated from hardware verification. We interpret PLL over timing diagrams such
that O expresses truth up to stabilization. In the resulting theory, Circuit-PLL,
one derives safe stabilization information even in the presence of glitches, where
the standard classical reasoning is sound only under implicit stabilization as-
sumptions. We show that this logic is able to express nontrivial stability beha-
viour.
For circuits where delays do not invalidate functional correctness, such as syn-
chronous circuits, it is often necessary or advantageous to combine functional
and timing analysis so as to derive the 'exact' data-dependent delay of com-
binational circuitry. We anticipate using PLL to do this with standard proof
extraction techniques based on a concrete computational lambda calculus as
mentioned in the introductory section 1. In this context an automatic theorem
prover for PLL will be useful. However, it is not yet clear how such extraction
techniques could be integrated with automatic proof search based on cut-free
sequent calculus presentations of the logic. We are developing an implementa-
tion based on such an approach and one of our goals is to incorporate constraint
extraction.
6 Acknowledgements
Rod Burstall and Terry Stroup have had decisive influence on the development
and presentation of this work. Michael Mendler is indebted to Rod for his en-
couragment and stimulating supervision of the author's Ph.D. research, on which
this work is built.
Thanks are also due to Roy Dyckhoff for his interest and for providing useful
references to the literature. The authors are grateful to Nick Benton for discus-
sions on the computational interpretation of PLL, and to Pierangelo Miglioli for
pointing out the connection with Maksimova's intermediate logic.
Michael Mendler is supported by a Human Capital and Mobility fellowship in
the EuroForm network.
358
References
1. N. Benton, G. Bierman, and V. de Paiva. Computational types from a logical per-
spective I. Draft Technical Report, Computer Laboratory University of Cambridge,
U.K., August 1993.
2. B. Chellas. Modal Logic. Cambridge University Press, 1980.
3. H. B. Curry. The elimination theorem when modaiity is present. Journal of Sym-
bolic Logic, 17:249-265, 1952.
4. H. B. Curry. A Theory of Formal Deducibility, volume 6 of Notre Dame Mathem-
atical Lectures. Notre Dame, Indiana, second edition, 1957.
5. M. Dummett. Elements of Intuitionism. Clarendon Press, Oxford, 1977.
6. W. B. Ewald. Intuitionistic tense and modal logic. Journal of Symbolic Logic, 51,
1986.
7. M. Fairtlough and M. Mendler. An intuitionistic modal logic with applications to
the formal verification of hardware. Technical Report ID-TR:1994-13, Department
of Computer Science, Technical University of Denmark, 1994.
8. G. Fischer-Servi. Semantics for a class of intuitionistic modal calculi. In
M. L. Dalla Chiara, editor, Italian Studies in the Philosophy of Science, pages
59-72. Reidel, 1980.
9. M. Fitting. Proof Methods for Modal and [ntuitionistic Logics. Reidel, 1983.
10. L. L. Maksimova. On maximal intermediate logics with the disjunction property.
Studia Logica, 45:69-45, 1986.
11. M. Mendler. Constrained proofs: A logic for deafing with behavioural constraints
in formal hardware verification. In G. Jones and M. Sheeran, editors, Designing
Correct Circuits, pages 1-28. Springer, 1990.
12. M. Mendler. A Modal Logic for Handling Behavioural Constraints in -Formal Hard-
ware Verification. PhD thesis, Edinburgh University, Department of Computer
Science, ECS-LFCS-93-255, 1993.
13. E. Moggi. Computational lambda-calculus and monads. In Proceedings LICS'89,
pages 14-23, June 1989.
14. G. Plotkin and C. Stirling. A framework for intuitionistic modal logics. In Theor-
etical aspects of reasoning about knowledge, pages 399-406, Monterey, 1986.
15. A. Simpson. The Proof Theory and Semantics of Intuitionistic Modal Logic. PhD
thesis, University of Edinburgh, Department of Computer Science, 1994.
16. A. S. Troelstra and D. van Dalen. Constructivism in Mathematics, volume II.
North-Holland, 1988.
Towards M a c h i n e - c h e c k e d Compiler Correctness
for H i g h e r - o r d e r P u r e Functional Languages
1 Introduction
Much of the work done previously in compiler correctness concerns restricted
subsets of imperative languages. Some studies involve machine-checked
correctness--e.g. Cohn [1], [2]. A lot of research has been devoted to the con-
struction of compiler-compilers as in the work of Mosses [6] Paulson [10], and
Wand [20]. A recent a t t e m p t in this field is reported in [9].
Developing a proof of compiler correctness for a higher-order functional lan-
guage is made considerably more difficult by the need to use inclusive predicates
to relate an operational semantics (or a continuation semantics) to the direct
semantics. A complete proof of the correctness of a lazy functional language
compiler is presented in Lester [3, 4], however, it has not been machine-checked.
Methods and i m p o r t a n t results have been published by Stoy [17, 18] and Wand
[19].
In order to present the problem in a relatively short paper, we have considered
a simplified form of the problem of compiler correctness, and its mechanized
proof. We hope subsequently to extend the work to a full compiler. Here we
discuss the use of machine-assisted proof in asserting the congruence between
two definitions of a fully-fledged language of lambda expressions. We use Isabelle
370
axiom_name : axiom
The major lemmas that were needed for the proof of the final result are also
quoted in the format
The usual denotational semantics notation is used (we have pretty-printed Is-
abelle's axioms and theorems using a Gofer 1 script to make them more readable).
1.1 T h e l a n g u a g e a n d its d e n o t a t i o n a l s e m a n t i c s
Exp_ind :
[ V n. P ( E C o n s t n);
V z y. P x =r P y =* P ( E A p z y);
V i. P ( E Y a r i); V i e . P e =~ P ( Z L a m i e);
V i e. P e =r P ( E L a m V i e) ] ~ V x . P ( x )
The domains used in the definition of the direct semantics of the language are
given below. Expression values are of two kinds: basic values (natural numbers
in our particular formulation) and function values, which are restricted to be
continuous functions. For convenience the semantic domain of environments is
also given a name: U. The domain l~ has been defined as a new type in Isabelle.
1 Gofer, a lazy pure functional language related to Haskell, was devised and imple-
mented by Mark Jones at Oxford.
371
D e f i n i t i o n 1.1
13 Basic values
e 1~ = 13 + F Expression values
e F = Jig ---* E] Function values
IJ C 13 = [Ide ---* E] Environments
The domain injection (in) and projection (I) operations are defined in the
usual way. For the domain of functions, in particular:
deF'F': (e in F) I F = e
deF'S': ( n l n ] 3 ) l ~ e = _7
deF~Err : ._ ?IF= ._
d e F ' U U : I I ~ = .L
D e f i n i t i o n 1.2
eval' E C o n s t : g' [ E C o n s t n] IJ = n in 13
eval'EAp: S' l E A p el e2]/~ = ((~' [eli ~6) I F) (g' [e~] j6)
e v a l ' E V a r : g' [ E V a r i~ I~ = IJ i
e v a l ' E L a m : ~' [ E L a m i e] ~ = (he. ~' Ire] (~[ i ~ e])) in
e v a t ' E L a m Y : g' [ E L a m Y i e] ~ = (strict (,~e. g' [el (t~[ i ~ e]))) in
The only point of interest is that the semantics for the call-by-value case
insists on evaluating its argument before evaluating the function. This is specified
by the use of the function strict which is defined by the following axioms2:
D e f i n i t i o n 1.3
~v, z E W =
I( --+ ]~ Closures
k E I~ =
1~, --~ E Continuations
e E ]~ =
I3 + F Expression values
e F =
[~r -.+ ~V] Function values
]~ E U = [Ide ~ ~V] Environments
An axiom for case analysis can be formulated for the domain ]~ of expression
values:
Definition 1.4
$ : : Exp--+ U---~ ~V
evaIEConst : ~ [EConst n] ~ = (2k. k (n i n 13))
evalEAp: ~ [ E A p e l e2]~ = ()~k.~[el]~()~e. ( e l F ) ( ~ [ e 2 ~ k ) ) )
evaIEYar: E[EVar i] b = (Ak. ~ i k)
evaIELam : E[ELam i ell) = ()~k. k ((~w. E l 4 (/~[i ~-+ w ] ) ) i n F ) )
evalELamV : C [ELamV i e] ~ = ()~k. k ( ( ~ . )~kl.
(hE. C [el (hi i k kl)) in #))
The reader can observe a property of the semantics of lambda expressions which
are in weak head normal form (EConst, ELam, ELamV). The expression clo-
sure w corresponding to such an expression is always of the form (~k.k ~), where
the expression value c is returned to the continuation k. This property will be
used later in the congruence proof.
W h a t we would like to prove is that the two semantic definitions are congruent
in a certain sense. To this end we define predicates (Definition 2.1) to compare
two values, one from each semantics.
373
As the reader may have noticed already, we use Stoy's diacritical notation
[15] to distinguish between objects belonging to the two semantics. We use an
acute accent : to represent an object from the direct semantics and a grave accent
" to represent an object from the continuation semantics.
Note that in Definition 2.1, as well as in the rest of this paper, we have not
restricted ourselves to the usual LCF notation. For example, we have not used
the conditional operator, which is predefined in Isabelle's LCF theory. Instead,
for convenience, we have preferred a style close to a that of a functional language
with pattern-matching.
The predicate e compares two expression valued objects. As we can see the
bottom, error, and basic value cases are straightforward. The e predicate uses the
f predicate to compare two functions. The predicate f compares two functions
by extensional equality: if two functions, when applied to congruent arguments
give congruent results then they are congruent. The ep predicate relates two
environments by comparing the two values corresponding to each identifier.
D e f i n i t i o n 2.1
them. Stoy [16] applies the more straightforward inclusive predicate strategy of
Milne to solving a similar problem. He uses retracts in building the domains, and
constructs the particular predicates iteratively. Reynolds's technique is system-
atic, and thus suitable for mechanization, however its applicability is restricted
to relating 'similar' domains. Milne's technique is general, but rather ad-hoc,
and thus harder to mechanize.
Mulmuley [7] proposed a systematic technique for proving the existence of
predicates, and implemented it as an extension to LCF. Central to his technique
is an algorithm which reduces the problem of existence to a set of sufficient (but
not necessary) goals. In practice, the goals produced are weak and can be proved
within LCF, very often automatically.
We have followed Stoy's approach. Not surprisingly, the equations from Def-
inition 2.1 differ from those of Reynolds.
3.1 Retracts
A retract A over a semantic domain A is a continuous function which is idem-
potent (composing it with itself produces the same function):
A :: A--* A; A=AoA
A retract can be constructed automatically from a domain definition by means
of a set of retract operators [15], corresponding to domain constructors.
Definition 3.1 shows how sequences of retracts can be constructed over the
domains of Definitions 1.1 and 1.3.
D e f i n i t i o n 3.1
rE'O: Eo ~ = - L
rE'B' : ~.+1 (c in f3) = c in
rE'r': ~.+1 (~ in ~') = (r ~;) in ~'
rE'Err: En+l ? : ?
r E ' U U : E.+I I = i
~F'o: fo ,~ =-L
r E_O : Eo r = I
rE_B: ~.+1 (c in i3) = c in i3
rE_F: En+l (r in F) = (F,~+I
r E _ E r r : E.+I ?_ = ?_
rE_UU: En+l .L - " _L
rE_0: r ~=•
rF_succ:Fn+l r = ~1. o (h o ~1~
375
goal r e t r a c t s _ t h y :
v~.(v~.~.(r ~)= ~. ~)A
goal r e t r a c t s _ t h y :
Vn.(Vd. F,, (F,,+~ r = r r A
(vc.~,,(~,,+, ~ ) = ~:,, ~)^
(v ,~. r ( ~,, ,/,) ~,, d,) ^
(v ~. ~,,+, ( ~,, ~) = ~,, ~)
Note that all the conjuncts in each goal are proved simultaneously, in one in-
duction step. This of course reflects the mutually recursive nature of retract
definitions 3.1. A further induction is needed for generalizing these properties.
As with most other theorems, the base cases of the induction are solved auto-
matically by Isabelle's simplifier. The proof of each of the lemmas below required
10 tactics to be applied.
L e m m a 3.3
Finally, some simplification rules for retracts turn out to be useful in later proofs:
Theorems analogous to those above have been proved for ~/, E arid F. Once
proven, all theorems are included in the set of Isabelle's rewrite rules to be used
by the simplifier for automating subsequent stages of the proof.
To summarize: retracts provide a convenient method of building up semantic
domains iteratively. Retracts can be constructed mechanically from (reflexive)
domain definitions, and their properties can be guaranteed. Thus automating
retract construction is not a problem.
Having built the theory of retracts, we can turn our attention back to the predi-
cates e and f . We give a standard formulation of these predicates (Definition 3.4),
in which e and f are defined in terms of two sequences of predicates, en and fn.
D e f i n i t i o n 3.4
The question is: what should en and fn look like? We want to eventually pro-
vide a proof that the predicates from Definition 3.4 satisfy the equations from
Definition 2.1, so e , and f~ must be of the general shape of those equations.
Furthermore, such a proof will require a general condition which relates any
predicates en with e by applying the n t h retracts to the arguments of e. The
condition corresponds to the statement of monotonicity of a predicator in [7]. In
our particular case, the condition is as follows:
The above condition would follow by induction from a couple of simpler state-
ments which relate a predicate with its successor:
The above statement has guided our search for an appropriate definition of e~
and f~. After some thinking and several abortive attempts, we have come up
with Definition 3.5, which complies with the above statement:
377
Definition 3.5
eqf_O : fo( r w) r T r u e
eqf_succ: In+l( r r r
eqw_O : eo( ~, w) r T r u e
eqw_B: em+l(n in ]3, Wk w) ,~60 = (.Xk.k (n in 13))
eqw_F: era+l(6 i n F , Wk w)
= (~,k.k (60 (~.~))) ^ f~+~( d, ~k ((~ (:~.~)) I r'))
eqw_UU : era+l( I , w) r w "- _L
eqw_Err : em+l ( -9~ 60) ~ 60 _~- ?
The considerations which prompted the exact form of Definition 3.5 are fully
stated in Lemma 3.6, which is proved by induction on n. Because of the mutual
dependency of our predicates, all conjuncts in the lemma are required if the
induction is to go through.
L e m m a 3.6
goal predicates_thy : V n.
(v~.s~(~, ~ ) ~ s.+l(~. ~,~. ~))^
( V S b . A+~(r $) ~ S,(r r F, r
(v ~ 60. e.+~( ~+~ ~, W~+~ 60) ~ e.( ~ 6, ~V~ ~))
The proof uses Lemmas 3.2 and 3.3. For convenience, the proof of this theorem
was preceded by separate proofs of the four conjuncts of the induction step. Each
conjunct required the use of approximately twenty tactics.
A generalization of the above properties is easily derived by another couple
of inductions:
goal predicates_thy : V m n r 60 .en( i n r ~Nn 60) =~
e~+.( ~ + . ( ~. ~), ~V~+. ( f% 60))
goal predicates_thy :
V m n ~ 60 .era+n( Ern+n g, ~Vm+n 60) =~
e.( ~ ( ~m+. ~), fV~ ( fU~+~ 60))
These two lemmas are summarized in the following theorem. This is the only
result from Section 3.2 used in subsequent proofs.
T h e o r e m 3.7"
3,3 I t e r a t i v e p r e d i c a t e s satisfy e q u a t i o n s
What remains to be done is to prove that the well defined predicates of Defini-
tion 3.4 actually satisfy our original equations from Definition 2.1. Correspond-
ing to the axioms for e in 2.1 are the following theorems. They are proved by
folding/unfolding the definitions of predicates (2.1 and 3.4) and retracts (3.1),
as well as by some simple manipulation of indices.
T h e o r e m 3.8
goal exist_t'hy : e (6 in F, ~) r162
= (ak.k (~ (ak.k))) ^ f(~, (~ (~k.k)) IF)
goal e~cist_thy : e (n in 13, w) r w = ()~k.k (n in 1~!))
4 The congruence
where (p[n ~-~ ~]) is the environment p augmented with a mapping of the
identifier n to the expression r
At last, we are able to prove the main theorem: The direct and continuation
semantics of a lambda expression are congruent, provided that the environments
of the two semantics are congruent.
T h e o r e m 4.1
tactics. Part of the work has been done automatically by Isabelle's simplifier, as
well as by the automatic tactics for classical first-order logic, but this can be
viewed as 'small-scale' automation.
The experience we have gained so far suggests that proof construction should
be viewed as an activity closely related to 'programming': Hence, techniques and
approaches characteristic of a good programming style ought to be applied in
theorem proving as well. To name just a few, modularity, good structure and
independent levels of abstraction are essential.
As a result of our mechanization we were able to correct an error in the proof
of [4, Lemmas 3.18 and 3.19].
We intend to experiment with other methods of proving the existence of
predicates on reflexive domains. Pitts [13] has proposed a m e t h o d which is easier
to apply than the usual inclusive predicate strategy. The essence of the m e t h o d
is to define simultaneously two versions of the predicate - one with positive and
one with negative occurrences only - and to prove the two versions equal using
fixpoint induction.
We also intend to explore the correctness of a full compiler for a lazy func-
tional language, in the style of [8]. For this we will need to deal with the following
points:
As experience reported in [3, 4] suggests, all of the above are easy to do, once
the main congruence result has been proved.
References
1. A. Cohn. The equivalence of two semantic definitions: a case study in LCF. Tech-
nical Report CSR-76-81, Department of Computer Science, Edinburgh University,
January 1981.
2. P. Curzon. Deriving correctness properties of compiled code. Formal Methods in
System Design, 3(1/2):83-115, August 1993.
3. D.R. Lester. The G-machine as a representation of stack semantics. In G. Kahn,
editor, Proceedings of the Functional Programming Languages and Computer Ar-
chitecture Conference, pages 46-59. Springer-Verlag LNCS 274, September 1987.
4. D.R. Lester. Combinator Graph Reduction: A Congruence and its Applications.
Dphil thesis, Oxford University, 1988. Also published as Technical Monograph
PRG-73.
5. R.E. Mflne. The Formal Semantics of Computer Languages and Their Implemen-
tation. PhD thesis, University of Cambridge, 1974.
381
Here or stands for free, binary choice,/~[a, fl] stands for "interleaving a and fl
by the merger #" in the obvious way, and a (strict) fair merger (following Park
[5]) is any sequence of O's and l's which is not ultimately constant. Note that as
operations on players, these merges remain distinct. If D has further structure,
then additional operations of this sort can be defined, such as s~ate-dependent
fair merges, see [3].
To model non-deterministic recursion within domain theory, we must em-
bed /-/(D) in some powerdomain D*, and not totally arbitrarily. For example,
D is embedded in H ( D ) by the natural map d ~ {d}, a n d w e would want
to have "liftups" of the continuous functions in (D --* D) to continuous func-
tions in (D* ~ D*) which respect composition, yield the correct least fixed
points, etc. Even such simple requirements seem to force undesirable conse-
quences about D*, however. Consider the first and most interesting powerdo-
main construction p l o t ( D ) of Plotkin [6] (see also Smyth [8]) as an illustrative
example, p l o t ( D ) does not faithfully model fairness because it identifies sets in
II(D) which are equivalent under 4 the "observational Egli-Milner equivalence re-
lation" -em. This collapses the merge and fairmerge operations on II(Str), e.g.,
fairmerge(aoo, b~176-era merge(aoo, b~176where aoo is the infinite string of 'a's.
In addition, the equivalence relation "~em identifies certain unguarded recursions
with similar, but intuitively distinct , guarded recursions, e.g., see Smyth [8].
To circumvent these imperfections of the powerdomain constructions, Mos-
chovakis [2, 3] introduced (over some specific domains D) a model i p f ( D ) for
non-determinism and concurrency in which programs are interpreted by ar-
bitrary players, and program transformations are modeled by i m p l e m e n t e d
p l a y e r f u n c t i o n s (ipfs) on II(D). These ipfs encode more than their values on
players: there exist distinct ipfs f and g such that f(x) = g(x) for all z 9 II(D).
The extra, i n t e n s i o n a l information carried by an ipf makes it possible to as-
sign "canonical solutions" to systems of recursive equations, so that the laws of
recursion are obeyed; we will make this precise further on. The or, merge, and
fairmerge operations introduced above are naturally modeled by certain ipfs
(and incidentally "unnaturally" modeled by others, distinct from but extension-
ally equal to the natural ones.)
Our principal aim here is to show that (with modest hypotheses on D) the
Plotkin powerdomain p l o t ( D ) can be recovered in a natural way from i p f ( D ) ,
while the countable powerdomains plotoj(D ) appear to represent a fundamen-
tally different modeling of fairness. For this, we will also introduce a refined
construction of i p f ( D ) (for any D) and establish precise properties of i p f ( D )
4 The terminology for various pre-orders on II(D) is not entirely standardized. In this
paper, we will use the lower preorder (x Et Y if for all a E 2, there exists b E y
such that a < b) and the upper preorder (2 E u y if for all b E y, there exists
a E x such that a _< b). The usual "Egli-Milner preorder" is the conjunction of
these two. However, as outlined in Smyth [8], the easiest construction of the Plotkin
powerdomain for countably algebraic D is in terms of the "observational Egli-Milner"
preorder, defined as x ,,, y if for all finite sets A of finite elements, A El x implies
A Ez y and A E ~' x implies A E ~ y. Each preorder induces an equivalence relation,
for example x -~em Y if x .., y and y ,., x.
38zt.
1 T h e main n o t i o n s
For each vocabulary (signature) r, i.e., set of function symbols with associated
non-negative arities, the expressions of the language FLR0(r) are given by
general, let x' be the same as x except that every variable from x occurring as
one of the yi has been replaced by a fresh variable. Then w depends only on
A(y, x ') Ei, in the same sense as the last two requirements: if these values are
equal to A(u, z')Mi, respectively, then w - A(z) M0 w h e r e {u = M ) . For a
standard interpretation, w must be computed by taking the least fixed point of
the system Yi - A(y, x ~) Ei for i from 1 to n, and substituting the results (which
are functions of the x') into E0.
An expression identity E = M is s t a n d a r d if it is valid for all standard
interpretations, i.e., A(x)E -- A(x)M for every list x which includes all the free
variables of both E and M. The simplest example of a standard identity is
.4 = ,A),
These results will appear in a multi-authored paper The logic of recursive equations,
now in preparation.
38G
p(A(x)E) = A'(x)(p(E)).
2 Main results
T h e o r e m A. For each domain D, there is a fine, fully non-deterministic pow-
erstructure ipf(D) = (//(D), ipf(D), Aipf) over D.
In the construction of ipf(D), every intensional function essentially arises as
f j for some J, so every 7 in ipf(D) ends up being set monotone, i.e.,
x C y =r f(x) C_ f(y), (10)
and this limits the functions on plot(D) we can represent inside ipf(D). Recall
that (for countably algebraic D), plot(D) can be defined as the quotient of
the finitely generable subsets of D (which we will denote by IIo(D)), under the
equivalence -em. Therefore, each continuous function r : plot(D) --~ plot(D)
is induced by some continuous function r : IIo(D) --~ IIo(D) on the predomain
IIo(D), i.e.,
r ~"~ern])"- [r "Vem], (X E //o(D)). (11)
In particular, we say that r is essentially monotone if it is induced by some
set monotone r The essentially monotone functions em(D) are closed un-
der composition and recurs• and therefore together with plot(D) and the
standard (least-fixed-point) interpretation comprise an FLR0-structure Pl(D) =
(plot(D), era(D), Astd). This is a natural FLR0-structure associated with the
Plotkin powerdomain, and it includes all U-linear functions [1].
T h e o r e m B. If D is strongly algebraic then there is an FLRo-substructure
i p f 0 ( n ) = (IIo(n), ipf o(n), Ao) of ipf(D) with the following properties.
(a) Each player function f in ipf0(D) respects the Egli-Milner preorder on
lip(D) and is Scott continuous, so that it induces a continuous function
p(f) = r plot(D) -~ plot(D) (12)
on the Plotkin powerdomain by the equation r ~em]) -- if(X)/ ---~em].(By the
observation (10), r is necessarily essentially monotone.)
(b) If we extend the map p to no(D) by p(x) = [z/ ~-em], it becomes an
FLRo-homomorphism from ipf0(D ) to PI(D).
(c) /f r : plot(D) ~ plot(D) is essentially monotone, then r = p(f) for
some player transformation f in ipf0(D); that is, the image of the homomor-
phism p is exactly Pl(D).
No similar comparison is possible between ipf(D) and ploto~(D), however.
The obstacle is that except for extremely simple (e.g., flat) D, p l o t , ( D ) cannot
be thought of as a structure on the subsets of D, or precisely:
T h e o r e m C. For any domain D embedding (1• x N)• the free a-semilattice
over D is not the homomorphic image of (lI(D), E, C_) with ordinary C and any
partial order E.
This means that plotw(D)is not technically a powerstructure in our sense,
in that it does not represent non-deterministic "programs" (FLR0 expressions)
by their set of possible "outcomes" (subset of D), but provides some altogether
different, less concrete interpretation.
388
3 Details a n d p r o o f s
P(~) = ( F ( X ) I X : Z - * z},
and we think of F as an "implementation" of f'. However, some polyfunctions
differ inessentially by the integer "tags" they use to name their arguments: we
say that G: D J --+ D reduces to F: D x --+ D, written G _-4 F, if there is an
injection t : I ~ J such that G(p) = F(p o t) for all p E D J. Let ~ be the
smallest equivalence relation extending ~, and call two polyfunctions F1 and F2
equivalent if F1 • F2. It is simple to verify that if F • G, then t7- = ~. Finally,
a (unary) implemented player function (ipf) is a nonempty set of polyfunctions
closed under • Each ipf f induces a function ]: II(D) --* II(D) (its extension)
by
This ~ E II(D) is the "canonical" ipf flxpoint of the equation z = f(a~), and
it is not hard to verify that, indeed, it is a fixed point. The construction of
canonical fixpoints for systems of equations with parameters is similar but more
complicated, and still very close to [3].
The proof of Theorem A now essentially consists of showing that the standard
identities hold in ipf(D). Armed with the axiomatization mentioned in Section
1, it suffices to verify a specific, short list of identities. This method improves
on that of [3], both in content (as we can handle arbitrary D and the refined
equivalence relation) and in simplicity.
Proof of claim. Let A(d) denote the set of finite elements less than or equal to
a given d E D. Since D is algebraic, .A(d) is directed and sup .A(d) = d. Also let
F: C x D N ~ D be the continuous parametrization of f .
Suppose that a is a finite approximation to an element c E ](x). By definition
of ipf application, c = Fa(X) for some a E C and X: I --~ x. By continuity of F,
But a is finite, so some individual term of the right-hand sup must already be
beyond a. That is, for a particular sequence A E D z such that A(i) E A(X(i)),
we have a <_ F~(A). Each A(/) is finite below Z ( i ) E x, and hence there is some
Y(/) E y such that A(i) < Y(i). Finally, by monotonicity of Fa, a < F~(Y) E
](y). In other words, ](x) E~ f(y).
To finish the first part of the claim, show that ] ( x ) E u ](y): if d = Fa(Y) e
](y), choose a map X: I ~ x such that X(i) < Y(i) for each i E I. This is
possible since x _Cu y. Then x 9 c = F~(X) <_ F~(Y) = d.
The second part of the claim asserting the continuity of p(f) is more delicate.
Note it suffices to show that for each point z in the Plotkin powerdomain, there
is a sequence of finite elements a,, with supremum z such that p(f)(z) is the
supremum of p(f)(an). Let p , : D --~ D,~ be the sequence of projections witnessing
391
The previous claim together with the definition that p(x) = [x/'%m] yields
a map from the universe of ipf0(D ) to the standard structure over the Plotkin
powerdomain of D. The next part of Theorem B states that this map p is an
FLR0-homomorphism. To see this, it suffices to show that p preserves function
compositions and systems of recursive equations, since all FLR0 expressions
are built up from these operations. The fact that p(f) o p(g) = p(f o g) is
easy, because p(f) depends only on the extension ]. For recursion, suppose
that f(z) is a compact-analytic ipf with player fixed point ~, and that X is
the least fixed point in the Plotkin powerdomain of p(f). We wish to prove
that X = p(a3) = [~/-~ern]. (The following argument will directly generalize to
systems of equations with parameters.)
First, p(f) fixes [~/'~em], since by definition p(f)([#./ -----ern])= [](~)/ "~em]
and ] fixes $. Choose a compact-analytic representative x0 of X in the Plotkin
powerdomain; since X is the least fixed point of p(f), this means that x0 ~ ~.
On the other hand, we know that ](x0) ---era Xo since the --era-equivalence class
X of z0 is fixed by p(f). In particular, ](z0) ~ z0. To finish the proof, we show
that for any compact-analytic player y, f(y) ~ y implies that $ ~ y:
392
S(a, R) = ~ if ~ G R
Vu such that {uu} El R,V~ extending u, ~u <_S(~,R). (15)
The following construction provides one such S. Fixing an R for the moment,
find all minimal u so that no element of R is greater than or equal to uu. For
such a u, let t be its parent. Then R does contain some element greater than ~r~,
which is exactly to say that {u~} El R. Therefore, choose the "leftmost" branch
extending t whose limit is in R, and call this limit mR,~.
Now define S(a, R) = mR,u if a E N(u), and S(a, R) = u~, otherwise. There
is at most one initial sequence u of a so that mR,u is defined, and if there is
none then a~ E/~ since R is closed under limits of approximations to itself; these
observations ensure that S is well-defined and satisfies properties (14) and (15).
Finally define for X E D N, a E C,
r ( ~ , X ) = S(~,g([X[N]])*),
and let f be the ipf generated by the sections Fa. We must prove both that
F is continuous and that p(f) = g. So, compute the inverse image of a generic
neighborhood No(a) under F, as follows. Let U be the set of all minimal u so
that u~ >_ a. By construction of S,
Since each N(u) is open in C, we need only show that the collection of all X so
that {~u} El g([X[l~]])* is open in O n. Fix a u and let b = ~ for convenience
of notation. Note first that if {b} El R, i.e., if 3d E R s.t. b _< d, then there
is some finite set of finite elements A 9 b so that A ~ R. That is, letting
B = {[R]: {b} El R}, then
B= U Nplot(D)(A)'
A finite
ABb
Finishing the proof of F's continuity therefore only requires showing that any
set of the form 2( = {X E D ~ [ A _E~X[N]} for A a finite set of finite elements
is open in D r~. But A E t X[I~I] just means there is some finite set of natural
numbers whose images under X meet ND (c) for each c E A. Any candidate set
of natural numbers of the appropriate size yields a finite condition on X, and so
the entire set ,Y is a union of basic neighborhoods in D ~.
It remains to show that p(f) = g. If x E/7o and X: l~l ~ x, then X[N] C_ x.
By the essential monotonicity of g, g([X[N]])* C_ g([x])*. This inequality with
shows that f(x) C g(tx])*. Equality will be achieved if for some X, X[N] -em X.
Collecting for each n and each element a of Pn[X] some element da E x with
a _< da produces a countable set which is "~em-equivalent to x. Then an X that
enumerates {da} suffices.
This completes the proof of all three parts of Theorem B, providing a vivid
picture of the Plotkin powerdomain as a powerstructure quotient of the player
model.
Since both the ipf structure ipf(D) and the powerdomains for countable non-
determinism plot w(D) seek to improve on the ability of earlier powerdomains
to model fairness, it is natural to compare them. Is there a "countable non-
determinism" analogue of Theorem B? Unfortunately, the answer is "no," for the
simple reason that plotw(D) does not constitute a powerstructure. The points in
ploto,(D) cannot (all) be viewed as subsets of D. In other words, although it still
may be true that "direct existence [of ploto,(D)] along the lines of [6, 8] should
be established,', as Plotkin [7] suggests, no construction which stays within the
subsets of D can accomplish this goal.
Let W be the domain (1• x N)• which is just a tree with root J_ and
countably many branches of length 2. Concretely, we take the elements of W to
be _L and all pairs of the form (0, n) or (1, n) for any n E 1~. The ordering on
W is that .1_ < (0, n) < (1, n) for any n, and these are the only relations. W is
very close to being flat, having maximal chain length 3 as opposed to 2. Also,
say that one dcpo D embeds another E if there is a projection from D onto a
subdcpo D ~ C_ D such that D ~ and E are isomorphic.
395
:r={supanlanE:r,~,am<_anform<n}.
Next, C_ must have least upper bounds of arbitrary countable sets from 79.
But U , zn is the ordinary C-least upper bound of {x, [ n E l~l}, so [Un z,/--em]
is the least upper bound of {[zn] [ n E N} on 79.
Finally, countable union (i.e., the operation of taking the _C-least upper bound
of countably many arguments) must be wl-continuous and binary union must
be continuous, with respect to ~. The former is trivial because there are no
uncountable chains in 79; the latter just requires that whenever x = sup z , , then
x U y = sup zn U y, which is easy to check because W only has chains of finite
length.
Thus, P is a u-semilattice; however, it is not free over W. The singleton map
would have to be the obvious {.}:w ~ [{w}]. Now consider the map f from W
into the u-semilattice 9 Q = (//(Nj.), ~, c_) given by f ( / ) = f(0, n) = { l } and
f(1, n) = {n}. f is certainly continuous, and so if 79 were free, f would have to
factor through the singleton map. The values of F at singletons determine all i t s
other values, since F must preserve countable unions and W is itself countable.
Therefore, the only possibility for F turns out to be
But this map F is not continuous (on the "dcpo" part (//(W), ~) of the semi-
lattices), which is a contradiction. To see the non-continuity, consider the sets
Hk = {(1, n) [ n < k}U((0, n) [ n > k}. Clearly the Hk are increasing in ~, and
have least upper bound H = {(1, n) I n E N}. But F(gk) = {.L, O, 1,..., k - 1)
so that supk F(Hk) = {/}Ul~l, whereas r ( g ) = 1~ is strictly bigger than {_L}UI~
in Q. (Intuitively, the true free ~r-semilattice over W will have to include some
"ideal" element, not corresponding to any set, to be the least upper bound of
the sequence {Hk}kel~. )
Assume by way of contradiction that the free ~-semilattice S over W is the
homomorphic image of (H(W), E, C) for some partial order E. The homomor-
phism induces some equivalence relation ,,~ on II(W), so that S = (II(W)/,,% E
/,,~, _C/N), with singleton map w ~ {w}/,,~. Since :P is a ,,-semilattice and the
map w ~-* [{w}] E 79 is continuous, there is a ~-semilattice map r S --+ 79 so
9 See Plotkin [7] for a proof that this is the free a-semilattice over 1~• the proof that
it is simply a ~r-semilattice is analogous to the proof for W.
396
s•
Now, w ~ [{w}] E 79 is an injectivemap, so the singleton m a p into S must
be as well, i.e.,~ cannot identify any singletons. Since the subset relation in S
is ordinary subset (modulo ~), the c-lub of a countable set A of singletons in S
isjust {w I {w} E A}/~. Since W is countable, every element of S is produced
in this way, and since r must preserve C, this means that the m a p r is simply
given by
= e 79
for any z E I I ( W ) . Thus r is clearly surjective. Suppose that r = r
which means that z is equivalent to y under ~. Write z as {a0,al, ...} and
y = {b0,bl,...} possibly with repetitions so that ai <_ hi. Since the singleton
map to S is monotone, and countable union in a cr-semilattice is monotone, this
representation of z and y shows that z/,~ E y / ~ . Symmetrically, y / ~ E z / ~ , so
that r is injective as well. Hence r is an isomorphism of r witnessing
S - 79. But 79 is not free over W.
This argument proves the theorem for D = W. To extend to the case that
D embeds W, just notice that any representation of the free ~-semilattice over
D in the given form would yield such a representation of the free ~,-semilattice
o,r W by taking a quotient under the projection from D to W. []
References
1 Specifications
We assume that the reader is familiar with basic category theory. A good intro-
ductory book, oriented towards computer scientists, is [Ba 90].
We will use the notations (X, (f~)~ez) and f~: X ~ I~ interchangeably. A source
in F i n S e t , the category of finite sets and functions, can be percieved as a mul-
tirelation over the sets Y~: X is a multiset of I-tuples, and the fi's are generalised
projections.
A source (X, (f~)~ez) is a mono-source if f~ oz = f~ oy for all i E I implies that
z = y. A mono-source in F i n S e t is an ordinary relation: it is (up to isomorphism)
a subset of the cartesian product l-I~ez I~ together with the (restrictions of) the
product projections.
More details about sources can be found in [Ad 90].
* Research Assistant of the Belgian Fund for Scientific Research
398
It is proved in a similar way as Theorem 16.14 on page 260 of [Ad 90]. The
construction of the reflection of a functor F is as follows:
The universal arrow from F to its reflection is the projection of F on this quo-
tient.
1.2 Examples
In the following examples, we try to show that specifications have enough ex-
pressive power to be useful in practice.
1. If 8 is a discrete category (no non-identity arrows), the models are just typed
sets. The specification:
COMPUTER PRINTER
g= 9 ~ , 9=0, ~=0
says that the part of the world we want to specify consists of two kinds of
entities (printers and computers), and that is all it says.
399
= [: , "D=O, M = 0
LOCATION
Since the arrow must be taken to a function in a model, this specifies that
every computer must have a location associated with it. (An entity of type
COMPUTER is always associated with an entity of type LOCATION.)
3. A source with r~ arrows in the category can be seen as an n-ary multirelation:
CONNECTION
coN~Io~ ............................................
A4 = {(CONNECTION,(c,p))}
COMPUTER PRINTER
400
CONNECTION
~ , 1) = 0, A4 = {(CONNECTION,(p))}
COMPUTER PRINTER
CONNECTION
6 = c o M p ~ ~ , 19={12op=lloe}, A4=0
I1 ~ 12
LOCATION
This specification says that computers and printers can be connected only
if they have the same location. This kind of constraint occurs very often in
practice.
401
1.3 Attributes
The functor A is essentially a function from the nodes of ,~ to the finite sets.
The intention is to specify that every entity of a type C must be labelled with
a value from the set A(C).
On the previous example, A could take C O M P U T E R and CONNECTION
to the terminal set, and P R I N T E R to the set { matrix, laser }. Labelling with
an element from the terminal set is of course equivalent to no labelling at all.
The straightforward way to define the model category is as follows: Let
I : So --~ S be the inclusion, let I* : Mod(9 r) --, Fun(&, F i n S e t ) be the functor
of composition with I and let A : 1 ---, Fun(S0, F i n S e t ) be the functor picking
out A. Then,
Hence, the objects of the model category are models of the underlying specifi-
cation without attributes, together with a natural transformation from M o I
to A, which can be percieved as a labelling of the elements of the model.
If A(C) is the terminal set 1, this means that C has no attributes and if
A ( C ) = A1 x A2 x . . . x An then C has n attribute-sets A 1 . . . An. The value of
the functor A is graphically denoted in the following manner:
CONNECTION
G= ,
COMPUTER
reality". It should be noted that we do not give any semantic value to the names
used in the specifications. This seems reasonable since the semantic content of
a name is usually not formal. In a similar way, you don't change an algebraic
structure by giving other names to its operations. Still, when combining existing
specifications, one should be cautious if different names are used for equivalent
subspecifications since the designer(s) may inadvertently have chosen the same
representation for two different realities. But of course, an automatic procedure
might give warnings or go interactive in such cases.
G= , •=0 (2)
In this second specification, separate entity types for matrixprinters and laser-
printers are used, instead of one entity type with an attribute.
Clearly, these specifications (1) and (2) are not isomorphic.
Ezample: The specifications:
A f B h C
In fact, we will only be able to define canonical forms for a (large) subset of
specifications.
Why do we care about canonical forms of specifications? Suppose two soft-
ware engineers both specify a part of a large system, and these specifications
overlap partly. If it would be guaranteed that the overlapping part is specified
isomorphically in the two specifications~ it would be easy to merge them together:
just identify the isomorphic overlapping parts.
A first important result states that, in many cases, attributes can be eliminated
from a specification, without changing the model-category.
Proof. First, note that the right Kan extension exists, since ,9 is a finite category,
and F i n S e t is finitely complete. The proof consists of two parts.
e(Md
tO,) A
/
4.OZ.
# A!
M2
/S
However by the naturality in M of (5), these two conditions are equivalent,
and hence (I* ~ A) and Mod(2")/A ! are isomorphic.
2. Secondly, let f A ! be the category of elements of A !, and let I I : f A ! ~ 3
be the associated projection. We prove that Mod(Sr)/A ~ is equivalent to
Mod(~") with 2" = (5', A4') where 3' = f A'", and gl E .MI iff/'/(/~') E M .
It is well-known that
e "
gl
y~ ... y,,
405
Since both the Karl extension and the category of elements can be effectively
computed, this proof is constructive: it describes an algorithm to eliminate at-
tributes from specifications.
Ezample: Let us eliminate the attributes from specification (1). The construction
of Kan extensions can be found in [Ba 85] or [Bo 94]. Since So is a discrete
category, the value of the Kan extension on objects becomes a simple product:
A'(X) = H A(Y)
f:X-*Y
where the product is taken over all arrows f : X --+ Y out of X. A ! becomes the
following funetor:
406
In this specification, a computer can have at most one printer connected to it.
Now, A ! is still the same functor, but it is no model of specification (7), because
At(c) is not an injective function. Hence, our construction fails. And indeed, no
m a t t e r what sources we add to A4 of specification (2), we can never specify that
a computer can have at most one printer connected to it. The best we can do is
specify that it can have at most one laserprinter, and at most one matrixprinter
connected to it. But clearly, this is not equivalent to the original specification
(7).
The question remains if some other construction might lead to an equivalent
specification without attributes. The answer is no, since the model category of
specification (7) does not have a terminal object, while a model category of a
specification without attributes always has a terminal object.
Since a source with doubles is a mono-source iff the same source with all doubles
removed is a mono-source, only sources without doubles must be considered for
specifications.
4. The set A4 contains all sources without doubles in S which are taken to
mono-sources by every model.
The first condition means the category must be Cauchy-complete ([Bo 86]), while
the second condition states that it must be skeletal. Conditions 2 and 3 remove
redundancy from the specifications: if different objects in the category of the
specification are isomorphic, this means that at least one of them can be removed
while retaining an equivalent model-category. If the representable functors are no
models, this means that some of the arrows in the specification are redundant
(cfr. the proof of L e m m a 10). Conditions 1 and 4 can be seen as a kind of
"completion" of the specification.
K
Fun(S, F i n S e t ) , ' Mod(Y)
I
f ,~, g i f f ~ H o m ( o , - ) ( f ) = r/Hom(o.-)(g)
with C the source of f and g. Let S' be S / ,,, and let P : 8 ---r g/,,~ be the
projection. Then ~'~ = (S', A/U) with AA' = { P ( M ) J M E A,t) satisfies the
conditions of the lemma. []
Hom(A, - )
T h e o r e m 11. For every specification jr = (S, All), there ezis~s a canonical spe-
cification with an equivalent model-category.
Proof. By Lemma 10, we can find 5r~ = (S', A4') with an equivalent model-
category such that all representable functors from S ~ are models.
Every model of ~r, can be extended uniquely to a model of 5r" = (8", A4'),
where S " is the Cauchy completion orS'. (cfr. Theorem 1 in [Bo 86]) Let i : 8' --*
S" be the inclusion. If we take ~r,, = (8", A4") with A t " = {i(M) J M E A 4 ' ) ,
then the model-category of 7 ' is equivalent with that of ~'.
Finally, let S ~ be the skeleton of S", with p : S " --* S ~ the equivalence be-
tween S " and S c. Let AA"' = {p(M) J M E A4"}. If we take A4 c to be the set
of all sources without doubles in 8~ which are taken to a mono-source in every
model of (,~, Azl"'), then (8 c, A4 ~) is the required canonical specification. [3
If the representable functors are models, this lemma says that every model is a
colimit of Horn-models.
409
Proof. We have:
y K
s~ , ~'un(S, r i n S e t ) , 'Mod(2")
I
Since K is left adjoint to I, it preserves colimits. And since K o I is the identity
functor on Mod(~'), K is surjective on objects and arrows.
Now, since every object in Fun(S, F i n S e t ) is a colimit of a diagram factoring
through r (see page 41 of [Ms 92]), the result follows. []
I c o t ~ D = r/toll m zD o colim ID
(This follows from Proposition 13.30 on page 213 of [Ad 90])
Also, N a t ( I / / , - ) preserves colimits (it is an evaluation functor).
Hence:
Nat(H, colim D) = Nat(IH, Icolim D) (since I is full and faithfull)
= N a t ( I H , ~/colim zD o colim ID)
= N a t ( I H , ~/colim ZD) o Nat(IH, colim ID)
Since every arrow of the unit is epi (Proposition 4), and since Nat(IH, - ) pre-
serves epi's, we conclude that H belongs to the base.
Secondly, we prove that every object in the base is isomorphic to some Hom-
functor.
Suppose F E Base. We know that F = colim D where all objects from D are
Hom-functors. (This follows from Lemma 12 and the fact that all I-Iom-functors
are models). Hence we have the following situation:
colim Nat(F, D)
Nat(F, F)
Nat(F,/))
410
References
Abstract
We propose a uniform, category-theoretic account of structural induction for
inductively defined data types. The account is based on the understanding of
inductively defined data types as initialalgebras for certain kind of endofunctors
T : ~---+]~on a bicaztesian/distributive category ]~. Regarding a predicate logic as
a fibration p : ~--*~ over B, we consider a logical predicate lilting of T to the total
category ~. Then, a predicate is inductive precisely when it carries an algebra
structure for such lifted endofunctoL The validity of the induction principle is
formulated by requiring :that the 'truth' predicate functor T : ]~--*~ preserve initial
algebras. We then show that when the fibration admits a comprehension principle,
analogous to the one in set theory, it satisfies the induction principle. We also
consider the appropriate extensions of the above formulation to deal with initiality
(and induction) in arbitrary contexts, i.e. the 'stability' property of the induction
principle.
1. Introduction
Inductively defined d a t a types are understood categorically as initial algebras for 'polyno-
mial'endofunctors T : I~--+I~o ~ a blcartesian/dlstributiv e category B, as in [CS91, Jac95].
The category I~ is the semantic category in which types and (functional) programs are
modelled, e.g. gpo or Set.
We will show how initiality canonically endows such d a t a types with induction prin-
ciples to reason about them. Induction is a property of a logic over (the theory) ~.
Induction is a property of a logic over (the theory) B. Categorically, such a logic
corresponds to a fibration over B, written as ~p. ]? is the category of 'predicates' and
B
'proofs' , over the 'types' and 'terms' of ~. When ~p is endowed with appropriate
B
structure, intended to model certain logical connectives and quantifiers, P is bicarte-
sian/distributive and p preserves this structure. It is then possible to 'lift' the functor
T to an endofunctor Pred(T) : ~---~IFover T, i.e. pPred(T) = Tp. The key point is that,
given a T-algebra T X ~ ~-X and a predicate P on X , i.e. pP = X, P is inductive,
meaning that it satifies the premise of the structural induction principle for the 'type
structure' T, precisely when it has a Pred(T)-algebra structure P r e d ( T ) P ~ ~ P with
p~ = z. This observation leads to our definition of the induction principle relative to
the fibration p as the preservation of initial algebras by the 'truth predicate' functor
T : B---+I?, which assigns to the object (or 'type') X the 'constantly true' predicate T x .
As for the usual induction princip!e for the natural numbers w in Set, we know it is
valid using the initiality of w with respect to the inductive subset {~ E X I P ( z ) } , de-
termined by the inductive predicate P which we wish to prove. This argument depends
crucially on the fact that we can perform comprehension. In categorical terms, compre-
hension P ~-* {x e X I P(x)} amounts to a right adjoint to T : ~---+P, after [Law70].
* Computer Science Department, Aarhus University, DK-8000 Denmark.
e-mail: chermlda@daimi,aau .dk.The author acknowledges funding from the CLICS II ESPRIT project.
** CWI, Krnislaan 413, 1098 SJ Amsterdam, The Netherlands. e-mail: bjacobsr
413
With our abstract formulation of induction, we will show that when ~p admits compre-
B
hension in the above sense, the induction principle holds in p, analogously to the above
situation in Set.
This last fact that comprehension entails induction hinges on the fact that adjune-
tions between ]~ and itDinduce adjunctions between the associated categories of algebras,
T-Alg and Pred(T)-.41g respectively, assuming some appropriate additional structure.
This is a 2-categorical property, namely the 2-functoriality of (the construction of) in-
serters: T-AIg is the inserter of T, 1~ : ]~---*~, in the sense of [Kel89]. See Theorems
2.3.1 and 4.0.8 below.
Another important aspect of the present work is the consideration of the (frequently
ignored) 'stability' of the induction principle under context weakening. This means that
we should be able to reason by induction on a given data type not only when such type
is given on its own, but also when it occurs toghether with some other data, which in
turn may be subject to certain hypotheses. Technically, this amounts to the requirement
that initiality of algebras be preserved under addition of indeterminates.
The primary aim of this work is to give a technically precise categorical formulation
of a logical principle, namely structural induction. Such formulation makes the principle
amenable to a purely algebraic manipulation. There are several relevant references in the
literature, particularly [LS81, Pit93]. We would like to emphasise the following points,
which highlight differences between our work and these references:
(i) The understanding of a predicate logic as a fibration is central to the present
work. This provides not only an appropriate level of generality but also the right techni-
cal framework. In particular, the relationship between inductive predicates and logical
predicates is best presented in this setting, as logical predicates for type constructors
given by adjoints arise uniformly from an intrinsic property of adjunetions between fi-
brations, cf. [Her93].
(it) The categorical framework which we work in takes explicit account of proofs
of entailments between predicates. Thus this work can be seen a s a generalisation of
induction principles from the usual proof-irrelevant setting to the type-theoretic (or
constructive) one. See Remark 2.2.1 below.
(iii) 2-categorical reasoning is essential to get conceptually uniform formulations. For
instance, just as inductive datatypes are understood as initial algebras for an endomor-
phism in Cat, the 2-category of small categories, their associated induction principles are
formulated in terms of (distinguished) initial algebras for endomorphisms in Cat-'~. Sim-
ilarly for stability of data types and their associated induction principles under context
weakening: the former means preservation of initial algebras by addition of indeter-
minates in Cat while the latter amounts to the same kind of preservation in Yrib, the
2-category of fibrations. See w below.
Background material on fibrations can be found in [Jac91, Pav90]. Indeterminates
for fibrations, as relevant to this work, are discussed in [HJ93]. Inserters are presented
in [Kel89]; they play a purely technical role here and hence they are not essential to
understand the paper.
The material presented here is essentially an extension of [Her93, w combined
with [Jac95]. A follow-up,in [HJ95] deals with a dual coinduction principle (which holds
in the presence of quotients) and a mixed induction/coinduction principle for mixed
variance type-constructors, cf. [Pit93].
414
2. S e t t i n g
In this section we lay down the setting required for our formulation of structural induc-
tion. In w we define the kind of endofunctors whose initial algebras are understood
as inductively defined datatypes and recall how such initial algebras may be obtained
under suitable cocompleteness conditions. In w we present the basic properties on
fibrations required to give a categorical counterpart of a logic suitable to describe struc-
tural induction, including the description of logical products and logical coproducts.
A [.,B A x B B
for a product diagram in B, omitting subscripts whenever convenient. Dually, we write
The set S in the above definition is called a parameter set. Its role is to specify,
via the functor M : S--*~, those objects of ~ which are parameters for the data type
specified. The examples below will make this clear. See [Jac95] for a more general type-
theoretic formulation of data types in distributive categories. The initial T-algebra of a
415
functor T : ]~---~ need not exist. But it is possible to guarantee the existence of initial
T-algebras under suitable cocompleteness conditions on ~ and T. As shown in [LSS1],
an initial T-algebra can be obtained as the colimit of an w-chain, when T preserves
such colimits. An w-chain is a functor w ~ ]~, where w is the poset category of natural
numbers with their usual ordering. The initial T-algebra is the colimit of the following
w-chain:
0 ' * TO T,. T20 ...
where t : 0---~T0 is the unique morphism from the initial object. In Set and Cpo, any
T E 7"_ preserves colimits of w-chains and therefore any polynomial functor in these
categories has an initial model.
2.1.3.. An important observation due to Lambek ,cf. [LSS1] for instance, is that for an
initial T-algebra (D, constr : T D ~ D ) , constr is an isomorphism. Thus, we can regard
D as the 'least fixed point' of T, as illustrated by the above w-chain. The isomorphism
constr provides the 'constructors' of the data type, as the following familiar examples
illustrate.
The example of lists above shows the role of the parameter S and the functor
M : S---*~ in the specification of a data type; the type of lists List(A) is parameterised
by the type A of the elements of the list.
2.2. L o g i c o v e r a b i c a r t e s i a n c a t e g o r y
Given a bicartesian category 1~ in which we model inductive datatypes, we want a cate-
gorical formulation of a logic over it, a predicate logic over the 'types' and 'terms' of ]B,
in order to consider induction principles. The proper categorical version of a predicate
logic over a category is embodied by the notion offibralion. We refer to [Jac91, Pav90]
for an exposition of this point of view.
P
Thus a predicate logic corresponds to a fibration over ~, written as ,p. P is the
]B
category of 'predicates' and 'proofs', over the 'types' and 'terms' of~. This can be made
precise via the inlernal language of a fibration, in the same vein as a cartesian closed
category has associated a simply typed A-calculus as its internal language, cf. [LS86].
Specifically, the fibration ~p has associated a predicate logic as its internal language:
/+16
regarding ~ as a simple type theory, with product and coproduct types (see [Jac95]), an
object P of It~, with p P = X is construed as a predicate, or indexed proposition, on the
type X:
x : X l- P(x) Prop
where we have written P ( x ) to emphasize the dependency on the variable x, although we
will usually leave this implicit. A morphism h : P--+Q with ph = u : X - * Y , corresponds
to a (unique) vertical morphism h : P--,u*(Q), where u*(Q) is the domain of a cartesian
lifting of u at Y. In the predicate logic of p, this vertical morphism h corresponds to a
proof of the entailment
x : X ] a : P(x) ~- h : Q ( u ( x ) )
where Q(u) is the predicate corresponding to u*(Q); reindexing in the fibration corre-
sponds to substitution in the logic:
2.2.1. REMARK. Although we usually omit the 'proof term' ]~ in entailments, the reader
should bear in mind that our approach is truly constructive, i.e. takes proofs into ac-
count.
2.2.2.. Fibrations are organized into the 2-category ~t~b, whose objects are fibrations
iEp. Morphisms are given by commuting squares
g ~
E ~D
~ T A
where K ~ preserves cartesian morphisms. Given morphisms (K', K), (L', L) : p---*q, a 2-
cell from (K', K) to (L', L) is a pair of natural transformations (~r' : K ' ~ L ' , ~ : K ~ L )
with cr~ over ~, i.e. q~r~ = r
.T~b is a sub-2-category of Cat--*, whose objects are arbitrary functors p, q,... and
whose morphisms are commutingsquares as above (without any preservation properties).
2.2.5. EXAMPLES. (i) Classical logic. The fibration corresponding to classical first-
Sub(&t)
order logic is the subobject fibration 1cod. The category Sub(Set) is the cate-
Set
gory of subobjects: its objects are pairs (S, X), where S _C X, and its morphisms
f : (S, X)--+(S', X') are functions f : X---+X' such that f ( S ) C S ~. The fibration simply
'forgets' the subsets. Cartesian liftings are given by inverse images:
(f : x-x',(s',x')) ~ (f-~(s'),x)
The bicartesian structure of Sub(Set) is described below, in terms of logical predicates.
ASub(cvo)
(ii) Admissible subsets. A related example is the fibration ~eod , where
@o
ASub(Cpo) is the category of admissible subsets: its objects are pairs (S, C) where C
is an w-cpo and S C C is a subset containing the bottom element and closed under lub's
of w-chains, while its morphisms are the strict continuous functions which respect the
subsets, as in the preceeding example. The category ASub(Cpo) is bicartesian as it is a
reflective subcategory of the fibred category U*(Sub(Sel)), obtained from the 'classical
logic' fibration by change-of-base along the forgetful functor U : Cpo--+Set. See [Her93,
w for further details.
9 ~ a bicartesian category,
9 p a fibred bicartesian category, i.e. every fibre is a bicartesian category and rein-
dexing functors preserve finite products and coproducis,
tI
9 p has coreindexing functors along coproduct injections, I -~ I + J ~- J, for every
_r, j e I~l.
Then, I~ is a bicartesian category and p strictly preserves finite products and coproducts.
over the product I ~+~-JI x J 24J J, where Xlx3 is the product in the fibre ]Fzxj. Dually,
their coproduct is
tn
tl,J
P "--') (tl,3)!(P) ' ~ (tl,s):(P) +I+J (t~,,,):(Q) r " (dI,3)!(Q) .~'.--~'J Q
where ,'l , J is a cocartesian lifting. Terminal and initial objects are obtained similarly []
2.2.7. REMARK. In the internal language of p, the above construction of products reads
as follows: given x : I I- P Prop and y : J t- Q Prop, their logical product is
that is, a predicate over I + J defined 'by cases'. This last expression of coproducts
relies on the presence of an equality predicate, satisfying certain exactness conditions,
commonly satisfied (see [Law70]). Actually, such additional structure on a fibration is
irrelevant for our arguments; the above description is given only to emphasise the logical
significance of the coproduct in ]?. The relationship between categorical structure on
and logical predicates is further analysed in [Her93].
Sub(se0
For example, the fibration I cod satisfies the hypothesis of Proposition 2.2.6. The
Set
fibred products and coproduets are given by intersection and union, respectively. It has
cocartesian liftings along arbitrary morphisms: given S C X and f : X--*X', the lifting
is the direct image f ( S ) C_ X q
A t*A
B---p-~ B
in which c~ is an isomorphism and f has a right adjoint, 71,e : f -~ g, the adjoint mate of
a -1, i.e. 0 = gt'e o g ' a - l g o rltg : t g ~ g t ' , induces a morphism g-Alg : t'-AIg---+t-.Alg
right adjoint to the morphism f - A l g : t-Alg--~t'-Alg induced by the above diagram, i.e.
3.0.2.. Given a set of parameters S and functors ]l~ : S--+~ and )~ : ..q--+lF such that
p M --- M, a polynomial functor TM : ]~--*~ induces a polynomial functor Pred(T)~ : IP--+]P
419
fibred over T, using the bicartesian structure of I?. The formal definition of P r e d ( T ) ~
proceeds by induction on the construction of T E TM. For instance, given P E I~al,
TAX = I + A x X induces Pred(T)Y = "I~-P'xY. We cau then consider Pred(T)-algebras
and initial models in F. We call P r e d ( T ) ~ the logical-predicate lifting of TM. We thus
get the following endomorphism in Cat ~
Pred(T)~
In a bicartesian fibration, the fibred terminal object (the 'truth predicate') is given by
a functor T : ]~--*~, which is a (fibred) right adjoint to p : P ~ B . Such a fibred terminal
object is used to give a notion of provability in the 'logic' p. A 'predicate' P, with
pP = I is provable when there exists a morphism h : Ts--*P in the fibre PI. In the
internal language of p, cf. w this amounts to a proof of the entailment
x: I l a : T I ~- h: P(x)
Since the functor p-Alg : Pred(T)-Alg--*T-Alg has a right adjoint, it preserves initial
algebras. Hence, if Pred(T)-AIg has an initial algebra, we may assume it lies over the
initial algebra in T-Alg.
We are now in position to state our main defnition.
This definition means that for an object P in IP, in order to give a morphism
f : TD--~P it is sufficient to endow P with a Pred(T)-algebra, (P, h : Pred(T)P---*P).
z.2O
For the more general case of the definition, the condition is also necessary if we assume
has image-faetorisation for T-algebras, e.g. when ]~ = Set.
We illustrate the logical import of the above definition with the polynomial funetor of
natural numbers and lists below. We assume the bicartesian structure of ]P is obtained
as in Proposition 2.2.6. The internal language of p in this case includes the logical
connectives {A, T, V,.k} and the eoreindexing funetors along coproduct injections. To
simplify the presentation, we consider only the entailment relation [-- in the internal
language, disregarding the proof terms. Note that for L : I---*I + J in ~, given predicates
x : I h Q(x) Prop and y : I + J I- P(y) Prop a morphism f : t , ( Q ) - , p corresponds under
A , . "
The above corresponds to the usual induction principle on the natural numbers: to
prove P(x) for the elements z : I generated by a and m, we must prove P(a) and
P ( y ) ~ P ( m y ) . The validity of the induction principle in p asserts then the existence
(and uniqueness) of a morphism it(f) : -TN-+P over it(a, m), which is the desired proof
of the previously mentioned 'validity' of P in the image of it(a, m).
(ii) For the polynomial funetor T A X = 1 + A x X, for some A E []~[, we get the
polynomial functor Pred(T)Y -- "I-TrTA'~Y. Let (L, [nil, cons]) be the initial T-model
and let P E I~Lt. Note that modulo the isomorphism [nil, cons] : 1 + A x L----~L, the
predicate P corresponds to a predicate P ' on 1 + A x L, i.e. x : 1 + A x L F- P~(x). The
predicate P~ therefore determines two predicates S and Q, with x' : 1 I- S ~-* PJ(nil) and
a : A, l : L ~- Q(a, l) ~ P'(cons(a, l)). To give a vertical global element h : TL--*P, a
proof of the property P for all lists, amounts to give a morphism k : Pred(T)P---*P over
[nil, cons] : 1 + A x L-+L. It corresponds to a sequent
4.0.8. THEOREM. Let ~p be a bicartesian fibration, which satisfies condition (I) and
admits comprehension. Then, p satisfies the induction principle w.r.t, every polynomial
endofunctor on 5.
Proof. Condition (1) and T -4 {_) give data satisfying the hypothesis of Theorem 2.3.1.
We then conclude that T-AIg has a right adjoint {_}-Alg and therefore preserves initial
objects. []
The import of the above theorem is that for an polynomial functor TM, the functor
{_}-Alg turns a Pred(T)-algebra on a predicate P into a T-algebra on the 'extent' of the
predicate P. This is the essential role comprehension plays in showing the validity of
the induction principle in Set: given a predicate P on the natural numbers w, which is
inductive, we use the initiality ofw to conclude that the (inductive) subset {n E w I P(n)}
must be the whole of w, and thus the predicate P is (provably) true.
So far we have considered inductive data types and their associated induction principle
in terms of initiedity in the empty context. For instance, the initiality of N allows us to
define functions out of it, e.g. h : N-*X, by endowing the set X with a 1 + (_)-algebra
structure. But we also want to use this method when the inductive data type occurs in
an arbitrary context, e.g. to define addition add : N x N-*N by induction on the second
argument. This requires that the initiality of N be preserved when we move from the
empty context to the context n : N (for the first argument of add). This operation
is called context weakening. Technically, we say initiality is stable under addition of
indeterminates, the indeterminate being n : N.
A similar extension is needed then for the associated induction principle, since when
we perform context weakening F ~ F,x : I, the element x may be subject to some
(propositional) hypothesis. That is, we are generally interested in proving relative en-
taihnents P I- Q rather than 'absolute' assertions T t- Q. For instance, we may want to
prove n : N, m : N [p : Even(m) t- q : Even(add(2 9 n, m)) for some q, in which case we
use induction on n with m : N and p : Even(m) as parameters.
Abstractly, both extensions are instances of the same phenomenon: let ]E be a 2-
category with finite products and inserters and let A be an object of K: with a 'ter-
mined object' :! H 1 : 1--+A. Given any global element I : l - - A , we can consider the
'object A with an indeterminate element x : l ~ I ' , A [ x :/]. This object is equipped with
rli : A--+A[x : / ] and a 2-cell ax : rlll~r]sI, and is universal among objects with such
data. Given an endomorphism T : A - . A , we can consider the 'object of T-algebras'
T-.Alg, namely the inserter of T and the identity on A. Similarly, since T : A--+A in-
duces T [ z : 1]: A [ z : I]-*A[x: 1] with T[x : 1]~x = ,xT, we can consider the object
T[x : 1]-.Alg and the induced morphism ~1-Alg : T-Alg-*T[x : I]-.Alg. Stability means
that ~s-.Alg preserves 'initial objects', for every I:I--*A. It follows from Theorem 2.3.1
that stability is guaranteed whenever the object A is functionally complete, i.e. when ~!
has a right adjoint. We spell this out in more detail for categories and fibrations in the
following subsections. Further details on indeterminates and functional completeness
can be found in [ItJ93]. We refer to [Str72] for the relevant definitions of comonads and
their associated morphisms, as well as Kleisli objects for them in a 2-category. Anyway,
these concepts are not essential to understand what follows.
423
FTrs, I o Oj ~--- 7rFJ,I Oj• I o (OJJ X I) 0 (id, ~rtFY,1} : F(id, ~rF.l,l)' o Os,l
for every object J of ~. Every polynomial functor T admits such structure and hence
can be lifted to 13ffI.
We recall from [HJ93] that I~ is functionally complete if for every object I, the functor
r/1 : 13~]~[z : I] has a right adjoint. For 13 bicartesian this is the case precisely when it
is (bi)cartesian closed. As an easy consequence of Theorem 2.3.1 we have the following
result.
5.2.1. REMARK. Although the treatment of indeterminates for fibrations to follow par-
allels that for categories in w there is a subtle technical difference. All the concepts
424
preserves initial algebras (both on the base and the total categories).
5.2.3. REMARK. The above definition could equivalently be expressed by requiring that
every fibration with an indeterminate P, p[(x, h) : P] satisfy the induction principle
w.r.t, the induced morphism (Pred(T)[h : P], T[x: pP]): p[(x, h): P]---+p[(x, h): P], pro-
vided the base category ]~ admits stable initial algebras. This makes logical sense, as we
want to reason by induction in the fibration p[(x, h) : P], which has an indeterminate of
type pP, satisfying the hypothesis P; this is exactly what the above formulation means.
Our aim was to give a precise abstract account of structural induction over data types,
presenting the relevant technical machinery. A pay-off of this account is the precise
relationship between logical predicates and induction. This relationship is further eluci-
dated in a sequel to the present paper [HJ95], where we give an account of coinduction
principles along the same lines as those for induction here. In that case, the 'equality
predicate' functor takes over the role of T, and the fact that such functor preserves
the relevant structure becomes (an instance of) Reynolds' 'identity extension lemma'
[MR91]. There are also some considerations as to the extent the present approach can
cope with bifunctoriality, in order to obtain (co)induction principles for recursive data
types, in line with the domain-theoretic account in [Pit93].
We should mention that the approach here can be applied to formulate induction
principles for data types with equational constraints (a standard kind of algebraic spec-
ification). The categorical aspects of such data types are described in [Jac95]. Briefly
put, such data types are described by so-called distributive signalures (~, E), and their
models correspond to distributive functors M : C2(E, E)-*~, where 1~ is a distributive
category and ~ ( ~ , E) is the classifying category associated with the signature. A 'logi-
cal predicate' over such model is then a distributive functor Bred(M) : ~ ( ~ , E)--*IP with
Z26
References
[cs91] J.R.B. Cockett and D. Spencer. Strong categorical datatypes I. In Proceedings Category
Theory 1991. Canadian Mathematical Society, 1991.
[IIer93] C. Hermida. Fibrations, logical predicates and indeterminates. PhD thesis, University
of Edinburgh, 1993. Tech. Report ECS-LFCS-93-277. Also available as Aarhus Univ.
DAIMI Tech. Report PB-462.
[HJ93] C. Hermida and B. Jacobs. Fibrations with indeterminates: Contextual and functional
completeness for polymorphic lambda calculi. In Book of Abstracts of Category Theory
in Computer Science 5, september 1993. Extended version to appear in Mathematical
Structures in Computer Science.
[HJ95] C. Hermida and B. Jacobs. Induction and coinduction via subset types and quotient
types, presented at CLICS/TYPES workshop, Gbtenburg, january 1995.
[Ja~91] B. Jacobs. Categorical Type Theory. PhD thesis, Nijmegen, 1991.
[Jac95] B. Jacobs. Parameters and parameterization in specification using distributive cate-
gories. Fundamenta Informaticae, to appear, 1995.
[Ke189] G.M. Kelly. Elementary observations on 2-categorical limits. Bulletin Australian Math-
ematical Society, 39:301-317, 1989.
[Law70] F.W. Lawvere. Equality in hyperdoctrines and comprehension scheme as an adjoint
functor. In A. Heller, editor, Applications o] Categorical Algebra. AMS Providence,
1970.
[LS81] D. Lehmann and M. Smyth. Algebraic specification of data types: A synthetic ap-
proach. Math. Systems Theory, 14:97-139, 1981.
[LS86] J. Lambek and P.J. Scott. Introduction to Higher-Order Categorical Logic, volume 7
of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1986.
[MRgl] Q. Ma and J. C. Reynolds. Types, abstraction and parametric polymorphism 2. In
S Brookes, editor, Math. Found. of Prog. Lang. Sere., volume 589 of Lecture Notes in
Computer Science, pages 1-40. Springer Verlag, 1991.
[Pavg0] D. Pavlovi& Predicates and Fibrations. PhD thesis, University of Utrecht, 1990.
[Pay93] D. Pavlovi& Maps h relative to a factorisation system. Draft, Dept. of Math. and
Stat., McGill University, 1993.
[Pit93] A. Pitts. Relational properties of recursively defined domains. Teeh. Report TR321,
Cambridge Computing Laboratory, 1993.
[Str72] R. Street. The formal theory of monads. Journal of Pure and Applied Algebra, 2:149-
168, 1972.
[Str73] R. Street. Fibrations and Yoneda's lemma in a 2-category. In Category Seminar,
volume 420 of Lecture Notes in Mathematics. Springer Verlag, 1973.
On the Interpretation of Type Theory in Locally
Cartesian Closed Categories
Martin Hofmann*
1 I n t r o d u c t i o n and M o t i v a t i o n
In the next section we define categories with attributes and sketch the stan-
dard interpretation function. Section 3 contains the main result - - the construc-
tion of a cwa out of an lccc. In Section 5 we give a extension to universes which,
however, does not handle the most general case. For many lcccs arising in the
semantics of type theory in particular sets and w-sets and all toposes there is al-
ready known a natural equivalent cwa. For the case of toposes see [7, Ex. 4.3.5].
In Section 6 we give an example where this is not the case and thus provide
an application of the main result. Section 7 offers some concluding remarks and
sketches an alternative construction of equivalent split fibration due to Power
which does not extend to H- and Z-types.
Some familiarity with basic category theory and dependent type theory will
be assumed. Introductory material may be found in [1] (categories) and [10]
(dependent type theory). Both subjects are also well described in [12].
and
cr[fo g] : cr[f][g]
for g : A --+ B are satisfied.
- an operation p ( - ) which to each ~ E Fam(F) associates a C-morphism p(a)
with codomain F - - the canonical projection of a. The domain of p(c 0 is
written F 9a.
- An operation q ( - , - ) which to each C-morphism f : B --~ F and a C
Fam(F) associates a morphism q(f, or): B . cr[f] --~ F.c~ such that 2
B, q(y' q
B ,F
f
2 This and the following diagrams have been typeset using Paul Taylor's diagram
macros.
430
q ( i d r , o') = idr.a
and
q ( f 0 g, ~r) = q(f, ~r) o q(g, c~[f])
for g : A --+ B are satisfied.
Example 1. An important example of a cwa which also gives some intuition about
the meaning of the various ingredients is the term model of some dependent type
theory constructed as follows. The category C has as objects well formed contexts
of variable declarations and equivalence classes of parallel substitutions (tuples
of terms of the appropriate types as morphisms.) If F is a context then Faro(F)
is the set of types well-formed in F. If f : B --* F is a substitution then c~[f] is
the parallel substitution of the terms of f in c~. The morphism p(o-) consists of
the first IFI-variables of F, x:~ and q(f, a ) i s the substitution (f, x) where x is
the last variable in B, x: c~[f].
Further examples arise from families of sets or w-sets.
Provided that suitable interpretations of base types and type constructors are
given, a partial interpretation function can be defined by structural induction
in such a way that every context is interpreted as a C-object, every type is
interpreted as an element of Faro at the interpretation of its context and finally
terms are interpreted as sections (right inverses) of the canonical projections
associated to their types. If M is a right inverse of p(~) then by a slight abuse of
language we say that M is a section of cr. The pullback requirement for q(f, or)
allows to define a semantic equivalent to substitution on terms: If M is a section
of c~ C Fam(F) and f : B ~ F then there is a unique section of a[f] written
M[f] which satisfies q(/, ~r) o M[f] = M o f .
This interpretation is sound in the sense that the interpretation of all deriv-
able judgements is defined and that all equality judgements are validated w.r.t.
the actual equality in the model. An auxiliary property of the interpretation is
that syntactic substitution is interpreted as its semantic counterpart - I l l .
What it means that a cwa is closed under a type former can be almost directly
read off from the syntactic rules. For example closure under Z-types means that
- for every two families o" C Fam(_P) and T C F a m ( F . (7) there is a family
T) e Faro(r)
- for every two sections M of a and N of T[M] there is a section (M, N) of
Z ( a , v) - - the pairing of M and N
- for every section M of ~(c~, T) there is a section M.1 of o- and a section M.2
of r[M.1] - - the two projections of M
such that (M, N ) . I - M and (M, N).2 = N and (optionally) (M.1, M.2) = M
and for y : B ~ F we have Z(cr, r)[f] = Z(cr[f], T[q(f, ~r)]) and similar coherence
laws for pairing and the projections. See [15, 12] for details.
431
Preliminaries. Let C be a category with finite limits (terminal object and pull-
backs) and F E Oh(C). The slice category C / F has as objects C-morphisms
with codomain F and a C / F - m o r p h i s m from s : dora(s) ~ F to t : dora(t) --* 1"
is a C-morphism a : dom(s) --~ dora(t) with t o a = s. Notice the important
triviality that any C-morphism a with codomain dora(t) is a C / F - m o r p h i s m
with codomain t (and domain t 0 a.) For each C-morphism f : B ~ F there is
a functor f* : C / F ~ C / B sending s : dom(s) --~ F to the left vertical arrow
of the pullback of s along f . The action of f* on morphisms is defined by the
universal property of the pullback. The functor f* has a right adjoint Z ] which
sends s : dom(s) ---* B to the composition f 0 s. The arrow category C -~ has as
objects all morphisms of C and commuting squares as morphisms. Equivalently,
a C--*-morphism from s : dora(s) ---+ B to t : dora(s) --~ F is a C-morphism
f : B ~ F and a C / B - m o r p h i s m a : s --~ f*t. Taking the domain of a morphism
extends to a functor dom : C ~ --* C.
Categories with finite limits loosely correspond to dependent type theories if
one views morphisms as families of types the morphisms denoting the projection
from the disjoint union of all fibres to the indexing type. For example in the lccc
of sets the type of m, n-matrices indexed over the set N • N would be modelled
as the function f o r m a t : Mat ~ N • N which maps an arbitray matrix to its
"format" a pair of natural numbers indicating the numbers of rows and columns.
Substitution then corresponds (up to isomorphism) to pullback and composi-
tion to disjoint union. For example we obtain the set of square matrices indexed
over N as the pullback of f o r m a t along the diagonal function N ~ N x N
and similarly the set of matrices with variable number of columns indexed
over the number of rows as the composition of f o r m a t with the first projection
N x N---~ N.
Using equalisers one can also model extensional identity types. In order to
have dependent product types one also needs right adjoints to pullback functors
which leads to the following definition.
D e f i n i t i o n 1. A locally cartesian closed category (lccc) is a category with finite
limits and right adjoints H I to every pullback functor f* : C / I " ~ C / B for
f:B---*F.
Examples of lcccs are the categories of sets and w-sets, all toposes, and the term
model of extensional Martin-LSf type theory as constructed in [14]. For the rest
of this section assume a fixed lccc C. In order to derive an interpretation of
dependent type theory in C we construct a cwa with base category C as follows.
For F E Oh(C) the set Faro(F) is defined as the set of those functors a from
the slice category C / F to the arrow category C ~ which map every morphism
to a pullback square and for which cod o ~ = dom. More precisely c~ E F a m ( F )
432
do<v-(so
v-(so [v-(s)
Bl , B
Ol
Example 2. The intuition behind these families is that instead of making sub-
stitution (viz. pullback) an arbitrarily chosen structure, every family comes
equipped with its own behaviour under substitution. Thus in v-(s) one should
view s as a requested substitution and v'(s) itself as the result of performing
this substitution. Indeed, given a (not necessarily split) choice of pullbacks in C
we can see that every C - m o r p h i s m v- with c o d o m a i n / " induces a family & over
F. For s : B -+ F we put &(s) := s*r where s* is the pullback functor defined
above. If in addition a : B' --+ B we define 8 ( s , ~ ) as the unique mediating
m o r p h i s m in p
/ v-
8' O"
B' - B , "
where the lower right trapezium and the outer square are pullbacks. It follows
from a simple diagram chase that the resulting lower left trapezium is also a
pullback as required. Since 8(s, a) is defined by a universal property it must be
functorial.
morphism q(f, cr) is given by c~(idr, f) which indeed yields the required pullback
square. The coherence law for q ( - , - ) follows from the functoriality of ~.
Notice that by the definition of canonical projection a section of some family
cr is merely a right inverse to ~(id). Thus terms do not carry any intensional
information with respect to substitution. See also Section 5.
We have now constructed a cwa over C which can be shown to be equivalent
to C in some suitable 2-categorical sense. We shall content ourselves by noticing
that the hat-construction and canonical projection (p) establish an equivalence
between the category Fam(F) where a morphism from a to r is a map f with
p(r) o f = p(~r), and the slice category C / F for every F E Oh(C).
T(q(s, cr))]
(8)1 B/
f/1,B ,F
o~ 8
The fact that E(cr, r)(s, a) = r(q(s, o'), c~(s, a)) forms a pullback with a and the
vertical arrows follows because the vertical composition of two pullback squares is
a pullback. Functoriality follows from functoriality of ~r and r and the coherence
laws for q ( . , - ) .
Next, we check that the thus defined Z-type is indeed stable under substi-
tution. If f : B --* F and s : A --+ B then Z(~r, r)[f](s) = ~(rr, r)(fs) = cr(fs) o
7-(q(fs, (r)) = o-[f](s) o r(q(f, c~)o q(s, ~r[f])) = o-[f](s) o r[q(f, a)](q(s, o-[f])) =
S(o'[f], r[q(f, ~r)])(s) as required. For the morphism part we calculate similarly.
The pairing and projection combinators are defined as usual in an lccc: If M is
a section of a, i.e. a right inverse of cr(idv) and N is a section of r[M], i.e. a right
inverse to r[M](idv) = r(M) then we define the pairing (M, N) as q(M, r)o N
which is a section of Z(g, r) by simple equality reasoning. On the other hand,
434
In a similar way we can show that the cwa of families supports lists or natural
numbers if the category C supports them in a coherent way. Instead of carrying
out these (rather laborious) examples we a t t e m p t to clarify the ideas a bit further
by elaborating the conditions on C which are necessary in order t h a t in the
associated category with attributes we can interpret an (admittedly contrived)
type former governed by the rules
F F o"type
T-FORM
r ~- T(r type
F}-M:a
T-INTRO
F F T-intro(M): T(~r)
and the associated congruence rules. This can in general be interpreted if there
is an operation T which to every m o r p h i s m ~r with codomain P associates a
m o r p h i s m T(c 0 also with codomain P and to every pullback square another
pullback square
f' D
T<r) 9
B
f
9 B
f
,F
t
functorial in the sense that T(id) : id and T ( f ' o g') : T ( f ' ) o T(g'). This action
on pullback squares is another way of stating that T is compatible with the cho-
sen pullback up to isomorphism and admits a functorial action on isomorphisms,
but of course not necessarily on arbitrary morphisms.
Moreover, for each section M of r we need a section T-intro(M) of T ( r r ) i n
such a way t h a t in the above pullback situation we have
where M ' is the unique section of ~r' with f ' o M ' : M o f . This is the coherence
condition one would reasonably expect.
Now we can define a T-operator on families by putting
T(rr)(s) : : T(rr(s))
and
T(rr)(s, c0 : : T(rr(s, r~))
for (r 6 Fam(F) and s : B --+ F and c~ : B ' --~ B. Functoriality follows from
functoriality of rr and T ( - ) . The operation T-fntro is defined as in C. Stability
436
of same under substitution follows directly from the above coherence condition
by instantiating with the pullback square
p(~r[f]) = cr(f)l
B ,r'
f
This example shows that the described method carries over to other type con-
structors like e.g. lists or natural numbers provided they are present in C in
a coherent way. We also see that a type former need not necessarily be given
by a universal construction as is the case for H- and Z-types. The lesson to be
learned is that whenever a type former admits a functorial action on pullback
squares which is compatible with the associated structure then it may be lifted
to the cwa of families.
The general interpretation function for categories with attributes now gives rise
to a semantic function mapping contexts to objects in C, types to families over
their context, etc. Now if F k- ~ type then I F k- o'~(id[rl) is an object in the
slice category C / [ F ] which we may view as the intended interpretation of
in C. This intended semantics is not "compositional" since for example in the
interpretation of pairing we use substitutions other than the identity. A reader
familiar with theory of functional programming may notice here some similarity
with the continuation-passing-style translation where semantics is inductively
defined with respect to an arbitrary continuation, but in the end one is only
interested in the instance of the identity continuation.
5 Universes
S
f ,T
B ,F
f
and m o r p h i s m p : T --+ ~2 we have Vt(p) 0 f = Vs(p0 f ' ) .
Theorem4. To every split dictos there exists an equivalent cwa with enough
structure to interpret the Calculus of Constructions.
Proof. Let us first define what it means for a cwa with H - t y p e s to be a model
of the Calculus of Constructions. Following [15] we need a family Prop over 1
and a family Prf over 1 9 Prop in such a way that two morphisms s, s ~ : F --*
1 9 Prop are equal if Prf[s] = Prf[s']. Moreover, if cr is a family over F and
p : F 9c~ ~ 1 9 Prop then there is a m o r p h i s m VG(p) : F ~ 1 9 Prop such t h a t
Prf[Vo(s)] = H(cr, Prf[s]). One could stay even closer to the syntax but only at
the expense of clarity.
Now let a split dictos C be given. We construct a cwa with base C as follows.
The set of families over F is defined as the disjoint union of the set of functorial
families as defined in Section 3 and the homset C(F, D). We call the elements
of C ( F , ~2) propositional families (over F). The operations of substitution and
canonical projection are extended to propositional families by defining for (r :
/~--+/2andf:B~F:
f[~] = ~oI
p(~) =~*gen
q(f, cr) defined by universal property like in Ex. 2
defined as the identity on ~2. Notice that if s : F ~ 1 9 Prop then Prf[s] equals
s. Therefore Prf[-] is injective as required.
For the definition of the H - t y p e H(a, r) we first replace ~r by 5" if cr is propo-
sitional. So let's assume that z is functorial. Then we proceed by case distinction
on whether r is functorial or propositional. In the former case we use the H-
type for functorial families as defined in Section 3. If r is propositional, i.e.
7" : F . ~r ---+1. Prop then we define H(0-, v) as the propositional family Vp(o)(7").
Abstraction and application are defined by suitably interspersing the isomor-
phism between Vs(p)*gen and Hs(p*gen) assumed in the definition of a split
dictos.
By lengthy but straightforward calculation it follows that this satisfies all
the properties of dependent products. In particular to see t h a t / / i s stable under
substitution we instantiate the coherence property for V with the pullback square
formed out of p(~[f]), p(o'), q(f, cr), and f for some f : B --+ F.
The Y-operator is defined in exactly the same way using the fact that propo-
sitional families and morphisms into 1. Prop coincide.
As in Ex. 2 the hat-construction and canonical projection define an equiva-
lence between C and the constructed cwa.
Proof. We only give the required objects leaving the verifications to the reader.
Let f : Y -~ X and g : Z --* X. The pullback of f and g is defined as the object
W given by W,~t = Zy: Y.Zz: Z . X ( f ( y ) , g(z)) and Wrel((y, Z, --), (y', Z', --)) =
Y(y, y') X Z(z, z'). The two pullback projections send (y, z , - ) to y and z re-
spectively.
Now let f : Y --~ X and g : Z --* Y. We define [If (g) : W --~ X by
actually equivalent to the lccc of types and terms in extensional type theory
defined in [14] and presented there as the initial one. Incidentally, the precise
proof of initiality (up to natural isomorphism) of this syntactic lccc is another
field of application for our methods.
Acknowledgement
References
1. Michael Barr and Charles Wells. Category Theoryfor Computing Science. Inter-
national Series in Computer Science. Prentice Hall, 1990.
2. J. B6nabou. Fibred categories and the foundations of naive category theory. Jour-
nal of Symbolic Logic, 50:10-37, 1985.
3. A. Carboni. Some free constructions in realizability and proof theory. Journal of
Pure and Applied Algebra, to appear.
4. J. Cartmell. Generalized algebraic theories and contextual categories. PhD thesis,
Univ. Oxford, 1978.
5. Pierre-Louis Curien. Substitution up to isomorphism. Fundamenta Informaticae,
19:51-86, 1993.
6. Thomas Ehrhard. Dictoses. In Proc. Conf. Category Theory and Computer Sci-
ence, Manchester, UK, pages 213-223. Springer LNCS vol. 389, 1989.
7. Bart Jacobs. Categorical Type Theory. PhD thesis, University of Nijmegen, 1991.
8. Bart Jacobs. Comprehension categories and the semantics of type theory. Theo-
retical Computer Science, 107:169-207, 1993.
9. Nax P. Mendler. Quotient types via coequalisers in Martin-L6f's type theory, in
the informal proceedings of the workshop on Logical Frameworks, Antibes, May
1990.
10. B. Nordstr6m, K. Petersson, and J. M. Smith. Programming in Martin-Lb'f's Type
Theory, An Introduction. Clarendon Press, Oxford, 1990.
11. Wesley Phoa. An introduction to fibrations, topos theory, the effective topos, and
modest sets. Technical Report ECS-LFCS-92-208, LFCS Edinburgh, 1992.
12. Andrew Pitts. Categorical logic. In Handbook of Logic in Computer Science (Vol.
VI). Oxford University Press, 199? to appear.
13. A. J. Power. A general coherence result. Journal of Pure and Applied Algebra,
57:165-173, 1989.
14. Robert A. G. Seely. Locally cartesian closed categories and type theory. Mathe-
matical Proceedings of the Cambridge Philosophical Society, 95:33-48, 1984.
15. Thomas Streicher. Semantics of Type Theory. Birkhs 1991.
Algorithmic aspects of propositional tense logics
Alexander V. Chagrov and Valentin B. Shehtman
Tvef State University,
Zhelyabova Str. 33, 170013 Tvef, Russia
Insitute for Problems of Information Transmission,
Ermolovoj Str. 19, 101447 Moscow, Russia
Email: shehtman@ippi.ac.msk.su
1 Introduction
It is well-known that tense logics can be treated as logics of computations if we
understand "moments of time" as "states of a computing system". In this paper
we are concerned with tense logics in the traditional Priorean language having
two non-classical connectives: G ("it will always be the case that . . . "), H ("it
was always the case that ... "), and their duals: F = -~G-~, P = ~ H - . First let
us recall the definition of the minimal lense logic (denoted by K t 1).
A x i o m s : Classical tautologies and the formulas G(p --+ q) ~ (Gp ~ Gq),
H(p --+ q) ~ (Hp ~ gq), F g p ~ p, PGp ~ p.
I n f e r e n c e r u l e s : Modus Ponens; Generalization (t- 9 ~ F- G 9 and
~- 9 ==~~- H g ) ; Substitution (of variables by formulas).
In general, by a lense logic we mean an extension of K t by some new axioms;
if the number of new axioms is finite, the logic is called a tense calculus. E.g.
the calculus K 4 t is obtained by adding Gp --+ GGp (this is written as follows:
K 4 t = K t + Gp --~ GGp).
Semantics of tense logics is given by Kripke frames. The latter is a non-
empty set (of "moments of time" or of "states") with a binary relation ("earlier
than"). We get a Kripke model on a frame (W, R) if for every moment w E W
and for every formula 9 we say whether 9 is true at w (w ~ 9) or not, and also
if the following holds: w ~ 9 A r iff w ~ 9 and w ~ r etc. (similarly for the
other Boolean connectives); w ~ G 9 iff Vv(wRv ~ v ~ 9);
w ~ H 9 iffVv(vRw ~ v ~ 9). A formula is called true in a Kripke model if it is
true at every moment; it is called valid in a frame if it is true in every model on
this frame. It is well-known that K t is exactly the set of formulas valid in every
frame, K 4 t is the set of formulas valid in every transitive frame. For proofs and
motivations of tense logics the reader may consult [5].
The main topic of our paper is undecidability. In Section 2.2 we construct
undecidable tense calculi which are axiomatized by formulas of a very simple kind
] T h e s e n o t a t i o n s are t r a d i t i o n a l : "K" r e m i n d s of K r i p k e , "t" is a n a b b r e v i a t i o n for
443
("tense reduction principles"). Sections 3.2, 3.3 are devoted to more delicate
questions about decidability of properties of tense logics. The first examples
of undecidability for standard properties of polymodal logics (such as Kripke-
completeness, the finite model property, consistency) were found by Thomason
[14]. Here we show that tabularity of tense logics is undecidable as well (in
contrast with the case of monomodal S4-1ogics).
The readers are supposed to be familiar with standard syntactical and se-
mantical notions referring to normal modal and tense logics. However we recall
some necessary definitions.
2 U n d e c i d a b l e t e n s e calculi a x i o m a t i z e d by re-
d u c t i o n principles
A tense operatoris a word(maybe empty) in the alphabet {F, P, G, H}. A tense
reduction principle is a formula of the form Flp --~ F2p, in which F1, 1"2 are tense
operators.
It is well-known that many properties of time can be expressed by formulas
of this sort. E.g. Fp ~ H F p (and/or Gp --+ GGp) corresponds to transitivity,
P F p ~ F P p corresponds to "confluence": Vx, y, z( x Ry A x Rz --+ 3t ( y Rt A z Rt ) ),
Fp --~ F F p corresponds to density, GHp --* p corresponds to "seriality in fu-
ture": Vx3yxRy, and so on. The class of tense reduction principles may seem
too poor for obtaining negative algorithmic (and other) results. Nevertheless in
this Section we construct an undecidable tense calculus, axiomatizable by tense
reduction principles.
RF----Ril o...oRi,~
4.L~4.
(o denotes the composition of binary relations); for the empty word A, let
(:*) follows easily from the definition of L(T) by an induction on the proof
of F~.
To show the converse, assume that L(7-) ~ r2p ~ Flp. By Theorem 2, there
exists an embedding f : M(7-) ~ R M ( W ) . Let Ri = f(Ioil) for 1 < i < n.
Then RA = f(IAI) for any word A, and thus (since f is an embedding and by
Lemma 1) we have (for any F, A):
(r, A) ~ 7- ~ Irl < I/,I in M ( T ) ca RF c_ R/, r (W, R1, . . ., R,~} ~ Ap --+ Fp.
Thus (W, R 1 , . . . , R,) validates L(7-), and we obtain:
(W,/~,..., R.) ~ r 2 p -+ r~p.
2 R e c a l l t h a t a n ordered raonoid is a m o n o i d w h i c h is a p a r t i a l l y o r d e r e d set, s u c h t h a t
a <_ b =t, ~ a y <_ xby.
445
Every set W gives rise to the tense ordered monoid of binary relations
TI~M(W) = ( P ( W x W), idw, o, ___,-1 ). A t.o. monoid is called representable iff
it is embeddable into some T R M ( W ) . Not all t.o. monoids are representable;
a criterion of representability is known, but it is rather complicated [2]. For our
purposes we will use a simpler sufficient condition of representability.
Call a t.o. monoid serial iff it satisfies Vx (1 < x x - ] ) , Vx (1 _ x - i x ) .
T H E O R E M 4. Every serial t.o. monoid is representable.
P r o o f (a sketch). A cone in a t.o. monoid M is a subset U C M such that
x E M ~ x < y ::> y E M. Consider the set W of all cones in M, and for
x E M let
h(x) -- {(U,V) E W • W l,~V C_ U , x - l v C_ V } .
One can check that the map h : M ---+T R M ( W ) is an embedding, provided
that M is serial. Q.E.D.
Obviously, the set of all words in the alphabet
T~ = {D1,..., D~, D~-I,..., D~ 1}
can be considered as a t.o. monoid in which 9 is the concatenation, 1 = A, _< is
the equality, and
(Oi) -1 = O~-1 , ([::171)-1 = Di, ( N 1 . . . N m ) - ' = (Nm)-i ...(N1) -1
~.2.~
Qi = Gn+aHGn+I-iHGi+2H.
Then the monoid morphism f : T n ---+ T~ such that for any i, f ( n i ) = Qi,
f ( m i - 1 ) = Q~I, is a coding.
The proof is straightforward, and we skip it. By a standard argument we
have also the following
L e m m a 6. Let U be a semi-Thue system over Tn, and let f be the same as in
Lemma 5. Consider the semi-Thue system f(/,/) = {(f(P), f ( A ) ) [ ( r , A) ~ U}.
Then for any Pl, F2 E T*
g(Irlr) = If(r)ls 9
S u b l e m m a 7.5. Suppose (4) holds for some P in T,~ and some strongly regular
U. Then for any A, an occurrence of f ( A ) in U can be of two types:
(i) either I(A) occurs in some Ui;
(ii) or f ( A ) : G X , and for some i, Di-~ = HG, X ~_ Ui).
P r o o f (a sketch). By an induction over the length of A. The base consists in
analyzing different possible occurrences of Qk; here 7.4 is used. To make the
step, suppose A = A0C?~; then f ( A ) = f(A0)nk, and there are two possibilities:
* f(A0) occurs in some Ui. Then the initial segment G ~+2 of Qk also must
occur in Ui. Now from the base of 7.5 we can conclude that the whole Qk
occurs in V~, and thus (i) holds.
V = DoU, D1 . . . U i _ I G H f ( ~ ) U ~ 9 9UkDk,
and also
A.../x
where
Pi = Pl A ... A pi-1 A'-,pi A pi+l A ... Aps+l,
Now let L be a tabular logic determined by a finite frame •. Then for some
n, F contains no n-pseudochains and no n-pseudo-branched worlds, and thus
-~ (f, 1, 0) , ~ -~ (/3, 0, 1) ,
1~
.4
s(t,'i,0
Figure 1:
the left and then pass to the state qz; but if the machine is in the state q~ and
there are no cells to the left of the head on the first tape, then, changing nothing
on both tapes, pass to the stage q7.3 A configuration of a Minsky machine is a
triple A = (~, m, n), where q~ is a state, m and n are numbers of cells on the
first and the second tape. The notation P : A --+ B means that the Minsky
program P transforms a configuration A to a configuration B. It is well-known
that there is no Mgorithm deciding, for given P, A, B, whether P : A ~ B or
not.
The technique used below was described in full details for monomodal and
intermediate logics and a general scheme of undecidability proofs of this kind
can be found in [4]. The picture shows a transitive frame ~ akin to the one from
[3] where the proof scheme was used at first. Further on we suppose that the
frame jc contains only such points s(fl, k, l) that P : (~, m, n) ~ (fi, k, l).
The points of 9c are defined by the following variable-free formulas:
-,OA0~,
i#k=O
i c { o , 1 , 2 } , j k O.
9 if I = 7 - + ( 5 , 0 , 1 } we set
Let
AxP = A AxI
IEP
L(P,A,B) =
1 2
K 4 t 4- A x P A ((-~X A O S ( a , Am, A~) -+ -~X A OS(fl, A~, A~)) -+ X).
References
[1] BAKER K.A. Finite equational bases for finite algebras in a congruence
distributive equational classes. Adv. Math., 1977, v.24,207-243.
[2] BREDIKHIN D.M. Representation of ordered involuted semigroups. Izvesti-
ja vuzov, matematika, 1975, No.7, 119-129 (in Russian).
[3] CHAGROV A.V. Undecidable properties of extensions of provability logic.
I: Algebra i Logika, 1990, v.29, No.3, 350-367; II: ibid., No.5, 613-623. (In
Russian)
[4] CHAGROV A.V. , ZAKHARYASCHEV M.V. The undecidability of the
disjunction property of propositional logics and other related problems. J.
of Symb. Logic, 1993, v.58, No.3, 967-1002.
[5] GOLDBLATT R. Logic of time and computation. CSLI Lecture Notes No.7,
1987.
[6] ISARD S. A finitely axiomatizable undecidable extension of K. Theoria,
1977, v.43, No.3, 195-202.
[7] MATIJASEVICH Yu. V. Simple examples of undecidable associative calculi.
Trudy MIAN SSSR, 1967, v.93, 50-88 (in Russian).
455
Pawet Cholewifiski
1 Introduction
2 Preliminaries
Throughout this paper we assume that every default has at least one justifi-
cation. We consider formulas built over a given propositional language/:. For
any propositional formula ~ by Var(~) we denote the set of all propositional
variables which appear in ~. For every justification/~i of a default
Finally, by the set of conflict variables of d we mean Var* (d) =U~=Ii=k Var*(~)
Now we can state the definition of a stratification ]unction as follows.
For a given stratification function p, its lower and upper bound will be usually
denoted by y_ and 7. T h a t is, p is a fuction from a given set of defaults D to
set of ordinal numbers {~ : _~ _< ~ < ~}. Also, for a default d by p(d) we will
mean its prerequisite, t h a t is the formula a, by J(d) the set of its justifications
{/~1,..., ilk} and by c(d) its conclusion 7-
This definition of stratification differs from the standard one in that it uses
the new concept of conflict variables. We decided for this approach as it yields the
class of stratified theories which includes normal default theories and seminormal
default theories as defined in [5].
In fact, because any constant function is a stratification function for any set
of defaults only conditions (1) and (2) of Definition 2 are essential for (D, W)
to be stratified. A stratification based on a constant function p will be called
trivial. Clearly, the situation with strong stratification is different. For a given
set of defaults a strong stratification function need not exist.
459
Let us recall that if S is an extension of (D, W) then the set of all defaults
applicable with respect to S satisfies the equation S = Cn(W U c(U)) [13].
This set of defaults will be called a generating defaults set for S and denoted as
GD(D, W, S). Also, for a given default theory (D, W) and a well-ordering __ of
D, if S = Cn(WUc(AD~_)) is an extension for (D, W) then AD~_ = GD(D, W, S)
([11]).
4.60
In this section we present the main result of the paper. We show that extensions
for a stratified default theory (Di W) can be found be expanding the initial
knowledge set W stratum by stratum. The method of building extensions by
well-orderings presented in Definition 4111] will be the main tool in proving this
result. First we present a technical lemma. The proof is standard and is omitted.
Lemma 5 allows us to prove that for a stratified default theory (D, W) and
a well-ordering _ of D which agrees with a given stratification function p the
order in which defaults are selected in the construction of AD~_ (Definition 4)
is not incidental. The defaults must be "inserted" into AD 5 according to their
strata. More formally we have the following proposition.
Proof. Let d~ = ~:Z~.~3~r.....Zk.~ and du = %:~1.,2r .....~ " . Assume that ~ -<- ~. Since
the ranks of d~ and dv are different, d~ ~ d v and therefore y < ~. Hence, the
set of ordinals % such that T > y and p(d~) < p(dv) , is not empty and well-
ordered, Thus, without loss of generality, we can assume that ~ is the first ordinal
greater then ~/ such that the rule chosen in ~th step has rank less then p(dv).
From the definition of AD Z (Definition 4) we have that W U c(AD<~) ~- a~ and
W U c(AD<~) ~/-~i,r for any k = 1 , . . . , I. Since AD<n C_ AD<r it must be the
case that W U c(AD<v ) ~/-~i,~ for any k = 1 , . . . , l . Also, from Lemma 5 it
follows that
W U c({5 E AD<r : p(5) < p(ar F- a~.
Since ~ is the first ordinal greater than q such that the rule d~ has rank smaller
than the rank of q, all rules in AD<~ \ AD<v have ranks greater or equal p(q).
Also, since p is a stratification function for D, all variables which appear in a t ,
appear in conclusions of only those defaults from D which have ranks at most
p((). From Definition 3 we have that, p(du) > p(d~) > p(a~). Hence
{6 E AD<~ : p(~) _< p(a~)} _CAD<v.
Thus AD<n F- a~ and since ___ agrees with the stratification p and d~ _ d v it
follows that dr should have been chosen in y,h step rather then d n. That is,
_< ~/. Because of the contradiction the proposition is proven. []
We can apply Lemma 5 to show that every strongly stratified default theory
has an extension. First, we will prove a slightly stronger result.
461
Proof. Suppose that for some d E AD.< there is a justification j3 of d such that
W U c(AD~_) ~ ~ . Let d = dr that is, d was chosen in the ~th step of the
construction of AD~_. We will consider three cases: (a) d is a normal default, (b)
d is a seminormal default, (c) d is a general default.
(a) If d is normal then d = ~ By definition of AD< we have that W U
c(AD~_) F V and therefore W U c(AD~_) is inconsistent. Since for consistent W
and any set D of normal defaults the theory W U c(AD~_) is consistent (Lemma
4.2 in [11]) D must contain also a not normal default d' with t3' E J(d') for
which W U c(AD~_) F -~'. So, it is enough to consider cases (b) and (c).
(b) If d is seminormal then d = ~:~^x.
,y If W U c(AD~_) ~- -~(fl A V) then since
W U e(AD~ F ~ we have that W U e(AD~_) F -~. Since the stratification is
strong and _ agrees with rank it follows from Lemma 5 and Proposition 6 that
W U c(AD<e) F ~j3. Hence W U c(AD<~) F -~03 A 7) a contradiction.
(c) Let d = ~:~1"r.....Z~ and suppose that for some i, W U c(AD-<) F -~fli.
As the immediate consequence of Proposition 6 and Lemma 5 we have that
W U c(AD<~) ~- -~t3~. That is, d could not have been selected in the ~h step of
the construction of AD-<.
We have proven the first part of the proposition that is, that the condition
2. This condition implies that C n ( W U c(AD~)) is an extension for (D, W)
(Proposition 3.65 in [11]). []
Now, the existence of extensions for strongly stratified default theories follows
immediately from Proposition 7. We state it as a corollary.
tt follows directly from Definition 9 that ___s agrees with rank. The ordering
___s orders defaults according to their strata and inside each stratum creates an
initial segment of defaults generating S ordered by _ and then defaults from
D \ GD (D, W, S) are appended at the end of their strata. In proving subsequent
results we will use the following notation. When we deal with ADr gener-
ated for several orderings _~1, ~ 2 , - . . , __.t (t may be an arbitrary index) to avoid
confusion we will denote them as AD~, AD~,..., AD~ respectively. Similarly for
AD<r AD<_r etc. We state the following proposition.
P r o p o s i t i o n 10. Let (D, W) be a stratified default theory and p be a stratifica-
tion for (D, W). Each extension for (D, W) is generated by some ordering which
agrees with p.
Proof. Since S is an extension for (D, W) there exists a well-ordering _ of D
such that S = Cn(W U c(AD~_)) (Proposition 3.68 in [11]). In this case AD~_ =
GD(D, W, S), that is, AD Z is the set of generating defaults for S (see Section 2).
Moreover we can assume that _ orders defaults in such a way that all defaults
from the generating set GD(D, W, S) precede the defaults from D\GD(D, W, S)
and ~ IGD(D,W,S) agrees with rank [11]. We will show that also for an ordering
__s described in Definition 9, S = Cn(W U c(AD~_s)). To this end it is enough
to show that AD~_ = ADds. We will prove by induction that AD[ = ADr
Suppose that the claim is true for all ~* < ~. Then W U c(Ur ADr and
W U c(U~,<~ AD~,) prove exactly the same formulas.
First, we show that in the construction with respect to ~ s only a rule from
GD(D, W,S) may be selected in ~th step. Suppose to the contrary that some
d E D \ GD(D, W, S) is selected in the ~th step of the construction with respect
to _ s . This means that W U c(Ur AD~,) ~ a, and for all fl E J(d) W U
c(U ,< AD~,) V -~fl and d is the "~s-least default with such properties Let
p(d) = ~. From the inductive assumption it follows that W U c(U ~,<r AD~, ) ~- a.
Hence, by definition of AD• and the fact that d ~. AD~_, we have that there
must be a justification f l e J(d) such that W U c(AD__.) t- --ft. So, by Lemma 5
W Uc({5 E AD~_: p(5) < p(~)} F ~fl.
Since, p is a stratification function, p(~) < p(d) = ~. And since in _ s all rules
from AD~ having ranks p less or equal than 7/precede d it must be the case that
Till now we have proven that all extensions for stratified theories can be
found by examining only those orderings which agree with stratification and
that for strongly stratified theories every such ordering generates an extension.
Now we will show that extensions for stratified default theories can be computed
more efficiently than in the general case, that is by dealing with one stratum at
a time. First, we recall basic definitions of sets used in such construction.
1. Ar = AD~_ N D~;
2. A<r A_<r A>r and A>~ are defined similarly as D<r D_<~, D>r D>r
3. W<~ = W U c(Ur Af,) and W<~ = W U c(U~,<r A~,);
4. = cn(w< u c(&)).
Proof. Consider arbitrary ~,_~ < ~ < 7. The ordering ~ defined as __]D<r is an
ordering of D<r Also, the function p~ = PID<~ is a stratification for (D<~, W).
From the definition of AD-sets and Proposition 6 it follows that AD~_r = AD~_N
D<~ = A<~. Since Cn(W U c(AD~_)) is an extension for (D, W), for any d E
AD~_ and for any/~ E J(d) W U c(AD~_) [/-1t3. Moreover, if d E A<~ C AD~_
then W U c(AD~_r V -~I~. This means that Cn(W U c(A<~)) is an extension for
(D<~, W) (Proposition 3.65 in [11]).
To prove that Cn(W U c(A<~)) is an extension for (D<~, W) it is enough to
take ~ + 1 in place of ~ in the proof of (1).
To prove that S is an extension for (D>~, W<~) observe that (D>~, W<~) is a
stratified default theory. The function p~ -- PID>e is a stratification function for
D>~ and ~ = _ ID>~ is a well-ordering of D>~ which agrees with p~o So, from
the definition of AD-sets and Proposition 6 we have that for any ~ >
Thus AD~, = AD~_ \ A<r Consider the theory S' = Cn(W<~ U e(AD~_,.~)). It
follows that $2 = Cn(W U e(A<r U c(AD~, )) = Cn(W U c(A<( U AD~_, )) =
Cn(W U c(AD~_). So, $2 = S, and since S does not prove negation of any
justification from AD~_, S does not prove negation of any justification from
AD~_, C_ AD~_. Hence, S is an extension for (D_>r W<r Assertion (4) follows
easily from (3) by using ~ + 1 in place of ~. []
4 Properties of Extensions
However, in the case of stratified default theories we can show a weaker form
of semimononicity. First, as an immediate consequence of Proposition 12 we can
state the following result.
465
Corollary 15. Let (D, W) be a strongly stratified default theory, D C_ D' and
p be a strong stratification function for D'. If for every d E D' \ D and 5 E D,
p(d) > p(5), then for any extension S of (D, W) there is an extension S I for
(D',W) such that S C_ S', and G D ( D , W , S ) C_ GD(D',W,S'). /~
L e m m a 16. Let (D, W) and (D', W) be two stratified default theories. Let D'
be a set of defaults such that D C_ D' and the constant ]unction p(d) = 1 is a
strong stratification function for D'. For any extension S of (D, W) there exists
an extension S' of (D', W) such that S c_ S' and GD(D, W, S) C_ GD(D', W, S~).
(D, W) by p it follows that WUc({5 ~ AD~_~UAD~_: : p(5) < p(/3~)}) t- -"/3~. But
since p is a strong stratification all defaults from AD~_~UAD~_: with ranks smaller
than/3~ are in Ur AD~,. That is, WUc({5 e AD~_~UAD~_~ : p(5) < p(/3~)}) C_
Ur <~dD~,, and consequently W U c(U ~, <~ dD~,) F -~/3~ a contradiction.
(c) The case when/3i is not normal and not seminormal is very similar to the
previous one. First, we observe that W U c(AD~_~) U c(AD~_:) F -"/3. Now, from
Lemma 5 we can conclude that W U c ( { ~ ~ AD~_I U A D ~ : p((i) < p(/3~)}) k -"/3
and by the same argument as in (b)
Thus, once again we receive that W U c(U ~,<4 AD~, ) t- -./3 which contradicts the
assumption about de. []
Proof. By Corollary 8 the default theory (D, W) has an extension. Suppose that
$1 and $2 are two distinct extensions for (D, W), generated by "<1 and ___2re-
spectively. By Proposition 18 the theory WUc(AD~_I) Uc(AD~_2) is inconsistent.
Hence the theory W U c(D) is inconsistent. A contradiction. []
Of course, to prove the assertions of Propositions 18 and Corollary 19 it is
necessary to assume that the stratification is strong. Otherwise these assertions
fail as shown in the following example.
Throughout this section we assume that sets D and W are finite and that the
range of p is 1 , . . . , K. First, we notice that checking whether a given finite
default theory is strongly stratified and finding the finest partition into strata
(with respect to Definition 2) is easy and except for consistency checking of W
all the work can be done in polynomial time. This can be done in a similar way
as for other stratified reasoning formalisms such as stratified logic programs [2],
stratified autoepistemic theories [7], [10] or stratified seminormal default theories
[5]. We can describe all the conditions of Definition 1 using the characteristic
graph GD.
D e f i n i t i o n 21. For any finite set of defaults D by the characteristic graph GD
we mean a directed graph GD = (D,A) , where for any d~,dj E D there is an
edge (dl, dj) 9 A if:
468
vertices of level K always exist. In fact, in such a case, all the leaves of T(D, W)
have depth K and correspond to all extensions for (D, W). Reasoning with a
stratified default theory (D, W) can be perceived as searching the tree T(D, W).
Let us denote by k~ the number of vertices of level i - 1 in T(D, W), that is the
number of extensions for (D<~, W). Let n~ =1 Di I and mi be the total number of
justifications in all defaults from D~. The tree T(D, W) can be searched in many
ways. For example both depth-first-search and breadth first search can be used.
In any case, whenever the search of whole T(D, W) is needed the required time
measured as the number of calls to a propositional provability oracle is bounded
by
K
Z kJ(ni, mi) <_f(n/K, m/K)(l+q+q2+...+q K-l) <_2f(n/K, re~K)2 '~/2.
i=1
(a) the query can be answered positively when at least one node U o f T ( D , W )
of level Ko - 1 is found and ~ is true in U.
(b) the query can be answered negatively if ~ is not true in all nodes of
T ( D , W ) of some level I > p(~).
3. For any formula ~ and query "Is ~ true in all extensions f o r (D, W ) ?":
(a) the query can be answered positively when ~p is true in all nodes of (D, W)
of any level l, 1 < l < K .
(b) the query can be answered negatively if ~ is not true in at least one node
of (D, W ) of some level 1 such that 1 > Ko - 1 and I > p(~).
A
References
NILS KLARLUND*
BRICS t
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY OF AARHUS
NY MUNKEGADE
DK-8000 AARHUS C, DENMARK.
1. OVERVIEW
An i m p o r t a n t and only partially solved problem in the theory of ~o-regular
languages is whether representations can be minimized. For usual regular lan-
guages, deterministic finite-state a u t o m a t a (DFAs) are recognizing structures
t h a t can be minimized easily in polynomial time by virtue of the Myhill-Nerode
Theorem. Although w-regular languages enjoy some of the same properties as
regular languages, see [6], the lack of similar minimization Mgorithms is a m a j o r
i m p e d i m e n t to building verification tools for concurrent programs.
The syntactic congruences of Arnold [2] provide canonical algebraic struc-
tures for w-regular languages. By themselves, these congruences provide no
explicit acceptance criteria just as in the situation for a regular language: the
canonical right congruence, whose classes are a u t o m a t a states, does not define a
*The author was partially supported by a Fellowship from the Danish Research Council.
~Basic Research in Computer Science, Centre of the Danish National Research
Foundation.
472
2. F O R C s AND L F O R C s
Let E be a finite or infinite alphabet. T h e e m p t y word is denoted e. T h e set
of finite words is denoted E* and the set of infinite words is denoted E ~. A right
c o n g r u e n c e ~ on E* is an equivalence relation t h a t satisfies
o~ --=-x y ~ ( y ~ ~
s p
O ~ V0 ~ V w
s P
-~a ~b 1 b 2 a
b b a ~ b
The first automaton defines the safety congruence ,,~ as x ,~ y if and only if the
last letter in x and in y are the same. The congruence o 1 is specified by the
second automaton. Each state is marked with the corresponding safety state
according to the requirement (FORC). The states in A1 are marked by an inner
circle. The other progress congruence ~ is shown as the last automaton.
There is another LFORC representation of the same language with a simpler
safety congruence and a more complicated progress congruence:
a
a,b a,b
0 b
[]
The size of an LFORC is the maximum index of its congruence relations. Thus
the size of the first LFORC above is three and the size of the second one is four.
4.76
(One could also have defined the size as the total number of classes, but this
number is at most quadratically bigger.)
Given L and (--~, 0 ) , define the natural live a s s i g n m e n t A by letting A L consist
of the p such that some a C L allows an (s,p)-factorization, i.e. such that
L M L(s,p) is non-empty. Then L C_ L ( ~ , 4 , AL).
A language L is saturated by (--%o ) if for all a and • both admitting an
(s, p)-faetorization, it holds that a E L if and only if • E L or, in other words, if
for all ( s , p ) , either L(,,p) C L or L(s,p) M L = O. Thus for a E L, we may choose
any factorization to determine whether c~ C L.
We can express the saturation property of [2] as follows.
L e m m a 2. (Saturation)
L is recognized by (--%o , A n) if and only if L is saturated by (~, o ) .
L e m m a 3. (Cyclicity) If,~ refines ~ and x E s_ = s_.y, then for some i, j < Is_L
and some s C s_, x . yi E s = s . yJ.
In particular, when
O~ ----- ~t Vw
= u v i .( v j )~
s p
in (~, o ) with s C s_ and i, j < is_l~. We say that the former factorization induces
the latter.
Proof. Note that the u-states of x, x . y , x . y 2 , . . . are all among the Is_l~ different
~-states contained in s_. []
Proof. Since L(2, L) is w-regular it suffices to verify that each word of the form
uvco is accepted if and only if it is in L.
So assume xy~ is in L. By Lemma l(b), x y ~ can be factorized as
s p
s p
Proof. Define ~ by
(l~~ if s x ~ s y and
for all v, s = s x v implies u ( x v ) ~ C L iff u ( y v ) ~ E L, where u E s.
Since ~ refines ~L, the choice of u in (1) is immaterial and it can be seen that
saturates L. Also, it is not difficult to see that any other FORC with ,-~ as
safety congruence retracts to 3. []
We noted before that the progress and live state conditions for a retraction
involving LFORCs with the same safety congruence essentially express DFA
homomorphisms. Thus for a fixed safety congruence ,-% DFA minimization that
respects requirement (FORC) can be applied to obtain the ~-canonical LFORC.
479
We shall next show that when the safety congruence becomes simpler, the
progress congruences get more complicated but they still essentially contain the
simpler progress congruences if these are minimum with respect to their safety
congruence.
Assume ~ retracts to ~_. Then ~ progress-refines ~ if for s E s__,
0
x~_~y and s x ,,~ s y implies x,'~
0 sy
4. COLLAPSED L F O R C S
Let ~ = (,,% o , A) be an LFORC and ~ be a refinement of ~. Define the
collapsed LFORC ~ = (~-, o , A__)by
(2) ~
x~-~y iff for all s of ~ contained in
s, x ~ and
for all v and i < Is_I~,
s_xv = s implies (xv) i ~ (yv)~
(3) p_EA~_ iff [xi]~Ehs, wherexEp_,
s_x = s_, s x ~ = s, a n d s _ s
It can be seen that (2) always defines a right congruence. The definition (3)
may not always make sense. The C o n s i s t e n c y R e q u i r e m e n t is that for all s the
membership of a progress state pp in As_ is determined unambiguously by (3) for
any choice of x, i, and s.
5. L o w ~ a BOUND
We establish an exponential lower bound for minimization, that is, there is an
infinite family Ln of languages that can be represented by LFORCs of size O(n)
but whose canonical LFORCs contain a progress congruence with n ~ states.
We let E~ = E A U E B consist of n trigger letters E A = { a l , . . . ,aN} and
of n response letters E B = { b l , . . . ,b~}. A word a is in L,~ if from some point
on there is a trigger letter ai such that ai occurs infinitely often and no other
trigger letter occurs; also, the number of bis between two consecutive letters ai
must be exactly n. There is certainly only one safety class in this language since
a word is recognized by properties of its tail. Define a linearly big L F O R C 12n
recognizing Ln by using the safety congruence represented by the a u t o m a t o n
depicted for n = 3 as
a2 a3
a2
Thus each state sl is a sink for the letter al, and a response letter does not
affect the state. Continuing our example for n -- 3, we can represent the progress
congruence for safety state 1 by the a u t o m a t o n
481
~-~1 ~1 lo~
(i 2
I
::,,( 1 , ~ . . . - . - ~ - ~
a2~a3
\
32
a2-1~a3
t
1.~~,
\
32
I
.~,]
a2-1~
b2,b3
bl
32 3
l
al
a3, r,~
al
green light is flashed every time n occurrences of bi has taken place in component
i. An infinite word is accepted if the red light flashes only finitely often and the
green light flashes infinitely often.
Then An has size O(n 2) and accepts Ln. []
(4) I c ( v s ) ( x ) = Ic(vs)(y)
iff
x ,-is y and for all v and i <_ If-l(s)l, s(xv) i = s ~ (xv) i ~ (yv) i
L e m m a 7. For s and j, an automaton IC(~8) exists such that (4) holds. The
a u t o m a t o n is at most exponential in size of ~38.
(5) =
implies
| • %,
seA:f(s)=A
where | denotes the transition system cross product, such that for all
s 9 S with f(s) = s_ and all i < If-l(s_)l,
Pro@ Clearly the safety congruence can only shrink. For the progress part, use
the FDFA representation (| ~ ) of ~ and assume that the canonical safety state
s_ is refined by the s of 8 such that f(s) = _s. Observe that @,es./(s)-s IC(~s)
refines the canonical progress automaton or congruence for s. Thus the-canonical
congruence is of size at most ( n ' ) ", which is n ~ . []
7. CONCLUSION
We have presented algebraic homomorphism concepts based on right congru-
ences introduced by Maler and Staiger. With these concepts, we have shown that
a representation of w-regularity exists that extend the Myhill-Nerode Theorem.
However, even as LFORCS can be reduced to a canonical form and any Rabin
or Streett automaton with a small acceptance condition is poly-size related to
LFORCS, it is still open whether the Myhill-Nerode Theorem can be generalized
to an algebraic setting that both is poly-size related to usual automata on infinite
words and allows poly-time computable reductions to canonical forms.
A c k n o w l e d g e m e n t s . Thanks to Dexter Kozen, Oded Maler, Ludwig Staiger,
Igor Walukiewicz, and Thomas Wilke for discussions about w-regular represen-
tations. Also, thanks to the referee for helpful comments.
REFERENCES
1. B. Alpern and F.B. Schneider. Defining liveness. In]ormation Processing Letters,
21:181-185, Oct. 1985.
2. A. Arnold. A syntactic congruence for rational w-languages. Theoretical Computer
Science, 39:333-335, 1985.
3. O. Maler and L. Staiger. On syntactic congruences for w-languages. Technical Re-
port 93-13, Aachener Informatik-Berichte, 1993. A preliminary version appeared in:
Proc. STACS 93, LNCS 665, Springer-Verlag, Berlin 1993, pp. 586-594.).
485
1 Background
1.1 Subrecursion and computational complexity
Recent interest in machine-independent characterizations of computational com-
plexity has been motivated by the wish to lend credence to importance of complex-
ity classes, provide insight into their nature, relate them to programming method-
ology, suggest new tools for separation, and offer generalizations to computing
over arbitrary structures and to higher type functionals. Machine-independent ap-
proaches fall, by and large, into three classes: proof-theoretic, descriptive (database
queries), and algebraic (applicative programs).
Recurrence schemas have been used for long as an applicative method for defin-
ing, characterizing, and classifying natural collections of recursive functions. Sub-
recursive characterizations of computational complexity classes that are relevant to
computer science originate with Cobham's [1965] characterization of the poly-time
functions over l~ by "bounded recursion on notations." Unfortunately, Cobham's
characterization uses ad hoc initial functions and explicit bounds on functions'
growth rate. The correspondence between sub-recursion and computational com-
plexity was clarified by the use of ramified data, introduced independently by Sim-
mons [1988], Leivant [1990b], and Bellantoni and Cook [1992]. One underlying
idea is that data objects are used computationally in different guises. By explic-
itly separating these uses, and requiring that recurrence (i.e. primitive recursion)
487
respect that separation, one obtains forms of ramified recurrence that correspond
closely to major computational complexity classes. Such characterizations have
been provided, for example, for the poly-time functions [Bellantoni and Cook, 1992;
Leivant, 1993], the extended polynomials [Leivant, 1990b], the linear space func-
tions [Bellantoni, 1992; Handley, 1992; Leivant, 1993; Nguyen, 1993], NC 1 and
polylog space [Bloch, 1992], NP and the poly-time hierarchy [Bellantoni, 1994],
and the elementary functions [Leivant, 1994a]. For further backgound on ramified
recurrence the reader is referred to [Leivant, t994b].
We show here that the poly-space functions over W are precisely the functions
defined by ramified recurrence with substitution. The schema of recurrence with
(parameter) substitution is a well known variant of recurrence, where the param-
eters of a recurrence may be altered at each iteration using previously defined
functions.
The role fo recurrence with substitution is best understood when comparing
the present result to the characterization of poly-time in [Leivant, 1993; Leivant,
1994b]. There we used recurrence over the algebra of binary words to simulate
the progressing computation of deterministic abstract machines, by itrerating the
mapping of a configuration into its successor configuration. The length of the pro-
gression, goverened by the ramification conditions, is polynomial in the size of the
input, leading to a characterization of poly-time. Parameter substitution allows the
representation of computation flow backwards, and not only forward: the outcome
of performing t + l computation steps from a given configuration is defined by per-
forming t steps from successor configurations. This allows the simulation of parallel
computing, in particular the computation of poly-time alternating machines over
{0, i)*, i.e. of poly-space.
by monotonic fixpoints [Vardi, 1982; Immerman, 1986]. The first such charac-
terization for poly-space is in terms of definability by imperative programs with
guarded looping ("while programs") over ordered finite structures, proved inde-
pendently by Moshe Vardi [Vardi, 1982] and Yiannis Moschovakis [Moschovakis,
1983, 2C.8]. Abiteboul and Vianu [Abiteboul and Vianu, 1989] showed that the
poly-space queries over ordered structures are those defined by non-monotonic fix-
point, a result which elegantly complements the characterization of poly-time by
positive fixpoint [Vardi, 1982; Immerman, 1986] and inflationary fixpoint [Gure-
vich and Shelah, 1986; Leivant, 1990a]. A related characterization is Datalog(-~-~)
[Abiteboul and Vianu, 1991]). Alternatively, poly-space is obtained by enriching
first order logic with both a partial fixpoint and some nondeterministic choice
operator [Abiteboul e~ al., 1990], or by enriching Datalog with operations that
represent hypothetical reasoning [Bonnet, 1988] (these characterizations hold also
over unordered structures).
Several other characterizations in finite model theory use higher order logic,
following the seminal characterization of poly-time queries by existential second
order formulas [Fagin, 1974; Jones and Selman, 1974], from which it follows that
the queries expressible in second order logic form the poly-time hierarchy. Notably,
Immerman showed that the poly-space queries (without order assumption) are
obtained by enriching second order quantification with a second order transitive
closure operator; this is immediately equivalent to expressibility in SO[n~ i.e.
by a uniform sequence of second order formulas where a quantifier block is iterated a
polynomiM number of times [Immerman, 1987]. For queries over ordered structures
an equivalent property is definability by an existential second order formulas whose
first order matrix uses non-monotone fixpoints, or similar extensions of first order
logic (see [Abiteboul e~ al., 1994] for a survey).
An algebraic characterization of poly-space in finite model theory is obtained
using higher order recurrence over particular rank-3 types [Goerdt, 1992], in anal-
ogy to the characterization of the log-space queries by simple recurrence [Gurevich,
1983]. Of some relevance are also characterizations of poly-space queries in terms
of imperative computation models (without explicit reference to resource bounds),
such as program schemes with arrays [Tiuryn and Urzyczyn, 1988; Stewart, 1993].
However, in contrast to the characterizations above of poly-space queries in
Finite Model Theory, our characterization is for poly-space functions over a single
infinite algebra, namely the algebra W of binary words, i.e. {0, 1}*, the canonical
medium of Turing machine computing. While subrecursive characterizations of
poly-time and of linear-space, albeit unsatisfactory, have been known for a long
while [Cobham, 1965; Ritchie, 1963], we know of no such characterization thus far
for poly-space (except for the rather pedestrian observation that, by the character-
ization of linear space in [Ritchie, 1963], composing functions in Grzegorczyk's E ~
with )~n.2"k yields poly-space [Thompson, 1972]; a category-theoretic rendition of
that result is in [Otto, 1995]).
489
Most results about applicative delineations of complexity classes are based, ex-
plicitly or implicitly, on data structures other than the natural numbers. For in-
stance, Cobham's "recursion on notations" is in fact recurrence over binary words
in disguise. We find it cleaner and clearer to refer directly to recurrence over free
algebras. ~- For the rest of the paper & will be a free algebra generated from cl ... c~
(k > 0) where a r i t y ( c i ) = r i ~_ O. W e write a r i t y ( A ) f o r m a x ( f t . . , rk). If a E &,
we write lal for the height of (the syntax tree of) a.
Recurrence over & is a set of k templates, one for each constructor:
. . a r , ) , i ) = g c , ( 1 a, '!" a r , , Et , E ) ,
9 i
/(ci(al whereaj=dff(aj,E), i=l...k.
The functions gr above are the recurrence functions, and the argument of f dis-
played first is the recurrence argument 9 The ri arguments of ge~ displayed first
above are the critical arguments, and the arguments displayed as ~ are the param-
eters.
aj if j _< ri
dstrj(ci(al...ar~))= ci(~) otherwise.
f(e,E) =
f(O o, e) = go(/(", e), w,
f(1,o,e) =
2These schemas are for the most part well known: see, for example, [Venkataraman e~ M., 1983;
Tucker and Zucker, 1988].
490
f(4(al...ar,),~) -- g c , ( f ( a l , x ) . . . f ( a r , , x ) , ~ , x ) i = 1...k
provided the tier j of the recurrence argument of f is higher than the tier t of
actually-present critical arguments of each gc~; i.e. either the recurrence is flat, or
else the recurrence argument of f is in a tier higher than the output tier of f .
The ramified PR functions over A are the functions over ,9(A) that are defined
from the constructors of A by ramified recurrence and explicit definitions. We write
RRec(A) for the collection of these functions. We also write f E RRec(A) for a
function f over A if f is obtained from a function in RRec(A) by disregarding tiers.
E x a m p l e s o f r a m i f i e d r e c u r r e n c e . (See [Leivant, 1994b] for more detail and
other examples.)
1. All functions defined by explicit definitions and flat recurrence are trivially
in RRec(A) as functions over A0, or, more generally, as functions over Ai for
any given i.
3. Addition and multiplication over N, and concatenation over Y~], are all defin-
able by ramified recurrence: For each j, k with j > k we can define a copy
of addition, +jk : l~lj xN~ --*Nk. Then, for each j, k with j > k, we obtain a
ramified multiplication function xjk : N j x Nj --*Nk.
4. Analogously to addition, we have ramified copies of the coneatenation func-
tion, (3j~ 9 Wj xWj--~Wk, (j > k). A multiplicative function | over W can
be defined, for example, by | z, y) = e, | e, x, y) = 9 (3 | z, y),
| e, z, y) = y (3 | z, y). For each j, k, where j > k, we can define by
ramified recurrence a ramified version | of | from Wj • Wj to Wk.
5. Consider the definition above of ezp by ezp (n + 1) = ex.__Ep(n) + ezi0 (n). Since
the first input of ramified addition must be at a tier higher than the out-
put, this recurrence cannot be ramified. A similar argument shows that the
definition of exponentiation via multiplication cannot be ramified either.
49]
For our generic algebra & recurrence with (parameter) substitution [Rose, 1984] is
the schema
Here 5ijt are vectors of already-defined functions, the substitution functions and
{al...~p)(1) stands for {al(ff)...c~p(ff)}. The arguments ff are the substitution
parameters.
For A = l~l the above reads
For example, define zp(O, u) = u, xp(sn, u) = xp(n, 2u); then xp(n, 1) = 2n.
It is well known that recurrence with substitution over N is reducible to simple
recurrence, but the standard reduction uses sequence-coding by elementary func-
tions [Rose, 1984]. In the context of sub-elementary complexity classes, it is there-
fore appropriate to consider this schema in its own right. Ramified &-recurrence
with substitution is the schema of recurrence with substitution as stated above, but
applied over S(&), with the provisos that (1) the tier of the recurrence argument be
larger than the tier of critical arguments, and (2) that the substitution parameters
have a common tier. We write RRecSbt(&) for the collection of functions generated
from the constructors of & by ramified recurrence with substitution and explicit
definitions.
Examples
f(O,x,y) -- y
f(sn, x,y) -- f(n,x,(y+~)+x)
with f : Na xN1 xN0 ~ N0. The tiering condition on the recurrence argument
is satisfied here, but not the condition that the substitution parameters have
a common tier. By a trivial induction on n we have f ( n , x, O) = 2 n 9x for all
n>O.
3. Consider the truth-definition function for quantified boolean formulas, call
it true, which is well-known to be poly-space complete. A definition by re-
currence (over the syntax of boolean formulas) has a key clause that defines
492
We continue our generic treatment of free algebras by using a generic machine model
for parallel computing. We enhance the genericity of the model by parameterizing
it with respect to a "combination function" a: when the computation at some state
s forks into its parallel subcomputations, the output is defined to be ~ applied to
s and to the outputs of the subcomputations. An alternating TM is then a special
case, with W as the algebra and with a(s, x, y) = if s is an existential state then
x V y else z A y.
A parallel register machine (PRM) over a free algebra A as above is a device
M consisting of:
1. a finite set S of states, including a distinct state BEGIN;
2. a finite list II = (~rl...~r "~) of registers (i.e. distinct identifiers); we use
7r, 7rl... to range over H, and we write OUTPUT for 7rm;
3. a finite collection of commands, where a command is of one of the following
forms. Following each form we indicate its intended operational interpreta-
tion; a more formal semantic definition follows.
(a) [Constructor: sTrl ... 7rr,ciTr's'] (When in state s, store in register 7r' the
term resulting from applying the constructor ci to the values stored in
registers r l ... 7rr~, and switch to state #.)
(b) [p-Destructor: szcTr's'] (p < arity(&)). (When in state s, store in regis-
ter 7r' the result of applying dstrp to the value in 7r, and switch to state
S t .)
493
(c) [Branch: sTrsl ...sk] the algebra term in register 7r is % then switch to
state sp .)
(d) [Fork: srsosl] (Return the value c~(s, ao, al), where ai is the value
returned by the computation that starts with state si and the current
register values (i = 0, 1).)
(e) [End] (Return the value in OUTPUT.)
* eval(O, s, F) is undefined.
* If corn(s) is [Constructor: sTrl...r~,eiTr's'] then
eval(t+l, s, F) = eval(t, s', F'), where F ' = {rr' ,---ei(F(rrl),..., F ( r ~ , ) ) } F;
* If com(s) is [p-Destructor: sirTr's'], then
eval(t+l, s,F) = eval(t, s', F'), where F ' = {r',--- dstrp(F(r))} F;
9 If corn(s) is [Branch: s~rsl...s~], then
eval(t+ 1, s, F) = eval(t, s', F), where s' = case (F(rr), s l , . . . , s~);
* If corn(s) is [Fork: ssosl], then
eval(t + l, s, F) = a(s, eval(t, So, F), eval(t, sl, F));
* If com(s) is [End] then eval(t + 1, s, F ) = F(OUTPUT).
Note t h a t eva_~lis defined by recurrence with p a r a m e t e r substitution, except for
the "undefined" base case.
Let T : I~---~H, and fix some canonical u E A (e.g. 0 for 1~, e for W). For M
and a as above, and for r < m = the number of registers of M , M determines a
partial function [M]~ a : A ~ --* A defined by
3.2 P o l y - t i m e P R M - c o m p u t a b i l i t y over w
Restricting attention to PRM's over the algebra W, we have a direct relation be-
tween poly-space and PRM-computability in poly-time. 4 If f : W--*W let f b i t be
defined by fuit(w, i) =dr the i'th bit of f(w). We have:
P r o o f . We shall use below only the fact that (1) implies (3). The implication (1)
=# (2) is well known [Chandra et al., 1981], and we prove here that (2) implies (3).
That (3) implies (1) is immediate from Theorem 4.3 below, and can also be easily
proved directly.
To see that (2) implies (3), suppose that f b i t is computable by a single-tape
alternating Turing machine M, in time O(n~). Define a PRM R that simu-
lates M as follows. R has two registers, intended to hold the tape's contents
from the head to the right, and from the hand's left neighbor to the left (in
reverse order), respectively. Without loss of generality each non-terminal state
of M is the sources of two transitions. Corresponding to a disjunctive state s
of M, with transitions rl and r~ leading to states 81 and s~ respectively, R
has a state s with a [Fork] transition leading to states s t and s~, and tran-
sitions r~ simulating ri, and leading from s~ to si (i = 1,2). Finally let R
compute with the combination function a(s, u, v) = if s is disjunctive (in M)
then max(u, v) else min(u, v). (Here max(u, v) - .cqse(u, e, case(v, e, O, 1), 1) and
min(u, v) = c.ase(u, e, O, case(v, E, O, 1)).)
Suppose If(w)[ ~ Iw[q. To compute f, construct a register machine R' that
on input w generates in some register r ~ the value llwFe, initialize O U T P U T to e,
and then enter into a loop guarded by value (~r') r e, and whose each cycle: (1)
executes R with w as first input and the value of ~rI as the second; (2) appends the
output of R to the value in OUTPUT; and (3) decrements zff. []
LEMMA 4.1 Let & be any free algebra. Ira function f over & is poly-time PRM-
computable, then f is definable by ramified &-recurrence with substitution. In
fact, that definition can be obtained using three tiers only (say Ao, &l and &2).
LEMMA 4.2 Let & be a word algebra. Ira function f over & is definable by ramified
recurrence with substitution, then it is poly-space.
The proof is in w below. Combining the facts above yields our main result:
THEOREM 4.3 Let f be a function over W. The following conditions are equivalent.
1. f is poly-space.
2. f is poly-time PRM-computable.
P r o o f . (1) implies (2) by Lemma 3.1. (2) implies (3) by Lemma 4.1. (3) implies
(4) trivially. Finally, (4) implies (1) by Lemma 4.2. O
Note that these characterizations use no initial functions other than the con-
structors, and no size-bounding condition.
LEMMA 4.4 Let & be a free algebra. If a function f over & is computable by a R M
in constant time then f is definable by ramified A recurrence as a function from
tier Ai to Ai (for any i >_ 0).
SUBLEMMA 4.6 For every c, k > 0 there is a function Powerk : W2 --* W l SUCh
that Ipowerc~(w)l = c. lwl k
f(w) = eval(lpower k ( w ) h s l , ~ o ( w ) , e . . . e )
= ~.evat(powe~. (w~, ~ , ~2o(~), ~ . . . ~)
. . . . . . r 1 6 3 -
By Sublemmas 4.5 and 4.6 it follows that f is definable by ramified recurrence with
substitution as a function W~--* W0. []
4.3 S i m u l a t i o n of r e c u r r e n c e w i t h s u b s t i t u t i o n in p o l y - s p a c e
assumption, g(g, if, u) is computed in space O(n ~) + max(m, lul) (for some fl).
Thus, f ( g , y-') is computed in space O(n ~) + m for k = max(a,/?).
Case 3. iicr(h) > ~ier(g). Then all arguments of tier > tier(h) are among ~, and so
h(g, ~ is computed in space O(n ~) for some (~, by induction assumption. Therefore
Ih(~, Y")Iis also O(n('). Also, g(~, ~7,u) is computed in space O((n + lul) ~) + m for
some/3. Thus f(g, ~7,~ is computed in space O(n ~) + m where k = max(~, a~).
Suppose, finally, that f is an r-ary function over W defined by ramified recur-
rence with substitution,
where all functions in the definition satisfy the lemma, ~ier(zi) > tier(f) for each x~,
tier(yi) = tier(f) for each yi, ~ier(zi) < ~ier(f) for each zi, and all us and cri are in a
common tier. 5 By induction assumption each ga is computable in space O(n ~) + rn
for a sufficiently large a, and each crd(g ) is computable in space maxi(luil)+K for
a sufficiently large K. Since each gc is independent of f, by induction assumption,
we shall assume without loss of generality that F is empty.
Combine the given Turing machines computing as required the functions gd
and crd into a machine M that computes f as follows. Let T(w, e) be the [w]-high
/-branching tree
{il...iql l <ij <g, O < q < l w [ } .
M has as many "principal tapes" as each one of the given machines, and in addition
9 an input tape t to store the initial values of w, if, g, if,
9 a tape r for the ( binary code of) the "currently visited" address in T(w, g),
9 a tape/~ for the last address in T(w, t) "already calculated",
9 a tape r/that stores a stack of up to l . Iwl vectors of iterates of the vectors
of substitution functions, and
9 a tape ~o that stores a stack of up to g. Iwl intermediate values of f.
M visits all nodes of T(w, g) in a depth-first style, calculates values for the vec-
tors of substitution parameters, stores these on ~, calculates corresponding values
of f, which it stores on ~. To motivate the general definition, consider the following
example.
Suppose the function f to be computed is defined by
Here = = =
{lui, u2). With the instance01e - 0(1(e)) for the first argumentwe couldexpand
where the nested values of f are precisely the two values being replaced.
Continuing in this fashion, we see that the computation has at most 2 vectors
of substitutions on ~ at any given time, and at most 3 values on ~o. More generally,
with g substitution vectors we would have at most ]w I vectors on 7, and ( e - 1).lwl-t-1
values on ~, as can be easily verified by induction on ]w].
Let us now describe the operation of M in the general case. Let h = [wl, and
let ci be the i'th bit of w. M starts by initializing rr to the root of T ( w , s fl to
the empty tape, and r/to ft. Then M runs the following loop (where we use a tape
name to denote the value on that tape):
differ from ff by at most C. [w[ for some C (using induction assumption for the
substitution functions). Using induction assumption for the recurrence functions,
it follows that the values in ~ are all of size O(Iwl + m + (n + [w[) ~) for some k.
Thus the entire computation is in space O(N ~k+l + m), where N =df max(n, ]w[).
[3
References
[Abiteboul and Vianu, 1989] S. Abiteboul and V. Vianu. Fixpoint extensions of first-order logic
and datalog-like languages. In Proceedings o] the Fourth Annual Symposium on Logic in
Computer Science, pages 71-79, Washington, D.C., 1989. IEEE Computer Society Press.
[Abiteboul and Vianu, 1991] S. Abiteboul and V. Vianu. Datalog extensions for database queries
and updates. Journal of Computer and System Sciences, 43:62-124, 1991.
[Abiteboul et aL, 1990] S. Abiteboul, E. Simon, and V. Vianu. Non-determlnistic languages to
express determluistic transformations. In Proe. ACM SIGACT-SIGMOD-SIGARTSymposium
on Principles o] Database Systems, 1990.
[Abiteboul et aL, 1994] S. Abiteboul, R. Hull, and V. Vianu. Foundations o] Databases. Addison-
Wesley, Reading, MA, 1994.
[Bellantoni and Cook, 1992] Stephen BeUantoni and Stephen Cook. A new recursion-theoretic
characterization of the poly-time functions, 1992.
[Bellantoni, 1992] Stephen Bellantoni. Predicative Recursion and Computational Complexity.
PhD thesis, University of Toronto, 1992.
[Bellantoni, 1994] Stephen Bellantoni. Predicative recursion and the polytime hierarchy. In Peter
Clote and Jeffery Remmel, editors, Feasible Mathematics II, Perspectives in Computer Science.
Birkhgnser, 1994.
[Bloch, 1992] Stephen Bloch. Functional characterizations of uniform log*depth and polylog-
depth circuit families. In Proceedings of the Seventh Annual S~ructure in Complexity Theory
ConJerence, pages 193-206. IEEE Computer Society Press, 1992.
[Bonner, 1988] A.J. Bonner. Hypothetical datalog: complexity and expressibility. In Proceedings
o] the International Con]erence on Database Theory, volume 326 of LNCS, pages 144-160,
Berlin, 1988.
[Chandra et al., 1981] Ashok Chandra, Dexter Kozen, and Larry Stockmeyer. Alternation. Jour-
nal o] the AC~f, 28:114-133, 1981.
[Cobham, 1965] A. Cobham. The intrinsic computational difficulty of functions. In Y. Bar-Hillel,
editor, Proceedings o] the International ConJerence on Logic, Methodology, and Philosophy o]
Science, pages 24-30. North-Holland, Amsterdam, 1965.
[Fagln, 1974] R. Fagin. Generalized first order spectra and polynomial time recognizable sets. In
R. Karp, editor, Complexity o] Computation, pages 43-73. SIAM-AMS, 1974.
[Goerdt, 1992] Andreas Goerdt. Characterizing complexity classes by higher-type primitive-
recursive definitions. Theoretical Computer Science, 100:45-60, 1992. Preliminary version
in: Proceedings of the Fourth IEEE Symposium on Logic in Computer Science, 1989, 364-374.
[Gurevich and Shelah, 1986] Y. Gurevich and S. Shelah. Fixed-point extensions of first-order
logic. Annals of Pure and Applied Loyic, 32:265-280, 1986.
[Gurevich, 1983] Yuri Gurevich. Algebras of feasible functions. In Twenty Fourth Symposium on
Foundations of Computer Science, pages 210-214. i E E E Computer Society Press, 1983.
[Handtey, 1992] W.G. Handley. Bellantoni and Cook's characterization of polynomial time func-
tions. Typescript, August 1992.
[Immerman, 1981] Nell Immerman. Number of quantifiers is better than number of tape cells.
Journal of Computer and System Sciences, 22:384-406,1981. Preminary version in Proceedings
of the 20th IEE Symposium on Foundations of Computer Science (1979), 337-347.
[Immerman, 1982] N. Immerman. Upper and lower bounds for first-order expressibillty. Journal
of Computer and System Sciences, 25:76-98, 1982. Preliminary version in loesS1, 74-82.
500
H r a n t B . M a r a n d j i a n ,+
[nstitut, e tbr tnformatics attd Automation Problems of Nat, ionM Academy of Sciences
o[ Armenia P.Sevak str.1, Yerevan 375044, Armenia
E-mail: hm(+Ohmlh.armenia.su
1 Solution Existence
Variety of methods of solving equations can be find in many sourc.es, see, for
example, [2], [3], [4], [5] , [6], [13], [12], [14], and many others, devoted to various
aspects of the question under consideration. In some sense most relevant to the
approach presented here are articles [15], and [17].
Two welt known recursion theorems of S.C.Kleene [7], playing a great role in
the algorithm theory, state correspondingly, that
llere we show that by means of these theorems we can solve recursive equa-
tions of more general form, than (1.1) and (1.2). The equation systems in terms
occur in many fields of mathematical logic, in algorithm theory and in their
applications. In this work we consider the equation systems of a general form
where Fi and Gi are recursive terms 3, containing the function symbols {fj}.
In the first place we are interested in a problem of a solution existence of an
equation system of the mentioned form with respect to fl,-.-, f,~ in the class of
partially recursive functions (PRF). It is easy to see, that a system of the general
form recursive equations not always has a solution. Thus, choosing F1 and G1
so. that the ]eft and the right hand parts of the equation
case when the terms, taking part in the equations, have no 'inverses'. It appears
that some of those equations may have computable solutions.
In [10] a sufficient condition of the computable solution existence for the
equations of such type was found, ttere we shall present a necessary and sufficient
condition.
Let {f~} be a set of (one-sorted) function symbols, arity(j) denote the arity
of fi, and let C, P R and M be the operators of function composition, primi-
tive rgcursion and minimization, respectively. P~', S and Z will denote the base
functions:
= o.
lnteger variables will be mainly denoted by xl, x 2 , . . . , and sometimes, for
convenience, x, y, and integer constants by a, b, and so on. To be more precise
let us give the fbl]owing definition.
Every recursive term defines a recursive functional and vice versa. Recursive
terms containing no occurrences of function symbols fi we call recursfve function
terms.
Following [7, Ch. XII], we say that the expression tl -~ t2 is a conditional
equality if the left and the right hands of it are conditionally equal for all instances
of function symbols and integer variables occurring in these terms.
To solve a system of recursive equations of the form (1.3) means to find
such rectirsive function terms f l , . . . , fn that being substituted i n (1.3) instead of
function symbols f l , . 9 f~ give a system of conditional identities.
Below we shall discuss some properties of systems of the general form recur-
sire equations, e.g. the systems of equations having the form (1.3), where F's
and G's are recursive terms. These equations seem to be similar to equation
systems by S.C.Kleene [7, Chapter XI], but there is an important difference.
We do not restrict ourselves with equations having the principal function symbol
(principal function letter in terms of [7]) and given ones. As a consequence of this
the problem of solution existence becomes very difficult, see Theorem 7 below.
1. t i [ f ~ , . . . , f n ] is an i-fragmeutof (t,f);
2. Every term obtained from an i-fragment of (t, f) by replacing some occur-
rences of fj(u~,...,ukj),l <_ j < n (where u~'s are arbitrary terms), by
j-fragments of (t, f) is an i-fragment of (t, f).
~ . . . .
"" Gk
at which for every i at least once an /-fragment is replaced instead of fi and in
the result a system of identities
{ #1 all,
In the given example it is not essential by what terms the operators F and
G are exactly presented. In any ease they will appear to be coupled. To be
convinced of that, it is sufficient to choose a recursive term H so that
H[f](x)
f (.f([z/2])) 2 + (f([zl2] - 1)) 2, if z is even,
(1.7)
t f(x- 1) + f ( x - 2), otherwise.
and then replace the external occurrence of f in the term representing F by
H[/].
It is easy to check, that these replacements lead to an identity. Easy to see
that as a solution to F[f] ~ G[f] one can choose Fibonacci sequence.
Let consider an another example.
and
if x < y,
G[fa,f2](x,y) ~- f2(x, fl(x--" 2, y) "-- 1), otherwise, (1:9)
where
a .__b...{a'b, ifb<a,
- 0, otherwise.
1. in the term representing F preserve all the occurrences of fl and f2, and
2. in the term representing G replace all the occurrences of f l and f2 by
H1 [.fa, f2] and H2[fl, f2], respectively.
s In the h)rthcoming part of this work we give examples substantially depending on
terms in equations.
506
has a parl~al recursive solution wiih respec* ~ofl,. 9 fn if and only if the system
{Fi, Gi} is coupled.
Proof. Let us {Fi, Gi} be coupled. Then, by Definition 4, there exist recursive
terms H1 . . . . . H,~ and such a replacement o f / - f r a g m e n t s of (H, f) in the equa-
tions Fi[J'l . . . . , f , ] ~- G i [ f l , . . . , fn] instead of the corresponding occurrences of
f/-s, that in the result we obtain the following system of conditional identities:
{ $'1 = 0 1
...=... (1.15)
Using the Multiple Recursion Theorem one can obtain a system of partial re-
cursive functions fl . . . . , fn such that for every 1 < i < n we have H i [ f 1 , . . . , fn] ~-
fi. Using induction argument on the definition of a fragment it is easy to see that
the system of functions f l , . - . , fn is a system of fixed points of H[f]. Hence, every
i-fi'agrnent used to obtain the system (1.15) can be replaced by fl preserving con-
ditional identity. However, the resulting system coincides with the given system
of equations, tIence, the system of functions f l , . . - , f n appears to be a solution
to (1.14).
On the other hand, let a system of functions f l , . . - , fn be a solution to (1.14).
Then tile needed system of coupling terms Hi can be defined as follows:
To show how this theorem works let us consider two simple examples.
If we try to solve this equation by taking a logarithm then the solution can
be expressed as tbllows:
n+2, .ifn<2,
f(n) ~_ 2 f ( n - 1), otherwise.
which is-one of infinitely many fixed points of H.
T h e o r e m 6 . There exist such recursive term F that the equation F[f](z) " 1
has a solution (which is an everywhere defined function) but has no solution in
the class of partial recursive functions.
F[f](x) ~ F(H[Y](x))[f](x).
509
Note, that F[f](x) is not the limit value (if any) of F(n)[f](x), as it is used
usually. It is obvious, that F is defined in such a way that it is uniform in f."
It is easy to see that if f is the characteristic function s of K then F[f]
appears to be a total function identically equal to 1. For any f different from
the characteristic function of K there exists an integer m such that P[f](k) is
undefined for all k > m. Indeed, let us consider three possible cases. If fis not
total then, by Subslep 1, F[f]will appear undefined for all the points larger than
the point of the first, divergence of f. Hence, f m u s t be total, if ftakes the value
'1' for an input tbelonging to Kthen, by Subslep 2, F[f](t)is undefined and,
hence, is undefinite for all the integers s > t. If f takes the value '0' for an input
qbelonging to K then, by Subslep 3, F[f](q)is undefined and, hence, is undefined
for all the integers s > q. Hence, the characteristic function of Kis the unique
solution to the equation F[f](x) _. 1. If f differs from the characteristic function
of K then the function F[f](k) has a finite nonempty domain (as 0 E K by the
above assumption). Hence no function other that the characteristic function of
K can be a solution to that equation.
2 Solution Complexity
It is easy to see from the proof that the problem of the general form recursive
equation solution existence in the class of total recursive functions is a ~o_
complete problem too.
Above we considered the difficulty of solution existence in the class of partial
as well as total recursive functions in terms of Arithmetical Hierarchy. On the
other hand, as shows Theorem 6, some general form recursive equations can
have noncomputable solutions. Below we consider the difficulty of the problem
of solution existence to our equations in the class of all total N ~-~ N functions.
Let denote by .T" the family of all one-place total functions mapping N into N.
Proof. Easy to see that the problem of the general form recursive equation
solution existence in the class .T- belongs to Z1. Now, denote by T~a,1 ) the
Kleene Normal Form PredicateT([7]) for recursive functionals and let E 1 =
{z](3J')(3w)(T~1,1)(z, f, z, w) = 0)} as they are defined in [16].
Let us define Pz as follows:
References
5. G. Huet and D. C. Oppen Equations and rewriting rules: A survey. In Formal Lqn-
guage Theory: Perspectives and Open Problems, pages 349-405. Academic Press,
New York, 1980.
6. A. J. Kfoury, J. Tiuryn, and P. Urzyczyn Computational consequences and partial
solutions of a generalized unification problem. In Proc. Fourth Annual Symposium
on Logic in Computer Science, pages 98-105, Washington, 1989.
7. S. C. Kleene Introduction to Metamathematics. D. Van Nostrand Co., Inc., New
York, Toronto, 1952.
8. S. C. Kleene Extension of an effectively generated class of functions by enumera-
tion. Colloquium Mathematicum, VI:67-78, 1958.
9. G. Kreisel Interpretation of analysis by means of constructive functionMs of finite
types. In A. Heyting, editor, Constructivity in Mathematics, Studies in Logic,
pages 101-t28. North-Holland Publ. Co., Amsterdam, 1959.
10. H. B. Marandjian On recursive equations. In COLOG-88, Papers presented at the
Int. Co@on Computer Logic, Part II, pages 159-161, Tallinn, 1988.
11. H. B. Marandjian Selected Topics in Recursive Function Theory in Computer
Science. DTH, Lyngby, 1990.
12. V. P. Orevkov Complexity of Proofs and Tbeir Transformations in Aziomatic
Theories, volume 128 of Transl. of Math. Monographs. American Mathematical
Society, New York, 1993.
13. R. Parikh, A. Chandra, J. H~tlpern, and A. R. Meyer Equations between regular
terms and all application to process logic. SIAM J. Comput., 14(4):935-942, 1985.
14. T. Pietrzykowski A complete mechanization of second-order type theory. JACM,
20(2):333-364, 1973.
15. A. Robiuson. Equational logic of partial functions under Kleene equality: a com-
plete and an incomplete set of rules. JSL, LIV(2):354-362, 1989.
16. If. Rogers, Jr. Theory of Recursive Functions and Effective Computability.
McGraw-Hill Book Company, New York, 1967.
17. L. Rudak. A completeness theorem for weak equational logic. Algebra Universalis,
16:331-337, 1983.
18. D. Scott Data types as lattices. SIAM J. Comput., 5(3):522 - 587, 1976.
Modal Logics Preserving Admissible for $4
Inference Rules
Vladimir V. RYBAKOV*
Mathematics Department, Krasnoyarsk State University
av. Svobodnyi 79, 660062 Krasnoyarsk, Russia
email: rybakov@math.kgu.krasnoyarsk.su
Abstract
The aim of the current paper is to provide a complete semantics descrip-
tion for modal logics with finite model property which preserve all admissible
in $4 inference rules, with the intention of meeting some of the needs of
computer science. It is shown a modal logic A with fmp above $4 preserves
all admissible for $4 inference rules iff A has so-called co-cover property. In
turned out there are c6ntinuously many logics of this kind. Using mentioned
above semantics criterion we give some precise description of all tabular logics
preserving admissible for $4 rules.
1 Introduction
Vardi [3]). Anyway among possible derivation rules some special once can definitely
be separated. It is so-called admissible (or permissible) inference rules: those with
respect to which the logic is closed. The class of admissible rules consists of exactly
all inference rules that can be applied in proofs with preserving the set of theorems
for given logic. Having in disposal all admissible rules for a given logic A we possess
all power of inference tools for derivations in A.
Unfortunately to determine whether a rule is admissible is a serious problem as
well. Therefore, having in disposal very few known criterions for recognizing ad-
missibility for individual logics, it is important to describe all logics which preserve
admissible rules of some logics of particular importance (say, having criterions for
recognizing admissibility). This paper aims to provide description of modal logics
which preserve all admissible inference rules of minimal transitive reflexive modal
logic $4. The base for this research is the existence of semantics and algorithmic
kind criteria for recognizing admissibility in modal logics $4, Grz (cf. [9, 10]).
Note that description of all intermediate (superintuitionistic) logics which preserve
all admissible derivation rules of intuitionistic propositional logic is given in [11].
We follow in general the line of this paper, but the modal case is more difficult
technically.
and Sn(A4) is the set of all elements from M with depth not more than n. Ifc E M
then
cR< := {z [ z E M, cRx,-~(xRc)},
c n<- := {x Ix ~ M, cRx}.
If Y C M then
y R < := {x 19 e M, 3v e
y n < := {z I z E M, 3y E V(yRx)}.
For any frame F, A(F) is the modal logic generated by F (that is the set of all
formulae which are true on F under any valuation of the variables). F + designates
the modal algebra associated with F. Vat(A) denotes the variety of all modal
algebras on which all theorems of A are true.
If W1 := (W1, R, V}, W2 := (W2, R, V} are Kripke models, W1 C W2, R and Y
in both models on the same elements coincide then we say W1 is an open submodel
of W2 iff the following holds
We define the model Gk+l as follows: the base set of Gk+l is ]A/Ik[ tA$1 | A(~/k).
The valuation l / o f any letter Pi from Pn is the set of all elements from )r on
which Pi is valid under If united with the set of all elements from $1 on which pl
is If-valid. The relation R on Gk+l is the transitive closure of the relation in )r
the relations on clusters from ,s 1 and the relation Q, where
of 7"(n) and T(n) = Chs4(n). We will often use this observations and properties
of T(n) furthermore.
In order to show Ch~(n) is n-characterizing for A we need some more definition.
Let K; : = < K, R, V > be a Kripke model and let W be a new valuation of some
propositional variables on the frame < K, R >. The valuation W is called express-
ible (or formulistic or definable) if and only if for any letter pi from the domain of
W, there exists a formula Ai such that W(pi) := V(Ai). A subset X of K is called
expressible (or formulistic or definable) iff there exists a formula A such that
X = {x [ x e K, xlF- vA}.
L e m m a 2.2 The Kripkc model ChA(n) is some n-characterizing for the logic A.
Every element of this model is expressible.
3 Preserving of admissibility
The notion of preserving admissibility for inference rules naturally follows from the
essence of admissibility and describes the properties of derivations for axiomatic
systems of modal logics.
D e f i n i t i o n . A modal Iogic A preserves all admissible for $4 inference rules if
every admissible for $4 rule is also admissible for A.
Now we introduce a semantics notion which will reflect the preserving of admis-
sibility. Let F := (F, R) be a frame and let X be an antichain of clusters from F.
We say that an element e from F is a co-cover for X if cR< = X R-< and {c} forms
the cluster. Let F be a finite rooted frame.
D e f i n i t i o n A frame J~4 is said to be co-covers poster of F if A~ is obtained
from F by adding of finite number of elements as follows. Let F0 := F. In every
step i of the adding we add to the frame Fi a single new element c in the following
way: we choose some non-trivial (having at least two clusters) clusters antichain X
in Fi which has no co-cover in Fi (if there is any) and add to Fi the new element c
which is co-cover for X. After a finite number of steps i = 1,2, ..., n this procedure
terminates and the resulting frame is .h4.
D e f i n i t i o n We say that a logic A has the co-cover property if for any rooted
finite A-frame F every co-covers poster of F is an A-frame.
We say a rule r is disprovable on a frame F by valuation V if all premises of r
are valid on F with respect to V but the conclusion of r is false on F under V.
L e m m a 3.1 If a modal logic A 2 $4 has the co-cover property and has the finite
model property then A preserves all admissible for $4 inference rules, admissible
for $4.
L e m m a 3.2 If a modal logic A preserves all inference rules admissible for $4 and
has finite model property then A has co-cover properly.
Proof. Let A be a logic which possesses fmp and does not have the co-cover
property. This means there is a subframe F := a R-< of Ch~(n) generated by a
cluster a and a sequence F0, ..., Fk of generated finite subframes of Ch~(n) which
has the following properties:
a) F0 :=F.
b) All antichains of clusters from Fi of depth less than i have in Fi
co-covers.
e) The frame Fi+l is obtained from Fi when i < k as follows: We
consider the set Ai of all non-trivial antichains of clusters from Fi
having at least one cluster of depth i. If all anticains from Ai have
co-covers in Fi then Fi+l := F~, otherwise for every antichain X
from At with missing co-cover we add to Fi single co-cover for X.
d) There exists an antichain A of clusters from F~ which has no co-
cover in Ch~(n) and has a cluster with depth k.
Then, in particular, all antichains of Fk with depth of elements not more than
k - 1 have co-covers in Fk itself. Moreover, we can choose the sequence F1, ..., Fk
with smallest number k of members and pointed above properties. Now we need
some special modal formulae in order to express some properties of the model
Ch~(n). Let M := S~(Ch:~(n)) U Fk and M1 := Slk+l(Ch:~(n)) U M. Any cluster
from Chx (n) is a subset of MI (or M) or has with M1 ( M correspondingly) empty
intersection. For each cluster i C M1 we introduce the new propositional variable Pi
(inewL means Pi is not a variable from the domain of the valuation of Ch~(n)). Now
we introduce special formulae f(i) for clusters i of M1 by induction on the depth
of i. If i, j are some clusters then iRj means that all elements of j are accessible
by R from all elements of i. If a cluster i belongs to S~(M~) then f(i) := Opi.
519
If formulae f(i) are already introduced for all clusters i from St(M1) and j is a
cluster from Slt+l(M1) then we put
where a is the generating element for F0. We choose the following valuation W
of variables pi from r on Ch~,(n): W(pi) := {j I iRj}. It can be verified directly
by induction on depth, that for each cluster i from M1 and any a E i, a II-"wf(i)
holds. Also for each cluster i from (Ch;~(n)-M1) and every a E i, a I[- wg is valid.
Thus r is disproved in Ch~,(n) by the valuation W, that is
Then there is b,, E T(n) such that ba I~ wf(a). It is easy to see from definition of
f(i) that
Vbi E T(n)(bi IF wf(i)&j c_ MI&
&(iRj)&--,(jRi) ==r 3bj(biRbj&bj I~ w f(j))) (3)
Suppose that bl H- w f(i), bj I['- w f(j) and bi1:gbj. If-',(iRj) then bj I~- w-"rnpi. But
bi I~- wf(i) therefore bi I~ wrnpl. This and biRbj imply bj I}- wrnpi - a contradic-
tion. Hence
Vbi, bj E T(n)(bi I~- wf(i)&bj I~- wf(j)&",(iRj) ~ ~(binbj)). (4)
520
From(4) we get
T h e o r e m 3.3 In order for a modal logic A with fmp to preserve all admissible in
$4 inference rules it is necessary and sufficient that A to have co-cover property.
It seems that all modal logics with fmp and co-cover property have very similar
classes of finite frames which approximate these logics. So from first glance it
looks very likely that the class of all such logics is not so reach and has not many
representatives. But in real it does not a ease.
T h e o r e m 3.4 There are continuously many modal logics over modal system Grz
which preserve all admissible in $4 inference rules and have fmp.
G + = H S ( G +, x...• xF+ •
because G + is finite. Then, using the correspondence between homomorphic im-
ages, subalgebras and directs products of finite modal algebras and generated sub-
frames, p-morphic images and disjoint unions of theirs corresponding frames we
have that G~ is a generated subframe of a frame D, where
expresses this property: for any rooted frame F, F I[- ~n ~ F has width not
more than n. We say that a logic A has width not more than n if ~ E A and that
it has width more than n if ~n ~ A.
523
M1 := ({a, bl, b2, cl, d}, <), a < bl, a < cl, bt < b2, cl < d, b_, < d.
M2 : - ({a, bl,b2, cl}, <),a < bl,a < Cl,bl < b2.
/1//3 := ({a, bl,b2, Cl,C2}, <),a < bl,a < cl,bl < b2, cl < c2,bl < cl.
Any tabular modal logic A always has representation: A := A(F1 U ... U Fn),
where F1 U ... IJ F,~ is the disjoint union of some rooted finite frames and logics
A(FI) are incomparable. We fixe some such representation for any given tabular
logic.
L e m m a 3.5 A tabular logic A := A(F1 U ... II Fn) preserves all admissible for $4
derivation rules iff each logic A(Fi) preserves all they.
524
Proof. Indeed, if some logic )t(Fi) preserves no an admissible for $4 rule A / B then
logic )t preserves no rule t3A V Dz/tDB V rqz which is admissible for $4 according
to disjunction property of $4. The converse is evident. 9
T h e o r e m 3.7 A tabular logic A : : A(F1 kJ ... kJ F,), preserves all admissible for $4
inference rules iff each Fj is proper rooted, )~ has width not more than 2 and for
all i, 1 < i < 5, )t g A( Mi ).
4 Comments on applications
Applying results from this paper and other results concerning admissibility of in-
ference rules in computer science bases on some natural and simple observations.
Any inference rule r can be understood as an atomic instruction in a declarative
programming language. Any finite axiomatic system ,4 correspondingly can be
interpreted as a program P which is intended to solve a task (in our particular case
the task is to derive theorems). According to this interpretation an admissible for
,4 inference rule r is an atomic programming instruction which can be consistently
added to the P, that is adjoining r to P does not change cardinally the behavior of
P: the output states will be the same. Thinking of admissibility derivation rules we
can have in mind this interpretation. This appears to be rather closure to research
in declarative programming languages like PROLOG.
Verification of programs in declarative programing languages also sometimes
needs to clarify if an atomic programming instruction can be consistently adjoined
526
to the program, what again has mentioned above interpretation trough admissi-
bility. Besides finite reflexive transitive frames can be understood as transition
systems of arbitrary nature. The logic ),(~) of some collection l) of such frames
is just equational theory of such chosen transition systems in modal language. As
it is easily seen admissible inference rules for ,k(T~) describe some meta-properties
of this theory. Of course this comments only sketch the line, anyway for some
individual particular cases this approach can be properly developed.
References
[1] van Benthem :]. A Manual of Intensional Logic. Lecture Notes, CSLI, Stanford,
1988, 135pp.
[2] Clarke E.M., Grumberg O., Kurshan B.P. A Synthesis of Two Approaches
for Verifying Finite State Concurrent Systems. Lecture Notes in Computer
Science, No. 363, 1989, Logic at Botik'89, Springer-Verlag, 81-90.
[3] Fagin R., Halpern J.Y., Vardi M.Y., What is an Inference Rule. 3. of Symbolic
Logic, 57(1992), No 3, 1018 - 1045.
[4] Goldblatt R.I. Metamathematics of Modal Logics, Reports on Mathematical
Logic, Vol. 6(1976), 41 - 78 (Part 1), 7(1976), 21- 52 (Part 2).
[5] Konolige K., On the Relation between Default and Autoepistemic Logic. Ar-
tificial Intelligence, V. 35 (1988), 343-382.
[6] Larsen K.G., Thomsen B. A Modal Process Logic, in Proceedings of Third
Annual Symposium on Logic in Computer Science, Edinburgh, 1988.
[7] Moore R.G., Semantical Consideration on Non-monotonic Logic. Artificial In-
telligence, V.25(1985), 75-94.
[8] Rautenberg W. Klassische und nichtklassische Aussagenlogik,
Braunsehweig/Wiesbaden, 1979.
[9] Rybakov V.V. Problems of Admissibility and Substitution, Logical Equations
and Restricted Theories of Free Algebras. Proced. of the 8-th International.
Congress of Logic Method. and Phil. of Science. Elsevier Sci. Publ., North.
Holland, Amsterdam, 1989, 121 - 139.
[10] Rybakov V.V. Problems of Substitution and Admissibility in the Modal Sys-
tem Grz and Intuitionistic Calculus. Annals of Pure and Applied logic V.50 ,
1990, 71 - 106.
[11] Rybakov V.V. Intermediate Logics Preserving Admissible Inference Rules of
Heyting Calculus. Math. Logic Quart. V. 39 (1993), 403 - 415.
[12] Shvarts G.F. Gentzen Style Systems for K45 and K45D. Lecture Notes in
Computer Science, No. 363, 1989, Logic at Botik'89, Springer-Verlag, 245 -
255.
A B o u n d e d Set T h e o r y w i t h A n t i - F o u n d a t i o n
Axiom and Inductive Definability*
D(G,y)={D(G,x):(x,y)CG}, or = { D ( G , x ) : x - + G y } .
(G, a) =0 (G', a' I ~- a "~a,a' a' (iff 3R e ]).(Bis(t~, G, G') ~ are')) and
(G, a) Eo (G', a'> = 3b' --+G' a'.(G, a> =o (C', b') .
Theoreml [Fer]. KPo + Power I- "(g, =o, Eo) k KPo + Power + AFA".
It was essentially used in the proof of this theorem that ~-relations =0 and
Ee become A in the presence of Powerset which seems too strong axiom from
* Supported by G.Soros Foundation and by Russian Basic Research Foundation
(project 93-011-16016).
2 In a Bounded Set Theory [Saz85, Saz87, Saz93] (exactly corresponding to PTIME-
computability) it was used, instead of the decoration operation, analogous notion
of A.Mostowski's collapsing (cf. [Bar75], Ch.I.0.13), however, only for the case of
well-]ounded graphs where both these notions are equivalent. Also, our direction of
edges in graphs is taken as in [Bar75] and in our previous papersi just converse to
that in [Acz88].
528
An-formulas : : = a E b i a = b i ~ & r 1 6 2
An-terms ::= set-variables p, q, x, y,...I {a, b}lUa I
{ t( x ) : x E a~;~p(x)} Ithe-least p.(p = { x E a: ~( x, p)})
where a,b,t are An-terms and ~ , r are An-formulas, set-variables x , p are not
free in a and all occurrences of set variable p in An-formula ~ are only of the
form '- E p', positive and not inside of any complex subterm of 9. The condition
on p guarantees that ~ is monotonic under the set inclusion on this set variable
and therefore (at least in the framework of the classical set theory) the least such
set p must exist. In fact, we will consider also the closure of An-formulas under
&, V,-~ and unbounded V and 3 with the ordinary definition of the bounded
quantifiers of An via unbounded ones.
529
Extensionality: a C a ~ a I C a = = ~ a - - a ~
Pair: t 9 ,
Union: t 9 1 4 9 1 4 9 ,
An-Image: s 9 {t(x) : : 9 a ~ ~(x)} ~ 3: 9 a.(~(:) ~ s = t(x)) ,
Recursive An-Separation: If p0 = the-least p.(p = {x 9 a : V~(x,p)}) then
po = {x 9 ({x 9 a: c_ p po c_ p) ,
x C: TC(-x), Tran(TC(x)) ,
(x C: y =;, TC(x) C mC(y)) and Tran(y) =~ TC(y) C y .
We split AFA into two axioms AFA1, existence or correctness of the decoration,
Here we take formally "(~,=0,E0) ~ W(2)" ~.~ V2. E ~.[[p]](2) where [~]] is
defined below a (DR-)translation of any (DD-)formula ~ which will give the in-
tended meaning of ~o in the universe (6, =0, E0}. Simultaneously, we must prop-
erly define (by a structural induction) for any DD-term t = t(2) corresponding
DR-term It[. In particular, we will have KPR0 ~- W E ~ . ( H ( 2 ) E ~). Some
denotations used in the following inductive definition of [-] will be explained in
Appendix 4.1. However, they have suitable mnemonics. Let us define here only
the point-changing operation for any pg (G, z) by (G, z} * y ~ (G, y) and also
elements(G, x) ~ {y E field(C) : y --~a x}. Then let
ir{t(x): x E a =
Here g,p are considered as any pg's in ~ and ~ denotes a pointed factor-
graph of g = (G, ;v) under the largest bisimulation (equivalence) relation " v
on G ~ graph(g) with any (x, y) E G defining an edge between corresponding
two equivalence classes [x], [y], and with the point ~r of g defining the point [~r]
of ~. Then it is clear, that any such ~ is isomorphic to some pointed "transitive
set" of the universe (6, =0, E0)/=o.
We can prove in KPR0 the congruence property of the equivalence relation
=0 with respect to the above definition of translation semantics [[-1]by using the
structural induction on any AD-formula v~(g) and AD-term t(g).
C o n g r u e n c e L e m m a 3.
Note, that in the proof of this lemma given in the Appendix 4.2 it is simul-
taneously shown that also for bounded quantifiers the equivalence
H , i.e. they hold in (G, =0, Co), provably in KPR0. For example, translations of
the (logical) equality axioms (cf. [Men64]),
for any formula ~(v) and terms a(v), t and s, are deciphered, respectively, as
[[t]] =0 It] and [t]] =0 [[s]] ::~ (J[~(t)] r [[~(s))]) &; ([a(t)]] =0 [a(s)]])
and we also should use, besides the above congruence lemma, the following the-
orems of KPN0 proved by induction on p and a with any t:
This also allows to prove translation of the quantifier axiom Yv.p(v) ~ p(t):
vv M(N) 9
as well as all propositional axioms. We finish the logical part by noting that
the logical inference rules from [Men64], just Modus ponens, ~, ~ =:~ r 1 6 2 and
Generalizat!on, ~(x)/Vx~(x), also preserve validity in (6, =0, e0). All of these
means that our semantics is logically correct.
Finally, and most important, we can prove in KPR0 translations of all non-
logical axioms of BSTA (cf. Appendix 4.3).
abbreviated below as P~, (= P~,(-) = P~,(-, 9, Q)). Here, 2, 9 and P, 0 are, respec-
tively, all free individual and predicate variables of T and the predicate variable
P is required to have only positive occurrences in ~. The last condition guar-
antees that ~ is monotonic on the predicate variable P and therefore (at least
in the framework of the classical set theory) the least required value of P must
exist. This new predicate construct may participate in other kiD-formulas with
any nesting except that its predicate variable P is not allowed to appear in the
corresponding body ~(~,P) inside any other such construct which is a part of
533
In particular, when ~ does not depend on P these axioms essentially give rise
just to the ordinary comprehension axiom for first-order formulas.
A semantics of this language and calculus in the framework of any set the-
ory universe V is defined via fixing any second-order structure A4 = (M, 7))
consisting of a many-sorted first-order structure M E V (e.g. with sorts being
just (the sets of vertices of) some pg's ~ = g t , . . . , g. in ~4 in which case M is
essentially a tuple or a join (gl,..., g.) of one-sorted structures) together with
some class "P C_ V of many-sorted predicates over M. It is required that always
(the value of) P~ = P~(-, #, Q) is in ~ when parameters ~3,Q are in A4 and
also that both the above axioms hold in ,44. In the case V satisfies KPR01A R it
suffices to take P just all the subsets E V of the corresponding direct products
of fitst-order sorts and we also will write M instead o f A4. Let us call such
A4 induced by M E V an inner model in V of LID. Bearing in mind this case,
Recursive AR-Separation axiom of KPR0 allows to interpret the-least construct of
LID by the-least construct of KPR0. Evidently, "M ~ ~" is represented by a AR-
formula of KPR0 for any ~ E LID. Then let ~V ~ abbreviate (non-AR-formula)
VM E V~t.(M ~ ~) where V denotes universal quantification over all free indi-
vidual and predicate variables of ~ (ranging eventually over V). Evidently, this
gives a natural embedding of LID in KPR0 (and, therefore, also in BSTA):
which is, in fact, conservative according to the Corollary 8 below. It is well known
that (for variations of the language of LID)
The translation [[t]] of AD-terms defined in Sect. 2 gives for each such a term
an operation over pointed graphs (A# E 6.[[t]~(~)) : 6 '~ ---+ 6. This operation was
defined by a An-term and its values do matter only up to an isomorphism of
pg's. Analogously, translation [~o](~) of a AD-formula gives a predicate over 6
which is also expressed by a An-formula.
As for the case of a well-founded universe [Saz91], we can imitate these
translations by a more elementary kiD-language (than A n dealing with "nested"
sets and predicates) with binary predicate constants and individual constants,
each pair for each sort, corresponding to each input pointed graph. We may just
redefine translations [[~o~(~) and [[t]](~), written now respectively as LI D-formulas
((~o))(~) and ((t))(~, ~, e), with no free individual variables in the first and with
designated two lists of all free variables ~, 9 in the second and one list e of
constants in both (just a list of the points of the given input pg's ~ in any order
and possibly with repetitions) all of the same nonzero length (may be different
from 9). This may be done in a full correspondence with the old H-translation:
all of these being provable in KPRo (with the isomorphism definable by a An-
term).
Then it proves that ((-))-translation of all AD-axioms of BSTA (i.e. all, except
AD-Collection5 ) including appropriate logical axioms for AD-language are prov-
able in LID. Also, logical inference rules for AD-language will preserve validity
in corresponding many-sorted structures under this translation. All this gives
rise to
On the other hand, any model Ad = (M, 7)) of LID gives rise to a supply ~A/[
of pointed graphs, just graphs with tuples as vertices defined by P-predicates
Afi~.P(g, 9), each one endowed with corresponding tuple of elements r~ of M as
a point. Naturally definable in LID relations =0A4 and C0-M also allow to consider
the triple (~j~4 ,-0-Ad , E0-M ) almost in the same way as (~, =0, C0> in Theorem 2
(just by an imitation of the whole argument).
4 Appendix
4.1 S o m e t e c h n i c a l definitions n e e d e d for [-]
Let us define for every pg (G, a)
point(G, a) = a, graph(G, a) = (7, vertices(G, a) =field(G) U {a} ,
536
elements-of-elements(G, a) =
{x E vertices(G, a ) : qy E vertices(G, a).(x --+G Y --+a a)} ,
elements* (G, a) ~ {x E vertices(G,a): (x ---+a...-+a a)} ,
new(v) = { x E v: x ~ x}, new-vertex(G, A) :m- new(vertices(G, a>) .
EX ~- (collect A in G x ) ,
the-unique x E t . ~ ( x ) ~ U {x E t : ~(x)} .
All the above constructs are easily definable by A R - t e r m s . However, only the
cases of--0, Co, elements* and ~ require Recursive AR-Separation.
537
with corresponding x and x t and q and q~ via the given bisimulation R. Here we
say that sets of vertices q and q' correspond by R if for each v E q there exists
a v' E q' such that vRv' and vice versa. We need also use the fact that for each
set q ___elements[a~(g) satisfying the inequality
Also if q0 is the least solution of the first (in)equality then R(qo) is the least
solution of the second. Indeed, if q~ is any solution of the second inequality then
(as above)/~-l(q,) is corresponding solution for q of the first and therefore q0 C_
R-l(q'). Any x' E R(qo) is related by bisimulation R with some x E q0. It follows
that x is an element of the right hand side of the first inequality (which proves
to be just equality) with q0 in place of q and therefore, by monotonicity, with
R - l ( q , ) in place of q. Then appropriate bisimulation gives rise to the analogous
membership x' to the right-hand-side of the second inequation for q'. Since this
right-hand-side is included in q' we have x ~ E q' and therefore R(qo) C_ q~ what
means that R(qo) is the least solution qo(g'), as required. It follows also that
qo = qo(g) is related by the required bisimulation relation with R(qo) = qo(g').
This finally gives (collect qo(g)in graph[el(g)) =0 (collect qo(g')in graph[a~(g')).
9 Extensionalityfollows from the following easy consideration. For any two pg's
(G, a) and (G', a') if for any x --+c a there exists x' ---~c' a' and a bisimulation
relation/~ such that xRx t, and vice versa, then the same holds for the unique,
the largest bisimulation relation --.r between G and G ~ (extending all these
R) which must also connect a and a'. It follows that (G, a) =0 (G', a').
Analogously, if [[(G, a) C {G', a')] then for some set A C elements(G', a') we
have (G, a) =0 (collect A in G').
539
where all free variables such as p are supposed to run over G. To show this let
us present the above form of Recursive An-Separation axiom for q0:
where all free variables such as q run over arbitrary sets (of the original universe
V of KPR0). Then to prove the above [-[-translation of Recursive AD-Separation
it is sufficient to note, that
Vx E elements[a[Zly E ~.[Vp]]([al]* z, y) :r
3z E (~Vz E elements[a]3v E elements(z).[~[([a] * x, z * v) .
9 Consider [-[-translations of AFA axioms. First note, that AFA1 may be rewrit-
ten also as conjunction of the following two formulas:
The first two formulas are evident. For the third formula note, that the set {u E-
elements* (x) : 3w e elements*(y).(x*u =0 y . w ) } contains elements*(x) because
it satisfies corresponding equation for elements*(x). Analogously, antecedent of
the last formula implies that the set {w E vertices(x) : 3w' e elements(x).(x 9
w =0 x * w')} includes elements* (x) what is essentially we need.
5 Acknowledgments
The author is grateful to anonymous referees for comments and detailed sugges-
tions on improving the presentation of this paper.
541
References
[Acz88] P.Aczel, Non-well-founded sets, CSLI LN N14, Stanford, 1988.
[Acz77] P.Aczel, Introduction to the theory of inductive definitions, in: Handbook of
Math. Logic, J.Barwise ed., North-Holland, Amsterdam, 1977.
[Bar75] J.Barwise, Admissible sets and structures. Berlin, Springer, 1975
[J&g86] G.Js Theories for admissible sets. A unifying approach to proof theory,
Studies in Proof Theory, Lecture Notes 2, Bibliopolis, 1986.
[Fef70] S.Feferman, Formal theories for transfinite iterations of generalized induc-
tive definitions and some subsystems of analysis, In: Intuitionism and Proof
Theory, eds. A.Kino, et al., Amsterdam, North-Holland, 1970, 303-326.
[Fed T.Fernando, A Primitive recursive set theory and AFA: on the logical com-
plexity of the largest bisimulati0n, Report CS-R9213 ISSN 0169-118XCWI
P.O.Box 4079, 1009 AB Amsterdam. The Netherlands.
[Gan74] R.O.Gandy, Set-theoretic functions for elementary syntax, in: Proc. in Pure
Math., Vol 13, Part II (1974) 103-126.
[Gur83] Y.Gurevich, Algebras of feasible functions, in: FOCS'83 (1983) 210-214.
[Gur84] Y.Gurevich, Towards logic tailored for computational complexity, in: LNM
1104, Springer, Berlin (1984) 175-216.
[cs86] Y.Gurevich and S.Shelah, Fixed point extensions of first-order logic, Ann.
Pure AppL Logic 32 (1986) 265-280.
[Imm82] N.Immerman, Relational queries computable in polynomial time, in: 14th.
STOC (1982) 147-152; cf. Inform. and Control 68 (1986) 86-104.
[Jen72] R.B. Jensen, The fine structure of the constructible hierarchy, Ann.Math.
Logic 4 (1972) 229-308.
[Liv82] A.B.Livchak, Languages of polynomial queries, in: Raschet i optimizacija tep-
lotehnicheskih ob 'ektov s pomosh 'ju E V M , Sverdlovsk, 1982, 41 (in Russian).
[Mos74] Y.N.Moschovakis, Elementary Induction on Abstract Structures, Amsterdam,
North-Holland, 1974.
[Men64] E.Mendelson Introduction to Mathematical Logic, D. Van Nostrand, Prince-
ton, 1964.
[Saz80] V.Yu.Sazonov, Polynomial computability and recursivity in finite domains.
EIK, 16, N7 (1980) 319-323.
[Saz85] V.Yu.Sazonov, Bounded set theory and polynomial computability. Conf. on
Applied Logic, Novosibirsk, 1985, 188-191. (In Russian)
[SazS5a] V.Yu.Sazonov, Collection principle and existetial quantifier. (In Russian) Vy-
chislitel'nye sistemy 107 (1985) Novosibirsk, 30-39.) (Cf. English translation
in AMS Transl. (2) 142 (1989) 1-8.)
[Saz87] V.Yu.Sazonov, Bounded set theory, polynomial computability and A-prog-
ramming, Vyehislitel'nye sistemy 122 (1987) Novosibirsk, 110-132 (in Rus-
sian). Cf. also LNCS 278 (1987) 391-397 (in English).
[Saz91] V.Yu.Sazonov, Bounded Set Theory and Inductive Definability, Logic Collo-
quium'90, JSL, 56, Nu.3 (1991) 1141-1142.
[Saz93] V.Yu.Sazonov, Hereditarily-finite sets, data bases and polynomial-time com-
putability, TCS 119 (1993) 187-214, Elsevier.
[Var82] M.Y.Vardi, The complexity of relational query languages, STOC'82, 137-146.
Author Index
Vol. 866: Y. Davidor, H.-P. Schwefel, R. Mfinner (Eds.), Vol. 883: L. Fribourg, F. Turini (Eds.), Logic Program
Parallel Problem Solving from Nature - PPSN III. Pro- Synthesis and Transformation - Meta-Programming in
ceedings, 1994. XV, 642 pages. 1994. Logic. Proceedings, 1994. IX, 451 pages. 1994.
Vo1867: L. Steels, G. Schreiber, W. Van de Velde (Eds.), Vol. 884: L Nievergelt, T. Roos, H.-J. Schek, P. Widmayer
A Future for Knowledge Acquisition. Proceedings, 1994. (Eds.), IGIS '94: Geographic Information Systems. Pro-
x n , 4 t 4 pages. 1994. (Subseries LNAI). ceedings, 1994. VIII, 292 pages. 19944.
Vol. 868: R. Steinmetz (Ed.), Multimedia: Advanced Vol. 885: R. C. Veltkamp, Closed Objects Boundaries
T e l e s e r v i c e s and High-Speed Communication from Scattered Points. VIII, 144 pages. 1994.
Architectures. Proceedings, 1994. IX, 451 pages. 1994. Vol. 886: M. M. Veloso, Planning and Learning by Ana-
Vol. 869: Z. W. Rag, Zemankova (Eds.), Methodologies logical Reasoning. XIII, 181 pages. 1994. (Subseries
for Intelligent Systems. Proceedings, 1994. X, 613 pages. LNAI).
1994. (Subseries LNAI). Vol. 887: M. Toussaint (Ed.), Ada in Europe. Proceed-
Vol. 870: J. S. Greenfield, Distributed Programming Para- ings, 1994. XII, 521 pages. 1994.
digms with Cryptography Applications. XI, 182 pages. Vol. 888: S. A. Andersson (Ed.), Analysis of Dynamical
1994. and Cognitive Systems. Proceedings, t993. VII, 260
Vol. 871: J. P. Lee, G. G. Grinstein (Eds.), Database Is- pages. 1995.
sues for Data Visualization. Proceedings, 1993. XIV, 229 Vol. 889: H. P. LuNch, Towards a CSCW Framework for
pages. 1994. Scientific Cooperation in Europe. X, 268 pages. 1995.
Vol. 872: S Arikawa, K. P. Jantke (Eds.), Algorithmic Vol. 890: M. J. Wooldridge, N. R. Jennings (Eds.), Intel-
Learning Theory. Proceedings, 1994. XIV, 575 pages. ligent Agents. Proceedings, 1994. VIII, 407 pages. 1995.
1994~ (Subseries LNAI).
Vol. 873: M. Naftalin, T. Denvir, M. Bertran (Eds.), FME Vol. 891: C. Lewerentz, T. Lindner (Eds.), Formal De-
'94: Industrial Benefit of Formal Methods. Proceedings, velopment of Reactive Systems. XI, 394 pages. 1995.
1994. XI, 723 pages. 1994.
Vol. 892: K. Pingati, U. Banerjee, D. Gelernter, A.
Vol. 874: A. Borning (Ed.), Principles and Practice of Nicolau, D. Padua (Eds.), Languages and Compilers for
Constraint Programming. Proceedings, 1994. IX, 361 Parallel Computing. Proceedings, 1994. XI, 496 pages.
pages. 1994. 1995.
Vol. 875: D. Gollmann (Ed.), Computer Security - Vol. 893: G. Gottlob, M. Y. Vardi (Eds,), Database
ESORICS 94. Proceedings, 1994. XI, 469 pages. 1994. Theory- ICDT '95. Proceedings, 1995. XI, 454 pages.
Vol. 876: B. Blumenthal, J. Gornostaev, C. Unger (Eds.), 1995.
Human-Computer Interaction. Proceedings, 1994. IX, 239 Vol. 894: R. Tamassia, I. G. Tollis (Eds.), Graph Draw-
pages. 1994. ing. Proceedings, 1994. X, 471 pages. 1995.
Vol. 877: L. M. Adleman, M.-D. Huang (Eds.), Algorith- Vol. 895: R. L. Ibrahim (Ed.), Software Engineering Edu-
mic Number Theory. Proceedings, 1994. IX, 323 pages. cation. Proceedings, 1995. XII, 449 pages. 1995.
1994.
Vol. 896: R. N. Taylor, J. Coutaz (Eds.), Software Engi-
Vol. 878: T. Ishida; Parallel, Distributed and Multiagent neering and Human-Computer Interaction. Proceedings,
Production Systems. XVII, 166 pages. 1994. (Subseries 1994. X, 281 pages. 1995.
LNAI).
Vol. 897: M. Fisher, R. Owens (Eds.), Executable Modal
Vol. 879: J. Dongarra, J. Wagniewsk'i (Eds.), Parallel Sci- and Temporal Logics. Proceedings, 1993. VII, 180 pages.
entific Computing. Proceedings, 1994. XI, 566 pages. 1995. (Subseries LNAI).
1994.
Vol. 898: P. Steffens (Ed.), Machine Translation and the
Vol. 880: P. S. Thiagarajan (Ed.), Foundations of Soft- Lexicon. Proceedings, 1993. X, 251 pages. 1995.
ware Technology and Theoretical Computer Science. Pro- (Subseries LNAI).
ceedings, 1994. XI, 451 pages. 1994.
Vol. 899: W. Banzhaf, F. H. Eeckman (Eds.), Evolution
Vol. 881: P. Loucopoulos (Ed.), Entity-Relationship Ap- and Biocnmputation. VII, 277 pages. 1995.
proach- ER'94. Proceedings, 1994. XIII, 579 pages. 1994.
Vol. 900: E. W. Mayr, C. Puech (Eds.), STACS 95. Pro-
Vol. 882: D. Hutchison, A. Danthine, H. Leopold, G. ceedings, 1995. XIII, 654 pages. 1995.
Coulson (Eds.), Multimedia Transport and Teleservices.
Proceedings, 1994. XI, 380 pages. 1994.
Vol. 901 : R. Kumar, T. Kropf (Eds.), Theorem Provers in Vol. 922: H. Dtrr, Efficient Grapg Rewriting and Its Im-
Circuit Design. Proceedings, 1994. VIII, 303 pages. 1995. plementation. IX, 266 pages. 1995. (Subseries LNAI).
Vok 902: M. Dezani-Ciancaglini, G. Plotkin (eds.), Typed Vol. 923: M. Meyer (Ed.), Constraint Processing. IV, 289
Lambda Calculi and Applications. Proceedings, 1995. pages. 1995.
VIII, 443 pages. 1995 Vol. 924: P. Ciancarini, O. Nierstrasz, A. Yonezawa
Vol. 903: E. W. Mayr, G. Schmidt, G. Tinhofer (Eds.), (Eds.), Object-Based Models and Languages for Concur-
Graph-Theoretic Concepts in Computer Science. Proceed- rent Systems. Proceedings, 1994. VII, 193 pages. 1995.
ings, 1994. IX, 414 pages. 1995.
Vol. 925: J. Jeuring, E. Meijer (Eds.), Advanced Func-
Vol. 904: P. Vit~inyi (Ed.), Computational Learning tional Programming. Proceedings, 1995. VII, 331 pages.
Theory. EuroCOLT'95. Proceedings, 1995. XVII, 415 1995.
pages. 1995. (Subseries LNAI).
Vol. 926: P. Nesi (Ed.), Objective Software Quality. Pro-
Vol. 905: N, Ayache (Ed.), Computer Vision, Virtual Re- ceedings, t995. VIII, 249 pages. 1995.
ality and Robotics in Medicine. Proceedings, 1995. XIV,
567 pages. 1995. Vol. 927: J. Dix, L. Moniz Pereira, T. C. Przymusinski
(Eds.), Non-Monotonic Extensions of Logic Program-
Vol. 906: E. Astesiano, G. Reggio, A. Tarlecki (Eds.), ming. Proceedings, 1994. IX, 229 pages. 1995. (Subseries
Recent Trends in Data Type Specification. Proceedings, LNAI).
I995. VIII, 523 pages. 1995.
Vol. 928: V.W. Marek, A. Nerode, M. Truszczynskl
Vol. 907: T. Ito, A. Yonezawa (Eds.), Theory and Prac- (Eds.), Logic Programming and Nonmonotonic Reason-
tice of Parallel Programming. Proceedings, 1995. VIII, ing. Proceedings, 1995. VIII, 417 pages. 1995. (Subseries
485 pages. 1995. LNAI).
Vol. 908: L R. Rao Extensions of the UNITY Methodol- Vol. 929: F. Morlin, A. Moreno, J.J. Merelo, P. Chac6n
ogy: Compositionality, Fairness and Probability in Paral- (Eds.), Advances in Artificial Life. Proceedings, 1995.
lelism. XI, 178 pages. 1995. XIII, 960 pages. 1995 (Subseries LNAI).
Vol. 909: H. Comon, J.-P. Jouannaud (Eds.), Term Re- Vo]. 930: J. Mira, F. Sandoval (Eds.), From Natural to
writing. Proceedings, 1993. VIII, 221 pages. 1995. Artificial Neural Computation. Proceedings, 1995. XVIII,
Vol. 910: A. Podelski (Ed.), Constraint Programming: 1150 pages. 1995.
Basics and Trends. Proceedings, 1995. XI, 315 pages. Vol. 931: P.J. Braspenning, F. Thuijsman, A.J.M.M.
1995. Weijters (Eds.), Artificial Neural Networks. IX, 295
Vot. 911 : R. Baeza-Yates, E. Goles, P. V. Poblete (Eds.), pages. 1995.
LATIN '95: Theoretical Informatics. Proceedings, 1995. Vol. 932: J. Iivari, K. Lyytinen, M. Rossi (Eds.), Advanced
IX, 525 pages. 1995. Information Systems Engineering. Proceedings, 1995. XI,
Vol. 912: N. Lavrac, S. Wrobel (Eds.), Machine Learn- 388 pages. 1995.
ing: ECML - 95. Proceedings, 1995. XI, 370 pages. 1995. Vol. 933: L. Pacholski, J. Tiuryn (Eds.), Computer Sci-
(Subseries LNAI). ence Logic. Proceedings, 1994. IX, 543 pages. 1995.
VoI. 913: W. Sch~ifer (Ed.), Software Process Technol- Vol. 934: P. Barahona, M. Stefanelli, J. Wyatt (Eds.), Ar-
ogy. Proceedings, 1995. IX, 261 pages. 1995. tificial Intelligence in Medicine. Proceedings, 1995: XI,
Vol. 914: J. Hsiang (Ed.), Rewriting Techniques and Ap- 449 pages. 1995. (Subseries LNAI).
plications. Proceedings, 1995. XII, 473 pages. 1995. Vol. 935: O. De Michelis, M. Diaz (Eds.), Application
Vol. 915: P. D. Mosses, M. Nielsen, M. I. Schwartzbach and Theory of Petri Nets 1995. Proceedings, 1995, VIII,
(Eds.), TAPSOFT '95: Theory and Practice of Software 511 pages. 1995.
Development. Proceedings, 1995. XV, 810 pages. 1995. Vol. 936: V.S. Alagar, M. Nivat (Eds.), Algebraic
Vol. 916: N. R. Adam, B. K. Bhargava, Y. Yesha (Eds.), Methodology and Software Technology. Proceedings,
Digital Libraries. Proceedings, 1994. XIII, 321 pages. I995. XIV, 591 pages. 1995.
1995. Vol. 937: Z. Galil, E. Ukkonen (Eds.), Combinatorial
Vol. 917: J. Pieprzyk, R. Safavi-Naini (Eds.), Advances Pattern Matching. Proceedings, 1995. VIII, 409 pages.
in Cryptology - ASIACRYPT '94. Proceedings, 1994. XtI, 1995.
431 pages. 1995. VoI. 938: K.P. Birman, F. Mattern, A. Schiper (Eds.),
Vol. 918: P. Baumgarmer, R. Hahnle, J. Posegga (Eds.), Theory and Practice in Distributed Systems.
Theorem Proving with Analytic Tableaux and Related Proceedings,1994. X, 263 pages. 1995.
Methods. Proceedings, 1995. X, 352 pages. 1995. Vo/, 939: P. Wolper (Ed.), Computer Aided Verification.
(Subseries LNAI). X, 451 pages. 1995.
Vol. 919: B, Hertzberger, G. Serazzi (Eds.), High-Per- Vol. 941: M. Cadoli, Tractable Reasoning in Artificial
formance Computing and Networking. Proceedings, 1995. Intelligence. XVII, 247 pages. 1995. (Subseries LNAI).
XXIV, 957 pages. 1995.
Vol. 942: G. B6ckle, Exploitation of Fine-Grain
Vol. 920: E. Balas, ,1. Clausen (Eds.), Integer Program- Parallelism. IX, 188 pages. 1995.
ruing and Combinatorial Optimization. Proceedings, 1995.
Vol. 943: W. Klas, M. Schrefl, Metactasses and Their
IX, 436 pages. 1995.
Application. IX, 201 pages. 1995.
Vol. 921: L. C. Guillou, J.-J. Quisquater (Eds.), Advances
in Cryptology - EUROCRYPT '95. Proceedings, 1995.
XIV, 417 pages. 1995.