Sunteți pe pagina 1din 552

Lecture Notes in Computer Science 933

Edited by G. Goos, J. Hartmanis and J. van Leeuwen

Advisory Board: W. Brauer D. Gries J. Stoer


Leszek Pacholski Jerzy Tiuryn (Eds.)

Computer
Science Logic
8th Workshop, CSL '94
Kazimierz, Poland, September 25-30, 1994
Selected Papers

Springer
Series Editors
Gerhard Goos, Umversltat Karlsruhe, Germany

Juris Hartmanis, Cornell University, NY, USA

Jan van Leeuwen, Utrecht University, The Netherlands

Volume Editors
Leszek Pacholski
Institute of Computer Science, Wroc~aw University
Przesmyckiego 20, 51-151 Wroc~aw, Poland

Jerzy Tiuryn
Institute of Informatics, Warsaw University
Banacha 2, 02-097 Warsaw, Poland

Library of Congress Cataloging-in-Publication Data Workshop on Computer


Science Logic (8th : 1994 : Kazimierz, Putawy, Poland) Computer science logic : 8th
workshop, CSL '94, Kazimierz, Poland, September 28-30. 1994 : proceedings /
Leszek Pacholski, Jerzy Tiuryn, eds. p. cm. - (Lecture notes in computer science :
933) Includes bibliograhical references and index. ISBN 3-540-60017-5 (Berlin :
acid-free paper. - ISBN 0-387-60017-5 (New York : acid-free paper) 1. Computer
science-Congresses. 2. Logic, Symbolic and mathematical-Congresses.
I. Pacholski, Leszek. II. Tituryn, Jerzy. III. Title. IV. Series. QA75.5.W64 1994
004'.01'5113-dc20 95-23973 CIP

CR Subject Classification (1991): E4, E3, 1.2.3-4, E1


1991 Mathematics Subject Classification: 03Bxx, 68Q05, 68Q45, 68Q50,
68Q55, 68Q60, 68T27

ISBN 3-540-60017-5 Springer-Verlag Berlin Heidelberg New York


This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer -Verlag. Violations are
liable for prosecution under the German Copyright Law.

O Springer-Verlag Berlin FIeidelberg 1995


Printed in Germany

Typesetting: Camera-ready by author


SPIN: 10486274 06/3142 - 543210 - Printed on acid-free paper
Preface

The 1994 Annual Conference of the European Association for Computer Science
Logic, CSL '94 was held in Kazimierz (Poland) from September 25 through Septem-
ber 30 , 1994. CSL'94 was the eighth in the series of workshops and the third to be
held as the Annual Conference of the European Association for Computer Science
Logic.
The workshop was attended by 100 participants from over 15 countries. In-
vited lectures were given by M. Ajtai, M. Baaz, It. Barendregt, J.-P. Jouannaud,
V. Orevkov, P. Pudlak, and A. Tarlecki. Moreover, 39 contributed talks selected from
151 submissions were presented. The selection and nomination of invited speakers
was done by the Program Committee consisting of E. BSrger, M. Dezani, N. Jones,
P. Kolaitis, J. Krajicek, J.-L. Krivine, L. Pacholski, A. Pitts, A. Razborov, and
J. Tiuryn (Chair). We would like to express our gratitude to the Program Com-
mittee for the time and effort they contributed to the task of selecting the best
papers from the unexpectedly large number of submissions. We are also thankful to
over 200 referees who have helped the program committee.
The conference was organized by Warsaw University. We would like very much to
thank Dr. Igor Walukiewicz for the splendid work he has contributed to the success
of the meeting. Special thanks go to R. Maron and A. Schubert for their help with
organizing and running the conference office. We also wish to thank to M. Benke
and G. Grudzifiski for taking care of the computers used during the conference.
We gratefully acknowledge the generous sponsorship by the following institutions:
- Office of Naval Research, under the Grant Number N00014-94-J9001
- Polish Committee for Scientific Research (KBN)
- Warsaw University
- Wroctaw University
- Mathematical Institute of the Polish Academy of Sciences.
Due to the financial support of ONR and KBN we were able to offer a number
of grants for participants who otherwise could not afford to come to the conference.
The topics covered by the talks at the conference addressed all important as-
pects of the methods of mathematical logic in computer science: finite model theory,
lambda calculus, type theory, modal logics, nommonotonic reasoning, decidability
problems, and the interplay between complexity theory and logic.
The order of the papers in the proceedings, unlike in the previous ones, follows
more closely the order in which they were presented during the conference. They are
grouped according to their subjects.
Following the traditional procedure for CSL volumes, papers were collected after
the presentation at the conference, and after a regular reviewing process 38 papers
have been selected for publication. We thank the referees of the final versions. With-
out them it would have been impossible to prepare this volume. Finally, we would
like to thank W. Charatonik for his help in collecting the papers for the proceedings.

May, 1995 Leszek Pacholski, Jerzy Tiuryn


VI

SPONSORS
We gratefully acknowledge the generous sponsorship by the following
institutions:

- Office of Naval Research, under the Grant Number N00014-94-J9001


- Polish Committee for Scientific Research (KBN)
- Warsaw University
- Wroctaw University.
- Mathematical Institute of the Polish Academy of Sciences
Table of C o n t e n t s

Subtyping with Singleton Types


David A s p i n a l l . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

A Subtyping for the Fisher-Honsell-Mitchell Lambda Calculus of


Objects
Viviana Bono, Luigi Liquori . . . . . . . . . . . . . . . . . . . . . . . . . 16
The Girard Translation Extended with Recursion
Torben B r a i i n e r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Decidability of Higher-Order Subtyping with Intersection Types
A d r i a n a B. C o m p a g n o n i . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A X-calculus Structure Isomorphic to Gentzen-style Sequent
Calculus Structure
Hugo Herbelin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Usability: Formalising (un)definedness in Typed Lambda Calculus
Jan K u p e r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Lambda Representation of Operations Between Different Term
Algebras
M a r e k Zaionc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Semi-Unification and Generalizations of a Particularly Simple Form


M a l l h i a s Baaz, Gernot Salzer . . . . . . . . . . . . . . . . . . . . . . . . . 106
A Mixed Linear and Non-Linear Logic: Proofs, Terms and Models
Nick Benton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Cut Free Formalization of Logic with Finitely Many Variables. Part I
Lev Gordeev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
How to Lie without Being (easily) Convicted and the Lengths of
Proofs in Propositional Calculus
P a v e l Pudldk, S a m u e l R. B u s s . . . . . . . . . . . . . . . . . . . . . . . . 151

Monadic Second-Order Logic and Linear Orderings of Finite


Structures
B r u n o Courcelle . . . . . . . . . . . . . . . . . . . . . . . ......... 163
First-Order Spectra with One Binary Predicate
A r n a u d Duvand, S o l o m a m p i o n o n a R a n a i v o s o n . . . . . . . . . . . . . . . 177
VIII

Monadic Logical Definability of N P - C o m p l e t e Problems


E l i e n n e Grandjean, Frdddric Olive . . . . . . . . . . . . . . . . . . . . . . 190
Logics For Context-Free Languages
Clemens Laulemann, T h o m a s Schwentick, Denis Thdrien ......... 205
Log-Approximable Minimization Problems on R a n d o m I n p u t s
Anders MalmstrSm .............................. 217
Convergence and 0-1 Laws for L~,~ u n d e r A r b i t r a r y Measures
Monica M c A r t h u r ............................... 228
Is First Order Contained in an Initial Segment of P T I M E ?
Alexei P. Stolboushkin, Michael A. Taitsliu ................. 242

Logic P r o g r a m m i n g in Tau Categories


Stacy E. Finkelstein, P e t e r Freyd, J a m e s Lipton .............. 249
Reasoning and Rewriting with Set-Relatlons I: G r o u n d
Completeness
Valentinas Kriau6iukas, Michat Walicki . . . . . . . . . . . . . . . . . . . 264
Resolution Games and Non-Liftable Resolution Orderings
H a n s de Nivelle ................................. 279
On Existential Theories of List Concatenation
Klaus U. Schulz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
Completeness of Resolution for Definite Answers with Case
Analysis
Tanel T a m m e t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Subrecursion as a Basis for a Feasible P r o g r a m m i n g Language
P a u l 7. Voda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

A Sound Metalogical Semantics for I n p u t / O u t p u t Effects


Roy L. Crole, A n d r e w D. Gordon . . . . . . . . . . . . . . . . . . . . . . . 339
An Intultionistlc Modal Logic with Applications to the Formal
Verification of Hardware
M a t t Fairtlough, Michael M e n d l e r ...................... 354
Towards Machlne-checked Compiler Correctness for Higher-order
P u r e ~anctional Languages
David Lcster, Sara M i n t c h e v ......................... 369
Powerdomains~ Powerstructures and Fairness
Y i a n n i s N. Moschovakis, Glen T. W h i t n e y ................. 382
IX

Canonical Forms for Data-Specifications


Frank Piessens, E r i c Sleegmans ....................... 397

An Algebraic View of Structural Induction


Cla~tdio Hermida, B a r i Jacobs ........................ 412
On the Interpretation of Type Theory in Locally Cartesian Closed
Categories
Martin Hofmann ................................. 427

Algorithmic Aspects of Propositional Tense Logics


A l e x a n d e r V. Chagrov, Valentin B. S h e h t m a n . . . . . . . . . . . . . . . . 442
Stratified Default Theories
P a w e t Cholewidski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
A Homomorphism Concept for w-Regularity
Nils K l a r l u n d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Ramified Recurrence and Computational Complexity II:
Substitution and Poly-space
Daniel Leivant, J e a n - Y v e s Marion ...................... 486
General Form Recursive Equations I
Hrani B. Marandjian ............................. 501
Modal Logics Preserving Admissible for $4 Inference Rules
Vladimir V. Rybakov ............................. 512
A Bounded Set Theory With Anti-Foundatlon Axiom and Inductive
Definability
Vladimir Yu. Sazonov ............................. 527

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543


Subtyping with Singleton Types
David Aspinall

Department of Computer Science, University of Edinburgh, U.K..


e-maih daCdcs, ed. ac.uk

Abstract. We give syntax and a PER-model semantics for a typed A-calculus


with subtypes and singleton types. The calculus may be seen as a minimal cal-
culus of subtyping with a simple form of dependent types. The aim is to study
singleton types and to take a canny step towards more complex dependent sub-
typing systems. Singleton types have applications in the use of type systems
for specification and program extraction: given a program P we can form the
very tight specification {P} which is met uniquely by P. Singletons integrate
abbreviational definitions into a type system: the hypothesis x : {M} asserts
x = M. The addition of singleton types is a non-conservative extension of fa-
miliar subtyping theories. In our system, more terms are typable and previously
typable terms have more (non-dependent) types.

1 Introducing Singletons and Subtyping

Type systems for current programming languages provide only coarse distinctions
amongst data values: Real, Boo1, S t r i n g , etc. Constructive type theories for program
specification can provide very fine distinctions such as {x e Nat[Prime(x)}, but
often terms contain non-computational parts, or else type-checking is undecidable.
We want to study type systems in between where terms do not contain unnecessary
codes and, ideally, type-checking is decidable. When types express requirements for
data values more accurately, it can help to eliminate more run-time errors and to
increase confidence in program transformations which are type-preserving.
Singleton types express the most stringent requirement imaginable. Suppose fac
stands for the expression:
M. Ax. i~ x = 0 t~en 1 else x * (f(~ - 1))
Then {fac} is a specification of the factorial function, and
fac :{fac}
says that /ac satisfies the specification {fac). This is an instance of the principal
assertion for singleton types, M : {M}. But syntactic identity is too stringent; we
can write the factorial function in other ways and it would be useful if when/'ac t is an
implementation of the factorial function, we also have fac t : {fac}. This suggests that
we let {M} stand for the collection of terms equal to M in some theory of equality,
so {M} denotes an equivalence class of terms, rather than a singleton set.
Although we want types to be more expressive, this should not sacrifice the us-
ability of the type system. More types can lead to more polymorphlsm: a term may
possess several types, and the type system should recognize this and allow the pro-
grammer as much flexibility as possible. Subtyping systems provide flexibility by
allowing a term of some type A to be used where one of a 'larger' type B is expected.
The characteristic rule for subtyping is known as subsumption, which captures this
kind of polymorphism:
M:A A<B
M :B (SUB)
Subsumption at ground types suggests suhtyping at higher types. For example, a
function defined on I n t may be used where one defined only on Nat is needed, because
Nat < I n t means that every natural number will be a suitable argument. So we expect
that I n t --~ I n t is a subtype of Nat --* Int.
Subtyping leads to a stratified notion of equality. Because terms may have many
types, the equality of two terms can be different at different types. Indeed, consider
two different functions on integers:
(Ax:Int. if x > O then x else 2*x) ~ (Ax:Int. x) : Int-~ Int
which have equal values at every natural:
(Ax:Int. if x > O then x else 2*x) = (•x:Int. x) :Nat--+ Int
If in some context only arguments of type Nat are supp]{ed, these functions are inter-
changeable; useful perhaps for program transformation during compilation.
This view of equality influences our treatment of singleton types. Because equality
can vary at different types, we think of {M} as a family of equivalence classes indexed
by a type. We attach a tag to the singleton, which denotes the type at which we "view"
the term. The introduction rule for singletons is:
M:A
M : {M}A ({}-I)
In fact, the type tag can be important for another reason: the type A might affect
the interpretation of M, as well as its equivalence class of terms (although this is not
the case for the semantics we give later). Imagine a model in which the integers are
constructed using pairs of naturals: the pair (m, n) codes the integer (m - n). Then
the interpretation [3: Int] is quite different from [[3: Nat], and the semantic types have
different equality relations associated. (Of course, there is an obvious coercion from
[[Nat] to [Intl.) To allow for typed interpretations we need to know the type given
to a term in a singleton, but unless it is recorded somehow it cannot be determined
from a typing derivation.
There is no typing elimination rule for singleton types, but we have a subtyping
rule that says that a singleton type is a subtype of its type tag:
M:A
{M}A < A (suB-{})
which allows us to deduce M : A from M : {N}A via (SUB).
For us, singleton types have a non-informative flavour. In other words, we have
no term operators corresponding to singleton introduction and elimination. This con-
trasts with constructive type theories utilising propositions-as-types, where singletons
might be treated akin to a propositional equality type and given a powerful elimina-
tion operator. In our approach, membership of singletons corresponds to definitional
equality, which is usually decidable. A technical side-effect of non-informative types
is that the meta-theory of our system is harder to deal with, because the rules are
less syntax-directed.1 Notice that the presence of (suB) already means that the typing
rules are not syntax-directed.
1A set of rules is called syntax-directed if the last rule used in a derivation of any statement J is
uniquely determined by the structure of J.
The theory of equality we choose to incorporate in singleton types is a natural
typed equational theory for the terms. The typing assertion M : {N}A asserts that
M and N are equal at type A, so instead of axiomatizing a separate judgement form
F b M = N : A, we use typing rules with the form F b M : {N}A directly. The
usual rule of B-equality is admissible. This formulation is nicer to deal with than one
defined using a rule of untyped ~-conversion.

The system A< ("lambda-sub")[Car88a] is formed by adding subtyping to the


simply-typed A-calculus. In the remainder of the paper we shall study the addition
of singleton types to A<; we call the resulting system AS{} ("lambda-sub-singleton").
First, the next section outlines some uses for type systems with singleton types. Then
in Section 3 we present the complete definition of A<{} and in Section 4 we establish
some of its recta-theoretic properties. Section 5 gives a PER semantics and shows
soundness; Section 6 concludes.

2 Using Singleton Types


Singleton t y p e s as specifications. This work arose from a desire to understand
the formal system of a specification language called ASL+ [SST92, Asp95]. ASL+
extends the algebraic specification language ASL [SW83] with constructs from type-
theory (principally A-abstraction) for parameterising specifications and programs.
ASL consists of a collection of specification building operators (SBOs) which are used
for putting together specifications; ASL-level specifications form the base types of the
ASL+ A-calculus.
One of the extensions provided by ASL+ is the ability to express specifications
of parameterised programs (like functor signatures in Extended ML [KST941). To
do this, we need dependent function spaces to specify a function which satisfies a
specification that depends on its argument. Singletons turn a program into a very
tight specification which we can use with other SBOs in the body of the function
specification. A trivial example is an identity functor, which returns its argument:
functor I d ( X : S ) : S = X
In ASL+, this could be specified by HX: S. {X}. A less trivial example is SORT,
which specifies a parameterised program that, given a program implementing an order
relation Ord (a type t and an ordering le on t), returns a sorting function sort for
sorting lists of elements of t according to le.
SORT =~., II Ord:ORD.
enrich {Ord}
by
sign sort : t list --~ t list
axioms (some axioms specifying that sort(x) is a
sorted copy of z with respect to le)
end
The important thing to notice here is the use of { Ord} to require that applications
of programs satisfying SORT should be an extension of the actual parameter with a
sorting function that operates on exactly the same type t.
S i n g l e t o n t y p e s and s u b t y p i n g . Adding singleton types to well-known subtyping
systems such as A< and its second-order variants is not a conservative extension. More
typing statements become provable, both because more terms are typab]e and because
terms have more (non-dependent) types.
We illustrate this with a simple example. Consider the identity function on real
numb ers:
idaeal =d,, Ax:Real. x
Suppose the fact that I n t < Real. Then in A<:
id~,~l : Real -* Real
: Int -+ Real
/ Int -* Int
But the third typing is perfectly reasonable; the identity function on reals certainly
maps integers to integers. It is provable in our system, via an extended rule for
A-introduction:
F I- I n t _ < R e a l F,x:Real ~ x :Real F,x:Int ~ x : Int
F ~ Ax:Real. x : Int -* Int
The second hypothesis ensures that the A-abstraction can be typed. The third hy-
pothesis allows the body to be given a more refined type based on the assumption
that the argument type is more refined, according to the first hypothesis. Although
a typing rule like this could be added to A<, the types Real ---* Real and I n t ---* I n t
are incomparable, so this would break the desirable property that every typable term
has a minimal type. On the other hand, A<O does have the minimal type property,
which we show in Section 4. Not surprisingly, the minimal types are singletons.
As well as more types for previously typable terms, it follows from the example that
more (singleton-free) terms become typable when we add singleton types. Suppose
twice~nt.i,,, =d., Af : Int ---* Int. Ax: I n t . f(f(x)).
Then twlcei~,_iat(idR,al ) : I n t --~ I n t in AS{), but is untypable in A<.
There are more complex cases than these. We can assign a type Hx: A. {M}B
to a function Ax: A. M that is as informative as the function definition itself, so we
can substitute the result of function applications during type-checklng and do some
amount of equational reasoning. Similarly, we can substitute the arguments supplied
to a function into the body before type checking, which is shown next.
S i n g l e t o n t y p e s as definitions. Simple definition by abbreviation is essential for
the practical use of a type system in a programming language or proof assistant. If
M is a large expression occuring inside N several times, we may write N as
x = M in N '
where N' is the result of replacing occurrences of M in N with the variable x. If
definitions are treated formally, they are introduced as a new concept that extends
the type theory causing additional complication. With singleton types we get a form of
definition in the system for free, and derive similar rules to those given in [HP91, SP94].
The typed definitions of Severi and Poll [SP94] have the form:
x= M:A in N
This is similar to a A-abstraction over a singleton type in A<{}, applied to the trivially
appropriate argument:
(Ax: {M}A. N)M
which in turn can be compared with the usual "trick" for writing definitions in systems
such as A'* without using a special mechanism:
(Ax: A. N)M
Severi and Poll point out three reasons for introducing definitions as a fresh concept:
1. The A-abstraction Ax: A. N might not be permitted in the type-system (if N is
a type expression, for example).
2. /~-reduction replaces all instances of x in N by M, whereas it is useful to be able
to replace instances one-by-one when desired.
3. The information that x = M may lead to a typing for N that otherwise would
not be possible.
If definitional bindings must be allowed in more places than ~-abstractions, then it
seems necessary to introduce a new concept. For the second point, Severi and Poll
introduce a new kind of reduction. A $-reduction replaces a single instance of x with
M. We can similarly introduce a new reduction relation (called F-reduction) in A<O:
z 'r M if F(x) = {M}A
It is indexed by a context F which contains typing assumptions for variables. When
the type of a variable x is a singleton type {M}A, then we consider x as being equal
to M (at type A), so we may replace occurences of x by M. This reduction can be
extended under )~ and II abstractions if certain care is taken; but we will not consider
it further in this paper,
Without adding special constructs for definitions we gain the benefit of the third
point, extended typing. Interestingly, because of the rules chosen for singleton types,
it is not necessary for the term M to be "revealed" to the function body N to make
use of the fact that x = M when typing N. The term ($x: A. N)M has exactly the
same types as (Ax: {M}A. N)M, and moreover the two terms are provably equal in
our equational theory.

3 The System A<{}


The system A<O is the addition of singleton types to A<. We do not restrict )~-
introduction, so we get dependent product types in place of the usual arrow types.
The combination of type-dependency, subtyping, and the non-informative aspect of
the singleton constructor make fundamental properties like subject reduction more
difficult to establish than usual.
Pre-types A, pre-terms M, and pre-contexts F are given by the grammar:
A ::= P [ IIx:A.B [ {M}A
M ::= x I Ax:A.M [ MN
r ::= 0 I r,x:A
There is a set of term variables Vat, ranged over by x. There are no type variables, so
we assume a set PrimTypes of primitive (or atomic) types, ranged over by P. Subtyp-
ing between primitive types is given by a relation, <p,~,~ on PrimTypes. Restricting
< ~ , ~ to atomic types ensures that subtyping retains a structural character; that is,
two types related by the subtype relation will have a similar shape. Free and bound
variables are defined as usual. We identify pre-types, pre-terms, and pre-contexts that
are alpha-convertible, and reserve = for this syntactic equivalence.
(EMPTY)
0 Context
F ~- A xC_dom(F)
(ADD-HYP)
F, z: A Context
r Context (FORM-PRIM)
F}-P
F,x:A}- B
(FORM-It)
F }- I I x : A . B
FF M:A (FORM-{))
r {M}A
F Context (VAR)
r z :r(z)
F,x:At- M:B
F ~- )~x:A.M : I I x : A . B
F~ M:IIz:A.B Ft- N : A (APP)
F ~- M N : BIN/x]
FF'M:A FF'A<_B (SUB)
FF-M:B

Table 1: Rules for contexts, formation, and typing.

We write F C_ F + if each x : A in F is also in F ~. Beta~reduction is defined as usual


over terms, and extended to types compatibly with the type-constructors. Notice that
there is no application at the level of types; convertible types can only differ within
corresponding singleton components.
We use the following judgement forms:
. context formation, F Context
, type formation, F }- A
9 typing, F ~- M : A
9 subtyping, F }- A < B
A judgement is valid iff it is derived using the rules of Tables 1 and 2. We use F F- J
to range over valid judgements.
We also use an equality judgement as a derived form; F }- M = N : A stands
for F ]- M : {N}A when we think of it as meaning equality between terms. Equality
between types is written F I- A = B which means both F b A < B and F b B <_ A
are valid.
The four judgements are defined simultaneously, in contrast to non-dependent
schemes where the subtype relation can be separated from the typing relation. Below
we describe the rules related to singleton types; the other rules shown in the tables
are mostly standard.
F~-M:A (EQ-REFL)
F~-M=M:A
r k A'<A r, x: A' F- M = M' : B' F , x : A ~ - M : B
r b Ax:A.M=Ax:X.M' :IIx:X.B'
(EQ-A)
r F M=M':IIx:A.B r F- N = N I : A (EQ-APP)
r ~- M N = M ' N ' :B[N/x]

Note: r ~- M = N : A is short for r ~- M : { N } A ,

r ~- A (SUB-REFL)
FkA<_A

r~- A<B r ~- B _ < C (SUB-TRANS)


F~-A<_C
r Context P < v , ~ P'
r F P < P' (SUB-PRIM)

FF" A'<_A F,x:At~" B < _ B ' F , z : A P B


r F- I I x : A . B <_ IIx:A'.B' (SUB-H)

r ~- M : A (SUB-{})
r ~- {M}A<_A

F F- M = N : A r ~- A_< B (SUB'EQ-SYM)
r f- {N}A <_ {M}s

r ~- M : A (SUB-EQ-ITER)
r F { M h _<

Table 2: Rules for equality and subtyping.

Singleton Types and Equality. Singletontypes are formed by the rule (FORM-{})
and terms of singleton type are introduced by the equality rules,principallyreflexivity
(EQ-REFL),which is the singleton introduction rule ({l-I) shown before under a dif-
ferent guise. Symmetry and transitivityare derived using the subtyping rules shown
below. W e also have the usual rules for equality of A-abstractions (EQ-A) and appli-
cations (EQ-APP). The rule (EQ-A) is more flexiblethan usual: it allows one to derive
equalities between functions by examining only a restricted domain. (This rule is
forced when one has untagged singletons and the usual equal domains equality,which
was the inspiration for adding it). It leads to the admissibilityof a correspondingly
stronger typing rule, via (SUB) and (SUB-{}):
Ft-A'<_A F,x:AbM:B F,x:XbM:B'
F F- Ax: A. M : l'Ix: A'. B'
which lets us give a more refined type for a function, given a more refined type for its
argument. This was used in the idR.~l example in Section 2.
One might wonder whether we can have deduction from arbitrary hypotheses of
equations between terms, assumed via iterated subscripts in the syntax. For example,
it holds that F, x: {M){N} d ~ M = N : A. In fact, we have only a pure theory of
equality since this judgement presupposes that F F- M = N : A. This is because the
rule (ADD-HYP)requires that types in contexts must be well-formed; Proposition 4.1
below establishes this formally.
S u b t y p i n g Singletons. Subtyping of singleton types is provided by.three rules.
First, we have the rule (SUB-{}) shown earlier, which asserts that a singleton is a
subtype of the type it is tagged with.
The rule (SUB-EQ-SYlVi) combines two principles. The first is monoton~city of
equality with respect to subtyping: if two terms are equal at a type, say M = N : A,
and A ~ B, then M = N : B also. We can express this via subtyping of singleton
types. Generally, as we pass from subtype to supertype, the equivalence class of any
particular term gets larger, so {N}A <_ {N}s and M = N : B via subsumption.
The second principle is symmetry of equality, and again we can express the typing
rule by a subtyping one (an economy, since we want both). If M = N : A then the
equivalence classes of M and N at A must be the same, in particular {N}A ~_ {M}A.
These are combined to get the single rule (SUB-EQ-SYM).
The third subtyping rule for singletons (SUB-EQ-ITER)deals with the case when a
singleton type is tagged with another singleton type. Observe that we can repeat the
operation of taking singletons in the syntax, forming {M}A, {M}{M)A,.... We shall
consider these types as equal, because singleton types are already the smallest non-
empty types we are interested in. And because {M}A inherits equality from A, the
equality on terms in {M}A and {M}{M)A is the same. We have {M}{M)A _< {M}A
alreacl by (suB-{}), for the other inclusion we need (SUB-EQ-ITER).

4 Basic meta theory of A<{}


We state some basic properties of A<{}, which lead to proofs of admissibility of rules
such as subject/~-reduction and fl-equality. We then prove the existence of minimal
types. Proofs are by straightforward inductions on typing derivations unless stated.

P r o p o s i t i o n 4.1 ( C o n t e x t s a n d S u b s t i t u t i o n )
1. Context formation. Suppose F Context, where F =_ xl : A1, ... , x , : A,. Then
(a) F ~- A~ for l < i < n.
(b) I f F ~- J, then B Y ( J ) C_ {.Tl,... ,an}.
2. Weakening. It" F F- J and F C F I with F~ Context, then F/ k J.
3. Substitution. I[ F, x : A, F' k J and P b N : A, then F', PIN~x] F- J[N/x].
4. Bound narrowing. /fF, x : A,F ~ k J andF ~ A ~ < A, then F, x : At, F r ~- J.
The next proposition shows some implications between judgements. It is a common
practice in the presentation of type theories to simply require the consequences of these
as premises in the rules to begin with (often implicitly). Here our rules have fewer
premises and we show the implications afterward, but our approach does make some
proofs on derivations slightly harder, because sometimes we cannot apply an induction
hypothesis directly. This can be circumvented by considering term structure instead,
or subderivations of the premises.
Proposition 4.2 (Implied Judgements)
1. If F }- J then F Context.
2. If F k J and J - M : A, A <_ B o r B <_A, then F F A.
3. If F ~- J and J =_ M = N : A, N = M : A, {M}A _< B o r { M } B <_ A then
F~ M:A.
Generation principles are important for meta-theoretic analysis. They allow us to de-
compose a derived judgement into further derivable judgements concerning subterms
from the first judgement. Typically a generation principle expresses the general way
in which a judgement form may be constructed; for the context and type formation
judgements, the generation principles are merely inversions of the rules. The following
generation result for the subtyping judgement allows us to show generation for the
typing judgement. It also reveals the "structural" nature of subtyping we mentioned
before, except in the case of singletons: {M}A can be a subtype of a type B which is
not itself a singleton. There is a case according to each syntactic form on either side
of the subtyping symbol.
P r o p o s i t i o n 4.3 ( S u b t y p i n g Generation)
1. If F b- P <_ B then B is also an atomic type, say P', and P <_P~i~ P'.
2. I f F }- 1-Ix:A. B <: C then for some X , B ' , we have C -- IIx: A t. B', such that (a)
F ~- A ' < A , (b) F , x : A ' ~ B < _ B ' , a n d ( c ) F , x : A ~ B.
3. IfF ~ {M}A <_ B, then F ~- M : B .
4. If F F- A < pt where A is not a singIeton, then A is also an atomic type, say P,
and P < p , ~ P'.
5. If F }- C _ Fix: A'. B t where C is not a singleton, then for some A,B, we have
C - H x : A . B , such that (a) F b A t <_ A, (b) F , x : A ' b- B <_ B', and (c)
F , x : A ~" B.
6. If F ~- C ~ {N}B then for some A,M, we have C =- {M}A.
Parts 3 and 6 of this proposition are rather weak, in particular nothing is said about
the relation between types A and B. This will be rectified later.
The generation principle for the typing judgement F F- M : A looks unusual,
because we must account for the possibility that A is a singleton type.
P r o p o s i t i o n 4.4 ( T y p i n g Generation)
1. IfF k x : A , thenF k {x}r(~)<_A
2. It'F k 2 x : A . M : C, then for some At,B,B ', we have (a) F k A t <_ A, (b)
F, x:A' ~ M : B', (c) F, x : A e M : B, and (d) F e {Ax:A.M}nx:A,.B, <_ C.
3. H F ~ M N : C, then t.or some A, B, we have that (a) F b M : FIx:A.B, (b)
F e N : A, and (c) F k {MN)B[N/~] <_ C.
In specific instances, the consequence of typing generation can be further broken down
using the subtyping generation principle, and so on.
A d m i s s i b l e e q u a l i t y rules. We mention a few important admissible rules of A<{}.
The symmetry and transitivity of equality
Fb-M=N:A
F [- N : M : A (EQ-SYM)

Fk L=M:A Fb" M = N : A
F b- L = N : A (EQ-TRANS)
are derived via (SUB-EQ-SYM)and (SUB-TRANS),using Proposition 4.2.
The usual rule for fl-equality is (perhaps surprisingly) derivable. This is because
Ax: A. M can be given the tight dependent type Hx: A. {M}s using ({}-I), and so
together with (A) and (APP) we can derive:
F,x:Ak M:B Fk N:A
F t- (Az: A. M ) N = M[N/x] : BIN~x] (EQ-~)
This rule is used to show fl subject reduction.
R e m o v i n g Singletons. We mentioned that two parts of the subtyping generation
principle (Proposition 4.3) are rather weak. If { M } A <. B, we would like to find a
relationship between the types A and B. When B is not a singleton type, we expect
(from the rules) that A < B. However, when B -= {N}c for some N and C, we
may have A < C or vice-versa, because of rules (SUB-EQ-SYM)and (SUB-EQ-ITER).
A generation lemma covering these cases is untidy to state, and difficult to prove
directly because of the rule (SUB-TRANS).
Here we define an operation ( - ) ~ which derives a non-singleton type from a type
by repeatedly taking the type tag of a singleton type. A proposition relates a type to
its singleton-deleted form; this sufficiently strengthens the generation result to give us
a tool to show the admissibility of subject reduction and the minimal type property.
Definition 4.5 (Singleton Removal)
p~ =p
(Hx: A. B) ~ = 1-Ix:A. B
({M}A) "~ = A:~

Proposition 4.6 (Singleton Removal Subtyping)


1. F k A ~ F k A<A ~
2. F k A < B = = ~ F ~ A ~ < B ~
3. If B is not a single,on, then F ~- {M}A <_ B ~ F [- A < B
Proof. Part 1 is proved by induction on the structure of types, part 2 by induction
on the subtyping derivation, using 1 and Proposition 4.2; part 3 follows from 1 and 2
using the same proposition. []
Subject R e d u c t i o n . We can now use the generation principles to show that fl-
subject reduction holds, for both typing and subtyping. The critical lemma is the
case of a one-step outermost reduction. The syntactic proof of this is more involved
than usual.
Theorem 4.7 ( S u b j e c t Reduction)
1. I f F ~- M : A and M - - ~ M I, then F ~- M r : A also.
2. I f F F- A < B a n d A - - ~ z A t, then F ~ A' <_ B also.
3. If F F- A < B and B ----~z B', then F k A < B' also.
Proof. Simultaneously for a single reduction step, we use induction on the structure
of terms and types. For terms, this involves Proposition 4.4 and the equality rules,
plus Lemma 4.8 below. For types, we use Proposition 4.3 and Proposition 4.2. []
]]

Lemma 4.8 (Outermost fl-reduction)


r b (Ax: A. M) N : C impIies r ~- M[N/z]: C.
Proof. By two applications of typing generation (Proposition 4.4), there exist types
A1,B1,A2, B2 and B such that:
F b A x : A . M : H z : A t . B1, r [- N : A 1
F ~- {()~x:A.M)N}BI[N/~] < C (*)
F b A2 < A, F,x: A ~- M : B, F,x: A2 ~- M : B2
r F- {Ax: A. M}rlz:A2.B2 < FIz:A1. Bz
By Propositions 4.6 and 4.3 and the last of these, we have:
r t-- Hx:A2. B2<Hx:A1. B1, F t- A I < A 2 , F,x:A1 ~ B2<B~.
Now using (BND-NARROW) and (SUB) we have F, x:AI i- M : B~. So we can apply
the admissible rule (EQ-fl),
F, x: Al t- M : B1 F ~- N : A1
F I- (.kx: A,. M) N = M[N/x] : B,[N/x]
and by (EQ-~) and (EQ-APP):
FbAI<A F,x:AI~-M=M:B1 F,x:A}-M:B FbN:A1
F ~- (2~x:A.M)N= ()tx:A1. M ) N : BI[N/x]
By transitivity:
F ~- ( ~ x : A . M ) N = M[N/x] : B~[N/x] (t)
Finally, using Proposition 4.2((SUB-TItANS), and (SUB-EQ-SYM), we can derive:
F ~- P = Q : D
r~{Q}D<{P}~ r ~ { P } ~ < c
FFQ:C
L e t (t) and (*) be the premises, so P = ( A x : A . M ) N , Q -- M[N/x], and D =
B1 [N/x]. Then F t- M[N/x] : C as required. []
M i n i m a l T y p e s . If a term possesses several types, it is useful both theoretically
and pragmatically if one type can always be found that is more general than the
others; in subtyping systems, it is minimal. With untagged singletons, minimal types
are a triviality: the minimal type for a term M is {M}] When type tags are added,
the issue is not so obvious. Here we show a strengthening of typing generation to give
minimal types.
The minimal type mint(M), of a term M in a context F, has the form {M)A for
some A. We give a partial inductive definition of minr(M) which is shown in the
following lemma to be well defined on all F, M such that F l- M : A for some A.
Definition 4.9 (Minimal Types)
9mint(z) = {z)r(,)
minr(Ax: A. M) = {Ax: A. M}n,:A.~nr,z:.~(M)
minr(Mg) = {MN}B[N/,] where (mint(M)) t} = Hx: A. B
a n d F b- N : A
Lemma 4.10 (Existence of Minimal Types)
1. F }- M : A ==* F F- M : minr(M)
2. r J- M : A ==~ F b minr(M) < A
P r o o f . Simultaneously, by induction on the derivation of I" ~- M : A. []
~2

Minimal types are not unique, and the minimal types given by the simple defi-
nition of mint(M) are not necessarily the simplest syntactically. For example, if
F -- z : a , f : f l ---+{z}~,x:fl then minr(fx) - {fx}{~}~. But F ~- f x : {z}~ too, and
we can show that F b {z}~ < {fx}{~}~.

5 A P E R Interpretation of A_<{}
Subtyping calculi have two basic kinds of model. We may choose a typed value space
where subsumption is modelled using coercion maps between types. In some sense
this is the most general setting , but it requires some way of relating coercion maps to
the syntax: either we forgo (SUB) and introduce coercions explicitly into the syntax
[CL91], or we reconstruct coercions by some translation process [BTCGSgl]. Either
route requires a coherence property of the interpretation, because of the possibility of
different ways of deriving or expressing the analogue of a coercion-free statement, by
permuting the positions of coercions. This property can be quite tricky to establish in
a general form, and has yet to be demonstrated in a subtyping calculus more complex
than F< (see [CG92]). Here we follow the alternative untyped approach, based on a
global value space from which types are carved out. Coercion maps are unnecessary
since subtyping amounts to inclusion between types, and the interpretation of a term
does not depend on its type. The need for coherency properties can be avoided by
defining the interpretation by induction on the structure of raw (coercion-free) terms
rather than typing derivations.
T h e P E R m o d e l . Recall that a partial equivalence relation (PER) on a set D is a
symmetric and transitive relation R C_ D x D. The domain of R, dora(R), is the set
{d [ d R d}, but we often write d E R instead of d E dora(R). The equivalence class
{d' [ d' R d} of d in R is written [d]n. Subtyping will be interpreted as inclusion of
PERs, which is simply subset inclusion on D • D.
The construction is mostly standard (see e.g., [CL91, BEg0]), but incorporates
type-term dependency. We make use of a model of the untyped A-calculus to interpret
terms and to build PERs over.
o e a n i t l o n 5.1 ( L a m b d a M o d e l [HLSO])
A lambda model is a triple, 7) = (D,., [ ~), where D is a set, 9 is a binary operation
on D and for untyped lambda4erms M, the interpretation of M in an environment
p: Var ~ D is [M~p E D, such that:
Mp = e(~) (vA~)
~M~]p = ~M];" Me (APF)
~Ax. M~p = ~)~y.M[y/X]~p (a)
(Vd e D. IM~p[x ~ d] = FVlp[x ~ d]) ~ [~x. M]p = [~x. JV~p (~)
(Vx E F V ( M ) . p(x) = p'(x)) ~ [M~p = jiM, p, (Fv)
Vd E D. lax. M~p- d = ~M~p[x ~ d] (fl)
From the above axioms (except fl), we also have:
[M[NIx]~p ~M]p[x ~ [N]]p]
=
(SUBSTITUTE)
13

An environment p' extends another p, written p C_ p', if for all variables x, if p(x) is
defined then p'(x) is defined and p(x) = p'(x). Fix a lambda-model D with domain
D. Terms are interpreted as elements of D as usual (we leave the erase operation,
which deletes type information, implicit) and types are interpreted as PERs on D.
Pairing and projection operations in the model are defined by:
{a,b) = ~Af. fxyl[ x ~ a,y ~ b]
~lP =

7r2p =

We first show some constructions for building PERs, and then the interpretation
proper.
Definition 5.2 (PER Constructions)
We define PERs to interpret the types of A<O , as follows:
9 For each primitive type P, we assume a PER Rp such that P <_p~.~ P' implies
Rvc_Rp,.
9 Let R be a PER and S(a) be a PER for a11 a e dora(R), such that S(a) = S(b)
whenever a R b.
Det~ne the PER II(R, S) by:
f II(R,S) g if[ Va, b. a R b ==~ f . a S(a) g . b
Define the PER E(R, S) by:
(a,, b,) E(R, S) (a2, b2) iff a 1 R a2 a n d b1 S(al) b2
9 Let R be a PER. Define the PER [p]R by:
m [p]R n if[ turn and m R p
The interpretation [F]] of a context F is a PER. The interpretation of a type in some
context is a family of PERs [IF t- A~g indexed by elements g E ~F] that is invariant
under the choice of representative of equivalence class in ~F].
Definition 5.3 ( I n t e r p r e t a t i o n o f C o n t e x t s and T y p e s )
For each context F, we define a PER ~F] by:
[[0] = D x D
F, x: A] = (F Jr e Al)
For each context F and type A, we define a PER IF b" A~g, for each g E dora(H):
IF b P~g = Rp
iF [- IIx:A. Big = II(IF t- Aig, Aa. IF, x : A ~" Bi(g,a))
IF t- {M}Aig = [~M~gr]ireAlg
Notice that ]- is just used as a place-holder here, it does not signify a judgement
derivation. The symbol A stands for lambda-abstraction at the meta-level, and
gr: Var ~ D is the environment defined by projections on g:
gO(y) undefined, for all y.
{ ~r2(g), if y - z,

The following theorem establishes the main soundness property of the model con-
struction. It also shows well-definedness of the interpretation of contexts ~F]] and of
types IF ~- Ai, whenever F is a context and F 1" A. The parts of the theorem need
to be proven simultaneously because of type dependency.
]4

T h e o r e m 5.4 (Well-Det]nedness and Soundness)


1. If F Context then [[F] is a PER.
2. If F F- A then ~F F- A]g I =- IF F- A~g 2 whenever gl ~F~ g2.
3. I f F F- M : m then [M~glr (IF F- m]]gl) [M~g2r whenever gl [[F]] g2.
4. If F e A < B then ~F F- A]g C_ [F F- Big for a11g E IF].
Proof. Simultaneously by induction on derivations, making use of axioms of the
lambda-model and the definitions of the PER constructions. []
Notice that a special case of part 3 is the soundness property that F k- M : A implies
[M]gr e [IF F- A]g for any g E ~F~. The existence of PER models (generated from
A-models and some PERs to interpret atomic types), together with this soundness
result, guarantee the consistency of the calculus.

6 R e l a t e d and Further Work


We have presented the type system A<{}, which adds singleton types and H-types to
the simply-typed lambda-calculus with subtypes, A<. This gives a system with types
that depend on terms. Singleton types {M)A c a n be interpreted in a PER model as
the equivalence class of [M~ in the PER ~A].
The judgements of A<O are defined simultaneously and typing embeds in subtyping
in a simple way. Noticing this, we can give alternative shorter presentations of the
system which "wrap-up" judgements in terms of each other; this can be useful because
it simplifies tedious simultaneous inductions. One way is to encode typing in terms
of subtyping, defining F F- M : A if[ F F- {M}A <_ B holds for some B. Another is
to encode subtyping in terms of typing, defining F ~- A < B i f f F, x : A F- x : B; we
get the usual subtyping rules if a rule of ~-reduction is added.
Decidability of type-checking has not yet been investigated, but we conjecture
that it holds. A typical approach would be to first study the reduction behaviour
of terms and types, for both ~-reduction and F-reduction. We would hope to find
a normalization result and a set of syntax-directed rules for the system, with points
where normalization occurs. The final step would be to give a termination argument
for the deterministic rules, to show that they define a complete algorithm.
Research into type systems for object-oriented programming (e.g., [Car88b, BL90,
CG92]) gave inspiration for this work. Most systems in the literature do not treat
type dependency at the same time as subtyping (Cardelli's is an exception). Hayashi's
extensions of System F [Hay94] have singleton types with the same form as those
described here. His calculi have union and intersection types, so dependent products
become a derived notion. Hayashi's treatment of singletons differs; his is based on
encoding a propositional equality (which reflects his intention to create a constructive
logic), rather than a definitional one. He has to add a rule of subject reduction as
primitive, since there is a counterexample to its admissibility involving union types.
We hope that some of the development here will carry over to systems with ap-
plication at the level of types, but the situation with full dependent types is more
difficult than the simple case where each term in a type is warmly insulated with
singleton braces. The harder case is the topic of some joint work in progress between
the author and Adriana Compagnoni.
15

Acknowledgements. I am grateful to Benjamin Pierce, Don Sannella, and Andrzej


Tarlecki for their wisdom, help, and encouragement during the progress of this work.
Useful comments and corrections were provided by the CSL referees. I was supported
by a UK EPSRC postgraduate studentship.

References
[Asp95] David R. Aspinall. Algebraic specification in a type-theoretic setting.
Forthcoming PhD thesis, Department of Computer Science, University
of Edinburgh, 1995.
[BL90] Kim B. Bruce and Giuseppe Longo. A modest model of records, in-
heritance and bounded quantification. Information and Computation,
87:196-240, 1990.
[BTCGS91] Val Breazu-Tannen, Thierry Coquand, Carl A. Gunter, and Andre Sce-
drov. Inheritance as implicit coercion. Information and Computation,
93:172-221, 1991.
[Car88a] Luca Cardelli. A semantics of multiple inheritance. Information and
Computation, 76:138-164, 1988.
[CarS8b] Luca Cardelli. Structural subtyping and the notion of power type. In
Fifteenth Annual ACM Symposium on Principles of Programming Lan-
guages, 1988.
[cG92] Pierre-Louis Curien and Giorgio Ghelli. Coherence of subsumption, min-
imum typing and type-checking in F<. Mathematical Structures in Com-
puter Science, 2:55-91, 1992.
[CL91] Luca Cardelii and Giuseppe Longo. A semantic basis for Quest. Journal
of Functional Programming, 1(4):417-458, 1991.
[Hay94] Susumu Hayashi. Singleton, union and intersection types for program
extraction. Information and Computation, 109, 1994.
[HLSO] R. Hindley and G. Longo. Lambda calculus models and extensionality.
Z. Math. Logik Grundla9. Math., 26:289-310, 1980.
[HP91] Robert Harper and Robert Pollack. Type checking with universes. The-
oretical Computer Science, 89:107-136, 1991.
[KST94] Stefan Kahrs, Donald Sannella, and Andrzej Tarlecki. The definition of
Extended ML. Technical Report ECS-LFCS-94-300, LFCS, Department
of Computer Science, University of Edinburgh, 1994.
[SP9~ Paula Severi and Erik Poll. Pure Type Systems with Definitions. In
Logical Foundations of Computer Science, LFCS'94, Lecture Notes in
Computer Science 813, pages 316-328. Springer-Verlag, 1994.
[SST92] Donald T. Sannella, Stefan Sokotowski, and Andrzej Tarlecki. Toward
formal development of programs from algebraic specifications: Parame-
terisation revisited. Acta Informatica, 29:689-736, 1992.
[sw83] Donald Sannella and Martin Wirsing. A kernel language for algebraic
specification and implementation. In Proceedings of International Con-
ference on Foundations of Computation Theory, Borgholm, Sweden, Lec-
ture Notes in Computer Science 158. Springer-Verlag, 1983.
A Subtyping for the Fisher-Honsell-Mitchell
L a m b d a Calculus of Objects

Viviana Bono and Luigi Liquori

Dipartimento di Informatiea - Universits di Torino,


C.so Svizzera 185 - 10149 Torino
E-mail: bono, liquori@di.unito.it

Abstract. Labeled types and a new relation between types are added
to the lambda calculus of objects as described in [6]. This relation is
a trade-off between the possibility of having a restricted form of width
subtyping and the features of the delegation-based language itself. The
original type inference systern allows both specialization of the type of an
inherited method to the type of the inheriting object and static detection
of errors, such as 'message-not-understood'. The resulting calculus is an
extension of the original one. Type soundness follows from the subject
reduction property.

1 Introduction

Object-oriented languages can be classified as either class-based or delegation-


based languages. In class-based languages, such as Smalltalk [4] and C ++ [5],
the implementation of an object is specified by its class. Objects are created
by istantiating their classes. In delegation-based languages, objects are defined
directly from other objects by adding new methods via method addition and
replacing old methods bodies with new ones via method override. Adding or
overriding a method produces a new object that inherits all the properties of the
original one. In this paper we consider the delegation-based axiomatic model
developed by Fisher, Honsell and Mitchell, and, in particular, we refer to the
model in [6] and [7]. This calculus offers:
- a very simple and effective inheritance mechanism,
- a straightforward mytype method specialization,
- dynamic lookup of methods, and
- easy definition of binary methods.
The original calculus is essentially an untyped lambda calculus enriched with
object primitives. There are three operations on objects: method addition (de-
noted by (el++ re=e2>) to define methods, method override ((el 6-- ra=e2)) to
re-define methods, and method call (e r ra) to send a message ra to an object
e. In the system of [6], the method addition makes sense only if method m does
not occur in the object e, while method override can be done only if ra occurs in
e. If the expression el denotes an object without method m, then <el/---f-ra---e2)
denotes a new object obtained from el by adding the method body e2 for m.
]7

When the message m is sent to {ele-{- m~-ej>, the result is obtained by applying
e~ to (el+-{- m=ej) (similarly for (el +-- m=ej)).
This form of self-application allows to model the special symbol self of object
oriented languages directly by lambda abstraction. Intuitively, the method body
e2 must be a function and the first actual parameter of e2 will always be the
object itself. The type system of this calculus allows methods to be specialized
appropriately as they are inherited.
We consider the type of an object as the collection of the types of its methods.
The intuitive definition of the width subtyping then is: ~ is a subtype of v if cr has
more methods than r. The standard subsumption rule allows to use an object
of type ~r in any context expecting an object of type r. In the original object
calculus of [6], no width subtyping is possible, because the addition of the method
m to the object e is allowed if and only ifm does not occur in e. So, the object e
could not be replaced by an object e I that already contains m.
Moreover, it is not possible to have depth subtyping, namely, to generalize
the types of methods that appear in the type of the object, because with method
override we can give type to an expression that produces run-time errors (a nice
example of [1] is translated in the original object calculus in [8]).
In this paper, we introduce a restricted form of subtyping, informally written
as cr _ r. This relation is a width subtyping, i.e., a type of an object is a subtype
of another type if the former has more methods than the latter. Subtyping is
constrained by one restriction: o" is a subtype of another type v if and only if we
can assure that the methods of or, that are not methods of r, are not referred to
by the methods also in r. The restriction is crucial to avoid that methods of r
will refer to the forgotten methods of or, causing a run-time error. The subtyping
relation allows to forget methods in the type without changing the shape of the
object; it follows that we can type programs that accept as actual parameters
objects with more methods than could be expected. The information on which
methods are used is collected by introducing labeled types. A first consequence
of this relation is that it can be possible to have an object in which a method
is, via a new operation, added more than once. For this reason, we introduce a
different symbol to indicate the method addition operation on objects, namely
(el+-o
The operation +--o behaves exactly as the method addition of [6], but it can be
used to add the same method more than once. For example, in the object

the first addition of the method m is forgotten by the type inference system via
a subspmption rule. Our extension gives the following (positive) consequences:
- objects with extra methods can be used in any context where an object with
fewer methods might be used,
our subtyping relation does not cause the shortcomings described in [1],
-

we do not loose any feature of the calculus of [6].


-

We also extend the set of objects and we present an alternative operational


semantics. Our evaluation rules search method bodies more directly and deal
with possible errors. This semantics was inspired by [2], where the calculus is
18

proved to be Church-Rosser. The typing of the operator for searching methods


uses the information given by labeled types in an essential way.
This paper is organized as follows. In section 2 we present our language
and the evaluation strategy, in section 3 we give the type inference rules for
the calculus and the subtyping relation. Some interesting examples, showing the
power of this calculus with respect to the original one, are illustrated. In section 4
we prove some structural properties of the system and we give a subject reduction
theorem. Moreover, we prove the type soundness, in the sense that we show that
the type system prevents message-not-understood errors. Because of the pages
limit, not all the proofs are included here, see [3] for details.

2 Untyped Calculus of Objects

The untyped lambda calculus enriched with object related syntactic forms is
defined as follows:
::= x I c I ~x.e I e ~ I < / I / ~ * -o m=e~> j (~ +- m=e~> r ~ ~ ~ f ~ ~ m I ~ ,
where x is a term variable, c belongs to a fixed set of constants, and m is a
method name. The object forms are:
(} the empty object;
(el+-o m=e2) extends object el with a new method m having body e2;
(el +-- re=e2) replaces the body of method m in el by e2;
e 4= ra sends message m to the object e;
e +-~ m searches the body of the message m into the object e;
err the error object.
Notice that the last two object forms are not present in the original calculus
of [6]. Let e--* denote t--o or +--. The description of ~an object via +--, opera-
tions is intensional, and the object corresponding to a sequence of +--, can be
extensionally defined as follows.

D e f i n i t i o n 1. Let m l , . . . , ink, and n be distinct method names. (ml=el, 9 .., mk=ek)


is defined as:
1. (...((>6-om l = e l ) . . . +-o ma=ek)=(ml=el,..., mk----ek),
2. ((mz=el,..., mi=ei,..., mk----ek)+-* mi----e~)=(ml = e l , . . . , mi=e~,..., mk---=ek),
3. ( ( m l - - - - e l , . . . , nlk:ek>+--o n=e')----(ml=ez,..., mk--ek, n=e'>.

So ( m l = e l , . . . , mk=ea) is the object in which each method body corresponds


to the outermost assignment (by addition or by override) perfomed on the
method.

2.1 T h e Evaluation Rules


The operational semantics of the original calculus of [6] is mainly based on the
following evaluation rules for/?-reduction and for message sending r
(~) (~x.el)e~ ~--~ [e2/x]~l
eval
(~) (el~-* m=~2> ~ m ~ e 2 ( e l ~ * m=e~>.
19

To send message m to the object e means applying the body of m to the object
itself. In fact, the body of m is a lambda abstraction whose first bound variable
will be substituted by the full object in the next step of/?-reduction.
The problem that arises in the calculus of objects is how to extract the ap-
propriate method out of an object. The most natural way is moving the required
method in an accessible position (the most external one). This means to treat
objects as sets of methods. Unfortunately, this approach is not possible: in fact,
the typing rules of objects depend on the order of +--* operations. For instance,
the typing of e 3 in the object expression ((el+--* m:-e2><---* n----e3) depends on
the typing of the "subobjects" el and (el e--* re=e2).
The approach chosen in [6] to solve the problem of method order is to add
book
to the e_~z relation a bookeeping relation --+ . This relation leads to a standard
form, in which each method is defined exactly once (with the extension opera-
tion), using some "dummy" bodies, and redefined exactly once (with the override
operation), giving it the desired body.
In our system the notion of standard form is unuseful since the subject reduc-
tion property does not hold for the book
--+ part of the evaluation rule. On the other
hand, we can use the extra information contained in types to type correctly the
extraction of the bodies of methods from the objects (it will be clear how in
paragraph 3:2). Therefore, we propose the following operational semantics. We
list here only the most meaningful reduction rules. Appendix 1 contains the full
set of rules, which includes rules of error propagation. The evaluation relation is
the least congruence generated by these rules.
(/?) ( x.el)e2 v4t [e2/x]el
e m -+. (e m)e
eval -
(8?.tCC+ "z) (el+--* re:e2> +--~III ~ e2
eval
(n xt <el , n=e > el +-m
eval
(fail (>1 () +J m --4 err
eval
(fail abs) Ax.e ~ m -+ err.
To send message m to the object e still means applying the body of m to the
object itself. The difference is that in our semantics the body of the method is
recursively searched by the 4-~ operator without modifying the shape of the full
object; if such a method does not existl the object evaluates to error. Observe
that the rule (fail vat) x ~ m ~_~t e r r is unsound, since the variable x could
be substituted (by applying a/?-reduction) by an object containing the method
Ill.

3 Static Type System

The central part of the type system of an object oriented language consists of
the types of objects. In [6], the type of an object is called a class-type. It has the
form:
c l a s s t.((ml :'rl, . . ., mk :Tk >>,
20

where ((ml:rl,... ,mk:'rk>> is called a row expression. This type expression de-
scribes the properties of any object e that can receive messages mi, (e ~ mi),
producing a result of type Ti, for 1 < i < k. The bound variable t may appear in
ci, referring to the object itself. Thus, a class-type is a form of recursively-defined
type. As a simple example of types in [6], we consider the following p o i n t object:
p o i n t d~] (x-=Aself.O, ravx'-Aself.Adx.(self +- x=As.(self ~ x) -4- dx)),
with the following class-type: classt.(<x:int, ravx:int~t)).
A significant aspect of this type system is that the type (int-+t) of method
mvx does not change syntactically if we perform a method addition of a method
c o l o r to build a c o l o r e d _ p o i n t object from p o i n t . Instead, the meaning of the
type changes, since, before the c o l o r adjuction, the bound variable t referred to
an object of type p o i n t , and after t refers to an object of type c o l o r e d _ p o i n t .
So the type of a method may change when a method is inherited: the authors
of [6] called this property method specialization (also called mytype specialization
in [8]). The typing rules assure that every possible type for an added method
will be correct and this is done via a sort of implicit higher-order polymorphism.
To allow subtyping, we add a new sort of types, the labeled types, that carry
on the information about the methods used to type a certain method body.
This information is given by a subscript which is a set of method names. The
methods used to type a body are roughly the method names which occur in the
body itself. For example, suppose that the object el has a method ra with body
e~, that in e2 a message n is sent to the bound variable self and a method n' (of
el) is overriden. Then the type r of e2 is subscripted by the set {n, n~}, since e~
uses n, n'. These labeled types are written inside the row of the class-type and
they do not appear externally. Therefore, in our system the object p o i n t will
have the following class-type: classt.((x:int{}, mVx:(int~t){x}>}.
We can forget by subtyping those methods that are not used by other meth-
ods in the object, i.e., a method is forgettable if and only it does not appear in
the labels of the types of the remaining methods. This dependency is correctly
handled in the typing rules for adding and overriding methods (i.e., (obj ext)
and (obj over)), where the labels of types are created. We refer to section 3.4 for
some meaningful examples.

3.1 Types, Rows, and Kinds

The type expressions include type constants, type variables, function types and
class-types. In this paper, a term will be an object of the calculus, or a type, or
a row, or a kind.
The symbols c~, r, p, ... are metavariables over types; ~ ranges over type
constants; t, self,.., range over type variables; A ranges over labels; a, /~, ...
range over labeled types; R, r range over rows and row variables respectively;
m, n, ... range over method names and ~ ranges over kinds. The symbols a, b,
c, ... range over term variables or constants; u, v, ... range over type and row
variables; U, V, ... range over type, row, and kind expressions, and, finally, A,
21

B, C, ... range over terms. All symbols may appear indexed. The set of types,
rows and kinds are mutually defined by the following grammar:
Types r ::= ~ It ] ~'--+~" ] c l a s s t . R
Labels A ::= { } I A U { m }
Labeled Types a ::= ra
Rows n : : : r t <<)) I <<R Im:~)) I At.R I a t
Kinds g ::= T l T ' - + [ m l , . . . , r a k ] ( n > 0 , k > l ) .
We say that r is the type and A is the label of the labeled type r a .
The row expressions appear as subexpressions of class-type expressions, with
rows and types distinguished by kinds, intuitively, the elements of kind [mz,..., real
are rows that do not include method names m l , . . . , ink. We need this information
to guarantee statically that methods are not multiply defined. In what follows,
we use the notation r~:fm as short for m l : r ~ , ... ,m•. "'rak k and r~:~7 as short for
1111 : O ~ 1 , , . , ,mk:OLk.
We say that a row R' is a subrow of a row R if and only if R--((R' [ r~:5)), for
suitable r~ and c7.
The set S(R) of labels of a row R is inductively defined by:
s((O))=S(r)={}, S(<<R [ m:T~>>)=S(R) U A, S(At.R)=S(R), and S ( R r ) = S ( R ) .
In our system, the contexts are defined as follows:
r ::= ~ l r , ~::7-] r, ~:T IF, r:~ ] F, t~ _~ L2,
and the judgement forms are:
F~- * F~ R:~ Ft-r:T F [ - r l <r2 F }- e :~'.
The judgement F t- , can be read as " F is a well-formed context". The meaning
of the other judgements is the usual one.

3.2 Typing Rules

In this subsection we discuss all the typing rules which are new with respect to
[6], except for the subsumption rule which will be discussed in the next section.
More precisely, we present the rules for extending an object with a new method
or for re-defining an existing one with a new body, the rule for searching method
bodies and the rule for sending messages. The remaining rules of the type system
are presented in Appendix 2.
We can assume, without loss of generality, that the order of methods inside
rows can be arbitrarily modified: this assumption allows to write any method as
the last method listed in the class-type. A formal definition of type equality is
given in Appendix 2.
The (obj ext) rule performs a method addition, producing the new object
(e~ +-o n =e2}. This rule always adds the method to the syntactic object in case
the latter is not present or it is present in the object but it was previously
forgotten in the type by an application of the subtyping rule (sub -<). Another
task performed by this rule is to build the labeled type v{~} for the new method
n, where the label {~} represents the set of all methods of el that are useful to
type n's body.
22

r~-el:elasst.<(Rlg:ff}} r, t:T 'r R : [g, n] n ~ S((<R ] g:/)))


(obj ext) F, r:T-+[ff~,n] ~- e2: [elasst.(<rt Im:& n:r{fa}}}/t](t--+r) r not in r
/~ ~ (el/--o n--e2): elasst.((R ] g:cT, n:r{~})).
The condition n ~ S(((R I g:cT))) prevents unmeaningful labels, since the
typings of the previously added methods cannot use n. This condition is not
derivable, since el could be a term variable.
The (obj over) rule types an object in which method n is overrided as in the
original rule of [6]. The label A is changed to {g}, because A represents the
dependences of the previous body of n, and these ones could not hold anymore
for the new body.
F F- el :elasst.((Rl~:~,n:ra))
(obj over) F, r:T--+[g, n] ~- e2: [elasst.((rt I ff:cT,n:r{fa}))/t](t--+7) r not in r
F }- (el +-- n=e~): classt.((R I r~:~7,n:r{rfi}}).
In the (obj ext) and the (obj over) rules we say that the method n uses all
the methods belonging to the label {r~} associated with the labeled type of n.
The (meth search) rule asserts that the type of the extracted method body
is an instance of the type we deduced for it when the method was added (by an
(obj ext) rule) or overrided (by an (obj over) rule). Note that the labels are used
in an essential way in this rule.
(meth search) F ~- e: elasst.((R [ ~:g,n:r~})) F,t:T F- R' : [g,n]
F ~- e ~ n: [elasst.<<R']g:a,n:r{fa}>>/t](t--+v ).
The (meth appt) rule is a sort of unfolding rule; in fact the class-type is a
form of recursive type, and it does not need any luther explanation.
ff ~- e : elasst.<<R I i:cT, n:qr~}>>
(meth appl) r f- e ~ n : [classt.((R l ~:5,n:r{fai))/t]r.

3.3 The Subtyping Relation and the Subsumption Rule


The subtyping relation is based on the information given by the labeled types
of methods in rows. Looking at rules (obj ext) and (obj over) it is clear that if
the body of the method n in e has type r, and its typing uses the methods r~ of
e, then the type of e will be classt.((R I ~:~,n:r{rfi})) , for suitable R and ~. In
other words, we label the types of the methods in rows by the sets of methods
the typing of their bodies depends on.
The (width -<) rule says that a type is a subtype of another type if the
forgotten methods (i.e., the methods not occurring) in the second type are not in
the union of the sets of labels of the remaining methods. The condition ff ~ 8 ( R )
formally assures that the remaining methods do not use the methods ft.
(width -<) F F- classt.((R I ~:ff)): T ff ~ S(R)
F t- elasst.((R I ff:~7)) __ elasst.R.
Clearly, we can forget groups of mutually recursive methods with this rule.
We have also the usual subtyping rules for constants, reflexivity, transivity
and for the arrow type constructor (that behaves eontrovariantly in its domain
23

with respect to the _ relation). The full subtyping system is given in Appendix
2. Let two class-types 71 and v2 be given, such that the judgement F ~- vl ___ 72
is derivable and the object e is of type 7"1. The (sub -<) rule says that we can
derive also type r2 for e. It follows that the object e can be used in any context
in which an object of type v2 is required. The possibility of giving more types to
the same object makes our calculus more expressive than the original one.
/'~- e:T1 El- Tl! 7"2
(sub -4) 1" [- e :T2.
Using the (sub -i) rule we can obtain judgements of the shape F ~- e : c l a s s t . R ,
where n is a method of e but F ~- R : In]. In this case we say that this rule forgets
the method n. It is important to remark that, when a method is forgotten in the
type of an object, it is like it was never added to the object.

3.4 Examples

In this section we will present two examples: the first shows how our subtyping
relation works on a critical example of [1]. The second example gives a simple
object, typable in our calculus, but not typable in the original calculus.

Example 1. Given the following objects:

de/
Pl ---- (x=)~self.O,mvx--)~self.)~dz.<selfi--
x=)~s.(self~ x) + dx>>
de/
p2 = (x=~,self.1,y=~,self.O,mvx=~,self),dx.<self+- x=~,8.(self.r x) + dx>,
mvy=~,self.:~dy.<selfe- y=~,8.(self~ y)+ dy>>,
we can derive ~- Pl : P1 and ~- P2 : P2, Where
P1 de=: classt.<(x:int, mvx:(int--+t){x}l>
P2 d~=: dasst.<<x:int, y:int, mvx:(int~t)~x~, mvy:(int~t)~y~>>,
and int stands for int{}.It is easy to verify that in our system P2 -< P1. This
relation between P2 and/~ is the one we want to have, since it is the intuitive
relation between a one-dimensional point and a two-dimensional point. If we
modify Pl and P2 as follows:
P'1 de_: <Pl <-- mvx--Aself.Adx.self>
p~ ~: <<p2+--x=~self.((self ~ mvx I) ~ y)> ~- mvx=~,sel:.:,d~.sd:>,
we can derive ~- p~: P~ and ~- p~: P~, where
Pi ~-~: classt.<<~:i,~t,mv~:(int--+~)>>
P~' ~-~: ~lasst.<<~:i,~t~mvxZ~,y:int,mvx:(int~t),mvy:(int~t)~yg>.
Now P~ 2~ P[, because we cannot forget the y method since the x method
uses it. Therefore, we are unable to assign type P~' to the object p~. In this
way, we avoid the so called message-not-understood error. In fact, if we allowed
P~ ___ P~, we would get ~- p~ : P~ by subtyping. Then, it would be possible to
override the mvx method of p~ by a body that has an output of type P~. Since
the x method of p~ uses y, this would produce a run-rime error. Let us formalize
24

this situation (the original pattern appears in [1], paragraph 5.4). Suppose to
override the object p~ as follows:
p~ de___](p~ +- mvx=Aself.Adx.pl).
If we send message x to p[, then an error occurs, since the body of x sends
the message y to the object (self ~ mVx 1), but this object does not have any y
method.

Example2. Consider the object draw that can receive two messages: f i g u r e ,
that describes a geometrical figure, and p l o t , that, given a point, colors it black
or white, depending on the position of the point with respect to the figure. The
object draw accepts as input both a colored point or a point. This would be
impossible in the original system of [6], since there one would have to write two
different objects, one for colored points and one for points, with different bodies
for the method p l o t . In fact, for colored points we need an override instead of
an extension. For the object draw:
draw def
=
(figure=Aself .Adx.Ady.(dy=f(dx)), plot=Aself.Ap.if (self ~ figure)
(p ~ x)(p ~ y) then (p+-o col=Aself .black} else (p+-o col=Aself .white}>,
we can derive F- draw: DR, where

DR de_f class t.((figure:int-+int--+bool, plot:(P--+CP){figure}))


p ds classt.((x:int, y:int, mvx:(int--+t){x}, mvy:(int-+t){y))}
CP dej classt.((x:int, y:int, mvx:(int--~t){x}, mvy:(int-+t){y}, color:colors}}.

4 Properties of the System

In this section we will show that our extension has all the good properties of
the original system. We follow the same pattern of [7]: first we introduce some
substitution lemmas and, then, the notion of derivation in normal form that
simplifies the proofs of technical lemmas and the proof of the subject reduction
theorem.
The following lemma is useful to show both a substitution property on type
and kind derivations and to specialize class-types with additional methods. Let
UoVstandsfor U:Vor U~V.

Lemma2. 1. If F, u2:V2,F' f- U1 * V1 and 1" ~- U2 : V2 and F,[U2/u2]F' F- ,


then F, [U2/u2]F' F- [U2/u2]U1 9 [U2/u2]V1.
2. If F, u2:V2, F' ~- * and]" F- U2 : V2 then F,[U2/u~]F' F- ,.
3. If ;, u2:V2, F' b U1 9 V1 and 1" ~- U2 : V2 then F, [U2/u2]F' ~- [U2/u2]U1 9
[U2/u2]Vl.
4. If r, r:Tn--+[g], r ' ~- e : r and r ~- R : T'~--+[~ then F, [R/r]F' t- e: [R/r]r.
25

4.1 N o r m a l Form
It is well known that equality rules in proof systems usually complicate deriva-
tions, and make theorems and lemmas more difficult to prove. These rules intro-
duce many unessential judgement derivations. In this subsection, we introduce
the notion of normal form derivation and of type and row in normal form, re-
spectively denoted by ~-g and vnf in [7]. Although it is not possible to derive
all judgements of the system by means of these derivations, we will show that all
judgements whose rows and type expressions are in r n f are PN -derivable. Using
this, we can prove the subject reduction theorem using only ~-lv derivations.

D e f i n i t i o n 3. 1. /" ~N e : 7"is a normal form derivation only if the only appear-


ance of an equality rule is as (row ~) immediately following an occurrence
of a (row fn appl) rule.
2. The r n f of a type and of a row are their/~-normal form.

It is easy to show that r n f satisfies the following identities:

vnf(((Rlra:v~))) =_ ((rnf(R) ] m:rnf(r~))), and


Fact 4. T n f ( r a ) -- r n f ( r ) ~ ,
vnf(class t.R) -- classt.rnf(R).
In [7], the type and row parts of the calculus are translated into the typed
A-calculus with function types over an assigned signature Z (called)~-+ (~)). In
Appendix 3, we present an extension of the translation function tr of [7] that
takes into account labeled types. This extension is done by deleting the labels
in labeled types.
The following theorem states that the row and type fragment of our calculus
is strongly normalizing and confluent. For the proof we can refer to [7], since the
target calculus ~ (Z) of tr is unchanged.

T h e o r e m 5.1. If F F- U : V then there is no infinite sequence of--+t~ reduc-


tions out of U.
2. If F ~- U1 : V1 and U1 - ~ U2 and Ut -.-~ Ua, then there exists U4, such
that U2 ---~ U4 and [;3 " ~ fl U4. 9
The following lemma states that, for each derivation in our system, it is possible
to find a corresponding derivation in normal form. Let A 9 B stands for any
statement in the calculus, and let rnf(e)=e.
Lemma6. If F F- A 9 B then rnf(F) ~-Y rnf(A) 9 rnf(B). 9

4.2 ~?echnical L e m m a s
We are going to state some technical lemmas, necessary to prove some parts of
the subject reduction theorem. They essentially say that each component of a
judgement is well-formed.
The following lemma shows that the contexts are well-formed in every judge-
ment and allows to treat contexts, which are lists, more like sets. Moreover, it
enables us to build well-formed row expressions.
26

L e m I n a 7. 1. If I7 [-N A 9 B then 1" ~-N *"


2. If 1", Y I f-N A 9 B and F, c 9 Y I ~-N * then 1", c 9 1"1 F-N A 9 B.
3. If F ~-N ),tl . . . t p . ( ( R I g:~A}) : TP-+[n-], then F, t l : T , . . . ,tp:T f-N 7-i : T for
each 7-i in "TA and F, t l : T , . . . , t p : T F - N R : [~,g].
4. If F ~-N classt.((R I r~:~a)) : T then F , t : T F-N ri : T for each ri in "TA and
F, t:T ~-N R : [I~]. 9
The proof of the above lemma is an easy extension of the proofs of Lemmas 4.11
and 4.12 in [7]. The last lemma of this section assures us that we can deduce
only well-formed types for objects.
L e m m a 8. 1. If F [-N (7 "~ T then F [-N O" : T and 1" ~'N 7" : T .
2. If 1" ~N e : r then Y ~-N 7" : T. 9

4.3 Subject Reduction Theorem


We are going to prove the subject reduction property for our calculus, by a case
analysis of the e-L~l rules.
The next lemma is used to prove that/?-reduction preserves types.

Lemma9. 1. If F , x : r l , F I ~-g e2 : r2 and 1" ~-N el : 7"1 then F , F I ~-Y


[el/x]e2 : 7-2.
2. If F ~N e : 7- and e--+ze I then F ~)v e I : 7-. 9
T h e o r e m 10. S u b j e c t R e d u c t i o n If 1" ~-r e : ~ and e e.y_~l el then F ~'N et : a.

Proof. It is enough to give that each of the basic evaluation steps preserves the
type of the expression being reduced. We show the derivation for the left-hand
side of each rule (considering the most difficult cases, in which the (sub ~_) rule
is applied after each other rule) and then we build the correct derivation for the
right-hand side. For the rules (succ +=) and (next +_o), we consider the +-o cases
only. The +- cases are similar.
9 (~) (~x.el)6 2 e_~l [e2/x]el" This case follows from Lemma 9.
eval
9 (r e r n --+ (e ~ n)e. This case follows from Lemma 8 (2), and
Lemma 7 (4).
eval
9 (succ +-~) (el+-o n - e 2 ) e-~ n --+ e2. In this case the left-hand side is:
F F-N e l : classt.((R I g:(~)) r,t:T ~-N R: [~,n] n ~ $(((R ] ~:c7)))
F, r:T-+[g,n] ~-N e2: [classt.((rt]~:~,n:7-ffa}))/t](t~r) r not in r
(obj ext) F i-to (elt-o n=e2) : classt.((R ] r~:ff, n:7-{~}))
(sub ~_)
r F-N (el+-o n----e2): classt.((R' ] r~:c~,n:r(m})) g,t:T ~-Y R": [r~,n]
(m search)
F F-N (el+-o n=e2) ~ n: [classt.({Ft"l~:~,n:r{~}}}/t](t--+r)
(sub ~_)
1" ~-N (ele-o n=e2) ~ n : r
where (m search) abbreviates (meth search). Observe that it is not possible to
forget n in the first application of (sub -~) rule, since afterwards we have to type
its search. Moreover, it is not possible to forget any of the ~ methods, since n
uses them.
27

From F, t:T ~-N R" : [r~,n], by applying (rowfn abs) rule, we get r ~-N )tt.RH :
T--+ [~, n]. This, together with 1", r:T--+[~,n] i-y ~ : [classt.((rt I ~:S, n:r{~})}/t](t--+r)
implies, by Lemma 2(4), F ~- e2: [classt.(((At.R")tl~:~,n:r{~}))/t](t--+'r). So,
by Lemma 6, we can conclude F t-y e2 : [classt.((R" I g:5, n:r{~}))/t](t--+r). Then
we build a derivation for the right-hand side as follows:
r i-N e2: [classt.((R" I ~:a, n:r{fa}>>/t](t-+r)
(sub -4) r ~-e e2 : o.
This proof shows the usefulness of labeled types. Thanks to the label {g} (that
contains all the me~hods used by n) it is possible to reconstruct the correcfl type
of e~ in the derivation of the left-hand side of the rule, and therefore the correct
type of the right-hand side.
eval
9 ( n e x t 4--~) (e14---o h i : e 2 ) / . . . z n --). e 1 ~ n.
There are two possible cases, according to whether the labeled type of m
contains n or not, We consider the first case, the second being similar. The
left-hand side is:
7) F, t:T ~-N R " : [~, n]
(meth search) F bN (elt--o m=e2} ~ n: [r I
(sub -4)
r l-'e (ei+-o m=e~) += n : o,
where 7) is the following derivation:
F ~-N e,: classt.((R I ~:&n:r{6t))) F,t:T ~-sv R: [~,n,m]
F, r:T-+[l~, m, n] t-N e2: [classt.((rt]~:~, n:r{~l},m:p{~,n}})/t](t--+p)
m g~S(((R I ~:g,n:r{~l}}) ) r not in p
(obj ezt/
f' l-N <el+-o re=e2> : classt.((Rl~:G,n:r{c~.,m:p@,n}))
(sub -4)
F ~-N (el +--~ra=e2): classt.((R'l~:~,n:v{~}) ).
The correctness of the application of (sub -<) rule in 7) implies that ((~:~, n:r{~t)))
is a subrow of ((R I ~:c7, n:r{~l} , m:P{15,n}}}. Moreover, the condition m ~ S(((R I
i~:r n:r{~l}))) implies m ~ {~}. Therefore, ((R I 1~:5, n:v{el})} = ((R'" I q:J, n:v{el})},
for a suitable R "I. So we can build the derivation for the right-hand side:
F t- N el :classt.((R'"[~lj, n:v{~l}}} F,t:T t-N RH: [~,n]
(meth search)
(sub _) F I-N el +-~ n: [classt.((R" [~:j, n!r{~}})/t](t--+T)
F~N el r -.
Clearly, the left-hand sides of all other evaluation rules cannot be typed. 9

4.4 Type Soundness

The subject reduction proof shows the power of our typing system. Labeled
types not only allow a restricted form of subtyping that enriches the set of types
of typable objects, but they also lead us to find a simpler and more natural
operational semantics, in which no transformations on the objects are necessary
to get the body of a message. In fact, the typing rule for the +-~ operation
eval
is strictly based on the information given by the labels. Moreover, since an -4
28

produces the object e r r when a message ra is sent to an expression which does not
define an object with a method m, the type soundness follows from Theorem 10.

Theoremll. If F ~-N e : 7" is derivable for some F and v, then the evaluation
ore cannot produce e r r , i.e. ee~t~t e r r . []

Notice that in [7] the type soundness was proved by introducing a structural
operational semantics and showing suitable properties.

5 Conclusions

This paper extends the delegation-based calculus of objects of [6] with a subtyp-
ing relation between types. This new relation states that two types of objects
with different numbers of methods can be subsumed under certain restrictions.
This restricted form of subsumption is conservative with respect to the features
of delegation-based languages which are present in the original system.
Among the other proposals, that allow width specialization we have studied,
one solution is to explicitely coerce an object with more methods into an object
with less methods, by expanding each call of a method that does not belong to
the smaller type with the proper body of that method. This solution forces a
new subsumption rule that performs explicitely this job; this rule creates a quite
different object but allows to eliminate the labels and the restrictions on them.
Further goals of our research are:
- finding a denotational model for the calculus,
-- adding some mechanism to model encapsulation,
- determining if the type-checking of this calculus is decidable.
Acknowledgements
The authors wish to thank Mariangiola Dezani-Ciancaglini for her precious
technical support and encouragement, Furio Honsell and Kathleen Fisher for
their helpful comments, Steffen van Bakel and Paola Giannini for their careful
reading of the preliminary versions of the paper.

References
1. Abadi, M., Cardelli, L., A Theory of Primitive Objects. Manuscript, 1994. Also in
Proc. Theoretical Aspect of Computer Software, LNCS 789, Springer-Verlag, 1994,
pp. 296-320.
2. Bell~, G., Some Remarks on Lambda Calculus of Objects. Internal Report, Dipar-
tiroento di Matematica ed Informatica, Urdversits di Udine, 1994.
3. Bono, V., Liquori, L., A Subtyping for the Fisher-Honsell-Mitchell Lambda
Calculus of Objects. Full version, avalaible as f t p : / / p i a n e t a . d i . t m i t o , it/pub
/LAMBDA/liquori/subtyping .ps. Z, 1994.
4. Borning, A.H., Ingalls, D.H., A Type Declaration and Inference System for
Smalltalk. In Proc. ACM Syrup. Principles of Programming Languages, ACM
Press, 1982, pp. 133-141.
5. Ellis, E., Stroustrop, B., The Annotated C++ Reference Manual. Addison Wesley,
1990.
29

6. Fisher, K., Honsell, F., Michell, J. C., A Lambda Calculus of Objects and Method
Specialization. In Proc. 8th Annual IEEE Symposium on Logic in Computer Sci-
ence, Computer Society Press, 1993, pp. 26-38.
7. Fisher, K., Honsell, F., Mitchell, 3. C., A Lambda Calculus of Objects and Method
Specialization. Nordic Journal of Computing, 1 (1), 1994, pp. 3-37.
8. Fisher, K., Michell, J. C., Notes on Typed Object-Oriented Programming. In Proc.
Theoretical Aspect of Computer Software, LNCS 789, Springer-Verlag, 1994, pp.
844-885.

Appendix 1: Evaluation Rules

eval
(Z) (AX.el)e2 e$_~l [e2/x]el (errabs) Ax.err -+ err
eval eval
(r e r m -+ (e +-~ m)e (err appl) err e -+ err
eval eval
(SUCC 4-~) <el/--* m ----e2> +-~ m -+ e2 (err 4-~) err /-~ 1% -+ err
eval eval
(next +-~) (el/--* I% ----e2> +-~ m -+ el /-~ m (fail<>) <> +-~ m -+ err
~v al
(err+---*) <err4--, m = e> e~_~l err (failabs) Ax.e 4-~ m -+ err

A p p e n d i x 2: T y p i n g r u l e s
General Rules R u l e s for t y p e e x p r e s s i o n s
(start) (type var) F I" * t ~ dom(.P)
e ~- * F, t:T ~- *

(proj) F~-* A.BEF F~'rl:T Fi-T2:T


F t- A 9 B (type arr) F ~- "rl--+r~ : T

(weak) F ~" d * B F, F' ~- * (class) F, t:T ~- R : [~


F, F ' ~- A 9 B F t- class t.R : T
T y p e a n d row e q u a l i t y
We consider a-conversion of type variables bound by A or class and application of the
principle: ((((R I n:r~l// I m:r~2}/ = ((((R I m:r~2// I n:r~l}} within type or row expres-
sion to be conventions of syntax, rather than explicit rules of the system. Additional
equations between types and rows arise as result of fl-reduction, written -+~, or ~ Z .
(type/9) F ~- r : T r --+~ r'
F~-r':T Fbe:r r~zr' F~-r':T
(type eq) .... F b- e : r'
F~-R:n R--+ZR'
(row fl) F ~- R' :
R u l e s for rows
F}-, (row .fn abs) F, t:T t- R : Tn-+[~]
(empty row) r ~ (0>: [~] F ~- ;~t.R : T ~+1 --+[~

(row vat) F ~- * r ~ dora(F) (row]n app) F t- R : T n + I - - + [ ~ F F- r : T


F, r:T~-+[~] ~- *

(row label) r ~ n : Tn-+[~ {fi} c_ {~} F t- R : [~,n] F~- r : T


F ~- R: T"-+[~ (row ext) r ~ ((R I n:~)): [~]
30

R u l e s for a s s i g n i n g t y p e s to t e r m s
F F r : T x ~ dora( F) FF,
(exp var) F, x:r ~- * (empty obj) F ~ 0: classt.(0)

(exp abs) F,x:rl b e : v2 (exp app) I" F- el : vl--~v~_ F F- e2 : r l


F }- Ax.e : rl--+r2 F F ele2 : r2

(sub_<) FFe:n F ~- n _< r2


Ft-e:T2 appl) F F e ~ n: [classt.((R[~:~,n:ri~}))/t]r

(meth F ~- e: classt.((R I ~:cLn:r~}) ) F,t:T F R ' : [~,n]


~h) F P e e-~ n: [classt.((R'lg:~,n:r{~t}))/t](t-~r )

FFe~:classt.((RJg:~)) F,t:T~-R:[Kn] n r S(((R ] ~:~)))


F, r:T-+[~,n] ~- e2: [classt.((rt l ~:5,n:r{~)))/t](t-+r) r not in r
(obj ext)
F b (el+-on=e2) : elasst.((R ]~:a,n:r{~}))

F F el :classt.((R[g:5, n:TA))
F,r:T--+[~,n] }- e2: [elasst.((rt l ~:5, n:rf~})}/t](t-+r ) r not in r
(obj o,eO
F F (el +-- n=e~): elasst,((R I g:a,n:r{~}) )
Rules of subtyplng
(width _ ) F F classt.((R ] ~:(7}): T f~ r S(R)_
F F classt.((R [ ~:5)) _ c l a s s t . R

(const-'() I~ [" * FF-o:T F ~- t2 : T (trans _~) F F cr ~ r FFr~ p


- P,~i _~ ~2 ~- * F~-~ p

F t- r : T (arrow_<) l~ b o'l ~ a P ~- r _< r'


(refl -4) F b ~r ~ a F F- tr--+r _< a'--+r'

Appendix 3: T r a n s l a t i o n Function tr : RowUTypes-+A-+(~)

Given the following signature Z :


Type C o n s t a n t : typ, m e t h
Term Constant : • typ, for each constant type ti, e r : m e t h
a r :typ--~typ--+typ, z l : (typ--+meth)--+typ
brm : m e t h - + t y p - + m e t h , for each m e t h o d n a m e ra,
the function tr : R o w U Types-+A -+ (Z) is inductively defined as follows:
tr(~i) = iota. tr(r) = r
tr(t) = t tr(<0) ) ---- el:"
tr(~l-~T~) = ar tr(Tl) tr(~) tr(<<R r m:~)>) = br~ tr(R) tr(~)
t r ( c l a s s t . R ) -= c l (At:typ.tr(R)) tr(At.R) = At:typ.tr(R)
tr(R~) = tr(R) tr(~).
We extend tr to kinds and contexts in the s t a n d a r d way.
t r ( T ) = typ tr(Tn-+[~]) = typ~--~meth
tr(~) = 0 tr(r, ~:T) = t r ( r ) u {tr(~): t r ( T ) }
tr(F, x:7) = t r ( F ) tr(F, tl -< L2) = t r ( F ) U { t r ( t l ) __ tr(~2)}
tr(L t:T) = tr(F) u { t r ( t ) : t r ( T ) } tr(L r:,~) = t r ( r ) o {tr(t):tr(tr
The Girard Translation
E x t e n d e d with Recursion*

Torben Braiiner

BRICS**
Department of Computer Science
University of Aarhus
Ny Munkegade
DK-8000 Aarhus C, Denmark
Internet: tor@daimi.aau.dk

A b s t r a c t . This paper extends Curry-Howard interpretations of Intu-


itionistic Logic and Intuitionistic Linear Logic rules for recursion. The
resulting term languages, the ~Tr162 and the linear )~rr176
respectively, are given sound categorical interpretations. The embedding
of proofs of Intuitionistic Logic into proofs of Intnitionistic Linear Logic
given by the Girard Translation is extended with the rules for recur-
sion such that an embedding of terms of the ),tee-calculus into terms of
the linear )(cO-calculus is induced via the extended Curry-Howard iso-
morphisms. This embedding is shown to be sound with respect to the
categorical interpretations.

1 Introduction

Linear Logic was discovered by J.-Y. Girard in 1987 and published in the now fa-
mous paper [Gir87]. In the abstract of this paper, it is stated that "a completely
new approach to the whole area between constructive logics and computer sci-
ence is initiated". Since then, a lot of work has been done to corroborate this
claim. This paper deals with a Curry-Howard interpretation of the intuitionistic
fragment of Linear Logic, appropriate for recursion.
The original Curry-Howard isomorphism, [How80], relates the natural deduc-
tion formulation of Intuitionistic Logic to the )t-calculus; formulas correspond to
types, proofs to terms, and normalisation of proofs to reduction of terms. The
fundamental idea of categorical logic is that formulas are interpreted as objects,
proof-rules as natural operations on maps; and proofs a s maps. We can give a
sound categorical interpretation of Intuitionistic Logic in a cartesian closed cat-
egory, so given the above mentioned Curry-Howard isomorphism, this induces
a sound categorical interpretation of the :k-calculus. In the present paper this
interpretation will be extended to deal with the A-calculus with an additional
rule for recursion, the Arr
* A full version of this paper is available as Technical Report BRICS-RS-95-13.
** Basic Research in Computer Science,
Centre of the Da~nish National Research Foundation.
32

Intuitionistic Linear Logic can be given a Curry-Howard interpretation in the


same way as the Curry-Howard interpretation of Intuitionistic Logic. In [Abr90]
the first Curry-Howard interpretation of Intuitionistic Linear Logic is introduced.
One of the rules of this system has the property that it forces ! to be isomorphic
to !! in any reasonable categorical interpretation, as pointed out in [Wad91].
In 1992 this was remedied by the authors of [BBdPH92] (and by the author
of this paper) by changing the rule in an appropriate way, and by discovering
a natural deduction formulation equivalent to the Gentzen style formulation of
Intuitionistic Linear Logic (the hitherto known natural deduction formulation,
[Mac91] did not possess that property). This work settled the question about how
to interpret Intuitionistic Linear Logic via a Curry-Howard isomorphism. The
Curry-Howard interpretation of Intuitionistic Linear Logic, the linear h-calculus,
is given a sound categorical interpretation in [BBdPH92], which in the present
paper will be extended to deal with the linear h-calculus with an additional rule
for recursion, the linear )~reC-calculus.
The [Gir87] paper introduced the Girard Translation which embeds Intuition-
istic Logic into Intuitionistic Linear Logic. This translation works at the level of
formulas as well as at the level of proofs. The Girard Translation at the level of
proofs can be extended with rules for recursion so that a reduction preserving
embedding of terms of the ),tee-calculus into terms of the linear )~reC-c~lculus is
induced via the extended Curry-Howard isomorphisms. At the semantic level, a
categorical model for the linear )~r~r that is, a closed !-category C with
finite products and a linear fixpoint operator, induces a categorical model for
the ArgO-calculus, that is, a cartesian closed category with a fixpoint operator.
This is the category of free coalgebras with respect to the comonad on C. The
embedding of terms is shown to be sound with respect to the categorical inter-
pretations, that is, it is shown to correspond to the adjunction between C and
the category of free coalgebras in a precise way.
It is shown in [HP90] that a cartesian closed category with finite sums and a
fixpoint operator is inconsistent, that is, it is equivalent to the category consisting
of one object and one arrow. It is also shown that a cartesian closed category
with a natural numbers object and a fixpoint operator is inconsistent. But the
category of CPOs and strict continuous functions is a consistent closed !-category
with finite sums, a natural numbers object, and a linear fixpoint operator; so the
presence of a linear fixpointoperator in a closed !-category is consistent with the
presence of finite sums and a natural numbers object. Thus, the inconsistency of
recursion with these standard constructs vanishes when we go to a linear context,
which is in accordance with [Plo93].
In [MRA93] a different approach to recursion in a linear context is taken. The
(I, | fragment of the linear h-calculus is extended with natural numbers,
corresponding to a weak natural numbers object in the categorical model. The
discussion above implies that this approach is consistent with ours.
The extensions of Intuitionistic Logic and Intuitionistic Linear Logic with
rules for recursion makes every formula provable, so they should not be under-
stood as having logical meaning. Neither should the extension of the Girard
33

Translation and the extensions of the Curry-Howard isomorphisms be under-


stood as having logical meaning. The role of the extensions are to extend the
translation from terms in the )t-calculus to terms in the linear S-calculus to a
translation from terms in the Srr to terms in the linear tree-calculus.
Section 2 introduces linear fixpoints and other relevant categorical constructs.
Section 3 and Section 4 introduce the $~r and the linear $~r
respectively, and Section 5 deals with the (extended) Girard Translation. Some
remarks on possible extensions and future work can be found in Section 6.

2 The Categorical Picture

2.1 Previous Work


In what follows, we need the notion of a categorical model for the (I, | -% !)
fragment of Intuitionistic Linear Logic as defined in [BBdPH92]:
Definition 2.1 A !-category is a symmetric monoidal category (C, I, | with
1. A symmetric monoidal comonad (!, ~, 6, m~, m).
2. Monoidal natural transformations e :!--+ I and d :!--+!(-)| where
(a) eA and dA are maps of coaigebras,
(b) eA and dA give (!A, $A) structure of a cocommutative comonoid,
(c) maps between free coalgebras are maps between cocommutative comonoids.
A closed t-category is a t-category where ( - ) | A has a right adjoint A --o (-).
The assumption that the comonad is symmetric monoidal means that ! is a
symmetric monoidal functor and c and ~ are monoidal natural transformations.
The assumption that eA is a map of coalgebras amounts to eA being a map from
(!A, 5A) to (I, mi), and the assumption that dA is a map of coalgebras amounts
to d.4 being a map from (!A, ~A) to (!A| (6A | ~A); m:A,!A).
We also need the notion of a generalised coKleisli operator 7, [BBdPH92]:
Definition 2.2 Given f :!A1 |174 "-+ B, we define 7(f) to be the composite

!A1 | ...@!A, ~A'I~*"~A~ !!A1 | ...| "~al.....Ag !(!A1 | ...| !f, !B

Given a category C equipped with a comonad (!,~, ~), the coEilenberg-Moore


category, C: is the category of coalgebras, and the category of free coalgebras is
the full subcategory of C:, whose objects are free coalgebras, that is, coalgebras
of the shape (!B, 6). The category of free coalgebras is equivalent to the coKleisli
category; it is straightforward to check that the comparison functor from C! to
C: is an equivalence of categories when considered as a functor from C! to the
category of free coalgebras. We have an adjunction U' -t F ! between C' and C;
the forgetful functor U ! : C! ~ C simply forgets the coalgebra structure, while
the free functor F ! : C ---* C! takes an object B to the free coalgebra (!B, 6). The
adjunction induces a natural bijection r C!((A, h), F'B) ~- C(U:(A, h), B) given
by r = f;eB : A - + B and r = h;!g: (A,h) ~ (!B,~).
34

It is shown in [Bie94] that a symmetric monoidal comonad (!, e, 6, rex, m) on a


symmetric monoidal category (s I, | induces a symmetric monoidal structure
on g!; the unit I of the tensor product is given by the coalgebra (I, mi), and given
two coalgebras (d, k) and (B, h), their tensor product (A, k) | (B, h) is given by
the coalgebra (A | B, (k | h); mA,B). If, moreover, g is a !-category, then the
symmetric monoidal structure on g! is a finite product structure, that is, I is a
terminal object, and (A, k)| h) is a binary product of (A, k) and (B, h) when
equipped with (A| eB)); -~ and ((k; ea)| ~ as projections. Given (A, k), a
unique map from (A, k) to I is given by k; cA, and given maps f : (C, l) ---+(A, k)
and g: (C, l) --~ (B, h), a unique map from (C, l) to (A, k) @ (B, h) making the
appropriate diagrams commute is given by 1; dc; (ec | ec); ( f | g).
Now, assume that a symmetric monoidal closed category (g, I, | ---o) has a
symmetric monoidal comonad (!, e, 6, rot, m). Then g! has a symmetric monoidal
structure, as mentioned above, and moreover, it is shown in [Bie94] that every
free coalgebra is exponentiable; the internal-hona object (A, h) =:~ (!B, 6) of (A, h)
and (!B, 6) is given by (!(A --o B), 6), and we have an appropriate natural
bijection g"((C, k) | (A, h), (!B, 6)) ~ C'((C, k), (A, h) ~ (!B, 6)).
Definition 2,3 Let g be a .t-category. We will say that the category of free coal-
gebras is closed under finite products iff it has a finite product structure (1, •
such that 1 is a terminal object in gr and such that (!A, 6) x (!B, 6) is a binary
product of (!A, 6) and (!B, 6) in g!.
If g is a closed !-category such that the category of free coalgebras is closed
under finite products, then the internal-horn objects induce a cartesian closed
structure on the category of free coalgebras. The finite product structure (1, •
is given by the assumption that the category of free coalgebras is closed under
finite products, and the appropriate natural bijection is the composite
g!((!C, 6) • (!A, 6), (!B, 6)) ~ g'((!C, 6) @ (!A, 6), (!B, 6))
-~ g!((!C, 6), (!A, 6) ~ (!B, 6))
Now, under which circumstances is the category of free coMgebras closed under
finite products? The following observation gives a sufficient condition: if C is
a category with a comonad (!,~,6) and finite products (1, • then (!1,6) is a
terminal object in g~, and (!(A • B), 6) is a binary product of (!A, 6) and (!B, 6)
in g!. This is because the free functor F ! : g --+ g! is right adjoint to U: : g ! --~ g,
and right adjoints preserve finite products. So if g is a !-category with finite
products, then the category of free coalgebras is closed under finite products.

2.2 Fixpoints in Cartesian Categories


The main concern of this subsection will be a parametrised version offixpoints in
cartesian categories as introduced in [Law69]. In what follows, AA : A ~ A • A
is the diagonal map.
Definition 2.4 Let (C, x, 1) be a cartesian category. A map f t : A --+ B is a
fixpoint o f f : A • B --+ B iff f t = A A ; ( A x f t ) ; f .
35

Note how the diagonal map is used to copy parameters.

D e f i n i t i o n 2.5 An external fixpoint operator on a category with finite products


(C, • 1) is an operation on maps

AxB f~ B

such that f t is a fixpoint of f, and such that the operation is natural in A.

Definition 2.6 A fixpoint operator on a cartesian closed category (C, • 1, ::~)


is a family of maps YB : B :=~ B ---~ B such lthat curry(f); YB is a fixpoint o f f
for any map f : A • B--+ B.

Now, one can show that a cartesian closed category has a fixpoint operator
iff every map f : A • B --+ B has a fixpoint, but we will show here a more
informative result:

P r o p o s i t i o n 2.7 There is a bijection between external fixpoint operators and


fixpoint operators in a cartesian closed category.

2.3 Linear F i x p o i n t s

We will now consider fixpoints in a linear context. The previous definition of


fixpoints can not be used because it assumes the presence of finite products.

D e f i n i t i o n 2.8 Let (C, I,| be a monoidal category equipped with a comonad


(!,~,6) and a natural transformation d : ! ( - ) - + ! ( - ) | A map ft :!A ~ B is
a linear fxpoint of f :!A| ~ B iff f~ = dA; (Id | 7 ( f ~)); f.

It is simply an extension of the defnition of fixpoints in a category with finite


products to a linear context, where we have only a "diagonal map" da for objects
of the shape !A; we get back Definition 2.4 if we equip a cartesian category with
the identity comonad and the natural transformation induced by the diagonal
maps.

Definition 2.9 Let (C, I, | be a monoidal category equipped with a comonad


(!,~, 6) and a natural transformation d :!(-) --+!(-)| An external linear
fixpoint operator on C is an operation on maps

!A| f, B

!A In. B

such that ft is a linear fixpoint of f , and such that the operation is natural in
!A with respect to maps in the image of 3'.
3~5

D e f i n i t i o n 2.10 Let (r I, | be a monoidal closed category equipped with


a comonad (!, ~, 8) and a natural transformation d :!(-) -+!(-)| A linear
fixpoint operator on C is a family of maps YZBin :!(!13 --o B) ~ t3 such that
7(curry(f)); y~in is a linear fixpoinl o f f for any map f :!A| -+ 13.

The definitions of linear fixpoints and (external) linear fixpoint operators can
be generalised to an arbitrary number of parameters; however, the definitions
presented here are appropriate for the purpose of this article, so we will not
pursue this further. One can show that a closed !-category has a linear fixpoint
operator iff every map f :!A| ~ B has a linear fixpoint, but we will show
here a more informative result:
P r o p o s i t i o n 2.11 There is a bijection between external iinear fixpoint operators
and linear fixpoint operators in a closed .r-category.
The definition of a linear fixpoint in C is the definition of a fixpoint in the
category of free coalgebras stated in terms of maps in C, which entails that:
L a m i n a 2.12 Let C be a .t-category. Given maps of coalgebras
h : (!A, 8) --+ (!B, 8) and f : (!A, 8)| 8) --+ (!B, 6) we have that h is a fixpoint
o f f iff r :!A ~ t3 is a line.r fizpoint o r e ( f ) :!A| ~ B.
P r o o f . The following calculation proves the result:

h is a fixpoint of f iff h = A; (Id | h); f


iff r = r (Id@ h); f)
iff r = d; (Id| h); f; E
iff r -- d; (Id| r162 r
iff r = d; (Id| 7(r r
iff r is a linear fixpoint of r []

T h e o r e m 2.13 Let C be a .t-category such that the category of free coalgebras


is closed under finite products. There is a bijection between external fixpoint
operators in the category of free coalgebras and external linear fixpoint operators
in C.

T h e o r e m 2.14 Let C be a closed .t-category such that the category of free coalge-
bras is closed under finile products. There is a bijection belween fixpoint operators
in the category of free coalgebras and linear fixpoint operators in C.

P r o o f . Follows from Proposition 2.7, Theorem 2.13 and Proposition 2.11. []


This can also be derived from the following theorem, which follows from the
fact that the definition of a linear fixpoint operator in C is the definition of a
fixpoint operator in the category of free coalgebras stated in terms of maps in C:
T h e o r e m 2.15 Let C be a closed .t-category such that the category of free coal-
gebras is closed under finite products. Given a map of coalgebras
Y : (!13, 8) ~ (!13, 8) --. (!13, 8), then Y is a fixpoint operator in the category of
free coalgebras i/~ r :!(!B --o 13) -~ 13 is a linear fi~point operator in C
37

2.4 Concrete Models

An example of a closed !-category with finite products, finite sums, and a linear
fixpoint operator is the category of CPOs and strict continuous functions. This
category has a linear fixpoint operator because the induced category of free
Coalgebras is equivalent to the category of CPOs and continuous functions; a
cartesian closed category with a fLxpoint operator.
In [Bra94b] another example of a closed !-category with finite products and
a linear fixpoint operator is given, namely the category of dI domains and join
preserving stable functions.

3 T h e )~r~c-Calculus

3.1 D e f i n i t i o n o f t h e )~re%Calculus

Types are given by s ::= t I s A s [ s~s,andtermsby

t ::= z I < t , t > I fst(t) I snd(t) I AxA't I tt [reczn.t


Rules for assignment of types to terms are given in Appendix A. Type assign-
ments have the form of sequents xl : A1, . . . , x n : A n ~- u : A where x l , ...,xr,
are pairwise distinct variables. The expression " F ~- u : A" can mean either the
sequent itself or a certain derivation of the sequent; the actual interpretation is
to be decided by the context. The terms together with the typing rules for the
fragment without recursion are usually called the A-calculus, [GLT89]. The ex-
tension with recursion will be called the Ar~-calculus. The term and the typing
rule for recursion given here can also be found in [Win93].
The usual reduction rules for terms of the A-calculus, [GLT89] can be ex-
tended with a reduction rule for the term corresponding to recursion:

rec .u u[rec .u/ ]

One can show that the Ar~-calculus satisfies the Substitution Property entailing
that the rule satisfies Subject Reduction, that is, typing is preserved by an
application of the reduction rule. Instead of equipping the Ar~-calculus with the
mentioned reduction rules, one could define an operational semantics in natural
semantics style. This is dorm in [Win93].

3.2 The Curry-Howard Isomorphism ( E x t e n d e d w i t h R e c u r s i o n )


The Curry-Howard isomorphism says that formulas corresponds to types, proofs
to terms, and normalisation of proofs to reduction of terms. The relation between
formulas of Intuitionistic Logic and types of the A-calculus is obvious. It turns
out that we get the rules for assigning types to terms in the A-calculus if we
decorate the proof-rules of a natural deduction formulation of Intuitionistic Logic
with terms in an appropriate way. We can obviously recover the proof-rules if
we take the typing rules of the A-calculus and remove the terms. We get the
38

Curry-Howard isomorphism on the level of proofs as follows: given a proof of


A1, ..., A n I- A , that is, a proof of the formula A, one can inductively construct
a derivation of a sequent xl : A1,...,xn : A n t- u : A, that is, a term u of
type A. Conversely, if one has a derivable sequent x l : A1, ...,xn : A , ~- u : A ,
there is an easy way to get a proof of A 1 , . . . , A n I- A: erase all terms in the
derivation of the type assignment. The two processes are each other's inverses
modulo renaming of variables. The Curry-Howard isomorphism on the level of
proofs can be extended to include the term and the rule for recursion, that is, we
have a bijection between proofs in Intuitionistic Logic, extended with the rule
for recursion without terms, and derivable sequents of the Met-calculus.

3.3 C a t e g o r i c M Interpretation o f t h e )~reC-Calculus

Given a cartesian closed category with a fixpoint operator, we define a categorical


interpretation; a derivable sequent xl : A1, ..., xn : A n ~- u : B is interpreted as
a map I x 1 : A 1 , . . . , x,~: A n ~- u : B ] : [A1] • ... • [An] --* [B] by induction in its
derivation cf. operations on arrows (corresponding to the typing rules) induced
by the categorical model. In what follows, we will omit the [ - ] brackets when
appropriate. We will just state the operation on arrows corresponding to the
only non-standard rule, namely the ( R e c ) rule:
u
FxB , B
1" curry(u), B ::~ B Y" B
It can be shown that the interpretation is sound with respect to the reduction
rule for the term corresponding to recursion, that is, interpretation is preserved
by an application of the reduction rule. This is so because the reduction rule
is essentially a syntactic restatement of the defining equation of a fixpoint. The
interpretation is also sound with respect to the usual reduction rules for terms
of the A-calculus, [GLT89].

4 T h e Linear M~%Calculus

4.1 D e f i n i t i o n o f the Linear , V e t - C a l c u l u s

Types are given by s ::= I [ s | ] T [ s~s I s-os I !s and terms by

t::=x[t| [lettbex| ] <t,t> [fst(t) l s n d ( t ) l A x A . t [ t t [


let t , . . . , t be x l , . . . , x n in It [ derelict(t) [ discard t i n t [ copy t as x , y i n t ]
let t, ...,t be x l , ...,xn in recz!a.t

where "t, ..., t" means a sequence of n occurrences of t. Rules for assignment
of types to terms are given in Appendix B. Type assignments have the form
of sequents xl : A 1 , . . . , x n : A n l- u : A where x l , . . . , x n are pairwise distinct
variables. Note that the definition of sequents implicitly restricts use of the rules.
It is for example not possible to use the (| rule if F and A have common
39

variables. The terms together with the typing rules for the fragment without
recursion will be called the linear A-calculus. The linear A-calculus is essentially
the same as the calculus given in [BBdPH92]. The extension with recursion will
be called the linear A~eC-calculus.
In what follows, the expression ~ r means Wl, ..., wn, %opy ~ as ~ , ~ in u r~
means copy Vl as Zl, yl in (...copy vn as zn, Yn in u...), and '~discard ~ in u ~
means discard Vl in (...discard v= in u...).
The reduction rules for terms of the linear A-calculus, [BBdPH92], can be
extended with a reduction rule for the term corresponding to recursion:

let ~ be 9 in recz.a -.-* copy ~ as ~ , y in (u[let y be 9 in i(let ~ be 9 in recz.u)/z])

One can show that the linear A~eC-cMculus satisfies the Substitution Property,
entailing that the rule satisfies Subject Reduction. Instead of equipping the linear
At*C-calculus with the mentioned reduction rules, one could define an operational
semantics in natural semantics style. This is dealt with in [Bra94a].

4.2 The C u r r y - H o w a r d I s o m o r p h i s m ( E x t e n d e d w i t h R e c u r s i o n )

Intuitionistic Linear Logic corresponds to the linear A-calculus via a Curry-


Howard isomorphism in the same way as Intuitionistic Logic corresponds to the
A-calculus; we get the rules for assigning types to terms in the linear A-calculus
if we decorate the proof-rules of a natural deduction formulation of Intuitionistic
Linear Logic with terms in an appropriate way, and conversely, we recover the
proof-rules if we remove the terms from the typing rules. The Curry-Howard
isomorphism on the level of proofs can also be extended to include the term and
the rule for recursion in the Intuitionistic Linear Logic case.

4.3 Categorical Interpretation o f the Linear ~,~e%Caleulus

We define a categorical interpretation in a closed !-category with finite products


and a linear fixpoint operator; a derivable sequent Zl : A1, ..., zn : An t--- u : B is
interpreted as a map [ Z l : A 1 , ..., z,,: A , v- u : B~: [A1] | ... | [A,~] ~ [B] by
induction in its derivation cf. operations on arrows induced by the categorical
model. We will just state the operation on arrows corresponding to the only rule
which can not be found in [BBdPtt92], namely the (Rec)'rule:

F~ ~ , !A~ ,..., F~ ~ , !A, !A~ | 1 7 4 1 7 4 ~, B

['1 | ... | Fn ,o,o...o,o,~ !A1 | ...| "r(e,,r~(,,)! !(!B ---o B) v " "* B

It can be shown that the interpretation is sound with respect to the reduction
rule for the term corresponding to recursion. This is so because the reduction rule
is essentially a syntactic restatement of the defining equation of a (generalised)
linear fixpoint. The interpretation is also sound with respect to the reduction
rules for terms of the linear A-calculus, as shown in [BBdPH92].
~0

5 The Girard Translation (Extended with Recursion)


5.1 Definition
The [Gir87] paper introduced the Girard Translation which embeds Intuitionistic
Logic into Intuitionistic Linear Logic. A formula A is translated into a formula
A ~ and a proof of At, ...,An F- B into a proof of (A1, ...,An ~ B) ~ that is, a
proof of !A~, ..., !A ~ ~ B ~ The Girard Translation can be extended to include
the rules for recursion where the terms have been removed. The (extended)
Girard Translation at the level of proofs is stated in Appendix C. The translation
induces a translation from types and derivable sequents in the )(~C-calculus to
types and derivable sequents in the linear ~ec-calculus cf. the extended Curry-
Howard isomorphisms; a sequent xl : At, ..., xn : A~ F- u : A is translated into a
sequent (a:~ : A t , . . . , x n : An ~ u :A) ~ of the shape xt :!A~,...,xn :!A~ b- u' : A ~
where u ~ encodes the translation of the proof encoded by u.
Note that the rule for recursion in the linear ~rec-calculus is essentially the
image under the Girard Translation of the rule for recursion in the )~ree-Calculus.
Moreover, the translation preserves the reduction rule for recursion because it is
essentially the image under the translation of the reduction rule for recursion in
the ~reC-calculus. The Girard Translation also preserves fl reductions, [Bie94].

5.2 Soundness
If g is a closed !-category with finite products and a linear fixpoint operator, then
the category of free coalgebras is a cartesian closed category with a fixpoint op-
erator, as previous results show. We can therefore interpret types and derivable
sequents in the )~reC-calculus as objects and arrows in the category of free coal-
gebras. It turns out that the interpretation of a type A in the ~r~C-calculus can
be written in a simple way using the Girard Translation at the level of formulas,
namely [A] = (![A~ 6). Note that A, a type of the )~reC-calculus, is interpreted
in the category of free coalgebras, and that A ~ a type of the linear ~r~-calculus,
is interpreted in g. Let the following composite be denoted by lin:
C'([Ad x ... x [ A , ] , [ B ] ) = C'((![A~], 6) x ... x (![A~ 6), (![B~ 6))
-~ e ' ( ( ! { A ~ ] , 6) | ... | ( ! [ A ~ 8), (![B~ 6))
e- e(![A~] | ...| [U~
We are now ready to state a result showing that the extended Girard Translation
is sound with respect to the above mentioned categorical interpretations. The
result essentially says that the extended Girard Translation corresponds to the
adjunction between the category of free coalgebras and g, or to be precise, to
the function lin. Recall that (xt : At, ..., xn : An ~- u : B) ~ is a derivable sequent
in the linear Arec-calculus of the shape xt :]A~, ..., Zn :!A~ ~ u r : B ~
T h e o r e m 5.1 (Soundness) Let g be a closed/-category with finite products and
a linearfixpoint operator. I f x t : A t , ...,xn : An f- u : B is a derivable sequent in
the )~re%calculus, then
lin([xt : At,...,~:n : An F- u : B]) = [ ( x t : A t , . . . , x , : A , ~- u : B) ~
4]

P r o o f . Induction in the derivation of zl : A1, ..., zn : A, b u : B. We will


disregard terms, and only consider the underlying proofs. We only cover the
(A~I) case, corresponding to the translation

F b A A B !F ~ t-- A ~ ~
(AE1) (S,S1)
F ~- A !F ~ t-- A ~

The following calculation shows the wanted result:

lin(IFb- A]) = n;[F~- AAB];!Zrl;e


= n; I F b A A B]; r 7r1
= tin(It A ^ B]);
~IH [[!I-,o I-.- A ~ 1 7 6 ~rl
= [!F ~ v- A I []

If we disregard recursion and consider proofs instead of derivable sequents, we


get a soundness result in the usual proof-theoretic sense, which is a categorical
version of a result in [Gir87] showing that the Girard Translation is sound with
respect to interpretation in a concrete category, namely the category of coherence
spaces and linear stable functions.

6 R e m a r k s on P o s s i b l e E x t e n s i o n s a n d F u t u r e W o r k

6.1 Extension with Finite Sums

We can give a categorical interpretation of the linear A-calculus extended with


finite sums using a closed !-category with finite products and finite sums. The
category of free coalgebras induced by a closed !-category with finite products
and finite sums has weak finite sums as pointed out in [Bie94], and moreover, the
weak finite sums satisfy certain additional conditions, which makes it possible
to give a sound categorical interpretation of the A-calculus extended with finite
sums. Furthermore, the Curry-Howard isomorphisms and the Girard Translation
still work in such a way that the induced embedding of terms is sound with
respect to the categorical interpretations.

6.2 (Linear) F i x p o i n t O b j e c t s

Let C be a closed !-category with finite products. The category of free coalgebras
is cartesian closed as shown above, and moreover, it can be shown to have a
strong monad (T, r/, #, t) where the functor T is given by U:; F:. This enables us
to define a fixpoint object, [CP90].

D e f i n i t i o n 6.1 A fixpoint object in a c a r t e s i a n closed category with a strong


m o n a d is 1. an initial algebra tr : T Z ---* Z , and 2. a map w : 1 --~ Z which is a
unique fixpoint o f ~z; a.
~2

It is shown in [Mu192] that there is a fixpoint operator Y : A ~ A -+ A for


every Eilenberg-Moore T-algebra (A, h : T A ~ A) in a cartesian closed category
with a strong monad and a fixpoint object. If we consider the category of free
coalgebras, then each if!A, ~), !~A : T(!A, 6) ~ (!A, ~)) is an Eilenberg-Moore
T-algebra. We conclude that if the category of free coalgebras has a fixpoint
object, then it has a fixpoint operator.
Now, in the present paper the approach to dealing with recursion in a linear
context has been to restate the definition of a fixpoint operator in the category of
free coalgebras induced by g in terms of g itself. The same technique can be used
to restate the definition of a fixpoint object in the category of free coalgebras
induced by g in terms of g itself. We then end up with the following definition
(which is a direct restatement; it might be formulated it in a cleaner way):

D e f i n i t i o n 6.2 A linear fixpoint object in a closed .t-category with finite prod-


ucts is 1. a map crzin :!!Z --+ Z such that for any map oc :!!A --+ A there exists a
unique map h :!Z --+ A with the property that 7(or); h =!7(h); o~, and 2. a unique
map w ri" :!1 --* Z with the property that 7(w ain) is a fixpoinl of ~; 7(ali's).

With this definition, we can get a result saying that C has a linear fixp0int object
iff the category of free coalgebras has a fixpoint object. But if the category of
free coalgebras has a fixpoint object, then it has a fixpoint operator, which is the
same as having a linear fixpoint operator in g. We conclude that if C has a linear
flxpoint object, then it has a linear fixpoint operator. Another way to show this
would be to prove the existence of a linear fixpoint operator directly from the
existence of a linear fixpoint object in C, without leaving the linear world.

6.3 R e c u r s i o n at t h e Level of Types

The A-calculus can be extended with recursion at the level of types, that is, we
have additional types X i # X . s , where X is a type variable, typing rules

F ~- t : A [ p X . A / X ] F ~- t : I~X.A
F t- abs(t) : I~X.A F ~- rep(t) : A ~ u X . A / X ]

and the reduction rule rep(abs(t)) -,~ t . We are then able to define recursion
at the level of terms as in the Ar~-calculus. Let/1, z : B ~- u : B be given. If C is
an abbreviation for the type I~X.(X ==~B), and F }- a : C is an abbreviation for
F ~- abs(Ax.((Az.u)(rep(x)x))): C, then F t- rep(~)~ : B acts as the reduction
rule for recursion in the AreC-calculus. This is essentially a syntactic restatement
of a result in [Law69] saying that if f : C --~ C =~ B is a weakly point-surjective
map in a cartesian closed category, then every endomap on B has a fixpoint.
Similarly, we are able to define recursion at the level of terms as in the linear
A~r when the linear A-calculus is extended with the above mentioned
rules for recursion at the level of types. Note that the !-free fragment of this
system is strongly normMising because the underlying proof of a term shrinks
during each reduction step.
43

A c k n o w l e d g e m e n t s . I am grateful to my supervisor, Glynn Winskel, for his


guidance and encouragement. T h a n k s to Gavin Bierman and Valeria de Paiva for
comments on a draft of this paper. The diagrams and proof-rules are produced
using Paul Taylor's macros.

References
[Abr90] S. Abramsky. Computational interpretations of linear logic. Technical
Report 90/20, Department of Computing, Imperial College, 1990.
[BBdPH92] N. Benton, G. Bierman, V. de Paiva, and M. ttyland. Term assignment
for intnitionistic linear logic. Technical Report 262, Computer Laboratory,
University of Cambridge, 1992.
[Bie94] G. Bierman. On Intuitionistic Linear Logic. PhD thesis, Computer Labo-
ratory, University of Cambridge, 1994.
[Bra94a] T. BrafineL A general adequacy result for a linear functional language.
Technical Report BRICS-RS-94-22, BRICS, Department of Computer Sci-
ence, University of Aarhus, aug 1994. Manuscript presented at MFPS '94.
[Bra94b] T. Brafiner. A raodel of intnitionistic affine logic from stable domain the-
ory. In Proceedings of ICALP '9•, LNCS, volume 820. Springer-Verlag,
1994.
[CP90] R. L. Crole and A. M. Pitts. New foundations for fixpoint computations.
In 5th LICS Conference. IEEE, 1990.
[G~87] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50, 1987.
[GLT89] J.-Y. Girard, Y. Lafont, and P. Taylor. Proofs and Types. Cambridge
University Press, 1989.
[How80] W. A. Howard. The formulae-as-type notion of construction. In J. R.
Hindley and J. P. Seldin, editors, To H. B. Curry: Essays on Combinatory
Logic, Lambda Calculus and Formalism. Academic Press, 1980.
[HP90] H. Huwig and A. Poigne. A note on inconsistencies caused by fixpoints in
a cartesian closed category. Theoretical Computer Science, 73, 1990.
[Law69] F. W. Lawvere. Diagonal arguments and cartesian closed categories. In
P. Hilton, editor, Category Theory, Homology Theory and their Applica-
tions 1I, LNM, volume 92. Springer-Verlag, 1969.
[Mac91] I. Mackie. Lilac : A Functional Programming Language Based on Linear
Logic. M.Sc. thesis, Imperial College, 1991.
[MRA93] I. Mackie, L. Romgn, and S. Abramsky. An internal language for au-
tonomous categories. Journal of Applied Categorical Structures, 1, 1993.
[Mu192] P. S. Mulry. Strong monads, algebras and fixed points. In M. P. Four-
man, P. T. Johnstone, and A. M. Pitts, editors, Application of Categories
in Computer Science, volume 177. London Mathematical Society Lecture
Notes Series, 1992.
[Pio93] G.D. Plotkin. Type theory and recursion (extended abstract). In 8th
LICS Conference. IEEE, 1993.
[Wad91] P. Wadler. There's no substitute for linear logic. Manuscript, 1991.
[Win93] G. Winskel. The Formal Semantics of Programming Languages. The MIT
Press, 1993.
44

A Appendix, the Arcc-Calculus

(Ax)
xl : A 1 , . . . , x n : An ~- xq : Aq

Fi-u:A FF-v:B Ft-u:AAB Ft-u:AAB


(AI)
F b< u,v > : A A B F ~- f s t ( u ) : A F b snd(u) : B

F,x:AF-u:B Ft--f:A=vB Ft-u:A


F F- Ax A.u : A ~ B FF-fu:B

F,x:B~-u:B
(Rec)
F I- r e c x B . u : B

B Appendix, the Linear X~cc-Calculus

(Ax)
x:A~x:A

F,x : A,y : B,A~ u :C


F , y : B , x : A, A I - u : C ( E x )

F,-u:A A~v:B A~w:A| F,x:A,y:B~u:C


(|
F,A,--u| : A| B F, A , - let w b e x | y i n u : C

F,-u:A F,-v:B A,-u:A&B A,-u:A&B


(~E2)
F I - < u, v > : A & B A I- f s t ( u ) : A A e- s n d ( u ) : B

F,x:A,-u:B At--f:A---oB A~u:A


(--or)
F ~ Ax A.u : A -.-o B A,A~ fu : B

/"1 I--- W 1 :!A1 ,..., F , t - wn : ! A . x l :!A1, ..., xn :!An ' - u : A


(!i)
F1, ..., F n ~- let w l , ..., w,, b e Xl, ..., xn i n !u :!A

A I - u :!A
(Der)
A,- derelict(u):A

A , - w :!A F, x :!A, y :!A , - u : B A ~ w :!A F~ u :B


(Con) (Weak)
F, A t - copy w as x, y i n u : B F, A ~ d i s c a r d w i n u : B

Fl~Wl:!A1 ,..., Fn~wn:!An xl :!A1, ..., x,~ :!An, z :!B ~ u : B


(Rec)
/"1, ..., Fn ~ let wl,...,Wn b e xl,...,x,~ i n r e c z ! B . u : B
45

C Appendix, the Girard Translation (Extended with


Recursion)
At the level of formulas the Girard Translation is defined inductively as follows:
- t o = y

- (AAB) ~176 ~
- (A~B) ~ ~ ~

At the level of proofs, the Girard Translation translates a proof of A1, ..., An t- B
into a proof of ! A ~ , . . . , !A ~ t - B ~ by induction in the proof of A1, ...,An t- B.
Special cases of rules will be used in the definition when appropriate (for example
in the case of ( R e c ) ) . A double bar means a number of applications of a rule.

(Az)
!AO~,--!A~
(Am) ~ (Oer)
A I , . . . , A , F Aq !A~ I--- A~
T o !A ~ o ( W e a k ) and ( E x )
.A1, ..., Aq

F I- A F F B ! F ~ I-- A ~ ! F ~ I-- B ~
(AI) ~ (~I)
F f- A A B ! F ~ v- A ~ ~

F ~- A A B !F ~ I-- A ~ ~
(A ~ ) ~ (~:~)
F b A ! F ~ I-- A ~

F F A A B !F ~ ~ A~ ~
(AE2)
FI- B !Fo ~ Bo (&E2)

F, A b B ! F ~ !A ~ t-- B ~
(~)
F F A ~ B ! F ~ t--!A ~ --o B ~ ( - o i )

! F ~ I-- A ~
(!x)
F F A ~ B F ~- A ! F ~ v - ! A ~ --o B ~ !F ~ v-!A ~
( =::=FE ) v--.+ (--o~)
1" F B !F ~ ! F ~ ~ B ~
( C o n ) and ( E x )
! F ~ e- B ~

F, B F B ! F ~ ! B ~ I-- B ~
(Rec) ~ (eec)
F t- B ! F ~ e- B ~
Decidability of Higher-Order Subtyping
with Intersection Types
Adriana B. Compagnoni*

Abstract
The combination of higher-order subtyping with intersection types yields
a typed model of object-oriented programming with multiple inheritance
[11]. The target calculus, F~', a natural generalization of Girard's system
F ~ with intersection types and bounded polymorpbism, is of independent
interest, and is our subject of study.
Our main contribution is the proof that subtyping in F~ is decidable.
This yields as a corollary the decidability of subtyping in F~, its inter-
section free fragment, because the F~ subtyping system is a conservative
extension of that of F~.
The calculus presented in [8] has no reductions on types. In the F~ sub-
typing system the presence of ~A-conversion - an extension of fl-conversion
with distributivity lav/s - drastically increases the complexity of proving
the decidability of the subtypiug relation. Our proof consists of, firstly,
defining an algorithmic presentation of the subtyping system of F/~, sec-
ondly, proving that this new presentation is sound and complete with
respect to the original one, and finally, proving that the algorithm always
terminates.
Moreover, we establish basic structural properties of the language of
types of F/~ such as strong normalization and Church-Rosser.
Among the novel aspects of the present solution is the use of term
rewriting techniques to present intersection types, which clearly splits
the computational semantics (reduction rules) from the syntax (inference
rules) of the system. Another original feature is the use of a choice operator
to model the behavior of variables during subtype checking.

1 Introduction
The system F ~ (F-omega-meet) was first introduced in [11], where it was shown
to be rich enough to provide a t y p e d model of object oriented p r o g r a m m i n g with
multiple inheritance. F/~ is an extension of F ~ [17] with bounded quantification
and intersection types, which can be seen as a natural generalization of the
type disciplines present in the current literature, for example in [14, 21, 22,
8]. Systems including either subtyping or intersection types or both have been
widely studied for m a n y years. W h a t follows is not intended to be an exhaustive
description, but a framework for the present work.

*Laboratory for Foundations of Computer Science, Department of Computer Science, Uni-


versity of Edinburgh. United Kingdom. This research was supported by the Dutch organization
for scientific research, NWO-SION project Typed lambda calculus, 612-316-030, and by EPSRC
GRANT, GR/G 55792 Constructive logic as a basis for program development.
47

First-order type disciplines with intersection types have been investigated


by the group in Torino [13, 1] and elsewhere (see [7] for background and further
references). A second-order A-calculus with intersection types was studied in [21].
Systems including subtyping were present in [6, 4]. Higher order generalizations
of subtyping appear in [3, 12, 19, 2]. F< (F-sub), a second-order A-calculus
with bounded quantification, was studied in [15], and in [21] it was proved that
subtyping in F< was undecidable and that undecidability was caused by the
subtyping rule for bounded quantification.
In [8] an alternative rule for subtyping quantified types was presented and the
decidability of subtyping was proved for an extension of system F with bounded
polymorphism.
Allowing bounds of functional kind forces us to introduce a conversion rule to
have invariance of subtyping under ~A-conversion of types. Therefore, our sub-
typing relation relates types of a more expressive type system than that presented
in [8]. In fact, treating the interaction between interface refinement and encap-
sulation of objects, in object oriented programming, has required higher-order
generalizations of subtyping-the F-bounded quantification of Canning, Cook,
Hill, Olthoff and Mitchell [3] or Cardelli and Mitchell's system F~ (F-omega-
sub) [5, 19, 2].
Ghelli [16] remarked that the rule for subtyping between quantified types
presented in [8] led to a well-behaved subtyping relation but that the typing
relation fails to satisfy the minimal type property. This failure introduces serious
problems in type checking and type inference. At the moment it is not clear how
to solve them or, even more problematic, whether the typing relation is decidable.
A possible solution to overcome this problem is to replace the subtyping rule
between quantifiers by the corresponding rule of Cardelli and Wegner's kernel
fun [6].
In this paper we give a positive answer to the decidability of subtyping in the
presence of/3A-convertible types. We prove that subtyping in F~ is decidable.
This immediately gives the decidability of subtyping for the F~ fragment as the
former is a conservative extension of the latter, i.e. each subtyping statement
derivable in F~ that contains no intersections other than the empty ones is also
derivable in F~. The system F~ satisfies the minimal type property and its
typing relation is decidable. These two results are beyond the scope of this
paper and can be found in [9].
We present a definition of F~ that differs from the one introduced in [11]
in two ways. First, Castagna and Pierce's quantifier rule has been replaced by
the Cardelli and Wegner rule. Second, we introduce a richer notion of reduction
on types, and thereby the four distributivity rules become particular cases of
the conversion rule. This new reduction is shown to be confluent and strongly
normalizing. The latter simplification was motivated by structural properties of
the former presentation. Furthermore, this new presentation provides a different
view of the system that is the key to proving the decidability of subtyping.
This new perspective suggests that to prove the decidability of subtyping it is
enough to concentrate on types in normal form. Note that the solution cannot be
as simple as restricting the subtyping rules of F,~ to handle only types in normal
form and replacing conversion by reflexivity. In section 3.1 there is an example
48

illustrating the problem to be solved with a subtyping statement which is not


derivable without using conversion, i.e. without performing any ~-reduction,
even when the conclusion is in normal form.
The subtyping rules of F~ are not syntax directed, in the sense that the
form of a derivable subtyping statement does not uniquely determine the last
rule of its derivation (i.e. there might be more than one derivation of the same
subtyping judgement). To develop a deterministic decision procedure to check
subtyping, we need a new presentation of the subtyping relation that provides
the foundations for a subtype-checking deterministic algorithm.
Our solution is divided into two main steps. First, we develop a normal
subtyping system, NF~, in which only types in normal form are considered. We
prove that derivations in NF~ can be normalized by eliminating transitivity and
simplifying reflexivity. This simplification yields an algorithmic presentation,
AlgF~. Moreover, we prove that AlgF~ is indeed an alternative presentation of
the F/~ subtyping relation, that is F ~- S _< T if and only if F nf ~-Al9 Snf ~ Tnf
(proposition 3.3.3).
The second and last step towards the decidability of subtyping in F~ is to
prove that the algorithm described by AlgF~ terminates, which is equivalent to
showing that the definition of the AlgF~ is well-founded. We discuss this further
in section 4.
Checking if F ~-Alg S T <_A is reduced to checking if F [-Alg lubr (S T) nf <_A,
where lubr (S T) substitutes the leftmost innermost variable of S T by its bound
in F. Such replacement may produce a term that is not in normal form, in which
case we normalize it afterwards. The main problem here is that the size of the
types to be examined in the recursive call does not decrease. This indicates that
the proof of termination of the algorithm is not immediate. In particular, the
proof of termination presented in [8] cannot be modified to serve our purposes,
because of the interaction between/3A-reduction and the substitution of type
variables by their bounds in our system. We discuss this further in section 4.

2 The S y s t e m
The kinds of F~ are those of F ~, they are the kind * of proper types and the
kinds KI-~K2 of functions on types (sometimes called type operators). The lan-
guage of types of F~ is a straightforward higher-order extension of F_<, Cardelli
and Wegner's second-order calculus of bounded quantification. Like F_<, it in-
cludes type variables (written X), function types (T--+T'), and polymorphie
types (VX<T:K.T~), in which the bound type variable X ranges over all sub-
types of the upper bound T. Moreover, like F ~, we allow types to be abstracted
on types (AX:K.T) and applied to argument types (TT'). Finally, we allow
arbitrary finite intersections (Atc [T1..T~]), where all the Ti's are members of the
same kind K. The empty intersection at kind K is written -]-K. We drop the
maximal type Top of F_<, since its role is played here by T*.
The reduction --+~A on types consists of the/3-reduction and distributivity
rules for each constructor associated with intersections.
49

DEFINITION 2.1

1. (AX:K.T~)T2 -4~^ T~[X+--T2]


2. s A*[T,..Tn] ..
3. VX<_S:K.A*[T1..Tn ] --+~^ A*[VX<_S:K.T, .. VX<_S:K.T,,]

4. AX:K1.AK2[T,..Tn] - ~ ^ AK'-*K2[AX:K~.T1 .. AX:K1.Tn]

5. U AK [T1V .. T n U ]

6. AK[T~ .. AK[S1..S.] .. Tm] -+~^ AK[T1 .. S1..Sn .. T,~]

The relation -+t~^ is extended so as to become a compatible relation with respect


to type formation, -~Z^ is the transitive and reflexive closure of --+Z^, and =Z^
is the least equivalence relation containing -+~A- The capture-avoiding substi-
tution of S for X in T is written T[X+-S]. Substitution is extended pointwise
to contexts. The/?A-normal form of a type S is written S he, and is extended
pointwise to contexts.
Strictly speaking the operational semantics of the system is captured by
rules i through 5. Associativity, commutativity, and idempotency of intersection
types exist in the system as derived subtyping judgements. We included rule 6
(associativity) here because it is convenient for the development of the subtyping
algorithm in section 3.1.

THEOREM 2.2 (Church-Rosser for ~ ^ ) If S - ~ ^ T1 and S ~ Z ^ T2, then


there exists T3 such that T1 ~ Z ^ T3 and T2 -~ZA T3.

The rules in F~ are organized as proof systems for four interdependent judge-
ment forms:
F F ok well-formed context FFT EK well-kinded type
F F S _< T subtype FFeET well-typed term.

2.1 Context Formation


A context F is a finite sequence of typing and/or subtyping assumptions for a set
of variables and/or type variables. The formation of contexts follow the three
rules below.
<> F ok (C-EMPTY)
FFT E* x r dom(F) F F T CK X r dom(F)
(C-VAR) (C-TVAR)
F, x:T F ok F, X~_T:K F ok
In the type variable declaration X<T:K, T is called the bound of X. We write
F(Y) to denote the bound of Y in F. We say that F is included in F t, F C_F ~, if
every assumption in F is in F t. The set of variables declared in F is called the
domain of F, dom(F).
50

2.1.1 Type Formation


For each type constructor, there is a rule specifying how it can be used to build
well-formed type expressions. We use the abbreviation X : K for X _< -I-K:K.
F1, X<_T:K, F2 ~- ok F/- T1 E * F ~- T2 E *
F1, X<_T:K, F2 }- X E K (K-TVAR) P l- TI-+T2 E * (K-A~aow)
I', X<_T1 :K1 1- T2 E * F, X:/(1 I- T2 E K2
(K-ALL) (K-OABS)
F ~- VX<TI:K1.T2 E * F }- AX:K1.T2 E KI-+K2
F ~- S E KI-+K2 P ~- T E K1
P [- S T ~ I(2 (K-OAPP)
F ~- ok for each iE{1..n}, F ~- Ti E K
(K-MEET)
F ~- AK[T~..T~] E K

LEMMA 2.1.1 1. For any context F it is decidable whether F ~- ok.

2. For any context F, type expression T, and kind K , it is decidable whether


F~-TEK.

LEMMA 2.1.2 ( S u b j e c t reduction) 1. P ~- ok and F ~Z/~ F' implies F' ~- ok.

2. F ~- S E K and S -+ZA T implies F ~- T E K.

3. F F- S E K and F -+~A F' implies F ~ F- S E K . []

PROPOSITION 2.1.3 (Well-kindedness o f s u b t y p i n g ) If F ~- S < T, then F ~-


S E K and F k T E K for some K.

THEOREM 2.1.4 ( S t r o n g normalization for --+ZA) If F ~- T E K , then every flA-


reduction sequence starting from T terminates.

2.1.2 Subtyping
The rules defining the subtype relation are a natural extension of familiar calculi
of bounded quantification. Aside from some extra well-formedness conditions,
the rules S-TRANS, S-TVAR,and S-ARrow, are the same as in the usual, second-
order case. F ~'s rule of type conversion (i.e. if F ~- e E T and T =~ T ' then
F t- e E T t) is captured here as the subtyping rule S-CONV, which also gives
reflexivity as a special case. Rules S-MEET-G and S-MEET-LB specify that an
intersection of a set of types is their order-theoretic greatest lower bound.
F~-SEK F}-TEK S=~AT
F ~- S _< T (S-CoNv)

F }- S < T F }- T < U F1, X<_T:K, F2 1- ok


I~ ]- S _< U (S-TRANS) F1, X<_T:K, F2 ~- X < T (S-TVAR)
F~- TI_< $1 F~-S2<_T2 F~-SI-+S2E*
(S-A~Row)
F ~-$I-+$2 <_TI--+T2
F, X<_U:K ~- S ~_ T F F- VX~_U:K.S E *
(S-ALL)
F F- V X < U : K . S ~_ VX<_U:K.T
51

F, X~_-[K:K [- S ~_T F [- S < T FSSU6K


F [- AX:K.S < AX:K.T (S-OABS) F [- S U ~_ T U (S-OAPP)
for eachi, F [- S ~ Ti F [- S 6 K
(S-MEET-G)
r [- S _< AK[T~..T~]
r e e K
(S-MEET-LB)
r < T,

3 Decidability of subtyping
In this section we show that the subtyping relation of F/~ is decidable. The solu-
tion is divided into two main parts. First, we develop a normal subtyping system,
NF~, in which only types in normal form are considered. We prove that deriva-
tions in NF~ can be normalized by eliminating transitivity and simplifying re-
flexivity. This simplification yields an algorithmic presentation, AlgF~, whose
rules are syntax directed. Moreover, we prove that AlgF~ is indeed an alterna-
tive presentation of the F/~ subtyping relation. Formally, F [- S _< T if and only
if F nf ~Alg snf < Tnf (proposition 3.3.3).
In the solution for the second order lambda calculus presented in [21], the
distributivity rules for intersection types are not considered as rewrite rules. For
that reason, new syntactic categories have to be defined (composite and indi-
vidual canonical types) and an auxiliary mapping (flattening) transforms a type
into a canonical type. Our solution does not need either new syntactic categories
or elaborate auxiliary mappings, since the role played there by canonical types
is performed here by types in normal form.
Independently Steffen and Pierce proved a similar result for F~ [23]. There
are several differences between our work and the proof of decidability of sub-
typing in [23]. First, our result is for a stronger system which also includes
intersection types. Our proof of termination has the novel idea of using a choice
operator to model the behavior of type variables during subtype checking. A
second major difference is the choice of the intermediate subtyping system. We
define the normal system NF~, which is not only the key to proving decidability
of subtyping but helped understand the fine structure of subtyping, yielding the
algorithm AlgF~. In [23] the intermediate system, called a reducing system,
leads to a much more complicated proof which involves dealing with several
notions of reduction and further reformulations of the intermediate system.

3.1 Normal Subtyping


An important property of derivation systems is the information that a derivable
judgement contains about its proofs. This information is essential to produce
results which not only state properties about the subproofs, but also help identify
ill formed judgements.

EXAMPLE 3.1.1 In F~ we can prove that if F - W : K , X < _ ( A Y : K . Y ) : K - + K ,


52

P~-ok
S-TVAR
F ~-X <_AY:K.Y (AY:K.Y) W =~^ W
S-OAPP S-CONV
F ~ X W < (AY:K.Y) W (AY:K.Y) W <_W
S-TKANS
F ~ - X W <_W.
This example shows that S-TRANS erases information obtained by S-CoNY that
is not present in the conclusion. A first step towards an algorithm to check the
subtyping relation is to design a set of rules in which the derivable judgements
contain all the information about its derivations. For that we define a set of
rules (NF~) in which conversion is reduced to a minimum and transitivity can
be eliminated (lemma 3.1.7). Both results are proved with a standard cut-
elimination argument. This yields a syntax directed subtyping relation (AlgF~)
which constitutes a decision procedure for the original system.
In this section, we present the subtyping system NF~ that uses the context
and type formation rules of F~. We prove a generation lemma for subtyping
(proposition 3.1.8) and define an algorithmic presentation, AlgF~ (see definition
3.3.1). Finally, we show that there is an equivalence between subtyping in F~
and subtyping in AlgF~, which is essential to prove the decidability of subtyping
in F/~ (see section 3.3).
We now define the normal subtyping system, NF~. Subtyping statements in
NF~ are written F Fn S < T, and S, T, and all types appearing in F are in
flA-normal form.

NOTATION 3.1.2 A, B, and C range over types whose outermost constructor


is not an intersection.

REMARK 3.1.3 It is an immediate consequence of the ~A-reduction rules that,


if T is in ~A-normal form, then T is either a variable X, an arrow type S-+A,
a bounded quantification VX<_S:K.A, an abstraction AX:K.A, an application
A S where A is not an abstraction, or an intersection AK[A1..A~].

We now define lubr(S), and we prove in lemma 3.2.1 and corollary 3.2.9,
that, when defined, it is the smallest type above S with respect to F.

DEFINITION 3.1.4 (Least strict Upper Bound)

l br(X) = r(x), lubr(T S) = lubr(T) S.

Note that when S is a well-kinded type with respect to F, and it is a type


variable or a type application in (head)normal form, then lubr (S) is defined.

DEFINITION 3.1.5 (NF/~ subtyping rules)


F~SEK
(NS-REFL)
F~-~S<_S
F~-~F(X)<A X~A
(NS-TVAR)
F ~ , , X GA
53

FF-~(lubr(TS))nf<_A F~-TSeK TS~A


(NS-OAPP)
FF~ T S < A
ViE{1..m} F I-,, A _<T~ P t- A e K
(NS-V)
F t-,~ A < AK[T1..Tm]
3je{1..n} r t-n Sj < A VkE{1..n} F ~- Sk e K
NS-3)
r >~ A~[s~..&] < A
ViE{1..m} 3jC{1..n} F I-. Sj _< Ti Vke{1..n} P I- Sk E K
(NS-V3)
r e~ A'~[&..&] < A~[T~..T~]
NS-TRANS, NS-Aaaow, NS-ALL, INS-OABs are as in section 2.1.2.

As we mentioned in the introduction, an important factor in developing this


system was to consider the distributivity rules of the presentation of F~ in [11]
as reduction rules instead of subtyping rules. This new point of view suggested
that an algorithmic system should, to a certain extent, concentrate on normal
forms replacing the conversion rule by reflexivity. Consequently, a derivation of
a subtyping statement should involve only types in normal form. However, as
shown by the (counter)example 3.1.1, it is not possible to perform all reductions
at once. In other words, the system does not satisfy an S-CoNv postponement
property. Without using S-CONV it is not possible to derive 3.1.1. Hence, the
solution is not as simple as replacing S-CONV by NS-REFL.
In general, the interaction between S-TRANS and S-CONY can be analyzed
as follows. In S-TaANS the metavariable T of the hypothesis is not present in
the conclusion, but this is not a problem by itself (a similar situation appears
in the simply typed lambda calculus in its application rule and the system is
deterministic). The problem is that in the presence of S-CONY the vanishing T
can be ~A-convertible to either S or U or to both S and U. What example
3.1.1 shows is that S and U may be different normal forms, which means that
searching for T is inherently nondeterministic.
We cannot eliminate transitivity completely, we still need it on type variables
and on type applications. In F_< [15] transitivity is eliminated and hidden in a
richer variable rule in which deciding whether F F X <_ T when T ~ X is
reduced to deciding whether the bound of X is smaller than or equal to T. The
bound of X has the particular property of being the least strict upper bound of
X. This observation motivated the definition of our NS-OAPP rule, in which we
reduce the decision of whether F F T S < A when A ~ T S, to checking whether
the least strict upper bound of TS is smaller than or equal to A (See lemma
3.2.1 and corollary 3.2.9). The least strict upper bound of T S with respect to
F, lubr (T S), is obtained from T S by replacing its leftmost innermost variable
by the corresponding bound in F. Consequently, lubr (T S) may be other than a
normal form. That is the reason we normalize it. The strength of the conversion
rule that is not captured by reflexivity is hidden in this normalization step. Since
T S is a well kinded type, the free type variables of T S are included in dom(F).
Therefore, lubr(TS) is defined. By lemma 3.2.1(1), lubr(TS) is well-kinded,
5z~

and since well-kinded types are strongly normalizing, its normal form exists.
The rules S-MEET-LB and S-MEET-G are replaced by NS-3, NS-V, and NS-V3.

EXAMPLE 3.1.6 A derivation of the subtyping statement of example 3.1.1 in


NF~. Let F - W : K , X<(AY:K.Y):K--+K.
FF-WEK
NS-I:I,EFL
r 1--,, ((AY:K.Y)W) ~1 <_W
NS-OAPP
pt-n x w <_w.
We say that a proof is in normal form if it has no applications of NS-TRANS
and in which NS-t~EFL is only applied to type variables and type applications.

LEMMA 3.1.7 I f F ~-n S ( T, then there exists a p r o o f o f F F-~ S < T in normal


form.

A consequence of the normalization of proofs is the following generation


lemma.

PROPOSITION 3.1.8 (Generation for normal subtyping)

1. F ~-n X _< B implies X = B and F ~- X E K for some K, or P ~-n F(X) <


B.

2. F ~,~ S ~ A <_ B implies B ==-T-+C, P~-~ T <_ S, F ~n A <_ C, and


F ~- S--+A E *.
3. F F-~ VX<_S:K.A < B implies B -- VX<_S:K.C, F, X<_S:K ~-n A < C,
and F ~- V X <_S:K.A E *.

4. F ~ A X : K . A <_B implies B = AX:K.C, and F, X<TtC:K ~ A <_ C.


5. F ~-,~ A S <_ B implies B ==- A S , or F ~-n (lubr(AS)) nf < B, and
F~ASEK.
6. F F-n AK[A1..Am] <_B implies that there exists jE{1..m} such that F ~-~
Aj <_ B and YkE{1..rn} F ~- Ak E K.

7. F ~-~ A < AK[BI..B~] implies that for each iE{1..n} F ~-~ A _< B{ and
F~AEK.
8. F ~-~ AK[A1..Ar~] <_ AK[B1..Bn] implies for each iE{1..n} there exists
jE{1..m) such that F F-n Aj _< Bi and VkE{1..m} F ~- Ak E K.

Moreover, given a normal proof of any of the antecedents the proofs of the
consequents are proper subderivations.
55

3.2 Equivalence of ordinary and normal subtyping


In this section, we show that a subtyping statement is derivable in F/~ if and only
if the corresponding normalized statement is derivable in NF~. This equivalence
is proved in theorem 3.2.8. As usual, we need some auxiliary properties and
definitions, among which we can highlight propositions 3.2.2 and 3.2.7.

LEMMA 3.2.1 Let lubr(S) be defined. Then


1. F I- S E K implies F F- lubr (S) E K.

2. F ~ S <_lubr(S).
PROPOSITION 3.2.2 (Soundness) If F ~-n S < T, then F t- S _< T.
LEMMA 3.2.3 1. F f- ok implies F nf f- ok.

2. F ~- T E K implies F nf f- T E K.
3. F b S < T implies I"nf ~- S < T.
4. Let F1, F2 ~- ok. Then F~f, F2 t- T E K implies F1, 1-'2 ~ T E K.
5. Let F1, F2 t- ok. Then F~f, F2 F- S < T implies F1, F2 ~- S < T.
6. Let F F- S, T E K. Then F nf I- S nf _< T nf if and only if F ~- S _< T.

The following lemma states that S-TVAR is an admissible rule in NF~,.


LEMMA 3.2.4 I f F is a context in normal form such that F ~- ok and YEdom(F),
then F F , Y <_ F(Y).
LEMMA 3.2.5 (Substitution) If F F- U E K and F, X : K , F t F-~ S _< T, then
F, (F'[X+-U]) "f ~-~ (S[X+--U])nf <_(T[X+--U])"f.

This substitution lemma is the key result we use in proving that 8-OAPP has
a corresponding admissible rule in NF~.
LEMMA 3.2.6 F ~- S U E K. Then F t-n S _< T implies F ~-n (SU) nf _< (TU) nf.
PROPOSITION 3.2.7 (Completeness) F ~- S < T ~ I"nf I-n S nf __ T nf.

PROOF: By induction on the derivation of F F- S _< T using lemma 3.2.6 and


the Church-Rosser theorem (2.2) for the case of 8-OAPP. []

THEOREM 3.2.8 (Equivalence of ordinary and normal subtyping) Let F t- S E


K and F F- T E K. Then F ~- S _< T if and only if P nf ~-n S nf _< Tnf.

PROOF: ( 0 ) By completeness (proposition 3.2.7). (r By soundness (proposi-


tion 3.2.2), it follows that F nf F- S nf < T nf, and, by lemma 3.2.3, it follows that
F ~ S <_T. []

COROLLARY 3.2.9 Let lubr(S) be defined. Then F F- S < T and T 7~^ S


implies F F- lubr (S) < T.
5~

3.3 A subtype checking algorithm, AlgF~


As it stands, NFJ, as defined in section 3.1 is not a deterministic algorithm,
because its rules are not syntax directed. Fortunately, we are not far away from
an algorithmic presentation. In fact, lemma 3.1.7 is the bridge to the algorithmic
presentation of the subtyping relation, AlgF~, which states that transitivity
steps can be eliminated and reflexivity steps can be simplified. AlgFZ is obtained
from NF~ by removing NS-TRANS and restricting NS-REFL to type variables and
type applications.

DEFINITION 3.3.1 We define the algorithmic system AlgF~ from NF~, by re-
moving NS-TRANS, and replacing NS-REFL by
F~XEK F~TSEK
(AS-TVARREFL) (AS-OAPPREFL)
F F-AlgX < X F F-AlgT S <_T S

LEMMA 3.3.2 (Equivalence of normal and algorithmic subtyping)


Let F ~- S, T E K. Then F ~-n S < T if and only if F t-Alg S < T.

PROOF: (::>) By lemma 3.1.7. (~=) Immediate. []


We have thereby proved that AlgF~ is indeed a sound and complete algorithm
for the F~subtyping relation. We conclude the proof of decidability of subtyping
in F~ by establishing in section 4 that AlgFZ always terminates.

PROPOSITION 3.3.3 (Equivalence of ordinary and algorithmic subtyping)


Let F ~- S E K and F F- T C K. Then F ~- S < T if and only if F nf [-Alg
S "f _< T "f.

PROOF: By the equivalence of ordinary and normal subtyping (theorem 3.2.8)


and the equivalence of normal and algorithmic subtyping (lemma 3.3.2). []

4 Termination of subtype checking


The last step in proving the decidability of the subtyping relation of F~ is
proving the termination or well-foundedness of the relation defined by the AlgF~
subtyping rules. We show this by reducing the well-foundedness of AlgF~ to the
strong normalization property of the --+ZA+ relation.
We begin by extending the language of types with the constructor + (T+T~).
The reduction ~ Z ^ + is obtained from -~Z^ by adding the reductions associated
with the choice operator +, S + T -+~A+ S and S + T --+ZA+ T. We also
need to extend our kinding judgements (section 2.1.1) with a rule saying that
F F-+ S + T C K whenever F ~-+ S,T E K. As far as we are aware, choice
operators have not been used before to analyze subtyping.
As usual, -+ZA+ is extended to become a compatible relation with respect to
type formation, ~Z^+ is the reflexive, transitive closure of -+ZA+, and =ZA+ is
the least equivalence relation containing --+ZA+. If S --+Z^+ T in at least one
step, we write S __+Z^+>o T.
57

PROPOSITION 4.1 (Strong normalization for -"}~h+) If F k-+ T E K, then every


/3A+-reduction sequence starting from T is finite.

Next, we define a measure for subtyping statements such that, for any sub-
typing rule, the measure of each hypothesis is smaller than that of the conclusion.
Most measures for showing the well-foundedness of a relation defined by a set
of inference rules involve a clever assignment of weights to judgements, often in-
volving the number of symbols. We need a more sophisticated measure, since in
NS-OAPP it is not necessarily the case that the size of the hypothesis is smaller
than the size of the conclusion.
We introduce a new mapping from types to types in the extended language
in order to define a new measure on subtyping statements. To motivate the
definition of this new measure, we analyze the behavior of type variables during
subtype checking. Assume that we want to check if F ~Alg S < T, where S is a
variable or a type application. It can be the case that the judgement is obtained
with an application of NS-TVAR or NS-OAPP, in which case we have to consider
a new statement F }-Alg S I <_ T, where S' is obtained from S by replacing a
variable by its bound (and possibly normalizing). However, we do not replace
every variable by its bound, as this would constitute an unsound operation with
respect to subtyping. This fact is illustrated in the following example.

EXAMPLE 4.2 Two unrelated variables may have the same bound.

X_<T*:*, Y<T*:* ~/X <_ Y, but X_<T*:*, Y<_T*:, F T* < T*.

Our new mapping, plus, includes in each type expression this nondetermin-
istic behavior of its type variables.

DEFINITION 4.3 (plus) 1. plusrl,X<T:K,r~(X ) = X + plusr~(T),


2. plus r (T--4S) = plus r (T)--+plus r (S),

3. plusr(VX < T : K . S ) = V X <plusr(T):K.plusr,z <_T:g(S),


4. p l u s r ( A X : K . S ) = AX:K.plusr,x:g(S),
5. plus r (S T) = plus r (S) plus r (T),
6. plus (A K IS1.. sn]) = A K [vlus (sl)..pl sr

EXAMPLE 4.4 plusx<T*:,, r < x : , , z_<r:,(Z) Z + Y + X + T*.

We need to show that plus is well defined on well-kinded arguments.

DEFINITION 4.5 (Size) The size of a type expression T, sizet(T), is defined as


follows.
9 sizet(X) = 1,
9 sizet(AX:K.T) = sizet(VX<S:K.T) = sizet(T) + 1,
58

9 size,(S- T) = si e,(ST) = siz ,(S) + siz (T) + 1,

9 = 1+

LEMMA 4.6 (We11-foundedness of plus) If F F- T E K, then plusr(T ) is defined.

PROOF: The height of the derivation of the kinding judgements of the arguments
strictly decreases in each recursive call. []

LEMMA 4.7 If F F- T E K, then F ~-+ plusr(T ) E K.

LEMMA 4.8 (Strengthening for plus) 1. Let X r FTV(F2) U FTV(S). Then


F1, X <Tx :Kx , F2 F- S E K implies plusrl ' X <Tx :gx, r2 (S) = plusrl ' r2 (S).

2. F1, x:T, F2 ~- S E K implies plusrl,~:T, r2(S ) = plusr~,r2(S).

LEMMA 4.9 ( Weakening for plus) If F' ~- ok, F _C F ~, and F F- S E K, then


plus r (S) = plus r, (S).
LEMMA 4.10 (Context modification) If F1 ~- U' E K and E is either ok or
T E K ~, then F1, X<U:K, F2 ~- E implies F1, X<U~:K, F2 ~- E.

LEMMA 4.11 (Substitution forplus) Let F1, X~S:K1, I~2 [- T2 E /(2 and F1 I-
T1 E K1. Then
plusrl, X <_S:K~,r~ (T2)[X e-plusr~ (T1)] ---~13/x+plusr~, r~ [X~-T~](T2 [X +-T1]).
LEMMA 4.12 (Monotonicity of plus with respect to --+Z/~) Let F ~- T E K. Then

1. F --+ZA F' implies plus r (T) ~ ^ + plus r, (T).


2. T -+Z^ T' implies plusr(T) ~ZA+ >~ plusr(T' ).
LEMMA 4.13 Let lubr(S) be defined and F F- S E K. Then plusr(S ) ~ + > 0
plusr ( lubr (S) ).
Our measure to show the well-foundedness of AlgF~ considers the j3A+-reduction
paths of the plus versions of the types in the subtyping judgements. As we men-
tioned before, in NS-TVAR and NS-OAPP the types appearing in the hypothesis
may be larger than those in their conclusions. Therefore, the well foundedness
of the AlgF~ relation is not immediate. The next corollary gathers the previous
results to serve our purposes.

COROLLARY 4.14 1. If F ~- X E K, then plusr(X ) _~^+>o plusr(F(X)).


2. If F F- A T E K then plusr(AT ) _~Z^+>o plusr(lubr(AT),r).

PP~OOF: Item i is a particular case of the previous lemma (lemma 4.13), and
item 2 is a consequence of lemma 4.13 and the monotonicity of plus with respect
to -+~A+ (lemma 4.12(2)). []
Finally, we can define our measure.
59

DEFINITION 4.15 (Weight)

1. weight(F ~AZg S <_T) =


<max-red(plus r (S)) + max-red(plus r (T)), sizet (S) + sizet (T) >,
2. weight(F ~- T E K) = <0, 0>,

where max-red(S) is the length of a maximal ~A+-reduction path starting from


S, and sizej is defined in definition 4.5.

Pairs are ordered lexicographically. Note that <0, 0> is the least weight.

PROPOSITION 4.16 (We11-foundedness of AlgF~)


J1.. Jn .
If is an AlgF~ rule, then weight(Ji) < weight(J), for each ic{1..n).
J
Finally, we can state our main result.

THEOREM 4.17 (Decidability of subtyping in F~)


For any context F and for any two types S and T, it is decidable whether
F~-S<__T.

5 Acknowledgements
I want to express my gratitude to Mariangiola Dezani-Ciancaglini for her scien-
tific and moral support which made the present work possible. I am also grateful
for discussions with Benjamin Pierce, Henk Barendregt, and Ugo de'Liguoro.

References
[1] H. P. Barendregt, M. Coppo, and M. Dezani-Ciancaglini. A filter lambda model
and the completeness of type assignment. Journal of Symbolic Logic, 48(4):931-
940, 1983.
[2] Kim Bruce and John Mitchell. PER models of subtyping, recursive types and
higher-order polymorphism. In Proceedings of the Nineteenth A CM Symposium
on Principles of Programming Languages, Albequerque, NM, January 1992.
[3] Peter Canning, William Cook, Walter Hill, Walter Olthoff, and John Mitchell. F-
bounded quantification for object-oriented programming. In Fourth International
Conference on Functional Programming Languages and Computer Architecture,
pages 273-280, September 1989.
[4] Luca Cardelli. A semantics of multiple inheritance. Information and Computa-
tion, 76:138-164, 1988. Preliminary version in Semantics of Data Types, Kahn,
MacQueen, and Plotkin, eds., Springer-Verlag LNCS 173, 1984.
[5] Luca Cardelli. Notes about F~:. Unpublished manuscript, October 1990.
[6] Luca Cardelli and Peter Wegner. On understanding types, data abstraction, and
polymorphism. Computing Surveys, 17(4), December 1985.
60

[7] Felice Cardone and Mario Coppo. Two extensions of Curry's type inference sys-
tem. In Odifreddi [20], pages 19-76.
[8] Giuseppe Castagna and Benjamin Pierce. Decidable bounded quantification. In
Proceedings of Twenty-First Annual ACM Symposium on Principles of Program-
ming Languages, Portland, OR. ACM, January 1994.
[9] Adriana Compagnoni. Higher-Order Subtyping with Intersection Types. PhD the-
sis, University of Nijmegen, The Netherlands, January 1995.
[10] Adriana B. Compagnoni. Subtyping in F~ is decidable. Technical Report ECS-
LFCS-94-281, LFCS, University of Edinburgh, January 1994.
[11] Adriana B. Compagnoni and Benjamin C. Pierce. Multiple inheritance via inter-
section types. Mathematical Structures in Computer Science, 1995. To appear.
Preliminary version available as University of Edinburgh technical report ECS-
LFCS-93-275 and Catholic University Nijmegen computer science technical report
93-18, Aug. 1993.
[12] William R. Cook, Walter L. Hill, and Peter S. Canning. Inheritance is not sub-
typing. In Seventeenth Annual ACM Symposium on Principles of Programming
Languages, pages 125-135, San Francisco, CA, January 1990. Also in [18].
[13] M. Coppo and M. Dezani-Ciancaglini. A new type-assignment for A-terms. Archly.
Math. Logik, 19:139-156, 1978.
[14] Pierre-Louis Curien and Giorgio Ghelli. Coherence of subsumption: Minimum
typing and type-checking in F_<. Mathematical Structures in Computer Science,
2:55-91, 1992. Also in [18].
[15] Giorgi0 Ghelli. Proof Theoretic Studies about a Minimal Type System Integrating
Inclusion and Parametric Polymorphism. PhD thesis, Universitk di Pisa, March
1990. Technical report TD-6/90, Dipartimento di Informatica, Universit~ di Pisa.
[16] Giorgio Ghelli, January 1994. Message to the Types mailing list.
[17] Jean-Yves Girard. Interprdtation fonctionelle et dlimination des eoupures de
l'arithm~tique d'ordre supdrieur. PhD thesis, Universit6 Paris VII, 1972.
[18] Carl A. Gunter and John C. Mitchell. Theoretical Aspects of Object-Oriented
Programming: Types, Semantics, and Language Design. The MIT Press, 1994.
[19] John C. Mitchell. Toward a typed foundation for method specialization and inher-
itance. In Proceedings of the 17th A CM Symposium on Principles of Programming
Languages, pages 109-124, January 1990. Also in in [18].
[20] Piergiorgio Odifreddi, editor. Logic and Computer Science. Number 31 in APIC
Studies in Data Processing. Academic Press, 1990.
[21] Benjamin C. Pierce. Programming with Intersection Types and Bounded PoIy-
morphism. PhD thesis, Carnegie Mellon University, December 1991. Available as
School of Computer Science technical report CMU-CS-91-205.
[22] Benjamin C. Pierce and David N. Turner. Simple type-theoretic foundations for
object-oriented programming. Journal of Functional Programming, 4(2):207-247,
April 1994.
[23] Martin Steffen and Benjamin Pierce. Higher-order subtyping. In IFIP Working
Conference on Programming Concepts, Methods and Calculi (PROCOMET), June
1994. An earlier version appeared as University of Edinburgh technical report
ECS-LFCS-94-280 and Universit~it Erlangen-Niirnberg Interner Bericht IMMD7-
01/94, February 1994.
A A-calculus Structure Isomorphic to
Gentzen-style Sequent Calculus Structure

Hugo Herbelin *

LITP, University Paris 7, 2 place Jussieu, 78252 Paris Cedex 05, France
INRIA-Rocquencourt, B.P. 105, 78153 Le Cliesnay Cedex, France
Hugo. HerbelinQinria. f r

A b s t r a c t . We consider a ~-calculus for which applicative terms have


no longer the form (...((n ui) u2).., u,) but the form (u [ul; ...;u~]), for
which [ut;...;u,~] is a list of terms. While the structure of the usual
)~-calculus is isomorphic to the structure of natural deduction, this new
structure is isomorphic to the structure of Gentzen-style sequent calculus.
To express the basis of the isomorphism, we consider intuitionistic logic
with the implication as sole connective. However we do not consider
Gentzen's calculus L J, but a calculus LJT which leads to restrict the
notion of cut-free proofs in L3. We need also to explicitly consider, in
a simply typed version of this ~-calculus, a substitution operator and a
list concatenation operator. By this way, each elementary step of cut-
elimination exactly matches with a/~-reduction, a substitution propaga-
tion step or a concatenation computation step.
Though it is possible to extend the isomorphism to classical logic and to
other connectives, we do not treat of it in this paper.

1 Introduction

By the Curry-Howard isomorphism between natural deduction and simply-typed


~-calculus, and using Prawitz's standard translation [11] of cut-free LJ into nat-
ural deduction, we get an assignment of LJ proofs by ~-terms.
Zucker [14] and Pottinger [10] have studied the relations between normali-
sation in natural deduction and cut-elimination in LJ. They were considering
normMisation without paying special attention to the computational cost of the
substitution of a proof in place of an hypothesis. But in sequent calculus, among
the different uses of the cut rule, there is one which stands for an explicit oper-
ator of substitution and among the elementary rules for cut-elimination, there
are rules to compute the propagation of substitution. Therefore, Zucker and
Pottinger were led to consider proofs up to the equivalence generated by these
substitution propagation computation rules.
Here, we consider a ~-calculus with an explicit operator of substitution and
with appropriated substitution propagation rules. This allows to have a more
* This research was partly supported by ESPRIT Basic Research Action "Types for
Proofs and Programs" and by Programme de Recherche Coordonn6es "M6canisation
du raisonnement'.
62

precise correspondence with the elementary rules for cut-elimination. However,


there are two problems. The first one is that several cut-free proofs of LJ are
associated to the same normal simply-typed ,~-terms. An answer to this problem
is to rather consider a restriction of LJ, called LJT, having the same structure
and same strength as LJ but for which there is a one-to-one correspondence with
normal simply-typed terms. The second problem is that Gentzen-style sequent
calculus and ,\-calculus (or natural deduction) have not the same structure.
Consequently, the reduction rules in one and the other calculi do not match. An
answer to this second problem is to consider an Mternative syntax for h-calculus
of which, this time, the simply-typed fragment is isomorphic to LJT.
Note that a radically different approach of the computational content of
Gentzen's sequent calculus appears in Breazu Tanen et al [1], Gallier [4] and
Wadler [13]. Each of them interprets the left introduction rules of sequent cal-
culus as pattern construction rules.

2 A Motivated Approach to LJT and A-calculus

2.1 T h e S e q u e n t Calculus L J

We consider a version of LJ with the implication as sole connective. The for-


m u l a s are defined by the grammar

A::= X I A ~ A
where X ranges over VF, an infinite set of which the elements are called
p r o p o s i t i o n a l v a r i a b l e names. In thesequel, we reserve the letters A, B, C,
... to denote formulas.
S e q u e n t s of LJ have the form F b A. To avoid the need of a structural rule
we define F as a set. To avoid confusion between multiple occurrences of the
same formula, this set is a set of named formulas. We assume the existence of an
infinite set of which the elements are called n a m e s . Then, a n a m e d f o r m u l a
is just the pair of a formula and a name. Usually, we do not mention the names
of formulas (anyway, no ambiguity occurs in the sequents we consider here).
Under the condition that A, with its name, does not belong to F, the notation
F, A stands for the set-theoretic union of F and {A}.
To avoid the need of a weakening rule, we admit irrelevant formulas in axioms.
The rules of LJ are:

Ax F,A,A~" C
F,A b A F, A t" C Cont

F t - A F, B t - C P,A~-B
F , A - - - , B F C IL F ~ - A - * B IR

F~- A I",AF B
Cut
FI-B
63

2.2 The Usual Interpretation of LJ Cut-free Proofs by Normal


)~-terms
There is a standard way to interpret cut-free proofs of LJ as k-terms, see for
instance Prawitz [11] or, for a more formal presentation, Zucker [14], Pottinger
[10] or Mints [9]. To express the interpretation, it is cumbersome to choose the
set of k-variables names as set of names. We then mention explicitly the name
z of a formula A under the form z : A. The interpretation is by induction on the
proofs and we mention the associated k-terms on the right of the symbol ~'.

F,z:A,y:AF u:C
F , z : A t " z : A Ax Cont
F,x:At- u{y := ~}:C
FI-u:A F,y:Bt-v:C F,z:AI- u:B
In F ~" Az.u : A.--* B IR
r,x:A-~S ~ v{U := (~ u)}:C
for which v{x := u} denotes the term v in which each occurrence of z has
been replaced by u.

2.3 Towards the Calculus L J T

However different proofs may be associated to the same k-term. For instance:

Az Az
A,C~- A A,C, Bt- B
IL
A-.* B , A , Ct- B
A--+ B , A t- C-+ B IR
and

AT,
A,C, Bt- B
AI-----'-A A z A, B t " C ~ B IR
1i,
A ---,B , A I-- C ---+B
are both associated to the Church-like typed k-term k z : C . ( z y):C----~ B for
a context in which z : A--* B and y: A.
We decide to restrict LJ in order to get a bijective correspondence between
normal simply-typed k-terms and cut-free proofs. For this purpose, we restrict
the use of the In rule in order to forbid the second proof. The calculus we obtain
has two kind of sequents. We call it LJT, since it appears as the intuitionistic
fragment of a calculus called LKT and defined by Danos, Joinet, Schellinx [2].
A s e q u e n t of LJT has either the form F; I- A or the form F; A I- B. In both
cases, F is defined as a set of named formulas. The semi~colon delimits a place on
its right. A uniform notation for sequents of LJT is the following one: F ; / / ~ - B
w h e r e / / i s a notation to say that the place on the right of the semi-colon may
be either empty or filled with one (not named) formula. The idea of using these
kinds of sequents comes from Girard [5] who called "stoup" the special place
between the symbols ";" and "}-'.
6~

The rules of LJT are

F ; A b A Az F,A;At- B
F, A; t- B Cont

F;I-A F ; B t - C F,A;b B
F;A~BbC IL F ; F A - - . B IR
R e m a r k s : 1) With these rules, the first proof above is not directly a proof in
the restriction: the axiom rule of LJ has to be encoded in the restriction by an
axiom rule followed by a contraction rule.
2)This calculus appears also in Danos et al [2] with a slight difference in
the treatment of structural rules. Like its classical version LKT, it has been
considered by Danos el al for its good behaviour w.r.t, embedding into linear
logic. The calculus LJT appears also as a fragment of ILU, the intuitionistic
neutral fragment of unified logic described by Girard in [6]. The calculus ILU
is itself a form of LJ constrained with a stoup, for which Girard pointed out
that "the formula [in the stoup] (if there is one) is the analogue of the familiar
head.variable for typed A-calculi'.
Recently, Mints defined in [9] a notion of normal form for cut-free proofs of
LJ which also coincides with the notion of cut-freeness in LJT.
We have also to mention the definition of a cut-free sequent calculus similar
to the cut-free LJT in the paper of Howard [12] on the interpretation of natural
deduction as a A-calculus. Howard mentions that the proofs of this cut-free
calculus are in one-to-one correspondence with the normal simply-typed A-terms.

The proofs of the cut-free LJT are effectively in one-to-one correspondence


with the normal simply-typed A-terms. For instance, a normal term of the form
Axl...Axn.(y Ul ...up) and of type A1 --* ... --* An --* B, in which y is of type
D = C1 "-* ...-'* Cp--* B, is unambiguously associated to a proof of the form

AT,
F;I-up:Cp F;y:Bby:B
IL
F; yp:Cp---*B I- (yp up):B

F;~-ul:C1 F;y2:C2--*...--*Cp-~BF (Y2 u~ ...up):B


IL
F ; yl : D F (yl ul ...up) : B
Cont
F ; b (y ul ...up) : B
In
F\{x,,}; ~- Ax,.(y ul ...up):A,,~B

F\{x2,..., xn}; b Ax2...Ax,.(y ul ...up):A~-*...-~ A,--* B / R


r \ { , l , ..., , , } ; ~ Axl...Ax,.(y ul ...up):A~ - ~ . . . - ~ A , - ~ B

where F contains y: D, Xl : A1, ..., xn : An.


65

R e m a r k : The construction of the applicative part of the term starts from up and
ends with ul in contrast with the usual way of building a term (u ul ... up) in A-
calculus. This is why we have not an exact correspondence with the substitution
operator when we consider the cut rule.

2.4 C u t a n d R e d u c t i o n Rules: Towards t h e A-calculus

According to the place of the cut formula (in the stoup or not), there are two
kinds of cut rules in LJT:

head-cut rule mid.cut rule

F;IIb A F;Ab B F;b A F , A ; I I b B


P;IIb B CH F;II}- B CM
for w h i c h / / m e a n s one or zero formula in the stoup.
The mid-cut rule is naturally interpreted as an operator of explicit substitu-
tion:

F;bv:A F,z:A;//~-u:B
CM
F ; / 7 I- u[z := v ] : B
A standard way to eliminate cuts is to apply rewriting rules to proofs in
order to propagate the cuts towards smaller proofs. Here is an example of such
a rewriting rule (we let C = A2--+...--+A,~B):
F,z:A;t-ua:A1 F,z:A;y:Ct-(yu2...u,):B
IL
F ; b v:A r, z:A;y:A1 -..,C ~- (y ux ... u,,):B
CM
r; y:Al-~C ~ (y ,,, ... , , ) [ , := ~]:a

reduces to

F;I-v:A F,z:A;F'ul:A1 F;I-v:A F,z:A;y:Cb(yu2 ...u,O:B


CM CM
r ; ~ ~,[~ := ~]:A1 r;v:c ~ (y ~2 ... ~,)[, := ~]:B
r;y:A,--.C l- (y m[z := v] ... ,,)[z := v]:a IL

It seems "natural" that such a rewriting rule is in correspondence with a rule


of substitution propagation. But it is not the case. Indeed it corresponds to the
reduction of (y ul ... u,)[z := v] into (y ul[z := v] ... u,)[z := v] while we would
like to get ((y ul ...un-1)[z := v] u,[z := v]).
This is because the structure of a proof in sequent calculus is different from
the structure of the associated ,x-term and this suggests to consider an alternative
formalism for the ,X-calculus in which an applicative term is no longer of the form
((u ut)...u,), but of the form (u [ul; ...; u,]), i.e. considered as the application
of a function to the list of its arguments. We call ~-calculus this alternative
formalism for ),-calculus.
2.5 Digression: How to Recover LJ ?

LJT is as strengthful as LJ since a proof of a sequent F t- A in LJ can be compo-


sitionnally translated into a proof of F; V-A in LJT. To express the translation,
it is more convenient to consider a variant of LJ with the IL rule and the Cont.
rules mixed (i.e. we assume that A--, B is already in F for the second item). We
note -.~ the translation.
a x

F, A t - A Az .,., F,A;A~-A
F, A; I- A Coat

Az
9 9 F;~-A F ; B t - B
IL
: : F; A - , B ~- B
F ~- A F, B ~ C ~ Coat
FI'-C Iz F;t-- B F, B i t - C
C~t
F;~- C

F,A'~- B F, Ai~- B
F~-A--,B IR ~.~ F;~-A--~B IR
9 9 * 9

Cut CM
F t- B F; I- B

Thus we have an interpretation of LJ into LJT. However, following this in-


terpretation, cut-free proofs in LJ may no more be cut-free in LJT.

3 The A-calculus

3.1 The A-expressions

We assume the existence of an infinite set 12 of which the elements are called
t e r m v a r i a b l e s n a m e s and here denoted by the letters z, y, z, ...
The set of A-expressions, including the X-terms (or shortly t e r m s ) and the
lists o f a r g u m e n t s are mutually defined by the following grammar for which
z ranges over P

Terms: t::= 0 I I (t l) [ (t[. := t])


A r g u m e n t lists: 1::= [3 ] It ::/] I(! @ i) [l[z := t]
We use the letters t, u, v,... to denote terms and the (possibly quoted) letter
I to denote lists of arguments9
The notation [ ] stands for the empty list of arguments and It ::/] stands for
the adjunction of the term t to the list of arguments 1, while (! @ l') stands for
the explicit concatenation of the lists l and 1' of arguments.
67

The syntax (t[z := u]) stands for an operator of explicit substitution in terms
(a " l e t x=u i a t" operator) and (l[z := u]) stands for an operator of explicit
substitution in lists of arguments.
We usuMly abbreviate an argument list [tl :: [... :: [tn :: [ ]]...]] by It1; ...;in].
Terms such as ((...(t tl) ...) tn) are abbreviated (t tl ... tn). Sometimes (z []) is
shortened into z. Also, the expressions (Az.t), (t[z := u]) and (l @ i I) may be
written respectively Az.L t[z := u] and l @ l~ when there is no ambiguity.
Subexpressions of A-expressions are defined as usual, but, in our case, by a
simultaneous recursion on terms and arguments lists.
Bound variables are defined as usual. We say that two A-expressions are
a-equal if they differ only in the names (assumed distinct the one from the
others) of bound variables. This notion of equality does not affect the structure
of expressions and, in the sequel, we consider A-expressions up to this a-equality.

3.2 N o r m a l K-expressions

A A-expression is n o r m a l if and only if it does not contain any operator of


explicit concatenation or explicit substitution and if all applicative subterms are
of the form (z I) with I normal.
Otherwise said, a A-expression, is normal if it is construed using this restricted
grammar:

t::= Z)l(Ax.O
Z::= :: tl

An approximation of normality is weak normality. A A-expression is called


w e a k l y n o r m a l if it is of the form (z l) or Az.t or [] or It ::/], where t and l
denotes respectively any term and any list of arguments.
R e m a r k : Usual A-calculus can be embedded in A-calculus, since there is, in A-
calculus, the possibility to consider terms of the form (...(z lull) ... [un]) having
a structure similar to the structure of applicative terms in A-calculus. However,
such a A-term is not normal. Indeed, its normal form is (x [ul; ...; un]).

3.3 R e d u c t i o n Rules
The presence of explicit substitution and concatenation operators entails the
presence of appropriated reduction rules:

- /~-reduction

(Ax.u [v :: z]) ..L :_- ,,] z)


(A .u []) -L Ax.u ,a,z
- concatenation of the arguments of a term
- concatenation computation rules

b, :: t] ~ l, _r_,[,, :: (t ~ r)] C.o.,


[] @ t' -~ l' C..
- propagation of substitution through weakly normal terms

(~ t)[~ := ,~] ~ (v l[x := v]) sy..


(v l)[x := v] ~ (v l[~ := v]) S,,o
(av.u)[,~ := v] _r_, ,~v.(,,[x := ,~]) s~,
warning to a possible variable capture in rule Sx
- propagation of substitution through weakly normal arguments

[][~ := v] ~ [] s..
[,, :: t][~ := v] -~ [,,[~ := v] :: I[~ := vii Soo.,

If u ~ v, then u is called a r e d e x . We note 1_. the one step reduction ob-


tained from -L by congruence. Since the system of reduction rules is left linear
(if one takes an infinite family of rules Sye,, Sno and S~,, one for each possible
combination of distinct z and y in •) and without critical pairs, according to
Huet [7], 1_. is confluent. We stay unprecised about the a-equality problem stem-
ming from the rule Sx. Solutions exist, for instance by adding an extra explicit
renaming rule to the rewriting system.
R e m a r k : The absence of critical pairs may be quite restricting. For instance,
it is not possible to simulate usual/?-reduction using these rules for the reason
that substitutions are not allowed to go through/~-redexes. However, the set of
rules is enough to reach a normal form, when this one exists.

4 C u t - e l i m i n a t i o n in the Calculus L J T

We say that two proofs are equal if they differ only by the names of formula in
the proved sequent or by addition of irrelevant formulas to the left part of the
proved sequents. We consider proofs up to this notion of equality. In particular,
if p is a proof o f / ' ; / 7 1- A, then, for any named formula t3 not i n / ' , p is a proof
of /', B; H }- A, even if it becomes necessary to change the name of another
similarly named occurrence of B throughout p.

4.1 Cut-elimination

P r o p o s i t i o n 1. (Strong and confluent cut-elimination)


There ezisis a confluent system of rewriting rules which allows to derive a cut-
free proof of F; 1"I F A from any proof of the same sequent.
69

Such a system of rewriting rules is listed hereafter. It is easy to see that it


is complete, since it exhausts all possible patterns having a cut rule as head
symbol. Its confluence comes from its left linearity (if one takes one different
rule for each different variable name) and from the absence of critical pairs, as
for the system of reduction rules of the T-calculus. As for its strong termination,
the proof is done in a next section.

Reduction of C u Rules.
- logical counterpart of ]%reduction

F,A;I- B F;~ A F;B~-C F;I- A F,A;I- B


IR IL CM
F;I-- A--.,B F ; A ~ B F- C F;b B F; B t- C
Cn LJT Cu
F;F" C ' F;~" C

F,A;t- B
IR Az
F ; F- A.--*B F ; A--* B F- A--* B
CH LJT F,A;I- B
F; F A..-. B ---~ /R
F;f- A--*B

- logical counterpart of concatenation of the arguments of a term

F,B;Bt.-A F,B;BI-A F,B;At-C


Con~ CH
F,B;I- A F,B;A I- C F,B; B ~- C
CH LJT Cont
F,B;~-C ---* s

- logical counterpart of concatenation computation rules

F;F-D F ; B t - A F;Bt-A F;AF-C


IL CH
r; D--~B F-A F;A F-C F;F- D r; B F- C
CH LJT IL
F; D-* B F"C ~ F; D--* B F- C

A~
F;At-A F;AF.C
F; A t- C CH LJT
F;AI-C

Rednc(ion of CM Rules.
- logical counterpart o f p r o p a g a t i o n o f s u b s t i t u t i o n s t h r o u g h w e a k l y n o r m a l
terms
F,A;AF-C F;I-A F , A ; A I - C
F; I- A F, A; t- C Cont F; F-A F; A [- C CM
CM LJT CH
F;F- C ---* F;F- C
70

F,A,B;B~C P, B;~- A F , A , B ; B t - C
Cont CM
F,B;I- A I",A,B;I--(7 .F,B;BI-C
]',B; I--C C#t LJ...T F, B; I~C Cont

F,A,B;I-C F,B;I" A F,A,B;I-C


F;~" A T,A;I- B..-*C I~ CM
F;~- B--.C Cu LJT F,B;b C
F;~- B--.C Iu

note that, if B already occurs with the same name somewhere in the proof of
F; I- A, then this latter name has to be changed throughout the proof.

- logical counterpart of propagation of substitution through weakly normal


list of arguments

Ax
F;I.- A F , A ; B b B
F; B l- B CM LJ...T Ax.
F;BI- B

I',A;I- B F,A;CI-. D
Iz
F;b A F , A ; B ~ C I- D
F; B---,C I- D CM
F;t-A F,A;I-- B F;~ A F,A;CI-- D
F; 1- B CM F'~C I- D Cxr
L:}T F; B--- (7 1- D I~.

5 The Assignment of LJT Proofs by ~-expressions


Proofs of LJT are isomorphic to X-expressions. We show it by first assigning
),-expressions ~o proofs of L:iT. It remains just to check that, through this as-
signment, the reduction rules for A-expressions are in exact correspondence with
the rewriting rules for proofs of LJT.
To describe the assignment, we identify the set of formula names with the
set of A-term variable names and we write the named formulas under the form
z:A. It is also cumbersome to consider arguments lists as applicative contexts:
An a p p l i c a t i v e c o n t e x t is a list of arKuments written under the form (. l)
where, is a special notational symbol. Also, we call h o l e d e c l a r a t i o n a formula
written under the form. :A.
We express the assignment by judgments.
A j u d g e m e n t is something of the form F ; / 7 I- t : A. In this writing/-/is
either nothing, in which case t is a term, or a hole declaration in which case t is
an applicative context.
71

Otherwise said, in the assignment, proofs of sequents with an empty stoup


are interpreted by terms while proofs of sequents with a non empty stoup are
interpreted by applicative contexts.

Applicative context formation Term formation

Az /,,z:A;. :Ab (. I):B


/';. :AI- (. []):A Cont
F,z:A;F" (z I):B
F ; b u : A / ' ; . : B b ( . i):C F,z:A;b u:B
/';. : A ~ B } - ( . [u::/]):C IL F;~" Az.u:A--* B IR

F;.:Cb(. i):A /';.:At-(. I'):B /';~-u:A / ' ; . : A t - ( . I):B


Cn Cn
/';. : C b ( . (l@l')):B r;~-(ul):B
F;bu:A F , z : A ; . : C b ( . I):B F;bu:A F,z:A;bv:B
/,;. : e l - (. l[z:=u]):B CM /,;bv[z:=ul:B CM
R e m a r k : The rules with an non empty stoup are polymorphic in the role of the
formula in the stoup. So, there is a strong relation between a judgement

F;. :At---*...~A,~---*BF (. [Ul;...;u,~]):B


and a judgement

F ~- [ul; ...; u,] :Aa A ... A A,


where A1A... A An is defined as VB.(A1 4 . . . -+ An -+ B) -+ B (encoding of
tuples in second order A-calculus).
A T-expression e such that we have zl :A1, ...,zn:An ~- e:A for some term
variable names zx, ..., z , and for some formulas A1, ..., A,, A is said simply-
t y p e d o f t y p e A, or shortly, t y p a b l e b y A.

6 Strong Termination

By the isomorphism, the strong termination of cut-elimination for LJT (using the
above rewriting system) and the strong termination of reduction for typable X-
expressions are equivalent. We show hereafter the strong termination for typable
X-expressions.

Proof of Strong Termination. Let e be a T-expression and _RRa notion of reduc-


tion. We say that e is strongly normalisable w.r.t. ~ in the following cases:
R
- e is not reducible w.r.t.
- for all e' such that e ~ e', we have e' strongly normalisable
72

Let e be a ~-expression. If e is typable, then it is strongly normalisable w.r.t.


the reduction _+I. To prove that, we prove something stronger, the strong E-
normalisability. This latter is preserved by the various operations of~-expressions
construction.
h
We define a notion of reduction ---+ which removes the head constructor of a
~-expre~sion. The reduction h_, is defined by the following cases:

h
~ Z . U -----} U
[u::/]•
l) L l [u :: • t
where u ranges over the set of ~-terms and I over the set of argument lists.

We note 4E, and we call E-reduction, the notion of reduction defined by


e ~E e ~ either because e ---+
1 e ~ or because e ---,
h e ~ (without considering the closure
h
of --* by congruence). We say that e is s t r o n g l y E , - n o r m a l i s a b l e (shortly SEN)
E
if it is strongly normalisable w.r.t. ---~.

L e m m a 2. lfthe A-term u and the argument list I are SEN then ~z.u, (z l) and
[u :: O are SEN.

Proof. By induction on the proof that u is SEN then by induction on the proof
that l is SEN. Let us treat of the case [u ::/]. If [u ::/] ~ e' then, either e' is u
or l, in which case, by hypothesis, e' is SEN, or e ~ is [u' :: /] with u ~ u ~ , or
[u :: l'] with l ~ l' in which cases e ~ is SEN by induction hypothesis. Therefore,
in any case, e reduces to a SEN T-expression. This implies t h a t e is itself SEN.

Lemma3. Let e and u be SEN-~-ezpressions. If, for all l SEN, the typability of
(u l) implies that (u l) is SEN, then, also the typability of e[z := u] implies that
e[z := u] is SEN.

Proof. It works by induction on the proof that e is SEN then by induction on


the proof that u is SEN.
Let us assume that e[z := u] ~ w. If the reduction touchs a redex in u then
w has the form e[z := u'] With u _+1 u'. T h e proof of SEN for u' is smaller than
the one for u, thus, by induction hypothesis, e[z := u'] is SEN. Similarly, if the
reduction is in e.
It remains the case where e[z := u] is itself a redex and where it is this redex
which is reduced. We look at the different possible forms for e.

- The case where e is (z 1') - in which case w denotes (u i'[z := u]) - is the
more delicate one. But since e h_~ 1~' the proof of SEN for l' is smaller than the
one for e. Therefore, by induction hypothesis, l~[z := u] is SEN. And since we
have assumed t h a t for all l SEN, (u l) was SEN, we infer t h a t (u i'[z := u])
is SEN.
73

- If e is (y 1) then w is (y l[z := u]). Here again, l[z := u] is SEN by induction


hypothesis. Then, by lemma 2, we get that w is SEN.
- If e is the term )~y.v, up to a change of the variable name y in )~y.v - and
this does not change the structure of the proof of SEN -, we may assume
that y and z are distinct variable names. We may then affirm that w is
)~y.(v[z := n]). Since )~y.v ~-~ v, by induction hypothesis, (v[z := hi) is SEN
and by lemma 2, w is SEN.
- If e is Iv :: /] then w denotes [v[z := u] :: l[z := u]]. But we have both
[v ::/] _,E v and Iv ::/] E I. Therefore, by induction hypothesis, we have that
v[z := u] and l[z := u] are SEN. Then, by lemma 2, we get that w is also
SEN.
- If e is [ ] then w is [ ] which is directly SEN.

Thus, whatever the form of e, the reducts of e[z := u] are all SEN. This is
enough to say that e[z := u] is SEN,

Lemma4. Let A be a formula. Let e be a )~-expression, S E N and fvpable by A.


Let l be a S E N arguments list. If the expression (e 1) (if e is a -A-term) or the
expression e @ l (ire is an arguments lisQ is typable, then it is SEN.

Proof. We proceed by induction on A, then on the proof that e is SEN, then on


the proof that I is SEN.
Let us assume that ( e l ) - 1 w ( i f e is a ~ - t e r m ) o r e @ l ~ w ( i f e i s an
arguments list).
If the reduction affects a redex in e then w has the form (e' l) or e' @ i with
1 et" Since the proof of SEN for d is smaller that the one for e, by induction
e --*
hypothesis, w is SEN. Similarly if the reduction is in 1.
It may also happen that (e I) or e @ 1 is a redex and that this redex is the
reduced one.

- The more delicate case is when e has the form Az.u while 1 has the form
[v :: l']. In this case, the type of A has the form B --* C, the A-term v is
typable by B and w denotes (u[z : - v] l'). Since B is smaller than A, by
induction hypothesis, the typability of (v !") implies that it is SEN whatever
1" SEN. It is then possible to use lemma 3 in order to infer that u[z := v]
is SEN. But this latter is typable by C which is also smaller than A. By
induction hypothesis, again, (u[z := v]l') is SEN.
- If e is (z 1') then w denotes (z (1' @ 1)). But (x l') ~ l', therefore, by
induction hypothesis, (1' @ !) is SEN. By lemma 2, w is SEN.
- If e is Az.u and I denotes [ ] then w is e which, by hypothesis, is SEN.
- If e is [ ] then w denotes 1 which is directly SEN.
- If e is [v :: 1'] then w denotes [v :: (1' @ 1)]. But [v :: 1'] ~ I', therefore, by
induction hypothesis, i' @ ! is SEN. As for v, it is also SEN by induction
hypothesis. Then, by lemma 2, w is SEN.
7~

Thus, whatever reduction of (e l) or e @ l we consider, we get a SEN l -


expression. This means that (e I) (if e is a A-term) or e @ i (if e is an arguments
list) is SEN.

P r o p o s i t i o n 5 . Typable -~-expressions are SEN.

Proof. Let e be a typable X-expression. The proof works by induction on e. The


cases Av, (p l) and (v :: l) come directly from the lemma 2. The cases (u 1) and
(1 @ I') come from the lemma 4. As for the cases v[z := u] a n d / [ z := u], they
come from the lemma 3 applied to the lemma 4.

The strong E-normalisability directly implies the strong normalisability.

C o r o l l a r y 6. Simply4yped-~-ezpressions are strongly normalizable.

R e m a r k s : 1) A similar proof has been done by Dragalin [3] for the system of
reduction rules given in the seminal paper of Gentzen on the cut-elimination
theorem for LK. The difference is that Dragalin's proof does not work by struc-
tural induction on the proof of strong E-normalisability, but rather by induction
on the length of these proofs. Our proof has been done independently, extending
a proof from Coquand that the elimination of cuts according to an outermost
strategy of reduction terminates.
Note that this kind of strong cut-elimination proof applies also to non-
confluent systems of reduction rules (it is the case of Gcntzen's system of re-
duction rules) but not to system including rules affecting the order of cuts. This
is contrast with the cut-elimination procedures that Zucker or Pottinger and
have considered.
2) An interesting result would be to prove the Strong normalisation of the
simply-typed A--calculus with the additional reduction rule (Az.t u)[y := v] r
((),z.t)[y := v] u[y := v]). As a corollary of this result, we would get the strong
normalisation of the usual simply-typed A-calculus and even the strong normal-
isation for the simply-typed A-calculus with an explicit "let _ in 2'-like substitu-
tion operator (see for instance Lescanne [8]).

Conclusion

The isomorphism known as the Curry-Howard isomorphism expresses a struc-


tural correspondence between Hilbert-like axiomatic systems and combinatory
logic and between natural deduction and A-calculus. The isomorphism between
LJT and the T-calculus can be seen as the extension of this correspondence into
the framework of sequent calculi and this shows that sequent calculus is no less
related to functional features than natural deduction.
Among the different forms of sequent calculi, the calculus LJT has clearly
a special place. Since the Modus Ponens rule of intuitionistic natural deduction
can be split into a head-cut rule and an implication left introduction rule, LJT
75

can even be seen as a strict refinement of natural deduction. Similarly the ~-


calculus can be seen as a strict refinement of the usual A-calculus, but, in order
to make more precise this embedding relation, it would be necessary to extend
the strong normalisation of the simply-typed A-calculus by considering the extra
reduction rule (~z.t u)[y := v] -L ((Az.t)[y := v] u[y := v]).

Acknowledgements
Simplifications in the proof of strong normalisation are due to Thierry Coquand.
I thank also the Paris 7 computer science logic group, Phil W~dler and Viviana
Bono for echoes on this work.

References
1. V. Brea~u Tanen, D. Kesner, L. Puel: "A typed pattern calculus", IEEE Symposium
on Logic in Computer Science, Montreal, Canada, June 1993, pp 262-274.
2. V. Danos, J-B. Joiner, H. Schellinx: "LKQ and LKT: Sequent calcufi for second
order logic based upon dual linear decompositions of classical implication", in
Proceedings of the Workshop on Linear Logic, Cornell, edited by J-Y. Girard, Y.
Lafont, L. R~gnier, 1993.
3. A. G. Dragalin: Mathematical Intuitionism: Introduction to Proof Theory, Trans-
lations of mathematical monographs, Vol 67, Providence, R.I.: American Mathe-
matical Society, 1988.
4. J. Gallier: "Constructive logics, part I: A tutorial on proof systems and typed
)t-calculi", Theoretical Computer Science, Vol 110, 1993, pp 249-339.
5. J.-Y. Girard: "A new constructive logic." classical logic", Mathematical Structures
in Computer Science, Vol 1, 1991, pp 255-296.
6. J-Y. Girard: "On the Unity of Logic", Annals of Pure and Applied Logic, Vol 59,
1993, pp 201-217.
7. G. Huet: "Confluent Reductions: Abstract Properties and Applications to Term
Rewriting Systems", Journal of the Association for Computing Machinery, Vol 27,
1980, pp 797-821.
8. Z. Benaissa, D. Briaud, P. Lescanne, J. Rouyer-Degli, "~,v, a calculus of explicit
substitutions which preserves strong normalisation", submitted to Journal of Func-
tional Programming, 1995.
9. G. Mints: "Normal forms for sequent derivations", Private communication, 1994.
10. G. Pottinger: "Normalization as a homomorphic image of cut-ellmination", Annals
of mathematical logic), Vol 12, 1977, pp 323-357.
11. D. Prawitz: Natural Deduction, a Proof-Theoretical Study, Almquist and Wiksell,
Stockholm, 1965, pp 90-91
12. W. A. Howard, "The Formulae-as-Types Notion of Constructions", in J.P. Seldin
and J.R. Hindley Eds, To H.B. Curry: Essays on Combinatory Logic, Lambda
Calculus and Formalism, Academic Press, 1980 (unpublished manuscript of 1969).
13. P. Wadler: "A Curry-Howard isomorphism for sequent calculus", Private commu-
nication, 1993.
14. J. I. Zucker: "Correspondence between cut-elimination and normalization, part I
and II', Annals of mathematical logic, Vol 7, 1974, pp 1-156.
Usability: formalising (un)definedness
in typed lambda calculus

Jan Kuper

University of Twente, Department of Computer Science


P.O.Box 217, 7500 AE Enschede, The Netherlands
e-marl: jankuper @cs.utwente.nl

A b s t r a c t . In this paper we discuss usability, and propose to take that


notion as a formalisation of (un)definedness in typed lambda calculus,
especially in calculi based on PCF. We discuss some important proper-
ties that make usability attractive as a formalisation of (un)definedness.
There is a remarkable difference between usability and solvability: in the
untyped lambda calculus the solvable terms are precisely the terms with
a head normal form, whereas in typed lambda calculus the usable terms
are "between" the terms with a normal form and the terms with a (weak)
head normal form.

1 Introduction

T h e elementary form of undefinedness arises on the level of natural numbers,


when the evaluation of a (closed) t e r m M of type N a t does not terminate,
i.e., when M does not have a normal form. Such a t e r m is also often called
meaningless. However, for higher types it is not so evident which terms should
be called meaningless. Analogous to the situation for ground types, it is often felt
to be attractive to call a t e r m M meaningless, or undefined, if M does not have
a normal form. However, one of the desirable properties of meaningless terms
is, t h a t their identification m a y not lead to inconsistency. It is well known that
in general terms without a normal form can not be identified consistently. For
the untyped l a m b d a calculus, this is shown in (Barendregt 1984, section 2.2).
For t y p e d calculi (with sufficient computing power) this is immediately seen:
for example, the evaluation of infinite lists does not terminate, but clearly they
m a y not be identified. The same holds for recursively defined functions. Hence,
at second sight, it is not natural to consider terms without a normal form as
meaningless.
As an alternative, A b r a m s k y and Ong propose to take the terms with a weak
head normal form as the meaningful ones ( A b r a m s k y 1990, Ong 1988). As an
argument in favour of this proposal A b r a m s k y and Ong mention that in lazy
functional languages no evaluation takes place inside weak head normal forms.
However, this is only true since values of function type are not acceptable as
output values. If terms of function type would be acceptable as output values,
then e.g. A x . l + l would be evaluated to Ax.2, i.e., there would be an evaluation
step inside a whnf.
77

As a third alternative for representing meaningfulness we mention the notion


of solvability, introduced in (Barendregt 1971). This notion is widely accepted
as an adequate formalisation of meaningfulness in the untyped lambda calculus.
However, it turns out that the standard definition of solvability is not adequate
for typed lambda calculi (see section 3).
In this paper we introduce a generalisation of solvability, called usability,
based on the following interpretation of meaningfulness:
a term is meaningful if it can have a contribution to the outcome of a
terminating computation.
The remaining part of this paper is organised as follows. In section 2 we define
the calculus A, in sections 3 and 4 we introduce usability and compare it to
solvability, in section 5 we compare usable terms with (head) normal forms (it
turns out that in typed lambda calculus the usable terms are not precisely
the terms with a head normal form), in section 6 we formulate the Genericity
Lemma, and in section 7 we show that all unusable terms can be identified
consistently.

2 The calculus

The calculus A is an inessential variant of PCF (Plotkin 1977), i.e., it is a simply


typed lambda calculus (i.e. ~ is the only type constructor) with two ground
types: N a t and Bool. There are constants for the natural numbers (0, !, ...)
and for the truth values (true, f a l s e ) .
There are also the following constants of function type: Succ, Pred, Zero?,
if= with the obvious interpretations. Different from PCF, A has a conditional
if= of type B o o l ~ r ~ a ~ for each type a. The types of the other constants
are obvious.
Apart from variables (x, y , . . . ) and constants (c, ] , . . . ) , the calculus A has
the following terms (given that M, N are terms): M N , )~x:o'.M, #x:o'.M. The
typing rules are standard, we only mention the rule for #-terms:
x:a F- M:a
f- (#x:a.M) :
Usually, we will leave out type information from terms, and write Ax.M, #x.M.
The reduction rules are also standard. The ;3- and #-rule are
(Ax.M)N ~ M[x:=Y],
#x.M ~ M[x:=#x.M],
where M[x:=N] denotes substitution. Some examples of the ~-rules are
Succ n --~ n + l

if~ true -~ Ka,


if~ false --~ g * ,
78

where K, K* are Axy.x and Axy.y respectively.


We often write i f L t h e n M e l s e N for i f L M N . We also use standard
abbreviations like Axy.M for )~x.Ay.M. We remark, that all 5-redexes are of the
form fc, where f, c are constants. This implies that all constants of function
type are strict in their first argument (this will be essential in definition 4).
We use standard notations such as M--~N, M = N , and A F- M = N . Clearly,
A has the same computing power as PCF, since # x . M is equivalent to Y()~x.M),
where Y is a fixpoint combinator in PCF. Hence, all partial recursive functions
are X-definable. Finally we mention that X has the Church-Rosser property, and
that the standardization theorem holds in X.
On several places below we will consider an extension of A with product
types. Then we will assume that there are terms of the form (M1, M2), constants
~rl, a'2, and reduction rules tel(M1, M2) --* Mi.

3 Solvability

In the untyped lambda calculus the notion of solvability is considered as an


adequate formalisation of meaningfulness. Some reasons for this are that the
Genericity Lemma (see section 6) holds for unsolvable terms, and that all un-
solvable terms can be consistently identified. However, a direct generalisation of
the standard definition of solvability towards X does not work, since the above
properties do not hold there. In this section we will give three different char-
acterisations of solvability in the untyped lambda calculus, and show that the
reformulation of these characterisations towards X are not equivalent. In the
next section we introduce the notion of usability and show that it is equivalent
to the weakest variant of solvability.

L e m m a l ( S o l v a b i l i t y ) . Let )~x.M be a closure of M (i.e., x consists of the


free variables of M). Then in the untyped lambda calculus the following are
equivalent (notice that (a) is the standard definition of solvability):
(a) 3N ()~x.M)N = I,
(b) VL 3N ()~x.M)N = L,
(c) there exists a normal form L such that 3N ()~x.M)N = L.

P r o o f . (a) ~ (b): ()~x.M)N = I implies ()~x.M)NL = L.


(b) ~ (c): Immediate.
(c) ~ (a): If L is in normal form, then L is in head normal form, hence L is
solvable (cf. Barendregt 1984, 8.3.14). That is, there is a sequence P such that
()~y.L)P = I, where ),y.L is a closure of L. Hence,
()~y.(~x.M)N)P = I.
Clearly, this can be brought into the form (Ay.Ax.M)Q = I (if necessary, add
I's to the right of P), i.e., M is solvable. []

Based on this lemma we define three variants of solvability in A.


79

D e f i n i t i o n 2 ( S o l v a b i l i t y in A). Let s be a closure of M. Then M is

(a) strongly solvable, if there is a type a such that

3N (Ax.M)N = Ia,

(b) medium solvable, if there is a type a such that for all terms L of type

3N ()~x.M)N = L,

(c) weakly solvable, if there is (a type a and) a term L (of type a), L in normal
form, such that

3N ( s = L. []

Clearly, in the definition of weak solvability, mentioning the type of L is super-


fluous. We give some examples of the various forms Of solvability in )~.

Example 1.

1. A variable x of type a is strongly solvable: Ax.x = I~.


2. The term M - )~x:~ t. Ay:at--*~. yx is medium solvable, as can be seen as
follows. Let L be any term of type a. Then Mx()~x.L) = L. In general M is
not strongly solvable (take a =_ N a t ) .
3. If M is a constant of ground type, then M is weakly solvable (immediate),
but not medium solvable, or strongly solvable.
4. [2 is not weakly solvable.

Clearly, the items (a), (b), (c) from definition 2 correspond to (a), (b), (c) from
lemma 1. In the untyped lambda calculus we have (a) r (b) r (c), whereas, as
can be seen from the examples above, in )~ we only have (a) ~ (b) ~ (c).

L e m m a 3. In A we have

M is strongly solvable ( ~ M is medium solvable

( ~ M is weakly solvable

P r o o f . " ( ~ " : As lemma 1 "(a) =v (b)"; " ( ~ " : Immediate. []

4 Strict contexts and usability

Based on the interpretation of meaningfulness, described in section 1, we will


introduce strict contexts, and use this notion to define usability. Informally, a
context C[-] is strict, if we can be sure that M is "used" in the (leftmost)
evaluation of C[M]. If C[M] has a normal form, we may conclude that M has a
contribution to this normal form. In such a case we will call M usable.
The following definitions formalise this intuition.
80

D e f i n i t i o n 4 ( S t r i c t C o n t e x t ) . (a) In .,k a strict context C[_] is inductively


defined as follows:

- the empty context, [_], is strict


- if C[_] is a strict context, f a constant and M a term, then

(i) f(C[_]),
(ii) (C[_])M,
(iii)
(iv) ,x.c[_].
are strict contexts.

(b) In the untyped lambda calculus the definition Of strict context is obtained
from part (a) by removing clauses (i) and (iv). []
We remark that part (a) of this definition remains unchanged if product types
are added to A. That is to say, 7r~(C[_]) is a strict context whenever C[_] is.
However, (C[_], M) and (M, C[_ D are not strict contexts.

LemmaS. Let C[_] be a strict context.


(a) In A a strict context is of one of the following five forms:

(i) [_],
(ii) f(C[_])M1... M,~, n > O,
(iii) (C[_])M1... Mn, n _> 1, C[-I not of form (ii) or (iii),
(iv) Axl...xn.C[_], n > 1, C[_] not of form Ax.C'[_],
(v)#xl...x~.C[-], n>_ 1, C[_] n o t o f f o r m ~x.C'[_].

(b) In the untyped lambda calculus a strict context is of form (i), (iii), or (iv).

P r o o f . By induction on the construction of C[_]. []

The next definition works for any calculus in which strict contexts can be defined.

Definition 6 (Usability).

- A term M is usable for computing N , notation M >> N, if there is a strict


context C[_] such that C[M] --~ N. We will sometimes call M relatively
usable,
- M is usable if M >> N for some normal form N.

We call >> the usability relation. []


The notation M >> N was introduced by Barendregt in his Ph.D.-thesis (Baren-
dregt 1971, definition 3.3.2), and pronounced as " N is in the solution of M ' .
Barendregt defined >> for combinatory logic only. It did not show up in the lit-
erature again, since it was thought that it did not work for the lambda calculus.
We mention some examples.
81

Example 2.

1. Normal forms are usable,


2. a variable x:a is usable to compute any term M:T, i.e., x >~ M for every
x, M and every ~, T. As before, there is a term )~y.O_ of type a (or t r u e
instead of O_). Since i f r Zero?((~x.[_])(~y.0)y) t h e n M e l s e M is a strict
context, it follows that x >> M.
3. x[2 is usable (see example 1),
4. [2 is not usable (see corollary 12),
5. if product types are added to A, then (0, [2) is usable (since ~rl[_] is a strict
context), but (/2, ~2) is not usable.

In the next lemma we list some simple properties of (relative) usability.

L e m m a 7.

(i) C[_] is strict =v M >> C[M],


(ii) M---~ N ~ M >> N,
(iii) M >> N , N >> L ~ M >> L,
(iv) M >> N, N is usable ~ M is usable,
(v) M >> M,
(vi) M >> f M ( f a constant of function type),
(vii) M >> MN,
(viii) M >> Ax.M,
(ix) Ax.M >> M,
(x) M > M[x:=Y]
(xi) M is usable r Ax.M is usable,
(xii) )~ ~ M = N ~ (M usable r N usable),
(xiii) M is usable ca M >> c (c a constant of ground type).

P r o o f . Most of these properties are easy to prove, and left to the reader. With
respect to (ix) we remark that [_]x is a strict context. Property (xi) can be
proved using (iv), (viii) and (ix).
For (xiii) notice that there is a normal form N such that M >> N. Hence N
does not contain a #-term. By (x) we may assume that N is closed. Now let L
be a sequence of closed terms in normal form such that N L is of ground type.
Since the #-free fragment of A is strongly normalising, it follows that N L --~ c
for some constant c. []

The following lemma makes explicit that usability is indeed a generalisation of


solvability. If product types axe added to )~, then this lemma does not hold any
more. For example, (0,/2) is usable, but not (weakly) solvable (see example 2).

L e m m a 8. (a) In )~: M is usable ~=~ M is weakly solvable,


(b) In the untyped lambda calculus: M is usable ~=~ M is solvable. []
82

P r o o f . (a) " ~ " : If M is weakly solvable, then there are sequences x, L such
that (;~x.M)L has a normal form, N say. Since ()~x.[_])L is a strict context, it
follows that M >> N. Hence, M is usable.
(a) " ~ " : Tedious. We have to prove that M is weakly solvable. It is sufficient
to prove that there is a (strict) context C[_] = (~x.[_])L such that C[M] has a
normal form.
If M is usable, then there is a strict context Co[-] such that C0[M] has a
normal form. Without loss of generality we may assume that Co [-] is constructed
without applying clause (iv) of definition 4: since Co [M] has a normal form, a
"subcontext" of the form #x.C~[_] can be replaced by ()~x.Ct[_])(#x.Ct[U]),
which is also strict (M is given).
Define a "measure function" q on strict contexts as follows:

q([_]) = 0
q(.fC[_]) = q(C[_]) + 1
q((C[_])X) = q(C[_])
q(C[_]) if C[_] is of the form Ax.C'[_]
q()~x.C[_]) = q(C[-]) + 1 otherwise
So q counts the number of applications of clause 4(i) and the number of sequences
of consecutive "leading" lambda's. We proceed by induction on q(C0[-]).
Basic case: q(C0[-]) = 0. Then Co[-] = [ - ] M 1 . . . M n , n > 0, i.e., Co[-] is of the
required form already.
Induction case: q(C0[-]) > 0. Notice that Co[-] ~ [-] (since q([_]) = 0). Hence,
by lemma 5, Co[-] is of one of the following three forms:

1. Co[-] ~- )~xl . . . x n . C l [ - ] , n _> 1, C1[-] not of the form )w.C~[-]. Since C0[M]
has a normal form, CI[M] has a normal form too. Since

q(Cl[-]) <(q(C0[-]),
the result follows by the induction hypothesis.
2. Co[-] - f(Cl[-])L1." "Ln, n > O. Since C0[M] has a normal form, it follows
by that C1 [M] has a normal form too. Since

q(Cl[-l) < q(Co[_]),


the result follows by the induction hypothesis.
3. Co[-] - ( C I [ - ] ) L I ' - " L n , n > 1, C1[-] not of form (ii) or (iii) of lemma 5.
Hence, C1 [-] is of one of the following two forms:
(a) C1[-] -= [-]. Then Co[-] is of the required form.
(b) C1[-] -= Axl ...x,~.C2[-], m > 1, C2[-] not of form Ax.C~[_]. Hence, by
lemma 5, there are three possible forms for C2[-], given below as i, ii, iii.
Recapitulating,
C0[M] - (Ax.C2[MI)L
has a normal form. Without loss of generality, we may assume that the
length of L is not smaller than the length of x. This can be seen easily:
83

since C0 [M] has a normal form, we may add terms in normal form to
the right of L, and the result will have a normal form again.
i. C~ [_] = [_]. Then Co [-] is of the required form.
ii. C2[-] ~ f(C3[-])N1 " "Nk, k > 0. Then
Co[M] - (Ax..f(C3[M])N)L
Since the length of L is not smaller than the length of x, it is easily
seen that there are P, Q such that
Co [M] = ]((Ax.C3 [M])P)Q:
Since Co[M] has a normal form, it follows that (Ax.C3[M])P has a
normal form. Since
q((Ax.C3[_])P) < q(Co[-])
the result follows by the induction hypothesis.
iii. C2[-] -~ (C3[-])N1 .. "Nk, k > 1. Then
Co[M] -= (Ax.(C3[M])N)L.
As before, there is a sequence Q such that
Co [M] = (Ax.C3 [MI)Q.
Clearly, C3[-] is not of form (ii) or (iii) from lemma 5. So two cases
remain:
A. C3[-] =-[-]. Then (Ax.C3[_])Q is of the required form.
B. Ca[-]- AziC4[_]. Then
q((Ax.Az.C4[_])Q) < q(Co[-])
and the result follows by the induction hypothesis.

This completes the proof of (a).

(b) The proof of (b) is analogous. []

5 Syntactic characterization of usability

In the untyped lambda calculus the solvable terms are precisely the terms with
a head normal form. However, in A the usable terms can not be characterized in
this way. Consider the following terms.

M 1 - if Zero?(Pred x) then 0_ else [2,


M2 ---if Zero?(Succ x) then 0_else /2.

Clearly (Am.M1)! --~ 0, so MI is usable. On the other hand, M2 is not usable


since there is no constant n_ for which Z e r o ? ( S u c c _n) ~ t r u e . However, both
are in head normal form as defined below.

Definition 9 ( H e a d n o r m a l form). Let H stand for head normal form (hnf).


Then
B::=x ! BM I fB
H ::= B I Ax.H I c []
84

Notice that the restriction of this definition to the untyped lambda calculus
yields the standard definition of hnf. For a comparison with other definitions of
hnf's in typed lambda calculi, cf. (Kuper 1994, chapter 6).
For the proof of the next lemma, see (Kuper 1994, lemma 6.2.6). Compare
also (Barendregt 1984, section 8.3).
L e m m a 10.
(a) Ax.M has a hnf r M has a hnf
(b) M i x := N] has a hnf ~ M has a hnf
(c) M N has a hnf =~ M has a hnf []
Now we come to the main proposition of this section. Notice that from the
examples above it follows that the converse arrows do not hold.
P r o p o s i t i o n 11.

M has a normal form (~ M is usable

(~ M has a head normal form.

P r o o f . " ( ~ " : If M -~ N, N in normal form, then M >> N, i.e., M is usable.


" ( ~ " : If M is usable, then by lemma 8 M is weakly solvable, i.e., there are x, N
such that (Ax.M)N has a normal form. By induction on the structure of terms
it is easy to see that a normal form is also a head normal form, so (Ax.M)N has
a head normal form (by lemma 10). []

If product types are added to A, then it depends on the precise definition of head
normal form, whether " ( ~ " of this proposition will still hold. For example,
(0,/2) is usable. However, this term is usually not considered as a head normal
form, but as a weak head normal form.
We mention two corollaries of proposition 11.
C o r o l l a r y 12. /2 is not usable.

P r o o f . /2 does not have a head normal form. []


C o r o l l a r y 13. Let M be a closed term of ground type. Then M is usable iff M
has a normal form.

P r o o f . A closed term M of ground type is in normal form iff M is in head


normal form, The proof is completed by proposition 11. []

6 Genericity
In section 1 we called a term meaningful if it can have a contribution to a termi-
nating computation. This conception of meaningfulness motivated the notion of
usability. Now we make this conception of meaningfulness precise in a different
way:
85

a term M is meaningful if there is a context C[_] such that (1) C[M]


has a normal form N, and (2) there is a term M ' such that C[M'] does
not evaluate to N.
The main result of this section is that both formalisations of meaningfulness
are equivalent. An important lemma in proving this equivalence is the Generic-
ity Lemma (lemma 30). This lemma is proved by generalising a technique from
(Barendregt 1971). This proof differs strongly from the standard proof of the
Genericity Lemma for the untyped lambda calculus cf. (Barendregt 1984,.propo-
sition 14.3.24), where it is proved by a topological method.

D e f i n i t i o n 14 ( G e n e r i c ) . A term M is generic, if for all contexts C[_] we have:

C[M] has an nf ~ VX C[X] has the same nf. []

We remark that the generic terms are the operationally least defined terms in
the sense of (Plotkin 1977, Berry et al, 1985): a term M is operationally less
defined than N, if
C[M] has a normal form ~ C[N] has the same normal form.
Now we come to the main result of this section.

T h e o r e m 15. M is generic r M is not usable.

P r o o f . " ~ " : By a corollary of the Genericity Lemma (corollary 31).


" ~ " : By contraposition. Suppose M is usable. Then there is a strict context
C[_] such that C[M] has a normal form. Since ~2 is not usable, C[T2] does not
have a normal form. Hence, M is not generic. []

In the remaining part of this section we prove the Genericity Lemma and some
of its variants (see lemma 30 and its corollaries). In order to do so, we need an
extension A of A. Informally, the terms of_A are the terms of A in which subterms
can be underlined, but no subterm is underlined more than once.

D e f i n i t i o n 16 ( T e r m s in _A).

- If A is a A-term, then A and A are A-terms,


- If A, B are A--terms, then AB, Ax.A and tzx.A are A__-terms.

Terms without underlinings (i.e., A-terms) are called line free. []


The following operation removes underlinings from A--terms.

D e f i n i t i o n 17 ( R e m o v a l o f u n d e r l i n i n g s ) .
- mAt - A if A is line free,
- ]A t -- A ,
- lAB I -[ALIBI,
- .iAi,
- i, .AI -- , .iAI. []
36

D e f i n i t i o n 18 ( S u b s t i t u t i o n in _A). In addition to the properties of substitu-


tion in A, we have
A[x := B] ~ A[x := JBI]. []

D e f i n i t i o n 19 ( R e d u c t i o n in _A_).
(i) The/3-, It- and b-rules are identical to the corresponding rules in X, e.g.,
( A x . M ) N --~ M[x:=N], where M, N are X__-terms,
(ii) If A ~ B in X, then A ~ B B_ in A__,
(iii) There are four underlining rules:
A B ~ A[B[,
IA- IA,
Ax.A --* Ax.A,
~ x . A ~ ~x.A. []

Notation. One-step reduction in IX is denoted by -=-+; as expected, -~ is the


reflexive and transitive closure of ~ .

The underlining rules of _A_correspond to strict contexts as follows.


Lemma20. Let C[_] be a line free context. Then
C[_] is strict ~ VX(C[X_j -=-~C[X]).

P r o o f i " ~ " : Immediate.


" ~ " : Clearly, C[X] _-=-~C[X] by underlining rules only. The result follows by
contraposition. []
L e m m a 21. For A_-terms A, B
A - ~ B ~ ]A]--~ IB[.

P r o o f . By induction on the length of the reduction A _-~ B. []

Notation. If M is a (proper) subterm of N, we will write M C N (M C N).


Lemma22. If A_-yzB and B' C B , then there exists an A' C A such that
A' >> B'.

P r o o f . By induction on the length of the reduction A -~ B.


Basic case: We have to check all one step reductions by a case analysis. The un-
derlining rules (cf. definition 19) are easy: they follow immediately from lemma 7.
The/3-, #-, ~i-rules are tedious, but straightforward. For the details we refer the
reader to (Kuper 1994, section 7.3).
Induction case: By the transitivity of >>. []
L e m m a 2 3 . Let A --~ B in X, and let A' be such that [A'[ - A. Then there
exists a term B', IBtl - B , such that A' -=~B r.
87

P r o o f . By induction on the length of the reduction A --~ B. It is easy to see that


all one step reductions can be copied to _~, if necessary after some intermediate
applications of the underlining rules. The (straightforward, but tedious) check
of all possibilities is left to the reader. []
D e f i n i t i o n 24. We write A ~ B, if B can be obtained from A by replacing zero
or more underlined subterms of A by other underlined subterms. []
For example, for all line free terms M, N we have M ~ N, and M L ~ N L .
L e m m a 25. The relation ~_ is an equivalence relation.

P r o o f . Straightforward. []
L e m m a 26.
- if M , N are A-terms, then M - ~ N ~=~M = N ,
-M~_N,
- M N ~ L iff there are M r, N ~ such that L - M t N ~, and M~_M ~, N~_N ~,
- A x . M ~ L if] there is an M ~ such that L =- A x . M ~, and M ~ M t,
- # x . M ~ L if] there is an M ~ such that L -- # x . M r, and M~_M ~.

P r o o f . Straightforward. []
Lemma27. If M - ~ M t and N - ~ N ~, then
M [ x := N] ~ M'[x := N'].

P r o o f . By induction on the structure of M. []


Lemma28. If M - ~ N and Mr@M, then there is an N ~ _ N such that M ' - ~ N ~.

P r o o f . By induction on the length of the reduction M ~ N.


Basic case. Let X be the chosen redex in M. Notice that this implies that X is
not of the form X__~.
._~ There are two possibilities:
1. There is a P C M such that X c_ P. Then clearly M~_N. Take N' -= M ~,
then N ~ _ N , and M ~- ~ N ~.
2. There is no P C M such that X C_ P.
Suppose X _-,Y, and let M =_ C[X]. Then, N =_ C[Y]. Clearly there exist
X',C'[_] with X ' - ~ X and C'[_]~C[_], such that M ' = C'[X'].
We have to construct Y ~ Y such that X r ~Y~, by considering all possible
reduction rules by which X - , Y (details are left to the reader).
Induction case. By transitivity of ---* . []
C o r o l l a r y 29. Let N be line free, and suppose M - ~ N . Then for all M ~ with
M~@M, we have M ~ _-~N. []
Now we come to the Genericity Lemma.
L e m m a 30 ( G e n e r i c i t y L e m m a ) . Let M, N be )~-terms, M not usable, N a
normal .form. If A F F M = N , then .for all X
AFFX=N.
88

P r o o f . Since N is a normal form, it follows by the Church-Rosser property that


F M - + N. Hence F M - - * N ' for some N' with I N ' I - N (lemma 23).
Suppose L C N', then by lemma 22: M >> L. Since N is in normal form, L
is in normal form too, and so M is usable, which is a contradiction. Therefore,
N' does not contain underlined subterms, i.e., N' - N. Hence, by corollary 29,
FX-=+N for every term X. By lemma 21 it follows that F X ~ N. []
C o r o l l a r y 31. Let M, N be A-terms, M not usable, N a normal form. If A
C[M]=N, then/or all X
A k- C[XI=N.

P r o o f . Suppose x is a sequence of variables containing all variables that are


free in M or X. Let y be a fresh variable. Then
(Ay.C[yx])(Ax.M) = C[(Ax.M)x]
= C[M]
=Y.
M is not usable, hence Ax.M not usable (lemma 7). Hence, by the Genericity
Lemma (lemma 30):
c[x] =

=N. []

7 Identification of unusable terms

In this section we prove that it is consistent to identify in A all unusable terms


(respecting their type, of course). Intuitively this means that all meaningless
terms may be identified. We also prove that this identification is maximal in
the sense that identifying a usable term to an unusable term (in addition to the
identification of all unusable terms) is inconsistent.

Notation. The set of all equations P--Q for which P, Q have the same type and
P, Q are unusable, is denoted by ,9.
T h e o r e m 32. A + S is consistent.

P r o o f . By contraposition. Suppose A + S is inconsistent. We show that this


implies that A is inconsistent.
If A + S is inconsistent, then
A + S }- true=false.

Suppose that in a proof of this there are n applications of equations from 8.


T h e n this proof can be presented as follows:

true . . . . . CI[PI] -- CI[QI] . . . . . Cn[Pn] = Cn[Qn] ..... false,


89

where P~ = Q~, / = l , . . . , n , are equations from S. So, the displayed equalities


C~[P~] = C~[Q~] are proved by equations from S, all other equalities are proved
by the axioms of A.
Now proceed by induction on n. If n=0, it follows that none of the equations
from S is used, i.e., A L- t r u e = f a l s e , and we are done.
Let n>0. By the proof above we have

)~ i- t r u e = Cl[P1].

Since P1 is unusable, it follows by the Genericity Lemma (corollary 31) that

A 5 true ----Of[Q1].

Since

CI[Q1] = c2[P ],
it follows that

A P t r u e = C2[P2],

i.e., t r u e = f a l s e is proved by n - 1 applications of equations from ,9. The result


follows by the induction hypothesis. []

We prove the maximality of the set S in the sense as described above.

T h e o r e m 33. Let M be a usable term, P an unusable term, M and P have the


same type. Then A + S + M = P is inconsistent.

P r o o f . Consider the term

[~ -- ~x. if Zero? x then I else 0_,

which is of type N a t , and notice that A+U=c is inconsistent for every constant c
of type N a t (we remark that c is restricted to the given constants of A. Clearly, it
would be possible to introduce, for example, a constant J_, with rule ~2~J_, but
not • i.e., _L is a normal form. Then t~ =- _l_does not lead to inconsistencies).
Clearly, for type B o o l there is also such a term, which we will denote by tl too.
Hence, for every type a there is a term

U~ = Ax.U

of type a. By lemma 7 it follows that ~3~ is not usable.


Suppose M, P are of type a, then P=~3~ 6 S, and so

A + 8 + M = P F- M--U~.

Since M is usable, it follows by lemma 8(a) that there are sequences y, N such
that ( A y . M ) N has a normal form. Without loss of generality we may assume
that this term is closed. It follows, that there is a sequence L and a constant c
of ground type such that ()~y.M)NL = c.
90

Hence, in the theory A +,.q + M : P we can derive the following inconsistency:

c = ()w.M)NL
---- (Ay.U~)NL
= (Ay.~x.U)NL
_~.

The final equality follows by reasoning on the types of the subterms. O

Acknowledgements
I thank Henk Barendregt and Maarten Fokkinga for the interesting and valuable
discussions. I also thank one of the anonymous referees for his or her detailed
corrections.

References
Abramsky, S. (1990), The Lazy Lambda Calculus, in: Turner, D.A. (Editor), Re-
search Topics in Functional Programming Languages, Addison-Wesley, Reading,
Massachusetts.
Barendregt, tt.P. (1971), Some extensional term models for combinatory logics and
lambda calculi, Ph.D. Thesis, Utrecht.
Barendregt, H.P. (1975), Solvability in lambda calculi, Colloque International de
Logique, Clermont Ferrand, 209 - 219.
Barendregt, H.P. (1984), The Lambda Calculus - Its Syntax and Semantics (revised
edition), North-Holland, Amsterdam.
Berry, G., P.-L. Curien, J.-J. L@vy (1985), Full abstraction for sequential languages:
state of the art, in: Nivat, M. and J.C. Reynolds (Editors), Algebraic Methods in
Semantics, Cambridge University Press, Cambridge, 89 - 132.
Kuper, J. (1994), Partiality in Logic and Computation - Aspects of Undefinedness,
Ph.D.Thesis, Enschede.
Ong, C.-H.L. (1988), The Lazy Lambda Calculus: an Investigation into the Foundations
o/Functional Programming, Ph.D. Thesis, Imperial College, London.
Plotkin, G.D. (1977),LCF considered as a programming language, Theoretical Computer
Science 5, 223 - 255.
Lambda Representation of Operations
Between Different Term Algebras 1

Marek Zaionc
Instytut Informatyki, Uniwersytet Jagiellofiski,
Nawojki 11, 30-072 Krakow, Poland 2
email zaionc@ii.uj, edu.pl

A b s t r a c t . There is a natural isomorphism identifying second order types of the sim-


ple typed ), calculus with free homogeneous term algebras. Let T A and r B be types
representing algebras A and B respectively. Any closed term of the type r A "-+ v B rep-
resents a computable function between algebras A and B. The problem investigated
in the paper is to find and characterize the set of all A definable functions between
structures A and B. The problem is presented in a more general setting. If alge-
bras A1, ...,An, B are represented respectively by second order types r A z , . . . , r A n , r B
then r A1 --* ( . . . ( r A~ --* T B ) . . . ) is a type of functions from the product A1 x ... • An
into algebra B . Any closed term of this type is a representation of algorithm which
transforms the tuple of terms of types r A1, ..., v A~" respectively into a term of type r B,
which represents an object in algebra B (see [BSB85]). The problem investigated in
the paper is to find an effective computational characteristic of the A definable func-
tions between arbitrary free algebras and the expressiveness of such transformations.
As an example we will consider A definability between well known free structures such
as: numbers, words and trees. ~Fhe result obtained in the paper is an extension of the
results concerning ), definability in various free structures described in [Sch75] [Sta79]
[Lei89] [Zai87] [Zai90] and [Zai91]

Introduction
As a contribution to the ongoing research on computing over general algebraic
structures, we consider recurrence over free algebras (compare [BOB85], [Lei89],
[Lei90], [Zai89]). As a model for computing a simple typed lambda calculus is
employed. The lambda calculus introduced by Church is a calculus of expres-
sions, which naturally describes the notion of computable function. Function-
als are considered dynamically as rules rather than set theoretic graphs. The
lambda calculus mimics the procedure of computation of the program by the
process called beta reduction. There is a natural way of expressing objects such
as numbers, words, trees and other syntactic entities in the lambda calculus.
All those objects are of a considerable value for computer scientists. Dynamic
operations on objects of this kind can be described by terms of lambda calculus.
Therefore lambda terms may be considered as algorithms or programs working
1This research was supported by KBN Grant 0384/P4/93
2This paper was partially prepared while author was visiting Computer Science Department
at State University of New York at Buffalo, USA
92

on those syntactic objects and producing as a result a new object not necessar-
ily of the same type. It is well known result by Church and Kleene, relating
all partial computable numerical functions with lambda terms. Of course, the
notion of partial recursive function can be naturally extended to other structures
such as words, trees etc. It is natural that the Church-Kleene theorem might be
extended and holds for these structures.
The typed version of lambda calculus is obtained by imposing simple types on
the terms of the lambda calculus. The problem of representing structures is ba-
sically the same in the typed lambda calculus, however, the rigid type structure
imposed on the syntax of lambda calculus dramatically reduces expressiveness
of functions on these structures. Interestingly enough, the solution for repre-
sentability problems varies for different structures.
The first result concerning representability in the typed ,~ calculus have been
proved by Schwichtenberg in 1975 and independently by Statman (see [Sch75],
[StaT9]). Schwichtenberg studied numerical functions represented in the typed
lambda calculus and the following characteristic was proved: lambda definable
functions are exactly those generated by composition from the constants 0 and
1 and operations of addition, multiplication and conditional (extended poly-
nomials). The similar result for word operations was obtained by Zaionc in
[Zai8T]. The word functions represented in typed lambda calculus are exactly
those generated by composition from constant A (empty word) and operations
append, substitution and cut. The results of Schwichtenberg and Zaionc were
extended to the structure of binary trees [Zai90]. It was shown that t definable
tree operations are those obtained from initial functions by composition and a
limited version of primitive recursion. Leivant [Lei89] showed that recursion is
essential and can not be removed from this characteristic. A similar result was
obtained for ,~ definable operations on arbitrary homogeneous free algebra.
In this paper we examine the situation when the input and output algebras are
generally different. The proof of the main result is obtained by inductive de-
composition of closed term which represents a function between two different
algebras. While the decomposed terms are generally simpler according to some
measure of complexity, they represent operations between definitely different al-
gebras from algebras we started with. Therefore, the problem must be presented
in a more general setting in which we consider functions from product of several
not necessarily same algebras.

1. Free Algebras
Algebra A given by a signature SA = [o~1, ..., OLn] has n constructors al, ..., an
of arities a l , . . . a n respectively. Expressions of the algebra A are defined by
induction as the minimal set such that if ai = 0 then ai is an expression and
if ai > 0 and tl, ...,ta~ are expressions then ai(tl, ...,toni) is an expression. We
may assume that at least one ai is equal 0, otherwise the set of expressions is
empty. By A we mean the set of all expressions in algebra given by signature
SA = [c~1,..., an]. For simplicity we are going to write A = [al, ..., a~] to say
that A is an algebra given by the signature [o~1,..., eta]. If A1 .... , An are algebras
93

then by A1 • ... • An we m e a n the p r o d u c t of sets of expressions. A bar over


the n a m e (for e x a m p l e A) indicates t h a t A is the p r o d u c t of sets of expressions
of some algebras. A n is the p r o d u c t A • ... • A.

D e f i n i t i o n 1.1 If A is an algebra given by signature SA = [hi, ..., a , ] and n


is a n o n n e g a t i v e integer then A +~ is an algebra given by signature SA+~ =
[al, ..., a s , 0, ...0] with exactly n O's added.

We are going to investigate the set of functions between a r b i t r a r y t e r m algebras

D e f i n i t i o n s 1.2 Function f in A p --~ A given by f ( x l , . . . , x p ) = xi is called


p r o j e c t i o n . Any constructor a~ in algebra A can be seen as a function a~ :
A ~i ~ A including 0 - ary function (hi = 0) considered as element of A.

D e f i n i t i o n 1.3 By the set of n o n t e r m i n a l trees in A we m e a n the m i n i m a l


set of m a p p i n g s A p ---+ A for p > 0 closed for composition and containing all
constructors in A and all projections.

Definition 1.4 Let f : A • B --~ C and b E B. By f/-b we m e a n the function


( f /b) : A ~ C defined by ( f /b)(~) = f(~,-6).

D e f i n i t i o n 1.5 Let A = In1, ..., an]. Let al, ..., an are all constructors in algebra
A such t h a t arity of a~ is hi. Function h : A • B ~ C is defined by recursion
f r o m functions fl : C ~1 • B ~ C , . . . , f ~ : C ~ • B ~ C if for all i <_ n the
following equations hold:

h(ai (tl, ...,-ta,),-b) = fi (h(tl,-b),..., h(ta,, b), b)

for all expressions t l , ..., t ~ of the algebra A

Definition 1.6 T h e set of A functions is defined inductively by:

1. for all algebras A , B , C any projection in A • B • C ~ B given by


f ( x , y, z) = y is a ~ function

2. any constant function from A to B given by f ( ~ ) = t for some expression


t E B is a ~ function.

3. for any algebra A any constructor a~ : A ~ --~ A is a ~ function

4. any composition of A functions is a ~ function, i.e. if f l : A1 • ... • AN


B1, ..., fk : A1 • ... • An ~ Bk are A functions and g : B1 • ... z Bk --+ C is
a/k function then h : ml z ... • An -~ C given by h(~) = g ( f l ( ~ ) , ..., h ( ~ ) )
is a A function.

5. let algebra A be given by a signature [hi, ..., an] and B be a p r o d u c t of


algebras. If fl : C a~ x B --+ C, ..., fn : C a~ • B --~ C are A functions such
t h a t for every expression b E B functions (fl/-b) : C ~ ~ C, ..., (__f~/-b) :
C ~" --+ C are n o n t e r m i n a l trees in C then the function h : A • B ~ C
defined by recursion (see definition 1.5) f r o m f l , ..., f~ is a/k function.
94

2. E x t e n d e d T y p e d A Calculus.
Our language is derived from Church's [Chu40] simple theory of types. Every
t e r m possesses a unique type which indicates its position in a functional hierar-
chy. Let T Y P E be a set of types which are defined as follows: 0 is a type and
if r a n d / i are types then 7 --*/i is a type. For any type 7 we define numbers
rank(r) and a r g ( r ) as follows: a r g ( O ) = r a n k ( 0 ) = 0 and arg(7 --+/i) = l+arg(/i)
and rank(7 --+/i) = max(rank(v) + 1, rank(/i) ). Associated with each type r
is a denumerable set of variables V(7). Any variable of type 7 is a t e r m of type
r. If T is a t e r m of type 7 - + / i and S is a t e r m of type 7 then T S is a t e r m
of type /i 9 I f T i s a t e r m of t y p e / i and x is a variable of type 7 t h e n Ax.T
is a t e r m of type 7--* /i. I f T i s a term of type 7 we write T E 7. We shall
use the notation Axl...xn.T for term ,~xl.(Ax2.(...(,~x~.T)...)) and TS1...5;~ for
(...(TS1)...Sn). If T is a term and x is a variable of the same type as a term S,
then T[x/S] denotes the substitution of the term S for each free occurrence of
x in T. The axioms of equality between terms have the form of a/?r/ conversions
and the convertible terms are written as T =Z~ S. By the Cl(7) we mean the
set of all closed terms (without free variables) of type 7. Term T is in the long
normal form if T = )~Xl...x,.yT1...Tk where y is an xi for some i < n or y is a
free variable, Tj for j _< k are in the long normal form and yT1...Tk is a term of
type 0 . Long normal forms exist and are unique for/~r 1 conversions.

In order to easily represent and manipulate on finite strings of terms of po-


tentially different types we extend our language by adding new constructor to
form a Cartesian product of types and terms. By T Y P E * we mean the min-
imal set containing T Y P E and closed for the Cartesian product formation: if
rl, ..., 7, E T Y P E * then (rl, ..., r , ) E T Y P E * . The e m p t y tuple is denoted by
w E T Y P E * . We use the abbreviation • for string (rl, ..., 7,~). We assume
that string of e m p t y types is empty, so • = w. In the case when all 71
are the same we use notation 7" instead of xi~=17. We also extend the type
formation --* for string types as follows: x~=17i ---,/i means 7"1 --+ (72 ---* ...(7n --~
/i)...)) including w /i = /i and 7 - - + • '~ " means x~=1(7 --~/ij) including
r ~ a~ = w. Particularly we have (x~=lri) ~ ( x jm= l / i j ) = X~=l(X]=lri ~ / i j )
and r ~ --~/i = / i - Since types are usually employed for describing construction
of function spaces the above definitions are in fact typical identifications. The
first equation identifies A BxC with A Be which usually is called "Currying". The
second equates (A x B) C with A c x B C. We will call elements of T Y P E * types
and elements of T Y P E simple types. It is easy to observe that every simple type
7 admits the unique form xi=17i- --~ O where rl are again simple types called
components of 7. We can prove several equation between types, for example

(7 -'~/i) n = v ---+/in 2.1


( r~ --+/i)" = v~ --+ Iin the special case of 2.1 2.2
for simple type 7" of arity k,
r" = (• -~ O)" = • -~ O" 2.3
95

for simple type r of arity k, where ri are components of r


Xj=l v , x j = l ( X i = l V i __.+ O ) a j ~ Xj=ln ( X i k= l T i ----+
: X/k_i T/ --~ X jn= l O c~j 2.4

Language over tuple types is build in the similar way. If M is a finite string of
terms (M1, ..., Ms) of types rl , ..., v~ respectively then M is denoted by x ~i=lMi.
of the type x~=17i. We use the same notation M E r to say that Mi is a t e r m
of type ri for i < n. This definition m a y by iterated so we m a y consider strings
of strings of terms and so on. The e m p t y tuple of terms is denoted by ft of type
w. In the case when all Mi are the same we use M n instead of xr~=lM with
f~n = ft. T e r m formation is following: If M E X~=zri --* # and N E X~=lri
then by M N we mean MN1...N,~ of type # with M f t = M when X~=lri = w
(n=0). If M E r ~ xj=l,u m ~9 and N E r then by M N we mean (M1N, ..., M,~N)
or equivalently x~=IMiN of type xr~=lpj with f t N = f t in case when m = 0 . If
M E # and x is a variable of type x~=17-i (which means that x has a form of
tuple t e r m (xl, ..., x~) ) then by Ax.M we m e a n Ax~...x,~.M of type x~=l~-i ~ p
with Ax~o.M = M for variable x of type w. If M E X~=l# j and x is a variable
of type r then by Az.M we mean (Ax.M1, ...,Ax.M,~) which is identical with
xr~=lAx.Mi of type 7 --~ xj=~#3
_ m . with Ax.ft = ~ when m = 0 . If M is a t e r m of
type x~=~(vi ~ #i) and N is a t e r m of type xr~=lri then by parallel application
M<>N we m e a n the t e r m xi= n 1MiNi o f t y p e Xi=z#,.
n . I f x = (xl...x,~) is a variable
X rt
of type X~=lri and M E i=zg* then b y parallel abstraction Ax 9 M we m e a n
the t e r m (Axl.M1, ...,Ax,..Mn) of type x'~=~(ri ~ Pi). We m a y summarize all
those definitions by:

M(x'~=lNi ) = MN1...Nn with M f t = M 2.5


(x]=tMi)N = xr~=l(MiN) with f t g = ft 2.6
A(x'~=lxi).M = )~xl...x,~.M with Ax~o.M = M 2.7
Ax.(x~=lMi ) = x~=l(Ax.Mi ) with Ax.ft = ft 2.8
(x~=lMi) (> (x~=lNi) = x~=l(MiNi ) 2.9
A(x~=lxi ) 9 (x~=IM/) = XLl(aXi.Mi ) 2.10

By simple t e r m we mean a term of a simple type. If M is a simple term and x is a


variable of type x~=iri and N E x ~ i r i then M ( x / N ) denotes the t e r m obtain
by simultaneous substitution of the terms Ni for each free occurrenceg of xi
respectively in M. If M E x ik= i # i then M ( x / N ) means (M1 (x/IV), ..., Mk(x/N))
and if n = k then M ( x / / N ) m e a n s (Mi(xl/N1), ..., Mk(x~/Nn)). We define fl~
conversions and the notion of long normal form by recursion with respect to the
complexity of tuple types. Two terms M and N of the same type x~=i#i are
equal modulo conversions M =#7 N if and only if Mi =#~ Ni for all i < n.
We say t h a t term M of type x in= i # , . is in long-normal form if every Mi is in
long-normal form and M is called closed if every Mi is closed. We can prove the
following extensions of conversions
96

A x . M =#~ A y . M ( x / y ) 2.11 sequential a conversion


Ax 9 M =#,7 Ay 9 M ( x / / y ) 2.12 parallel a conversion
( A x . M ) N =#, M ( x / N ) 2.13 (sequential fl conversion)
(Ax 9 M) <>N =#~ M ( x / / N ) 2.14 parallel/3 conversion
A x . ( M x ) =#,7 M 2.15 sequential ~l conversion
Ax.(Mox)=#, M 2.16 parallel q conversion

Having proved the existence of the long-normal form in the ordinary typed A
calculus we can show the same for tuple terms by induction. If M is a term of
type Xi=z#,
'~ 9 then by Mi we mean i-th coordinate of M, therefore we have

(MN)i = MiN 2.17


(Ax.M)i = Ax.Mi 2.18
( M <>N ) i = M i N i 2.19
(Ax 9 M ) i = A x i . M i 2.20
= 2 21
fli = ~ 2.22

3. Representability
If A is an algebra given by a signature SA = [al, ..., c~a] then by r A we mean
a t y p e (O al -+ O) --+ ...--+ (O ~ -+ O) --+ O. By r/a for i_< a we m e a n i - t h
component of type r A i.e. r/A = O ~ -+ O. We will see that closed terms of this
type reflect constructions in algebra A. Assuming that at least one o~i is 0 we
have that r A is not empty, r A is the simple type for any algebra A. There is
a natural 1-1 isomorphism between expressions of algebra A and closed terms
of type r A. Let Cl, ..., ca are all constructors in term algebra, of arity Ctl, ..., o~a
respectively. If ai is an 0-ary constructor in A then the closed term ~Xl...Xn.Xi
represents ai. If ai > 0 and tl, ..., t ~ are expressions in A represented by closed
terms T1, ..., T ~ of type r A, then an expression ai (tl, ..., t ~ ) is represented by the
term A x l . . . x , . x i ( T l X l . . . x n ) . . . ( T ~ x l . . . x n ) . Thus, we have a 1-1 correspondence
between closed terms of type r a and expressions of algebra A. The unique (up to
r term of type r A which represents an expression t in algebra A is
denoted by t. Let A1, ..., A,~ and B be algebras. A function h : A1 x ... x AN --+ B
is represented by a closed term H of type r al --+ ... --+ r Am --+ r B if for all
expressions tl E A1, ..., tn E An, the following terms are/3r / convertible

H t l . . . t ~ =#~ h ( t z , . . . , t~).

Let B be a product of algebras B1 x ... • Bk. We define a r B to be the type


x ki=lr B~ . By analogy, there is a natural isomorphism between terms of type r ~-
and the product of expressions B, x ... x Bk.

E x a m p l e 3.1 The algebra N of positive integers based on the signature 5'N =


[1,0] is represented by the type T N : (0 "--+ 0) ----+ (0 ---+ 0). Every nmnber
97

n is represented by a term (Church's numerals) of the form )tsx.s(...sx). The


algebra E of binary words based on the signature Ss = [1, 1, 0] is represented by
the type r s = (0 ~ 0) ~ ((0 --* 0) --* (0 ~ 0)). For example, the word aba over
the alphabet E = {a,b} is represented by the term )~uvz.u(v(uz)). The term
,Xwsx.ws()~y.y)z of type r s ~ r n represents the function counting the number
of letters a in the given word.

E x a m p l e 3.2 U = [2, 0] is the algebra of binary trees and N = [1, 0] is the


algebra of Church's numerals. Let e be a 0-ary constructor (empty tree) and
A be a binary tree constructor in the algebra U. In this example infix notation
is used for the binary constructor A. By tl^t~ we denote the tree such that
tl and t2 are, respectively, left and right subtrees. In this example U is the
set of all binary trees and N is the set of all Church's numerals. Type r U is
(O--* (O --~ O)) --* (O ~ O) and r y = ( 0 - - * O) --~ (O --* O). Let H be
a closed term ;~Tux.T(~yz.uy)x of type ~.u _.. rN. It is easy to see that H
represents t h e function leftmost : U ~ N which computes the length of the
leftmost path of a tree.

leftmost(e)=0
leftmost (t 1At 2) ~-~leftmOst(t ] ) + 1

The function leftmost is obtained by the recursion schema from the functions
fl(y, z) = y + l and f2 = 0. Since f~ and f2 are ~ functions which are nonterminal
trees in N, the function leftmost is also a ~ function (see definition 1.6).

D e f i n i t i o n 3.3 Let A be an algebra based on signature [al .... , a~]. Let Zi be


a variable of type (TA) a' for i < a and x be a variable of type • --* O).
Let Cons A be a closed term of type • ~ --* rA), defined by Cons A =
(Cons r ...,ConsA), where Cons A, of the type (7A) ~' --, 7"A, is defined by
Cons A = AZiz.xi(Z~x). Note that Cons A represents constructor ai which can
be seen as a function ai : A ~ --+ A. Note also that when ai = 0, Cons A is a
projection ~x.xi which represents constant a~.

Let A = [al,...,aa] be an algebra. Let n A be a collection


D e f i n i t i o n 3.4
of closed terms of types (~.A)p ~ rA for all p > 0 defined by recursion as
the minimal set containing projections ;~s.si and constant functions )~s.Cons A
if ai = 0, and satisfying following property: If D is a closed term of type
(TA) p -"+ (TA) c~i for a~ > 0, such that D j E ~A for all j < c~i, then a closed term
As.ConsA(--Ds) of type (rA) p --~ r n belongs to n A.

Next four lemmas 3.5, 3.6, 3.7, 3.8 are concern with type checking of particular
terms and will be used afterward in lemmas 3.10, 3.13 and 3.14 as well as in
theorem 4.6.

L e m m a 3.5 Let A = [al, ..., aa] be an algebra. Let B be a product of algebras


represented by type r B. Let C = [71, ..., %] be an algebra. For every closed term
98

7 of the type r B --~ (x~=lvc --+ Xa:lT?) a term AY 9 (ATx.((-[Tx) <>( Y x ) ) ) is


well-formed in type xa=~((vc') ~' --+ rB--+ r e ) .

L e m m a 3.6 Let B- be a product of algebras represented by type v V. Let C =


[71,.-., 7c] be an algebra. For every closed term J- of the type v ~- -+ (~.c)7, for
i < c the closed term ATx.xi(J-Tx) is well-formed in type 7-B --~ r C.

L e m m a 3.7 Let A = [al, ..., c~] be an algebra. Let B be a product of algebras


represented by type r ~. Let C = [71, .-., 7~] be an algebra.. For every closed
term T of type r B, for every closed 7 of type • and every closed H- of
type T A ---+ T B -+ T C and T of type X a = l ( ( r C ) cei --~ T B --+ TC) the closed term
-- (l --O~i
(F<> Xi=l((H <>Y//)T))T is well-formed in type (~-C)a.

L e m m a 3.8 Let A = [0~1,..., c~a] be an algebra. Let B be a product of algebras


represented by type r ~. Let C = [7~, ..-,7c] be an algebra. For every closed
terms T of type V B , for every closed Y-of type Xa=l(Va) ~ and every closed
of type r A --* r -K --* r c the closed term (H~ <>(Cons A ~ Y ) ) T is well-formed in
type ( r e ) ~.
P r o o f . See the definition 3.3.

L e m m a 3.9 Let A = [o~1,..., aa] be an algebra. Let B be a product of algebras


represented by type r ~. Let C = [71, ...,%] be an algebra. For every closed
term Y of the type • ~' ---* r ~- ---* v c) a closed term ASTx.S(ATg 9 (T~>
(Az.I~))Tx) is well-formed in type r A ~ 7"~ ---+r C.

L e m m a 3.10 Let A = [al, ..., OZa]be an algebra. Let B be a product of algebras


represented by type 7 B. Let C = [71,-..,%] be an algebra. Let 7 be a closed
term of type 7-~ ~ (x~=lvff -+ Xa=lwA). Let T be a closed term of type
x ~ = l ( ( r c ) ~' ---* r ~- -* r C) defined by T = AY * (ATx.((TTx) o (Yx))). Let
be a closed term of type v A --~ "r~" ~ r c defined by H- = A S T x , S ( I T x ) . For
every closed term T of type r ~- and every closed term Y-of type • ~ , the
terms (Ha <>(Cons A ~ Y ) ) T and ( F <>(( x]= 1 (H~' <>Y i ) ) T ) T are fir] convertible.
P r o o f . The term Cons c is defined in 3.3. In lemmas 3.7 and 3.8 we checked
that both terms have the same type ( r e ) a. We must remember that since Y is a
a
closed term of type xai:l\{rA'~ai] , then Y h a s a form x/= 17 " , =/3n x}~=i • C~i ('Yi)j

(-~a <>(Cons, r o Y ) ) T =/3. definition 2.9


((xa=l ~) o ( xL~ Co,~r =~, definition 2.9
( X a = l ( - H ( C o n s i A ~ i ) ) w =/3. definition 2.6
xL ~-ff(Co.sf g, )T =/3; definition of H
definition 3.3 of Consr
99

• ~=~(~.((~.~(Y~))(W,))) = ~ fl conversion 2.13


• =~, definition 2.17
• =~ definition of Y~
• (~*.((~,Tx)((• =~, definition 2.6
• ~:~(~,. ((fiT,) ( • ((~,b (IT,)))) =~, definition H
xi=~(~x.(I~Tx)(• =~ definition 2.6
• ~:I(~,.(Z~T,)((• ~(Y~b)T,)) =~ definition 2.6
form of Yi
x~=~(Ax.(I~Tx)((H ' o Y~)Tr -Z,~ definition of F~
• tl -- --(~i
o ~ ) T ) T x ) =~, U conversion 2.15
• ~' o ~)T)T) =~ definition 2.9
((• o • ( ( ~ ' oY~)T))~ definition of F
(~o • o V,:)T))T

L e m m a 3.11 The collection t~ A (see definition 3.4) of terms is just the set of
representatives of nonterminal trees in A.
P r o o f . By induction of the construction of elements from hA. The con-
structors are represented (see definition 3.3). Projections are represented. Let
f : A n ~ A b e a function represented by F E nA Let functions gl : A k --*
A, ...,g,~ : A k --~ A be represented by terms G1, ...,Gn from hA. A function h
defined by h(el, ..., ek) = f ( g l ( e t , . . . , ek), .., g(el, ..., ee)) is represented by term
AT.F(G~T)...(GnT). By simple induction on the construction on F we can check
that H ~ hA.

L e m m a 3.12 Let A = [al, ...,aa] be an algebra. If G is a closed term of type


(TA) p ""+ TA_representing a nonterminal_ tree in the algebra A then for every
closed term P terms ~ x . G ( ~ z . P x ) z and G P are j3~ convertible.
P r o o f . It is easy to check that both terms have the same type r A. The proof
is by induction on the construction of term from the set hA. (see definition 3.4)
If G is ~s.s~ then

~x.G(Az.Px)x = ~ definition of G
~ . ( ( ~ . s ~ ) ( ~ z . ~ ) ~ ) =p,~ /~ conversion 2.13
~.((~z.P~)~)x) = ~ formulas 2.18 and 2.17
~.((~z.p~)~) = ~ /3 conversion 2.13
Ax.(P~x)= ~ U conversion 2.15
P~ = ~ definition of G
GP

If G is As.Cons A when a~ = 0 then

Let D be a closed term of type (TA) p ~ ('cA) a' such that every D j ~ nA for
j ~ a~. For induction we assume that every Dj satisfy lemma which means that
i O0

Ax.-D(Az.-fix)x and D P are fir] convertible. We want to prove that the l e m m a


holds for term E = Asx.xi(Dsx).

~ x . E ( A z . P x ) x =~, definition of E
/3 conversion 2.13
/? conversion 2.13
~.x,((~.~(~.~)V)~) =~, inductive assumption for Ay.D(Az.Py)y
Ax.xi(DPx) =~, definition of E
EP

L e m m a 3.13 Let A = [oq, ..., ~ ] be an algebra. Let B be a product of algebras


represented by type r ~. Let C = [71,..-, %] be an algebra. Let F be a closed
t e r m of type X~=l((Vv) ~ ---* v e ---* r C) such that for every closed t e r m T of
type r B terms Gi = A Z . F i Z T of types ( r e ) ~ ~ v c for all i _< a represents
nonterminal trees in algebra C. Let H be a closed term of type r A ~ r g - + r c
defined by A S T x . S ( A R * ( F <>(Az.R))Tx) (see ! e m m a 3.9). For every closed
t e r m T of type v B and every closed t e r m 7 of type x~i=I~G'A~'~, the terms
(-ff~ o (Cons A <>Y ) ) T and ( F o ((x]= 1 ( H ~ o ~ ) ) T ) T are fly convertible.
P r o o f . In l e m m a s 3.7 and 3.8 we checked that terms (-~a 0 (Cons A o Y ) ) T and
(F--o ((xi=~(H~ - - ~ <>Y i ) ) T ) T have the same type (TA) a. Let T and Y- be closed
terms of appropriate types. By l e m m a 3.8 the term A S T x . S ( A R * ( F o ( A z . R ) ) T x )
is well typed in r A ~ r B ~ r A.

(-~'~ o (Cons A o Y))T = , , definition 3.3


of
Cons A
((xa=IH) O ((Xa=lCOn8 A) 0 (Xa=l ]~)))T =~/ formula 2.9
formula 2.9
(( • It ( ConsAyi) ) r =~,7 formula 2.6
definition of H
and /3 conver-
sion 2.13
x L-, (~x.[(Co~.#~4)(~R 9 (T o (A~.R))T~)]) =~, definition of
ConsA~
and /? conver-
sion 2.13
x3=1 (~x.[(Ay.y,(~y))(~R 9 (To (~z.R))~x)]) = , , fl conversion
2.13
x~=l (Ax.[((AR 9 (F e (Az.-R))Tx)i(Yi(AR 9 (To (Az.R))Tx)))]) =~, formulas 2.20
and 2.17
• (A~.[((~R,.(T o (:~z.n)),Y~)(~(AR 9 (-f o (~z.R))~)))]) = ~ formulas 2.19
and 2.18
x~_~ (~.[((~t~,.(~, (:~z.R,))T~)(~(),R. (To (A~.R))T~)))]) =~, definition of
Gi
101

• ( )~x.[( ( )~Ri.-Gi( Az.Ri )x )(~i( AR 9 (-F o ( ;~z.R) )Tx) ) )]) =~ f conversion


l
2.13
x~%~(~.[(~,(~.(V~(~R 9 (Yo (~.~))T~))~]) : ~ , definition of H
x ~ % ~ ( ~ , ~ . [ ( U , ( ~ . ( ( ~ ~'' o ~)T~)~)]) =~,, lemma 3.12
with a term P
equal ( ~ i o
~ ) T when Gi
is nonterminal
tree
definition of
G~
X~=l ( T i ( ( H ~'' o Y~)T)T) formula 2.9
xi~=l (F{) o (x~=l (H~" o Y{)T)T =~, definition of T
T o (x~'=l ( H ~i * YOT)T

L e m m a 3.14 Let A = [c~1, ..., a~] be an algebra. Let B be a product of algebras


represented by type r B. Let C = [71, ...,%] be an algebra. Let ~ be a closed
t e r m of type r B --~ ( x ~ = l r j c --* x~=lr/a). Let F be a closed t e r m of type
X a = l ( ( T C ) ai ""+ T B ~ T C ) defined by F = AY 9 (ATx.((-[Tx) <>(Yx))). For every
closed t e r m T a closed term G defined as AX 9 ( F o X ) T of type x a = ~ ( ( r c ) ~' -*
r c ) represents a tuple of nonterminal trees in the algebra C.

Proof. T y p e checking for t e r m F has been done in l e m m a 3.5. We prove t h a t


Gr represents a nonterminal tree the algebra C for every r < a. Let T be
a closed term. The t e r m G~ is AX~.F~X,T =p, AXrx.IrTx(X~x). [~T is a
simple closed t e r m of type ~ c = e c --. ( o -* o). Since
rank(x$=lv ~ --* (O ~" ~ O)) < 2 then there is a finite term g r a m m a r which
produces all closed terms of this type. Consult for details [Zai87] page 4 L e m m a
c
2.3. The g r a m m a r i s following: Let q 6 0 ~ , x E xj=17JC and let K be a variable.

K :=~ )~xq.ql

K ==~ Axq.qa~
K ~ Axq.xj when 7j = 0
K ~ Axq.xj ( K x q ) . . . ( K x q ) when 7j > 0
"gj times

This g r a m m a r produces all closed terms of the type • C ~ r A. The proof


is by induction on the g r a m m a r construction of the t e r m K = L T . Case 1. I f
I~T is )~xq.% for p < a~ then G~ = ~X~x.(X~)vx = ~ AXr.(X~)v. Therefore
Gr represents a projection. Case 2. If IrT is Axq.xj for 7j = 0 then G~ =
~X~x.xj = ~ )~X~.Cons C therefore Gr represents the constant function. Case 3
inductive step. Suppose the theorem is true for closed terms K1, ...,};f~j which
means t h a t terms G t = IX~x.T(lx(X~x)...G~ i = AX~x.K~ix(Xrx) represent
~.a2

nonterminM trees. Let G = (G1, ..., Gnu) and K = (K1,..., K-r, ). So we have G =
A X r x . K x ( X ~ x ) . Let us check that the theorem also holds for K ' = A x q . x ~ ( K x q ) .
Let G' be A X r x . K ' x ( X ~ x )

G' = t X ~ x . K ' x ( X ~ x ) =~, definition of K'


definitions of G and 3.3 of

X.x.C (G conversion 2.15

L e m m a 3.15 Let B be a product of algebras B1 x ... x Bk such that a algebra


Bi is based on the signature SB~ = [/~, .../3~]. Let the product B is represented
by the type r B. Let C = [71, .-., %] be an algebra. Let T be a variable of the
type r B and x be a variable of the type • ~ c Every long-normal closed term
T of the type r B ~ ~_c is in one of three possible forms
1. P = A T x . x i if 7i = 0
2. P = A T x . x i ( J T x ) if 7/ > 0 for some closed term J of the type 7-g -~ ( r e ) "~'
3. P = A T x . T i ( I- -T x ) \ for some closed term 7 of type ~.B __+ X jC = l v C --+ Xj=lrB~b,
P r o o f . From the definition of the long-normal form.

4. M a i n R e s u l t
L e m m a 4.1 Let A be an algebra based on signature [Ctl, ...,eta]. Let B be a
product of algebras. Let C be an algebra. Let F be a closed term of type
• ~ --+ T B --+ r c) representing the system of functions fl, ..., fa. Let
h : A x B --~ C be a function defined by recursion from functions fl, ..., f~ ( see
definition 1.5.) Let H be a closed term of type r A -+ 7"~ --+ 7"C. The following
two statements are equivalent:

1. Term H represents h
a ~o(i

for every closed term T of type r B and 7 •


P r o o f . The second equation is a simple encoding in A calculus the definition of
primitive recursion.

T h e o r e m 4.2 (soundness) If f is a A function then f is A definable.


P r o o f . By induction on the construction of A functions. Trivially all projec-
tions and constant functions are represented and representability is preserved
by composition. Let A be an algebra based on signature [al,..-,a~]. Always
i - th constructor in A is represented by C o n s A (see definition 3.3). We want
to show that representability is also preserved by primitive recursion. Let B
be a product of algebras. Let C be an algebra. Let f l , . . . , fa are )~ functions
103

such that fi : C ~ x B --+ C for all i < a. Let us assume that for every
b C B functions (fl/-6) : C ~ -+ C,...,-(fa/-b) : C ~~ ~ C are nonterminal
trees. Let h : A • B ~ C be a function defined by primitive recursion from
f l , . . . , fa. Let the system f l , . . . , fa be represented by a closed term F of type
• ~' ~ 7--~ --~ 7"c). For every b E B we define functions g~,...gba by
gb(x) = f i ( x , b ) . Function gb is represented by the term G~ = A X i . F X i b for
i _< a. Therefore the tuple term G b = (G~,..., G~) is given by $ X 9 F X b . Since
gb = f i / b for i < a are nonterminal trees in C then according to lemma 3.11 the
term G~ belongs to ~c. Let H be a closed term of type T A --+ T B -'-+ T C defined
by A S T x . S ( ~ R 9 ( F ~ ( ~ z . R ) ) T x ) (see lemma 3.9). By lemma 3.13 it holds that
for every T and Y-, ( ~ a <>( C o Y ) ) T =p, ( F o (• <>~//))T)T. According
to lemma 4.1 H represents h. It means that the function h is ~ definable.

D e f i n i t i o n 4.3 (Measures of complexity) Let us introduce a complexity measure


7r for closed terms. If T is a closed term written in the long normal form and
T is a projection AXl...x~.xi then 7r(T) = 0. If T = A x l . . . x , . x i T 1 . . . T k then
7r(T) =- m a x j = l . . . k ( r + 1. In fact 7r corresponds with the height of
Bhhm tress for a term T. Let us introduce also a special measure of complexity
p which apply only to closed terms of type r y --+ v C for any product of algebras
B and for any algebra C = [71, ..., %]. Let A be the i - t h algebra in this product
B, and let A be based on signature [al, ..., aa]. Let ~ T x . X be a closed term in
the long normal form of r B --* v C type where T = [T1, ...,Tk] is a variable of
type v B, x = [xl, ..., xr is a variable of type • and X is a term of type O
By p ( ~ T x . X ) we mean a number of such occurrences of T1, ..., Tk in the long
normal form of the term X that any Tj for j < k does not occur in a context
Tjx. A formal definition is the following:

p(ATx.zi) =0 for all i < c such that 7i=0

p( Tx.x (TTx)) = E =I for 7i > 0 where Ji for j < 7i


are closed terms of the type r B ---+ r C

p( ~ T x . ~ (-[Tx) ) = 0 if I T • =~ x

p(ATx.~(-[Tx)) = 1 + E~'=I P(-FfJ) if-iT• # ~ x where I1,..., Ia are closed


terms of types v ~ --~ r C+~1 ,...,
v -~ --~ v C+~" respectively (see defini-
tion 1.1)

In the next theorem we are going to design a procedure which reduces the prob-
lem of representability of a closed term to possibly few "simpler" problems. For
reason of termination of this procedure we are going to investigate some quasi-
order of terms. Let us consider the set of pairs of natural numbers well-ordered
in the ordinal w • w. For every closed term T of type T B - ' + T C where B is
104

a product of algebras and C is an algebra we define the pair of two numbers


(p(P), 7r(P)). In the completeness theorem we will see that the procedure works
in this way that the pair is decreasing in the ordinal w x w.
m

D e f i n i t i o n 4.4 Let B and B' be two products of algebras. Let C and C '
be an algebra. Let P be a term of the type 7-~ - - ~ ~-c and ~ h e a term of
the type r B--r ~ r C'. We call the t e r m P ' "simpler" than P if (p(~-7), ~(~-~)) <
(p(P), ~(__P)) in the_ ordinal w xw. It means that p(P') < p(P) or if p ( P ' ) = p(P)
then ~ ( P ' ) < ~ ( P ) .
m

L e m m a 4.5 Let B be a product of algebras B1, ..., B~. Let C be an algebra.


Let A be the i-th algebra from the product B for i _< n. Let the algebra A have
the signature ( a l , ..., o~). Let 7 be a closed term of type r B ~ ( x ~ : l r C
x ] = l r A ) . Let T be a closed term of the type x ] = l ( ( r c ) ~' ~ r ~ ~ r c ) defined
by ~ = ;~Y 9 ()~Tx.((-[Tx) o (Yx))). Let ~ be a closed term of type r e ~ r c
defined by P = If -[Tx r x then p(~-/) < p ( P ) for all i _<n.
P r o o f . First of all we can notice that terms -[Tx and x m a y be /3r/ con-
vertible only if algebras A and C are the same. The term Fi has the form
AYiTx.hTx(Y~x). It is obvious that the number of occurrences of T's in the Fi
is less by one than the number of r ' s in aT .T ffT ). ~ew variables (Y~)~ for
J <_ 7i of type r c in the term F//are in the contexts of Y/x and therefore their
occurrences do not increase the measure p.

T h e o r e m 4.6 (completeness) If a function p is a A definable function between


the product of algebras B and the algebra C then p is a ,~ function.
P r o o f . Let P be a closed term of type r B ---,, r C representing p. We are going
to prove the theorem inductively on the complexity (p(P), ~r(P)) of the term P.
If (p(P), ~'(P)) = (0, m) then P has to be in one of two possible forms, ATx.xi
when 7i = 0 or ATx.T~(TTx)_J if I T x = ~ x. (see l e m m a 3.15 and definition
of p). If P = ATx.xi then P represents function p:B --~ C which m a p s onto
i - th constructor of the algebra C constantly. Therefore p is a ,~ function. If
-fi = A T x . T i ( I T x ) where -[Tx = ~ x then it means that the algebra C and i - th
algebra in the product B are the same and moreover P =Z~ )~Tx.Tix = ~ )~T.Ti.
In this case P represents projection p(bl, ...b~) = bi so it is a A function.
Induction step. Suppose the theorem is true for all closed terms P ' of any of
the types r B' ---+T C' such that (p(~--7), ~r(~-7)) < (p(~), ~-(~)). If p ( P ) > 0 then
according to the definition of p and l e m m a 3.15 P must be in one of two possible
forms: T = ATx.xi(JTx)_for some closed t e r m 7 or T = ATx.T~(TTx) for some
closed t e r m I such that I T x #Z~ x.
(Case 1) If T = ;~Tx.xi(TTx) for some closed t e r m 7 = (Y-~, ..., JT,) then ~r(J,--?) <
~(~) for all ~ _< 7~ and p(g-~) < p ( ~ ) for all ~ < 7~. Ai~ closed terms J-7, ..., J~,
are of the type v B --~ 7 C (see l e m m a 3.6). Therefore according to induction
assumption .lr represents function j,. : B --~ C which are A functions. We
have t h a t p(bl, ...,bk) = ci(jl(bt,...,b~), ...,jT,(bl, ...,b~)) where ci is the i - t h
105

constructor of the algebra C of arity 71. The function p as the composition of/~
functions is a ,~ function.
(Case 2) Let T = ~Tx.Ti(TTx) for some closed term 7 of the type r B --*
xj=17
c
jC -* x ja = l r r where A is the i - th algebra in the product B given
by signature [al,...,o~a]. Let I T x Cp~ x. Let T be a closed term of type
x]=l((rc) rB r C) defined by T = ~ Y . ( A T x . ( ( - [ T x ) o ( Y x ) ) ) . By lemma
4.5 p(Fj) < p(P) for all j _< a. Since any term Fj is "simpler" than the term P
then all Fj are representing ,~ functions. Let fl, ..., fa be the system of ~ functions
represented by F where fj : C ~j x B ---, C. By lemma3.14 it holds that for every
T, closed term G defined by ,~X * ( T o X ) T belongs to ~ c Since ,~X 9 ( T o X ) T
belongs to gc then according to lemma 3.11 all functions f j / b for j _< a are
nonterminal trees in C. Let h : A ---, B --+ C (where A is an Bi ) be a function
defined by limited primitive recursion from the system fl, ...,fa. Let H be a
closed term of type v A --+ v -~ ~ r C defined by ;~STx.S(TTx). By lemma 3.10
we know that for every T and g , (~a (>(C<>g))T = ~ (~o(( x]= 1( H ~ <>Yj))T)T.
Thanks to lemma 4.1 it means that H represents h. Therefore h is a A func-
tion. Function h is represented by ~STx.S(-[Tx) but function p is represented
by ;~Tx.~(-[Tx). Therefore the following relation between functions h and p
holds: p(bl, ..., bk) = h(bi, bl, ..., bk) for all expressions bl, ..., b~. Since the class
of ,~ functions is closed for compositions it holds that p is a ,~ function.

A c k n o w l e d g m e n t s . I would like to thank an anonymous referee for many


helpful suggestions and comments.

References.
[BSB85 ] Corrado Bbhm and Allessandro Berarducci, Automatic synthesis
of typed ~ programs on term algebras, Theoretical Computer Sci-
ence 39 (1985) 135-154
[Lei89 ] Daniel Leivant Subrecursion and lambda representation over free
algebras, in S. Buss and P Scott (eds.), Feasible Mathematics (Pro-
ceedings of June 1988 Workshop at Cornell)
[Mad91 ] Madry M, On the A definable functions between numbers, words
and trees Fundamenta Informaticae, 1991
[Sch75 ] Schwichtenberg H., Definierbare Funktionen im ~ -Kalkiil mit
Typen, Arch Math. Logik Grundlagenforsch 17 (1975-76) pp 113-
114.
[Sta79 ] Statman 1%. Intuitionistic propositional logic is polynomial-space
complete, Theoretical Computer Science 9, 67-72 (1979)
[Zai87 ] Zaionc M. Word operations definable in the typed )~ calculus, The-
oretical Computer Science 52 (1987) pp. 1-14
[zaig0 ] Zaionc M. A Characteristic of )~ definable Tree Operations, Infor-
mation and Computation 89 No.l, (1990) 35-46
[Zai91 ] Zaionc M. A definability on free algebras, Annals of Pure and Ap-
plied Logic 51 (1991) pp 279 -300.
Semi-Unification and Generalizations of a
Particularly Simple Form

Matthias Baaz * Gernot Salzer **

Technische Universits Wien, Austria

A b s t r a c t . This paper describes a criterion for the existence of general-


izations of a particularly simple form given complex terms in short proofs
within schematic theories: The soundness of replacing single quantifiers,
which bind variables in schema instances, by blocks of quantifiers of the
same type. The criterion is shown to be necessary in general and suffi-
cient for languages consisting of monadic function symbols and constants.
The proof is mainly based on the existence of most general solutions for
solvable semi-unification problems.

1 Introduction
When dealing with the generalization of complex terms in short proofs, one of
the first questions is: Have the innermost parts of a sufficiently complex term in
the end-formula any influence on the proof?. Or more formally, Is it possible to
transform a given proof of A(t) to A(t'), where t' is the result of replacing suffi-
ciently deep subterms oft by corresponding variables? We call this type of gen-
eralization generalization of a particularly simple form. There are calculi which
admit this type of generalization trivially without changing the logical structure
of derivations. Take for example first-order resolution calculi: the generalizations
are provided by the so-called lifting lemmas (cf. [CL 73], L e m m a 5.1). Other cal-
culi admit this form of generalization after adequate transformations; for LK this
means elimination of cuts (cf. [KP 88], Chapter 2).
In this paper we concentrate on schematic 3 theories within usual logical de-
duction systems. It is known from literature that schematic theories, which are
identical in the sense of model theory, may behave in a completely different man-
ner with respect to the generalization principle mentioned above. E.g., for every
finitely axiomatized number theory Z augmented by the least number principle
3#r D (c~(x) A x_<y)) (LP)
restricted to purely existential formulas there is a function g' such that
~k(Z + L P ( ~ I ) ~k A(s~ (0)) and n >_ e ( k , A(a)))
3 m ( Z + LP(Z1) ~- VxA(sm(x))) (*)
* Wiedner Itptstr. 8/El18-2, A-1040 Wien, Austria. Emaih baaz01ogic, tuwien, ac. at
** Karlsplatz 13/E185-2, A-1040 Wien, Austria. Emaih salzer01ogic, tuwien, ac. at
3 Throughout this paper a schema is formula which, in addition to ordinary function
and predicate symbols, may contain predicate variables. An instance of a schema is
a first-order formula obtained by replacing the atomic second-order semi-formulas
a ( t l , . . . , tn) by first-order formulas of corresponding type. A schematic theory is a
finite set of schemata.
107

holds. This implies that if for a sufficiently large n, A(s"(O)) is derivable by a


short proof then it is also derivable for all larger n (cf. [BP 93]). This is not true
even for quantifier-free successor induction
~(0) A Vx(~(x) D a(s(x))) D Vxa(x) (S 0
The argument of [Rich 74] shows that Vx(s n (0)+x=s ~ (x)) is derivable by a proof
of length independent of n from
s'~(0)+0=s~(0) and sn(O)+x=s'~(x) D s'~(O)+s(x)=sn+l(x) ,
which follow immediately from
x+O=x and y+u=v D y+s(u)=s(v)
Therefore
3kVn(SI + Z F~ 3x(x+x=s~~
This proof can be transferred to the case where s is the only function symbol.
The usual arguments identifying LP and SI in number theories depend on
the schema of identity
VxVy(x=y n (a(x) D ~(y))) (ID)
Consequently, the extension of Z + LP(~I) by ID implies the failure of (*).
We consider the class of schemata which are built from one-place schematic
variables ~ such that no function symbol occurs in an argument of a~, like LP
and ID. We call schemata of this type function-free. For these schemata we estab-
lish a simple criterion that allows to decide semantically whether generalizations
of the mentioned particularly simple form exist:
Let Ak be the following transformation for a schema S. Replace all oc-
currences Qz of quantifiers binding the occurrences ai(x) of the schema
variables in S uniformly by a string of quantifiers Q x Q x l . . . Qxk; also
replace the ai(x) by at(x, x l , . . . , xk). If this transformation is logically
correct for all k E w then we call S splittable. A schematic theory is
called splittable if all its schemata are splittable.
LP is splittable since
3x35:VyYg(A(y, Y) D (A(x, 5:) A x<_y)) (A(LP))
is correct for all A, in contrast to
3x35:VyV9(x=y D (A(x, 5:) D A(y, 9))) (ID)
For details see Example 3 and 6 in Section 4.
In this paper we show that the given criterion for function-free schemata
is necessary in general and sufficient in the case of monadic languages for the
existence of generalizations of a particularly simple form. Schematic theories in
monadic languages admit of course generalizations expressible by the monadic
analogue of arithmetic progressions [Maka 77, Farm 88], but these are not gener-
alizations of the form mentioned. The main argument in the proof is the replace-
ment of the second-order unification problems given by the realization problems
of the proof skeletons by adequate semi-unification problems.
'08

2 (Semi-)Term Bases

In this section we want to make precise the concept of 'generalizations of a


particularly simple form'. Generalizations of proofs with respect to their term
structure can be considered as sets

T-(H, A(al,..., aN)) = {(t,,... ,t~) ] II ~ proves A(t,,... ,t~) for s o m e / / ' = / / }
where = is some equivalence relation over proofs. We are interested in general-
izations where
(A) T-(H, A ( a l , . . . , aN)) is the set of all instances of some n-tuple of terms.
(B) = is chosen in way such that for all k E w there are only finitely many
equivalence classes of proofs of length _< k. We will concentrate on the case
where / / = H ' iff /7 and /7' have the same extended proof matrix (see
Section 7).
Condition (A) and (B) imply the existence of a term basis for A(al,..., a~) and
all k E w in a theory 7".
D e f i n i t i o n 1 ( T e r m B a s i s ) . A finite set of n-tuples of terms (t~, 9 " ",
t~ \z is
nti=l
a term basis for A ( a l , . . . , as) and k iff
1. 7" ~ A(ti,,..., ti~) for 1 < i < l, and
2. if 7" t-k A(sl,..., s~) for an n-tuple g of variable free terms then there is a
substitution a such that { S l , . . . , s . ) = {t~,...,ti)c~ for some i, 1 < i < 1.
We extend this concept to semi-terms bound by strong quantifiers. 4 For a bound
variable x, let ST~ be the set of semi-terms containing x and function symbols,
but no constants or other variables9 STc denotes the set of closed terms. For a
formula A(a,,..., as) with free variables a l , . . . , a,~, let a binding assignment 7
be a function that assigns to each ai either one of the variables that are bound
by a strong quantifier occurrence in A or the symbol c. Let

T=_(H,A(al,...a~),7) = { ( t l , . . . , t , ) [ I I ' proves A(tl,...t~)such that


H ' _- H and tj E ST.r(ai) }

Again, if T-(H, A(al,..., a,~), 7) consists of all instances of some tuple of terms
consistent with 7, we obtain the concept of semi-term bases.
D e f i n i t i o n 2 ( S e m i - T e r m B a s i s ) . A finite set of n-tuples of pure semi-terms
(t~, ...,tn)i= i , 1 is called a semi-term basis for A(al,...,an), 7 and k E w in a
theory 7" if the following holds:
1. 7- k- A'(ti~,...,t~) for 1 < i < l, where A' is obtained from A by replacing
the strong quantifier occurrence Qx by Qx, ~t if 7(aj) = x and fi are the
variables in tj.
2. For all n-tuples ( s l , . . . , sn), where sj E ST~(aj), 7" ~_k A(sl,..., sn) implies
that there is a n i , 1 < i < l, and asubstitutionc~ such that sj = tjc~ for
a l l j , 1 <_j<_n.
4 A n o c c u r r e n c e of 3 (\/) is weak in a f o r m u l a if it is in t h e scope of a n even ( u n e v e n )
n u m b e r of n e g a t i o n signs, a n d strong if it is in t h e scope of a n u n e v e n (even) n u m b e r
of n e g a t i o n signs; s u b - f o r m u l a s A D B are t r e a t e d as --A V B .
109

A theory 7- admits (semi)-term bases (with respect to a calculus) if they exist


for every formula A ( a l , . . . , an), every binding assignment 7, and every k E w.
Note that not all schematic theories admit (semi-)term bases. For example,
successor induction together with an adequate weak consistent number theory Z
excludes the general existence of (semi-)term bases. By definition, all (semi-)term
bases are finite; furthermore, {s n (0) I n even} is not the set of instances of a finite
set of terms. However,

3kVn (SI + Z F '~ 3 x ( x + x = s ~ ( 0 ) ) )

3 Some Consequences of the Existence of (Semi-)Term


Bases

For the following we assume that the theories under consideration are consistent
and prove the usual axioms of equality. An almost immediate consequence of
the existence of term bases that usually receives much attention is a version of
Kreisel's conjecture.

T h e o r e m 3. If a number theory 7- admits term bases and proves Vxqy(x=O V


9.. V x=sn-l(0) V x=sn(y)) for all n E w then for every formula A(a)

qkVn 7- ~_k A(s ~(0)) implies 7- F YzA(x)

Proof. We prove the following, somewhat more general form of Kreisel's conjec-
ture:

3kVnt...Vnr 7" }_k A(snl(O),..., s,~,(0)) implies 7- k VXl...Vx~A(xl,..., x , ) .

Let TB be a term basis for A(al, ...,at) and k. Since TB is finite there is a
bound h such that dp(t) < h for all t occurring in TB. All tuples ( t l , . . . , G )
where t~ = sg(0) for some g < h or ti = sh+l(bl) for pairwise distinct free
variables bi are substitution instances of tuples in the term basis. It follows that

7- F A ( t l , . . . , t,.)

Therefore we can repeatedly apply Vx3y(x=0 V . . . V x=s'~-l(0) V x=s~(y)) to


obtain
7- F V z l . . . V x , A ( x l , . . . , x,) []

We now derive consequences of the fact that large semi-terms can be generalized.
T h e o r e m 4 . If a theory 7- admits semi-term bases then for every formula
A(a, a l , . . . , a r )

3]r r ~l- ~_k 3xVyA(x, s ~1 (y),..., s~'(y)) implies


3h T F qxVyl...Vy~A(x, sh(yl),..., sh(y~))
110

Proof. Let SB be a semi-term basis for 3xA(x, al,..., at) and k. Since SB is
finite we can choose terms s ~I ( y ) , . . . , s ~ ( y ) such that all ni as well as I n i - nil
(for i r j and 1 < i, j < r) are greater than the maximal depth of terms in SB.
Because of
T F 3xVyA(x,s'*(y),...,s"r(y)) ,
( s " ( y ) , . . . , s ~ ( y ) } is an instance of some tuple (sin'(z1),.. s'~(zr)} in SB
where z l , . . . , z~ are pairwise distinct variables. By definition,

7" F ~XVZl.. "VzrA(x, Sm'(Zl),..., srn'(zr))

Now choose h = m a x { m 1 , . . . , mr}. It follows that

[- 3xVyl...VyrA(x ,sh(yl),...,sh(yr)) []
Another application of semi-term bases shows: if the existence of a bound beyond
which a statement holds can be shown by a short proof then this bound can be
made explicit within the theory.
Theoremh. If a theory T admits semi-term bases and proves Vx3y(x<y) as
well as Vx(sn(O)<x D 3y(x=s~+l(y))) for all n E r then for every formula A(a)

3k\/n ~ F k 3xVy(x<y D A(s"(y)) implies 3h 7 F Vy(sh(O)<y D A(y))

Proof. Let SB be a semi-term basis for 3xgy(x<y D A(a)), 7 and k, where


7(a) = y. As in the proofs of the theorems above, we conclude that

:r F 3xVyVz(x<y > A(,h+~(~)))


for some fixed h. This is logically equivalent to
T F 3xVy(-~(x<y) V VzA(s h+l (z)))
Now we use Vxqy(x<y) and Vx(sh(O)<x D 3y(x=sh+l(y))) to conclude

T ~ Vy(sh(0)<y D A(y))

4 The Splitting Criterion is Necessary for the Existence


of (Semi-)Term Bases
The following facts will be used in the proof.
L e m m a 6.
(a) Let A[Qxl .. . Qx,~B(xl,..., x,~)] be a formula where the Qxi are strong quan-
lifter occurrences not within the scope of any weak quantifier occurrence in A.
Then
vx~...Vx, A[B(~,..., ~,,)] I: A[Q~... Qx, B(~,...,,,)]
(b) Let Qx be a weak quantifier occurrence in A [ q x B ( x , t l , . . . , t,)] where the ti
are semi-terms containing no bound variables except x. Then
A[QxB(x, t l , . . . , t,)] ~ A [ Q x Q x l . . . Q x , B(x, x~,..., x,O]
111

Proof. Obvious from the laws for shifting quantifiers in first-order logic. []
T h e o r e m 7. Le t 7- be a f u n c t i o n - f r e e s c h e m a t i c t h e o r y p r o v i n g V x ( h ( x ) = x ) for
some monadic function s y m b o l h.
(a) ff all schemata S = S[(Qlx,..-~(xl)...),..., (Qk~k " ' ~ ( ~ ) ' " ")] in 7- are
of a f o r m such that no strong quantifier occurrence Q i x i is in the scope o f any
w e a k q u a n t i f i e r occurrence, a n d i f 7 - a d m i t s t e r m bases, t h e n 7- is splittable.
(b) I f 7-/- a d m i t s s e m i - t e r m bases t h e n 7- is splitlable.
Proof.
(a ) Let 7" consist of the schema

S[(Qtxl . . . c~(Xl)...),..., (Q,~x,~ . . . o~(x,~ ) . . .),


! !
(Q~*~ . . . ~ ( ~ i ) ) , . . . , ( Q '~ . . .' ~ ( x - ) . . . ) ] ,

where the Q i x i (Q~x~) denote the strong (weak)occurrences of quantifiers.


By instantiation we derive immediately the formula
s[..., (Qi~-.. A(~i, h~l(~),..., h ~ ( ~ ) ) . . . ) ,
. . . , ( Q~x~ . . . A ( x ~ , h n l ( x ~ ), . . . , h ' ~ ( x ~ ) ) . . .),...]

for all h i , . 9 n~ E w. Now we apply Lemma 6b to replace

Q ~ (... A(4, h~l(~),..., h~(~))...)


by
Qjxj
, , Qj
, X j,~ . . . t,~,
~ j X j ,~,I ' " .A(xl, X j,~, . .. , ~?)...)

Let S ' [ . . . , ( Q i x i . . . A ( x i , h'~(xi),..., h n k ( x i ) ) '' "),...] be the resulting for-


mula. Replace the xi by new constant symbols ci. All this requires <_ r steps,
for some r independent of k, n l , . . . , n~. Now consider the term basis T B for
S ' [ . . . , A ( a i , a/l,..., a/k),...] and r. Since term bases are finite, there is an l
such that dp(t) < l for all t occurring in T B . Since the constants ci are all
different and S'[. , A ( c i , h > Z ( c i ) , . . . , h ~ ' l ( c i ) ) , . . . ] is provable in _< r steps,
the n(k+l)-tuple i : . . , c i , h 1 "l ( c i ) , . . . , h h i" (ci),...} is represented by T B . I.e.,
there is a tuple [ E T B such that ( . . . , ci, h~(b~),..., h ' ( b ~ ) , . . . ) is an instance
of t for pairwise distinct free variables h i1, . . . , b ik . Now we replace ci by bi and
use Lemma 6b as well as the assumption 7- F V x ( h ( x ) = x ) to obtain the split-
ted schema Ak(S). The same procedure can be repeated for all k E w. If 7-
consists of more than one schema, then this procedure has to be applied
simultaneously to all S E 7-. Thus 7- is splittable.
(b) We proceed like in the first part but consider the semi-term bases for

s'[..., (Q,,~... A(~,, ~ , . . . , ~ 5 " ),---]


where the binding assignment 7 associates xi with all a~. In a similar way
as above we derive
S'[. . . , (Q~x~Q~x~ . . . Q i x ~ . . . A ( x~, h~ ( x~ ), . . . , h~ ( x~ ) ) . . .) , . . .]

in a fixed number of steps. It follows by the same argument as above that


Ak(S) can be proved and hence 7- is splittable. []
~2

The following examples describe splittable schemata.


E x a m p l e 1. Tautologies and first-order formulas are splittable schemata by def-
inition.
E x a m p l e 2. The schema of order induction

Vx(Vy(y<x D a(y)) D ~(x)) D Vz~(z) (OI)


is splittable since A k(OI) is

wvx,...w~(vyv~,, ...vy~(~,<~ D ,~(v, y~,..., v~))D ~(~, <,---, ~ i )


VzVzl ..Vzko~(z, zl, .,zk)
By shifting quantifiers this is equivalent to
Vx(Vy(y<x D VyI...Vy/~ol(y, Yl,..., Yk)) D VXl..-Va~kOZ(X, X l , . . , xk))
D VzVzl ...Vzko~(z, q , . . . , z~)
Therefore T ~ A(T) for any IT containing order induction as its only proper
schema.
E x a m p l e 3. The least number principle

~xVy(~(y) D (~(x) A x<y)) (LP)


is splittable since Ak(LP) is
3x3x,... 3x~VyVy~ ... vv,~ (~(y, y ~ , . . . , w ) > (o4x, x ~ , . . . , x~) A ~ < y ) ) ,
which is derivable from LP via the following steps:
3 x 3 x l . . . 3 x k VyVyl . . . Vyk ( ( - , a ( y , Yl , . . . , Y~ ) V ce( x, X l , . . . , Xk ) ) A
("lol(y, Y l , . . . , Yk) V x~y)) ,
3x [ ( v ~ 3 v , . . . 3y~,(y, yl,..., v/c) v 3~1 ... 3 , ~ ( x , , ~ , . . . , x/c)) A
VY('mgyl "'' 3ykoI(Y, Y l , . . . , y k ) V x__~Y)] ,
3xVv[(--,3v, ... 3y/c~(v,v,,...,y~) ... ,,~)) A
v 3x~ ... 3x/co~(x, x,,
(~3y,... 3y~,~(y, y,,...,y~) v ~_<v)],
~xVy(~yl'''~y/Co~(y, yl,...,y/C) > (3Xl"''~XkOZ(X,Xl,''-,X/c)Ax__~Y))
E x a m p l e 4. The supremum principle for real numbers

3x~(x) A 3X~B(X) D 3*(~B(X) A Vy(~B(Y) D x_<Y)) (SUP)


where a B ( X ) abbreviates Vz(o~(z) D z < x ) , is splittable: OeB(X) transforms into
a~(x) = Vz(3~a'(z, z)) D z<x) and 3XCeB(X) transforms into 3 x 3 # a ' ( z , 9). (The
axioms for ordered fields plus the existence of square roots and SUP for quantifier
free formulas gives a complete axiomatization of the theory of real closed fields.)
E x a m p l e 5. The elementary principle of Dedekind cuts

3 x , ( , ) A 3x-.~(x) A v x v y ( , ( , ) A ~ ( v ) ~ x<v)
D 3xVz(o~(z) D z<_x A -oe(z) D x<_z) (bED)
is splittable, too.
113

The most prominent example for a non-splittable schema is the schema of


identity.
Example 5. Theories based on the schema of identity
VxVy(x=y D (a(x) D a(y))) (ID)

are not splittable in general since ID splits into

VxVx'VyVy' (x=y D (a(x, x') D o~(y, y'))) (A,(ID))

Now instantiate u=v for a(u, v). Shifting the quantifiers we get

WVy(~=y D (3x'(~=x') D Vy'(y=y')))


and consequently VxVx'(x=x') from Vx(a~=x). Hence, a theory containing the
schema of identity and Vx(x=x) is not splittable, except it has only models with
one element.
Remark. If the language of a theory 7- contains only constant symbols but no
function symbols then 7- trivially admits term bases. Moreover, it is easy to see
that instead of containing Vx(h(x)=x) for a function symbol h, it is sufficient to
require 7" I-- Vx(t(x)=x) for some term t(a) containing no other variable than a.

5 Preprocessing for the Generalizations

We now have to fix our concepts of proof (7" ~- A and IT ~-~ A) such that
1. usual notions of proofs are captured, and
2. the proofs, or rather their transformed forms, should admit most general,
term minimal proofs within finitely many equivalence classes of proofs. Every
term minimal proof provides for one element of the (semi-)term basis.
We shall work with LK in one of its usual formalizations.
The length len(//) of a proof/7 (also: its number of steps) is defined as the
number of applications of inference rules. We write 7- k k A for a schematic the-
ory 7- if there is an LK-proof of a sequent Vs ...VxmTrn ---, A of length < k,
where T1, . . . , Tm are instances of schemata in T and the quantifiers V21,..., V2m
bind the parameters of these instances.
T h e o r e m 8. Let 7- be a schematic theory. 7- ~_k A ( t t , . . . , t,~) implies that there
is a proof H of a sequent V21T1 .. "V2mTm --+ A such that
(a) 17 is cut-free 5 and uses only atomic initial sequents and weakenings, and
(b) len.(//) < r k, A ( a l , . . . , an)) for a recursive ~.
Proof. Using the argument of [Part 73, KP 88], all formulas in the proof can
be restricted to formulas of a certain logical depth depending on 7", k and
A ( a l , . . . , aN), and therefore the usual bounds on cut-elimination can be used.
[]
5 An alternative to cut-elimination is the addition of the schema ce D a, which is
splittable, and the replacement of all cuts by D: I. The cut-free proof, however,
admits stronger generMizations than the proof using instances of ce D a. [Kreis 94]
6 Semi-Unification

The main algebraic tool to calculate generalizations of a particularly simple


form is semi-unification. A substitution cr is called a semi-unifier of the semi-
unification problem {(s], t l ) , . . . , (sv,t;) } iff there exist substitutions "~1, 9 9 )~p
such that s l c r = tlcrA1, . . . , 8pCr= tpr
Example 7. ~ = {f(x, f(x, x))/z} is a semi-unifier of {(/(x, z), f(x, f(x, y)))}
because f(x, z){f(x, f(x, x))/z} = f(x, f(x, y)){f(x, f(x, x))/z} {f(x, x)/y}.
Note that cr is not most general; however, c~' = {f(x', y')/z} is.
There is no semi-unifier for (f(x, y), f(x, f(x, y))), since no simultaneous
substitution makes the left-hand side a substitution instance of the right-hand
side.
Semi-unification is undecidable in general [KTU 93]. However, we have the fol-
lowing observation.
A most general semi-unifier can be computed by the following correction
procedure. Let the semi-unification problem be given by { ( s l , t l ) , . . . , (sp,tp)},
and let r/1,,~,...,~lp,n be disjoint canonical renamings of all variables occur-
ring at level n. Furthermore, let the approximations ~r~ be defined by c~0 = id
and 0"~+1 = ~r,~#,~+l, where #,~+1 is the most general simultaneous unifier of
{ ( s l ~ , t 1 ~ 1 , ~ ) , . . . , (Sp~, tv~n~,~)}.
T h e o r e m 9 (cf.[Baaz 93]). The semi-unification problem {(sl, Q ) , . . . , (sp, tp)}
is solvable if[ there is an n such that the approximations an and crn+l are equal
up to renaming when restricted to the variables occurring in s l , . . . , sv. In this
case the restriction of crn to the variables in the si is a most general semi-unifier.
The semi-unification problems needed in this paper are all of the form

{ ((s/,1,...,Si,qi),(ti,1, ...,~,i,qi)) I 1 _< i_<p}


where none of the terms contains a function symbol of arity higher than one.
Using the decidability of monadic second-order unification [Maka 77, Farm 88]
it can be easily shown that semi-unification is decidable for this restricted case.

7 Extended Proof Matrices

A skeleton of an LK-derivation is a finite tree where all inner nodes are labelled
with the name of inferences to be applied. Quantifier inferences are augmented
by their bound variables. Additionally, predicate symbols are attached to the
initial nodes and to the nodes labelled with 'weakening:right' or 'weakening:left';
the atomic axioms and weakenings are supposed to be based on these predicate
symbols. The skeleton obviously determines the logical structure of the proof. Let
7* be a mapping assigning to each argument position in a predicate occurrence
a free variable, a bound variable or a constant; 7* denotes the center of the
monadic terms. Every cut-free LK-derivation in a monadic language starting
from atomic axioms and weakenings determines uniquely a skeleton S and a
mapping 7* .6
6 Clearly, this definition makes only sense for languages consisting only of monadic
function symbols.
115

An extended proof matrix M(S, 7*) is constructed as follows. First add initial
sequents of the form P(al,..., an) ---+P(al,..., a~), where P is the predicate
symbol according to S and a l , . . . , an are pairwise distinct fresh variables. Then
add sequents to the inner nodes of the skeleton by traversing it downwards as
follows:
1. If the node is labelled with A:right then its upper sequents are of the form
H --+ A, B and H I --+ A', C, respectively,

where the sequences II(A) and H'(A') are equal up to naming of variables.
Unify those formulas and apply the unifier to all sequents above the node.
Then apply the rule A:right to obtain the lower sequent, i.e., the sequent
labelling this node.
If the node is labelled with an exchange, a contraction, a cut or another
propositional inference rule, proceed analogously.
2. If the inference is a weakening then an atomic formula P(al,..., an) is intro-
duced, where P is determined by S and (al,..., an) is an n-tuple of pairwise
distinct fresh variables.
3. If the inference is 3:right with bound variable x and the upper sequent is
ii ---+A, B(al,..., an)
then add as lower sequent

II ---+A, 3xB(al,.. ., an) {bil/aq, . .., biT~hiT}


where b i , , . . . , b i r are fresh variables and a q , . . . , a ~ are those variables
among a l , . . . , a~ with 7*(hi) = x.
The other quantifier rules are handled similarly.
Example 8. Consider the skeleton S depicted in Fig. 1. A corresponding proof
matrix A/I(S, 7*) is given in Fig. 2; the values of 7* are written as superscripts
above each position.

8 C o n s t r u c t i o n of an A d e q u a t e Semi-Unification P r o b l e m
Let the following information be given:
(A1) a cut-free extended proof matrix AJ(S, 7*);
(A2) a sequence H ] , . . . H ~ of function-free schemata;
(A3) a formula A(al,..., a,0 with a binding assignment 7 coinciding with 7*.
All bound variables in item A2 and A3 are assumed to be different.
We construct a semi-unfication problem U together with a basic substitu-
tion ~ to be applied to the matrix such that
(B1) every p r o o f o f A ( t l , . . . , t~), where t~ E STs(a0, represents a solution for U;
(B2) every solution cr for U can be applied to the substituted proof matrix
M(S, 7")~, which then can be extended to a proof of
V21T1...V~2~T~ ---+A*(al~r,..., an~a)
where
116

P P

D:r

Fig. 1. Skeleton for Example 8.

a (t a a a a
P ( ~ , ~ ) --, P ( ~ , u~) P ( ~ , u~) - P ( ~ , u~)
o o o o
P(~, ~) --, 3~P(,~, g) P(~, ~) --~ 3~P(v~, v2) ~ P(.~, u6) ~ e(.s, ~6)
a a a a gg ;?3
P(ul, us) V P(ua, u4) ---+3xP(vl, ve) P(u~ 06) ---+3 x P ( 4 , 4 )
P(~I, ua2) V p(uaa, ~.4) V P(u~ u~ ) -+ 9ecP( ~l, v~2)
a 2
Vy2 (P(r ~ ) V P(ua, ~6) V P(u~ 06)) -- 3xP(vl,23 v2)
x

3ylVy~ (P( ~i9, ~s)


2 1 2
V P(vYlo, g6) V P(u~ u~
)
~ 3zP(vl,x v2)
x
9 ~,, 9 y ,r

c c c
P(uT, ~6) -+ P(~7, ~8)
P(u~, ~s) c ~3)
3z~P(u7, z2
B D C , A--, D
P ( ,g , ,& ) --+ 3zl 3z2P(v~,
z ~ z~
v3) A, B D C --+ D
23 ~ Z2;
3xP(v~, vs) -+ 3z~3z2P(~, va e DC~ d DD
c D

Fig. 2. Extended proof matrix of Example 8.

(a) T1, 9 9 Tr are instances of splitted variants of H1,. 9 Hr, and


(b) A* is obtained from A by replacing the quantifier occurrence Qx by
Qx, fi with fi being the variables in ai~c~ for all ai fulfilling 7(a~) = x.

To avoid t h a t free or b o u n d variables in the range of 7* are instantiated or


identified we initialize U with the semi-unification problem

{. ((c, d), (x, y)) J x ~ y are free or bound variables } ,


117

where c ~ d are arbitrary constants. We first unify the end sequent of the matrix
with V21T~...V2~T'~ ~ A ( a l , . . . , a, 0 where Y2iT~ is obtained from the corre-
sponding schema Hi by replacing the different occurrences of schema variables
by different propositional variables and by taking V2i from the given matrix; the
resulting unifier initializes ~. We proceed by traversing the matrix bottom-up.
(C1) The inference is a propositional or structural inference: nothing to do.
(C2) The inference is V:right, the lower sequent is of the form
H -+ F, VxB(x, a i l , . . . , a i r ) ,
where VxB(x, a l l , . . . , air) is a subformula of A ( a l , . . . , an) and 7(aij) = x.
In this case unify the upper sequent with
II ~ F,U(e, b l , . . . , b r ) ,
where e is the eigenvariable associated with x by 7* and bl, ..., b~ are new
variables. Apply the most general unifier to the matrix as well as to ~.
Extend U by the two pairs
((e,b~,...,b~),(x,a~,...,air)) and ((x, a i ~ , . . . , a , ) , ( e , b ~ , . . . , b ~ ) )
(C3) The inference is V:right, the lower sequent is of the form
H ---+F, VzB ,
where VzB will become a part of a schema instance. Let B* be the formula
occurring immediately above VxB in the proof matrix, and let pl,. 9 pr be
the positions in B for which 7' yields x. Identify the terms ti occurring in
position Pi in B, and likewise the terms ui occurring in position Pi in B*.
Extend U by the two pairs
((e, u l , . . . , u ~ ) , ( x , t l , . . . , t r ) ) and ((x,tl,...,t~),(e, ul,...,u~)) ,
where e is the eigenvariable associated with x by 7".
(C4) The inference is 3:right, the lower sequent is of the form
H --, r , ,

where 3xB(x) is a subformula of A ( a l , . . . , a~). Unify the upper sequent


with H ~ F, B(u), where u is a new variable. Apply the most general
unifier to the matrix as well as to ~.
(C5) The inference is B:right, the lower sequent is of the form
H ---* F, B~:B ,
where 3 x B will become a part of a schema instance. Let B* be the formula
occurring immediately above ~xB in the proof matrix, and let Pl, 99 P~ be
the positions in B for which 7* yields x. Identify the terms ti occurring in
position pi in B, and likewise the terms u~ occurring in position pi in B*.
Extend U by the pair
u,,..., ur), (x, t l , . . . ,
where d is a fresh variable.
1!8

The remaining dual quantifier inferences are handled in an analogous manner.


It remains to extend U and ~ in order to correct the schema instances. There
are three kinds of quantified positions which have to be treated separately:
(D1) Quantified positions related to quantifiers within the instances of the sche-
ma variables. For every bound variable x locate all instances of schema
variables, say m, containing positions for which 7* yields x. Let n be the
total number of such positions in one instance, and let ti,1,..., ti,~ be the
terms occurring in these positions within the i-th instance for 1 < i < n.
Compute the most general unifier # such that ti,k# =- tj,k~ for all i, j, k.
Apply/~ to the matrix as well as to ~.
(D2) Quantified positions bound by quantifiers properly belonging to one of the
schemas Hi. For every schema instance consider every pair B1 and B~ of
schema variable instances derived from schema variables a(xl) and a(x2),
respectively. Let Pl,...,P~ and q l , . . . , q~ be the positions for which 7*
yields xl and x2, and let ti and ui be the terms at the positions pi and qi,
respectively. Extend U by the two pairs

((x2, u l , . . . , ur), (xl, t l , . . . , tr)) and ((Xl, t l , . . . , tr) , (X2, U l , . . . , Ur) )

(D3) Quantified positions bound by parametric universal quantifiers. Completely


analogous to case D1.
Finally, apply ~ to U.
If there exists a proof corresponding to the initial information A1-A3 then
the previous construction is possible and the unification problem U is solvable.
On the other hand, every solution a of U induces a derivation by replacing single
quantifier inferences by block-of-quantifier inferences. Every quantifier Vx (3x)
is replaced by VXVZl...Vz,~ (3x3z]...3zn) where Zl,...,zn are all variables
in the scope of Vx (3x) within .&[(S, 7 " ) ~ for which 7* yields x. If the original
extended proof matrix A//(S, 7*) is realized by a proof all eigenvariable conditions
are fulfilled for the proof generated by this splitting of quantifiers. We conclude
the following theorem.
T h e o r e m 10. Let T be a function-free schematic theory in a monadic language.
[f T fulfills the splitting criterion then 7- admits (semi-)term bases.
Proof. By Theorem 8, we only need to consider cut-free proofs from schemata
as described above. The proofs of instances of a given A(al, ..., a~) and binding
assignment 7, which are of bounded length, can be partitioned into finitely many
equivalence classes characterized by the finitely many different extended proof
matrices which are realizable by proofs. Note that there are only finitely many
different functions 7" modulo renaming of flee variables, bound variables and
constants. Every realizable extended proof matrix determines a term minimal
proof and provides for one element of the (semi-)term basis. []
Example 8 (continued). Let the following information be given:
(A'I) the cut-free proof matrix in Fig. 2
(A'2) the schema H1 ~ o~ D c~
(A'3) the formula3ylVy2 (P(Yl, y2)VP(s(yl), s(y2))VP(O,p)) D 3z13z2P(zl, z2)
with 7(P) = 0.
119

P(a, r3) --+ P(a, ra) P(s(a), s(ra)) --~ P(s(a), s(ra))
P(~, ~) -* ~,P(t~, t~) P0(~), ~(r~)) -~ 3*P(t~, ~) P(O, p) ---* P(O, p)
P(a, ra) V P(s(a), s(r3)) --* 3xP(tl, t2) P(O, p) --* 3xP(tl, t2)
P(a, ra) V P(s(a), s(r3)) V P(O,p) ---*3xP(tl, t2)
Vy2 (P(a, y2) V P(s(a), s(y2)) V P(O,p)) --* 3xP(tl, t2)
~ylVy2(P(yl, y2) V P(s(yl), s(y~)) V P(O,p)) --* 3xP(tl, t~)
P(n, ~:) -~ p ( n , ~)
P(r~,r2) ~ 3z2P(ri, z2) B D C,A---* D
P(r~, w_) --~ 3z13z2P(zl, z~) A, B D C --* D
3xP~l,t2)--~3z13z2P(zl,z2), ~ , BDC--ADD
U D

Fig. 3. Instantiated proof matrix of Example 8.

By the construction described above we obtain the instantiated extended proof


matrix AA(S, 7")~ given in Fig. 3. The resulting semi-unification problem U is
g ~- { ((0, 8(0)), (v, w)) I v ~ w, v, w e {a, c, x, yl, y2, Zl, z2} }
U { ((X,tl,t2), (C, rl,r2)), ((c, rl, r2), (x,tl,t2)) }
U { ((0, p), (tl, t2)), ((s(a), s(r3)), (tl, t2)), ((a, r3), (tt, t2)) }
As most general unifier we obtain the identity substitution, a = id. Conse-
quently, Y~4(S, 7*)~a can be transformed to a proof by replacing 3x by 3t13t2.
The element contributed to the (semi-)term basis is p, hence all instances are
provable.
This proof requires the splitting of 3x. In the original proof matrix only
instances with terms s~(0) are pro//able. This can be seen as follows. A proof
within the given logicM form depends on a solution consistent with 7* of the
equations:
= t l { u l / ~ } (1) r3 = t2{ul/x} (2)
s(a) t1{~2/~}
- - (3) s(,3) = t~{u~/~} (4)
0 = t,{u3/z} (5) p = t2{u3/z} (6)
n =t,{cl~} (7) ~ =t~{c/x} (8)
t~ = ~ { ~ l e } (9) t~ = ~2{xlc) (10)
(1) and (5) imply tl = x, ul = a and u3 = 0; equation (3) implies u2 = s(a),
whereas (2) and (4) together imply s(t2){a/x} = t2{s(a)/z}, i.e., t2 = sn(x) and
p = sn(0) by (6). In this case we may choose rl = c, r2 = sn(c) and rn = s'~(a).
[]

9 Conclusion
The characterization results obtained in this paper can be used to demonstrate
that model-theoretically equivalent schemata cannot always be directly trans-
formed into each other. For example, successor induction SI cannot be directly
120

derived from the least number principle LP, even if finitely many, arbitrary
strong consistent formulas are added. In this context, 'directly derived' means
that all instances are derived within bounded depth. The analysis of the rela-
tion between different model-theoretically equivalent schemata is part of a joint
project with P. Wojtylak.
The main interest for computer science, however, seems to be the charac-
terization of schematic theories where term minimal proofs are obtainable, i.e.,
where lifting lemmas in the widest sense are possible.

References

[Baaz 93] Baaz, M. Note on the existence of most general semi-unifiers. In Arithmetic,
Proof Theory and Computational Complexity, P. Clote and J. Kraj/Sek, ed-
itors, pp. 20-29. Oxford University Press, 1993.
[BP 93] Baaz, M. and P. Pudls Kreisel's conjecture for L31. In Arithmetic, Proof
Theory and Computational Complexity, P. Clote and J. Kraji~ek, editors,
pp. 30-60. Oxford University Press, 1993.
[CL 73] Chang, C.-L. and Lee, R.C.-T. Symbolic Logic and Mechanical Theorem
Proving. Academic Press, 1973.
[Farm 88] Farmer, W. M. A unification aJgorithm for second-order monadic terms. In
Ann. Pure Appl. Logic, 39 (1988), 131-174.
[KP ss] Kraji~ek, J: and P. Pudls The number of proof lines and the size of proofs
in first order logic. Arch. Math. Logic, 27 (1988), 69-84.
[Kreis 94] Kreisel, G. Generalizing Proofs: Implications for Generalizing Theorems
Proved in Relatively Few Lines, I. Unpublished manuscript.
[KTU 93] Kfoury, A., J. Tiuryn, and P. Urzyczyn. The undecidability of the semi-
unification problem. Information and Computation, 102 (1993), 83-101.
[Maka 77] Makanin, G.S. The problem of solvability of equations in a free semi-
group. Mat. Sb., 103(2), 147-236. English translation in Math. USSR Sb.,
32 (1977).
[Pari 73] Parikh, R. J. Some results on the length of proofs. Trans. Am. Math. Soc.,
177 (1973), 29-36.
[Rich 74] Richardson, D. Sets of theorems with short proofs. Y. Symbolic Logic, 39(2),
235-242.
A Mixed Linear and Non-Linear Logic:
Proofs, Terms and Models
(Extended Abstract )*

P. N. Benton t
University of Cambridge

Abstract
Intuitionistic linear logic regains the expressive power of intuitionistic logic through
the ! ('of course') modality. Benton, Bierman, Hyland and de Paiva have given a term
assignment system for ILL and an associated notion of categorical model in which the
! modality is modelled by a comonad satisfying certain extra conditions. Ordinary
intuitionistic logic is then modelled in a cartesian dosed category which arises as a
full subcategory of the category of coalgebras for the comonad.
This paper attempts to explain the connection between ILL and IL more directly
and symmetrically by giving a logic, term calculus and categorical model for a system
in which the linear and non-linear worlds exist on an equal footing, with operations
allowing one to pass in both directions. We start from the categorical model of ILL
given by Benton, Bierman, Hyland and de Paiva and show that this is equivalent
to having a symmetric monoidal adjunction between a symmetric monoidal dosed
category and a cartesian closed category. We then derive both a sequent calculus
and a natural deduction presentation of the logic corresponding to the new notion of
model.

1 Introduction
This paper concerns a variant of the intuitionistic fragment of Girard's linear logic [7].
Linear logic does not contain the structural rules of weakening and contraction, but these
are reintroduced in a controlled way via a unary operator !. The rules for ! allow ordinary
intuitionistic logic to be interpreted within intuitionisitic linear logic.
In [5, 4], Benton, Bierman, Hyland and de Paiva formulated a natural deduction pre-
sentation of the multiplicative/exponential fragment of ILL, together with a term calculus
(extending the propositions as types analogy to linear logic) and a categorical model (a lin-
ear category). In that work, the multiplicative (i.e. | part of the logic is modelled in a
symmetric monoidal closed category (SMCC). That much is standard and well-understood.
The ! n~odality is then modelled by a monoidal comonad on the SMCC Which is required
to satisfy certain extra (and non-trivial) conditions. These extra conditions are sufficient

*A considerably longer version of this paper is available as a University of Cambridge technical report
[21.
tAuthor's address: University of Cambridge, Computer Laboratory, New Museums Site, Pembroke
Street, Cambridge CB2 3QG, United Kingdom. Email: Nick.Benton~d.cam.ac.uk.Research supported by
a SERC Fellowshipand the EU Esprit project LOMAPS.
~22

to ensure that the category of coalgebras for the comonad contains a full subcategory
which is cartesian closed and thus models the interpretation of IL in ILL.
Whilst the view that linear logic is primary and that ordinary logic is merely a part of
linear logic is appealing, it is not necessarily always the best way of seeing the situation.
This paper tries to present a more symmetric view of the relationship between IL and ILL
and it seems worth trying to give some motivation for why this might be worth doing.
From a p~actical point of view, there are a number of reasons why the standard linear
term calculus (LTC) of [5] might be considered unsuitable as the basis of a linear functional
programming language. Firstly, linear functional programming is verbose and unnatural -
whilst the LTC might well be a useful intermediate language for a compiler, it is not very
appropriate as a language for everyday programming. If linearity is to be made visible
to the programmer at all, it appears preferable to have some extension of a traditional
non-linear language in which one could write the occasional linear function in order to
deal with I/O, in-place update or whatever.
A more fundamental, problem is that, despite considerable research effort, the precise
way in which a linear language can help with what we have deliberately referred to rather
vaguely as ~I/O, in-place update or whatever' is still not clear. Most published proposals
for using linear types to control or describe intensionat features of functional programs
are either unconvincing or use type systems which are only loosely inspired by linear
logic. Systems in the last category can, pragmatically, be extremely successful; the most
obvious example being the language CLEAN. The type system of CLEAN [1] incorporates
a 'uniqueness' operator for (roughly) making non-linear types linear. This is in some
sense dual to the ! of linear logic, which allows linear types to be treated non-linearly.
Unique types in CLEAN are used to add destructive updates and I / O to the language in
a referentially transparent way.
One (somewhat speculative) aim of the work described here is to provide a sound
mathematical and logical basis for a type system like that of CLEAN. We are encour-
aged not only by the similarities between CLEAN and the calculus to be presented here
(the LNL term calculus), but also by the fact that other researchers looking at practical
implementations of linear languages have come up with systems which include aspects
of the LNL term calculus. For example, Lincoln and Mitchell's linear variant [10] of the
'three instruction machine' divides memory into two spaces corresponding to linear and
non-linear objects. Similarly, Wadler's 'active and passive' type system [14] separates lin-
ear from non-linear types. Jacobs [9] has also described how a sequent calculus inspired
by CLEAN's uniqueness types may be interpreted using the linear categories of [4] under
some extra simplifying assumptions.
From a more logical point of view, there has recently been much interest in Girard's
system LU [8] and related systems in which the (multi)sets of formulae occuring in sequents
are split into different zones. Formulae in some zones are treated classically, whilst those
in other zones are treated linearly. Intuitionistic logics inspired by LU have been proposed
by Plotkin [12] and by Wadler [15]. It is desirable to study the proof and model theory
of such systems directly, rather than treating them as syntactic sugar for, for example,
ordinary linear logic (if only to verify that it is possible to treat them as such syntactic
sugar). The logic of this paper should turn out to be equivalent to a subsystem of LU,
though there are some superficial differences of presentation.
From the categorical perspective, it seems natural to explore the more symmetric
situation where one starts from an SMCC and a CCC with (adjoint) functors between
them, rather than an SMCC with sufficient extra structure to ensure the existence of such
123

a CCC. This is particularly true in the light of the fact that the definition of a linear
category in [4] was arrived at mostly from the proof theory of linear logic, but also (and
this was something of a 'hidden agenda') from a desire to have enough structure to be able
to derive an appropriate CCC from the model. 1 It is also fair to say that the definition
of a linear category is surprisingly complicated, so looking for simpler models, or simpler
presentations of the same models, is a good idea.
The initial motivation for the present work comes from the categorical picture sketched
in the previous paragraph. Once the definition has been made a little more precise, we
shall show that such a situation leads to a comonad on the linear part of the model which
automatically satisfies all the extra conditions required of a linear category, and thus
gives a sound model of ILL including the ! operator. Y~rthermore, the converse holds -
every linear category gives rise to such a pair of categories. Thus we have an alternative,
simpler, definition of what constitutes a model for ILL. This can be seen as giving a purely
category-theoretic reconstruction of !, in that a linear category (a model for ILL with !) is
exactly what one obtains if one attempts directly to model an interpretation of IL in ILL
without the !.
Another interesting feature of the model is that it gives rise to a strong monad on the
CCC part. Thus one obtains a model not just of the lambda calculus, but of Moggi's
computational lambda calculus [11].
Section 3 then looks at the logic and term calculus which are associated with our
new notion of model. We formulate a sequent calculus presentation which satisfies cut
elimination and then give an equivalent natural deduction system. This then gives, by
the Curry-Howard correspondence, an interesting term calculus which combines the usual
simply-typed lambda calculus with a linear lambda calculus. We also consider translations
in both directions between this new term calculus and the linear calculus of [5].

2 The Categorical Picture


Our aim is to present a logic/terms/categories correspondence, similar to that between
intuitionistic logic, simply-typed lambda calculus and cartesian closed categories, in which
the categorical vertex of the triangle consists of (essentially) a cartesian closed category C,
a symmetric monoidal closed category s and a pair of functors G : s --~ C and F : C -+/'-
between them with F -~ G. Intuitively, the requirement that the two functors be adjoint
should be understood as saying that there is an interpretation of IL (the CCC) into ILL
(the SMCC).
We will, however, need our categorical model to satisfy some extra conditions before
we can have any hope of it modelling a logic or term calculus. It is necessary for the
two functors and the unit and counit of the adjunction to behave well with respect to the
monoidal structures of the two categories as this is used to represent the multicategorical
structure implied by commas in contexts. We do not have the space to give full definitions
of all the categorical concepts we shall need, but we can at least recall the broad outlines
of the most important ones. The longer version of this paper [2] includes the details.
Given monoidal categories (f14, | I) and (J~4t, | it), a monoidal ]unctor F : M --+
]vt t is a functor from ]vt to .M ~ equipped with a map m l : I ~ --+ F ( I ) in .M ~ and a

1This is not to say that there is anything in the model which is not justifiable in terms of the proof
theory (given a proper proof-theoretic account of T-rules), but merely that, given that a translation of IL
proofs into ILL proofs exists, any correct model for ILL must be able to reflect the translation semantically.
124

natural transformation mx,r : F(X) | F(Y) -+ F ( X | Y) which satisfy some coherence


conditions. If.M and .h/[I are symmetric monoidal, then F is a symmetric monoidal functor
if it is monoidal and in addition respects the twist maps a and o~.
If (F, m) and (G, n) are monoidal functors from an MC A4 to an MC ~4 ~, then a
monoidal natural transformation from (F, m) to (G, n) is a natural transformation f x
from F to G which is compatible with the comparison maps in an obvious way.
If .h4 and .M I are (symmetric) monoidal categories then a (symmetric) monoidal ad-
junction between them is an ordinary adjunction in which both of the functors are (sym-
metric) monoidal functors and both the unit and the counit of the adjunction are monoidal
natural transformations.

Definition 1 A linear/non-linear model (LNL model) consists of

1. a cartesian closed category (C, 1, x,--r);

2. a symmetric monoidal closed category (s I, | --o) and

3. a pair of symmetric monoidal functors (G, n) : s --+ C and (F, m) : C --+ E between
them which form a symmetric monoidal adjunction with F q G.

We shall usually use A, B, C to range over objects o f / : and X, Y, Z for objects of C. We


write r / a n d e for, respectively, the unit and counit of the adjunction.
An important consequence of the definition of an LNL model is that as well as the
natural transformations
mx,r : F X | F Y ---r F ( X x Y)
nA,B : GA x GB --+ G(A | B)
and their nullary versions, the maps m : I --+ F1 and n : 1 --+ GI, we have a family of
maps
PX,Y : .F(X • Y) -+ F X | F Y
given by the transpose of nvx,FY o ~lx x fly:

F ( X x Y) F~x'))F(GFX x GFY) F!nlFG(FX | FY)--%FX | F Y


F ez
and a map p : F 1 --r I given by FI~-~FGI--~I.

P r o p o s i t i o n 1 In an LNL model (in fact for any monoidal adjunction), the maps mx,y
are the components of a natural isomorphism with inverses Px,Y and, furthermore, the
map rn is an isomorphism with inverse p:

F(X) | F(Y) ~- F ( X x Y)

I -~ F(1)
[]

So F preserves the monoidal structure up to an isomorphism rather than merely up


to a comparison. That is to say, F is a strong functor. There is, of course, a lot more
interesting structure in an LNL model. To begin with, the adjunction induces a comonad
o n / : and a monad on G. We discuss each of these below.
125

2.1 The Comonad and Comparison with Linear Categories


The comonad on s is (FG, ~ : F G -~ 1, ~ : F G --+ F G F G ) where E is the counit of the
adjunction and (f has components ~A = F(~C(A)). We write I for FG.

L e m m a 2 The comonad (!,E,8) is symmetric monoidaI, i.e. there is a natural transfor-


mation q with components qA,B :!A| ---}!(A| and a map q : I -+lI such that (!,q) is
a symmetric monoidaI functor and r and 6 are monoidal natural transformations, rq

In [4], we defined a model of the multiplicative/exponential fragment of intuitionistic


linear logic as follows:

Definition 2 A linear category is specified by the following data:

1. A symmetric monoidal closed category (s | I, -o).

$. A symmetric monoidal comonad (!, r 8, q) on g.

3. Monoidal natural transformations with components eA :!A -4 I and dA :IA --+!A|


such that

(a) each (!A, eA, dA) is a commutative comonoid,


(b) eA and dA are coalgebra maps, and
(c) all coalgebra maps between free coalgebras preserve the comonoid structure.

2.1.1 Linear Categories and LNL Models are Equivalent


Any LNL model includes, by definition, part 1 of Definition 2, and we have just seen
(Lemma 2) that it also satisfies part 2. Furthermore, there are plausible candidates for eA
and dA:
def
eA = p o F(*~A)
where *CA is the unique map from GA to the terminal object 1 of C, and

d Adef
"~PGA,GA o F(AGA )

where AGA is the diagonal map from GA to GA x GA in C.

T h e o r e m 3 For any LNL model, e and d as defined above satisfy all the conditions given
in part 3 of Definition 2. In other words, any LNL model is a linear category.

P r o o f . This involves checking that a fairly large collection of diagrams all commute.
Although this is a lot of work, none of them are very difficult. Proposition 1 plays an
important role in several of them. Fhrther details may be found in [2]. H

We now sketch the proof of the converse to Theorem 3. Whilst this is largely a matter
of recalling results which were proved in [4], by doing this carefully we obtain a slightly
better understanding of the situation.
Assume that s is a linear category as in Definition 2. We need to show that this gives
rise to a CCC C and a symmetric monoidal adjunction between s and C as in Definition 1.
Recall that the comonad on s gives rise to two categories of algebras:
126

9 The Eilenberg-Moore category E !. This has as objects all the !-coalgebras (A, hA :
A -~!A) and as morphisms all the coalgebra morphisms.

9 The (co-)Kleisli category s This is the full subcategory of s which has as objects
all the free !-coalgebras (!A, gA :!A ~!!A).

Each of these categories comes with a pair of adjoint functors F -~ G where G : A ~-~
(!A, ~A) and F : (A, hA) ~ A.

L e m m a 4 If L is a linear category then s has finite products, with the terminal object
given by (I, q : I -+!I) and the product of (A, hA) and (B, hB) by (A | B, qA,B o (hA | hB)).
[]

In general, there is no reason why the Eilenberg-Moore category should be cartesian


closed, since there is no reason why it should have an internal horn for arbitrary pairs
of coalgebras. We can~ however, find a full subcategory of the Eilenberg-Moore category
which is cartesian closed.

L e m m a 5 In fJ, all the free coalgebras are exponentiable.That is, there is an inter-
nal horn into any free coalgebra (!B, gB). Furthermore, the internal horn is itself a free
coalgebra. []
Now, notice that in any cartesian category, if an object X is exponentiable then so is
[Y, X] for any Y, since we can take [Z, [Y, X]] to be [Z • Y, X]. Furthermore, the product
of two exponentiable objects X and Y is exponentiable since we can take [Z, X • Y] to
be [Z, X] • [Z, Y]. Taking this together with the previous !emma, we have:

L e m m a 6 The full subcategory Cxp( E !) of the EiIenberg-Moore category having as objects


the ezponentiable coalgebras is cartesian closed and contains the Kleisli category s []
Note that the Kleisli category is not, in general, itself cartesian closed, since the product
of two free coalgebras is not necessarily free. We shall consider a case in which this does
happen in Section 2.1.2. In the general case, we do have the following, however:

L e m m a 7 The full subcategory s of Exp(f. !) consisting of finite products of free coalge-


bras is cartesian closed. []

T h e o r e m 8 If r. is a linear category then by taking C to be either E.~ or s


!) and F
and G to be the appropriate forgetful and free functors one obtains an LNL model.

P r o o f . We have already seen that both the choices for C are cartesian closed so it just
remains to check that F and G form a symmetric monoidal adjunction, which is straight-
forward, n

2.1.2 Additives and the Seely Isomorphisms


We now consider briefly what happens when an LNL model also has the extra structure
required to model the additive linear connectives &, @ and the non-linear sum +. The
simplest case is when the SMCC part s of an LNL model also has finite products, modelling
127

the additive connective 'with' (&). The functor G preserves limits because it is a right
adjoint, and in particular

G(A&B) -~ GA x GB and G1 -~ 1

Taking this together with Proposition 1, we obtain the following natural isomorphisms:

!A| ~- !(A&B) and I ~ !1

These isomorphisms were central to Seely's proposed model of ILL [13], which also pro-
posed interpreting IL in the Kleisli category. See [6] for a critique of Seely's semantics;
here we merely note the following:

Proposition 9 I~ a linear category has products then the Kleisli category s is cartesian
closed.

P r o o f . One shows that s having products implies that the product of two free !-coalgebras
is a free coalgebra. This means that t'! coincides with s which is cartesian closed by
Lemma 7. D

The correspondence between linear categories and LNL models extends trivially to
one between linear categories with finite products and LNL models with products on the
SMCC part. Coproducts are slightly more problematic. Whilst the appropriate extension
of an LNL model seems obvious (just require both s and C to have finite coproducts),
this does not correspond quite as simply as one might hope to linear categories with
coproducts.
The difficulty is that, whilst an LNL model with coproducts certainly gives rise to
a linear category with coproducts, the converse does not appear necessarily to be true.
Assume s is a linear category with finite coproducts, then s also has finite coproducts as
we can define the coproduct of (A, hA) and (B, hB) to be

(A + B, [!inl o hA, lint o hB])

aild this is easily checked to satisfy the appropriate conditions. There seems no general
reason, however, why either of the two CCCs which we have already identified as arising
from s should be closed under this coproduct.
Fortunately, something can be salvaged. There are weak finite coproducts ~ in the
Kleisli category, obtained by defining

(IA, 6A) ~9 (!B, 5B) d---ef(!(!A+!B),6!A+!B)

with, for example, the left injection given by !inl o ~A.

2.2 The Monad and Comparison with Let-CCCs


The monad on C is (GF, 71 : 1 -~ GF, # : G F G F -r GF) where W is the unit of the
adjunction and # has components l~x = G(~FX). Writing T for GF, one can check that
(T, ~7,/,) is a symmetric monoidal monad, i.e. T is a symmetric monoidal functor and both
77 and # are monoidal natural transformations.
Cartesian closed categories with monoidal monads have recently been the focus of some
interest, as they are the models for Moggi's computational lambda calculus [11]. The
~2~

definition is usually given in terms of strong monads, where a monad T on a monoidal


category is said to be strong if it is equipped with a natural transformation ~- (called the
tensorial strength) with components

~'A,B : A | T B -~ T ( A | B)

satsifying some extra conditions. A strong monad on a symmetric monoidal category


is said to be commutative if the tensorial strength behaves well with respect to the twist
maps a. It turns out that commutative strong monads are the same as symmetric monoidal
monads (see the full paper for more details).
A model of the computational lambda calculus (a let-CCC) is a cartesian closed cat-
egory with a strong monad. The above implies that an LNL model always has a strong
monad on the CCC part of the model and thus includes a let-CCC. The monad is, how-
ever, always commutative (because T is a symmetric monoidal functor). It is not the case
that all strong monads on CCCs are commutative; indeed, some very important monads
arising in computer science are non-commutative, for example the free monoid monad
(list, [-], flat~er~) on the category of sets. Thus it is certainly the case that not all, or
even all interesting, let-CCC's will arise from LNL models. Having said that, many of
the most important monads arising in semantics, such as lifting and various flavours of
powerset/powerdomain, are commutative, so the theory of commutative strong monads
on CCCs is not without independent interest.

2.3 Examples
Let E be the category of pointed ~cpos (w-cocomplete partial orders with a least element)
and strict (bottom preserving) continuous maps. This is a symmetric monoidal closed
category with tensor product given by the so-called smash product, the identity for the
tensor by the one-point space (which is also a biterminator) and internal hom by the strict
continuous function space. In fact, /: also has binary products and coproducts, given by
cartesian product and coalesced sum respectively.
Given this choice of/:, there are a couple of obvious choices for the CCC C which give
an LNL model. One is to take g to be the category of pointed w-cpos and continuous
(not necessarily strict) maps, G to be the inclusion functor and F to be the lifting functor
F : X ~ X• The monoidal structure m on F is given by the evident isomorphism
X• | Y• ~ (X x Y)• In this case, g is (equivalent to) the Kleisli category of the lifting
comonad on E. Note that the cartesian closure of the Kleisli category follows from the
fact that s has products. There are strong coproducts in E but only weak ones in C.
An alternative choice of g is the category of (not necessarily pointed) w-cpos (these
are sometimes called predomains) and continuous maps, again with inclusion and lifting
functors. This is equivalent to the Eilenberg-Moore category of the lift comonad on E, so
it has products and coproducts by our previous general arguments, but it also turns out
to be cartesian closed.
A different example arises from taking E be the category of Abelian groups and group
homomorphisms. This is symmetric monoidal closed with A | B the Abelian group gen-
erated by the set of tokens {a | b I a E A~ b E B} subject to the relations

(al-t-a2)| ---- al |174


a| ~ a|174
129

(More categorically, A | B can be defined by a homomorphism A x B --4 A | B which


is universal amongst bilinear maps into Abelian groups.) The unit for | is the group of
integers under addition, Z, and the internal horn A - oB is the group of homomorphisms
from A to B with the multiplication inherited from B. In f a c t / : also has biproducts - the
direct sum A ~ B is both a product and a coproduct and the trivial group is a biterminator.
Now let C be ~get, and F and G be the free and forgetful functors. It is easy to check
that this does indeed give an LNL model. The comonad on s takes an Abelian group to
the free group on its underlying set. e is 'evaluation' and ~ is the insertion of generators.
In this case C is equivalent to the Kleisli category of the comonad o n / ' .

3 LNL Logic
LNL-models are, of course, supposed to be models of a logical system. Theorem 3 says
that they are models for intuitionistic linear logic as defined by Girard, but the form of
the definition of LNL-model suggests an interesting alternative presentation of the logic.
The idea is that one starts with two independent logics, corresponding to the categories
L: and C and then adds operators which correspond in some way to the adjunction.
In keeping with our earlier conventions for naming objects o f / : and C, we will use
A, B, C to range over linear propositions and X, Y, Z for conventional ones. We shall use
P and A to range over linear contexts (finite multisets of linear propositions) and {9 and
for non-linear ones. We also decorate turnstiles with s or C to indicate which subsystem
they belong to. Finally, if {9 is X1,... ,Xn then F{9 means FX1,... ,FXn, and similarly
for GF. The two classes of propositions with which we shall be dealing are defined by the
following grammar:

A,B := A o l I I A |
X,Y := X o l I l X •

where A0 (resp. X0) ranges over some unspecified set of atomic linear (resp. non-linear)
propositions.

3.1 S e q u e n t Calculus
The two logics with which we start are very familiar viz. the exponential-free, multiplica-
tive fragment of propositional intuitionisticlinear logic and the x, -~ fragment of ordinary
intuitionisticlogic. These both have very well-behaved sequent presentations. H o w should
the systems be enriched and combined to give LNL-logic? There are (at least) two natural
answers, neither of which satisfiescut elimination. Fortunately, there is a presentation
of the logic which has a good proof theory. The trick is to allow conventional non-linear
formulae to appear in the assumptions of a linear sequent. A typical linear sequent looks,
therefore, like this:
X1,... ,Xm, A1,... ,An ~'s B
which is interpreted as a morphism in s of the form

FX1 | | FXm | A1 | | A,~ -'--+ B

Non-linear sequents are still constrained to have purely non-linear antecedents and are
interpreted as morphisms in C in the usual way. We abuse notation by writing linear
!30

A ~-~ A L-axiom X FC X C-axiom

O, X, X; F ~-~ A O, X, X I-c Y
L-contraction C-contraction
O, X; F ~-~ A O, X FC Y
O; F ks A O~'c Y
L-weakening C-weakening
O, X; F bc A O,X ~-c Y
0 I-c X X, (I);F t-c A 0 ~c X X, ,~ Fc Y
CL-cut CC-cut
O, ~; F Pc A O, 0 f-c Y
O; F I-z: A (I);A, A ~-~ B
f.L-cut
O, O; F, A t-c B

O, X I-c Z O, Y I-C. Z
C- x -left l C-x-left2
O, X x Y t-c Z O,X x YFc Z
O, X; F t-c A O, Y; F ~-~ A
s x -Ieftl s x -left2
O, X x Y; F t-s A O, X x Y; F F-z:A

O Fc X g2 Fc Y 1-right
x -right
O, 9 F"c X x Y F-c 1

O; F, A, B F-z C O; P t-s A ,I);A F-z:B


| e-right
O;F,A| ~-s C O,O;F,A k-~:A |

O; F I-z: A I-right
I-left ~-~ I
O; F, I I-~ A

0 Fc X Y, r F-c Z 0 F-c X Y, ~; F ~-c A


C---+-/eft s
O,X -+ Y,r I-c Z
O,X I-c Y
~-right
OFc X--+ Y
O; F, A b-L B O, F F-L A O; A, B t-c C
--o-right -.o-left
O;F F-c A - o B O, O; F, A - o B , A t-c C
O t-c X O, X; F ~-s A
F-right F-left
0 J-c F X O; FX, P J-c A

e; B, F F-c A 0 F-s A
G-left G-right
O, GB; P ~-c A (9 Pc GA

Figure 1: Sequent calculus presentation of LNL logic


131

Y1 x ... x Yn-% X F X | F r | F I-cA


Cs

X1 x . . . x X,-%X
FX1 | | FXn---~F(X1 x ... x X n ) ~ $ F X F-right

FO | B @ F-5,A
G-left
F O | F G B @F I | | B @ F-%A

FX1 @... @ FXn--%A


G-right

Figure 2: Categorical interpretation of LNL logic (sketch)

sequents in the form O; F I-s A, even though there is no need for the ';' since linear and
non-linear formulae can never be confused. Figure 1 shows the sequent rules for LNL logic.
The interpretation of LNL logic in an LNL model is fairly straightforward. We omit the
interpretation of the standard logical connectives and just give details of the interpretation
of one cut rule and some of the rules for F and G in Figure 2.

T h e o r e m 10 ( C u t Elimination) There is an algorithm which, given a proof II of a


sequent 0 Fc X or O; F F-s A, yields a cut-Jree proof IIt of the same sequent,

Proof. This follows the broad outline of most cut elimination proofs, showing that proofs
may be simplified by a (non-deterministic) succession of local rewrites which percolate the
cuts upwards. Again, see the full version of the paper for details, rl

T h e o r e m 11 The cut elimination procedure for LNL logic is modelled soundly in any
LNL model.

Proof. One shows that whenever one proof is simplified to another then the interpreta-
tions of those two proofs are equal morphisms in the model.

There are many possible variations on the sequent rules for LNL logic. One of the most
natural is to treat the non-linear parts of antecedents as additive rather than multiplicative.
This yields an equivalent logic containing rules such as
O; F I--s A O; A ~-s B
|
O;F,A ~-s A |
One can also present the purely multiplicative version of the logic in a concise way by
using some new metavariables: let P, Q range over either linear or non-linear propositions
132

@; a: A l-s a: A O~x: X l-e x: X

0 l-c s: X (9 t-c t: Y
O l-c (8,t):X x Y e l-c O: 1

O ~-c s: X x Y 0 bc s: X x Y
O t-'c fst(s): X @ l-c snd(8): Y

O; F l-s e: A (9; A l-z: f: B O;FI-Z:e:A| O;A,a:A,b:Bl-z:f:C


O;F,A l-z:e|174 O ; F , A I-z: leta| = ein f : C

O; F l-l: e: I O; A I-s f : A
0 l-s *: I O;F,A t-z: let, = e i n f : A

O, x: X l-c s: Y O l-c s: X -+ Y O l-c t: X


@ l-c (Ax: X.s): X -~ Y 0 I-c s t : Y

O; F, a: A l-z: e: B O;F l-z: e:A - o B O;A l-z: f : A


O;F I'-z (Aa:A.e):A - o B O ; F , A l-z: e ] : B

0 l-c 8 : X O; F FZ: e: F X O, x: X; A I-z: f: A


O l-z: F(s): F X O; F, A l-z let F(x) = e in f: A

O l-z:e: A @ l-c s: GA
O Fc G(e): GA O Fs derelict(s): A

Figure 3: LNL term assignment system

and T over mixed contexts. Then we can, for example, capture both --+-left rules in the
one rule
O l- X Y, T I- P
-left
e , X -~ Y, T I- P
This gives a set of rules which are essentially the same as those given by Jacobs in [9]
(which also contains a good account of some concrete categorical models). Jacobs gives a
rather different account of the semantics and there are also some subtle differences in the
proof rules.

3.2 Natural Deduction and LNL Terms


There is a natural deduction formulation of LNL logic and an associated normalisation
procedure. This gives, by the Curry-Howard correspondence, a term assignment system
and a set of reduction rules, i.e. a mixed linear/non-linear lambda calculus. The natural
deduction system we present corresponds to the additive context variant of the sequent
calculus and is given in 'sequent style', complete with the term annotations, in Figure 3.
133

fst(s,,) ~
snd(s, t) -+~ t

let a | b = e | ] in g -~# g[e/a, f/b]

let * = * in e - - ~ e

(~x: X.s) t ~ z sit/=]


(ha: A.e) f ~ Z elf/a]
let F(x) -- F(s) in e - - ~ e[s/x]

derelict(G(e)) -4~ e

Figure 4: Term calculus fl-reductions

It is easy to check that terms code derivations uniquely and that the natural deduction
system is equivalent to the sequent calculus. The proof of the equivalence uses the impor-
tant lemmas that substitution and weakening are admissible rules in the natural deduction
system. Linear variables a,b in the context occur free exactly once in a well-typed term,
whereas non-linear variables x,y may occur any number of times, including 0. Note also
that there is no explicit syntax for weakening or contraction. We omit the details of the
interpretation of natural deductions in LNL models.
The fundamental kind of normalisation step on natural deductions is the removal of a
'detour' in the deduction, which consists of an introduction rule immediately followed by
the corresponding elimination. For reasons of space, we omit the details of the reductions
on proofs but merely list the induced fl-reductions on terms in Figure 4.
There is also a secondary class of reductions - the commuting conversions, of which
there are 12 in total. The following is a typical term reduction induced by a commuting
conversion:
leta| ing --% l e t * = e i n ( l e t a |
The reduction relation --+~,c, which is the precongruence closure of the union of --+Z and
~ c , is easily checked to preserve types. We also have (cf. Theorem 11):
T h e o r e m 12 Both the fl-reductions and the commuting conversions are soundly modelled
by the interpretation of the natural deduction system in any LNL model.
9 / f {9; F ~-s e: A and e --4Z,e e' then [O; F ~-s e: A] = [ e ; F }-s e': A]
9 I / o ~c s: x and s - ~ , c s' then [O ~c s: X] = [O ~c s': X]
[]
We can define translations in both directions between LNL logic and ILL. H A is an
ILL proposition, define the linear LNL proposition A ~ inductively as follows:
A~ = A0 (A0atomic) (A| ~ = A~ 1 7 4 ~
(A-oB) ~ = A~ ~ I~ = I
(!A) ~ = F G ( A ~
134

T h e o r e m 13 l i p ~- e : A in ILL, then there is an e ~ such that P ~ t-z: e~ ~ []


In the other direction, one translates the linear part of LNL logic essentially unchanged
and the non-linear part using a variant of the Girard translation. E.g.:
(A ~ B)* = A * | (A-oB)* = A*-oB*
(FX)* = !(x*) ( x x y)" = !(x*)|
(X -. Y)" = ! ( X * ) - o Y * (GA)* = A*
T h e o r e m 14
1. I f @ ~'c s : X in L N L logic, there is an L T C term s* s.t. !@* i- 8*:X*

2. I] @;F t-s e : A in L N L logic, there is an L T C term e* s.t. !@*,F* t- e*:A*


[]

It is easy to see that for any ILL judgement P t- A, P ~ ~- A ~ is equal to the original
judgement. Thus P }- A is provable in ILL iff F ~ ~-L A~ is provable in LNL logic. This
extends to proofs in the following way:
T h e o r e m 15 I f P t- e: A in LTC, then not only is P t- eO*: A provable, but e ~. e ~ where
is the categorical equality relation on L T C terms given in [4]. []

4 Conclusions and Further Work

We have given a new and intuitively appealing characterisation of categorical models of


intuitionistic linear logic. We then used this presentation of the models as the basis for
defining a new logic which unifies ordinary intuitionistic logic with intuitionistic linear
logic. The natural deduction presentation of the new logic then led to a mixed linear
and non-linear lambda calculus. LNL logic has a natural class of categorical models and a
well-behaved proof theory in both its sequent calculus and natural deduction formulations.
Given this, and the links with other research which were mentioned in the introduction,
LNL logic certainly seems to merit further study.
On the theoretical side, much remains to be done. We have not proved a completeness
theorem, nor have we proved that the LNL term calculus is strong normalising. The
strong normalisation proof should be relatively easy to do via a translation argument like
that which we have previously used for the linear term calculus [3] and the computational
lambda calculus. It would be nice to have better (that is, less degenerate) examples of
concrete models and one might well find such examples by looking at some of the categories
arising in game semantics.
We should investigate further how to treat the additives. Beyond that, one could
consider adding inductive or coinductive datatypes or second-order quantification to the
logic. This seems particularly worthwhile in the light of Plotkin's work on parametricity
and recursion in a logic rather like ours [12].
On the practical side, we should investigate whether or not the LNL term calculus lends
itself more readily to efficient implementation than does the linear term calculus. The hope
is that one can arrange an implementation with two memory spaces, corresponding to the
two subsystems of LNL logic. The non-linear space would be garbage collected in the
usual way, whereas the linear space would contain objects satisfying some useful memory
invariant (such as having only one pointer to them at all times) which could be exploited
to reduce the space usage of programs. Previous experience, however, shows that turning
such intuitively plausible hopes into provably correct implementations is a non-trivial task.
135

References
[1] E. Barendsen and S. Smetsers. Conventional and uniqueness typing in graph rewrite systems.
Technical Report CSI-R9328, Katholieke Universiteit Nijmegen, December 1993.
[2] P. N. Benton. A mixed linear and non-linear logic: Proofs, terms and models (preliminary
report). Technical Report 352, Computer Laboratory, University of Cambridge, September
1994.
[3] P. N. Benton. Strong normalisation for the linear term calculus. Journal of Functional Pro-
9ramming, 1995. To appear. Also available as Technical Report 305, University of Cambridge
Computer Laboratory, July 1993.
[4] P.N. Benton, G. M. Bierman, J. M. E. Hyland, and V. C. V. de Paiva. Linear lambda calculus
and categorical models revisited. In E. BSrger et al., editor, Selected Papers from Computer
Science Logic '92, volume 702 of Lecture Notes in Computer Science. Springer-Verlag, 1993.
[5] P. N. Benton, G. M. Bierman, J. M. E. Hyland, and V. C. V. de Paiva. A term calculus
for intuitionistic linear logic. In M. Bezem and J. F. Groote, editors, Proceedings of the
International Conference on Typed Lambda Calculi and Applications, volume 664 of Lecture
Notes in Computer Science. Springer-Verlag, 1993.
[6] G. M. Bierman. On intuitionistic linear logic (revised version of PhD thesis). Technical Report
346, Computer Laboratory, University of Cambridge, August 1994.
[7] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50:1-102, 1987.
[8] J.-Y. Girard. On the unity of logic. Annals of Pure and Applied Logic, 59:201-217, 1993.
[9] B. Jacobs. Conventional and linear types in a logic of coalgebras. Preprint, University of
Utrecht, April 1993.
[1(3] P. Lincoln and J. C. Mitchell, Operational aspects of linear lambda calculus. In Proceedings
of the 7th Annual Symposium on Logic in Computer Science. IEEE, 1992.
[11] E. Moggi. Notions of computation and monads. Information and Computation, 93:55-92,
1991.
[12] G. D. Plotkin. Type theory and recursion (abstract). In Proceedings of 8th Conference on
Logic in Computer Science. IEEE Computer Society Press, 1993.
[13] R.. A. G. Seely. Linear logic, *-autonomous categories and coffee coalgebras. In Conference on
Categories in Computer Science and Logic, volume 92 of AMS Contemporary Mathematics,
June 1980.
[14] P. Wadler. There's no substitute for linear logic (projector slides). In G. Winskel, editor,
Proceedings of the CLICS Workshop (Part I), March 1992, Aarhus, Denmark, May 1992.
Available as DAIMI PB-397-I Computer Science Department, Aarhus University.
[15] P. Wadler. A taste of linear logic. In A. M. Borzyszkowski and S. Sokolowski, editors,
Proceedings of the 18th International Symposium on Mathematical Foundations of Computer
Science, number 711 in Lecture Notes in Computer Science, pages 185-210, 1993.
Cut Free Formalization of Logic with Finitely Many Variables. Part I.
L.GORDEEV
7SI f . Informatik, Math. Logik, Auf der Morgenstelle 10,
D-72076 Tiibingen, Germany
gordeew@informatik.uni-tuebingen.de

w INTRODUCTION. Computer science logic and the automated theorem


proving deals e.g. with computable verifications of the validity in the classi-
cal predicate logic of 1-order. To that end various calculi of predicate logic
and their proof search algorithms were invented and studied.
Let us consider first, for the sake of brevity, a particular case of the
propositional logic. There are known direct as well as indirect calculi. The
latter ones include e.g. modus ponens rule of deduction }- C gz }- C-~A ~ }-A,
or transitivity [- A = C & [- C= B # F- A = B in case of equational logic, both
being indirect because their "new" premise cut formula C might be more
involved (both in form and in substance) than the conclusion formula A,
resp. A and B. [A familiar "extremely indirect" example is the calculus of
Lukasiewicz whose only rule is modus ponens and whose sole axiom (schema)
is ((P-~Q)-~R)-~((R-~P)-~(S-~P)), cf. [I~].] The former (direct) ones include
e.g. cut free sequent calculi which use only such (direct) rules of inference
whose premise formulas are subformulas of the correlated conclusion formulas
(subformula property). It is readily seen that direct calculi provide better
proof-search algorithms. Namely, a natural proof search in an indirect
calculus is based on a given "casual,' enumeration of all available formulas to
be treated as cut formulas C (see above) at successive stages of backward
deduction of A, resp. A = B. Now these different cut formulas are (loosely
speaking) mutually incomparable, so that the information gained from one
such formula is useless to the next one. That is, searching for a proof in an
indirect calculus is like searching for a suitable finite path in an infinitely
branching tree (of all formulas). 1/ In contrast, the subformula property
enables more systematic proof search by using only subformulas of formulas
treated previously. In the above temns, this can be addressed as searching for
a suitable finite path in a finitely (in fact, binary) branching tree.
The case of full predicate logic is more involved. Strictly speaking, the
corresponding cut free predicate calculi don't satisfy the subformula property
anymore. This is because premise formulas may include substitutions of new
eigen-variables (from a given infinite list) not occurring in the conclusion,
where~ts the length of a new variable v i is growing proportionally to i. And it
is vital for cut free sequent calculi to operate with infinitely many distinct
variables, as shows the following simple observation (Theorem 1.3.7 below).
(*) There is no algorithmical upper bound to the number of eigen-variables
which musl appear in a cut free derivation of a given formula A with
just three distinct variables.
Note that the resulting different substitution-instances (copies) of premise
formulas in question are substantially incomparable, since eigen-variables
137

involved may serve as GiSdelnumbers enumerating the correlated infinite


counter-example model. (This becomes evident in the case of predicate reso-
lution approach as being based on a "casual" enumeration of the correlated
Herbrand model.) So, by (*), the corresponding predicate cut free proof
search turns out to be nearly as indirect as proof search in indirect proposi-
tional calculi. 2~ On the other hand, there are (clearly indirect) equational
calculi to which no cut free counterparts are known. A familiar example is
Relation Algebra of Tarski's, RA (see e.g. [TG]). 3) Now theorems of RA are
(modulo the canonical interpretation) exactly the three-variable ~abbr.:
3-VAR) theorems of the correlated modus ponens predicate calculus with
equality in the presence of at most four distinct variables (see [M2]). And
yet RA is very strong both in expression and formalism, since various formal
set theories are interpretable within it (see [TG]) 4~. Hence so is the predicate
calculus with equality and at least four distinct variables. And, by analogy to
formalizations with infinitely many variables, one can't expect the equality to
be responsible for the proof theoretical strength. So the question arises about
cut free formalization of predicate logic without equality in the presence of
finitely many distinct variables that could provide a better proof search for
the whole (1-order) predicate logic. In the present paper I introduce the
appropriate cut free reduction predicate calculi, RPCn(see w below) with at
most n distinct variables, n_< w, and prove the following results.
(A) Nested Hauptsatz (Theorem 3.1) which implies (Corollary 3.1) that for
all n, RPC n is equivalent to the standard modus ponens predicate calculus
in the presence of at most n distinct variables (abbr.: n-VAR logic).
(B) Relative Completeness (Corollary 4.14) stating that for n > 3 , the whole
w-VAR logic is polynomially interpretable in the 3-VAR fragment of RPC n .
For "small" n, RPC n is almost direct, as it nearly preserves "the
subformula property, since by definition all substitutions of variables for
variables are carried out in the underlying domain v0,...,vn_ 1. Loosely
speaking, RPC n is an appropriate nested specification of GentzenVs cut free
sequent calculus (in Schiitte-Rasiowa-Sikorski form) in the restricted
variable domain v0,... , %-1. The two main features of this specification are as
follows. First, RPC n admits different proof search options, as it enables
mutually independent reductions at various subformula-levels of a given
formula. Second, the reductions of RPC n do not involve new eigen-
variables; instead, they operate with suitable "variable deletions". As a
result, the new "reduced" formulas are closer related to their origins than in
the sequent calculus proper. An appropriate implementation of RPCn, for
reasonably small n, would provide new algorithmical (semi)solutions to the
validity problem in the whole 1-order predicate logic. For example, given a
sentence ~ with 5 (say) variables one can search for a proof of ~ in RPClo
(say) and, simultaneously, for a proof in RPC4 of the 3-VAR interpretation
of ~ mentioned in (B). If ~ is valid then, by (A), there is hope to achieve a
desired "fast" solution from the first procedure in case ~ is provable in
10-VAR logic (which itself is very strong), while the second procedure will
anyway provide the corresponding "slow" solution by (B).
138

w PRELIMINARIES. In the sequel n denotes afixed positive ordinal < w.


1.1 Modus-ponens formalism.
1.1.15) The language r 1 includes the following items.
1.1.1.1 Symbols:
9 Individual variables, VAR: v0, Vl, ... , vi, ... for i < n (abbr.: x, y, z, u, w)
9 Individual constants, CON: c1 , ... , Cq (finite possibly empty list)
9 Predicate variables, PRE: VO, V1, ... , Vp (finite nonempty list; abbr.: X,
Y, Z) with the correlated dimension (arity) list a 0 ) 0, al ~ 0, ... , % )_ 0
9 Falsehood: •
9 Implication: -~
9 Universal quantifier: V
9 Parentheses: ( and )
1.1.1.2 Terms, TER: = VAR u CON (abbr.: a, b, s, t)
1.1.1.3 Formulas, FOR (abbr.: A, B, C, F, G, H), are defined as usual:
9 • E FOR
9 Xbl...b k E FOR, k being the dimension of X, b1 E TEL ... , bk E TER
9 If A E FOR and B E FOR then (A~B) E FOR
9 If A E FOR and x E VAR then VxA E FOR
(Below, for brevity, external parentheses are usually dropped.)
1.1.2 The set of free variables of A E FOR, Fr(A) C VAR, is as follows:
9 x E Fr(A) iff some occurrence of x in A is free, i.e. not bound by any
quantifier (so x can be both free and bound in A).
1.1.3 The substitution of a E TER for x E VAR in A E FOR, A[x/a], is
defined recursively:
9 If x ~ Fr(A) then A[x/a]: = A
9 If x E Fr(A) and a E CON then A[x/a] arises from A by replacing every
free occurrence of x by a
9 If x E Fr(A) and A = Xbl...b k then A[x/a] arises from A by replacing
every b i = x b y a
9 If x E rr(A) and A = B~C then A[x/a]: = B[ x/ a]-~C[ x/ a]
9 If x E Fr(A), A = VyB and y r a E VAR then A[x/a]: = Vy(S[x/a])
9 If x E Fr(A) and A=VyB then A[x/y]:=Vx(B[xVy]), where by
definition B[x~y] arises from B by interchanging x and y, i.e. by
replacing all occurrences (free or bound) of x by y and vice versa.
1.1.4 With og'n1 is correlated the modus-ponens predicate calculus M PC n .
1.1.4.1 Axioms of M P C , :
(1)m A~(B~A)
(2)m (A-~(B~C))~((A'~B)-~(A~C))
(3)m ((A~•177
(4)m VxA+A[x/a]
(5)m Vx(A~B)~(A~VxB), if x ~ Fr(A)
1.1.4.2 Deduction rules of M P C n are (MP) and (G):
A A~B
(MP) B
A
139

1.1.5 Note that M PC~ is complete as it coincides (up to the particular


substitution operator 1.1.3) with the familiar formalism of 1-order predicate
logic. Furthermore, for each n > 3 , M P C n in undecidable (see e.g. [TG]).
1.2 Equational formalism.
1.2.1 The language ~-~n2 includes the following items.
1.2.1.1 Symbols:
9 Individual variables, VAIt, as in ~L1nl above
9 IndividuM constants, CON, as in .L,Pn1 above
9 Predicate variables, PI~, as in ~'n 1 above
9 Predicate constants: • (falsehood) and T (truth)
9 Connectives: -~ (negation), ^ (conjunction) and v (disjunction)
9 Quantifiers: V (universal) and 3 (existential)
9 Parentheses: ( and )
9 Equality: =
1.2.1.2 Terms, TER: = VAR u CON (abbr.: a , b , S , t ), as in ~ n 1.
1.2.1.3 Literals, LIT (abbr.: L ):
9 Xbl...b k E LIT and ",Xbl...b k E LIT, Xbl...b k being as in ~ n 1
1.2.1.4 Formulas, FOR (abbr.: A , B , C , F , G , H ) are as follows:
9 •
9 LIT C FOR
9 If A E FOR and B C FOR then ( A ^ B ) E FOR and (AvB) E FOR
9 If A E FOR and z E VAR then VxA E FOR and 3xA E FOR
1.2.1.5 -~A is defined on FOR recursively by duality:
9 -~• : T and 7T: : &
9 ~Xbl... bk : = X b l . . . b k
9 ~(A ^B): = , A v - ~ B and -~(AvB): = ~ A ^ , B
9 -~3zA: = Vz-~A and "~VzA : = 3z-~A
1.2.1.6 Equations, BQU: A -- B for A E FOR and B E FOR
1.2.2-1.2.3 Fr(A) and A[z/a] are defined just as in J l (see 1.1.2-1.1.3
above) with respect to -~, ^, v (instead of-~) and V, ~ (instead of Y). A
subformula-occurrence in F E FOR is any distinguished positive occurrence of
G E FOR in F, i.e. a one that is not bound by negation in the case G=
Xbl...b k. (Thus Xbl...b k is not a subformula-occurrence e.g. in ~Xbl...b k v A.)
1.2.4 With ~ 2 is correlated the equational predicate calculus E P C n
being a natural extension of the familiar formalism of boolean algebra.
1.2.4.1 Axioms of EPCn:
(1)e A = A
(2)e A v B - - B v A and A h B = B h A
(3)e Av(B^C) = (AvB)A(AvC) and AA(BvC) = (AAB)v(AhC)
(4)e Av• A and A^T = A
(5)e Av'~A = T and A ^ n A = •
(6)e 3xAvVx(AvB) = 3xAvVxB
(7)e 3 z A v A [ x / a ] = 3xA
(8)e 3zA = A, if x ~ Fr(A)
1.2.4.2 Deduction rules of EPC n are (S), (K) and (T):
140

A -- B
(s) B=A
A=B F =G A=B F =G A--B A=B
(K) A vF=BvG A ^F=B^G 3zA=3zB VzA=VzB
(T) A = C A -- B C = B
1.2.5 It is easily verified that EPC~ is complete and hence equivalent to
MPC~modulo provability and the canonical interpretation (see below).
1.2.6 Canonical translation. In the language ~n 1 set
9 T : : .LqJ.., -~A:=A~.L, A v B : = ' ~ A ~ B , A^B:=-~(A~',B), 3xA: = - V x - A
In the language 0or set
9 A~B: = -~AvB
1.2.7 Canonical interpretation. Keeping 1.2.6 in mind, for any formula A
let EI~U(A) denote the equation A = T. Conversely, for any equation A = B
let F{}R(A = B) denote the corresponding formula A H B , i.e. (A~B)^(B--+A).
1.2.8 THEOREM.I f MPC. proves a formula A then EPC. proves EQU(A).
Conversely, if EPC n proves an equation A ---- B then MPC n proves
FI]R(A ~ B). Moreover both proof transformations are carried out
by polynomial-weight algorithms (: the weight of output is polynomial
in the weight of input; r weight of a formula/equation/proof etc. is
the total number of its symbol-occurrences).
PROOF. These proof thransformations are defined by straightforward recur-
sion on the length/depth of proofs in MPC n and EPC n respectively. (The
associative law for ^, v is derivable from (1)e-(5)e, see [H].) (4)m is proved
by (7)e, (5)m by (6)e and (8)e , (G) by (8)e, (Me) by (T). This is routine. []
1.3 Sequential form..alism.
1.3.1 In the language ~n 2 define sequents, SE{] (abbr.: F, A, E, II, 2),
as finite strings of formulas (possibly empty), i.e. expressions D1,...,D k ,
k > 0. In particular, FOR C SEQ.
1.3.2 The corresponding sequent calculus SPC n is as follows.
1.3.2.1 Axioms of SPC.:
(1)s F,T,~
(2)s r,L/~ ,-~L, 2
1.3.2.2 Deduction rules of SPC.:
F,A ,B,~
(D) r,AvB,r. (C) F,A,2 F,A ^B,S F,B,2

(s) r,A[x/a] ,]xA,r, E,A[x/y] ,11 I y ~ Fr(r,V xA,2)


r,SxA,S (u) F,VzA,S' if t Z fi r , II C_ S
(CUT) C,A ~C,.A_
A
1.3.3 It is known that SPCo is complete - in fact, even without (CUT)
and with E = F, II= E in (U) - and hence equivalent to both MPC~ and
EPC~ modulo provability and the canonical interpretation (see below).
1.3.4 Canonical interpretation. Keeping 1.2.6 in mind, for any given
sequent A = D o , . . . , D k let FOR(A) and E{]U(A) denote the corresponding
formula D0v...vD k and equation D0V...vD k = T, respectively.
141

1.3.5 TIIEOR.EM/ f SPCn proves a sequent A then MPC.proves FORiA)


and EEC n proves EQIJ(A). / f MPC n proves a formula A then
SPC n proves A, and if EPC n proves an equation A = B then
SEC n proves FOR(A = B). All these proof transformations are
carried out by polynomial-weight algorithms.
PROOF. In view of Theorem 1.2.8 it suffices to prove the equivalence between
SPC n and EEC n. The proof thransformations are defined by straightfor-
ward recursion on the length/depth of proofs in EEC n and SPC n respecti-
vely. We sketch the latter one showing that S P C , is not weaker than EPC n.
First note that S P C , minus (CUT) proves (2)s for any formula A instead of
L. Second, in SPC n minus (CUT) is admissible the rule of weakening

(W) r,A,2
Having this, the axioms (1)e-(5)e are readily derivable in SPCn-(CUT):
Furthermore, (6)e is derivable by (C) and (D) followed by (U), (D) and (E).
(7)e and (8)e are derivable by (D), (C) and (E). The rules (S) and (K) are
derivable by (C), (D) and (W) (see above). The transitivity rule (T) is
derivable by (CUT) - this is the only passage which requires (CUT). []
1.3.6 Recall that (CUT) is superfluous in SPC~ (Gentzen's Hauptsatz),
although its counterparts (MP) and (T) are vital for MPC~ and EPC~. In
contrast, for each finite n, SECn-(CUT ) is decidable and hence, for n>3,
dramatically weaker than SPCn, MEC n mid/or EPCn, as shows the follo-
wing theorem (see also 1.1.5 and Theorem 1.3.5).
1.3.7 TtIEO~EM.
(1) For n ~ , the set of formulas provable in S P C n - ( C U T ) is decidable.
(2) There is no effective upper bound to the number of distinct vari-
ables occurring in cut free SPC~-proofs of valid formulas contai-
ning just three distinct variables.
Ptt00F. Consider the canonical proof search tree, T, of a given formula A in
SPCn, n_< w. So A is provable in SPC n iff every (linear) branch of T con-
tains T or both L and -~L for some literal L. For every branch of T consider
the set of all formulas occurring in it (call it clause). If only finitely many
distinct variables occur in T then there are finitely many different clauses as
well, since any formula in any clause arises by carrying out substitutions
[x/a] in a certain subformula of A. Moreover, we can redefine T such that
the variables form an initial segment of VAlt. As a result, the (finite) set of all
clauses of T is effectively determined by A and the number of variables
involved. This proves (1). (2) follows by the completeness of SPC~ and the
undecidability of the 3-VAR restricted validity (the latter holds e.g. by
3-VAR translation of Post Correspondence Problem; see also [TG]). []
1.3.86) Theorem 1.3.7 holds in various extended variants obtained by
adding new "direct" rules such as e.g. (U +) and/or (S) which guarantee that
S P C n - ( C U T ) is closed under contraction and/or specification, respectively:
s ,VzA,E]
VzA,II [ygrr(r,VxA,S), ~, C_r, n C_S]
(U+) =_,A[z/y]r,VzA,s, (S) r,A[xla] ,S
w REDUCTION CALCULI. That (CUT) is vital for SPC n for finite n is
~42

due to the deterministic mode of sequent calculi under which formulas are
evaluated in the "prescribed" reduction order such that subformulas are
treated later than formulas in which they occur. Now I switch to term rewri-
ting, as it provides mutually independent reductions at all subformula levels. 7)
Instead of eigen-variables (y from (U) in 1.3.2.2) a different idea of "variable
deletions" is employed. The resulting formalisms are called reduction calculi.
2.1 In ~,~2, define operations A[:--z] and A[x§ of "variable deletion":
9 .L[§ = .L and T[§ = T
L, if xi~Yr(L)
9 L[bz] : =
• if xEFr(L)
9 (BvC)[§247247 and (BAC)[§247247
9 3yB[-rx]: = x= . VyB , if x= y
. {3yB ,if y and V y B [ v x ] : = { V y ( B [ §
3y(B[§ if z # y
,r . ~ /A ,ifx=y
9 A t x v y j : = ~ A['y], if x r y
For any E = G1,..., G k E SEQ let E[x§ Gl[x§ Gk[x§ C SE{].
2.2 Below, for brevity, parentheses in the left-associative disjunctions are
dropped. Now let FvE:=FVGlVG2v...VGk, AVE:=HlV...VHlVGIV...VGk,
etc., where E = G1, G2,...,G k and A = HI,:..,H 1. So e.g. A v B v E v C actually
stands for (...((AvB)v G1) v...)v Gk)V C. Let E range over FOR*: = FflRu { ~ } .
2.3 8) Reduction calculus RPC n . The language of RPC n is ,L/n2.
2.3.1 Rewriting rules of RPCn:
(1)r EVTVE -~ T , if E u E r
(2)r EvLvAv-~L 4 T
AAT 4 A
(3)r ThA 4 A
(4)r Av(BvC ) 4 AvBvC
(5)r EV(AhB)VE "* ( E v A v E ) ^ ( E v B v E ) , i f E v E r
(6)r 3xA -~ 3xAvA[x/a]
(7)r EvVxAvE 4 EvVxAvA[§247247247
2.3.2 Let G be any subformula-occurrence in F, G-* H be any rewriting
rule (i)r, l _ i_< 7, which is sometimes indicated by G i'* H. Let F[G/IH], or
just F[ G/H], arise by replacing G by H in F. This operation is called reduc-
tion and denoted by F-~4 F[G/iH], or just F ~ F[G/It~. Any sequence
F 0 - ~ F 1 -*-* ... ~ Fk, k > 0 , is called a reduction chain. Let A ~ B
express that A reduces to B by some chain A = F 0 -.4 F 1 ~ ... -.-, F k = B.
2.3.3 A is provable in RPC n (abbr.: RPCn ~- A) if A ~ T holds. The
correlated chain A = FO-*-*F I - ~ . . . - ~ F k = T is called a proof of A in RPC n .
2.4 THEORE~I.//RPCn proves A then F P C n proves EQU(A), i.e. A = T .
The correlated proof transformation is carried out by a polynomial-
weight algorithm. (The weight of a reduction chain is the total
number of its symbol-occurrences.)
LEMMA. / f EPC n proves G ---- H then F P C n proves F = F[ G/H].
Furthermore, F P C n proves G = H for every reduction G i-~ H,
l<_i<_Z The correlated proof and "equation ~ proof' transformations
are carried out by polynomial-weight algorithms.
143

PROOF. The theorem is an immediate consequence of the lemma.


The former claim of the lemma is a familiar property of equational calculi (it
is easily proved by induction on complexity of F), Let G i4 H be any
rewriting rule of RPC n . If i < 6 then clearly G = H is provable in boolean
algebra and hence in EPC n. Now let i>_6. Since H always includes G as
subdisjunction, it suffices to prove -~HvG = T.
(6)r: The required equation ( 89 -- T readily follows in
boolean algebra from "~A[x/a] v3xA = T which in turn follows from (8)e.
SUBLEMMA 1. EPC a proves "~B[§ v B[x/a] = T, in particular 7B[+x] v B
-- T, and hence ~B[§ vVzB = T.
The last equation of the sublemma follows from the second one by (6)e and
(8)e: The first equation is proved by straightforward induction on the
complexity of B (left to the reader).
SUBLEMMA 2. EPC n proves VxB -- Vy(B[x/y]) if y ~ Fr(VxB).
This follows from 3z-~BvVy(B[x/y]) -- r and VzBv3y-~(B[x/y]) = T in
boolean algebra, and hence, by. (8)e, from 3y3x-~BvVy(B[z/y]) = T and
VxBv 3x3y~(B[z/y]) -- r which are readily derivable by (4)e-(8)e.
(7)r: For brevity let E = O. The required equation
(-~E^3x~Ah-~A[§ ^3y(~E[§ ^-~A[x§ [x/y]))v EvVxA = T
follows in boolean algebra by Sublemmas 1-2, (6)e, (8)e and (4)e from
3y(~E[§ ^~A[x§ [x/y] ) v EvVzA -- T
3y(-~E[§ ^~A[x§ [x/y] ) v E[§ vVx(A[x§ -- T
3y(-~E[§ ^ ~A[x+y] [x/y] ) v 3yE[§ vVy(A[z§ [z/y]) --- T
~y(-~E[§ ^-~A[x§ )vVy(E[§ vA[x§ Ix/y]) -- T
whose last equation is an instance of (5)e. This completes the whole proof. []
2.5 Reduction calculus R P C Q , . It will be shown that RPC n is equivalent
to M P C n , EPC n and/or SPC n . To this end, first prove the equivalence with
respect to the expanded reduction calculus, RPCCn, that extends RPC n by
adding the following rewriting cut rule
(8)r d -* (Av C ) ^ ( A v T C )
2.6 THEOREU. If F P C n proves A = B then RPCC n proves FOR(A = B),
i.e. (TAvB)A(TBvA). The correlated proof transformation is carried
out by a polynomial-weight algorithm.
COROLLARY. MPCn, EPCn, S P C n and RPCCn are equivalent in the
sense of provability. The correlated proof transformations are carried
out by polynomial-weight algorithms.
The corollary follows by Theorem 1.3.5, since A -- (AvC)^(Av-~C) is
provable in boolean algebra.
2.6.1 LEMMA. RPC. proves EvAvAv-~AvT. (AEFOR, EEF{}R+, A,~ESEQ).
Thecorrelated "formula H proof' transformation is carried out by a
polynomial-weight algorithm.
PROOF OF LEMMA. Observe that by (1)r we can assume E= ~ = O. That
A v A v-~A reduces to T is proved by induction on the complexity of A. The
case A = L is clear by (2)r. The case A = (BhC) follows from I.H. by (4)r ,
(5)r , (2)r and (3)r. The case A = (BvC) is analogous. Let A = 3xB. Reduce
3 x B v A v V x T B by (7)r to qxBvAvVx-~Bv-~B[§ vVx(3xBVA[§ v~B), and
144

then, by (6)r (a:=x),its subformula 3xBvA[§ v-~B to 3xBvBvA[§ v~B


that is provable by I.H. So Vz(3xBv A[-: x] v-~B) ~ VXT 4 VZT v T vVxT 4 - ~ T
by (7)r and (1)r , which yields the result by (1)r. The case A = VxB is analo-
gous. The required polynomial growth is readily seen. []
PROOF OF THEOREM. Argue by induction on the length/depth of the proof
of A = B in EPC n. We must prove (-~AvB)^(-,BvA) . ~ c T, where .,.,c
and .~,.,c refer to the reducibility in RPCC n . To this end, by (3)r , it suffices
to reduce both formulas -~AvB and -~BvA to Y. For boolean axioms A = B
this follows from the lemma and (3)r-(5)r. For (6)e-(8)e we "argue
analogously by also using (6)r-(7)r (where a:=z for (6)e , (8)e). We still have
to prove that the equational rules are admissible/derivable in RPC n. I.e. if
FOR(A i = Bi) 4 - ~ c T for all premises A i = B i of a given rule (S), (T) or
(K) with the conclusion A -- B, then FOR(A "-- B) 4 - ~ c m. Now (S) and (K)
are easily derivable by (1)r-(7)r. (T) is derivable by (8)r (this is the only
passage that requires (8)r). Namely, by the assumptions, -~AvC, Bv~G,
, B v C , Av-~G, and hence -~AvBvC, AvBv-~G, -~BvAvC, -~BvAv-~G all
reduce to T. Hence by -~AvB 8"* (-AvBvC)A(-~AvBv-~C) we get - A v B
.~"*c r and similarly -~BvA .~.,c T, which yields the result by (3)r. This
completes the proof. The required polynomial growth is readily seen. []
w NESTED HAUPTSATZ. Loosely speaking, GentzenWs Hauptsatz says
that (CUT) is derivable from the appropriately chosen direct rules. In
RPCCn, the rewriting rule (8)r resembles (CUT), except that it applies to
all subformula levels. As a sequential version of (8)r could serve the following
nested cut rule (CUTN), where F[A/AvC] and F[A/Av-TC] denote repla-
cements, in F, of A by AvC and Av-~C, respectively (cf. 2.3.2 above).
(CUTN) F,F[A/AvC],E F,F[A/Av20],r,
P,F,~
By the completeness, (CUTN) is admissible in S P C ~ - ( C U T ) . The correlated
constructive proof is merely a variant of Gentzents original proof of the
Hauplsatz. The corresponding Nested Haup~salz (Theorem 3.1) says that for
all n, (8)r is admissible in RPC n . The proof (3.10 below) is more involved.
3.1 THEOREM.Any formula provable in RPCCn is provable in RPC n .
COROLLARY. MPCn, EPCn, SPCn and RPCn are equivalent modulo
provability.
The corollary obviously follows by Corollary 2.6.
3.2 A rewriting rule G "* H is called admissible in RPC n if for any F with
a subformula-occurrence H, R P C , F F implies RPC n F F[H/G]. Note that,
by definition, all basic rules (1)r-(7)r are admissible in RPC..
3.3 LEMMA. The rewriting rules (9)r-(19)r are admissible in RPCn.
(9)r E -~ .1.
(10)r Ev Av~ "* A
(ll)r A "* AAB and B -* AAB
(12)r A v B v C 4 Av(BvC)
(13)r A[x/a] 4 A[+z]
(14)r AAB "* BAA
(15)r (AAB)AC 4 A^(BAC) and AA(BAC) -* (AAB)AU
145

(16)r AAA -* A
(17)r EvBvAv~ -* E V B v A v B v ~
(18)r AvB "~ BvA
(19)r (AAC)v(BAC) -* (AvC)AC and (CAA)v(C^B) -* C^(AvB)
PROOF. (9)r-(12)r, (14)r-(15)r , (18)r: The proof runs by straightforward
induction on the proof length, i.e. the length of the reduction chain F-*-*-* T
(see 2.6.3 and 3.2). (Note that (18)r follows directly from (4)r , (10)r and
(17)r.) (16)r: Argue analogously by using (14)r-(1D)r in case A being the
"major" formula of (5)r. (13)r: Straightforward induction on the complexity
of A by also using (9)r-(12)r. (17)r: analogous. Note that if B= VxA is the
"major" left-hand formula of (7)r , then the required contraction is obtained
by applying (7)r twice. (19)r follows directly from (5)r , (10)r and (15)r. []
3.4 LEMMA.The rewriting rules (20)r-(23)r are admissible in RPC n .
(20)r - 3xz
(21)r
(22)r VxA 4 A['x]
PROOF. (The admissibility of) (20)r is readily seen by induction on the proof
length. (21)r , (22)r are consequences of (6)r , (7)r together with (10)r. []
3.5 Denote by A - B that A and B are equal modulo 1-1 renaming of
bound variables (without affecting the free variables involved).
LEMMA. The rewriting rules (23)r-(2~) r are admissible in RPC n .
(23)r Vxg -* Vy(A[x/y]) if y~Fr(d)
(24)r B -* C , i f B -
PROOF. (23)r follows directly from (7)r and (10)r. (24)r is proved by double
induction on (first) the 3-depth of B and (second) the proof length. Here by
definition the ~-depth of B is maximal length of strings B1, ..., Bi, Bi+l, ...
such that each B i is an ~-formula (i.e. begins with ~), each Bi§ 1 is proper
subformula of B i and B 1 is a subformula of B. So no reduction can increase
the 3-depth. Note that if 3-depth of B is 0 then (24)r follows from (23)r. []
3.6 Let Fr*(A) denote the subset of Fr(A) that is obtained by not
counting (hereditarily) those free variables y from B[x/y] which appear in
disjunctions together with 3xB (to the effect that the rewriting rule (6)r does
not increase Fr*(-) of the right-hand formula). Let A = d B express that A
equals B modulo associativity and commutativity of v. Let A ~ B abbreviate
taht A is a subformula-occurrence in B. Let A -~0 B denote A ~ B provided
that A appears in the boolean part of B, i.e. outside any quantifier-domain.
3.7 Let (a)r be the following list of rewriting rules.
9 A -* B , i f A = d B
9 A V V z ( B [ A / ~ ] ) 4 VzB , if z ~ Fr*(A) and A ~0 B ("splitting")
The a-reducibility, G a-*--~ H, is defined by analogy to 2.3.2 with respect to
the rewriting rules (a)r instead of (1)r-(7)r. Note that every rule from (a)r is
admissible in RPC n . This follows from the previous lemmas and, in the case
of splitting rule, from (7)r , by using extra (6)r when dealing with Fr*(-).
Thus if B a - ~ 4 C, then R P C n I- F implies RPC. I- F[C/B].
3.8 LEMMA. Let B=AoVVxlAIVAlV...VVXkAkVA k ~-~4-~ C where C "~ F,
B~=AoVAlVAlV...VAkVAk and R P C n F F. Then R P C n I- F[C~B~].
!46

In particular, the rewriting rule ~25)r is admissible in RPC~.


(25)r A -~ VxA
PROOF. The proof runs by induction on the proof length by using (admissi-
ble!) rules of Lemma 3.3 and (24)r (in order to handle structural differences,
contractions, deletions and renamings of bound variables) as well as (22)r. []
3.9 LEMMA. The following rewriting rule (26)r is admissible in RPC n .
(26)r 3x(G[+x]AA) 4 G[Sx]A3xA and 3z(AAG[§ ..* 3xAhe[§
PROOF. The proof runs by induction on the proof length; (24)r is crucial in
the case 3zA being the "major" formula of (6)r (we omit the details). 9) []
3.10 Proof of Theorem 3.1, By (5)r and (9)r , it suffices to show that the
following rewriting rule is admissibile.
(27)r • "+ C^-'C
The admissibility of (27)r is proved by induction on the complexity of C.
3.10.1 Let C C {_L,Y}. The corresponding instances of (27)r
_L --~ • and I -~ TA-I.
are obvious consequences of the admissibility of (11)r.
3.10.2 Let C E (L,-,L}. The corresponding instances of (27)r are
_ L 4 L A T L and •
Without loss of generality consider the former rule. Let LA~L ~_ F-~-*-~ T. By
using admissible "inversions" from Lemma 3.3 we rearrange this reduction
chain such that every "minor" predecessor of the occurrence LA-,L in
question has the form (Ev/[~l]...[(q] VE)h(Ev(-,L)[~I ]...[~q] vE) where [~i]
= [§ or [xj/Xk] , since literals can be properly reduced only by (2)r. (Pre-
decessors are formulas on the right-hand side of rewriting rules (1)r-(7)r.)
Now replace by • in the modified chain, all these L[~] -~ and (-,/)[~]-~. The
chain remains correct in the sense of 2.3.2 if they were not involved as the
"major" literals of (2)r. Otherwise, consider the "major" reduction of the
first (say) conjunctive component EvL[~]-~vAv(-,L)[~] -~ -~ y. Thus ~L[~] -~
appears in the second component no___~tas "minor" predecessor to be replaced
by • Hence this reduction can be "repaired" by successively applying (17)r ,
(11)r , (16)r. So this case also does not affect the correctness of the reduction
chain. This yields F[LA',L/zA• ~ T and hence the result by (ll)r. []
3.10.3 Let C E { A v B~ A h B). The corresponding instances of (27)r are
.L -* (AvB)A(-,AA-,B) and • -~ (AAB)A(-,Av-,B)
Without loss of generality consider the former rule. Its admissibility follows
by I.H. from the admissibility of (ll)r, (17)r and (19)r:
• -~ • v • -* (A A-,A) v (B A-,B) -~ (A A=A A-,B) v (BA-,A ^ ~B) -* (A v B) ^ (-,A A~B)
3.10.4 Let C E {3xA, VxA}. The corresponding instances of (27)r are
• -~ 3xAAVx-,A and • -~ VxAA3z-,A
Without loss of generality consider the former rule. Its admissibility follows
by I.H. from the admissibility of (20)r , (25)r and (26)r :
• -~ 3z• -* 3z(AA-,A) -~ 3 z ( A A V ~ A ) -~ 3zAAVz-,A. []
3.11 REMARK.As in the familiar sequential case, this nested cut elimina-
tion theorem has non-elementary growth of the proof weight (see [May]).
w RELATIVE COMPLETENESS. For each n < x, n - V A R logic is weaker
than (the whole) ~-VAR logic, i.e. there are valid n - V A R sentences not
147

provable in n-VAR logic (cf. [TG] for references). In this chapter I show
that nevertheless, for each n>3, n-VAR logic is complete modulo interpre-
tation. More precisely, there is a polynomial-weight interpretation, p.*, of
w-VAR language into 3-VAR language such that A is valid iff p,*(A) is
provable in IVlPC., EPC. and/or RPC. for any fixed n>3. This generalizes
the corresponding results of [TG] and [M3].l~ For brevity, here only the
language of binary relations is considered. (This language is universal by the
canonical translation into the language of set theory; by familiar interpreta-
tions, this Mso applies to the ones with functionM symbols and/or equMity.)
4.1 Let .L/R,n be the language ~.~n2 (schema) provided a i = 2 for all i_< p
(cf. 1.1.1.1, 1.2.1.1 above). For every particular l a n g u a g e / ~ E ~ffR,~, let ~3
E ~R,3 be its 3-VAR fragment, and let /~w+ E ~ , w and ~3 E R,3 be
their expansions by the four new binary relations Vp+l, Vp+2, Vp+3, Vp+ 4
(cf. 1.1.1.1, 1.2.1.1 above) which are denoted by U, P, Q and R, respectively.
(P, Q express the basic projections as A, B from [TG], R, U the appropriate
"congruence" and "universe".) The basic variables v0, v1 and v2 are denoted
by x, y and z, respectively. As usual, xZy or x~Zy stands for Zz, y or "~Zz,y.
For any formula A of ~ffR,~, set IFr(A)= {i: viEFr(A)}. In view of 1.2.6-
1.2.8, 2.5, 2.7, 3.1, both formalisms, MPC n and EPCn, and their languages,
are used simultaneously, when dealing with the n-VAR logic.
4.211) In ~3 +, let ~r E FOR be the universal conjunction of (1)-(8):
(1) 3x(xux)
(2) z-,Px v z-,Py v xPy
(3) z-~Qx v z~Qy v xQy
(4) 3z(zPx A zRy)
(5) xRx
(6) x-,Ry v yRx
(7) x-,Ry v y~Rz v xRz
(S) x'-Ry v ~L v L[x/y], for all L= ziVkYi and zi,VkY j (i,j< 2, k< p+4)
4.3 In ~3 +, define the following abbreviations.
xP+y: = zPy v (xay A Vy(z-,Py)) and xQ+y: = xQy v (xRy A Vy(x~Qy))
Pi: = Q+(i)| i.e. zPoy : = zP+y and zPi+lY := 3z(xQ+z A zPiy )
(e.g. xP3y : 3z(xq+z A :]x(zq+x A :Iz(xQ+z A zP+y)))) )
4.4 Let K be any finite set of natural numbers. Define the relation "~'K" by
{&ieK(~z(xPiz A yPiz)), if K r
9 x~KY:= T , ifK=O
4.5 Let I~ : FOtL(s +) H FOR(L~3+) be the auxiliary mapping defined recur-
sively as follows. Note that rr(l~(A)) C_{x} holds for each A. (t~ corresponds
to the mapping MAB from [TG] .)
9 ~ ( T ) : - - T and IJ,(.L): = _L.
9 ~(L): = L, if Irr(L) = O.
9 I.I,(L):= 3y(xPiy AL[vi/y]), if Irr(A) = {i}.
9 [~(L): = 3y3z(xPiy ^ xPjz ^ L[vi/y ][vj/z]), if Irr(A) = {i,j}, iCj.
9 [~(AvB): = [J,(A)v It(B) and [J,(AAB): = p,(A)A IJI,(B)
9 I~(qviA): = 3y(x~Ky a I~(A)[x/y]), if K = Irr(qviA).
9 I~(VviA): =Vy(x*Ky v l~(A)[x/y]), if K = IFr(VviA ).
i48

4.6 LEMMA. The formulas (9)-(19) and (9)-(20) are provable in MPC 4 + 7
and MPC~ + 7, respectively, for any iE~] and I,J Cfi n ~.
(9) xRy ~ x~ I y
(10) z~iz
(11) z,~iy -4 y ~ i x

(13) z~ Iy-* z~jy, if J C_ I


(14) (zP+x ^ zP + y) -~ zRy and (zQ+x ^ zQ + y) -~ xRy
(15) 3z(zP§ ^ zQ+y)
(16) 3y(xP+y) ^ 3y(xQ+y)
(17) ^ zr, iy) -,
(18) 3y(xPiy ) ^ 3x(xPiy )
(19) z ~ i n j y - ~ 3z(z~ix ^ z~jy)
(20) 3vj(~iei(vjPivi)), i f j ~ I
PROOF. The provability of (9)-(20) in MPC~ + 7 is clear by the comple-
teness of MPC~. That (9)-(18) are provable in MPC 4 + 7 is readily seen
by induction on i, since M PC 4 proves that relation composition of two func-
tions is a function (cf. e.g. [TG], [M1]). That (19) is provable in MPC 4 + 7
follows by induction on cardinality of I u J by using (9)-(16). []
4.7 LEMMA. The formulas (21)-(2~) are provable in MPC 4 + 7, where
I : IFr(A), J = IFr(A)~{i} and K= IFr(A)-.{i,j}.
(21) x ~ i y - ~ (p(A) r t~(A)[x/y])
(22) ( x ~ j y A YPiq) -~ (I.i'(A[vi/ej]) ~ I~(A)[x/y])
(23) ( x ~ j y ^ 3z(yPiz a xPiz)) -~ (l~(A[vi/vj]) ~ I~(A)[x/y])
(24) (x~Ky^3z(yPiz^xPiz) a3z(yPjzAxPiz))-~(IJ,(A[vi~vi])Ht~(A)[x/y])
PROOF. The proof runs by simultaneous induction on the complexity of A. If
A E {T,• then the required equivalences readily follow from 4.5 by
(9)-(13). As for the induction step, the propositional cases are readily seen
by the same token. For A = VvrB, 3vrB this is routine by heavily using (9)-
(19). (24) is used in the proof of (23). []
4.8 TIfEOR,EM. For any A of~,~ +, MPC~ F A implies MPC4 + iT F- I~(A).
PROOF. It suffices to prove in MPC 4 + 7 or in EPC4 + EI~U(7) the II,-inter-
pretation of the axioms and rules of EPC~. Boolean cases are trivial. The
rule (K) for 3xA and VxA is readily interpretable by (6)e. For (7)e this
readily follows from (22)-(23), for (8)e by (21). For (6)e this follows from
(19) and (21). Namely, to prove in EPC 4 + EQU(7) the I~-interpretation of
3viA v Vvi(AvB ) -- 3viA v VviB , it suffices to prove in MPC4 + 7 the
implication ~l,(3viA v Vvi(AvB)) ~ ~(3viA v VviB ). So let K: = IFr(VviA ),
J: = IPr(3viB), I: = IFr(3vi(AhB)). Hence I = K u J and I n J = J. By 4.5, it
suffices to prove a formula (Vy(x ~I(Y'~ ~lJ'(A )[x/y] ) A3y(x ~3 Y^ ~lJ'(B) [x/y] )) -*
3y(x~iya-,l~(A)[x/y ] V-,l~(B)[x/y]). By (19), 3y(x~ay a ~l~(B)[x/y]) implies
3y3z(z~ix ^ z ~ j y ^ x ~ j y A -~(B)[x/y]), which by (11), (13) yields
qy3z(x~iz ^ x~Kz ^ y~az A x ~ j y ^ -,~(B)[x/y]). From this we arrive at
3y3z(x~izAy~jz^-,l~(A)[x/z ] A',l~(B)[x/y]) which yields the result by (21). []
4.9 TIfE0g, EM. For any A of ~,~ +, the following are provable in MPC~ +iT,
where jJ~K= IFr(A), and ~ksi( F k = T, if K= O.
149

AHVvj(& kei((VjPkVk)~ ~(A)[x/vj]), Ar 3vj(~ keK(vjPkVk)h~,(A)[x/vj])


PROOF. Since MPC~ is complete, this easily follows by "informal" model
theoretical induction on the complexity of A from Lemmas 4.6, 4.7. []
4.10 COItOLLAX.Y.For any s A, EPC~ + EQ[/(9c) ~- A ---- p,(A).
PROOF. Since K = O, MPC~ + F k A - ~ ( A ) and MPC~ + 7 F- ~(A)~A
by (8)e and the first and the second equivalence of 4.9, respectively. []
4.11 For any A E FOR(L:~), let Q(UIA ) e FOR(s +) be the canonical rela-
tivization of A to the "universe" U (see 4.1 above). I.e. Q(U[A) arises from
A by successively replacing every subformula VxB and every subformuIa 3xB
b y Mz(x-~Uz v B) and 3x(xUx AB), respectively.
4.12 Define the desired interpretation ~*. For any sentence A of L:~, let
9 I~*(A): = -~:f v I~(~(U ] A)), i.e. ~: -* I~(~(U I A))
I~*(A) is a formula of /:3 +. It is clear that the weight of I~*(A) is merely
quadratic in the weight of A. It remains to prove that ll,* is an isomorphism
with respect to the provability in M P e w and MPC4, respectively.
4.13 TIfEOItEI[. For any s A, MPQ~ ~- A iff MPQ 4 ~- p~*(A).
PROOF. Assume MPC~ ~- A. Hence ~ A. Hence ~ Q(UIA), since U does not
occur in A. By the completeness of MPC~, MPC~ ~- Q(UIA), which by
Theorem 4.8 yields MPC 4 + ~" F- p~(Q(U I A)) , and by the deduction theorem
(which holds in modus ponens calculi with (1)m-(2)m) MPC4 F- ~*(A). Now
assume MPC 4 ~- p~*(A). In particular MPC~ ~- p~*(A). By (MP) this yields
MPC~ + 7 k II,(O(UIA)) , and by Corollary 4.10, MPC~ + :T F O(c A),
and by the deduction theorem MPC~ [- ~c ~ Q(UIA)" Hence h 7-~ Q(U A).
Therefore Q(U I A) holds in every infinite structure S= < M I V0,...,V p ,Vp+l;
Cl,...,c q >, since S extends to S+= <M I V0,...,Vp,Vp+l,Vp+2,Vp+3,Vp+4;
cl,...,c q > such that VD+I, Vp§ Vp+ 3 and Vp, 4 fulfill the requirement (1)-
(8) of Y, where Vp+I~O corresponds to U. Now O(UIA ) holds in S +, and
hence in S, since P, Q and R don't occur in Q(UIA ). Furthermore, every T=
< M I V0,...,VD;Cl,...,c a > extends to an infinite structure S, as above, such
that T= S I{xEM: xUx}. So O(U|A) holds in S. Since U does not occur in
A, A holds in T. So h A, and MPC~ k- A by the completeness of MPC~. []
4.14 C01tOLLhltu For any s A, n>3, ~ A iff RPC n ~- iJ,*(A).
PROOF. This follows from Theorems 4.13, 3.1 and Corollary 2.6, since RPC n
are obviously monotone on n. []
4.15 REMAI~K.By the appropriate canonical set theoretical interpretation,
4.13 and 4.14 are true of the particular language/:3 of "minimal" signature
of one binary predicate (and no individual constant). Thus the whole logic is
isomorphically (modulo provability) embeddable into the 3-VAR fragment of
4-VAR logic, and hence of RPC4, in the minimal language of set theory. $

1) Cf. [I~]: p.30: I do not know of any other method of finding proofs in the
Propositional Calculus than the method of "trial and error".
2) The unification approach does not essentially change the situation. It
merely provides a more economical enumeration by deleting some repetitions.
3) There are known sequent calculi for relation algebras, which include cut
rule as counterpart of modus ponens, see [M1]. Their cut free fragments are
150

dramatically weaker than the correlated modus ponens formalisms.


4) In [TG], the underlying 3-VAR logic includes equality, as well as (the
canonical translation of) the associativity of relation composition (A|174
A| not derivable in the standard 3-VAR logic, see [TG]: p.89, [M1].
5) All formalisms below can also be enriched by unary function symbols.
6) This also holds for sequent calculi restricted to analytic cuts only.
7) The motivation should be more exhaustively discussed in the context of
automated theorem proving, but this would exceed the present framework.
8) In a slightly modified form, RPC n includes the following rewriting rules
(R1)-(R6) , where ^{E} and v{~} stands for conjunction and disjunction of
all formulas (arbitrarily ordered) from ~, respectively.
(R1) V{P,,T} 4 T, if~r
t -,
(R4) v{~,AAB) 4 (v{~,A}) ^ (v{~,B}), if~:O
(Rs) 3xA -* 3xA v A[x/y]
(R6) v{~,VzA} 4 v{E,VzA,A[§247247
9) In this lemma is used the special form of the rewriting rule (7)r that
allows the renaming of bound variables. Otherwise, one could weaken (7)r by
replacing y by x and dropping [x§ in the right-hand formula.
However, the resulting calculus would not fulfill the Hauptsatz. For instance,
it would not prove the associativity of relation composition.
10) Cf. [TG]: pp.89-90, [M3]: Theorem 23. Now Theorem 4.13 shows that
the whole predicate calculus, and not merely set theory, can be formalized in
the standard 3-VAR logic without equality. Moreover, the minimal 3-VAR
logic in question can be presented in a variable-free form e.g. by setting
Pij:= viEv i for all i,j<3, E being the unique (binary) predicate involved.
Besides, all known translations of predicate language into the language of
relation algebra axe weight-exponential. Also note that the mapping Tr used
in [M3] needs both infinitely many variables and the equality.
11) (1) can be dropped if the list of individual constants of L:3 is not empty.

[H]: E.V.Huntington, New sets of independent postulates ..., Trans.


Amer. Math. Soc. 35 (1933), pp.274-304
[L]: J.Lukasiewicz, The shortest axiom of the implicational calculus
of propositions, Proc. Royal Irish Acad. 52 no. 3 (1948), pp.25-33
[M1]: R.Maddux, A sequent calculus for relation algebras, APAL 25
(1983), pp.73-101
[M2]: R.Maddux, Nonfinite aziomatizability results for cylindric and
relation algebras, JSL 54 (1989), pp.951-974
[M3]: R.Maddux, Finitary algebraic logic, Zeitschr. f. math. Logic u.
Grundl. d. Math. 35 (1989), pp.321-322
[May]: U.Mayer, Interpretationen und Wachstumsbeispiele, Doctoral
dissertation, Tiibingen University (1994)
[TG]: A.Tarski & S.Givant, A Formalization of Set Theory Without
Variables, AMS Coll. Publ. 41 (1987)
How to lie without being (easily) convicted and
the lengths of proofs in propositional calculus

Pavel Pudls .1 and Samuel 1~. Buss ~ 2

1 Mathematics Institute, Academy of Sciences of the Czech Republic, Prague


2 Department of Mathematics, University of California, San Diego

A b s t r a c t . We shall describe two general methods for proving lower


bounds on the lengths of proofs in propositional calculus and give exam-
ples of such lower bounds. One of the methods is based on interactive
proofs where one player is claiming that he has a falsifying assignment
for a tautology and the second player is trying to convict him of a lie.
The second method is based on boolean valuations. For the first method,
a log n + loglog n - O(logloglog n) lower bound is given on the lengths
of interactive proofs of certain permutation tautologies.

1 Introduction

We are interested in proving lower bounds on the lengths of proofs in proposi-


tional calculus. There are two main motivations for this research.
First of all, this question is connected with the famous open problem of
"AFT) =?coAf~P '' , since a proof system for propositional calculus can be thought
of as a nondeterministic procedure for the coAfT)-complete set of propositional
tautologies, Cook [5]. Thus proving superpolynomial lower bounds on the lengths
of proofs in increasingly stronger proof systems parallels in a sense an approach
to the problem P =?ALP, where for restricted classes of circuits superpolynomial
lower bounds are proven for the size of circuits computing AlP sets - - this is
done with the hope that eventually techniques will be found which will work for
all propositional proof systems and all boolean circuits.
The second motivation is that this seems to be the most promising way
of proving independence of interesting sentences from fragments of arithmetic.
The fragments that we have in mind are often referred to by a generic name
B o u n d e d A r i t h m e t i c . For many theories R of bounded arithmetic one can find
an associated propositional proof system R pr~ [5,8,14]. For a given theory R of
arithmetic, R pr~ is the strongest system system whose soundness is provable in
the theory R and which simulates provability in R. The simulation means that
for a certain class of universal sentences, if a sentence is provable in the t h e o r y / E
and we translate it into a sequence of tautologies expressing finite instances of the
* Partially supported by US-Czechoslovak Science and Technology Program grant No.
93025
** Partially supported by US-Czechoslovak Science and Technology Program grant No.
93025 and by NSF grant DMS92-05181
152

sentence, then the tautologies have polynomial size proofs in the system R vr~
A superpolynomial lower bound on the size of proofs in the proof system R vr~
would imply independence of A/'P =?coX7) from R. Thus even partial results in
this approach to the problem 24"7) =?coNP may have interesting consequences.
The most important class of propositional proof systems is called Frege sys-
tems. This concept was defined by Cook and Reckhow [7,6] and was intended
to capture the properties of the most common propositional proof systems. For-
mally, a Frege system is determined by a complete finite basis of connectives and
a finite set of rules

fro) (1)

which form a sound and implicationally complete system. Let us note that when
applying the rules we use their substitutional instances, but the general rule of
substitution is not allowed. A typical representative Frege system is based on
finitely many axiom schemas (zero premise rules) and the Modus Ponens rule.
There are two measures of complexity that one uses for such proofs: the size of
the proof (which include the sizes of the formulas in it) and the number of steps
(which counts only the number of formulas used in the proof). The concept of
the Frege system is very robust with respect to each measure: every two systems
polynomially simulate each other. Moreover they are equivalent in this sense
with sequent calculi with the cut rule. For their associated theories of bounded
arithmetic, see [5,3,13]. So far, superpolynomial lower bounds have been proved
only for more restricted systems, see e.g. [1,2].
In this paper we introduce two frameworks for proving lower bounds for
Frege systems and their restricted versions. First we shall define an interactive
way of proving propositional tautologies. This game is well-known, however the
relation of the length of the game to the lengths of Frege proofs is (as far as
we know) new. Namely, the minimal number of rounds in the game is propor-
tional to the logarithm of the minimal number of proof steps in a Frege proof.
The inspiration of this game comes from some lower bound techniques in com-
plexity theory, the so-called adversary arguments and certain game-theoretical
characterizations of circuit complexity measures. We shall show that it is triv-
ial that most tautologies require interactive games of at least logn rounds (all
logarithms in this paper are base two). However, we shall also prove a lower
bound of log n + log log n - O(log log log n) on the number of rounds in interac-
tive games for some tautologies consisting of randomly chosen permutations of
conjunctions. Here, n is the number of distinct subformulas of the tautology.
The second approach for lower bounds on Frege proofs is based on valuations
in boolean algebras. This has actually been used implicitly in [2], and other proofs
can be interpreted in such a way.
For the reader who wishes to get a deeper knowledge about lower bounds in
propositional calculus we recommend the forthcoming book by Kraji~ek [9] and
a forthcoming survey by the first author [17].
153

2 Interactive proofs of tautologies

We shall introduce a g a m e using a real life situation as an example. Suppose you


are a prosecutor who wishes to convict someone at a trial. What is he saying
is a blatant lie for you, but the judge, and especially the jury, need a proof
without any doubts. In particular they will not accept a long formal proof of a
contradiction in his testimony. Instead, they require you to ask the defendant
several questions that eventually force the liar to say some simple contradiction.
Let us describe this game more formally. There are two players Provgr and
Adversary, who play the roles of a prosecutor and a lying defendant, respectively.
The aim of Prover is to prove a proposition ~ and the aim of Adversary is to
pretend that, for some assignment, the formula ~ can have value 0 (=false).
The game starts with Prover's asking T and Adversary answering 0, and then
Prover asks other propositions and Adversary assigns values to them. The game
ends when there is a simple contradiction in the statements of the Adversary
which means the following. Suppose we consider propositions in a basis of con-
nectives B. Then a simple contradiction means that for some connective o 6 B,
and propositions ~ 1 , . . . , ~ , Adversary has assigned values to the k + 1 many
formulas ~ 1 , - . . , ~k, o(~1, . . . , ~k) which do not satisfy the truth table of o; e.g.
he assigned 0 to ~p, 1 to r and 1 to ~ A r
We shall call the game Prover-Adversary game. We define that a proposition
is provable in this game, if Prover has a winning strategy. Furthermore a natu-
ral measure of complexity of such proofs is the minimal number of rounds needed
~o convict any Adversary. The following is easy and follows from Proposition 2,
but it helps to understand the concept.

Propositionl. The Prover-Adversary game is a complete proof system.

Proof. To prove the soundness, suppose ~ is not a tautology. Then Adversary


can simply evaluate the propositions on an input a for which ~[a] = 0. To'prove
the completeness suppose ~ is a tautology and let Prover ask all subformulas,
including the variables, of ~. []

W h a t is more interesting is the relation of the number of rounds in the game


to the number of steps in a Frege proof.

P r o p o s i t i o n 2. The minimal number of rounds in the game needed to prove


is proportional ~o the logarithm of the minimal number of steps in a Frege proof
ofqQ.

Proof. 1. Let a Frege proof of qo be given, say ~ l , . . . , ~k, with ~k = ~. Consider


conjunctions
= ^ A...) ^
We use the notations a ~-* 1 or a F-+ 0 to denote the conditions that Adversary
has stated that a has t r u t h value 1 or 0, respectively.
If Adversary tries to be consistent as long as possible, Prover needs only a
constant number of questions to force him to assign 1 to an axiom. Thus he can
~54

force value r ~-+ 1. Also he needs only a constant number of rounds of questions
to get ek ~ 0, since ~k ~-* 0. Then he uses binary search to find an i such that
r ~-+ 1 and r w-~ 0. This takes O(logk) rounds. Another constant number
of rounds suffices to get ~i+1 ~-* 0. Suppose ~i+1 was derived from ~il, 9 9 Tit,
i a , . . . , il < i. For each of these premises ~ij it takes only O(log i) rounds to force
Tij ~-+ 1 (or to get an elementary contradiction), since Prover can force r ~-* 1
in O(log n) rounds using binary search again. Once the premises get the value 1
and the conclusion value 0, Prover needs only a constant number of questions to
force an elementary contradiction.
2. Let a winning strategy for Prover be given, suppose it has r rounds in
the worst case. We construct a sequent calculus proof of ~ of size 2~ It is
well-known that a sequent proof can be transformed into a Frege proof with at
most polynomial increase.
Consider a particular play P, let o h , . . . , at, t _< r be the formulas asserted
to have value 1 by Adversary in response to Prover's questions, where we have
added (or removed) negations if Adversary answered 0. (In particular al is - ~ ) .
Thus o~1 A ... A s t is false, hence --+ -~1,...,--'c~r is a true sequent. Moreover,
as easily seen, it has a proof with t + O(1) number of lines, since there is a
simple contradiction in the statements c~1 ... s t . The proof of ~ is constructed
by taking proofs of all such sequents and then using cuts eliminating successively
all formulas except 9. This is possible due to the structure of the possible plays.
Namely,

For each play P with questions o h , . . . , ai, and each j _< i, there is a another
play P~ in which the first j questions are the same as in P and in which the
first j - 1 answers are the same and the j - t h answer is different.

Finally observe that the number of such sequents is at most 2~, which gives the
bound. []

Let us note that the proof constructed from the game is in a tree form, except
possibly for constant size pieces at the leaves, which can be easily changed into
such a form. Thus we get:

C o r o l l a r y 3. (Kraji~ek [10]) A Frege proof can be transformed into a tree-like


Frege proof with at most polynomial increase of size.

Proof. Let an arbitrary Frege proof of size n be given. First transform it into the
Prover-Adversary game, thus we get a game with O(log n) rounds. Transforming
it into a sequent proof we get a proof of size 2 ~176 = n ~ Then one can
check that the translation of this proof into a Frege proof can be done so that
the tree form is preserved. []

This corollary is not surprising, since the main idea of the first part of the
proof of Proposition 2 is the same as in Krajieek's proof. In fact the number of
rounds characterizes more precisely the log of the minimal number of steps of
a proof in a tree form. Using these transformation we get, however, still a little
155

more information: we get a kind of a normal form of a proof - proofs which


use only cuts except for the top part (at the leaves), i.e. something dual to the
cut-free proofs.
Let us stress that we can also characterize the size of Frege proofs using
this game, if we count also the size of the queries. Furthermore we can impose
various restrictions on the form of the queries. E.g. bounded depth queries would
correspond to bounded depth Frege proofs.

A particularly interesting restriction is the restriction to monotone formu-


las which are formulas using only the connectives A and V. Since there are no
nontrivial monotone tautologies, one has to consider proofs from assumptions.
A lot of interesting tautologies can be represented in this way, e.g. the most
useful example - the Pigeon Hole Principle. (As we can take the conjunction of
all assumptions, we can confine ourselves to the case of the single assumption.)
In this case the game starts with Adversary claiming the assumptions to be true
and the conclusion to be false. In circuit complexity the restriction to monotone
circuits enabled to prove exponential lower bounds, while for nonmonotone cir-
cuits we still have only small linear lower bounds (for explicitly defined boolean
functions). Thus we hope that also in propositional calculus the monotone case
will be easier.
Proposition 2 suggests that there might be a similar relation between the
monotone version of the game and a monotone version of propositional calculus.
The most natural way to introduce the monotonicity restriction to propositional
calculus is to use the sequent calculus with monotone formulas in the sequents
in the whole proof. Thus rules for other connectives are forbidden. Part 1 of
the above proof does not work in this case; although it can be made to work
in the case of monotone tree-proofs. So Proposition 2 does apply to monotone
tree-proofs and the monotone version of the game.
It is not clear, if we really need nonmonotone formulas for short proofs of
monotone true sequents. Thus we have the following two questions.

Question 4. Can every (general) sequent proof of a monotone sequent be replaced


by at most polynomially longer monotone proof?

Question5. Can every monotone sequent proof of a monotone sequent be re-


placed by at most polynomially longer monotone tree proof?

It is conceivable that the answers to both questions are NO. Thus tree-like
monotone proofs are an interesting class of proofs on which we can try lower
bound methods. Let us stress that we do not have superpolynomial lower bounds
even for such proofs. In Prover-Adversary game this means that the following is
open.

Question 6. Are there monotone true sequents which cannot be proved in mono-
tone Prover-Adversary game using O(logn) rounds, where n the size of a se-
quent?
~56

3 An example of the adversary method

Let us consider the following formula t~

. . . . . . . (p v

Note that t2,~ is always a tautology; this is a well-known example for which one
can prove a linear lower bound on the number of steps in Frege proofs [4,11].
We shall show an f2(log n) lower bound for the number of rounds in the game.
This, of course, follows from the cited result and Proposition 2. Still the proof
is interesting, because it is different, it is not just a translation. The direct
translation only gives us a proof that Prover cannot win in r rounds for some
r = o(log n). To get a winning strategy for Adversary in r rounds we have to
refer to the finiteness of the game. Thus the direct translation does not give the
winning strategy explicitly, so it is not a really "adversary argument".

P r o p o s i t i o n 7. Any proof of t2n in the Prover-Adversary game requires log n


rounds.

Proof. For m >_ 0, let Am be an assignment of truth values to formulas defined


as follow. Let 9 -- 9 ( t / 1 , . . - , t / k , P , P l , - - . , P q ) be a formula, where the t~j's are
maximal, p stands for all other occurrences of p and P l , . . . , P q are all other
variables. First assign some values (say 0) to p and P l , . . . , Pq. Then assign values
to t/j's as follows. I f i j _> m, let tij ~-~ 1, i f i j is odd and tij ~-~ O, i f i j is even.
If ij < m, then assign values conversely. Thus if ij >_ m we assign to tij the
incorrect value and otherwise we assign the correct value. Once the values of
ti~,..., tik, p, p l , . . . , pq are set, evaluate the formula according to the rest of the
connectives correctly.
C l a i m . Let 91, . . . , 9I be the maximal proper subformulas of 9 and suppose
that the values assigned to 9, 9 1 , . . . , 9~ according some Am give an immediate
contradiction. Then 9 = tin, (hence l = 1 and 9I = tin-l).
This is easy, since if 9 is not of the form t~ for some i, then every maximal
tj's is maximal also in some proper subformula of 9-
Now we can describe a strategy for Adversary. He will keep a certain set S
of numbers between 0 and 2n in each round. He starts with S consisting of all
numbers between 0 and 2n. Suppose we are in a certain round with a set S and
P r o v e r asks formula 9- Then Adversary evaluates 9 using all Am's with m E S.
Then he chooses the value for 9 which occurs most frequently and sets new S to
consist of those m's for which he got this value. Thus the size of S decreases at
most by the factor 2. The set S has the property that the values of all queries
up to this round equal to the values obtained by applying Am to them for any
m E S. Hence, by the Claim, there cannot be an immediate contradiction in the
answers of Adversary, if S > 2. So Adversary can be consistent at least for log n
rounds. []
157

4 A n o n c o n s t r u c t i v e lower bound

In this section we prove a slightly larger lower bound log n+log log n - O ( l o g log log n)
on the number of rounds in the Prover-Adversary game. (Note that this is larger
than previous bound only if we do not count the size of indices of variables.) We
do not construct the formulas explicitly, but use a counting argument to show
that they exist. Although counting arguments sometimes easily give exponen-
tial lower bounds in circuit complexity [19], it seems that for the propositional
calculus we cannot get such strong bounds.
We consider the following formulas

sn,x =a/ P~(1) A ... Apr(~) ---+Pl A ... Apn,

where Ir is a permutation of { 1 , . . . , n}. The distribution of parentheses is not


important; for definiteness let us assume that we group the conjuncts to the
left. These formulas have been used by Orevkov [15] to prove a speedup from
~ ( n log n) to O(n) of the sequence-like proofs vs. tree-like proofs (this speedup
was rediscovered later by the authors, and we sketch its proof below). Theorem 8
does not follow from Orevkov's result since we do not have such a tight relation
between tree-like proofs and the game.

T h e o r e m 8 . There exists a sequence of permutations { 7i"n}n=l,


~o 7rn a permuta-
tion of { 1 , . . . , n } , such that any proof of s~,Tr~ in the Prover-Adversary game
requires log n + log log n - O(log log log n) rounds.

Proof. Let a winning strategy P of Prover be given. We can view P as a labeled


binary tree where the nodes are labeled by the queries of Prover and the edges
are labeled by the answers of Adversary. In particular, the root is labeled by the
proved formula and has only one edge which is labeled by 0. For each branch
there is a simple contradiction for some node labels.
The skeleton of P is defined to be the same tree, but with the node labels
replaces by information about a simple contradiction for each branch. Namely,
if ~ l , . . . , ~ k , o ( ~ 1 , . . . , ~k) is a simple contradiction for a branch b, we add
edges labeled by 1 , . . . , k pointing from the leaf to the nodes on b which were
labeled by ~ 1 , . . . , ~k and an edge labeled by o pointing to the node labeled by
O(~fll,... , (ilk).

L e m m a 9 . Let S be the skeleton of some winning strategy for s~,~, 7r any per-
mutation of { 1 , . . . , n } . Then S and n uniquely determine the permutation ~r.

Proof of Lemma. Let S and n be given. Define a unification problem as follows.


Introduce a variable for each node of S and add an equation corresponding to a
simple contradiction for each branch in S:

v =

where v is the variable of the node to which an edge labeled by o is pointing etc.
~' 5 8

Let y be the variable corresponding to s,~,~. We take another variable x and


add one more equation
y = x-*plA...Apn.
Clearly this unification problem is determined solely by S and n. Consider the
most general unifier of this problem and let the term ~ be the solution for x. We
claim that ~ is actually P~(1) A ... A p~(,~). We know that this formula can be
obtained from ~ by a substitution, as the proof whose skeleton S is, is a solution
of the unification problem. If ~ was not equal to it, then there would be at least
one pi missing in it. Then, if we substitute, say u different variable for the free
variables in ~ we get a proof of a non-tautology, which is a contradiction. Thus
~- is determined by the skeleton S and n. []

Proof of Theorem 8. To prove the theorem it suffices to compare the number of


skeletons of a given depth d (= number of rounds) and the number of permuta-
tion on n elements. W.l.o.g. we can assume that each branch has length d, thus
we need only to count the number of possible markings of simple contradictions.
If we have a basis B with at most k-ary connectives, then the number of possible
situations on a branch of length d is IB Idk+l. Hence the number of such skeletons
is estimated by
(iBldk+l)2 ~ = 20(2 elogd),

while the number of permutations is n! = 2~176 This gives the bound d =


log n + log log n - O(log log log n). []

Next we state and and give a quick sketch of a theorem originally proved by
Orevkov [15] and later rediscovered by the authors. This gives a / 2 ( n log n) lower
bound on the length of tree-like Frege proofs of the tautologies s,~,~.

T h e o r e m 10. For every Frege system there exists a positive constant r such that
for every n there exists a permutati'on 7c of { 1 , . . . , n} such that every tree-like
proof of sn,~ has at least en log n steps.

The proof of Theorem 10 is very similar to the proof of Theorem 8, and,


in the setting of proofs, is a well-known technique due to Parikh [16]. For u
Frege proof P we define the skeleton of P to be the labeled graph whose vertices
correspond to the formulas, the label of a vertex v corresponding to a formula
determines the rule by which ~ was derived and the edges going into v determine
from which formulas was ~o derived. Furthermore the edges are ordered so that
it is clear at which positions of the rule were the formulas used. Put otherwise, a
skeleton contains all information about the proof except for the formulas. Similar
to Lemma 9 above, we have:

L e m m a 11. Let S be the skeleton of some Frege proof of Sn,r. Then S and n
uniquely determine the permutation ~r.

The proof of Lemma 11 is similar to the proof of Lemma 9 and we leave it


to the reader.
159

Proof of Theorem 10. To prove the theorem it suffices to compare the number of
tree-skeletons with a given number of vertices and the number of permutation on
n elements. To estimate the number of skeletons we can use well-known estimates
about the number of trees, but we can also estimate it easily directly. A tree-
skeleton can be represented as a term where we have a function symbol for each
rule and a single (constant) symbol c which we use for all leaves. Using Polish
notation we can even avoid parentheses. Thus we can code tree-skeletons with
_~ L vertices by words of length L in an alphabet of size r + 1, where r is the
number of rules of the Frege system. If all tautologies s~,r have proofs with at
most L steps, then
( r - b l ) L k n!,
which gives L = 12(n log n). []

Theorems 8 and 10 are both proved by counting arguments. As a conse-


quence, the stated lower bounds apply to randomly chosen permutations; how-
ever, we do not know any particular explicitly defined permutation for which the
lower bounds hold.

5 A method based on boolean values

We shall discuss another method for proving lower bounds on the lengths of
proofs. This method has been successfully applied in the case of proofs where
the formulas have bounded depth [2]. (Here the restriction means that we use
only the De Morgan basis and the number of alternations of different connectives
is bounded by a constant; e.g. CNF's and DNF's are of depth < 3 . ) A j t a l [1]
and Riis [18] use in fact a different approach, an approach based on forcing, but
their results can be interpreted using the boolean values method.
In model theory we use boolean values to prove independence results as fol-
lows. We take a suitable boolean algebra and assign suitable values to formulas.
If a sentence gets value different from 1, then it is not provable, since we can
collapse the boolean algebra to a two-element boolean algebra and get a model,
where the sentence is false. In propositional calculus we are interested in lower
bounds on the length of proofs of tautologies. A tautology gets Value 1 in any
boolean algebra, so we cannot use a single boolean algebra. Our approach is based
on assigning boolean algebras to every small subset of a given set of formulas in
a consistent way. An equivalent approach has been proposed by Kr~ji~ek [12],
which is based on assignments in a single pariial boolean algebra.
The concept of a homomorphism is defined for boolean algebras. We extend
it to mappings of sets of formulas into boolean algebras. Namely, let a set of
formulas L and a boolean algebra B be given. A mapping A : L --+ B will be
called a homomorphism, if it is consistent w.r.t, connectives. For instance

A(-~) = -~B)~(~) if ~,-~ E L,

v r = vB if r v r e L.
160

We define the degree of a Frege system :P as the maximal number of subfor-


mulas of a rule (or axiom scheme) of ~ . E.g. the Modus Ponens rule has three
subformulas ~, r and ~ ~ r so d _> 3.

P r o p o s i t i o n 12. Let a Frege system jz of degree d be given, let 7" be an arbitrary


formula.
Suppose that for every set of formulas ~ of size at most n which contains z-
the following holds: (1) For each subset S C q~ of size at most d we can find a
boolean algebra Bs and a homomorphism As : S --+ B s , and ( 2 ) f o r every pair
T , S , with T C_ S, we can find an embedding aT,S : BT ~ B s such that the
following diagram commutes

id ~ ~T,S
S A s Bs

Furthermore we require that A{T}(v) < 1.


Then r does not have a proof with < n steps.

Proof. Let a proof (~1,-.., ~rn), m < n of r ( = ~,~) be given. We shall show that
the assumption of the proposition fails for ~ = { ~ 1 , . . . , ~,~}. Suppose that we
have a system of homomorphisms as required in the proposition, except possibly
for the last condition. We shall show that all ~ C ~ get k{~}(~) = 1, thus the
last condition is not satisfied.
First observe that B{~} is embedded in all B s where ~ E S, hence As(p) = 1
for one S iff it holds for all such S. Let ~ be an instance of a logical axiom
r i.e. ~ = r for some formulas X 1 , . . . , X k . Let S be the
set of formulas O(X1,...,Xk), where t~ runs over all subformulas of r By the
assumption, IS[ is at most the degree of the Frege system, hence we have a
boolean algebra B s and a homomorphism As : S ~ B s . Since r is a tautology,
it must get value 1 for any assignment of boolean values. Thus

= As(r : A (Xk)) : 1.

Suppose that ~i is obtained in the proof from some PjI, 9-., ~J~, jl, . . . , j~ < i by
a Frege rule, and suppose that ~Jl, 9 9 ~j~ get all the value 1 in their algebras.
Then ~pplying the same argument as for an axiom (namely, a Frege rule is
sound in any boolean algebra), we conclude that T also gets the value 1. Thus,
by induction, all formulas ~ 1 , . - . , ~,~ get value 1. []

As an example, we shall describe the form of boolean algebras that one can
use for proving a superpolynomial lower bound on the lengths of bounded depth
proofs of the Pigeon Hole Principle, using the combinatorial arguments of Ajtai
[1]. The Pigeon Hole Principle is the statement that there is no bijection between
161

an n + 1-element set D and an n-element set R. It is represented by the following


formula
Vi#jeD, keR(Pik A pjk) V Vi#jeR, keD(Pki Apkj)V
VieD AkeR ~Pik V VkeR AieD ~Pik,
where Pij determines whether the pair {i, j} is in the alleged mapping. We think
of t r u t h assignments as bijections between D and R. There are no such real
assignments, but in some cases we can still determine what would be the value
of a formula under such assignments. For instance, P H P will get the value O,
since it asserts that there are no such assignments. In some cases we cannot
decide the value of a formula for all such assignments, but we can decide it for
all assignments which extend some partial one-to one mapping g : D --* R. In
other cases it is not possible at all. The key combinatorial argument shows that
for small sets # of bounded depth formulas there exists a partial assignments h
(in fact a random h of suitable size) such that for each ~ E # its value can be
determined by certain small extensions of h.
Let us forget about h. Then the statement is roughly this. There exists a
constant size set C C_ D U R such that the value of ~ is decided by all g's whose
support contains C. Now we take the boolean algebra Be of all subsets of partial
one-to one mappings g whose support contains C and which are minimal with
this property (i.e. if g~ is a proper subset of g, then its support does not cover
C). The value of ~ is the set of such g's which force ~ to be true (the other
g's force ~ to be false). For a set ~ 1 , . . . , ~ of formulas with the corresponding
subsets C 1 , . . . , Ck, we take the boolean algebra Bclu...uck. If C ~ C C, then
there exists a natural embedding of Bc, into B e . Thus we get the required set
of homomorphisms.

References

1. M. AJTAI, The complexity of the pigeonhole principle, in Proceedings of the 29-th


Annual IEEE Symposium on Foundations of Computer Science, 1988, pp. 346-355.
2. P. BEAM~, R. IMPAGLIAZZO, J. KP,AJiCEK, T. PITASSI, P. PUDLAK, AND
A. WOODS, Exponential lower bounds for the pigeonhole principle, in Proceedings
of the 24-th Annual ACM Symposium on Theory of Computing, 1992, pp. 200-220.
3. S. ]~. Buss, Bounded Arithmetic, Bibliopolis, 1986. Revision of 1985 Princeton
University Ph.D. thesis.
4. S. R. Buss ANDET AL., Weak formal systems and connections to computational
complexity. Student-written Lecture Notes for a Topics Course at U.C. Berkeley,
January-May 1988.
5. S. A. COOK, Feasibly constructive proofs and the propositional calculus, in Pro-
ceedings of the 7-th Annual ACM Symposium on Theory of Computing, 1975,
pp. 83-97.
6. S. A. COOK AND R. m. RECKHOW, On the lengths of proofs in the propositional
calculus, preliminary version, in Proceedings of the Sixth Annual ACM Symposium
on the Theory of Computing, 1974, pp. 135-148.
7. - - . , The relative efficiency of propositional proof systems, Journal of Symbolic
Logic, 44 (1979), pp. 36-50.
"~62

8. M. DOWD, Propositional representation of arithmetic proofs, in Proceedings of the


10th ACM Symposium on Theory of Computing, 1978, pp. 246-252.
9. J. KRAJf~EK, Bounded Arithmetic, Propositional Calculus and Complexity Theory,
Cambridge University Press, To appear.
10. J. KtLCJf~EK, Lower bounds to the size of constant-depth Frege proofs. To appear
in Journal of Symbolic Logic.
11. ~ , Speed-up for propositional Frege systems via generalizations of proofs, Com-
mentationes Mathematicae Universitatis Carolinae, 30 (1989), pp. 137-140.
12. ~ , On Frege and extended Frege proof systems. Typeset manuscript, 1993.
13. J. KRAJi(~EKAND 1:). I:)UDLAK,Propositionalproofsystems, the consistency offirst-
order theories and the complexity of computations, Journal of Symbolic Logic, 54
(1989), pp. 1063-1079.
14. - - , Quantified propositional calculi and fragments of bounded arithmetic,
Zeitschrift fiir Mathematische Logik und Grundlagen der Mathematik, 36 (1990),
pp. 29-46.
15. V. P. OREVKOV, On lower bounds on the lengths of proofs in propositional logic
(russian), in Proc. of All Union Conference Metody matem, logiki v problemach
iskusstvennogo intellekta i sistematicheskoje programmirovanie, Vilnius, vol. I,
1980, pp. 142-144.
16. R. PARIKH, Some results on the lengths of proofs, Transactions of the American
Mathematical Society, 177 (1973), pp. 29-36.
17. P. PUDL.~K, The lengths of proofs. To appear in Handbook of Proof Theory, ed.
S. Buss.
18. S. R/IS, Independence in Bounded Arithmetic, PhD thesis, Oxford University, 1993.
19. C. SHANNON, On the synthesis of two-terminal switching circuits, Bell System
Technical Journal, 28 (1949), pp. 59-98.
Monadic Second-Order Logic and

Linear Orderings of Finite Structures

Bruno Courcelle

Universit6 B o r d e a u x - I , LaBRI (1)


351, C o u r s de la Lib6ration
3 3 4 0 5 TALENCE Cedex, France

Abstract: We consider graphs in which it is possible to specify


linear orderings of the sets of vertices, in uniform ways, by MS
(i.e., Monadic Second-order) formulas. We also consider classes
of graphs C such that for every L :_ C, L is recognizable iff it is
MS-definable. Our results concern in particular dependency
graphs of partially commutative words.

Introduction

We shall c o n s i d e r t h e following question:


Q u e s t i o n I: In which finite graphs is it possible to specify a linear
ordering o f the vertices, in a uniform way, by monadic second-order
formulas ?
T h i s is n o t p o s s i b l e for all finite g r a p h s : t a k e t h e d i s c r e t e
(edgeless) g r a p h s ; t h e y have a u t o m o r p h i s m s , so no linear o r d e r c a n b e
defined. E v e n if we c h o o s e in a given discrete g r a p h k s e t s of vertices
(by m e a n s of k s e t v a r i a b l e s t h a t we s h a l l call " p a r a m e t e r s " ) , we
c a n n o t define a linear order if t h e g r a p h is "too large" b e c a u s e discrete
g r a p h s w i t h a t l e a s t 2k+ i v e r t i c e s h a v e n o n t r i v i a l a u t o m o r p h i s m s
p r e s e r v i n g k arbitrarily given s u b s e t s . This explains w h y t h e discrete
g r a p h s c a n n o t all b e linearly o r d e r e d b y a u n i q u e MS f o r m u l a (MS
will s t a n d for " M o n a d i c S e c o n d - o r d e r " ) , e v e n w i t h p a r a m e t e r s
d e n o t i n g s e t s of vertices. Hence, we c a n only hope to order linearly t h e

(I) Laboratoire associ~ au CNRS; email : cource11@labri.u-bordeaux.fr


g r a p h s of specific classes. In [7] we c o n s i d e r e d t h e similar q u e s t i o n of
specifying b y MS f o r m u l a s a n orientation of t h e edges of a n u n d i r e c t e d
graph.
The n o t i o n of a recognizable set of g r a p h s h a s b e e n i n t r o d u c e d in
[4]. It is b a s e d on g r a p h c o n g r u e n c e s with finitely m a n y c l a s s e s a n d is
r e l a t i v e to o p e r a t i o n s on g r a p h s t h a t , typically, g l u e two g r a p h s
t o g e t h e r or e x t e n d in s o m e w a y a given graph. It is k n o w n from Bfichi
a n d Doner, (see T h o m a s [14]) t h a t a s e t of w o r d s (or of b i n a r y trees) is
recognizable iff it is MS-definable. This r e s u l t is f u n d a m e n t a l for two
reasons:
first b e c a u s e it relates two different t y p e s of c h a r a c t e r i z a t i o n
of t h e s a m e s e t s of s t r u c t u r e s : t h e first o n e u s e s a logical f o r m u l a
v e r i f y i n g t h a t a given s t r u c t u r e s a t i s f i e s a c e r t a i n c h a r a c t e r i s t i c
p r o p e r t y a n d t h u s b e l o n g s to the c o n s i d e r e d set; the other is relative to
a fixed algebraic s t r u c t u r e on the class of all s t r u c t u r e s a n d e x p r e s s e s a
p r o p e r t y of t h e set of s t r u c t u r e s c o n s i d e r e d as a whole a n d n o t one of
e a c h individual element;
s e c o n d b e c a u s e it r e l a t e s a logical d e s c r i p t i o n a n d a n
a l g o r i t h m i c o n e s i n c e r e c o g n i z a b l e s e t s c a n b e h a n d l e d in t e r m s of
t r e e - a u t o m a t a , a n d efficient recognition a l g o r i t h m s c a n b e b u i l t from
a u t o m a t a . We a s k t h e following general q u e s t i o n , a l r e a d y c o n s i d e r e d
in [5,7,11,15] :
Q u e s t i o n 2 : For which classes o f finite graphs C is it true that , f o r
every L c ~ , L is recognizable iff it is MS-definable.

We n o w explain t h e links b e t w e e n Q u e s t i o n s 1 a n d 2. It is k n o w n
t h a t every MS-definable set is recognizable. Let E b e a class of g r a p h s ,
let IF b e t h e s e t of g r a p h o p e r a t i o n s on E involved in t h e n o t i o n of
recognizability (of [4]), let u s also a s s u m e t h a t every g r a p h in q: is t h e
v a l u e of a n I F - e x p r e s s i o n , i.e., of a n a l g e b r a i c e x p r e s s i o n over IF.
A s s u m e fmally t h a t for every g r a p h G in C we c a n c o n s t r u c t "in G" a n
I F - e x p r e s s i o n t h a t d e f i n e s this graph. Then, if L is a r e c o g n i z a b l e
s u b s e t of s , t h e r e exists a finite t r e e - a u t o m a t o n recognizing t h e s e t of
I F - e x p r e s s i o n s t h e v a l u e of w h i c h is in L . Given a g r a p h G we c a n
e x p r e s s t h a t G b e l o n g s to L b y m e a n s of a n MS formula t h a t w o r k s as
follows :
(1) it defmes " i n G " a n IF-expression, the v a l u e o f w h i c h i s G,
(2) it c h e c k s w h e t h e r t h e t r e e - a u t o m a t o n a c c e p t s this e x p r e s s i o n
(this is p o s s i b l e b y Doner's theorem):
165

t h e g r a p h G is in L iff the a u t o m a t o n a c c e p t s t h e expression, iff t h e


MS-formula holds.
In s o m e cases, a linear ordering o f the given graph helps to "parse"
it by MS-formulas: t h i s is t h e link b e t w e e n Q u e s t i o n s 1 a n d 2.
This p a p e r is a n e x t e n d e d abstract. Full details c a n be f o u n d in [8].

Graphs
All g r a p h s will be finite, directed (unless otherwise stated), simple
(no two edges h a v e t h e s a m e o r d e r e d pair of vertices). A g r a p h will be
given as a pair G = <V G , e d g G > where V G is the set of vertices a n d
e d g G c VG • is the edge relation. If X c_ V G we denote by G [X] the
i n d u c e d s u b g r a p h of G with set of vertices X . A p a t h is a s e q u e n c e of
p a i r w i s e d i s t i n c t vertices (Xl,.:.,xn) s u c h t h a t (x i, xi+ 1) E e d g G for
every i = 1, n- 1. It connects Xl to x n. It is e m p t y ff n = 1. A cycle is like
a p a t h except t h a t Xl = Xn a n d n > 1. A g r a p h is a p a t h if its vertices
form a p a t h (x 1 ..... xr0 a n d all edges of the g r a p h are in t h e path, i.e., are
of t h e form (x i, xi+ 1) for s o m e i. A discrete g r a p h is a g r a p h w i t h o u t
edges. We let SucG(x) := { y / ( x , y) is a n edge} a n d we call it t h e set of
successors of x. We say t h a t x is a predecessor of y if y is a s u c c e s s o r
of x. The outdegree of G is t h e m a x i m a l c a r d i n a l i t y of the sets SucG(x).
A d a g is a (directed) acyclic graph; a tree is a dag s u c h t h a t every vertex
is r e a c h a b l e by a u n i q u e p a t h from a (necessarly unique) vertex called
t h e root. A f o r e s t is a dag, e a c h c o n n e c t e d c o m p o n e n t of w h i c h is a
tree; h e n c e , a f o r e s t t h a t is n o t a tree h a s several roots. A v e r t e x
w i t h o u t s u c c e s s o r s is called a leaf. The transitive closure of a g r a p h G
is a g r a p h denoted by G+. If G is a dag, the relation edgG* (the reflexive
a n d t r a n s i t i v e c l o s u r e of t h e s u c c e s s o r relation) is a partial order on
V G . Two vertices x and y are comparable ff x edgG*y or y edgG* x ;
o t h e r w i s e t h e y are i n c o m p a r a b l e a n d we write t h i s x _kG y. The
reduction of a dag G is the least s u b g r a p h H of G s u c h t h a t H + = G + . It
is u n i q u e a n d d e n o t e d b y red(G); it is t h e H a s s e - d i a g r a m of t h e order
edgG* . We s a y t h a t a g r a p h G is linear if it is a d a g a n d a n y two
vertices are l i n k e d b y a n edge; its r e d u c t i o n is a p a t h a n d t h e order
edgG* is linear.

Relational structures a n d Monadic Second-order Logic.


Let R be a finite set of relation s y m b o l s w h e r e e a c h e l e m e n t r in R
h a s a r a n k p(r) in IN+, w h i c h will be the arity of relations d e n o t e d b y r.
A n R-(relational) structure is a tuple S = <D S, (rs)r E R > w h e r e D S is a
finite (possibly empty) set, called t h e d o m a i n of S, a n d rs is a s u b s e t of

D 0(r) for e a c h r in R. We shall d e n o t e by ?3(R) t h e set of R - s t r u c t u r e s . We


S
r e f e r t h e r e a d e r to [4-9] for m o n a d i c s e c o n d - o r d e r (MS) logic a n d MS-
d e f i n a b l e t r a n s d u c t i o n s of s t r u c t u r e s .

Recognizable s e t s
Let ?3 be a p o s s i b l y infinite set of sorts. A n ~3-signature is a s e t of
f u n c t i o n s y m b o l s F s u c h t h a t e a c h f in F h a s a type of t h e form SlXS2X
... XSn --~ s w h e r e s 1 ..... Sn,S a r e sorts. An F-algebra is a n object M =
<(Ms)se ?3, ( f M ) f a F >, w h e r e , for e a c h s in ?3, M s is a set called t h e
d o m a i n o f sort s of M , a n d for e a c h f a F of type Sl•215 ... x s n --~ s,
f M is a total m a p p i n g : M s 1 x Ms2 x ... x M s n --~ M s . We d e n o t e b y T(F)
t h e F- a l g e b r a of f i n i t e t e r m s (algebraic expressions) over F a n d b y
h M t h e u n i q u e h o m o m o r p h i s m : T(/~ --~ M t h a t a s s o c i a t e s with a t e r m
its v a l u e . We s h a l l s a y t h a t t is a t e r m (or a n expression) d e n o t i n g
hM(t). An F - a l g e b r a A is locally f i n i t e if e a c h d o m a i n A s , s ~ ?3, is
finite. Let M be a n F - a l g e b r a a n d s a ? 3 . A s u b s e t B of M s i s
recognizable if t h e r e e x i s t s a l o c a l l y f i n i t e F - a l g e b r a A, a
h o m o m o r p h i s m h : M--~ A, a n d a (finite) s u b s e t C of A s s u c h t h a t B =
h -l(c).

Propositionl" Let M and N be t w o F - a l g e b r a s , let h be a


h o m o m o r p h i s m o f N onto M: a s u b s e t L o f M s is F-recognizable iff
the s u b s e t h - l ( L ) of Ns is F-recognizable. In particular, /f N is T(F)
a n d F is finite then L is F-recognizable iff h-l(L) is a recognizable s e t
o f terms.

A g r a p h w i t h sources is a pair H = <G, s > consisting of a g r a p h G


a n d a total o n e - t o - o n e m a p p i n g s : C --~ VG called its source mapping,
w h e r e C is a finite s u b s e t of 5I. We s a y t h a t s(C) c_ V G is the s e t o f
sources o f H a n d t h a t s(c) is its c-source w h e r e c ~ C. We shall also say
t h a t t h e v e r t e x s(c) h a s source label c . A v e r t e x t h a t is n o t a s o u r c e is
an internal vertex. T h e s e t C is called the type o f H a n d is d e n o t e d b y
r We shall d e n o t e b y G C t h e s e t of all g r a p h s of t y p e C. O n e c a n
define o p e r a t i o n s on s o u r c e d g r a p h s that, typically glue two g r a p h s b y
t h e i r s o u r c e s . We shall u s e t h e s e t ?3 of finite s u b s e t s of IN as a s e t of
sorts. T h e s e o p e r a t i o n s form a n ?3-signature. We refer t h e r e a d e r to [4-
10] for definitions.
167

A set of g r a p h s , all of t h e s a m e type, is r e c o g n i z a b l e if it is w i t h


r e s p e c t to t h e s e o p e r a t i o n s . T h e n o t i o n of r e c o g n i z a b i l i t y is t h u s
a s s o c i a t e d w i t h c e r t a i n g r a p h operations. It is r o b u s t in t h e s e n s e t h a t
s m a l l v a r i a t i o n s on t h e definitions of t h e o p e r a t i o n s do n o t m o d i f y it
(this is s h o w n in Courcelle [10]). It is proved in Courcelle [4] t h a t every
MS-definable set of g r a p h s is recognizable.

Monadic second-order definitions of linear orders


Let s be a class of R - s t r u c t u r e s (it is no m o r e difficult to give t h e
definition for s t r u c t u r e s t h a n for graphs). We s a y t h a t a linear order
on the structures o f s is M S - d e f i n a b l e if t h e r e exist two M S - f o r m u l a s
~0(X1..... Xn) a n d 0(x, y, X1 ..... Xn) s u c h t h a t for every S in s :
(1) S ~ 3 X l ..... X n . ( p
(2) for all sets D 1 ..... D n c_ D S , if (S, D 1 ..... D n ) ~ (p
the b i n a r y relation P s u c h t h a t
(u,v) ~ P ca(S, u,v, D i ..... D n ) ~ 0
is a linear order on D S .
T h e l i n e a r o r d e r is defined "uniformly", b y t h e s a m e f o r m u l a s for
all s t r u c t u r e s of t h e class, a n d in t e r m s of auxiliary sets D 1 ..... D n . In
other words, there exists a definable t r a n s d u c t i o n mapping any
s t r u c t u r e S in $ into a s t r u c t u r e S' c o n s i s t i n g of S e q u i p p e d w i t h a
linear o r d e r of its d o m a i n . This does n o t m e a n t h a t e v e r y linear order
o n t h e d o m a i n of S is o b t a i n e d i n t h i s way, b y s o m e choice of s e t s
D1 ..... D n .

Locally o r d e r e d d a g s
E v e r y d a g h a s a topological sorting, i.e., a n o r d e r i n g < of t h e
vertices s u c h t h a t if there is a n edge from x to y t h e n x <_y. A vertex of
a dag G h a v i n g no p r e d e c e s s o r is called a root a n d R o o t G d e n o t e s t h e
set of roots of G . (Since g r a p h s are finite, if a dag G is n o n e m p t y , t h e n
R o o t G is nonempty). A partial order ~ on V G locally orders G if the
sets R o o t G and S u c G (x) for every x ~ V G are linearly ordered b y ~;
we let P a t h s ( G ) d e n o t e t h e s e t of p a t h s in G s t a r t i n g from a root;
P a t h s ( G ) is l i n e a r l y o r d e r e d b y <~ w h e r e -<(z is t h e l e x i c o g r a p h i c a l
order on s e q u e n c e s of vertices a s s o c i a t e d with (z. For e a c h x ~ V G , we
let n(x) d e n o t e t h e u n i q u e _<(z-minimal p a t h from a root to x . For x, y
V G , we let x _<(zy iff n(x) -<(z n(Y). Hence, <(~ is a linear order on V G
T h e e n u m e r a t i o n of V G in i n c r e a s i n g o r d e r w i t h r e s p e c t to _<(z is
called t h e (z-depth-first t r a v e r s a l of G. It is n o t h i n g b u t t h e o r d e r in
168

w h i c h t h e v e r t i c e s of G a r e v i s i t e d d u r i n g t h e d e p t h - f i r s t s e a r c h o f G
w h e r e , in c a s e o f c h o i c e , t h e a - s m a l l e s t v e r t e x is c h o s e n . W e let P be
the binary relation on VG such that
(x, y) ~ P r x is j u s t b e f o r e y o n t h e p a t h n(y).
T h e g r a p h F(G, a):=<V G , P > is t h e a-depth-first spanning f o r e s t .
I n p a r t i c u l a r s i n c e G is a dag, a n e d g e (x,y) o f G c a n b e o f o n l y 3 t y p e s ,
b y [2, L e m m a (5.6)I:
(1) e i t h e r it is a tree edge, i.e., a n e d g e o f F(G, a),
(2) o r it is a forward edge, i.e., x is a n a n c e s t o r of y in
s o m e t r e e of F(G, a), b u t (x; y) is n o t a n e d g e of F(G, a),
(3) o r it is a cross e d g e , i.e. x a n d y a r e i n c o m p a r a b l e in F(G, a)
and y <aX.
F i n a l l y , w e d e f i n e t h e a-canonical traversal o f G a s t h e a - d e p t h
f i r s t - t r a v e r s a l o f F(G, a -1) w h e r e a -1 is t h e o p p o s i t e o r d e r o f a (i.e., x
< a - 1 y iff y < a x). It o r d e r s G locally iff a does.

Theorem 2:Let G be a dag locally ordered by c(. The a-canonical


traversal o f G is a topological sorting that is MS-definable in <V G ,
e d g G , ~>.

P r o o f : Consider a n y edge (x,y) of G. If it is a tree edge or a forward edge (in


F(G, a - l ) ) , t h e n x is before y in the a - d e p t h - f i r s t t r a v e r s a l of F(G, a - l ) .
Otherwise it is a cross-edge, hence x > a - 1 y , and x < a y where < a is defined in
t e r m s of the p a t h s in F(G, a - l ) ; hence, x is before y in the a - d e p t h - f i r s t
traversal of F(G, a - l ) . Hence this traversal is a topological sorting of G. We now
formalize its definition in MS.
C l a i m : If G is a forest locally ordered by a then its a-depth-first traversal is
MS-definable.
It is t h u s enough to show that F(G, a - l ) is MS-definable in <V G , e d g G ,
a>, b e c a u s e having defined this forest (namely, its edges), we can MS-define its
a - d e p t h - f i r s t traversal by the claim. We shall do the proof for F(G, a) in order
to simplify the notation. The result will follow since a - i is definable from a.
We shall basically t r a n s l a t e into MS-formulas the various notions involved in
the definition of F(G, a). The reduction r e d ( H ) of a n y dag H is the g r a p h K
with the s a m e vertices as H and edges defined as follows :
(x,y) ~ edg K ca (x; y) e edg H and there is no z r y such that
x e d g H z and z edgG+ y.
169

We claim the existence of formulas ~1 .....qY7 with the s e m a n t i c s given


below, in a dag G locally ordered by a. The letter X denotes sets of vertices;
letters x, y, z, ... denote vertices of the considered dag G.
~1 (x; y~K) ca x , y ~ X , x ~ y , a n d t h e r e i s a d i r e c t e d p a t h i n G [ X l f f o m
xtoy.
(P2 (x, y~X ) ca x, y ~ X , (x, y) is an edge of the graph red (G [XI).
(Write that (x, y) a edg G and that there is no z i n X - {x,y}
s u c h that x edg H z and ~1 (z, y, X ) holds where X' = X - {x }).
(P3 (x, y) ca x a y and G is a path from x to y.
(Write that tp1 (x; y~X ) holds for X = V G and that it
does not i f X is any proper subset of V G containing x and y .)
q)4 (x; y ~ ) ca x ~ y and X is the set of vertices of a directed path from x to y.
(Write that the graph red(G [X ]) is a path from x to y ).
~5(x; y , z , t ~ ) ca tp4{x; y,X) holds, z, t e X and z is the predecessor of
t on the path red(G [X]) from x to y.
(Write that q)4(x; y ~ ) holds and (z,t) is an edge of red(G [X]).
~6(x;y,X ) c a x ~ y and X is the set of vertices of the minimal path from x t o y.
(Write that tp4 (x; y ~ ) holds and that there does not exist z, t,
t' ~ VG such that ~5(x; y,z,t,X ) holds, (z,f) ~ edg G ,
t' e d g d y and t' is strictly smaller than t w.r.t, c~
tp7 (x; y) ca there exist x and z such that z is a root of G,
~o6(z, y, X ) and (P5(z, y, x, y, X ) hold.
Hence ~07 (x, y) holds iff (x, y) is an edge of F(G, a). This concludes the
proof. []
In the following result, we need not use a given local ordering,
b e c a u s e we c a n det'me one.

Corollary 3 : For e a c h d ~ IK s o m e d e p t h - f i r s t t r a v e r s a l o f t r e e s o f
o u t d e g r e e a t m o s t d is M S - d e f i n a b l e .

In Theorem 2, w e s h o w e d h o w to c o n s t r u c t some topological


s o r t i n g . I n c e r t a i n a p p l i c a t i o n s {in p a r t i c u l a r to t r a c e s ) , o n e m a y w i s h
to c o n s t r u c t a specific one. We shall MS-define the <a-minimal
topological sorting of a dag G having antichains of cardinality at
m o s t k, w h e r e (~ is d e f i n e d f r o m a c o v e r i n g of G by chains.
170

Dags with small antichains


Let ik k be t h e class of d a g s G having a n t i c h a i n s of c a r d i n a l i t y
a t m o s t k. By Dilworth's T h e o r e m , t h e y are v e r t e x - c o v e r e d b y t h e
u n i o n of k p a t h s P1, P2 ..... P k (not n e c e s s a r l y disjoint). A c h a i n
partition of VG is a p a r t i t i o n (X1 ..... X k ) s u c h t h a t e a c h set X i is
l i n e a r l y o r d e r e d b y e d g G . F r o m t h e p a t h s P i , one g e t s a c h a i n
p a r t i t i o n b y t a k i n g X l = P1, X i = Pi - (PlU...UPi-1) for i = 2, 3 ..... k (we
d e n o t e also b y Pi t h e set of vertices of the p a t h Pi ). Conversely, if a
c h a i n p a r t i t i o n X1 ..... X k is given, t h e n for e a c h i one c a n find a p a t h
Pi t h a t c o n t a i n s X i . H e n c e G is v e r t e x covered by P 1 U . . . U P k, a
s u b g r a p h of G t h a t is t h e u n i o n of k p a t h s . From X1 ..... X k we define
as follows a linear order on V G , denoted by a(X 1 ..... X k ):
(x, y) e ct(X1 ..... X k ) iffeither x e X i ,y e X~ a n d i < j
orx, y e Xi a n d x e d g G y .
We s h a l l d e n o t e b y S ( G , X 1 ..... X k ) the < a - m i n i m a l topological
sorting of G where a = a(X 1 ..... X k ) .

Theorem 4 : For e v e r y f v c e d k, S(G, Xl ..... X k ) is M S - d e f i n a b l e in the


structure <VG, e d g G , X1 ..... X k > if G is a d a g a n d (X1 ..... X k ) is a chain
p a r t i t i o n o f VG.

We n e e d s o m e n o t a t i o n a n d l e m m a s . If S is a linear order on a set


V, we d e n o t e by $ S t h e e n u m e r a t i o n of V in i n c r e a s i n g order for S.
C o n c a t e n a t i o n of s e q u e n c e s is d e n o t e d by a big d o t . . If G is a dag a n d
is a linear order on a s u p e r s e t of V G , we let S(G, ~) d e n o t e the <a-
m i n i m a l topological sorting. If x is a vertex, we d e n o t e b y G-x t h e
g r a p h GIVG-{X}I.

L e m m a 5 : Let G be a d a g a n d ~ be a linear order o n a s u p e r s e t o f V G .


T h e n $S(G, ~)= b . $ S ( G - b , a) w h e r e b is the ~ - s m a l l e s t r o o t o f G .

O u r proof of T h e o r e m 4 will be a n i n d u c t i o n on k . The following


l e m m a will give t h e inductive step.

L e m m a 6: Let G be a dag, let (Y1, Y2) be a partition o f V G s u c h that Y2


is linearly o r d e r e d b y e d g G . W e let a 2 be this linear order o n Y 2 a n d
Oil b e a n y linear o r d e r o n Y1 9 T h e n S(G, ct1 . ~2) = S(G U C('l, a' 1 . ~2)
w h e r e ~'1 = S(G +[YI], ~I).
171

We" d e n o t e b y al.(Z 2 t h e l i n e a r o r d e r i n g o n V G s u c h t h a t (x, y)


al.a2iffeither (x; y) e a i for s o m e i or x e Y 1 , y E Y 2 . ( H e n c e S a l . a 2 =
~al.$tz2). This lemma means that one can construct S(G ,al.a2) in
t h r e e s t e p s : o n e f i r s t c o n s t r u c t s a ' l = S ( G +[Y1], a l ) n a m e l y t h e a l -
t o p o l o g i c a l s o r t i n g o f t h e r e s t r i c t i o n to Y1 o f t h e t r a n s i t i v e c l o s u r e o f
G ; t h e n o n e lets G' c o n s i s t of G a u g m e n t e d b y t h e e d g e s (x, y ), x ~ y ,
s u c h t h a t (x, y) ~ ~'1: w e s h a l l v e r i f y t h a t G' in i n d e e d a c y c l i c , a n d
third one constructs S(G', a ' l . a 2 ) giving t h e d e s i r e d S ( G , a l . a 2 ) .

Proof of Theorem .4We use an induction on k. Let us recall that G is a dag


and (X1 ..... Xk ) a chain partition of V G . The case k = 1 is trivial. We consider
the case k = 2.
C l a i m : The relation S(G, X 1 ,X2) /s equal to the relation S such that
( x , y ) ~ S iff either x edgGY or(xs G y , x ~ X l , y ~X2).
The definition of S is expressible by an MS-formula. This completes the
proof of the case k = 2. We now consider the case k> 2. We let Y = X 10...OXk -1 9
We have, letting ~i be the linear order equal to the restriction of edgD to Xi:
S(G, X 1 ,...r = S(G, ~1.[~2..... [3k ) by definitions
= S(GU~', ~'.~k) b y L e m m a 6
where ~=]11.~2 ..... i3k.1 and ~'= S(G+IYI, ~).
By induction S(G +[II], ~) is definable by an MS formula in terms of X 1
....,Xk -1. By the case k = 2, a n o t h e r formula can define S(G U~', 13'.~k ), i.e., the
desired S(G, X1,...j~k). []

General graphs.
C a n o n e e x t e n d T h e o r e m 2 to d i r e c t e d g r a p h s w i t h c y c l e s ? O n e
c a n n o t o f c o u r s e d e f i n e a t o p o l o g i c a l s o r t i n g b u t o n e m a y w a n t to
d e f i n e a l i n e a r o r d e r . L e t u s a s s u m e t h a t G is d i r e c t e d a n d h a s a n
origin, n a m e l y a v e r t e x r f r o m w h i c h e v e r y v e r t e x is r e a c h a b l e b y a
d i r e c t e d p a t h ; let u s also a s s u m e t h a t a is a p a r t i a l o r d e r o n V G t h a t
o r d e r s l i n e a r l y e a c h s e t S u c G (x). F o r e v e r y x t h e r e is a < a - s m a l l e s t
p a t h i n G f r o m r to x d e n o t e d b y x(x) a n d t h e d e f i n i t i o n o f F(G, tz)
e x t e n d s . T h e a - d e p t h - f i r s t t r a v e r s a l o f t h e t r e e F(G, ~) is a l i n e a r o r d e r
o f G ( b u t n o t a l w a y s a t o p o l o g i c a l sorting), H o w e v e r , w e d o n o t k n o w
h o w to d e f i n e it in MS. T h e r e d u c t i o n r e d ( H ) is n o t u n i q u e l y d e f i n e d
for g r a p h s H w i t h c y c l e s (take for e x a m p l e a c o m p l e t e d i r e c t e d g r a p h
w i t h 3 v e r t i c e s ) a n d w e d o n o t k n o w h o w to d e f i n e a n a l t e r n a t i v e
formula with same meaning as r y, X). A c t u a l l y , n o s u c h f o r m u l a
172

d o e s exist: o t h e r w i s e o n e c o u l d e x p r e s s in MS t h a t a d i r e c t e d g r a p h h a s
a H a m i l t o n i a n cycle a n d t h i s is n o t e x p r e s s i b l e (see [6]).

O p e n q u e s t i o n s : Is it possible to MS-define a linear order o f a locally


ordered directed graph having a n origin? Is it possible to MS-define a
linear order o f a connected undirected graph given with a partial
order that orders linearly the set o f vertices adjacent to any vertex ?

Recognizability v e r s u s MS-definability.
We now consider classes of graphs of which the recognizable sets
a r e M S - d e f i n a b l e . W e let E k m A k b e t h e c l a s s o f d a g s G s u c h t h a t
there exists a chain partition (X1 ..... X k) o f V G a n d a topological
s o r t i n g S s u c h t h a t : for e a c h e d g e (x,y) o f G , if x e Xi t h e n x is t h e
l a s t v e r t e x in X i t h a t p r e c e d e s y w i t h r e s p e c t to S .

L e m m a 7: T h e class s is MS-definable and MS-formulas can define


in every graph G o f E k a chain partition a n d a topological sorting
witnessing that G is in OK.

Proof: An MS-formula ~. I(X1,...~Xk) c a n express t h a t (X1,...cKk) is a chain


partition of the considered dag. From any chain partition (X1 ,...oKk) we define a
graph G' by adding to G a n e d g e ( x , y ) i f f : t h e r e e x i s t z e V G and i~ [k]
such that z, y e X i , (z, x) e edg G and y is the successor of z in G + [Xi ]. The
result follows since a topological sorting S of G satisfies the condition iff it is
a topological sorting of G'. []

Theorem 8: L e t k e IK Every recognizable s u b s e t o f C k is MS-


definable.

W e c o n c l u d e t h i s s e c t i o n w i t h a n a p p l i c a t i o n to p a r t i a l k - p a t h (see
[6,8,13] for d e f i n i t i o n s ) .

Corollary 9: A set o f k-connected partial k - p a t h s is recognizable iff it


is MS-definable.

It is c o n j e c t u r e d in [5] t h a t a s e t o f g r a p h s o f t r e e w i d t h a t m o s t k
for a n y fixed k is r e c o g n i z a b l e iff it is t h e s e t of finite m o d e l s o f a n M S
formula, where one can also use special quantifiers expressing that
s e t s h a v e c a r d i n a l i t y t h a t is a m u l t i p l e o f a fixed n u m b e r . T h i s r e s u l t
173

is a further step towards the proof of this conjecture since partial k-


paths have treewidth a t m o s t k.

Traces

We give a new proof of a result o f [12] o n r e c o g n i z a b l e sets of


partially commutative words, also called traces. We recall a few
d e f i n i t i o n s a n d w e r e f e r t h e r e a d e r t o [1] f o r m o r e d e t a i l s .
A partially commutative alphabet is a p a i r (A, C) w h e r e A is a f i n i t e
alphabet and C a s e t of u n o r d e r e d p a i r s of l e t t e r s o f A t h a t a r e s a i d to
c o m m u t e . W e let = d e n o t e t h e l e a s t c o n g r u e n c e o n A * s u c h t h a t a b -= ba for
every [ a, b } ~ C. A n e l e m e n t of A */= is c a l l e d a trace. T h e q u o t i e n t m o n o i d
M(A, C) = A * / - is t h e c a l l e d t h e trace monoid defined by (A, C). If L C A * / - , w e
let L c_ A b e t h e u n i o n of t h e s e t s t, for t e L . ( E a c h trace t is a n
e q u i v a l e n c e c l a s s , h e n c e a s u b s e t of A *.) T h e n o t i o n o f a r e c o g n i z a b l e s u b s e t
o f M(A,C) follows i m m e d i a t e l y f r o m t h e m o n o i d s t r u c t u r e . W e s h a l l u s e t h e
following m o r e c o n c r e t e c h a r a c t e r i z a t i o n , t h a t c a n b e t a k e n a s a d e f i n i t i o n :
L C A * / = is r e c o g n i z a b l e iff L is a r e g u l a r l a n g u a g e (i.e., a r e c o g n i z a b l e
s u b s e t of t h e free m o n o i d A *).
Let u s e n u m e r a t e A in fixed w a y a s { a l ......ak}. E v e r y t r a c e t c o n t a i n s a
u n i q u e < - m i n i m a l e l e m e n t (where < is t h e l e x i c o g r a p h i c o r d e r a s s o c i a t e d w i t h
t h e e n u m e r a t i o n of A) t h a t w e s h a l l d e n o t e b y rain(t). Ochmanski has proved
t h a t a s e t L c_ M(A,C) is r e c o g n i z a b l e iff MIn(L) (:= {min(t) / t ~ L}) is a r e g u l a r
l a n g u a g e 9 W e s h a l l give a n e w p r o o f of t h i s r e s u l t , b a s e d o n g r a p h s a n d
m o n a d i c s e c o n d - o r d e r logic.
Every trace can be represented by a dag with vertices labelled by the
l e t t e r s c a l l e d i t s d e p e n d e n c y graph. W e fix A = {a 1 ..... ak} a n d C. W i t h e v e r y
word u = blb2...bne A* (with b l ..... b n e A) we associate a graph G
c o n s t r u c t e d a s follows
9V G = {1, 2 ..... n] (ff n = 0 t h e n G is empty),
. e a c h vertex i h a s t h e label b i ,
9 t h e r e is a d i r e c t e d edge from i to j iff i < j a n d [ b / , / ~ } ~ C
(this is t h e c a s e in p a r t i c u l a r if bi = bj ).
Finally, w e let H b e t h e r e d u c t i o n of G ; it will b e d e n o t e d b y d e p ( u ) . It
will b e h a n d l e d a s a r e l a t i o n a l s t r u c t u r e <V H , e d g H, ( l a b a H ) a e A > w h e r e
labaH(X) h o l d s iff x h a s l a b e l a .

Proposition I 0 ( [ 1 ] ) F o r a n y t w o w o r d s u , v e A *, u - v i f f d e p ( u ) a n d
dep(v) a r e i s o m o r p h i c , i f f v is a t o p o l o g i c a l s o r t i n g o f d e p ( u ) .
174

It follows t h a t t h e g r a p h H a s a b o v e is a c t u a l l y a s s o c i a t e d w i t h
t h e e q u i v a l e n c e c l a s s o f u, i.e., w i t h a t r a c e t. W e s h a l l d e n o t e b y dep(t)
t h e a b s t r a c t g r a p h t h a t is t h e i s o m o r p h i s m c l a s s o f d e p ( u ) w h e r e u is
a n y m e m b e r o f t . (The n u m b e r i n g of t h e v e r t i c e s o f dep(u) d e p e n d s o n u
b u t is i r r e l e v a n t in dep(t).) If L is a s e t o f t r a c e s , w e let Dep(L) := {dep(t)
I t e L}).

L e m m a 1 1 : The mapping d e p f r o m words to graphs is a definable


transduction.

P r o p o s i t i o n 12: Let {A, C ) be a partially commutative alphabet.


(1) Every dependency graph satisfies the following properties : it is
acyclic, reduced, a n y two a d j a c e n t vertices are labelled by
n o n c o m m u t i n g letters a n d a n y two vertices l a b e l l e d by
noncommuting letters (in particular any two vertices labelled by the
same letter) are comparable.
(2) Every directed graph with vertices labelled in A that satisfies the
above conditions is a dependency graph.

In p a r t i c u l a r e v e r y d e p e n d e n c y g r a p h G b e l o n g s to t h e c l a s s o f
d a g s C k w h e r e k is t h e size of t h e a l p h a b e t : t h e s e t s of v e r t i c e s w i t h a
same label form an appropriate chain partition. For such graph G, we
let r a i n ( G ) = m i n ( d e p - 1 (G)) = t h e u n i q u e < - m i n i m a l - w o r d i n t h e t r a c e
dep-l(G) c A *

P r o p o s i t i o n 13: The mapping m i n is a definable transduction f r o m


dependency graphs to words.

Proof : Let A= {al .....ak}. Let G be a dependency graph. Let Xi = {x e VG / labG


(x) = a i }. Then V G = X 1 U...UXk where (X 1 ..... Xk) is a chain partition. The word
mixl(G) is nothing b u t the reduction of the linear graph $(G, X 1 ..... Xk), hence
rain is definable since it is the composition of two definable transductions:
the one defining S(G, X 1 ..... Xk) (by Theorem 4) and the reduction.
[]

T h e o r e m 1 4 : Let (A,C ) be a partially commutative alphabet. The


following properties of a subset X of A * are equivalent
(1) X is a regular language,
175

(2) Min(X) is a regular language,


(3) D e p ( X ) is a n MS-definable set of graphs,
(4) Dep{X) is a recognizable set of graphs.

P r o o f (I) ~ (2): By Proposition 13, one can construct a n MS-formula O(x, y)


such that for every word u e A *, this formula defines in the relational s t r u c t u r e
r e p r e s e n t i n g u a linear order t h a t c o r r e s p o n d s to rain(u). One can build a
closed MS-formula 0' t h a t verifies that is order coincides with the n a t u r a l
order on the letters of u, i.e., that rain(u) = u. The set Min(A*) of minimal words
is MS-definable h e n c e regular. Since Min(X) = X c~ Min(A*), we get the
desired implication.
(2) ~ (3) The t r a n s d u c t i o n rain from d e p e n d e n c y graphs to minimal words is
definable. Since Dep(X) = rain-1(Min(X)) we obtain t h a t D e p ( X ) is MS-
definable if Min(X) is, which is the case by Bfichi's theorem if Min(X) is
regular.
(3) ~ (1) The t r a n s d u c t i o n dep that maps a word u e A* to the corresponding
d e p e n d e n c y g r a p h is definable. Note also that X = dep-~Dep(X)). Hence, if
Dep(X) is MS-definable, so is the language d e p - l ( D e p ( X ) ) which i s also
regular by Bflchi's theorem.
(3) ~ (4) is a consequence of the result by Courcelle [4] saying that every MS-
definable set of graphs is recognizable.
(4) ~ (3) We have observed t h a t d e p e n d e n c y graphs belong to the class Ck
where k = card(A). The result follows from Theorem 8 saying t h a t recognizable
subsets of Ck are MS-definable. []

T h e e q u i v a l e n c e o f (1) a n d (2) in t h i s t h e o r e m is also p r o v e d in [12]


by a more complicated method using rational expressions for
r e p r e s e n t i n g t h e t w o c o n s i d e r e d l a n g u a g e s . T h e e q u i v a l e n c e o f (1) a n d
(3) i n t h i s t h e o r e m is a l s o p r o v e d i n [15], b y u s i n g t h e a s y n c h r o n o u s
c e l l u l a r a u t o m a t a ( s e e [3]) a n d t h e d i f f i c u l t r e s u l t s t a t i n g t h a t t h e s e
automata define exactly the recognizable sets of traces. Our proof does
n o t u s e s u c h c o m p l e x t o o l s : it u s e s o n l y M S logic, a n d r e g u l a r
l a n g u a g e s a r e h a n d l e d t h r o u g h M S logic a n d B f l c h i ' s T h e o r e m .
176

References

[1] AALBERSBERG I., ROZENBERG G., Theory of traces, Theoret. Comput. Sci.
60(1988) 1-82.
[2] AHO A., HOPCROFT J., ULLMAN J., The design and analysis of c o m p u t e r
algorithms, Adison-Wesley, 1974.
[3] CORI R., METIVIER u ZlELONKA W., A s y n c h r o n o u s m a p p i n g s a n d
a s y n c h r o n o u s cellular a u t o m a t a , Information and C o m p u t a t i o n 106(1993)
159-203.
[4] COURCELLE B., The monadic second-order logic of graphs I: Recognizable
sets of finite graphs. Information and Computation 8 5 (1990) 12-75.
[5] COURCELLE B., The monadic second-order logic of graphs V: On closing
the gap beween definability and recorgnizability, Theoret. Comput. Sci. 8 0
(1991) 153-202.
[6] COURCELLE B., The monadic second-order logic of graphs VI: On several
representations of g r a p h s by r e l a t i o n a l structures, D i s c r e t e Applied
Mathematics, 5 4 (1994) 117-149.
[7] COURCELLE B., The m o n a d i c s e c o n d - o r d e r logic of g r a p h s VIII:
Orientations, Annals Pure Applied Logic, 72(2)(1995).
[8] COURCELLE B., The m o n a d i c s e c o n d - o r d e r logic of g r a p h s X: Linear
orderings, http://www.labri.u-bordeaux.fr/-courcell/ActSci.html, May 1994,
[9] COURCELLE B., Monadic second-order definable graph transductions: a
survey, Theoret. Comput. Sci. 126(1994) 53-75.
[10] COURCELLE B., Recognizable sets of graphs: equivalent definitions and
closure properties, Math. Str. Comp. Sci. 4(1994) 1-32.
[11] HOOGEBOOM H., ten PAS P., Recognizable text languages, MFCS 1994,
LNCS 841 (1994)413-422.
[12] OCHMANSKI E., Regular behaviour of c o n c u r r e n t systems, Bull. of EATCS
27(1985) 56-67.
[13] PROSKUROWSKI A., Separating s u b g r a p h s in k-trees: cables a n d
caterpillars, Discrete Maths 4 9 (1984) 275-285.
[14] THOMAS W., A u t o m a t a on infinite objects, in "Handbook of Theoretical
Computer Science, Volume B", J. Van Leeuwen ed., Elsevier, 1990, pp. 133-192.
[15] THOMAS W., On logical definability of trace languages, Proce-edings of a
workshop held in Kochel in October 1989 , V. Diekert ed., Report of Technische
Universit~t Mflnchen 1-9002, 1990, pp. 172-182.
First-Order Spectra with One Binary Predicate

Arnaud Durand and Solomampionona Ranaivoson

LAIAC, universit@ de Caen, France.


Email: arnaud.durand@info.unicaen.fr, ranaivoson@info.unicaen.fr

A b s t r a c t . The spectrum, Sp(~), of a sentence ~ is the set of cardinali-


ties of finite structures which satisfy ~. We prove that any set of integers
which is in Func~~ i.e. in the class of spectra of first-order sentences of
type containing only unary function symbols is also in B I N 1 i.e. in the
class of spectra of first-order sentences of type involving only a single
binary relation.
We give similar results for generalized spectra and some corollaries: in
particular, from the fact that the large complexity class UNTIMERAM(Cn)
C

is included in Func~~ for unary languages (n denotes the input integer),


we deduce that the set of primes and many "natural" sets belong to
BIN 1

1 Introduction
Let 9 be a first-order sentence ( i.e. with no free variable ). The spectrum of 9,
denoted by Sp(9), is the set of cardinalities of finite structures which satisfy the
sentence 9. Let Spectra denotes the class of spectra of all first-order sentences.
It has been proved that

N P = Spectra
holds for the unary representation of integers (see [11, 3]).
Such a result establishes connection between computational complexity and
finite model theory and permits us, for any property, to study its "complexity"
either in terms of the machine which recognizes it or in terms of the formula
which characterizes it.
So, as soon as we have a good measure of complexity for formulas, we are able
to translate some computational complexity results into finite model theory's
ones. For example, Pudlak [14] uses the Cook's hierarchy theorem (see [2]) in
order to show that there is a strict hierarchy for spectra. Of course, the inverse
translation is possible although we do not know any result in computational
complexity obtained by this way.
Let Sp(dV) be the Class of spectra of sentences with at most d universal
quantifiers. Let S be a set of positive integers. We have, for all d > 1 (see
[8, 10]):

S C N T I M E ( n d ( l o g n)) :=~ S C Sp(cN) ~ S C N T I M E ( n d ( l o g n)~),


"~78

where n is the input integer. This result links closely the degree of the non-
deterministic time complexity class which recognizes a given property 7) and the
required number of universal quantifiers for a formula which expresses 7). If we
use NRAMs (with only successor as operator and uniform cost measure) instead
of TM, it yields, for d > 1 (see [9, 10]):

U N T I M EI{AM( Cnd) = Sp( dV) = Sp( dV, arity d).


C

Those results allow us to affirm that the number of quantifiers of a sentence


is a good measure for the complexity of its spectrum. Furthermore, it gives
immediately the hierarchy theorem ( by Cook's hierarchy [2] ):

Vd > 1 Sp(dV)c# Sp((d + 1)V).


A second way to study spectra complexity is to consider the maximal arity
of function and relation symbols of formulas. Then, for each d > l(see [9, 10]),
N T I M E ( n d log n) C_Sp(arity d), and

N T I M E ( n d) C Sp(arity d, without function symbol), for d > 2.


Unfortunately, there are not converse results. For example, we are not able to
find any non deterministic time higher bound for the class of spectra of formulas
with a single binary relation symbol.
D e f i n i t i o n A directed graph is a finite structure ~ = (Dom, I{) where R
is a binary relation. Let c be a positive integer. We say that a graph G is of
ouldegree bounded by c if for each x in the domain, the number of elements y
which satisfy/{(x, y) is bounded by c.
Now let ~ be an existential second-order sentence of type 7- i.e ~ is of the
form:

3X1 ... 3Xk~',


where ~' is a first-order formula and the Xi are extra function or relation sym-
bols. We call generalized spectrum, denoted ~enSp(~), the set of finite structures
of type 7- which satisfy sentence ~o.
We denote BIN1(7-) (resp. BIN],r resp. Func~~ the class of gen-
eralized spectra of formulas of type T where the only second-order quantified
symbol is a single binary relation symbol (resp. is a single binary relation sym-
bol all of whose interpretations are of outdegree bounded by c, resp. are unary
function symbols). Finally, let BINI'b~ = U BINI'~+"
c>0
Obviously, for 7- = 0 generalized spectrum and spectrum are both the same
(in this case, we write, for example, BIN 1 instead of BINI(O))
The aim of this paper is to show that on finite structures every existential
second-order sentence with the second-order quantifiers ranging over unary func-
tions is equivalent to an existential second-order sentence with a single second-
order quantifier ranging over binary relations.
179

The main result is the following one:

T h e o r e m 1.1 With the definitions given above:

Func~(T) = BIgl'b~ -)

We will only give the proof of the theorem for spectra (see the remark at the
end of the paper for the case of generalized spectra). Talking about spectra, we
have as an immediate corollary:

C o r o l l a r y 1.2
N T I M E ( n log n) C B I N 1 ,
where n is the input integer.

In particular, it's an other way to show that the set of primes belongs to
B I N 1 (see also [161).
We divide our work into three parts. In section 3, we prove the following
proposition.

P r o p o s i t i o n 1.3 Let ~ be a first-order sentence of type { f l , f 2 , . . . , f k } where


the fi are unary function symbols. Then there exists a first-order sentence ~1
of type {R] where R is a binary relation symbol, there exists an integer ik such
that, for each positive integer n :
has a model J: = (Dora, f l , f 2 , . . . , fk) of cardinality n
i#
~t has a model ~ = (Dora, R) of cardinality n
and ~ 's ouldegree is bounded by ik

Then, in section 4, we prove the converse result (which is easier than the
previous one).

2 Definitions

We will use the usual notation in first-order logic and in model theory.
A type q" is a finite set of relation and function symbols { V 1 , . . . , V k } . A
formula is of type T if all its relation and function symbols are in T.
Ad = (Dom, V1,..., Vk) denotes a structure consisting of a nonempty set
D o m called the domain and of relations and functions defined on Dom. For
convenience, our notation will not distinguish between a relation or function
symbol and its interpretation.
The cardinality of a structure is the cardinality of its domain. In this paper,
we will only consider finite structures.
Let T be a sentence. Let Al(x) be a formula with only one free variable
x. We define the relativization A for ~o by induction on the construction of
formulas as follows: i f ~ is atomic, then ~ a = ~; else (-!a) A = ~oa , (~1 A~2) ~ =
~ A ~ . (3zlo) a (also denoted (3xA(x))~) becomes 3x(A(x) A ~p) and (Vx~o)'~
(also denoted (VxA(x))~) becomes Vx(A(x) ~ ~o).
"~80

3 From unary functions to binary relation

3.1 First Lemma

Let's begin with the main result which is also the most difficult one :

L e m m a 3.1 Let ~ be a first-order sentence of type {fl, f 2 , . . . , fk}, then there


exists,
* a first-order sentence ~ of type {R} where R is a binary relation symbol,
. two integers hk, ik,
such that for all n >_ hk,
has a model :7z = (Dora, f], f 2 , . . . , fk) of cardinality n
iE
9' has a model 6 = (Dora, R) of cardinality n
and ~ 's outdegree is bounded by ik

T h e c o n s t r u c t i o n o f t h e d i g r a p h Assume 9t" = (Dora, f l, f 2 , . . . , f k ) where


Dora is of cardinality n. We want to construct a digraph ~ = {Dora, R) which
encodes 9r.
Let ul, V l , . . . , uk, v~, a, b, c be 2k + 3 distinct fixed elements of Dora. With
these points we define respectively 2k + 3 subsets U1, V1,..., U~, Vk, A, B, C of
Dora by (e.g. for U1):
for all x E D o m \ { u l , . . . , c}, x e U1 ": ' ,~" R(X, Ul).
We suppose that U1, V1,..., Uk, Vk, A, B, C are pairwise disjoint (in fact we
shall take IUil = IVil = IAI = IBI = ICI > Ix/m-I).

(.) We represent Dora in A • B by associating injectively by /~ a pair


(ay,by) of A • B to each element y of Dora (arrows prl and pr2 in fig.1 and
fig.2).
( . . ) We define a bijection from each set U1, V1,..., A, B to C (arrows bij
of R in fig.1 and fig.2).
Now, let us show how we encode fi(x) = y : first, we read the representation
(ay, by) o f y in A • B induced by step (-). Then, from (ay, by), we follow bijections
of step ( . . ) to a (unique) pair (ui,y, Vi,y) of Ui • ~ ( Ui, ~ are two sets which
correspond to the function f+). Finally, we associate x by R to Ui,y and vi,y
(arrows f} and f~ in fig.1 and fig.2).
In fig.l, we give the corresponding construction for three points x, y, z such
that fl(X) = y and f2(x) = z and show how B is defined by b (recall that
x E B .: :. R(x,b)).
181

U l

k~ l,y
fl1

Vl,y C bij

I /',. / Pr2

bij l... ""'.~,}


bij "" - - .... ~0 b

f2
2

I V2
fig.l: the arrows (both full lines and dotted lines) represent relation R.

R e m a r k 1 If two elements x l, x2 of D o m have the same image y by fi, they


will be associated by R to the same pair (Ui,y, Vi,y) of Ui x ~ .
R e m a r k 2 It is easy to see that the digraph G has an outdegree bounded by
2k + 4 where k is the number of functions of ~" (outdegree 2k + 4 is obtained for
elements in U1, V1, 9 9 A, B).

B u i l d i n g t h e f o r m u l a ~p' of type { R } of L e m m a 3.1 Without loss of gen-


erality assume that all the atomic formulas have one of the forms fi(u) = v or
u = v. The formula ~' is:

~l=3u13vl...3a3b3c A(Ul,Vl,...,b,c) A~oA~IA~2A~P3A~4A~*,

where A(ul, V l , . . . , b, c) expresses that:


- ul, v l , . . . , a, b, c are pairwise disjoint,
- for all u, v belonging to {ul, v l , . . . , a, b, c} we have -~R(u, v),
and ~Po,..., ~4 are defined below.
Let Ul(x) abbreviates R ( x , u l ) , . . . , C ( x ) a b b r e v i a t e s R ( x , c).

e 0 : v= [(ul(=) - , ~ v l ( ~ ) ) ^ (ul(~) -~ ~u2(~)) A . . . A (B(~) -~ ~C(=))]


"The subsets U1, V1, . . . , A, B, C are pairwise disjoint"
Remark A(ui,vx,...,a,b,c) force{ul,vl,...,a,b,c}andU1UV1U...UAU
B U C to be disjoint.
I, 82

We set ~t1 = A kPl,X where:


XE{U1 ,Vl ..... A , B }

~l,X: (VvCh')) (~3,xX(~))


,
R(x,'y)
^(WX(~)) (3h'C(rl) R(x,~)

"There is a one-one correspondence from each set U1, V1,..., A, B to C "

~2: Vz (3aA(a))(3/3B(/3))(Va'A(a'))(Vfl'B(/3'))
[(R(~, ~') ^ n(~, ~')) ~ (~' = ~ ^ ~' = ~)]

"Each element x of the domain is associated by R to a unique pair (a, b) of


A x B"

~3: (VaA(c~)) (V/3B(fl)) Vx Vy


[(R(~, ~) A n(~, Z) A R(V, ~) A R(V, Z)) -~ ~ = V]

"The above construction ( f r o m the domain go A x B ) is an injection"

k
We set kP4 = A~ where:
i=1

el: vx(3~u~(~)) (3vV~(vl) (w'Ui(u')) (Vv'V,(v'))


[(R(x, u') ^ R(x; v')) ~ (u' = u ^ v' = v)]

"each element x of the domain is associated by l=d to exactly one pair (u, v)
of U; x 88

We obtain W* from W by replacing each sub-formula of the form f i ( x ) = y by


the following formula (*)i(x, y) (see fig.l) :

(3~A(~) (3flB(fl))(371C(71)) (372C(72)) (3uUi(u)) (3vVi(v))


[•(y, ~) A R(v,/~)
^R(~, ~1) ^ R(Z, 72)
^R(u, ~i) A R(v, 72)
^R(x, u) ^ R(x, v)].
183

S o m e r e m a r k s a b o u t ~ o , . . . , ~ 4 # 0 , . . . , #4 describe syntaxically the con-


straints of the digraph ~ of 3.1.1 . NevertheIess we have to verify there is not
hidden difficulty and those constraints are computable. First it is easy to see that
there is no "double-use" possible. T o show this, we will describe all the kinds
of edges between two points. Note that an element of A ( or B, C,..., Ui, Vi or
one of the "constants" a, b, c , . . . , ui, vi) is also an element of the domain and
then is concerned by formulas #~, #z, #4 as such an element. Let z, y be two
elements of Dora such that R(z, y) holds. Concerning y we obtain exactly one
of the following four cases :
9 If y is one of the "constants" ul, v l , . . . , a, b, c, then R(z, y) defines z as
an element of one of the subsets U], V1,... ,A, B, C, respectively (denoted def
in fig.3).
9 If y E C, then R(x, y) is an edge of bijection involved by #1 (denoted bij
in fig.3 e.g if x E A then bij is the bijection A ~ C).
9 If y E (A U B), then R(z, y) means y is one of the two projection of x in
A • B (denoted prl or pr2 in fig.3) involved by #2 A #3.
* If y E (Ui U V/) for a certain i, then R(z, y) means y is one of the two
representative projections of the image of z by f~ in Ui • ~ (denoted f] or f~
in fig.3) involved by #4.
R e m a r k Constants ul, V l , . . . , a, b, c, are also elements of Dom. Then they are
represented in A • B (arrows prl,pr2 in fig.l) and in Ui • ~ (arrows f] in fig.l).
Let us consider Ul. We can easily make a difference between the definition of the
subset U1 (edges of the form R(., Ul)) and the representation of Ul in a subset
(edges of the form R(ul, .)). Figure 2 shows, as an example, all kind of edges
which are adjacent to the subset C:

. ~ to U.i, Vi,A,B
p,,, pr ,

from U i , Vi
(bij)

i, ,; l

9 t o U i , Vi ' / ,
(f~)

to A,B
(prl, pr2)

\ ~ / /-r \ ~ fromA,B
(bij)

fig.2

Let Rem denote Dom\{U1, V1,..., A, B, C} (in particular constants a, b, c,


184

9.., ul, vi are in Tdem. Fig.3 describes the unique meaning of each arrow R(x, y)
according to the respective sets (U1, V j , . . . , A , B, C) of its endpoints x and y.
We will also distinguish the case where y is one of the constants Ul, Vl,...,a,b,c.

e def
b d~f
a def
v2 def
U2 def
Vl def
111 def
c bij bij bij bij bij bij
B pr2 pr2 pr~ pr2 pr2 pr2 pr2 pr2
A pr! prl prl prl prl prl prl prl
V2
U2 f~ f~ f~ f~ f~ f~ f12 f~
V1 f21 f~ f21 f~ f~ f~ f~ f2
U1 f~ fl f~ f~ fl f~ fl fl
I

I y, x---~ U1 V1 U2 V2 A B C Rem

fig.3

C a r d i n a l i t y c o n d i t i o n s * Conditions ~1, r ~3 (which express the existence


of bijections or injections between some sets) imply m = [A[ = [B[ = IV[ =
IUi[ = IV/[ for i=l,...,k and m 2 >_ IDoml = n.
9 On the other hand ~0 A A ( u l , v l , . . . , a , b , c ) implies:
mx(2k+3)+2k+3<n
For example those inequations are satisfied by m = [ V ~ for any n > (2k +
4) 2 .

P r o o f o f l e m m a 3.1 Let ~ = (Do,n, R) be a model o f ~ ' . Let ~ = (Dora, f l , . . . , f~)


be the functional structure on the same domain such that for i = 1 , . . . , k and
r y E Dora:

J: ~ fi(x) = y ~ (Dom, R, ut, Vl,...,b,c) ~ (*i)(x,y) (*).


185

Clearly each fi is a well-defined function because (Dora, R, Ul, Vl, . . . , b, c)


#0 A . . . A kP4 implies (Dora, R, ul, v l , . . . , b, c) ~ Vz 3!y (*i)(z, y); on the other
hand .T satisfies 9 because (Dom, R, Ul, v l , . . . , b, c) ~ ~* and because of equiv-
alence (*).
Conversely, let 5v = (Dora, f l , . . . , fk) be a structure of cardinality n such
that n > h~ = (2k + 4) 2. Let G = (Dora, R) be its associated digraph (cf. 3.1.1).
By construction (Dora, R, ul, v ~ , . . . , b, c) ~ kP0 A . . . / ~ ~4 and Equivalence (*)
holds. So if ~ ~ 9 then 6 ~ 9*. []

3.2 P r o o f o f p r o p o s i t i o n 1.3

We also need the following general result:

L e m m a 3.1.1 Let 7" be a type. I r A E Sp(7") and B is a finite set then A O B


and A - B are also in Sp(7").

Proof easy. []

Proposition 1.3 follows by applications of lemma 3.1 and lemma 3.1.1. As an


immediate corollary we have :

C o r o l l a r y 3.2
Func~ C_ B I N 1,b~ C_ B I N 1.

4 A converse result

We have to prove the following proposition which is easier than the previous
one.

P r o p o s i t i o n 4.1 Let k be a fixed positive integer and let 9 be a first-order


sentence of type {R} where R is a binary relation symbol. Then there exists a
first-order sentence 9" of type {f0, fl, f 2 , . . . , fk} such that, for each positive
integer n :
9 has a model ~ = (Dora, R) of cardinality n
and outdegree bounded by k (i.e. may be 0,1,...,k)
iF
9 I' has a model J~ = (Dora, f0, fl, f ~ , . . . , fk) of cardinalily n

P r o o t ~ We divide our proof into two parts. Let k be a positive integer.


9 Assume 9 is a first-order sentence of type {R}. Let us exhibit a first-order
sentence 9 ~ of type {R ~, Z}, where Z is a unary predicate symbol, such that for
all positive integer n:
there exists ~ = (Dom, R) of cardinality n which satisfies 9
and ~'s outdegree is bounded by k (may be 0,1,..., k)
iff
there exists ~ = (Dom, R ~, Z) of cardinality n which satisfies 9'
~86

and each vertex of G' has an outdegree for R ~ between 1 and k.


Intuitively Z is the subset containing all the elements x of Dom of outdegree
zero. We replace each atomic subformula of ~ of the form R(x, y) with R'(x, y) A
~Z(x). ~ is the conjunction of the resulting formula and of the following

37w, --+

Let G ~ 9, we build G' to be a model of 9' as follows: edges of R' are given
by those of R both with edges (x, 7) where x is of outdegree 0 (for R) and 7
is some fixed element. Conversely, if G' D ~', then the structure G such that
"/~(a, b) holds iff R'(a, b) A "-,Z(a)" is a model of p (by construction of 9') and
if G' has an outdegree bounded by k then so has G.
9 From now on we transform R' and Z into unary functions. We only have to
replace in 9' each sub-formula of the :form R'(x, y) with fl (x) = yV...Vf~ (x) = y
where the fis' are new unary function symbols and to replace each subformula
Z(x) by the formula fo(x) = z where f0 is a unary function and z is a new
variable. We denote ~"(z) the resulting sentence.
The idea is to "label" the (at most) k edges R'(x, Yl), R'(x, Y2),..., R'(x, Yk)
starting from any x by respective arrows f, : x ~ Y l , . . . , fk : x ~ y~. The reader
should be easily convinced that the following equivalence holds for IDom I >_2:
there exists G' = (Dora, R', Z} which satisfies 9' and where
each vertex has an outdegree (for R') between 1 and k
iff
there exists iT = (Dora, fo, f l , . . . , fk) (on the same domain)
which satisfies 3z~"(z)
[]

We obtain as an easy consequence of proposition 4.1:

C o r o l l a r y 4.2
B I N l'b~ C_ Func~.

Theorem 1.1 follows easily.

Now, let us give an interesting corollary. Let S be a set of positive integers.


As mentioned before, Grandjean [8, 10] shows that:

S E NTIME(ndlogn) ~ S E Sp(dV) ~ S E NTIME(nd(logn)2).


Where d > 1 and n is the input integer. Let us suppose S C {1}* (each
integer is identified to its unary notation), then:

C o r o l l a r y 4.3 For unary languages:

N T I M E ( n log n) C_ B I N 1,b~C_B I N 1.
187

Proof

N T I M E ( n log n) C_ Sp(1V, unary) (sea [i0])


C_ F u n c ~
: BIN1, b~
C BIN 1 []

All "natural" sets of integers seem to be in B I N 1 because they seem to


belong to the "large" class N T I M E ( n l o g n) (recall that n is the value of the
input integer). In particular, this implies :

C o r o l l a r y 4.4 The set of primes and the set of perfect numbers are in B I N 1.

Notice that this corollary can also be proved using results of Woods [16]. Let
P be a k-ary predicate (on integers). We say that P is rudimentary if it can
be defined by a first-order sentence q~ in a language containing only equality
(x = y), addition (x + y = z) and multiplication (x.y = z) predicates and whose
variables are bounded by the variables of P. For example, it is easy to see that
the set of primes is a (unary) rudimentary predicate.
In his thesis [16], Woods shows that every rudimentary set of positive integers
is the spectrum of a sentence involving only one binary relation symbol (then,
of course, corollary 4.3 follows). Let R U D denote the class of rudimentary sets.
In fact, our opinion was that the following inclusions hold :

R U D C_ F u n c ~ g B I N 1.

Recently, F.Olive [13] proved the first inclusion:

G e n e r a l r e m a r k The proof is similar for generalized spectra except for lemma


3.1.1, where instead of a finite set B of integers we consider a finite set B of
structures. The solution consists in describing completely each structure of B.

As a consequence of corollary 3.2 and of result in [10] (which says that


conneetedness and strong connectedness are expressible by sentences with only
unary function symbols as extra predicates), we have:

C o r o l l a r y 4.5 Conncctedness and strong connectedness are expressible by sen-


tences with a single extra binary relation.

This "contrasts" with the result by Fagin and De Rougemont ( see [4, 7, 15])
that connectedness is not definable by a monadic second-order sentence even in
the presence of an underlying successor relation.
~88

5 Conclusion

By an extension of the method of this paper, we hope to give soon the same kind
of result where instead of a simple binary relation we consider more restricted
one as a symetric binary relation or a partial ordering.
In [6], Fagin asks the problem of the existence of spectra (resp. generalized
spectra about graphs) which are not in B I N 1 (respectively in B I N 1 ({R})). Usu-
ally, logical undefinability results (in a given language) concern natural problems
(about graphs, words, numbers). But, for all we know, most of natural problems
about graphs (resp. words, numbers) are either in B I N 1 ({R}) or in F u n c ~ ( { t ~ } )
(which, according to this paper, is also in B I N 1 ({R})). Consequently, a positive
answer to the above question seems to lie in the construction of artificial prob-
lems. This explains, in some way, why such a positive answer seems to be very
hard to justify.
Aknowledgements We would like to thank Professor Etienne Grandjean for
the many ideas he suggests to us and for the attention he gives to this work. We
are grateful to Nadia Creignou for her helpful advices which improve readability.

References

1. M. Ajtal. ~l-formulae
1 on finite structures. Ann. Pure Appl. Logic, 24:pp.1-48,
1983.
2. S.A Cook. A hierarchy for nondeterministic time complexity. J. Comput. Systems
Sei., vol.7:pp.343-353, 1973.
3. R. Fagin. Generalized first-order spectra and polynomial-time recognizable sets.
Complexity of computations, vol.7:pp.43-73, 1974.
4. R. Fagin. Monadic generalized spectra. Z. Math. Logik. Grundlag. Math.,
21:pp.89-96, 1975.
5. R. Fagin. A spectrum hierarchy. Z. Math. Logik Grundlag. Math., (21):pp.123-
134, 1975.
6. R. Fagin. Finite-model theory - a personal perspective. Theoretical Computer
Science, (116):pp.3-31, 1993.
7. R. Fagin, L.J. Stockmeyer, and M.Y. Vardi. On monadic np vs monadic co - np.
IBM Research Report, 1993.
8. E. Grandjean. The spectra of first-order sentences and computationM complexity.
S I A M d. Comput., vol.13:pp.356-373, 1984.
9. E. Grandjean. Universal quantifiers and time complexity of random access ma-
chine. Math. Systems Theory, vol.18:pp.171-187, 1985.
10. E. Grandjean. First-order spectra with one variable. J. Comput. Systems Sci.,
vol.40(2):pp. 136-153, 1990.
11. N.D Jones and A.L Selman. Turing machines and the spectra of first-order formulas
with equality. J. Symb. Logic, vol.39:pp.139-150, 1974.
12. J.F. Lynch. Complexity classes and theories of finite models. Math. Systems
Theory, vol.15:pp.127-144, 1982.
13. F. Olive. Personal communication.
14. P. Pudlak. The observational predicate calculus and complexity of computations.
Comment. Math. Univ. Carolin., vol.16, 1975.
189

15. M. De Rougemont. Second-order and inductive definability on finite struc-


tures. Zeitschrift .fur Mathematische Logik und Grundlaoen der Mathematik,
vol.33:pp.47-63, 1987.
16. A.R. Woods. Some problems in logic and number theory and their connections.
PhD thesis, University of Manchester, 1981.
Monadic Logical Definability
of NP-Complete Problems

Etienne Grandjean, Fr@ddric Olive

LAIAC, Universit4 de Caen


Etienne.Gr andj ean@info.unicaen.fr, olive@logique.j ussieu, fr

A b s t r a c t . It is well known that monadic second-order logic with linear


order captures exactly regular languages. On the other hand, if addition
is allowed, then J.F.Lynch has proved that existential monadic second-
order logic captures at least all the languages in NTIME(n), and then
expresses some NP-complete languages (e.g. knapsack problem).
It seems that most combinatorial NP-complete problems (e.g. traveling
salesman, colorability of a graph) do not belong to NTIME(n). But it has
been proved that they do belong to NLIN (the similar class for RAM's).
In the present paper, we prove that existentia] monadic second-order
logic with addition captures the class NLIN, so enlarging considerably
the set of natural problems expressible in this logic. Moreover, we also
prove that this logic still captures NLIN even if first-order part of the
second-order formulas is required to be V*3*, so improving the recent
similar result of a.g.Lynch about NTIME(n).

K e y words : Computational complexity, monadic second-order logic,


finite model theory, nondeterminism, NP-complete problem, linear time,
random access machine.

INTRODUCTION
W h a t is the relation between the computational complexity of a decision 1 prob-
lem and the logical complexity of the language required to describe it? This
question was formulated first by I m m e r m a n [Ira3] (see also [Gul, Gu2, I m l , Im2,
Va]). In complexity theory any problem is identified with a language s C Z*.
Solving the problem means deciding the associated language, i.e., deciding for
each word w whether it belongs to s or not. Now, a word w on a finite alphabet
is easily identified to a finite structure Aw. The most common encoding consists
in identifying each w C {0, 1}* with the finite structure (n, X, suce) defined by :
on = {0, 1 , . . . , l e n g t h ( w ) - 1} ;
9X C n a n d g i E n : X(i) Cezthei thletter o f w i s 1;
9suce is the successor relation on n.
Such an encoding being chosen, the purpose is the following : try to associate
to every complexity class g a class of formulas of a given logical language, 9e, in
such a way that: A language s C S* is in g iff there exists a sentence 9 E 5e s.t.

VwEZ* : wEs ~.
1 This question is also studied for optimisation problems. See [KoTh] for instance.
191

Among the results which were proved in that framework, let us mention those
of Biichi and Fagin, often quoted in this way :
regular languages = monadic SO = monadic SO(~) [Bit] ; N P = S 0 ( 3 ) [Fa],
where, whenever C is a complexity class and Z" is a class of formulas, C = 3v
means that for every language L on Z:, X: is in g iff there exists a formula ~ E
such that : Vw E ,U* : w E L: r Aw ~ ~. We would expect this correspondence
between computational complexity and logical definability to allow us :
1- to use the flexibility of complexity-theory tools (models of computation for
instance) to get results on logical definability;
2- to export complexity questions in the field of logical definability.
An important example of the second item is the following : In order to prove
that a given problem belongs to a given complexity class, it is enough to build an
algorithm of that complexity which decides this problem. But of course, there is
not guarantee that this algorithm is the "best" one. In other words, the difficulty
is to obtain lower bounds of complexity.
In [Ly2], Lynch focuses on this precise question. He writes : "there are hundreds
of known NP-complete problems, but until recently, not one of them had a prov-
able nontrivial lower bound." ~ This is the reason why Lynch wants to compare
the class NTIME(n) (i.e. the class of languages recognized in nondeterminis-
tic linear time on Turing machines) with some class of logical formulas F . He
hopes then to obtain results as "H ~NTIME(n)", for some natural NP-complete
problems H, in proving their non-definability by formulas of f . Actually, the
"inclusion" NTIME(n)C ~ is enough to get such a result, because it still allows
the implication "H non-definable by f :=~ H ~NTIME(n)". The main result of
[Lyl,2] may be stated as follow :

T h e o r e m . Let L: C {0,1}* belong to NTIME(n). Then there exists a first-


order formula q5 on a signature made up, in addition to the predicate symbols
X, succ, plus of respective arity 1, 2, 3, of monadic predicate symbols U 1 , . . . , Us,
s.t.:
(i) the quantifier prefix of q~ has the form V*3*;
(it) Vw e {0, 1}n: w E F~ ~ ( n , X , suec, plus) ~ 3U--~(X,-ff , succ, plus) ,
where succ, plus and X are interpreted on n = { 0 , 1 , . . . , n - 1) by the pre-
defined relations: X ( i ) ~=~ the i th letter of w is 1; suee(i,j) ~=~ j = i + 1;
plus(i, j, k) r i + j = k.

Concluding his paper [Ly2], Lynch asks: "Can the (above) theorem be extended
to random access models of computation, where the memory elements can be
read and written in any order?" The question is justified by the specificity of his
proof, which uses a discrete analogue of the Intermediate Value Theorem for the
Turing machines, that is : if the head is at the position c at the time t and at
c' at the time t' > t, then for every position c" between c and c', there is some
time t" between t and t' such that the head is at c" at time t". Of course, there
2 There is a noteworthy exception with the problem "Reduction of Incompletely Spec-
ified Automata"(RISA), which has been proved non-solvable in deterministic linear
time in [Gr2].
!92

is not equivalent result for RAM's.


The aim of this paper is to prove that Lynch's question has nevertheless a positive
answer. More precisely, we show that the previous theorem remains true when
replacing "NTIME(n)" by "NLIN", where NLIN is a complexity class (including
NTIME(n)) elaborated by Grandjean [Gr3] to formalize nondeterministic linear
time on RAM's. The definition of this class will be recalled in 1.1. But let us
before go back over some steps of the linear time formalization story.
Although it is commonly mentioned by algorithm designers, the linear time com-
plexity notion is especially hard to formalize, because of its great sensitivity to
models of computing and to problems encodings used to describe it. This lack of
robustness, apparently inherent in linear time, leads Gurevieh and Shelah [GuSh]
and Gr~del [G1] to define two "robust closures" of linear time. Previously, Schnorr
[Scr] had similarly defined "quasilinear time" (that is time O(n(log n) ~ for
some Turing machine) and proved that many NP-complete problems belong to
nondeterministic quasilinear time and are complete for this class. Those authors
defined extensions of linear time because, as [GuSh] explains: "It is possible that
there is no universal notion of linear time and different versions of linear time
are appropriate to different applications." In [Gr3] and [Gr4], Grandjean adopts
the opposite point of view: he defines and justifies a unified, robust and power-
ful notion of linear time, both in deterministic and nondeterministic cases, with
the classes DLIN and NLIN. Moreover, in [Grl], [Gr2], [Gr3] he gives a logical
characterization of NLIN by
second-order formulas written with <, suc, 0 and unary function symbols.
This characterization will be our main tool to prove that monadic second-order
logic captures NLIN. Such a result appreciably extends the Lynch's one, since
NTIME(n)CNLIN, and since all the twenty one Karp's problems [Ka] have been
proved as belonging to NLIN, when only two of them (Knapsack and Partition)
are known as belonging to NTIME(n).

1 N L I N Characterization and Sketch of P r o o f of the


Main Result
1.1 N L I N Definition
Let /: be a language over a finite alphabet Z. L E N L I N if there exists an
integer k # 0 and an NRAM (i.e. non deterministic RAM) Tr such that :
1. R only uses the arithmetic operations suc (x H x + 1) and p r e d (x ~ x - 1),
2. Tr eeads its input w by blocks of length l = [-~ log(n + l)] 3 , where n =
length(w) (w E S'~),
3. 7~ recognizes s in time O ( n / l o g n) (where the time is the number of executed
instructions, and n is the length of the input word), or, equivalently, in time
O ( m ) , where m is the number of blocks of the input.

Remark. k only depends on S cardinality (e.g. for IS1=2, each k > 1 suits).
a In this paper, log n denotes the logarithm of n in basis 2.
193

1.2 R e a d i n g D e c o m p o s i t i o n of a W o r d w E ,U*

In what follows, we assume that Z is the dyadic alphabet ( Z = {1, 2}) and we
denote k an integer greater than or equal to 2. In this section, we describe a de-
composition of the words of ~* suited to the NRAM definition given above.This
decomposition, called "reading decomposition", will allow to encode every word
w E Z* as a finite structure of the (m, f, <, suc, 0) type, where f denotes a unary
function on m = {0, 1 , . . . , m - 1}, < is the usual order and suc is the successor
function on the domain m. This encoding is used in an essential manne? in the
logical characterization of NLIN given by [Gr3].
Let w E Z * 9
Let us s u p p o s e n = l e n g t h ( w ) , l = , r ~ k l / , m = r91"Thenwe
can break down w into m words To, w l , . . . , win-l, all of which have length l
exactly, except the last one, which may be of length less than or equal to I. We
have w = wg'w~". ~ wm_l, and if w d denotes the integer whose dyadic writing
is w~, it is easy to show that m -- O ( n / l o g n) and w d < m for large enough n.
We can now identify w E {1, 2}*, with reading decomposition w = w ~ . . . ~ win-l,
with the finite structure (m, fw, <, 0) defined by :
- m = { 0 , 1 , . . . , m - 1}
: ~ m ----+ m
- f~ ( i~.. f~(i) = w d
- < is the natural linear order on m and 0 is the "real" 0 of m. 4

1.3 N L I N Characterization and Main Result

Notations. In the following, we denote FO(Vi3 j) the class of first-order formulas


in prenex conjunctive normal form, the quantifier prefix of which has the pattern
Vi3 j. We omit the exponent i or j when it equals 1. FO(V*3*) is Ui,j F o ( V i ~ J ) 9
F O ( w q ) is the class of first-order formulas without quantifier. These notations
naturally extend to define any class of formulas logically equivalent to formulas
of the concerned class. With the same notations as in 1.2, NLIN characterization
essentially proved in [Grl], [Gr2], [Gr3] may be written:

Theorem1. Let f~ C {1,2}*. Then ~ E N L I N iff there exists a formula


E FO(wq) with only one first-order variable on signature {f, g l , . . . , gp, <, 0},
where f and the g~s denote unary functions symbols, < is a binary predicate
symbol and 0 is a constant symbol, so that:

vw e { 1 , 2 } * : (w e iJ ((m, IT, <, 0) 391...3gpvx e(x,/,y, <, 0)),


where (m, fw, <, 0) is the finite structure associated to w as above.

This exact characterization of NLIN will allow to prove our main result, that is

4 Compare this with the usual identification of a word w E {0, 1}'~ with the finite
structure (n, X, <), where X is the unary relation on n such that X(i) holds iff the
i *h letter of w is 1 (see [Lyl].
9Z.

T h e o r e m A Let s C_ {1,2}*. If s E N L I N , then there exists a sentence


0 E FO(V*3*) on signature {X, U1,..., Us, +}, where X and the Ui's are unary
predicate symbols, + is a ternary predicate symbol, such that :

w E {1, 2)" : [w L] [{n, X, +> b 3U1... 3Us O(X, U, +)],


when + is the predefined addition on n and X is the predicate naturally coding
w on n, i.e. defined by : Vi < n : X ( i ) r the letter of index i o f w is 2.

1.4 A Simplified View of the Proof of Theorem A

9 We know that each positive integer x may be coded by a word of {0, 1}* of
length [log(x) + 1] (binary representation of integers). In particular, if E is a
finite set of integers, we can identify every x E E with a word B(x) E {0, 1}*
of length L = M a x { [ l o g ( x ) - 4 - l J , x E E} exactly, does mean padding with O's
binary representations of the elements of E. Equality between two integers of E
is equivalent to equality of their representative words. Namely, if for x E E we
denote B(x)t the bit of B(x) of rank t (0 < t < L with convention: the bit of
rank 0 is the less significant bit), we have :

Vx, y E E : [x = y] iff [(Vt < n ) ( B ( x ) t = B(y)t)]

9 This encoding of integers by fixed length words can be extended to functions


in the following manner : Since every x < m is encoded by B ( x ) E {0, 1} L, where
L = [log(m) + l J, a function f : m ~ m may be identified with the word :

B y = B(f(O))~B(f(1))~.. ~ B(f(m- 1)) E {0, 1} "~L

If the t th bit o f B / is denoted ( B f ) t , we have: Vf, g : m --+ m , Vx, y < m :

f ( z ) = g(y) i f f (Vt < L)( ( B ( f ( x ) ) , = B(g(y))t )

hence
f ( x ) = 9(y) i f f (Vt < L)( (BI),,~+t = (Bg)my+, ) (*)
Eventually, we know that a word of length mL is naturally associated with the
unary predicate X C mL defined by "

vt < L: X(t) (w), = 1


If for h : m ---+m, the corresponding capital H is the predicate of m L associated
in this way with the word B h E {0, 1} mL, the equivalence (*) becomes :

Vx, y E m : [f(x) =- g(y)] r162[(Vt < n ) ( F ( m x + t) +-+G(my + t))]

Thus, we convert a unary functional formula into a monadic relational formula.


This result may be extended to a large family of formulas on a signature made
up, in addition to unary functions symbols, of the predefined symbols <, sac
and 0 (natural linear order, successor function on integers and 0). This "transla-
tion", from unary functional language to the less expressive monadic relational
195

language, has a cost: the domain of finite structures considered must be expanded
(we pass from unary functions on domain m = O(n/log n) to unary relations on
domain mL, which is O(m log m) = O(n)), and the predefined symbols <, suc, 0
must be replaced by the much more expressive predicate +.
The study of this flattening of functions in unary predicates constitutes the
core of this paper. Combined with the above-mentioned characterization of a
language Z: E N L I N :
(w E/2) iff ((m, fw, <, O) ~ 3 g l . . . 3gpVx r f ,-y, <, 0))
it allows to prove that for every /2 E N L I N , there exists O E S0(3, mon, +)
(i.e. O is a monadic existential second-order formula with addition), so that :
(w e C) r (<n, X, +) ~ O),
where n = length(w) and X is the unary predicate on n coding w.
More precisely, the paper is organized as follows :
In p a r t 2 we first specify the above-mentioned identification between a unary
function f : m ~ m and a unary predicate F C_ mn (n = Llog(m)+lJ). Actually,
for technical reasons, the predicate F is built on a domain cn (for a fixed constant
e) slightly bigger than mL. We will call F "flattening of f on cn". Then, we state
precisely the passage from functional language to relational language : for each
formula gr E FO(V*3*) on signature {f, g l , . . . , gp, <, 0}, where the gi's are unary
function symbols, there exists ~ E FO(V*3*), on signature {F, G 1 , . . . , Gq, +},
where F and the Gi's are monadic predicate symbols, such that if f : m -* m
and if F is the flattening of f on cn then
(re, f , < , 0 ) p 3 ~ r (ca, F , + ) p 3 a ~
Then we combine this result with the NLIN characterization of Theorem 1 to
prove : for every /: C {1, 2}* in NLIN, there exists a second-order formula
such that for every w E {1, 2}":
(w E c) ((en,X, +) p +)),
where X is a unary predicate which codes w on cn.
P a r t 3 focuses on a technical result which makes it possible to provide the
preceding result with a more standard form. We first prove that if R 1 , . . . , Rp
are predicates with respective arities r l , . . . , r p , on a domain of the form cn,
and if 9 is a first-order formula on signature { R 1 , . . . , Rp}, then there exists a
division of each Ri into a series R* of ri - ary predicates on n, and a first-order
formula r on signature { R ~ , . . . , R~}, such t h a t :
(<cn, R1,..., Rp> p r (<n, Rp> p
Associated to the last equivalence seen in part 2, this result allows us to conclude:
If/2 C {1, 2}* belongs to NLIN, then there exists O E S0(3, mon, +) such that
for every w E {1,2}* :
(w e c) ** (<n, x, +> p o),
where X is the unary predicate naturally coding w.
~95

2 From Unary Functional Language to Monadic


Relational Language

2.1 Notations

- N is the set of positive integers.


- log n denotes the logarithm of n in basis 2.
- For q E N , B ( q ) denotes the reversed binary representation of q (e.g. B(6) =
011).
- For i E N, 0 i is the concatenation of i 0-symbols.
- If w is a word, l g ( w ) is the length of w.
- Let q, n be integers such that L _> Llog(q)+ 1] = l g ( B ( q ) ) . We note B n ( q ) =
B ( q ) ~ O n-tg(B(q)) (e.g. B5(6) = 01]00). B L ( q ) is in fact the word coding q
in reversed binary which was forced to reach length L, possibly padding it
with O's on its right.
- For m E N, we also denote m the domain {0, 1 , . . . , m - 1}. R e m a r k that
for m and L > [log(m) + 1] fixed in N, the m a p B L : m ~ {0, 1} L, which
m a p s q to B L ( q ) , is injective.
- For m E N , I d m denotes the identity function on m.
- For w = aoal . . . a,,-1 E {0, 1}* and for t < n, we will call "t th bit of w " the
symbol at, ie. the (t + 1) th letter of the word w.
- Let w = bob1 . . . b n - 1 E {0, 1} ~. We will call " predicate naturally coding w"
the unary predicate Xw C n defined by Vt < n : X w (t) r bt = 1.
In the same way, if w = d o d l . . . d n - 1 E {1,2} n is a word on the dyadic
alphabet, the "predicate naturally coding w" will be the unary predicate
Xw on n defined by Vt < n : Xv~ (t) ~ dt = 2.

R e m a r k . In the present paper, both the binary and the dyadic ~ representation
of positive integers are used. Binary representation is used when padded and
uniform length notations are helpful. Dyadic one allows to identify any n o n e m p t y
word of {1, 2}* with a positive integer, in a one-one way.

2.2 Flattening of a Function

For this whole paragraph, m, L, and m I denote three integers such that L =
[log(m) + l] and m ' > ran. We show there that, using a simple coding, each
function h : m --* m can be identified with a unary relation H on m t.
Let h : m --~ m. For every i < m, let us denote b~. = B L ( h ( i ) ) . The image of each
i < m being thus coded by a word of {0, 1} L, h can be coded by a word Wh, the
length of which is exactly m ' : Wh = b~'b~" .. ~ b m _ l O m ' - m L

5 Recall that dyadic representation consists in encoding each positive integer x by the
word D ( x ) = x p . . . xlxo E {1, 2}*, where (x0, xl . . . . . xp) is the only tuple on {1,2}
such that : x = x0 + x12 + ... + zp2 p.
197

Remark. If for x < m and t < L, bt~ denotes the t th bit of b=, then :
L-1
h(x) = ~'~
i...~ bt~2t = E 2t
t.~-O t<L st
btz:l
Every function from m to m is thus represented in a single way by a word of
{0, 1}mL.om'-mL. Besides, one usually represents a word v E {0, 1}* of length q
by the unary predicate V C_ q defined as follows :
Yi < q : V(i) ~ the t th bit o f v is 1.
Composition of this two codings makes it possible to associate to each unary
function h : m ---* m a unary predicate H C m ~ defined as follows :
Ya < m ~ : H ( a ) r the OLth bit o f Wh i8 1
ie: H ( a ) r ~ = x a L + t ~ , with x~ < m , to < L , and bXc~
t" = 1
Which implies the following :
Vx<m,Vt<L : H(xL+t) vvbt~=l

D e f i n i t i o n 2 . From now on the predicate H C_ m ~ thus constructed from h :


m ~ m will be called the "flattening of h on m " .

Remark. The previous remark immediately results in :


2
t<L st
H(xL+t)

D e f i n i t i o n 3 . We call B z t ~ the flattening of the identity function Idm : m --~ m


9 J

on m ~. In other words, B~t~ is the unary predicate on m ~ naturally associated


to the word B L ( 0 ) ~ B L ( 1 ) ~ . . ~ B L ( m -- 1)~0 rn'-mL E {0, 1} m'. Or else:

9 m' { a = x ~ L + t ~ , with x~ < m , t~ < L ,


V~ < m' : B~t m (a) r and the t ~ bit o f BL(Xa) is 1

In fact, to make the notations simpler, we will note this predicate "Bit", the
(m, m ~) pair which it refers to being implicitly given by the context.

2.3 From F u n c t i o n a l Formulas to R e l a t i o n a l Formulas.

Let k > 1 be a fixed integer. For every integer n, we note: l = / k l,


m = [~], L = [log(m) + 1]. Given these notations, we get m = O ( n / l o g n ) and
therefore m L = O(n). This guarantees the existence of an integer c such that
V n E N * , m L < cn ( * ) . L e t c b e a fixed integer m e e t i n g ( , ) . For a l l n E N,
the (rn, cn) pair therefore verifies m L < cn, and it is possible to refer to the
flattening of a function h : m --~ m on cn.
In this paragraph, we seek to associate to each language ~: E N L I N , an existen-
tial monadic second-order formula ~ z , in such a way t h a t a word w E {0, 1}* is
in s iff a certain finite structure, in which w is encoded by the flattening of f~,
is a model of 4~c. This will be done along L e m m a s 4, 5, 6, 7 in using T h e o r e m
1 and stating successively that :
9 Flattenings are quite adapted to translate certain functional formulas, that we
-~98

could call "simple (atomic) formulas", in relational formulas, provided we add


to the language a series of predefined arithmetical symbols including +.
9 Does mean changing a little its first-order quantifier prefix, the formula which
characterizes a language s E N L I N (by Theorem 1) can be required to be a
boolean combination of these "simple formulas".
9 The conjunction of the two preceding facts allows to associate to each s E
N L I N a monadic existential second-order formula whose signature contain the
above mentioned arithmetical objects. Moreover, if the transformation is made
carefully, the first-order part of this formula can be required to belong to FO(V* 3*).
9 At last, we state that, excepted for +, all these arithmetical symbols can be
removed, without changing the structure of the formula.
All this will lead to Corollary 8, which establishes the existence of the formula

Z e m m a 4. Let f, g be two unary functions on m, F, G their flattenings on cn.


Then, for all x, y < m :
m ~ f ( z ) = g(y) r cn ~ (Vt < L ) ( F ( x L + t) ~ G(yL + t)) and

m ~ f ( x ) < g(y) r cn b (3t < L)(Vt' < L) [t' > t --+ ( F ( z L + t') ~-* G(yL + t')A]
[G(yL + t) A -,F(xL + t))

Proof. We just have to recall the connection between a function h : m ---+m, its
associated word B L ( h ( O ) ) ~ . . . ~ B L ( h ( m - 1)) E {0, 1}* and its flattening on cn
to see that these three equivalences express the following facts :

- f ( x ) : g(y) iff the words B L ( f ( x ) ) and BL(g(y)) are equal, i.e. coincide on
their L successive bits.
- f ( x ) < g(y) iff B L ( f ( x ) ) and BL(g(y)) are in the same order for the reverse
lexicographic order on {0, 1} L.
[]

Z e m m a S . Let s E ~*. If E 6 N L I N , then there exists ~o E FO(V*) on a


signature made up, beside the symbols <, O, of unary function symbols f, gl ... g~,
s.l. :
(i) V w E Z*, w G s r (m, fw,<,O> ~ 3 g l . . . 3 g r ~ ' ~
(ii) ~o has the following form : ~o _ V~ Ai Vj Aij, where each Aij can be written
a(u) = /9(v) or ~[a(u) : t3(v)] or -~[a(u) < j3(v)], for some c~,13 E {f, ~, Id}
and some u,v 6 {0,~}.

Proof. Let g' E F O ( w q ) be the formula associated to s by the Grandjean's


characterization of N L I N . So we have, for all w E ~* :
w E s ~ (m, f~, <, 0} ~ 391... 3gpVx~'(x, f, ~, <, 0). We can consider, without
loss of generality, that g' is under conjunctive normal form.
An atomic subformula of q' has the f o r m : h l o . . . o h p ( u ) oc k , o . . . o k , ( v ) ,
with hi, k~ E {f,-~, Id} (where Id is the identity function), u, v E {0, x}, and
ocE {=, #, <, ~}. But this formula is clearly equivalent to :
^i:p--2 / \
Vx [ap-l(x) = h p - l o h p ( x ) A I\~=1 c~{tx)= h/oai+l(X) A
fls- I(X) : ]r A A / =i =I s - 2" ~{(x) = k i o / 3 i + l ( x ) A cq(u) c< A?I(v)] ,
199

where the o~i's and /3i's are new function symbols. Thus, if we still denote
the tuple 91, 9 99 gr made up of gl,. 9 gp in addition to as many new function
symbols as necessary, this allows to write g' in the following equivalent form :
gs = A a~ = 7(x) A A V e(u) or r with
a, fl, 3: E {f,~}, r {f,-~,fd}, c<E {--,7~,<,~}.
Besides, replacing each subformula c~o/3(x) = 7(x) of gr by Vy[fl(x) = y ~ ~(y) =
7(x)], and writing the obtained formula under prenex conjonctive normal form,
we get for Vxg: an equivalent formula of the following form: Vg A V a(u) o< fl(v),
with a, fl E {f,-g, Id}, u,v E {0, g}, c<E { = , r 1 6 2 which can be written
more precisely : VgAiV~B#, where each Bq is one of the formulas a(u) =
/3(v), c~(u) r fl(v), a(u) < fl(v), o~(u) r /3(v), for some a,/3 E {f,~-, Id} and
some u, v E {0, g}.
At last, we observe that the formula c~(u) < fl(v) can be expressed with the
others (namely, those of the form ~(u) r /3(v), ~(u) r /3(v)) by a universal
formula using the only connectives A and V. Indeed we have :
o~(u) </3(v) ~ --,[/3(v) < c~(u)] A --,[fl(v) = a(u)]. The prenex conjunctive normal
form of the formula produced by this last rewriting is the sought formula ~.o. []

s Let 1: E Z*. If s E N L I N , then there exists ~ E FO(V*3*) on


signature {F, G1,..., Gr, +, <, x, Bit, m, L}, where F, Gi, +, <, x, Bit are pred-
icate symbols of respective arities 1, I (for every i <_ r), 3, 2, 3, and m, L are
constants symbols, s.t. :
Vw E Z*, w E s r (cn, F) p 3G1... 3G,~(F,-G, +, <, x, Bit, m, L),
where F is the flattening of fw on cn, +, <, x are respectively the predefined
addition, linear order and product on cn, Bit is lhe flattening of Idm on cn, and
m, L are the two predefined constants defined at the begining of this paragraph.

Proof. Let ~o - V~Ai Vj A~j the formula associated to /3 by Lemma 5. We


compute g) from ~o in translating each subformula Aq according to the equiv-
alences stated in Lemma 4. Namely, we have the following translations :
a(u) = fl(v) becomes (Vt < L)[A(uL + t) +-+B(vn + t)]
--[a(u) = fl(v)] becomes (3t < L)-,[A(uL + t) +-+B(vL + t)]
It' > t --+ (A(un + t') ~ B(vL + t')A]
--[a(u) < fl(v)] becomes (Vt < L)(3t' < L)-~ [ B(vL + t) A -,A(uL + t)) J
Notice that all these formulas are in FO(V*3*). Thus, rewriting the obtained
formula under prenex conjunctive normal form, we get a new formula with
the following pattern, (VE < m)(V~l < L)(3t2 < L) A V+H(_yL + t), with
H E {F,-G, Bit}, y E ~, t E {tl,t2}. This is the sought formula gr. []

We can achieve a stronger result in showing that the previous lemma still holds
when using + as the only arithmetical predefined symbol. Recall that for k, c
two fixed integers and for n E N, we denote : l = , k ,, m = [~],
L = [log(m) + lJ.
The following result, essentially due to Lynch, states that the constants k, c, n,
l, m, L, as well as the predicates <, x , Bit, are definable in cn by existential
monadic second-order formulas on a signature including the predefined predi-
200

cate +. Moreover, this definability preserves the V*3* quantifier pattern of the
formulas. More precisely, we have the :

I, e m m a 7. Let ~ro denote the set of predefined symbols { k, c, n, l, m, L, Bit, x, <


, + } . Let U be a series of unary predicate symbols. Let ~b ~ FO(V*3*) be a
sentence on the signature ~r0 U {U}. Then there exists ~' E FO(V*3*), on the
signature { + } U { U } U { D 1 , . . . , Dj}, where Di's are new unary predicate symbols,
such that :
cn ~ O(cro,-U) ,~ cn ~ 3 D 1 . . . 3 D j ~ ' ( U , + , D )

Proof. Definability of the predicates <, x , Bit, has been proved in [Lyl]. In
the first paragraph of the appendix, we recall the main steps of this proof and
we show the definability of the arithmetical constants k, e, n, l, m, L, taking a
specific care to the quantifier prefixes of the used formulas. []

C o r o l l a r y 8. Let s be a language over S = {1, 2}. If s C N L I N , then there


exists a sentence ~ E FO(V*3*) on signature {F, G 1 , . . . , G q , + } , where F and
the Gi's are unary predicate symbols, + is a ternary predicate symbol, such that:

[w 9 ,c] [r r, +) p 3c, ... 3c (F, c,,..., c,)]


where F is the flattening of rio on cn, and + is the predefined addition on cn.

Remark. From now on, we will still use subformulas like x < m, B i t ( y L +
t ) , . . . , but it will have to be interpreted as the abreviation of existential monadic
formulas on {+}.

2.4 Connection between F and X

Our purpose being to get back to a finite structure in which the word w is encoded
by its naturally associated predicate X, we make explicit, in this paragraph, the
link between F and X. Let w be a word of {1, 2} n and let us denote
w = d o 9..~0
.I-1 ~1 0 . . . .@
.. 1 d 0, ~ _ ~ . . . ~ , t-l
~ _ 20~ , ~ _ l . . . d , ~ _ l , 0 < s < _ l ,
its reading decomposition into rn words wi = d~ -1 for i < m - 2 and
0 s--1
w,~_ 1 = din-1 "'" d ~ _ l . The function fw : m ~ m is then defined by :
/-1
Ed{2 j if i < r n - 1
Vi < m, f~(i) = w/d = j=0
s--1

j=0
E'/ J 1 2j i f i = m
t~rn-- 1

Every integer w/d may be coded by its reverse binary representation of length L:
b ~ bL-1. Hence fw may be described by a binary word :
W' = boO . . . .b0
.L. - 1 0
brn_l . . . ./ Lm_l.~,
- 1 (~en-rnL E {0, 1} cn
Relationship between the words w, w ~ and the function f,o is therefore expressed:
Vi < m,
201

dO... d~-lis the reverse dyadic representation of fw(i) (replace 1 by s for win-l)
b ~ bL-lis the reverse binary representation of length L of f~ (i)
We deduce : Vx < m, Vt < L :
{b~ = 1}
(,) iff

(x < r e - 1) A {(t < 0 ^ ( 4 - 2) ^ (3t' < t)(d~' = 2)} .


{(t = 0 ^ (3t' < 0 ( 4 ' = 2)}
V (x = re 1) A [ the same, replacing l by s ]

With the choosen notations, the predicate X naturally coding w on n and the
flattening F of f~ on cn are defined by:
Vx < m, Vt < l: [n ~ X(xl + t)] r [dt~ = 2]
Vx < m, Vt < I: [ca ~ F(xL + t)] r [bt~ = 2]
The equivalence (.) can therefore be written : Vx < m, Vt < I :

F(xL + t)

iff

(x < re - 1) A {(t < 0 A x ( x l + t) ^ (3t' < t)x(~l + t')}


{(t = 0 ^ (3t' < t)x(~t + t')}
V(x=rn 1) A [ the same, r e p l a c i n g l b y s ]
More precisely, if X denotes the trivial extension of X to cn (that is, ,12 C c n and
Vi < en: X(i) ~ [i < n A X(i)]), and if ~(x, t, X) denotes the formula written
between braces in the above equivalence, then we have, Vx < re, Vt < L :
[ca ~ F(xL + t)] ~ [ca ~ O(x, t, X ) ] ,
and since X coincide with X on mL, which is the only part of cn concerned by
the formula 4~ :
[ca ~ F(xL + t)] ~ [ca ~ +(x, t, X)].
Then, a unary predicate F C cn is the flattening of f~ on cn iff cn ~ D(X, F),
where D(X, F) is the formula which makes explicit the connection between X
and the flattening of fw on cn.
i.e. D(X, F) =_W [ F ( ~ ) ++ (3x < re)(3t < L){(~ = *L + t) A +(x, t, X)}]
Since we have :
[w e E] r [(ca, F, +) p 3G1... 3Gq~(F, G ) ] ,
where F is the flattening of fw on cn
(this is Corollary 8 ) ,
we eventually deduce:
[w E ~.] r [(cn, X, 9-) ~ 3F3G1. . .3Gq( D(,.Y , F) A ~(F, G))]
(since D(X, F) characterizes the flatteningF o f f = on cn.)
Whence, denoting g~(X, F, G1,..., Gq) = D(X, F) A ~(F, G) :
[w E s ~:~ [(cn, X, +) p 3F3G1... 3aq~(X, F, a l , . . . , Gq)]
And at last, denoting Go the quantified second-order variable F, we have :
202

Lemma9. Let 12 be a language on ~ = {1,2}. I f s 9 N L I N , then there exists


a sentence ~ 9 FO(V*3*) on signature { X , Go, G 1 , . . . , Gq, +}, where X and the
G i ' s are unary predicate symbols, + is a ternary predicate symbol, such that :
[w e 12] e* [(en, X , +) p 3 G o . . . 3GqCJ(X, G o , . . . , Gq)]
when X is the trivial extension to cn of the predicate naturally coding w on n,
and + is the predefined addition on cn.

Remark. The assertion ~ 9 FO(V*3*) has not been justified above. Since ~) -_-
D A ~, and since ~ 9 FO(V*3*), it is sufficient to prove that D 9 FO(V*3*).
But D _ Vo~[F(c~) ~ (3x < m ) ( 3 t < L){(c~ = x L + t)A r can also be w r i t t e n :
D -- V a [ F ( a ) ~ (Vx < m ) ( V t < L ) { ( a - x L + t ) ~ ~]. Using this two writtings
and observing that :
9 a = x L + t can he replaced by a formula in FO(V*3*) ;
9 (P can be written with either a V*3* or an 3*V*-quantifier pattern,
the reader will easily convince himself that the result holds.

3 From cnton

This part states without proof a technical result which is essentially due to Lynch
(see [Lyl] or [Gr2]). Let us suppose that n and c are fixed in N. If R is an r-ary
predicate on cn, we call "division" of R on n the set of r-ary predicates on n,
R* = {Rfl...i~, i l , . . . , i r < c ] , defined, for all i l , . . . , i r < c, by :
v ( y ~ , . . . , Yr) 9 n r : R~I . , ( y l , . . . , ~ ) r R(iln + y l , . . . , ir~ + yr).
The bijectivity of the mapping which associates every r-ary predicate on cn to
its division R* = {Ril...~, ij < c} on n, allows us to a f f r m :
L e m m a 10. Let q5 be a first-order sentence on signature { R1, . . . , Rs, P1, . . . , Pt } ,
where each Ri is a ri-ary predicate symbol, and each Pi is a pi-ary relation on en.
Then there exists q~*, f r s t - o r d e r sentence on signature { R~ , . 9 . , R*~, P~ , 9 * ., p * ~ },
where each R* is a set of c r~ ri-ary predicates symbols, and the P* 's are the re-
spective divisions of the Pi 's, such that :
(cn, t:)1,..., Pt) ~ 3 R ~ . . . 3 R ~ ( R ~ , . . . , R~, P1, 9 Pt)
iff
(n, P*~,. .., Pt*) p3R~ . . . 3 R ~*r * ( R I*, . . . , R , , * PI*,...,Pt*)
Moreover, i r e 9 FO(V*3*), ~* can be required to be i~ FO(V*~*).
Theorem A immediately follows from Lemma 9, Lemma 10, ~nd a simple study
of the divisions of + (addition on cn) and X (trivial extension of the predicate
naturally coding w) on n :
T h e o r e m A Let 12 be a language on ~ = {1,2}. I f 12 9 N L I N , then there
exists a sentence 0 9 FO(V*3*) on signature { X , U 1 , . . . , Us, +}, where X and
the Ui's are unary predicate symbols, + is a ternary predicate symbol, such that
w 9 ~ : [~ 9 121 r [(~,x,+) p 3u~...~u~o(x, u,+)],
where X is the predicate naturally coding w on n, and + is the predefined addition
on ft.
203

Pro@ Let E, s w, X and + be like in the theorem hypothesis. Then, from


lemma 9, we have [w e E] Ca [(ca, X,+} ~ 3Go...3Gq~(X, Go,...,Gq,+)] ,
where X is the trivial extension of X to ca. This becomes, with lemma 10 :
[w C s <z [(n, X*, +*) p 3 a ; . . . 3 C q*( ~~) * ( x ,* c 0*, ..., a , *, +*)],
where X* and +* are the respective divisions of X and +, and each G* is a
set of c unary predicate symbols. Replacing { G ~ , . . . , G~} by {U1,..., Us}, with
s = (q + 1)c, and denoting kP* instead of (r we get :
[w e z] [(., x*, +*) e*(x*,ul,..,u,,+*)],
We have just to write +* and X* in a standard form to obtain the sought result.
D i v i s i o n o f + o n n : + is the ternary predicate on cn defined by :
[on p +(x, y, z)] Ca [x + y = z in ca].
+* is then the set {+ijk, i,j,k < c} made up of c3 ternary predicates on nl
defined, for each (i,j, k) C c3, by :
[n p +iJk(x, y, z)] Ca [ca p +(in + x,jn + y, kn + z)]
r [(in + x) + (jn + y) = (kn + z) in ca]
[n~ {(i+j=l~)A(x+y=z)}V ]
Ca {(i+j+l=k) A(x+y-z-l=n-1)}
Let us call r y, z) this last formula, in which + now denotes the addition on
n. Replacing the assertions "i + j = k" and "i + j + 1 = k" by their truth value
(which is perfectly defined for each triple (i, j, k)),. we have thus built, for each
formula +ijk(x, y, z), an equivalent formula r y, z) on n whose signature is
the only predicate +, namely, the addition on n.
D i v i s i o n o f X o n n : X is the trivial extension of X, i.e. the unary predicate
on cn defined b y : Vx < cn : X(x) r [x < n and X(x)]. So X* may be denoted
X* = { X ~ X~-I}, where each X i is a unary predicate on n defined by :
[ n p Xi(x)] Ca [cn p X(in + x)]
Ca [in + x < n and n ~ X(in + x)]
Ca [i = 0 and n p
In other words, ?do = X and for every i > 0, X i is the empty relation of n.
Let O be the first-order sentence obtained from r as follows : * each subformula
+qk(x, y, z) of ~* is replaced by the formula Cq~(x, y, z) described above ; 9 each
occurence of Xi(x) is replaced by X(x) i f / = 0, and by _L (the false) if0 < i < c.
Then O is the first-0rder sentence on the signature {X, U1,..., Us, +} announced
in the theorem. []

4 Conclusion

We have proved that N L I N , and then many NP-problems (the 21 NP-complete


problems of [Ka]), are definable in Lynch's logic, even with sentences in FO(V*3*).
In our opinion, this means that the Lynch's purpose announced in [Ly2], namely
proving that a natural NP-problem is not expressible in that logic, is a con-
siderably difficult problem, since it would imply that the concerned problem is
strictly harder than each of the 21 above mentioned NP-complete problems.
20z~

References

[Bii] J.R. BOCHI, Weak second order arithmetic and finite automata, Z.Math.
Logik Grundlagen Math. 6 (1960), pp.66-92.
[Fa] R.FAGIN, Generalised first-order spectra and polynomial-time recognizable
sets, in Complexity of Computations, R.Karp, ed., SIAM-AMS Proc. 7, 1974,
pp.43-73.
[G1] E.GR~.DEL, On the notion of linear time computability, International J. of
Foundations of Computer Science, No 1 (1990), pp.295-307.
[Grl] E.GRANDJEAN, A natural NP-complete problem with a nontrivial lower
bound, SIAM J.Comput.,17 (1988), pp.786-809.
[Gr2] E.GRANDJEAN, A nontrivial lower bound for an NP problem on automata,
SIAM J. Comput., 19 (1990), pp.438-451.
[Gr3] E.GRANDJEAN, Linear time algorithms and NP-complete problems, Proc.
CSL'92, Lect. Motes Comput. Sci. 702 (1993), pp. 248-273, also to appear in
SIAM J. on Computing.
[Gr4] E.GRANDJEAN, Sorting, linear time and the satisfiability problem, to ap-
pear in special issue of Annals of Math. and Artificial Intelligence, 1995.
[GuSh] Y.GUREVICH and S.SHELAH, Nearly linear time, Lect.notes Comput.Sci.
363 (1989), Springer-Verlag, pp.108-118.
[G~I] Y.GUREVICH, Toward Logic Tailored for Computational Complexity,
Computation and Proof Theory, (M.M.Richter et. al., eds.).Springer-
Verlag Lecture Notes in Math. Vol.l104, pp.175-216, Springer-Verlag, New
York/Berlin, 1984.
[Gu2] Y.GUREVICH, Logic and the challenge of computer science, Current Trends
in Theorical Computer Science, (E~Boerger Ed.), pp.l-55, Computer science,
Rockville, MD, 1986.
[Ira1] N.IMMERMAN, Languages which capture complexity classes, 15th ACM
Symp. on Theory of Computing , 1983, pp.347-354; SIAM J. Comput., 16,
No.4 (1987), 760-778.
Jim2] N.IMMERMAN, Relational Queries Computable in Polynomial Time, 14th
ACM STOC Symp., 1982, pp.147-152. Also appeared in revised form in
Information and Control, 68 (1986), pp.86-104.
[Im3] N.IMMERMAN, Descriptive and Computational Complexity, in J. Hartma-
his ed., Computational Complexity Theory, Proc. of AMS Symposia in Appl.
Math. 38 (1989), pp. 75-91.
R.M.KARP, Reducibility among combinatorial problems, IBM Symp.1972,
Complexity of Computers Computations, Plenum Press, New York, 1972.
[IKoTh] P.G.KOLAITIS, M.N.THAKUR, Logical definability of NP-optimization
problems, Technical report UCSC-CRL-90-48, Computer and Information
Sciences, University of California, Santa-Cruz, 1990.
[Lyl] J.F.LYNCH, Complexity classes and theories of finite models, Math. Systems
Theory, 15 (1982), pp.127-144.
[Ly2] J.F.LYNCH, The quantifier structure of sentences that characterize nonde-
terministic time complexity, in Comput. Complexity, 2 (1992), pp.40-66.
[Set] C.P.SCHNORR, Satisfiability is quasilinear complete in NQL, J. ACM, 25
(1978), pp.136:145.
[Va] M.VARDI, Complexity of Relational Query Languages, 14th ACM Syrup. on
Theory of Computation, 1982, pp.137-146.
Logics For Context-Free Languages

Clemens L a u t e m a n n 1 T h o m a s Schwentick 1
Denis Th~rien 2.

1 Johannes Gutenberg-Universits Mainz


2 McGill University Montreal

A b s t r a c t . We define matchings, and show that they capture the essence


of context-frceness. More precisely, we show that the class of context-
free languages coincides with the class of those sets of strings which can
be defined by sentences of the form 3 b~, where ~ is first order, b is a
binary predicate symbol, and the range of the second order quantifier
is restricted to the class of matchings. Several variations and extensions
are discussed.

1 Introduction

In descriptive complexity theory, we try to find logical characterisations of lan-


guage classes which are relevant for C o m p u t e r Science. T h e first such charac-
terisation was given by Bfichi [3], who showed t h a t a set of strings is a regular
language if, and only if, it is the set of finite models of a sentence of monadic
second order logic. Since then, m a n y i m p o r t a n t subclasses of regular languages
have been characterised by means of syntactical restrictions, or extensions of
first order logic, c.f. [2].
In this paper, we initiate a similar p r o g r a m m e for context-free languages. In
order to define any n o n - r e g u l a r languages, we have to go beyond monadic second
order logic. On the other hand, existential quantification over a single binary
relation is enough to express all context-free languages - but also some languages
which are not context-free. We therefore have to find a logic whose expressive
power lies between t h a t of monadic second order logic on the one, and of first
order logic with one binary existential second order quantifier, on the other
hand. We choose a semantical approach in which we restrict the second order
quantifier to range only over a specified class of binary relations. To be more
precise: let a be a signature, B a binary predicate symbol, and a let 13 be a class
of binary relations (such as, e.g., linear order relations, or equivalence relations).
We define the class 3 B f.o.(a) to consist of all those sets L of a - s t r u c t u r e s for
which there is a first order sentence over a<b>, such that, for every a - s t r u c t u r e
G,

G E L r there is a binary relation B E/3 on G, with < G , B > ~ ~.

* Work done while on sabbatical leave at the Universidad Politecnica di Catalunya


206

In the following sections, we will use string logic. We consider strings over
some a l p h a b e t A = { a l , . . . , a t } as structures over the signature < A , < > :=
< Q ~ I , ' " , Q~r, < > in the usual way: a string w = wl ... w~ E A + is identified
with the s t r u c t u r e 3 < { 1 , . . . , n } , Q ~ l , . . . ,Q~,, < > , where i E Q ~ iff wi = 6tj.
For the sake of clarity we will also use obvious abbreviations such as rain and
m a x (denoting the smallest and largest element, repectively). W i t h CFL(A) we
denote the class of context-free languages over the alphabet A.

2 Matchings

T h e essence of context-free languages is, in a sense, contained in the Dyck lan-


guages Dk, which consist of strings of properly matched parentheses with k
different kinds of parentheses, c.f. [10]. Given some means of expressing t h a t two
positions in a string belong together, it is easy to describe the elements of D~
in first order logic. The binary predicates which provide this means are called
matchings, and are defined as follows (c.f. Figure 1).

a b a b b a b a a b b a a

1 2 3 4 5 6 7 8 9 10 11 12 13

Fig. 1. Example of a matching.

A binary relation B C { 1 , . . . , n} 2 is called a pairing relation, if, for all i, j, k E


{ 1 , . . . ,n} it satisfies

1. ( i , j ) e B ~ i < j (B is compatible with <);


2. if ( i , j ) E B and kf~{i,j} then (i, k), (k, i), (k,j), (j, k)f[B (elements belong to
at most one pair).

A pairing relation is called a matching, if it is noncrossing, i.e., if for all i, j, k, l C

1. ( i , j ) e B , ( k , l ) E B , i < k < j ~ i < 1 < j.

If M is a matching, we call every pair (i,j) E M an arch of M.


Let Match denote the class of all matchings.

3 Whenever appropriate, we use the same letters to denote predicate symbols and their
interpretations in the structure.
207

With quantification over matchings, the Dyck language Dk over the alphabet
Ak := { a l , f l l , a 2 , . . . ,ilk} can be defined as follows:

3MVxVySz : ( M ( x , z ) V M ( z , x ) ) A ( M ( x , y ) -~ V Q-,(x) A Q~(y)


i=1

We will now show that first order logic plus existential quantification over match-
ings is enough to define every context-free language, and, moreover, no other
languages can be defined that way.

2.1 T h e o r e m . CFL(A) = 3Match f.o.(A, <).


P r o o f o f CFL(A) C C_ 3 M a t c h f . o . ( A , <):
Here we have to show that, for every context-free language LEA +, there is a
first order sentence T over < A , <, M > , with the property that for all wEA + :
wEL ~ there is a matching M such that <w, M > ~ T-
As a first step, we derive a normal form for context-free productions which is
particularly convenient for our purposes. As a starting point for our normalisa-
tion, we take the following normal form (c.f. [10],ch.VI, exercise 4).

2.1.1 L e m m a . Every context-free language L C A + is generated by a context


free grammar ~ = ( A , N , S , P ) , in which every production is of one of the fol-
lowing form~:
1. X - - + a , X E N , a E A ;
2. X >au/3, with X E N, a,/3 E A, and u E (N U A)* .
Condition 2 ensures that every right-hand side which allows further derivations
begins and ends with a terminal symbol. It is these symbols which we are going
to match. In order to avoid some unpleasant technicalities, we need a slightly
stronger normal form. Let p - X0 > voXlVl .'. Vs-lXsvs be a context-fl'ee
production, where X 0 , . . . ,Xs are nonterminals, and v0,... ,v~ are (possibly
empty) terminal strings. If s > 0, we call p a nonterminal production. Let I
be a new symbol. We call the string v0].., vs-1]v~ the pattern of the production
p. We want the left-hand side of a nonterminal production to be uniquely deter-
mined by its pattern. This can be achieved in the following way, starting from a
grammar as in 2.1.1:

1. Eliminate all productions of the form X >a, X E N \ { S }, a E A, intro-


duce a new production Y > uav, for every production Y -----+uXv.
2. E n u m e r a t e all nonterminal symbols X 1 , . . . ,X~. Starting with i=2, do the
following for every i: as long as there is a nonterminal production p - Xi >v
whose p a t t e r n also appears as the pattern of a production with left-hand
side Xj, j < i, replace p by all productions which can be obtained fl'om it
by substituting one of the nonterminals in v in all possible ways.

This process will eventually terminate, since the substitution in 2 will either
make the production terminal, or it will increase its length. Thus we conclude:
208

2.1.2 Lemma. Every context-free language L C A + is generated by a context-


free grammar G = (A, N, S, P) which satisfies the follow'lag conditions:

- All productions are of one of the following forms


1. S ~ ~, a E A, or
2. X ~ sup, XEN, a,~eA, ue(AUN)*.
- If any two nonterminaI productions have the same pattern, they have the
same left-hand side.

Let L be g e n e r a t e d by a g r a m m a r as in 2.1.2 a n d consider a derivation tree T


of a string w E L. T h e leftmost a n d the r i g h t m o s t child of each internal n o d e of
T are leaves, labeled with t e r m i n a l symbols. Define a pairing MT on w by (c.f.
Figure 2):

(i, j) E MT ~ i corresponds to the leftmost, j to the r i g h t m o s t child


of the s a m e internal n o d e of T.
Clearly, MT is a m a t c h i n g .

a
/I\a a b a X4 a b
/I\
a b
/\
b b b

Fig. 2. The matching constructed from a derivation tree.

We show now t h a t there is a f o r m u l a ~ over < A , <, M > , which holds for a string
w with m a t c h i n g M , if, a n d only if, there is a G-derivation tree T of w such t h a t
M = MT. It follows t h a t there is a m a t c h i n g M on w with < w , M > ~ p iff w
can be derived in ~.
E v e r y arch ( i , j ) E M defines a substring, w l , . . . ,wj. We say t h a t an arch
(k,1) E M (i < k < l < j) lies at the surface of (i,j), if t h e r e is no o t h e r arch
b e t w e e n it a n d ( i , j ) , i.e., if there is no arch (r,s) with i < r < k < s. Similarly, a
position k (i <_ k _< j ) which is n o t the e n d p o i n t of an arch o t h e r t h a n (i, j) lies
at the surface of ( i , j ) , if there is no other arch between it a n d (i,j), i.e., if there
209

is no arch (r, s) with i < r < k < s.The string of surface symbols, with surface
arches replaced by the symbol [ is called the pattern of (i,j), c.f. Figure 3.

Vo VI Vs

Fig. 3. The surface of an arch, with pattern V O [ V l [ . . . [Vs.

We say that an arch (i, j) corresponds to a production p, if their patterns are


identical. This property of a pair of positions can easily be expressed by a first
order formula Xp(X, y), for every production p:
Let, for every string v, Cv(x,y) be a formula which expresses that the string
strictly between positions x and y is v, i.e,
<w,i,j>~bv(x,y) <:. W i - I - I ' ' ' W j _ I - ~ - U .
Then, for p - Xo ~ cevoXlVl ... Vs-lXnVst3, Xp(X, y) can be formed as:

X,(x,y) = Q~(x) h Q#(y) h 3XI ~BI . . . ~Xs~Ys : (X<Xl <Yl <X2< ... <Ys < Y) A
^(~)vo(X, Xl)A~r)vl(Yl,X2)A...A~v.(ys, y)) n
A(M(xl,yl)A...AM(x,,ys)).
For every X E N, let Xx (x, y) be the disjunction of all those Xq(X, y), for which
X is the left-hand side of the production q. Then, for p as above, the following
formula f~p(X, y) expresses that (x, y) corresponds t o p and that the surface arches
(xl, Y l ) , . . - , (x~, Ys) correspond to productions with left-hand sides X 1 , . . . , X~,
respectively.

A(r Xl) A~bv,(yl,X2)A...Ar y)) A


A(M(Xl,yl)A... AM(xs,ys)) A (XXl (Xl,Yl)A...A ~x~(xs,ys)).
Now the formula ~ with the property that w has a matching M such that
<w, M > ~ ~ if and only if w E L, is formed as followsa:

V ~uV[(VxVy:M(x,y)-e V 2p(x,y))A(M(min, max)AfG(min, max))]"


S--+uEP pEP
expresses t h a t either S ) w or
4 Here ~5~ is a formula which holds for w iff w = u.
210

1. for every arch (i, j) of M:


- (i,j) corresponds to a production Xo ~ C~voXlvl ... XsvS3, and
- the surface arches (il, j l ) , . . . , (is, js) correspond to p r o d u c t i o n s p L , . . . , lJ~
with l e f t - h a n d sides X 1 , . . . , Xs, respectively;
2. (1, n) is in M and corresponds to a production with left-hand side S.

Since every production is uniquely determined by its p a t t e r n , the two conditions


under 1 must b e consistent, and we can see inductively, t h a t (i, j ) C M implies
t h a t w i . . " wj can be derived from some nonterminal X. [:]

P r o o f o f 3 M a t c h f.o.(A, < ) C CFL(A):


In order to prove this direction, we need some notation for trees. For our pur-
poses, trees are rooted and ordered (in a leftmost depth-first way), and have
labeled nodes, where each node label has an arity which corresponds to the out-
degree of the node. A tree language is a set of trees over some finite label set.
T h e set A of 0 - c r y labels is called the leaf alphabet. The leaf labels of a tree
T with leaf alphabet A, concatenated according to the order relation (i.e., from
left to right) form a string over A, called the yield of T.
As a relational structure, the universe of a tree is the set of its nodes; there is a
u n a r y relation Q~, for every label c~, there is a binary relation C, with (i, j ) C C
r i is a child of j, and, finally, there is the order relation <. W i t h predicate
symbols for these relations, we can easily express the following properties in first
order logic:
Lf(i): i is clear;
L c ( i , j ) : i is the leftmost child of j;
R c ( i , j ) : i is the rightmost child of j;
and in monadic second order logic 5
An(i,j): i is an ancestor of j;
Pt(U, i , j ) : A n ( i , j ) and L f ( j ) and the set U contains precisely the nodes
on the p a t h from i to j.
In what follows, we will make use of two characterisations, of context-free lan-
guages, and of recognisable tree languages, respectively. 6

2.1.3 L e m m a . [9] A language L over A is context-free if, and only if, there is
a recognisable tree language T , with leaf alphabet A, such that every w C A +
belongs to L iff it is the yield of some tree T E 7-.

2.1.4 L e m m a .[5, 13] A tree language is recognisable iff it is definable in monadic


second order logic.

Combining these two results, we can proceed as follows.


Given a first order sentence ~ over < A , < , m > we will construct a monadic
second order sentence ~ over some tree signature, in such a way t h a t a string

5 This is first order logic with additionM (existential and universal) quantification over
unary predicates.
6 For the notion of a recognisable tree language, see e.g. [8].
211

w has a matching M with <w, M > ~ r if and only if w is the yield of a tree
which satisfies ~.
will be of the form ~ A T, where T describes a class of trees which can be
obtained from strings with matchings in a certain way t h a t will be explained
below. T h e formula q~ corresponds to r and holds for those trees which are
obtained from strings with matchings t h a t satisfy ~.
Let w = wl - . . w , be a string, and M a matching on w. We construct a tree
T,,,M in two stages as follows:

1. Define a tree Tw,M the nodes of which are the positions of w and the arches
of M:
- for every i < n, i is a leaf, labeled wi;
- every arch (i, j ) E M is an internal node, labeled ~), its children are all
those positions k and arches (k, l) which are at the surface of (i, j), in
the order in which they a p p e a r in <w, M > .
If n > 1, and (1, n) • M , add a root, labeled Q , whose children are all those
positions and arches which are not underneath any arch.
2. W h e n e v e r an internal node has more t h a n two children, we distribute t h e m
over binary subtrees, 7 using additional nodes with label (~). This results in
a tree Tw,M, with yield w and with internal nodes labeled C) or ~).

(i)

b | @ @
/\ / \
a | a ,,|
/\ /\
a b a b

< w, M > 2',,,,M Tw,M

Fig. 4. The construction of the tree.

Thus our trees have leaf alphabet A and two binary labels, (~) and @ . A m o n g
all such trees the ones t h a t can be obtained in the way described above are
characterised by the following property:

Both, the leftmost and the rightmost p a t h out of of every ~ ) - l a b e l e d node


lead to leaves, without meeting any ( ~ - l a b e l e d nodes.

7 We don't need this construction to be deterministic - any tree Tw,M obtained this
way will do.
212

This p r o p e r t y can be expressed in a monadic second order formula over the tree
signature < Q ~ , . . . Q~r, Q| Q o , C, < 2> as T = TL A Yn, where TL deals with
the leftmost, TR with the rightmost path, respectively. For TL, e.g., we can write:

Vx [O| ~ 3y3XVsVt: St(X,x,y) A ((X(s) A X(t) A C(s,t)) -~ rc(s,t)) A

If M is a m a t c h i n g on w, there is a tree T with yield w, which satisfies T, namely


T~,M. On the other hand, if T satisfies T, then a matching IV/T on the yield of
T can be constructed by defining:

(i,j) e MT, iff c, the c o m m o n ancestor of i and j in T, has label @ , and i lies
on a leftmost, j on a rightmost p a t h from c.

We want to restrict T so t h a t <w, MT> ~ ~. This is done by the tormula ~P


which is derived from ~p by

- restricting all quantifiers to range over leaves only; to this end, we replace
every subformula of the form 3xx by 3x : Lf(x) A X, and every occurence of
Vxx by Vx : LI(x) ~ X;
- replacing x < y by the monadic second order formula

which expresses t h a t there is a node n whose left child is x or an ancestor of


x, and whose right child is y or an ancestor of y;
- replacing re(x, y) by a monadic second order formula (similar to T) which
expresses t h a t there is a node n with label ~ , such t h a t the leftmost p a t h
from n leads to x and the rightmost p a t h from n leads to y.

T h e n there is a matching M on w such t h a t <w, M > ~ ~b, if, and only if,
w = yield(T) for some T such t h a t T D T A k~. Hence the set of those w for
which there is such a matching is context-free. []
T h e construction in the second half of the proof of T h e o r e m 2.1 still yields a
monadic second order tree formula 9 if the string formula ~b is monadic sec-
ond order rather t h a n first order. Therefore, the theorem remains true with f.o.
replaced by m.s.o.

2.1.1 C o r o l l a r y . CFL(A) = 3 Matchm.s.o.(A, < ) . []

3 Variations

One motivation for this research was an a t t e m p t to find logical characterisations


for interesting subclasses of context-free languages. W i t h our approach, this is
not difficult for the class of k-linear languages, i.e., languages generated by
g r a m m a r s with rules of the form
213

- S )XI...Xk, XI,...,XkEN\{S}.
- X ~ uYv, u , v e A*, X e N, Y e N \ { S } ;

These languages can be chaxacterised by concatenations of at most k strictly


nested matchings, where we call a matching M on { 1 , . . . ,n} strictly nested if
there are no two pairs ( i , j ) , ( k , l ) E M such t h a t i < j < k < 1.

There are other classes of binary relations which can be used to chaxacterise
context-free languages. E.g., the proof of T h e o r e m 2.1 can easily be modified
to show CFL(A) = 3 B f . o . ( A , < ) , where the class/3 consist of t h o s e ' b i n a r y
relations B t h a t satisfy the following noncrossing condition:

if j, k e { i + l , . . . , l - 1 } , and ( i , j ) , ( k , l ) e B then j < k.

Here we allow several arches to share the same left or right endpoint; note,
however, t h a t no position can serve b o t h as a left and a right endpoint.

We can also employ special order relations to define context-freeness. Call a


linear order relation I- on { 1 , . . . ,n} tree definable if there is a binary tree T
such t h a t

- the leaves of T are, from left to right, the numbers 1 , . . . , n;


- the internal nodes axe labeled e i t h e r c / or x,~;
- i r- j iff i is visited before j in the depth-first traversal of T in which, at
every node with label r first the left, and at every node with label "~, first
the right child is visited.

As there are only 20(n) binary trees with n leaves, not all linear orders can have
this property. Let T D O denote the class of tree definable linear orderings.

3.1 Theorem. CFL(A) = q T D O f.o.(A, <)s.

P r o o f (sketch):
Let < w , 1--> ~ ~, and let T be a tree defining the order relation f- on { 1 , . . . , n}.

where the mso formula Ica(z, x, y) expresses t h a t z is the least c o m m o n ancestor


of x and y.
As in the proof of the second half of T h e o r e m 2.1 we can construct an mso tree
formula ~ such t h a t for the tree T ~ obtained from T by labeling the leaves with
the symbols of w it holds t h a t T I ~ ~, and t h a t for every tree T with T ~ ~,
the order relation r- defined by T satisfies <yield(T), r - > ~ T. This shows t h a t
3 T D O f.o.(A, < ) C_ CFL(A).
For the other direction, let L E CFL(A), and let L be defined by 3rn T ( m ) , ac-
cording to T h e o r e m 2.1. From a matching M on { 1 , . . . , n} we derive a successor
relation r~ as follows:
s Note that formulas of this logic refer to two linear order relations: the given positional
ordering <, and the additional (tree definable) order.
214.

- If j is the right endpoint of an arch (i,j) then j c~ i;


- otherwise let k be the smallest n u m b e r > j which is not the righ~ endpoin~
of an arch.
* If k is the left endpoint of an arch (k,l) then j ~ l;
9 otherwise j E k.

T h e n (i,j) is an arch of M iff i<j A j r-o i. Since the defining properties of


matchings are first-order definable, we can derive a f.o. formula ~5(F) from ~,
such t h a t <w,M> ~ ~ r < w , r - > ~ @. Here c- is the order relation
induced by E .
It remains to show t h a t c- is tree definable. This can be shown by induction on
n. For the induction step let (i, j) be an outermost arch of the matching M on
{ 1 , . . . , n}. T h e n the following tree defines M. / @ ~ ~ N

T3

i T2
Here T1, T2,2"3 are trees defining the order relations derived from the restrictions
of M to { 1 , . . . , i - 1 } , { i + 1 , . . . , j - I } , and { j + l , . . . , n}, respectively. []

T h e o r e m 3.1 implies t h a t all context-free languages are contained in the class


3 LO f.o.(A, <), where LO stands for the class of all linear order relations, since
the p r o p e r t y of a linear order relation to be tree definable can be expressed in
first order logic. However, this is not immediately obvious, and we will no~ prove
it here. But it is not difficult to show directly t h a t linear order relations can
express all context-free languages.

3.2 P r o p o s i t i o n . CFL(A) C 3LO f.o.(A,<). []

This inclusion is strict, since the set L = {vv/yEA+}, a well-known example


of a n o n - c o n t e x t free language, is contained in 3 LO f.o.(A, <). To see this, let
w = wl...w2n be equipped with the additional order relation -< induced by
1 -~ n + l -4 2 -4 n + 2 -4 ... -4 n -4 2n. With this order it holds t h a t w E
L ,,--, > I= wv : < = ---, A
a6A
In combination with the following formula, which uniquely determines -<, this
formula characterises L:

(min.~=min<)AVxVy:(y=suc_<(suc<(x))-+y=suc<(x)).

9 It is only for the sake of conciseness that we use a function symbol suc< here to
express the direct successor in the order relation -<. Of course, the successor can
easliy be expressed in first order logic without this symbol.
215

4 Discussion
Of course, our m a i n result does not teach us anything new a b o u t context-free
languages. However, it supports, in a rather satisfying m a t h e m a t i c a l way, the
intuition t h a t the essence of context-freeness is contained in the notion of a
matching relation. And since context-free languages are well understood, it tells
us something a b o u t our logic 3 Match f.o., e.g., t h a t it is not closed under nega-
tion: We believe t h a t semantically restricted logics can be used to characterise
not only the class of context-free languages, and some of its subclasses, but also
m a n y other interesting classes. We are convinced t h a t a more general study of
these logics will prove worthwhile also in the context of general finite structures,
instead of strings. Here, as opposed to the string case, the class m o n N P defined
by existential monadic second order logic is not closed under complementation.
In fact, the set of connected graphs, although expressible by means of a universal
monadic second order sentence, is not contained in m o n N P [6], a result which
has since been refined and extended in a n u m b e r of ways [4, 1, 7, 11, 12]. On the
other hand, the expressive power of sentences with one binary existential second
order quantifier is not well understood. We believe t h a t studying semantically
restricted versions of this latter logic, will help us understand the limitations,
and the expressive power of binary existential second order quantification.

References

1. M. Ajtai and R. Fagin. Reachability is harder for directed than for undirected
finite graphs. Journal of Symbolic Logic, 55(1):113-150, 1990.
2. D. A. M. Barrington, K. Compton, H. Straubing, and D. Th6rien. Regular lan-
guages in NC 1, Journal of Computer and System Sciences, 44:478-499, 1992.
3. J. R. Biichi. Weak second order arithmetic and finite automata. Zeitschrift fiir
Mathematische Logik und Grundlagen der Mathematik, 6:66-92, 1960.
4. M. de Rougemont. Second-order and inductive definability on finite structures.
Zeitschrift fiir Mathematisehe Logik und Grundlagen der Mathematik, 33:47-G3,
1987.
5. J. Doner. Tree acceptors and some of their applications. Journal of Computer and
System Sciences, 4:406-451, 1970.
6. R. Fagin. Monadic generalized spectra. Zeitschrift fiir Mathematische Logik und
Grundlagen der Mathematik, 21:89-96, 1975.
7. R. Fagin, L. J. Stockmeyer, and M. Y. Vardi. On monadic NP vs. monadic Co-
NP. In Proc. 8th Annual Conference Structure in Complexity Theory, pages 19-30,
1993. To appear in Information and Computation.
8. F. G6cseg and M. Steinby. Tree Automata. Akad4miai Kind6, 1984.
9. J. Mezei and J. B. Wright. Algebraic automata and context-free sets. Information
and Control, 11:3-29, 1967.
10. A. Salomaa. Formal Languages. Academic Press, 1987.
11. T. Schwentick. Graph connectivity and monadic NP. In Proc. 35st IEEE Syrup.
on Foundations of Computer Science, pages 614-622, 1994.
12. T. Schwentick. Graph connectivity, monadic NP and built-in relations of mod-
erate degree. In Proc. 22nd International Colloq. on Automata, Languages, and
Programming, 1995. to appear.
216

13. J. W. Thatcher and J. B. Wright. Generalized finite a u t o m a t a theory with an


application to a decision problem of second-order-logic. Mathematical System
Theory, 2:57-81, 1968.
Log-Approximable Minimization Problems on
Random Inputs
Anders MalmstrSm
Lehrgebiet Mathematische Grundlagen der Informatik
RWTH Aachen, Germany
E-mail: anders @informatik.rwth-aachen.de

Abstract
We extend recent work about the relationship between logically defined
classes of NP optimization problems and the asymptotic growth of optimal
solutions on random inputs. We consider the syntactic class MIN F + 1-[2(1)
of minimization problems. Kolaitis and Thakur proved that every problem
in this class is log-approximable. We show that for every problem Q in the
class MIN F+II2(1) there exist polynomials g(n) and h(n) such the optimal
solution of Q almost surely grows like O(g(n)+ h(n).log n). Applying this
result we show without using any complexity theoretical assumptions that
the problem MIN GRAPH COLOURINGis not in MIN F+II2(1). With the
same method we get a similar criterion for each class MIN F+II2(k). Using
the fact that on a random graph with n nodes the chromatic number is
n/(2 log n) almost surely we prove that MIN GRAPH COLOURINGis not in
MIN F+II2, resolving a conjecture by Kolaitis and Thakur.

1 Introduction
Many NP-complete decision problems come from combinatorial optimization
problems by putting a bound on the cost of a desired solution. Assuming P # NP
we cannot find polynomial-time algorithms which solve optimization problems
exactly. This does not exclude to find nearly optimal solutions, that means
solutions whose relative error compared to an optimal solution is smaller than a
constant. For many problems their approximation properties are known. Some
have efficient approximization algorithms, others like the T S P are as hard to
approximate as to solve exactly. It would be interesting to know more about the
structural reasons for these differences.
Motivated by Fagin's characterization of NP by existential second-order logic
on finite structures [4], Papadimitriou and Yannakakis introduced in [11] meth-
ods for defining NP optimization problems using logic. T h e y showed that in
some important cases there is a relation between the logical representation of
an optimization problem and its approximation properties. They defined the
two syntactic classes of optimization problems, MAx S N P and MAX NP, and
218

proved that for all problems Q in these classes there exists a polynomial-time
algorithm that approximates optQ up to a constant factor. Many natural max-
imization problems like MAX CUT and MAX SAT are contained in MAX NP.
Surprisingly the analogue definitions for minimization problems do not lead to
a similar result (see [7] and [8] for details). However, Kolaitis and Thakur in-
troduced in [7] a different syntactic class, called MIN F+Hi, and proved that all
problems in this class are approximable in polynomial time up to a constant fac-
tor. MIN F + H i consists of all minimization problems Q whose input instances
92 are finite structures such that the optimum can be expressed as

optQ(92) = rnjn{ISI : (92,S) ~ Wr S)},


where S is a single predicate variable and r S) is a quantifier-free first-
order formula in which all occurrences of S are positive.
In [8] the same authors introduced the syntactic class MIN F+II2 which is
an extension of MIN F+H1.

D e f i n i t i o n 1.1 For every k E IN, the class MIN F+H2(k) consists of all mini-
mization problems Q whose optimum can be expressed as

opt~(~) = mjnflSl : (92,s) ~ w3~r ~, s)},


where S is a single predicate variable, r .0, S) is a quantifier-free formula
in disjunctive normal form (DNF), in which all occurrences of S are positive
and S occurs at most k times in each disjunct. Moreover, let MIN F+II2 =
U MIN F+H2(k).
k

Kolaitis and Thakur proved that every problem in MIN F+H2(1) is log-
approximable. An example for this class is MIN DOMINATING SET. Given a
graph, the problem asks for the cardinality of the smallest subset S of vertices
such that every vertex either is in S or is adjacent to a vertex in S. In fact
MIN DOMINATING SET is complete for MIN F+H2(1). It is definable by the
expression

optMIN DS(G) = m~n{ISI : (G, S) ~ Vx3y(Sx V (Exy A Sy))}.

Other complete problems in this class are MIN SET COVER and MIN HITTING
SET.
It is not known whether MIN b'+H2 is in APX, the class of all NP optimiza-
tion problems that are approximable up to a constant factor. In fact this is
unlikely since Lund and Yannakakis showed in [10] that there exists a constant
c, such that MIN DOMINATING SET cannot be approximated within ratio c log n
unless NP is contained in DTIME[np~176
For an overview of NP optimization problems see [3] or [6].
To show that a problem is in a logically defined class is usually done by giving
the desired logical formula. On the other hand showing that a specific problem
does not belong to a class is more difficult. In [2] Behrendt, Compton and Gr~idel
219

investigated a new technique. They introduced a probabilistic method that


characterizes the growth of the size of expected optimal solutions for problems
that belong to MAX ~1. They showed that on a random input it grows like
O(p(n)), where p(n) is a polynomial and n the cardinality of the input. In [5] we
established a similar criterion for membership in MIN F+H1. Using this result
we proved that MIN DOMINATING SET is not contained in MIN F+H1.
In this paper we show an analogue result for MIN F+H2(1) which looks
different to the hitherto existing criteria.
T h e o r e m 1.2 ( P r o b a b i l i s t i c c r i t e r i o n for MIN F+II2(1)) Let Q be a prob-
lem in MIN F+II2(1). Then there exist polynomials g(n) and h(n) such lhat
almost surely
optQ (P2) = O(g(n) + h(n) . log n)
for a randomly chosen ~r-structure 92, where n -- ]92].
As a relevant example we have MIN DOMINATING SET. In [5] we proved that
for a random graph G with n vertices and edge probability 1/2 almost surely
optMi N Ds(G) is logn.
Applying this result we can show without using any complexity theoretical
assumptions that the problem MIN GRAPH COLOURINGis not in MIN F+H2(1).
MIN GRAPH COLOURING is the problem of computing the chromatic number
of a graph. With the same method we get a similar criterion for each class
MIN F+H2(k). Using the fact that on a random graph with n nodes the chro-
matic number is n/(2 log n) almost surely [1] we can show that MIN GRAPH
COLOURING is not in MIN F+H2, resolving a conjecture by Kolaitis and Thakur
[8].

2 Preliminaries
D e f i n i t i o n 2.1 An NP optimization problem is a tuple Q = (I~, YQ, fQ, opt)
such that
9 Z~ is the set of input instances.
9 YQ(I) is the set of feasible solutions for input I. Here 'feasible' means that
the size of the elements S E YQ (I) is polynomially bounded in the size of
I and that the set {(I, S) : S E YQ(I)} is recognizable in polynomial time.
9 f~ : {(I, S ) : S E YQ(I)} ~ N is a polynomial-time computable function,
called the cost function.
9 optQ is one of the two functions defined below with 2:6 as domain and
positive integers as values:
optQ(I) = n~nfQ(I, S) or optQ(I) = m~x/Q(I, S).

In the first case we say Q is a minimization problem and in the second case
we say Q is a maximization problem.
220

For every NP optimization problem Q, the following decision problem is


in NP: Given an instance I of Q and a natural number k, is there a solution
S E YQ (I) such that fQ (I, S) ~< k when opt=min, (or fQ (I, S) >/k, when opt
max).
First we give the exact result about the approximation properties of the class
MIN F+II~(k), k >/1.

T h e o r e m 2.2 ( K o l a i t i s , T h a k u r [8]) Let Q be an optimization problem in


the class MIN F+II2(k), k >i 1. Then there exists a polynomial-time approx-
imation algorithm and a constant c such thai for every instance 92 of Q, the
algorithm produces a feasible solution on which the cost function takes on a
value less than or equal to c(optQ(92)) k . log(19~D. In particular, every problem
in the class MIN F+II~(1) is Iog-approximable.

We see that the most important of these classes is MIN F+II2(1). We show
our main result for this class and then we extend it to MIN F+II2(k) for each k.
With some simple syntactical transformations we get

L e m m a 2.3 MIN F+II1 C MIN F+II2(1).

Moreover we need some definitions and results from the theory of asymptotic
probabilities and from the theory of graphs.
As usual, for functions f, g : 1~ --~ ~ , we write f = O(g), if there are constants
c, d > 0 such that cg(n) <<. f ( n ) <<. dg(n) for sufficiently large n, and we write
f ..~ g to indicate that lim~...oo f ( n ) / g ( n ) = 1.

Definition 2.4 Let (r be a finite relational vocabulary and xl, 9.., Xk be a se-
quence of distinct variables. A maximal consistent set t of G-atoms and negated
~-atoms (including equalities and inequalities) in X l , . . . , xk is called an atomic
~-type in xl . . . , xk. Since such a set is always finite, we can form in first-order
logic the conjunction of the formulae in t; by abuse of notation, we denote this
conjunction by t ( x l , . . . , xk). On every G-structure 92, the type t defines the set
of realizations t ~ = {fi E A k : 92 ~ t(fi)}, where A is the universe of 92.
A O-type is called an equality type e. So e is a maximal consistent set of
equalities xi = xj and inequalities xl # z., where 1 ~< i < j ~< k, which defines
on every structure 92 the set e ~ = {fi E A~: 92 ~ e(fi)}.
For each atomic type t let rt be the unique natural number of pairwise distinct
components of each tuple of type t.

T h e o r e m 2.5 ( K o l a i t i s , V a r d i [9]) For every formula r in the infinitary


logic L~ there exist atomic types t~(~) such that

k
i

for all but an exponentially decreasing fraction of structures 92 when the size
of 92 tends to infinity.
221

L e m m a 2.6 For each n E N, let G = (V, E) be a random graph of the following


form: V is the disjoint union of two sets A of cardinality a(n) and B of cardi-
nality b(n), a(n),b(n) non-constant polynomials. A is an independent set and
B is a clique of G and the probability for every (u, v) E A x B to be an edge is
independently a constant p.
Then almost surely the domination number of G satisfies
1
7(G)~log 1 loga(n)-O(logn)
1-p

PROOF. Let X~(G) be the number of different dominating sets D of cardinality


r. Thus

7(G) = min{r E 1~ : X~(G) >1 1}.


Without loss of generality D C B. We estimate the random variable X~ for
r = r(n). Define q = 1 - p. Let D be an arbitrary subset of B of cardinality
r(n) = lOgl/q a(n). The probability that D is a dominating set is

(l_(l_p)r(n))a('*)= (l_qr(n))a(n)= ( l _ 1.~a(n)


a(n))
and this tends to ~ if n tends to infinity. We infer that 7(G) <~ logl/q a(n)
almost surely.
Now let r(n)= (1-e)logl/qa(n) for any r > 0. There are ((l_e)~(gl)/q a(n))
different choices for D in B. Define the indicator random variable X n by
1 : D is a dominating set of G
XD(G):= 0 : otherwise

Obviously, X~ = Y~D XD" By linearity of expectation it follows that

E(X~) = ZE(XD)=(( 1 b(n)a(n))(1 1 ~ a('O


D -- e)lOgl/q a(n)(1-e)]

= ((1 b(n) ~(1 1 ~a(n)O-'). )~:) )


-- r lOgl/q a(n)/ a(n)(1-~) }

Because liran..,oo(1 - n-S) n" = e -1 and (m


n) ~< n m we can infer that

lim E(X~) ~< ,~lim b(n)(1-e)l~ ( ! ) a(n)~


Tt---~ ( X )

lim el~b('~)(1-~)l~ ~(n)-~('~)" = 0


n-'~ O0

By Markov's inequality, P[Xr >1 1] ~ E(Xr), so almost surely


7(G) > (1 - ~)log1/q a(n) and thus 7(G) "~ logl/q a(n) almost surely.
222

3 Main Result
In this section we establish a necessary criterion for the membership of a problem
Q in the class MIN F+II2(1). We will give a probabilistic estimation of optQ on
a random structure 92. First we need a normal form for the first-order formula
that defines Q.
Let Q be an optimization problem whose input instances are finite structures
over a fixed vocabulary o.

L e m m a 3.1 ( N o r m a l f o r m for MIN F+II2(k)) Let Q E MIN F+II2(k), then


there exist a universal first-order formula 7(~) and an existential first-order for-
mula ~(2, 5) such that for all 92
k

j=l

and 2j is of length m.

PROOF. According to Definition 1.1, the optimum of Q can be expressed for


all input structures 9.1 as:

opte(92 ) = ~n{ISI : (92, S) I= W39r #, S)},

where S is of arity m and r •, S) is a quantifier-free DNF formula in which


all occurrences of S are positive and S occurs at most k times in each disjunct.
Define the disjunction of all subformulae in which S does not occur by ~0(~, ~).
Without loss of generality, we can assume that in all other disjuncts S occurs
exactly k times. So let us make the following logical deductions:

w3~r ~, s) - w3~[~(~, ~) v V r #, s)]


i
k
- w ~ [ ~ ( ~ , #) v V ( ~ ( ~ , #1 A A s ~ ) ]
i j=l
k k

i j=l j=l
k k

i j=l j=l
k
- w [ ( v # ~ ( ~ , ~11 -* 3~, ... 3~(3~ V(~,(~, ~1 ^ A ~ = ~1
i j=l
k
^ A s~j))]
j=l
223

We will now consider the behaviour of opts(92) on a randomly chosen cr-


structure 9g. Let ~(n) be the probability space of all ~r-structures over universe
{ 0 , . . . , n - l } with uniform probability distribution. So o p t e is a random variable
on ~2(n).
We show the result for the lowest class MIN F+II~(1) and we will generalize
it to MIN F+II~(k) for an arbitrary k in the next section.

T h e o r e m 3.2 Let Q be a problem in MIN F+II2(1). Then there exist polyno-


mials g(n) and h(n) such that almost surely
optQ(92) = O ( g ( n ) + h ( n ) . log n)

for a randomly chosen ~-structure 92, where n = [92].

PROOF. By lemma 3.1 we can express the optimum of every Q E MIN F+II2(1)
as:

opte(92) = m~n{ISI : (92, S) t= W'[7(~') --+ 32(~(~', ~) ^ $2)]}.

By theorem 2.5 there exists a finite number of atomic types u~(~) and tj (~, 5)
such that

at I= w(7(~) ~ V n~(~)) and ~ I= WW(~(~, ~) ~ Vt;(~, e))


i j

for all but an exponentially decreasing fraction of structures 92. So almost


surely

opt~ (~) = m~n{ISl : (~, S) i= w[V u,(,~) -+ 3~(V tj(i, ;) ^ s;)]}.


i j

For each i there exists at least one j such that tj is an extension type of ui.
That means we can decompose the formula tj(~, 5) in the following way.

t~(e, ~) : u~(~) A v~(~) A .j(~, ~) ^/~ (~, ~),


where vj is the type of 2, aj contains only equalities xk = zl and inequali-
ties x~ 5s zt and flj contains only atoms and negated atoms (no equalities and
inequalities) R9 where 9 intersects 2 and 2. In this case we call (i, j) feasible.
Let F be the finite set of all feasible pairs (i, j). We can infer that

opts(92) = m~n{ISl : (~, S) I= W" V [u{(s ~ 3~,ti(~', ~) ^ S~]}.


(ij)eF

Let us consider the special case where

optQ(92) = m~n{ISI : (9.1,S) p V2[u(~) --+ 32t(~, ~) A $2]}.


22 z.

and t(~, 5) = u(2) A v(2) A ~(~, 2) .4/3(~, 2). The optimum of Q is the
cardinality of the smallest subset S C v ~ such that for every 2 E u ~ there
exists a 5 E S such that t(2, 2) becomes TRUE. Let ru and rv be the natural
numbers from definition 2.4, the number of pairwise distinct components of
respectively 2. From these rv pairwise distinct components of 2 let m be the
number of components that are equal to some component of 2. We get that
m ~< min{ru, rv}.
Each value of this m components defines a subset u~ of u ~. So u ~ is the
disjoint union of n(n - 1 ) . . . (n - m + 1) sets u~, and the sets

v~ = {5: ~ E u~ A (~,5) E t ~} are disjoint too.

We have reduced our problem to finding the minimal subset Sl C v~ such


that for every ~ E u~t there exists a 5 E Sz such that t(~, 2) becomes TRUE.
But this is the same problem as finding a minimal dominating set of the graph
Gl = ( ~ , Ez) which is defined in the following way. Consider Vl as the disjoint
union of u~ and v~, and let (a,b) E E, if and only if (a,b) E v~ • v~ or
(a,b) e u ? • v ~ A ( a , b ) e t ~ .
Almost surely the size of each u~ is ut(n) = O(n ~ - m ) and the size of v? is
=
First we consider the case where m < min{r~,rv}. The probability that
randomly chosen ~ E u~ and 5 E v~ are in t ~ is a constant p.
With lemma 2.6 we get that 7(G,) "~ O(log uz(n)) almost surely.
We can infer that almost surely

opts(92) = optQ(UGz ) = ~7(G,)


l I

~ O(log ,(nl)= o(n m) O(logn

If m = min{r=, r~} then

optQ (92) = min{[u~I, [v~l}


Let us go back to the general case. For each feasible pair (i, j) we find a
polynomial gi,j (n) such that the optimum of Q on 92 restricted to the types u~
and tj grows asymptotically like g~,j (n). log n or like g~,j (n). This yields that the
optimum almost surely grows like the finite sum of polynomials and polynomials
multiplied by log n and we get the assertion. 9

4 Generalization and Application


We can extend this criterion to the whole class MIN F+II2.

Theorem 4.1 (Probabilistic c r i t e r i o n for MIN F+II2(k)) Lel Q be a prob-


lem in MIN F+II2(k). Then there exist polynomials gi(n), 0 <<.i <~ k, such that
almost surely
225

opt ( ) = o ( g0(n) +
i=l
)
f o r a randomly chosen ~-structure 92, where n = ]9.1].
First we need a generalization of lemma 2.6 to k + 1-hypergraphs. A k + 1-
hyperdominating set is a set of vertices such that for every vertex there exist
k vertices in this set and these k + 1 vertices form a k + 1-hyperedge. The
hyperdomination number is the cardinality of the smallest hyperdominatlng set.

L e m m a 4.2 For each n E 1~, let H = (V, E) be a random k + 1-hypergraph


of the following form: V is the disjoint union of the sets A and B 1 , . . . , B k
of cardinalities a(n),bl(n),...,bk(n), non-constant polynomials. A is an in-
dependent set, B = Ui=l k Bi is a k-t-1-hypereliqne of H and every tuple
(u, v l , . . . , v k ) E A • B1 • ... • Bk is independently a k + 1-hyperedge with
constant probability p.
Then almost surely the k + 1-hyperdomination number of H satisfies
~(H) ~ O(log ~ n)

PROOF. Let Xr(H) be the number of different hyperdominating sets D of


cardinality r. Thus

~(H) = min{r E N : Xr(H) >1 1}.

Without loss of generality D C B. We estimate the random variable Xr for


r = r(n). Define q = 1 - p. Let Di be an arbitrary subset of Bi of cardinality
1
~(n) = log~/q a(n). Thus r(n) = k , ~(n). The probability that D = LJik=lDi is
a hyperdominating set is

(l_(l_p)r('*)k)a('~).-(l_qr(,~)k)a(r~)=(l_a~n)) a(n)

1
and this tends to ~ if n tends to infinity. We infer that ~(H) ~< k log~/q a(n)
almost surely.
Now let f(n) = ((1 - r for any r > 0. There are 1--[i~l ,e(,q/
different choices for D in B. Define the indicator random variable X D by
1 : D is a dominating set of G
XD(G):= 0 : otherwise

Obviously X~ = )'-~D X ] . By linearity of expectation it follows that

E(Xr) = B E(X~)= (bi(n)'~ 1 1

=
22~

Because lin~_~oo(1 - n-S) n~ = e -1 and (r~) ~< nm we can infer that

lim E(Xr) ~< lim bi(n) ~(n) ~< lim b(n) ~(n)
n "--+ ~ n"+ O0 ~ "-*~*0 0

~< l i m e in b(n)'k~(n)-a(n)~ :- 0

By Markov's inequality, P[Xr >1 1] ~< E(X~), so almost surely


1
~(H) > k ( ( 1 - ~)log1/q a(n))~ and thus ~(H)-~ k log~/q a(n) almost surely.
[]

PROOF. [of Theorem 4.1] By lemma 3.1 we can express the optimum of every
Q E MIN F+II2(k) as:
k
opt~(~) = mjn{ISl : (~, S) k Wb(~) -~ 3~1.. 32,(~(~, 2) A/~ S~d]}.
i=1

Analogue to the proof of theorem 3.2 we decompose the formula into atomic
a-types and consider the special case where
k
opte (~) -- mjn{ISI : (~, S) ~ W[u(2) --* 3h... 32kt(2, 2)A A Sa]}
i=1

and t(2,2) = t(2, z l , . . . , z k ) = u(2) AVl(51)A...Avk(Sk)Aa(~,21,...,zk)A


j3(s 21,..., 5k) on almost all structures.
The optimum of Q is the cardinality of the smallest subset S C U~=I @
such that for every ~ E u ~ there exist 21, ...,2k E S such that t(~,21,...,Sk)
becomes TRUE.
Similar to the proof of theorem 3.2 we reduce Q to the hyperdominating set
problem. In the case that a contains only inequalities we can apply lemma 4.2
directly and we get that the optimum is O(log~ n) almost surely. In the other
cases we have to split up the problem into subproblems of the desired form. []

This theorem leads us to the following conclusion.

P r o p o s i t i o n 4.3 MIN GRAPH COLOURING, MIN CLIQUE PARTITION are not


in MIN F+II2.
PROOF. By [1] the chromatic number x(G) of a graph satisfies x(G) "~ n/2 log n
almost surely. MIN CLIQUE PARTITION is the same problem as MIN GRAPH
COLOURING o n the complementary graph. []

With the same arguments as in [5] we can show that the criterion remains
true if we use more powerful logics than first-order for example fixed point logic.
227

References
[1] N. Alon and J. Spencer. The Probabilistic Method. Wiley, 1991.
[2] Th. Behrendt, K. Compton; and E. GrKdel. Optimization problems: Ex-
pressibility, approximation properties, and expected asymptotic growth of
optimal solutions. In E. BSrger, G. J~ger, It. Kleine Brining, S. Martini,
and M.M. Pdchter, editors, Computer Science Logic, 6th Workshop, CSL
'92, San Miniato 1992, Selected Papers, volume 702 of LNCS, pages 43-60.
Springer-Verlag, 1993.
[3] P. Crescenzi and V. Kann. A compendium of NP optimization problems.
unpublished, 1994.
[4] R. Fagin. Generalized first-order spectra and polynomial-time recogniz-
able sets. In R. M. Karp, editor, Complexity of Computation, SIAM-AMS
Proceedings, Vol. 7, pages 43-73, 1974.
[5] E. Gr~del and A. MalmstrSm. Approximable minimization problems and
optimal solutions on random input. In E. BSrger, Y. Gurevich, and
K. Meinke, editors, Computer Science Logic, 7th Workshop, CSL '93,
Swansea 1993, Selected Papers, volume 832 of LNCS, pages 139-149.
Springer-Verlag, 1994.
[6] V. Kann. On the Approximability of NP-complete Optimization Problems.
PhD thesis, Royal Institute of Technology, Stockholm, 1992.
[7] Ph. Kolaitis and M. Thakur. Logical definability of NP-optimization prob-
lems. Technical Report CRL-90-48, University of California, Santa Cruz,
October 1990.
[8] Ph. Kolaitis and M. Thakur. Approximation properties of NP minimization
classes. In Proc. 6th IEEE Syrup. on Structure in Complexity Theory, pages
353-366, 1991.
[9] Ph. Kolaitis and M. Vardi. Infinitary logics and 0-1 laws. Information and
Computation, 98:258-294, 1992.
[10] C. Lund and M. Yannakakis. On the hardness of approximating minimiza-
tion problems. In Proe. 95th ACM Symp. on Theory of Computing, pages
286-293, New York, 1993. ACM.
[11] C. Papadimitriou and M. Yannakakis. Optimization, approximization and
complexity classes. Journal of Computer and System Sciences, 43:425-440,
1991.
k
Convergence and 0-1 Laws for under
Arbitrary Measures

Monica McArthur

Department of Mathematics
University of California
Los Angeles, CA 90095

A b s t r a c t . We prove some general results about the existence of 0-1


and convergence laws for L~o,~ and Loo,~ k on classes of finite structures
equipped with a sequence of arbitrary probability measures {#,~}, as well
as a few results for particular classes. First, two new proofs of the char-
acterization theorem of Kolaitis and Vardi [9] are given. Then this theo-
rem is generalized to obtain a characterization of the existence of L~,~
convergence laws on a class with arbitrary measure. We use this theo-
rem to obtain some results about the nonexistence of Loo,~ convergence
laws for particular classes of structures. We also disprove a conjecture
of Tyszldewicz [16] relating the existence of L~o,~ and MSO 0-1 laws on
classes of structures with arbitrary measures.

1 Introduction

L ~ , ~ is logic with arbitrarily m a n y conjunctions and disjunctions but at m o s t k


distinct variables; L~,~ = Uk L~,w. A class C of finite models is assumed to be
equipped with a sequence {~n}, where Pn, for each n, is a probability measure
on the structures in C of size n. Then, for any property P, # ( P ) is defined to
be lin~-.oo ~ n ( P ) , if this limit exists. A language is said to have a convergence
law on a class C if p ( P ) exists for every property P definable in the language,
and a 0-1 law if ~u(P) is 0 or 1 for every property P definable in the language. In
general, we will be considering classes C with arbitrary measures. We will also,
however, consider certain classes C with particular sets of measures, m o s t often
uniform measure (in which #n assigns equal probability to each structure of size
n in C).
In the past few years there has been quite a bit of work done on 0-1 and
convergence laws for LWoo,~- Lynch [11] has shown t h a t L o) ~ , ~ has a conver-
gence law in m o s t classes of r a n d o m graphs with edge probability p(n) << n -1,
Tyszkiewicz [15] has obtained some nice results relating convergence laws for
L ~OO~W and for least-fixed-point logic, and Lynch and Tyszkiewicz together have
229

recently obtained a 0-1 law for L~,~ for random graphs with edge probability
p(n) = n -a, a irrational, 0 .< a < 1. The first results in this field, however,
are due to Kolaitis and Vardi [9]. They showed that the class of all structures
with uniform measure (for any given signature) has an L%,~ 0-1 law, and also
to
characterized the existence of an Lor 0-1 law on a class with arbitrary measure:

T h e o r e m I (Kolaigis. Vavdi). A class C of finite models with arbitrary mea-


sure has an L~,o, 0-1 law if and only if there is an Lk-equivalence class A such
that #(A) = 1.

It is this theorem which is the starting point for the work in this paper.

2 L k T y p e s and Infinitary 0-1 Laws

We show that the existence of L ~ , W (and L%,~) 0-1 laws on a class of structures
is equivalent to various conditions on the L k types of the first-order random
theory (the set of sentences of first-order logic which have probability 1 on the
class), and, in the process, we also provide another proof of the L%,~ 0-1 law
characterization theorem of Kolaitis and Vardi (Theorem 1).
We first state some definitions regarding types; these can be found in any
standard book on model theory, such as Pillay [12], except for L k types, which
can be found in [3].

D e f i n i t i o n 2.

1. A k-type for a complete first-order theory T is a maximal set of formulas


of first-order logic, each formula having at most k free variables, which is
consistent with T.
2. An L k type for a complete first-order theory T is a maximal set of formulas
of L k which is consistent with T.

We say that a type p(~) is realized in a model if there is some tuple ~ in the
model such that ~ satisfies every formula in p(~).

D e f i n i t i o n 3 . For any set S of types, the Stone space of S is a topological space


on S in which the basic open sets are Ur = {p E 5: : ~b E p} for each formula r
which Occurs in some p E S.

Note that the Stone space of L k types, like the Stone spaces for k-types, is
compact.

D e f i n i t i o n 4 . A type p(~) of a theory T is principal if there is a formula r E


p(~) such that r implies p(~) in all models of T.
230

Note that an L k type p is principal if and only if p is an isolated point in the


Stone space of L k types (again, exactly as in the case of k-types).
One of the conditions on the L ~ types of the random theory which guarantees
an L ~OOl~ 0-1 law is similar to the condition on k-types which is equivalent to R0-
categoricity by the Ryll-Nardzewski theorem (for references to this, see e.g. [1]).
Recall that this theorem states that a first-order theory T is R0-categorical if
and only if it has finitely many k-types for each k. We will show that a class C
of finite models with an arbitrary measure has an L ~OOIW 0-1 law if and only if its
first-order random theory T has finitely m a n y L k types for each k. Parts of this
proof, involving the Stone space of L k types and principal types, are also similar
to the proof of the Ryll-Nardzewski theorem.

T h e o r e m 5. Let C be an class of finite models which has a first-order 0-1 law


and first-order random theory T. Then the following are equivalent:

1. Every L k type of T is realizable in a finite model


2. Every L k type of T is principal
3. T has only finitely many L k types.
4. The intersection o f T with L k is finitely axiomatizable.
5. There is an L~-equivalence class A which has probability 1 on C.
6. C has an L~,~ 0-1 law.

Proof. The proof is done in round-robin fashion: we show (1) =~ (2) =a (3) =~
(4) =* (5) =a (6) =~ (1). Most of these implications are already known: (1)
(2), for instance, was proved by Dawar, Lindell, and Weinstein [3]; (2) =a (3) is
just due to the compactness of the Stone space of L k types; (3) =:~ (4) and (4)
=* (5) are similar to one of the proofs given by Kolaitis and Vardi [9] that the
class of all structures has an L~o,,0 0-1 law; (5) =~ (6) is the "easy half" of the
characterization theorem of Kolaitis and Vardi [9].
(6) :=# (1) is the only new part of the proof. We prove the contrapositive. Let p
be an L k type o f T which is not realizable in any finite model. By compactness, for
each n there is some formula Cn (~) in p which implies the existence of at least n
distinct elements. Let Cn (~') be Aj <n Cj (s we do this so that r (~') --~ r (~).
Clearly, for each n, 35r m u s t ~ a v e probability 1, since for each n 3~r
has probability 1, because r is in p and p is consistent with T. For each n,
let rnn be such that for all m > m,,/~m(3s162 > 1 - 1/n. Pick some n0 > 0,
and then define f ( n ) recursively as follows:

(1) f ( 0 ) = n0
(2) f(n + 1) = mr(.) + 1

Clearly f is strictly increasing and f ( k ) > k, since for any n mn is obviously as


least as large as n.
231

Now consider the L~,o~ sentence

For m = m:(~i), clearly prn(8) is less than 1/f(2i): we only need consider the
conjunct 3~r ~ B~r in which the hypothesis has probability
greater than 1 - 1/f(2i) on structures of size m, and the conclusion is always
false, since there cannot be f(2i + 1) = my(20 + 1 elements in a structure of
size m$(20. But for m = m1(2i+1), Pro(~9) is greater than 1 - 1/f(2i + 1): any
of the conjuncts with f(2n), n > i, in the hypothesis can be disregarded, since
all of their hypotheses must be false (we cannot have f(2i + 2) = m1(2i+1) +
1 elements is a structure of size rn1(2i+l)), and Bs162163 has probability
greater than 1 - 1 / f ( 2 i + l ) on structures of size m and implies all of the remaining
conjuncts. Thus this sentence does not have a probability (i.e. limm-.oo #m (t9) is
not defined), and so C does not have an Lco,o k J 0-1 law.

3 Yet Another Proof of the Characterization Theorem


f o r L ~k, o , 0 - 1 L a w s

This proof relies on the following proposition, which, as stated here, is slightly
more general than, but similar in proof to, m a n y others which reduce the sen-
tences of any countable language to first-order logic (e.g., McColm's proof, as
mentioned in [7], that McColm's second conjecture is true for sentences).
We will be considering abstract logics s = (L, ~ k ) , where L is a set of
objects, called sentences, and ~ k is a relation between structures and sentences
in L (the satisfaction relation). (For a brief discussion of abstract logics, see [1]).
The only requirements we will impose on these logics is that the satisfaction
relationship is preserved under isomorphism of structures, and that they are
closed under negation, that is, for every sentence 0 in L there is a sentence -~/9
in L such that for all structures A, A ~ k "~/9if and only if it is not the case that

We will need some definitions:

D e f i n i t i o n 6. Let L: and 1:~ be two abstract logics. We say that 1: is reducible


to s on a class of structures C if for every sentence 0 of L there is a sentence 0t
of ZY such that for all A E C, A ~ : 0 ~ A ~ z , 6t. If s is reducible to s on
C and s is reducible to s on C, then we say that L: and L:t are equivalent on
C.

D e f i n i t i o n T . Let L: be an abstract logic, T a set of sentences of/~, and C any


class of structures. We say that T is countably axiomatizable on C if there is a
232

set S C T, [S[ = R0, such that for each sentence 0 in T, there is a sentence r
in S which implies 0 under ~ for all structures in C, that is, if A E C is such
that A ~ r then A ~ z 0.
This is basically the same as the normal definition; however, we require that for
each sentence 0 in T there be a single sentence in S which implies 0 since s m a y
not be closed under conjunction.

P r o p o s i t i o n 8. Let s be an abstract logic. Suppose s has a 0-1 law on a class of


finite structures C, and, furthermore, that the set {r E L, #(r - 1} is countably
axiomatizable on C. Then 1: is equivalent to first.order logic on a subset of C of
measure 1.

Proof. Let S be a countable axiomatization for the set of sentences of s with


probability 1. Let {So, s l , . . . } be an enumeration of S, and let Si be the set of
the first i elements in this enumeration. Now, each sentence in S has probability
1 by definition (since S axiomatizes the probability 1 sentences). Thus each finite
subset of S also has probability 1, since p is finitely additive. Let nl = 0, and
for each i > 1, let ni be the least n > hi-1 such that #m(Si) > 1 - 1/i for all
m > n. Now we define
Z i = {A e C : ni < [A] <_ ni+l and A ~ Si}.
The Xi are clearly disjoint, since no two of them contain structures of the same
cardinality. Let X = UXi. X is of measure 1, since # n ( X ) = #n(Xi) > 1 - 1/i
for ni < n _< hi+x, so lirn,-~oo/tn(X) = 1.
We claim that s is expressible by first-order logic on X. First,suppose r E L
has probability 0. Then -~r has probability 1, so there is some si in S which
implies -~r Since si holds for all Xj, j > i, we have that r can only hold in X
on structures of size _< hi. So we have

r ~ V{0A: IAI < nl, A ~ c r on X.

Here OA is the first-order sentence which describes A up to isomorphism (this


sentence exists since both A and the signature are finite). Since the RHS is a
finite disjunction of first-order sentences, it is clearly first-order. Next, suppose
r E L has probability 1. Then r is implied by some si E S, so r holds for all A
in X which have cardinality larger than n~. Thus

, (V 0A : IAI < _ - , , A ,} v Pn,) on X

(where p, is the first-order sentence stating that there exist more than n distinct
elements). Again, the RHS is clearly first-order, and we are done.
The "hard direction" of the Kolaitis-Vardi theorem (Theorem 1) is an imme-
diate corollary of this proposition.
233

C o r o l l a r y 9. If a class C has an L ~ , w 0-1 law, then there is an L k-equivalence


class A such that # ( A ) = 1.

Proof. First, since each sentence of L~,o, is equivalent, on finite models, to a


countable disjunction of sentences of L k, we can consider L~,o~ to be a set
(rather than a class). Let T be the set of all sentences of L~,o~ with probability
1. Then A T is (equivalent on finite structures to) a sentence of L k(2~ "jW ' and thus
T is countably (in fact, finitely) axiomatizable. Thus L~,o, must be equivalent
to first-order on a set of measure 1. But this implies that there must be a set of
measure 1 with only finitely m a n y distinct Lk-equivalence classes, since a set with
infinitely many distinct Lk-equivalence classes has 2 s~ distinct L~,to sentences,
and thus L OO~Od
k cannot possibly be equivalent to first-order on that set. But,
since each Lk-class is definable by a sentence of L k, it must have probability 0
or 1. Thus if there is a finite set of Lk-equivalence classes of measure 1, it must
contain exactly one equivalence class of measure 1.

4 A Characterization of the Infinitary Convergence Law

The following theorem provides a complete characterization of the existence of


tO
an Loo,t
o convergence law.

T h e o r e m lO. Suppose a class C of finite models has an L k convergence law.


Then C has an L~,o, convergence law if and only if for each e > O, there is a
finite set X of Lk-equivalence classes such that II(X) > 1 - e.

Proof. Let Am be the set of all Lk-equivalence classes represented in C be a


structure of size m.
(=~): Assume that the RItS of the theorem does not hold. Let

a = sup{/~(X) : X is a finite set of Lk-equivalence classes};

since the RttS does not hold, we know that a < 1. Fix e > 0 very small (less
than (1 - a ) / 2 will suffice; since a < 1, this will still be greater than 0). Let U
be a finite set of L*-equivalence classes such that #(U) > a - e/2. Then we have,
by finite additivity, that any finite set of Lk-equivalence classes disjoint from U
will have probability < e/2.
We will now construct a sequence of finite sets of Lk-equivalence classes Xi,
Y/, and integers rni, hi, which satisfy the following conditions:

1. X~+l 3 xi, 1"i+1 _~ ]'i.


2. Xi and ~ are disjoint, and they are both disjoint from U.
234

3. I.tm,(UUXi) > 1 - e .
4. g m ( U U X i ) < a + e / 2 .

We let rn0 be such that #i(U) > a - e / 2 for all i > m0 and let X0 be A,~o\U.
Clearly/~m0 (X0 U U) = 1, so (3) is satisfied, and certainly X0 is disjoint from
U. Let no be such that #i(X0 U U) < a + e/2 for all i > no (no is guaranteed
to exist, since U and X0 are both finite, so/z(X0 U U) < a), thus satisfying (4).
Let II0 be Ano\(Xo U U). Yo clearly satisfies (2).
Given Xi, Yi, rni, ni, we construct their successors in similar fashion. We let
mi+~ be such that pro,+, (X~ U ~ U U) < a + e/2 (and thus #,,,+, (Y~) < e). We
let Xi+l = Xi U Am,+,\(]~ U V). Then Xi+l satisfies (1) and (3), and is disjoint
from U. We let ni+l be such that/~i(Y~ U Xi+l U U) < a + e/2 for all i >_ ni+l,
so I.t,~,+~(Xi+t) < e and (4) is satisfied, and let ~+1 = ]~ U An,+~\(Xi+l U U).
Then ~ + 1 satisfies the disjointness conditions (1) and (2).
To complete the proof, we consider the property U U U x i , which is definable
in L~,~ by some sentence 0 [9]. By (3), p,,~ (0) > 1 - e for all m~, and by (4)
#n~(O) < a + e/2 for all n~. Thus 0 has no asymptotic probability.
(4=): Assume that the RHS holds. Let {Xi} be a sequence of finite sets of
equivalence classes such that limi-.~ #(Xi) = 1. Taking Xi = Xi U Uj<i x j , we
may assume that Xi+l ~_ Xi.
Let 0 be a sentence of L~,~, and consider #,(0). We certainly have that

(3) ..(0) > . . ( { z e z 0})


(here we are using the fact that L/Z-equivalence implies L~,~-equivMence on
finite models), since Xi is a subset of all Lk-equivalence classes in C. We also
have that

(4) /ln(O) < 1 - / ~ , , ( { Z E X i : Z ~ -,O})


by similar reasoning. Let p(i) = # ( { Z e Xi : Z ~ 0}) and let ((i) = 1 -
/~({Z E X~ : Z ~ -,0}). (We know that these probabilities exist since we have by
assumption that C has an L k convergence law, so any finite set of L~-equivalence
classes must have a probability, because each Lk-equivalence class is definable
by a sentence of L k.) We have by 3 and 4 that

(5) ((i) >_ lim sup #n (0) > liminf#n(0) >_ p(i)

for each i.
Since Xi C Xi+l, p(i) is increasing as i increases, and ~(i) is decreasing with
i. Thus lirr~._.~ p(i) and limi-.oo ~(i) both exist, In fact, since p(i) + (1 - ~(i)) =
~(Xi), we have lim~...oo(p(i) + (1 - ~(i)) = limi~oo p(Xi) = 1 so limi--.oo p(i) =
lirr~-~oo ~(i). Thus, by (5), lim inf#,(0) = lira sup pn(0), so ~(0) exists.

This theorem has several immediate corollaries. For the first corollary, we
need to define the graph associated with a given structure with an arbitrary
signature:
235

D e f i n l t l o n l l . The graph associated with a structure A is a graph on the set


of elements of A, where two distinct elements a, b E A have an edge between
them if for some relation R in the signature, R holds on a tuple from A which
includes a and b.

A structure A is connected if its associated graph is connected, the degree of


an element in A is its degree in the associated graph, etc. (Note that this is
equivalent to the definitions of a connected structure, etc., given in Compton [2].)
Let MA -" max{diam B : B is a connected component of A} for any struc-
ture A. We then have the following corollary:

C o r o l l a r y I9. A class C has an L ~ convergence law only if

lira #({A E C : MA < m}) = 1.


Wi ---* OO

Another corollary of the theorem is as follows:

C o r o l l a r y 13. l f C is a class of finite structures which has a first-order conver-


gence law and each connected component of every structure A E C has cardinality
oJ
< n f o r some fixed n, then C has an Loo,w convergence law.

5 S o m e A p p l i c a t i o n s to P a r t i c u l a r Classes of Finite
Structures

Corollary 12 can be immediately applied to get some negative results about the
existence of L~,~ convergence laws. The first two of these have already been
stated by Tyszkiewicz [15], and the third is a weaker version (but with a shorter
proof) of another result due to Tyszkiewicz [14].

1. The class of graphs with edge probability p(n), n - 1 - ~ << p ~< n -1 for all
e > 0, does not have an L~,~ convergence law. This can be read off from the
analysis in Shelah and Spencer [13]: the random theory asserts the existence
of copies of all finite trees, including those of arbitrarily large diameter.
2. The class of graphs with edge probability p(n), n -1 ~ p ~ n -1 logn, does
not have an L~ convergence law. This is also immediate from Spencer and
Shelah [13].
3. The class of graphs with edge probability p(n), n - 1 log n ~ p ~< n-1+c, does
not have an L~CO~O~ convergence law. This is also immediate from Spencer and
Shelah [13].
4. The class of all unary functions with uniform measure does not have an L%,o~
convergence law. This is immediate from Lynch [10], where it is shown that
this class has arbitrarily large diameter with probability 1.

For classes of bounded degree, we have the following general proposition,


which, like the facts above, uses Corollary 12. (Note that when we say that a
236

structure is of bounded degree we mean that its associated graph, as defined


in the previous section, is of bounded degree.) This proposition uses generating
functions to obtain a result which can then be applied immediately to some
particular classes of bounded degree.

P r o p o s i t i o n 14. Let C be a class with uniform measure such that, with proba-
bility 1, every point in every structure in C is of bounded degree. Then C has an
L%, w convergence law only if there is a polynomial p(x) such that the exponential
generating function G(x) for C (as defined by Compton [2]) converges for all x
and has G(x) < ep(~) for all x >__O.

Proof. We first note that, by Corollary 12, if C has an L w oo,~ convergence law
then C must contain a set of measure 1 which has bounded diameter; since C
also has a set of measure 1 which has bounded degree, C must have a set S
of measure 1 (the intersection of the two above-mentioned sets) in which every
connected component of every structure is of bounded size. Let A be the set
of all structures which are connected components of some structure in S. Now,
there is a bound n on the size of structures in A, and there are at most finitely
m a n y structures of size < n, so A is finite.
So assume C has a set S of measure 1 such that the set A of all structures
which are connected components of some structure in S is finite. Then the ex-
ponential generating function associated with A is some polynomial q(x). Now,
S is clearly a subset of the set A* of all structures whose components are in A,
and the exponential generating function associated with A* is e q(~). Thus the
generating function H ( x ) associated with S is term-by-term less than e q(~), and
so, since each coefficient of the series is nonnegative (one cannot have a nega-
tive number of structures), H ( x ) clearly converges and is < e q(~) for all x > 0.
Since H(x) is a power series, we also have that H ( x ) converges for all x, since
it converges for all ~ > 0.
Now, G(x), the generating function of C, is clearly very close to H ( x ) , since S
is of measure 1, and in fact, for z > 0, G(x) must be strictly less than 2 H ( z ) + r ( z )
for some polynomial r(x)~ since there must be some m such that for all n > m,
less than half of the structures of C of size n are not in S; otherwise S would
not have measure 1. Thus each term g,~xn of G(x), n > m, is less than 2hnx n,
where hnx n is the corresponding term of g ( z ) . We add r(x) in to take account
all the things that may happen in C for structures of size < m. Thus we have,
for x > 0,
G(x) < 2 H ( x ) + r(x) <_ 2e q(x) + r(x) <_ ep(~)
where p(x) can be constructed from q(x) and r(x).
In particular, if C is of bounded degree and has (n - c)! structures of size n
for each n and some constant c, then C does not have an L W convergence law,
because the exponential generating function for C will either fail to converge or
will not be bounded by ep(~:) for any polynomial p(x). Thus we get the following
results:
237

1. The class of all unary 1-1 functions with uniform measure does not have an
oJ
Leo,~ convergence law, as it has nl structures of size n.
2. The class of all graphs of bounded degree with uniform measure does not
have an L ~ , W convergence law, as it has at least ( n - 1)! structures (the
chains of length n) of size n.

6 Infinitary Convergence vs. MSO Convergence


w o~
Note that when a class C has an Loo,o ~ 0-1 law, then L~,~ reduces to first-order
logic (as formulas, not sentences) with probability 1 on C. Tyszkiewicz [16] has
conjectured that when both L ~OO~tO and MSO 0-1 laws hold, the same is true of
MSO. There are m a n y examples which support this conjecture:
1. The class of all structures on a finite relational signature has an L ~~,~ 0-1
law, but does not have an MSO convergence law (even with lenient notions
of convergence, such as Cesaro convergence).
2. The class of all graphs with edge probability p(n), 1/n << p(n) << ( l / n ) log n,
has an MSO 0-1 law, but does not have a n L~,,o convergence law.
3. The class of all partial orders has an L ~OO ,OJ 0-1 law, but it does not have an
MSO convergence law.
4. The class of all graphs which are unions of cycles has an MSO 0-1 law, but
w
does not have an Loo,o ~ convergence law.
5. The class of all equivalence relations has both L~,oj and MSO 0-1 laws, but
every formula of MSO is equivalent to some first-order formula on this class.
Tyszkiewicz's actual conjecture was slightly more restricted, since he was
considering L F P 0-1 laws instead of L ~OO~W 0-1 laws, so the conjecture, instead of
being stated for classes with arbitrary measure, is only stated for classes with
recursive measures, as defined below.
D e f i n l t i o n l 5 ( T y s z k i e w i c z [15]). A sequence of measures {/Zn} on a class of
finite models C is recursive if there is a reeursive relation R~((A), p, q) (where
(A) is the code of the structure A in some recursive coding) which is true exactly
when/JIAI({A}) > p/q.
oJ
Tyszkiewicz conjectured that if a class C with a recursive measure has an L~,o~
0-1 law, then it does not have an MSO convergence law unless there is a set of
probability 1 in C on which each formula of MSO is equivalent to a first-order
formul~ with probability 1 on C. (Note that by Proposition 8, this conjecture is
trivially true if "formula" is replaced by "sentence".) However, this conjecture
is not true.

P r o p o s i t i o n 16. There is a class C of finite structures with a recursive measure


which has both L~o,o~ and MSO 0-1 laws but in which there is no subset of measure
I on which every formula of MSO is equivalent to a first-order formula.
238

We show this by the following construction. Recall that an extension axiom


is defined as follows (as in [51):

D e f i n i t i o n 17.

1. A quantifier-free type t ( z l , . . . , xk) is a formula which is a conjunction of


atomic formulas and negations of atomic formulas such that for every relation
R of the base signature of arity n and every sequence xi,1, 999 , xl,n of length
n from z l , . . . , xk, exactly one of R ( z i , x , . . . , xi,n) and -~R(z~,l,... , zi,n) is
one of the conjuncts, together with the assertion that zi ~ zj for i # j.
2. Let s ( z l , . . . ,xk) be a quantifier-free type, and let t ( x l , . . . , x k , x k + l ) be a
quantifier-free type which extends s (that is, s and t make the same assertions
about the base relations on z l , . . . , zk). Then the extension axiom Ezt(s, t)
is the sentence

V Z l , . . . , Zk[S(Zl, . 9 9 , mr) ~ 3 X / + l t ( X l , . . 9 , xk, Xk+l)]

Let Tk be the conjunction of all extension axioms (in the language with one
binary relation) with < k variables. Let b(k) be a recursive function such that
for all rn > b(k), there is a model of Tk of cardinality m. We can construct such
a b(k) using Fagin's proof of the 0-1 law for first-order logic [5]. Now, we define
a sequence {A~} of connected directed graphs, ]A(] = i, as follows: let A1 be the
(unique) graph with one vertex. For i < b(3), let Ai be the chain {1, 2 , . . . , i}.
For i = b(3), let Ai be the first directed graph (in some recursive enumeration)
of size i which satisfies T3. We know that such an A~ exists by the definition of
b(k); Ai will be connected since one of the 3-variable extension axioms states
that every two points not connected by a path of length 1 are connected by a
path of length 2. In general, for k > 3, b(k - 1) < i < b(k), we let A~ be the first
structure which satisfies Tk-1. As in the specific case i = b(3), we know that A~
is connected.
It is clear that the set A = {Ai} is recursive. Taking the obvious probability
distribution (#i(Ai) = 1, and #i(A) = 0 otherwise), we clearly get a class with a
recursive measure. It is also clear that the probability of each extension axiom is
1. So then we can apply the following theorem of Tyszkiewicz to conclude that
A does not have an MSO convergence law:

T h e o r e m 18 ( T y s z k i e w i e z [I 6]). Let C be a class of structures with a recur-


sire distribution. Suppose each extension axiom has probability 1 on C. Then C
does not have an MSO convergence law.
Because the probability of each extension axiom is 1, however, we can conclude,
following one of the proofs of Kolaitis and Vardi for the class of all structures,
that A has an L~,~ 0-1 law [9].
Now, we define a class A* in such a way that A* has exactly one member,
B , , of size n for each n, with Bn just being Bn-x with an isolated point added
i----rn
unless n = ~ i = 1 i • rn in which case Bn has m copies of Ai for each i < m. Thus
each large enough structure Bn has more than k copies of each Ai, i < b(k), and
more than k structures taken from the set {Ab(k), Ab(k)+l . . . . }, with probability
239

1. Now, any two structures which satisfy this property are Lk-equivalent, since
any two copies of each Ai, i < b(k), are certainly Lk-equivalent, and any two Aj,
At, j, l > b(k) are L~-equivalent since any two models of Tk are Lk-equivalent.
Thus any two structures which satisfy this property both have more than k copies
of the same Lk-equivalence classes, and thus are L~-equivalent by a theorem
of Kolaitis [8]. Thus, for each k, there is an L~-equivalence class which has
probability 1, and thus A* has an L~o,o~ 0-1 law by Theorem 1. Also, it is clear
that each equivalence class of MSO with k quantifiers occurs more than m times
for any m as n get bigger, so by a similar argument (as in Compton's proof [2]
that the class of all equivalence classes has an MSO 0-1 law), we can conclude
that A* has an MSO 0-1 law.
However, it is fairly easy to see that MSO does not reduce to first-order on
any subset of A* of measure 1. To see this, let 0 be a sentence of MSO with no
probability on A (such a sentence must exist, since A does not have an MSO
convergence law). Let r be a formula which says "there is a connected set
U with diameter 2 such that z E U and 0 relativized to U is true". This is
clearly expressible in MSO. It is also clear that the truth of r depends only
on the isomorphism class of the connected component that z is in. We will show
that r cannot be equivalent to any first-order formula on any subset of A*
of measure 1. We will need the following lemma:

s 19. Suppose that r is equivalent to some first-order formula a(x)


on some subset S of A* of probability 1. Then we can construct a first.order
.formula a'(x) such that for all Ai E A,

Ai ~ 3xa'(x) ~ Ai P O.

Proof. The is most easily seen by using Ehrenfeucht-Fra'iss6 games [4], [6]. As-
sume the quantifier rank of a is n. Let Bi be sufficiently large (large enough to
be in S and to have at least n + 1 copies of each connected component of size
< n + 1), and let x and y in Bi be such that tr(z) holds but or(y) does not (i.e. z
is in a connected component of Bi on which 0 is true and y is in a connected
component on which 0 is false). Let C~ and Cy denote the connected components
of z and y respectively.
Add a new constant c to the language, and consider the structures (Bi, z)
and (Bi,y), in which z and y, respectively, are the interpretations of c. (Bi, z)
and (Bi, y) will differ in a sentence of quantifier rank n, namely ~(c). So Player I
can always win the Ehrenfeucht-Fra'issd game of length n on (Bi, z) and (Bi, y).
But whenever Player I plays on a connected component other than Cz in (Bi, z)
or C~ in (Bi, y), Player II can easily match the move by playing on an isomorphic
component of Bi which is not Cy or Cz respectively (here we use the fact that
Bi has at least n + 1 copies of each connected component). So Player I only
needs to play on Cr and C~. And clearly Player II will still lose whether or not
he only plays on C= and C~. Thus we have that (Cz, z) and (C~, y) differ on a
sentence of quantifier rank at most n.
240

This argument works for any x and y; thus if C and D are two connected
components such that 0 is true in C but not in D, and z E C and y E D, then
(C, x) and (D, y) must have different first-order theories up to quantifier rank
n. But there are only finitely m a n y equivalence classes of first-order logic up to
quantifier rank n, each of which is expressible by a single sentence of quantifier
rank n. Let So be the (finite) set of sentences which describe the first-order
theories up to quantifier rank n for each (C, x) such that 0 holds in C and
x E C. Let So (x) be So with each occurrence of the constant c in each sentence
replaced by x. T h e n we have

0 4 - ~ Bx V s o ( z )

on all At.

But then the sentence 3xa~(x) will be true on a structure Ai in A if and


only if 0 is true in that structure, and thus it will not have a probability. But
A has an L ~ 0-1 law, and so 3z~'(z) must have probability either 0 or 1, a
contradiction. Thus r must not be equivalent to any first-order formula.
Note that the first-order random theory of A*, like that for equivalence
classes, has finitely many L k types for each k but infinitely m a n y normal 1-
types, and thus is not R0-categorical. Thus the following modification of the
original conjecture may still be true:

Conjecture PO. Let C be a class of finite models (possibly restricted to have a


recursive measure) whose first-order random theory is No-categoricaL Then C
has an MSO 0-i law if and only if there is a set of probability 1 in C on which
each formula of MSO is equivalent to a first-order formula.

References

1. Chang, C. C., and Keisler, H. J., Model Theory, North-Holland, Amsterdam, 1990.
2. Compton, K. J., 0-1 Laws in Logic and Combinatorics, in I. Rival, ed., Proc. NATO
Advanced Study Institute on Algorithms and Order, Reidel, Dordrecht (1988),
pp. 353-383.
3. Dawar, A., Lindell, S., and Weinstein, S., Infinitary Logic and Inductive Definabil-
ity over Finite Structures, University of Pennsylvania Tech. Report IRCS 92-20
(1992).
4. Ehrenfeucht, A., An Appfication of Games to the Completeness Problem for For-
malized Theories, Fund. Math. 49 (1961), pp. 129-141.
5. Fagin, R., Probabilities on Finite Models, J. Symbolic Logic 41 (1976), pp. 50-58.
6. Fra~ss~, R., Sur quelques classifications des syst~mes de relations, Publ. Sci. Univ.
Alger. Sgr. A 1 (1954), pp. 35-182.
7. Gurevich, Y., Immerman, N., and Shelah, S., McColm's Conjecture, Proc. of the
9th IEEE Symposium on Logic in Computer Science, 1994.
241

8. Kolaitis, Ph., On Asymptotic Probabilities of Inductive Queries and their Decision


Problem, in R. Parikh, ed., Logics of Programs '85, Lecture Notes in Computer
Science 193 (1985), Springer-Verlag, pp. 153-166.
9. Kolaitis, Ph., and Vardi, M., Intinitary Logics and 0-1 Laws, Information and
Computation 98 (1992), pp. 258-294.
10. Lynch, J. F., Probabilities of First-Order Sentences about Unary Functions, Trans-
actions of the AMS 287 (1985), pp. 543-568.
11. Lynch, J. F., Infinitary Logics and Very Sparse Random Graphs, Proc. of the 8th
IEEE Symposium on Logic in Computer Science, 1993.
12. Pillay, A., An Introduction to Stability Theory, Oxford University Press, New York,
1983.
13. Shelah, S., and Spencer, J., Zero-One Laws for Sparse Random Graphs, Journal
of the AMS 1 (1988), pp. 97-115.
14. Tyszkiewicz, J., Infinitary Queries and their Asymptotic Probabilities I: Properties
Definable in Transitive Closure Logic, in E. BSrger, et. al., eds., Proc. Computer
Science Logic '91, Springer LNCS 626.
15. Tyszkiewicz, :I., On Asymptotic Probabilities of Logics that Capture
DSPACE(Iog n) in the Presence of Ordering, Proc. CAAP '91.
16. Tyszkiewicz, J., On Asymptotic Probabilities of Monadic Second Order Properties,
Proc. Computer Science Logic '9P, Pisa, Italy.
Is First Order Contained in an Initial Segment of
PTIME?

Alexei P. Stolboushkin .1 and Michael A. Taitslin *~2

1 Department of Mathematics, UCLA, Los Angeles, CA 90024-1555,


aps@math, ucla. edu
2 Department of Computer Science, Tver State University, Tver, Russia 170000,
mat@mat, tvegu, t v e r . su

A b s t r a c t , By "initial segments of P" we mean classes DTIME(nk). The


question of whether for any fixed signature the first-order definable pred-
icates in finite models of this signature are all in an initial segment of P
is shown to be related to other intriguing open problems in complexity
theory and logic, like P = PSPAcE.
The second part of the paper strengthens the result of Ph. Kolaitis of
logical definability of unambiguous computations.

Introduction
The question put in the title of this paper was originally motivated by certain
questions asked by Moshe Vardi after a talk by the first author, and further
crystallized during the author's discussions with Yiannis Moschovakis. We call
this "Moschovakis's Problem".
Let's start from the context for this question.
It is well-known t h a t the language FO of first-order logic fails to express
certain simple (say, computable in P) properties of finite models. This remains
true even in the presence of linear order (see [2]).
[11] and [20] showed that the class P can be characterized in the presence of
linear order by the language of least fixpoint logic 3.
On the other hand, if we are going to evaluate a fixed first-order formula in
a finite model, it seems natural that we would need to run a search for every
single quantifier in the formula, in other words, the complexity of this problem

* This work has been partially supported by NSF Grant CCR 9403809.
A part of this research was carried out while the author was visiting UCLA and
supported in part by NSF Grant CCR 9403809.
3 Many other characterizations of this complexity class in logical terms have been
proposed [6, 14, 17, 18].
243

(with a formula fixed) is although polynomial, but seemingly unbounded in the


degree of polynomial (for all formulas).
Of course, should the signature contain unary relational symbols only, any
formula would be equivalent to one of the quantifier depth 1 (see [3, 15]) and
thus evaluable in linear time. However, [4] shows that, in general, the hierarchy
of FO formulas by their quantifier depth does not collapse.
With all this known, it seemed to be obvious that the language of first-order
logic FO "spans" P in the sense that for no r FO is fully contained in DTIME(nr),
whatever the nontrivial signature one considers (notice that we can easily achieve
this affect if the signature is allowed to grow, however, we are interested in fixed
signatures).
And we want to start by expressing our deep unshaken belief in this thesis.
However, all our attempts to actually prove that have been unsuccessful, and
soon we found out why. The fact of the matter is that truth of this thesis would
imply P ~ PSPACE, the fact although commonly believed but proved to be
unprovable (see Fact 1).
We also believe that P ~ PSpace would imply the positive answer to Mos-
chovakis's Problem, but for that we don't have any argument except for the fact
that truth is implied by anything (see [8]).
Seriously, we show (Theorem 6) that the positive answer to Moschovakis's
problem would be implied by the collapse of the least fixpoint hierarchy by
dimension on ordered models. But then again, although this is an open problem,
we don't honestly believe in the collapse (Martin Grohe recently proved that the
hierarchy does not collapse for unordered structures [5]).
So the first section of this paper shows what consequences for Moschovakis's
problem certain unlikely facts would have.
Among other open problems related to Moschovakis's Problem we want to
mention here the problem of the relative position of the classes AC ~ and P. [12]
showed that in some particular class of models, first-order formulas characterize
AC ~.
In the second section we establish a normal form result for implicit definabil-
ity in ordered models (with successor and two constants for the first and last
elements).
It is well-known (see [13]) that in this class of models implicit definabil-
ity captures the complexity class U P N c o U P , where UP is the class of non-
deterministic unambiguous polynomial-time Turing computations. "Unambigu-
ous" means that at most one of all the computational paths of the non-determi-
nistic Turing machine may be successful.
Then we consider only those implicit definitions that actually define predi-
cates in P, that is, deterministic (not only unambiguous), and consider a dimen-
sion hierarchy of the definitions similar to the dimension hierarchy for fixpoints.
Throughout the text, we use standard definitions (like DTIME and NTIME, P
and NP), and also less standard as U P discussed above (this class was introduced
in [19]), or UNTIME(p(n)) for those in U P that are in NTIME(p(n)) (in the sense
that the unambiguous machine is of this complexity, not just the language).
2Z.z~.

Some other widely used notions are introduced below.


Fix a class/C of finite structures. A IC.global predicate defines a specific pred-
icate of the same fixed arity in every finite structure in K:. In particular, a 0-ary
/C-global predicate is a subset of the set of all the considered structures.
We can naturally define what the code of a finite structure is.
An r-ary/C-global predicate is said to be recognized by a Turing machine T
iff this Turing machine accepts the set of codes of the structures and codes of
r-tuples such that the code of M together with the code of a is accepted iff a
belongs to the global predicate in M.

1 On the complexity of first-order formulas

Pact 1 The following problem is PSPACE-complete for any signature:


Given a first-order sentence ~ in the signature and a model M of the signa-
ture, to decide whether M ~ ~.

PROOF: It is obvious that the problem is in PSPACE. Now the completeness


follows from PSPhcE-completeness of QBF (see [1]), which is a special case of
our problem with a model of cardinality 2 fixed. Q.E.D.

Corollary 2. If P = PSPACE then for any signature there would exist a k such
that for any FO formula, checking its validity would be in DTtME(nk).

PROOF: Because of Fact 1, the two-parameter problem of checking M ~


is in PSPAcE (in the combined size of the formula and the cardinality of the
model), and since we assumed PSPACE = P, this problem is in P. Hence, it is
in DTIME(n/~) for a certain k.
When we fix a formula ~, and consider the one-parameter problem (where
only the cardinality of the model varies), it clearly remains in DTIME(n k) for
the same k. Q.E.D.

D e f i n i t i o n 3 Class $. S is the class of finite models of the linear order <, the
successor function ', and of two constants 0 and LAST, such that < is a linear
order of the universe, ' gives the next element (in the <-ordering), while 0 and
LAST are, respectively, the first and last elements of the universe (w.r.t. <).

Note,~. The class S is finitely axiomatizable in the class of all finite models by
means of universal formulas.
245

PROOF: The following formula axiomatizes the class $ and is universal:

( L A S T ' = L A S T A (Vx)(Vy)(Vz)(x' ~ OA

(x = x I ---+x = L A S T ) A

(x < y V y < xVx--y) A((x < y A y < z)--*x < z)A

(x < y --* (',y < x A -~y = x)) A (x - L A S T V x < L A S T ) A (x - O V O < x)A

((y~ = x A z ~ = x) ~ (y = L A S T V y = z V z.= L A S T ) ) A

( x ' = y--* ((x < y V x = L A S T ) A ( x < z--* (z = y V y < z))))))

Q.E.D.

D e f i n i t i o n 5 L F P . L F P is the language of first-order logic FO with one addi-


tional operator, called a fizpoint operalor, defined as follows.
Let ~o(P, xl, x2, 9 9 xr) be a (positive in P ) first-order formula 4 of the original
signature plus a new r-ary predicate symbol P, with r free variables xl, x 2 , . . . , xr.
Then LFPr(~o(P, Xl, x 2 , . . . , xr)) is a formula of LFPr with r free variables
xl, x2, 9 9 x~, whose semantics is as follows.
Substitute an empty predicate for P in ~o(P, xl, x 2 , . . . , x~). Let PI be the set
of all r-tuples a of values of the domain such that ~o(0, a).
Similarly, define Pk+l as the set of all r-tuples a of values of the domain such
that ~(Pk, a).
Notice that due to the positivity of ~(P, x~, x 2 , . . . , x~) in P, Pk C_ P~+I.
Finally, define LEPr(~(P, Xl, x 2 , . . . , xr)) as Ui~0 P~. Notice that, unlike the
general case, in finite models the process stabilizes in finite number of steps.
We define LFPr by allowing conjunctions and disjunctions of LFPr formulas
with first-order formulas, negations of LFPr formulas, and allowing the use of
quantifiers. However, any LFPr formula is going to contain no more than one
fixpoint 5 .
Define L F P as the union of LFP~ for all r's.

T h e o r e m 6. If the hierarchy LFPi collapses in the class $, then for any signa-
ture that contains one binary predicate symbol, and for any k there exists an FO
formula ~k in the signature that defines a global predicate not in DTIME(nk).

4 Often, definitions of LFP use arbitrary LFP formulas in fixpoint operators, however,
it does not increase the expressive power of the logical languages (see [16]).
s As before, this restriction doesn't actually restrict the expressive power.
2L,6

PROOF: Suppose the contrary, that is that LFPi collapses, say, at the level 1,
but for any signature ~r there exists an ma such that any FO formula ~ defines
a global predicate of time complexity DTIME(n m').
Let ~ be an extension of the signature of S by a new predicate symbol of
arity l and l constant symbols. It is now easy to see that the time complexity of
any fixpoint operator of dimension l (or less) in the class S is DTIME(n 2/+m~).
Indeed, to compute the predicate defined by the fixpoint operator, we will need
to make at most n I iterations, and at each iteration, we will need to check n 1
l-tuples, and each such check will take (by assumption) n m~ steps.
Because the fixpoint operators define predicates of arity l (or less), the time
complexity of checking any LFP1 (and thus any LFP) sentence in the class S is
then DTIME(n 21+2m6)~,
But this contradicts to the well-known fact ([11, 20]) that in this situation
LFP is capable of defining any S-global predicate in P.
Now this proves that if the hierarchy of LFPi collapses, then for a certain
signature ~ FO is in DTIME(n m) for no m.
However, models of this signature 5 can be translated into models of one
binary relation (see e.g. [9]), and the way they are translated, the models grow
in size only polynomially.
Hence, the theorem. Q.E.D.

2 O n a n o r m a l f o r m for I D

D e f i n i t i o n 7 I m p l i c i t d e f i n a b i l i t y ( I D ) . Let E be a class of finite models of


some signature, and let ~(P) and r x) be, respectively, first-order sentence
and first-order formula of this signature enriched with a new r-ary predicate
symbol P.
It may happen that for any M E E there exists a unique r-ary predicate
PM C_ IMI r that makes the formula ~(P) true. Since this is a unique value of
the predicate, we have a K-global predicate thus defined.
Then consider a K-global predicate defined by r a~) in the models of class
E with P interpreted by the values of this implicitly defined predicate.
This new predicate (whose arity is the number of variables in x) is said to
be implicitly defined by ~(P) and r ~) in the class E.
The class of implicitly definable global predicates is denoted ID.
If ~(P) is a universal sentence, and r x) is an existential formula, then
the above global predicate is said to be in the class 1Dv~3, with I/)v~3 be the
union of ID vs~3 for all r's.

The following theorem is from [13]6.

Note that the definition of implicit definability in [13], although different by its ap-
pearance, defines the s~me class of global predicates.
247

T h e o r e m 8 Kolaitis. An S-global predicate is in U P A c o U P iff it is in ID.

T h e o r e m 9, An S-global predicate is in UP A c o U P iff it is in ID vs~3.


PROOF: (~=) follows from Theorem 8.
(=~). Take an S-global predicate in UP N c o U P . This predicate is unambigu-
ously recognized by a non-deterministic Turing machine M + , and its complement
is unambiguously recognized by a non-deterministic Turing machine M - . Both
machines are in NP, hence, there exists a k such that both these machines are
of time complexity UNTIME(nk). Suppose that r is the arity of this predicate.
The formula that we use is going to implicitly define a new predicate of arity
2k + 1 + r. Essentially, this predicate will contain a full history of computation by
one of the machines (M + or M - ) . k coordinates are used to number moments
of time, from 1 to n k, and the next k coordinates number positions of the tape
that the machine uses. The additional coordinate deal with internal states and
positions of the heads on the tapes. Once the machine stopped, the remaining
time positions remain the same.
The formula can easily check that for any two adjacent moments of time,
they are connected according to the transition in the Turing machine. Notice
that we need only universal quantifiers (k of them) to implement the above "for
any". Since we have 0 and LAST and successor ', we won't need any additional
quantifiers in writing the conditions (this technique is known since [9]).
As well as that, we can write that at the last moment of time, the state is
accepting, while at the first moment of time, the tape contains the code of our
structure and the tuple that we are going to check. Q.E.D.

T h e o r e m 10. Any S-global ID predicate is ID va3.


PROOF: follows from Theorems 8 and 9. Q.E.D.
A global predicate is said to be in PTr iff it is in IDrv~3 a n d at the same
time in the complexity class P.

Corollary 11. For any k there exists a PT2k+3 S-global predicate thai is not in
DTIME(nk).
PROOF: By the time hierarchy theorem ([10]; see also [1, Exercise 11.8]), there
exists a deterministic Turing machine in DTIME(n k+l) \ DTIME(nk). This ma-
chine accepts a certain language and can be thought of as defining a 0-ary global
predicate (essentially, a subset of S).
The global predicate that this machine computes can be defined, according
to the proof of Theorem 9, by an ID v~z3 formula that implicitly defines a new
predicate of arity 2(k + 1) + 1 = 2k + 3.
By its very construction, this predicate is in PT~k+3, which proves the corol-
lary. Q.E.D.
248

References

1. Aho, Alfred V., John E. Hopcroff, and Jeffrey D. Unman, "The design and analy-
sis of computer algorithms", Addison-Wesley Publ. Co., 1974, 1-470.
2. Aho, Alfred V., and Jeffrey D. Ullman, Universality of data retrieval languages,
in: "Proceedings of the 6th ACM Syrup. on Principles of Programming Languages
(POPL)", 1979, 110-117.
3. Behmann, tteinrich, BeitrKge zur Algebra der Logic, insbesondere zur Entschei-
dungproblem, Math. Ann., 86, 1922, 163-229.
4. Chandra, Ashok K., and David Harel, Structure and complexity of relational
queries, J. Comput. Syst. Sci., 25, 1980, 156-178.
5. Grohe, Martin, Bounded-arity hierarchies in fixed-point logics, in: '~Proceedings
of CSL '93. 1993 Annual Conference of the European Association for Computer
Science Logic, Swansea, UK, 13-17 Sept. 1993)" (Borger, E.; Curevich, Y.; Meinke,
K., Eds.), Springer-Verlag: Berlin, 1994. 150-164.
6. Gurevich, Yuri, Algebras of feasible functions, in: "24th Syrup. on Foundations of
Computer Science (FOGS)", IEEE Computer Society Press, 1983, 210-214.
7. Gurevich, Yuri, Logic and the challenge of computer science, in: "Current Trends
in Theoretical Computer Science" (Egon B6rger, Ed.), Computer Science Press:
Rockville, Md., 1987, 1-57.
8. Enderton, Herbert B., "A mathematical introduction to logic", Academic Press:
New York, 1972.
9. Ershov, Yuri L., Igor A. Lavrov, Asan D. Taimanov, and Michael A. Taitslin, Ele-
mentary theories, Russian Mathematical Surveys, 20:4, 1965, 35-105.
10. Hennie, F.G., and R.E. Stearns, Two tape simulation of multitape machines,
J. ACM, 13:4, 533-546.
11. Immerman, Neff, Relational queries computable in polynomial time, in: "14th
ACM Syrup. on Theory of Computing (STOC)', ACM, 1982, 147-152.
12. Immerman, Neff, Languages that capture complexity classes, SIAM J. Computing,
16, 1987, 760-778.
13. Kolaitis, Phokion G., Implicit definability on finite structures and unambiguous
computations, in: "5th IEEE Annu. Syrup. on Logic in Computer Science (LICS)",
IEEE Computer Society Press: Los Alamitos, CA, 1990, 168-180.
14. Livchak, Alexander B., The relational model for process control, in: "Automatic
Documentation and Mathematical Linguistics 4", Moscow, 1983, 27-29.
15. Maltsev, Anatoly I., Regular products of models, lzv. Acad. Nauk. SSSR,
Ser. Mat., 23, 1959, 489-502.
16. Moschovakis, Yiannis N., "Elementary induction on abstract structures", North-
Holland/Elsevier: Amsterdam/New York, 1974, 218pp.
17. Sazonov, Valdimir Yu., Polinomial computability and recursivity in finite domains,
Elektronisehe Informationsverarbeitung und Kybernetik, 16, 1980, 319-323.
18. Stolboushkin, Alexei P., and Michael A. Taitslin, Dynamic logics, in: "Cybernetics
and Computing Technology" (V.A. Mel'nikov, Ed.), Moscow: Nauka, 1986, 180-
230.
19. Valiant, L., Relative complexity of checking and evaluating, In]ormation Process-
ing,, 5, 1976, 20-23.
20. Vardi, Moshe Y., Complexity of relational query languages, in: "14th ACM Syrup.
on Theory of Computing", ACM, 1982, 137-146.
Logic Programmingin Tau Categories
S t a c y E. F i n k e l s t e i n Peter Freyd
McGill University U n i v e r s i t y of P e n n s y l v a n i a
sef@triples, math. ~cgill. ca pj ~@saul. cis. upenn, edu

James Lipton
Wesleyan University
lipt on~allegory, cs. wesleyan, edu

Abstract
Many features of current logic programming languages are not captured by
conventional semantics. Their fundamentally non-ground character, and the
uniform way in which such languages have been extended to typed domains sub-
ject to constraints, suggest that a categorical treatment of constraint domains,
of programming syntax and of semantics may be closer in spirit to declarative
programming than conventional set theoretic semantics.
We generalize the notion of a (many-sorted) logic program and of a resolution
proof by defining them both over a (not necessarily free) r-category, a category
with products enriched with a mechanism for canonically manipulating n-ary
relations. Computing over this domain includes computing over the Herbrand
Universe, and over equationally presented constraint domains as special cases.
We give a categorical treatment of the fix-point semantics of Kowalski and van
Emden, which establishes completeness in a very general setting.

1 Introduction

9 Tau Categories are categories enriched with a mechanism for canonically manipulating
n-ary relations. This r-structure creates, through canonical limits, an ideal framework
for formalizing an abstract syntax for logic programming with constraints. By defining
the notions of resolution and unification over such a category, one is able to capture
a quite general notion of constraint logic programming and establish a completeness
theorem, based on a non-ground categorical generalization of fixed point semantics.
Connections between categories and logic programming were used to describe uni-
fication in [18]. A. Corradini and U. Montanari [5] focus on the abstract computations
of logic programs and the view of logic programs as structured transition systems.
A. Corradini and A. Asperti [4] build on [5] and define a model for a logic program
as a family of monoidal categories indexed by sets of variables. In both works, the
category theoretic setting is semantical and is not used to describe the resolution pro-
cess. Asperti and Martini [2] give a categorical treatment of the syntactic mechanisms
of resolution along with the interpretation of logic programs based on the concept of
using projections as variables [14], and was a starting point for our work. The categor-
ical framework is here extended to cover more general notions of programming syntax
and resolution and to produce a cleaner semantics that exploits and strengthens the
fixed-point approach of Kowalski and van Emden [12].
250

Categorical interpretations are often limited by the fact that the main constructs
such as products and pullbacks are only defined up to isomorphism, forcing arbitrary
choices of representatives of a given isomorphism class, r-categories, introduced by
Peter Freyd [7], are a uniform setting in which to work with n-ary relations and thus
describe logic programs. They are finite limit categories enriched with r-structure,
a distinguished class of diagrams which provides canonical choices for limits, and
guarantees associative equality and a strict unit for the product.
We begin by describing a categorical syntax for logic programs in an arbitrary
r-category as in [6].

2 r-categories

We begin with an outline of the theory of ~-categories, described in detail in [7]. We


depart slightly from the cited reference in that in this paper, r-categories need only
have all finite products, not all finite limits. The r-structure guarantees that when
finite limits do exist they are canonical. Also note that consideration of this class of
categories is not overly restrictive since every small finite limit category is equivalent
to a r-category([7]).
The relevant definitions (of table, short column, compositions and r-categories)
are given in table 1. An isomorphism class of tables is called a relation. Any table
is equivalent to a monic composed with a product table. In fact, this is a strong
equivalence with regard to the r-structure, as a r-monic composed with a product
r-table is exactly its equivalent r-table.
Axiom 1 of the definition of a r-category says that the class r is a set of repre-
sentatives for each isomorphism class of tables. It is this property which allows for a
canonical choice of limits. We say that the product diagram (A x B; Pl, P2} is canoni-
cal if it is a r-table. Similarly, a pullback diagram is canonical if (P; Pl, P2} E r (where
Pl : P --~ A and p~ : P --* B are the pullback arrows of f : A --* C and g : B --~ C re-
spectively.) These axioms also imply that the category has a unique terminal object.
Canonical limits in r-categories enjoy several important properties ([7]). Canonical
products are strictly associative. The terminator is strictly a two-sided unit. Also, if
each of two horizontally adjacent pullback squares is a canonical pullback then so is
the outer pullback rectangle they describe.

3 Logical Structure and Interpretation

The information contained in this section describes a r-categorical syntax for logic
programming and is given in greater detail in the dissertation of Stacy E. Finkelstein
[6].
The basic translation of the first-order language Z: into the r-category follows the
usual interpretation as in [14], slightly augmented with the canonical choices allowed
by the r-structure. The relevant definitions are given in table 2. We first define
an interpretation of terms, inductively, as arrows in A ~, relative to a sequence of
distinct variables containing all of the free variables in the term. We then extend this
interpretation to predicates.
251

Table 1: Tables and r-categ0ries


D e f i n i t i o n 2.1 A t a b l e is an object T with a monic finite sequence of morphisms xl . . . . ,Xn
with T as a common source. W e will denote such a table by {T;x 1 . . . . . Xn).

D e f i n i t i o n 2.2 Given a table ( T ; x l , . . . , X n ) we say that x j is a s h o r t c o l u m n it does not


contribute to the monte character of the table, in other words iJ ]or every f , g : X ---* T such
that f x j ys # x j there exists i < j such that .fxi • gxl.

If xj is a short column, then ( T ; x l , . . . , k j , . . . , Xn} will denote the table obtained by removing
the arrow xj from the sequence x] , . . . , Xn.

D e f i n i t i o n 2.3 Given tables ( T ; x l , . . . , Xrn) and (TI; yl . . . . . Yn) where x j : T --~ T', we define
their c o m p o s i t i o n a t j as

(T;za . . . . . xm) oj (T';ya ..... y.) = (T;xl ..... x j - l , x j y l . . . . . z j y . , z j + l . . . . . x,n).

D e f i n i t i o n 2.4 A r - c a t e g o r y is a category with all finite limits and a distinguished class o]


tables, denoted r, such that
1. Every table is isomorphic to a unique table in r.
L (T;1T) E r, all T.
3. I] ( T ; x l , . . . , X r n ) E r and (Tt;yl . . . . . Yn) E ~r and x j : T--~ T', then
(T;z~ . . . . . =,~) oj (T';m ..... y,) e r
~. I] (T;x~ . . . . . Xn)E "r and x j is a short column, then ( T ; x l , . . . , & j , . . . , X n ) E r

In the table, we will let L: be a first order language, g = ( z l , . . . , x m ) be a se-


quence of distinct variables of sorts Pl, P 2 , . . . , Pm and let M(E) denote the canonical
product M ( p l ) x -.- x M ( p m ) in A r. It is important to note in the definition of
M ~ ( P t l .. "tin) that not only is the arrow p a monic arrow, by a property of pull-
backs, but it is also a r-monic arrow since the second leg of this canonical pullback
is short.

3.1 S u b s t i t u t i o n a n d U n i f i c a t i o n in A r

Let 0 = {zl := t l , . . . , x n := t,~} be a substitution with the xi's all distinct. In


addition, let ff = ( Y l , . . . , Ym) be a sequence of distinct variables containing all of the
free variables of the terms ti.
Then we m a y define a categorical substitution e f in A ~ to be the morphism
( M r ( t 0 , . . . , Mg(tn)) : M ( ~ --+ M(~').
One may prove that for any term s, Mg(sS) = M e ( s ) o @f and thus that a
categorical substitution may be applied to a term in this structure via ordinary
composition of arrows in A r. Additionally, one defines the application of a sub-
stitution to an atomic formula ~ to be the arrow q which is the canonical pull-
back of M ~ ( ~ ) , 9 M(~) along the substitution arrow @g. For nonatomic formulae
~ : = A~=IAi, one may extend the definition for atomic formulae by taking canonical
pullbacks along each r-monic of tP. As in the definition of M e ( P t l . . . tin), q will auto-
matically be a r-monic arrow in A r. Additionally, one may show for So := P u l . . - uk,
Me( oe) = M , r ( . . . (u e) ).
252

Table 2: Term structure in r-categories

Definition 3.1 For any z-category A r, an At-structure M on the language s is a function


M : ~ ~ A r such that for every sort a in s M(a) is an object o ] A r, andfor every operation
symbol f in s o] arity n, input sorts al,a2 . . . . . an, and of sort p, M ( f ) : M(~) ~ M(p) is a
morphism in Arwhere M(~) denotes the canonical product M(al) X ' " X lid(an). (Constants
will be thought of as functions of arity O, and so ]or each constant c of sort p, M(c) : 1 ~ A~(p)
is a morphisra in A t . )

Definition 3.2 For a term t of sort p, having all of its free variables among ~, M'~(t) is defined
to be a morphism M(~) --* m(o) in A ~ as follows:
t : = xi M~(xi) is defined to be the canonical projection M(~) -* M(pi)
t : = ftl "'" tm If each ti is o] sort ai, then s'VI~(ftl " ' t i n ) is defined as the composition
.M(f) o (Me(tl) . . . . . Me(tin))

Predicates ]

Definition 3.3 For every predicate symbol R in/5 with arity n and sorts cr1 ,an,... ,qn, define
M ( I t ) ~o be a z-monic M ( R ) --* M(~}. For a formula ~ with all ir free variables among ~,
Me(w) is defined as follows:
~ : = P t l - ' " tm /] each ~i is of sort ~i, then M f ( P t l ""tin) is defined to be the
z-arrow p which is the canonical pullback o] M ( P ) , ~ M(~) along the a r r o w
(M~(tl) . . . . . Mz(tm)).
%o:= A~n__lAi If each Ai is an atom which has been interpreted as described above, then ~4~(~o)
is defined to be the m-tuple of z-monic arrows Ats(Ai) , 9 A4(~).

tn a logic p r o g r a m m i n g language, the unifiability of formulae depends on the


equality of the predicate symbols and the existence of a unifier, a substitution which
unifies the terms appearing in the formulae. In a logic p r o g r a m m i n g r-category, the
unifiability of formulae depends on the equality of the interpretation of the predicate
s y m b o l and the existence of a pullback for the terms. It is this ability to define
formulae in terms of different lists of variables and then use a pullback to unify t h e m
t h a t removes the need for any renaming of variables to avoid clashes.

D e f i n i t i o n 3.4 Let 9 : = P t l . . . tn a n d r R s l . . . sin. ~ and r are said to be


u n i f i a b l e i f the r - r n o n i c arrows M ( P ) --* M ( ~ ) and M ( R ) --* M(p-) are equal and
there c r i s i s a p a i r o f s u b s t i t u t i o n arrows ( e ~ , r f o r m i n g a c o m m u t a t i v e square,
i.e., such that ( M ~ ( s l ) , . . . , M e ( s , ) } o ~ = (Me(tl),...,Me(t~))o @g. Unifiers
(O#,r are m o s t g e n e r a l u n i f i e r s ( m g u ' s ) i f f the c o m m u t a t i v e square described
in d e f i n i t i o n 3.~ is a pullback.

In this case the pair (@~, el) are said to unify the formulae ~ and r Note t h a t
the first condition implies in particular t h a t n = m , M ( g ) = M(p-), and M ( P ) = M ( R ) .
Also note t h a t the occur check is internal to this setting. Since all interpretations are
m a d e with respect to a list of variables and, in particular, the substitution arrows
are m a d e with respect to an ostensibly different list of variables than the terms to
be unified, there c a n be no substitution of a variable with a term containing t h a t
variable unless a separate d i a g r a m specifies these to be equal. In m a n y constraint
253

domains (e.g. the set of closed terms in the lambda calculus), terms may be unifiable
without having mgu's. But if unifiers exist in a category, it is straightforward to add
the appropriate pullback squares and form a new r-category in which mgu's exist.
This is discussed in the last section of the paper.

3.2 Resolution

D e f i n i t i o n 3.5 Let C be the definite clause A 1 , A 2 , . . . , A n }- B, (for n>_O). Then a


definite clause diagram, M~(C), is the following (n+ I)-tuple of r-monic arrows:
M~(A1 ) "" Mz(An )

M(~ , , MdB)
A definite r-logic p r o g r a m pr is a finite set of definite clause diagrams MffC).
Let G be the definite goal B1 A B2 A ... A B,~, (for m>_l). Then a definite goal
d i a g r a m , Me(G) will be an m-tuple of r-monic arrows Me(B~) , , M(~) to the
base M(g).

D e f i n i t i o n 3.6 Let Me(G) be a definite goal diagram, and Mz(C) be a definite


clause diagram. Then a new definite goal diagram Mg(G ') is derived from Me(G)
and M~(C) using (Og, t g ) if the following conditions hold:

I. M~(Bi) --* M(g) is the selected r-monic arrow of Me(G).

2. {0g,0,.7) is an mgu of Me(Bi) ~ M(~) and Mz(B) ~ M(z-3.


3. Mg(G ') is the following definite goal diagram:

M(u3

where recall that the r-monic Mr , , M(ff) is defined as a substitution pullback


of Me(Bj) , , M(•) against Of.

D e f i n i t i o n 3.7 Let A r be a r-category, pr a definite v-logic program and Me(G)


a definite goal diagram. A r S L D - d e r i v a t i o n of PrI-a. MI(G ) consists of a se-
quence (finite or infinite) M~(G), Me,(G1), Me,,(G2), ... of definite goal diagrams,
a sequence Mr M~,(C2),... of definite program clauses of prand a sequence
(O 1, r , (O2, r , ... of most general unifiers such thai each M~(Gi+I) is derived
from M~(G 0 and Me(G+Q using (0~ -1, ~ 1 ) . a r S L D - p r o o f of {Me(G)} from
P~ is a finite rSLD-derivation of Pr~-a. Me(G ) which has the empty diagram as the
last definite goal diagram in the derivation. A r - c o m p u t e d answer for Prl-a. Me(G)
is the substitution arrowO = O 1 o{~ 2 o . . . o O r where (O1,~ 1) , ... , (Or,r r) is the
sequence of most general unifiers used in a rSLD - proof of {Me(G)} from pr.
254

3.3 Models

Definition 3.8 An i n t e r p r e t a t i o n for a logic programming structure A ~ is a carte-


sian category 1) with an associated cartesian functor H : A r --* 1).

In the following section we shall only be considering semantic categories 1) which


have the syntactic category as a subcategory. In this case, the functor H will be the
obvious inclusion functor which preserves the logical structure.

D e f i n i t i o n 3.9 The notion 0 f v a l i d l t y in 1) (under H) for a (possibly open) formula


~o is defined as follows:

I. A conjunction M~(~o) is valid in 2) iff H [l], where l: Lira ~ M(~') is the canon-
ical limit of the diagram M~(~o), is an isomorphism.
2. A definite clause diagram Mr(A1, A 2 , . . . , A n ~- B ) is valid in 7) (under H) iff
there exists a monte arrow m such that:
m
H [Lira] , , H [M~(B)]

H [M(a?)]
where l is the limit of the diagram M r 1 6 2 for r = A1 A A2 A . . . A An.

3. A logic program Pr is valid in l) iff every clause of P~ is valid in 1). In this case
we say that 1) (under H) is a m o d e l of p r

Validity of an atomic formula means it is interpreted as an isomorphism, which in


the case of an open formula implies validity of its universal closure.

T h e o r e m 3.10 ( S o u n d n e s s o f r S L D - R e s o l u t i o n ) Let Pr be a definite r-logic


program and M ~ ( G ) a definite goal diagram. I f ~)~ is a v-computed answer f o r
PrkA,Mr then it is the case that Mg(GO) is valid in all models of P r.

4 Completeness of TSLD-Resolution

Let s be the language of the Horn-clause program P and /~n = {R1,..., Rn} the
set of relation symbols in the language, of sorts {r an}. In order to establish
completeness we begin with a category C with terminator and products, in which all
sorts, constants and function symbols of s have been interpreted by a C-structure
M. On this base category of constraints we create a syntactic category into which we
translate logic programs and over which SLD resolution will take place. The syntactic
category, (previously denoted Arin preceding sections) is obtained by freely adjoining
indeterminate monics Xi , , B to C, so as to associate to each predicate in ~:~ an
indeterminate monte, through which no arrows from C representing terms will factor.
255

In general, for any category C with terminator and products, and any object B in
C, the category CB (also called C[Xi] below) obtained by adjoining an indeterminate
subobject Xi -- ~ B is defined as follows. The objects of CB are pairs (A, s) where
A is an object of C and s is a finite set of morphisms A --0 B of C. We will let the
morphisms (A, s) ~ (A', s t) in CB be those morphisms f : A --* A' of C such that for
every morphism g : A' --* B in s' it is the case that f; g : A --* B is in s.
C/~ has a terminal object and products for every pair of objects. It is also a
r-category with r-structure inherited from C. In addition, the embedding C ~ " ~ CB
which sends f : A -~ A' in Cto f : (A,0) -~ (A',0) in CB is full and faithful
and preserves existing limits. The category CB also has a generic subobject idc :
(B, {id}) --*~(B, 0). This is our indeterminate monic X -- : t(B), and it is important
to note that all pullbacks of this monic along arrows t from C (so-called "term arrows")
exist in CB and are of the form (A, {t}) i a c (A, 0} which cannot be an isomorphism
in CB.
To emphasize the fact that we are interpreting predicate letters as indeterminate
monics, we use the notation C[X] - C [ X 1 , . . . , X n ] for the category obtained from
C by freely adjoining indeterminate monics Xi * , M(~i) one for each Ri E/2~. We
extend M to an interpretation (also called M) of the predicates in s by defining
M(P~) = Xi (that is to say, M ( R I ) is the monic Xi , , M ( ~ ) ) . Such a category,
along with such a structure map, corresponds to the category A~and structure map
(also denoted M) over which the proof theory has been discussed in previous sections.
We shall call the category C[X] with its structure map a logic programming category
(an LP-category). We now use the notation PrbcG to mean there is a rSLD-proof of
G from P~ in the category C[X].
Our ambient semantic category will be the topos C'given by the contravariant
Yoneda embedding from C (ie: C ~ , Set C ~ In Cwe are able to take unions and
images. Hence, in pa~icular, any predicate defined by resolution over C is repre-
sentable as a monic in C. We will use this framework to define initial models for logic
programs over C, as subcategories of C.
We will need to make use of a few elementary facts about the Yoneda embedding
whose proofs can be found in e.g. [9, 7]. The embedding shown above is full and
faithful. The objects B in C appear in Cas representable functors Horn(_, B), al-
though we will often abuse notation and refer to them by their original names B. All
representable functors are coprime and projective. Thus, in particular, arrows from
representables into unions of functors in C must factor through one of them. Also, up
to isomorphism, we may take the subobject relation A C B between subobjects A, B
of some object C to be pointwise inclusion of functors. That is to say, given a chain
of subobjects of b in C'such as the diagram on the left,

El * ' Ei§ . . . . . . Fi * Fi+l

b b
256

there are isomorphic subobjects Fi ~ Ei such that the diagram on the right com-
mutes, with all arrows the canonical inclusions g ~ induced by pointwise set
theoretic containment of functors: for each object X in g, Fi(X) C_ Fi+I(X).
To show completeness, we shall develop a category theoretic generalization of
the Kowalski-Van Emden Tp-operator ([12]). We will define an operator Tp,c on
semantic categories l) which builds a new category from 79 generated by the clauses
of the program. By a process of iteration we construct a fixed point for Tp,c, an initial
model for the program from which we may show completeness.
Throughout, we will denote by (f)# : Sub(a) ~ Sub(b) the functor from the
lattice of sub-objects of a in alto that of b induced by pulling back along f Lb --~ a,
and by ~ ( f ) its left adjoint in C. Notice that ~(id)(g) is the image of 9 in g .

D e f i n i t i o n 4.1 A s e m a n t i c c a t e g o r y is a full subcategory 79 of d which contains


d and which is equipped with an extended structure (denoted []79 ) based on the struc-
ture already defined on g (previously denoted M). []7) will interpret sorts, constants
and function symbols as prescribed by the g.structure M, and will assign to each
predicate symbol a monic arrow of 79.

Note that one may show using a straightforward induction on terms that for any
term t, lit ] ~ = [ t l~g " Thus we will often drop the superscripted categories from our
notation when discussing interpretations of terms.
Given a program P let us denote by cla clause A1, A s , . . . , An F- B(t). We will then
mean by hdr the predicate symbol B (the head of cl), by tm,~ the term t (associated
with hdr and by tlr the list A1,..., An (the tail of cl). We then make the following
categorical notational definitions.

D e f i n i t i o n 4.2 Let Pr be a definite r-logic program in a semantic category D with


extended structure []79. Then denote by [cl]79 the definite clause diagram associated
with A 1 , A ~ , . . . , A n k- B(t). We will define [hdct] o to be [BI D , [tmct]Oe to be
I l i a , and [tlet]O~ to be the monic from the limit of the (definite goal) diagram
consisting of the fan of arrows [AiI~ " ~ [~179 from the clause diagram above.

D e f i n i t i o n 4.3 If I r i s : M(a~) ~ M(~) and ~r]g: M(y') --~ M(~) are terms, then
[t ]e" is said to f a c t o r t h r o u g h [rig in D if there exists an arrow 0 in 7)such that
the arrows id and 0 are pnllbacks of the arrows [t ]g. and [rig respectively.

If [tie factors through [r]]g in d, then it factors through in g .


P r o p o s i t i o n 4.4
Furthermore, if [t]z factors through .~(id)([r]f) in C, then [t]z factors through
[ r i g in C.

Van Emden and Kowalski introduced an elegant semantics for Prolog in terms of
fixed points of a continuous operator Tp on the power set of the Herbrand base Bp
257

of a prolog program P [12]. The action of Tp on a subset S of B e is given by:


Tp(S) = S U {Q(t') i A t , . . . ,A" f- Q(t') is a ground
instance of a clause in P & A t , . . . , A" 6 S}.
The least term model of P is precisely the least fixed point of Te which we can think
of as building the model as a (co-)limit of a chain. If we fix a predicate letter Q, we
can think of each clause A t , . . . , An ~- Q(t) in P as contributing to the extent of Q all
ground instances of t corresponding to terms in the interpretation of the Ai. Because
of the non-ground nature of categorical semantics we can define analogous operators
E and Tp, c over our more general categorical models in many respects simpler their
set theoretic counterpart. The operator E adjoins finitely many arrows to the arrow
interpreting the predicate Q in a category. The fact that this is done over an arbitrary
category C with terminator and certain limits poses no difficulties. The main idea is
as follows. We assume the predicate letter B has the interpretation [B] z~ in some
category 79. Then if A1,.-.,An ~" B(t) is a clause in P, let [[A1,..., A hie79 be the
limit of the clause diagram for A 1 , . . . , A, in 79. Then we define E by
E ( [ ] 7 9 , B ) = [[B]79 U -c-~([t]~)([[A1,...,A~]zO ) tO ...
where the remaining terms of the union, not shown, correspond to any remaining
clauses in P whose head predicate letter is B, and where, we recall, ~( lit ] ~ ) is the
image functor along t. We now give this definition in its full generality.

Let 79 be a semantic category as defined in definition 4.1 with ez-


D e f i n i t i o n 4.5
tended structure [[]O. Then for each predicate letter B 6 12.~ define the following

h ~I=B

Note that these arrows are in 2, but not, in general, in 79.


Now let Te,c(79 ) be the full subcategory of C generated by 79,0 6 C, E( []79, B)
for each predicate letter B E s and closed under finite limits. Te,c (79) is then also
a semantic category with extended structure defined by [[B]TP,c(79 ) = E( [[]79, B) for
predicate symbols B G s

In particular, let C 0 be the full subcategory of C'obtained by adjoining to the


original category C the co-terminator 0 of C and defining an extended interpretation
[]0 given by [B]] ~ = the unique map 0 ---* [b] ~ for predicate symbols B e ~Te, where
b = sort of B. We may then define a sequence of categories C 0, C 1 , . . . , C n , . . . built
by letting C i+1 = Tp,c(C i), each with the extended interpretation [B~ n given in the
definition of Tp,c(C i).
Construct for each predicate symbol B E ~7r the filtered colimit in Cof the fol-
lowing diagram of inclusions
[BI~ ~ [sp . . . . . . . [B] k . . . . .
In particular, we may take this diagram of inclusions to be pointwise containment of
functors Fn. The pointwise union of this diagram is then easily seen to be its colimit.
258

D e f i n i t i o n 4.6 We define C* as the full subcategory of C generated by the categories


C ,~, li_.m{[ B ] r'} for each predicate symbol B E s and closed underfinite limits. For
this category we then define an extended structure by [B]" = li+m{[ B ] r~} for each
predicate symbol B E s

L e m m a 4.7 [tld]}= lira {[ tlcl ]~}

P r o o f . Follows from the fact that filtered colimits commute with finite limits in a
topos (eg [3]). []

L e m m a 4.8 The category C* with interpretation H : C[X] .........C* induced by eval-


uation g(x~) = [B]* is a model of the program P.

P r o o f . We show that Tp,e(C* ) = C*, and, in particular, that E preserves colimits of


subobject chains, ie.

(1)
[B]*=[B~*U[h~'~P--B {-~( [[t,nr ]]~)( it lr ~;) }

for each predicate letter B E s It will suffice to show that for each clause such that
hdcl = 8, ~(~trnr C_ ~[B]*. Then since 9t([tmr is the left adjoint of
([trn~z]r #, it suffices to show [tl~l]} C ( [tm~z]r [B]*. We know by the definition
of [B] k+l that ~([tmel]z) [tlr C_ [B] k+l. So since (-)# is also a left adjoint, we
have that for each k

c ( D ~ o , b ) # [BI ~+1

= ([tm~,l~)# [ B F
By lemma 4.7 we know that [tl~,]* = li_+m{[tlr Thus we may conclude by the
universal property of colimits that

(2)

In fact it is the case that for any semantic category D , being a fixed point of
Tp,c is equivalent to being a model of P. In the proof above, one sees that to show
C* a fixed point of Tp,c (equation 1 for each predicate letter B), one may equivalently
show that for each clause such that hdr = B, C* is a model of P (equation 2). []
259

L e m m a 4.9 ( [ ] - m o n o t o n i c i t y ) If JAIl7) C [A]7)' for every predicate letter A


in f~r~, then it is the case that for any predicate letter B E f.~ and any term t,
( it ]f)#E([17), B) C ( it ]g)#E ( []7)', B).
L e m m a 4.10 C* is an initial model of P.

P r o o f . We want to show that for any model 7) of P (ie any semantic category 7) which
is a fixed point of Tp,c), that validity in C* implies validity in 7). In particular, it
suffices to show that for any predicate letter B E / ~ , [B a* C_ [B ]79.
We first show by induction on k that for all k, [B] k C_ [BAT). Since [B] ~ =
the unique m a p 0 - * [ba ~ b = sort o f B , itisclear that [Ba ~ C [B]7). Now
suppose that [Ba k C [B]7). By Lemma 4.9 and the fact that 7) is a fixed point of
Tp,c, [Ba k+l = E ( [ a k, B) C_ E ( [ ] 7 ) , B) = [BAT).
It now follows by the definition of [Ba* that [Ba* = li__.m{[[Ba" } C [Ba D . D

Definition 4.11 A clause A 1 , A 2 , . . . , A m ~- B(r) is said to a d m i t Q(t) if B = Q


and [t a~ factors through [Ira#.

L e m m a 4.12 If an atomic formula Q(t) is valid in all models of the program P,


(i.e. is interpreted as an isomorphism) then some clause of P admits Q(t).

Proof. Consider the following model 7) of the program P. For each predicate letter
Q E P, adjoin to C the arrows (in C'): U ~zeP ~(id)( [tm~t ]g), and close under finite
hds =Q
limits. Let 7) be the full subcategory of C generated by this. Now let the extended
structure [aT) be defined by I[BaT) -- U oz~P ~(id)([trnczag) where by convention,
hdez =Q
if B r P then [BAT) (the empty union) shall be the unique arrow 0 ---* [b a for
b = sort of B.
The category 7)with interpretation H : C[X] ----+ 7) induced by evaluation
g ( x i ) = [BAT) is a model of the program P since if A1,A2,...,Am b B(r) is a
clause cl of P, then [[rag factors through 9(id)([ray ) and so through [[B]]7). Thus
[B(r) a~ is an isomorphism through which [tic, aDf factors automatically.

Now, since Q(t) is valid in all models, [[Q(t)a ~ is an isomorphism. In other


words, [ta~ factors through [Q]7) = U o~r ~(id)([t,nctaf). But then, by
hdez=Q
coprimeness of representable objects of C(i.e. images of objects of Cunder the
Yoneda embedding) It ]z must factor through ~(id)( [r]E ) for some particular clause
A1, A 2 , . . . , Am P Q(r) in P. By proposition 4.4 then, [tjJ z must factor through [r~g
in C. Thus, this clause of P admits Q(t). t3
260

L e m m a 4.13 Let G be an atomic formula. If [G]~ is an isomorphism, then there


is a rSLD-proof of Mff[X] (G).

Proof. The proof will proceed by induction on k.


If k = 0, then the lemma holds vacuously since for all predicate letters B E s
[[B~~ : 0 --+ [b] where b is the sort of B, which cannot be an isomorphism. Now
assume that the lemma is true for k = n. And suppose that [G]I~+I is an isomor-
phistn, where G = Q(t) for some predicate letter Q and term t. If [Q(t)]~ is also
an isomorphism, then the lemma holds by the induction hypothesis. So assume that
[Q(t) ]~ is not an isomorphism. By definition

Since this is an isomorphism, then It ]e factors through the union

[Q]'u U {9([tmr162
clEP

But then, since M(s (or, more precisely, its image in C') is coprime, it must be
the case that It ]~ factors through one of the members of the union. Since we
have assumed that [t~z does not factor through [Q]n, it must be the case that
for some clause cl e P such that hdd = Q (say the clause A1,A2,...,Am I-- Q(r))
( lit ]~)#~( [rig)( [tl~t ]$) is an isomorphism. In fact, by Lemmas 4.12 and 4.10, there
is a substitution arrow Oein C such that
(a~ o)~j , (a~]~_j and M(~) ~ M(ff)

M(~) ~ M(y-*) IriS, 9 M(a) M(e) ~ M(~)

Thus, we have that for each i, [Ai~]~ is an isomorphism. But then, by our induction
hypothesis, there are rSLD-proofs of Mff[Xl(AiO) for each i.
We may then build a rSLD-proof of MClXl(Q(t)) as follows. Using the clause
diagram on the left below, we may take one vSLD-derivation step using (id, 8) as
mgu. Then our new goal would be the diagram on the right below. But, as mentioned
above, each of thesehas a (finite) rSLD-proof.
Mff[X](A1) ... Mff[Xl(Am) Mc[X](AIO) ... ]14ff[X](AmO)

\/ \/
M(y~ o 9 M~[X]CQ(r)) M(~)
[]
261

L e m m a 4.14 Suppose the pullback [ B t ] ; , , [s of the monic IBm*, 9b


inlerpreting Bt in C* is an isomorphism. Then, for some k so is the corresponding
pullback [Bt ]~ -~= = ~s interpreting Bt in C k.

Proof. Suppose pulling back [B]* along t in C* gives rise to an isomorphism 7:

[ml~_j , [B]* or equivalently [s ~---~ [B]*

[el [t]~ ' b [el ~ b

Then [s ~-z~ [B]* is an arrow from an object in C, i.e. a representable


functor in Cto IS]* ~ li_.m{[Bin}. By the eoprimeness of C-objects in C, 7-1j3
factors through some [B] k as [s 6 , [B]~ C [B]*. It is then straightforward
to show that the arrows id and 5 are pullbacks of the arrows It ~ and [B ]k . , b
respectively. []

T h e o r e m 4.15 ( C o m p l e t e n e s s of r S L D - R e s o l u t i o n ) Let C be a category and


C[X]a logic programming category with extended structure M : ~. ---*C[X] as described
previously. Let p r be a r-logic program in the language s and G a definite goal dia-
gram: If G holds in every model of P r, then P~-cG (ie: there is a rSLD-proof of G
from p" in the category C[X]).

Proof. Let e ' a s described above have clause diagrams MC[ X] (cll),..., MC[X](cl m)
and let G be a definite goal diagram (ie diagram at left below) such that for every
model of P ' , G is true.
Mg[X](G1) "'" Mc-iX](Gr)
Y
[G1 ]9* o', [G r ]9*

M(~ [g]
In particular then, for the model C* of pr, we have that H [~], where ~ is the limit of
the diagram G at left above, is an isomorphism. Since by definition, H[~] is the limit
of the diagram at right above, if it is an isomorphism, then g~ is an isomorphism for
each j. But now by Lemma 4.14, g] is an isomorphism for each j, so by Lemma 4.13
there is a rSLD-proof of MC[Xl(Gj) for each j.

This theorem is a category theoretic generalization of the completeness theorem


of Clark [12], which states (in essence) that the validity of the universal closure of a
goal implies its SLD-provability.
262

5 Some Examples of Constraint Categories

Let s be the language of a one-sorted Horn clause program. Define 7/, the free
algebraic theory for E, to be the category with objects the natural numbers and arrows:
projection maps zrm : n --+ m and diagonal arrows 6nm : m --* n for each nonzero pair
of objects rn, n with m _< n and, for each function symbol f of arity n in s an
arrow f : n ~ 1. These arrows satisfy the equations for associativity as well as
7r,~nrr~m= ,'r~ and ~5nm7rmn= idm __ 6ram= 7rm.m
This category has products and 7"structure. There is a one-to-one correspondence
between arrows in 7-/ and (tuples) of terms in the Herbrand universe over E. Give
7 / t h e canonical interpretation that associates with each f in E the arrow f and let
H [ X 1 , . . . , Xn] be the category obtained by adjoining indeterminates for each of the
predicate letters in E, then H [ X 1 , . . . , X,~] captures logic programming in the sense
made precise in the theorem below. Terminology is from [12].

T h e o r e m 5.1 Let P be a Horn clause program, G a (possibly open) definite goal.


Then there is an SLD proof P ["HG over this category if and only if there is a computed
answer for P U {G}

This example is essentially the case studied in [2]. A straightforward adaptation of


this construction gives the corresponding free category for many-sorted logic.
We can modify the preceding construction to capture equational constraints, as in
[13]. Let/2 be a language, possibly many-sorted, and s an equational theory over •.
By taking a quotient of the Herbrand category 7//g we obtain the algebraic theory
associated with s (see e.g. [17]), containing the same objects as 7/ and congruence
classes of arrows of 7-I as morphisms. We then add mgu's by adjoining terminal cones
(i.e. pullbacks) for every commuting square consisting of interpretations of terms in
/2 and their corresponding unifiers. We obtain a category in which a pullback of the
arrows corresponding to two terms exists precisely when the terms are g-unifiable, that
is to say, when there is a substitution 0 such that g ~ 80 = tO. SLD-resolution over
this category corresponds to finding a computed answer with equational constraints
from s
Note that for domains with pairing or projection operations (see e.g. [11]), first
order constraints are equivalent to (variable-free) equational set constraints using the
allegory-theoretic connectives A, U, ./., (_)o (and complementation in the case of clas-
sical logic). Thus first-order constraints can be captured with many-sorted equational
categories of the type just discussed.

6 Conclusion

We have shown how a general notion of many-sorted, non-ground constraint logic


programming can be captured by a categorical abstract syntax in which resolution
takes place via pullbacks in a 7.-category with canonical products, monics and lim-
its. A categorical analogue of the Tp operator of Kowalski and Van Emden gives a
completeness theorem for this notion in semantic categories defined via the Yoneda
embedding~
263

T h i s categorical f r a m e w o r k opens the way for a categorical a p p r o a c h to t h e se-


m a n t i c s o f u n i f o r m p r o o f systems and higher-order logic p r o g r a m m i n g [15] and as well
as for t h e f o r m a l i z a t i o n of control. It would be interesting to s t u d y c o n n e c t i o n s w i t h
the h y p e r d o c t r i n a l constraints of [16], and with the zr d o m a i n s and n o n - g r o u n d se-
m a n t i c s of [1]. Also worth exploring in this context is a categorical t h e o r y of a b s t r a c t
i n t e r p r e t a t i o n based on a d j u n c t i o n s over the Y o n e d a c a t e g o r y C'.

References
[1] G. Levi A. Bossl, M. Gabbrielli and M. MarteUi. The s-semantics approach: Theory and
applications. Journal o] Logic Programming, 19-20, 1994.
[2] A. Asperti and S. Martini. Projections instead of variables, a category theoretic interpretation
of logic programs. In Proc. 6th 1CLP, pages 337-352. MIT Press, 1989.
[3] Michael Barr and Charles Wells. Toposes, Triples, and Theories. Springer-Verlag, 1985.
[4] A. Gorradini and A. Asperti. A categorical model for logic programs: Indexed monoidal
categories. In Proceedings R E X Workshop '92. Springer Lecture Notes in Computer Science,
1992.
[5] A. Corradini and U. Montanari. An algebraic semantics of logic programs as structured tran-
sition systems. In Proceedings of the North American Conference on Logic Programming
(NACLP '90). MIT Press, 1990.
[6] Stacy E. Finkelstein. Tau Categories and Logic Programming. PhD thesis, University of
Pennsylvania, 1994.
[r] Peter Freyd and Andre Scedrov. Categories, Allegories. North-Holland, 1990.
[8] Joxan Jaffar and Michael Mailer. Constraint logic programming: A survey. Journal of Logic
Programming, 19/20, 1994.
[9] J. Lambek and P.J. Scott. Introduction to Higher Order Categorical Logic. Cambridge, 1986.
[10] Saunders Mac Lane and Ieke Moerdijk. Sheaves in Geometry and Logic. Springer-Verlag, 1992.
[11] James Lipton and Paul Broome. Combinatory logic programming. In Proc. ILPS'94. MIT,
1994.
[12] J. W. Lloyd; Foundations of Logic Programming. Springer Verlag, New York, 1987.
[13] M. Gabbrielli M. Alpuente, M. Falaschi and G. Levi. The semantics of equational logic pro-
gramming as an instance of clp. In Logic Programming Languages. MIT, 1993.
[14] Michael Makkai and Gonza]o Reyes. First Order Categorical Logic, volume 611 of Lecture
Notes in Mathematics. Springer-Verlag, 1977.
[15] Dale Miller, Gopalan Nadathur, Frank Pfenning, and Andre Scedrov. Uniform proofs as a
foundation for logic programming. Annals of Pure and Applied Logic, 1990.
[16] P. Panangaden, V. Saraswat, P.J. Scott, and R.A.G. Seely. A hyperdoctrlnal view of constraint
systems. In Lecture Notes in Computer Science 666. Springer-Verlag, 1993.
[lr] A. Polgne. Algebra categorically. In Category Theory and Computer Programming. Springer,
1986.
[18] D,E. Rydeheard and R.M. Burstall. A categorical unification algorithm. In Category Theory
and Computer Pro#ramming~ 1985.
Reasoning and Rewriting with Set-Relations I:
Ground Completeness

Valentinas KriauSiukas 1 and Michat Walicki 2 *

1 Department of Mathematical Logic


Institute of Mathematics and Informatics, Vilnius, LITHUANIA
{valentinas.kriauciukas@mlats.mii.lt }
2 Deptartment of Informatics, University of Bergen, NORWAY
{michal@ii.uib.no}

A b s t r a c t . The paper investigates reasoning with set-relations: intersection, inclusion


and identity of 1-element sets. A language is introduced which, interpreted in a multi-
algebraic semantics, allows one to specify such relations. An inference system is given
and shown sound and refutationally ground-complete for a particular proof strategy
which selects only maximal literals from the premise clauses. Each of the introduced
set-relations satisfies only two among the three properties of the equivalence relations
we study rewriting with such non-equivalence relations and point out differences from
the equational case. As a corollary of the main ground-completeness theorem we obtain
ground-completeness of the introduced rewriting technique.

1 Introduction

Reasoning with sets becomes an important issue in different areas of com-


puter science. Its relevance can be noticed in constraint and logic programming
e.g. [SD86, DO92, Jay92, Sto93], in algebraic approach to nondeterminism e.g.
[Hus93, Hes88, WM95], in term rewriting e.g. [LA93, Kap88, Hus93].
Our interest in the set concepts originates from an earlier study of specifica-
tions of nondeterministic operations. Such operations are naturally modelled as
set-valued functions. The semantic structures serving this purpose - m u l t i a l g e -
b r a s - generalize the traditional algebras allowing operations which, for a given
argument, return not necessarily a single value but a set of values (namely, the
set of all possible values returned by an arbitrary application of the operation).
In [WM95, Wal93] we defined a specification language using set-relations and its
multialgebraic semantics. The set-relations we considered were: inclusion, inter-
section and identity of 1-element sets. The first two are the usual set relations.
Inclusion allows one to define set equality which, for that reason, is not included
in the language. The third relation is particularly important: it provides the
syntactic means of distinguishing between sets and their elements, and is indis-
pensable for obtaining a complete reasoning system. Such a system is also given
in the above works.
* Both authors gratefully acknowledge the financial support received from the Norwe-
gian Research Council.
265

In the present paper we use the same set-relations but introduce a new rea-
soning system. It is less general than the earlier one - we are studying only
the ground case - but it is much more prone to automation. Rewriting with
non-congruence relations becomes also an issue of increasing importance. The
set-relations we are considering are not even equivalences: equality is symmetric
and transitive (but not reflexive), inclusion is reflexive' and transitive (but not
symmetric) and intersection is reflexive and symmetric (but not transitive). We
study the rewriting proofs in the presence of these relations generalizing sev-
eral classical notions (critical pair, confluence, rewriting proof) to the present
context. Our results on rewriting extend bi-rewriting [LA93] in that we con-
sider three different set-relations. We also take a step beyond the framework of
[BG93] in that we study more general composition of relations than chaining of
transitive relations.
Section 2 defines the syntax and the multialgebraic semantics of the lan-
guage and lists some basic properties of the set-relations. Section 3 introduces
the reasoning system E, ordering of words and specifies the m a x i m a l literal proof
strategy for using 2.. Section 4 discusses term rewriting with the introduced
set-relations. In section 5 we discuss the main theorem - refutational ground
completeness of 2" with the maximal literal strategy and, as a simple corollary,
ground completeness of rewriting.
The present paper is an improved and shortened version of the report [KW94].
Because of the space limitations, all the proofs had to be omitted in the present
version of the paper.

2 Specifications of Set-Relations

Specifications are written using a finite set of function symbols $" having arity
ar : ~" -+ ~q.3 A symbol f C $-0 is called a constant. Only ground case is
considered here, and we do not introduce any variables. We denote by T($')
the set of all (ground) terms. There are only three atomic forms using binary
predicates: equation s ~ t, inclusion s -~ t and intersection s ~. t. A specification
is a set of clauses - finite sets of literals, where a literal is an atom or a negated
atom written -~a. (In [WM95, Wal93] a restricted language is used, allowing
only negated intersections, positive inclusions and equations in clauses.) We will
usually write negated atoms explicitly as s ~ t, s -~ t and s ~ t, and assume
~(-~a) = a. By words we will mean the union of the sets of terms, literals and
clauses. We will write u[s]p to denote that a term s is a subterm of a term u at
a position p. Often the position will be omitted for the sake of simplicity.
Syntactic expressions of the language are interpretated in multialgebras [Kap88,
Hus92, Wal93].
D e f i n i t i o n 2.1 An JZ-multialgebra A is a tuple(sA,~ A)
where S A is a non
empty carrier set, and $-A is a set of set-valued functions f A : (SA)ar(I) _+
3 We are treating only the unsorted case - extension to many sorts is straightforward.
266

P + ( s A ) , where f C ~ , and P + ( S A) is the power-set of S A with the empty set


excluded.
Defining the meaning of the words we follow [WM95, Wal93].

D e f i n i t i o n 2.2 An interpretation Id~ d of any expression d of the language is


defined as follows:
_ [c~A d_~fc A , if a is a constant;

- ~ / ( t l , . . . , t ~ ) ] A %~ U { / A ( a l , . . . , a , ~ ) : aie Iti] A} for any f e Z ~ and


{tl,.. c 7(7);
- is ~ t] A is true if is] A = [t] A = {a} for some a e S A, and false otherwise;
- Is -4 t] A is true if [s~ A C it] A, and false otherwise;
- Is ~ t] A is true if Is] A N [tl A =~ 0, and false otherwise;
- for an atom a, [~a~ A is true if Ia~ A is false, and is false otherwise;
- ~ a l , . . . , an] A is true if some Iai~ A is true, and false otherwise.

Definition 2.2 implies that for each f E 7 , f A is C-monotone (because it is


defined by pointwise extension). Interpretation of a constant c is, according to
the definition of multialgebra, a non-empty set. Observe also that equality is not
reflexive - - t ~ t is not true in general. A term t for which this equality is true
is called deterministic because then it has only one possible value. The equality
is merely a symmetric and transitive relation. An inclusion s -4 t means that the
term s has the value set which is included in the value set of t. This relation
is a partial preorder - - it is transitive and reflexive, but not symmetric. The
intersection is reflexive (because of nonemptiness of term values) and symmetric,
but lacks the transitivity property. Thus each of these relations satisfies two of the
three properties of equivalence relations. Now we present some other properties
of these relations.

2.1 Basic Properties of Literals

The following relation expresses equality of term value sets, and is the usual
interpretation of equality in the set-valued approach to nondeterminism [Hes88,
Kap88]:
s• ~[ s ~ t A s ~ - t . (1)
As can be expected, it does not increase expressibility and therefore is not used
in the language. For a discussion of the intended meaning and difference between
' ~ ' and ' • in the context of nondeterminism see [WM95, Wa193].
The positive, resp. negative, relations are totally ordered by strength:
u.-~v ~ u • ~ u-<v ~ u ~ . v and u ~ v ~ u ~ v 4= uv~v 4= u q k v (2)
The following two lemmas present the subterm replacement and composition
(chaining) properties. Replacement of "equals by equals" occurs only in the case
of two of the four relations. Nevertheless these properties will allow us lager to
develop techniques of term-rewriting.
267

L e m m a 2.3 The following properties hold for the introduced predicates:


~t ~ ~[s]~• ~• : . ~[s]p• s-~t =. ~ [ s ] ~ [ t ] ~ , ~---t : . ~[~]~---~[t]~.

L e m m a 2.4 The predicates satisfy the composition properties, given in Table 1.

Iii IIs ~ ~1~-< ~1s >- ~ts ~ ~1~ ~ ~ls ~ ~1~ ~ ~1s ~ ~1
s~t t~ut-~ut~ut~u t~ut~ut~ut~u
s - ~ t , t ~ - u t ~ ' . u t ~ - u t~-.u t ~ u t - ~ u t ~ u t - ~ u
s~-t t ~ u t - ~ u . . . . t~u!t~u
sat t~-ut~.u . . . . tr t-~u

Table 1. Rules for atom composition

For convenience we will write the partial function coded in this table as 5) o | =
| meaning that | is the strongest relation obtained by composing 5) and | for
any terms, i.e.: 5)o| = | ~ (s@tAt| ~ s o u ) for any terms s , t , u . Note
that the table defines only the strongest composite of the arguments. Because
of the ordering (2) the fact that, for instance, ~ ~ ~ = ~. will imply that also
-~ can be obtained from composing ..~ and ~ .
Composition of negative and positive atoms is symmetric to the composition
of the positive and the negative ones given in the table. Composition of two
negative atoms does not allow one to draw any conclusion and therefore is not
mentioned at all.

L e m m a 2.5 The composition function _ o _ is transitive.

The next lemma is an easy corollary of the two previous lemmas, but it is
important because it describes the situation known from term rewriting as a
critical peak [DJ90] and is related with generation of critical pairs.

L e m m a 2.6 The atoms satisfy the replacement rules from Table 2.

.--. t ~[t] ~ ~ u-[t] .--. v . . . . ~[t] .r ~[t] r

Table 2. Rules for term replacement in atoms

The content of this table is also encoded as a partial function: Repl(@,| =


| ~ (s 5) t A u[S]p | v ~ u[t]p 5) v) for any terms s, t, u, v and position p at u.
268

The two tables differ in predicate signs at four places - - 1:3 - 1:6 (row 1,
columns 3 through 6), where the relation resulting from Table 1 is stronger than
the one from Table 2. These cases must be distinguished when the superposition
rule (see below) is applied. Therefore we introduce the function Sup(s, t, @, |
which will select the appropriate table. Its value is @ ~ | in the case s = t, and
Repl(D, | otherwise.

3 The Inference System


The following set of rules was constructed in analogy to the inference systems
for first-order predicate calculus with equality [BG91, S-A92]. However, there
are some additional restrictions due to the composition laws as compared with
the equational case. Very similar rules are presented in [BG93] for transitive
relations.

Reflexivity resolution ~--~, where | e {~, 5, r

C, s 9 t 19, u[s]p..| v
Superposition C, D, u[t]p | v , where | = Sup(s, u[S]p, 0, |

Compositionality resolution C, s @ t D r s | u where | = 0 -1 o ~|


C,t|174 '
The analogous rule for equality called equality factoring [BG91, S-A92] is a
special case of our rule when both premise clauses coincide. In [BG93] analogous
rule is called transitivity resolution.
Let Z denote the inference system consisting of the above rules.

T h e o r e m 3.1 The inference system 5[ is sound.


P r o o f . Soundness of reflexivity resolution follows from reflexivity of '-~' and
' ~ ' . Soundness of superposition is a direct consequence of the replacement and
composition laws (Lemmas 2.6, 2.4). Soundness of the compositionality rule is
based on the following short deduction. Suppose that the first premise clause
and the implica[ion s G t A ~(t | u) ~ s | u both are true. The later is equivalent
to s ~ t ~ t | u V s | u. A single application of the (usual) resolution rule gives
the conclusion of the rule. The second premise clause is not used in this step - -
it only shows the goal atom.

3.1 O r d e r i n g of W o r d s
Various ordering8 of terms and atoms are used extensively in the study of auto-
mated deduction. We will apply such an ordering to define a more specific proof
strategy for the system 27, to study the possibility of rewriting wrt. the introduced
predicates and, finally, to define the model in the completeness proof. We assume
the existence of a simplification ordering ' > ' [DJ90] on ground terms which is
total (Vs :~ t E T ( J z) : s > t V t > s), well-founded ( V t e T(iT) : {s : s < t}
269

is finite), monotone (Vu, s , t e T(J:) : s > t ~ u[s] > u[t]) and increasing
(Vs E T(J:) : u[s] r s ~ u[s] > s). Ordering of other words is defined by the
multiset extension [DM79] of this ordering. Let A4(T) denote the set of all finite
multisets of elements from T. Each element of A4(T) can be represented by a
function ~ : T -+ ~ such that ~ _= 0 except for some finite number of elements
of T./~(d) is a number of copies of d in the multiset 8.

D e f i n i t i o n 3.2 For an ordering '-4' on a given set T, an ordering ,__+m, on the


set J~4(T,) is a muItiset extension of ' ~ ' , if

fl --+'~ 7 r Vd E T 3c E T ((/~(c) > 7(c) A (/~(d) _> 7(d) V c -~ d)).

In the particular case of total ordering o f T , which is the only one considered here,
a _~m f~ means that there is some c e T such that: a(c) > fl(c) A Vd -+ c a(d) =
fl(d). This is a lexicographic ordering comparing biggest components first. In the
general case it is known [DM79] that ,_~m, is total if ' - F is total and , ~ m , is
well-founded if '-~' is well-founded.
Writing a literal s @ t, we indicate that s _ t. It explains why both signs
'-~' and '~-' are used. This rule, of course, is not applied to the conclusions of
the proof rules. We assume that any term is bigger than any predicate symbol.
A stronger positive predicate is bigger than a weaker one, the order between
negative predicates is reversed, and all negative predicates are bigger than the
positive ones:

r > > > + > > > (3)

By analogy with the commonly used approach in equational reasoning, we iden-


tify literals with multisets. A literal s @ t is represented by the multiset ({s, @},
(t, ~-1}}. The ordering of the predicates will make the negated form of an atom
bigger than the atom itself.
The ordering of literals is the twofold extension of ' < ' because each literal is
a multiset of two multisets. The biggest literal in a clause C w.r.t, this ordering
is denoted by max(C). Clauses are compared as multisets of literals, so their
ordering is the multiset extension of the ordering of literals (threefold multiset
extension of '<'). Although we have here three different orderings, we will use the
same symbol ' < ' to denote any of them. This should not introduce any confusion
as the sets of terms, literals and clauses are disjoint.

3.2 T h e MAXIMAL LITERAL P r o o f S t r a t e g y

The literals mentioned explicitly in the premises of the proof rules are called
active. Various ways of selecting the active literals will lead to different proof
strategies. The maximal literal strategy requires that the active literals in the
premise clauses are the ones which are maximal wrt. the ordering defined above.
Stated explicitly the strategy amounts to the following restrictions on the appli-
cation of the rules:
R e f l e x i v i t y r e s o l u t i o n " the literal s ~ s is maximal in the premise clause.
270

S u p e r p o s i t i o n : the atom s @ t and the literal u[s]p | v are maximal in their


clauses.
C o m p o s i t i o n a l i t y r e s o l u t i o n : the atom s @ t is maximal in its clause. The
atom s | u is n o t maximal in the second premise clause, but the term s
is maximal in this clause, and the maximal literal in this clause is positive.
Both @ and | are positive.

The restriction on the last rule is the only case where some active atom
(s | u) is not maximal in its clause. However, it is almost maximal because the
maximal term s of the clause occurs in it. Another reason why this weakening
of the strategy is not essential is that the second clause in the premise provides
merely the context allowing application of the rule, and in fact the atom s | u is
not so "active". A particular consequence of this restriction is that the rule can
be applied only when its second premise is a non-Horn clause.

4 Rewriting Proofs

In the next section we show that if the empty clause can not be deduced using the
maximal literal strategy, then a model exists satisfying the initial set of clauses.
The model is constructed from an appropriate set of ground atoms which force
all the initial clauses to be true. The notion of forcing requires construction
of a deductive closure of a given set of literals. This section investigates the
rewriting proofs in which ground literals are rewritten to ground literals. The
obtained results will serve as a basis for the construction of forcing set in the
completeness proof.
Although, eventually, only atoms will be used in the model construction we
give a more general account - our definitions and lemmas apply to rewriting
with both negative and positive literals.
Rewriting of literals with the set-relations is based on the fact that the re-
lations satisfy replacement properties from Lemma 2.6. For example, the impli-
cation: s .-. t ~ u[s]p ~ u[t]p means that the atom u[s]p ~. u[t]p can be derived
applying the rule s ~-'~ t to the term u[S]p. The following definition states what
kind of literals can be derived directly applying the replacement property to
some set ,4 of ground literals (also called axioms).

D e f i n i t i o n 4.1 A literal r is a rewriting step in ,4 if either 1) r E A, or 2) r is


an atom u[s]p @ u[t]p, and s | t e A, where @ = | if | =/=~., or | e {-% >-},
otherwise.
The rule based directly on this kind of term-rewriting is superposition. The forc-
ing set in the completeness proof will consist of ground positive literals, which can
be derived using only this one rule. The superposition rule takes two rewriting
steps and composes them into one. Consequtive applications of the superposition
correspond to composition of the finite sequence of the corresponding rewriting
steps (s @1 tl , tl @2 t2, t2 @3 t3, ..., tn @n t). Such a sequence is called a rewriting
sequence, and the predicate sign (~ of the resulting literal is computed using the
271

function _o_: @ = ( ( @ { l o @2) -1 0 . . . ) - I o @n. The next definition puts all such
literals into rewriting closure of .4. This closure also contains atoms t h a t are
trivially true.

D e f i n i t i o n 4.2 For a set ,4 of ground literals, the rewriting closure of ,4 is the


set of ground literals, Jr*, defined as follows:
- all atoms of the form s -4 s or s ~- s belong to ,4*;
- if an a t o m s@t E A and a literal u[s] | C .A*, then the literal u[t] | Cr
if | = Sup(s, u[s], @, |

Primarily, we are interested in reducing rewriting sequences, i.e., such t h a t


rewriting is used to produce terms of lower complexity i n some well-founded
ordering. The t e r m ordering is used to orient literals but it does not allow us to
orient the reflexive literals of the form s @ s. However, the orientation problem of
these particular literals turns out to be inessential for the following arguments
(except the next definition), and so we allow t h e m to have orientation t h a t is
appropriate for the context in which it is used. A literal s @ t can be written
in the form s @> t to emphasize that s > t, then it is also called a rule. The
fact t h a t this literal is derived by a rewriting sequence in which terms do not
@-1
increase in any step is written as s ~ t or (the same) as t ~ s.

D e f i n i t i o n 4.3 A rewriting sequence is reducing (w.r.t. to an ordering of terms


<) if it does not contain a peak, i.e., a pair of consecutive rewriting steps s@t, t|
such t h a t s < t > u. A rewriting proof is a reducing rewriting sequence.
The non-strong inequalities in the last condition capture the cases of reflexive
steps. A rewriting step s @ s does not form a peak in a rewriting sequence only
at a locally minimal point, i.e., in a rewriting proof where s is the smallest term.
Definition 4.3 means t h a t any reducing proof consists of two decreasing branches
like s ~ u ~ t, or has only one s ~ t or s ~ t. The table from L e m m a 2.4
can be written as a s u m m a r y of all the possible combinations of the resulting
predicate signs appearing in two-branches rewriting proofs:

(s=~u.~=t v s = ~ . ~ = t v s~,~=t) ~ s ~ t.
(s=~u.~=t v s=~u4=?=t v s=:guc=~=t) ~ s-< t,
(s==%~=t v s=~=t v s=%u~=t) ~ s >- t,
(s=%u4=?=t v s=%u<~=t v s = ~ 4 4 = t ) ~ s.-.t.

L e m m a 2.6 describes how the peaks can be eliminated from rewriting sequences.
Let us take, for example, one implication from this lemma: s~,tAu[s]~v ~ u[t]
v. The premise can be interpreted as a possibility to have a peak u[t] ~ u[s] ~,~v
in proofs, if both atoms s ~ t and u[s] ~ t are axioms. This peak can be "cut
down" changing it by the consequence u[t] ~- v, if it is also among the axioms.
T h e following notions are commonly used in similar situations.
272

D e f i n i t i o n 4.4 A rule rl = s 9 ~t overlaps a rule r2 = u[S]p | ~v. In this case


the literal l = u[t]p | v, where | = Sup(s, u[s]p, @, | is called a critical literal
formed by the rules r l , r2, if I is different from rl and r2.
Critical literals correspond to critical pairs from equational reasoning [DJ90]. In
our case the definition is more complex because the predicate sign is important
and replacement is not merely of "equals by equals". Also, when the rule rt is
reflexive (which is not necessarily a tautology in our case) then the critical literal
I may be the same as r2. It is better to exclude such cases because they would
complicate our model construction.

D e f i n i t i o n 4.5 A set 7~ of ground rewriting rules is confluent if T~* contains


all critical literals formed by overlapping rules from ~ .
In term-rewriting theory [DJ90] such systems are called locally-confluent. Conflu-
ent systems have slightly different definition, but both these notions are proved
to be equivalent. In [LA93] a similar definition introduces bi-confiuent systems.
In completeness proofs like ours, fully-reduced [PP91] or left-reduced [S-A92,
BG91] rewriting systems are used. We are not able to define the analogous notion,
since deduction and reduction are not the same in our language, and will apply
Definition 4.5 instead. Its direct consequence is

L e m m a 4.6 A n y literal derivable by a rewriting sequence in a confluent system


T~ has a rewriting proof in 7~.

Since we have allowed both negative and positive literals to occur in one set of
axioms ,4, the unpleasant situation, when both an atom a and its negation ~a
belong to A*, is possible. The set A is consistent if no such atom exists. A set
containing only atoms is obviously consistent. Although such a set will be of
main importance in the following section, we again formulate stronger results,
taking into account the general situation of possibly inconsistent sets of literals.
Next lemma, to be used in the completeness proof to construct confluent systems
incrementally, characterizes rules that can be added to a confluent and consis-
tent system preserving both these properties. Here we have again the situation
different from the usual equational reasoning, because the rule s ~ t overlaps
itself and produces the critical atom t ~ t which need not be always true. The
following lemma characterizes the rules which can be added to a system preserv-
ing its consistency and confluence. It serves as a basis for the construction of the
forcing set in the next section.

L e m m a 4.7 For a confluent and consistent system T~ and a rule r ~ Tt* the
system T~ U {r} is confluent and consistent iff
(i) r does not have the form s - ~ s, where | e {~, ~, r
(ii) for any critical literal 1 formed by any r' e 7~U{r} overlapping (or overlapped
by) r, l e ~ * .
273

5 The Completeness Theorem

The proof system ~ is used to derive a clause from a given specification S "by
contradiction": to prove that a clause C = { a l , . . . , an} follows from S, one takes
the negation of C, namely the set of unary clauses neg(C) ~f {-~al;... ;~an},
adds it to S, and tries to derive the empty clause from the resulting set of clauses.
Proving refutational completeness, we have to show that if some set of ground
clauses S has no model, then the empty clause is derivable using rules from I .
The usual way to prove this is to show that there exists a model satisfying all
the clauses from S if the empty clause is not derivable from S. In our proof
we follow the ideas from [S-A92] and [BG93] which, in turn, develop the ideas
of [Bez90]. Similar proof using forcing is given in [PP91]. All these works are
concerned with first-order predicate calculus with equality. In [BG93], a similar
proof method is used with respect to transitive relations.
Our construction proceeds in two main steps. Given a consistent set S of
clauses, we select a set of atoms T~ (section 5.1) and show that 7~ is a forcing
set for the clauses from S. Then (section 5.2) we show that 7~ can be used to
construct a multimodel which satisfies S.
We call a set of clauses S consistent if it does not contain the empty clause.
The redundancy of clauses in S will be defined during the model construction.
Redundancy notion was developed by Bachmair and Ganzinger [BG91] to cover
simplification techniques commonly used in theorem provers. Referring to this
notion we fix a set S and assume it is consistent and relatively closed, meaning
that any application of a rule from 27 with premises from S produces a clause
that is in S or is redundant in 8. The main result is

Theorem 5.1 (Ground-completeness) If a set of ground clauses $ is con-


sistent and relatively closed then it has a model.

In the following sections we merely indicate the main steps and results needed
in the proof of this main theorem.

5.1 Forcing Set, Redundancy and Productive Clauses


We borrow the notion of forcing from [PP91] where it is also used in a complete-
ness proof:

Definition 5.2 A set of ground atoms A forces


- a ground atom a if a E ,4", and the literal -~a if a ~ `4*;
- a clause C if it forces some literal l E C;
- a set of clauses S if it forces all clauses from S.

In the last case we say that A is a forcing set for S. We write .4 H-w if .4
forces w. For a consistent set S of ground clauses we will construct a set T~ of
ground atoms forcing S. All such atoms can be oriented into rules because of our
assumption about an ordering of terms, therefore we can treat 7~ as a system
274

of rules. Since the constructed 7~ will be confluent, it suffices to consider only


rewriting proofs.
The starting point of the model construction is the set of maximal literals

A0 ~-f {max(C) : C E S}. (4)

In rewriting proofs all terms are not bigger than the maximal term of the lit-
eral being proved. This admits an incremental construction of the model, starting
with A0 and removing redundant literals.
Redundancy of clauses is defined relatively to two sets: one set of clauses S
and one of ground atoms A. This is an intermediate notion, the final one refers
only to S. We have already fixed the set of clauses S to shorten our formulations.
For a given literal 1 and a set s of literals the set s ~-~ {a E s : a < l} contains
all the literals from s that are smaller than 1.
D e f i n i t i o n 5.3 A clause C E S with max(C) = 1 is redundant in a set of
ground atoms .4 if either
- A t ~ C or
- S contains another clause C' < C with max(C/) = l, such that Al ~ C j.
The nature of the second condition of the definition may not be very clear, but
thanks to this condition, the whole definition is a negated assertion about some
minimality of a clause. Statements of this kind are very appropriate in inductive
proofs, like our proof of completeness. The redundancy of literals is based on
redundancy of clauses and Lemma 4.7.
D e f i n i t i o n 5.4 A ground literal 1 is redundant in a set of ground atoms .4 if
either
- l=s@s, where@E {~,~t @},or
- .4U {/} contains a rule r overlapping 1 and forming the critical literal a, such
that A ~ a, or
- every clause C E S with max(C) = l is redundant in .4.
We write red(.4,w) to indicate that w is redundant in A. Observe that Defi-
nitions 5.4 and 5.2 of redundancy and forcing are so related, that all negative
literals that are not forced are redundant. Since any forced literal makes all
clauses containing it redundant, any negative literal appears redundant.
After all the preliminary definitions the definition of the forcing set is quite
short. The set is defined as a limit of a decreasing sequence of sets, which begins
with .40 defined in (4). Succeeding sets are obtained removing minimal redundant
literals. Suppose .4i is already known, and let li be the minimal redundant literal
in .4i:
.4 +1 do= \ {ld, n A (5)
iEIN
The next lemma shows that redundancy is preserved when taking the limit in
the definition of 7~, and that redundancy of a word in some .4i is equivalent to
its redundancy in 7~.
275

L e m m a 5.5 For a literal l E ~4o (a clause C with max(C) = l) :

3i, li > 1 : red(~4i, l) ~..~ (Vj > i : red(Aj,l) ~ red(7~,l)).

As an immediate consequence of this lemma and Lemma 4.7 we obtain

C o r o l l a r y 5.6 7~ is confluent.

With every atom a from 7~ there is associated a clause from S which causes a to
be included in 7~. In [S-A92] such clauses are called regular, in [BG91] productive
because they produce atoms being included in the forcing set. In [PP91], where
the forcing method is presented, no special notion for clauses of this kind is used.

D e f i n i t i o n 5.7 A clause C, 1 with max(C, l) = 1 is productive for l i n a set A


ifA~c.
The main properties of the set T~ are expressed in the following theorem:

T h e o r e m 5.8 Let 8 be consistent and relatively closed set of ground clauses,


Ao and T~ be as defined by (4) and (5). Any literal I E Ao satisfies the following
conditions:
I1. /f-~red(~, l), then for any a E TQ there exists a clause C E S productive for
a in TQ,
I2. /f red(7~, l), then for any clause C E S with max(C) = l : 7~1 W C.

The theorem is formulated in the form of an induction statement. After taking


the limit 7~ = UlEAo Rz, the theorem means that any atom in T~ has productive
clause in 8 (this is an auxiliary assertion) and that Tr H-S. This is the main
technical result of this paper.

5.2 From the Forcing Set to a Multialgebra

Thus, we have shown that for a consistent and relatively closed set ,S of ground
clauses, the set of ground atoms 7~ is a model of 8 in the sense that it forces
all the clauses from S. To complete the proof of Theorem 5.1 we need to show
that the existence of such an T~ implies the existence of a multialgebra A which
satisfies all the atoms from 7~* and only these ones. Then, from the definition of
forcing it follows that A also satisfies all the clauses S.
The rewriting closure 7~* defines a reflexive transitive relation '-~' on the set
of terms T(Jv). In multialgebras this partial (pre)order '-<' on terms is interpreted
as set inclusion: an atom s -~ t means that ~s] A C ~t]A. Two other predicates of
our language also have natural interpretation in partial order terms:
- s ~ t means that both sets ~s] A and It] A are equal minimal elements in
the partial order of nonempty sets, i.e., they denote the same set with one
element;
- s ~-. t means that there exists some minimal element a such that a E [s] A
and a E ~t] A.
276

The relation '-4' on the set T ( f ) is partial preorder, because different terms
may have the same value set. To turn it into a partial order we have to take the
quotient of 7~* modulo ' • that was defined in (1). Since • denotes set equality, it
is obviously a congruence: reflexivity and transitivity follow from the analogous
properties of inclusion '-~', symmetricity from (1), and the replacement property
is given in (2.3).
First, given a set T~ of ground atoms, we construct a partially ordered set
PO(T~) = (C, _E). Then, we extend signature with new constants and 7~ with
new atoms, so that P O ( T t ) defines a multialgebra satisfying all the atoms in 7~.
Let d %f ~ * / • be the quotient of 7~* modulo '• [t] ~-~ {s C T(Y) : s < t c
7~* A t -4 s E 7~*} denotes the equivalence class in 0 of a term t, and E %~
{([s], [t]/ : s -4 t E 7~*} is a partial order on 6 ('E' is the irreflexive part of this
relation). (That (d, -~/is well defined, i.e, that all atoms in one equivalence class
stand in the same relations, follows from Table 1: for any atom s @ t, any t -~ u
or t ~- u is enough to derive also s G u.) Let I9 %f {[s] : s E T ( ~ ) , s ~ s c 7~*}
be the set of deterministic elements in d, 7)(T) %f {S C 7) : S _ T} be the
set of deterministic elements which are smaller than or equal to T E r and
.~4 ~ {S E d : VT E d T r S} be the set of minimal elements in C. ~4(S)
denotes the set of elements from A/~ that are smaller than or equal to S. We will
write A4[t] (or 7)It]) instead A4([t]) (respectively, 7)([t])).
The set 7) is a subset of A4, because from s ~ s and s ~ u it follows that
s ~ u, meaning that no elements lie below the class Is] if Is] C 7). The set
7) is a "candidate" to be a carrier of a multialgebra, and 7)(T) should be an
interpretation mapping defining the multialgebra. Definition 2.2 tells us what
properties 7)[_] should have in order to be an interpretation mapping:
P O 1 . 7 ) [ f ( t l , . . . , tn)] = U { 7 ) [ f ( a l , . . . , an)]: ai E 7)[t/]}
for any f E T ~, { t l , . . . , t ~ } C T($');
P O 2 . 7)[sl={[s]} r s~scn*;
P O 3 . 7)[s]C_~[t] z ~. s -~ t E T~*;
P O 4 . 7)[s] M 7)[t] =/=~ r s ~ t E n*.
If 7)[_] satisfies PO1-PO4, then a multialgebra A satisfying T/can be defined:
M A 1 . the carrier S A ~f 7);
M A 2 . for any constant e C ~-: c A %~ 7)[c];
M A 3 . for any f e 9vn and {[all],..., Ida]} C 7):
f A ( [ d l ] , . . . , [d~]) %f 7 ) [ f ( d l , . . . , dn)].
In this case [[_]A = 7)[_], what follows from PO1, and we say that 7~ defines
the multialgebra A. The next result is important, because the multialgebra is
constructed from positive atoms only, while clauses from S may also contain
negative atoms. We must be sure that the multialgebra makes true only atoms
derivable from the forcing set 7~.
L e m m a 5.9 If T5 defines a multialgebra A then, for any atom a : a C T~*
~a] A is true.
277

However, the mapping ~D[_] defined from the forcing set 74 may violate any of
the Requirements PO1-P04. We might therefore consider the mapping M[_] in-
stead, but also this one may violate these requirements. To meet these problems
we have to extend C (and mapping ~D[_]) with new minimal elements. For in-
stance, there may be no (deterministic) element validating the atom s ~ t E 74*,
or even no such term included in a given term t (which therefore would denote
empty set). We do not give here the, rather elaborate, details of this extension.
Its result is that the signature is extended with new constants, and new atoms
are added imposing the required properties on these new elements. In particu-
lar, all new elements are deterministic, which makes the mappings ~D[_]and M[_]
coincide. The crucial property of this extension is that it is conservative.

D e f i n i t i o n 5.10 Let $- C ~rl be two signatures and 74,741 two sets of atoms
over signature $', resp. $1. 741 is a conservative extension of 74 if for any ~--atom
a, aE74~ ~ aE74*.
Thus, we can complete the set 74 with elements and atoms needed to construct
a multialgebra satisfying exactly the same atoms which are members of 74. Since
74 H- S, this means that we obtain a multialgebraic model of S. This last step of
the construction is expressed in:

T h e o r e m 5.11 For any atom set 74 there exists an atom set 741 that is a con-
servative extension of 74 and defines a multialgebra.

This ends the proof of the completeness theorem which also yields:

C o r o l l a r y 5.12 Any ground atom valid in all multimodels of a given set of


ground atoms 74 has a rewriting proof in 74.

6 Conclusion

Motivated by the study of nondeterministic specifications, we have introduced a


system for reasoning with the set-relations: inclusion, intersection and identity
of 1-element sets. The system gives rise to a rewriting technique where atoms
involving these relations are rewritten to other atoms, and chaining is based on
the explicitly specified composition laws for the introduced relations. We have
shown that the reasoning system and, consequently, the associated rewriting is
sound and complete for the ground case.
We have not addressed the analogous problem for the general case involving
variables. Admitting variables ranging over sets should not present particular
problems and, probably, can be incorporated without much difficulty into our
framework. However, the significant role is played by the variables wich are al-
lowed to range only over individuals (e.g., [Hus93], [WM95b]). Unfortunately,
since in general terms denote arbitrary sets, such variables do not admit unre-
stricted substitution. Consequently, one should not expect the possibility of a
straightforward lifting of the presented results to the language containing such
variables. This issue is now under investigation.
278

References
[Bezg0] M. Bezem. Completeness of Resolution Revisited. TCS, 74, pp.27-237, (1990).
[BG91] L. Bachmair, H. Ganzinger. Rewrite-Based Equational Theorem Proving with
Selection and Simplification. Technical Report MPI-I-91-208, Max-Planck-
Institut f. Informatik, Saarbriicken~ (1991).
[BG93] L. Bachmair, H. Ganzinger. Rewrite Techniques for Transitive Relations. Tech-
nicM Report MPI-I-93-249, Max-Planck-Institut f. Informatik, Saarbrficken,
(1993). [to appear in LICS'94]
[DJg0] N. Dershowitz, J.-P. Jouannaud. Rewrite systems. In: J. van Leeuwen (ed.)
Handbook of theoretical computer science, vol. B, chap. 6, pp.243-320. Amster-
dam: Elsevier, (1990).
[DM79] N. Dershowitz, Z. Manna. Proving termination with multiset orderings. Com-
munications of the ACM, 22:8,pp.465-476, (1979).
[DO92] A. Dovier,E. Omodeo,E. Pontelli,G.-F. Rossi. Embedding finite sets in a logic
programming language. LNAL 660, pp.150-167, Springer Verlag, (1993).
[Hes88] W.H. Hesselink. A Mathematical Approach to Nondeterminism in Data Types.
ACM ToPLaS 10, pp.87-117, (1988).
[Hus92] H. Hussmann. Nondeterministic algebraic specifications and nonconfluent
term rewriting. Journal of Logic Programming, 12, pp.237-235, (1992).
[Hus93] H. Hussmann. Nondeterminism in Algebraic Specifications and Algebraic Pro-
grams. Birkh~user Boston~ (1993).
[Jay92] B. Jayaraman. Implementation of Subset-Equational Programs. Journal of
Logic Programming, 12:4, pp.299-324, (1992).
[Kap88] S. Kaplan. Rewriting with a Nondeterministie Choice Operator. TCS, 56:1,
pp.37-57, (1988).
[KW94] V. Kriau~iukas, M. Walicki Reasoning and Rewriting with Set-Relations I:
Ground-Completeness. Technical Report no.96, Dept. of Informatics, Univer-
sity of Bergen (1994).
[LA93] J. Levy, J. Agusti. Bi-rewriting, a term rewriting technique for monotonic or-
der relations. In RTA '93, LNCS, 690, pp.17-31. Springer-Verlag, (1993).
[PP91] J. Pais, G.E. Peterson. Using Forcing to Prove Completeness of Resolution and
Paramodulation. Journal of Symbolic Computation, 11:(1/2), pp.3-19, (1991).
[S-A92] R. Socher-Ambrosius. Completeness of Resolution and Superposition Cal-
culi. Technical Report MPI-I-92-224, Max-Planck-Institut f. Informatik,
Saarbrficken, (1992).
[SD86] J. Schwartz,R. Dewar,E. Schonberg,E. Dubinsky. Programming with sets, an
introduction to SETL. Springer Verlag, New York, (1986).
[Sto93] F. Stolzenburg. An Algorithm for General Set Unification. Workshop on Logic
Programming with Sets, ICLP'93, (1993).
[Wa193] M. Walicki. Algebraic Specifications of Nondeterminism. Ph.D. thesis, Insti-
tute of Informatics, University of Bergen, (1993).
[WM95] M. Walicki, S. Meldal. A Complete Calculus for Multialgebraic and Func-
tional Semantics of Nondeterminism. [to appear in ACM ToPLaS, (1995).]
[WM95b] M. Walicki, S. Meldal. Multialgebras, Power Algebras and Complete Calculi
of Identities and Inclusions, Recent Trends in Data Type Specification, LNCS,
906, (1995).
Resolution Games and Non-Liftable Resolution
Orderings
Hans de Nivelle,
Department of Mathematics and Computer Science,
Delft University of Technology,
Julianalaan 132, 2628 BL, the Netherlands,
email: nivelle~cs.tudelft.nl

Abstract
W e prove the completeness of the combination of ordered resolution and
factoring for a large class of non-liftable orderings, without the need for
any additional rules like saturation. This is possible because of a new proof
method wich avoids making use of the standard ordered lifting theorem.
This proof method is based on resolution games.

1 Introduction
Resolution was introduced in ([Robins65]) and is still among the most successful
methods for automated theorem proving in first order logic. (See [ChangLee73]).
Although resolution is efficient, it is not efficient enough. Therefore so called
refinements of resolution have been designed, which can improve efficiency quite
a lot, without losing completeness. In this paper we will consider ordering re-
finements. Ordering refinements are a restriction of the resolution rule. With
refinements two types of improvement can be gained: First resolution refine-
ments simply improve efficiency, which means that less memory will be used,
and less time will be spent on finding a proof if it exists. Second it can be
shown that certain resolution refinements are ~erminating on certain clause sets
for which unrestricted resolution would be non-terminating. Thus it is possible
to obtain decision procedures with resolution. This approach was initiated by
([Joy76]) and ([Zam72]). ([FLTZ93]) contains an overview of the results reached
in this field. The general strategy for proving the completeness of an ordering
refinement is as follows: (1) Prove the completeness of the refinement for the
ground level. (2) Then show that a refutation of a certain set of ground clauses
can be lifted to the non-ground level. For the first part it has been shown
that resolution with every ordering on ground literals is complete. For the sec-
ond part, the ordering must have a special property which is called liftability:
280

A -~ B ~ AO -~ BO. This property is problematic because any ordering that


satisfies this property must leave many literals uncompared. For example lit-
erals p ( X ) and p ( s ( Y ) ) cannot be compared. Suppose that p ( X ) -4 p(s(Y)).
Then because of liftability,-4 is preserved by both {X := s(s(O)),Y := 0},
and {X := s(0), Y := s(0)}. However this results in p(s(s(O))) -~ p(s(O)) and
p(s(O)) -4 p(s(s(O))). This contradicts the fact that -~ is an ordering. In the
same way p ( s ( Y ) ) -4 p ( X ) is impossible. From the efficiency point of view it
is desirable to compare as many literals as possible, because then as little as
possible resolution inferences will be made.
It is also desirable from the decision point o f view to be able to drop this liftabil-
ity property, because certain non-liftable orderings have been proven terminat-
ing for certain clause sets, but it was not known whether or not these orderings
were complete. (See [FLTZ93]). We can positively answer this question here.
It is for these reasons that we will study resolution with non-liftable orderings
here. We will prove two completeness theorems for two types of non-liftable
orderings. For these proofs we will make use of a device called resolution game.
The resolution game can be seen as ordered resolution, with which a certain
counterplayer can change the ordering at certain moments. We begin by re-
peating some basic definitions.

D e f i n i t i o n 1.1 An order is a relation which satisfies the following properties:


O1 For no d i s it the case that d < d. 0 2 For each dl,d~ and d3: if dl <
d~, d2 < d3, then dl < d3. An order < is total if 0 3 whenever d l r d2, then
either dl < d2 or d~ < dl.
< is well-founded if there is no infinite sequence do, dl, d 2 , . . . , such that do >
dl > d2 > ... > di > "".
A total, well-founded relation is called a well-order.

D e f i n i t i o n 1.2 Let F be a finite set of function symbols with arities attached


to them, V be a countably infinite set of variables, and let P be a finite set of
predicate symbols with arities attached to them. We define terms as follows:
(1) Every variable, or function in F with arity 0 is a term. (2) If f is an
element of F, with arity n, and t l , . . . , t,~ are terms, then f ( t l , . . . , t~) is a term.
There are no other terms than defined by these rules.
If p is a predicate symbol, with arity n, and t l , . . . , t n are terms, then
p ( t l , . . . , t ~ ) is an atom . A literal is an atom p ( t l , . . . , t ~ ) , or its negation
-1 p ( t l , . . . , t ~ ) . The complexity of a term # t is defined from # v = 1, for a
variabele v, # f -- 2, for a 0-ary function symbol f E F, and f ( t ] , . . . , t n ) =
2 + # t l + 9-- # t , , for an n-ary function symbol f C F.

D e f i n i t i o n 1.3 A substitution 0 is a finite set of the form


{vl := t l , . . . , v n := t~}, where each vi E V and each ti is a term. The effect
of substitution @ on a literal A is defined as usual: As the result of replacing
simultaneously all vi in A by ti. Because of this it must be the case that for no
281

i # j, we have vi = vj. Otherwise the effect on literals containing vi = vj would


be undefined.
A literal A is an instance of a literal B if A can be obtained from B by a
substitution. We call A a renaming of B if A is an instance of B and B is an
instance of A. In t h a t case we also call A and B equivalent. We call A a strict
instance of B if A is an instance of B and A and B are non-equivalent.
If A and B are literals and 0 is a substitution, such t h a t A 0 = B O then
both 0 and AO are called a unifier of A and B.
If 0 is a unifier of A and B, then 0 is called a most general unifier if for
every unifier ~ of A and B, and every literal C, it is the case that C E is an
instance of CO.
I f A is an instance of B and O = {vi := ti, . . . , vn :-- t,~} is a C_-minimalsnb-
stitution such t h a t A -- BO, then we define the complexity of the instantiation
as # t i + . . . + #t,~.

It has been proven in [Robins65] that there exists an algorithm that has as input
two a t o m s (or literals), computes a most general unifier if they are unifiable,
and reports failure otherwise.

D e f i n i t i o n 1.4 A clause is a finite set of literals. A clause {Ai, 9 9 Ap} should


be read as the first order formula V~(Ai V ... V Ap). Here ~ are the variables
t h a t occur in the clause.
We call a clause decomposed if all literals in it have exactly the same variables.
An L-ordering E is an ordering on literals. If E is an L-order, then a literal L
is maximal in a clause c if (1) L E c, and (2) for no n ' E c, we have L E L'.

Note that, because E is an order, and clauses are finite, every n o n - e m p t y clause
has at least one m a x i m a l element.
Resolution is a refutation method. If one wants to try to prove a formula one
has to try to refute its negation.

D e f i n i t i o n 1.5 O r d e r e d r e s o l u t i o n We define the resolution rule: Let ci


and c2 be clauses, such that (1) c] and c2 can be written as ci = { A i } U r i ,
and c2 = {--1 A2} U r2, (2) A1 is E - m a x i m a l in el, and --1 As is E - m a x i m a l
in c2, and (3) Ai and As are unifiable with m g u 8 . Then r i b U r 2 0 is an
ordered resolvent of ci and c2. We write Cl, c2 t- r i b U r20.

O r d e r e d f a c t o r i n g Let c be a clause containing 2 literals A1 and As, such


t h a t (1) Ai and As are unifiable with m g u 8 , and (2) Ai is E - m a x i m a l
in c. Then cO is an ordered factor of c. Notation c ~-/ cO.

We have not defined unrefined resolution. Unrefined resolution can be obtained


by dropping the ordering conditions in definition 1.5.

D e f i n i t i o n 1.6 We call E liftable if A E B ~ AO E BO.


282

This property ensures that if a literal Ai| is maximal in a clause {A10,..., ApO},
that then its uninstantiated counterpart A~ in { A 1 , . . . , A p } is also maximal.
This makes lifting possible. The next theorem is the standard ordered resolu-
tion theorem.
T h e o r e m 1.7 Ordered resolution with ordered factoring is complete, for any
liftable L-order.
L-orders are a slight generalization of the more well-known A-orders. An A-
order is an order on atoms, which is extended to literals by the rule A 7- B =~
A F- -, B, -~ A v- B , - , A v- -~ B. Although every extension of an A-order is
an L-order, the converse is not true. For example P E Q f- -" Q 7- -~ P is
an L-order, but not the extension of an A-order. It is known that A-ordered
resolution and factoring is complete since ([KH69]).

2 Non-Liftable Orderings
We will now give the two completeness theorems for non-liftable orderings. For
the proof we develope the resolution games in the next section. After that we
prove the two completeness theorems in Section 5.
T h e o r e m 2.1 Let E be an L• such that
REN If A [- B, then for all renamings AO1 of A, and BO~ of B, we must have
A01 E BO2,
SUBST For every A and strict instance AO of A it must be that AO v- A.
Then the combination of [--ordered resolution and factoring is complete.
Theorem 2.1 implies the completeness of resolution with any relation that is in-
cluded in an order satisfying the conditions. An example is the ordering defined
by L1 E L2 iff ~L1 > # L 2 . Another possibility is an alfabetic, lexicographic
ordering on term structure.
T h e o r e m 2.2 Let v- be an order, such that
REN if A and B contain exactly the same variables, and A [- B, then for all
substitutions O1 and O~, such that (1) AO1 is a renaming of A, (2) B 0 2
is a renaming of B, (3) A01 and BO~ have exactly the same variables,
we have AO1 v- BO2.
Then U-ordered resolution with factoring is complete for every set of decom-
posed clauses.
It is impossible that p(X, Y) I- q(X, Y) and q(Y, X) 7- p(X, Y). This would
imply p(X, Y) 7- p(Y, X), which would imply p(Z, Y) r- p(X, Y).
The <~ order, together with the E+'-class, defined in ([FLTZ93]), pp. 82,
satisfies the conditions, mentioned here. There is no place for details here, but
the check is easy.
283

3 Resolution Games
In this section we define resolution g a m e s and give a completeness result for
resolution games. T h e p r o o f is given in the next section. We need a precise con-
trol over the factoring rule. Therefore it is needed to define clauses as multisets
instead of o r d i n a r y sets. So we define:
D e f i n i t i o n 3.1 A multiset is a set, which is able to distinguish how often an el-
e m e n t occurs in it. We write [ A 1 , . . . , Ap] for the multiset containing A 1 , . . . , Ap.
Unlike in the set { A 1 , . . . , A p } it is m e a n i n g f u l to repeat elements in the list.
T h e union of 2 multisets $1 U $2 is o b t a i n e d by s u m m i n g the n u m b e r of occur-
rences for each element. T h e difference set of 2 multisets $1\$2 is o b t a i n e d by
s u b t r a c t i n g for each element, the n u m b e r of occurrences in $2 f r o m the n u m b e r
of occurrences in $1. If this results in a negative n u m b e r then the n u m b e r of
occurrences is p u t to 0.

D e f i n i t i o n 3.2 A (binary) resolution game is an ordered triple, ~ = (P,.A, < ) ,


where
9 P is a set of propositional symbols. We define a literal of ~ as a proposi-
tional s y m b o l p or its negation -~ p.

9 M is a set of attributes,

9 -~ is an order on s • .4, where s is the set of literals. It m u s t be the case


t h a t < is well-founded on s • A.
An indexed literalis a pair L : a consisting of a literal L and an a t t r i b u t e a E .A.
A clause of ~ is a finite multiset of indexed literals of ~.

T h e m e a n i n g of a clause is the disjunction of its literals. So the clause


[al : A 1 , . . . , ap : Ap] has as m e a n i n g al V -.. V ap. Accordingly we call a set of
clauses satisfiable if the set of its m e a n i n g s is satisfiable. We define:

D e f i n i t i o n 3.3
R e s o l u t i o n Let Cl and c2 be two clauses, such t h a t (1) Cl can be written as
cl = [r : R1] U [al : A 1 , . . . , ap : Ap], and c2 can be written as
c2 = [--1 r : R2] U [hi : B 1 , . . . , bq : Br (2) r : R1 is < - m a x i m a l in Cl, and
-1 r :/~2 is < - m a x i m a l in c2. T h e n [al : A 1 , . . . , a p : Ap] U [bl : B 1 , . . . , b r :
Bq] is a resolvent of cl and c2.

F a c t o r i n g Let c = [al : A 1 , . . . , a p :Ap] be a clausel such that: (1) al : A I is


m a x i m a l in c, (2) al = a/, for an i > 1. T h e n c\[ai : Ai] is a factor of c.

R e d u c t i o n Let c -- [al : A 1 , . . . , a p : Ap] be a clause. A reduction of c is


o b t a i n e d by replacing zero, one or m o r e a/ : A/ by an a/ : A~, such t h a t
ai : A~ < a~ : A/. It is also possible to delete literals in the clause. (Note
t h a t there is no m a x i m a l i t y restriction here).
284

We can now define how the game is played.

D e f i n i t i o n 3.4 Let C be a finite set of clauses of a resolution game 6. There


are two players.

T h e o p p o n e n t The opponent will try to derive the e m p t y clause by computing


factors and resolvents.

T h e d e f e n d e r The defender will try to prevent this by replacing newly derived


clauses by reductions.

There are two sets G and N. The set G contains all the derived clauses, and N
contains the clauses of the last generation. The game starts with G = ~, and
N = C. Then:

1. The defender can replace any clause in N by a reduction. So he can make


0, 1 or any finite number of replacements. When the defender is finished
N is added to G. N is emptied.

2. Now the opponent can compute any ordered resolvent, or ordered factor
of clauses in G. The result is put in N. I-Ie can derive as m a n y clauses as
he wants in one turn, but he cannot use the new clauses because they are
in N. When he is finished the defender is on turn again.

The g a m e ends when the opponent succeeds in deriving the e m p t y clause. In


t h a t case the opponent is the winner. If the defender succeeds in preventing
this, the defender is the winner. Unfortunately for him, he will not enjoy his
victory at a finite time, because in this case the game m a y last forever.

We have defined the resolution game in such a way that the defender can only
affect newly derived clauses. We could also have defined the resolution game
in such a way t h a t the defender is allowed to reduce any clause. In that case
T h e o r e m 3.5 still holds.
The resolution game is different from lock or indexed resolution [Boyer71], be-
cause in lock resolution the resolvent inherits the indices from the parent clause
without any changes. We have the following theorem:

T h e o r e m 3.5 Let C be a set of clauses of a resolution game G. (1) If C is


unsatisfiable, then the opponent of the resolution game can play in such a way
t h a t he is guaranteed to derive the e m p t y clause at a finite m o m e n t . (2) If C is
satisfiable then the defender can play in such a way t h a t the opponent will not
derive the e m p t y clause.

We call the first part of the theorem completeness, and the second part sound-
ness. T h e proof of the soundness is not difficult. All the actions of the opponent
are semantically sound. The defender can play in such a manner t h a t his actions
are sound, by never deleting a literal. This guarantees t h a t the e m p t y clause
285

will not be derived if C is satisfiable. The proof of the completeness is more


difficult. We give the main part of it in the next section. Here we only show
t h a t it is sufficient to consider resolution games F = (P,A,-~), in which -~ is
total. We have the following lemma.

Lemma 3.6 Every well-founded order is contained in a well-order.

So for every resolution game F it is possible to obtain a resolution game F ~ by


replacing -~ by a well-order _~t . Then we have:

C O P Y 1 Every resolvent, or factor that can be computed with F t, can also be


computed with F. This is because a literal, t h a t is m a x i m a l w.r.t to _~t
will certainly be m a x i m a l w.r.t. -~ .

COPY2 Every reduction t h a t can be made with F is also a reduction with F t.

We. will show that the completeness of game F/, implies the completeness of F.
It is for this reason that it is sufficient to consider games in which the order is
total. An opponent of a set of clauses C playing F can simultaneously play a
g a m e using game F t as defender. He will copy the moves from the opponent of
F t to F, and copy the moves from the defender o f f to F ~. This goes as follows:
T h e opponent of a set of clauses C with game F starts a simultaneous g a m e as
defender of C using game F t. Then he proceeds as follows:

1. He waits for the defender on game F to make his reductions.

2. After this he can imitate the reductions made by the defender on F onto
F t. This is possible because of COPY2.

3. T h e n he waits for the opponent of F ~ to compute his factors and resolvents.

4. When the opponent of game F t is finished he imitates his moves on F.


This is possible because of COPY1. After this he continues at 1.

Because the opponent of F / will derive the e m p t y clause, if the initial Clause
set is unsatisfiable, the opponent of F will derive the e m p t y clause, and win
the resolution game. So it is sufficient to prove the completeness of resolution
games for those resolution games, in which -~ is a well-order. We will do this in
the next section. We will end with an example:

E x a m p l e 3.7 Let -~ be defined from:"


-~c:0-~ b:0-~ -~a:0-~ -~a:l-~ b:l-~ c:0-~ -~c:1~
a:O-4 c:l-~ -~ c : 2 -4 -~ b: 0-~ ~a:2-.~ -~ b: 1 -~ b : 2-~
c:2-~ -~ b : 2-~ a:l-~ a:2.
Let C be the following unsatisfiable set of clauses:
[b:2, c:2, a:2] [ c : 2 , ~ b : 2 ] [--~ c : 2] [-~a:2, b:2].
286

The clauses are sorted according to -4 . So each last literal is the selected literal.
If the defender doesn't make any reductions then the resolvent [-~ a : 2, c : 2] is
possible. This clause can be reduced to for example [-~ a : 0, c: 0], [c: 1,-~ a : 2],
or [-~ a : 0, e : 2]. The defender can also replace the initial clause [c : 2,-~ b: 2]
by [-1 b : 1, c : 2]. In that case the only possible resolvent is [-~ b : 1]. Whatever
reductions the defender makes, the emtpy clause can always be derived.

4 C o m p l e t e n e s s of R e s o l u t i o n G a m e s
In this section we give the completeness proof of the resolution game. For this
proof we need the following notion:

D e f i n i t i o n 4.1 Let C be a set of clauses. We call C closed iff


1. For every Cl,C2 E C, such that cl and c~ have a resolvent d, there is a
reduction d t of d in C.

2. For every c E C, such that c has a factor d, there is a reduction d t of d in


C.
C is a closure of a clause set C if C contains a reduction of every c E C.

We will prove completeness of resolution games by showing that every closed set
that does not contain the empty clause, is satisfiable. This implies completeness.
Suppose that this holds, while resolution games are not complete. There is a
clause set C, of a resolution game ~, such that C is unsatisfiable, and whatever
the opponent does, the defender can block derivation of the empty clause. Then,
when the opponent produces all possible clauses in each move, the conjunction
of the successive generations C = Ui>0 Gi is a closure of C. By assumption this
set does not contain the empty clause. Then C must be satisfiable, and this
implies that C is satifiable. This is a contradiction.
We use an adaptation of a proof in [Bezem90], of the completeness of A-ordered
hyperresolution. The proof is probably a bit dissapointing, because it does
not use the game-structure, but it is with less technicality than the proof in
([Nivelle94b]). The proof in ([Nivelle94b]) is based on games. We adapt the
proof in two steps for the clarity of the presentation. We first give the proof for
the case in which the defender never makes a reduction. In that case we have
proven the completeness of a variant of lock resolution. After that we make
some more adaptations to obtain the completeness of the full resolution game.
We will show that every closed set of clauses has a formal model, and that this
implies that every closed set of clauses has a model.

D e f i n i t i o n 4.2 Let C be a set of clauses of a resolution game ~. We define


a formal model M as a set of indexed literals, which (1) does not contain a
complementary pair, -~ a : il, and a : i2, (2) and which contains an indexed
literal of every clause.
287

We have the following simple lemma:

L e i n m a 4.3 If a set of clauses C has a formal model M, then it has a model.

This can be seen by taking the interpretation I, defined by: I(A) = t i f f an A : i


occurs in M , for each a t o m A.

4.1 Completeness of Restricted Resolution Games

If we consider games in which the defender never makes a reduction we have


t h a t d = d ~, in b o t h cases of Definition 4.1.

D e f i n i t i o n 4.4 Let C be a closed set of clauses. We define an intersection set


of C, as a set of indexed literals I, s.t. I contains a literal of every c E C.

We will construct an intersection set I, s.t.

MAXUNIQUE for every A : a E I, there is a clause c E C, such t h a t A : a


is m a x i m a l in c, A : a is not repeated in c, and there is no other indexed
literal of c in I.

It is the case t h a t if a certain set I is an intersection set of a set of clauses C and


I satisfies M A X U N I Q U E , then I is a formal model of C. This is seen as follows:
Suppose t h a t I contains a complementary pair A : al and -- A : a2. Then there
are clauses Cl and c2 such t h a t A : al is m a x i m a l in cl and cl\[A : al] N I = ~.
and -- A : a2 is m a x i m a l in c2 and c2\[-~ A : a2] N I = 0. Now because C is
closed under resolution, C contains d = (Cl\[A : al]) U (c2\[-~ A : a2]). Then
d N I = ~ and this contradicts the fact that I is an intersection set.
So what remains to show is that there exists an intersection set, satisfying
M A X U N I Q U E . We will construct this intersection set.

L e r a m a 4.5 Let C be a closed set (in which resolvents and factors are never
reduced), s.t. 0 ~ C. Then there exists an intersection set I of C, that satisfies
MAXUNIQUE.

P r o o f : Because -< is a well-order on the set of indexed literals we can use


recursion. Let )~ be the ordinal length of -< . Let L~ be the a - t h indexed literal,
for 0 < a < ~. W i t h I s we will denote the construction of the set I up to a.
We construct the I s as follows:

1. I o = 0 ,

2. For any limit ordinal ~, let I s = ~J~<~ I~.

3. For any successor ordinal a, put I~ = I a - 1 if I~-1 U {Lz ] c~ < fl < A} is


an intersection set. Otherwise let I~ = I ~ - I U {La-1}. (So at stage a we
decide whether or not L ~ - I is added)
288

4. Finally put f = I~.

We will show t h a t I is an intersection set satisfying MAXUNIQUE.


Suppose t h a t I is not an intersection set. Then there is a clause c E C, such that
I N c = 0. Let c~ be the index of the m a x i m a l literal in c. So L~ is the m a x i m a l
literal of c. Then at stage c~+ 1 of the construction, I~ U {La+l, L ~ + 2 , . . . } is not
an intersection set, and L~ would have been added to I~. This is a contradiction.
It remains to prove that I satisfies MAXUNIQUE. Suppose I does not. Then
I contains an indexed literal A : a = L ~ - I such that either

. L a - 1 does not occur uniquely in a clause c E C. Then at stage a of the


construction of I, Lc~-z would not have been added.

. L ~ - I does occur uniquely in some clauses, but nowhere as m a x i m a l ele-


ment. In that case the set {Lz ] a _</3 < A} contains all m a x i m a l elements
of clauses in which L a - 1 uniquely occurs, and L ~ - I would not have been
added at stage a.

. L ~ - I occurs uniquely and maximally in a clause c, and as m a x i m a l ele-


ment, but is repeated. In that case there is a (possible iterated) factor of
c in C, in which L ~ - I is not repeated.

End of proof

4.2 Completeness of Full Resolution Games


We will now a d a p t this proof to a completeness proof for full resolution gaines.
The first problem t h a t we encounter is that the argument below the definition of
M A X U N I Q U E does not work anymore, because the resolvent m a y be reduced.
We have to replace Definition 4.4 by

D e f i n i t i o n 4.6 Let C be a closed set of clauses. We define an intersection set


of C as a set of indexed literals, s.t.

1. I r A : al C 1, then for a l l A : a~, such that A : al -4 A : a2, also A : a2 E I.

2. From every clause c C C, there is an element in I.

Then we can replace property M A X U N I Q U E by

MAXUNIQUE2 For every A : al C I, for which there is no A : as E I, such


t h a t A : a2 -4 A : al, there is a clause c E C, such that A : al is m a x i m a l
in c, A : al is not repeated in c, and there is no other indexed literal of I
in c.

Now it is possible to repeat the argument below the definition of M A X U N I Q U E


with a few adaptations. Suppose that an intersection set I of C which satisfies
M A X U N I Q U E 2 , contains a complementary pair A : ax and -, A : a2. Then
289

I contains m i n i m a l elements A : a t and -- A : a t for which A : a t -< A : al


and -~ A : a S -< -, A : a2. For these literals there m u s t be clauses cl and c2,
such t h a t A : a t is m a x i m a l in cl, -, A : a t is m a x i m a l in c2, and ( ( c l \ [ A :
at] ) t_J (c2\[-, A : aS])) n [ = O. T h e n C contains a reduction d' of this resolvent.
It m u s t be the case t h a t d'CI I = 0, because of p r o p e r t y 1 in Definition 4.6, and
this contradicts p r o p e r t y 2 in Definition 4.6. So it r e m a i n s to show t h a t there
exists an intersection set, satisfying M A X U N I Q U E 2 .

L e m m a 4.7 Let C be a closed set of clauses, for which @ ~ C. T h e r e exists an


intersection set I of C, t h a t satisfies M A X U N I Q U E 2 .

P r o o f : Let C] C_ C be the set of clauses of C with n o n - r e p e a t e d m a x i m a l


elements, i.e., the set of clauses t h a t does not have a factor. We use the s a m e
recursion as in the p r o o f of L e m m a 4.5. Let ), be the length of ~<. Let L s be
the a - t h indexed literal, for 0 < a < A. Let I s be the construction of I up to a ,
(here 0 _< a < A) T h e construction goes as follows:

1.10=0,

2. For any limit ordinal ~, put I s -- U z < s I~.

3. For any successor ordinal a do

(a) If there is a l i t e r a l A : a ~ C I s - l , such t h a t L s - 1 = A : a and


A : a I -< A : a, then I s = I s - 1 U { L s - 1 } .
(b) Otherwise (if no such literal exists), then
i. if I s - 1 U {L# I c~ < fl < A} is an intersection set of C], then
Is = Is-1.
ii. otherwise I s = I~_1 U { L s - 1 } .

4. Finally we define I = I~.

It is not difficult to see t h a t I is an intersection set of C, because every in-


tersection set of C / is an intersection set of C. We m u s t show t h a t I satisfies
M A X U N I Q U E 2 . Let A : al be such t h a t there is no A : a2 E I, for which
A : a2 -< A : al and despite this, there is no clause c E C / , such t h a t A : al is
m a x i m a l in c, and A : al is the only literal of I in c. Let a be the m o m e n t at
which adding of A : al is decided, so A : al = L s - 1 . T h e r e are the following
possibilities:

1. L ~ - I does not occur uniquely in a clause c E C]. T h e n at stage a, L ~ - I


would not have been added.

2. L s - 1 does occur uniquely in some clauses in C], but nowhere as m a x i m a l


element. In t h a t case I ~ - 1 W {L~ I a < fl < A} is an intersection set, and
L s - 1 would not have been added.
290

3. L~_ 1 does occur uniquely and maximally in a clause c E C / , but is re-


peated. This is impossible because of the nature of C].

E n d o f proof
We will give two examples demonstrating that resolution games are not complete
when (1) the condition that A : a' -~ A : a in reductions, or (2) the condition
that -< is well-founded, is dropped. (So for example replacing a : 1 by a : 2 is a
valid reduction in the first case)

E x a m p l e 4.8 Define ~ = (P, .4, -~) from P = {a, b}, .4 = A/', and
11 : nl -~ 12 : n~ iff n] < n2. The clause set C =

[ a : 0 , b: 1] ['~ b : 0, a : 1] [-~ a : 0 , ~ b : 1] [b:0,~ a:l]


[a:0,~a:l] [b : 0,-~ b : 1]
is closed and unsatisfiable, but does not contain the empty clause.
Now replace A/ by Z. Then -4 is not well-founded anymore. Let the initial
clause set be equal to C. The defender can always reduce in such a manner that
newly derived clauses are sorted in the same way as in C. Therefore he can block
derivation of the empty clause.

5 A p p l i c a t i o n of R e s o l u t i o n G a m e s
We are now in the position to prove Theorems 2.1 and 2.2. For both theorems
the strategy is the same. Each unsatisfiable clause set has a finite set Cg of
ground-instances, which is unsatisfiable. From this non-satisfiable set we con-
struct the resolution game, by taking all the ground literals in Cg. We use the
attributes of the resolution game to indicate the non-ground literals by which
the ground literals are represented. Then it is possible for the defender to make
his moves in such a manner, that the resulting game corresponds to the be-
haviour of the non-liftable ordering. Because the empty clause will be derived
with the game, the empty clause will be derived with the non-liftable ordering.
We begin with Theorem 2.1. Assume that a set of clauses C is unsatisfiable.
By Herbrands theorem there exists a finite set Cg = { c l , . . . , ~ } of clauses in
C, such that Cg is unsatisfiable. Let Cused = {Cl,..., Cn} C C be the set of
clauses for which each ci is an instance of ci. (Here C~sed is written with possible
repetitions)
Now construct the following resolution game. Define G = (P,,4, ~), where

* P is the set of ground atoms occurring in Cg. P is finite, because Cg is


finite. (We will denote the set of literals that can be formed from elements
of P as s

9 We define .4 as the set of literals, s.t. each L E .4

1. has an instance in a clause of Cg, and


291

2. is an instance of a literal, occurring in a clause in C~s~d.

* -4 is defined as follows. ( a l : A1) -4 (a2: A2) if A1 [- A2.

We will show that ~ is a valid resolution game. For this we have to show that
-4 is an order on/~ • ,4, and that -< is well-founded on s • A. The first follows
trivially from the fact that r- is an order.
For the second let us define al : A1 -= a2 : A2 if al = a2, and A1 is equivalent
with A2 (i.e. they are an instance of each other).
This is an equivalence relation with only a finite number of equivalence-classes.
-4 will not distinguish elements of these classes. Then, because every sequence
of -< must be finite, -~ is well-founded.
We will now describe how the resolution game is played.

9 The resolution game starts with the following set of clauses: For every
ci = { A 1 , . . . , Ap}, the initial set Cga,~ contains a clause
[A10 : A 1 , . . . ,ApO :Ap]. Here O is a substitution, such that ciO = -6i. In
his first move the defender does not affect the indices. Now we have:

I N S T A N C E There exists, for each initial clause [al : A 1 , . . . , ap : Ap]


one substitution O, such that for all ai : Ai, we have AiO -- ai.

This property will b e preserved throughout the game by the defender.

9 When the opponent derives a clause c = [al : A 1 , . . . , a p : Ap], by res-


olution, and p : P1, and -1 p : P2 are the literals resolved upon in the
parent clauses, the defender reacts by replacing all Ai by AiO, where O
is the mgu of -~ P1 and P2. After this he deletes all repeated occurrences
of indexed literals in the result.

9 When the opponent derives a clause c = [al : A 1 , . . . , ap : Ap], by fac-


torization, and p : P1, and p : P2 are the literals factored upon, then the
defender reacts by replacing all Ai by AiO, where O is the mgu of P1 and
P2- Afther this he deletes all repeated occurences of indexed literals in the
result.

This is valid strategy because property INSTANCE will be preserved throughout


the game by the defender. The defender will lose the resolution game. From
this game a E-ordered refutation of C can be extracted, by replacing each clause
[al : A 1 , . . . , a p :Ap] by the clause { A 1 , . . . , A p } .
We can now prove Theorem 2.2. Let C be a unsatisfiable set of decomposed
clauses, let Cased and Cg be defined as in the proof of Theorem 2.1.
The resolution game will be a lit.tle different. ~ = (P, A, -~). P and A are
constructed in the same way, but -~ is constructed different.

9 -~ is defined from: (al : A1) -~ (a2 : A2) if one of the following holds
292

1. The complexity of the instantiation (A1 becomes al) is strictly less


than the complexity of the instantiation (A2 becomes a2)
2. The complexity of the instantiation (A1 becomes al) is equal to the
complexity of the instantiation (A2 becomes a2), A1 and A2 have the
same number of variables. Then there exists a renaming A10 of A1,
such that AI| has the same variables as A2. Then it must be the
case that: AI(~ F A2. Note that if these conditions hold for one O,
such that AI@ has the same variables as A2, then they hold for any
such O ~, because of REN in Theorem 2.2.

We will show that this is a valid resolution game. In order to show that the
relation -~ is a well-founded order, it is sufficient to show that the relations
mentioned under (1) and (2) are well-founded orders. It is easily seen that the
relation under (1) is a well-founded order. For (2) it is easily checked that (2)
defines an order.
It remains to show that the ordered defined under (2) is well-founded. In the
same way as in the proof of Theorem 2.1 an equivalence relation ~ can be
defined. -~ will not distinguish equivalent indexed literals under this relation.
Now ~ has only a finite number of equivalence classes. Because of this the
ordering defined under (2) is well-founded, and hence the composition of (1)
and (2) is well-founded.
Now the resolution game proceeds in exactly the same way as in the proof of
Theorem 2.1, and a F-ordered refutation of C can be extracted from this game
in the same manner.

6 C o n c l u s i o n s and Future Work


We have shown that there exists a large class of non-liftable orderings, with
which resolution and factoring is complete. We have proven that the <~-order
is complete for the E'+-class, which was an open problem in ([FLTZ93]). We
do not know to which extent the orderings are compatible with subsumption.
Also we do not know what happens when condition SUBST in Theorem 2.1 is
dropped. Counterexample 4.8 cannot be reproduced.

7 Acknowledgements
I would like to thank Trudie Stoute for her advice on English and Tanel Tammet
for his improvements in the formulation of theorem 2.1.
293

References
[Baum92] P. Baumgartner, An ordered theory calculus, in LPAR92, Springer
Verlag, Berlin, 1992.
[BG90] L. Bachmair, H. Ganzinger, On restrictions of ordered paramodu-
lation with simplification, CADE 10, pp 427-441, Keiserslautern,
Germany, Springer Verlag, 1990.
[Bezem90] M. Bezem, Completeness of resolution revisited, Theoretical com-
puter science 74, pp. 227-237, 1990.
[Boyer71] R.S. Boyer, Locking: A restriction of resolution, Ph.D. Thesis,
University of Texas at Austin, Texas 1971.
[ChangLee73] C-L. Chang, R. C-T. Lee, Symbolic logic and mechanical theorem
proving, Academic Press, New York 1973.
[FLTZ93] C. Fermiiller, A. Leitsch, T. Tammet, N. Zamov, Resolution meth-
ods for the decision problem, Springer Verlag, 1993.
[Joy76] W.H. Joyner, Resolution Strategies as Decision Procedures, J.
ACM 23, 1 (July 1976), pp. 398-417.
[KH69] R. Kowalski, P.J. Hayes, Semantic trees in automated theorem
proving, Machine Intelligence 4, B. Meltzer and D. Michie, Ed-
ingburgh University Press, Edingburgh, 1969.
[Nivelle94b] H. de Nivelle, Resolution games and non-liftable resolution or-
derings, Internal report 94-36, Department of Mathematics and
Computer Science, Delft University of Technology.
[Robins65] J.A. Robinson, A machine oriented logic based on the resolution
principle, Journal of the ACM, Vol. 12, pp 23-41, 1965.
[Tamm94] T. Tammet, Seperate orderings for ground and'non-ground literals
preserve completeness of resolution, unpublished, 1994.
[Zam72] N.K. Zamov: On a Bound for the Complexity of Terms in the
Resolution Method, Trudy Mat. Inst. Steklov 128, pp. 5-13, 1972.
On E x i s t e n t i a l T h e o r i e s of List C o n c a t e n a t i o n

Klaus U. Schulz ~

CIS, University of Munich, Wagmiillerstr. 23


80538 Munich, Germany
e-mall: schulz@cis.uni-muenchen.de
phone: (+49 89) 211 0667

A b s t r a c t . We discuss the existential fragments of two theories of con-


catenation. These theories describe concatenation of possibly nested lists
in the algebra of finite trees with lists and in the algebra of rational
trees with lists. Syntax and the choice of models are motivated by the
treatment of lists in PROLOG III. In a recent prototype of this lan-
guage, Colmerauer has integrated a built-in concatenation of lists, and
the constraint-solver checks satisfiability of equations and disequations
over concatenated lists. But, for efficiency reasons satisfiability is only
tested in a rather approximative way 2. The question arises whether satis-
fiability is decidable. Our main results are the following. For the algebra
of finite trees with lists, the existential fragment of the theory is decid-
able. For the algebra of rational trees with lists, the positive existential
fragment of the theory is decidable. Problems in the treatment of the
existential fragment may be traced back to a difficult question about
solvability of word equations with length constraints for variables.

1 Introduction

Quine [9] has shown t h a t the theory of concatenation is undecidable. T h e exis-


tential f r a g m e n t of the theory was shown to be decidable by Biichi and Senger
[3], building up on Makanin's decidability result for solvability of word equations
[7]. Concatenation, in the sense of Quine, is an operation acting on words over
an a l p h a b e t of atomic letters, and the classical theory of concatenation is the
theory of free monoids. In the meantime, with the development of high level
p r o g r a m m i n g languages, concatenation has become relevant as an operation on
lists. Lists, as opposed to flat words, m a y contain complex objects as entries,
including nested sublists, for example.
In this p a p e r we want to discuss theories of list concatenation. We shall con-
centrate on two formal models t h a t are m o t i v a t e d by the t r e a t m e n t of lists in
P R O L O G III. In a recent p r o t o t y p e of this language, Colmeraner has integrated
a built-in concatenation of lists, and the constraint-solver checks satisfiability

* Supported by EC Working Group CCL, EP 6028.


2 We refer to a talk by Alain Colmerauer on the third Workshop on Constraint Logic
Programming, Marseille, March 1993.
295

of equations and disequations between terms with concatenated lists. For effi-
ciency reasons, however, satisfiability is only tested in a rather approximative
way. Colmerauer introduces a non-standard "naive" concatenation on a compli-
cated "extended domain" to explain the precise answer behaviour of the solver
declaratively. The question arises whether satisfiability of equations and dise-
quations between terms with concatenated lists is decidable.
Approximating the formal model of PROLOG III, we consider the algebra
of finite trees with lists and the algebra of rational trees with lists. In both
domains, concatenation is interpreted as a partial operation acting on lists only,
free function symbols are interpreted as tree constructors. In view of the results of
Quine and Biichi-Senger we only consider the existential fragment of the theories
of these two structures. The syntax is more or less identical to the syntax of
PROLOG III for constraints over lists. The "list constraint systems" that will
be considered are finite sets of equations and disequations between terms with
concatenated lists. Arbitrary existential sentences correspond to disjunctions of
list constraint systems.
The paper is structured as follows. Section 2 starts with central definitions.
In Section 3 we show that solvability of list constraint systems over the algebra
of finite trees with lists is decidable. This implies that the existential theory of
this structure is decidable. The decision procedure is based on a decomposition
technique that was introduced in [2] in the context of disunification in the union
of disjoint equational theories. A variant of Makanin's algorithm [7] deciding
solvability of word equations is needed.
In Section 4 we consider the algebra of rational trees with lists as solution
domain. It is shown that solvability of equational list constraint systems is decid-
able. Thus the positive existential theory of this algebra is decidable. We sketch
how the problem of solvability of arbitrary list constraint systems over the alge-
bra of rational trees with lists may be traced back to the following problem: given
a word equation with variables x l , . . , x., and given a finite set of constraints of
the form Ixi[ = Ixj] demanding that the length of the (words to be substituted
for the) variables xi and xj has to be the same, decide if the word equation has
a solution that satisfies these restrictions. Decidability of word equations with
these length constraints seems to be a deep problem. G.S. Makanin (personal
communication) has shown that a primitive recursive decision procedure would
give a primitive recursive algorithm for deciding solvability of equations in free
groups. It is known that Makanin's algorithm for free groups [8] is not primitive
recursive [5].

2 List Constraint Systems and Solutions

List c o n s t r a i n t s y s t e m s

Following the syntax of PROLOG III we shall use an infinite set of list construct-
ing symbols for representing lists. For each natural number k, let [ ]a denote a
function symbol of arity k. Let ZL := {[ ]k;k > 0}. Let ~F denote a finite set of
296

free function symbols, disjoint to EL, containing at least one constant and one
non-constant function symbol. The complete signature that we shall use con-
tains binary concatenation "o", and all symbols from Z L a F := EL U ZF. X is
a countably infinite set of variables. In the sequel, possibly subscripted symbols
x, y, z , . . . always denote variables.
The set of all (F- and L-) terms is recursively defined as follows:

- every variable is an L-term and an F - t e r m ,


- if t l , . . . , t~ are terms and f C E F is an n-ary function symbol, then
f ( t l , , . . , tn) is an F - t e r m and [ ] n ( t l , . . . , tn) is an L-term (n _> 0),
- if 11 and 12 are L-terms, then 11 o 12 is an L-term.

Terms [In(t1,..., t~) will be written in the form [ t l , . . . , t~]. Since the infix
symbol "o" is interpreted as concatenation, we omit brackets in expressions
11 o - . . o ln. For n = 0, an expression 11 o . . . o In denotes the empty lists [ ]0. Of
course many "natural" expressions (such as those using "con~' and "coati', or
Prolog-style [t~ Ill) are not treated as terms. It is simple to see that for all these
expressions there are terms which behave in the same way, in any relevant sense.
In order to keep proofs simple we have chosen a compact syntax which captures
all conventional constructions for a combination of terms with lists.
A list constraint system is a finite set of equations and disequations F of the
form
{ S l --" t l . . . . , s , - t~, s.+l # t , + l . . . . . s,+m # t , + ~ }

where the s/- and ti are terms.

Example1. Let ~ F = {f,g,a,b} where f is binary, g is unary, and a and b are


constants. Then Fl = {If(z, Ix] o x), g(y o y)] -- [f(g(x o y), [Y] o [b, a]), z], y # [] }
and r2 = {[x] - x } are list constraint systems.

Two solution domains

We assume t h a t trees (and subtrees) are formalized as usual, i.e., as sets of


labelled paths. Paths ( = positions) are finite sequences of positive natural num-
bers. A tree with lists is a tree with labels in EL~F, the arity of the label giving
the branching degree at the node. A tree with lists is rational if it has only a
finite number of distinct subtrees.
In order to solve list constraint systems we shall consider the two domains
~ L ~ F and ~ L ~ of all finite (resp. rational) trees with lists. Elements of
these domains will often be writ, ten in the form f ( t l , . . . , tk) or [ ] k ( t l , . . . , tk) =
[tl . . . . , tk], where the ti denote subtrees. Trees of the form [tl,..., tk] will be
called lists of length k.
Both domains may be turned into (partial) algebras over the mixed signature
~L&F (J {o}: free function symbols f E ~ f and list symbols [ ]k are interpreted
as tree constructors, the interpretation of "o" is the partial function

o T : ([h . . . . , t n l , [ t , + l , o . . ,t,+m]) ~ [ t , , . . . , t , , t , + l .... ,t~+,~].


297

Solutions and finite-tree solutions

A (partial) tree assignment is a (subset of a) m a p p i n g a : X ---+ T~L~v. r,, . Tree


assignments will be used to associate with arbitrary t e r m s t an interpretation
t ~ E T ELaF But, since "o T" is a partial operation, we have to be careful. We
9 rat 9

say t h a t x E X has type L with respect to t if t has a s u b t e r m of the form x o l or


1 o x. It is not hard to see t h a t t ~ is defined if and only if x ~ is a list, for every
variable x which has type L with respect to t. T h e variable x has type L with
respect to the constraint s y s t e m F if there is a t e r m t in F such t h a t x has type
L with respect to t. A partial tree assignment a is consistent f o r F if a assigns
a list x a to every variable x t h a t has type L with respect to F and an arbitrary
tree with lists to the remaining variables y of F.

Definition 1. Let F be a constraint system. A rational-tree solution (or simply


a solution) of F is a partial tree assignment a which is consistent for F such t h a t
s ~ = t ~ (s ~ ~ t a) whenever F contains an~dis)equation s - t (s ~ t). A finite-
tree solution is a solution a where x ~ E Tf~ LaF, for all variables x occurring in
F.

Example 2. T h e assignment x ~ [b, a], y ~ [b, a], z ~ g([b, a, b, a]) is a finite-tree


solution of the constraint system F1 given in E x a m p l e 1. T h e system F2 does
not have a finite-tree solution. But there exists a solution a which m a p s x to the
rational tree [[... [...]...]].

Flat and nontrivial constraint s y s t e m s

i t e r m t is called flat if t is a variable, if t has the form f (x 1 , . . . , xn) ( f E Z F ) , or


if t has the form ll o . . . oln (n >_ 0) where the arguments li are variables or t e r m s
of the form [x]. A flat constraint s y s t e m is constraint system F where b o t h sides
of disequations are variables and the left-hand ( r i g h t - h a n d ) s i d e s of equations
are variables (fiat terms) 9 Obviously it is possible to c o m p u t e for an arbitrary
list constraint s y s t e m /~ a fiat list constraint system F ~ t h a t is equivalent in
the sense t h a t every (finite-tree) solution of _r' can be extended to a (finite-tree)
solution of F I and every (finite-tree) solution o f / ' ~ is a (finite-tree) solution of
F. (We just have to introduce additional variables x and new equations of the
form x - t in order to get rid of complex subterms.)
A fiat list constraint system is trivial if it contains an equation x - t, where
t is a non-variable F - t e r m , and if at the same t i m e x has type L with respect to
F, or x occurs in an equation x - l where I is a non-variable L-term. Obviously,
triviality can be detected algorithmically, and trivial systems are unsolvable. All
list constraint systems t h a t will be considered in the following are assumed to
be fiat and non-trivial.

3 Decidability Result for Finite Tree Solutions

In this section we want to prove the following theorem 9


298

T h e o r e m 2. It is decidable if a list constraint system has a finite-tree solution.

Let F = {sl -" t l , . . . , S n -- tn, sn+l ~: t n + l , . . . , S n + m ~ tn+m} be a list


constraint system. It can be regarded as an existential sentence 3' of the form
3 x ( ( A in= 1 si = ti) A (An+m
,, ,j=n+l -'sj = tj)). Finite-tree solvability of F corresponds
to validity of ~ in Tf~ L~F. Obviously arbitrary existential sentences m a y be rep-
resented as disjunctions of list constraint systems.

C o r o l l a r y 3. The ex'istential theory of the algebra Tf~.L ~ is decidable.

To establish T h e o r e m 2 we shall give an algorithm t h a t decomposes a flat non-


trivial list constraint system F = F0 into a finite set of o u t p u t pairs. We shall see
t h a t F0 is solvable iff b o t h components of one of the o u t p u t pairs are solvable.
Moreover, solvability of b o t h o u t p u t components will be decidable. Before we
describe the steps of the algorithm we shall explain the nature of three types
of constraint systems t h a t arise from decomposition. W i t h T ( ~ , X ) we denote
the set of all t e r m s with variables in X and function symbols in ~2. A VC-
declaration (VC stands for variable-constant) is a pair ( Z v , Z c ) representing a
partition Z -- Z F O Z c of a finite set of variables Z C X. In the presence of a
VC-declaration ( Z v , Z c ) , the variables in Z c are not instantiated in solutions,
which means t h a t they are t r e a t e d as constants.

Free disunification problems with linear constant restriction.

A free disunification problem with linear constant restriction is a quadrupel


(FF, Z v , Z c , < ) where

- ( Z v , Z c ) is a VC-declaration of Z C_ X,
- _Ff is a finite set of equations and disequations between terms in
T ( ~ . U Z c , Z v ) and
- < is a linear ordering on Z.

T h e first c o m p o n e n t of each output pair has this complex form. A solution of


this p r o b l e m is a T ( ~ F , X ) - s u b s t i t u t i o n or, not instantiating "constants" in Z c ,
which solves all equations and disequations of FF such t h a t y E Z c does not
occur in x ~ for all x < y (x E Z v ) . A solution a is called restrictive if x ~ ~ X
for all x E Z v .
T h e notion of a disunification problem with linear constant restriction and
the notion of a restrictive solution have been introduced in [2] in the context
of disunification modulo equational theories. There it has been shown (proof of
Corollary 4.8):

L e m m a 4. It is decidable whether a free disunification problem with linear con-


stant restriction has a restrictive solution.
299

F l a t p u r e list c o n s t r a i n t s y s t e m s w i t h linear c o n s t a n t r e s t r i c t i o n

A fiat pure list constraint system with linear constant restriction is a quadrupel
( FL , Z v , Z c , < ) where
- (Zv, Z c ) is a VC-declaration of Z C X ,
- Fz is a finite set of disequations of the form x ~ y (x,y E Z v ) and of
equations of the form x = 11 o . . . o I , (n k 0) where x E Z v and the li have
the form z E Zv or the form [y] with y E Z = Zv U Zc,
- < is a linear ordering on Z.
Let M be a set. W i t h EM~.d,,, we denote the set of all finite, possibly nested
lists where elements t h a t are not itself lists are in M. This domain contains only
finite trees.
A solution of (FL, Z v , Zc, <) is a m a p p i n g a which assigns to every variable
X
x E Z v an element x ~ ~ g,~.~d,f~, such t h a t the canonical extension of a on pure
L - t e r m s a solves all equations and disequations of FL and the constant c E Z c
does not occur in x ~ for all x < c (x E Zv). T h e solution ~ is called compatible
with < if x~ is never a proper subtree of x~ for x2 < xl (xl,x2 e Zv). Note
that, by definition, e v e r y solution cr is restrictive in the sense t h a t x ~ cannot be
a variable (x E Zv).
In the third step of the algorithm, systems of this type are created. Nested
lists as solution values m a y be necessary since variables m a y occur a m o n g the
elements of lists in equations. This is the i m p o r t a n t distinction to the following
type of system.

S h a l l o w p u r e list c o n s t r a i n t s y s t e m s w i t h l i n e a r c o n s t a n t r e s t r i c t i o n
Let (FL,Zv, Zc, <) be a flat pure list constraint system with linear constant
restriction. T h e shallow version ( FL, Zv , Z c U Zv , < ) of ( FL, Zv, Zc, <) is ob-
tained by
(1) introducing the new set of constants Z v := {~;x E Zv},
(2) replacing every t e r m [x] in FL with an embedded occurrence of a variable
x E Z v by an expression [~],
(3) using the linear ordering < which is the extension of < on Z v U Z c U Z v
where each constant ~ is the i m m e d i a t e successor of x with respect to
(x e Zv).
T h e second components of the o u t p u t pairs will have this form. T h e domain
s contains all lists of the form [11,... ,in] (n _> 0 ) w i t h e l e m e n t s / / E X U Z v .
A solution of (]~L, Z v , Z c U Zv, <) is a m a p p i n g a which assigns to every
x E Z v a value x ~ E fXUZv
~flat
such t h a t the canonical extension 4 of g on terms
3 Where c~ :-----c for c e Zc, Ill . . . . , l,] ~ = [l~,..., l•] and (11 o12)r is the concatenation
of 17 and l ~
4 Defined as above, w i t h ~ ---- ~ for ~ E Z v . N o t e t h a t t h e canonical extension of a
assigns to b o t h sides of each e q u a t i o n of/~L again values in ~fXu2v
flat since there axe
no variables in element positions.
300

in/~L solves all equations and disequations of/~L and the constant c E Z c U Z v
does not occur in x a for all x<c (x E Z v ) . As for flat pure systems, each solution
is restrictive by definition.

L e m m a b . It is decidable whether the shallow version of a fiat pure list con-


straint system with linear constant restriction has a solution.

Proof. (Sketch). Suppose that /~L has m disequations. It is first shown that
solvability of (/~L, Z v , Z c UZv, <) may be tested in a domain s where
X0 C X has 2m + 1 elements and X0 M Z c = ~. Now we have a finite solution
alphabet ' and the method of Biichi and Senger ([3]) may be used to compute
an equivalent finite set of systems with equations only. This latter systems are
like word unification problems with linear constant restriction, where solvability
is known to be decidable (see [1]). (More details of all steps can be found in [2]
where the almost identical case of associative disunification with linear constant
restriction has been treated.) []

3.1 First decomposition algorithm (Algorithm 1)


The input of Algorithm 1 is a flat and nontrivial list constraint system 1"0.

Step 1: variable identification. Consider all partitions of the set of all variables
occurring in Fo such that distinct variables x, y are in the same class of the
partition if the system contains the equation x "- y, and distinct variables x, y
are in distinct classes of the partition if the system contains the disequation
x ~= y. Each of these partitions yields one of the new systems I"1 as follows.
The variables in each class of the partition are "identified" with each other by
choosing an element of the class as representative, and replacing in the system
all occurrences of variables of the class by this representative. Afterwards, trivial
equations x - x are erased. In addition, we add a disequation x ~ y for every
pair x, y of distinct representatives to the system if this disequation is not already
present. Systems that are trivial now are excluded.
In each system F1, the right-hand side of every equation is either an F - t e r m
or an L-term (but not a variable). Thus, we may speak about F-equations and
L-equations.

Step 2: choose ordering, type variables. For a given system 1"1, consider all pos-
sible strict linear orderings < on the variables of the system. Guess a type assign-
ment which maps every variable x to an element type(x) of {F, L}, satisfying
the following restrictions: if x has type L with respect to 1"1, or if 1"1 contains an
equation x - t where t is a non-variable L-term (resp. F-term), then type(x) = L
(resp. type(x) -- F ) . Each pair (<, type) yields one of the new systems obtained
from the given one.
For a system 1"2 obtained by Step 2, let X3,p (X3,L) denote the set of variables
of type F (L) occurring in Fs. Let X2 = X3,F U X3,L. Now left-hand sides of F
(L) equations are in X3,F (Xa,L).
301

Step 3: split systems. A given system/"2 is divided into two systems/"2 =/"3,F U
/"3,L. The "free" subsystem/"a,f contains all F-equations of~"2, the "L "-subsystem
/"3,L contains all L-equations of/"2. Disequations with at least one variable of
type F are added to the free subsystem, the other disequations are added to/"3,L.
Now (F3,F,Xa,F,Xa,L,<) is a free disunification problem with linear constant
restriction and (/"3,L,X3,L,X3,F,<~.) iS a fiat pure list constraint system with
linear constant restriction.

Step 4: dot embedded variables. In this last step we compute the shallow version
(/~3,L, X3,L, X3,F t9 )(3,L, "<) of the fiat pure list constraint system with linear
constant restriction, ( /"3,L , X3,L ~X3,F , <~), obtained in the previous step.
Terms of /~3,L have the form ll o . . . olm (m > 0) where the subterms li are
variables x E X3,L or lists [t] where t E X3,F U -~a,L is a constant.
Note that Steps 1 and 2 are non-deterministic. The output of the algorithm
consists of all pairs

((V ,F,Xa,F,Xa,L, <), (Pa,L,X ,L, Xa,F U 23, ,


which are obtained from/"0 by means of the Steps 1 - 4. [7
Theorem 2 is a direct consequence of the following proposition, using Lem-
m a t a 4 and 5.

P r o p o s i t i o n 6. The input system /"o has a finite-tree solution if and only if


there exists an output pair ((/"a,F, X3,F, X3,L, <), (/~3,L, X3,L, X3,F U )(a,L, 4))
such that ( F3,F , X3,F , X3,L , <) has a restrictive solution and (/~3,L, Xa,L , X3,F O
X3,L, < ) has a solution.

3.2 Correctness of Algorithm 1

In order to prove Proposition 6 we shall need four lemmata.

L e m m a 7. If the input system Fo is solvable, then there exists a pair

((/"a,F, xa,F, xa,L, <), (ra,L, xa,L, xa,F, <))


reached after Step 3 such that (/"a,F, X3,F , Xa,L, <) has a restrictive solution and
(F3,L, X3,L, X3,F, <) has a solution that is compatible with <.

Proof. (Sketch) Suppose that a is a solution of F0. We have to determine choices


in the non-deterministic Steps 1 and 2 which lead-after Step 3--to a pair of
systems as described in the proposition. In Step 1 of the algorithm two variables
x, y are identified iff x a = y% With this choice a is a solution of I"1. In Step 2 of
the algorithm the linear order < which we choose is an arbitrary extension of
the partial order -~ defined by

x -~ y :r x ~ is a proper subtree of ya.


302

A variable x receives t y p e F if[ the t o p m o s t label of x ~ is in S F . T h e s e choices


are consistent with t h e restrictions in Steps 1 a n d 2 a n d define a pair of sys-
t e m s ((r'3,F, X3,F, X3,L, <), (F3,L,X3,L, X3,F, <)) which is reached after Step 3.
It is now possible to show t h a t these s y s t e m s have solutions as described in t h e
proposition. T h e full p r o o f - - w h i c h is b a s e d on m e t h o d s f r o m [2J--can b e found
in [10]. []

Lemma 8. If a system (F3,L, X3,L,X3,F, <) obtained as second component after


Step 3 has a solution cr that is compatible with <, then the dotted system reached
after Step 4, (/~3,L, X3,L, Xa,F O ](3,L, <), has a solution.
fX3,FuY
Proof. Let a : X3,L --4 . . . . . . d,f~n be a solution of (['3,L,X3,L,X3,F,<() t h a t is
c o m p a t i b l e with " < ' . We m a y a s s u m e t h a t Y C X a n d X2 are disjoint. Consider
a bijection
f3 ,' Y U X3,/~' U flXa.FUY
. . . . . . d,fin ~ }7 U X3, F U 23, L
such that fl(xa) = } for all x 6 X3,L and fl(x) = x for all x E X3,F. Such
a bijection exists since /"3,L contains a disequation xl # x2 for every pair of
distinct variables of type L and thereforex~ ~ x~ for all Xl,X2 E X3,L,Xl 7s x2.
Now fl induces a projection

:x~,~oY~
7i- : i ~ , n e s t e d l f i n
:~x~,~ ; [l~,.
~'~f|at ""
,l.] ~ [fi(l~),...,/~(l.)].

We define t h e a s s i g n m e n t 5- : Xa,L ---+ ,rxux~


~, 'L; x ~(z~).
L e t x E X3,L. T h e n zr(x ~) -- x s by definition of (~. Consider a list [x] where
e x ~ , ~ . We have ~([~]~) = [ ~ ( ~ ) ] = [~] = [~]~. Eventually, consider a list [~]
where x e Xa,F. We have 7r([x] a) = [fl(x)] = [x] = [x] ~.
Let us now show t h a t ~ solves the equations of/~3,t. L e t x - 11 o - . . o I~ be
an e q u a t i o n f r o m F3,L, let x -- 1~ o - . . o 1; be the c o r r e s p o n d i n g e q u a t i o n in/~a,L-
We know t h a t x ~ = (11 o - . . o 1,) ~. B u t t h e n

x ~ = ~ ( x ~) = ~ ( ( l , o . . . o l~) ~) = ~(17) o ~ . . . o~ ~(l~) = (l~ o . . . o l ' ) ~.

N e x t consider a disequation Xl ~ x2 of/:'3,L. We have the s a m e disequation in


Fa,L, therefore x~ ~ x~. But, since fl is a bijection, distinct lists r e m a i n distinct
u n d e r p r o j e c t i o n 7c, thus x~ r x~.
Now let xi E Xa,L a n d x2 E Xa,F. If xl<x2, then also xl < x2. T h e r e f o r e
x2 does not occur in x~. This m e a n s t h a t x2 does not occur in x~, by choice
of 3 a n d h. T h e r e f o r e & respects the linear c o n s t a n t restriction for c o n s t a n t s
in X3,F. It r e m a i n s to show t h a t also the restrictions for c o n s t a n t s in -~a,L are
satisfied. Let Xl,X2 E Xa,L. If xl<d~2, then either x2 = Xl or xl < x2. In t h e
first case, x~ = 7r(x~) c a n n o t have an o c c u r r e n c e of d:l since x~ is not a p r o p e r
s u b t r e e of x~'. In t h e second case we have to m a k e use f r o m the fact t h a t a is
c o m p a t i b l e with <. F r o m this we know t h a t x~ is not a p r o p e r s u b t r e e of x~,
therefore d~2 = fl(x~) c a n n o t occur in ~r(xT) = xT. []
S u m m a r i z i n g , the preceding two l e m m a t a show t h a t t h e d e c o m p o s i t i o n al-
g o r i t h m is complete. L e t us now consider soundness.
303

L e m m a 9. If a dotted system (/~3,L, X3,L, X3,F U X3,L, <.) obtained after Step
has a solution, then the original system (/"3,L,X3,L,X3,F, < ) has a solution.

Proof. Let & be a solution of (/~a,L, X3,L, Xa,F U XS,L, <:). We m a y a s s u m e t h a t


(~ : X3,L --~ f nYUXs.FUk3.L
at where Y C_ X is disjoint to X2. We shall now use the
f, YuXs,F
linear order ~: in order to define a partial assignment a : X3,L U X 3 , L --~ ~ n .... d,fin
such t h a t the restriction to X 3 , L - - e x t e n d e d canonically on pure L - t e r m s - - s o l v e s
(I"3 L, X3 L, X3 F, <)- Let x E X3,L U X3 L and a s s u m e t h a t z ~- has b e e n defined
for all z E X3,L U X3,L such t h a t z ~ x . We shall also a s s u m e (*) t h a t z~ r z~
for all zl,z2 E X3,L with x > zl ~ z2 < x.
If x = ~ is a d o t t e d variable, t h e n x is the i m m e d i a t e successor of z. We
define x ~ : = z ~. If x E X3,L, t h e n the d o t t e d elements of the flat list x ~ are
smaller t h a n x with respect to ~. We define x " : = x ~'~. Since the flat lists x ~
a n d z ~ are distinct it follows easily, by (*), t h a t x ~ r z ~ for all z < x, z E X3,L.
We shall now prove t h a t a is a solution of (/"3,L,X3,L,X3,F,<). Let x -
ll o . - . o l~ be an e q u a t i o n of /"3,L with c o u n t e r p a r t x - l~ o - . . o l~ in /~3,L.
We have x ~ = (l~ o . . . o l~) ~. Therefore x a = x ~ = (l~ o . . . o l~) ~ . E a c h l~
is in X3,L or i t has the f o r m [t] where t E .~3,L U X3,F. It follows easily t h a t
(l'1 o . . . o l~) ~ = (l, o . . . o ln) ~, thus a solves x -" (l I 0 - ' - 0 ln). /"3,L contains
only disequations where b o t h variables have t y p e L. We have already seen t h a t
a solves these disequations. Let us consider the linear c o n s t a n t restriction which
is i m p o s e d by <. Let z E X3,F, z > x E X3,L. We know t h a t z does n o t o c c u r
in any t e r m of the f o r m r ~ for r_<x, r E X3,L. F r o m this it follows easily t h a t z
does n o t o c c u r in x ~. []

L e m m a 10. If there exists a pair ((F3,F, X3,F , X3,L , <), ( /"3,L , X3,L , X3,F , <~) )
reached after Step 3 such that (F3,F,Xa,F,X3,L,<) has a restrictive solution
and (F3,L,X3,L,X3,F, < ) has a solution, then 1"0 has a solution.

Proof. Let a F be a restrictive solution of the free disunification p r o b l e m with lin-


ear c o n s t a n t restriction (F3,F, X3,F, X3,L, < ) , let as be a solution of the s y s t e m
(F3,L, X3,L, X3,F, < ) . We m a y assume t h a t

aF : X3,F ~ T ( 2 F U X3,L, YF)


f.X3,FL-JYL
O'L : X3,L ~ ~n~t~d,n.
where t h e sets YF = { Y l , F , . . . , Y m , F } C X and YL = { Y l , L , . . . , Y n , L } C X are
finite, disjoint and do not contain an element of X3,F U X3,L. Since ~ f contains
at least one c o n s t a n t a n d one n o n - c o n s t a n t function s y m b o l we m a y choose n
distinct g r o u n d t e r m s t l , . . . , tn over this signature which are different f r o m all
t e r m s x af for x E X3,F. Similarly we m a y choose m distinct nested lists l l , . . . , lm
where all labels have the f o r m [ ]k (k > 0), each l i s t / / being distinct f r o m all
lists x ~L for x E X3,L. Let

Tf : Yi,F ~ li (1 < i < m),


VL : YI,L ~-~ tl (1 < i < n).
304.

We shall define a Tfi,~LaF-assignment cr on X 2 by induction on the linear ordering


<. A s s u m e t h a t z" has been defined for all z E X2 preceding x E X 2 with respect
to <. We shall a s s u m e (1) t h a t this assignment is t y p e - c o n f o r m , which m e a n s
t h a t z ~ has t o p m o s t s y m b o l in 22F (of the t o r m [ ]k) for variables z of t y p e F
( t y p e L), (2) t h a t z~ # z~ for all Zl, z2 < x, a n d (3) t h a t t h e t e r m s z ~ are not
in { t l , . . . , t n , l l , . . . , l m } for z < x.
A s s u m e t h a t x has t y p e i e {F, n}, let i # j e {F, L}. Since cri respects t h e
linear c o n s t a n t restriction of s y s t e m i, t h e variables o c c u r r i n g in x ~ are variables
z E X 3 , j with z < x, or variables f r o m Y/. T h u s we m a y define x * : = x * ~
By i n d u c t i o n hypothesis, z a E "V~LS~F ,f~. for all x > z E X3,j, thus x * E "D'~L*SzF ,f~. .
Since a F is restrictive a n d since o-L ranges over lists, this assignment is type-
c o n f o r m and a s s u m p t i o n (1) holds again. A s s u m e t h a t x * = z a for s o m e z E X2,
z < x. T h e n z has t y p e i since ~ is t y p e - c o n f o r m , thus x ~ z E F3,i a n d
x ~' # z ~'. By a s s u m p t i o n (1), the m a x i m a l j - s u b t e r m s of z ~ ' ~ = z ~ = x ~ =
x *i~~ are e x a c t l y t h e a - i m a g e s of the variables of t y p e j occurring in z *~ a n d
t h e Ti-images of variables Yl,i. T h e f o r m e r variables are smaller t h a n x a n d t h e
restriction of a to these variables is injective, by hypothesis. By a s s u m p t i o n (3),
we o b t a i n z ~ and x ~' back f r o m z * = x ~ just by a projection which replaces
these alien s u b t e r m s by their unique Ti- or ~r-origines. T h u s x *~ = z a~. This
is a contradiction. Therefore a s s u m p t i o n (2) holds again. If x ~ contains any
variable, t h e n x ~ will have occurrences of free function symbols and of a list
s y m b o l []~. Therefore x ~ ~ { t i , . . . ~ t , , , l l , . . . , l m } . If x ~ is g r o u n d , x ~ = x ~''
{t~ . . . . . t,~, l ~ , . . . , l,~ } by choice of the these elements. Therefore a s s u m p t i o n (3)
holds again.
We m a y now show t h a t a solves the s y s t e m F2 which is reached after Step 2.
Since a is consistent for /'1 (see (1) a n d the restrictions in Step 2) it is t h e n
clear t h a t cr can be e x t e n d e d to a solution of F0. B y our previous considerations
it remains to be shown t h a t cr solves the equations x - t o f / ' 2 . A s s u m e t h a t
x - t is in F3,1, where i E {F, L}. T h e n x ~ = t a~. It follows t h a t x ~ = x a~r~ =
t *~*~* = t ~. For the last equality recall t h a t ai a n d ri leave all y E X 3 , j fixed
while yalrlcr = y~ for y E Xa,i. []

4 Results for Rational-Tree Solutions

Here we w a n t to prove the following theorem.

Theorem 11. It is decidable i f an equational list constraint s y s t e m F has a


rational-tree solution.

C o r o l l a r y 12. The positive existential theory of the algebra T ~ L~r is decidable.

Before we give a second a l g o r i t h m for proving these results it is instructive to


reconsider A l g o r i t h m 1 a n d its soundness proof: we f o u n d t h a t given solutions of
the two c o m p o n e n t s of an o u t p u t pair can be c o m b i n e d to yield a solution of the
i n p u t system. This solution is f o u n d by a f i n i t e recursive process along the cho-
sen linear ordering. T h e linear c o n s t a n t restrictions i m p o s e d on the c o m p o n e n t s
305

of the output pairs have the effect of a partial occur check, excluding cyclic de-
pendencies between values of F- and L-variables. If we now ask for rational-tree
solutions, cyclic dependencies are acceptable and may be necessary. Accordingly,
constant restrictions are not used in Algorithm 2.

4.1 Second decomposition a l g o r i t h m ( A l g o r i t h m 2)

The input is a flat and nontrivial constraint system/"0 without disequations.


Algorithm 2 is obtained as a simplification of Algorithm 1:

- We ignore all subpavts in the description of the steps of Algorithm 1 that


refer to disequations.
- In Step 2 (type variables) we do not choose a linear order on the variables.
Accordingly, systems obtained after Step 3 have the form (/'a,f, X3,F, X3,L)
and (F3,L,Xa,L,X3,F), and from (I'3,L,X3,L,X3,F) we obtain its shallow
version (/~Z,L,X3,L, Xa,F U )(3,L) 5 in Step 4.

The output consists of all pairs ((FZ,F,Xa,F,Xa,L), (F3,L,X3,L,X3,F U X3,L))


t h a t are obtained from/"0 by means of the new Steps 1 - 4. []
The simplification of the decomposition steps comes in parallel with a modifi-
cation of the solution domains for the systems that are reached. The free system
(I"3,F, X3,F,X3,L) is solved in the algebra T~FUXa'LUYr~ of rational trees with
labels in EF U X3,L U Y, treating X3,L as a set of constants. Here Y C X is
an infinite set of variables that is disjoint to X2. Solvability of equational sys-
tems over TS~MXa'LUYrat is decidable (see [4, 6]). Since ~'F contains a constant
and a non-constant function symbol, solvability and restrictive solvability are
equivalent.

C o r o l l a r y l 3 . It is decidable if a system (r3,F,X3,r,X3,L) has a restrictive


solution.

System (/'3,L, X3,L, Xz,F) is solved--treating X3,F as a set of constants--in


the domain ~nested~r~t
fXz'FUv of nested lists representing rational trees with labels in
~L LJXa, F U Y. System (/~a,L, Xa,L, Xa,F U-,~3,L) is solved in ,~n,~rxux3'L, as earlier.
Theorem 11 is a direct consequence of the following proposition, using Corol-
lazy 13 and the fact that solvability of word equations is decidable ([7]).

P r o p o s i t i o n 14. The input I"0 of Algorithm 2 has a rational-tree solution if and


only if there exists an output pair ((/'Z,F, X3,F, X3,L ), (/~'3,L,X3,L , X3,F t3 )C3,L))
where (1"3,F,X3,F , X3,L ) has a restrictive solution and (/~3,L, XZ,L, X3,F U -~Z,L)
has a solution.

s Defined as earlier, ignoring linear orders.


306

4.2 Correctness of Algorithm 2

C o m p l e t e n e s s of A l g o r i t h m 2 is proved in similar m a n n e r as for A l g o r i t h m 1


(see [10]). Here we omit this part. For proving soundness let us i n t r o d u c e the
following notation', we write tl = i t~ if the rational trees tl and t2 have the s a m e
labels for all positions of length (depth) k < i. Clearly tl = t2 iff tl =i t2 for all
i_>O.

Propositionl5. If a dotted s y s t e m ( ] ' 3 , L , X s , L , X 3 , F U ff(.a,L) obtained after


Step ~ has a solution, then the original system (I'3,L,X3,L,X3,F) has a solution.

Proof. Let dr be a solution of (Fa,L, X3,L, X3,F U X3,L). We m a y assume t h a t

I,*YuXa,FUJ(~3,L
dr : X3,L "~ "~rtat

where Y C X a n d X2 are disjoint. Let r be the assignment which m a p s every


d o t t e d variable a? E -~3,L to k~ : = x/'. Let cri : = dr o r i (i _> 1). Obviously
x ~ =k x *~ for all i , j __ > k and x E Xa,L. T h e r e exists a unique tree t= E .I~. .X.a '*,d FUY
....
such t h a t x *~ =k t , for all i > k > 1. We define x * : = t , (x E X3,L). Take an
e q u a t i o n x - ll o - - . oln of F3,L with c o u n t e r p a r t x -- I~ o . . . ol~ in/~3,L. Let i > 1.
For lj = lj' = y E X3,L we have 1~ ~ =i lj. a For lj = ljt = [y] with y E X3,F we have
i ~ i I
lj = l j = l j . " For lj = [ ~ / ] w i t h y E X 3 , L we h a v e l jIo i = [ ~ ) ~ ' ] : [ ~ ~
i

9 _ 6r
[Y~r ~1 = [Y~'-~] = i [Y]~ = lj. T h u s x ~ =i x ~' = (l', o . . . ol'n) ~' =i (I1 0''' Oln) a
for i > 1 a n d a solves x -- ll o . . . oln. rl

Proposition 16. If there exists a pair ((F3,F, XS,F, X3,L), (F3,L,Xa,L, X3,F))
that is reached after Step 3 such that (F3,F, X3,F , X3,L ) has a restrictive solution
and ( F3,L, X3,L, X3,F ) has a solution, then 1"0 has a solution.

Proof. Let a F be a restrictive solution of (F3,F,X3,F,X3,L) a n d let aL be a


solution of (FS,L, Xa,L, X3,F). We m a y assume t h a t

O"f : X3,F --~ Wr5FUXa,LUY


I~Xa,FUY
tYL : X3,L ~ ~neated,rat

where Y = { Y l , - - . , Yn} C X is finite and Y M X2 = ~. Let O'FgzL := aF U


aL. C h o o s e n distinct g r o u n d trees t l , . . . , t ~ E T ~ L~F. L e t T : Yi ~-+ ti (1 _<
i _< n). We identify b o t h crFa/; a n d r with their h o m o m o r p h i c extension on
T E L ~ u x 2 u Y Let al : = aF&LUT, a n d let ai : = a~ (i > 1). Since aF is restrictive
at "

a n d each list x aL (x E X3,L) has t o p m o s t label of the f o r m [ ]t we know t h a t


x ~' =k x ~j for all i , j ~ k (x E X2). T h e r e exists a unique tree tx E T ~ LaF such
t h a t x ~ =k t= for all 1 < k < i. We define x ~ : = t=. T h e restrictions in Step 2
of the a l g o r i t h m g u a r a n t e e t h a t cr is consistent for/"1.
Let i > 1. Consider an F - e q u a t i o n x "-- f(Yl . . . . ,Yn) of the s y s t e m reached
after Step 1, /"1. If yj E X3,F, t h e n y~Fra~_, = y~, =i-1 Yj . If yj E X1,L ,
9 ffi--1
307

then yj~ -- yja ' i - 1 . Thus


x a =i xa' = xaFra'-' -----f ( Y l , . . . ,Yn) aFra'-I
: i f ( Y l , . . . ,Yn) ~'-x =i f ( Y l , ' ' . , Y n ) a"
Therefore a solves x -- f ( Y l , . . . ,Y,). Consider an L-equation x - It o . . . o In
(n > _ 0) of F 1. If l j = y e X 3 , L , then l~.L ~ ' - I = l~' =i-1 lj~,-i. Similarly
l~. L ~ ' - I --i-1 lj~'-' for lj = [y] with y E X3,L. If lj ---- [y] where y E Xa,F, then
ljaLra~-i ---- l~. I-1. Thus

Xa ._~.., X al = z ~ ' ~ ' - ' = (l, o... o In) ~'~'-'

=~_~ (l~ o . . . o Z n V ' - ' =~-t (It o - . . ol~V

and a solves x - ll o . . . o ln. This shows t h a t a solves all equations of/"1. Thus
cr is a solution of F1. It is now trivial to extend a to a solution of F0. [:]

4.3 Problems with Disequations

Unfortunately, the t r e a t m e n t of disequations causes problems when we ask for


rational-tree solutions. Here is an illustrating example. T h e input system F0 with
equations Xl -- g ( y l ) , x 2 -- g(Y2),Yl -- [Xl],Y2 -- [x2] and disequation x 1 ~ X2
cannot be solved in T ~ LaF since every solution of the equational p a r t will identify
xl and x2. If we decompose Fo, treating disequations as in Algorithm 1, one
particular o u t p u t pair with free system F3,F = {Xl -- g(Yl), x2 -- g(Y2), Xl ~ x2}
U {xi ~ y j ; i , j = 1,2} (constants Yl,Y2) and with the L - c o m p o n e n t /~S.L =
{Yl -- [Xl],y2 -- [x2],yl # Y2} (constants Xl,X2) is generated. Both systems are
solvable. Thus decomposition is no longer sound. T h e reason is t h a t validity of
disequations is not preserved when we recombine solutions of the o u t p u t systems
in order to obtain a solution of Fo.
Our a t t e m p t s to prove decidability of the full existential theory of T ~ ~,aF
have led to a partial result only. The question can be reduced to the following
problem for word equations: given a word equation with variables x l , . . . Xn, and
given a finite set of constraints of the form Ixil = [xjl demanding t h a t the length
of the (words to be substituted for the) variables xi and xj has to be the same,
decide if the word equation has a solution t h a t satisfies these restrictions. T h e
first reduction step is based on the following observation (which is very hard to
prove).

T h e o r e m 17. 6 If a typed flat constraint system P has a solution, then the sys-
tem Fx(r) has a solution that is obtained from 1" by replacing every disequation
x ~s y of F by a bounded disequation x ~ x ( r ) Y. Here X(F) h e2
m b -~ n d i a "q- 1

6 A list constraint system T' is typed if every variable occurring in T' has type F or
type L. A tree assignment a solves a bounded disequation x ~k y if the trees x ~ and
y~ have a distinct label in depth j < k. An occurrence of a variable x in a term of t
of F of the form [x] or f( . . . . x . . . . ) 0 e E ZF) is called an embedded occurrence of x
in F.
308

where ne.~b is the number o/ embedded variables of 1" and ndi. is the number of
disequations of 1".

In a second step, bounded disequations can be eliminated for the price of


introducing length constraints of the form Ixl = lYl and Ixl > lYl that restrict
the length of (the values of) L-variables. For each system A with equations
and b o u n d e d disequations we obtain a finite number of systems A 1 , . . . , Ar with
length constraints, preserving solvability in b o t h directions. Eventually a variant
of Algorithm 2 may be used to decompose the systems A~ in a similar way as
before, taking length constraints into account. While the free o u t p u t components
sake the same form as in the case of Algorithm 2, the L-output systems may be
considered as word equations with length constraints as described above. On the
level of word equations it suffices to have length constraints of the form Ixl --- lYl.
T h e remarks given in the introduction indicate that the problem to decide
solvability of word equations with length constraints might be extremely difficult.
Let us make clear t h a t our reduction only shows that decidability of solvability of
word equations with length constraints would entail decidability of the existential
fragment of the algebra of rational trees with lists.

References

1. F. Bander and K.U. Schulz. Unification in the union of disjoint equational


theories: Combining decision procedures. In Proceedings of CADE-I1, Springer
LNC$ 607, 1992.
2. F. Bander and K.U. Schulz, "Combination Techniques and Decision Problems
for Disunification," (extended version) DFKI Research-Report-93-05, German
Research Center for AI, Saarbriicken 1993. Short version in Proceedings RTA
'93, Montreal, June 1993, LNCS 690. Springer, 1993, pp.301-315.
3. J.R. Biichi, S. Senger, "Coding in the Existential Theory of Concatenation,"
Arch. math. Logik 26 (1986/7), pp.101-106.
4. A. Colmerauer, "Equations and Inequations on Finite and Infinite Trees," Pro-
ceedings of the FGCS'84, pp.85-99.
5. A. Kogcielski, L. Pacholski, "Complexity of Makanin's Algorithms," Research
Report, Universitiy of Wroclaw (1991); preliminary version: "Complexity of
Unification in Free Groups and Free Semi-Groups," Proceedings 31st Annual
IEEE Symposium on Foundations of Computer Science, Los Alamos (1990),
pp.824-829.
6. M.J. Maher, "Complete axiomatizations of the algebras of finite, rational and
infinite trees", In Proc. LICS 3, IEEE Computer Society (1988), pp.348-357.
7. G.S. Makanin, "The Problem of Solvability of Equations in a Free Semigroup,"
Mat. USSR Sbornik 32, 1977.
8. G.S. Makanin, "Equations in a Free Group", Izv. Akad. Nauk SSSR Set. Mat.
46 (1982), 1199-1273; English Translation in Math. USSR Izv. 21 (1983).
9. W.V. Quint, "Concatenation as a Basis for Arithmetic," J. Symbolic Logic
11 (1946), pp.105-114.
10. K.U. Schulz, "Constraints for Lists and Theories of Concatenation" CIS-
Report 94-80, Univ. of Munich, 1994.
C o m p l e t e n e s s of R e s o l u t i o n for Definite A n s w e r s
w i t h Case Analysis

Tanel T a m m e t

Department of Computing Science


Chalmers University of Technology and Ghteborg University
S-41296 G~teborg, Sweden
e-mail: tammet@cs.chalmers.se

A b s t r a c t . We investigate the problem of finding a computable witness


for the existential quantifier in a formula of the classical first-order pred-
icate logic. The A-resolution calculus based on the program derivation
algorithm A of C-L. Chang, R. C-T. Lee and R.Waldinger (a subsystem
of the Manna-Waldinger calculus) is used for finding a definite substitu-
tion t for an existentially bound variable y in some formula F, such that
F { t / y } is provable. The term t is built of the function and predicate
symbols in F, plus Boolean functions and a case splitting function if,
defined in the standard way: if(True, x, y) = x and if(False, x, y) = y.
We prove that the A-resolution calculus is complete, i.e. if such a definite
substitution exists, then the A-calculus derives a clause giving such a
substitution. The result is strengthened by allowing the usage of liftable
criterias R of a certain type, prohibiting the derivation of the substitution
terms t for which R(t) fails. This enables us to specify, for example, that
the substitution t must be in some special signature or must be type-
correct, without losing completeness.

1 Introduction

The motivation for this work is to devise efficient automated theorem proving
strategies for the first-order theorem proving tasks arising in the formal deriva-
tion of programs from specifications. The specific aim of the paper is to present
completeness results for certain simple relatively well-known program synthesis
algorithms.
One of the standard approaches to automated program construction is using
intuitionistic logic with a suitable realiZability interpretation to derive programs
from proofs (see [5], [11], [8]). The programs derived in this way always enjoy an
intuitionistic correctness proof.
Another approach (see [2], [6], [7], [1]) is to use classical logic instead, with
the additional restrictions guaranteeing that the proof contains a single definite
substitution t into a certain existentially bound variable, and this t is further-
more in a signature where all the function and predicate symbols are assumed
to represent computable functions. The derived programs thus always have a
classical correctness proof, although they may lack an intuitionistic one.
31o

The following summarizes our motivation for using the second approach (clas-
sical logic) for program construction.
The known realizability interpretations for intuitionistic logic often give pro-
grams which contain computationally irrelevant parts. For example, the real-
ization of the formula Vx3y(x = y & y = x) is a term Az.(x, (id, id)). The
A-resolution gives a term ),z.z as a program to compute y.
Some formulas which admit a proof by A-resolution (and hence give a pro-
gram) are not provable by intuitionistic logic. For example, A-resolution gives
a program Az.x for computing y for the intuitionistically unprovable formula
Vx3y((A V -~A) & y = x).
The standard resolution method with Skolemization and/or conversion to a
conjunctive normal form (CNF) cannot be used for intuitionistic logic, although
there exist special resolution methods without Skolemization and CNF ([9], [10])
and a tableaux method with partial dynamic Skolemization ([15]) for intuitionis-
tic logic. Also, there exists a sizeable amount of theory for the resolution method,
which can be used for program derivation by A-resolution.

1.1 Basic Definitions


We consider closed formulas in the first-order predicate logic language with func-
tion symbols. When we speak about derivability (provability), we mean deriv-
ability in the classical first-order logic. We will restrict us to formulas which
contain at least one positive occurrence of the existential quantifier or at least
one negative occurrence of the universal quantifier. The polarity of subformula
occurrence is defined in the standard way: subformulas under an odd number of
negations and left argument positions of implications are negative, all the others
are positive. We will refer to both the positive occurrences of existential quan-
tifiers and negative occurrences of universal quantifiers as essentially existential
quantifiers and all the others as essentially universal quantifiers.
As our goal is to derive programs by finding a certain definite substitution
t into one of the variables bound by the essentially existential quantifier, we
assume that our formulas have an associated marker for this specific variable,
which will be called the main variable of the formula. We require that the quan-
tifier occurence Q binding the main variable must be outside the scope of other
essentially existential quantifiers. The set of variables bound by the set S of all
occurrences of the essentially universal quantifiers such that Q is in the scope of
all the elements of S is called the set of parametric variables of the formula.
Given a formula F and its main variable y, we are looking for a proof of F
such that this proof gives a term t for computing a value r for y for any set
of values t l, t 2 , . . . , tn assigned to parametric variables x l, x 2 , . . . , x,~ so that a
substitution instance F { t l / z l , t2/z2,...tn/Zn, r/y} of the formula F would be
provable in the first-order classical logic. Here and elsewhere { t l / x l . . . . , t~/x,,}
represents the substitution of each ti (1 < i < n) for the variable xl, respectively.
The terms t (representing computable functions) we are looking for are as-
sumed to contain only the function and predicate symbols and parametric vari-
ables of F, plus Boolean functions and a case-analysis function " / f " defined in
311

the standard way:


if(True, x, y) = x and if(False, x, y) = y. Since not all the predicate and func-
tion symbols in F necessarily represent computable functions, the signature of
t may be further restricted to a subset of function and predicate symbols of F,
representing computable functions.
The proof search is carried out in a modified resolution calculus. We will use
standard resolution terminology, see e.g. [1] or [3]. The resolution rule is defined
as {L, F} {-~L', A} / {F, A}mgu(L, L'). The literals L and -~L' are called the
literals resolved upon. The factorization rule: { L, L', F} / { L, F }mgu( L, L').

2 A N S - m e t h o d and t h e D-calculus

2.1 ANS-method

Given a Skolemized formula F, there is a well-known method (we will call it


ANS-method) for finding a finite set of substitutions applied to some variable y
during the refutation. We will assume that neither Skolemization nor conversion
to clause form change the names of non-Skolemized variables. Our presentation
of the ANS-method differs slightly from the presentation in [4] and [1].

D e f i n i t i o n 1. A formula, clause set, clause, literal or a term is called ground iff


it does not contain any variables.

D e f i n i t i o n 2. An answer clause is either an empty clause or a clause containing


only literals with the special predicate A called the answer predicate.

D e f i n i t i o n 3 . ANS-method: Given a clause set S (S is assumed not to contain


the predicate A) and a variable y, a new clause set S ~ is formed by adding a
new literal A(y) with an answer predicate A to each clause in S containing the
variable y. The refutation of S is found iff an answer clause C is derived from S.

In case C is empty, no substitution has been applied to y and S{t/y} is unsatis-


fiable for arbitrary ground t. In case C = {A(tl),...,A(tn)} (where 1 _< n), the
set of substitutions for y is { t l , . . . , t,,}, thus the clause set S{t~/y}U...US{ t In/Y}
is unsatisfiable, where t~,..., tin are arbitrary ground instances of t l , . . . , tn, re-
spectively.
For the correctness and completeness proofs of the ANS-method see [4] or
[11.

Example 1. Consider the formula F - ((P(a) V P(c)) ::~ 3yP(y)) and the main
variable y in F. Skolemization gives (P(a) V P(e)) =v P(y). The clause form S
of F: {{P(a), P(c)}, {-~P(y)}}. The result of adding the answer literals A is
the clause set S': {{P(a), P(c)}, {~P(y), A(y)}} Resolution derives the answer
clause {A(a), A(c)} from S', thus the set of substitutions for y is {a, c}.
312

There is a well-known class of formulas where the ANS-method always gives


a single substitution for any unsatisfiable set of clauses: namely, the Horn Class,
which is the foundation of the Prolog programming language. A Prolog inference
engine for queries containing variables may be seen as a special case of the ANS-
method.
In the general case, however, the set of substitutions computed by the ANS-
m e t h o d may contain several elements and there is no direct way to use this set
as a program for finding a single definite substitution t into the main variable
of a formula (see the previous example). However, in some cases there exist also
definite substitutions for the main variable.

Example 2. For the formula n - (((P(a) V P(c)) & P(d)) ~ qyP(y)) with the
main variable y there exist both the result set {a, c} and the result set {d}, the
latter giving the definite answer d.

2.2 D-calculus

The D-calculus is used for finding a definite substitution t for the main variable
of a formula F, such that t is built of the function symbols in F. The D-calculus
is a weaker version of the forthcoming A-calculus, with the difference being in
that the A-calculus allows the substitution term t to contain the case splitting
function "if", whereas the D-calculus does not.

D e f i n i t i o n 4 . The D-calculus is obtained from the ordinary resolution calculus


by prohibiting the resolution rule to be applied to two premisses both containing
the answer predicate A and adding the new D-resolution rule for this case:
The D-resolution rule:

{L, A(t), F} {-~L', A(g), A} ~r = mgu(L, L'), p = mgu(ttr, gcr)


{A(t), P, A}~rp

where A(t) and A(g) are answer literals, on the condition that both the atoms
L and L ~ are unifiable, as well as the terms t~ and g~r.

D e f i n i t i o n 5 . A definite answer clause is either an empty clause or an answer


clause containing a single literal.

P r o o f search by the D-calculus is completed iff a definite answer clause is found.

D e f i n i t i o n 6. By D-completeness of a certain resolution calculus C we will mean


completeness for definite answers: if there exists a ground term t such that a
substitution instance S{t/y} of a clause set S is unsatisfiable, the calculus C
will either derive an empty clause or a clause {A(g)} from the clause set S ~
obtained from S by adding a literal A(y) to every clause in S containing y, such
that S{g~/y} is unsatisfiable for any ground instance g~ of g.

The following D-completeness L e m m a 1 is proved in [14]. As our own proof is a


subcase of the main result (Theorem 1) in our paper, we will skip the proof.
313

L e m m a 1 ( D - c o m p l e t e n e s s o f t h e D-calculus) Let G be a clause set con-


taining a variable y. Suppose that the clause set G ( t / y } is unsatisfiable for some
ground term t. Let G ~ be a clause set obtained from G by adding an answer lit*
eral A(y) to each clause of G containing the variable y. Then the D-calculus will
derive a definite answer clause from G I.

Example3. Consider the formula R • ((P(a) Y P(c)) & P(d)) ~ ~yP(y) w i t h


the main variable y. The clause form G j of R after adding the answer literals
is {{P(a), P(c)}, {P(d)}, {-~P(y),A(y)}}. D-calculus cannot derive an answer
clause {A(a), A(c)} from G', but it does derive a definite answer clause (A(d)}.

D e f i n i t i o n 7. The formula F belongs to a Simple Class iff it contains no essen-


tially universal quantifiers except the ones binding the parametric variables.

As the Skolemized forms of formulas in the Simple Class do not contain any
nonparametric Skolem functions, any term give11 by the D-calculus for the main
variable of these formulas can be used as a program. In the general case we can
ensure usability of the terms in the derived definite answer clauses by using the
following restricted form of the D-calculus.
D e f i n i t i o n 8 . A computable predicate R on (possibly non-ground) terms is
called a liflable term restriction iff it has the following property:

VtV,~.R(t,~) ~ n(t)
where t is a term and ~r is a substitution. R is a predicate on the meta-level, not
in the object language of resolution.

D e f i n l t i o n 9 . D(R)-calculus is obtained from the D-calculus by the following


restriction: it is prohibited to derive any clause C containing an answer literal
A(t) such that R(t) does not hold for a given liftable term restriction R.

For example, we can define a certain liftable term restriction RE (t) as "t does
not contain function symbols from the set S " . It is easily seen that for any
R~ is indeed a liftable term restriction. E.g. the set of nonparametric Skolem
functions in a clause is typically taken as the set S.
The definition of D(R)-completeness is obtained from the definition of D-
completeness by requiring that the terms t and g satisfy the criteria R. We
prove D(R)-completeness of the D(R)-calculus. The proof is a subcase of the
Theorem 1.

3 A(R)-calculus

The D(R)-calculus fails to find a proof for a large class of formulas which admit
a proof in intuitionistic logic. The reason for this, roughly speaking, is that
intuitionistic logic assumes subformulas of any formula F to have an associated
program (realization of the formula), whereas the D(R)-calculus assumes only
the function symbols in F to have an associated program.
314

Example,~. Consider the formula F - ((P(a) V P(b)) :=~ 3yP(y)). F does not
admit a proof by the D(R)-calculus. However, F is provable in the intuitionistic
logic.

We argue that in program specifications the following restriction (stemming from


the A(R)-calculus) is natural: nonatomic formulas are never assumed to have
associated programs (realizations), only some predicate and function symbols
and variables may be assumed to have associated programs. Thus nonatomic
formulas composed of atomic formulas by propositional connectives may only
have a derived associated program. For example, if a predicate P has an associ-
ated program p, then a formula P(x) V P(y) has a derived associated program
)~xy.Or(p(x),p(y)) where Or is a standard Boolean disjunction.
The A(R)-calculus we give (it is based on the A-resolution calculus first
presented in [2] and [1], later extended to the Manna-Waldinger calculus [6], [7])
is an extension of the D (R)-calculus which allows the derivation of answer literals
containing the case analysis function if and predicate symbols. The function if is
defined in the standard way: if(True, x, y) = z and if(False, x, y) = y. The first
arguments of the /f-terms are assumed to be type-correct (according to some
given computable type-checking algorithm) literals containing only computable
function and predicate symbols.
Showing type-correctness of literals and computability of function and pred-
icate symbols is outside the scope of the A(R)-calculus. For the purposes of the
A (R)-calculus the "type-correctness and computability of a term t (possibly con-
taining/f and predicate symbols)" means just that R(t) holds for an explicitly
given term restriction R. Observe that the check for a first-order term to be
type-correct for some monomorphic type assignment to function and predicate
symbols is indeed a liftable term restriction.
The notion of a liftable term restriction has to be strengthened in order for
the forthcoming completeness Theorem 1 to succeed.

D e f i n i t l o n l 0 . A liftable term restriction R is called strongly liftable iff

Vt(R(t) Vg(iarg(g, t) n(g)))


where iarg(g, t) is true iff g is an argument of some occurrence of the function
/fin t.

D e f i n i t i o n l l . Given a strongly liftable term restriction R, the A(R)-calculus


is obtained from the D(R)-calculus by adding a new A-resolution rule:

{L,A(t),F} {~L',A(g),A} r mgu(L,n')


{A(/f(L,g,t)),F,A}~ =
where A(t) and A(g) are answer literals, on the condition that the atoms L and
L' are unifiable and R(/f(L, g, t)a) holds.

The A(R)-calculus can be used for finding programs for the main variables
of formulas in the same way as the D(R)-calculus.
315

Example 5. Consider the formula F - ( ( P(a)V P(b) ) =:>3yP(y) ) and let the term
restriction R hold for all terms (thus we assume P, a and b be computable).
The clause form G' of F (after adding the answer literals): {l:{P(a), P(b)},
2:{-~P(y), A(y)} }. The derivation of the answer clause in the A(R)-calculus: 1
and 2 give 3: {P(b), A(a)}, 3 and 2 give an answer clause: {A(if(P(b), b, a))}.
Thus the A(R)-calculus gives a program (without arguments) if(P(b), b, a) for
computing the value of y.
Suppose that the algorithm we have for computing the predicate P is defined
only on a. In that case we define the restriction R(t) as "any subterm of t with
the leading symbol P has either a form P(a) or P(x) for some variable x". Then
the A(R)-calculus cannot derive the answer clause {A(if(P(b), b, a))}; the only
answer clause it can derive is {A(if(P(a), a, b))},

3.1 Di!(R)-completeness of the A(R)-calculus

D e f i n i t i o n 12. A type Boolean is the set of two logical constants 7~ue and False.
A Boolean function is a function taking n(O _< n) arguments of the Boolean type
and returning a value of the Boolean type. We consider "if" to be a polymorphic
function: an occurrence of if is Boolean iff all its arguments are Boolean. A term
t is of a Boolean type if either t is a logical constant, literal, or has a form
f ( t l , . . . ,tn) where f is a Boolean function and all the ti (1 < i < n) are of the
Boolean type.

D e f i n i t i o n 13. A term t is said to be B.correct iff both of the following hold:


(1) each proper subterm s of t which is of a Boolean type occurs either as a first
argument of an /f-term or as an argument of a Boolean function, (2) the first
argument of each if-term in t and all arguments of all Boolean functions in t are
of a Boolean type.

We will start by defining the notion of Di! (R)-completeness. The definition is


obtained from the definition of D(R)-completeness by introducing a few changes.
Given a term t containing if and predicate symbols along with a clause s e t s
and the main variable y we cannot any more speak about the satisfiability of the
construction S{t/y}, since S{t/y} is not a clause set in the standard sense. We
will overcome the problem by extending the standard model-theoretic definition
of satisfiability for classical first-order predicate logic by defining the semantic
function for if: if(True, x, y) = x and if(False, x, y) = y. Here and in the follow-
ing we will assume that all the terms in a clause set or a formula we have are
B-correct. Thus the semantic value of a first argument of if can be only True or
False.
D e f i n i t i o n l 4 . Given a clause set S, we obtain a clause set :5:' by adding an
answer literal A(y) to every clause in S containing y. By Di! (R)-completeness
of a certain resolution calculus C we will mean the following: if there exists
a B-correct ground term t such that R(t) holds and the clause set S{t/y} is
unsatisfiable, then the calculus C will derive from S' either an empty clause or a
316

clause {A(g)}, such that g is B-correct, n ( g ) holds and S{gcr/y} is unsatisfiable


for any ground instance g~ of g. R is assumed to be a strongly liftable term
restriction and cr is assumed not to contain the function i f , Boolean functions
or any predicate symbols.
We will present an explicit algorithm for converting constructions containing
/I-terms to equivalent standard first-order formulas. This algorithm is needed
for the forthcoming completeness proof.

D e f i n i t i o n 15. The algorithm Ax takes a B-correct term t possibly containing


Boolean functions. It replaces all the subterms (with forthcoming exceptions)
built using Boolean functions by the equivalent terms containing if, True and
False instead of Boolean functions. Exceptions: subterms led by a negation which
has an atomic argument are not changed, Boolean occurrences of if are preserved.

D e f i n i t i o n l 6 . The algorithm AT. First, a function step is defined as:

step(g( x l , . . . , xi, i f ( y 1 , Y2, Y3), xi+2,..., Xn) --+


if (yl,ff(Xl, . . . , Xi, Y2, Xi+2,..., Xn),g(Xl, . . .,Xi, Y3, Xi+2, . . . , Xn) )
step( iI( iI(vl, y2, y3), ) iI(vl, iI(u2, iI(y3, )
for all i (1 < i < n), for all n and for all predicate and function symbols g except
/y.
The algorithm A T takes an arbitrary B-correct term t and computes a tree
f o r m A T (t) of t by repeated applications of the function step to the term com-
puted by A i ( t ) , until step cannot be applied any more.

D e f i n i t i o n l 7 . The algorithm A F . A function Is is defined as:


if ( T e, ) -+ ) Is( if ( F lse , ) ) -+ Is( )
f s ( i f ( y l , Y2, Y3)) -'+ ((fs(yl) ~ fs(y2)) ~ (-~fs(yl) ~ fs(y3)))
fs(x) --4 x iff x is not lead by if
The algorithm AF takes an arbitrary B-correct term t of a Boolean type and
computes a flattened f o r m mF(t) = I s ( A T (t)) of t.

We will introduce the construction Sit/y] for clause sets, similar to the or-
dinary substitution S { t / y } . The difference is that in the newly introduced con-
struction the term t may contain literals and the function i f , thus we will use
the algorithm AF to "flatten out", so to say, any literals containing the term t
after direct substitution.
D e f i n i t i o n l 8 . Let t be a ground B-correct term, possibly containing literals
and the function i f . Consider the clause set S to be a conjunction of disjunctions
of literals. Build a new construction St by replacing the variable y everywhere
in S by the term t. Build the formula SAF(t ) by replacing all the literals L in St
containing t by the formula computed by A F ( L ) . Sit/y] is obtained by bringing
the formula SAy(t) to the clause form again and removing all the tautologous
clauses.
317

Example 6. Let F "- ((P(a) V e(b)) =:~3yP(y)). The clause form of F is S:


{{P(a),P(b)}, {~P(y)}} Let t be the term if(P(b),b,a). Then S[t/y] is
{{P(a), P(b)}, {-~P(b)}, {P(b),-~P(a)}} which is obtained by converting the
following formula to the conjunctive normal form: (P(a) V P(b)) & ((P(b) =ez
--,P(b)) &5(~P(b) ~ -,P(a))) which is obtained from the following construction
by one application of the algorithm AF: (P(a) V P(b)) & (--,(P(if(P(b), b, a)))).

Observe that for the earlier mentioned "Simple Class" (the class where all Skolem
functions are parametric) the A(R)-calculus is complete even in the standard
classical sense: since all the Skolem functions are parametric, we define R to
return True for every term t. Then if a formula F in that class is provable, the
A(R)-calculus derives from the clause form S of F a definite answer clause which
is either empty or has a form A(g) such that S~q/y] is provable.
The following completeness theorem for the general case is a main result of
the paper.

T h e o r e m 1 The A(R)-calculus is Dil (R)-complete.

Proof. Recall the definition of Dil (R)-completeness: we assume that we have a


clause set S and there is a B-correct ground term t such that S{t/y} is unsatis-
fiable. In that case also the clause set Sit/y] is unsatisfiable.
Recall the construction of S[t/y]. We denote the set of literals in S[t/y] built
from the literals in the term t by the algorithm AF as I. Notice that since t is
ground, all the literals in I are also ground. Further, each literal in I has both
a positive and a negative occurrence in I.
Let G - (gl, 99 gl) be the sequence of all non-•terms occurring in the tree
form of t as the second and third arguments of the function if. G represents the
possible choices given by t for the term to be substituted, so to say. For each
element gi in G there is a corresponding choice path in the tree form of the term
t. Let Ai be the set of all the literals on that path, but in the negated form (the
element gi is chosen by t i f f all the literals in Ai have a truth value False). We
call the set Ai a path-clause of the term gi. Due to the construction of the term
t, the set of the path-clauses of all elements of G is unsatisfiable. Further, for any
two different path-clauses Ai and Aj (i r j) we know that the clause Ai U Aj
is a tautology. Notice also that each literal I has both a positive and a negative
occurrence in I.
Let Sy be the set of those clauses in S which contain the variable y and let
So be the set of all the other clauses in S. Notice that in S[t/y] all the clauses in
So are preserved unchanged. S[t/y] = So U St, where St is built from the clauses
Sy - {C1,..., Ck} and can be assumed to have a following form:
{Cl{gl/y} U,~l,...,Cl{gl/y} U A l , . . ., Ck { g l / y } U .A.1,...,Ck{gl/y} U Al}

(with the tautologous element clauses missing due to tautologies being removed
by the construction of S[t/y]).
Since Sit/Y] is unsatisfiable, there must be some unsatisfiable finite Herbrand
expansion S[t/y]E of the set S[t/y]. Recall that the finite Herbrand expansion
318

of some clause set { C 1 , . . . , Cn} is the set

{ClO'h...,ClO'rn,...,CnO'l,...,Cno'rn}
where each Ci6rj is ground and contains only predicate, function and constant
symbols from the set { C 1 , . . . , Cn} (plus a single new constant symbol, in case
{ C 1 , . . . , Cn} contains none).
Unless it is explicitly said otherwise, we will in the following use ordi-
nary resolution (not the D- or A-calculus) which is restricted in the following
completeness-preserving manner. We introduce a following ordering of ground
literals in S[t/y]E: all the literals in S[t/y]E which do not occur in the set I are
preferred for resolution over the literals occurring in I. We restrict the resolu-
tion m e t h o d by allowing resolution upon a literal L in a clause C only if C does
not contain any literal R preferred over L. This restriction is a case of so-cMled
ordered semantic resolution, see [3] or [1]. We will restrict resolution further
by prohibiting the derivation of tautologies (clauses containing some literal L
and its negation -~L). This restriction preserves completeness for the semantic
resolution.
W e build the clause set S[t/y]EA from the clause set S[t/y]E by adding an
answer literal {A(gi)} to each clause Cj{gi/y} U Ai (1 _ j < k, 1 < i < l) built
from some clause Cj in Sy.

D e f i n i t i o n l 9 . A clause in S[t/y]EA or derived from S[t/y]EA is called a final


clause iff it contains only answer literals and literals from I.
We will first show that if some final clause C is derived from S[t/y]EA, it is
impossible to use C for the derivation of a nonfinal clause. Consider a resolution
inference with premisses being C and some other clause C t. The consequent is
non-final only if C I contains literals not occurring in I. Let N be the set of all
these literals in C ~ which do not occur in I.
Due to the construction of N and I, none of the literals in N occur in C
neither positively nor negatively. Thus the inference step is possible only if C t
contains also some literals from I. But these literais cannot be resolved upon in
C' due to the ordering restriction we use.
As a second step we will show that from two non-final clauses C and C ~ it is
impossible to derive a clause C '~ such that C I' contains more than one occurrence
of an answer literal. We assume that C contains some answer literal {A(gi)} amd
C ~ contains some answer literal {A(gj)}. Consider the case i r j. Since C and C ~
as non-final clauses cannot have been inferred from final clauses, the consequent
of the inference would be a tautology (due to the construction of S[i/y]EA) and
thus is not allowed to be inferred according to our resolution strategy. Consider
the case where i = j. Then {A(g~)) is the same as {A(gj)}. As they are both
ground, the consequent of the inference contains a single answer literal.
As a third step, notice that any clause inferred from two final clauses by an
A-resolution inference step contains either no answer literals or a single ground
answer literal A(d) where d is constructed from the terms in G and literals in I
and the function if. Since R is assumed to be strongly liftable, R(t) is assumed
319

to hold and the clause set S[t/y]E is assumed to be unsatisfiable, the A(R)-
calculus derives from S[t/y]EA a definite answer clause with a term satisfying
R. Let D be such a derivation.
Finally, consider the original clause set S. Add an answer literal A(y) to each
clause containing the variable y. We get the following clause set S~:

S' - (So U {{Cl, A ( y ) } , . . . , {Ok, A(y)}})

For each clause C/in the set S[t/y]~A there is a clause C~ in the set S' subsuming
Ci (in general, several clauses in S[t/y]~A may map to one clause in S').
We will now lift the derivation D from the clause set S[t/Y]EA to the deriva-
tion of a definite answer-clause from the set S ~. We note that the standard lifting
lemma is not true for the A(R)-calculus due to the D-resolution rule. However,
we can show that D can be assumed to have a special form such that the standard
lifting lemma is applicable. Namely, whenever there is a derivation of a definite
answer-clause from the final clauses, then there is also a derivation without the
use of the D-resolution rule (since literals in final clauses satisfy the R-restriction,
D-resolution inferences can be replaced by A-resolution inferences). Considering
D-resolution inferences from the non-final clauses, we observe that the answer
literals in the figure do not contain if and thus standard lifting is applicable.
Lifting: transform the derivation D to a derivation D ~by replacing each input
clause Ci in S[t/y]EA by the subsuming clause C~ in S' and each clause inferred
in D by the correspondingly inferred subsuming clause. Remove the resolution
inferences which become impossible (it is possible to remove those in lifting since
for such figures the lifted consequent is the same as the lifted premiss).
Conclusion: since Sit~y] is assumed to be provable, S[t/y]E is an arbitary
finite unsatisfiable Herbrand expansion of Sit~y], the term with the required
properties was derivable from the clause set S[t/y]EA by the ordered A(R)-
resolution and the term restriction R is strongly liftable, the term with the
required properties is also derivable from the clause set S ~ by the unrestricted
A(R)-resolution.

By examining the proofs of the completeness theorems in the paper we can


easily see that the results hold also in case full subsumption and tautology elim-
ination are used during proof search. We will present a lemma guaranteeing
Dil (R)-completeness of a subset of ordering restrictions of resolution which pre-
serve (standard) completeness of resolution. We say that an ordering ~- of literals
preserves (standard) completeness of resolution iff for any unsatisfiable clause
set S there is a derivation of an empty clause such that a literal L in a clause C
is not resolved upon in case there is a literal L * in C such that L* ~- L. See [3]
for the detailed analysis of ordering restrictions.

L e m m a 2 Let ~- be an ordering of literals and let ~ preserve completeness of


resolution. Then ~- will preserve Di! (R)-completeness of A(R)-resolution if both
of the following hold:
- liftability: VL, L', m(L' ~ L) ~ (L'a ~ La)
320

- R-compatibility: VL.-~R(L) ~ -,3L"(L' ~- L)


The proof can be inferred by examining the proof of the main theorem above.
We remark that the given criteria can be strengthened by comparing literals in
the scope of a derivable clause instead of comparing all the possible literals.
The following is an example of Dq (R)-completeness being lost due to the
ordering not satisfying R-compatibility:

Example 7. Let ~-a be defined as "literals with the predicate G are preferred over
all the other literals". The ordering ~-a is an instance of the semantic resolution
and is thus known to preserve completeness of resolution. Consider the clause
set S: {{-~a(b,y), P(a), A(y)}, {G(y,a), P(b), A(y)}, {--P(y), A(y)}}. We
define R(t) as "t does not contain the predicate symbol P". Then there is no
A(R)-derivation of a definite answer clause from S satisfying either the ordering
~-G or the hyperresolution-compatible semantic ordering, whereas there is an
A(R)-derivation of the definite answer clause {A(if(G(b, a), a, b))} from S using
unrestricted A(R)-calculus.

4 An Example of Program Synthesis

We will present an example of program synthesis using the A(R)-algorithm. We


will assume the use of the paramodulation rule instead of an explicit axiomati-
zation of equality, see [12].
The program will take lists as input. Lists are built inductively from the
constant nil (empty list) and arbitrary objects by the pair constructor c, so that
a list c(h, x) is obtained from the list x by prepending a new element h.
We take a predicate m such that re(x, y) means: x is a member of the list y.
We take as an axiom the following formula defining rn:

(Vx(-~m(x, ,it))) ~ (Vx, y, z(m(x, c(y, z)) r (x = y V re(x, z))))

We will use structural induction over lists:


Vx2... x,,By.A{nil/xl }
VX((VX2...xn3Yl.A{x/xl, Yl/Y}) ~ (Vhx2... x,3y.A{c(h, x)/Xl}))
VX I . . . x,~3y.A
In the second premiss of the scheme above the variable yl is bound by the
essentially universal quantifier and has an interpretation as a recursive case of
the program for computing y. We will present the program extracted from the
proof of the basis and step formulas (see I11]) as two equalities, one for the
constructor nil and one for the pair constructor c.
The choice of the induction principle is not relevant for our aim of demon-
strating the A(R)-calculus proofs of the first-order tasks. The lack or presence
of additional lemmas is also irrelevant for the A(R)-calculus.
Take the previous definition of a list membership predicate m and assume
m to be decidable. Take an arbitrary decidable predicate P. Derive a program
321

to find a member of a list satisfying P, under assumption that the list contains
such a member. The specification is
Vx((3y(m(y, x) & P(y))) ::~ (3z(m(z, x) & P(z))))
and we want to find a program to compute a value for z for any list-type value
of x. We define R(t) as "t does not contain the Skolem function for y".
First, an attempt to derive a definite answer clause with a term satisfying
R fails, if we do not use induction. Conversion of the whole problem to the
resolution form (clauses axiomatizing equality are skipped, overlined variables
like 9 represent Skolem functions, first four clauses come from the definition of
m, A is the answer predicate to collect substitutions):
1) {~m(x, nil)}
2) {~m(x, c(y, z)), x = y, m(x, z)}
3) {x r y, m(x,c(y,z))}
4) {'-~m(x, z), re(x, c(y, z))}

6) {P(~)}
7) {-m(z,~), -P(z), A(z)}
The only derivable definite answer clause is derived in the following way:
5,6 and 7 give 8) {A(~)}
But this answer is discarded by R.
We get successful derivations of a definite answer clause by using one struc-
tural induction over x.

4.1 Induction Base

(3y(m(y, nil) & P(y))) =~ (3z(m(z, nil) & P(z)))


Conversion of the problem to the resolution form:
1) {-~m(x, nil)}
2) c(y, z)), x = y, m(=, z)}
3) {x r y, m(x, c(y, z))}
4) {-~m(x, z), m(x, c(y, z))}
5) {m(~, nil)}
6) {P(~)}
7) {-~m(z, nil), -~P(z), A(z)}
There is a one-step refutation: clauses 1 and 5 give contradiction. This refuta-
tion does not instantiate the variable z. Thus any substitution to variable z is
admissible, and the base case for the program is
z(nil) = t
where t is an arbitrary object. Indeed, due to the assumption the base case will
be never reached, thus the value of z on nil does not have any importance.
322

4.2 I n d u c t i o n Step

w(((3uCm(u, ~) a P(u))) ~ (3z(m(z, ~) ~ P(z))))


Vw( (:tu(m(u, c(w, x) ) & P(u) ) ) =>. (3v(m(v, c(w, x) ) ,~ P(v))))))
and we want to get a program to compute the value of v for any value of w (head
of the list) and x (tail of the list).
The following variables have interpretation: w and x as formal parameters, z
as an induction hypothesis (the program for the recursive case). We define R(t)
as "t does not contain Skolem functions for y and u".
The resolution form:
1) {'-,m(x, nil)}
2) {-~m(x, cCu, z)), x = u, m ( ~ , z ) }
3) {x • y, m(x, c(y, z))}
4) {--,~(~, z), m(,,,, ~(u, ~))}
5) {--,m(y,~), -~P(y), m(7,~)}
6) {'-,m(y,~), -~P(y), P(~)}
7) {m(~, c(~,~))}
8) {P(~)}
9) {~m(v, c(~,~), -~P(v), m(v)}
The derivation:
2,7 give lO) {~ = ~, m(~,~)}
9,3 give 11) {v # ~, -~P(v), A(v)}
9,4 give 12) {-~m(v,~), -~P(v), A(v)}
12,10 give 13) {~ = ~, -~P(~), A(~)}
10,5 give 15) {",P(~), u = w, m(~,~)}
15,8 give 16) {~ = ~, m(~,~)}
10,6 give 17) {--,P(~), u = w, P(~)}
17,8 give 18) {~ = ~, P(~)}
16,4 give 19) {~ = ~, m(7, c(y,u
19,9 give 20) {~ = ~, --,P(~), A(~)}
20,18 give 21) {~ = ~, A(2)}
21,8 give 30) {P(~), A(2)}
11, = reflexivity give 26) {--P(~), A(~)}
26,30 give: {a(if(P(~),-~,-~))}
The substitution if(P('~), ~,-~) found gives a program for the recursive case:
z(c(w, x) ) = if(P(w), w, z(x) )
As a whole the program is the following:
z(nil) = t
z(c(w, x)) = if(P(w), w, z(x))
where t is arbitrary, i.e. the end of the list is known to be never reached.
323

A c k n o w l e d g e m e n t s : We'd like to thank Jan Smith, T a r m o Uustalu and espe-


cially Grigori Mints for useful discussions, criticism, numerous ideas and sugges-
tions. We are also grateful to severM anonymous referees.

References
1. C.L.Chang, R.C.T Lee. Symbolic Logic and Mechanical Theorem Proving. Aca-
demic Press, 1973.
2. C.L.Chang, R.C.T Lee, R.Waldinger. An Improved Program-Synthesizing Algo-
rithm and its Correctness. Comm. of ACM, (1974), V17, N4, 211-217.
3. C.Ferm/iller, A.Leitsch, T.Tammet, N.Zamov. Resolution Methods for the Deci-
sion Problem. Lecture Notes in Artificial Intelligence 679, Springer Verlag, 1993.
4. C.Green. Application of theorem-proving to problem solving. In Proc. 1st Inter-
nat. Joint. Conf. Artificial Intelligence, pages 219-239, 1969.
5. S.C.Kleene. Introduction to Metamathematics. North-Holland, Amsterdam, 1952.
6. Z.Manna, R.Waldinger. A deductive approach to program synthesis. ACM Trans.
Programming Languages and Systems, (1980), N2(1), 91-121.
7. Z.Manna, R.Waldinger. Fundamentals of Deductive Program Synthesis. 1EEE
Transactions on Software Engineering, (1992), V18, N8, 674-704.
8. G.Mints, E.Tyugu. Justification of the structural synthesis of programs. Sci. of
Comput. Program., (1982),N2, 215-240.
9. G.Mints. Gentzen-type Systems and Resolution Rules. Part I. Propositional
Logic. In COLOG-88, pages 198-231. Lecture Notes in Computer Science vol.
417, Springer Verlag, 1990.
10. G.Mints. Gentzen-type Systems and Resolution Rules. Part II. Predicate Logic.
In Logic Colloquium '90.
11. B.Nordstr6m, K.Petersson, J.M.Smith. Programming in Martin-L6f's Type The-
ory. Clarendon Press, Oxford, 1990.
12. G.Peterson. A technique for establishing completeness results in theorem proving
with equality. SIAM J. of Comput. (1983), N12, 82-100.
13. J.A. Robinson. A Machine-oriented Logic Based on the Resolution Principle.
Journal of the ACM 12, 1965, pp 23-41.
14. U.R.Schmerl. A Resolution Calculus Giving Definite Answers. Report Nr 9108,
July 1991, Fakultiit fiir Informatik, Universitiit der Bundeswehr Miinchen.
15. N.Shankar. Proof Search in the Intuitionistic Sequent Calculus. In CADE.11,
pages 522-536, Lecture Notes in Artificial Intelligence 607, Springer Verlag, 1992.
S u b r e c u r s i o n as a B a s i s for a F e a s i b l e
Programming Language

Paul J. Voda

Institute of Informatics, Comenius University Bratislava,


Mlynsks dolina, 842 15 Bratislava, Slovakia.
voda@fmph.uniba.sk

A b s t r a c t . We are motivated by finding a good basis for the seman-


tics of programming languages and investigate small classes in subre-
cursive hierarchies of functions. We do this with the help of a pairing
function because in this way we can explore the amazing coding pow-
ers of S-expressions of LISP within the domain of natural numbers. We
introduce three Grzegorczyk-like hierarchies based on pairing and char-
acterize them both in terms of Grzegorczyk hierarchy and computational
complexity.

1 Introduction

The motivation for this research comes from our search for a good programming
language where we are constructing computable functions over some inductively
presented domain. The domain of LISP, i.e. S-expressions, is an example of a
simple, yet amazingly powerful, domain specified as words. We have designed
and implemented two practical declarative programming languages Trilogy I
and Trilogy II based on S-expressions with a single atom 0 (Nil) [9, 1]. Since
the domain of S-expressions with a single atom is denumerable it seems natural
to identify it with the set of natural numbers. Functions of our programming
language will become recursive functions. The identification is obtained by means
of a suitable pairing function.
Quite a few people have investigated properties of S-expressions but to our
knowledge nobody has done it in the context of subreeursion. Yet, a feasible
programming language should restrict itself to functions computable by binary
coded Turing machines in polynomial time. This class is a subclass of elemen-
tary functions which is a small subclass of primitive recursive functions. Hence
it seems natural to study the connection between a pairing-function-based pre-
sentation of primitive recursive function hierarchies with the usual presentation
based on the successor function s(x) = x + 1. The relation to Grzegorczyk-based
hierarchies shouId be central. The connection between recursive classes of func-
tions (based both on the successor recursion and on the recursion on notation)
and classes of computational complexity is quite well-understood now (see for
instance [10]). We investigate this connection by recursion based on pairing.
Section Sect. 2 introduces the pairing function P. We order the presentation
of primitive pair recursive functions in Sect. 3 in such a way that we can quickly
325

develop computational complexity classes in Sects. 4 and 5. The development


critically depends on the right choice of a pairing function. The proofs of most
lemmas in this paper are omitted. Interested reader can obtain a copy of the full
paper from the author by email.
Our contributions are (i) in the design of a clausal language for the definition
of recursive functions essentially as a usable computer programming language
(Sect. 3), (ii) in the insight that the natural measure of S-expressions, i.e. the
number of pairing operators (cons), should be tied to the size of natural numbers
via our pairing function P. This gives us a characterization of pair-based func-
tion hierarchies by means of both Grzegorczyk-based hierachies (Sect. 4) and
Turing machines (Sect. 5). Finally, we get new (iii) recursion-theoretic closure
conditions under which P = NP, P ~- P S P A C E , and P H = P S P A C E .

2 Pairing functions
All functions and predicates in this paper are total over the domain of natural
numbers N. It is well known that in the presence of a pairing function we can
restrict our attention to the unary functions and predicates. Unless we explicitly
mention the arity of our functions and predicates they will be understood to be
unary.
A binary function (., .) is a semi-suitable pairing function if it is ( P l ) : a
bijection from N 2 onto N \ {0}, and we have (P2): (x, y) > x and (x, y) > y.
The condition ( P 1 ) assures the pairing property that from (a, b) = (c, d) we get
a = c and b = d and the property that 0 is the only atom, i.e. the only number
not of the form (x, y).

T h e o r e m 1 P a i r I n d u c t i o n . If (.,-) is a semi-suitable pairing function and R


a predicate such that R(O) and VxVy(R(x) A R(y) ~ IE (x, y)) then Vx R(x).

Pro@ By complete induction on R(x). []

T h e o r e m 2 P a i r R e p r e s e n t a t i o n . If (., .) is a semi-suitable pairing function


then every natural number has a unique pair representation as a term obtained
from 0 by finitely many applications of the pairing function.

Proof. By pair induction where R(x) iff x has a unique representation. []

We will abbreviate (x, (y, z)} t o ( x , y , z) and when discussing only unary
functions we will write x, y for (x, y). Thus '., .' can be viewed as an infix pairing
operator with a lowest precedence where, for instance, x+y, z stands for (x+y), z.
Thms. 1 and 2 guarantee that for a semi-suitable pairing function (a): every
number x is either 0 or it can be uniquel[y written in the form xl, x2,..., x~, 0
for some n > 1 and numbers xi. Thus every number codes a single finite sequence
over numbers (codes of finite sequences are called lists in the computer science),
and vice versa. (b): There exist unique pair size Ixl and length t e n ( x ) functions
such that 101 = 0, Ix,yl = I x l + l y l + l , and Len(O) = 0 and Len(x,y) = Len(y)+l.
326

The function L e n ( x ) gives the length of the finite sequence coded by z. We have
Len(x) < Izl .
There are many semi-suitable pairing functions. For instance we can offset
the standard recursion-theoretic pairing function J [3] by one: J ( x , y) = ((x +
y). (x + y + 1)) + 2 + x + 1. However, the function J is not good for our purposes
as it is not a suitable pairing function satisfying the additional condition ( P 3 ) :
I~l = O ( l o g ( ~ ) ) . We will see in Sect. 5 that a suitable pairing function gives a
rise to the classes of functions with very desirable properties of computational
complexity.
Let us temporarily assume that there is a binary function P ( z , y) = x, y such
that the following sequence of pair representation terms enumerates all natural
numbers in the natural order:

0; 0, 0; o, 0, 0; (0, 0), 0; 0, 0, 0, 0; o, (0, 0), 0; (0, 0), (0, 0); (0, 0, 0), 0; ((0, 0), 0), o;...
(1)
The sequence is obtained by letting the term x to precede the term y iff Ixl < lY]
or Ixl = IY], x = Xl,X2, y = yl,y2 and either xl precedes yl or else xl = Yl and
x2 precedes y2. It should be clear that P (if it does exist) is unique and it is a
semi-suitable pairing function. The closed interval

[0, 0,...,0,0 ; (... ((0, 0), 0),..., 0), 0] (2)


where both bounds have the pair size n > 1 contains all numbers with the
pair size n. The lower bound of the interval is a right-leaning number while the
upper bound is a left-leaning number. The length of the interval is the number of
ways an expression consisting of n binary infix operators can be parenthesized.
These numbers are known as Catalan numbers C ( n ) = ~+11 . ( 2 : ) (see [4]).
By convention C(0) = 1 and this is also the length of the closed interval [0; 0]
consisting of the numbers of the pair size 0. By a straightforward manipulation
of binomial coefficients we get C ( n + 1) = ~n + 2 9 C(n). From this by a simple
induction proof we get
2 " - 1 < C(n) < 4 '~ . (3)
The function
~(n) = ~ c(i) (4)
i<rt
yields the minimal number with the pair size n. For all n the numbers in the
closed interval [cr(n); cr(n + 1) - 1] are exactly the numbers with the pair size n.
We clearly have
Lxl = ~ n < x Ix < ~(n + 1)] (5)
where # is the operator of bounded m i n i m u m (see [7]).
For n > 1 the interval (2) can be partitioned into a sequence of consecutive
intervals Z~ = [i, a ( u - 1 - N) ; i, cr(n - Ill) - 1] where 0 < i < ~(n). The length
of Zi is C ( n - 1 - Iil). Hence for x < or(n)

P(x,a(n- 1 - IxD) = ~r(n) + E C(n- 1 - 1i1) . (6)


i<a
327

The function Ro(x) = x - cr Ix] gives the offset of the number x from the least
number of the same pair size. If for two numbers x and y we set n = Ix I + lYl + 1
we can see that P(x, y) = P(x, cr lYl) + Ro(y) because the number P(x, y) occurs
at the offset Ro(y) in the interval 27= with the lower bound P(x, ~r]Yl). Using (6)
we get

P ( x , y) = (Ixl + lyl + 1) + Ro(y) + ~ C(Ixl + lyl - liJ) 9 (7)


i<x

We can conclude that if the function P exists then (7) must hold. Vice versa,
when we define P by (7) where Ixl is defined by (5) we can see that the sequence
(1) consists of all natural numbers in the increasing order. So the function P
does exist and it is a semi-suitable pairing function strictly monotone in both
arguments. We now show that P satisfies also the condition (P3). We clearly
have C(n) <_ ~r(n + 1), so for Ix] >_ 2 we get 21=1-2 _< C(Ix ] _ 1) _< (r Ix] _< x.
Hence for all x
2 I=1 _< 4 . x + 1 . (8)

Each of the intervals Z0 through 27,(,~)_1 is non-empty and so c~(n) _< C(n) which
holds also for n = 0. Thus

x < ~([x I + 1) _< C(Ixl + 1) _< 4 I~l+l = 2 21~1+2 . (9)

Let us denote by d(x) the binary size function yielding the number of bits in the
binary representation of x (d(0) = 0). We clearly have d(x) = O(log(x)). From
(8) we get Ix] + 1 = d(21~l) ~ d(4.x + 1) = d(x) + 2, i.e. Ixl _< d(x) + 1. From
(9) we get d(x) <_ d(2 21~1+2) = 2. }xI + 3. Thus the condition ( P 3 ) holds and
we have:

T h e o r e m 3. The function P defined by (7) is a suitable pairing function.

From now on (x, y) and x, y are abbreviations for P(x, y).

3 Pair-Based Functions and Classes

The development of function classes discussed in this section does not require
the property (P3). Any primitive recursive semi-suitable pairing function p can
be used instead of P provided we are able to derive the successor function and
unary iteration in a way similar to the derivation of arithmetic below.
The identityfunction is I(x) = x. The zero function is Z(x) = 0. The first (H)
and second (T) projection functions are such that H(0) = T(0) = 0, H(x, y) =
x, and T(x, y) = y. The conditional function D is such that D(0, x, y) = y,
D((v, w), x, y) = x, and D(0) = D(x, 0) = 0. The function f is obtained from
the functions g and h by (unary) composition if f(x) = g h(x) and by pairing
if f(x) = g(x), h(x). Functions simple (in a class of functions .T) are generated
from ($'), I, Z, and D by composition and pairing.
328

T h e o r e m 4 . A class of functions simple in 3c is closed under explicit definitions


of the form f(x) = a where a is a term constructed from the variable x and
numerals by pairing (b, c) and applications g(b) where the function g is a function
simple in 3r.

Pro@ By a straightforward induction on the structure of a. We may assume


that all numerals in a are in the pair representation. []

The projection functions are simple as they have explicit definitions H ( x ) =


D(1, x) and T(x) = D(O, x). An example of a function f simple in g and h is

f(x) = O(x,D(H(x),O,O(gT(x),h(Tgr(z),HgT(x)),O)),c) . (10)


Definitions like (10) are made readable by a clausal language where we define
the same f by an explicit clausal definition as

f((v,w),x) = 0
f(O, x) = h(w, v) ~ g(x) = v, w
f(O,x) = 0 ~ g(x) = 0
f(O) = c
The clauses, when read as inverted implications, express properties of the defined
function. The first and third clauses can be omitted as the defined function
then obtains the value 0 by default when no clause can be satisfied for a given
argument. Clauses can be given in any order (we have listed the clauses for f
in the order obtained by removing the conditionals from the definition (10)).
Clauses must be presented in such a way that the conversion from a clausal
definition to its initial form should be always possible.
We do not have the space to dwell on the details of our clausal language and
hope that the clausal definitions given in this paper will be simple enough so the
reader can reconstruct the initial form of the definition. In the case of explicit
definitions the initial form is always f ( x ) = a. We permit the clausal language
to be used with inductive definitions where the initial form will be a schema of
inductive definition. The development of the clausal language should be viewed
from the perspective of its eventual implementation on computers. We offer here
no details on how one can go about the automatic compilation.
We now present three schemas of pair-based inductive definitions which can
be used as initial forms of clausal definitions. The function f is defined by the
unary iteration of g if f(0) = 0, f(0, y) = y, and f(s(x), y) = g f(x, y). We
write g~(y) as an abbreviation for the application f(x,y). The function f is
defined by the pair iteration of g if f(0) = 0 and f(x, y) = gt~l(y). The function
f is defined by pair recursion from g and h if f(0) = 0, f(0, y) = g(y), and
f ( (v, w), y) = h(v, w, y, f(v, y), f ( w , y) ).
The classes "P7~1, 7)7~2, and P7~3 are generated from [, H, and T by com-
position pairing and respectively by pair recursion, pair iteration, and unary
iteration. We call the functions of 7)T~1 primitive pair recursive functions.

T h e o r e m 5. The classes 7)TQ are closed under explicit clausal definitions.


329

Proof. Clearly, it is sufficient to show that the classes PTgi are simple in them-
selves, i.e. that each of them contains the functions Z and D. []

We can now use the clausal language with any of the three inductive schemas.
The functions @, | 1", written in infix notation e.g. x 9 y instead of | y), are
obtained in 79791 by pair recursion:
O~y=y
(v,w)| v, w G y
~t(o) = o
n~(v, w) = o, Rt(v) + Rt(w)
x | v = fl(R~(x), y)
A (0, u) = o
f~(0, w), ~) = u + fl(w, u)
x , y = f~(Rt(y), x)
h ( 0 , x) = 1
A ( 0 , w), x) = x | A ( w , x)
The function @ is the list concatenation function. The function Rt is obtained by
parameterless pair recursion for which PTQ is easily seen closed. Rt(x) yields a
right-leaning number of the same pair size as x: Rt(x) = cr IX[. We have IRt(x)[ =
Len Rt(x) = Ix]. By pair induction we can derive Ix ~ y[ = ]x t + ]yl, Ix | y[ =
Ixl. lyh and Ix 1 yl = Ixl '~t.
The function Lt(x) = a(lx I + 1) - 1 yielding a left-leaning number of the
same pair size is obtained as primitive pair recursive by Lt(x) = f Rt(x) where
f(0) = 0 and f(v, w) = f(w), O.
We identify a predicate /~ with its characteristic function R,(x) = y
R(x) A y = 1 V -~R(x) A y = 0. For a function class 2- we denote by 2-, the
class of 0, 1-valued functions of .7". We say that the predicate R is from 2-, and
sometimes write R E 2-, if R, E 2",. The predicate Left(x) true of left-leaning
nmnbers is defined primitive pair recursively by a clausal definition
Left(0)
Left(v, w) ~- w = 0 A Left(v).
Clausal definitions of predicates should be viewed as abbreviations for clausal
definitions of their characteristic functions. If we replace Left(x) by Left,(x) =
1 above we get such a definition. Note that the clauses for --Left(x), i.e. for
Lefl,(x) = O, are obtained by default. We can similarly define the primitive pair
recursive predicate Right holding of right-leaning numbers.
We now turn to the arithmetic functions. The successor and predecessor
functions are obtained by the case analysis of the enumeration of pairs (1). We
derive the successor function in P7~1 by a parameterless pair recursion:
s(O) = 1
8 0 , w)= v, 8(w) ~- -,Left(w)
~(v, w) = ~(v), n t ( ~ ) ,-- Left(w) A -,~eft(~)
~(v, w) = 0, R t 0 , w) +- Left(w) A Zeft0) A w = 0
8(V, W) = Rt(0, y), ~t(wl) +--- L e f t ( w ) A L e f t ( v ) h w = Wl, w2
387

The predecessor function p(0) = 0 and ps(x) = x is derived as primitive pair


recursive in a similar way with with the help of Lt and Right. Before we can get
additional arithmetic functions we need ,07~1 closed under unary iteration:
T h e o r e m 6. The classes PT/i consist exactly of primitive pair recursive func-
tions.
Proof. 7)Tlt C_ 7)1"~2: It is clearly sufficient to show f E PP,-2 for f derived by
the pair recursion from g and h in "P7~2. Derive f in 7)7~ by pair iteration:
f ( x , y) = v ~ h~'~'~l(y, O, (O, x), O) -- yl, ( (V, Xl), 8), i
hi(y, 8, (0, 0), i) = y, ((g(y), 0), 8), i
hi(y, 8, (0, v, ~), i) = ~, 8, (0, ~), (0, v), 0, i
hi(y, ((fv, v), ( f w , w), s), O, i) -- y, ((h(v, w, y, f v , fw), v, w), s), i
hl(y,s,O) = y,s,O .
The 'second' argument s of hi is an output stack where the function values
f ( z , y) of components z of x are assembled in the form f(z, y), z. The 'third'
argument i is an input stack for breaking down the components z of x. The
components are stored in the form (0, z): The marker 0 is pushed on the input
stack in the second clause for hi to indicate that when it eventually appears at
the top of the stack in the third clause for ht the values f(v, y), v and f ( w , y), w
will be on the top of the output stack. The needed length of iteration of ht is
3- txl + 1 < Ix, x, x I as we need two iterations for each comma and one iteration
for each 0 in the pair representation of x.
727~ C_ 727~3: It is sufficient to show f E 72~a for f derived by pair iteration
of g E 1r)7~3. Derive f in 7)T/a by unary iteration:
f ( x , y) = a +-- gr'~(y, x, O) = a, s
gl(a, 0, 8) = a, 8
gl(a, (v, w), 8) = g(a), v, w, s
~ ( a , 0) = a , 0 .
The iteration of g~ goes long enough 2. ]x I + 1 = Ix, xl __%x, x to empty the
'second' argument s of gl which acts as an input stack for breaking the original
argument x into its components. The 'first' argument a is an accumulator where
g is applied to each time we have a pair on the top of the stack.
P7~3 C_ 7)T~t: Assume f defined by the unary iteration of g E P'Rt and show
f E 7)7r Derive f in P7~1 by pair recursion:

fl(O,x,y) = x,y
fl ((v, w), x, y) = O, Yl *--- f, (w, x, y) = O, Yl
fl((V, ~), ~, y) = p(Vl, w~), g(y~) ~- f l ( ~ , ~, y) = (v~, ~ ) , y, .
The auxiliary function fl accepts as a parameter and yields numbers of the form
c, a where c is a counter of iterations and a an accumulator where g is repeatedly
applied to. This works provided the length of recursion is at least x. By (9) we
ha.ve
L ~ R @ I (0, x)) = ]9 1 (0, =)1 = t911~ = 4M+, > = .EJ
331

A unary equivalent of an n-ary function f (n _> 2) is any function g such that


f ( x l , . . . , x,~) = g ( x l , . " , x,~). Unary equivalents of arithmetic functions +, -
(modified subtraction),., + (integer division), and xY can be derived in PT4i by
unary iteration simulating the usual recursion. The function d is derived in 79T4i
by a repeated halving.

4 Pair-Based Hierarchies

We assume that the reader is familiar with the Grzegorczyk hierarchy s [5]
which we will need from the multiplicative stage g 2 onwards. Rose in [7] gives a
good discussion of the topic. The hierachy functions generating the classes gi are
E2(x) = x2+2, and and Ei+3(x) = E~+2(2). The reader will note that in order to
get Ei E gi we have increased the indices of Rose's functions by one. Functions
Ei are strictly monotone, Ei+l bounds (dominates) Ei, i.e. El(x) <_ Ei+l(x) for
all x. We also have El(y) < E +l(x + y).
The subexponential stage 2.5 between the multiplicative and elementary (ex-
ponential) stage 3 is an inserted stage into the Grzegorczyk hierachy. We augment
the Grzegorczyk hierarchy by adding the class g2.5 generated in the usual way
from the binary subexponential hierarchy function E2.5(x, y) = x d(y) (alterna-
tively, we can use the 'smash' function x # y = 2d(x)'d(Y)). It is easy to see that
all functions f ( x l , . . . , xn) in g2.5 are bounded by 2p(d(xl),',d(x~)) for a polyno-
mial p(xl,..., xn) and so the class g~.5 lies strictly between the classes E2 and
ga.
Pair hierachy functions Fi generate our pair-based hieraehies. We set F2 = I,
F2.5 = | F3(x) = gl=l(2) where g(x) = (x @ x) | 2, and Fi+4(x) = F~+13(2). We
have Fi E PTr The functions Fi+3 have been chooses in such a way that we
have IFi+a(x)l = Ei+31xl.
A function operator yielding the function f is limited if f is bounded by
a function j which is an argument of the limited operator. To every operator
introduced sofar we clearly have a limited version by adding a new argument
function j.

D e f i n i t i o n 7 P a i r h i e r a r c h i e s . The classes 7)i (i = 2, 2.5, 3, 4,...) of the pair


hierarchy are generated from I, H, T, and F~, by composition, pairing, and
limited pair iteration.
The classes 7)A/[i of the pair bounded minimum hierarchy are obtained by
adding to the classes 7~i the operator of bounded unary minimum: f(O) = O,
f(y, x) = # z < y[R(z, x)] yielding the least z < y to satisfy R or 0 if there is
no such.
The classes ~PO~ of the pair order hierarchy are obtained by replacing in the
classes pi limited pair iteration by limited unary iteration.

We will write ~ i when we mean a class at the i-th stage of any of the three
pair hierarchies. We did not define the classes .T~ of pair hierarchies below the
multiplicative stage 9r2, i.e. for i = 0 and i = 1. As it is well known no pairing
332

function can be introduced in these stages. We only note here that by limiting
the pairing operator we can extend the pairing hierarchies to the stages zero or
one. We now state mostly without proofs a number of lemmas leading to Thms.
20 and 22.

L e m m a 8. The classes jci are closed under explicit clausal definitions.

When a function f is obtained by a limited operator we have f(x) <_ j(x)


from which If(x)[ < lj(x)l fonows. Vice versa, the latter condition implies f(x) <
O, j(x) = j l ( x ) where the explicitly defined function jl is in the same class as j.
As the two bounding conditions are equivalent we will be using both of them~ We
say that a function f is monotone in the pair size if Ix[ _< lyl-~ If(x)l <_ If(y)l.
Note that the pairing function P is monotone in the pair size of both arguments.

L e m m a 9. | E f i .

Proof. We derive by the pair iteration:


x * V = Reva(Reva(x, 0), y)

f(o, y) = o, y
f((v,w),y) = w , v , y .
A computer programmer will recognize that the function Reva is the so called
accumulator version of the list reversal function Rev. We have Reva(x,y) =
Rev(x) | y, i.e. Rev(x) = Reva(x, 0). The pair iteration of f is limited as we
have Ifl l( )l = I*i _< IT(n, x)l. It should be clear that we can replace the limited
pair iteration by the limited unary iteration because Ixl _< x. []

L e m m a 10. For any polynomial p ( x l , . . . , x,) we can define a function j such


that Ij(xt,..., x~)l" = p(lxl I , " ' , Ix~l)- Ifp is a linear form then j E :pi, otherwise
j is in any class 5c~ containing |

L e m m a 1 1 . Let f E f i . If i < 2.5 then there is a polynomial p(n) such that


[f(x)l < plx[. If i = 2 then p is a linear form. If i > 3 then for some c we
have lf( )l -< IFF( )I. Moreover, If(r _ [j(x)[ for a monotone in the pair size
j E 5 c~.
L e m m a 12. The classes $-i are closed under limited pair iteration. Hence p i _C
J)Mi,~)O i and 7)2 C .Ti.
L e m m a 13. If 2 < i < n then Fi E J:".

Lemma14. TE 7)3.
L e m m a 15. f ~ C f2.5 C f a C ~4 C '" ".

L e m m a 16. The classes 9ci for i >_ 2.5 are closed under limited pair recursion.
The classes ~'~ are closed under limited pair recursion when the bounding con-
dition is independent of the parameter: If(x, Y)t < [j(x)l. This, clearly, includes
limited parameterless pair recursion.
333

L e m m a 17. Rl, Lt, Left, Right, s, p E 7)2.


L e m m a 18. 7)i C_ 7).Adi C_ 7)0 i and 7)i+3 = 7)Adi+3 = 7)Oi+3.

It is unknown whether for i < 2.5 any of the inclusions are strict.

L e m m a 1 9 . We have +, : , . , - , d E 7)02 and Ei E 7)0 i where we denote the


unary equivalents of the binary functions by the same symbols (and also use the
infix notation where appropriate).

T h e o r e m 2 0 . The union of each of the hierarchies 7)i 7)3//i and 7)0 i is the
class of primitive pair recursive functions.

The pairing function P defined by (7) is in g 3 because the functions C and


o" are. An inspection~of the definition of P reveals that the elementary functions
are applied only to arguments of logarithmic size. The detour can be eliminated:

L e m m a 21. The functions P, H, T, and Ix I are in g 2.

T h e o r e m 22 C h a r a c t e r i z a t i o n of 7)0 i. The classes P O i of the pair order


hierachy consist exactly of the unary functions of the classes $ i of the augmented
Grzegorczyk hierarchy.

Proof. 7)0 i C_ gi: By Lemma21 the classes gi are closed under pairing and
contain I, H, and T. The function D can be easily derived in gi and so the
classes are closed under explicit clausal definitions. We can now easily show [ i
closed under both limited iteration and pair iteration with the help of limited
recursion. It remains to derive Fi E gi. For that we derive | in $2 just as in
Lemma 9. As in Lemma 13 we derive | E g2.5 with the bound obtained from
(9) as x N y _< 4 Ix| = 4 M'IyI+I < 4 (d(x)T1)'(d(y)+l)+t. The functions Fi+3 are
obtained by pair iteration in g i+3 with the bounds Fi+3(x) < c~(IFi+3(x)l + 1) =
o(Ei+3lxf + 1).
For the converse inclusion we show by induction on the construction of gi
a stronger claim that both the unary functions and the unary equivalents of
functions in gi are in 7)0 i. This is certainly true of Z, s, and Ei. The unary
equivalents of the projection functions U ~ ( x t , . . . , x ~ ) = xi are obtained by
clausal definitions of the same form. The closure under n-ary composition and
limited recursion which is simulated by limited unary iteration is left to the
reader. []
T h e o r e m 23 C h a r a e t e r i z a t i o n of p r i m i t i v e pair reeursive f u n c t i o n s .
The classes 7)7~i consist exactly of unary primitive recursive functions.

Proof. From Thms. 20 and 22 as ~Ji g i are primitive recursive functions. []


The operators of recursion and of unary iteration go for too long and permit
a direct jump from g2 to g3 (from 7)0 2 to 7)0 a) by iterating the multiplication.
This is why the class g2.~ is skipped in the Grzegorczyk hierarchy. Pair iteration
is logarithmically shorter and the jumps by pair iteration go from | E U2 to
| E .%-2.~ and from there to T E F 3.
334

5 Turing Machine Characterizations

We will now characterize some pair-based classes in terms of computational


complexity and relate some of the open problems in the complexity theory to
those in the small classes. For that we need to know the class ~'i into which a
simulation of a particular resource bounded Turing machine falls. So suppose
that a given Turing machine M has tape symbols ao, a l , . . . , ak (k >_ l) where
a0 is the blank symbol. M has states qo,ql,"',qm (m > 1) where q0 is the
initial state and ql the terminating state. Let us further assume that M stops
for every input word with the length n after at most t(n) steps and uses at most
p(n) squares of tape. We assume that t(n) is monotone and p(n) a polynomial.
We can assume that M is started by placing the initial word on the otherwise
blank tape and make the currently scanned square the first symbol of the word
(if any). The machine stops scanning the leftmost non-blank symbol on the tape
(if any).
We will code the tape symbol a~ by i and the state qj by j. Words will be
coded by lists consisting of codes of tape symbols. The tape will be coded by two
lists I and r where the list r codes the word from the currently scanned square to
the right and l the word to the left of the currently scanned square. The list 1 is
reversed, so the symbol immediately to the left of the currently scanned square is
the first one in I. When r = 0 the currently scanned square is a blank at the end
of the tape and a move to the right extends the tape by setting 11, rl = (0,/), 0.
Similarly for I.
It should be obvious that we can define a simple function My M such that
MvM(q,l,r) = ql,ll,rl holds if one transition of M in the state q and tape
configuration l, r results in the new state ql and tape configuration /1, rl. We
can require that rl never contains trailing blanks and that MvM(1, l, r) = 1, l, r
in the terminating state. Let Mv M E 792 be like My M but with a test that the
result does not exceed a bound given as the first argument. We set
TraM(x) = y ~- (Mv~) ~M(b(x), 0, 0, x) = b~, q~, h, Y -
We wish to define the function b in such a way that I(MvM)n(O,O,x)l
< Ib(x)[
holds. Then, since Len(x) <_ Ixl, for a given code x of an input word Tm M (x)
would yield the code of the output word computed by the machine M.
An input word coded by x contains Lee(x) symbols each of which has the
pair size not exceeding Ikl. As the length of the tape Lee(l) + Lee(r) is always
bounded by p Lee(x) we can see that for some q, 1 and r we have
I(MvM)~(0, 0, x)l = tq, l,,I <_ Ira[ + 1 + [l,r I <
]m[+ l + ( l k l + l ) . p L e n ( x ) + l < Iml+(lkl+ l ) . p l x l + 2= Ib(x)l
where the function b is obtained by Lemma 10. If p(n) is a linear form then
b E 7)2 otherwise b E 7)2.5.
T h e o r e m 2 4 . The function Tm M simulating a Turing machine M computing
in (i) linear space, polynomial time is in 792, (ii) linear space is in 7)0 2, (iii)
polynomial time is in 792.5, and (iv) polynomial space is in 7)0 2"5.
335

Proof. It should be obvious that the function Tm M will be in the particular


class provided (a): the bounding function b(x) is in the same class, and (b): we
can achieve t Ixl iterations in there. (a): In the cases (i)(ii) the polynomial p(n)
is a linear form and so b C 7)2, in the cases (iii)(iv) b e p2.5.
(b): In the cases (ii)(iv) we can assume that t has the form t(n) = (k +
1F(~). (p(n) + 1). (m + 1). This is because the expression on the right gives
a bound on the number of possible configurations of M and the machine must
stop before the bound is reached. The number of iterations of My M is t Ixl =
(k + 1)plxl. (p Ixl + 1 ) - ( m + 1). We observe that for a constant c we can define
the function f(x) = cM in P O 2 by limited pair iteration of multiplication with
the bound obtained from (8) as cM _< (21xl)d(c) _< ( 4 . x + 1) d(c). We can define
f ( x , y) = yt=l in 7)(9 2.5 by a limited pair iteration of multiplication with the
bound yM _< yd(~)+l. By an n-fold composition of f we can get f l ( x , y) = yl=l"
in 7)O z~. Thus the function tl(x) = t lxl is 7)(9 2 in the case (ii) and 7)(9 2.5 in
the case (iv). We get Tm M in the same class by limited unary iteration.
In the cases (i)(iii) the time function t(n) is a polynomial and we wish to
achieve t fxl iterations of My M. This can be done both in 7)2 (case (i) when
b E 7)2) and in 7)2.5 (case (iii) when b E p2.5) by induction on the structure
of the polynomial t(n). Let us for simplicity assume that we wish to derive
f(x, y) = gt I~1(y) by pair iteration. Ift(n) = c for a constant c we define f(x, y) =
gl~,(c)l(y). If t(n) = n we define f(x, y) = gM(y). If t(n) = tl(n) + t~(n) we set
f(x, y) = gtll=lgt2M(y). If t(n) = h ( n ) . t2(n) we set up a 'nested' pair iteration
by

f(x, y) = yl *-- f~llXl(x, y) = Xl, yl

The reader will see that in our special case there is no problem with bounds. []

We say that a Turing machine M with three tape symbols: blank, 0 and 1
computes an n-dry function f ( x l , 9 9 x,~) when, after giving it the words coding
the arguments in the binary and separated by blanks as input, the machine stops
with a word coding the number •(x) in the binary as output. In order to convert
between numbers and codes of such words we will need two functions B and
B -1. The function B(x) yields a list of l's and 2's called the binary code of the
number x > 0, i.e. the list coding the word with the binary representation of
x (note that the bits 0, and 1 are coded in our Turing simulation by 1 and 2
respectively). We set B(0) = 0. Any function B -1 such that B - 1 B ( x ) = x is a
binary inverse of B.

L e m m a 2 5 . The functions P, H, T, and Ixl can be computed by Turing ma-


chines in polynomial time and linear space.

Proof. This is similar to the proof of Lemma 21 but we have to make sure that
all iterations are short. []

Lemma26. B , B -1 C 7)~.
336

Pro@ This would be easy if we had sufficient arithmetic in 7)2 . We do have


s and p but we cannot iterate them sufficiently long in order to obtain + and
x + 2. Fortunately, we can proceed by a detour through Turing machines. Let
Mp, Mh, and Mt be the Turing machines computing the functions P, H, and
T in polynomial time and linear space (they exist by Lemma 25~. By T h m . 24
there are 792 functions P , H , and T such that P ( x , y ) = T m P(x | (O,y)),
-H(x) = rfrtMh(x), and T ( x ) = TrnMt(a). These functions accept and yield
binary codes. We observe that B(x, y) = P(B(x), B(y)). This, and B(0) = 0
constitutes a definition of B by parameterless pair recursion. We have B E 792
because
IB(x)l _< 3. Len B(x) = 3. d(x) < 6. lxl + 9 . (11)
Let R(z) hold iff z is a binary code, i.e. z = B(x) for some x. We are looking
for a function B - * such that B - l ( 0 ) = 0 and R(z) A z > 0 --+ B - l ( z ) :
B-l~(z),B-*g(z). Since # B ( v , ~ ) = B(v) and T B ( v , ~ ) = B(w), we can
then show by pair induction on x that B - 1 B ( x ) = x. Consider the derivation:
B - l ( z ) = y *-- glt(*)l(O, (O, z), O) = (y, s),i
g(~, (o, o), 0 = (o, ~), i
g(~, (o, ~), i) = ~, (o, T(~,)), (o, H(~)), o, ~ ~- z = ~1, z~
g((~, ~, ,), o, i) = ( ( < y), ~), i
g(s, 0) = s, 0 .
W h e n B -~ is called with a b i n a r y code B(z) the value 0, B(x) is pushed on
the input stack i. The iteration of the function g accumulates the result in the
output stack s by breaking down the binary codes of components z of x which
are stored on i in the form (0, B(z)) The marker 0 is placed on the input stack
to mark that the pairing operation should occur on the output stack (the third
clause for f). The iteration of g should go once for each 0 and twice for each pair
in x, i.e. altogether 3. I~1 + 1 times. The 7)2 function t is obtained by L e m m a 10
to satisfy [t(z)l = 3. lz[ + 4. We then have

3. I~1 + 1 < 3. d(~) + 4 = 3. Lea B(x) + 4 _< 3. IB(~)I + 4 = it B(~)I


It should be clear that the function B -1 satisfies the above conditions and so
it is a binary inverse of B. It remains to bound B -1 i n / ) u which we do not do
here. []

T h e o r e m 2 7 . The classes ~2, 7)0 ~, ~2.5 and 7)0 2.5 consist exactly of unary
functions computed by Turing machines in linear space/polynomial time, linear
space, polynomial time, and polynomial space respectively.

Proof. Let the function f be computed by a Turing machine M with one of


the above time/space bounds. By Thm. 24 the function Tm M simulating its
operations is in the corresponding pair class. We have f(x) = B -~ Tm M B(x).
The function f is in the same pair class by L e m m a 26.
Vice versa, Turing machines operating with the above bounds can certainly
compute the functions obtained by composition and pairing from functions corn
putable by machines with similar bounds. The latter holds because of Lemma 25.
337

Such Turing machines can also pair iterate such functions in polynomial time
(for 7)2 7)2.5 functions) because the time is polynomial in d(z) which is on the
order of Ixl. []

Now we see that the class 7)2 contains + , - , - , +, etc. The Turing machine
characterization of 7)(92 can be also proved from Thm. 22 by the result of Ritchie
[6] that E 2 consists of n-ary functions Turing computable in linear space. Cob-
ham [2, 7] was the first one to characterize the n-ary functions computable in
polynomial time by an inductively defined class based on the recursion on no-
tation with the length of iteration d(x).
We now characterize the class 7)M,"2 5. The class of languages P H = {,Ji Si
where {S~} is the polynomial *ime hierachy [8] such that So = P and S~+~ =
N P ( S i ) . We have P C_ N P = $1, c o N P C_ S2.

T h e o r e m 28. The languages over {0, 1} in P H are exactly the languages com-
putable by 7 ) M 2s predicates.

Pro@ We first reduce limited pair iteration to bounded minimum by a well-


know technique of constructing a table for the iterations by means of bounded
quantifiers. Similarly as in [7] (page 123) we then show that to every f E 7 ) M 2~
there is a formula equivalent to f ( x ) = y consisting ofT) 2 predicates, connectives,
and quantifiers with bounds Ixl <_ PM. A predicate R E 7)3-4 ~5 can be then
presented in a prenex form with alternating quantifiers:

where S E 7)2. The rest of the proof then follows from the well know presentation
of P H by means of alternating quantifiers. []

We probably cannot derive limited pair iteration from bounded unary min-
imum in the class 7)3/t 2 . The relations of this class can be characterized by
Ui P L i where { P L i } is a hierarchy similar to P H with

PLo = P L = U T I M E S P A C E ( n k ' n ) = 7)2, ,


k

and PLi+I = N P L ( L i ) . Note that L~ C_ P L i where {Li} is the linear time


hierarchy [11]. It is not known whether the inclusions are strict.

T h e o r e m 2 9 . P = N P iff 7)2.5 is closed under bounded unary minimum, i.e.


7)2.s = 7)Ad2.s p = P S P A C E iff 7)2.s is closed under limited unary iteration,
i.e. 7)2.5 = 7)0 2"5. P H = P S P A C E iff 7)3,4 z5 is closed under limited unary
iteration, i.e. 7)2M2'5 - - 7 ) 0 25.

Proof. If P = N P then the polynomial then hierachy collapses: P = P H and


so 7),~5 = 7)34,2.5 . Vice versa, if the latter identity holds then by Thms. 24, 27,
and 28 we have P = P H and hence P = N P . Similarly, P = P S P A C E iff
7),2.5 = 7)0,2.5 and P H = P S P A C E iff 7 ) M ,2"5 = 7)0,2s.
338

Clearly, from .~F1 : .)u2 follows (9vl), = (5c2),. So it remains to show the
converse where we assume (5cl), = (5r2), for the above three pairs of relation
classes. We have F1 C_ 9c2. Let us denote by (a)i the 792 function yielding the i-
th element (0 <_ i) of the list a or 0 if i >_ Len(a). Take any function f E .T2. The
function f is bounded by a function j E 5cl. The relation R(x, i) ~ (B f(x))i = 1
is in (Y'2), and hence also in (Y'l),- Now, L e n B f ( x ) <_ L e n B j ( x ) and we
can derive by pair iteration a function /~ C f l such that k(x) = B f(x) by
assembling the bits of the binary code with the help of the predicate R. Hence
f ( x ) = B -1 k(x) is in ~'1 and we h a v e / P l = :P2. [3

6 Conclusions

Although the central results of this paper are contained in the characterization
theorems 22, 27, 28, 29 we would like to remind the reader that this research
was started as a search for a feasible declarative p r o g r a m m i n g language based
on subrecursion. We plan to develop and implement on computers our clausal
language in the near future.

References

1. P. Borovansky, P. J. Voda. Types as Values Polymorphism. In proceedings of SOF-


SEM conference 1993.
2. A. Cobham. The intrinsic computational difficulty of functions. In Proc. Int. Conf.
Logic, Meth. Phil.(ed Y. Bar Hillel), 24-30, North Holland, Amsterdam, 1965.
3. M. Davis. Computability and Unsolvability, McGraw Hill, New York. 1958.
4. R. L. Graham, D. F. Knuth, O. Patashnik. Concrete mathematics. Addison-Wesley
1989.
5. A. Grzegorczyk. Some classes of recursive functions. Rozprawy Mate. No. IV, War-
saw 1953.
6. R.W. Ritchie, Classes of predictably computable functions. Trans. Am. Math. Soc.
(106), 139-73, 1963.
7. H.E. Rose. Subrecursion, Functions and Hierarchies. Clarendon Press, Oxford 1984
8. L. Stockmeyer, The Polynomial-Time Hierachy, Theor. Comp. Sci. 3, 1-22, 1977.
9. P. J. Voda. Types of Trilogy, Proceedings of the Fifth International Conference on
Logic Programming, MIT Press, Cambridge MA, 1988.
10. K. Wagner, G. Wechsung. Computational Complexity, VEB Berlin 1986.
11. C. Wrathall, Rudimentary Predicates and Relative Computation, SIAM Journ.
Comput., 1978.
A Sound Metalogical Semantics for
Input/Output Effects

Roy L. Crole 1 and Andrew D. Gordon 2

1 Dept. of Mathematics and Computer Science,


University of Leicester, University Road, Leicester LE1 7RH, United Kingdom.
rlc3~mcs, l e . ac. uk
2 University of Cambridge Computer Laboratory,
New Museums Site, Cambridge CB2 3QG, United Kingdom.
adg@cl, cam. ac. uk

A b s t r a c t . We study the longstanding problem of semantics for in-


put/output (I/O) eXpressed using side-effects. Our vehicle is a small
higher-order imperative language, with operations for interactive char-
acter I/O and based on ML syntax. Unlike previous theories, we present
both operational and denotational semantics for I/O effects. We use
a novel labelled transition system that uniformly expresses both ap-
plicative and imperative computation. We make a standard definition
of bisimilarity and prove fl; is a congruence using Howe's method.
Next, we define a metalogical type theory M in which we may give a
denotational semantics to 50. A~[ generalises Crole and Pitts' FIX-logic
by adding in a parameterised recursive datatype, which is used to model
I/O. A4 comes equipped both with judgements of equality of expres-
sions, and an operational semantics; 2~ itself is given a domain-theoretic
semantics in the category 57Y~ of cppos (bottom-pointed posets with
joins of w-chains) and Scott continuous functions. We use the 57Y~ se-
mantics to prove that the equational theory is computationally" adequate
for the operational semantics using formal approximation relations. The
existence of such relations uses key ideas from Pitts' recent work.
A monadic-style textual translation into ,~4 induces a denotational se-
mantics on 50. Our final result justifies metalogical reasoning: if the de-
notations of two O programs are equal in J~i then the 50 programs are
in fact operationally equivalent.

1 Motivation

Ever since M c C a r t h y referred to the i n p u t / o u t p u t ( I / O ) operations READ and


PRINT in LISP 1.5 [15] as "pseudo-functions," I / O effects have been viewed
with suspicion. LISP 1.5 was the original applicative language. Its core could be
explained as applications of functions to arguments, but "pseudo-functions"--
which effected "an action such as the operation of i n p u t - o u t p u t " - - c o u l d not.
Explaining pseudo-functions t h a t effect I / O is not a m a t t e r of semantic archae-
ology: although lazy functional p r o g r a m m e r s avoid unrestricted side-effects, this
style of I / O is pervasive in imperative languages and persists in applicative ones
340

such as LISP, Scheme and ML. But although both the latter are defined formally
[17, 25] neither definition includes the I/O operations.
We address this longstanding but still pertinent problem by supplying both
an operational and a denotational semantics for I/O effects. We work with a
call-by-value PCF-like language, (9, equipped with interactive I/O operations
analogous to those of LISP 1.5. We can think of dO as a tiny higher-order im-
perative language, with an applicative syntax making it a fragment of ML. We
adopt CCS-style bisimilarity as the natural operational equivalence on (9 pro-
grams. Our first theorem is congruence of bisimilarity, via Howe's method [14],
justifying operationally-based equational reasoning about O programs.
The denotational semantics is specified in two stages. First, we give a denota-
tional semantics to a metalogic ~4 in the category 57)/~ of cppos and Scott
continuous functions. Second, we give a formal translation of the types and ex-
pressions of O into those of M . M is based on the equational fragment of Crole
and Pitts' FIX-logic [5], but contains a single parameterised recursive datatype
which is used to model computations engaged in I/O, and does not (explic-
itly) contain a fixpoint type. Following Plotkin's use of a metalogic to study
object languages [24] we equip the programs (closed expressions) of J~4 with
an operational semantics. Our second theorem shows the 'good fit' between the
domain-theoretic semantics of M and its operational semantics: we prove that
the denotational semantics is sound and adequate with respect to the operational
semantics.
To complete our study, we establish a close relationship between the operational
semantics of each (9 program and that of its denotation. Hence we prove our
third theorem: that if the denotations of two (9 programs are provably equal in
the metalogic, the programs are in fact operationally equivalent. The proof is
by co-induction: we can show that the relation between (9 programs of equal
denotations is in fact a bisimulation, and hence contained in bisimilarity.
We overcame two principal difficulties in this study. First, although it is fairly
straightforward to write down operational semantics rules for side-effects, the es-
sential problem is to develop a useful operational equivalence. Witness the great
current interest in ML plus concurrency primitives: there are many operational
semantics [2, 13] but few if any developed notions of operational equivalence.
HolmstrSm [13] pioneered a stratified approach to mixing applicative and imper-
ative features in which a CCS-style labelled transition system for the side-effects
was defined in terms of a 'big-step' natural semantics for the applicative part of
the language. But HolmstrSm's approach fails for the languages of interest here,
in which side-effects may be freely mixed with applicative computation. Instead,
we solve the problem of finding a suitable operational equivalence by express-
ing both the applicative and the side-effecting aspects of (9 in a single labelled
transition system, a family (--~[ a E Act), of binary relations on (.9 programs
indexed by a set of actions, Act. The actions correspond to the atomic obser-
vations one can make of an (9 program. Milner's classical definition of (strong)
bisimilarity from CCS [16] generates a natural operational equivalence, which
341

subsumes both Abramsky's applicative bisimulation [1] and the stratified equiv-
alences suggested by Holmstrhm's semantics [10, 11]. The second main difficulty
was the construction of formal approximation relations in the proof of adequacy
for J%4. Proof of their existence is complicated by the presence in A/[ of a param-
eterised recursive type needed to model (9 computations engaged in I/O. Our
proof makes use of recent work on algebraic completeness by Freyd [9] and Pitts
[21].
As usual, we identify phrases of syntax up to alpha-conversion, that is, renaming
of bound variables. We write r = %bto mean that phrases r and r are alpha-
convertible. We write r162 for the substitution of phrase r for each variable x
free in phrase r A context, C, is a phrase of syntax with one or more holes.
A hole is written as [] and we write C[r for the outcome of filling each hole in
C with the phrase r If T/is a relation, 7~+ is its transitive closure, and 7Z* its
reflexive and transitive closure.

2 The object language O

(9 is a call-by-value version of PCF, including constants for I/O. The t y p e s of (9,


ranged over by T, consist of ground types unit, bool, i n t and compound types
T->T I and T*~', with the same intended meanings as in ML. Let Lit, ranged over
by ~, be the set {true, f a l s e } U { . . . , - 2 , - 1 , 0, 1, 2,...} of Boolean and integer
literals, and let Rator, be the set {+,-, *, =, <} of arithmetic o p e r a t o r s . Let
notations b (b = t t , f f ) , i (i 6 Z) and _@ (@ 6 { + , - , x, =, <}) range over the
sets Lit and Rator. Let k range over the set of (9 constants, equal to
{ (), f s t , snd, 6, ~, read, write} tA Lit U Rator.
Here is the grammar for (9 expressions,
e::=k r IxIAx:T, e leel (e,e) l ifetheneelsee
where x ranges over a countable set of variables. For the sake of simplicity there
is just one user-definable constant, 6, and w e assume a user-supplied declaration.
fun6(X:Th):T~ defeh.The expression f~r is one whose evaluation diverges. This
is a spartan programming language, but it suffices to illustrate the semantics of
side-effecting I/O.
The t y p e a s s i g n m e n t judgements are of the form F F e:T, where the environ-
m e n t , F, is a list of variable-type pairs, X:T1,..., X:Tn.The provable judgements
are generated by the usual monomorphic typing rules for this fragment of ML,
where F }- k r : T is provable just when k : r is an instance of one of the following
type schemes.
() : unit i : int true, false :bool
6:~ ->~ ~:T
+,*,- : int * int -> int =, < : int * int -> bool
fst:TI*T2->T1 snd : T1 *7"2 ->7-2
read :unit -> int write : int -> unit
3z,2

We assume that X:T~ ~- e~ : v~ is provable.

The set of p r o g r a m s , ranged over by p and q, is Prog ~ de=f{e [ 37 (0 ~- e : T)}.


A v a l u e e x p r e s s i o n , ve, is an expression that is either a variable, a constant
(but not f~), a lambda-abstraction or a pair of value expressions. The set of
values, Value ~ ranged over by v or u, consists of the value expressions that
are programs. Each program has a unique type, given the type annotations on
constants and lambda-abstractions, though for notational convenience we often
omit these annotations.
Before defining the labelled transition system that induces a behavioural equiv-
alence on O, we need to define the applicative reductions of O. We define a
call-by-value 'small-step' reduction relation, ~ C_ Prog ~ • Prog ~ by the follow-
ing axioms

e) v ely/x] ~(i,j_) -+ i @ j
f~-~f~
f s t ( u , v ) -+ U s n d ( u , v ) -4 v
i f t r u e t h e n p e l s e q --~ p if false thenp else q -+ q

together with the inference rule


p~q

g[P] -+ $[q]
where g is an e x p e r i m e n t , a context specified by the grammar

g ::= [ ] p l v [ ] l i f [ ] t h e n p e l s e q I ([],P) I (v,[]).

The rules for 5 and f~ introduce the possibility of non-termination into O. One
can easily verify that the relation -+ is a partial function, and that it preserves
types in the expected way. A c o m m u n i c a t o r is a program ready to engage in
I/O, that is, one of the form C[read ()] or C[write n], where g is an e v a l u a t i o n
c o n t e x t , a context made up of zero or more experiments. More precisely, such
contexts are given by the grammar g = [] I g[C]. If we let the set of a c t i v e
p r o g r a m s , Active, ranged over by a and b, be the union of the communicators
and the values, we can easily show that the active programs are the normal forms
of -~, that is:
Lemmal. Active = {P l -~3q(p --+ q)}.
Our behaviourat equivalence is based on a set of atomic observations, or a c t i o n s ,
that may be observed of a program. This set, ranged over by a, is given by

Act de__fLit U { f s t , snd, @v I v 6 Value ~ } O Msg

where Msg, a set of m e s s a g e s , represents I/O effects. Let Msg, ranged over by
#, be Msg def {?n, !n I n 6 N}, where ?n represents input of a number n and !n
output of n.
343

The l a b e l l e d t r a n s i t i o n s y s t e m is a family ( ~ ~ C_ Prog ~ x Prog ~ [ a E


Act) of relations indexed by actions. It is inductively defined by the following
rules
t fst snd
~ ~~ (u,v) ~u (u,v) ) v
@v ?n !n
u ) uv ifuv E Prog ~ r e a d () )n writen .....~ ()

P--+P" p. a ~p~ P g)q

p ~ ~p' E~v] " ~ E[q]


The last rule allows messages--but not arbitrary actions--to be observed as side-
effects of subterms. Each transition arises from reduction to an active program.

L e m m a 2. p ) q iff 3a E Active (p -+* a q).


We write p$ to mean 3q E A c t i v e ( p -+* q). Unless p$, p has no transitions. So
~, for instance, has no transitions.
We adopt bisimilarity from Milner's CCS [16] as our operational equivalence for
0 . 3 Any program p is the root of a potentially infinite d e r i v a t i o n t r e e , whose
nodes are programs and whose branches are labelled transitions. We regard two
programs as behaviourally equivalent if they have the same derivation trees. The
labels on the trees must match exactly, but we completely disregard the syntactic
structure at their nodes.
We say a relation S C_ Prog ~ • Prog ~ is a b i s i m u l a t i o n iff p S q implies:
(1) whenever p ---%,p~, there is q~ with q ~ q~ and pt S q~;
(2) whenever q --%, q~, there is p~ with p ~ ) p~ and p~ S q~.
Then b i s i m i l a r i t y , ,.~ C Prog ~ x Prog ~ is the union of all bisimulations. It is
standard to prove that bisimilarity is itself a bisimulation, and hence we have
what amounts to a principle of co-induction:
L e m m a 3. p ~, q if] there is a bisimulation 8 with p S q.
The main objective of this paper is to give a denotational semantics of O so that
our metalogic A4 may be used to establish operational equivalences. Nonetheless,
just as in CCS, the availability of co-induction means a great deal can be achieved
simply using operational methods, provided that ~ is a congruence. This is our
first main result, which we can be proved via an adaptation of Howe's method;
similar proofs can be found elsewhere [10, 12, 14].
T h e o r e m 1. Bisimilarity is a congruence.
3 In PCF-like languages, we often define two programs p and q to be observationally
equivalent iff there is no program context C such that C[p] converges and C[q] diverges,
and vice versa. This is inappropriate for our calculus because (unlike in CCS, say)
contexts cannot observe the side-effects of a program. Any two communicators are
contextually equivalent whether or not they axe bisimilax.
3z.z.

3 The metalogic

We outline a Martin-LSf style type theory which will be used as a metalogic, 34,
into which (9 may be translated and reasoned about--it is based on ideas from
the FIX-Logic [5, 6], though A4 does not explicitly contain a fixpoint type. The
(simple) types of J~4 are given by

::= Xo f Unit l B o o l l l n t [ a x a [ a - + a l a• I U(a)

together with a single top-level recursive datatype declaration

datatype U(Xo) = cl of al I " " t Cn of an

in which any type U(a) occurring in the a~ is of the form U(Xo), and each function
type in any ai has the form a -~ a~ (thus the function types in the body of the
recursive type are required to be partial). The use of these types in the modelling
of O is essentially standard, but note that the single recursive datatype will be
used in Section 4 to model I/O. The collection of (raw) expressions of A4 is given
by the grammar in Figure 1,

E ::~ x (variable)
0 (unit value)
[~J (literal value)
E [OJ E (arithmetic)
If E then E else E (conditional)
(E, E) (pair)
Split E as (x, y)in E (projection)
c(E) (recursive data)
CaseEofc(x) -+ E I " [ c(x) -~ E (case analysis)
Ax:a. E (abstraction)
EE (application)
Lift(E) (lifted value)
Drop E to x in E (sequential composition)
Rec x in E (recursion)

Fig. 1. Raw Expressions of the Metalogic .&4, ranged over by E

Most of the syntax of fv4 is standard [6, 7]. The types are either a type vari-
able, a unit type, Booleans, integers, products, exponentials, liftings, or a single,
parameterised recursive datatype whose body consists of a disjoint sum of in-
stances of the latter types. Here, the expressions Lift(E) and Drop E1 to x in E2
give rise to an instance of (the type theory corresponding to) the lifting compu-
tational monad [19]. A closed type a is one in which there are no occurrences of
345

the type variable X0, and we omit the easy formal definition. We define a type
assignment system [4] for A4 which consists of rules for generating judgements
of the form F t- E:a, called p r o v e d expressions, where a is a closed type, and
the e n v i r o n m e n t F is a finite list x l : a l , . . . ,xn:an of (variable, closed type)
pairs. Most of the rules for generating these judgements are fairly standard [6].
We give a few example rules in Figure 2.

F I- El:Bool F t- E2:a F l- E3:a P,x:a• I- E:a• F I- E:ai[cr/Xo]


F I- If E1 then E2 else Ea:a P k Recx in E:a• r e ~(E):U(~)

F I- E:U(a) F, xl:al[a/Xo] I- EI:a' ... F,x,~:an[cr/Xo] k- E,~:a'


F I- CaseEofcl(xO) ~ E1 I ' " I cn(Xo) ~ E,~ :a'

Fig. 2. Example Rules for Generating Proved Expressions in Jbl

There is an equational theory for A4. A t h e o r e m of 2k4 takes the form F P E =


E':a (where necessarily F F E:a and F t- E':a are proved expressions). The rules
for generating the theorems are also fairly standard, and are omitted except for
the example rules which are given in Figure 3. In the case that the environment
F is empty, we shall write E:a and E1 = E2:a. The set of A4 p r o g r a m s Of
t y p e a is Prog~ dej { p I ~a(P:a)} and Prog • dealUa Prog~ is the set of 34
p r o g r a m s . Canonical forms V are given by the grammar
Y ::= 01 L/J I ( E , E ) l A x : a . Z l L i f t ( E ) lc(E).

The set of f14 values of t y p e a is given by Value~M dej {V I 3a(Y:a)} and


Value ~ dJ [I
ujO.
Value~ is the set of A'[ values.

F I- E:~i F, Xl:O'l }- El:dr ... F, xn:an I" En:a


F k Caseci(E)ofcl(xl) --+ E1 I . . . I c,~(x,~) -r E,~ = E~[E/xi]:a
r~- E:U(a) r,z:U(a)t-Z':a'
F k- CaseEofcl(xl) --+ E'[Cl(Xl)/z]l... I ~.(x.) -, E'F-(x-)/z] = E'[E/z]:a '

Fig. 3. Example Rules for the Equational Theory of E

Finally, we equip the syntax of f14 with an operational semantics. This is spec-
346

ified in two ways, first in the style of natural semantics 'big-step' reduction
relations, and second in 'small-step' reduction relations. The former is specified
via judgements of the form P ~ V where ~ C_ Prog M x Value M. The latter
reductions take the form P1 ~ P2, with P1 and P2 both M programs. We omit
most of the rules for generating the operational semantics, except those associ-
ated with recursion and the recursive datatype which appear in Figure 4. Given
any program P, we write P4~ to mean that there is a value V for which P I W .
As usual for a deterministic language we can prove that P ~ V iff P -~* V.

E[Rec x in E/x ] j~ V E ~ ci(E') Ei [E~/xl] ~ V


RecxinE ~ V CaseEof cl(xl) --~ El l ... ] c,~(x,~) --~ E~ ~ V

Split (El, E2) as (x, y) in E --+ E[E1, E2/x, y]

Fig. 4. Examples of the reduction relations for Ad

Our aim is to prove the following theorem.

T h e o r e m 2.
(1) If P E P r o g ~ and P -4 P', then P' E Prog~ and moreover P = P':a is a
theorem of M .
(2) If P = Lift(P'):a• is a theorem of M then there exists a value Y for which
V E Value~J~z and P ~ V.
It is easy to prove the first part by rule induction on P --+ P~. A corollary is that
whenever P ~ V, P = V:a is a theorem of A,t.
In order to prove the second part, we first give a denotational semantics to ~4
in the category ~ of complete pointed posets (cppos) and Scott continuous
functions. For us, a cppo is a poset which is complete in the sense of having joins
of all w-chains and pointed in the sense of having a bottom element. Closed types
will be modelled by cppos, and the proved terms by Scott continuous functions.
In order to set up the denotational semantics, we define a set of functors, each
functor being of the form

Fo-:61~_p x 5'PO• x d/:'O~_p x 5"PO• ~ d]::~•


(X-,X+,Q-,Q +) ~ Fr +)

where 57X9• is the category of cppos and strict continuous functions. These
functors are introduced to provide convenient machinery for specifying the se-
mantics of types, and for inducing functions which arise when we later prove
the existence of certain logical relations. The cppos X - and X + will model the
347

parameter type variable X0, with X - modelling negative occurrences of X0 and


X + modelling positive occurrences. The cppos Q - and Q+ will play a role in
modelling the recursive datatype declaration. The reader should also note that
these functors are on the category of cppos and strict continuous functions--this
is to take advantage of the minimal invariant cppos of Freyd and Pitts [9, 21].
The functors are defined by clauses such as

9 FUnit(X_,X+,Q_,Q+) de__fl i ,
9 F,,~,~, d"=fF,~(X+,X-,Q+,Q-) --+ F~,(X-,X+,Q-,Q+),
9 FU(xo ) (X-, X +, Q-, Q+) ~f Q+,
9 Fxo (X-, X +, Q-, Q+) dej X + ' and
9 F ( X - , X +, Q - , Q+) de=f (E~F~, ( X - , X +, Q - , Q+))• (where ,U denotes a
coproduct of cpos, itself a cppo).
The remaining clauses are omitted. The definition of the semantics of the closed
types a, written [a], are as the reader expects, except possibly for a recursive
type U(a). There is, for each pair ( X - , X +) of cppos, a functor

CPO~p x 57:~x (Q-'Q+) ~ F ( X - ' X + ' Q - ' Q + ! CPO•

We can then exhibit a family of cppos (D(X+,X -) [ X + , X - E ~ • which


possess the simultaneous minimal invariant property--see [22]. In particular,
there are isomorphisms i:F(X-, X +, D ( X - , X+), D ( X - , X+)) ~- D ( X - , X +)
in ~ • and we define [U(a)] def n([a], [a]). Given a environment F we define
IF] to be the cppo which is the product of the denotations of the types appearing
in F, and we then specify a continuous function IF F E:a]:[F] -+ [a]. The
definition of these semantic functions is quite standard and omitted, but we do
give the meaning of expressions associated with recurs• types:

9 If ej de=f~F ~- Ej:aj[a/Xo]]:[F]] -+ [aj[a/Xo]], and ~ E IF], then we shall set


IF ~- cj(Ej):U((r)](~) def i([inj(ej(~))]) E [U(ff)] where one can show that
there is an isomorphism i:(E[aj[a/Xo]])z ~- [U(a)] and in is the expected
insertion function into the disjoint sum.
9 If e =def [ r e [U(o)] and

def

then
[IF }- CaseEof cl(Xl) -+ E1 I"" I Cn(Xn) --~ En:a'](~)
clef f ej(~, _L) if i-I(e(~)) = _l_
= [ if =

To prove Theorem 2, we shall show that there is a type indexed family of relations
<~ C_ [a] x Prog~ satisfying certain conditions. Such formal approximation
348

relations are fairly standard (see for example [7, 22, 24]) so we simply give these
conditions at lifted and recursive types:

e % . P iff 3d 6 [a].e = lift(d) implies 9P1.P ~ Lift(P1) and d %/)1,

r ~U(~) P iff r = _L or ~Pj.P ~ cf(Pj) and 3dj.r = inj(dj) and dj ~j[a/Xo] PJ"

We shall need the following lemma:


L e m m a 4 . Given such a family of relations, if xl:al,...,x,~:am [- E:a and
(dk ~ Pk I 1 < k < ,~) then [r F E:~ ~ ~ E[P/4.
Proof. The proof is by induction on the structure of E, which is routine and
omitted. []
In particular, it follows from Lemma 4 that [P:a] % P for any P E Prog~. We
can now complete the proof of Theorem 2. For suppose that P1 = Yal(P2):a•
Then by soundness of the semantics we have [Pl:a• = lift([P2:a]) r _l_ and
from Lemma 4 we have [Pl:a• %~ /'1. Hence, from the property of % . we
deduce P1 ~ Va](P3) for some P3 as required.
The existence of the formal approximation relations can be proved by techniques
which appear in Plotkin's CSLI notes [24]. However, it is more elegant to adapt
Pitts' method of admissible actions on relational structures. We give an outline
of the method. Set H d_ef{P:a ] P 6 Prog~, a 6 Type} and for any cppo X put

n(x) %f
{ a 6 7)(X x H) I for each P:a, {x I (x, P:cr) 6 R} C_X is chain complete}.

Write D def D(1, 1). We then define (monotone) functions, at each closed type
a, where Fa:TZ(D) ~ x TZ(D) -4 TZ(F~(1, 1, D, D)), by inductive clauses such as

t-el o')1 f = • or
P 1~ Ax. E' and V(d, Pl:a) 6 F~(S, R).(f(d), E'[P1/x]:a ') 6 Fa,(R, S)}.

Using these functions, and lifting the function

F~j(1,1, D,D) i~ F(1,1, D,D) ~ D

to a (monotone) function ~(F~j (1, 1, D, D)) -+ 7Z(D) in a similar fashion, we


arrive at a (monotone) function @:T4(D)~ xTZ(D) --+ T4(D), and by symmetrising
we can take its least fixed point ( A - , A +) 6 TZ(D) ~ x TZ(D). In fact using
the minimal invariant property associated with D, we can show A - = A+ (the
lengthy proof is omitted due to lack of space). We set

% de=f{(d,p) t(d,p:a) 6 F~(1,1, D,D)}.


349

4 T h e translation of O into

Following Plotkin [24] we induce a denotational semantics on O, via a textual


translation ( _ ) o of its types and expressions into ,~r Each O type T is sent
to an J~4 type (T) ~ that models O values of type T. We have (unit) ~ d__efUnit,
(bool) O d_efBool, ( i n t ) O d=efInt and (f 1 *7"2)O d el (7.1)O X (7.2)O. Our translation
of an (_9 function, (T1 -> T2) ~ must model the "pseudo-functions" read and
write, and so cannot simply be (T1) ~ -4 (T2)~ (as of course McCarthy realised)
but must be (T1) ~ -4 T(T2) ~ where the range is a type of c o m p u t a t i o n s [19].
If 7- is an O type, M type T(T) O is to represent the behaviour of O programs of
type T, including divergent programs and communicators as well as values. Using
an idea that dates at least to Plotkin's Pisa notes [23, Chapter 5, Exercise 4],
we set Ta d=ef(U(a))_L given the following top-level A~ declaration:

datatype U (Xo) = Crd of Int -+ U(Xo)_L


I ewr of Int x U(Xo)_L
I c ~ of Xo

We may form programs of type Ta using the following definitions:

Read(E) def: Lift(crd(E))


Write(El, E2) de=f Lift(c~r((E1,E2)))
Return(E) def: Lift(crew(E))

Roughly speaking, a computation of type T(T) O consists of potentially un-


bounded strings of Read's or Write's terminated with either _Lor a Return bearing
an element of type (T) ~ Hence T(~-)~ is a suitable semantic domain to model
the behaviour of arbitrary O programs of type 7. It better models the interleav-
ing of input and output than early denotational semantics models that passed
around a state containing input and output sequences (see Mosses [18]).
O expressions are inductively translated into M expressions following the
monadic style pioneered by Moggi [19] and Pitts [20]. The translation is pa-
rameterised by a monad (in the type-theoretic sense of Wadler [26]) (T, Val, Let)
where Val and Let are M combinators with the following types.

Vah a -4 Ta
Let: Ta -4 (a -4 Ta') -4 Ta I.

(Strictly, speaking these are type schemes, and Val and Let are type-indexed
families of combinators.) The idea behind this monadic translation is that Val
and Let correspond to immediate termination and sequential composition re-
spectively. We can define Val d_efAX. Return(x) and Let has a recursive definition
350

that roughly speaking stitches together the strings of I/O operations denoted by
its two arguments,

Let d=_ef Fix(Met. Ax. Split x as (s f ) in


Drop s to ~o in
Case w of
Crd(g) --~ Read(Ax. let (g x, f) )
cwr(x) -+ Split x as (y, s in Write(y, let (s f))

where Fix:(a -4 a~L) -~ (a --+ a~_) is a fixpoint combinator defined from Rec [10].
(Note that let, w and (o and their primed variants are simply jr4 variables.)
We simultaneously define the translation (_)o of arbitrary O expressions to AA
expressions, and an auxiliary translation ]_[o of O value expressions. Here are
the rules for value expressions
IXl O --: X

I()1 ~ - 0
I-~1~ --- ls
I~_1~ = ~x. S p l i t x a s (y,y') in DropyL| in V a l z
I~=1 ~ - Ax.Splitxas (y,z)inValy
Isndl ~ - Ax. Split x as (y, z) in Val z
161~ - Fix(~I~. ~x. (e~[fq~]) ~ where ~un~(x:~):~5 dej e~
I(~, u)I ~ - (Ivl ~ lul ~
I~=:~.el ~ - Ax:(~) ~ ~
Iread] O - Ax:Unit. Read(Val)
I,=itel ~ - A x : l n t . W r i t e ( x , Val 0 )

and here are the rules for arbitrary expressions, where Let x <= E in E' is an
abbreviation for Let (E, Ax. E').
(re) ~ -~ Return(lvel ~ (*)
(~)o = R e c x i n x
(if el then e2 else e3)0 --- Letx <= (el) O in I f x t h e n (e2) O else (e3) ~
(vee')~ - Letx <= (e') 0 in Ivel ~ x (**)
(ee')~ -- Letf 4= (e) O in L e t x 4= (e') 0 in f x
( (ve, e') ) ~ _-- Lety 4= (e') ~ in Val(Ivel~ y) (**)
((e,e')) ~ _ Letx <== (e) ~ in L e t y 4= ( d ) ~ in V a l ( x , y )

Rules marked (*) and (**) take precedence over later rules.
L e m m a 5 (Correspondence).
(1) L e t x <= R e t u r n ( P ) i n E --~+ E[P/x]
351

(2) l i r , x:r e: T' and r F ve: then (e)~176 = (e['e/x]) ~


(3) l i p -+ q then (p)O ~ + (q)O.
Proof. (1) is a routine calculation. (2) is by induction on the structure of e and
ve and (3) by induction on the derivation of p -+ q. O
Without the prioritised translation rules marked (**), part (3) would fail. If
p -+ q we would have (v,p) ~ (v,q) but not ((v,p)) ~ ~ + ((v,q)) ~ However
one can easily show that all the rules are valid up to provable equality in M .
Part (3) as proved makes the proof of Lemma 7 particularly simple.
L e m m a 6 . l i C[writen] and C[read ()] are communicators and v is a value,
(v) 0 = Return(Iv[ O)
(C[read ()])o = Read(Ax:lnt. (C[x]) ~
(C[writen_]) ~ = Write(/nJ,(C[()]) ~
are all .44 theorems.
Proof. The first equation follows by inspection. Proofs of the other two are by
induction on the number of experiments making up evaluation context C. D
L e m m a 7 ( A d e q u a c y ) . p~L i~ (p)O~.
Proof. Use the last two lemmas and Theorem 2.
L e m m a 8. If (a) ~ (b) ~ and a p there is q with b .2_+ q and (p)O (q)O
Proof. By a simple case analysis and Lemma 6.
L e m m a g . Relation S de_f{ (p, q) i (p) O -: (q)O} is a bisimulation.

Proof. Suppose that p S q and that p a ) pr. By Lemma 2 there is a with p -+* a
and a a ) p'. By Lemma 5 we have (p)O __+. (a)O and therefore (p)O = (a)O
by Theorem 2. By transitivity (q)O = (a)O is derivable, so by Theorem 2 and
Lemma 6 we have (q)O~. Hence q$ by Lemma 7, that is, there is b with q -+* b.
By Lemma5 and Theorem2 we have (q)O = (b)O and so (a) ~ = (b) 0 by
transitivity. Hence by Lemma 8 there is ql with b a ) ql and (p~)O = (qr)O.
Altogether we have q a ~ q~ and p~ S q~. A symmetric argument shows that q
can match any action of p, hence S is a bisimulation. []
The soundness of metalogical reasoning follows by co-induction, Lemma 3.
T h e o r e m 3 ( S o u n d n e s s ) . (p)O = (q)O implies p ~ q.

5 Discussion

By consolidating prior work on operational semantics, bisimulation equivalence


and metalogics for denotational semantics, we have presented the most compre-
hensive study yet of I/O via side-effects. Previous work has treated denotational
352

or operational semantics in isolation. Our study combines the two to admit


proofs of programs based either on direct operational calculations (Theorem 1)
or metalogical inference (Theorem 3).
Williams and Wimmers' paper [27] is perhaps the only other to consider an
equational theory for a strict functional language with what amounts to side-
effecting I/O, but they do not consider operational semantics. Similarly, the
semantic domains for I/O studied in early work in the Scott-Strachey tradition
of @notational semantics [18, 23] were not related to operational semantics.
In his CSLI lecture notes, Plotkin [24] showed how Scott-Strachey denotational
semantics could be reconciled with operational semantics by equipping his meta-
language (analogous to our M ) with an operational semantics. He showed for a
given object language (analogous to (9) that the adequacy proof for the object
language (analogous to Lemma 7) could be factored into an adequacy result for
the metalanguage (analogous to Theorem 2) together with comparatively routine
calculations about the operational semantics. Moggi [19] pioneered a monadic
approach to modularising semantics. In an earlier study [7] we reworked Plotkin's
framework in a monadic setting, for a simple applicative language.
We have made two main contributions to Plotkin's framework. First, by adapting
recent advances in techniques for showing the existence of formal approximation
relations we have a relatively straightforward proof of computational adequacy
for a type theory with a parameterised recursive type. This avoids the direct con-
struction of formal approximation relations using the limit/colimit coincidence
(see for example [8]). Instead we use the minimal invariant property which char-
acterises the (smallest) coincidence. Second, we use the adequacy result for O
(Lemma 7) and co-induction to prove the soundness of metalogical reasoning
with respect to operational equivalence (Theorem 3).
The idea of using a labelled transition system for a functional language, to-
gether with co-inductively defined bisimilarity, is perhaps the most important
but the least familiar in this paper. It appears earlier in Boudol's concurrent 7-
calculus [3], but Boudol does not establish whether bisimilarity on his calculus is
a congruence. Abramsky's applicative bisimulation [1] is another co-inductively
defined equivalence on functional languages but based on a 'big-step' natural
semantics. Labelled transitions better express I/O, and hence are preferable to
natm'al semantics for defining languages with I/O.

Acknowledgements We thank Simon Gay, Andrew Pitts and Eike Ritter for use-
ful discussions. Roy Crole was supported by a Research Fellowship from the
EPSRC. Andrew Gordon was funded by the Types BRA. This work was par-
tially supported by the CLICS BRA.

References

1. Samson Abramsky and Luke Ong. Full abstraction in the lazy lambda calculus.
Information and C o m p u t a t i o n 105:159-267, 1993.
353

2. Dave Berry, Robin Milner, and David N. Turner. A semantics for ML concurrency
primitives. In 19th P O P L , pages 119-129, 1992.
3. G~rard Boudol. Towards a lambda-calculus for concurrent and communicating
systems. In T A P S O F T ' 8 9 , Springer LNCS 351, 1989.
4. Roy. L. Crole. Categories for Types. CUP, 1993.
5. Roy. L. Crole and A. M. Pitts. New foundations for fixpoint computations: FIX
hyperdoctrines and the FIX-logic. I n f o r m a t i o n and C o m p u t a t i o n , 98:171-210,
1992.
6. Roy L. Crole. P r o g r a m m i n g Metalogics with a Fixpoint T y p e . PhD thesis,
University of Cambridge, 1992.
7. Roy L. Crole and Andrew D. Gordon. Factoring an adequacy proof (preliminary
report). In Functional Programming~ Glasgow 1993, Springer 1994.
8. Marcello P. Fiore and Gordon D. Plotkin. An Axiomatisation of Computationally
Adequate Domain Theoretic Models of FPC. In 9th LICS, 1994.
9. P. Freyd. Algebraically complete categories. In 1990 C o m o C a t e g o r y T h e o r y
Conference, Springer Lecture Notes in Mathematics, 1991.
10. Andrew D. Gordon. Functional P r o g r a m m i n g a n d I n p u t / O u t p u t . CUP,
1994.
11. Andrew D. Gordon. An operational semantics for I/O in a lazy functional lan-
guage. In F P C A ' 9 3 , pages 136-145. 1993.
12. Andrew D. Gordon. A tutorial on co-induction and functional programming. In
F u n c t i o n a l Programming~ Glasgow 1994. Springer Workshops in Computing.
13. SSren HolmstrSm. PFL: A functional language for parallel programming. Report
7, Chalmers PMG. 1983.
14. Douglas J. Howe. Equality in lazy computation systems. In 4 t h LICS, 1989.
15. John McCarthy et al. LISP 1.5 P r o g r a m m e r ' s Manual. MIT Press, 1962.
16. Robin Milner. C o m m u n i c a t i o n and C o n c u r r e n c y . Prentice-Hall, 1989.
17. R. Milner, M. Torte and R. Harper. T h e Definition of SML. MIT Press, 1990.
18. Peter D. Mosses. Denotational semantics. In Jan Van Leeuven, editor, H a n d b o o k
of T h e o r e t i c a l C o m p u t e r Science, pages 575-631. Elsevier 1990.
19. Eugenio Moggi. Notions of computations and monads. TCS, 93:55-92, 1989.
20. Andrew M. Pitts. Evaluation logic. In I V t h H i g h e r O r d e r W o r k s h o p , B a n f f
1990, pages 162-189. Springer 1991.
21. Andrew M. PittS. Relational properties of domains. Tech. Report 321, University
of Cambridge Computer Laboratory, December 1993.
22. Andrew M. Pitts. Computational adequacy via 'mixed' inductive definitions. In
M F P S IX, N e w Orleans 1993, pages 72-82, Springer LNCS 802, 1994.
23. Gordon D. Plotkin. Pisa notes on domains, June 1978.
24. Gordon D. Plotkin. Denotational semantics with partial functions. Stanford CSLI
1985.
25. Jonathan Rees and William Clinger. Revised a report on the algorithmic language
scheme. A C M S I G P L A N Notices, 21(12):37-79, December 1986.
26. Philip Wadler. The essence of functional programming. In 19th P O P L , 1992.
27. John H. Williams and Edward L. Wimmers. Sacrificing simplicity for convenience:
Where do you draw the line? In 15th P O P L , 1988.
A n Intuitionistic M o d a l Logic
with Applications to the
Formal Verification of Hardware

Matt Fairtlough 1 and Michael Mendler 2

1 University of Sheffield
Department of Computer Science
Regent Court, Sheffield $1 4DP, UK
Emaih m.fair tlough@dcs.shef.ac.uk
2 (Corresponding author)
Technical University of Denmark
Department of Computer Science
Building 344, DK-2800 Lyngby, Denmark
Emaih mvm@id.dtu.dk

A b s t r a c t . We investigate a novel intuitionistic modal logic, called Pro-


positional Lax Logic, with promising applications to the formal verific-
ation of computer hardware. The logic has emerged from an attempt to
express correctness 'up to' behavioural constraints - - a central notion
in hardware verification - - as a logical modality. The resulting logic is
unorthodox in several respects. As a modal logic it is special since it
features a single modal operator O that has a flavour both of possibility
and of necessity. As for hardware verification it is special since it is an
intuitionistic rather than classical logic which so far has been the basis
of the great majority of approaches. Finally, its models are unusual since
they feature worlds with inconsistent information and furthermore the
only frame condition is that the O-frame be a subrelation of the D-frame.
We provide the motivation for Propositional Lax Logic and present sev-
eral technical results. We investigate some of its proof-theoretic proper-
ties, and present a cut-efimination theorem for a standard Gentzen-style
sequent presentation of the logic. We further show soundness and com-
pleteness for several classes of fallible two-frame Kripke models. In this
framework we present a concrete and rather natural class of models from
hardware verification such that the modality O models correctness up to
timing constraints.

1 Motivation

It is good engineering practice to think of the synthesis of a hardware device


as proceeding through numerous levels of abstraction. The attempt to form-
alize this process in mathematical logic for the purpose of formal verification,
however, faces a major obstacle: behavioural abstractions are genuine mathem-
atical abstractions only up to behavioural constraints, i.e. under certain restric-
tions imposed on the device's environment. Timing constraints on input signals
355

form an important class of such restrictions. Since correctness across abstraction


levels is only correctness up to constraints, rather than correctness proper, the
ambient formalism no longer provides a (uniform) notion of correctness across
the abstraction boundary. A concrete example is the common timing abstrac-

Fig. 1. A Simple Combinational Circuit

tion at the gate level. It is convenient to reason about the static behaviour of a
combinational circuit in terms of high or low voltage and to abstract away from
propagation delays. This makes it possible to analyse large circuits by classical
propositional logic and standard Boolean techniques. In this 'ideal' abstract set-
ting the behaviour of the circuit shown in Fig. 1 may be specified by "if input
A is 0 then output C is 0," formally

A=ODC=O.

This abstract formulation elides, of course, much concrete-level detail. As a spe-


cification of static behaviour it is valid only as long as gate delays can be ignored.
This is, however, not always the case. In the design of real circuits delays can
cause considerable grief. Races, hazards, or glitches are delay-related phenomena
that may render simple boolean reasoning unsound. In the real circuit corres-
ponding to Fig. 1 for instance, a 1 ~ 0 transition on input B may produce a
0 ~ 1 ~ 0 glitch on C even if A is constant 0, viz. whenever the delay through
the inverter is sufficiently small compared to that through the or-gate. Thus, in
the presence of propagation delays, the above specification is unsound.
So, how can we justify the 'ideal' abstract description in terms of the more
concrete level? Well, the best we can expect is that instead of the original spe-
cification the 'real' behaviour satisfies but an approximation like

B stable D (A = 0 ~ (after some delay D C -- 0)),

weakened by the stability constraint 'B stable' to rule out the glitch, and by the
timing constraint 'after some delay' to account for the input/output propagation
delay. The trouble is that both constraints refer to time and thus belong to the
concrete level of timed signals. Thus, the dominant formalism of Boolean algebra,
or propositional logic for that matter, is not adequate to capture correctness of
356

the timing abstraction. No problem, one might say, since of course at the low
level one can make things precise, say by

((Vr.B(r)=0)V(Vr.B(r)=5)) D ( V r . A ( r ) = 0 ) D ( V r . r _ > A D C(r)=O).

Though the state of the art in formal hardware verification this cannot be the
answer as it jeopardizes the crucial distinction between the abstraction levels.
We are throwing overboard the abstractness of the Boolean approach, and we
are back where we started, entangled in the nitty-gritty details of exact timing
verification.
Fortunately, there is a middle way of tackling the problem of approximat-
ive and incomplete abstractions: we employ a weakened notion of correctness,
viz. correclness-up-to-consiraints, and formalize it as a logical modality. The ex-
ample circuit's behaviour then would be specified by

O(A = 0 D O(C = 0)),

to be read as "under some constraint, if A is low, then under some constraint,


C is 0". Thus, the modality O is used to account for the stability and timing
constraints, which are no longer part of the specification but of the semantic
model and of the correctness proofs. The technical advantages of this idea have
been worked out in [11, 12]. Given the great variety of uses that the notion of
'constraint' finds in hardware engineering, a general advantage of this framework
is to provide a precise definition of constraint correctness that permits more or
less arbitrary instantiation while yielding useful metamathematical results.

2 Propositional Lax Logic

In this paper we shall present a concrete formal calculus, Propositional Lax Logic,
conceived along the lines set out in [12]. The term 'lax logic' is chosen to indic-
ate the 'looseness' associated with the notion of correctness up to constraints.
Propositional Lax Logic, PLL, is an intuitionistic propositional calculus with a
single modality O. The intuitive interpretation of OM is "for some constraint
c, formula M holds under c". Clearly, different notions of constraint will have
different properties, and thus will give rise to different axioms for O. The generic
interpretation leads to the following three axioms:

OR : M D O M
OM : OOM D OM
OF : ( M D N ) D ( O M D O N ) .

Axiom OR says "if M holds outright then it holds under a (trivial) constraint";
OM says "if under some constraint, M holds under another constraint, then
M holds under a (combined) constraint"; finally, O F says "if M implies N
then if M holds under a constraint, N holds under a (the same) constraint."
However innocent each of these axioms may appear in this informal reading,
their combination results in a rather strange modality. Indeed, O has a flavour
357

of both possibility and of necessity without being one or the other. Axioms OR
and OM are typical of possibility while O F is typical for necessity. On the other
hand, in standard systems, say Lewis' modal system $4 [2], the axiom OR is
never adopted for necessity while OF never for possibility, and in fact they would
trivialize the modalities.
The second noteworthy feature of PLL is that it is an intuitionistic rather than
classical logic which so far has been the basis of the great majority of approaches
in the area of hardware verification. In [11, 12] the intuitionistic nature has
been exploited to extract constraints from proofs. Yet, dropping the Excluded
Middle is not merely for pragmatic reasons: PLL is essentially intuitionistic in
the sense that assuming the Excluded Middle and the axiom -10 false trivializes
O, i.e. O M becomes provably equivalent to M. This is another indication for the
'strangeness' of O in the context of standard classical modal logics. Of course:
we hope this paper will convince the reader that the O modality is actually a
very natural one.
So, just what kind of modality is O? Why should it be interesting at all and how
does it relate to correctness-up-to constraints? In [11, 12] O is motivated by a
proof-theoretic interpretation. The present paper attempts to justify the axioms
and their constraint interpretation by model-theoretic means. In this paper we
will show that PLL has a natural class of Kripke models for which it is sound
and complete. Two concrete subclasses of such models will be presented obtain-
ing two concrete constraint interpretations of O. These concrete models verify
that PLL has nontrivial expressiveness and illustrate the benefit of dropping
Excluded Middle and ~Ofalse. But before we get to the technical results it may
be appropriate to give some general justification of O.
(1) Consider the most simple constraint interpretation, viz. O M - C D M ,
where C is an arbitrary but fixed constraint. Under this encoding all three axioms
OR, OM, OF become tautologies of (intuitionistic) propositional logic. With
modification, this generalizes to a set g of constraints: OM - Z C E g. C D M
(see [11]). The single constraint interpretation is precisely Curry's system LJZ
[3]. PLL itself appears to have occurred for the first time in Curry's 1948 Notre
Dame lectures on A Theory of Formal Deducibility [4]. These lectures contain
some sketchy remarks on a O modality endowed with axiom schemata that are
essentially equivalent to the ones we are adopting for O. The present paper may
be seen as giving a model-theoretic account of Curry's proof-theoretic O and in
particular of LJZ.
(2) A second motivation for O can be drawn from general type theory. The formal
properties of O viewed as an unary type constructor are precisely the data of
a strong closure operator, or strong monad familiar from category theory. In
fact, the propositions-as-types principle which yields an equivalence between In-
tuitionistic Propositional Logic (IPC) and bi-Cartesian closed categories can be
extended to an equivalence between PLL and bi-Cartesian closed categories with
a strong monad. This categorical structure is also known as the computational
lambda calculus Ac [13]. The application of Ac as a calculus of proofs for PLL
has been investigated by Benton et al. [1] (there the logic is called CL).
358

(3) The third motivation for O is the possibility of a timing analysis of combin-
ational circuits. In an equivalent presentation of PLL we can replace O F by the
axiom
OS:(OM A ON) D O(M A N)
and the additional inference rule "from M D N infer OM D ON". One may
now establish a direct correspondence between the axioms used in verifying the
functional behaviour of a combinational circuit and the computation of a data-
dependent timing constraint: OR corresponds to a wire, which involves zero
delay 0; OM deals with the sequential composition of circuits, which involves the
addition of delays +, and OS effects the parallel composition of circuits, which
amounts to the maximum operation max on delays. In other words, by systematic
translation of proofs in PLL into a term over the delay algebra (Nat, O, +, max)
we can extract verification-driven (= data-dependent) timing information. This
is essentially an interpretation, in the sense of (2), in a concrete A, calculus.

3 Results

The formulas of PLL are generated by the grammar

M::= A ] true I false I M A M I M VM ] MDM ] --,M ] O M

where A ranges over a countably infinite set of propositional constants


{p0, p l , . . . } . We will sometimes also use - to abbreviate biimplication.
The Hilbert system of PLL takes as axiom schemata all theorems of IPC, plus
the axiom schemata OR, 9 OF, and Modus Ponens as the only inference
rule. The finitary deduction relation induced by this axiom system is denoted
by [-PLL.
The Gentzen-style calculus for PLL is presented in terms of ordinary sequents
s b A, where F is a finite, possibly empty, list of hypotheses and z5 a finite list
of assertions with length 0 or 1. The inference rules for deriving sequents are
the standard ones for IPC plus two special rules OR and 9 which capture the
properties of 9 :

F ~ - M OR F,M~-ON 9
F ~- OM F, OM F- ON

The complete set of rules is listed in Fig. 2. One can verify that the Hilbert and
Gentzen .systems for P LL are equivalent, i.e. for all formulas M, ~-PLLM iff ~- M
is derivable.

Lemma 1 Deduction Theorem. F, M ~'PLL N implies 1" ~-PLLM D N.

The deduction theorem does not hold for ordinary modal logics. For instance in
K, T, $4 [2] we have M ~- ~ M but ~/M D QM, and M D N b OM D ON but
k/(M D N) D (OM D <~U).
359

Logical Rules

FFM FFN F,M,N b A


^R ^L
FbMAN F, M A N t - A

F, M F A F,N~- A
VL
F, M V N F A

FI--M FI-N
VR1 VR2
I~}-MVN FbMVN
F, M F N FF M F, N F A
DR DL
FF M D N F, M D N b A

F, M I - - R FF M ,L
F t- --M F,-',M b

F I ' - M OR F, M F O N OL
F F OM F, O M I-- ON

Structural Rules

id F ~- M F, M F A cut
M b M FFA

s FI-
weakL - - weakR
F, M F AI F~- M

F, M, M I- A contr F, M, N, F' l- A exch


F , M t- ~ F , N , M , U t- A

Fig. 2. Gentzen Rules for PLL.

T h e o r e m 2 S t r o n g C o n s e r v a t i v i t y . Let M be a theorem of PLL. Then the


f o r m u l a M I, where M ~ is obtained f r o m M by removing all occurrences of O, is
a theorem of IPC.

Another way of translating theorems of PLL into theorems of I P C is obtained


by replacing all subformulas prefixed by O by true. Both results are special
instances of the more general result that the translation O M - C D M preserves
provability; for the first take C - true and for the second take C -- false.
From the latter translation we m a y conclude, for instance, that -~ O false is not
a theorem of PLL. This ensures t h a t PLL is not a trivial extension of I P C in
the sense that it is not possible to transform a theorem of I P C into a theorem of
PLL by arbitrarily introducing Os. Note also that both ~<>false and -~ D false
are theorems of standard modal logics with reflexive accessibility such as $4.

Theorem 3 Cut Elimination. I f t- A is derivable, then it is derivable without


the cut rule.
360

The proof uses the same method that works for IPC [5]. One new reduction step
needs to be introduced, as shown in Fig. 3.

HI II2
FF-M F, M F - o N
OR 9L ~zl ~z2
F F- O M 17, O M ~- O N F ~- M F, M ~- O N
cut reduce cut
F ~- o N ==# 1~ b o N .

Fig. 3. Primitive Cut Reduction Step

A direct consequence of cut-elimination is the decidability of PLL. Other con-


sequences are the disjunction and the subformula property, and the admissibility
of the rule ~- O M =~ ~- M, which is the inverse of the necessitation rule of stand-
ard modal logics.
There is a natural class of Kripke models for which PLL is shown to be sound and
complete. The models are two-frame Kripke models with a single frame relation
and fallible worlds. Kripke-style analyses have been given for other intuitionistic
modal logics, for instance by Simpson [15] and Plotkin and Stirling [14] for
system IK, by Fischer-Servi [8] for the class of (*)-IC systems, and by Ewald
[6] for an intuitionistic tense logic. The approach taken here most closely follows
[14] in using one set of worlds but two separate frame relations to interpret O
and D.
D e f i n i t i o n 4 K r i p k e C o n s t r a i n t M o d e l . A (Kripke) constraint model for
PLL is a quintuple g = (W, Rm, Ri, V, F), where W is a non-empty set, R,~, Ri
are binary relations on W, F C W, and V is a map that assigns to every pro-
positional constant A of PLL a subset V ( A ) C W . These data are subject to the
following conditions:
- Rm, Ri are preorders, i.e. reflexive and transitive relations, and Rm C Ri,
- F and V are hereditary w.r.t. Ri, i.e. i f a R i b , then a E F implies b C F, and
a E V ( A ) implies b E V(A),
- V is full on F, i.e. F C_ V(A).

If aR,~b then we say that b is a constraining of a, or b is reachable from a under


a constraint. Elements of F are fallible worlds and if aR,~b and b E F, then
intuitively the constraint leading to b is inconsistent with world a. Models with
fallible worlds are not a new concept. They have been introduced previously
to admit intuitionistic recta-theory for intuitionistic logic, see e.g. [16, 5]. As
we will show, in our context, fallible worlds arise naturally from the constraint
interpretation.
D e f i n i t i o n 5 V a l i d i t y . Let C = (W, Rm, Ri, V, F ) be a constraint model for
PLL. Given a formula M and c E W, M is valid at c in g, written C, c ~ M iff
361

- M is a propositional constant A and c E V(A);


- MisNAKandbothC, c~NandC, c~K;
- MisNVKandC, c~NorC, c~K;
- M is true; or M is false and c E F;
- M is N D K and for all a E W such that cRia, C, a ~ N implies C, a ~ K;
- M is of form O N and for all b E W, crib, there exists a E W with bRma
such that C, a ~ N.

A formula M is valid/n C, written C ~ M, if for all c E W, M is valid at c in


C; M is valid, written ~ M, if M is valid in any constraint model C.

Validity behaves as in ordinary intuitionistic logic, viz. it is hereditary w.r.t, the


accessibility relations. Formally, If c ~ M and crib, then b ~ M. Since Rm is
a subrelation of Ri, validity is hereditary w.r.t. R,~ too. Worlds w, v with wRiv
and vRiw validate the same formulas and can thus be identified. Hence, it is no
restriction to assume that the relation R~ is a proper ordering,/, e. antisymmetric.
Some remarks concerning our definition of validity are in order. First, one notes
that fallible worlds validate all formulas and that -10 false is not valid in general.
Second, the clause for validity of ON is a V3 statement. This endows O with
properties of both possibility and of necessity. A direct consequence is that O
is hereditary w.r.t, the intuitionistic frame Ri without further imposing a con-
fluence frame condition as in the models for IK [14]. Another consequence is
that this semantics of 9 does not validate the scheme O ( M V N) D O M V ON.
Both this scheme and -1 9 are generally adopted for modality ~, even for
intuitionistic logics such as IK and apparently also by the class (*)-IC of logics
considered by Fischer-Servi in [8]. We will present concrete constraint models
falsifying as well as validating these axioms. Finally notice that there is no point
in defining a 'necessity' modality, in contrast to IK. Its definition

w ~ DM iffVw', v'. wRiw' ~ w'R,nv' ~ v' ~ M

yields nothing new because of the frame condition Rm C_ Ri.

Theorem 6 Soundness, Completeness.

--~'PLL M iff ~ M.
- PLL +-~ Ofalse is sound and complete for the class of constraint models with
F=0.
- PLL + O(M V N) D (OM V ON) is sound and complete for the class of
constraint models where Rm and Ri are mutually confluent, i.e. if aRmb
and aRic, then there exists d such that bRid and cRmd.

Proof. We indicate the proof of the first statement which follows traditional
lines in constructing a counter model for every formula that is not derivable. The
counter model employs a suitable generalization of the Lindenbaum construction,
in which worlds are triples
(r,~,O)
362

of sets of formulas, called theories, subject to an abstract consistency condition


which reflects the semantical r61e of its components (cf. [9]).
The model is set up so that at a world w - (F, A, O) the formulas in F are
validated at w, the formulas in A are falsified at w, and the formulas in O are
falsified at every world R,n reachable from w. The sets O are a special feature
of our completeness proof and of PLL. They are introduced to make up for the
fact that falsity of a formula OM cannot be expressed by including M (or a
subformula of M) in F or A. Thus, we need to keep track of these separately.
Another special feature of the proof is the notion of consistency. A theory
( F , A , O ) is consistent if for no choice of formulas N1,...,Nn E A, and
K 1 , . . . , K ~ E O, such that n + k > 1,
F [-PLL g l V . . . V N,~ V O(K1 V.-. V Kk)
This definition is somewhat weaker than one might expect as it excludes the
case k = n = 0. The disjunction on the right must always be nonempty, with
the effect that the theories (F, 0, 0), for any choice of F, are consistent for trivial
reasons. The point here is that we take the empty disjunction to be the empty
formula rather than false.
A consistent theory is maximally consistent if there is no proper consistent ex-
tension, under component-wise subset ordering. For instance, the distinguished
theory (A_,0, 0), where .1_ denotes the set of all formulas is maximally consistent.
Observe that if (P, A, O) is maximally consistent, then false E F iff A = @ = O.
We can now proceed to define a generic Kripke constraint model
c* = (c',RL, R~,V
* * , Y *)

which falsifies all unprovable formulas. As the elements in C* we take the max-
imally consistent theories (F, A O). The accessibility relation R* is simply the
subset relation on the first component, i.e.

( r , n , O ) R~ ( r ' , n ' , O ' ) ~ r C r ' .


Constraint accessibility R m is given by

(r,n,o)RL(r',n',o') ~ rc_r' ~ 3Mer'.oM~r & oc_o'.


Valuation V* and fallible nodes F* are defined such that
d]
V*(A) - { (F,A,O) I A e F }

It is not hard to verify that these data indeed constitute a constraint Kripke
model.
Suppose VPLL M. Then (O, {M}, O) is consistent. Take a maximally consistent
extension "/', then 9- ~= M in the constraint Kripke model g*.
A complete proof of this result may be found in [7]. []
Two examples of counter models, one falsifying -~ Ofalse and the other falsifying
O(AV B) D (OAVOB), are shown in Fig. 4. Both may be seen to be constructed
from g*. The solid arrows represent R* and the dashed lines R*.
363

V=-~oSalse l~ O(A V B) D (OA V OB)

I I
I I
I I
I I

v.E F*
~A ~A ~B

Fig. 4. Two Counter Models

4 Concrete Constraint Models

In this paper we give two variants of concrete models for PLL. The first class
of models is characterized by formulas of the kind O M = C[M] where C[_] is
one of a family of possible contexts, for example dIM] _= C D M where C
is an arbitrary but fixed proposition. As mentioned this is precisely Curry's
system LJZ [3] and a special case of the quite general constraint interpretation
according to which O M means 3' D M, where 7 is taken from a predefined set of
distinguished propositions representing constraints. Other possible contexts are
C[M] ~_ C V M or C[M] ~. (M D C) D C.
Let PLL c be the theory

PLL + O M = ( C D M )

and .AAc the class of (antisymmetric) Kripke constraint models validating PLL C.

Lemma 7 Curry Models.

- JUtC is characterized by the frame conditions (i) Vw. 3u. wRmu ~ u ~ C,


and (ii) Vw, u. (w ~ C 8~ wRmu) ::~ w = u.
- PLL C is complete for M e.

The PLL C interpretation of 9 provides us with a class of models for which the
axiom schemata --10false and o ( m V N) D (OM V ON) are unsound, in general.
The former is valid iff ~ -~-~C and the latter whenever [C] = { w I w ~ C } is
a principal ideal, i.e. for some z, [[C] = { w ] zRiw }.
Let PLL1 and PLL2 be the theories

PLLx := PLL + O M = (CV M)


PLL2 := PLL + O M ~ ((M D false) D false),
and A/li the classe of (antisymmetric) Kripke constraint models validating PLLi,
i = 1,2.
364

L e m m a 8.

-JP[1 is characterized by the frame conditions (i) Vw. w ~ C


3u. wR~u ~ u e F , and(ii) Vw, u . ( w V : C ~ wR,,u) ~ w = u .
- .A42 is characterized by the frame condition Vw, u. wRmu iff wRiu & (u E
F~wEF).
Some of these conditions are not pure 'frame' conditions as they involve the
validity of C and thus the valuation. In these cases, by 'characterized' we mean
that the class of models satisfying the given conditions is the largest class of
models for PLL c closed under any change of valuation that does not modify the
validity of C.

4.1 Combinational Circuits

The second type of models to be investigated are still more concrete. They are
obtained from the dynamic behaviour of combinational circuits under explicit
modelling of propagation delays, so that 9 means there exists a timing con-
straint d such that the circuit stabilizes in state M after time delay d. Concretely,
the circuit models are set up such that for a propositional constant A

A means 'A is stable at 1'


-~A 'A is stable at 0'
OA means 'A is going to stabilize to 1'
O-~A 'A is going to stabilize to 0.'

This allows us to retain the ideal 'static' interpretation of truth values while
safely keeping track of the offset to the real signals caused by propagation delays.
For instance, the example circuit mentioned in the introductory section (1) can
now be specified by
(B V-~B) D (-,A D O~C).

The standard way of interpreting propositional logic on circuits is to associate


propositional constants with input and output signals of combinational gates, so
that the truth values correspond to high and low voltage. Signals are conceived
as Boolean-valued functions over the time domain N of natural numbers, the
Boolean values IB being denoted by 1 and 0. A circuit interpretation of PLL then
is given by a map I, called a timing diagram, assigning to each propositional
constant A a function I(A) : N -+ 1~.
Given a timing diagram I we will construct a constraint Kripke model AdI,
so that the induced semantics for PLL complies with the informal reading of
formulas given above. The worlds of JtdI are closed-open time intervals obtained
from breaking the signal waveform I into pieces. We adopt Leibniz' view of time
which takes the process of time to be given by events, i.e. state changes. This
means that the only intervals we can form for a given I are the [s,t), where
both s and t mark a signal change. A time t + 1 marks a signal change if there
exists A such that I(A)(t) r I(A)(t + 1). Let us call these intervals the Leibniz
365

intervals of I. As special cases of Leibniz intervals [s, t) we allow s, t = 0, t = co


and empty intervals with s = t. An example can be seen in Fig. 5. It depicts
two signals I(A) and I(B) with their signal changes at times to = 0, t l , . . . , t 7 .
The Leibniz intervals, then, are Its, t j), and [ti, oo), i, j = 0 , . . . 7, i <_ j. Given a

1 1
I I I II
H
I(A) L I
I
t
J q
I

I I I I
I i ! t
I

I(B) I • I I / I
T i ~ ~
J I I I I I I
I I I I I I I

0 t~ t2 t3 t4 L5 t~t7 c~

Fig. 5. An Example Interpretation

timing diagram I, a constraint Kripke model

.s d=] (W(I), Ri(I), Rm(I), V(I), F(I))

is constructed as follows:
- W(I) is the set of Leibniz intervals for I.
Is, t)R~[s', t') if Is', t') is a subinterval of Is, t),
-

- [s,t)Rm[s',t') if [s',t') is a final subinterval of [s,t), i.e. t = t' and s < s'.
- [s,t) E V(I)(A) if I(A) is constant 1 throughout [s,t), i.e. Vx. s < x < t,
I(A)(x) = 1.
- F(I) is the set of empty intervals [s, s).

The set W(I) is clearly nonempty, as it always contains the pairs [0, 0) and [0, r
The other properties of a constraint Kripke model are easily verified. Also, as
this model is confluent, it satisfies the axiom O ( M V N) D ( O M V ON).

L e m m a 9. Let A be an atomic proposition.


- ~ A iff I(A) is constant 1.
- ~ -,A iff I(A) is constant O.
- ~ O A iff I(A) stabilizes eventually to 1, i.e. there is a k > s such that
Vx > k. I(A)(x) = 1.
- ~ O-~A iff I(A) stabilizes eventually to O.

Thus, the semantics of the basic modalities is as anticipated at the beginning of


this section. In particular, we notice that in this interpretation the intuitionistic
nature of PLL is intimately tied up with delay behaviour: ~ A V --1A iff I(A) is
stable.
366

In analyzing the meaning of formulas it is helpful to realize that t < ~ implies


Is, t) ~ OM for any M, i.e. finite intervals validate any O-formula. This is
a consequence of the fact that from finite intervals there is always the empty
final subinterval reachable through Rm(I). Intuitively, a finite Leibniz interval
carries inconsistent stability information. It represents an intermediate phase of
the circuit's execution with a potential for glitches.
With this in mind we may unroll the semantics of some formulas to find that
we can express various stabilization behaviour: O(A V --A) says that the par-
ticular signal I(A) stabilizes eventually, O-,Ofalse comes out as the second-
order statement "eventually all signals are stable". We can express oscillation:
(A V ~A) D Ofalse is valid iff I(A) oscillates indefinitely, and a form of strong
termination: OA D -~B is valid iff whenever B switches to 1 all signals have
become stable for good and A rests at 0. This amounts to specifying a global
termination signal indicating a particular termination state.
It can be seen that if the circuit stabilizes completely at some time s, then both
[s, ~ ) ~ A V --A and Is, ~z) ~ OA _= A for all A. Thus, after stabilization, the
theory reduces to ordinary classical boolean algebra, which is what we expected.
Aiming at a concrete application we might specify the falling output transition
of an invertor by the formula A D O-~B, "if I(A) becomes stable 1 for good then
eventually I(B) becomes stable 0 for good". Similarly, --A D OB would capture
the rising output transition. Given this axiomatization we might consider a ring
circuit consisting of an odd number of invertors. Then, if A represents any one of
the signals within the ring our logic would derive the formula (A D O-~A)A(--A D
OA) which says nothing but that I(A) oscillates. In contrast, the classical theory
of the invertor ring would lead to A -= --A which is plainly inconsistent.
One can show that the O-free fragment also allows us to specify nontrivial dy-
namic behaviour: it is possible to specify state and transition invariants, say that
two signals may never be 1 at the same time, or in a certain state never switch
at the same time. We do know that an upper bound to the expressibility in the
O-free fragment of Circuit-PLL is given by the regular languages closed under
subsequence and insensitive to a form of local reversal. As a consequence, the
O-free fragment of Circuit-PLL is decidable. This fragment is equivalent to the
stable form of Maksimova's LH [10].
The expressive power of the language, which ranges from a set of states with
transitions of arbitrarily high level to no transitions at all, will be reported on
in a subsequent paper.

5 Conclusion and future work

The paper presented a novel intuitionistic modal logic, PLL, a conservative ex-
tension of the standard intuitionistic propositional calculus by a new modal op-
erator O to capture the notion of 'correctness-up-to-constraints'. The modality
algebraically is a strong closure operator or - - from a type-theoretic perspective
- a strong monad. The main result is that PLL has a natural class of two-frame
-

Kripke models for which it is sound and complete. This provides a satisfactory
367

model-theoretic account of the modality 9 in an intuitionistic setting. On the


proof-theoretic side it is shown that PLL, despite being a modal logic, inherits
many of the properties of intuitionistic logic, viz. deduction theorem, a simple
cut-free sequent calculus, and the disjunction property.

We have given a number of concrete types of models for PLL, one of them mo-
tivated from hardware verification. We interpret PLL over timing diagrams such
that O expresses truth up to stabilization. In the resulting theory, Circuit-PLL,
one derives safe stabilization information even in the presence of glitches, where
the standard classical reasoning is sound only under implicit stabilization as-
sumptions. We show that this logic is able to express nontrivial stability beha-
viour.

The advantage of the framework we present here is that it provides a precise


definition of constraint correctness that permits more or less arbitrary instanti-
ation while enjoying an intriguing yet tractable meta-theory.

The full characterization of expressibility in Circuit-PLL and the existence of


finite or complete axiomatizations are left as open problems for future work. A
finite axiomatization would establish decidability of and ideally give rise to a
cut-free sequent calculus for full Circuit-PLL.

For circuits where delays do not invalidate functional correctness, such as syn-
chronous circuits, it is often necessary or advantageous to combine functional
and timing analysis so as to derive the 'exact' data-dependent delay of com-
binational circuitry. We anticipate using PLL to do this with standard proof
extraction techniques based on a concrete computational lambda calculus as
mentioned in the introductory section 1. In this context an automatic theorem
prover for PLL will be useful. However, it is not yet clear how such extraction
techniques could be integrated with automatic proof search based on cut-free
sequent calculus presentations of the logic. We are developing an implementa-
tion based on such an approach and one of our goals is to incorporate constraint
extraction.

6 Acknowledgements

Rod Burstall and Terry Stroup have had decisive influence on the development
and presentation of this work. Michael Mendler is indebted to Rod for his en-
couragment and stimulating supervision of the author's Ph.D. research, on which
this work is built.
Thanks are also due to Roy Dyckhoff for his interest and for providing useful
references to the literature. The authors are grateful to Nick Benton for discus-
sions on the computational interpretation of PLL, and to Pierangelo Miglioli for
pointing out the connection with Maksimova's intermediate logic.
Michael Mendler is supported by a Human Capital and Mobility fellowship in
the EuroForm network.
358

References
1. N. Benton, G. Bierman, and V. de Paiva. Computational types from a logical per-
spective I. Draft Technical Report, Computer Laboratory University of Cambridge,
U.K., August 1993.
2. B. Chellas. Modal Logic. Cambridge University Press, 1980.
3. H. B. Curry. The elimination theorem when modaiity is present. Journal of Sym-
bolic Logic, 17:249-265, 1952.
4. H. B. Curry. A Theory of Formal Deducibility, volume 6 of Notre Dame Mathem-
atical Lectures. Notre Dame, Indiana, second edition, 1957.
5. M. Dummett. Elements of Intuitionism. Clarendon Press, Oxford, 1977.
6. W. B. Ewald. Intuitionistic tense and modal logic. Journal of Symbolic Logic, 51,
1986.
7. M. Fairtlough and M. Mendler. An intuitionistic modal logic with applications to
the formal verification of hardware. Technical Report ID-TR:1994-13, Department
of Computer Science, Technical University of Denmark, 1994.
8. G. Fischer-Servi. Semantics for a class of intuitionistic modal calculi. In
M. L. Dalla Chiara, editor, Italian Studies in the Philosophy of Science, pages
59-72. Reidel, 1980.
9. M. Fitting. Proof Methods for Modal and [ntuitionistic Logics. Reidel, 1983.
10. L. L. Maksimova. On maximal intermediate logics with the disjunction property.
Studia Logica, 45:69-45, 1986.
11. M. Mendler. Constrained proofs: A logic for deafing with behavioural constraints
in formal hardware verification. In G. Jones and M. Sheeran, editors, Designing
Correct Circuits, pages 1-28. Springer, 1990.
12. M. Mendler. A Modal Logic for Handling Behavioural Constraints in -Formal Hard-
ware Verification. PhD thesis, Edinburgh University, Department of Computer
Science, ECS-LFCS-93-255, 1993.
13. E. Moggi. Computational lambda-calculus and monads. In Proceedings LICS'89,
pages 14-23, June 1989.
14. G. Plotkin and C. Stirling. A framework for intuitionistic modal logics. In Theor-
etical aspects of reasoning about knowledge, pages 399-406, Monterey, 1986.
15. A. Simpson. The Proof Theory and Semantics of Intuitionistic Modal Logic. PhD
thesis, University of Edinburgh, Department of Computer Science, 1994.
16. A. S. Troelstra and D. van Dalen. Constructivism in Mathematics, volume II.
North-Holland, 1988.
Towards M a c h i n e - c h e c k e d Compiler Correctness
for H i g h e r - o r d e r P u r e Functional Languages

David Lester and Sara Mintchev

Functional Programming Group,


Department of Computer Science, Manchester University,
Oxford Road, Manchester M13 9PL, UK.
{dlest er, mint ches}@cs, man. ac. uk

A b s t r a c t . In this paper we show that the critical part of a correct-


ness proof for implementations of higher-order functional languages is
amenable to machine-assisted proof. An extended version of the lambda-
calculus is considered, and the congruence between its direct and con-
tinuation semantics is proved. The proof has been constructed with the
help of a generic theorem prover - - Isabelle,
The major part of the problem lies in establishing the existence of predi-
cates which describe the congruence. This has been solved using Milne's
inclusive predicate strategy [5]. The most important intermediate re-
sults and the main theorem as derived by Isabelle are quoted in the
paper.

K e y w o r d s - Compiler Correctness, Theorem Prover, Congruence Proof, Deno-


tational Semantics, L a m b d a Calculus

1 Introduction
Much of the work done previously in compiler correctness concerns restricted
subsets of imperative languages. Some studies involve machine-checked
correctness--e.g. Cohn [1], [2]. A lot of research has been devoted to the con-
struction of compiler-compilers as in the work of Mosses [6] Paulson [10], and
Wand [20]. A recent a t t e m p t in this field is reported in [9].
Developing a proof of compiler correctness for a higher-order functional lan-
guage is made considerably more difficult by the need to use inclusive predicates
to relate an operational semantics (or a continuation semantics) to the direct
semantics. A complete proof of the correctness of a lazy functional language
compiler is presented in Lester [3, 4], however, it has not been machine-checked.
Methods and i m p o r t a n t results have been published by Stoy [17, 18] and Wand
[19].
In order to present the problem in a relatively short paper, we have considered
a simplified form of the problem of compiler correctness, and its mechanized
proof. We hope subsequently to extend the work to a full compiler. Here we
discuss the use of machine-assisted proof in asserting the congruence between
two definitions of a fully-fledged language of lambda expressions. We use Isabelle
370

[12] - - a generic theorem prover, written in Standard ML. It has a built-in


parser/pretty-printer generator and a type checker. T h e inference mechanism
uses higher-order resolution. Due to the flexibility of the theorem prover, various
object logics have been defined in Isabelle. In our proof we have used the logic
of computable functions (LCF) [11], a formalization of polymorphic predicate
lambda calculus (PP~).
In the course of the proof construction, LCF has been extended with new
theories, each with its new types, constants and axioms. In the following sections
we present the Isabelle axioms needed for the formulation of the problem at hand.
Axioms are shown in the format:

axiom_name : axiom

The major lemmas that were needed for the proof of the final result are also
quoted in the format

goal theory_name : theorem

The usual denotational semantics notation is used (we have pretty-printed Is-
abelle's axioms and theorems using a Gofer 1 script to make them more readable).

1.1 T h e l a n g u a g e a n d its d e n o t a t i o n a l s e m a n t i c s

The language under consideration is the lambda-calculus with constants and


two alternative calling mechanisms: call-by-name and call-by-value. A lambda-
expression can be a constant ( E C o n s t ) , a variable ( E V a r ) , a function application
( E A p ) , a non-strict lambda abstraction ( E L a m ) or a strict l a m b d a abstraction
( E L a m V ) . The induction axiom Exp_ind for the type of lambda-expressions is
given below:

Exp_ind :
[ V n. P ( E C o n s t n);
V z y. P x =r P y =* P ( E A p z y);
V i. P ( E Y a r i); V i e . P e =~ P ( Z L a m i e);
V i e. P e =r P ( E L a m V i e) ] ~ V x . P ( x )

The domains used in the definition of the direct semantics of the language are
given below. Expression values are of two kinds: basic values (natural numbers
in our particular formulation) and function values, which are restricted to be
continuous functions. For convenience the semantic domain of environments is
also given a name: U. The domain l~ has been defined as a new type in Isabelle.

1 Gofer, a lazy pure functional language related to Haskell, was devised and imple-
mented by Mark Jones at Oxford.
371

D e f i n i t i o n 1.1

13 Basic values
e 1~ = 13 + F Expression values
e F = Jig ---* E] Function values
IJ C 13 = [Ide ---* E] Environments

The domain injection (in) and projection (I) operations are defined in the
usual way. For the domain of functions, in particular:

deF'F': (e in F) I F = e
deF'S': ( n l n ] 3 ) l ~ e = _7
deF~Err : ._ ?IF= ._
d e F ' U U : I I ~ = .L

We wish to distinguish between two different ways in which a program can


fail. Firstly, it might not terminate; we will use _L (as usual) to represent this
situation. Secondly, the program might terminate with an error (e.g. division by
zero). In this case we will say that the program returns _ (error).
We have supplied Isabelle with an axiom for case analysis of expression val-
ues.

E ' c a s e s : [ P l ; P ?; V n. P (n in I~); V ~. P (r in F) ] ~ V x.P(x)

With these definitions in place, the standard semantic function S is defined


next:

D e f i n i t i o n 1.2

eval' E C o n s t : g' [ E C o n s t n] IJ = n in 13
eval'EAp: S' l E A p el e2]/~ = ((~' [eli ~6) I F) (g' [e~] j6)
e v a l ' E V a r : g' [ E V a r i~ I~ = IJ i
e v a l ' E L a m : ~' [ E L a m i e] ~ = (he. ~' Ire] (~[ i ~ e])) in
e v a t ' E L a m Y : g' [ E L a m Y i e] ~ = (strict (,~e. g' [el (t~[ i ~ e]))) in

The only point of interest is that the semantics for the call-by-value case
insists on evaluating its argument before evaluating the function. This is specified
by the use of the function strict which is defined by the following axioms2:

strict_UU : [ x =_k V x -- 9 ] ~ strict r z = z


strict_z: [-~(x=• Vx=7.)] ~ strict r x = r

Next we define a different semantics of the language of lambda-expressions.


Note (for SML readers) that this is not the standard definition of SML strictness.
372

1.2 The continuation semantics

Since the continuation semantics captures some of the details of an operational


semantics (namely, implies the order of evaluation of subexpressions), it is not
too surprising that its semantic domains are more complicated. The domain of
expression values, E, has been introduced as a new type in Isabelle.

D e f i n i t i o n 1.3

~v, z E W =
I( --+ ]~ Closures
k E I~ =
1~, --~ E Continuations
e E ]~ =
I3 + F Expression values
e F =
[~r -.+ ~V] Function values
]~ E U = [Ide ~ ~V] Environments

An axiom for case analysis can be formulated for the domain ]~ of expression
values:

E_cases:[ P _l_; P?; V n . P ( n i n ] 3 ) ; Vr162 inF)] =~ V x . P ( x )


The continuation semantics function C comes next. Unlike the direct se-
mantics definition, the definition of ~ does not refer to the function strict: the
call-by-value mechanism is described directly in terms of continuations.

Definition 1.4

$ : : Exp--+ U---~ ~V
evaIEConst : ~ [EConst n] ~ = (2k. k (n i n 13))
evalEAp: ~ [ E A p e l e2]~ = ()~k.~[el]~()~e. ( e l F ) ( ~ [ e 2 ~ k ) ) )
evaIEYar: E[EVar i] b = (Ak. ~ i k)
evaIELam : E[ELam i ell) = ()~k. k ((~w. E l 4 (/~[i ~-+ w ] ) ) i n F ) )
evalELamV : C [ELamV i e] ~ = ()~k. k ( ( ~ . )~kl.
(hE. C [el (hi i k kl)) in #))
The reader can observe a property of the semantics of lambda expressions which
are in weak head normal form (EConst, ELam, ELamV). The expression clo-
sure w corresponding to such an expression is always of the form (~k.k ~), where
the expression value c is returned to the continuation k. This property will be
used later in the congruence proof.

2 A formulation of the congruence

W h a t we would like to prove is that the two semantic definitions are congruent
in a certain sense. To this end we define predicates (Definition 2.1) to compare
two values, one from each semantics.
373

As the reader may have noticed already, we use Stoy's diacritical notation
[15] to distinguish between objects belonging to the two semantics. We use an
acute accent : to represent an object from the direct semantics and a grave accent
" to represent an object from the continuation semantics.
Note that in Definition 2.1, as well as in the rest of this paper, we have not
restricted ourselves to the usual LCF notation. For example, we have not used
the conditional operator, which is predefined in Isabelle's LCF theory. Instead,
for convenience, we have preferred a style close to a that of a functional language
with pattern-matching.
The predicate e compares two expression valued objects. As we can see the
bottom, error, and basic value cases are straightforward. The e predicate uses the
f predicate to compare two functions. The predicate f compares two functions
by extensional equality: if two functions, when applied to congruent arguments
give congruent results then they are congruent. The ep predicate relates two
environments by comparing the two values corresponding to each identifier.
D e f i n i t i o n 2.1

e_B : ,(n in Ii, , , , ) . ~ = (~k. k (n i . i3))


e_F : e(~i in ~', ~) . ~ = (~k.k (~ (~k.k))) ^ /(,~, (~ (~.~)) I ~')
e_UU : e(• ~,) ~ o., = .l_
e_Err : e ( ~ , ~ ) . ~ =?

f _def : S(~, ~) r ( v ~ z e(~, z) ~ e ( ~ , Sz))


p_def : ep (IJ, /9) r (V I . e (~ I, /9 I))
Unfortunately, however, the equations in Definition 2.1 do not actually define
the predicates! e and f are mutually recursive, reflecting the fact that the do-
mains E and F are reflexive. Because of the use of implication in f_def, the
existence of a solution is not guaranteed by monotonicity. So we must find a way
of demonstrating that such predicates exist.
The whole proof has been constructed within several Isabelle theories (all
built on top of a theory of the natural numbers and the LCF theory of Isabelle):
retracts_thy, predicates_thy, exist_thy, and congr_thy. A separate (sub)section
is devoted to each of these in the rest of the paper.

3 The Existence of the Predicates

The importance of the problem of existence of predicates was demonstrated


by Mulmuley [7]. He provided examples of plausible recursive definitions which
do not always have solutions, i.e. do not define predicates. Furthermore, as he
pointed out, there isn't a rich enough language having only valid predicates as
sentences. Thus existence must be proved for every recursive predicate definition.
Reynolds [14] tackles the problem of existence by constructing explicitly
the recnrsive domains, and defining generalized directed complete relations on
37z:

them. Stoy [16] applies the more straightforward inclusive predicate strategy of
Milne to solving a similar problem. He uses retracts in building the domains, and
constructs the particular predicates iteratively. Reynolds's technique is system-
atic, and thus suitable for mechanization, however its applicability is restricted
to relating 'similar' domains. Milne's technique is general, but rather ad-hoc,
and thus harder to mechanize.
Mulmuley [7] proposed a systematic technique for proving the existence of
predicates, and implemented it as an extension to LCF. Central to his technique
is an algorithm which reduces the problem of existence to a set of sufficient (but
not necessary) goals. In practice, the goals produced are weak and can be proved
within LCF, very often automatically.
We have followed Stoy's approach. Not surprisingly, the equations from Def-
inition 2.1 differ from those of Reynolds.

3.1 Retracts
A retract A over a semantic domain A is a continuous function which is idem-
potent (composing it with itself produces the same function):
A :: A--* A; A=AoA
A retract can be constructed automatically from a domain definition by means
of a set of retract operators [15], corresponding to domain constructors.
Definition 3.1 shows how sequences of retracts can be constructed over the
domains of Definitions 1.1 and 1.3.
D e f i n i t i o n 3.1

rE'O: Eo ~ = - L
rE'B' : ~.+1 (c in f3) = c in
rE'r': ~.+1 (~ in ~') = (r ~;) in ~'
rE'Err: En+l ? : ?
r E ' U U : E.+I I = i

~F'o: fo ,~ =-L

rW_def:~Y, w = E , o w o (Ag.E, o g o E,)

r E_O : Eo r = I
rE_B: ~.+1 (c in i3) = c in i3
rE_F: En+l (r in F) = (F,~+I
r E _ E r r : E.+I ?_ = ?_
rE_UU: En+l .L - " _L

rE_0: r ~=•
rF_succ:Fn+l r = ~1. o (h o ~1~
375

The range of each retract is a subset of the appropriate domain. A sequence of


retracts gives rise to a sequence of improving 'approximations' of the domain.
For example, the retracts sequences E, and Fn give rise to:
Eo = {.L} Expression values
:E,.,+I = :B + F.+~
]~0 = {• Function values
F,,+~ = [:r -+ E.]
The limit of a retract sequence is in fact the identity function on the appropriate
domain 3.
The basic idea of the proof is to define a sequence of predicates en and f,~ (of
the form of Definition 2.1) on the sequence of domain approximations induced by
the retracts E,,, F,~, E, and F,. No circularity will be involved in the definitions
of en and f , , so they will necessarily be well defined. Then predicates e and
f will be defined in terms of e,~ and f , in a standard way. Thus the problem
of existence will amount to proving that the newly defined e and f satisfy the
equations of Definition 2.1.
We have used Isabelle to prove the properties of retracts from Definition 3.1:
i) idempotency and ii) the fact that the n th retract is 'weaker' than the n + I st
one. Each property is proved by a straightforward numerical induction on n.
L e m m a 3.2

goal r e t r a c t s _ t h y :
v~.(v~.~.(r ~)= ~. ~)A

goal r e t r a c t s _ t h y :
Vn.(Vd. F,, (F,,+~ r = r r A
(vc.~,,(~,,+, ~ ) = ~:,, ~)^
(v ,~. r ( ~,, ,/,) ~,, d,) ^
(v ~. ~,,+, ( ~,, ~) = ~,, ~)
Note that all the conjuncts in each goal are proved simultaneously, in one in-
duction step. This of course reflects the mutually recursive nature of retract
definitions 3.1. A further induction is needed for generalizing these properties.
As with most other theorems, the base cases of the induction are solved auto-
matically by Isabelle's simplifier. The proof of each of the lemmas below required
10 tactics to be applied.
L e m m a 3.3

goal r e t r a c t s i t h y : Y m n S.Ern+n (En e) = En

goaZ ret~acts_thy : V m n c.~. (~m+. ~) = ~


3 Continuity guarantees that such a limit indeed exists.
376

Finally, some simplification rules for retracts turn out to be useful in later proofs:

goal retracts_thy: En ( Fn+l r ~) = Fn+1 r c

goal retracts_thy : F,~+I r ( E. ~) = Fn+l (5 s

Theorems analogous to those above have been proved for ~/, E arid F. Once
proven, all theorems are included in the set of Isabelle's rewrite rules to be used
by the simplifier for automating subsequent stages of the proof.
To summarize: retracts provide a convenient method of building up semantic
domains iteratively. Retracts can be constructed mechanically from (reflexive)
domain definitions, and their properties can be guaranteed. Thus automating
retract construction is not a problem.

3.2 The Iterative Predicates

Having built the theory of retracts, we can turn our attention back to the predi-
cates e and f . We give a standard formulation of these predicates (Definition 3.4),
in which e and f are defined in terms of two sequences of predicates, en and fn.

D e f i n i t i o n 3.4

eaw_def : e(E, w) r (V n. en( Er~ ~, VVn ~))


eaf_def : f ( r r ~ (V n. fn( Fn r F, r

The question is: what should en and fn look like? We want to eventually pro-
vide a proof that the predicates from Definition 3.4 satisfy the equations from
Definition 2.1, so e , and f~ must be of the general shape of those equations.
Furthermore, such a proof will require a general condition which relates any
predicates en with e by applying the n t h retracts to the arguments of e. The
condition corresponds to the statement of monotonicity of a predicator in [7]. In
our particular case, the condition is as follows:

The above condition would follow by induction from a couple of simpler state-
ments which relate a predicate with its successor:

(v ~ ~. ~.( E. ~, fV. ~) ~ ~.+1( & ~, VV. ~)) A


(V ~ ~. ~+1( E.+I ~, VV.+~ ~) ~ ~.( E. ~, VV. ~))

The above statement has guided our search for an appropriate definition of e~
and f~. After some thinking and several abortive attempts, we have come up
with Definition 3.5, which complies with the above statement:
377

Definition 3.5

eqf_O : fo( r w) r T r u e
eqf_succ: In+l( r r r

eqw_O : eo( ~, w) r T r u e
eqw_B: em+l(n in ]3, Wk w) ,~60 = (.Xk.k (n in 13))
eqw_F: era+l(6 i n F , Wk w)
= (~,k.k (60 (~.~))) ^ f~+~( d, ~k ((~ (:~.~)) I r'))
eqw_UU : era+l( I , w) r w "- _L
eqw_Err : em+l ( -9~ 60) ~ 60 _~- ?

The considerations which prompted the exact form of Definition 3.5 are fully
stated in Lemma 3.6, which is proved by induction on n. Because of the mutual
dependency of our predicates, all conjuncts in the lemma are required if the
induction is to go through.

L e m m a 3.6

goal predicates_thy : V n.
(v~.s~(~, ~ ) ~ s.+l(~. ~,~. ~))^
( V S b . A+~(r $) ~ S,(r r F, r
(v ~ 60. e.+~( ~+~ ~, W~+~ 60) ~ e.( ~ 6, ~V~ ~))
The proof uses Lemmas 3.2 and 3.3. For convenience, the proof of this theorem
was preceded by separate proofs of the four conjuncts of the induction step. Each
conjunct required the use of approximately twenty tactics.
A generalization of the above properties is easily derived by another couple
of inductions:
goal predicates_thy : V m n r 60 .en( i n r ~Nn 60) =~
e~+.( ~ + . ( ~. ~), ~V~+. ( f% 60))
goal predicates_thy :
V m n ~ 60 .era+n( Ern+n g, ~Vm+n 60) =~
e.( ~ ( ~m+. ~), fV~ ( fU~+~ 60))
These two lemmas are summarized in the following theorem. This is the only
result from Section 3.2 used in subsequent proofs.
T h e o r e m 3.7"

goal predicates_thy : V n c w .en( in r ~V, w)


(v m. e~( ~ ( ~. ~), ~vra ( ~v. 60)))
378

3,3 I t e r a t i v e p r e d i c a t e s satisfy e q u a t i o n s
What remains to be done is to prove that the well defined predicates of Defini-
tion 3.4 actually satisfy our original equations from Definition 2.1. Correspond-
ing to the axioms for e in 2.1 are the following theorems. They are proved by
folding/unfolding the definitions of predicates (2.1 and 3.4) and retracts (3.1),
as well as by some simple manipulation of indices.
T h e o r e m 3.8
goal exist_t'hy : e (6 in F, ~) r162
= (ak.k (~ (ak.k))) ^ f(~, (~ (~k.k)) IF)
goal e~cist_thy : e (n in 13, w) r w = ()~k.k (n in 1~!))

goal exist_thy: e (.L, w) r (V m. ~/,~+1 w =_L)

goal exist_thy: e (?, ~) 4:~ (V m. VVm+I ~: = ? )


The theorem which mirrors f _ d e f from 2.1 presents a somewhat harder problem
- - this is where Theorem 3.7 is used. The two implications of the equivalence
(r are proved separately.
T h e o r e m 3.9

goal exist_thy : (V r z. e (~, z) ~ e (r r r z)) ~ f (6, r

goal ~:ist_~hy: / (6, ~) ~ (V ~ z. ~ (~, z)


(V m. e (Fro-F1 ~ C, Frn+l (~ z)))
The second part of Theorem 3.9 is still not what we want. Instead of congruence
for all possible retract pairs (Fro+l, F,~+I) in (e (F,~+I r ~, Fm+l r z)),
we would like congruence for the limit retracts, which we'll denote by (F~, F~).
Achieving that involves applying fixpoint induction (predefined in Isabelle's LCF
theory) to the second part of Lemma 3.9 in order to derive:
goal exist_thy : V r z. (~ = F I X (+1))
(Vm.e(r 6~, ~+1Sz)) ~ ~(~ ~ ~, ~ ~ ~)
where (+1) is the natural number successor function, and F I X is the fixpoint
operator. The proof of the above proposition requires that our iteratively defined
predicates be inclusive. The inclusivity is proved using Isabelle's own axioms
and the following lemma, in which isB ~ (isF') is a predicate which is true iff its
argument is a basic (function) expression.
goal exist_thy : V c w. P s r
r177 P•
C ~ =. A P ?=.V
isB', ^ P ((, I B) i~ B) v
379

4 The congruence

Having established the existence of predicates satisfying 2.1, we turn to the


original problem, i.e. the proof of the congruence of the two semantic definitions
(1.2 and 1.4) of the lambda expression language. This turns out to be a fairly
simple task in comparison with the problems from the preceding sections. Due
to the similar nature of the two semantics, the congruence can be proved in one
go by structural induction on lambda expressions. However, a couple of lemmas
must be proved in advance. The first one concerns function application: if two
expressions ~1, e2 E 1~ are congruent with wl, w2 E Vr respectively, then
the application of the function (~1 I F) to e2 is congruent with the application
of ( e l f ) to w~. More accurately:

goal congr_thy : V el r wl w2. e(r wl) ::~ e(r w2) ::~


e((cl wl( e.((e w2 k)))
The second lemma regards the relationship between environments. If two con-
gruent environments,/J and ~b, are augmented with a mapping of an identifier I
to the congruent expressions, r and w, respectively, then the augmented envi-
ronments are still congruent. In formal terms:

goal congr_thy : V nIr162 w) ep( ,

where (p[n ~-~ ~]) is the environment p augmented with a mapping of the
identifier n to the expression r
At last, we are able to prove the main theorem: The direct and continuation
semantics of a lambda expression are congruent, provided that the environments
of the two semantics are congruent.
T h e o r e m 4.1

goal congr_thy : V E t~ ~. ep(t~ , ~) =:~ e(E'[E] ~, $ [ E ] ~)

The theorem is proved by induction on lambda expressions (the axiom Exp_ind


from Section 1.1). The base cases (constants and variables) of the induction
require the use of 3-4 tactics; the hardest case (strict lambda abstractions)
requires approximately 70 tactics, but probably a better solution can be found.

5 Conclusion and Further Work

We have outlined the solution of a non-trivial problem involving the semantics of


a higher-order language. Our aim has been to construct a proof for the problem
with the help of a theorem prover, rather than to fine-tune a theorem prover to
solve the problem 'automatically'. We have not tried to build new tactics and
tacticals; instead, we have constructed the proof using readily available Isabelle
380

tactics. Part of the work has been done automatically by Isabelle's simplifier, as
well as by the automatic tactics for classical first-order logic, but this can be
viewed as 'small-scale' automation.
The experience we have gained so far suggests that proof construction should
be viewed as an activity closely related to 'programming': Hence, techniques and
approaches characteristic of a good programming style ought to be applied in
theorem proving as well. To name just a few, modularity, good structure and
independent levels of abstraction are essential.
As a result of our mechanization we were able to correct an error in the proof
of [4, Lemmas 3.18 and 3.19].
We intend to experiment with other methods of proving the existence of
predicates on reflexive domains. Pitts [13] has proposed a m e t h o d which is easier
to apply than the usual inclusive predicate strategy. The essence of the m e t h o d
is to define simultaneously two versions of the predicate - one with positive and
one with negative occurrences only - and to prove the two versions equal using
fixpoint induction.
We also intend to explore the correctness of a full compiler for a lazy func-
tional language, in the style of [8]. For this we will need to deal with the following
points:

- Using an operational model of the implementation (e.g. Plotkin's structural


operational semantics, or a concrete abstract machine) in place of a contin-
uation semantics.
- Lazy data structures and in particular the problem of infinite s t r e a m s of
output.
- Graph reduction, as opposed to the tree reduction we used in this paper.
- Code generation [19] from an operational model of graph reduction.

As experience reported in [3, 4] suggests, all of the above are easy to do, once
the main congruence result has been proved.

References

1. A. Cohn. The equivalence of two semantic definitions: a case study in LCF. Tech-
nical Report CSR-76-81, Department of Computer Science, Edinburgh University,
January 1981.
2. P. Curzon. Deriving correctness properties of compiled code. Formal Methods in
System Design, 3(1/2):83-115, August 1993.
3. D.R. Lester. The G-machine as a representation of stack semantics. In G. Kahn,
editor, Proceedings of the Functional Programming Languages and Computer Ar-
chitecture Conference, pages 46-59. Springer-Verlag LNCS 274, September 1987.
4. D.R. Lester. Combinator Graph Reduction: A Congruence and its Applications.
Dphil thesis, Oxford University, 1988. Also published as Technical Monograph
PRG-73.
5. R.E. Mflne. The Formal Semantics of Computer Languages and Their Implemen-
tation. PhD thesis, University of Cambridge, 1974.
381

6. P.D. Mosses. SIS - semantics implementation system. Technical Report DAIMI


MD-30, Computer Science Department, Aarhus University, 1979.
7. K. Mulmuley. Full Abstraction and Semantic Equivalence. MIT Press, Cambridge,
Massachusetts, 1987. ACM Doctoral Dissertation Award 1986.
8. F. Nielson and H.R. Nielson. Two-level FunctionalLanguages. Number 34 in Cam-
bridge Tracts in Theoretical Computer Science. Cambridge University Press, 1992.
9. J. Palsberg. A Provably Correct Compiler Generator. PhD thesis, Computer Sci-
ence Department, Aarhus University, January 1992. Also published as Technical
Report DAIMI PB - 382.
10. L.C. Paulson. A semantics-directed compiler generator. In Ninth Symposium on
Principles of Programming Languages, pages 224-233, 1982.
11. L.C. Paulson. Logic and Computation: Interactive proof with Cambridge LCF.
Cambridge University Press, 1987.
12. L.C. Paulson. Introduction to Isabelle. Technical report, Computer Laboratory,
University of Cambridge, 1992.
13. A.M. Pitts. Relational properties of recursively defined domains. In Proc. 8th
Annual Symposium on Logic in Computer Science, pages 86-97, Washington, 1993.
IEEE Computer Soc. Press.
14. J.C. Reynolds. On the relation between direct and continuation semantics. In
Proceedings of the Second Colloquium on Automata, Languages and Programming,
pages 141-156, Saarbrucken, 1974. Springer-Verlag.
15. J.E. Stoy. Denotational Semantics: The Scott.Strachey Approach to Programming
Language Theory. The MIT Press Series in Computer Science. MIT Press, Cam-
bridge, Massachusetts, 1977.
16. :I.E. Stoy. The congruence of two programming language definitions. Theoretical
Computer Science, 13(2):151-174, February 1981.
17. J.E. Stoy. Semantic models. In M. Broy and G. Schmidt, editors, Theoretical
Foundations of Programming Methodology. Lecture notes of an International Sum-
mer School, directed by F.L. Bauer, E. W. Dijkstra and C.A.R. Hoare, pages 293-
324, Boston, Massachusetts, 1982. NATO Advanced Study Institute Series, C91,
D. Reidel Publishing Co.
18. :I.E. Stoy. Some mathematical aspects of functional programming. In
J. Darlington, P. Henderson, and D.A. Turner, editors, Functional Programming
and its Applications: An Advanced Course, pages 217-252. Cambridge University
Press, Cambridge, England, 1982.
19. M. Wand. Deriving target code as a representation of continuation semantics.
ACM Transactions on Programming Languages and Systems, 4(3):496-517, July
1982.
20. M. Wa~nd. A semantic prototyping system. In Proceedings of the ACM SIG-
PLAN'84 Symposium on Compiler Construction, pages 213-221, 1984.
Powerdomains, Powerstructures and Fairness

Yiannis N. Moschovakis 1 and Glen T. Whitney 2.

1 Dept. of Math, UCLA, Los Angeles, CA 90024; ynm@math.ucla.edu


Dept. of Math, University of Michigan, Ann Arbor, MI 48109; gwhitney@umich.edu

Abstract. We introduce the framework of powerstructures for compar-


ing models of non-determinism and concurrency, and we show that in
this context the Plotkin powerdomain plot(D) [6] naturally occurs as
a quotient of a refined and generalized player model ipf(D), following
Moschovakis [2, 3]. On the other hand, Plotkin's domains for countable
non-determinism plot~ (D) [7] are not comparable with these structures,
as they cannot be realized concretely in the powerset of D.

If, as usual, we let the programs of a deterministic programming language L


denote points in some directed-complete poset (dcpo) D, then programs in non-
deterministic extensions of L should naturally correspond to non-empty subsets
of D, members of the set of p l a y e r s 3

1I = 11(0) = d f {x C D I x # 0}. (1)


This idea immediately encounters a problem with non-deterministic recnrsive
definitions. In the deterministic case, the open terms of L (its program trans-
formations) denote (Scott) continuous functions on D. Their least fixed points
(which exist precisely because D is a dcpo) provide a means of interpreting recur-
sion. On II(D), which does not carry a natural, complete partial ordering, how
are we to interpret non-deterministic program transformations so that they still
have "canonical" fixed points? No known semantics solves this basic problem in
the modeling of non-determinism in an entirely satisfactory way.
For a concrete example, let Str be the dcpo of integer streams, where (follow-
ing [5]) a stream is a finite or infinite sequence, or a finite sequence of the form
ala2.., ant, where the terminator t is some fixed non-integer witnessing "termi-
nation." Now 11(Str) is the set of non-deterministic integer streams, and many
of the usual, non-deterministic constructs are naturally interpreted by functions
on II ( Str) as follows:
x or y = d f x U y (2)
merge(x,y)=df{p[a,~ ] I c~ E x, ~ E y, #:1~ --* {0, 1}} (3)
fairmerge(x, y) = d f {pCa, f~] I a e x, f / e y, p a fair merger}. (4)
* During the preparation of this paper, Moschovakis was partially supported by a NSF
Grant and Whitney was supported by a Fellowship from the Fanny and John Hertz
Foundation.
3 The term derives from the original construction in Moschovakis [2, 3] which was cast
in game theoretic terms, for a specific domain D of partial strategies.
383

Here or stands for free, binary choice,/~[a, fl] stands for "interleaving a and fl
by the merger #" in the obvious way, and a (strict) fair merger (following Park
[5]) is any sequence of O's and l's which is not ultimately constant. Note that as
operations on players, these merges remain distinct. If D has further structure,
then additional operations of this sort can be defined, such as s~ate-dependent
fair merges, see [3].
To model non-deterministic recursion within domain theory, we must em-
bed /-/(D) in some powerdomain D*, and not totally arbitrarily. For example,
D is embedded in H ( D ) by the natural map d ~ {d}, a n d w e would want
to have "liftups" of the continuous functions in (D --* D) to continuous func-
tions in (D* ~ D*) which respect composition, yield the correct least fixed
points, etc. Even such simple requirements seem to force undesirable conse-
quences about D*, however. Consider the first and most interesting powerdo-
main construction p l o t ( D ) of Plotkin [6] (see also Smyth [8]) as an illustrative
example, p l o t ( D ) does not faithfully model fairness because it identifies sets in
II(D) which are equivalent under 4 the "observational Egli-Milner equivalence re-
lation" -em. This collapses the merge and fairmerge operations on II(Str), e.g.,
fairmerge(aoo, b~176-era merge(aoo, b~176where aoo is the infinite string of 'a's.
In addition, the equivalence relation "~em identifies certain unguarded recursions
with similar, but intuitively distinct , guarded recursions, e.g., see Smyth [8].
To circumvent these imperfections of the powerdomain constructions, Mos-
chovakis [2, 3] introduced (over some specific domains D) a model i p f ( D ) for
non-determinism and concurrency in which programs are interpreted by ar-
bitrary players, and program transformations are modeled by i m p l e m e n t e d
p l a y e r f u n c t i o n s (ipfs) on II(D). These ipfs encode more than their values on
players: there exist distinct ipfs f and g such that f(x) = g(x) for all z 9 II(D).
The extra, i n t e n s i o n a l information carried by an ipf makes it possible to as-
sign "canonical solutions" to systems of recursive equations, so that the laws of
recursion are obeyed; we will make this precise further on. The or, merge, and
fairmerge operations introduced above are naturally modeled by certain ipfs
(and incidentally "unnaturally" modeled by others, distinct from but extension-
ally equal to the natural ones.)
Our principal aim here is to show that (with modest hypotheses on D) the
Plotkin powerdomain p l o t ( D ) can be recovered in a natural way from i p f ( D ) ,
while the countable powerdomains plotoj(D ) appear to represent a fundamen-
tally different modeling of fairness. For this, we will also introduce a refined
construction of i p f ( D ) (for any D) and establish precise properties of i p f ( D )
4 The terminology for various pre-orders on II(D) is not entirely standardized. In this
paper, we will use the lower preorder (x Et Y if for all a E 2, there exists b E y
such that a < b) and the upper preorder (2 E u y if for all b E y, there exists
a E x such that a _< b). The usual "Egli-Milner preorder" is the conjunction of
these two. However, as outlined in Smyth [8], the easiest construction of the Plotkin
powerdomain for countably algebraic D is in terms of the "observational Egli-Milner"
preorder, defined as x ,,, y if for all finite sets A of finite elements, A El x implies
A Ez y and A E ~' x implies A E ~ y. Each preorder induces an equivalence relation,
for example x -~em Y if x .., y and y ,., x.
38zt.

which make it a suitable structure for modeling non-deterministic programs and


program transformations. We will rely heavily on an axiomatization of the "stan-
dard" laws of recursive equations, and on a somewhat novel approach to the de-
velopment of intensional semantics for formal languages, which has applications
beyond its present use. These ideas are described in Section 1.

1 T h e main n o t i o n s

For each vocabulary (signature) r, i.e., set of function symbols with associated
non-negative arities, the expressions of the language FLR0(r) are given by

E :_=_z I f ( E 1 , . . . , E , ) I E0 , h e r e {zl = El, .... zn = En},

where z is a variable (from some fixed, infinite set of variables) and f E r.


Intuitively, FLR0 has notation just for function application and for solution of
simultaneous recursion equations. The where operator binds the variables zl
through Zn ; all other variable occurrences are free. A closed expression is one
containing no free variables, e.g., f(g0) or z where {z = f(~:)} if f is unary and
g is nullary.
In the standard semantics for FLR0, we have a dcpo D together with some
continuous functions on D to interpret the function symbols, and with each
FLR0 expression E and each assignment r : Variables --* D, we associate a
point value(E, r) E D. If E is an open FLR0 expression and the list of variables
x = x l , . . . , zn includes all the free variables of E, then

A(x)E = ~(y)value(E, {xl := Y l , . . . , xn := Yn)}) (5)


is the n-ary function defined by E and x. The more general, intensional semantics
for FLR0 needed here are defined directly in terms of a given interpretation A,
making (5) a theorem rather than a definition in the standard case.
The universe of an interpretation A is a set 9 of objects with associated
integer arities; ~n comprises the n-ary objects of ~, and the nullary objects
in ~0 are its individuals. In a standard interpretation, ~0 = D is a dcpo and
~n consists of all the continuous, n-ary functions on D. The interpretation A
assigns to each expression E and list of variables x = x l , . . . , x n including all
free variables of E an object A(x) E in ~n, so that the following basic conditions
of compositionality hold:
(1) A(x)zi depends only on the length of x and on i; in the standard case,
this must be the usual projection function from D n to D by the ith component.
(2) If A(x)Mi = A(y)M[ for i = 1 , . . . , n , then A ( x ) f ( M 1 , . . . , M n ) =
A(y) f ( M ~ , . . . , M~). In the standard case, these must be computed by ordinary
function application of the interpretation of f on the given values.
(3) If A(y) E ( y l , . . . , Yn) = A(z) E ' ( z l , . . . , zn) and the substitutions ElM/Y]
and E'[M/z] are free, then A(x) E ( M 1 , . . . , Mn) = A(x) E ' ( M 1 , . . . , M,).
(4) If w = A(x)E0 whero {Yl = E 1 , . . . , y n = En}, suppose first that
no Yi occurs in x. Then w depends only on A ( y , x ) E i for i from 0 to n. In
385

general, let x' be the same as x except that every variable from x occurring as
one of the yi has been replaced by a fresh variable. Then w depends only on
A(y, x ') Ei, in the same sense as the last two requirements: if these values are
equal to A(u, z')Mi, respectively, then w - A(z) M0 w h e r e {u = M ) . For a
standard interpretation, w must be computed by taking the least fixed point of
the system Yi - A(y, x ~) Ei for i from 1 to n, and substituting the results (which
are functions of the x') into E0.
An expression identity E = M is s t a n d a r d if it is valid for all standard
interpretations, i.e., A(x)E -- A(x)M for every list x which includes all the free
variables of both E and M. The simplest example of a standard identity is

f ( x whore {x -- f(x)} ) = x where (x = f(x)} (6)


which asserts that "the least fixed point of f is a fixed point of f". Others
include the Beki~-Scott rules which relate simultaneous and iterated recursion,
the reduction of explicit definition to recursion, etc. It can be shown that the
class of standard identities (on a recursive, countable vocabulary) is decidable,
simply (and usefully) axiomatizable, and the same as the class of identities valid
for all interpretations with individuals Do, the set of all streams of '0's. 5 This
robustness of the standard identities suggests that they truly codify the laws of
recursive equations'--the rules we use unthinkingly when we manipulate recur-
sive definitions--and we look for modelings of non-determinacy and concurrency
among FLR0 interpretations which satisfy them.
9 An (abstract, intensional) F L R 0 - s t r u c t u r e is a triple

.4 = ,A),

where # = U # n is a universe and A is an interpretation of FLR0(#) into 4i


which satisfies the standard identities and also

A(Xl,...,Xn) f ( x l , . . . , X n ) = f for each fEqbn. (7)


Notice that here we view ~ as both a vocabulary and universe, each f E ~lin
being an n-ary function symbol naming itself as enforced by (7). Each dcpo D
gives rise to a standard FLRo-structure, in which 4i0 = D, On consists of the
n-ary continuous functions from D to D, and A is the standard interpretation
as described above.
In an arbitrary FLR0-structure, we think of the elements f o f ~ as intensionai
functions on A = ~0, and every n-ary f determines an actual function f: A" ~ A
via
] ( a l , . . . , an) = d f A0 f ( a l , . . . , an). (S)
We say that 7 is the extension of f, or that f covers 7.
A homomorphism p : `4 ~ B from ,4 (as above) to B = (gt0, {~n}n>l, A') is
any arity-preserving map from 4i to ~ = [.J gtn which respects the interp~tations,

These results will appear in a multi-authored paper The logic of recursive equations,
now in preparation.
38G

as follows: extend p (by substitution) so that it takes arbitrary expressions of


FLR0(~) to expressions of FLR0(~); then it must satisfy

p(A(x)E) = A'(x)(p(E)).

Thus, homomorphisms preserve all possible compositions and recursions.


A p o w e r s t r u c t u r e over a dcpo D is an FLR0-structure "P = (P,~n>I,A)
such that there is an injective FLR0-homomorphism p from the standard FLR0-
structure over D to 7) satisfying the following two finite non-determinism con-
ditions:
(1) The map {d} ~ p(d) on the singletons of D extends to a surjective map
7r: Sp --~ P, where S e is a subset of II(D) closed under continuous images and
finite unions.
(2) Similarly, for each arity n, the map {F} ~ p(F) on the singletons of
continuous functions extends to a map ~" which takes each finite set J of n-dry
continuous functions to some ~r(J) E Cn, so that:

~'(J)(~rxl,..., 7rxn) - ~ { F ( d l , . . . , d,) I F E J, di e xi}. (9)

If for a particular powerstructure 7) both occurrences of '~nite" in these condi-


tions may be replaced by "countable" or "arbitrary", then the powerstructure
is called countably non-deterministic or fully non-deterministic, respectively. We
also say that 7) is fine, if the map ~" on individuals is actually a bijection, so that
P can be identified with a set of players.
The second condition applied to singletons {F} implies that each continuous
F : D n -~ D has a lift-up f, such that

7(~'xl,..., ~'x,~) = ~r{F(dl,..., d,) I dl E z l , . . . , d, E z,,}.

In addition, if U = {F1, F2} where Fl(d, e) = d, F2(d, e) = e, then the corre-


sponding intensional function r(U) covers the ("quotient" of the) binary union
operation (2). If 7) is fine and fully non-deterministic over Str and M is the set
of all functions of the form Fu(ao, al) = /~[a0, all with # a fair merger, then
~-(M) covers fairmerge as defined above (4). Thus, fine, fully non-deterministic
powerstructures can provide powerful and faithful models of "fair concurrency."
Note that plot(D) together with the continuous functions on it is a power-
structure, but not a fine one: Sp is the collection of finitely generable subsets
of D and ~" identifies Egli-Milner equivalent sets. Neither is plot(D) fully non-
deterministic. The powerdomains plot~(D) enjoy the intermediate property of
countable non-determinism which can be used to define fairmerge, although not
in the direct way described above, for there are uncountably many fair mergers
Ft. It is not clear that countable non-determinism provides a rich enough model
to handle the many extant notions of fairness; in particular, we do not expect
to be able to define natural state-dependent fair merges or the fair merge of
countably many streams using only countable non-determinism.
387

2 Main results
T h e o r e m A. For each domain D, there is a fine, fully non-deterministic pow-
erstructure ipf(D) = (//(D), ipf(D), Aipf) over D.
In the construction of ipf(D), every intensional function essentially arises as
f j for some J, so every 7 in ipf(D) ends up being set monotone, i.e.,
x C y =r f(x) C_ f(y), (10)
and this limits the functions on plot(D) we can represent inside ipf(D). Recall
that (for countably algebraic D), plot(D) can be defined as the quotient of
the finitely generable subsets of D (which we will denote by IIo(D)), under the
equivalence -em. Therefore, each continuous function r : plot(D) --~ plot(D)
is induced by some continuous function r : IIo(D) --~ IIo(D) on the predomain
IIo(D), i.e.,
r ~"~ern])"- [r "Vem], (X E //o(D)). (11)
In particular, we say that r is essentially monotone if it is induced by some
set monotone r The essentially monotone functions em(D) are closed un-
der composition and recurs• and therefore together with plot(D) and the
standard (least-fixed-point) interpretation comprise an FLR0-structure Pl(D) =
(plot(D), era(D), Astd). This is a natural FLR0-structure associated with the
Plotkin powerdomain, and it includes all U-linear functions [1].
T h e o r e m B. If D is strongly algebraic then there is an FLRo-substructure
i p f 0 ( n ) = (IIo(n), ipf o(n), Ao) of ipf(D) with the following properties.
(a) Each player function f in ipf0(D) respects the Egli-Milner preorder on
lip(D) and is Scott continuous, so that it induces a continuous function
p(f) = r plot(D) -~ plot(D) (12)
on the Plotkin powerdomain by the equation r ~em]) -- if(X)/ ---~em].(By the
observation (10), r is necessarily essentially monotone.)
(b) If we extend the map p to no(D) by p(x) = [z/ ~-em], it becomes an
FLRo-homomorphism from ipf0(D ) to PI(D).
(c) /f r : plot(D) ~ plot(D) is essentially monotone, then r = p(f) for
some player transformation f in ipf0(D); that is, the image of the homomor-
phism p is exactly Pl(D).
No similar comparison is possible between ipf(D) and ploto~(D), however.
The obstacle is that except for extremely simple (e.g., flat) D, p l o t , ( D ) cannot
be thought of as a structure on the subsets of D, or precisely:
T h e o r e m C. For any domain D embedding (1• x N)• the free a-semilattice
over D is not the homomorphic image of (lI(D), E, C_) with ordinary C and any
partial order E.
This means that plotw(D)is not technically a powerstructure in our sense,
in that it does not represent non-deterministic "programs" (FLR0 expressions)
by their set of possible "outcomes" (subset of D), but provides some altogether
different, less concrete interpretation.
388

3 Details a n d p r o o f s

To prove Theorem A, we need to define the class ipf(D) of implemented player


functions on an arbitrary dcpo D, specify suitable operations of composition and
recursion on this class, and then show that the resulting structure ipf(D) =
(II(D), ipf(D),Aipl) is a fully non-deterministic powerstructure. The complete
construction is quite long, but not very different from that given in detail and
with many motivating examples in [3], for a specific D. Here we confine ourselves
to a brief sketch, highlighting the differences arising in the general case; [9]
contains a full treatment.
A unary polyfunction on D is a monotone function F: D I --* D, where the
index set I is an arbitrary set of integers and D I is the dcpo of maps from I to
D under the pointwise ordering. Each polyfunction induces a function on II(D)

P(~) = ( F ( X ) I X : Z - * z},
and we think of F as an "implementation" of f'. However, some polyfunctions
differ inessentially by the integer "tags" they use to name their arguments: we
say that G: D J --+ D reduces to F: D x --+ D, written G _-4 F, if there is an
injection t : I ~ J such that G(p) = F(p o t) for all p E D J. Let ~ be the
smallest equivalence relation extending ~, and call two polyfunctions F1 and F2
equivalent if F1 • F2. It is simple to verify that if F • G, then t7- = ~. Finally,
a (unary) implemented player function (ipf) is a nonempty set of polyfunctions
closed under • Each ipf f induces a function ]: II(D) --* II(D) (its extension)
by

](z)= U F(z)={F(X)lFef, X:I--*zwhereF:D t-*D}. (13)


FEI

The members of f are called its implementations, and a set 2- of polyfunctions


generates f, written f = {2"), if f is the closure of 2- under • i.e.,
GEf ~ (3F E 2-) such that G • F.
It is not difficult to see that a generating set of implementations suffices to
determine the extension of f as per (13). For n-ary ipfs we use polyfunctions
F: D ~1 x ... • D In --* D and proceed similarly.
Polyfunctions generalize the infinitary behavior functions of [3], where, how-
ever, only one index set was allowed, I = 1% A more essential difference is the
present choice of polyfunction equivalence, which is less coarse than that of [3]
and produces more natural modelings in the specific examples. 6 This choice of
equivalence requires some extra care in the correct definition of ipf composition
and ipf recursion, but these constructions are quite similar to those of [3] and
we will skip them. We mention the one technical notion needed in the proofs
below, to set notation.
6 T h e desirability of this refinement was discussed briefly in F o o t n o t e 8 of [3], but the
m e t h o d s of t h a t p a p e r were not strong enough to prove the main results with the
present, more natural equivalence relation.
389

An implementation system for a single ipf equation of the form z = f(z)


is a labeled infinite tree F, whose vertices are the set N* of finite sequences of
natural numbers. Each vertex is labeled with an implementation Fr of f, and so
F determines an infinite system of recursive equations over D,

X , = F,(~(i e r~)z~) (r c IN*),

where ri is the result of appending i to the end of r. We let {~7r I r E IN*} be


the set of mutual fixed points of this system, and put

= {X~: Xr are the simultaneous least fixed points of some F } .

This ~ E II(D) is the "canonical" ipf flxpoint of the equation z = f(a~), and
it is not hard to verify that, indeed, it is a fixed point. The construction of
canonical fixpoints for systems of equations with parameters is similar but more
complicated, and still very close to [3].
The proof of Theorem A now essentially consists of showing that the standard
identities hold in ipf(D). Armed with the axiomatization mentioned in Section
1, it suffices to verify a specific, short list of identities. This method improves
on that of [3], both in content (as we can handle arbitrary D and the refined
equivalence relation) and in simplicity.

3.1 Comparing ipf(D) and plot(D)

Turning to Theorem B, we first need to define ipf0(D), which is most eas-


ily done topologically. So, place the usual Scott topology on all dcpos; note
that for algebraic D, a base of this topology is given by the collection of sets
ND (e) = {d E D ] e _< d} for e finite. Let C be the Cantor set (all infinite binary
sequences) with its usual topology, and call a subset X of D compact-analytic if it
is the continuous image of C, i.e., if there is a (topologically) continuous function
F: C ~ D such that F[C] = X. Since C is homeomorphic to the direct product
of countably many copies of itself, it is not hard to see that compact-analytic
sets are closed under countable direct products and continuous images.
Now restrict attention for the remainder of this section to strongly alge-
braic D. These are the "SFP objects" of Plotkin [6]; we need just the follow-
ing properties: If D is strongly algebraic, then there is an increasing sequence
Do C D1 C D2 ... of finite sets of finite elements of D whose union is all finite
elements of D. Furthermore, there is a family of projections Pn: D --. Dn such
that p,~+l op,~ = P-+I and for all d E D, d = sup, p~(d).
It is not difficult to check that for strongly algebraic D, the compact-analytic
subsets of D coincide w i t h the finitely generable ones. 7 Therefore, think of
ipf0(D ) as a structure on the compact-analytic subsets of D. To provide the
transformations ipfo(D), call an ipf f compact-analytic s if it is generated by a
7 This statement in fact holds for all algebraic D, but the proof requires considerably
more work; see [9].
8 See also [4].
390

family of polyfunctions of the form { F , : D N ---, D ] c~ E C} where the function


F: C • D ~ --~ D via F(~, p) = Fa(p) is continuous. These ipfs have implementa-
tions continuously parametrized by the Cantor set, which one can think of as a
space of "oracles" for the corresponding non-deterministic function. The closure
properties of compact-analytic sets guarantee that such an f takes compact-
analytic players to compact-analytic players, as ](x) = F[C • xN].
It is a fact that the compact-analytic ipfs iPfo(D ) and players IIo(D) are
closed under composition and recursion, which means that IIo(D), ipfo(D), and
the (restriction of) the usual ipf interpretation form an FLE0-structure iPf0(D ).
The proof of this fact is not difficult from the definitions, and is similar to the
portions of Theorems 8.2 and 8.4 of [3] which state that ipf recursion preserves
"type."

L e m m a . For x, y finitely generable, x ~ y is equivalent to the conjunction of


x E_u y and x E_~ y, where x E~ y if for every c E x and every finite a E D such
that a < c, there exists d E y such that a < d.

Intuitively, x E~ y means that every finite approximation to x is also an


approximation to y. This form of ~ will be most useful in the following proofs.
Rephrasing Theorem B, part (a), we now wish to show

C l a i m . For any ipf f and players x and y, all compact-analytic, we have


(1) If x ~ y, then f ( x ) ~ ](y).
This condition means that f takes --era-equivalent players to ---era-equivalent
ones, so it induces a monotone function on the Plotkin powerdomain.
(2) The induced function p(f) is continuous on the Plotkin powerdomain.

Proof of claim. Let A(d) denote the set of finite elements less than or equal to
a given d E D. Since D is algebraic, .A(d) is directed and sup .A(d) = d. Also let
F: C x D N ~ D be the continuous parametrization of f .
Suppose that a is a finite approximation to an element c E ](x). By definition
of ipf application, c = Fa(X) for some a E C and X: I --~ x. By continuity of F,

a < c = sup {Fa(A) I Vi, A(i) e A ( X ( i ) ) } .

But a is finite, so some individual term of the right-hand sup must already be
beyond a. That is, for a particular sequence A E D z such that A(i) E A(X(i)),
we have a <_ F~(A). Each A(/) is finite below Z ( i ) E x, and hence there is some
Y(/) E y such that A(i) < Y(i). Finally, by monotonicity of Fa, a < F~(Y) E
](y). In other words, ](x) E~ f(y).
To finish the first part of the claim, show that ] ( x ) E u ](y): if d = Fa(Y) e
](y), choose a map X: I ~ x such that X(i) < Y(i) for each i E I. This is
possible since x _Cu y. Then x 9 c = F~(X) <_ F~(Y) = d.
The second part of the claim asserting the continuity of p(f) is more delicate.
Note it suffices to show that for each point z in the Plotkin powerdomain, there
is a sequence of finite elements a,, with supremum z such that p(f)(z) is the
supremum of p(f)(an). Let p , : D --~ D,~ be the sequence of projections witnessing
391

that D is strongly algebraic. Choose the set-maximal representative Z* of z


from Ho(D), constructed in section 7 of Smyth [8]. Unsurprisingly, the players
An = pn [Z*] have ~-supremum Z*; we shall verify that the supremum of ](An)
is ](Z*). The key property of the choice of representative Z* is that if {bi}i~l~
is any increasing sequence with bi E Ai, then the supremum of the bi is in Z*.
Actually, Z* is a supremum of the An in the preorders E_~ and _Eu, individu-
ally; so we can check that ](Z*) is the supremum of the An in these two preorders
individually, as well. For E~, the argument goes similarly to the monotonicity
argument above, but using the stronger approximation

X = sup sup {A [ A(i) E A(X(i)) and A(i) = _L for i ~ I ' } ,


1,c_I,II'l<oo
for a function X: I --~ Z*, which results from the pointwise ordering on D I.
To show that ](Z*) is the ___u_supremum of the ](An), let Y be any U u-
upper bound of the ](An), and choose any member d of Y. Then for each n,
there is some an and map Bn: 1~ ~ An such that ](An) B F(c~n, Bn) <_d. Now,
C is a compact topological space, and every point has a countable neighborhood
base, so C is sequentially compact. Therefore we may safely assume that the
an converge as n increases, to some a E C. Next notice that the projections
p, "push down" so that F(an,p, o B,,) <_ d for any t and n as well. By similar
compactness arguments (using Tychonoff's theorem to see that D r~ is compact),
one can choose a subsequence Bn, so that for each k E N, the sequence of values
p,(B,~,(k)) E A, is eventually monotone as t increases. Hence for each k, this
sequence converges to Z(k) E Z*, so that

F(an,,p, o Bn,) ~ F(~, Z) = c e ] ( Z ' ) as t ~ oo.


Since each element in the left hand sequence is _< d, the limit c _< d. Hence every
element of Y has an element of ](Z*) below it, as desired. []

The previous claim together with the definition that p(x) = [x/'%m] yields
a map from the universe of ipf0(D ) to the standard structure over the Plotkin
powerdomain of D. The next part of Theorem B states that this map p is an
FLR0-homomorphism. To see this, it suffices to show that p preserves function
compositions and systems of recursive equations, since all FLR0 expressions
are built up from these operations. The fact that p(f) o p(g) = p(f o g) is
easy, because p(f) depends only on the extension ]. For recursion, suppose
that f(z) is a compact-analytic ipf with player fixed point ~, and that X is
the least fixed point in the Plotkin powerdomain of p(f). We wish to prove
that X = p(a3) = [~/-~ern]. (The following argument will directly generalize to
systems of equations with parameters.)
First, p(f) fixes [~/'~em], since by definition p(f)([#./ -----ern])= [](~)/ "~em]
and ] fixes $. Choose a compact-analytic representative x0 of X in the Plotkin
powerdomain; since X is the least fixed point of p(f), this means that x0 ~ ~.
On the other hand, we know that ](x0) ---era Xo since the --era-equivalence class
X of z0 is fixed by p(f). In particular, ](z0) ~ z0. To finish the proof, we show
that for any compact-analytic player y, f(y) ~ y implies that $ ~ y:
392

S u p p o s e / ( y ) ~ y. We shall show separately that ~ ___~y and ~ E ~ y. Let


a < 55 E ~ be a finite approximation to an element of ~, where 5r is the top-
level fixed point for some implementation system F of f . By induction on k, each
iterate c (k) of the corresponding fixed point equations satisfies that the singleton
c~*)} ___~y. Since a is a finite approximation to ~ , it must be less than or equal
% ,J

to some top-level iterate c~g) of F, which is in turn less than some d E y, as


desired.
For $ _ " y, let d be any element of y. Since f(y) K u y, choose some c E f(y)
such that c < d. Now c must be of the form F(dl, d~,...) for some F E f and
dl, d 2 , . . . E y. Proceeding by induction, once dr E y is chosen, find some Fr E f
and dr "1, d r - ~ , . . . E y so that dr > Fr(dr ^1, dr-2,...). In this way construct
dr and Fr for all finite sequences from N.
The Fr of course form an implementation system F for the ipf equation
x = f(x). This system has least fixed points dr so that the top-level d~ E x.
But since dr > Fr(dr* 1,...), it must be that dr > dr for all v, by the general
least-fixed-point properties of the dr. In particular, y 9 d _> d$ E x as desired.
Only part (c) of Theorem B remains, which says that the homomorphism p
produces every continuous, essentially monotone function on the Plotkin pow-
erdomain. To prove this part, we must take an arbitrary essentially monotone
g: p l o t ( D ) --. p l o t ( D ) and construct a compact-analytic ipf f so that p(f) = g.
Restated, the goal is to find a continuous function F: C x D N -+ D such that
F[C x x N] -----em g([z/ ~---em]) for any x E IIo(D). One reasonable approach is t o
arrange that if X: l~ ---* x is surjective and so enumerates x, then F ( a , p) will
pick out all members of g([x])* as a varies over C.
This approach calls for a "selection function" S which will take an a E C
and some set R of the form R = Y* and return an element of R, so that

{S(o~, R) I a @ C} = R, for every R. (14)

The following labeled tree cr which "generates" D will help to construct S. As


before, let pn: D --* D,~ be the projections which witness the fact that D is
strongly algebraic. The root of ~ is labeled with 1; if n - 1 levels have been
constructed, then the new children of a current leaf au should consist of a set of
nodes labeled with each element of Dr, greater than or equal to ~u.
We may assume that o" is actually a perfect, infinite binary tree by replacing
each node having k children by a small binary tree with k leaves. Therefore,
identify the nodes of a with finite binary sequences, and the set of branches of
~r with the Cantor set. We denote by era the supremum of the labels along the
a t h branch of or. By construction, ~ra varies over all of D as a ranges over C.
Moreover, an analogue of this property holds at every node. For any node u, let
N(u) denote the set of branches which extend u. Then as a varies over g(u),
(r~ varies over all elements of D greater than or equal to o'u, i.e., ND(~u).
Let 7~ = {Y* : Y E / / 0 } be the range of the .-operation. For the purposes of
393

our proof, any selection function S: C x :~ ~ D satisfying (15) will do:

S(a, R) = ~ if ~ G R
Vu such that {uu} El R,V~ extending u, ~u <_S(~,R). (15)

The following construction provides one such S. Fixing an R for the moment,
find all minimal u so that no element of R is greater than or equal to uu. For
such a u, let t be its parent. Then R does contain some element greater than ~r~,
which is exactly to say that {u~} El R. Therefore, choose the "leftmost" branch
extending t whose limit is in R, and call this limit mR,~.
Now define S(a, R) = mR,u if a E N(u), and S(a, R) = u~, otherwise. There
is at most one initial sequence u of a so that mR,u is defined, and if there is
none then a~ E/~ since R is closed under limits of approximations to itself; these
observations ensure that S is well-defined and satisfies properties (14) and (15).
Finally define for X E D N, a E C,

r ( ~ , X ) = S(~,g([X[N]])*),
and let f be the ipf generated by the sections Fa. We must prove both that
F is continuous and that p(f) = g. So, compute the inverse image of a generic
neighborhood No(a) under F, as follows. Let U be the set of all minimal u so
that u~ >_ a. By construction of S,

s-i(gD(a)) = U N(u) x { R ] {uu} El R}.


uEU

Since each N(u) is open in C, we need only show that the collection of all X so
that {~u} El g([X[l~]])* is open in O n. Fix a u and let b = ~ for convenience
of notation. Note first that if {b} El R, i.e., if 3d E R s.t. b _< d, then there
is some finite set of finite elements A 9 b so that A ~ R. That is, letting
B = {[R]: {b} El R}, then

B= U Nplot(D)(A)'
A finite
ABb

and so B is open in plot(D). This means that g - l ( B ) is also open in plot(D),


and hence is of the form

g-l(B)= U Nplot(v)(A), (16)


AEA ~

for some collection A' of finite sets of finite elements of D.


Unfortunately, saying that the range X ~ ] of a function from 1~ to D is
bigger than a given finite set of finite elements A in the order ~ is not an open
condition on X, because every value of X must then be greater than or equal to
some element of A. The following argument which converts (16) into a collection
of neighborhoods in the order Et is therefore the crux of the continuity proof.
Suppose y is given so that A El Y for some A G ,4 ~. Clearly one can choose
some subset z C_ y such that A E u z as well, so that in fact A ~ z. Then
39z.

g([A]) ~ g([z]), so {b} E_I g([A])* E~ g([z])*. But g is essentially monotone, so


g([z])* C_ g([y])*, and hence {b} Et g([y])*, which is to say that g([y]) 9 B.
Conversely, if there is no A 9 ,4 ~ such that A Et y, then certainly no A 9 A ~
has A ~ y, and so g([y]) • B. Thus equation (16) can be improved to

g-l(B) = U {[y] I A E, y}.


AE~4 ~

Finishing the proof of F's continuity therefore only requires showing that any
set of the form 2( = {X E D ~ [ A _E~X[N]} for A a finite set of finite elements
is open in D r~. But A E t X[I~I] just means there is some finite set of natural
numbers whose images under X meet ND (c) for each c E A. Any candidate set
of natural numbers of the appropriate size yields a finite condition on X, and so
the entire set ,Y is a union of basic neighborhoods in D ~.
It remains to show that p(f) = g. If x E/7o and X: l~l ~ x, then X[N] C_ x.
By the essential monotonicity of g, g([X[N]])* C_ g([x])*. This inequality with

f(x)= U F[C• { X } ] = U g([x[N]])*


X:N~x X:N--*~

shows that f(x) C g(tx])*. Equality will be achieved if for some X, X[N] -em X.
Collecting for each n and each element a of Pn[X] some element da E x with
a _< da produces a countable set which is "~em-equivalent to x. Then an X that
enumerates {da} suffices.
This completes the proof of all three parts of Theorem B, providing a vivid
picture of the Plotkin powerdomain as a powerstructure quotient of the player
model.

3.2 Countable non-determinism

Since both the ipf structure ipf(D) and the powerdomains for countable non-
determinism plot w(D) seek to improve on the ability of earlier powerdomains
to model fairness, it is natural to compare them. Is there a "countable non-
determinism" analogue of Theorem B? Unfortunately, the answer is "no," for the
simple reason that plotw(D) does not constitute a powerstructure. The points in
ploto,(D) cannot (all) be viewed as subsets of D. In other words, although it still
may be true that "direct existence [of ploto,(D)] along the lines of [6, 8] should
be established,', as Plotkin [7] suggests, no construction which stays within the
subsets of D can accomplish this goal.
Let W be the domain (1• x N)• which is just a tree with root J_ and
countably many branches of length 2. Concretely, we take the elements of W to
be _L and all pairs of the form (0, n) or (1, n) for any n E 1~. The ordering on
W is that .1_ < (0, n) < (1, n) for any n, and these are the only relations. W is
very close to being flat, having maximal chain length 3 as opposed to 2. Also,
say that one dcpo D embeds another E if there is a projection from D onto a
subdcpo D ~ C_ D such that D ~ and E are isomorphic.
395

Proof of Theorem C. We begin by showing that 79 = (II(W)/'~em, ,-~, ~) is a


u-semilattice, where C_ is the ordinary subset relation (modulo -era). First note
that since W has only finite elements, ~ is just the conjunction of E u and ~_l
on W. Since every ~-chain in II(W) has countable length, (II(W), ~) is a dcpo
exactly if it contains a least upper bound for each sequence. Let z0 ~ zl ~ ...
be such a sequence, and check that its supremum is

:r={supanlanE:r,~,am<_anform<n}.

Next, C_ must have least upper bounds of arbitrary countable sets from 79.
But U , zn is the ordinary C-least upper bound of {x, [ n E l~l}, so [Un z,/--em]
is the least upper bound of {[zn] [ n E N} on 79.
Finally, countable union (i.e., the operation of taking the _C-least upper bound
of countably many arguments) must be wl-continuous and binary union must
be continuous, with respect to ~. The former is trivial because there are no
uncountable chains in 79; the latter just requires that whenever x = sup z , , then
x U y = sup zn U y, which is easy to check because W only has chains of finite
length.
Thus, P is a u-semilattice; however, it is not free over W. The singleton map
would have to be the obvious {.}:w ~ [{w}]. Now consider the map f from W
into the u-semilattice 9 Q = (//(Nj.), ~, c_) given by f ( / ) = f(0, n) = { l } and
f(1, n) = {n}. f is certainly continuous, and so if 79 were free, f would have to
factor through the singleton map. The values of F at singletons determine all i t s
other values, since F must preserve countable unions and W is itself countable.
Therefore, the only possibility for F turns out to be

F(A) = { {n I (1, n) E A} if A N ({• U {(0, n) I n e N}) = $,


{n [ (1, n) e A} U {$} otherwise.

But this map F is not continuous (on the "dcpo" part (//(W), ~) of the semi-
lattices), which is a contradiction. To see the non-continuity, consider the sets
Hk = {(1, n) [ n < k}U((0, n) [ n > k}. Clearly the Hk are increasing in ~, and
have least upper bound H = {(1, n) I n E N}. But F(gk) = {.L, O, 1,..., k - 1)
so that supk F(Hk) = {/}Ul~l, whereas r ( g ) = 1~ is strictly bigger than {_L}UI~
in Q. (Intuitively, the true free ~r-semilattice over W will have to include some
"ideal" element, not corresponding to any set, to be the least upper bound of
the sequence {Hk}kel~. )
Assume by way of contradiction that the free ~-semilattice S over W is the
homomorphic image of (H(W), E, C) for some partial order E. The homomor-
phism induces some equivalence relation ,,~ on II(W), so that S = (II(W)/,,% E
/,,~, _C/N), with singleton map w ~ {w}/,,~. Since :P is a ,,-semilattice and the
map w ~-* [{w}] E 79 is continuous, there is a ~-semilattice map r S --+ 79 so

9 See Plotkin [7] for a proof that this is the free a-semilattice over 1~• the proof that
it is simply a ~r-semilattice is analogous to the proof for W.
396

that the following diagram commutes:


W

s•
Now, w ~ [{w}] E 79 is an injectivemap, so the singleton m a p into S must
be as well, i.e.,~ cannot identify any singletons. Since the subset relation in S
is ordinary subset (modulo ~), the c-lub of a countable set A of singletons in S
isjust {w I {w} E A}/~. Since W is countable, every element of S is produced
in this way, and since r must preserve C, this means that the m a p r is simply
given by
= e 79
for any z E I I ( W ) . Thus r is clearly surjective. Suppose that r = r
which means that z is equivalent to y under ~. Write z as {a0,al, ...} and
y = {b0,bl,...} possibly with repetitions so that ai <_ hi. Since the singleton
map to S is monotone, and countable union in a cr-semilattice is monotone, this
representation of z and y shows that z/,~ E y / ~ . Symmetrically, y / ~ E z / ~ , so
that r is injective as well. Hence r is an isomorphism of r witnessing
S - 79. But 79 is not free over W.
This argument proves the theorem for D = W. To extend to the case that
D embeds W, just notice that any representation of the free ~-semilattice over
D in the given form would yield such a representation of the free ~,-semilattice
o,r W by taking a quotient under the projection from D to W. []

References

I. M. Hennessy and G. D. Plotkin, "Full abstractionfor a simple parallelprogram-


ming language," Proceedings of MPCS, Lecture Notes in Computer Science 74,
ed. J. Becvar, Berlin:Springer-Verlag(1979) 108-120.
2. Y. N. Moschovakis, "A game-theoretic modeling of concurrency," Extended ab-
stract, Proceedings of the fourth annual symposium on Logic in Computer Science,
IEEE Computer Society Press (1989) 154-163.
3. Y. N. Moschovakis, "A model of concurrency with fair merge and full recursion,"
Information and Computation 93 (1991) 114-171.
4. Y. N. Moschovakis, "Computable, concurrent processes," Theoretical Computer
Science (to appear).
5. D. Park, "On the semantics of fair parallelism," Proc. Copenhagen Winter School,
Lecture Notes in Computer Science 104, Berlin: Springer-Verlag (1980) 504-526.
6. G. D. Plotkin, "A powerdomain construction," SlAM J. of Comput. 5 (1976) 452-
487.
7. G.D. Plotkin, "A powerdomain for countable non-determinism," Automata, Lan-
guages, and Programming 9th Colloquium, Lecture Notes in Computer Science
140, eds. M. Nielsen and E. Schmidt, Berlin: Springer-Verlag (1982) 418-428.
8. M. B. Smyth, "Power Domains," J. Comput. System Sci. 16 (1978) 23-36.
9. G. T. Whitney, "Recursion Structures for Non-Determinism and Concurrency, "
Ph.D. Thesis, University of California, Los Angeles (1994).
Canonical Forms for Data-Specifications

Frank Piessens* and Eric Steegmans

Department of Computer Science, Katholleke Universiteit Leuven,


Celestijnenlaan 200A, B-3001 Belgium

Semantic data-specifications, like Chen's Entity-Relationship specification


mechanism ([Ch 76]) have been used for many years in the early stages of
database design. More recently, they have become key ingredients of object-
oriented software development methodologies ([Co 90, VB 91]). Since most of
the data-specification mechanisms used in practice have only weak expressive
power, we may hope to find an algorithm to decide wether two specifications are
equivalent, in the sense that they have essentially the same models.
In this paper, we present a categorical data-specification mechanism, and we
show that, for an important subset of these specifications, equivalence is indeed
decidable: we show that every specification (in the abovementioned subset) can
be transformed to a canonical form such that two specifications are equivalent
iff they have isomorphic canonical forms.
This property is of utmost importance if r e u s e of data-specifications among
software engineers is ever to be put in practice: it ensures us that two specifica-
tions of different parts of a given reality can always be combined by identifying
the overlapping parts of their canonical forms.

1 Specifications

1.1 Definition and First Properties

We assume that the reader is familiar with basic category theory. A good intro-
ductory book, oriented towards computer scientists, is [Ba 90].

D e f i n i t i o n 1 . A s o u r c e in a category ,9 is a pair (X, (fi)~ez) consisting of an


object X of ,9 and a family of morphisms f~ : X ~ I~ of S, indexed by some set
I.

We will use the notations (X, (f~)~ez) and f~: X ~ I~ interchangeably. A source
in F i n S e t , the category of finite sets and functions, can be percieved as a mul-
tirelation over the sets Y~: X is a multiset of I-tuples, and the fi's are generalised
projections.
A source (X, (f~)~ez) is a mono-source if f~ oz = f~ oy for all i E I implies that
z = y. A mono-source in F i n S e t is an ordinary relation: it is (up to isomorphism)
a subset of the cartesian product l-I~ez I~ together with the (restrictions of) the
product projections.
More details about sources can be found in [Ad 90].
* Research Assistant of the Belgian Fund for Scientific Research
398

D e f i n l t i o n 2 . A specificagion .T is a pair (S,A~), where 8 is a finite category,


and Ad is a finite set of sources in 8.

In practice, the category 8 will often be presented as a graph G together with a


set ~ of diagrams (or equations) in G. The category $ can then be computed as
the quotient of the free category on G by the congruence relation generated by
the diagrams ([Ba 90]). In the examples in this paper, every specification will be
given as a graph G, a set of equations :P and a set of sources M .

D e f i n i t i o n 3 . A model of a specification ~ = (8, A~) is a functor from 8 to


F i n S e t , which takes every M E Ad to a mono-source.

The model-category Mod(~') is the full subcategory of Fun(8, F i n S e t ) , con-


taining only the models.
Specifications are a special kind of Finite Limit or Left Exact sketches as
defined in [Ba 90, Ba 85], since the condition that a source must be mono can
be phrased as requiring a certain cone in the category to be a limit cone.
The following proposition (that we will need later) is easy to prove:

Proposition4. The model-category of a specification (8, Ad) is an epi-reflective


full subca~egory of Fun(S, F i n S e t ) .

It is proved in a similar way as Theorem 16.14 on page 260 of [Ad 90]. The
construction of the reflection of a functor F is as follows:

1. Define a relation R on the elements of F where z R y iff there exists some


fi: X --~ Yi E A~ such that f~(z) = f~(y) for all i.
2. Construct the smallest congruence relation on the elements of F , containing
R.
3. Take the quotient of F by this congruence relation.

The universal arrow from F to its reflection is the projection of F on this quo-
tient.

1.2 Examples

In the following examples, we try to show that specifications have enough ex-
pressive power to be useful in practice.

1. If 8 is a discrete category (no non-identity arrows), the models are just typed
sets. The specification:
COMPUTER PRINTER
g= 9 ~ , 9=0, ~=0

says that the part of the world we want to specify consists of two kinds of
entities (printers and computers), and that is all it says.
399

2. Arrows in the category specify existential dependencies. Consider for in-


stance:
COMPUTER
9

= [: , "D=O, M = 0
LOCATION

Since the arrow must be taken to a function in a model, this specifies that
every computer must have a location associated with it. (An entity of type
COMPUTER is always associated with an entity of type LOCATION.)
3. A source with r~ arrows in the category can be seen as an n-ary multirelation:
CONNECTION

g= ~ ~ ' N ' 9=0, .M=0


COMPUTER PRINTER

This specification says that connection is a multirelation between computers


and printers: every entity of type CONNECTION is associated with a couple
of entities (~, 2/) with z of type COMPUTER and 2/of type PRINTER. It
is possible that two different entities of type CONNECTION are associated
with the same couple. Hence CONNECTION is a mul~i-relation over COM-
P U T E R and PRINTER. The following model of the specification, models the
situation where there is one computer and two printers, and one of the two
printers is connected twice to the computer (for instance, the printer could
be connected to the computer by a serial cable and by a parallel cable). The
other printer is not connected:

coN~Io~ ............................................

4. A multi-relation is an ordinary relation (no duplicates allowed) if and only


if the corresponding source is a mono-source. The specification:
CONNECTION

A4 = {(CONNECTION,(c,p))}

COMPUTER PRINTER
400

says that CONNECTION is an ordinary relation over C O M P U T E R and


PRINTER. The model of the previous example is not a model of this ex-
ample, because the source (CONNECTION,(c,p)) is not taken to a mono-
source.
5. Connectivity constraints on relations (as used in ER-modelling) can also be
specified:

CONNECTION

~ , 1) = 0, A4 = {(CONNECTION,(p))}

COMPUTER PRINTER

Requiring p: CONNECTION ---* P R I N T E R to be mono means that a printer


can be connected to at most one computer. In ER-modelling, one says that
CONNECTION is a 1-n-relation. In a similar way, we can specify that rela-
tions must be n-1 or 1-1.
6. The examples so far have shown that any Entity-Relationship specification
without attributes can be translated to one of our specifications. The cate-
gory S corresponding to an ER-specification has a different object for each
entity-class and for each relation, and for each binary relation, there are
two arrows (which we will call projections) in the category, going from the
relation-object to the two objects corresponding to the two participating
entity-classes. A4 must contain at least one source for every relation: for n-n
relations, this will be a source containing the two projections, for 1-n and
n-1 relations, this source will contain only one of the projection arrows, and
for 1-1 relations, two sources, both containing one of the projection arrows,
must be added to A4.
Relations of arity higher than 2 can also be handled in the obvious way.
Attributes of ER-specifications will be considered in more detail later.
7. By requiring certain equations to be valid in S, we can express equality
constraints:

CONNECTION

6 = c o M p ~ ~ , 19={12op=lloe}, A4=0

I1 ~ 12

LOCATION

This specification says that computers and printers can be connected only
if they have the same location. This kind of constraint occurs very often in
practice.
401

1.3 Attributes

An important ingredient of data-specifications that we have not considered yet


are attributes. In ER-diagrammas, attributes allow one to specify that entities
of a certain type are labelled with elements of a given set. For example, entities
of type P R I N T E R could be labelled with elements of an attribute-set { matrix,
laser } specifying that every entity of type P R I N T E R is either a matrixprinter
or a laserprinter.
We will extend our definition of specifications to allow for attribute-sets, but
later on we will show that in m a n y cases such a specification with attributes can
be reduced to an equivalent one without attributes.

D e f i n i t i o n S . A specification with a$tributes consists of a specification (S,A,t),


together with a functor A : 30 - ' F i n S e t , where So is the discrete category con-
taining all the objects of 3.

The functor A is essentially a function from the nodes of ,~ to the finite sets.
The intention is to specify that every entity of a type C must be labelled with
a value from the set A(C).
On the previous example, A could take C O M P U T E R and CONNECTION
to the terminal set, and P R I N T E R to the set { matrix, laser }. Labelling with
an element from the terminal set is of course equivalent to no labelling at all.
The straightforward way to define the model category is as follows: Let
I : So --~ S be the inclusion, let I* : Mod(9 r) --, Fun(&, F i n S e t ) be the functor
of composition with I and let A : 1 ---, Fun(S0, F i n S e t ) be the functor picking
out A. Then,

D e f i n i t i o n 6. The model category of (S,A4,A) is the comma-category (I* ~ A).

Hence, the objects of the model category are models of the underlying specifi-
cation without attributes, together with a natural transformation from M o I
to A, which can be percieved as a labelling of the elements of the model.
If A(C) is the terminal set 1, this means that C has no attributes and if
A ( C ) = A1 x A2 x . . . x An then C has n attribute-sets A 1 . . . An. The value of
the functor A is graphically denoted in the following manner:

CONNECTION

G= ,

COMPUTER

Two specifications (with or without attributes) are (syntactically) isomor-


phic iff they are identical up to a renaming. Two specifications are (semantically)
equivalen~ iff their model-categories are equivalent. Hence, two specifications are
equivalent iff they have essentially the same models. This definition of equiv-
alence seems to capture very well the intuitive notion of "describing the same
402

reality". It should be noted that we do not give any semantic value to the names
used in the specifications. This seems reasonable since the semantic content of
a name is usually not formal. In a similar way, you don't change an algebraic
structure by giving other names to its operations. Still, when combining existing
specifications, one should be cautious if different names are used for equivalent
subspecifications since the designer(s) may inadvertently have chosen the same
representation for two different realities. But of course, an automatic procedure
might give warnings or go interactive in such cases.

2 Canonical Form for Specifications


2.1 Introduction

Certain real-world situations can be specified in a number of different (non-


isomorphic) ways. In other words: equivalence of specifications does not imply
isomorphism of specifications.
Ezample: Specification (1) of C O M P U T E R S , P R I N T E R S and CONNECTIONS
given above with an attribute set { matrix, laser } attached to P R I N T E R , is
equivalent to the following specification without attributes:
MCONNECTION L_CONNECTION

G= , •=0 (2)

MATRIX'PRINTER COMPUTER LASERPRINTER

In this second specification, separate entity types for matrixprinters and laser-
printers are used, instead of one entity type with an attribute.
Clearly, these specifications (1) and (2) are not isomorphic.
Ezample: The specifications:
A f B h C

=" :" M={(B,(h))) (3)


and
A f B h C
----. ~. .~ :D = @, M ----{(B,(h))} (4)

are another example of two non-isomorphic specifications describing the same


reality: the constraints in the firstspecification (h must be taken to an injectivc
function and h o f = h o g) are such that in every model, .f and g must be the
same. Hence, there is no need to distinguish them in the specification.
W e want to define a notion of canoBical specification with the following prop-
erties:

I. For every specification, there must exist a canonical specification with an


equivalent model-category. Moreover, we would like to have an algorithm to
convert a specification to its canonical form.
403

2. Two canonical specifications have equivalent model-categories iff they are


isomorphic.

In fact, we will only be able to define canonical forms for a (large) subset of
specifications.
Why do we care about canonical forms of specifications? Suppose two soft-
ware engineers both specify a part of a large system, and these specifications
overlap partly. If it would be guaranteed that the overlapping part is specified
isomorphically in the two specifications~ it would be easy to merge them together:
just identify the isomorphic overlapping parts.

2.2 Attribute Elimination

A first important result states that, in many cases, attributes can be eliminated
from a specification, without changing the model-category.

T h e o r e m 7. Let (,9,A4,A) be a specification with attributes, and let X" : 8 --*


F i n S e t be the right Kan eztension of A along the inclusion I : ,So ~ ,9. I r A ! is a
model of (8,A4), then (,S ,A4,A ) is equivalent to a specification (,9', A.4') without
attributes.

Proof. First, note that the right Kan extension exists, since ,9 is a finite category,
and F i n S e t is finitely complete. The proof consists of two parts.

i. First we show that (I* ~ ]) is isomorphic to the slice-category Mod(~)/A t.


From the universal property of the right Kan extension, we get the natural
isomorphism:

Nat(I* (M), A) m, Nat(M, A I) (5)


Objects of (I* ~ A) are couples (M, A) with M a model of.7" and A: I* (M) --*
A a natural transformation. Objects of Mod(~r)/A ! are couples (M, ~) with
M a model of ~r and ~ : M --* A ! a natural transformation. By (5), there is
a bijection between the objects of both categories.
Arrows in both categories are natural transformations ]~ between models of
~r. In (I* ~ A) they must satisfy commutativity of:

e(Md

tO,) A

/
4.OZ.

In Mod(JC)/A: they must satisfy commutativity of:

# A!

M2
/S
However by the naturality in M of (5), these two conditions are equivalent,
and hence (I* ~ A) and Mod(2")/A ! are isomorphic.
2. Secondly, let f A ! be the category of elements of A !, and let I I : f A ! ~ 3
be the associated projection. We prove that Mod(Sr)/A ~ is equivalent to
Mod(~") with 2" = (5', A4') where 3' = f A'", and gl E .MI iff/'/(/~') E M .
It is well-known that

Fun(S, F i n S e t ) is equivalent with F i n D o f ( 3 ) (6)


where F i n D o f ( 3 ) is the category of finite discrete opfibrations over 3, a full
subcategory of F i n C a t / 3 . A functor F : S --* F i n S e t corresponds under
this equivalence with the discrete opfibration ( d o f ) / / F : f F --* 3. We inves-
tigate under what conditions the functor corresponding to a dof 9 : s --~ 3
is a model of (3, M ) .
Let p -- f ~ : X - 4 Y ~ E M and suppose B and B I are nodes of s such
that ~ ( B ) -- ~ ( B ' ) = X. The arrow-lifting property of dofs ensures us
the existence of two unique sources g~ : B --~ Z~ and g~ : B' --* Z~ such that
~(g~) = ~(g~) = f,. We say that B ~~ B' iff Z~ = Z~ for all i. This situation
is illustrated in the following picture:

e "

gl

y~ ... y,,
405

It is clear that the functor corresponding to 9 takes/z to a mono-source iff


B ~ g B I = : ~ B = B I"
By (6):

Fun(S, F i n S e t ) / A ' is equivalent with F i n D o f ( ` 5 ) / / /

and it is well-known that:

FinDof(S)/(II : f A ! ~ ,5) is equivalent with F i n D o f ( f A')

This last equivalence maps a dof g~ : s --~ f A ! to /'/o # : E -~ $. Hence,


it only remains to prove that # corresponds to a model of 5r' iff H o
corresponds to a model of 5r.
(a) Suppose k~ corresponds to a model of 5r'. If B ..floe B', then !t~(B) ..fl
9 (B'), and hence ( s i n c e / / c o r r e s p o n d s to a model) ~ ( B ) = k~(B'). But
that means that B .-.~, B ~ for s o m e / ~ in $ ~ with H(/~ ~) =/~. Since
corresponds to a model for ~r~, it follows that B = B ~, and h e n c e / ' / o
corresponds to a model of 5r.
(b) S u p p o s e / / o ~ corresponds to a model o f T . If B . ~ , B ~, then B ..flo#~ B ~
where/~ = H (g~), and s i n c e / / o k~ corresponds to a model of 9r, it follows
that B = B ~. Hence, # corresponds to a model of jr~.

In part 1 of the proof, we have shown that:

Mod($, M , A) is isomorphic to Mod(,5, M ) / A '

In part 2, we proved that:

Mod(`5, A4)/A ! is equivalent to Mod($', M ' )

Hence, we may conclude that:

Mod($, M , A ) i s equivalent to Mod($', A4')

Since both the Karl extension and the category of elements can be effectively
computed, this proof is constructive: it describes an algorithm to eliminate at-
tributes from specifications.
Ezample: Let us eliminate the attributes from specification (1). The construction
of Kan extensions can be found in [Ba 85] or [Bo 94]. Since So is a discrete
category, the value of the Kan extension on objects becomes a simple product:

A'(X) = H A(Y)
f:X-*Y

where the product is taken over all arrows f : X --+ Y out of X. A ! becomes the
following funetor:
406

Its category of elements is isomorphic to (2), and hence we obtain specifica-


tion (2) if we eliminate attributes from specification (1).
If A z is not a model, the construction fails. Suppose, for instance that we
modified specification (1) in the following way:
CONNECTION

G= ~////NN~ x , D = $,A4 = {(CONNECTION, (c))}(7)


,/ \
COMPUTER PRINTER

In this specification, a computer can have at most one printer connected to it.
Now, A ! is still the same functor, but it is no model of specification (7), because
At(c) is not an injective function. Hence, our construction fails. And indeed, no
m a t t e r what sources we add to A4 of specification (2), we can never specify that
a computer can have at most one printer connected to it. The best we can do is
specify that it can have at most one laserprinter, and at most one matrixprinter
connected to it. But clearly, this is not equivalent to the original specification
(7).
The question remains if some other construction might lead to an equivalent
specification without attributes. The answer is no, since the model category of
specification (7) does not have a terminal object, while a model category of a
specification without attributes always has a terminal object.

2.3 Canonical Specifications


D e f i n i t i o n 8 . A source without doubles is a source in which no arrow occurs
twice or more.

Since a source with doubles is a mono-source iff the same source with all doubles
removed is a mono-source, only sources without doubles must be considered for
specifications.

D e f i n l t i o n 9 . A specification ~r = (8, A4) is canonical iff:

1. In the category $, every idempotent arrow is split.


2. In the category S, no two different objects are isomorphic.
3. All representable functors from $ to F i n S e t are models.
407

4. The set A4 contains all sources without doubles in S which are taken to
mono-sources by every model.

The first condition means the category must be Cauchy-complete ([Bo 86]), while
the second condition states that it must be skeletal. Conditions 2 and 3 remove
redundancy from the specifications: if different objects in the category of the
specification are isomorphic, this means that at least one of them can be removed
while retaining an equivalent model-category. If the representable functors are no
models, this means that some of the arrows in the specification are redundant
(cfr. the proof of L e m m a 10). Conditions 1 and 4 can be seen as a kind of
"completion" of the specification.

E x i s t e n c e o f C a n o n i c a l F o r m s . First, we prove that every specification can


be transformed to a canonical one with an equivalent model-category.

LemmalO. For every specification ~ = (,5, M ) , there ezists a specification


~r, : (S', .A4') such that:

I. The model-categories of ~r and 3=l are equivalent.


2. The representable functors from S' are models for jr,.

Proof. We have the following situation:

K
Fun(S, F i n S e t ) , ' Mod(Y)
I

with I the inclusion and K left adjoint to I (existence of K follows from


Proposition 4). Now suppose Horn(O,-) is not a model. Then Hom(O,-) #
I K H o m ( C , - ) . Hence the arrow ~Hom(o,-) of the unit is an epi which is not
an isomorphism. Hence rlHom(o,_) identifies some of the arrows out of C. Since
N a t ( H o m ( C , - ) , I M ) = N a t ( I K H o m ( C , - ) , IM), we can conclude that these
identified arrows must be taken to the same function in every model. Hence, if
we identify these arrows in S, we don't change the model-category. More for-
mally:
Let ,-, be the congruence relation on arrows of S with:

f ,~, g i f f ~ H o m ( o , - ) ( f ) = r/Hom(o.-)(g)

with C the source of f and g. Let S' be S / ,,, and let P : 8 ---r g/,,~ be the
projection. Then ~'~ = (S', A/U) with AA' = { P ( M ) J M E A,t) satisfies the
conditions of the lemma. []

E$ample: Consider specification (3). The hom-functor H o r n ( A , - ) is no model


of this specification, because Hom(A, h) is not injective: it takes f and g to the
same arrow h o f = h o g. The following figure gives a graphic representation of
Hom(A, - ) and of its reflection along the adjunction mentioned in the lemma:
408

Hom(A, - )

The reflection was computed by the construction given after Proposition 4.


Clearly, ~/Hom(A,-) identifies f and g. Hence, we have f .-~ g, and we must
divide 8 by this congruence relation. Doing this gives us the specification (4).

T h e o r e m 11. For every specification jr = (S, All), there ezis~s a canonical spe-
cification with an equivalent model-category.

Proof. By Lemma 10, we can find 5r~ = (S', A4') with an equivalent model-
category such that all representable functors from S ~ are models.
Every model of ~r, can be extended uniquely to a model of 5r" = (8", A4'),
where S " is the Cauchy completion orS'. (cfr. Theorem 1 in [Bo 86]) Let i : 8' --*
S" be the inclusion. If we take ~r,, = (8", A4") with A t " = {i(M) J M E A 4 ' ) ,
then the model-category of 7 ' is equivalent with that of ~'.
Finally, let S ~ be the skeleton of S", with p : S " --* S ~ the equivalence be-
tween S " and S c. Let AA"' = {p(M) J M E A4"}. If we take A4 c to be the set
of all sources without doubles in 8~ which are taken to a mono-source in every
model of (,~, Azl"'), then (8 c, A4 ~) is the required canonical specification. [3

Since construction of the Cauchy-completion and the skeleton is trivial (and


hardly ever needed in practice since idempotent arrows and isomorphic entity-
classes are very rare in specifications) we do not give an example here.
Again, we emphasize that the proof of the previous theorem is construc-
tive: the congruence relation mentioned in Lemma 10, as well as the Cauchy-
completion or the skeleton of a finite category can easily be computed. By com-
bining all these constructions, one finds an algorithm to compute the canonical
form of a specification.

U n i q u e n e s s o f C a n o n i c a l F o r m s . To prove that two canonical specifications


have equivalent model-categories iff they are isomorphic, we need a few lemma's.

L e m m a 12. Every model M : S --, F i n S e t of a specification .7: = (S, .4.4) is a


colimit of a diagram that factors through K o Y where Y : S ~ --* Fun(S, F i n S e t )
is the Yoneda-embedding and K : Fun(S, F i n S e t ) --* Mod(5r) is the left-adjoint
to the inclusion I : Mod(5r) ~ Fun(S, F i n S e t ) .

If the representable functors are models, this lemma says that every model is a
colimit of Horn-models.
409

Proof. We have:
y K
s~ , ~'un(S, r i n S e t ) , 'Mod(2")
I
Since K is left adjoint to I, it preserves colimits. And since K o I is the identity
functor on Mod(~'), K is surjective on objects and arrows.
Now, since every object in Fun(S, F i n S e t ) is a colimit of a diagram factoring
through r (see page 41 of [Ms 92]), the result follows. []

D e f i n i t i o n 13. Let T be a model-category of a specification. We define the base


cate#ory of T as the full subcategory of T which contains all the objects O of
T, for wich the arrow a : colim Nat(O, D) ~ Nat(O, colim D), induced by the
universality of colimits, is an epi for every diagram D.

L e m m a 14. Let ~" = (,.q, .M) be a canonical specification. Then Base(Mod(~'))


is equivalent with S.
Proof. First we prove that all the Hom-functors belong to the base. Since 5r is
canonical, all representable functors are models. Let IH be such a Hom-functor.
We observe that:

I c o t ~ D = r/toll m zD o colim ID
(This follows from Proposition 13.30 on page 213 of [Ad 90])
Also, N a t ( I / / , - ) preserves colimits (it is an evaluation functor).
Hence:
Nat(H, colim D) = Nat(IH, Icolim D) (since I is full and faithfull)
= N a t ( I H , ~/colim zD o colim ID)
= N a t ( I H , ~/colim ZD) o Nat(IH, colim ID)
Since every arrow of the unit is epi (Proposition 4), and since Nat(IH, - ) pre-
serves epi's, we conclude that H belongs to the base.
Secondly, we prove that every object in the base is isomorphic to some Hom-
functor.
Suppose F E Base. We know that F = colim D where all objects from D are
Hom-functors. (This follows from Lemma 12 and the fact that all I-Iom-functors
are models). Hence we have the following situation:
colim Nat(F, D)

Nat(F, F)

Nat(F,/))
410

with r an epi. So some o b j e c t / / o f Nat(F, D) contains an a such that ~ o a =


Idp. Hence, F is a splitting of an idempotent of a Hom-functor. But since every
idempotent in S is split, F must be isomorphic to a Hom-functor. []
We are now ready to prove the main theorem:
T h e o r e m 15. Let ~rl = (,.ql, A41) a~tt ~r2 = (32, A42) be two ca~oTtical specifi-
ca~io~ts. Tl~e~ .T1 a~.d ~r~ are isomorpttic iff tkeir modebcategories are eq~tivalert~.
Proof. If the model-categories are equivalent categories, then their bases are
equivalent categories, and hence by Lemma 14 and by transitivity of equivalence,
the categories ,-ql and 32 are equivalent. But since ~r1 and 5r2 are canonical, 31
and 32 are skeletal. Hence, the categories $1 and 32 are isomorphic.
Since 5rl and .7"2 have essentially the same models, their sets of sources with-
out doubles which are taken to a mono-source in every model must be isomorphic
also. []

3 Conclusion and R e l a t e d Work


We have presented categorical data-specifications with similar expressive power
as the data-specification mechanisms used in practice. The main restriction is the
fact that attribute sets must be finite. For a large subset of these specifications,
one can compute so-called canonical forms which have the nice property that
equivalence of semantics implies isomorphism of syntax. Hence, for that same
subset of specifications, equivalence of semantics is decidable.
The construction of the canonical form does not work for specifications where
the connectivity constraints and the attributes interact in an unclean way: in
that case, it was impossible to eliminate the attributes from the specification.
It remains an open question if equivalence of these specifications is decidable or
not.
The results in this paper are a refinement (especially in the area of attributes)
of work presented in a lecture at the BCS FACS Christmas Workshop on Formal
Aspects of Object Oriented Systems, held in Imperial College, London, december
1993.
[Is 93] gives a categorical underpinning of relational database schemas, and
shows how the glueing together of different schemas can be defined rigorously.
[Vi 92] proposes geometric logic as a language for specifying databases. In [Vi 94],
a Z-like notation for geometric logic is proposed.
The data-specifications proposed in our paper have less expressive power,
but have the advantage of decidable equivalence. Since our specifications are
essentially a special kind of Finite Limit sketches ([Ba 85, Ba 90]), they are in
fact a fragment of geometric logic. We are currently investigating if decidable
equivalence is possible for larger fragments.
Another interesting issue for future work, is investigating the practical use
of our specifications: Are they understandable by database designers as well as
users? Do they have enough expressive power to specify non-trivial, real-world
databases?
411

References

lad 90] J. Adamek, H. Herrlich, G. E. Strecker. Abstract and Concrete Categories


Wi]ey-Interscience publications, 1990.
[Ba 85] M. Barr, C. Wells Toposes, Triples and Theories Springer New York, 1985.
[Ba 90] M. Barr, C. Wells. Category Theory ]or Computing Science Prentice Hall In-
ternational Series in Computer Science, 1990.
[Bo 86] F. Borceux, D. Dejean. "Cauchy Completion in Category Theory" Cahiers de
Topologie et G~ometrie Diff~rentielle Cat~goriqucs, 3/ol. XXVII-2, pp. 133-
146, 1986.
[Bo 94] F. Borceux. Handbook of categorical algebra I Cambridge University Press,
1993.
[oh 76] P. P. Chen. "The Entity-Relationship Model - Towards a Unified View of
Data" ACM Transactions on Database Systems, Vol. 1, No. 1,1976,pp. 9-36.
[Co 9o] P. Coad, E. Yourdon. Object-Oriented Analysis, Yourdon Press, Englewood
Cliffs, New Jersey, 1990.
[Is 93] A. Islam, W. Phoa. "Categorical models of relational databases I: fibrational
formulation, schema integration", preprint.
[Ma 92] S. Mac Lane, I. Moerdijk. Sheaves in geometry and logic: a first introduction
to topos theory Springer New York, 1992.
[VB 91] S. Van Baden, J. Lewi, E. Steegmans, H. Van Rid. "EROOS: An Entity-
Relationship based Object-Oriented Specification Method" Technology of
Object-Oriented Languages and Systems TOOLS 7(ed. G. I-Ieeg, B. Magnusson,
B. Meyer), Prentice Ha]], 1991, pp. 103-117.
[vi 92] S. Vickers. "Geometric Theories and Databases" in Applications of Categories
in Computer Science, London Mathematical Society Lecture Note Series 177,
pp. 288-314.
[vi 94] S. Vickers. "Geometric Logic as a Specification Language", to appear in The-
ory and Formal Methods 1994: Proceedings of the Second Imperial College,
Department of Computing, Workshop on Theory and Formal Methods, proba-
bly Springer Verlag.
A n algebraic v i e w of s t r u c t u r a l i n d u c t i o n
Claudio Hermida* Bart Jacobs**

Abstract
We propose a uniform, category-theoretic account of structural induction for
inductively defined data types. The account is based on the understanding of
inductively defined data types as initialalgebras for certain kind of endofunctors
T : ~---+]~on a bicaztesian/distributive category ]~. Regarding a predicate logic as
a fibration p : ~--*~ over B, we consider a logical predicate lilting of T to the total
category ~. Then, a predicate is inductive precisely when it carries an algebra
structure for such lifted endofunctoL The validity of the induction principle is
formulated by requiring :that the 'truth' predicate functor T : ]~--*~ preserve initial
algebras. We then show that when the fibration admits a comprehension principle,
analogous to the one in set theory, it satisfies the induction principle. We also
consider the appropriate extensions of the above formulation to deal with initiality
(and induction) in arbitrary contexts, i.e. the 'stability' property of the induction
principle.

1. Introduction
Inductively defined d a t a types are understood categorically as initial algebras for 'polyno-
mial'endofunctors T : I~--+I~o ~ a blcartesian/dlstributiv e category B, as in [CS91, Jac95].
The category I~ is the semantic category in which types and (functional) programs are
modelled, e.g. gpo or Set.
We will show how initiality canonically endows such d a t a types with induction prin-
ciples to reason about them. Induction is a property of a logic over (the theory) ~.
Induction is a property of a logic over (the theory) B. Categorically, such a logic
corresponds to a fibration over B, written as ~p. ]? is the category of 'predicates' and
B
'proofs' , over the 'types' and 'terms' of ~. When ~p is endowed with appropriate
B
structure, intended to model certain logical connectives and quantifiers, P is bicarte-
sian/distributive and p preserves this structure. It is then possible to 'lift' the functor
T to an endofunctor Pred(T) : ~---~IFover T, i.e. pPred(T) = Tp. The key point is that,
given a T-algebra T X ~ ~-X and a predicate P on X , i.e. pP = X, P is inductive,
meaning that it satifies the premise of the structural induction principle for the 'type
structure' T, precisely when it has a Pred(T)-algebra structure P r e d ( T ) P ~ ~ P with
p~ = z. This observation leads to our definition of the induction principle relative to
the fibration p as the preservation of initial algebras by the 'truth predicate' functor
T : B---+I?, which assigns to the object (or 'type') X the 'constantly true' predicate T x .
As for the usual induction princip!e for the natural numbers w in Set, we know it is
valid using the initiality of w with respect to the inductive subset {~ E X I P ( z ) } , de-
termined by the inductive predicate P which we wish to prove. This argument depends
crucially on the fact that we can perform comprehension. In categorical terms, compre-
hension P ~-* {x e X I P(x)} amounts to a right adjoint to T : ~---+P, after [Law70].
* Computer Science Department, Aarhus University, DK-8000 Denmark.
e-mail: chermlda@daimi,aau .dk.The author acknowledges funding from the CLICS II ESPRIT project.
** CWI, Krnislaan 413, 1098 SJ Amsterdam, The Netherlands. e-mail: bjacobsr
413

With our abstract formulation of induction, we will show that when ~p admits compre-
B
hension in the above sense, the induction principle holds in p, analogously to the above
situation in Set.
This last fact that comprehension entails induction hinges on the fact that adjune-
tions between ]~ and itDinduce adjunctions between the associated categories of algebras,
T-Alg and Pred(T)-.41g respectively, assuming some appropriate additional structure.
This is a 2-categorical property, namely the 2-functoriality of (the construction of) in-
serters: T-AIg is the inserter of T, 1~ : ]~---*~, in the sense of [Kel89]. See Theorems
2.3.1 and 4.0.8 below.
Another important aspect of the present work is the consideration of the (frequently
ignored) 'stability' of the induction principle under context weakening. This means that
we should be able to reason by induction on a given data type not only when such type
is given on its own, but also when it occurs toghether with some other data, which in
turn may be subject to certain hypotheses. Technically, this amounts to the requirement
that initiality of algebras be preserved under addition of indeterminates.
The primary aim of this work is to give a technically precise categorical formulation
of a logical principle, namely structural induction. Such formulation makes the principle
amenable to a purely algebraic manipulation. There are several relevant references in the
literature, particularly [LS81, Pit93]. We would like to emphasise the following points,
which highlight differences between our work and these references:
(i) The understanding of a predicate logic as a fibration is central to the present
work. This provides not only an appropriate level of generality but also the right techni-
cal framework. In particular, the relationship between inductive predicates and logical
predicates is best presented in this setting, as logical predicates for type constructors
given by adjoints arise uniformly from an intrinsic property of adjunetions between fi-
brations, cf. [Her93].
(it) The categorical framework which we work in takes explicit account of proofs
of entailments between predicates. Thus this work can be seen a s a generalisation of
induction principles from the usual proof-irrelevant setting to the type-theoretic (or
constructive) one. See Remark 2.2.1 below.
(iii) 2-categorical reasoning is essential to get conceptually uniform formulations. For
instance, just as inductive datatypes are understood as initial algebras for an endomor-
phism in Cat, the 2-category of small categories, their associated induction principles are
formulated in terms of (distinguished) initial algebras for endomorphisms in Cat-'~. Sim-
ilarly for stability of data types and their associated induction principles under context
weakening: the former means preservation of initial algebras by addition of indeter-
minates in Cat while the latter amounts to the same kind of preservation in Yrib, the
2-category of fibrations. See w below.
Background material on fibrations can be found in [Jac91, Pav90]. Indeterminates
for fibrations, as relevant to this work, are discussed in [HJ93]. Inserters are presented
in [Kel89]; they play a purely technical role here and hence they are not essential to
understand the paper.
The material presented here is essentially an extension of [Her93, w combined
with [Jac95]. A follow-up,in [HJ95] deals with a dual coinduction principle (which holds
in the presence of quotients) and a mixed induction/coinduction principle for mixed
variance type-constructors, cf. [Pit93].
414

2. S e t t i n g

In this section we lay down the setting required for our formulation of structural induc-
tion. In w we define the kind of endofunctors whose initial algebras are understood
as inductively defined datatypes and recall how such initial algebras may be obtained
under suitable cocompleteness conditions. In w we present the basic properties on
fibrations required to give a categorical counterpart of a logic suitable to describe struc-
tural induction, including the description of logical products and logical coproducts.

2.1. I n d u c t i v e data t y p e s in a bicartesian c a t e g o r y


Following [CS91, Jac95], we will consider inductive data types in a bicartesian category ~,
i.e. a category with finite products and coproducts. Actually, these references consider
an additional of distributivity, but it is irrelevant until we consider 'stability', that is the
preservation of initial algebras by weakening to arbitrary contexts, in w
We write !a : A ~ I for the unique morphism into the terminal object 1 and

A [.,B A x B B
for a product diagram in B, omitting subscripts whenever convenient. Dually, we write

A LA,~ A+B <~,B


J B
for a coproduct diagram.
Clearly, categories like Set and Cpo (with strict continuous functions) are bicartesian.

2.1.1.. Inductive data types in a bicartesian category are specified by endofunctors,


which give the 'signature' of the type. Given an endofunctor T : B ~ B , we write T-.4lg
for the category whose objects, called T-algebras, are pairs (X, x : TX--*X) and whose
morphisms f : (X, x)~(Y, y) are f : X ~ Y in B such that f o z = y o Tf. Ins(T, 1~)

2.1.2. DEFINITION. Let ]~ be a distributive category, S a finite set and M : S--*B be a


functor, regarding S as a discrete category.
(i) Let TM C I~t(B,B)t be the least set of endofunctors on ]~ such that
9 The identity functor is in TM.
* For any I E S, the constant functor X ~ M(I), written KM(O is in TM.
9 The constants functors K0 and K1 are in TM.
. If T1 and T2 are in TM, so are 2"1 x T2 and T1 + T2, i.e. (X ~ TI(X)• and
(X ~ TI(X)+T2(X)) respectively. These operations are the product/coproduct in
the functor category Cat(~, ]~).
(ii) An inductive data type specification in ]~ is given by a functor M : S---*B and a
functor (T : ~---.I~) E TM. We write TM for such specification and refer to it simply as
a polynomial funetor.
(iii) A model for aTM is a TM-algebra.
(iv) The initial model for a specification TM is the initial T-algebra.

The set S in the above definition is called a parameter set. Its role is to specify,
via the functor M : S--*~, those objects of ~ which are parameters for the data type
specified. The examples below will make this clear. See [Jac95] for a more general type-
theoretic formulation of data types in distributive categories. The initial T-algebra of a
415

functor T : ]~---~ need not exist. But it is possible to guarantee the existence of initial
T-algebras under suitable cocompleteness conditions on ~ and T. As shown in [LSS1],
an initial T-algebra can be obtained as the colimit of an w-chain, when T preserves
such colimits. An w-chain is a functor w ~ ]~, where w is the poset category of natural
numbers with their usual ordering. The initial T-algebra is the colimit of the following
w-chain:
0 ' * TO T,. T20 ...
where t : 0---~T0 is the unique morphism from the initial object. In Set and Cpo, any
T E 7"_ preserves colimits of w-chains and therefore any polynomial functor in these
categories has an initial model.

2.1.3.. An important observation due to Lambek ,cf. [LSS1] for instance, is that for an
initial T-algebra (D, constr : T D ~ D ) , constr is an isomorphism. Thus, we can regard
D as the 'least fixed point' of T, as illustrated by the above w-chain. The isomorphism
constr provides the 'constructors' of the data type, as the following familiar examples
illustrate.

2.1.4. EXAMPLES. Let ]B be a bicartesian category.


(i) N a t u r a l n u m b e r s o b j e c t : Consider the polynomial functor T X = 1 + X , with
parameter set 0. A T-algebra (A, [c, f ] : T A R A ) is given by an object A, the 'carrier'
of the type, and morphisms c : 1---~A and f : A--*A. An initial model for T is precisely a
natural numbers object (N, [z, s]) in Lawvere's sense, see [LS86, Part I,w In Set, it is the
set of natural numbers w, with the usual zero and successor operations. Initiality means
that there is an 'iterator', which given c and f as above produces a unique morphism
h : N--*A such that h o z = c and h o s --- f o h. In Set, h corresponds to the function
defined from c and f by primitive recursion, given by n ~-+ f(n)(c). We write it(c, f) for
h above.
(ii) Lists: For an object A E ]~h consider the polynomial functor T A X = 1 + A x X ,
for a singleton parameter set, i.e. A : { * } ~ . A T-algebra is given by an object B and
morphisms c : 1---*B and t : A x B--*B. An initial model in Set is precisely the set list(A)
of finite lists of elements of A, with the usual operations nil : l~l_ist(A), the empty list,
and cons : A x List(A)---*l_ist(A), which given a E A and a list l, returns this list with the
element a appended to its head.

The example of lists above shows the role of the parameter S and the functor
M : S---*~ in the specification of a data type; the type of lists List(A) is parameterised
by the type A of the elements of the list.

2.2. L o g i c o v e r a b i c a r t e s i a n c a t e g o r y
Given a bicartesian category 1~ in which we model inductive datatypes, we want a cate-
gorical formulation of a logic over it, a predicate logic over the 'types' and 'terms' of ]B,
in order to consider induction principles. The proper categorical version of a predicate
logic over a category is embodied by the notion offibralion. We refer to [Jac91, Pav90]
for an exposition of this point of view.
P
Thus a predicate logic corresponds to a fibration over ~, written as ,p. P is the
]B
category of 'predicates' and 'proofs', over the 'types' and 'terms' of~. This can be made
precise via the inlernal language of a fibration, in the same vein as a cartesian closed
category has associated a simply typed A-calculus as its internal language, cf. [LS86].
Specifically, the fibration ~p has associated a predicate logic as its internal language:
/+16

regarding ~ as a simple type theory, with product and coproduct types (see [Jac95]), an
object P of It~, with p P = X is construed as a predicate, or indexed proposition, on the
type X:
x : X l- P(x) Prop
where we have written P ( x ) to emphasize the dependency on the variable x, although we
will usually leave this implicit. A morphism h : P--+Q with ph = u : X - * Y , corresponds
to a (unique) vertical morphism h : P--,u*(Q), where u*(Q) is the domain of a cartesian
lifting of u at Y. In the predicate logic of p, this vertical morphism h corresponds to a
proof of the entailment
x : X ] a : P(x) ~- h : Q ( u ( x ) )
where Q(u) is the predicate corresponding to u*(Q); reindexing in the fibration corre-
sponds to substitution in the logic:

( x : X ~- u(x) : Y, y : Y t- Q(y) Prop) ~ x : X t- Q ( u ( x ) ) Prop

2.2.1. REMARK. Although we usually omit the 'proof term' ]~ in entailments, the reader
should bear in mind that our approach is truly constructive, i.e. takes proofs into ac-
count.

2.2.2.. Fibrations are organized into the 2-category ~t~b, whose objects are fibrations
iEp. Morphisms are given by commuting squares

g ~
E ~D

~ T A

where K ~ preserves cartesian morphisms. Given morphisms (K', K), (L', L) : p---*q, a 2-
cell from (K', K) to (L', L) is a pair of natural transformations (~r' : K ' ~ L ' , ~ : K ~ L )
with cr~ over ~, i.e. q~r~ = r
.T~b is a sub-2-category of Cat--*, whose objects are arbitrary functors p, q,... and
whose morphisms are commutingsquares as above (without any preservation properties).

The analysis of structural induction in w below, depends crucially on the relationship


between the 'logical' structure of the fibration ~p and the categorical structure of the
'total' category P. Specifically, we want to lift an endofunctor T : ~--*~, belonging to
TM, to one on ]P. Since the functors on TM are essentially those expressible by the
bicartesian structure of ]~, we need the same structure on 1P. This leads us to consider
the following kind of fibrations.

2.2.3. DEFINITION. A b l e a r t e s i a n f i b r a t i o n ~v is a fibration over a bicartesian cate-


gory ]~, such that. I? is bicartesian and p strictly preserves such structure.

2.2.4. REMARK. A bicartesian fibration is a bicartesian object in 3Cib, the 2-category of


fibrations described in 2.2.2 above. See [Her93] and the references there for details on
such matters.
W e a s s u m e a choice of c a r t e s i a n liftings, i.e. we a s s u m e t h e f i b r a t i o n is cloven. Such a choice is
a l w a y s possible if we a p p e a l to the a x i o m of choice.
417

2.2.5. EXAMPLES. (i) Classical logic. The fibration corresponding to classical first-
Sub(&t)
order logic is the subobject fibration 1cod. The category Sub(Set) is the cate-
Set
gory of subobjects: its objects are pairs (S, X), where S _C X, and its morphisms
f : (S, X)--+(S', X') are functions f : X---+X' such that f ( S ) C S ~. The fibration simply
'forgets' the subsets. Cartesian liftings are given by inverse images:

(f : x-x',(s',x')) ~ (f-~(s'),x)
The bicartesian structure of Sub(Set) is described below, in terms of logical predicates.
ASub(cvo)
(ii) Admissible subsets. A related example is the fibration ~eod , where
@o
ASub(Cpo) is the category of admissible subsets: its objects are pairs (S, C) where C
is an w-cpo and S C C is a subset containing the bottom element and closed under lub's
of w-chains, while its morphisms are the strict continuous functions which respect the
subsets, as in the preceeding example. The category ASub(Cpo) is bicartesian as it is a
reflective subcategory of the fibred category U*(Sub(Sel)), obtained from the 'classical
logic' fibration by change-of-base along the forgetful functor U : Cpo--+Set. See [Her93,
w for further details.

2.2.1. Logical predicates


In order to convey the logical significance of the bicartesian structure of I~ we recall,
from [Her93], how such finite products and coproducts are induced by the fibred ones
and the ones in the base.

2.2.6. PROPOSITION. Given ~v wilh

9 ~ a bicartesian category,
9 p a fibred bicartesian category, i.e. every fibre is a bicartesian category and rein-
dexing functors preserve finite products and coproducis,
tI
9 p has coreindexing functors along coproduct injections, I -~ I + J ~- J, for every
_r, j e I~l.
Then, I~ is a bicartesian category and p strictly preserves finite products and coproducts.

Proof. Given objects P and Q of ~, with pP = I and p Y = J, their product in ]P is

p ~ " J ~r~,,,(P) , ~ (Tr~,3(P) Xrxj (~r~,s)*(Q)) " ' , (~r~,j)*(Q) ~!J~ Q

over the product I ~+~-JI x J 24J J, where Xlx3 is the product in the fibre ]Fzxj. Dually,
their coproduct is
tn
tl,J
P "--') (tl,3)!(P) ' ~ (tl,s):(P) +I+J (t~,,,):(Q) r " (dI,3)!(Q) .~'.--~'J Q

where ,'l , J is a cocartesian lifting. Terminal and initial objects are obtained similarly []
2.2.7. REMARK. In the internal language of p, the above construction of products reads
as follows: given x : I I- P Prop and y : J t- Q Prop, their logical product is

x : I , y : J t- P(x) A Q(y) Prop

and their logical coproduct is

z : I + J I- (:Ix: I . t ( x ) = z A P ( x ) ) V (:ly: J.,'(y) = z A Q(y)) Prop

that is, a predicate over I + J defined 'by cases'. This last expression of coproducts
relies on the presence of an equality predicate, satisfying certain exactness conditions,
commonly satisfied (see [Law70]). Actually, such additional structure on a fibration is
irrelevant for our arguments; the above description is given only to emphasise the logical
significance of the coproduct in ]?. The relationship between categorical structure on
and logical predicates is further analysed in [Her93].

Sub(se0
For example, the fibration I cod satisfies the hypothesis of Proposition 2.2.6. The
Set
fibred products and coproduets are given by intersection and union, respectively. It has
cocartesian liftings along arbitrary morphisms: given S C X and f : X--*X', the lifting
is the direct image f ( S ) C_ X q

2.3. Adjunctions between categories o f a l g e b r a s


We present the main technical tool we need to deal with adjunctions between categories
of algebras, induced by adjunctions between the base categories. The result holds in
any 2-category which admits inserters, as categories of algebras are an instance of such
limits cf. [CSgl], i.e. for a given T : A ~ A , T-Alg ~_ Ins(T, 1A) , the inserter of T and
the identity functor on A.

2.3.1. THEOREM. Given a diagram in a 2-category ~

A t*A

B---p-~ B

in which c~ is an isomorphism and f has a right adjoint, 71,e : f -~ g, the adjoint mate of
a -1, i.e. 0 = gt'e o g ' a - l g o rltg : t g ~ g t ' , induces a morphism g-Alg : t'-AIg---+t-.Alg

(B, t B a . B) ~ (gB, t'(gB) ~ ' ~ gtB g~. gB))

right adjoint to the morphism f - A l g : t-Alg--~t'-Alg induced by the above diagram, i.e.

(A, t A a , A) ~-~ (fA, t ' ( f A ) ~a~ f t A ]~, f A ) )

3. I n d u c t i o n principle for inductive data types relative to a fibra-


tion

3.0.2.. Given a set of parameters S and functors ]l~ : S--+~ and )~ : ..q--+lF such that
p M --- M, a polynomial functor TM : ]~--*~ induces a polynomial functor Pred(T)~ : IP--+]P
419

fibred over T, using the bicartesian structure of I?. The formal definition of P r e d ( T ) ~
proceeds by induction on the construction of T E TM. For instance, given P E I~al,
TAX = I + A x X induces Pred(T)Y = "I~-P'xY. We cau then consider Pred(T)-algebras
and initial models in F. We call P r e d ( T ) ~ the logical-predicate lifting of TM. We thus
get the following endomorphism in Cat ~

Pred(T)~

In a bicartesian fibration, the fibred terminal object (the 'truth predicate') is given by
a functor T : ]~--*~, which is a (fibred) right adjoint to p : P ~ B . Such a fibred terminal
object is used to give a notion of provability in the 'logic' p. A 'predicate' P, with
pP = I is provable when there exists a morphism h : Ts--*P in the fibre PI. In the
internal language of p, cf. w this amounts to a proof of the entailment

x: I l a : T I ~- h: P(x)

We usually omit a : T I on the left-hand side of a sequent.

3.0.3.. Given a polynomial functor TM in ]B, with M : S--*]~ as in Def.2.1.2, we can


consider the logical predicate lifting of T, using the functor T M : S---*I?. We write
Pred(T) : I ? ~ F for the functor Pred(T)TM so obtained. We thus have an endomorphism
(Pred(T),T) :p---*p in Cat-'*, and, writing (1, 1) for the identity on p, we can consider
the inserter Ins((Pred(T), T), (1, 1)) in Cat'-*. We write p-AIg : Pred(T)-Alg--.T-Alg for
the fibration so obtained. Here, Pred(T)-Alg is the category of Pred(T)-algebras of the
endofunctor Pred(T) on F, in agreement with our convention in 2.1.1. Furthermore,
the adjunction p, T : ~ -q F induces an adjunction p-Alg -~ T-Alg : T-Alg--*Pred(T)-Alg,
by Theorem 2.3.1. In elementary terms, the functor T-Alg : T-.Alg--*Pred(T)-Alg
acts a s follows

(I, T I i ~ I) ~ (T~,Pred(T)TI ! * TTI T(!) T~)

using the fact that T I is terminal in the fibre I?l.

Since the functor p-Alg : Pred(T)-Alg--*T-Alg has a right adjoint, it preserves initial
algebras. Hence, if Pred(T)-AIg has an initial algebra, we may assume it lies over the
initial algebra in T-Alg.
We are now in position to state our main defnition.

3.0.4. DEFINITION ( I n d u c t i o n p r i n c i p l e in a f i b r a t i o n ) . Let ~p be a bicartesian fi-


bration, and let TM : It~--*]~be a polynomial functor, for a parameter set S and a functor
M : S--*~. The fibration ~p satisfies the induction principle w.r.t. T if the functor
T-Alg:T-Alg---*Pred(T)-Alg preserves initial algebras, i.e. whenever (D, constr) is an
initial T-model, 1-Alg(D, constr) is an initial Pred(T)-model.

This definition means that for an object P in IP, in order to give a morphism
f : TD--~P it is sufficient to endow P with a Pred(T)-algebra, (P, h : Pred(T)P---*P).
z.2O

Note that if P is a predicate over D, the condition is also necessary, as a morphism


f : TD--~P gives a Pred(T)-algebra

Pred(T)P ' -FTD T(constr)._


.....> "I D
jr
~ P

For the more general case of the definition, the condition is also necessary if we assume
has image-faetorisation for T-algebras, e.g. when ]~ = Set.
We illustrate the logical import of the above definition with the polynomial funetor of
natural numbers and lists below. We assume the bicartesian structure of ]P is obtained
as in Proposition 2.2.6. The internal language of p in this case includes the logical
connectives {A, T, V,.k} and the eoreindexing funetors along coproduct injections. To
simplify the presentation, we consider only the entailment relation [-- in the internal
language, disregarding the proof terms. Note that for L : I---*I + J in ~, given predicates
x : I h Q(x) Prop and y : I + J I- P(y) Prop a morphism f : t , ( Q ) - , p corresponds under
A , . "

the adjunetion t! -4 t* to a morphism f : Q--*t (P), which amounts to an entailment


x : I I q ( x ) ~- P(~x).

3.0.5. EXAMPLES. Let ~p be as in Proposition 2.2.6.


(i) For the polynomial functor T X = 1 + X in ~, the corresponding Pred(T) poly-
nomial funetor in II~ is Pred(T)H = ]'~-H. Let P E [[~II and let (N, [z, s]) be the initial
T-model in lB. To give a Pred(T)-algebra (P, f : P r e d ( T ) P - . - + P ) amounts to give a T-
algebra (I, [a, rn] : TI---*I) (which induces a morphism it(a, m ) : N--*I) and a vertical
morphism f : Pred(T)P--+[a, mJ*(P). Let us examine this vertical morphism in the in-
ternal language of p: it corresponds to a sequent

z : 1 + I [ L~(T1) V ~(P) F- P([a, re]x)

which can be decomposed into two sequents ,+

x:l+II~!(T1)~-P([a,m]x ) x:l+Ilt'!(P) F-P([a,m]x)

which in turn correspond to sequents

x' : 1] q-1 ~ P(a) y : I I P ( y ) ~- P ( m y )

The above corresponds to the usual induction principle on the natural numbers: to
prove P(x) for the elements z : I generated by a and m, we must prove P(a) and
P ( y ) ~ P ( m y ) . The validity of the induction principle in p asserts then the existence
(and uniqueness) of a morphism it(f) : -TN-+P over it(a, m), which is the desired proof
of the previously mentioned 'validity' of P in the image of it(a, m).
(ii) For the polynomial funetor T A X = 1 + A x X, for some A E []~[, we get the
polynomial functor Pred(T)Y -- "I-TrTA'~Y. Let (L, [nil, cons]) be the initial T-model
and let P E I~Lt. Note that modulo the isomorphism [nil, cons] : 1 + A x L----~L, the
predicate P corresponds to a predicate P ' on 1 + A x L, i.e. x : 1 + A x L F- P~(x). The
predicate P~ therefore determines two predicates S and Q, with x' : 1 I- S ~-* PJ(nil) and
a : A, l : L ~- Q(a, l) ~ P'(cons(a, l)). To give a vertical global element h : TL--*P, a
proof of the property P for all lists, amounts to give a morphism k : Pred(T)P---*P over
[nil, cons] : 1 + A x L-+L. It corresponds to a sequent

x : 1 + A x n [ t!(T1) V t~(TAXP) [- P'([nil, cons]l)


421

which can be decomposed into two sequents


y:l[ I-S
and
a: A,l : L [ P(1) F- Q(a,l)
where we have simplified the antecedent of the second sequent by TA• L A P(l) *-* P(l).
We thus get the usual structural induction principle for finite lists.

4. Validity of the induction principle in the presence of compre-


hension
We now set out to show that, like in ordinary set theory, if the logic admits comprehen-
sion, the induction principle is valid in it. First, let us make an important remark.
4.0.6. REMARK. Since the structure map constr : T D ~ D of an initial algebra (D, constr)
is an isomorphism, cf. 2.1.3, it follows from Definition 3.0.4, that if p satisfies the induc-
tion principle, the following condition must hold

P r e d ( T ) T o -~ -[-TD is an isomorphism (1)


Notice that the above morphism is the instance at D of the 2-cell p : P r e d ( T ) T ~ T T in
3.0.3. Given that T E TM, the condition that p be an isomorphism amounts to requiring:
9 For 0, the initial object of ~, the initial object in P0 is terminal, that is, the fibre
~0 is the terminal category 1.
9 For any pair of objects I, J of ]B, (LI,a)!(Ts) +1+J (L~,j):(Tt) -~ T.t+a
This last condition essentially means that the union of the images of the coproduct
injections 'cover' the object I + J. We note that these conditions are satisfied for
instance, when
9 We consider internal logic fibrations, i.e. fibrations in which the predicates are
subobjects of the base category, in which coproduct injections are monic.
9 More generally, in the presence of comprehension, as T preserves coproducts be-
cause it has a right adjoint.
9 We consider the logic relative to a stable factorisation system, as in [Pav93], where
predicates are interpreted as (equivalence classes of) formal monos.
From now on, we will assume condition (1) is satisfied. We recall from [Jacgl] the
definition of comprehension in a fibration (which is essentially the same as given in
[Law70] for hyperdoctrines).

4.0.7. DEFINITION. A fibration ~p with a fibred terminal object T : il~---*]?admits com-


B
prehension if T has a right adjoint, T -I {_}. For an object P over X, we write {X [ P}
for the value of {_} at P.
The above definition means that, given a morphism f : Y---~X in ~ and a predicate
P E [~x 1, P ( f ) is provable iff the 'image' of f lies in {X I P}. In Set comprehension
is the usual operation P ~ {z E X [ P(z)}. Clearly, the fibrations of Examples 2.2.5
admit comprehension; in the second case, notice that an admissible subset of a cpo is
itself a cpo.
422

4.0.8. THEOREM. Let ~p be a bicartesian fibration, which satisfies condition (I) and
admits comprehension. Then, p satisfies the induction principle w.r.t, every polynomial
endofunctor on 5.

Proof. Condition (1) and T -4 {_) give data satisfying the hypothesis of Theorem 2.3.1.
We then conclude that T-AIg has a right adjoint {_}-Alg and therefore preserves initial
objects. []

The import of the above theorem is that for an polynomial functor TM, the functor
{_}-Alg turns a Pred(T)-algebra on a predicate P into a T-algebra on the 'extent' of the
predicate P. This is the essential role comprehension plays in showing the validity of
the induction principle in Set: given a predicate P on the natural numbers w, which is
inductive, we use the initiality ofw to conclude that the (inductive) subset {n E w I P(n)}
must be the whole of w, and thus the predicate P is (provably) true.

5. S t a b i l i t y of initial algebras under weakening of context

So far we have considered inductive data types and their associated induction principle
in terms of initiedity in the empty context. For instance, the initiality of N allows us to
define functions out of it, e.g. h : N-*X, by endowing the set X with a 1 + (_)-algebra
structure. But we also want to use this method when the inductive data type occurs in
an arbitrary context, e.g. to define addition add : N x N-*N by induction on the second
argument. This requires that the initiality of N be preserved when we move from the
empty context to the context n : N (for the first argument of add). This operation
is called context weakening. Technically, we say initiality is stable under addition of
indeterminates, the indeterminate being n : N.
A similar extension is needed then for the associated induction principle, since when
we perform context weakening F ~ F,x : I, the element x may be subject to some
(propositional) hypothesis. That is, we are generally interested in proving relative en-
taihnents P I- Q rather than 'absolute' assertions T t- Q. For instance, we may want to
prove n : N, m : N [p : Even(m) t- q : Even(add(2 9 n, m)) for some q, in which case we
use induction on n with m : N and p : Even(m) as parameters.
Abstractly, both extensions are instances of the same phenomenon: let ]E be a 2-
category with finite products and inserters and let A be an object of K: with a 'ter-
mined object' :! H 1 : 1--+A. Given any global element I : l - - A , we can consider the
'object A with an indeterminate element x : l ~ I ' , A [ x :/]. This object is equipped with
rli : A--+A[x : / ] and a 2-cell ax : rlll~r]sI, and is universal among objects with such
data. Given an endomorphism T : A - . A , we can consider the 'object of T-algebras'
T-.Alg, namely the inserter of T and the identity on A. Similarly, since T : A--+A in-
duces T [ z : 1]: A [ z : I]-*A[x: 1] with T[x : 1]~x = ,xT, we can consider the object
T[x : 1]-.Alg and the induced morphism ~1-Alg : T-Alg-*T[x : I]-.Alg. Stability means
that ~s-.Alg preserves 'initial objects', for every I:I--*A. It follows from Theorem 2.3.1
that stability is guaranteed whenever the object A is functionally complete, i.e. when ~!
has a right adjoint. We spell this out in more detail for categories and fibrations in the
following subsections. Further details on indeterminates and functional completeness
can be found in [ItJ93]. We refer to [Str72] for the relevant definitions of comonads and
their associated morphisms, as well as Kleisli objects for them in a 2-category. Anyway,
these concepts are not essential to understand what follows.
423

5.1. Stability o f initial algebras in a d i s t r i b u t i v e b i c a t e g o r y


The material in this subsection is based on [Jac95], although the formulations and proofs
are different. It is just a preliminary to the treatment of stability of induction principles
in w
Given a bicartesian category ~ and an object I, ]~[x : I] denotes the universal bicarte-
sian category rll : 13--+13[z : I] which has a global element of 'type' I, i.e. a morphism
z : 1--~rliI. Universality means (at the 1-dimensional level) that given a bicartesian cat-
egory C, a functor F : ~ C preserving finite products and coproducts, and a morphism
a : F1---*FI, there is a unique functor F' : 13[z : I]---*C preserving finite products and
coproducts such that
F'7//= F F'(z) = a
The category ]~[x : I] can be charaeterised as the Kleisli category of the r (_) • I,
written 13ffi, when 13 is distributive, i.e. _ • I preserves finite coproduets.
Logically, we think of 13[x : / ] as the theory with the same types of 13, whose terms
have a 'parameter' of type I, i.e. they are terms of the form I ' , x : I ~- t : J in ~. This
interpretation is obtained by considering the internal language of the Kleisli category of
the comonad _ • I on ~.
A functor T : ~--+13 lifts to a functor T / / I : 13//I--+13//I such that (T//I) o yx =
r/x o T, whenever it is endowed with the appropriate additional structure. Technically,
this structure is exactly what makes T a morphism of comonads; it is essentially the
same as requiring T strong, although this latter formulation leads to often misleading
considerations of enrichement. More specifically, we require a natural transformation
(_ • I ) T : O ~ T(_ x I) satisfying the following coherence conditions:

FTrs, I o Oj ~--- 7rFJ,I Oj• I o (OJJ X I) 0 (id, ~rtFY,1} : F(id, ~rF.l,l)' o Os,l

for every object J of ~. Every polynomial functor T admits such structure and hence
can be lifted to 13ffI.

5.1.1. DEFINITION. Given a distributive category ~ and a polynomial functor T : ~--*~,


I3 admits stable inilial T-algebras if it admits an initial T-algebra and for every object
I, the functor (y1)-Alg : T-Alg---*(T//I)-Alg preserves initial objects.

We recall from [HJ93] that I~ is functionally complete if for every object I, the functor
r/1 : 13~]~[z : I] has a right adjoint. For 13 bicartesian this is the case precisely when it
is (bi)cartesian closed. As an easy consequence of Theorem 2.3.1 we have the following
result.

5.1.2. PROPOSITION. Let ]~ be a functionally complete distributive category (or equiva-


lently, bicartesian closed). Whenever 1~ has initial T-algebras, they are stable.

5.2. Stability of initial algebras in a d i s t r i b u t i v e fibration


Just as we require inductive data types to be stable under addition of indeterminates
to use its initial algebra property in an arbitrary context, we must require an analogous
stability of their associated induction principles. In order to express such stability, we
consider, for a given fibration (logic), an associated one with 'parameters' both on the
base and total categories.

5.2.1. REMARK. Although the treatment of indeterminates for fibrations to follow par-
allels that for categories in w there is a subtle technical difference. All the concepts
424

previously defined by universal properties in Cat, should be considered in their bicat-


egorical variants in ~rib, i.e. up-to-equivalence rather than up-to-isomorphism. This is
because the pseudo-functorial nature of the cleavages of fibrations allows only tlie exis-
tence of the bicategorical cocompleteness properties required (e.g. Kleisli objects), rather
than the 2-categorical versions previously mentioned. The strict 2-categorical version
does apply if we restrict attention to split fibrations and splitting-preserving morphisms.

Given a bicartesian fibration ~v and an object P of ~, the fibration with an


indeterminate of P, written p[(x, h): P ] : ~//(P)---4~[x: pP] is the universal fibration
(~p, r#) : p--*p[(x, h) : P] with a global element x : l~qi(I) in ]~[x : pP] and a global
element h :T1---*x*(qp(P)) in (]?//(P))l. Universality means that given a bicartesian
fibration ?q, a morphism (H, K) : p---~qpreserving finite products and coproducts, and
C
global elements a : K1--*K(pP) and b :HT1---~a*(HP), there is a unique (up to isomor-
phism) morphism (g', It") : p[(z, h) : P]--*q preserving finite products and coproducts
such that
(H', g ' ) ( ~ . , ~ ) ~ (H, I;) K'x = a r o Y'h = b

where r H'(x*(yp(P)))~(K'z)*(HP) is the canonical comparison isomorphism in the


fibration q.
It is easy to extend Proposition 2.2.6 to make ~ a distributive category when the
base and the fibres are so and when the coreindexing functors satisfy the Beck-Chevalley
condition and Frobenius reciprocity, as formulated in [LawT0]. We call such p a distribu-
tive fibralion, In this case, we can characterise p[(x, h) : P] as a Kleisli fibration pff(P)
for the comonad ((_~P), (_ • pP)) on p (in ~b),[ttJ93].
Logically, we think of the fibration p[(x, h) : P] as a logic with the same types and
propositions as those of p, but whose terms have a 'parameter' of type pP, i.e. of the
form s x : pP t- t : J, and whose entailment relation allows an additional hypothesis
P(x), i.e. the entailments have the form

F , x : p P I (9, h : P(x) ~- q: Q(x)


That is, we are assuming the presence of an additional element x of type I, and
a predicate of that type whose instance at x is provably true. Both these elements
represent the additional data with their associated properties forming the context in
which we are working, for instance when carrying out an inductive proof. Semantically,
such interpretation ofp[(x, h) : P] can be obtained via the internal language of the Kleisli
fibration of the comonad ((_~P), (_ • pP)) on p.
A polynomial morphism (Pred(T), T) : p--*p as considered in w induces an endo-
morphism (Pred(T)[h : P], T[x: p P ] ) : p[(x, h ) : P]~p[(x, h): P] such that

(Pred(T)[h : P], T[x : pPJ)(qp, qI) "~ (~t,, rll)(Pred(T), T)

So we get a morphism (~lP, qz)-.Alg : (Pred(T), T)-.Alg-+(Pred(T)[h : P], T[x : pP])-.AIg,


where for an endomorphism in Cat--* (H, K) : p--,p, with p a fibration, (H, K)-.Alg is
the fibration obtained as the inserter of (H, K) and the identity on p; its base category
is K-.AIg and its total one is H-.Alg. Now we can formulate stability of the induction
principle for an inductive data type.

5.2.2. DEFINITION. Given a polynomial functor T : ~---4~, a distributive fibration ,p


satisfies the stable induction principle w.r.t T if T-.Alg : T-.Alg--~Pred(T)-.Alg preserves
425

initial algebras and moreover, for every P E J~h the morphism

(rip, rlt)-Alg : (Bred(T), T)-Alg-*(Pred(T)[h : P], T[x : pP])-Alg

preserves initial algebras (both on the base and the total categories).

5.2.3. REMARK. The above definition could equivalently be expressed by requiring that
every fibration with an indeterminate P, p[(x, h) : P] satisfy the induction principle
w.r.t, the induced morphism (Pred(T)[h : P], T[x: pP]): p[(x, h): P]---+p[(x, h): P], pro-
vided the base category ]~ admits stable initial algebras. This makes logical sense, as we
want to reason by induction in the fibration p[(x, h) : P], which has an indeterminate of
type pP, satisfying the hypothesis P; this is exactly what the above formulation means.

In analogy to ordinary categories, we say that the fibration ~p is functionally com-


plete when, for every P E [~h (~p, r/i) : p---*p[(x, h): P] has a right adjoint (in ~-/b). This
holds for instance when p admits (or models) universal quantifiers V and implication
==r (as a model of first-order logic). Then, we can apply Theorem 2.3.1 to show the
following.

5.2.4. THEOREM. If ~p : p--+p[(x, h) : P] is a functionally complete distributive fibration


and satisfies the induction principle w.r.t, to a polynomial endofuuctor T, then it satisfies
the stable induction principle w.r.t to T.
Sub(Set)
The fibrations of Example 2.2.5 are functionally complete: ~cod is so because it
Set
ASub(cpo)
models V and ==V, while lcod is functionally complete although it does not model
cpo
= ~ . Thus, the above abstract formulation seems to capture better this kind of example
than a purely syntactic approach would. Functional completeness (at the logic level) is
implicitly used in [LS86, w to show validity of inducff6n over the natural numbers in
a topos.

6. C o n c l u s i o n s and further work

Our aim was to give a precise abstract account of structural induction over data types,
presenting the relevant technical machinery. A pay-off of this account is the precise
relationship between logical predicates and induction. This relationship is further eluci-
dated in a sequel to the present paper [HJ95], where we give an account of coinduction
principles along the same lines as those for induction here. In that case, the 'equality
predicate' functor takes over the role of T, and the fact that such functor preserves
the relevant structure becomes (an instance of) Reynolds' 'identity extension lemma'
[MR91]. There are also some considerations as to the extent the present approach can
cope with bifunctoriality, in order to obtain (co)induction principles for recursive data
types, in line with the domain-theoretic account in [Pit93].
We should mention that the approach here can be applied to formulate induction
principles for data types with equational constraints (a standard kind of algebraic spec-
ification). The categorical aspects of such data types are described in [Jac95]. Briefly
put, such data types are described by so-called distributive signalures (~, E), and their
models correspond to distributive functors M : C2(E, E)-*~, where 1~ is a distributive
category and ~ ( ~ , E) is the classifying category associated with the signature. A 'logi-
cal predicate' over such model is then a distributive functor Bred(M) : ~ ( ~ , E)--*IP with
Z26

pPred(M) = M, cf.[Her93]. Induction can be stated by requiring that T M : O?(E, E)--]?


preserve initial models. This is the case when p admits comprehension. Furthermore, in
[Jac95] parametrized specifications correspond to morphisms r : (E0, E0)---*(E, E), which
semantically are interpreted as the functor sending a model M : ~(E0, E0)--*ll~ to the
(distributive) left I(an extension along Cg(r : Cs E0)--*Cs E). At the logical level,
we expect a similar action on logical predicates as a suitable counterpart of induction
for parametrized specifications. Once again, postcomposition with q- preserves (dis-
tributive) left Kan extensions in the presence of comprehension. This generalisation is
a pay-off of the 2-categorical approach taken here. Of course, this topic requires further
investigation to assess its suitability for applications in program development.
Further development of the ideas in this paper should account for some semantic
features missing in the present treatment, notably partiality and type dependency.

References
[cs91] J.R.B. Cockett and D. Spencer. Strong categorical datatypes I. In Proceedings Category
Theory 1991. Canadian Mathematical Society, 1991.
[IIer93] C. Hermida. Fibrations, logical predicates and indeterminates. PhD thesis, University
of Edinburgh, 1993. Tech. Report ECS-LFCS-93-277. Also available as Aarhus Univ.
DAIMI Tech. Report PB-462.
[HJ93] C. Hermida and B. Jacobs. Fibrations with indeterminates: Contextual and functional
completeness for polymorphic lambda calculi. In Book of Abstracts of Category Theory
in Computer Science 5, september 1993. Extended version to appear in Mathematical
Structures in Computer Science.
[HJ95] C. Hermida and B. Jacobs. Induction and coinduction via subset types and quotient
types, presented at CLICS/TYPES workshop, Gbtenburg, january 1995.
[Ja~91] B. Jacobs. Categorical Type Theory. PhD thesis, Nijmegen, 1991.
[Jac95] B. Jacobs. Parameters and parameterization in specification using distributive cate-
gories. Fundamenta Informaticae, to appear, 1995.
[Ke189] G.M. Kelly. Elementary observations on 2-categorical limits. Bulletin Australian Math-
ematical Society, 39:301-317, 1989.
[Law70] F.W. Lawvere. Equality in hyperdoctrines and comprehension scheme as an adjoint
functor. In A. Heller, editor, Applications o] Categorical Algebra. AMS Providence,
1970.
[LS81] D. Lehmann and M. Smyth. Algebraic specification of data types: A synthetic ap-
proach. Math. Systems Theory, 14:97-139, 1981.
[LS86] J. Lambek and P.J. Scott. Introduction to Higher-Order Categorical Logic, volume 7
of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1986.
[MRgl] Q. Ma and J. C. Reynolds. Types, abstraction and parametric polymorphism 2. In
S Brookes, editor, Math. Found. of Prog. Lang. Sere., volume 589 of Lecture Notes in
Computer Science, pages 1-40. Springer Verlag, 1991.
[Pavg0] D. Pavlovi& Predicates and Fibrations. PhD thesis, University of Utrecht, 1990.
[Pay93] D. Pavlovi& Maps h relative to a factorisation system. Draft, Dept. of Math. and
Stat., McGill University, 1993.
[Pit93] A. Pitts. Relational properties of recursively defined domains. Teeh. Report TR321,
Cambridge Computing Laboratory, 1993.
[Str72] R. Street. The formal theory of monads. Journal of Pure and Applied Algebra, 2:149-
168, 1972.
[Str73] R. Street. Fibrations and Yoneda's lemma in a 2-category. In Category Seminar,
volume 420 of Lecture Notes in Mathematics. Springer Verlag, 1973.
On the Interpretation of Type Theory in Locally
Cartesian Closed Categories

Martin Hofmann*

Department of Computer Science, University of Edinburgh


JCMB, KB, Mayfield Rd., Edinburgh EH9 3JZ, Scotland

A b s t r a c t . We show how to construct a model of dependent type the-


ory (category with attributes) from a locally cartesian closed category
(lccc). This allows to define a semantic function interpreting the syntax
of type theory in an lccc. We sketch an application which gives rise to
an interpretation of extensional type theory in intensional type theory.

1 I n t r o d u c t i o n and M o t i v a t i o n

Interpreting dependent t y p e theory in locally cartesian closed categories (lcccs)


and more generally in (non split) fibrational models like the ones described in [7]
is an intricate problem. The reason is that in order to interpret terms associated
with substitution like pairing for Z-types or application for H-types one needs
a semantical equivalent to syntactic substitution. To clarify the issue let us have
a look at the "naive" approach described in Seely's seminal paper [14] which
contains a subtle inaccuracy.
Assume some dependently typed calculus like the one defined in [10] and an
lccc C (a category with finite limits and right adjoints to every pullback functor
in order to interpret dependent product types.)
The idea is to interpret contexts as objects in C, types in context /~ as
morphisms with codomain the interpretation of F, and terms as sections (right
inverses) of the interpretation of their types. Now the empty context gets in-
terpreted as the terminal object and a context F,z:c~ gets interpreted as the
domain of the interpretation of F k cr type. A Z - t y p e Z z : o-.7"in context F gets
interpreted as the composition s o t where s is the interpretation of cr and t is the
interpretation of r in (context F, z: ~). This "typechecks" because the codomain
of t is the interpretation of F, x: c~ which is the domain of the interpretation of
g. The problem appears when we try to interpret pairing. Assume/~ ~- M : cr
is a term of type cr and F F N : f i x := M] is a term of type v with x replaced
by M. We want to interpret their pairing F b- (M, N ) : Zx: g.r. Let m and n be
the interpretations of the former and the latter. The morphism m is a section
of s and n is a section of the interpretation of v[x :-- M] which a priori has
nothing to do with t - - the interpretation of ~'. Seely argues that substitution

* The author is supported by a European Union HCM fellowship; contract number


ERBCHBICT930420
428

should be interpreted as a pullback, so that the interpretation of r[x := M]


becomes the pullback of t along m. One might then interpret the pair (M, N)
as the composition m ~o n where m ~ is the upper arrow of this pullback.
The subtle flaw of this idea is that the interpretation of fix := M] is already
fixed by the clauses of the interpretation and there is no reason why it should
equal the chosen pullback of t along m.
Curien [5] addresses the problem by making substitution a syntactic operator
which may then be interpreted as (chosen) pullback. However, this changes the
calculus and also results in a quite complicated interpretation function for as
explained in [5] type equality must be modelled by isomorphism instead of actual
semantic equality.
On the other hand interpretation of type theory is relatively straightforward
if one has a model equipped with a semantic substitution operation which com-
mutes with composition and all semantic type and term formers. In this case
one can show that syntactic and semantic substitution do agree. The technique
of interpreting type theory in such a model has been worked out by Streicher
[15] in great detail. See also Pitts' forthcoming survey article [12].
Unfortunately, however, it seems impossible to endow an arbitrary lccc with
a pullback operation which would satisfy these coherence requirements. For ex-
ample the natural choice ~of pullbacks in the category of sets does not work.
Indeed, if f : A --* B, g : B ~ C, and h : D ~ C are set-theoretic functions
then (According t o the canonical choice) the pullback of h along g o f is the set
{(a, d) l a e A A d E D A g(f(a)) = h(d)} whereas the iterated pullback of h first
along g then along f gives the set { ( a , ( b , d ) ) l a C A A b e B A d E D A f ( a) =
b A g(b) = h(d)} which is equipollent, but not equa~to the former. It seems to
be open whether there exists another choice of pullb~cks in the category of sets
which commute with composition (and the type formers).
In this paper we propose another solution under which a type is not merely
interpreted as a morphism, but as a whole family of morphisms indexed over
possible substitutions. More abstractly, we describe a construction which turns
an arbitrary lccc into an equivalent category with attributes (cwa) - - a "split"
notion of model introduced by Cartmell [4], see also [12], for which an interpreta-
tion function is readily available. The method we use is a very general procedure
due to B@nabou (see [2] and [7, Prop. 1.3.6]) which turns an arbitrary fibration
into an equivalent split fibration. Our contribution consists of the observation
that the cwa obtained thus has not merely a split substitution operation, but
is closed under all type formers the original lccc supported. In particular the
resulting cwa h a s / / - t y p e s , Z-types, and (extensional) identity types. Phoa [11,
p. 14] has considered this as an open problem. Locally cartesian closed categories
play the role of a running example here; the arguments immediately carry over
to the more general notions of model studied by Jacobs [7, 8] and other authors.
On a more elementary level the method computes additional information
along with the inductive definition of the interpretation which allows to iden-
tify the interpretation of a substituted type r [ x := M] as a pullback of the
interpretation of r albeit not the previously chosen one.
429

In the next section we define categories with attributes and sketch the stan-
dard interpretation function. Section 3 contains the main result - - the construc-
tion of a cwa out of an lccc. In Section 5 we give a extension to universes which,
however, does not handle the most general case. For many lcccs arising in the
semantics of type theory in particular sets and w-sets and all toposes there is al-
ready known a natural equivalent cwa. For the case of toposes see [7, Ex. 4.3.5].
In Section 6 we give an example where this is not the case and thus provide
an application of the main result. Section 7 offers some concluding remarks and
sketches an alternative construction of equivalent split fibration due to Power
which does not extend to H- and Z-types.
Some familiarity with basic category theory and dependent type theory will
be assumed. Introductory material may be found in [1] (categories) and [10]
(dependent type theory). Both subjects are also well described in [12].

2 Categories with attributes

A category with attributes (cwa) is given by the following data

- a category C with terminal object i. The unique morphism from object /~


into 1 is written !r. @,
- a. functor Fam: C ~ - " + S e t s with morphism part written Fam(f)(o') ~-abbr
~[f]. More elementarily, this means that F a m ( F ) is a set for each F E Ob(C)
and if d E Faro(F) and f : B --+ F then ~[f] e Fam(F) and the two
coherence conditions
tr[idr] = cr

and
cr[fo g] : cr[f][g]
for g : A --+ B are satisfied.
- an operation p ( - ) which to each ~ E Fam(F) associates a C-morphism p(a)
with codomain F - - the canonical projection of a. The domain of p(c 0 is
written F 9a.
- An operation q ( - , - ) which to each C-morphism f : B --~ F and a C
Fam(F) associates a morphism q(f, or): B . cr[f] --~ F.c~ such that 2

B, q(y' q

B ,F
f
2 This and the following diagrams have been typeset using Paul Taylor's diagram
macros.
430

is a pullback and the coherence conditions

q ( i d r , o') = idr.a

and
q ( f 0 g, ~r) = q(f, ~r) o q(g, c~[f])
for g : A --+ B are satisfied.

Example 1. An important example of a cwa which also gives some intuition about
the meaning of the various ingredients is the term model of some dependent type
theory constructed as follows. The category C has as objects well formed contexts
of variable declarations and equivalence classes of parallel substitutions (tuples
of terms of the appropriate types as morphisms.) If F is a context then Faro(F)
is the set of types well-formed in F. If f : B --* F is a substitution then c~[f] is
the parallel substitution of the terms of f in c~. The morphism p(o-) consists of
the first IFI-variables of F, x:~ and q(f, a ) i s the substitution (f, x) where x is
the last variable in B, x: c~[f].
Further examples arise from families of sets or w-sets.

Provided that suitable interpretations of base types and type constructors are
given, a partial interpretation function can be defined by structural induction
in such a way that every context is interpreted as a C-object, every type is
interpreted as an element of Faro at the interpretation of its context and finally
terms are interpreted as sections (right inverses) of the canonical projections
associated to their types. If M is a right inverse of p(~) then by a slight abuse of
language we say that M is a section of cr. The pullback requirement for q(f, or)
allows to define a semantic equivalent to substitution on terms: If M is a section
of c~ C Fam(F) and f : B ~ F then there is a unique section of a[f] written
M[f] which satisfies q(/, ~r) o M[f] = M o f .
This interpretation is sound in the sense that the interpretation of all deriv-
able judgements is defined and that all equality judgements are validated w.r.t.
the actual equality in the model. An auxiliary property of the interpretation is
that syntactic substitution is interpreted as its semantic counterpart - I l l .
What it means that a cwa is closed under a type former can be almost directly
read off from the syntactic rules. For example closure under Z-types means that

- for every two families o" C Fam(_P) and T C F a m ( F . (7) there is a family
T) e Faro(r)
- for every two sections M of a and N of T[M] there is a section (M, N) of
Z ( a , v) - - the pairing of M and N
- for every section M of ~(c~, T) there is a section M.1 of o- and a section M.2
of r[M.1] - - the two projections of M

such that (M, N ) . I - M and (M, N).2 = N and (optionally) (M.1, M.2) = M
and for y : B ~ F we have Z(cr, r)[f] = Z(cr[f], T[q(f, ~r)]) and similar coherence
laws for pairing and the projections. See [15, 12] for details.
431

3 From lcccs to categories with attributes


Our aim in this section is to construct a category with attributes supporting H-
and Z-types, and extensional identity types from a given locally cartesian closed
category.

Preliminaries. Let C be a category with finite limits (terminal object and pull-
backs) and F E Oh(C). The slice category C / F has as objects C-morphisms
with codomain F and a C / F - m o r p h i s m from s : dora(s) ~ F to t : dora(t) --* 1"
is a C-morphism a : dom(s) --~ dora(t) with t o a = s. Notice the important
triviality that any C-morphism a with codomain dora(t) is a C / F - m o r p h i s m
with codomain t (and domain t 0 a.) For each C-morphism f : B ~ F there is
a functor f* : C / F ~ C / B sending s : dom(s) --~ F to the left vertical arrow
of the pullback of s along f . The action of f* on morphisms is defined by the
universal property of the pullback. The functor f* has a right adjoint Z ] which
sends s : dom(s) ---* B to the composition f 0 s. The arrow category C -~ has as
objects all morphisms of C and commuting squares as morphisms. Equivalently,
a C--*-morphism from s : dora(s) ---+ B to t : dora(s) --~ F is a C-morphism
f : B ~ F and a C / B - m o r p h i s m a : s --~ f*t. Taking the domain of a morphism
extends to a functor dom : C ~ --* C.
Categories with finite limits loosely correspond to dependent type theories if
one views morphisms as families of types the morphisms denoting the projection
from the disjoint union of all fibres to the indexing type. For example in the lccc
of sets the type of m, n-matrices indexed over the set N • N would be modelled
as the function f o r m a t : Mat ~ N • N which maps an arbitray matrix to its
"format" a pair of natural numbers indicating the numbers of rows and columns.
Substitution then corresponds (up to isomorphism) to pullback and composi-
tion to disjoint union. For example we obtain the set of square matrices indexed
over N as the pullback of f o r m a t along the diagonal function N ~ N x N
and similarly the set of matrices with variable number of columns indexed
over the number of rows as the composition of f o r m a t with the first projection
N x N---~ N.
Using equalisers one can also model extensional identity types. In order to
have dependent product types one also needs right adjoints to pullback functors
which leads to the following definition.
D e f i n i t i o n 1. A locally cartesian closed category (lccc) is a category with finite
limits and right adjoints H I to every pullback functor f* : C / I " ~ C / B for
f:B---*F.
Examples of lcccs are the categories of sets and w-sets, all toposes, and the term
model of extensional Martin-LSf type theory as constructed in [14]. For the rest
of this section assume a fixed lccc C. In order to derive an interpretation of
dependent type theory in C we construct a cwa with base category C as follows.
For F E Oh(C) the set Faro(F) is defined as the set of those functors a from
the slice category C / F to the arrow category C ~ which map every morphism
to a pullback square and for which cod o ~ = dom. More precisely c~ E F a m ( F )
432

associates to every m o r p h i s m s : B --+ F a C - m o r p h i s m a(s) with codomain B


and to a : B ' --+ B a m o r p h i s m v-(s, a) such that

do<v-(so

v-(so [v-(s)
Bl , B
Ol

is a pullback. Moreover, the assignment of the m o r p h i s m ~r(s, a) is functorial in


the sense that v-(s, idB) = iddo,~(~(,)) and v-(s, so/~) = o'(s, a) 0 a ( s o a,/3) for
/3 : B " --+ B ' . An element of the thus defined set Fam(F) is called a functorial
family over F.

Example 2. The intuition behind these families is that instead of making sub-
stitution (viz. pullback) an arbitrarily chosen structure, every family comes
equipped with its own behaviour under substitution. Thus in v-(s) one should
view s as a requested substitution and v'(s) itself as the result of performing
this substitution. Indeed, given a (not necessarily split) choice of pullbacks in C
we can see that every C - m o r p h i s m v- with c o d o m a i n / " induces a family & over
F. For s : B -+ F we put &(s) := s*r where s* is the pullback functor defined
above. If in addition a : B' --+ B we define 8 ( s , ~ ) as the unique mediating
m o r p h i s m in p

/ v-

8' O"

B' - B , "

where the lower right trapezium and the outer square are pullbacks. It follows
from a simple diagram chase that the resulting lower left trapezium is also a
pullback as required. Since 8(s, a) is defined by a universal property it must be
functorial.

We continue with the definition of the cwa of functorial families. If a E Faro(F)


then the canonical projection p(v-) is defined as v-(idr). Thus F-v- = dom(v-(~dr)).
If in addition f : B --+ F we define the substitution v-[f] by v-[f](s) := v-(fo s) for
s : A -+ B and by ~r[f](s, a) : = er(fQ s, a ) for ~ : A' ~ A. Since this substitution
is defined by composition the functor laws for Fam are immediate. Finally, the
433

morphism q(f, cr) is given by c~(idr, f) which indeed yields the required pullback
square. The coherence law for q ( - , - ) follows from the functoriality of ~.
Notice that by the definition of canonical projection a section of some family
cr is merely a right inverse to ~(id). Thus terms do not carry any intensional
information with respect to substitution. See also Section 5.
We have now constructed a cwa over C which can be shown to be equivalent
to C in some suitable 2-categorical sense. We shall content ourselves by noticing
that the hat-construction and canonical projection (p) establish an equivalence
between the category Fam(F) where a morphism from a to r is a map f with
p(r) o f = p(~r), and the slice category C / F for every F E Oh(C).

Theorem2. The category with attributes constructed above admits Z-types, H-


types, and extensional identity types.
Pro@ We give the full proof for E-types which conveys the idea and sketch
the interpretation o f / / - t y p e s and identity types. Let o- E Fam(F) and r E
Fam(F. cO. The family Z(c~, r) is defined by Z(~r, r)(s) := a(s) o r(q(s, a)) and
~(~, r)(s, ~) := r(q(s, c0, c~(s, ~)). Thus to obtain the value of Z(a, r) at some
substitution s : B ~ F we first perform the substitution inside ~r yielding ~(s)
and r yielding 7-(q(s, cr)) and then calculate the sum of the resulting morphisms
in C as usual by composition.

dom(Z(o', v)(sa)) r(q(s, (r), (r(s, a)! dom(~(fr, 7-)(s))

T(q(s, cr))]

~r(s, c~) , q(s, c~! F . c r

(8)1 B/
f/1,B ,F
o~ 8

The fact that E(cr, r)(s, a) = r(q(s, o'), c~(s, a)) forms a pullback with a and the
vertical arrows follows because the vertical composition of two pullback squares is
a pullback. Functoriality follows from functoriality of ~r and r and the coherence
laws for q ( . , - ) .
Next, we check that the thus defined Z-type is indeed stable under substi-
tution. If f : B --* F and s : A --+ B then Z(~r, r)[f](s) = ~(rr, r)(fs) = cr(fs) o
7-(q(fs, (r)) = o-[f](s) o r(q(f, c~)o q(s, ~r[f])) = o-[f](s) o r[q(f, a)](q(s, o-[f])) =
S(o'[f], r[q(f, ~r)])(s) as required. For the morphism part we calculate similarly.
The pairing and projection combinators are defined as usual in an lccc: If M is
a section of a, i.e. a right inverse of cr(idv) and N is a section of r[M], i.e. a right
inverse to r[M](idv) = r(M) then we define the pairing (M, N) as q(M, r)o N
which is a section of Z(g, r) by simple equality reasoning. On the other hand,
434

if M is a section of Z(cr, r) then M.1 := p(r) o M is a section of M and M.2


is the unique section of r[M.1] with q(M.1, r ) o M . 2 = M determined by the
universal property of the pullback. Now we have (M.1, M.2) = M by definition,
(M, N ) . I = M by equational reasoning, and (M, N).2 = N by uniqueness of the
second projection.
It remains to show that these operations are stable under substitution. We
do the calculation for pairing, the two other cases may be verified similarly or
can be deduced from the case of pairing and the fland rl-equations. Let M and
N be as above in the definition of pairing and f : B -+ s Our aim is to show
that
(M, N)[f] = (M[f], N[f])
The participating sections are defined uniquely by the equations
q(f, Z(cr, 7-)) o(M, N)[f] = (M, N) o f
q(f,~r) 0 M[f] = M o f
q(/, 7-[M]) o N[I] = No f

Now in view of the unique charaeterisation of (M, N)[f] stability follows if we


can show
q(f, Z(cr, r)) 0(M[f], N[f]) -- (M, N) o f
Here the left hand side e q u a l s

q(q(f, ~r), 7-)o q(M[f], 7[q(f, c0])0 N[f]

by expanding the definitions of ~ and pairing. This in turn equals

q(q(f, or) 0 M[f], 7-)0 N[f]


using the coherence law for q ( - , - ) . Now using the defining equation for M[f]
and applying the coherence law in the other direction we arrive at

q(M, 7-)0 q(f, 7-[M]) o Nil]


Using the defining equation for N[f] and the definition of (M, N) we arrive at
the right hand side.
The type constructors H and identity type are defined in a similar fashion.
For families ~, 7- as above the value of the family H(c~, 7-) at substitution s : B
/~ is//o(~)(7-(q(s, g))). We leave the messy, but essentially forced definitions of
the morphism part and the associated combinators to the reader.
For ~r E Faro(F) and M , N sections of p(cr) we define the identity type
Eqr N) at s as the (chosen) equaliser of M[s] and N[s] where M[s] is the
unique section of ~(s) for which q(8, g) o M[s] = M o s. Compatibility of substitu-
tion of the associated combinators requires again some lengthy calculation which
in the case of H basically amount to reproving the Beck-Chevalley condition for
lcccs [14].
It is worth pointing out that a certain choice of pullbacks and equalisers
albeit not a split one is required to interpret identity types which are the basic
source of type dependency.
435

In a similar way we can show that the cwa of families supports lists or natural
numbers if the category C supports them in a coherent way. Instead of carrying
out these (rather laborious) examples we a t t e m p t to clarify the ideas a bit further
by elaborating the conditions on C which are necessary in order t h a t in the
associated category with attributes we can interpret an (admittedly contrived)
type former governed by the rules

F F o"type
T-FORM
r ~- T(r type

F}-M:a
T-INTRO
F F T-intro(M): T(~r)
and the associated congruence rules. This can in general be interpreted if there
is an operation T which to every m o r p h i s m ~r with codomain P associates a
m o r p h i s m T(c 0 also with codomain P and to every pullback square another
pullback square

f' D
T<r) 9

B
f
9 B
f
,F
t
functorial in the sense that T(id) : id and T ( f ' o g') : T ( f ' ) o T(g'). This action
on pullback squares is another way of stating that T is compatible with the cho-
sen pullback up to isomorphism and admits a functorial action on isomorphisms,
but of course not necessarily on arbitrary morphisms.
Moreover, for each section M of r we need a section T-intro(M) of T ( r r ) i n
such a way t h a t in the above pullback situation we have

T ( f ' ) o T-intro(M') : T-intro(M) o f

where M ' is the unique section of ~r' with f ' o M ' : M o f . This is the coherence
condition one would reasonably expect.
Now we can define a T-operator on families by putting

T(rr)(s) : : T(rr(s))

and
T(rr)(s, c0 : : T(rr(s, r~))

for (r 6 Fam(F) and s : B --+ F and c~ : B ' --~ B. Functoriality follows from
functoriality of rr and T ( - ) . The operation T-fntro is defined as in C. Stability
436

of same under substitution follows directly from the above coherence condition
by instantiating with the pullback square

B . f[o-] = dom(cr(f)) q(f, cr)= cr(idv, f ! V . a = dom(o'(idv))


/

p(~r[f]) = cr(f)l

B ,r'
f
This example shows that the described method carries over to other type con-
structors like e.g. lists or natural numbers provided they are present in C in
a coherent way. We also see that a type former need not necessarily be given
by a universal construction as is the case for H- and Z-types. The lesson to be
learned is that whenever a type former admits a functorial action on pullback
squares which is compatible with the associated structure then it may be lifted
to the cwa of families.

4 The interpretation in the family model

The general interpretation function for categories with attributes now gives rise
to a semantic function mapping contexts to objects in C, types to families over
their context, etc. Now if F k- ~ type then I F k- o'~(id[rl) is an object in the
slice category C / [ F ] which we may view as the intended interpretation of
in C. This intended semantics is not "compositional" since for example in the
interpretation of pairing we use substitutions other than the identity. A reader
familiar with theory of functional programming may notice here some similarity
with the continuation-passing-style translation where semantics is inductively
defined with respect to an arbitrary continuation, but in the end one is only
interested in the instance of the identity continuation.

5 Universes

In the construction described above types get interpreted as functions associ-


ating substitutions to morphisms. Terms, however, get interpreted simply as
sections and do not carry any intensional information about their behaviour un-
der substitution, it being forced upon by the universal property of the pullbacks
associated with families. This implies that our construction does not carry over
to universes (which mix terms and types) unless the universe was "split" in the
first place. What this means is exemplified by the following definition specialising
the notion of model for the Calculus of Constructions given in [6].

D e f i n i t i o n 3 . A split dictos is an lccc C with a morphism gen : T --+ ~2 and an


operation which to every two morphisms s : S ---+ F and p : S -+ t'2 associates a
437

m o r p h i s m Vs(p) : F --*/2 such t h a t Vs(p)*gen and Hs(p*gen) are isomorphic in


C / F and for every pullback square
!

S
f ,T

B ,F
f
and m o r p h i s m p : T --+ ~2 we have Vt(p) 0 f = Vs(p0 f ' ) .

In [6] the last requirement is weakened to isomorphism in C / F of the morphisms


associated by -*gen. The stricter condition imposed here means that the V
operator is stable under substitution up to equality. The two most prominent
examples of dictoses, namely the category of sets with /2 = {0, 1} and the
category of w-sets with f2 equal the set of partial equivalence relations on w are
split dictoses. In split dictoses we can interpret the Calculus of Constructions

Theorem4. To every split dictos there exists an equivalent cwa with enough
structure to interpret the Calculus of Constructions.

Proof. Let us first define what it means for a cwa with H - t y p e s to be a model
of the Calculus of Constructions. Following [15] we need a family Prop over 1
and a family Prf over 1 9 Prop in such a way that two morphisms s, s ~ : F --*
1 9 Prop are equal if Prf[s] = Prf[s']. Moreover, if cr is a family over F and
p : F 9c~ ~ 1 9 Prop then there is a m o r p h i s m VG(p) : F ~ 1 9 Prop such t h a t
Prf[Vo(s)] = H(cr, Prf[s]). One could stay even closer to the syntax but only at
the expense of clarity.
Now let a split dictos C be given. We construct a cwa with base C as follows.
The set of families over F is defined as the disjoint union of the set of functorial
families as defined in Section 3 and the homset C(F, D). We call the elements
of C ( F , ~2) propositional families (over F). The operations of substitution and
canonical projection are extended to propositional families by defining for (r :
/~--+/2andf:B~F:

f[~] = ~oI
p(~) =~*gen
q(f, cr) defined by universal property like in Ex. 2

It follows by straightforward calculation that this is a cwa. Every propositional


family cr : /1 ~ /2 induces a functorial family 5 defined by applying the hat-
construction from Ex. 2 to ~r*gen. We m a y then extend Z - t y p e s and other pos-
sible type formers except H - t y p e s to propositional families by precomposition
with'. We have a functorial family Prop over 1 defined by Prop = !a. W.l.o.g.
we m a y identify 1 9Prop with X?. A propositional family Prf over 1 9Prop is then
438

defined as the identity on ~2. Notice that if s : F ~ 1 9 Prop then Prf[s] equals
s. Therefore Prf[-] is injective as required.
For the definition of the H - t y p e H(a, r) we first replace ~r by 5" if cr is propo-
sitional. So let's assume that z is functorial. Then we proceed by case distinction
on whether r is functorial or propositional. In the former case we use the H-
type for functorial families as defined in Section 3. If r is propositional, i.e.
7" : F . ~r ---+1. Prop then we define H(0-, v) as the propositional family Vp(o)(7").
Abstraction and application are defined by suitably interspersing the isomor-
phism between Vs(p)*gen and Hs(p*gen) assumed in the definition of a split
dictos.
By lengthy but straightforward calculation it follows that this satisfies all
the properties of dependent products. In particular to see t h a t / / i s stable under
substitution we instantiate the coherence property for V with the pullback square
formed out of p(~[f]), p(o'), q(f, cr), and f for some f : B --+ F.
The Y-operator is defined in exactly the same way using the fact that propo-
sitional families and morphisms into 1. Prop coincide.
As in Ex. 2 the hat-construction and canonical projection define an equiva-
lence between C and the constructed cwa.

It deserves attention that the coherence requirement imposed on the Y-operation


was crucial for the definition of H-types by case distinction and that the methods
described in this paper do not seem to generalize to arbitrary dictoses or more
generally lcccs which support universes in a non split way.

6 Application: A category of setoids

As mentioned in the Introduction for many lcccs an equivalent cwa is known


already. However, there is an interesting example motivated by a construction in
[3] for which the construction described in this paper seems to be the only viable
way. Consider the syntax of intensional Martin-LSf type theory with natural
numbers as described e.g. in [10]. We write F t- cr true to mean that there exists
a term F ~- M : ~r. We write x and + for the nondependent special cases of
and H, resp. A category C of "setoids" (Types with equivalence relations) is
formed as follows.

An object of C is a quintuple X = (Xs~t, Xrel, r, s, t) such that the following


hold.
i) X ~ t is a closed type.
ii) x, x': X ~ t ~- Xr~t(x, x') type.
iii) x: X,~t ~- r ( x ) : Xr~l(x, x).
iv) x, x': X,~t, p: Xr~,(x, x') ~ s(p): X~,(x', x).
v) x, x', x": Xs~t, p: X,.~t(x, x'), q: X~,,(x', x") ~- t(p, q): X~ez(x, x").
So r , s , t are "proofs" that X ~ is an equivalence relation on X,~t. If no
confusion can arise the subscripts sa and ~el may be omitted.
439

- A morphism from X to Y is a term x: X~,t t- f(x) : Y~t such that

x, x' : Xset , - : Xrel(X, x') I- Yrel(f(x), f(x') ) true

Moreover, two morphisms f and f~ are identified if

9 : x~o~ ~ Y~.(f(=), f'(~))true


It is easy to check that equality on morphisms is an equivalence relation and
that morphisms are closed under composition and contain the identity so that
indeed a category has been defined. Essentially, this construction is the same
as the one described in [3] although there the category is defined categorically
rather than syntactically and one starts out with an lccc in the first place. By
mimicking the argument given there we obtain the following proposition.

Propositlonh. The category C of setoids is locally cartesian closed and con-


tains a natural numbers object.

Proof. We only give the required objects leaving the verifications to the reader.
Let f : Y -~ X and g : Z --* X. The pullback of f and g is defined as the object
W given by W,~t = Zy: Y.Zz: Z . X ( f ( y ) , g(z)) and Wrel((y, Z, --), (y', Z', --)) =
Y(y, y') X Z(z, z'). The two pullback projections send (y, z , - ) to y and z re-
spectively.
Now let f : Y --~ X and g : Z --* Y. We define [If (g) : W --~ X by

w~e, := ~ : X.~u: nv: z . x ( f ( v ) , ~) -~ ~ z : z . Y (g(z), v).nesp(u)


W~.((~, u,-), (x', ~', -)) :=
X(x,x') X IIy:Y.IIp:X(f(y),x).IIp':X(f(y),x').Z(u y p, u' y p')
where
Resp(u) :=
r[v,v':y.~p:x(f(v),~).Hp':x(f(v'),~).y(v,~') ~ z ( ~ y p,~ ~' p')
The morphism II](g) itself is then the first projection from W to X. Finally the
natural numbers object has as underlying type the type of natural numbers and
the intensional identity type [dy a s relation.

Unfortunately, this proof only shows the existence of pullbacks a n d / / and no


canonical choice arises from it because the constructions given in the proof are
not independent of the particular representatives chosen for the involved mor-
phisms. Therefore, in order to endow the category of setoids with chosen struc-
ture the use of the axiom of choice seems unavoidable. However, given such a
choice we can use the method described in this paper to obtain an interpreta-
tion of extensional type theory in the category of setoids and thus in a certain
sense in intensional type theory. The category of setoids is a worthwhile object
for further study. In particular it appears to have coequalisers of equivalence
relations and thus provides a model for the extensional quotient types studied
by Mendler in [9]. Moreover, we believe that the full subcategory of the category
of setoids consisting of those objects taken on by the interpretation function is
42.9

actually equivalent to the lccc of types and terms in extensional type theory
defined in [14] and presented there as the initial one. Incidentally, the precise
proof of initiality (up to natural isomorphism) of this syntactic lccc is another
field of application for our methods.

7 Summary and Concluding Remarks

We have described a method for obtaining an equivalent category with attributes


from a locally cartesian closed category. This solves the problem of interpreting
(at least first order) dependent type theory in lcccs. The method consists of
applying B4nabou's construction of a split fibration from an arbitrary one to the
particular case of the codomain fibration associated to an lccc. The observation
that the thus obtained cwa is closed under various type operators is to our
knowledge original.
Incidentally, for another somewhat dual construction of split fibrations due
to Power [13] this is not the case. Using it Faro(F) would be the set of pairs
(s, cr) where s and cr are morphisms with common codomain and dora(s) = F.
The associated canonical projection to such a family is the morphism s*cr with
codomain F. If f : B ~ F then we define (s,~r)[f] as (s0.f, cr). This gives
rise to a cwa which, however, is not even closed under X-types in a natural
way. Intuitively, the reason is that (s, cQ can be viewed as a type ~ together
with a delayed substitution which is meant to be carried out upon taking the
canonical projection. But if two types have different associated substitutions
we cannot compute their sum or product without performing the substitutions
which destroys the split property.
Power's very elegant result applies to much more general coherence prob-
lems than the one considered here; in fact it requires some effort to extract the
above concrete description from the general construction. The aim of the pre-
vious paragraph is by no means to critieise his beautiful work, but to pinpoint
the particular properties of B4nabou's construction which make our result go
through.
In view of the lack of generality with respect to universes pointed out in
Section 5 one might want to endow the meaning of terms with behaviour under
substitution, too. Then, however, the framework of cwas is no longer sufficient
and more generally no model in which substitution on terms is modelled by a
universal property could work. We do not know of any notion of model where
this is not the case, so maybe some further research into the abstract semantics
of dependent types is called for.
In conclusion we may remark that a certain gap in the literature has been
filled, but that the practical usefulness of the result remains unclear until more
examples like the one from Section 6 are found and investigated.

Acknowledgement

My thanks go to Thomas Streicher for explaining B6nabou's construction to me.


441

References

1. Michael Barr and Charles Wells. Category Theoryfor Computing Science. Inter-
national Series in Computer Science. Prentice Hall, 1990.
2. J. B6nabou. Fibred categories and the foundations of naive category theory. Jour-
nal of Symbolic Logic, 50:10-37, 1985.
3. A. Carboni. Some free constructions in realizability and proof theory. Journal of
Pure and Applied Algebra, to appear.
4. J. Cartmell. Generalized algebraic theories and contextual categories. PhD thesis,
Univ. Oxford, 1978.
5. Pierre-Louis Curien. Substitution up to isomorphism. Fundamenta Informaticae,
19:51-86, 1993.
6. Thomas Ehrhard. Dictoses. In Proc. Conf. Category Theory and Computer Sci-
ence, Manchester, UK, pages 213-223. Springer LNCS vol. 389, 1989.
7. Bart Jacobs. Categorical Type Theory. PhD thesis, University of Nijmegen, 1991.
8. Bart Jacobs. Comprehension categories and the semantics of type theory. Theo-
retical Computer Science, 107:169-207, 1993.
9. Nax P. Mendler. Quotient types via coequalisers in Martin-L6f's type theory, in
the informal proceedings of the workshop on Logical Frameworks, Antibes, May
1990.
10. B. Nordstr6m, K. Petersson, and J. M. Smith. Programming in Martin-Lb'f's Type
Theory, An Introduction. Clarendon Press, Oxford, 1990.
11. Wesley Phoa. An introduction to fibrations, topos theory, the effective topos, and
modest sets. Technical Report ECS-LFCS-92-208, LFCS Edinburgh, 1992.
12. Andrew Pitts. Categorical logic. In Handbook of Logic in Computer Science (Vol.
VI). Oxford University Press, 199? to appear.
13. A. J. Power. A general coherence result. Journal of Pure and Applied Algebra,
57:165-173, 1989.
14. Robert A. G. Seely. Locally cartesian closed categories and type theory. Mathe-
matical Proceedings of the Cambridge Philosophical Society, 95:33-48, 1984.
15. Thomas Streicher. Semantics of Type Theory. Birkhs 1991.
Algorithmic aspects of propositional tense logics
Alexander V. Chagrov and Valentin B. Shehtman
Tvef State University,
Zhelyabova Str. 33, 170013 Tvef, Russia
Insitute for Problems of Information Transmission,
Ermolovoj Str. 19, 101447 Moscow, Russia
Email: shehtman@ippi.ac.msk.su

1 Introduction
It is well-known that tense logics can be treated as logics of computations if we
understand "moments of time" as "states of a computing system". In this paper
we are concerned with tense logics in the traditional Priorean language having
two non-classical connectives: G ("it will always be the case that . . . "), H ("it
was always the case that ... "), and their duals: F = -~G-~, P = ~ H - . First let
us recall the definition of the minimal lense logic (denoted by K t 1).
A x i o m s : Classical tautologies and the formulas G(p --+ q) ~ (Gp ~ Gq),
H(p --+ q) ~ (Hp ~ gq), F g p ~ p, PGp ~ p.
I n f e r e n c e r u l e s : Modus Ponens; Generalization (t- 9 ~ F- G 9 and
~- 9 ==~~- H g ) ; Substitution (of variables by formulas).
In general, by a lense logic we mean an extension of K t by some new axioms;
if the number of new axioms is finite, the logic is called a tense calculus. E.g.
the calculus K 4 t is obtained by adding Gp --+ GGp (this is written as follows:
K 4 t = K t + Gp --~ GGp).
Semantics of tense logics is given by Kripke frames. The latter is a non-
empty set (of "moments of time" or of "states") with a binary relation ("earlier
than"). We get a Kripke model on a frame (W, R) if for every moment w E W
and for every formula 9 we say whether 9 is true at w (w ~ 9) or not, and also
if the following holds: w ~ 9 A r iff w ~ 9 and w ~ r etc. (similarly for the
other Boolean connectives); w ~ G 9 iff Vv(wRv ~ v ~ 9);
w ~ H 9 iffVv(vRw ~ v ~ 9). A formula is called true in a Kripke model if it is
true at every moment; it is called valid in a frame if it is true in every model on
this frame. It is well-known that K t is exactly the set of formulas valid in every
frame, K 4 t is the set of formulas valid in every transitive frame. For proofs and
motivations of tense logics the reader may consult [5].
The main topic of our paper is undecidability. In Section 2.2 we construct
undecidable tense calculi which are axiomatized by formulas of a very simple kind
] T h e s e n o t a t i o n s are t r a d i t i o n a l : "K" r e m i n d s of K r i p k e , "t" is a n a b b r e v i a t i o n for
443

("tense reduction principles"). Sections 3.2, 3.3 are devoted to more delicate
questions about decidability of properties of tense logics. The first examples
of undecidability for standard properties of polymodal logics (such as Kripke-
completeness, the finite model property, consistency) were found by Thomason
[14]. Here we show that tabularity of tense logics is undecidable as well (in
contrast with the case of monomodal S4-1ogics).
The readers are supposed to be familiar with standard syntactical and se-
mantical notions referring to normal modal and tense logics. However we recall
some necessary definitions.

2 U n d e c i d a b l e t e n s e calculi a x i o m a t i z e d by re-
d u c t i o n principles
A tense operatoris a word(maybe empty) in the alphabet {F, P, G, H}. A tense
reduction principle is a formula of the form Flp --~ F2p, in which F1, 1"2 are tense
operators.
It is well-known that many properties of time can be expressed by formulas
of this sort. E.g. Fp ~ H F p (and/or Gp --+ GGp) corresponds to transitivity,
P F p ~ F P p corresponds to "confluence": Vx, y, z( x Ry A x Rz --+ 3t ( y Rt A z Rt ) ),
Fp --~ F F p corresponds to density, GHp --* p corresponds to "seriality in fu-
ture": Vx3yxRy, and so on. The class of tense reduction principles may seem
too poor for obtaining negative algorithmic (and other) results. Nevertheless in
this Section we construct an undecidable tense calculus, axiomatizable by tense
reduction principles.

2.1 Undecidable polymodal logics


We begin with the polymodal case. Recall that normal n-modal logics are for-
mulated in the language including necessity operators Di besides classical con-
nectives; as usual Oi abbreviates -~Di--. Minimal normal n-modal logic Kn is
axiomatized by the formulas D~(p --+ q) ~ (Dip ~ D~q) in addition to classical
tautologies, together with the rules: Modus Ponens, Substitution and General-
ization (F- ~ ~F- Di~). An n-modal Kripke frame is a structure (W, R 1 , . . . , Rn)
in which Ri's are binary relations in a non-empty set W. The truth and va-
lidity are defined similarly to the monomodal case; in particular, w ~ Di~ iff

Thus K t is equivalent to the logic K2 + ~ID2p "-+ p + O2[:31p ---+p if G, H


are replaced by D1, D2. Respectively, Kripke frames (W, R) for K t correspond
to 2-modal frames (W, R, R - l ) .
A necessity reduction principle is a formula of the form Flp ---+F2p, in which
F1, F2 are words in the alphabet En = {D1,..., Dn}.
Given an n-modal frame (W, R 1 , . . . , Rn), and F = Dil ... Dim, let

RF----Ril o...oRi,~
4.L~4.

(o denotes the composition of binary relations); for the empty word A, let

•A = idw = {(x, x)l c W}


L e m m a 1. (W, t~1,..., Rn) ~ Flp -+ F2p ~ Rr~ c_ Rrl.
The straightforward proof is easy (for modal logicians we can observe that
Lemma 1 is a particular case of Sahlqvist's Theorem [9]).
Recall that a semi-Thue system over the alphabet {[:]1,..., Dn} can be de-
fined as a binary relation in the set of words E*. A semi-Thue system T gives
rise to the deducibility relation ~-T which is the reflexive and transitive closure of
{ ( X r E X A Y ) I (r, A ) ~ T ; X , YeE~}. 7- is called decidable if ~-T is decidable.
By factorizing the set E* through the equivalence relation

Y --:r A r (F b-~- A) and (A ~-~- F ) ,

we obviously obtain the ordered monoid 2, with the unit 1 = IAI


(IFI, or IFIz-denotes the equivalence class of F under ----T), in which the multipli-
cation and the order are defined as follows:

Irl. IAI = Ir l, Irl _< lal ca r a.


We denote this ordered monoid by M(T).
On the other hand, every set W gives rise to the ordered monoid of binary
relations R M ( W ) = ( P ( W • W), idw, o, C_). An ordered monoid is called rep-
resentable iff it is embeddable into some R M ( W ) . The following fact ("Cayley's
Theorem") is well-known [11]:
T H E O R E M 2. Every ordered monoid is representable.
Now for any semi-Thue system T we can construct a logic

n(7-) = K , + {Ap -~ Vpl (r, A) e 7-}

T H E O R E M 3. If T is undecidable then L(T) is undecidable.


P r o o f . It is sufficient to show that for any words F1, F2

F1 t-7- F2 Ca L(T) ~- F2p ---+Flp 9

(:*) follows easily from the definition of L(T) by an induction on the proof
of F~.
To show the converse, assume that L(7-) ~ r2p ~ Flp. By Theorem 2, there
exists an embedding f : M(7-) ~ R M ( W ) . Let Ri = f(Ioil) for 1 < i < n.
Then RA = f(IAI) for any word A, and thus (since f is an embedding and by
Lemma 1) we have (for any F, A):

(r, A) ~ 7- ~ Irl < I/,I in M ( T ) ca RF c_ R/, r (W, R1, . . ., R,~} ~ Ap --+ Fp.
Thus (W, R 1 , . . . , R,) validates L(7-), and we obtain:
(W,/~,..., R.) ~ r 2 p -+ r~p.
2 R e c a l l t h a t a n ordered raonoid is a m o n o i d w h i c h is a p a r t i a l l y o r d e r e d set, s u c h t h a t
a <_ b =t, ~ a y <_ xby.
445

Hence Rr2CRrl by Lemma 1, and therefore FI~'rF2 (since f is an embed-


ding). Q.E.D.
E x a m p l e . The logic obtained by adding the following axioms to K5 is
undecidable:

DIVIap ~ DaDIp, r'llN4P ~ V14Dlp , D2Dap ~ D3r12p ,

D2D4p _= v14112p, D5DaDIp ~ DaD5p,


Vq5VI4V12p ~--- V14V15p i [ t a D 4 D 3 D l p ~_~ D3V14VlaflSp.

This follows from Theorem 3 and the paper [7].


R e m a r k . Theorem 3 and the Embedding Theorem reducing polymodal logics
to monomodal logics [13] imply the existence of undecidable normal monomodal
calculi. The first (and more complicated) example of such calculus was con-
structed in [6].

2.2 Undecidable tense logics


Now let us transfer the previous results to tense logics axiomatized by tense
reduction principles. The main difficulty here is the connection between G
and H in K t and the corresponding restriction to 2-modal frames of the form
(W, R, R - ' > only.
An ordered monoid with the additional unary operation x ~-+ z-1 is called a
tense ordered (t.o.) monoid iff it satisfies the conditions:
(xy) -1 = y-ix-1, (x-l) - 1 = 9 , x _< ~ ~ x-1 _< y - 1 , 1 - 1 = 1.

Every set W gives rise to the tense ordered monoid of binary relations
TI~M(W) = ( P ( W x W), idw, o, ___,-1 ). A t.o. monoid is called representable iff
it is embeddable into some T R M ( W ) . Not all t.o. monoids are representable;
a criterion of representability is known, but it is rather complicated [2]. For our
purposes we will use a simpler sufficient condition of representability.
Call a t.o. monoid serial iff it satisfies Vx (1 < x x - ] ) , Vx (1 _ x - i x ) .
T H E O R E M 4. Every serial t.o. monoid is representable.
P r o o f (a sketch). A cone in a t.o. monoid M is a subset U C M such that
x E M ~ x < y ::> y E M. Consider the set W of all cones in M, and for
x E M let
h(x) -- {(U,V) E W • W l,~V C_ U , x - l v C_ V } .
One can check that the map h : M ---+T R M ( W ) is an embedding, provided
that M is serial. Q.E.D.
Obviously, the set of all words in the alphabet
T~ = {D1,..., D~, D~-I,..., D~ 1}
can be considered as a t.o. monoid in which 9 is the concatenation, 1 = A, _< is
the equality, and
(Oi) -1 = O~-1 , ([::171)-1 = Di, ( N 1 . . . N m ) - ' = (Nm)-i ...(N1) -1
~.2.~

where Ark E T* for all k.


A semi-Thue system T over Tn is called tense if

(r, A) E T =:>"[,-1 t._T A-1.

In this case M ( T ) becomes a t.o. monoid if we set: Ir1-1 = Ir-~l. An arbitrary


semi-Thue system T over E , can be extended to a tense semi-Thue system T t
over Tn :
~t = ~r u {(r -~, ~-~)I (r, A) c ~}.
Now let us encode the words in T,~ as the words in the two-letter alphabet
T1 = {G, H}. (Keeping to the tense-logical notation, we write G instead of []1
and H instead of mi-1. )
L e m m a 5. Let n >_ 2 and let for any i < n,

Qi = Gn+aHGn+I-iHGi+2H.

Then the monoid morphism f : T n ---+ T~ such that for any i, f ( n i ) = Qi,
f ( m i - 1 ) = Q~I, is a coding.
The proof is straightforward, and we skip it. By a standard argument we
have also the following
L e m m a 6. Let U be a semi-Thue system over Tn, and let f be the same as in
Lemma 5. Consider the semi-Thue system f(/,/) = {(f(P), f ( A ) ) [ ( r , A) ~ U}.
Then for any Pl, F2 E T*

rl ~-u r2 r f(r~) ~-f(u) f(r~).

M a i n L e m m a 7. Let T be a semi-Thue system over E , , such that r r A,


A ~ A whenever ( F , A ) E "/-. Then there exists a tense semi-Thue system ,9
over T1 such that M ( S ) is serial, and M(cJ-) is embeddable in M ( S ) .
P r o o f (a sketch). Take the coding f from Lemmas 5 and 6, and let

S = f(7"t) U {(A, GH), (A, H G ) } .

Due to L e m m a 6, the equality

g(Irlr) = If(r)ls 9

(for any F C Tn) correctly defines a morphism of t.o. monoids


g: M(7-) ---+M ( 8 ) .
Our goal is to prove that g is an embedding, and it is sufficient to show that

f ( r l ) ~-s f(r2) ~ Pl ~-r r2. (1)

We call a word U in T1 regular if it has no occurrences of H 2 and strongly


regular if it has no occurrences of H G H and HG2H either. One can see from
the definitions that every word in the range of f is strongly regular.
447

Consider also the two fragments of S:

S+ = f ( T ) U {(A, GH), (h, HG)}, D = {(A, GH), (A, HG)}.

We skip the inductive proofs of the following two sublemmas.


S u b l e m m a 7.1. If U }-s V and V is regular, then U is regular.
S u b l e m r a a 7.2. If U t-s+ V and V is strongly regular, then U is strongly
regular.
By Sublemma 7.1 we obtain that

f(rl) ~-s f(I'2) ~ f(rl) ~-s+ f(r2)


because every derivation of f(r~) from f(r~) contains only regular words, and
thus rules of the form (f(r-~), f(a-~)) (with I" 7t A) cannot be applied there.
So to check (1) it remains to show that

f(r~) ~-s+ f(r2) ~ f(r~) ~-:(r) f(r~) (2)


since
f(F1) I"$(T) f(]P2) =:~ r l ~'T F2
by Lemma 5.
S u b l e m m a 7.3. If V F~ U and U is strongly regular then there exist
U 1 , . . . , U k , D 0 , . . . , D k such that D 1 , . , . , D k - 1 E {GH, HG},
Do, Dk E {GH, HG, A} and

V = U1...Uk, U = DoU1D1 ...UkDk 9 (3)

This is also proved by an induction over the derivation of U. Observe that


if a single application of the rule (A, GH) or (A, HG) to U written in the form
(3) gives a strongly regular word, then none of non-void Di can become longer.
Further on X -< Y denotes that the word X is an initial segment of the word
Y.
S u b l e m m a 7.4. Suppose Qi Fv U and U is strongly regular. If Qj ~ U then
Qi = Q~ = g .
P r o o f (a sketch). By Sublemma 7.3, we can write:

Qi = U1 ...Uk, U = DoUtD1 ...UkDk

for some D 1 , . . . , D k - 1 E {GH, HG};Do, Dk E {GH, HG, A}. It follows that


Do = A since G n+3 -4 Qj ~_ U. After examining different possibilities for
V1 ~ Qi, it turns out that !/1 = Qi (since U is strongly regular), and thus
U = QiD1. Then (again from the strong regularity of U) we obtain that D1 is
void. Q.E.D.
Now suppose f(F)FvU for some strongly regular U. By 7.3, there exist
U I , . . . , U k , D 0 , . . . , D k such that D1,...,Dk-1 E {GH, HG},
Do, Dk E {GH, HG, A} and

f(F) = U1...Uk, U -= DoU1D1...UkDk. (4)


a48

S u b l e m m a 7.5. Suppose (4) holds for some P in T,~ and some strongly regular
U. Then for any A, an occurrence of f ( A ) in U can be of two types:
(i) either I(A) occurs in some Ui;
(ii) or f ( A ) : G X , and for some i, Di-~ = HG, X ~_ Ui).
P r o o f (a sketch). By an induction over the length of A. The base consists in
analyzing different possible occurrences of Qk; here 7.4 is used. To make the
step, suppose A = A0C?~; then f ( A ) = f(A0)nk, and there are two possibilities:

* f(A0) occurs in some Ui. Then the initial segment G ~+2 of Qk also must
occur in Ui. Now from the base of 7.5 we can conclude that the whole Qk
occurs in V~, and thus (i) holds.

* f ( A o ) = G X , Di-1 = HG, X "4 Ui. In this case the initial segment


G '~+2 of Qi occurs in ld, and similarly to the previous case, it follows that
X Q k -4 Ui, i.e. (ii) holds. Q.E.D.

S u b l e m m a 7.6. If f ( r ) f-v U t-S(m) V is strongly regular, and Gn+aH -4 V


then for some A
/(r) ~-s(m)/(A) t-v V.

P r o o f (a sketch). It is sufficient to consider the case when V is obtained from U


by applying a single rule, i.e. by replacing an occurrence of some f((P) by some
f ( ~ ) . By 7.2, U is strongly regular, and thus (4) holds for F, U. Due to 7.4, we
can consider two possibilities9
(i) If f((I)) is a subword of some Ui then after applying the same rule
(f(4), f ( ~ ) ) to f ( r ) = g l . . . gk, we get some f ( a ) because f is a coding. Then
the D-rules deriving U from f ( r ) can be applied to f(A), and t h u s / ( A ) t-.D V.
(ii) f((I)) = G X , X ~_ Ui, Oi-1 = HG. Then i r 1 (since G~+3H ~ V,
G'+2H ~_ X ) . Since H 2 does not occur in U, G is the last letter of Ui-1. Let
U~- I = Ui_ t G. Then
I !

U = DoUID1 ... Ui_IGHGXUi 9 .UkD~,

V = DoU, D1 . . . U i _ I G H f ( ~ ) U ~ 9 9UkDk,
and also

f(r) - u: ...UT_lf(<~)ui ...u, Vs(~) w = u: ..uT_, . . . u , ~ , v .


But then W must be of the form f ( A ) since the set { Q 1 , . . . , Q,~} is free. Q.E.D.
Now to show that (2) holds, we can reorganize the S+-derivation of f(F2)
from f(F1) in such a way that all D-rules were applied after f(T)-rules (due to
7.6):
/(F1) I-f(T) f(A) I-~/(F~).
Let us show that the last D-derivation is void. Suppose not. Then it leads
to the situation described by (4), with some Di being non-void. Then by 7.5 the
449

occurrence of H in D~ is not touched by any subword of the form f ( r ) . This


contradiction yields f(rl) ~s(:r) f(r2). Q.E.D.
A tense necessity reduction principle is a formula of the form r l --+ F~, with
r l , r~ from T~. For any tense semi-Thue system 8 over TW1 we can consider
the logic
Lt(8) = g t + {Ap -+ r p I (r, A) e S}.
Given a frame {W,R), let R e = R, R ~ / = R -1 and for F = N1 . . . N m
(Nk E {G, H}), let R r =/~N1 o ... o RN,~; let also RA = idw =
= x) lx e w} above.
The following claim is proved similarly to Lemma 1:
L e m m a 8. (W, R) ~ rip ~ r2p r Rp2 c Rr~.
T H E O R E M 9. Let 8 be an undecidable tense Semi-Thue system, such that
M ( 8 ) is serial. Then nt(8) is undecidable.
P r o o f . Similarly to Theorem 3, by applying Lemma 8 and Theorem 4.
T H E O R E M 10 (cf. [12]). There exist undecidable tense calculi axiomatized by
tense necessity reduction pinciples.
P r o o f . Consider an undecidable semi-Thue system 7" over E= satisfying the
condition of Lemma 7 (e.g. 7" can be taken from the example at the end of 2.1).
From Lemma 7 we obtain 8 satisfying the conditions of Theorem 9. Q.E.D.

3 Undecidability of tabularity of tense calculi


3.1 Tabularity criterion
Recall that a (poly)modal logic of a single finite Kripke frame is called tabular.
To formulate a criterion of tabularity, consider the following formulas ~s and f , :

9 c~, is the conjunction of all formulas of the form

-n(~ 1 A M1(~2 A M2(~3 A . . . A M,~-lp,~)...);

9 fs is the conjunction of all formulas of the form

A.../x

where
Pi = Pl A ... A pi-1 A'-,pi A pi+l A ... Aps+l,

and Mi E {F, P} for all i.


L e m m a 11. (i)A logic is tabular iff it contains a formula ~,~ A f~ for some n.
(it) There exists a recursive function f ( n ) s.t. every generated frame validat-
ing an A fin has at most f ( n ) worlds.
P r o o f (a sketch). A sequence of distinct worlds x l , . . . , x~ in a frame (W, R) is
called an n-pseudochain iff Vi < n (xiRxi+l V Xi+lRXi). We say that x C W is
n-pseudo-branched iff there exist distinct worlds xl .... , x~ such that
gi < n ( x R x i V x i R x ) . Then we have:
~-50

9 x ~= a , (under some valuation) iff there exists an n-pseudochain beginning


at x;

9 x ~ fin iff for some m < n there exists an m-pseudochain beginning at x


and ending at some n-pseudo-branched world.

Now let L be a tabular logic determined by a finite frame •. Then for some
n, F contains no n-pseudochains and no n-pseudo-branched worlds, and thus

Conversely, let L ~- an A fin, and suppose L ~/ ~. Then x ~: ~ for some


world x in the canonical Kripke model # of the logic L. Then every substitution
instance of an A/~n is true at x, and this allows to show that any pseudochain
beginning at x is of the length less than n and contains no n-pseudo-branched
worlds. Therefore ~ is refuted in the generated submodel p5 whose cardinality
is _< f(n) = ~ kn-1
= 0 (n - 1) t. Q.E.D.
R e m a r k , We say that a normal tense logic is of a finite length if it contains
K 4 t and at least one of the formulas an 9 Every tense logic of a finite length
is locally tabular (and therefore has the tim.p); this can be proved using the
canonical model technique and the same arguments as in the proof of L e m m a
11.
The following theorem is known for a very large class of non-classical logics
with lattice-theoretic connectives and the implication satisfying Modus Ponens,
and with finitely many additional algebraic connectives [1]. The latter paper
uses a highly developed algebraic technique. Here we present a simpler proof
which fits for normal polymodal logics as well.
T H E O R E M 12. Every normal tense tabular logic is finitely axiomatizable.
P r o o f . Consider a tabular logic L. By Lemma ll(i), L is an extension of the
finitely axiomatizable logic K t + an A fin for some n. By L e m m a 11 (ii) the latter
logic has finitely many extensions, and they are all tabular. Thus L is obtained
from K t + an A fin by a finite sequence of immediate extensions. To complete
the proof, observe that the finite axiomatizability is inherited by immediate
extensions. Q.E.D.

3.2 Simulating Minsky machines and undecidability oftab-


ularity
Recall that a Minsky machine (see [8]) has two left-bounded tapes, the machine
heads (one on each tape) write or erase nothing, and the information on a tape
is the number of its cells.
A program for a Minsky machine is a finite set of instructions of the form

-~ (f, 1, 0) , ~ -~ (/3, 0, 1) ,

o) o, o)), (f, o,-1) o, o)).


The last of them, for instance, means: if the machine is in the state q~ and there
are cells to the left of the head on the first tape, then move this head one cell to
451

1~
.4

s(t,'i,0
Figure 1:

the left and then pass to the state qz; but if the machine is in the state q~ and
there are no cells to the left of the head on the first tape, then, changing nothing
on both tapes, pass to the stage q7.3 A configuration of a Minsky machine is a
triple A = (~, m, n), where q~ is a state, m and n are numbers of cells on the
first and the second tape. The notation P : A --+ B means that the Minsky
program P transforms a configuration A to a configuration B. It is well-known
that there is no Mgorithm deciding, for given P, A, B, whether P : A ~ B or
not.
The technique used below was described in full details for monomodal and
intermediate logics and a general scheme of undecidability proofs of this kind
can be found in [4]. The picture shows a transitive frame ~ akin to the one from
[3] where the proof scheme was used at first. Further on we suppose that the
frame jc contains only such points s(fl, k, l) that P : (~, m, n) ~ (fi, k, l).
The points of 9c are defined by the following variable-free formulas:

A=OTAnOT, B=QI, C=OTAnOI,


D = OT A n o i , E1 = OOT A O n D i , E2 = OOOT A n o n n i ,

F1 = O C A -~OOC A --,OO, F2 = O O C A --,OOOC A -~OD,


A ~ = <>C A <>D A --O<>C A -~O<>D,
A~ - OE1 A OF1 A -~<><>E1 A -~<><>F1,
A02 = <>E2 A <>F~ A --<><>E2 A -~<><>F2,
3There are other modifications of Minsky machines, of course. Sometimes Minsky machines
we use here are cosidered as register machines with two registers.
z'.52

-,OA0~,
i#k=O
i c { o , 1 , 2 } , j k O.

For/3, k, 1 _ 0 and for any formulas ~2 and r let

S(/3, ~, r = A OA? A -,OA~+, A 9 A -~OO~ A O r A 7 0 0 r


i=0
then the formulas S(fl, k, l) = S(/3, Ak,
1 A t2) define the point s(/3, k, l).
Also let

Q:t = (OA 1 v 4 ) A -~OAo~ A ---,OAo~ A pj_ A ~Opl,

Q2 = OA~ A-~OAo~ A---,OAo2 A Op, A mOOpl,


R, = (9 ~ V Ao2) A -,OAo~ A - " 0 4 A P2 A ~Op~,
R2 = OAo2 A --OA ~ A -~OA~ A Op2 A -,O9

D~, = ((OA01V A01) A --OAo~ A-~OAo2 --~ OkA01)A


((oA~ v A0~) A ~<>A0~ A < > 4 -~ o % ~ )
The proof of the subsequent lemma is straightforward.
L E M M A 13. The following formulas are K4t-provable:

S(fl, Qi, Rj)(Dkt/p) ~ S(/3, k + i - 1, l + j - 1),

s ( A A~, R~)ogkdp) ,--+S(A O, l) ,


S03, Q1, A~)(DkJp) ~ S(fl, k, 0).

Now let X be an arbitrary formula refutable in 5c. To simulate the instruc-


tions of a two-tape Minsky machine we use the following formulas (where C)~'
abbreviates 9169 V O - 1 9 V O~ V O - 1 ~ V F; easily to see that in a model on
O P is true in all moments iff ~ is true in some moment, i.e. the modality 0
play a role of "omniscience"):
9 if I = 7 -+ {5,1, 0) w e s e t

A x I = -~X A 0 S ( % Q1, R1) ~ -~)/A 0 S ( 5 , Q2, RI)

9 if I = 7 - + ( 5 , 0 , 1 } we set

A x I = -~X A O S ( 7 , Q1, ):~1) "-+ "rex A 000(5, Q1, R2)


453

9 if I = 3' -'~ (~1, - 1 , 0) ((~2, 0, 0)) we set

A~I = (-~X A 9 Q~, n~) ~ ~X A 9 Q,, n~)) A


(-~X A O S ( 7 , A~, R1) -+ -~X A OS(~2, A01, R1)) ;

9 if I = 3' --* ( ~ , 0,.-1) ((~2, O, 0)) we set

AxI = (-'X A OS(7, Q1, R2) ~ -'X A O S ( 6 , , Q1, R~)) A


(~X A OS(7, Q,, Ao2) --* ~X A OS(f2, Q,, A~)).

Let
AxP = A AxI
IEP

Now for a Minsky machine P and two configurations A = (a, m, n), B =


(/~, k, l) we define a logic

L(P,A,B) =
1 2
K 4 t 4- A x P A ((-~X A O S ( a , Am, A~) -+ -~X A OS(fl, A~, A~)) -+ X).

L e m m a 14. If P : A 7z+ B then :1: ~ n(P, A, B).


We omit the proof, which is routine, but without any difficulties.
L e m m a 15. If P : A --+ B then L(P, A, B) = K 4 t 4- X.
P r o o f (a sketch). The inclusion L(P, A, B) C_ K 4 t 4- X follows easily from the
definition. The converse is a consequence of the following fact: if P : A --+ B
then L(P, A, B) t- -~X A O S ( a , A~, A~) -+ -~X A OS(t 3, A~, A 2) which is proved
by an induction over the length of the calculation. Q.E.D.
All the formulas an Afl~ (Lemma 12) are refuted in ~ and thus L(P, A, B) is
non-tabular whenever P : A 7~ B . Now we can choose the formula X in different
ways. E.g. X may be a formula axiomatizing a tabular tense logic. Then by
applying Lemma 15 we obtain
T H E O R E M 16. The problem whether a finitely axiomatizable K4t-logic is
tabular, is undecidable.
THEOREM 17. For any tabular K4t-logic L, the problem whether a finitely
axiomalizable K4t-logic equals to L, is undecidable.In particular; it is undecid-
able whether a given logic is consistent.
R e m a r k . Theorem 17 becomes false for normal K4-1ogics; in this case, for
any tabular logic, the problem of equivalence is decidable.

3.3 Antitabularity and co-antitabularity


A normal tense logics without finite frames called antitabular. This notion has
no analogues for the normal monomodal case.
It is well-known that there exists a continuum of Post-complete (or maximal
cosistent tense logics; thus there exists a continuum of antitabular ones.
T H E O R E M 18. The problem whether a finitely axiomatizable K4t-logic is
antitabular, is undecidable.
L-54

P r o o f (a sketch). The construction of L(P,A,B) from the previous section


should be somewhat modified; take X = 2 and add two axioms: O A1 and
OQ1 ~ OQ2- Then Lemmas 14, 15 still hold. Due to the new axioms, the
logic contains all the formulas OAi0 (because all OA~ ~ OA~+I are obtained
from OQ1 ~ OQ2 by Substitution, cf. Lemma 13) and thus it can be valid
only in an infinite frame. Therefore L(P, A, B) is antitabular if P : A ~ B, and
is inconsistent otherwise. Q.E.D.
We call a logic co-antitabular iff it has generated frames of arbitrarily large
finite cardinalities, but no infinite generated frames. Two following examples of
co-antitabular tense logics can be found in [10]:
,, the logic L i n T G r z determined by finite frames ({1,..., n}, <);
9 the logic L i n W determined by finite frames ({1,..., n}, <).
On the other hand, it can be proved that there are no co-antitabular logics
among normal extensions of K4.
T H E O R E M 19. The problem whether a finitely axiomatizable K4t-logic is
co-antitabular, is undecidable.
P r o o f (a sketch). Again we use the construction from 2.2.
If P : A --* B then the frame 2- is infinite and connected, and thus the
logic L(P, A, B) is not co-antitabular. Now take X to be a formula axiomatizing
L i n T G r z (or LinW) and apply Lemma 17. Q.E.D.

References
[1] BAKER K.A. Finite equational bases for finite algebras in a congruence
distributive equational classes. Adv. Math., 1977, v.24,207-243.
[2] BREDIKHIN D.M. Representation of ordered involuted semigroups. Izvesti-
ja vuzov, matematika, 1975, No.7, 119-129 (in Russian).
[3] CHAGROV A.V. Undecidable properties of extensions of provability logic.
I: Algebra i Logika, 1990, v.29, No.3, 350-367; II: ibid., No.5, 613-623. (In
Russian)
[4] CHAGROV A.V. , ZAKHARYASCHEV M.V. The undecidability of the
disjunction property of propositional logics and other related problems. J.
of Symb. Logic, 1993, v.58, No.3, 967-1002.
[5] GOLDBLATT R. Logic of time and computation. CSLI Lecture Notes No.7,
1987.
[6] ISARD S. A finitely axiomatizable undecidable extension of K. Theoria,
1977, v.43, No.3, 195-202.
[7] MATIJASEVICH Yu. V. Simple examples of undecidable associative calculi.
Trudy MIAN SSSR, 1967, v.93, 50-88 (in Russian).
455

[8] MINSKY M. L. Recursive unsolvability of Post's problem of "Tag" and


other topics in the theory of Turing machines. Annals of Mathematics, 1961,
v. 74, pp. 437-455.
[9] SAHLQVIST H. Completeness and correspondence in the first and second
order semantics for modal logic. Studies in Logic and Found. Math., 1975,
v.82, 110-143.
[10] SEGERBERG K. Modal logics with linear alternative relations. Theoria,
1970, v.36, No.3,301-322.
[11] SHAIN B.M. Representation of ordered semigroups. Matematicheskij
Sbornik, 1965, No.2, 188-197 (in Russian).
[12] SHEHTMAN V.B. Undecidable propositional calculi. In: Neklassicheski-
je logiki i ih prilozhenija. Ed. by All-Union Council on Cybernetics.
Moscow,1982, 74-116 (In Russian).
[13] THOMASON S.K. Reduction of tense logic to modal logic. I: 3. of Symb.
Logic, 1974, v.39,549-551. II: Theoria, 1975, v. 41, 151-169.
[14] THOMASON S.K. Undecidability of the completeness problem of modal
logic. Universal algebra and applications, 1982, v.9,341-345. Banach Center
Publications, Warsaw.
Stratified Default Theories

Pawet Cholewifiski

Department of Computer Science, University of Kentucky, Lexington, KY 40506

A b s t r a c t . Default logic is a nonstandard formal system, especially suit-


able for knowledge representation and commonsense reasoning. In this
paper we study a class of propositional default theories for which compu-
tation of extensions simplifies. We introduce the notions of stratification
and strong stratification. We investigate properties of stratified default
theories. We show how to determine whether a given default theory is
stratified or strongly stratified and how to find the finest partition into
strata. We present algorithms for computing extensions for stratified de-
fault theories and analyze their complexity.

1 Introduction

Nonmonotonic reasoning systems allow us to formalize reasoning with incomplete


information and other forms of commonsense reasoning. They model a behavior
of an agent who constructs its knowledge (or belief) sets by reasoning not only
from what is known to him but also from what is possible or consistent to assume.
Default logic of Reiter [13] proved to be one of the most successful nonmonotonic
reasoning systems. Its applications range from databases to expert systems and
planners.
Default logic can be regarded as a proof system obtained from propositional
logic by adding nonstandard inference rules-called defaults. The difference be-
tween a default and a standard inference rule is the presence of two kinds of
premises - prerequisites and justifications. The role of prerequisites of defaults
is the same ks the role of premises in standard inference rules - they have to be
proven before we can to apply a default. But prerequisites of the second type
-justifications - do not need a proof. T h e y need to be shown possible in order
for a default to be applicable. Any default theory represents a family (possibly
empty) of all possible knowledge (belief) sets, which are called extensions. Due
to the fact that justifications do not need proofs but only have to be possible, de-
fault reasoning is defeasible. T h a t is, some inferences may became invalid when
more facts become known.
Default reasoning problems have high computational complexity. For in-
stance, the problem of deciding if a given formula ~ belongs to at least one
extension of a finite default theory is r ~ - c o m p l e t e and deciding whether ~p be-
longs to all extensions is UP-complete [8], [12]. In fact, these complexity results
hold even if we restrict to normal default theories with all defaults prerequisite
free. Even if we restrict to the class of disjunctive-free theories these problems
are still NP-hard [9].
457

To improve efficiency of reasoning with default logic it is worth to check if a


given theory belongs to a subclass for which reasoning can be performed faster.
A common technique is to stratify a theory. Stratification consists of partitioning
a given theory into a sequence of smaller theories for which extensions can be
computed faster. This approach was widely studied in the cases of logic program-
ming and autoepistemic logic ([1], [2], [3], [7], [14]) Some results of applying this
approach to default logic were obtained by Etherington [6] and Kautz and Sel-
man [9]. Their results were based on restricting the syntactic forms of formulas
used in default theories.
In [5], we studied the class of seminormal default theories. We did not impose
there any syntactic restrictions on formulas appearing in defaults, but more re-
strictive conditions on dependencies between defaults were required. We showed
that every strongly stratified seminormal default theory has an extension, and
that each ordering of defaults which agrees with a strong stratification generates
an extension. Conversely, each extension for such a theory is generated by some
ordering which agrees with stratification. Moreover, extensions of a stratified
seminormal default theory can be found by considering the theory stratum by
stratum.
In this paper we study general default theories. That is, no syntactic re-
strictions on the form of defaults are required. We generalize the concept of
stratification for seminormal default theories introduced in [5] to the case of ar-
bitrary default theories. Under a very weak additional assumption (each default
must have at least one justification) all results from [5] emend to this case. In
particular it is possible to find extensions by considering the theory stratum by
stratum which often yields significant speedups in reasoning algorithms.
We also introduce a concept of an extension tree. For a default theory (D, W)
with D stratified into D 1 , . . . , Dk, an extension tree is a tree with theory W in its
root and nodes on level i being all the extensions of the default theory (D1 U... U
Di, W). We use extension trees to construct algorithms for computing extensions
and query processing for default theories. Finally we show that checking if a given
default theory is stratified and computing the finest partition into strata is easy
and can be done in polynomial time.

2 Preliminaries

Throughout this paper we assume that every default has at least one justifi-
cation. We consider formulas built over a given propositional language/:. For
any propositional formula ~ by Var(~) we denote the set of all propositional
variables which appear in ~. For every justification/~i of a default

d = ~ : flt ' ' ' ' ' f l k


3'
we define the set of its conflict variables, denoted Var*(/~), as follows:

1. if/~i = 7 (normal justification) then Var*(/3i) = 0;


L58

2. i f / ~ = / ~ A ~f (seminormal justification) then Var*(~) = Var(B~);


3. in all other cases Var*(fli) = Var(~i).

Finally, by the set of conflict variables of d we mean Var* (d) =U~=Ii=k Var*(~)
Now we can state the definition of a stratification ]unction as follows.

D e f i n i t i o n 1. Let D be a set of defaults. A function p assigning an ordinal


number to every default from D is a stratification function for D if for any
d,d'cD, w h e r e d = ~:~ .....zh a n d d ' = ~'"Z[~,
..... ~ ' the following three conditions
hold:

1. if Var('y) n Var('/) ~ 0 then p(d) = p(d'), and


2. if Var*(d) n Var('y') ~ 0 then p(d) >_p(d'), and
3. if Vat(a) n Var(~/) ~ ~ then p(d) > p(d').
Stratification p is called strong if p(d) > p(d'), whenever Var*(d) N Var(7' ) ~ 0.
p(d) is also called the rank of d. A

For a given stratification function p, its lower and upper bound will be usually
denoted by y_ and 7. T h a t is, p is a fuction from a given set of defaults D to
set of ordinal numbers {~ : _~ _< ~ < ~}. Also, for a default d by p(d) we will
mean its prerequisite, t h a t is the formula a, by J(d) the set of its justifications
{/~1,..., ilk} and by c(d) its conclusion 7-

D e f i n i t i o n 2. A default theory (D, W) is stratified (strongly stratified) if


1. W is consistent, and
2. Yar(W) n Yar(c(D)) = 0, and
3. there is a stratification (strong stratification) function for D.

This definition of stratification differs from the standard one in that it uses
the new concept of conflict variables. We decided for this approach as it yields the
class of stratified theories which includes normal default theories and seminormal
default theories as defined in [5].

Example1. Consider the default theory (D,W), where W = {p V q}, D =


{dl,d2,d3}, and dl = ~r V s ' d2 = r:uA(-pv~s)
u '
d3 = : ~ p , ~qD~z s , ~ " This is an
example of a general default theory, (D, W) is not seminormal, nor disjunction-
free and results from [9] and from [5] do not apply to it, but this theory is
strongly stratified in the sense of Definition 2. One can easily check that func-
tion p(dl) = 1, p(d2) = 2, p(d3) = 3 is a strong stratification.

In fact, because any constant function is a stratification function for any set
of defaults only conditions (1) and (2) of Definition 2 are essential for (D, W)
to be stratified. A stratification based on a constant function p will be called
trivial. Clearly, the situation with strong stratification is different. For a given
set of defaults a strong stratification function need not exist.
459

Example 2. Let (D, W) be a default theory, where W = {q}, D = { ~ }. There


is no strong stratification function for D, condition 2 of Definition 1 immediately
implies p(dl) < p(dl). A

General default theories, that admit nontrivial stratifications arise naturally


in various encodings of many problems, especially those taken from graph the-
ory or combinatorics in general. Several examples of such problems and corre-
sponding encodings are presented in [4]. The goal of this paper is to show that
extensions for such theories can be found or tested easier than in the general
case.
We will use the terminology and notation introduced in [11]. In particular,
we use a method for building extensions given an ordering __ and the definition
of a set of defaults AD~_ introduced in [11]. For the reader's convenience we give
these definitions here.

D e f i n i t i o n 3. Let D be a set of defaults and p : D --+ {~ : _~ < ~ < 7} a


stratification function for D. We define:

1. Sets D~ = {d E D : p(d) ~} called strata.


2. D<~ = U~,<~ D~,, (sets D<~, D>~, D>_~ are defined analogously).
3. For any propositional letter p p(p) = ~ if p 9 Var(c(D~)), and p(p) = 0
otherwise.
4. For any formula ~ 9 s if ~ contains propositional variables then
p(~) ----max{p(p): p 9 Var(~)}, otherwise p(~) --- 0.
5. For any well-ordering _ of D we say that _ agreeswith p if for any d, d~ 9 D,
p(d) < p(d') implies that d ~ d'. A
D e f i n i t i o n 4 . [11] Let (D, W) be a default theory and ~ a well-ordering of D.
We define an ordinal y___,sets of defaults AD~ , ~ < y___and defaults d~, ~ < ~5,
and a set of default rules AD~ as follows: If the sets AD~, ~ < a, has been
already defined but y___is not yet defined then

1. If there is no default rule d 9 D \ Ur AD~ such that W U c(U~<o ADr F


p(d) and W U c(Ur o AD~) ~ ~ , for every f l 9 J(d) (by ~- we denote
propositional provability relation), then y~ = a.
2. Otherwise, define do to be the ___-least default rule d 9 D \ U~<o AD~ such
that W U c(U~< o AD~) F p(d) and W U c(Ur o AD~) ~/-~fl, for every fl E
J(d). Set ADo = U~<~ ADr U {do}.
3. Put AD~_ = U~<v~ AD~

The theory Cn(W Uc(AD~_)) will be called generated by the well-ordering _. A

Let us recall that if S is an extension of (D, W) then the set of all defaults
applicable with respect to S satisfies the equation S = Cn(W U c(U)) [13].
This set of defaults will be called a generating defaults set for S and denoted as
GD(D, W, S). Also, for a given default theory (D, W) and a well-ordering __ of
D, if S = Cn(WUc(AD~_)) is an extension for (D, W) then AD~_ = GD(D, W, S)
([11]).
4.60

3 Stratification and Well-Orderings

In this section we present the main result of the paper. We show that extensions
for a stratified default theory (Di W) can be found be expanding the initial
knowledge set W stratum by stratum. The method of building extensions by
well-orderings presented in Definition 4111] will be the main tool in proving this
result. First we present a technical lemma. The proof is standard and is omitted.

L e m m a 5. Let (D, W) be a stratified default theory, and let p be a stratification


for (D, W) Let Z C D and ~ E s If W U c(Z) is consistent then W U c(Z) b
if and only i f W Uc({d E Z : p(d) g p(~)}) t- ~. []

Lemma 5 allows us to prove that for a stratified default theory (D, W) and
a well-ordering _ of D which agrees with a given stratification function p the
order in which defaults are selected in the construction of AD~_ (Definition 4)
is not incidental. The defaults must be "inserted" into AD 5 according to their
strata. More formally we have the following proposition.

P r o p o s i t i o n 6. Let (D, W) be a stratified default theory, ~ be a well-ordering


o l D . If ~ agrees with the stratification p o l D and d~,d v E AD~_ are defaults
chosen in ~th. and ~th step then
if p(d~) < p(dv) then ~ < 7/. (1)

Proof. Let d~ = ~:Z~.~3~r.....Zk.~ and du = %:~1.,2r .....~ " . Assume that ~ -<- ~. Since
the ranks of d~ and dv are different, d~ ~ d v and therefore y < ~. Hence, the
set of ordinals % such that T > y and p(d~) < p(dv) , is not empty and well-
ordered, Thus, without loss of generality, we can assume that ~ is the first ordinal
greater then ~/ such that the rule chosen in ~th step has rank less then p(dv).
From the definition of AD Z (Definition 4) we have that W U c(AD<~) ~- a~ and
W U c(AD<~) ~/-~i,r for any k = 1 , . . . , I. Since AD<n C_ AD<r it must be the
case that W U c(AD<v ) ~/-~i,~ for any k = 1 , . . . , l . Also, from Lemma 5 it
follows that
W U c({5 E AD<r : p(5) < p(ar F- a~.
Since ~ is the first ordinal greater than q such that the rule d~ has rank smaller
than the rank of q, all rules in AD<~ \ AD<v have ranks greater or equal p(q).
Also, since p is a stratification function for D, all variables which appear in a t ,
appear in conclusions of only those defaults from D which have ranks at most
p((). From Definition 3 we have that, p(du) > p(d~) > p(a~). Hence
{6 E AD<~ : p(~) _< p(a~)} _CAD<v.
Thus AD<n F- a~ and since ___ agrees with the stratification p and d~ _ d v it
follows that dr should have been chosen in y,h step rather then d n. That is,
_< ~/. Because of the contradiction the proposition is proven. []

We can apply Lemma 5 to show that every strongly stratified default theory
has an extension. First, we will prove a slightly stronger result.
461

P r o p o s i t i o n T . Let ( D , W ) be a strongly stratified default theory and ~_ be a


well-ordering of D. If -~ agrees with the strong stratification p of D then the
condition
for every/~ e J(AD~_), W U c(AD~_) ~/-~/~, (2)

is satisfied and C n ( W U e(AD~_)) is an extension for (D, W).

Proof. Suppose that for some d E AD.< there is a justification j3 of d such that
W U c(AD~_) ~ ~ . Let d = dr that is, d was chosen in the ~th step of the
construction of AD~_. We will consider three cases: (a) d is a normal default, (b)
d is a seminormal default, (c) d is a general default.
(a) If d is normal then d = ~ By definition of AD< we have that W U
c(AD~_) F V and therefore W U c(AD~_) is inconsistent. Since for consistent W
and any set D of normal defaults the theory W U c(AD~_) is consistent (Lemma
4.2 in [11]) D must contain also a not normal default d' with t3' E J(d') for
which W U c(AD~_) F -~'. So, it is enough to consider cases (b) and (c).
(b) If d is seminormal then d = ~:~^x.
,y If W U c(AD~_) ~- -~(fl A V) then since
W U e(AD~ F ~ we have that W U e(AD~_) F -~. Since the stratification is
strong and _ agrees with rank it follows from Lemma 5 and Proposition 6 that
W U c(AD<e) F ~j3. Hence W U c(AD<~) F -~03 A 7) a contradiction.
(c) Let d = ~:~1"r.....Z~ and suppose that for some i, W U c(AD-<) F -~fli.
As the immediate consequence of Proposition 6 and Lemma 5 we have that
W U c(AD<~) ~- -~t3~. That is, d could not have been selected in the ~h step of
the construction of AD-<.
We have proven the first part of the proposition that is, that the condition
2. This condition implies that C n ( W U c(AD~)) is an extension for (D, W)
(Proposition 3.65 in [11]). []

Now, the existence of extensions for strongly stratified default theories follows
immediately from Proposition 7. We state it as a corollary.

C o r o l l a r y 8. If (D, W) is a strongly stratified default theory then (D, W) has


an extension. In fact, for each well-ordering ~_ of (D, W) which agrees with a
strong stratification of D ~ S = Cn( W U c( AD ~ ) ) is an extension for (D , W ). A

The assumption that (D, W) is strongly stratified is crucial in the proof of


Proposition 7. As we noted before every set of defaults has a stratification (not
strong) and therefore every default theory (D, ~) can be considered as a stratified
default theory. However, it is a well known fact that not every such theory has
an extension.
Now, we will show the "converse" result. That is, that all extensions for a
stratified default theory can be generated by orderings which agree with some
fixed stratification p. Suppose that we have a stratified default theory (D, W),
its stratification function p and its extension S. To prove that S can be generated
by some ordering which agrees with rank we introduce the following definition.
462

D e f i n i t i o n 9. Let (D, W) be a default theory, p be a stratification function for


D and S be an extension for (D, W) generated by ordering ~ . We define an
ordering ~ s as follows: for every two defaults d, d' E D, d ~ s d I if
1. p(d) < p(d'), or
2. d e GD(D, W, S), d' E GD(D, W, S), p(d) = p(d') and d _ d', or
3. d E GD(D, W, S), d' r GD(D, W, S) and p(d) = p(d'), or
4. d ~ GO(D, W, S), d' r GD(D, W, S), p(d) = p(d') and d ~ d'. A

tt follows directly from Definition 9 that ___s agrees with rank. The ordering
___s orders defaults according to their strata and inside each stratum creates an
initial segment of defaults generating S ordered by _ and then defaults from
D \ GD (D, W, S) are appended at the end of their strata. In proving subsequent
results we will use the following notation. When we deal with ADr gener-
ated for several orderings _~1, ~ 2 , - . . , __.t (t may be an arbitrary index) to avoid
confusion we will denote them as AD~, AD~,..., AD~ respectively. Similarly for
AD<r AD<_r etc. We state the following proposition.
P r o p o s i t i o n 10. Let (D, W) be a stratified default theory and p be a stratifica-
tion for (D, W). Each extension for (D, W) is generated by some ordering which
agrees with p.
Proof. Since S is an extension for (D, W) there exists a well-ordering _ of D
such that S = Cn(W U c(AD~_)) (Proposition 3.68 in [11]). In this case AD~_ =
GD(D, W, S), that is, AD Z is the set of generating defaults for S (see Section 2).
Moreover we can assume that _ orders defaults in such a way that all defaults
from the generating set GD(D, W, S) precede the defaults from D\GD(D, W, S)
and ~ IGD(D,W,S) agrees with rank [11]. We will show that also for an ordering
__s described in Definition 9, S = Cn(W U c(AD~_s)). To this end it is enough
to show that AD~_ = ADds. We will prove by induction that AD[ = ADr
Suppose that the claim is true for all ~* < ~. Then W U c(Ur ADr and
W U c(U~,<~ AD~,) prove exactly the same formulas.
First, we show that in the construction with respect to ~ s only a rule from
GD(D, W,S) may be selected in ~th step. Suppose to the contrary that some
d E D \ GD(D, W, S) is selected in the ~th step of the construction with respect
to _ s . This means that W U c(Ur AD~,) ~ a, and for all fl E J(d) W U
c(U ,< AD~,) V -~fl and d is the "~s-least default with such properties Let
p(d) = ~. From the inductive assumption it follows that W U c(U ~,<r AD~, ) ~- a.
Hence, by definition of AD• and the fact that d ~. AD~_, we have that there
must be a justification f l e J(d) such that W U c(AD__.) t- --ft. So, by Lemma 5
W Uc({5 E AD~_: p(5) < p(~)} F ~fl.
Since, p is a stratification function, p(~) < p(d) = ~. And since in _ s all rules
from AD~ having ranks p less or equal than 7/precede d it must be the case that

W Uc({S E AD~_ : p(5) <p(fl)}C_{SeAD~_:p(5) <_~) C U AD~,.


~'<~
463

Thus by the inductive assumption W U c(Ur162 AD~,) h -~3 and a contradiction


follows. So in ~h step both ___and ~ s will select a rule from GD(D, W, S). Since
~- ]GD(D,W,S):~--S ]GD(D,W,S) the same rule must be selected, so AD~_ = AD~_s.
[]

Till now we have proven that all extensions for stratified theories can be
found by examining only those orderings which agree with stratification and
that for strongly stratified theories every such ordering generates an extension.
Now we will show that extensions for stratified default theories can be computed
more efficiently than in the general case, that is by dealing with one stratum at
a time. First, we recall basic definitions of sets used in such construction.

Definition 11. Let (D, W) be a stratified default theory, p : D --+ {~ : ~ <


< ~} its stratification and ~ an ordering of D, which agrees with p. For every
ordinal ~ _>~ let us define:

1. Ar = AD~_ N D~;
2. A<r A_<r A>r and A>~ are defined similarly as D<r D_<~, D>r D>r
3. W<~ = W U c(Ur Af,) and W<~ = W U c(U~,<r A~,);
4. = cn(w< u c(&)).

P r o p o s i t i o n 12. Let (D, W) be a stratified default theory with stratification p :


D -+ {~ : ~ <_ ~ < "~), ~_ be an ordering of D, which agrees with rank and
generates an extension S = Cn(W U c(AD~_)). For any ~ > ~_ :

1. Cn(W U c(A<~)) is an extension for (D<r W);


2. Cn(W U c(A<_~)) is an extension for (D_<~, W);
3. S is an extension for (D>_r W<r
4. S is an extension for (D>r W_<~).

Proof. Consider arbitrary ~,_~ < ~ < 7. The ordering ~ defined as __]D<r is an
ordering of D<r Also, the function p~ = PID<~ is a stratification for (D<~, W).
From the definition of AD-sets and Proposition 6 it follows that AD~_r = AD~_N
D<~ = A<~. Since Cn(W U c(AD~_)) is an extension for (D, W), for any d E
AD~_ and for any/~ E J(d) W U c(AD~_) [/-1t3. Moreover, if d E A<~ C AD~_
then W U c(AD~_r V -~I~. This means that Cn(W U c(A<~)) is an extension for
(D<~, W) (Proposition 3.65 in [11]).
To prove that Cn(W U c(A<~)) is an extension for (D<~, W) it is enough to
take ~ + 1 in place of ~ in the proof of (1).
To prove that S is an extension for (D>~, W<~) observe that (D>~, W<~) is a
stratified default theory. The function p~ -- PID>e is a stratification function for
D>~ and ~ = _ ID>~ is a well-ordering of D>~ which agrees with p~o So, from
the definition of AD-sets and Proposition 6 we have that for any ~ >

WUc( U ADn)=WUc(A<r A D ~ ) = W < ~ O c ( U AD~).


~<~+~' n<~, n<~,
4.64.

Thus AD~, = AD~_ \ A<r Consider the theory S' = Cn(W<~ U e(AD~_,.~)). It
follows that $2 = Cn(W U e(A<r U c(AD~, )) = Cn(W U c(A<( U AD~_, )) =
Cn(W U c(AD~_). So, $2 = S, and since S does not prove negation of any
justification from AD~_, S does not prove negation of any justification from
AD~_, C_ AD~_. Hence, S is an extension for (D_>r W<r Assertion (4) follows
easily from (3) by using ~ + 1 in place of ~. []

C o r o l l a r y 13. Let (D, W) be a stratified default theory, p be its stratification


and ~_ an ordering olD, which agrees with p and generates an extension S. Under
the notation introduced above, for any ~ from range of p, S~ is an extension for
(De,W<~).
Proof. Observe that So = Cn(WUc(A<_r is an extension for (D<_r W) (Propo-
sition 12 point (2)). By the same proposition point (3) So is also an exten-
sion for: (D_<r N D>r = (Dr162 and since So = Cn(W U e(A<g)) =
Cn(W U c(A<r U c(Ar ) = Cn(W<r U c(Ar ) = Sr the corollary is proven. []

The moral of Corollary 13 is that to generate extensions for a stratified theory


it is enough to consider one stratum at a time. That is, first find extension $1 of
W by defaults of the first stratum and append conclusions of defaults generating
$1 to W. Then find extension of "updated" W by defaults from second stratum
and so on. Notice also that in the case of strong stratification p the assumption
of Proposition 12 and Corollary 13 that __dgenerates an extension is redundant.
As it was shown (Corollary 8) in such a case every ordering which agrees with
stratification generates an extension.

4 Properties of Extensions

In this section we analyze properties of stratified and strongly stratified default


theories. In particular, we compare them to well-known properties of normal
default theories [13]. We have already shown that every strongly stratified default
theory has an extension~ This is also a property for normal default theories.
Another property of normal default theories is their semimononicity in D. That
is, for every extension S = Cn(W U c(U)) for (D, W) and for every D' such that
D C D', there exists an extension S' = Cn(W U c(UO) for (D ~, W) and U C U'.
In general, this property is not preserved by strongly stratified default theories.

Example3. Let D = { ~ } U { : . Vz, z } and W = 0. Theory (D,W) is strongly


stratified and has only one extension Cn({x, z}). Let D' = D U { :~-~y}. The theory
(D', W) is also strongly stratified and has a unique extension Cn({x, --y}) which
does not contain the previous one. A

However, in the case of stratified default theories we can show a weaker form
of semimononicity. First, as an immediate consequence of Proposition 12 we can
state the following result.
465

Corollary 14. Let (D, W) be a stratified default theory, D C D' and p be a


stratification function for D ~. If for every d E D ~ \ D and 5 E D, p(d) > p(5),
then for any extension S ~ for (D ~, W) there is an extension S of (D, W) such
that S C S', and GD(D, W, S) C GD(D', W, S').
I
Proof. There must exist an ordinal ~ such that, D = De. The claim follows
from Proposition 12 applied to the default theory (D ~, W). Just observe that,
GD(D, W, S) = AD~ = Ar C_ AD~ = GD(D', W, S'). []

Also, if the theory (D ~, W) in Corollary 14 is strongly stratified then the


existence of extensions is guaranteed and the assertion can be reversed in the
following way (Corollary 8).

Corollary 15. Let (D, W) be a strongly stratified default theory, D C_ D' and
p be a strong stratification function for D'. If for every d E D' \ D and 5 E D,
p(d) > p(5), then for any extension S of (D, W) there is an extension S I for
(D',W) such that S C_ S', and G D ( D , W , S ) C_ GD(D',W,S'). /~

Corollaries 14 and 15 state that semimonionicity for stratified theories holds


when new strata are added. In the case of strongly stratified theories we can also
prove that the same holds when we add new defaults to the 'old' stratum of the
highest rank (if it exists). First, we prove an auxiliary lemma which describes
the case of trivial stratification.

L e m m a 16. Let (D, W) and (D', W) be two stratified default theories. Let D'
be a set of defaults such that D C_ D' and the constant ]unction p(d) = 1 is a
strong stratification function for D'. For any extension S of (D, W) there exists
an extension S' of (D', W) such that S c_ S' and GD(D, W, S) C_ GD(D', W, S~).

Proof. Consider an arbitrary extension S for (D, W). There is an ordering ~ of


D such that all defaults from GD(D, W, S) form an initial segment in (D, _)
and also for any k <_ GD(D, W, S) the k *h default in (D, _) is selected in the
k th step of AD-construction ([11], Lemma 3.72).
Take any ordering ___~of D ~ such that the defaults from D are ordered by
_ form an initial segment in _'. From Definition 4 and Corollary 8 it follows
that ___~will generate an extension S ~ such that S _ S ~ and GD(D, W, S) c_
GD(D', W, S'). []

Now, from Corollary 14 and Lemma 16 it follows that semimononicity for


strongly stratified default theories holds when new defaults are added to the
highest stratum or new strata of higher ranks are added to the theory. More
formally we state the following proposition.

Proposition 17. Let (D, W) be a strongly stratified default theory, D C_ D'


and p be a strong stratification function f o r D ~. If for any d E D t \ D p(d) >_
sup{p(5): 5 E D}, then for any extension S of (n, W) there is an extension S'
of (D', W) such that: S C_ S', and GD(D, W, S) c_ GD(D', W, S'). []
466

The assumption that the stratification is strong is essential in Proposition


17. Otherwise, since every default theory (D, ~) has a stratification function (for
example, the trivial one), we would receive that every such default theory has
an extension and this is not true. This is also evident when looking at Example
3.
Another well-known property of extensions for normal default theories is their
orthogonality. This property is preserved by strongly stratified default theories.

P r o p o s i t i o n 18. If a strongly stratified default theory ( D , W ) has distinct ex-


tensions S~ and $2 then S] U $2 is inconsistent.

Proof. Let p be a strong stratification of (D, W). From Proposition 10 it follows


that there are orderings ~1 and -~2 which agree with p, such that $1 = Cn(W U
c(AD~_I)) and $2 = Cn(W U c(AD~_2)). Suppose that $1 U 5:2 is consistent.
Then also W U c(AD___I)U c(AD~_2) must be consistent. Since extensions are
always incomparable AD~_I \ AD~_2 r ~ and AD~_~ \ AD~_I • 0. Let dl be the
-~l-least rule in AD~I \ AD~j. Let d2 be the _2-1east rule in AD~_~ \ AD~_I.
Define then dr as dl if p(dl) <_ p(d2) and as d2 otherwise. Without loss of
generality we can assume that d~ = dl. That is~ d~ E AD~_I and no rule d t exists
in AD-<2 \ AD-<1 such that p(s < p(dr Assume that d~ = ~:B1..... ~ . Then
W U c((.Jr162 AD~,) F a and for all i~ 1 < i < k W U c (U~,<~ AD~,) 1 k~-~/3i. Since
Ur AD~, C_AD-<2 it follows that WUc(AD.<2) F a. Thus it must be the case
that W U c(AD~_2)k- -~/3i. We will consider three cases: (a) /3i is normal, (b) /3i
is seminormal and (c) /3i is a general type justification.
(a) If/3~ is normal, that is ~ = ~, then immediately we have that 7 E $1 USe
and -.7 E $1 U $2, so $1 U $2 is inconsistent.
(b) If/3~ is seminormal, that is fli = ~ A % then since 7 E $1, we have
that W Uc(AD~_I) Uc(AD~_~) t- --I /3~. ! From Lemma 5 and strong stratification of

(D, W) by p it follows that WUc({5 ~ AD~_~UAD~_: : p(5) < p(/3~)}) t- -"/3~. But
since p is a strong stratification all defaults from AD~_~UAD~_: with ranks smaller
than/3~ are in Ur AD~,. That is, WUc({5 e AD~_~UAD~_~ : p(5) < p(/3~)}) C_
Ur <~dD~,, and consequently W U c(U ~, <~ dD~,) F -~/3~ a contradiction.
(c) The case when/3i is not normal and not seminormal is very similar to the
previous one. First, we observe that W U c(AD~_~) U c(AD~_:) F -"/3. Now, from
Lemma 5 we can conclude that W U c ( { ~ ~ AD~_I U A D ~ : p((i) < p(/3~)}) k -"/3
and by the same argument as in (b)

W U c({(f e AD~_I U AD~_2: p(5) < p(/3)}) C_ U AD~,.


~,<~

Thus, once again we receive that W U c(U ~,<4 AD~, ) t- -./3 which contradicts the
assumption about de. []

C o r o l l a r y 19. Let (D, W) be a strongly stratified default theory such that W U


c(D) is consistent. Then (D, W) has a unique extension.
467

Proof. By Corollary 8 the default theory (D, W) has an extension. Suppose that
$1 and $2 are two distinct extensions for (D, W), generated by "<1 and ___2re-
spectively. By Proposition 18 the theory WUc(AD~_I) Uc(AD~_2) is inconsistent.
Hence the theory W U c(D) is inconsistent. A contradiction. []
Of course, to prove the assertions of Propositions 18 and Corollary 19 it is
necessary to assume that the stratification is strong. Otherwise these assertions
fail as shown in the following example.

Example $. Let W = 0 and D = :~~'Y, :-"~'=~. The default theory (D, W) is


stratified (trivially) and has two extensions Cn({x}) and Cn({y}), which are
not orthogonal. Moreover the set W U c(D) = {x, y} is consistent. A
Finally, we will describe the analog of the following property of normal de-
faults: the number of extensions is monotone non-decreasing under the addition
of new defaults. Such a strong result cannot be proven for stratified default the-
ories. However, in the case of strongly stratified theories we can show a weaker
version of this property: the number of extensions is monotone non-decreasing
under the addition of new strata or addition of new defaults to the stratum of
thehighest rank. We state the following corollary.
C o r o l l a r y 20. Let (D, W), D E D' and (D', W) be a strongly stratified default
theory with a stratification function p. If for every d 9 D' \ D, p(d) > sup{p(~) :
9 D}, then for any two distinct extensions $1 and $2 for (D,W) the theory
(D', W) has distinct extensions S~ and S~ such that S ] c S~ and $2 c_ S~.
Proof. Let $1 and $2 be two different extensions for (D, W). From Proposition 18
it follows that S]US2 is inconsistent. Let $1 and $2 be two extensions for (D', W)
such that Si C S~, i = 1, 2. The existence of Sx and $2 follows from Proposition
17. If S~ = S~ then $1 U $2 C_ S~ and therefore S~ is inconsistent (Proposition
18). Since a stratified default theory (D, W) must have consistent W it cannot
have inconsistent extensions ([13]). Hence, we obtained a contradiction. []

5 Reasoning with Stratified Theories

Throughout this section we assume that sets D and W are finite and that the
range of p is 1 , . . . , K. First, we notice that checking whether a given finite
default theory is strongly stratified and finding the finest partition into strata
(with respect to Definition 2) is easy and except for consistency checking of W
all the work can be done in polynomial time. This can be done in a similar way
as for other stratified reasoning formalisms such as stratified logic programs [2],
stratified autoepistemic theories [7], [10] or stratified seminormal default theories
[5]. We can describe all the conditions of Definition 1 using the characteristic
graph GD.
D e f i n i t i o n 21. For any finite set of defaults D by the characteristic graph GD
we mean a directed graph GD = (D,A) , where for any d~,dj E D there is an
edge (dl, dj) 9 A if:
468

1. the conclusions of di, dj have a common variable, or


2. the conclusion of di contains a variable which appears in the set of conflict
variables for dj (an arc (d~, dj) E A for which this condition is satisfied is
called strong), or
3. the conclusion of di contains a variable which appears in the prerequisite for
dj. A
Now, we present a description of strongly stratified default theories in terms of
their characteristic graphs. The proof is straightforward and is omitted.
P r o p o s i t i o n 22. A finite default theory (D, W) is strongly stratified if and only
if W is consistent, Vat(W) A Var(c(D)) = ~, and no strong component of the
characteristic graph GD contains a strong arc. []
It is easy to see that the partition of D defined by partitioning of GD into
its strong components is the finest possible partition. Definitions I and 21 imply
that the defaults belonging to the same strong component of GD must have
equal ranks. The finest possible partition into strata of an arbitrary default the-
ory with consistent W and with Wnc(D) = ~ can be found in the same way. In
this case we just construct GD, compute its strong components and topological
sort. If no strong component of GD contains a strong arc then the received strat-
ification is strong. Otherwise, a strong stratification for D does not exist and
the computed ranks form only the finest possible stratification ("weak"). This
approach defines a direct method of constructing the finest stratification for a
given set of defaults. Checking whether W and c(D) have a common variable
can be done in O(l log l) time where l is the length of the theory (D, W). Clearly
checking whether W is consistent is NP-complete. Graph GD , its strong com-
ponents and the topological ordering of G~ can be computed in O(l ~) time. This
method of computing stratification can be improved. It is not necessary to use
an algorithm for finding strong components of an arbitrary directed graph. One
fine algorithm for computing stratification in the case of autoepistemic theories
was described in [10]. This algorithm works in O(1) time and uses the specific
properties of a characteristic graph. It can be adopted to the case of default
theories.
Now, we will discuss the complexity of reasoning with stratified default the-
ories.

D e f i n i t i o n 23. Let (D, W) be a finite stratified default theory and function p


of range 1 , . . . , K be its stratification. An extension tree T(D, W) has subsets of
s as vertices and is defined as follows:
1. W is the root of T(D, W) (level 0 vertex),
2. for any i, 1 < i < K - 1, every vertex U of level i has as its children all the
extensions for the default theory (D~+l, U). A

As a direct consequence of Proposition 12 and Corollary 13 we see that the


vertices of T(D, W) of level K are exactly the extensions for (D, W). Also,
from Corollary 8 it follows that in the case of strongly stratified default theory
469

vertices of level K always exist. In fact, in such a case, all the leaves of T(D, W)
have depth K and correspond to all extensions for (D, W). Reasoning with a
stratified default theory (D, W) can be perceived as searching the tree T(D, W).
Let us denote by k~ the number of vertices of level i - 1 in T(D, W), that is the
number of extensions for (D<~, W). Let n~ =1 Di I and mi be the total number of
justifications in all defaults from D~. The tree T(D, W) can be searched in many
ways. For example both depth-first-search and breadth first search can be used.
In any case, whenever the search of whole T(D, W) is needed the required time
measured as the number of calls to a propositional provability oracle is bounded
by

where f(n, m) is the maximal number of calls to propositional provability rou-


tine needed to compute all extensions for a theory with n defaults having m
justifications. Known algorithms for this problem (reduet based or order based
methods, see [11] for fine presentation) are based on f(n, m) = 2'~w(n,m) for
some w(n, m) _< m n + n 2, in the most general case. Also notice that if the poly-
nomial hierarchy does not collapse then f(n, m) may not be polynomial~ The
performance of algorithms based on searching T(D, W) to compute all exten-
sions or detect whether at least one extension exists strongly depends on the
numbers ki, that is on the properties of a particular input theory. Let us analyze
some cases.

- If K = 1 (trivial stratification) then we have the same complexity as in the


general case O(f(n, m)). For example, for a straightforward reduct based
algorithm this is O(n(n + m)2~).
- If K -- n then all ki = 1 and ni = 1. Each default is tested at most once
to find whether it is applicable and therefore n + m calls to a propositional
provability routine are enough to complete the job.
- If D is partitioned into K > 1 strata of approximately same size of n/K
defaults with m / K justifications. Then each stratum has up to q -- 2n/2g
extensions. So, kl -- 1 and ki ~ k i _ l 2n/2K for i : 2 , . . . , K , and

K
Z kJ(ni, mi) <_f(n/K, m/K)(l+q+q2+...+q K-l) <_2f(n/K, re~K)2 '~/2.
i=1

So, for f (n, m) = n(n + m) 2~ we receive the bound vO(~Kn__{~K


n 4-
-- m~onl2+n/K~
K J= J"
In particular for n = m the speedup ratio is 2'~/u K--~2
2 "

We conclude the paper with a corollary summarizing major results in terms


of possible speedups in query answering.

Corollary24. For a default theory (D, W) with D stratified into D1,..., DK


and for Ko such that (DKo U... U DK, W) is strongly stratified or Ko -- K:
L70

1. Query "Has (D, W ) an extension?" can be answered positively when at least


one node o f T ( D , W ) of level Ko - 1 is found.
2. For any formula ~ and query "Is ~ true in at least one extension of (D, W )
".

(a) the query can be answered positively when at least one node U o f T ( D , W )
of level Ko - 1 is found and ~ is true in U.
(b) the query can be answered negatively if ~ is not true in all nodes of
T ( D , W ) of some level I > p(~).
3. For any formula ~ and query "Is ~ true in all extensions f o r (D, W ) ?":
(a) the query can be answered positively when ~p is true in all nodes of (D, W)
of any level l, 1 < l < K .
(b) the query can be answered negatively if ~ is not true in at least one node
of (D, W ) of some level 1 such that 1 > Ko - 1 and I > p(~).
A

References

1. K. Apt and H.A. Blair. Arithmetical classification of perfect models of stratified


programs. Fundamenta Informaticae , 12:1 - 17, 1990.
2. K. Apt, H.A. Blair, and A. Walker. Towards a theory of declarative knowledge.
In J. Minker, editor, Foundations of Deductive Databases and Logic Programming,
pages 89-142, Los Altos, CA, 1987. Morgan Kaufmann.
3. N. Bidoit and Ch. Froidevaux. General logical databases and programs: Default
logic semantics and stratification. Information and Computation, 91:15-54, 1991.
4. P. Cholewifiski, W. Marek, A. Mikitiuk and M. Truszczyfiski. Experimenting with
default logic In Proceedings of ICLP-95, MIT Press, to appear.
5. P. Cholewifiski. Seminormal stratified default theories. Technical Report 238-93,
University of Kentucky, Lexington, 1993.
6. D. W. Etherington. Reasoning with Incomplete Information. Pitman, London, 1988.
7. M. Gelfond. On stratified autoepistemic theories. In Proceedings of AAAI-87, pages
207-211, Los Altos, CA., 1987. American Association for Artificial Intelligence, Mor-
gan Kaufmann.
8. G. Gottlob. Complexity results for nonmonotonic logics. Journal of Logic and
Computation, 2:397-425, 1992.
9. H.A. Kautz and B. Selman. Hard problems for simple default logics. In Principles
of Knowledge Representation and Reasoning, pages 189-197, San Mateo, CA., 1989.
Morgan Kaufmann.
10. W. Marek and M. Truszczyfiski. Autoepistemic logic. Journal of the ACId, 38:588
- 619, 1991.
11. W. Marek and M. Truszczyfiski. Nonmonotonie Logics; Context-Dependent Rea-
soning. Springer-Verlag, 1993.
12. I. Niemel~. On the decidability and complexity of autoepistemic reasoning. Fun-
damenta Informaticae, 1992.
13. R. Reiter. A logic for default reasoning. Artificial Intelligence, 13:81-132, 1980.
14. M. Truszczyfiski. Stratified modal theories and iterative expansions. Technical
Report 159-90, Department of Computer Science, University of Kentucky, 1990.
A HOMOMORPHISM CONCEPT FOR w-REGULARITY

NILS KLARLUND*
BRICS t
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY OF AARHUS
NY MUNKEGADE
DK-8000 AARHUS C, DENMARK.

ABSTRACT. The Myhill-Nerode Theorem (that for any regular lan-


guage, there is a canonical recognizing device) is of paramount im-
portance for the computational handling of many formalisms about
finite words.
For infinite words, no prior concept of homomorphism or struc-
tural comparison seems to have generalized the MyhiU-Nerode The-
orem in the sense that the concept is both language preserving and
in a natural correspondence to automata.
In this paper, we propose such a concept based on Families of
Right Congruences [3], which we view as a recognizing structures.
We also establish an exponential lower and upper bound on the
change in size when a representation is reduced to its canonical form.

1. OVERVIEW
An i m p o r t a n t and only partially solved problem in the theory of ~o-regular
languages is whether representations can be minimized. For usual regular lan-
guages, deterministic finite-state a u t o m a t a (DFAs) are recognizing structures
t h a t can be minimized easily in polynomial time by virtue of the Myhill-Nerode
Theorem. Although w-regular languages enjoy some of the same properties as
regular languages, see [6], the lack of similar minimization Mgorithms is a m a j o r
i m p e d i m e n t to building verification tools for concurrent programs.
The syntactic congruences of Arnold [2] provide canonical algebraic struc-
tures for w-regular languages. By themselves, these congruences provide no
explicit acceptance criteria just as in the situation for a regular language: the
canonical right congruence, whose classes are a u t o m a t a states, does not define a

*The author was partially supported by a Fellowship from the Danish Research Council.
~Basic Research in Computer Science, Centre of the Danish National Research
Foundation.
472

language--unless certain states are designated as being final. Similarly, Arnold's


congruences have only the ability to recognize, which is a property called satura-
tion. Arnold's congruences can be extended so that acceptance becomes explicit
and thus a language preserving homomorphism concept arises. But, unlike the
Myhill-Nerode Theorem, which is based on right congruences, canonicity in [2]
is obtained for full congruences, which are usually exponentially bigger than
one-sided congruences.
Maler and Staiger [3] focus on the canonical right congruence =L on finite
words of a language L of infinite words. This congruence is defined by x =L Y if
and only if for all infinite a, x-a C L if and only if y.c~ E L. (We use x, y, u, v, w to
denote finite words and c~, fl to denote infinite words). The concept of a Family
of Right Congruences (FORC) suggested in [3] is there used to characterize
w-regular languages that are accepted by their canonical right congruence --=L
extended to a Muller automaton.
FORCs are also not language recognizing. But they do enjoy canonical prop-
erties with respect to saturation as we prove in this paper. Similarly, the right
binoids of Wilke [8] are algebraic devices that characterize regular sets of fi-
nite and infinite words based on a saturation concept embedded in a notion of
recognition by homomorphism.
Other algebraic approaches to w-regularity include the semigroup approach
of [5] and the right congruences proposed in [4]. The latter congruences are
defined in terms of language recognizing automata, but only some w-regular
languages are recognizable by their canonical automaton defined by their lan-
guage [7].

I n t h i s p a p e r . In this paper we regard FORCS as language accepting devices


rather than as the transition structures of underlying Muller automata. Then
FORCS may be viewed as separating the characterization of the topological clo-
sure of the language from that of the dense part.
The closure corresponds to the canonical right congruence. The classes of
this relation for which there is an infinite suffix that makes words in the class
belong to L describe the closure of L: an infinite word is in the closure if and
only if all of its prefixes belong to these classes. The closure is also called a safety
property in the theory of concurrent systems. A FORC represents the closure
by what we here call a safety congruence, which is a refinement of the natural
right congruence. (The results of [3] show under which conditions this safety
congruence may be used with a Muller condition to accept languages that are
not necessarily closed.)
The dense part of L is described by a collection of right congruences, here
called progress congruences, that specify the cyclic behavior any word even-
tually exhibits according to Ramsey's Theorem about finite partitions of the
natural numbers. Thus it is natural to view these congruences as an algebraic
formalization of progress towards the dense part, known as a liveness property
in concurrency [1] .
We show that a Myhill-Nerode Theorem exists that declares a unique min-
imum representation of an w-regular language under a structural comparison
473

that is language preserving. Also, we clarify the notion of refinement of FORCs


presented in [3].
Our representation is that of a FORC extended by explicit enumeration of
accepting progress states. We call such a device an LFORC, since it is Language
accepting. Under the automata-theoretic view, an LFORC is a Family of DFAs
(FDFA).
We introduce a concept of retraction between LFORCs and show that it
is language preserving. We also formulate a retraction under the automata-
theoretic view as an FDFA homomorphism. From a given FDFA, the homomor-
phism involves implicitly formed product state spaces that may be exponentially
larger than the FDFA itself.
Our main result is that among all LFORCs recognizing a language L there
is a canonical or minimum one under our notion of retraction. Thus all such
LFORCs retract to this minimum LFORC.
The canonical LFORC was already defined, as a FORC, in [3]. It was implied
there that with respect to saturation this FORC is canonical for a straightforward
notion of refinement. This result, however, does hold only in certain situations.
We provide a simple counter-example for the general case.
The primary consequence of our generalization of the Myhill-Nerode Theorem
is that minimization of ~-regular representations is reducible to calculations
involving only regularity or usual finite-state automata. We show how any FDFA
can be retracted to the minimum FDFA by structural operations that do not
refer to acceptance of infinite words.
The minimization of FDFAs may yield an exponential blow-up in size. We
establish both the lower and the upper bound. This blow-up can occur only
for the progress congruences, whose number of equivalence classes may grow
exponentially. The safety congruence, however, can only shrink.
We also show that a kind of inverse refinement holds for the progress congru-
ences: for any FORC, every progress congruence that is minimized with respect
to the safety congruence is refined by the product of the safety congruence and
the canonical progress congruence. Thus during minimization, the progress con-
gruences become finer in a sense whereas the safety congruence becomes coarser.

A p p l i c a t i o n s to m i n i m i z a t i o n . From [3], it follows that there are polynomial


translations fl'om FORCs to deterministic Rabin or Streett automata (with a
number of acceptance pairs that is roughly logarithmic in the state space size).
This is unlike the situation for Arnold's congruence that may be exponentially
bigger than its automaton representation.
But if minimization is involved, there need not be an exponential gain in using
LFORCs instead of Arnold's syntactic congruences, since during minimization
LFORCs may blow up whereas Arnold's congruences can only shrink, i.e. become
coarser. It appears though that FORCs grow drastically in size only if the
progress part is more involved than the safety part.
474'-

2. F O R C s AND L F O R C s
Let E be a finite or infinite alphabet. T h e e m p t y word is denoted e. T h e set
of finite words is denoted E* and the set of infinite words is denoted E ~. A right
c o n g r u e n c e ~ on E* is an equivalence relation t h a t satisfies

x -,~ y implies for all a, x a ,,~ ya.

T h e n each u E E* can be c o n c a t e n a t e d to the right on any equivalence class s


by s u = s I, where s ~ is defined as [xu] with x any m e m b e r of s.
A F O R C ~ = (,,~, o ) consists of right congruences of finite index on E*.
We call the relation ,,~ the s a f e t y congruence. An equivalence class s is also
called a s a f e t y state. T h e safety state of u E E, i.e. the s such t h a t u E s, is
denoted [u]. To each safety state s is associated the right congruence ~ called
the progress congruence of s. An equivalence class p of ~s is called a progress
state. T h e progress state of u with respect to s is denoted [u]s. T h e following
r e q u i r e m e n t m u s t hold:

(FORC) x o y implies sx = sy.

A n o n - e m p t y word x such t h a t s = s . x is called s - c y c l i c .


By ( F O R C ) , an o p e r a t i o n of right-concatenating a progress state p of ,-~s to
s is defined by
s'p-~ s'X,

where x is chosen so t h a t x E p. A progress state p such t h a t s - p = s is called


cyclic. T h u s the progress state according to ~ of an s-cyclic word is cyclic.
An ( s , p ) - f a c t o r i z a t i o n of a word a E E ~, where s is a safety state and p is a
progress state of o is a collection v0, vl, v 2 , . . , of n o n - e m p t y f a c t o r s such t h a t
a = v o v l v 2 . . , and for all i > 0, v 0 . . . v i E s = svi and vi E p. If in addition,
p = pvi (for all i > 0), then the factorization is said to be progress cyclic.
T h e following l e m m a s u m m a r i z e s results in [2] and [3].

Lemma 1. (Factorization) Given a F O R C ~ = (,,~, ~ ) .


(a) Every a E E ~ a d m i t s a cyclic (s, p)-factorization for some (s, p).
(b) Moreover, if a = x y ~, then it a d m i t s s o m e (s, p)-factorization v0 = x y m
and vi = y~, i > 0 for some m, n > 0. This factorization is also denoted

o~ --=-x y ~ ( y ~ ~
s p

(c) Every oL = xy ~~ a d m i t t i n g an (s, p)-factorization has a factorization

O ~ V0 ~ V w

s P

(These factorizations m a y even be assumed progress cyclic.)

For (s, p) define L(~,p) to be the set of words a d m i t t i n g an (s, p)-factorization.


475

A live assignment A associates to each s a subset A~ of progress states of o .


A FORC (,,~, o ) together with a live assignment h is called an LFORC (for
Language recognizing FORC) and denoted ,~ = (,,~, o, h). The language recog-
nized by (,,% o, A) is the union of L(s,p) for p E A~ and is denoted L(,~, o , A).
Thus it consists of the words that allow some (s, p)-factorization with p E A~.

2.0.1. Example. An LFORC is perhaps best understood as a familyof automata.


The w-regular language Z*(a w t3 b~), where E = {a, b}, can be represented by
an LFORC specified by three automata:
a a b

-~a ~b 1 b 2 a

b b a ~ b

The first automaton defines the safety congruence ,,~ as x ,~ y if and only if the
last letter in x and in y are the same. The congruence o 1 is specified by the
second automaton. Each state is marked with the corresponding safety state
according to the requirement (FORC). The states in A1 are marked by an inner
circle. The other progress congruence ~ is shown as the last automaton.
There is another LFORC representation of the same language with a simpler
safety congruence and a more complicated progress congruence:
a

a,b a,b

0 b

[]
The size of an LFORC is the maximum index of its congruence relations. Thus
the size of the first LFORC above is three and the size of the second one is four.
4.76

(One could also have defined the size as the total number of classes, but this
number is at most quadratically bigger.)
Given L and (--~, 0 ) , define the natural live a s s i g n m e n t A by letting A L consist
of the p such that some a C L allows an (s,p)-factorization, i.e. such that
L M L(s,p) is non-empty. Then L C_ L ( ~ , 4 , AL).
A language L is saturated by (--%o ) if for all a and • both admitting an
(s, p)-faetorization, it holds that a E L if and only if • E L or, in other words, if
for all ( s , p ) , either L(,,p) C L or L(s,p) M L = O. Thus for a E L, we may choose
any factorization to determine whether c~ C L.
We can express the saturation property of [2] as follows.

L e m m a 2. (Saturation)
L is recognized by (--%o , A n) if and only if L is saturated by (~, o ) .

3. I:{EFINEMENTS AND RETRACTIONS

We say that ~ refines ~ if x ,-~ y implies x ~ y. Then for s_ an equivalence


class of _-, Is_I~ is the number of equivalence classes of ,-~ contained in s. More-
over, if s is an equivalence class of ,,% then [s]~_ is the equivalence class of ~ that
contains s.
In the following, we always assume that A is the natural live assignment.

L e m m a 3. (Cyclicity) If,~ refines ~ and x E s_ = s_.y, then for some i, j < Is_L
and some s C s_, x . yi E s = s . yJ.
In particular, when
O~ ----- ~t Vw

is a factorization in (~_, o ) , then there is a factorization

= u v i .( v j )~
s p

in (~, o ) with s C s_ and i, j < is_l~. We say that the former factorization induces
the latter.

Proof. Note that the u-states of x, x . y , x . y 2 , . . . are all among the Is_l~ different
~-states contained in s_. []

LFORC ~ = (~, o , A) retracts to LFORC ~ = ('% o , A) if


(R-S) x~yimpliesx-~y
(R-P) for all s_ of ~,
if for all s of ~ contained in s,
x~
and
for all v and all i < IsI~,
s ( x v ) i = s implies ( x v ) i o (yv)i,
then x --s_
o Y-
477

(R-A) for all s of _~, all x such that s = sx,


all s of--~ contained in s_, and all i _< Is_I~,
if s = sx ~, then [xi]s E As iff [x]~ E A~_.
The condition (R-S) expresses that ~ safety-refines ~_, i.e. that the safety con-
gruence of ~ refines the safety congruence of ~.
Unfortunately, it is not sufficient to formulate a similarly simple requirement
for the progress congruences. In fact, Example 2.0.1 shows that the minimum
progress congruence may become more complicated as the safety congruence is
made coarser!
Instead, condition (R-P) expresses that the product of all ~ where s is
contained in s, augmented with a condition about finite iterations, refines --~.
The intuition is that when .-~ is collapsed to ~, an s_-cyclic word z in _~ may
induce an s-cycle in ,~ only when repeated a number of times that is at most
Is_I~. Requirement (R-P) stipulates that if x and y are equivalent with respect of
all such repetitions for s a subset of s, then x and y are equivalent with respect
to progress for s_.
Finally, condition (R-A) expresses that acceptance in/~ is matched by ac-
ceptance in ~ in the following sense. Let x be an s_-cyclic word such that x i is
s-cyclic, where s is contained in s_. Then x i is in a state of As_ if and only if z is
in a state of As.
Note that in the case -.~ = _~, (R-P) simply states that x,.%~y implies x~
and (R-A) states that [x~]~ E As if and only if [x]~ E A~. Thus if 0~ and
0
_~ are regarded as usual DFAs with final states As and As_, then (R-P) and
(R-A) expresses that a usual automaton homomorphism exists from the former
a u t o m a t o n to the latter.
Saturation by FORCs characterizes w-regularity [3]. Similarly, we have

L e m m a 4. The class of languages recognized by LFORCS is the class of w-


regular languages.

Proof. The acceptance criterion of an LFORC can easily be encoded by a non-


deterministic Biichi automaton that guesses the factorization.
Vice versa, it can be seen that any deterministic automaton with the Streett
acceptance condition gives rise to an LFORC. Recall that a Streett acceptance
condition consists of a list of pairs of subsets of states, called "red" and "green"
states. A run is accepted if it holds for each pair that if green states of the pair
occurs infinitely often, then red states of the pair occurs infinitely often. The
progress information along a cycle in the automaton consists of recording which
"red" and "green lights" have been seen since the beginning of the cycle. The
acceptance condition for the progress automaton is that for each pair for which
a green state has been encountered, also a red state has been encountered. []

Given a language L, Maler and Staiger define a canonical FORC (,.~L, o n )


and show that it saturates L. The corresponding canonical LFORC ,~L = (,..~L
, o L , A L) of any w-regular language L is then defined as follows:
9 z ,,~L y if for all a, xa E L i f f y a E L, and
z~78

. ~L is the right congruence ~ defined as x O ~ y iff


(~ s x = sy, and
(~ for all v, if u E s = s x v = s y v then u ( x v ) ~ E L iff u ( y v ) ~ C L
9 A L is the natural live assignment.
L e m m a 5. ~L recognizes L.

Proof. Since L(2, L) is w-regular it suffices to verify that each word of the form
uvco is accepted if and only if it is in L.
So assume xy~ is in L. By Lemma l(b), x y ~ can be factorized as

s p

The progress state p is then in A~ by definition of the natural live assignment.


It follows that x y ~~ E L ( s
Vice versa, if xyco is in L(~,L), then it admits a factorization of the above
form with p is in As. Thus there is some word uvco in L that also has a (s, p)
factorization. By L e m m a l(c); u v W has a factorization
U! V! co

s p

By definition of (~L, ~ ) , it follows that x y ~ E L if and only if u'v'co C L. Thus


x y ~~ E L. []

The canonicity of 12L is explained by the following result.


T h e o r e m 1. (Canonicity) Any ~ recognizing L retracts to ~L.
T h e o r e m 2. (Language Preservation) If s recognizes L and s retracts to ~,
then _~ recognizes L.
3 = ("% o ) is ,,~-canonical for L if (~, o ) saturates L and any other FORC
with safety congruence ~ and saturating L retracts to 3.
P r o p o s i t i o n 1. (~-Canonicity) If ~ refines -VL, then a ,-~-canonical ~ exists.

Proof. Define ~ by

(l~~ if s x ~ s y and
for all v, s = s x v implies u ( x v ) ~ C L iff u ( y v ) ~ E L, where u E s.

Since ~ refines ~L, the choice of u in (1) is immaterial and it can be seen that
saturates L. Also, it is not difficult to see that any other FORC with ,-~ as
safety congruence retracts to 3. []

We noted before that the progress and live state conditions for a retraction
involving LFORCs with the same safety congruence essentially express DFA
homomorphisms. Thus for a fixed safety congruence ,-% DFA minimization that
respects requirement (FORC) can be applied to obtain the ~-canonical LFORC.
479

We shall next show that when the safety congruence becomes simpler, the
progress congruences get more complicated but they still essentially contain the
simpler progress congruences if these are minimum with respect to their safety
congruence.
Assume ~ retracts to ~_. Then ~ progress-refines ~ if for s E s__,
0
x~_~y and s x ,,~ s y implies x,'~
0 sy

Thus the product of o and --~ refines o.

T h e o r e m 3. (Progress Refinement) If ~ is ,-~-canonical and ~ retracts to ~,


then ~ progress-refines ~.
0
Proof. Assume x~_~_y and s x ,~ sy. To prove that x~ it suffices to prove
that a = u ( x v ) ~ E L i f f f l = u ( y v ) ~ E L whenever u E s = s x v . Now since
x v ~_~_y v and u E s = s x v , both a and/~ admit a (s,p_)-factorization in ~, where
p_ = [xv]~ = [yv]~_. Since ~ retracts to ~ and ~ saturates L, also ~_ saturates L.
ThusaELiffflEL. []

4. COLLAPSED L F O R C S
Let ~ = (,,% o , A) be an LFORC and ~ be a refinement of ~. Define the
collapsed LFORC ~ = (~-, o , A__)by

(2) ~
x~-~y iff for all s of ~ contained in
s, x ~ and
for all v and i < Is_I~,
s_xv = s implies (xv) i ~ (yv)~
(3) p_EA~_ iff [xi]~Ehs, wherexEp_,
s_x = s_, s x ~ = s, a n d s _ s

It can be seen that (2) always defines a right congruence. The definition (3)
may not always make sense. The C o n s i s t e n c y R e q u i r e m e n t is that for all s the
membership of a progress state pp in As_ is determined unambiguously by (3) for
any choice of x, i, and s.

L e m m a 6. The collapsed LFORC ~ recognizes L(~) if the Consistency Re-


quirement holds. If the Consistency Requirement does not hold, then ~ does
not refine ,,~L.

Proof. If the Consistency Requirement holds then ~ is a refinement of ~.


If the Consistency Requirement does not hold, then there are s, _pof o s, s'
of ,-~ contained in s_, x~ where x , y E p, and i , j ~ IsI~ with s_x = s = s__y,
s x i = s, s' = s ' y i. such that [xi]~ E As and [YJ]s' ~ As,. Thus if u E s and
v E s', then u x ~ E L and v y W ~ L. By (2) and since x~ - - x i ~ $ yi . Thus
u y ~ E L. But then u and v are not equivalent with respect to ,,~L. []
Z.80

5. L o w ~ a BOUND
We establish an exponential lower bound for minimization, that is, there is an
infinite family Ln of languages that can be represented by LFORCs of size O(n)
but whose canonical LFORCs contain a progress congruence with n ~ states.
We let E~ = E A U E B consist of n trigger letters E A = { a l , . . . ,aN} and
of n response letters E B = { b l , . . . ,b~}. A word a is in L,~ if from some point
on there is a trigger letter ai such that ai occurs infinitely often and no other
trigger letter occurs; also, the number of bis between two consecutive letters ai
must be exactly n. There is certainly only one safety class in this language since
a word is recognized by properties of its tail. Define a linearly big L F O R C 12n
recognizing Ln by using the safety congruence represented by the a u t o m a t o n
depicted for n = 3 as

a2 a3

a~, ~ a3, C3B

a2

Thus each state sl is a sink for the letter al, and a response letter does not
affect the state. Continuing our example for n -- 3, we can represent the progress
congruence for safety state 1 by the a u t o m a t o n
481

~-~1 ~1 lo~

(i 2
I
::,,( 1 , ~ . . . - . - ~ - ~

a2~a3
\
32
a2-1~a3
t
1.~~,

\
32
I
.~,]

a2-1~
b2,b3
bl

32 3

l
al

a3, r,~

al

This congruence respects the safety congruence. Any factorization of a ac-


cording to s and a progress state of this automaton yield the right answer as to
whether a is in L,~.
Thus it can be seen that for any n, the progress automaton has 2n + 1 states.
We conclude that L,~ can be represented by an LFORC of size O(n).
For the lower bound, assume that the progress congruence o f ~L~ has less
than ( n + l ) '~ states. If we say that the signature of a word over E B is determined
by the number of his for each i, then there are (n + 1) '~ signatures, where for
all i, there are at most n bis. Thus there are two words x and y over E B with
different such signatures that lead to the same progress state. For some i, x
and y contain a different number of bis and it is then possible to find a word
u E b* .ai such that (xu) ~ E Ln and (yu)~ g~ L,~. This contradicts that ~n~
accepts L,~. Thus ~s;~ has more than (n + 1) '~ states.

P r o p o s i t i o n 2. There is an infinite family of LFORCS J~,~ of size O(n) whose


corresponding canonical LFORCS have size at least n n.
Note that the languages L,~ can also be recognized by deterministic a u t o m a t a
of polynomial size with a pairs condition.
P r o p o s i t i o n 3. There is an infinite family of deterministic a u t o m a t a A,~ of size
O(n 2) whose corresponding canonical LFORCS have size at least n n.

Proof. Construct An as follows. To each ai corresponds an ith component that


counts occurrences of bi. If an ai occurs at the wrong time--either because the
current component is i and the number of b~s encountered is wrong or because
an aj with i r j occurs when the current component is / - - t h e n a red light is
flashed (and in the latter case the automaton moves to component j). Also a
482

green light is flashed every time n occurrences of bi has taken place in component
i. An infinite word is accepted if the red light flashes only finitely often and the
green light flashes infinitely often.
Then An has size O(n 2) and accepts Ln. []

6. THE AUTOMATA-THEORETIC VIEW AND UPPER BOUND


As indicated in Example 2.0.1, an LFOIKC ~ can be represented by a family
of automata. The safety relation ~- is represented by a safety automaton ~ =
(S, s ~ 6) with state space S, initial state s ~ and deterministic transition function
6 : E ~ S ---* S such that

x ,-, y iff ~ ( x ) = ~(y),

where ~ ( x ) denotes the state of ~ upon reading x. Thus we may continue to


identify each safety class s with a state s E S. For each such s, we represent s
and As as an automaton ~3s = (Ps, p0, 6s, PF), where P y is a set of final states.
Here each p represents a progress state such that x ~ y if only if ~ s (x) = ~38 (y)
and [x]8 E As if and only if ~ s ( x ) E p F . The family of automata or FDFA so
defined is denoted ( 6 , ~3).
To formulate a retraction as a homomorphism, we need an operation I C
that transforms ~8 into an automaton I c J ( ~ ) that represents the iterative
condition of (R-P) as follows:

(4) I c ( v s ) ( x ) = Ic(vs)(y)
iff

x ,-is y and for all v and i <_ If-l(s)l, s(xv) i = s ~ (xv) i ~ (yv) i

L e m m a 7. For s and j, an automaton IC(~8) exists such that (4) holds. The
a u t o m a t o n is at most exponential in size of ~38.

Pro@ The proof consists of defining an automaton 92 that is able to distinguish


words according to or even more strictly than the criterion (4). The automaton
IC(~38) is then a coarsest refinement of 92.
Note that a transition relation 5 : E --+ P ---+ P can be extended to a function,
also denoted 6, of type E* -+ P --* P by defining 5 ( a o . . - a n ) = 5(a,~)o...oh(ao).
By the standard technique for obtaining syntactic monoids, let 92 constructed
from q3e be an automaton whose state space consists of the functions 5s(x) such
that q = 92(x) is the function 5s(x). This automaton is exponential in size of
~38.
Each q determines a function q : P -+ P, where q(p) is the only p~ such that
the entry (p,p~) is 1.
Since each q also determines the state p = q(pO) reached from the initial
state p0, we may define an operation q - v, which denotes a state in P, namely,
q . v = 58(v)p. Thus if 92(x) = q, then q . v is simply 6s(xv). Moreover, we may
483

even define an operation (q. v) i so that if P./(x) = q, then (q. v) i is 5~((xv)i).


This is done by Jetting (q. v) i be ~ ( v ) o q o . - . o 5~(v) o q o 5~(v)(q(p~ where
6s (v) o q is repeated i - 1 times.
9./does not quite calculate what is needed in (4), but satisfied the weaker
requirement:

(5) =

implies

x o y and for all v and i ~ I~1, = s o

To see this, assume 9.1(x) = 9fl(y). Then ~ , ( x ) = ~ ( y ) = q(p0), whence x ~


Moreover, 5s((xv) ~) = (q . v) i = 5s((yv)i). Thus in particular, it holds that if
i_< I f - l ( s ) l , then s(xv) i = s ~ (xv)i~ (yv) i.
Note that by (FORC), there is a subset P~ of progress states such that s = sx
if and only if ~ (x) 9 P : .
To make the other direction of (4) hold, we will shrink 9fl according to the
following characterization of states:

x(q) = (q(p0), {(p, i, L) I i < ]f-l(s)l and L = {v I (q" v) i = P 9 P:})

The Ls are all regular languages and so X Can be computed by operations on


usual finite-state automata. The function X induces a partition of P.I. Let I C ( ~ s )
be the automaton corresponding to the coarsest partition of ~ that refines the
one induced by X- We also use qs to denote the states of this automaton. If
q = I C ( ~ s ) ( x ) , then x(q) = (~3s(x), { ( p , i , L ) [ i <_ j and L = {v I ( x . v ) i = p 9
P:}).
Thus (4) is satisfied. []

An FDFA homomorphism h : (~, ~ ) i-+ (2, ~ ) consists of

9 a transition system homomorphism (i.e. a mapping respecting right con-


catenation) f : ~ -+ 2 ; and
9 for each s C _S, a transition system homomorphism

| • %,
seA:f(s)=A
where | denotes the transition system cross product, such that for all
s 9 S with f(s) = s_ and all i < If-l(s_)l,

V1L = L(V~) N L, where


L =

and L~(~) = {x 5(x)(s) = s, i.e. sx = s} and r denotes the language


{x Ix i E L}.
484

L e m m a 8. Let (8, 9 ) be the automata representation of ~ and let (~,__~) be


the automata representation of ~. Then ~ retracts to _~ if and only if there is
an FDFA homomorphism from (8, ~ ) to (~, ~ ) .

Proof. Requirement (R-S) and (R-P) correspond to the existence of f and g.


Requirement (R-A) is then encoded correctly as shown above since n = L~_(~) N
defines the x such that s = sx and s = sx i. []

Let ( 8 L, ~ ) be the FDFA representation of ~n. We can now restate The-


orem 1 and Theorem 2 as theorems about FDFAs and their homomorphisms.
T h e o r e m 1 ~. (Canonicity) Any FDFA recognizing L allows a homomorphism
to (8 L, ~).
T h e o r e m 2'. (Language Preservation) If ( 8 , 9 ) recognizes L and there is a
homomorphism from (8, ~ ) to (2, ~ ) , then (~, ~ ) recognizes L.

P r o p o s i t i o n 4. (Upper Bound) If ~ has size n and retracts to s then the


size of s is at most n ~ .

Pro@ Clearly the safety congruence can only shrink. For the progress part, use
the FDFA representation (| ~ ) of ~ and assume that the canonical safety state
s_ is refined by the s of 8 such that f(s) = _s. Observe that @,es./(s)-s IC(~s)
refines the canonical progress automaton or congruence for s. Thus the-canonical
congruence is of size at most ( n ' ) ", which is n ~ . []

7. CONCLUSION
We have presented algebraic homomorphism concepts based on right congru-
ences introduced by Maler and Staiger. With these concepts, we have shown that
a representation of w-regularity exists that extend the Myhill-Nerode Theorem.
However, even as LFORCS can be reduced to a canonical form and any Rabin
or Streett automaton with a small acceptance condition is poly-size related to
LFORCS, it is still open whether the Myhill-Nerode Theorem can be generalized
to an algebraic setting that both is poly-size related to usual automata on infinite
words and allows poly-time computable reductions to canonical forms.
A c k n o w l e d g e m e n t s . Thanks to Dexter Kozen, Oded Maler, Ludwig Staiger,
Igor Walukiewicz, and Thomas Wilke for discussions about w-regular represen-
tations. Also, thanks to the referee for helpful comments.

REFERENCES
1. B. Alpern and F.B. Schneider. Defining liveness. In]ormation Processing Letters,
21:181-185, Oct. 1985.
2. A. Arnold. A syntactic congruence for rational w-languages. Theoretical Computer
Science, 39:333-335, 1985.
3. O. Maler and L. Staiger. On syntactic congruences for w-languages. Technical Re-
port 93-13, Aachener Informatik-Berichte, 1993. A preliminary version appeared in:
Proc. STACS 93, LNCS 665, Springer-Verlag, Berlin 1993, pp. 586-594.).
485

4. B. Le Sac. Saturating right congruences. Informatique Thgorique et Applications,


24:545-560, 1990.
5. B. Le Sac, J.-E. Pin, and P. Weil. Semigroups with idempotent stabilizers and
applications to automata theory. International Journal of Algebra and Computation,
1(3):291-314, 1991.
6. W. Thomas. Automata on infinite objects. In J. van Leeuwen, editor, Handbook of
Theoretical Computer Science, volume B, pages 133-191. MIT Press/Elsevier, 1990.
7. D. L. Van, B. Le Sac, and I. Litovsky. Characterizations of rational w-languages by
means of right congruences. Theoritical Computer science, ?(?), 1995. To appear.
8. T. Wilke. An Eilenberg theorem for co-languages. In Proc. 18th Inter. Coll. on Au-
tomata, Languages, and Programming, LNCS 510, pages 588-599. Springer Verlag,
1991.
Ramified recurrence and
computational complexity II:
Substitution and poly-space

Daniel Leivant Jean-Yves Marion


Department of Computer Science Universit@ Nancy I
Indiana University CRIN - CNRS - INRIA Lorraine
Bloomington, IN 47405 54506 Vandoeuvre-l@s-NancyCedex
United States France
leivant @cs. indiana, edu Jean-Yves.Marion@loria. f r

A b s t r a c t . We prove an applicative characterization of poly-space as the set


of functions over W = {0, 1}* defined by ramified W-recurrence with param-
eter substitution. Intuitively, parameter substitution allows re-use of space in
ways disallowed by ramified recurrence without substitution: it permits cap-
turing by recurrence the flow of computation backwards from accepting config-
urations, thereby enabling the simulation of parallel (alternating) computing.
Conversely, parameter substitution can be simulated by a computation that
can repeatedly bifurcate into subcomputations, i.e. by parallelism that can be
captured in poly-space.

1 Background
1.1 Subrecursion and computational complexity
Recent interest in machine-independent characterizations of computational com-
plexity has been motivated by the wish to lend credence to importance of complex-
ity classes, provide insight into their nature, relate them to programming method-
ology, suggest new tools for separation, and offer generalizations to computing
over arbitrary structures and to higher type functionals. Machine-independent ap-
proaches fall, by and large, into three classes: proof-theoretic, descriptive (database
queries), and algebraic (applicative programs).
Recurrence schemas have been used for long as an applicative method for defin-
ing, characterizing, and classifying natural collections of recursive functions. Sub-
recursive characterizations of computational complexity classes that are relevant to
computer science originate with Cobham's [1965] characterization of the poly-time
functions over l~ by "bounded recursion on notations." Unfortunately, Cobham's
characterization uses ad hoc initial functions and explicit bounds on functions'
growth rate. The correspondence between sub-recursion and computational com-
plexity was clarified by the use of ramified data, introduced independently by Sim-
mons [1988], Leivant [1990b], and Bellantoni and Cook [1992]. One underlying
idea is that data objects are used computationally in different guises. By explic-
itly separating these uses, and requiring that recurrence (i.e. primitive recursion)
487

respect that separation, one obtains forms of ramified recurrence that correspond
closely to major computational complexity classes. Such characterizations have
been provided, for example, for the poly-time functions [Bellantoni and Cook, 1992;
Leivant, 1993], the extended polynomials [Leivant, 1990b], the linear space func-
tions [Bellantoni, 1992; Handley, 1992; Leivant, 1993; Nguyen, 1993], NC 1 and
polylog space [Bloch, 1992], NP and the poly-time hierarchy [Bellantoni, 1994],
and the elementary functions [Leivant, 1994a]. For further backgound on ramified
recurrence the reader is referred to [Leivant, t994b].
We show here that the poly-space functions over W are precisely the functions
defined by ramified recurrence with substitution. The schema of recurrence with
(parameter) substitution is a well known variant of recurrence, where the param-
eters of a recurrence may be altered at each iteration using previously defined
functions.
The role fo recurrence with substitution is best understood when comparing
the present result to the characterization of poly-time in [Leivant, 1993; Leivant,
1994b]. There we used recurrence over the algebra of binary words to simulate
the progressing computation of deterministic abstract machines, by itrerating the
mapping of a configuration into its successor configuration. The length of the pro-
gression, goverened by the ramification conditions, is polynomial in the size of the
input, leading to a characterization of poly-time. Parameter substitution allows the
representation of computation flow backwards, and not only forward: the outcome
of performing t + l computation steps from a given configuration is defined by per-
forming t steps from successor configurations. This allows the simulation of parallel
computing, in particular the computation of poly-time alternating machines over
{0, i)*, i.e. of poly-space.

1.2 Previous machine-independent characterizations ofpoly-


space

Machine-independent characterizations of poly-space in Finite Model Theory have


been known for some time. They seem to originate in Immerman's early work,
where he goes beyond the weak class of first order queries to queries defined by
uniform sequences of formulas. [Immerman, 1981] defines FO[t(n)] as the class of
queries defined by a fixed quantifier block repeated t(n) times and followed by a
fixed quantifier-free formula. 1 Immerman showed that a query over finite ordered
structures is poly-space iff it is FO[2'~~ [Immerman, 1982]. In [Immerman, 1991]
he showed that poly-space is equivalent to uniform definability by a sequence of
formulas with a fixed number of distinct variables; more precisely, a query as above
is in DSpace[n ~] iff it is definable by a sequence of first order formulas using k+ 1
distinct variables (k > 1).
Instead of uniform sequences of formulas, one may consider extending first order
queries by augmenting the language, emulating the characterization of poly-time
1The formulas in FO[t(n)] use a fixed number of variables. At the time Immerman used the
notation Var~ Sz[O(1).t(n)], but he later switched to FO[t(n)].
~-88

by monotonic fixpoints [Vardi, 1982; Immerman, 1986]. The first such charac-
terization for poly-space is in terms of definability by imperative programs with
guarded looping ("while programs") over ordered finite structures, proved inde-
pendently by Moshe Vardi [Vardi, 1982] and Yiannis Moschovakis [Moschovakis,
1983, 2C.8]. Abiteboul and Vianu [Abiteboul and Vianu, 1989] showed that the
poly-space queries over ordered structures are those defined by non-monotonic fix-
point, a result which elegantly complements the characterization of poly-time by
positive fixpoint [Vardi, 1982; Immerman, 1986] and inflationary fixpoint [Gure-
vich and Shelah, 1986; Leivant, 1990a]. A related characterization is Datalog(-~-~)
[Abiteboul and Vianu, 1991]). Alternatively, poly-space is obtained by enriching
first order logic with both a partial fixpoint and some nondeterministic choice
operator [Abiteboul e~ al., 1990], or by enriching Datalog with operations that
represent hypothetical reasoning [Bonnet, 1988] (these characterizations hold also
over unordered structures).
Several other characterizations in finite model theory use higher order logic,
following the seminal characterization of poly-time queries by existential second
order formulas [Fagin, 1974; Jones and Selman, 1974], from which it follows that
the queries expressible in second order logic form the poly-time hierarchy. Notably,
Immerman showed that the poly-space queries (without order assumption) are
obtained by enriching second order quantification with a second order transitive
closure operator; this is immediately equivalent to expressibility in SO[n~ i.e.
by a uniform sequence of second order formulas where a quantifier block is iterated a
polynomiM number of times [Immerman, 1987]. For queries over ordered structures
an equivalent property is definability by an existential second order formulas whose
first order matrix uses non-monotone fixpoints, or similar extensions of first order
logic (see [Abiteboul e~ al., 1994] for a survey).
An algebraic characterization of poly-space in finite model theory is obtained
using higher order recurrence over particular rank-3 types [Goerdt, 1992], in anal-
ogy to the characterization of the log-space queries by simple recurrence [Gurevich,
1983]. Of some relevance are also characterizations of poly-space queries in terms
of imperative computation models (without explicit reference to resource bounds),
such as program schemes with arrays [Tiuryn and Urzyczyn, 1988; Stewart, 1993].
However, in contrast to the characterizations above of poly-space queries in
Finite Model Theory, our characterization is for poly-space functions over a single
infinite algebra, namely the algebra W of binary words, i.e. {0, 1}*, the canonical
medium of Turing machine computing. While subrecursive characterizations of
poly-time and of linear-space, albeit unsatisfactory, have been known for a long
while [Cobham, 1965; Ritchie, 1963], we know of no such characterization thus far
for poly-space (except for the rather pedestrian observation that, by the character-
ization of linear space in [Ritchie, 1963], composing functions in Grzegorczyk's E ~
with )~n.2"k yields poly-space [Thompson, 1972]; a category-theoretic rendition of
that result is in [Otto, 1995]).
489

2 Ramified recurrence with substitution


2.1 Recurrence over free algebras

Most results about applicative delineations of complexity classes are based, ex-
plicitly or implicitly, on data structures other than the natural numbers. For in-
stance, Cobham's "recursion on notations" is in fact recurrence over binary words
in disguise. We find it cleaner and clearer to refer directly to recurrence over free
algebras. ~- For the rest of the paper & will be a free algebra generated from cl ... c~
(k > 0) where a r i t y ( c i ) = r i ~_ O. W e write a r i t y ( A ) f o r m a x ( f t . . , rk). If a E &,
we write lal for the height of (the syntax tree of) a.
Recurrence over & is a set of k templates, one for each constructor:

. . a r , ) , i ) = g c , ( 1 a, '!" a r , , Et , E ) ,
9 i
/(ci(al whereaj=dff(aj,E), i=l...k.

The functions gr above are the recurrence functions, and the argument of f dis-
played first is the recurrence argument 9 The ri arguments of ge~ displayed first
above are the critical arguments, and the arguments displayed as ~ are the param-
eters.

An instance of recurrence is flat if all critical arguments are missing, and it


is closed if it refers to no parameters. For example, the definition exp (0) =
1, exp(sn) = e x p ( n ) + e x p ( n ) is a closed recurrence which is not flat. Typical
examples of flat recurrence are the definitions of the branching and destructor
functions. Branching is defined by

case ( c i ( a l . . . at,), x l . . . Xr) -- xl,

and the destructor functions dStr l , . . . , dstr r ( r = arity ( A ) ), are defined by

aj if j _< ri
dstrj(ci(al...ar~))= ci(~) otherwise.

As an example of an algebra other than 1~ consider the free algebra W generated


from a constant e and two unary functions 0,1. Then recurrence over W takes the
form:

f(e,E) =
f(O o, e) = go(/(", e), w,
f(1,o,e) =

Since W is isomorphic to {0, 1}*, the canonical medium of computational complex-


ity theory, it is the most important algebra in relating recurrence to computational
complexity.

2These schemas are for the most part well known: see, for example, [Venkataraman e~ M., 1983;
Tucker and Zucker, 1988].
490

2.2 Ramified recurrence

Ramified recurrence was discovered independently by Simmons [1988], Leivant


[1990b], and Bellantoni and Cook [1992]. A systematic use of data tiering and
ramified recurrence was introduced in [Leivant, 1993]. We recall here the essen-
tials, and direct the reader to [Leivant, 1994b] for a more detailed account.
Let S(A) be the many-sorted structure with copies .&0, A1 ..- of the algebra A
as universes. We refer to these copies as tiers. Let c~ be the constructor ci in
&j (we omit the superscript, though, when in no danger of confusion). Ramified
recurrence allows the definition of a function f : &i x ,4 --~ At (where A is the
product of some Am's), by

f(4(al...ar,),~) -- g c , ( f ( a l , x ) . . . f ( a r , , x ) , ~ , x ) i = 1...k
provided the tier j of the recurrence argument of f is higher than the tier t of
actually-present critical arguments of each gc~; i.e. either the recurrence is flat, or
else the recurrence argument of f is in a tier higher than the output tier of f .
The ramified PR functions over A are the functions over ,9(A) that are defined
from the constructors of A by ramified recurrence and explicit definitions. We write
RRec(A) for the collection of these functions. We also write f E RRec(A) for a
function f over A if f is obtained from a function in RRec(A) by disregarding tiers.
E x a m p l e s o f r a m i f i e d r e c u r r e n c e . (See [Leivant, 1994b] for more detail and
other examples.)

1. All functions defined by explicit definitions and flat recurrence are trivially
in RRec(A) as functions over A0, or, more generally, as functions over Ai for
any given i.

2. Given a free algebra A as above, and j, m where j > m, define a coercion


function ~r : Aj--* Am, by

3. Addition and multiplication over N, and concatenation over Y~], are all defin-
able by ramified recurrence: For each j, k with j > k we can define a copy
of addition, +jk : l~lj xN~ --*Nk. Then, for each j, k with j > k, we obtain a
ramified multiplication function xjk : N j x Nj --*Nk.
4. Analogously to addition, we have ramified copies of the coneatenation func-
tion, (3j~ 9 Wj xWj--~Wk, (j > k). A multiplicative function | over W can
be defined, for example, by | z, y) = e, | e, x, y) = 9 (3 | z, y),
| e, z, y) = y (3 | z, y). For each j, k, where j > k, we can define by
ramified recurrence a ramified version | of | from Wj • Wj to Wk.
5. Consider the definition above of ezp by ezp (n + 1) = ex.__Ep(n) + ezi0 (n). Since
the first input of ramified addition must be at a tier higher than the out-
put, this recurrence cannot be ramified. A similar argument shows that the
definition of exponentiation via multiplication cannot be ramified either.
49]

2.3 Recurrence with substitution

For our generic algebra & recurrence with (parameter) substitution [Rose, 1984] is
the schema

f(ci(al...a~,),ff,~) = gc,(A~...A~,,~,ff,s } { i=X...k


where A'j = ( f ( a j , Y l j l ( f f ) , g ) . . . f ( a j , Y l j l , j ( 1 ) , ~ ) } j = 1 ...ri

Here 5ijt are vectors of already-defined functions, the substitution functions and
{al...~p)(1) stands for {al(ff)...c~p(ff)}. The arguments ff are the substitution
parameters.
For A = l~l the above reads

f(0, i,~') = g0(if, x)


f(s(n), i , ~) -- gs(f(n, ~l(ff), ~')--. f(n, ~,(ff), ~), n , i ) .

For example, define zp(O, u) = u, xp(sn, u) = xp(n, 2u); then xp(n, 1) = 2n.
It is well known that recurrence with substitution over N is reducible to simple
recurrence, but the standard reduction uses sequence-coding by elementary func-
tions [Rose, 1984]. In the context of sub-elementary complexity classes, it is there-
fore appropriate to consider this schema in its own right. Ramified &-recurrence
with substitution is the schema of recurrence with substitution as stated above, but
applied over S(&), with the provisos that (1) the tier of the recurrence argument be
larger than the tier of critical arguments, and (2) that the substitution parameters
have a common tier. We write RRecSbt(&) for the collection of functions generated
from the constructors of & by ramified recurrence with substitution and explicit
definitions.

Examples

1. The definition of zp cannot be ramified, because it is not a correct definition


over S(N): the tier of 2z is lower than the tier of z under any ramified
definition of ~z.2z, and so the tier of the second argument ofxp is not well-
defined.
2. Consider the recurrence over 1~

f(O,x,y) -- y
f(sn, x,y) -- f(n,x,(y+~)+x)

with f : Na xN1 xN0 ~ N0. The tiering condition on the recurrence argument
is satisfied here, but not the condition that the substitution parameters have
a common tier. By a trivial induction on n we have f ( n , x, O) = 2 n 9x for all
n>O.
3. Consider the truth-definition function for quantified boolean formulas, call
it true, which is well-known to be poly-space complete. A definition by re-
currence (over the syntax of boolean formulas) has a key clause that defines
492

the truth of a universally-quantified formula of length < n + l in terms of the


truth of formulas of length < n: writing # ( 9 ) for a canonical symbolic rep-
resentation of the formula 9, and using a conjunction-representing function
conj, as well as the function
i st:
(where { r is the result of appropriately substituting r for all free occur-
rences o f p in ~), the more significant recurrence clause for the truth-function
true reads:
~rue (sn, #(Vp 9)) = conj(true (n, ins, (#(Vp~)), # t r u e ) ,
*rue (n, insr (#(Vp~)), #false)).
It is easy to see that the functions conj and inst mentioned above are
definable by ramified recurrence over W, and that ~rue is therefore defin-
able by ramified recurrence with substitution from conj and inst. Thus
*rue e RRecSbt(V4). But, by [Bellantoni and Cook, 1992; Leivant, 1993],
lrue ~ RRec(W), unless poly-space -=- poly-time.

3 Parallel register m a c h i n e s over free algebras


3.1 A generic model of parallel computing

We continue our generic treatment of free algebras by using a generic machine model
for parallel computing. We enhance the genericity of the model by parameterizing
it with respect to a "combination function" a: when the computation at some state
s forks into its parallel subcomputations, the output is defined to be ~ applied to
s and to the outputs of the subcomputations. An alternating TM is then a special
case, with W as the algebra and with a(s, x, y) = if s is an existential state then
x V y else z A y.
A parallel register machine (PRM) over a free algebra A as above is a device
M consisting of:
1. a finite set S of states, including a distinct state BEGIN;
2. a finite list II = (~rl...~r "~) of registers (i.e. distinct identifiers); we use
7r, 7rl... to range over H, and we write OUTPUT for 7rm;
3. a finite collection of commands, where a command is of one of the following
forms. Following each form we indicate its intended operational interpreta-
tion; a more formal semantic definition follows.
(a) [Constructor: sTrl ... 7rr,ciTr's'] (When in state s, store in register 7r' the
term resulting from applying the constructor ci to the values stored in
registers r l ... 7rr~, and switch to state #.)
(b) [p-Destructor: szcTr's'] (p < arity(&)). (When in state s, store in regis-
ter 7r' the result of applying dstrp to the value in 7r, and switch to state
S t .)
493

(c) [Branch: sTrsl ...sk] the algebra term in register 7r is % then switch to
state sp .)
(d) [Fork: srsosl] (Return the value c~(s, ao, al), where ai is the value
returned by the computation that starts with state si and the current
register values (i = 0, 1).)
(e) [End] (Return the value in OUTPUT.)

T h e proviso is t h a t for every s ~ S there is exactly one c o m m a n d t h a t starts


with s. 3 We write corn(s) for the c o m m a n d t h a t starts with s. If the c o m m a n d
[Fork] is not used, then the machine is said to be a regisier machine (RM)
over A.

The semantics of a P R M as above is defined as follows. An environment for M


is a m a p p i n g F : II ~ &. We write [ u l . . . u r n ] for the environment F such that
F(~ri) = u ~ , and { r o , - - - a } F for the environment ATr. if rr is r0 t h e n a else
F(~r). A configuration of M is a pair (s, F), where s E S and F is an environment
for M . A combination function for M is a function a : S x &2._.~.&.
Given M and a as above we define a semantic partial-function eval = evalM, ~ :
1~ x S x &m ~ A, t h a t m a p s a "time bound" t and an initial configuration (s, F )
to an output value. T h e definition is by recurrence on t, and by eases on com(s).

* eval(O, s, F) is undefined.
* If corn(s) is [Constructor: sTrl...r~,eiTr's'] then
eval(t+l, s, F) = eval(t, s', F'), where F ' = {rr' ,---ei(F(rrl),..., F ( r ~ , ) ) } F;
* If com(s) is [p-Destructor: sirTr's'], then
eval(t+l, s,F) = eval(t, s', F'), where F ' = {r',--- dstrp(F(r))} F;
9 If corn(s) is [Branch: s~rsl...s~], then
eval(t+ 1, s, F) = eval(t, s', F), where s' = case (F(rr), s l , . . . , s~);
* If corn(s) is [Fork: ssosl], then
eval(t + l, s, F) = a(s, eval(t, So, F), eval(t, sl, F));
* If com(s) is [End] then eval(t + 1, s, F ) = F(OUTPUT).
Note t h a t eva_~lis defined by recurrence with p a r a m e t e r substitution, except for
the "undefined" base case.
Let T : I~---~H, and fix some canonical u E A (e.g. 0 for 1~, e for W). For M
and a as above, and for r < m = the number of registers of M , M determines a
partial function [M]~ a : A ~ --* A defined by

[M]~(al... a , ) =dr evalM,a(T(n), BEGIN, In1... a,, u . . . u]), where n = m a x la~l.


3

3Allowing s o m e staten n o t to originate a c o m m a n d is not more general, because we can intro-


duce for each such state additional corrtmaxtds and auxiliary states that would spasm, a looping
c o m p u t a t i o n sequence.
494

We say that f : Ar ---, A is computable by M modulo ~ in time T if f is [M]rr~O~.


Clearly, if eval(t, s, F) is defined then eval(t+l, s, F) = eva.!(t, s, F), so if f is total
and computable by M modulo a in time T for some T, then f is computable by
M modulo cr in time T', for all functions T I that dominate T.
We say that f is PRM-computable in time T if it is computed in time T by a
PRM modulo a combination function a computable in constant time by a RM.

3.2 P o l y - t i m e P R M - c o m p u t a b i l i t y over w

Restricting attention to PRM's over the algebra W, we have a direct relation be-
tween poly-space and PRM-computability in poly-time. 4 If f : W--*W let f b i t be
defined by fuit(w, i) =dr the i'th bit of f(w). We have:

PROPOSITION 3.1 Let f : W---+~r$. The following conditions are equivalent.

1. f is computable in polynomial apace.


2. fbit is computable by an alternating 7bring machine in polynomiM time, and
If(w)[ is polynomial in ]w[;

3. f is PRM-computable in polynomial time.

P r o o f . We shall use below only the fact that (1) implies (3). The implication (1)
=# (2) is well known [Chandra et al., 1981], and we prove here that (2) implies (3).
That (3) implies (1) is immediate from Theorem 4.3 below, and can also be easily
proved directly.
To see that (2) implies (3), suppose that f b i t is computable by a single-tape
alternating Turing machine M, in time O(n~). Define a PRM R that simu-
lates M as follows. R has two registers, intended to hold the tape's contents
from the head to the right, and from the hand's left neighbor to the left (in
reverse order), respectively. Without loss of generality each non-terminal state
of M is the sources of two transitions. Corresponding to a disjunctive state s
of M, with transitions rl and r~ leading to states 81 and s~ respectively, R
has a state s with a [Fork] transition leading to states s t and s~, and tran-
sitions r~ simulating ri, and leading from s~ to si (i = 1,2). Finally let R
compute with the combination function a(s, u, v) = if s is disjunctive (in M)
then max(u, v) else min(u, v). (Here max(u, v) - .cqse(u, e, case(v, e, O, 1), 1) and
min(u, v) = c.ase(u, e, O, case(v, E, O, 1)).)
Suppose If(w)[ ~ Iw[q. To compute f, construct a register machine R' that
on input w generates in some register r ~ the value llwFe, initialize O U T P U T to e,
and then enter into a loop guarded by value (~r') r e, and whose each cycle: (1)
executes R with w as first input and the value of ~rI as the second; (2) appends the
output of R to the value in OUTPUT; and (3) decrements zff. []

4A more detailed a n d general account of the computational power of P R M ' s will a p p e a r


elsewhere.
495

4 A subrecursive characterization of poly-space

4.1 Statement of the results

LEMMA 4.1 Let & be any free algebra. Ira function f over & is poly-time PRM-
computable, then f is definable by ramified &-recurrence with substitution. In
fact, that definition can be obtained using three tiers only (say Ao, &l and &2).

The proof is in ~4.2 below.

LEMMA 4.2 Let & be a word algebra. Ira function f over & is definable by ramified
recurrence with substitution, then it is poly-space.

The proof is in w below. Combining the facts above yields our main result:

THEOREM 4.3 Let f be a function over W. The following conditions are equivalent.
1. f is poly-space.
2. f is poly-time PRM-computable.

3. f is definable by ramified recurrence with substitution, using tiers W0, W1


and W2 only.
4. f is definable by ramified recurrence with substitution.

P r o o f . (1) implies (2) by Lemma 3.1. (2) implies (3) by Lemma 4.1. (3) implies
(4) trivially. Finally, (4) implies (1) by Lemma 4.2. O
Note that these characterizations use no initial functions other than the con-
structors, and no size-bounding condition.

4.2 Simulation of PRMs by ramified recurrence with substi-


tutions

LEMMA 4.4 Let & be a free algebra. If a function f over & is computable by a R M
in constant time then f is definable by ramified A recurrence as a function from
tier Ai to Ai (for any i >_ 0).

P r o o f . . See [Leivant, 1994b, Theorem 2.2]. rn


P r o o f o f L e m m a 4.1. We prove the lemma for A = W. The general case is
similar. Code N in W by fi =dr 0n(r In particular, let #si =df 3, i.e. a code
in W for st. A configuration (s, [Wl...wm]) of M can then be identified with
(#s, wl . . . win) E W rn+l. For a parallel register machine M, let t.eval = *.eValM,c,
be defined like evalM,c, , except that (1) the first argument is in W (with clauses
for 0 and 1 identical to the clause for s), and (2) the base case of the recurrence
is t.eval(O, s, F) = F(OUTPUT) (for all s, F). Thus t.eval is a total function, and
~.eval(w, s, F) = eva]([w[, s, F) whenever eval(t, s, F) is defined.
496

SUBLEMMA 4.5 Let M be a P R M over W, and a a function over W computable


by a R M in constant time. Then, for every j > i > 0 the function t.eValM, a is
definable by ramified recurrence with substitution, as a function from Wj x W~ni+1
to Wi.

P r o o f . Consider a definition by recurrence of t.eval which is like the definition of


eval above, with the base case modified appropriately. This is a ramified recurrence
with substitution, where the functions used in the definiens are all definable by flat
recurrence and therefore definable as tier-preserving functions: case, the construc-
tors, and the destructor are defined by flat recurrence by the examples above;
and since a is computable in constant time, it is definable by flat recurrence, by
L e m m a 4.4. Thus the recursive definition of .t.eval is a ramified recurrence with
substitution, provided the first argument is of tier higher than the output's tier. []

SUBLEMMA 4.6 For every c, k > 0 there is a function Powerk : W2 --* W l SUCh
that Ipowerc~(w)l = c. lwl k

P r o o f . See [Leivant, 1994b, Lemma 1.4].


P r o o f o f L e m m a 4.1 - - C o n c l u d e d . Suppose that f is computed by a PRM
M relative to c~, where M runs on input w in time c. [w[k. Then

f(w) = eval(lpower k ( w ) h s l , ~ o ( w ) , e . . . e )
= ~.evat(powe~. (w~, ~ , ~2o(~), ~ . . . ~)
. . . . . . r 1 6 3 -

By Sublemmas 4.5 and 4.6 it follows that f is definable by ramified recurrence with
substitution as a function W~--* W0. []

4.3 S i m u l a t i o n of r e c u r r e n c e w i t h s u b s t i t u t i o n in p o l y - s p a c e

We prove L e m m a 4.2 as a special case of the following.

LEMMA 4.7 If a function f : Wil x ... x Wip -* Wi is definable by ramified


recurrence with substitution, then f is computable (on a Turing machine without
input or output tapes) in space O(nk) + m for some k, where n is the maximal size
of arguments of tier > lier(f), m is the maximal size of arguments of tier = tier(f),
and disregarding arguments of tier < lier(f).

P r o o f . The proof is by induction on the definition of f. The cases where f is a


constructor or a projection are trivial.
Suppose that f is defined by composition: f ( x l . . , x v, Y l . . . Yq, z-) =
g(s g, ~', h(2", if, z')), where tier(xi) > tier(g), tier(yj) = lier(g), tier(zl) < tier(g),
and where n = max~(l~d) a n d m = max(lwil).
C a s e 1. tier(h) < tier(g). Then g is independent of its last argument, f = g, and
the l e m m a holds of f since it holds of g by induction assumption.
C a s e 2. tier(h) = tier(g). By induction assumption h(~, y-') is computed in space
O(n ~) + rn for some a, and so Ih(e,g)l < c . n ~ + m for some c. By induction
497

assumption, g(g, if, u) is computed in space O(n ~) + max(m, lul) (for some fl).
Thus, f ( g , y-') is computed in space O(n ~) + m for k = max(a,/?).
Case 3. iicr(h) > ~ier(g). Then all arguments of tier > tier(h) are among ~, and so
h(g, ~ is computed in space O(n ~) for some (~, by induction assumption. Therefore
Ih(~, Y")Iis also O(n('). Also, g(~, ~7,u) is computed in space O((n + lul) ~) + m for
some/3. Thus f(g, ~7,~ is computed in space O(n ~) + m where k = max(~, a~).
Suppose, finally, that f is an r-ary function over W defined by ramified recur-
rence with substitution,

f(~,~,~.,ff,z~ ---- ge(~,~',y.z-")


/(cw, = go(/(w,
c=0,1

where all functions in the definition satisfy the lemma, ~ier(zi) > tier(f) for each x~,
tier(yi) = tier(f) for each yi, ~ier(zi) < ~ier(f) for each zi, and all us and cri are in a
common tier. 5 By induction assumption each ga is computable in space O(n ~) + rn
for a sufficiently large a, and each crd(g ) is computable in space maxi(luil)+K for
a sufficiently large K. Since each gc is independent of f, by induction assumption,
we shall assume without loss of generality that F is empty.
Combine the given Turing machines computing as required the functions gd
and crd into a machine M that computes f as follows. Let T(w, e) be the [w]-high
/-branching tree
{il...iql l <ij <g, O < q < l w [ } .
M has as many "principal tapes" as each one of the given machines, and in addition
9 an input tape t to store the initial values of w, if, g, if,

9 a tape r for the ( binary code of) the "currently visited" address in T(w, g),
9 a tape/~ for the last address in T(w, t) "already calculated",
9 a tape r/that stores a stack of up to l . Iwl vectors of iterates of the vectors
of substitution functions, and
9 a tape ~o that stores a stack of up to g. Iwl intermediate values of f.
M visits all nodes of T(w, g) in a depth-first style, calculates values for the vec-
tors of substitution parameters, stores these on ~, calculates corresponding values
of f, which it stores on ~. To motivate the general definition, consider the following
example.
Suppose the function f to be computed is defined by

f(e, Ul, U2, x) = | Ul,U2)


f(0w, ul, u2, x) ---- | f(w, Ul, u2, x), f(w, u2, Ul, x))
f ( l w , Ul, U2, X) -: @(T.,f ( w , Ul, 0U2, ;g), f ( w , l U l , U2, Z))
' W e might have a different s for go and #1, in which case we take the larger of the two and
pad the other gi with redundant arguments.
498

Here = = =
{lui, u2). With the instance01e - 0(1(e)) for the first argumentwe couldexpand

f(Ole, if, x) -- | f ( l ~ , S0,if, x), f ( l e , @02if,x))


= | |
| (x, f(e, elx~o2ff,x), f(e, e12Yo2ff,x)))
To compute the vMue above, the Turing machinecomputing f visits the tree
T ( w , s = T(01e, 2) - {A, 1, 2, 11,12, 21, 22} and computes corresponding iterates
of the substitution functions, as well as values of f . On its way to 11 through A
and 1, M calculates crll~r01u by pushing on ~ successively if, ~01ff, and trlla01u.
Then f ( e , ~11~01ff, z) is computed and the result is pushed on ~. On its way to
the leaf 12 of T ( w , 2) the machine replaces the top value on r] by ~12~01g, pushes
this new vector on r], and proceeds to compute f ( e , tr12aolu, x), which it pushes
in turn on ~. Next, M backs up to the interior node 1, eliminates the top vector
from r/, and replaces the top two values on ~ by

where the nested values of f are precisely the two values being replaced.
Continuing in this fashion, we see that the computation has at most 2 vectors
of substitutions on ~ at any given time, and at most 3 values on ~o. More generally,
with g substitution vectors we would have at most ]w I vectors on 7, and ( e - 1).lwl-t-1
values on ~, as can be easily verified by induction on ]w].
Let us now describe the operation of M in the general case. Let h = [wl, and
let ci be the i'th bit of w. M starts by initializing rr to the root of T ( w , s fl to
the empty tape, and r/to ft. Then M runs the following loop (where we use a tape
name to denote the value on that tape):

1. If/3 is the root, output the top of ~ and terminate;

2. Otherwise, if ~r = pl .. -Ph is a leaf of T ( w , g), update/3 to r and r/to if, and


push the value go(if, ~, if) on ~.

3. Otherwise, if/3 is empty or is to the left in T ( w , g ) of 7r = Pl ...Pq (i.e.


lexicographically precedes rr), update 7r to the leftmost child Pl ..-Pq 1 of rr.

4. Otherwise, if/3 = Pl 9 9.pqi is a child of r = Pl ...Pq, but i < g, update ;r to


pt . . . p q ( i + 1).
5. Otherwise, if fl = p~ ...p~g is the last child of 7r = p~ ...pq, update/3 to r ,
and replace on ~ the top g values b by gq (b, w, if, ~, y').

6. Otherwise, if~r -- fl, update rr = p t . . . p ~ to its parent, p~ ...pq_~, and replace


the top vector g on r/by ~ g.
The computation will use only addresses of length < Iwh so these have binary
codes of length Iwl. logp.(g). By induction on Iwl it is seen that r/holds at most ]wl
vectors at any time, that ~ holds at most g. Iwl + 1 values, and that the values in
499

differ from ff by at most C. [w[ for some C (using induction assumption for the
substitution functions). Using induction assumption for the recurrence functions,
it follows that the values in ~ are all of size O(Iwl + m + (n + [w[) ~) for some k.
Thus the entire computation is in space O(N ~k+l + m), where N =df max(n, ]w[).
[3

References
[Abiteboul and Vianu, 1989] S. Abiteboul and V. Vianu. Fixpoint extensions of first-order logic
and datalog-like languages. In Proceedings o] the Fourth Annual Symposium on Logic in
Computer Science, pages 71-79, Washington, D.C., 1989. IEEE Computer Society Press.
[Abiteboul and Vianu, 1991] S. Abiteboul and V. Vianu. Datalog extensions for database queries
and updates. Journal of Computer and System Sciences, 43:62-124, 1991.
[Abiteboul et aL, 1990] S. Abiteboul, E. Simon, and V. Vianu. Non-determlnistic languages to
express determluistic transformations. In Proe. ACM SIGACT-SIGMOD-SIGARTSymposium
on Principles o] Database Systems, 1990.
[Abiteboul et aL, 1994] S. Abiteboul, R. Hull, and V. Vianu. Foundations o] Databases. Addison-
Wesley, Reading, MA, 1994.
[Bellantoni and Cook, 1992] Stephen BeUantoni and Stephen Cook. A new recursion-theoretic
characterization of the poly-time functions, 1992.
[Bellantoni, 1992] Stephen Bellantoni. Predicative Recursion and Computational Complexity.
PhD thesis, University of Toronto, 1992.
[Bellantoni, 1994] Stephen Bellantoni. Predicative recursion and the polytime hierarchy. In Peter
Clote and Jeffery Remmel, editors, Feasible Mathematics II, Perspectives in Computer Science.
Birkhgnser, 1994.
[Bloch, 1992] Stephen Bloch. Functional characterizations of uniform log*depth and polylog-
depth circuit families. In Proceedings of the Seventh Annual S~ructure in Complexity Theory
ConJerence, pages 193-206. IEEE Computer Society Press, 1992.
[Bonner, 1988] A.J. Bonner. Hypothetical datalog: complexity and expressibility. In Proceedings
o] the International Con]erence on Database Theory, volume 326 of LNCS, pages 144-160,
Berlin, 1988.
[Chandra et al., 1981] Ashok Chandra, Dexter Kozen, and Larry Stockmeyer. Alternation. Jour-
nal o] the AC~f, 28:114-133, 1981.
[Cobham, 1965] A. Cobham. The intrinsic computational difficulty of functions. In Y. Bar-Hillel,
editor, Proceedings o] the International ConJerence on Logic, Methodology, and Philosophy o]
Science, pages 24-30. North-Holland, Amsterdam, 1965.
[Fagln, 1974] R. Fagin. Generalized first order spectra and polynomial time recognizable sets. In
R. Karp, editor, Complexity o] Computation, pages 43-73. SIAM-AMS, 1974.
[Goerdt, 1992] Andreas Goerdt. Characterizing complexity classes by higher-type primitive-
recursive definitions. Theoretical Computer Science, 100:45-60, 1992. Preliminary version
in: Proceedings of the Fourth IEEE Symposium on Logic in Computer Science, 1989, 364-374.
[Gurevich and Shelah, 1986] Y. Gurevich and S. Shelah. Fixed-point extensions of first-order
logic. Annals of Pure and Applied Loyic, 32:265-280, 1986.
[Gurevich, 1983] Yuri Gurevich. Algebras of feasible functions. In Twenty Fourth Symposium on
Foundations of Computer Science, pages 210-214. i E E E Computer Society Press, 1983.
[Handtey, 1992] W.G. Handley. Bellantoni and Cook's characterization of polynomial time func-
tions. Typescript, August 1992.
[Immerman, 1981] Nell Immerman. Number of quantifiers is better than number of tape cells.
Journal of Computer and System Sciences, 22:384-406,1981. Preminary version in Proceedings
of the 20th IEE Symposium on Foundations of Computer Science (1979), 337-347.
[Immerman, 1982] N. Immerman. Upper and lower bounds for first-order expressibillty. Journal
of Computer and System Sciences, 25:76-98, 1982. Preliminary version in loesS1, 74-82.
500

[Immerman, 1986] N. Immerman. Relational queries computable in polynomial time. Information


and Control, 68:86-104, 1986. Preliminary report in Fourteenth ACM Symposium on Theory
of Computing, 1982, pp. 147-152.
[Immerman, 1987] N. Immerman. Languages that capture complexity classes. SIAM Journal of
Computing, 16:760-778, 1987.
[Immerman, 1991] Nell Immerman. Dspace[n k] = var[k + 1]. In Proe. 6th IEEE Syrup. on
Structure in Complexity Theory, pages 334-340, 1991.
[Jones and Selman, 1974] N. G. Jones and A. L. Selman. Turing machines and the spectra of
first order formulas. Journal of Symbolic Logic, 39:139-150, 1974.
[Leivant, 1990a] Daniel Leivant. Inductive definitions over finite structures. Information and
Computation, 89:95-108, 1990.
[Leivant, 1990b] Daniel Leivant. Snbrecursion and lambda representation over free algebras.
In Samuel Buss and Philip Scott, editors, Feasible Mathematics, Perspectives in Computer
Science~ pages 281-291. Birkhauser-Boston, New York, 1990.
[Leivant, 1993] Daniel Leivant. Stratified functional programs and computational complexity. In
Conference Record of the Twentieth Annual A C.~I Symposium on Principles of Programming
Languages, New York, 1993. ACM.
[Leivant, 1994a] Daniel Leivant. Predicative recurrence in finite type. In A. Nerode and Yu.V.
Matiyasevich, editors, Logical Foundations of Computer Science, LNCS ~813, pages 227-239,
Berlin, 1994. Springer-Verlag. Proceedings of the Third LFCS Symposium, St. Petersburg.
[Leivant, 1994b] Daniel Leivant. Ramified recurrence and computational complexity I: Word
recurrence and poly-time. In Peter Clote and Jeffrey Remmel, editors, Feasible Mathematics
IL Birkhauser-Boston, New York, 1994.
[Moschovakis, 1983] Yiannis Mosehovakis. Abstract recursion as the foundation for a theory of
algorithms. In Computation and Proof Theory (Proceedings of the ASL Meeting in Aachen,
1983), Lecture notes in Mathematics, Berlin, 1983. Springer Verlag.
[Nguyen, 1993] Anldan P. Nguyen. A formal system for linear space reasoning. PhD thesis,
University of Toronto, 1993. Master of Science Thesis.
[Otto, 1995] J. Otto. V-comprehensions and P space. Preprint, 1995.
[Ritchie, 1963] R.W. Ritchie. Classes of predictably computable functions. Trans. A.M.S.,
106:139-173, 1963.
[Rose, 1984] H.E. Rose. Subrecursion. Clarendon Press (Oxford University Press), Oxford, 1984.
[Simmons, 1988] Harold Simmons. The realm of primitive recursion. Archive .for hlathematieal
Logic, 27:177-, 1988.
[Stewart, 1993] Ialn A. Stewart. Logical and semantic characterization of complexity classes.
Aeta Informatica, 30:61-87, 1993.
[Thompson, 1972] D.B. Thompson. Subrecursiveness: machine independent notions of com-
putability in restricted time and storage. Math. System Theory, 6:3-15, 1972.
[Tiuryn and Urzyczyn, 1988] J. Tiuryn and P. Urzyczyn. Some relationships between logics of
programs and complexity theory. Theoretical Computer Science, 60:83-108, 1988.
[Tucker and Zucker, 1988] J.V. Tucker and J.I. Zucker. Program Correctness over Abstract Data
Types, with Error-State Semantics. CWI Monographs No. 6. North-Holland and the Centre
for Mathematics and Computer Science, Amsterdam, 1988.
[Vardi, 1982] M. Vardi. Complexity and relational query languages. In Fourteenth Symposium
on Theory of Computing, pages 137-146. ACM, New York, 1982.
[Venkataraman et at., 1983] K.N. Venkataraman, Ann Yasuhara, and Frank M. Hawrusik. A view
of computability on term algebras. Journal of Computer and System Sciences, 26(2):410-471,
June 1983.
General Form Recursive Equations I

H r a n t B . M a r a n d j i a n ,+

[nstitut, e tbr tnformatics attd Automation Problems of Nat, ionM Academy of Sciences
o[ Armenia P.Sevak str.1, Yerevan 375044, Armenia
E-mail: hm(+Ohmlh.armenia.su

A b s t r a c t . In t.his +trticle the general form recursive equations (GFRE)


are considered. A necessary and sufficient condition for these equations
to have. a solution in the family of partial recursive functions is found.
We show that there exists such a G F R E that, in contrast with usual case,
it has a no~>computable solution but. has no solution in the class 50 of
partial reeursive functions. The problem of solution existence to G F R E
is shown to l>e s e and L'}-complete in the classes 5~ and the
class of total re'cursive fu notions, respectively.

1 Solution Existence

A l m o s t any mathell~atical tield is to this or t h a t e x t e n t c o n n e c t e d with e q u a t i o n


s o l u t i o n . T h e s e e q u a t i o n s m a y have ditti'+rent forms, a n d their solving m e t h o d s
also m a y vary. In some cases one can find a solution of general, a n a l y t i c a l f o r m
these cases are t~ot n u m e r o u s . Ot~en it is possible to find only a p p r o x i m a t e , or
n u m e r i c a l solutions, and in some p r o b l e m s researcher needs an i n t e g r a l solution.
All the m e t h o d s of e q u a t i o n n u m e r i c a l s o l u t i o n s w i t h o u t of a n y e x c e p t i o n l e a d
to the So called r e c u r r e n t e q u a t i o n s . T h e s t r i c t m a t h e m a t i c a l definition, w h a t
r e c u r r e n c e is, is include.d in the c o n c e p t of recursion. Thus, a n u m e r i c s o l u t i o n of
any recnrsive equal~ion is r e d u c e d to the m m m r i c s o l u t i o n of recursive e q u a t i o n s .
T h e s t u d y trend, p r o p o s e d here, is c o n n e c t e d with

T h e se~t'ch o[' a c c u r a t e i n t e g r a l s o l u t i o n s of general f o r m recursive e q u a t i o n s ,


p a r t i a l l y th~ t,r a n s c e n d e n t a l e q u a t i o n s for which one tries to find i n t e g r a l
s o l u t i o n s u, ~q('. I'artially, we would like to a p p r e c i a t e , in t e r m s of recursion
theory, t,he difficulty of solution search for differential and i n t e g r a l e q u a t i o n s
r e w r i t t e n in the tBrm of a finite difference e q u a t i o n s s y s t e m .
- Solutions t,o general form recursive equation arising when using no-counterexample
int;ecpretation of G. Kreisel [9]. Here we have r e s t r i c t e d ourselves to the case
of rcc u rsivc ['u nc tionals and leave the c o n s i d e r a t i o n of f u n c t i o n a l s of finite
t y p e for n n o l h e r tinw..

We shall c<msider here the a s p e c t s t h a t are m o r e specific for the second of


the not, ed aims.
* rFhis research was supported b y A U A / E R ( ; grant
2 This rcstriclion ~urely is not principal, but lea,ds to more transparent exposition.
502

Variety of methods of solving equations can be find in many sourc.es, see, for
example, [2], [3], [4], [5] , [6], [13], [12], [14], and many others, devoted to various
aspects of the question under consideration. In some sense most relevant to the
approach presented here are articles [15], and [17].
Two welt known recursion theorems of S.C.Kleene [7], playing a great role in
the algorithm theory, state correspondingly, that

- for any recursive operator F the equation

f "- F[f] (1.1)


is solvable with respect to f in the class of partial recursive functions, and
- in any G6del numbering {~i} of partial recursive functions for any total
recursive function ~ there exists such natural number a, that

~,,(o) ~ ~o. (1.2)

llere we show that by means of these theorems we can solve recursive equa-
tions of more general form, than (1.1) and (1.2). The equation systems in terms
occur in many fields of mathematical logic, in algorithm theory and in their
applications. In this work we consider the equation systems of a general form

F l [ f l , . . . , f,~] "" G t [ f l , . . . , f,~];


(1.3)
Fk[fl,...,f~] ~ Gk[fl,.:.,A],

where Fi and Gi are recursive terms 3, containing the function symbols {fj}.
In the first place we are interested in a problem of a solution existence of an
equation system of the mentioned form with respect to fl,-.-, f,~ in the class of
partially recursive functions (PRF). It is easy to see, that a system of the general
form recursive equations not always has a solution. Thus, choosing F1 and G1
so. that the ]eft and the right hand parts of the equation

F 1 [f] ~ G 1 [f] (1.4)


depend on f fictitiously, and e.g.

Fl[f] "" ~x.[x], and Gl[f] ~ ~x.[0],


we are convinced that the equation (1.4) has no solution as against the cases (1.1)
and (1.2) which ahvays have a solution. If in the equation of the form (1.4) one
of the terms, say, F, represents a mapping, having inverse, then such equation
results in (1.1) and thus has a solution in the class of PRF, which, as we know,
can be found effectively at given F and G. Consequently, we are interested in the
3 The notion of a recursive term will be made more precise below.
503

case when the terms, taking part in the equations, have no 'inverses'. It appears
that some of those equations may have computable solutions.
In [10] a sufficient condition of the computable solution existence for the
equations of such type was found, ttere we shall present a necessary and sufficient
condition.
Let {f~} be a set of (one-sorted) function symbols, arity(j) denote the arity
of fi, and let C, P R and M be the operators of function composition, primi-
tive rgcursion and minimization, respectively. P~', S and Z will denote the base
functions:

P ~ ( a l , ..., a,~) ~ a~;


S(a) _~ a + 1, and

= o.
lnteger variables will be mainly denoted by xl, x 2 , . . . , and sometimes, for
convenience, x, y, and integer constants by a, b, and so on. To be more precise
let us give the fbl]owing definition.

D e f i n i t i o n 1. The class of recursive terms is the smallest class T s u c h that

1. Every integer constant is in T. 4


2. Every integer variable is in T.
3. S(z~) ~T.
4. Z(x,:) ET.
5. P ~ ( X l , . . . , x , ~ ) E T
6. If f~ is a fimction symbol (i.e. function variable) and arity(fi) "~ n then
fi ( x l , . . . , x , ~ ) E T
7. Let u E T a n d t ~ , . . . , t ~ , Eq':. Then any term, obtained by means of simul-
taneous substitution of terms t l , . . . , t r ~ instead of all the occurrences of
corresponding integer variables in u, belongs to T. The result of such a
substitution will be denoted further by u ( t l / x l , . . . , tn/xn) or, shortly, by

8. Let tl E T a n d t2 C T a n d any integer variable occurring in tl occurs in


t2 and there are exactly two extra integer variables occurring in t2. Then
e R [ t l , t2] CT.
9. If t E T a n d x is an integer variable. Then Mx[t] ET.

To emphasize that a term t contains occurrences of some integer variables


x l , . . . , x,~ and function symbols f l , ..., fk, we write t[fl, ..., fk](xl,..., in).
In the following we shall use, without loss of validity, more wide spectra
of term descriptions than in the previous Definition, for example, definition by
cases, to save room. We shall omit sub- and superscripts in any case when it
does not lead to an ambiguity.

4 It is, of course, a redundant, but useful condition.


50~-

Every recursive term defines a recursive functional and vice versa. Recursive
terms containing no occurrences of function symbols fi we call recursfve function
terms.
Following [7, Ch. XII], we say that the expression tl -~ t2 is a conditional
equality if the left and the right hands of it are conditionally equal for all instances
of function symbols and integer variables occurring in these terms.
To solve a system of recursive equations of the form (1.3) means to find
such rectirsive function terms f l , . . . , fn that being substituted i n (1.3) instead of
function symbols f l , . 9 f~ give a system of conditional identities.
Below we shall discuss some properties of systems of the general form recur-
sire equations, e.g. the systems of equations having the form (1.3), where F's
and G's are recursive terms. These equations seem to be similar to equation
systems by S.C.Kleene [7, Chapter XI], but there is an important difference.
We do not restrict ourselves with equations having the principal function symbol
(principal function letter in terms of [7]) and given ones. As a consequence of this
the problem of solution existence becomes very difficult, see Theorem 7 below.

D e f i n i t i o n 2 . Let q and t be recursive terms and an occurrence of the func-


tion symbol fi in t be fixed: A f i ( u l , . . . , u ~ ) B . Then we say that the term
A q ( u l / x ~ , . , . , u~/x~)B is the result of replacement of the fixed occurrence of fi
in t by term q.

The notion of a fragment defined below sometimes makes recursive equation


solution search easier.

D e f i n i t i o n 3 . L e t t ~ , . . . , t ~ and f l , - - - , f . be given. Then we say that

1. t i [ f ~ , . . . , f n ] is an i-fragmeutof (t,f);
2. Every term obtained from an i-fragment of (t, f) by replacing some occur-
rences of fj(u~,...,ukj),l <_ j < n (where u~'s are arbitrary terms), by
j-fragments of (t, f) is an i-fragment of (t, f).

D e f i n i t i o n 4 . Let a finite system {Fi, GI} of pairs of recursive terms be given.


Let there exist such recursive terms H 1 , . . . , H n that at some replacement of
i-fl'agments of (H, f) in the system of equations

~ . . . .

"" Gk
at which for every i at least once an /-fragment is replaced instead of fi and in
the result a system of identities

{ #1 all,

is obtained. Then we say that the system {FI, Gi} is coupled.


505

Example 1. Let F and G be such that

F[f](x) _ f(2f(x)) (1.5)


G[f](x) ~_ (f(f(x))) 2 + ( f ( f ( x ) - 1)) 2. (1.6)

In the given example it is not essential by what terms the operators F and
G are exactly presented. In any ease they will appear to be coupled. To be
convinced of that, it is sufficient to choose a recursive term H so that

H[f](x)
f (.f([z/2])) 2 + (f([zl2] - 1)) 2, if z is even,
(1.7)
t f(x- 1) + f ( x - 2), otherwise.
and then replace the external occurrence of f in the term representing F by
H[/].
It is easy to check, that these replacements lead to an identity. Easy to see
that as a solution to F[f] ~ G[f] one can choose Fibonacci sequence.
Let consider an another example.

Example 2. Let F and G be such that


a, if x < y,
F[fl,f2](x,y) ~ f l ( x + 1,f2(x "-- 1,y) "-- 1), otherwise, (1.8)

and
if x < y,
G[fa,f2](x,y) ~- f2(x, fl(x--" 2, y) "-- 1), otherwise, (1:9)
where
a .__b...{a'b, ifb<a,
- 0, otherwise.

In this example, as in previous one, it is not essential by what terms the


operators F and G are exactly presented, s in any case they will appear to be
coupled. To be convinced of that, it is sufficient to choose recursive terms H1
and H2 in such a way that

H l [ f l , f : ] ( x , y ) ~ f2(x + 1, y); (1.10)

H2[fl, f~](x, y) ~ f l ( x + 1,y); (1.11)


and then

1. in the term representing F preserve all the occurrences of fl and f2, and
2. in the term representing G replace all the occurrences of f l and f2 by
H1 [.fa, f2] and H2[fl, f2], respectively.
s In the h)rthcoming part of this work we give examples substantially depending on
terms in equations.
506

It is easy to check, that these replacements lead to an identity. In the first


strings it can be seen immediately, and the second strings give

f l ( x + 1, f2(x "-- 1, y) "-- 1), (1.12)


and
H2[fl, f2](z, H i [ k , f2](z "-- 2, y)), (1.13)
respectively.
Substituting the definitions (1.10) and (1.11) of H1 and H2 instead of func-
tion symbols fa and f2 in (1.13) respectively, we obtain (1.12). Thus, in this
example the system F, G is coupled.
A necessary and sufficient conditions of the existence of general form recursive
equation system solution were obtained in [10], [11]. Here we give another variant
of these conditions, more flexible for usage (although equivalent to the previous
ones) and make some of definitions precise.

T h e o r e m 5 . Let F1,...,F~, and G 1 , . . . , G n be arbitrary recursive terms. The


system of equations
F1 ~- G1,
9. . _ ..-, (1.14)

has a parl~al recursive solution wiih respec* ~ofl,. 9 fn if and only if the system
{Fi, Gi} is coupled.

Proof. Let us {Fi, Gi} be coupled. Then, by Definition 4, there exist recursive
terms H1 . . . . . H,~ and such a replacement o f / - f r a g m e n t s of (H, f) in the equa-
tions Fi[J'l . . . . , f , ] ~- G i [ f l , . . . , fn] instead of the corresponding occurrences of
f/-s, that in the result we obtain the following system of conditional identities:

{ $'1 = 0 1
...=... (1.15)

Using the Multiple Recursion Theorem one can obtain a system of partial re-
cursive functions fl . . . . , fn such that for every 1 < i < n we have H i [ f 1 , . . . , fn] ~-
fi. Using induction argument on the definition of a fragment it is easy to see that
the system of functions f l , . - . , fn is a system of fixed points of H[f]. Hence, every
i-fi'agrnent used to obtain the system (1.15) can be replaced by fl preserving con-
ditional identity. However, the resulting system coincides with the given system
of equations, tIence, the system of functions f l , . . - , f n appears to be a solution
to (1.14).
On the other hand, let a system of functions f l , . . - , fn be a solution to (1.14).
Then tile needed system of coupling terms Hi can be defined as follows:

H ; [ f l , . . . , Yk](xl,..., x,~) ",. ~//(Xl,... , aTn),


where each ~ is a recursive term representing the partial recursive function fi.
507

To show how this theorem works let us consider two simple examples.

Ezample 3. Let consider the following transcendental equation:

(f(n)) 2J(n-1) "" (2/(n -- 2)) -f(n+l) (1.16)

If we try to solve this equation by taking a logarithm then the solution can
be expressed as tbllows:

f ( n + I) = 2f(n- i). Ig(f(n))


lg(2f(n - 2))
Computing f by the iterative path naturally originated from this formula we
obtain noninteger values. Now try to solve the equation in the following way.
Let
H[f](n) ~_ 2 f ( n - 2)
Replacing in (1.16) f ( n ) by H[f](n) and f ( n + 1) by H[f](n + 1) we obtain the
identity

(2f(n - 2)) 2J('~-:) = (2f(n - 2)) 2](n-:)


So, the pair of left and the right parts of the equation (1.16) are coupled. Hence,
using Theorem 5 we conclude, that every fixed point of H will be a solution to
the equation (1.16). Particularly, as a solution we can choose the function

n+2, .ifn<2,
f(n) ~_ 2 f ( n - 1), otherwise.
which is-one of infinitely many fixed points of H.

Example/~. Now consider the following transcendental equation:

[f(n)/2] + (.f(n))21('~-D ~_ (2f(n - 2)) I("+I) + f(n - 2) (1.17)

In this case taking of a logarithm does not lead to an analytical solution.


Another obstacle is the presence of the operation of the integer part. While, as
it is not ditficult to notice, according to the above described method a solution
can be found by replacing the underlined occurrences of f in

[/(n)/2] + _-, ( 2 f ( . - + _ 2) (1.18)

by H[f] defined as in the previous example. As a result, after trivial transforms


we obtain the needed identity:

f(n - 2) + 2/(n - 2) 21("-I) _~ 2f(n - 2) 21("-I) + f(n - 2),

hence tile fixed point of H from the Example 2 is a solution to (1.18).


508

In some cases we may be interested in the existence of an equatipn solution


of the form (1.3) not in the class of recursive function terms, but in more limited
classes, particularly, in different classes of general recursive functions. In such
cases an appropriate numbering, say, as the numberings with the corresponding
modification of a 'fixed point' as in [8], may provide with a stronger concept of
coupling, under which the solution may, from the practical point of view, appear
to belong to a more preferable class of functions.
If the equations of the forms (1.1) and (1.2) are always solvable in the class
of partial recursive function, then the situation with the equations of the form
(1.8) is much more difficult as it is shown in the following theorem.
We denote by Wi the domain of a partil recursive function with index i in
a G6del numbering of partial recursive functions. Let us imagine that a fixed
'machine' is simultaneously enumerating all the members of a sequence {Wi},
this sequence being an enumeration of all recursively enumerable sets. The finite
set of numbers enumerated in Wi by the end of step n is denoted by Wi,n.

T h e o r e m 6 . There exist such recursive term F that the equation F[f](z) " 1
has a solution (which is an everywhere defined function) but has no solution in
the class of partial recursive functions.

Proof. Let I4 be the standard complete set {x [ x E W~}. Without loss of


generality let 0 E K. To avoid the exhausting term construction let express
F recursively by means of its functional behavior. Define functional F in the
following way, where F(~)[f](a) denotes the value (if any) of F[f](a) by the end
of step n of computing.
Define F[I](0) _~ 1. To compute F(~)[f](a), (a < n), we do the following three
substeps of computations. In this process, F(n)[f](a) can take tentative values
't' and 'undefined'. Some of 'l'-s (or, partially, all of them) can be changed or
become permanent in future steps.
Substep 1. Try to compute F(")[f](b) for each b < a. If F(~)[f](b) is defined
for every b < a then go to Substep 2, else F(n)[f](a) is undefined.
Subslep 2. If there exists some b <: a such that b E Wb,~&f(b) ~-- 1 then
F('~)[J'](a) is undefined, else go to Substep 3.
Substep 3. If for every b <_ a, f(b) "~ 0 implies b E Wb,~ then define
F(n)[f](a) "" 1, else F(n)[f](a) is undefined.
Easy to see that given f, for each pair of integers a, n we can uniformly solve
whether we have F(~)[f](a) ~ 1, or F(")[f](a) is undefined (maybe, yet). Let
H[I](x) ___(#n)(F(~)[f](x) ~ 1). Obviously, for each x H[f](x) gives (if any) the
least number m such that F[f](x) is defined. Now, define F as follows:

F[f](x) ~ F(H[Y](x))[f](x).
509

Note, that F[f](x) is not the limit value (if any) of F(n)[f](x), as it is used
usually. It is obvious, that F is defined in such a way that it is uniform in f."
It is easy to see that if f is the characteristic function s of K then F[f]
appears to be a total function identically equal to 1. For any f different from
the characteristic function of K there exists an integer m such that P[f](k) is
undefined for all k > m. Indeed, let us consider three possible cases. If fis not
total then, by Subslep 1, F[f]will appear undefined for all the points larger than
the point of the first, divergence of f. Hence, f m u s t be total, if ftakes the value
'1' for an input tbelonging to Kthen, by Subslep 2, F[f](t)is undefined and,
hence, is undefinite for all the integers s > t. If f takes the value '0' for an input
qbelonging to K then, by Subslep 3, F[f](q)is undefined and, hence, is undefined
for all the integers s > q. Hence, the characteristic function of Kis the unique
solution to the equation F[f](x) _. 1. If f differs from the characteristic function
of K then the function F[f](k) has a finite nonempty domain (as 0 E K by the
above assumption). Hence no function other that the characteristic function of
K can be a solution to that equation.

2 Solution Complexity

T h e o r e m 7. The problem of solution existence ~o a general form recursive equa-


lion in lhe class of parIial recursive funclions is a Z~ problem.

Proof. It is known, that the set R = {x I W~ is recursive} is ~W~ [16].


Let us define a family of recursive functions {ai}ieN in the following way:

(~u(X, Y, Z) ..~ J"[ 2z,


-
1, if z E W=,y,
otherwise,
and the family of recursive functionals Fu in the following way:

F~[f](z,y) ~_ c~u(x,y,f(z,y+ 1)).


Now let us consider the family of equations of the form

If(x, y) - F,,[f](z, Y)I -~ 0 (2.1)


The right hand side of the equation (2.1) is total. Hence, if (2.1) has a partial
recursive solution, then that solution must be total too by definition of __. If
u E R then the characteristic function of Wu is total recursive and hence there
exists a total recursive function f such that (2.1) holds (and this solution is
unique). If u ~ R then the existence of a recursive solution to (2.1) leads to a
contradiction, because it enables one to decide recursively the set Wu. The one-
to-one correspondence between R and the recursive equations is shown. On the
other hand, it is easy to see that the last problem is Z'~ Q.E.D.
Here characteristic function is equal to 0 for integers belonging to K and 1, otherwise.
of K.
5.0

It is easy to see from the proof that the problem of the general form recursive
equation solution existence in the class of total recursive functions is a ~o_
complete problem too.
Above we considered the difficulty of solution existence in the class of partial
as well as total recursive functions in terms of Arithmetical Hierarchy. On the
other hand, as shows Theorem 6, some general form recursive equations can
have noncomputable solutions. Below we consider the difficulty of the problem
of solution existence to our equations in the class of all total N ~-~ N functions.
Let denote by .T" the family of all one-place total functions mapping N into N.

T h e o r e m 8. The problem of the general form recursive equation solution exis-


tence in the class .~ is a Z~ -complete problem.

Proof. Easy to see that the problem of the general form recursive equation
solution existence in the class .T- belongs to Z1. Now, denote by T~a,1 ) the
Kleene Normal Form PredicateT([7]) for recursive functionals and let E 1 =
{z](3J')(3w)(T~1,1)(z, f, z, w) = 0)} as they are defined in [16].
Let us define Pz as follows:

Fz[f] _~ ((pw)(Til,1)(z , f, z, w) = 0). 0). (2.2)


Easy to see that for every integer n, the equation Fn[f] ~- 0 has a solution in
the family Y'if and only if n E E 1 9 As it is known from [16], E 1 is Z~-complete.
Hence, the problem of the general form recursive equation solution existence in
the class .T" is a Z~-complete problem.

Theorems, similar to Theorem 5 can also b e p r o v e d for A-calculus [1], con-


tinuous lattices of D.Scott [18], and a series of other theories.
A c k n o w l e d g m e n t s . The author acknowledges Professor Yiannis Moschovakis
for stimulating discussion of the problems considered here and an anonymous
referee for helpful comments.

References

1. H. P. Barendregt The Lambda Calculus. Its Syntax and Semantics. North-Holland


Publ. Comp., Amsterdam, 1981.
2. D. B. Benson and I. Guessarian Algebraic solution to recursion schemata. Journal
of Computer and System Sciences, 35:365-400, 1987.
3. N. Dershowitz and J.-P. JouannaudRewrite systems. In Jan van Leeuwen, editor,
Handbook of Theoretical Computer Science, volume B, chapter 6, pages 243-320.
Elsevier - MIT Press, Amsterdam, New York, Oxford, Tokyo, 1990.
4. J. P. Gallier and W. Snyder Designing unification procedures using transforma-
tions: A survey. Bulletin of the European Association for Theoretical Computer
Science, 40:273-326, February 1990.
Here we take no difference between predicates and their characteristic functions.
511

5. G. Huet and D. C. Oppen Equations and rewriting rules: A survey. In Formal Lqn-
guage Theory: Perspectives and Open Problems, pages 349-405. Academic Press,
New York, 1980.
6. A. J. Kfoury, J. Tiuryn, and P. Urzyczyn Computational consequences and partial
solutions of a generalized unification problem. In Proc. Fourth Annual Symposium
on Logic in Computer Science, pages 98-105, Washington, 1989.
7. S. C. Kleene Introduction to Metamathematics. D. Van Nostrand Co., Inc., New
York, Toronto, 1952.
8. S. C. Kleene Extension of an effectively generated class of functions by enumera-
tion. Colloquium Mathematicum, VI:67-78, 1958.
9. G. Kreisel Interpretation of analysis by means of constructive functionMs of finite
types. In A. Heyting, editor, Constructivity in Mathematics, Studies in Logic,
pages 101-t28. North-Holland Publ. Co., Amsterdam, 1959.
10. H. B. Marandjian On recursive equations. In COLOG-88, Papers presented at the
Int. Co@on Computer Logic, Part II, pages 159-161, Tallinn, 1988.
11. H. B. Marandjian Selected Topics in Recursive Function Theory in Computer
Science. DTH, Lyngby, 1990.
12. V. P. Orevkov Complexity of Proofs and Tbeir Transformations in Aziomatic
Theories, volume 128 of Transl. of Math. Monographs. American Mathematical
Society, New York, 1993.
13. R. Parikh, A. Chandra, J. H~tlpern, and A. R. Meyer Equations between regular
terms and all application to process logic. SIAM J. Comput., 14(4):935-942, 1985.
14. T. Pietrzykowski A complete mechanization of second-order type theory. JACM,
20(2):333-364, 1973.
15. A. Robiuson. Equational logic of partial functions under Kleene equality: a com-
plete and an incomplete set of rules. JSL, LIV(2):354-362, 1989.
16. If. Rogers, Jr. Theory of Recursive Functions and Effective Computability.
McGraw-Hill Book Company, New York, 1967.
17. L. Rudak. A completeness theorem for weak equational logic. Algebra Universalis,
16:331-337, 1983.
18. D. Scott Data types as lattices. SIAM J. Comput., 5(3):522 - 587, 1976.
Modal Logics Preserving Admissible for $4
Inference Rules
Vladimir V. RYBAKOV*
Mathematics Department, Krasnoyarsk State University
av. Svobodnyi 79, 660062 Krasnoyarsk, Russia
email: rybakov@math.kgu.krasnoyarsk.su

Abstract
The aim of the current paper is to provide a complete semantics descrip-
tion for modal logics with finite model property which preserve all admissible
in $4 inference rules, with the intention of meeting some of the needs of
computer science. It is shown a modal logic A with fmp above $4 preserves
all admissible for $4 inference rules iff A has so-called co-cover property. In
turned out there are c6ntinuously many logics of this kind. Using mentioned
above semantics criterion we give some precise description of all tabular logics
preserving admissible for $4 rules.

1 Introduction

Investigations in computer science nowadays actively involve the syntactic and


semantics tools of propositional modal logic. The results obtained in this way
are useful and look very attractive. Say, Computation Tree Logic (E.M.Clark, O.
Grumberg, R.P. Kurshan [2]) can be considered as a special system of modal logic.
Modal Process Logic (K.Larsen, B.Thomsen [6]) is intended for specification of
nondeterministic and concurrent processes. The using of modal logic in knowledge
representation is well known also (cf. a.E.Moore [7], K.Konolige [5], G.F.Shvarts
[12]). Studying of inference rules for modal logics is some special domain which
is especially developed for constructing Gentzen-style systems. Inference rules are,
potentially, even more important than axioms because only they are involved in
iactions/, during proving. Preserving the set of theorems for every given modal
logic (as essence of the logic), we can vary the set of axioms and inference rules in
order to make the axiomatic system more powerful and convenient.
Regarding axioms all is more or less clear: any axiom must be a theorem for
given logic. As to inference rules the things are less clear. Perhaps the question
iwhat is suitable inference rule/, has really no unique answer (cf. Fage, Halpern,
*Sponsored by the A l e x a n d e r yon H u m b o l d t F o u n d a t i o n (Bonn, G e r m a n y ) d u r i n g a stay at
t h e M a t h e m a t i c s I n s t i t u t e (2) in Free University of Berlin ( J a n u a r y - J u n e 1993)
513

Vardi [3]). Anyway among possible derivation rules some special once can definitely
be separated. It is so-called admissible (or permissible) inference rules: those with
respect to which the logic is closed. The class of admissible rules consists of exactly
all inference rules that can be applied in proofs with preserving the set of theorems
for given logic. Having in disposal all admissible rules for a given logic A we possess
all power of inference tools for derivations in A.
Unfortunately to determine whether a rule is admissible is a serious problem as
well. Therefore, having in disposal very few known criterions for recognizing ad-
missibility for individual logics, it is important to describe all logics which preserve
admissible rules of some logics of particular importance (say, having criterions for
recognizing admissibility). This paper aims to provide description of modal logics
which preserve all admissible inference rules of minimal transitive reflexive modal
logic $4. The base for this research is the existence of semantics and algorithmic
kind criteria for recognizing admissibility in modal logics $4, Grz (cf. [9, 10]).
Note that description of all intermediate (superintuitionistic) logics which preserve
all admissible derivation rules of intuitionistic propositional logic is given in [11].
We follow in general the line of this paper, but the modal case is more difficult
technically.

2 Definitions, Preliminary Results

We presuppose an acquaintance of the reader with technique of modal propositional


logic. In general we follow the conventions in modal logic and Kripkesemantics as
laid down in, say, [1, 4, 8]. Trough this paper a modal logic A is a set of propositional
modal formulae containing the set of all theorems of $4 and closed with respect to
modus ponens, normalization rule ~7~zand substitutions. A rule
..., Z,), ...,
/3(=1, ...,=,)
is said to be admissible in a logic )~ iff for every tuple of formulae 61 .... , ~, the
following holds Vj(aj(61, ..., 6n) e ,k) => (l~(61, ..., 6,~) e A).
Being interested in admissibility of rules for modal logics, we can consider only
inference rules with single premise. Indeed, for any given modal logic )~ above
$4 (more precisely, for any modal logic based on classical propositional calculus),
A A B E ~ ~ A E ~&B E ),. A frame iT := (F,/~) is a pair where F is some
non-empty set and R is a binary relation on F. The basic set of a frame and this
frame itself will often be denoted by the same symbol. Because we will consider
only modal logics over $4, trough this paper a frame is a transitive reflexive frame.
Let ~" := (F, R) be a frame, and C is a subset of F. We say C is cluster of iT if C
consists of all mutually comparable by R elements of F. The depth of an element x
in a frame having no infinite ascending chains of clusters is the maximal number of
clusters in ascending chains of clusters which begin with the cluster containing x.
For any frame J~4 :-" (M, R) (or Kripke model .h4 := (M, R, V)) Sl,(.h4) denotes
the set of all elements from M which have depth n (it is so called n-slice of M ) ,
51L.

and Sn(A4) is the set of all elements from M with depth not more than n. Ifc E M
then
cR< := {z [ z E M, cRx,-~(xRc)},

c n<- := {x Ix ~ M, cRx}.

If Y C M then

y R < := {x 19 e M, 3v e

y n < := {z I z E M, 3y E V(yRx)}.

For any frame F, A(F) is the modal logic generated by F (that is the set of all
formulae which are true on F under any valuation of the variables). F + designates
the modal algebra associated with F. Vat(A) denotes the variety of all modal
algebras on which all theorems of A are true.
If W1 := (W1, R, V}, W2 := (W2, R, V} are Kripke models, W1 C W2, R and Y
in both models on the same elements coincide then we say W1 is an open submodel
of W2 iff the following holds

Va E W1, Vb E W2(aRb --~ b E W1).


The main property of open submodel is: for any formula a,

(Va E I/V1)[(a ]~- va) in W1 r (a [[- vc~) in W2].

A Kripke model Kn is called n-characterizing model for a modal logic A if


the domain of the valuation V for Ifn is the set P which consists of n different
propositional variables and the following holds. For any formula a which is build
up on some variables from P a E A ~ Kn ][- a. Using of n-characterizing models
we can give a simple criterion for admissibility.

L e m m a 2.1 Let If~,n E N be a sequence of n-characterizing models for some


modal logic A. Any inference rule A / B is admissible for A iff for every n E N and
each valuation S of variables from A / B by members of K+(V(pl), ..., V(pn)) (where
V is the valuation of K,~) the following holds: if S(A) = K,~ then S(B) = If,~.

Proos Suppose an inference rule A(xi)/B(xi) is not admissible in ,~. Then


there are formulae Ci such that A(Ci) E ~ and B(Ci) q~ )~. Suppose the number of
propositional variables having occurrences in the mentioned above formulae is n.
Model Kn is n-characterizing for ,~ therefore we have

If, 1[- v A(C~)&K,J~v B(C~).


Then the valuation S(zi) :--- V(Ci) disproves A / B on If,~: S(A) = Kn, meanwhile
S(B) # Kn. Conversely. Suppose there are formulae Ci such that the follow-
ing valuation W(xi) = V(Ci) has properties: W(A) = If,~ and W(B) # Kn.
Then V(A(CI)) = If, and V(B(Ci)) # K , . This implies K~ [[-vA(Ci) and
515

KJ~B(Ci). But K,~ is n-characterizing for A. Therefore inference rule A/B is


not admissible in the logic A. 9
Some sequences of n-characterizing models for modal systems $4, Grz, are con-
structed in [9, 10]. Now we extend those constructions on all logics A with finite
model property and introduce models Ch~(n) below.
CONSTRUCTION OF Ch~,(n)
Let A be a modal logic with finite model property. We set P~ := {pi ] 1 < / < n}.
We define 81 to be the set of all models such that: (a) any model is just a cluster
C with the valuation V of all letters from P,, (b) any cluster C has no elements
with the same valuation of letters from Pn with respect to V, (c) any model C is
based on the cluster which is A-frame. The model .~41 is the model obtained from
$1 as disjoint union of all models contained in J'l. Note that the frame of A/f1 is a
A-frame.
Suppose we have already constructed all models J~41, ... , A~k and (a) all they
are based on some finite A-frames, (b) any )v/i has depth not more than i, (c)
any 2di is an open submodel of Jvfi+l. Consider M~ which has the valuation If
of letters from Pn. We take the set A(Mk) of all possible antichains of clusters
(arbitrary collections of clusters which are mutually incomparable by the relation R
in A~lk ) from A/tk which have at least one cluster of depth k. The set $1 |
is given by

$1 | A(A4k) := {< C,A > 1 C E S1,A E A(Mk), if IA[ = 1


then C is not a submodel of A }.

We define the model Gk+l as follows: the base set of Gk+l is ]A/Ik[ tA$1 | A(~/k).
The valuation l / o f any letter Pi from Pn is the set of all elements from )r on
which Pi is valid under If united with the set of all elements from $1 on which pl
is If-valid. The relation R on Gk+l is the transitive closure of the relation in )r
the relations on clusters from ,s 1 and the relation Q, where

aQb r 3 < C, A >~ S1 | A(ek4k)(a E C, b E A).


The model ]~4k+1 is obtained from ~k+l by deleting all clusters C of depth k q- 1
such that C R< is not a A-frame. Clear 2~4k is an open submodel of Adk+l and the
frame of 2k4~+1 is a A-frame.
This way we construct all sequence .L4k, k E N. We define the model Ch;~(n) as
follows Ch~(n) := Uk~N .~k. Because any Adk is an open submodel of the model
.Mk+l this definition is correct.
The n-characterizing model 7-(n) for modal logic $4 was constructed in [9].
This constructions was given quite similar to the general construction of Chx(n)
here. The difference is single: during the construction of "T(n) in [9] there was
no checking if cluster-models and models are based on A-frames. Therefore no
clusters and elements was rejected. But it correspond to the our description for
Chs4(n) here because any frame is S4-frame (remind we consider only reflexive and
transitive frames). Therefore the model Chx(n) for any A is some open submodel
51~

of 7"(n) and T(n) = Chs4(n). We will often use this observations and properties
of T(n) furthermore.
In order to show Ch~(n) is n-characterizing for A we need some more definition.
Let K; : = < K, R, V > be a Kripke model and let W be a new valuation of some
propositional variables on the frame < K, R >. The valuation W is called express-
ible (or formulistic or definable) if and only if for any letter pi from the domain of
W, there exists a formula Ai such that W(pi) := V(Ai). A subset X of K is called
expressible (or formulistic or definable) iff there exists a formula A such that

X = {x [ x e K, xlF- vA}.

L e m m a 2.2 The Kripkc model ChA(n) is some n-characterizing for the logic A.
Every element of this model is expressible.

Proof. All theorems of A are true on Ch~(n) by construction of this model. If c~


is a formula with not more than n propositional variables and c~ ~ A then by finite
model property there is a finite Kripke model , ~ := (M, R, V) such that < M, P >
is a frame for A and ~ is not true in A4. We take a maximal by R cluster of M with
an element a which disproves o~. Then generated submodel an-< of A4 disproves a
as well. Now we make a collapsing of the model an-< in the following way. First we
identify in every cluster all elements having similar valuation of variables. Then
we contract clusters of depth 1 which are isomorphic as models.
The resulting model F1 is a p-morphic image of the original one. Therefore F1
disproves o~ as well. Also it is clear that SI(F1) forms an generated-submodel of
the model Ch~(n). Suppose that we have already constructed the model Fi from
a n< such that: the frame of Fi is a A-frame, the depth of Fi is not more then
depth of .M, Fi disproves c~ and Si(Fi) forms a gene}ated submodel of Ch~(n).
We construct the model Fi+l from Fi in the following way. First we contract all
clusters C from Sli+l (Fi), which have only one immediate successor cluster C1 and
C is a submodel of C1, with corresponding sumbodels of clusters C1. Then we
contract clusters from Sli(Fi) that are isomorphic as models and have the same
sets of immediate successors. The resulting model, which is a p-morphic image of
Fi we denote by Fi+a. Therefore we get that the frame of ~+1 is a A-frame, and
the frame Fi+l disproves the formula a. It can be easily seen also that Sli+l(Fi+l)
is a generated submodel of Ch:~(k).
We continue this transformation and on some step m, which is less or equal the
depth of.h4, obtain some model which rejects a and is also a generated submodel
of the Ch~(n) . Thus Ch~(n) disproves a. ~u have noted Ch~(n) is an open
submodel of the model 7"(n). It is known that all elements of 7-(n) are expressible
( L e m m a 7 [9]). Therefore every element of Ch~(n) is also expressible. 9
Note that using n-characterizing models we can describe the free algebras from
Var(A). ~u denote the free algebra of rank n from Vat(A) by Fn(A). It is easy
to verify directly that Fn(A) is isomorphic to the generated by V(p~), 1 < i < n
subalgebra of Ch~(n) +. Also it is well known (and can easily be verified) that
F,~(A), n E N describe admissible inference rules for the logic A:
517

L e m m a 2.3 A rule A / B is admissible in A iff the quasi-identity A = 1 ::~ B = 1


is true in Fn(A) for every n.

3 Preserving of admissibility
The notion of preserving admissibility for inference rules naturally follows from the
essence of admissibility and describes the properties of derivations for axiomatic
systems of modal logics.
D e f i n i t i o n . A modal Iogic A preserves all admissible for $4 inference rules if
every admissible for $4 rule is also admissible for A.
Now we introduce a semantics notion which will reflect the preserving of admis-
sibility. Let F := (F, R) be a frame and let X be an antichain of clusters from F.
We say that an element e from F is a co-cover for X if cR< = X R-< and {c} forms
the cluster. Let F be a finite rooted frame.
D e f i n i t i o n A frame J~4 is said to be co-covers poster of F if A~ is obtained
from F by adding of finite number of elements as follows. Let F0 := F. In every
step i of the adding we add to the frame Fi a single new element c in the following
way: we choose some non-trivial (having at least two clusters) clusters antichain X
in Fi which has no co-cover in Fi (if there is any) and add to Fi the new element c
which is co-cover for X. After a finite number of steps i = 1,2, ..., n this procedure
terminates and the resulting frame is .h4.
D e f i n i t i o n We say that a logic A has the co-cover property if for any rooted
finite A-frame F every co-covers poster of F is an A-frame.
We say a rule r is disprovable on a frame F by valuation V if all premises of r
are valid on F with respect to V but the conclusion of r is false on F under V.

L e m m a 3.1 If a modal logic A 2 $4 has the co-cover property and has the finite
model property then A preserves all admissible for $4 inference rules, admissible
for $4.

Proof. Suppose A / B is an admissible for $4 inference rule which is not ad-


missible in A. Then there exist formulae Ci such that A(Ci) E )~ but B(Ci) ~ A.
Formulas Ci contain a finite number of propositional letters, say n letters. The
model Ch~(n) is n-characterizing for A according to L e m m a 2.2, hence

Ch~(n) [I-- A(Ci), Ch~(n)J~ B(Ci).


Thus the valuation S(xi) := V(Ci) disproves the inference rule A / B on some
subframe a R-< of Chx(n) generated by some a, moreover we can assume the cluster
containing a is maximal with respect to R with this property.
Let Fo := aR<-a n d Fm+l is obtained from Fm as follows. We take every non-
trivial antichain X of clusters from Fm which has no co-cover in Fm and add to
Fm single reflexive element which is co-cover for X. By construction Fm+l is some
co-covers poster for Fro. Logic ~ has co-cover property. Therefore each Fm can be
considered as a generated subframe of Ch~(n) and Fm is a generated subframe of
Frn+l 9
5!8

Thus F := Um<oo Fm is a generated subframe of Ch~(n) and the rule A / B is


false on F with respect to the valuation S. Moreover, F has by its construction
a co-cover for every finite non-trivial antichain of arbitrary clusters. Note that by
construction the model T(n) (el. [9]) has co-covers for all non-trivial antichains of
clusters, and we have seen Ch;~(n) is an open submodel of T(n). From this and
property of F to have co-covers for all non-trivial finite antichains of clusters it is
not hard to see that there exists a p-morphism of the frame T(n) onto the frame
F.
Using this p-morphism we can transfer the valuation S onto the frame T(n).
Because p-morphisms preserve truth of formulae, we have S disproves A / B on
T(n). Theorem 17 from [9] says that an inference rule r is admissible in $4 iff r
is valid on the frame of T ( n ) for every n and every valuation. Therefore we get
A / B is not admissible in $4, a contradiction. Thus A preserves all admissible in
$4 rules. ,,

L e m m a 3.2 If a modal logic A preserves all inference rules admissible for $4 and
has finite model property then A has co-cover properly.

Proof. Let A be a logic which possesses fmp and does not have the co-cover
property. This means there is a subframe F := a R-< of Ch~(n) generated by a
cluster a and a sequence F0, ..., Fk of generated finite subframes of Ch~(n) which
has the following properties:

a) F0 :=F.
b) All antichains of clusters from Fi of depth less than i have in Fi
co-covers.
e) The frame Fi+l is obtained from Fi when i < k as follows: We
consider the set Ai of all non-trivial antichains of clusters from Fi
having at least one cluster of depth i. If all anticains from Ai have
co-covers in Fi then Fi+l := F~, otherwise for every antichain X
from At with missing co-cover we add to Fi single co-cover for X.
d) There exists an antichain A of clusters from F~ which has no co-
cover in Ch~(n) and has a cluster with depth k.

Then, in particular, all antichains of Fk with depth of elements not more than
k - 1 have co-covers in Fk itself. Moreover, we can choose the sequence F1, ..., Fk
with smallest number k of members and pointed above properties. Now we need
some special modal formulae in order to express some properties of the model
Ch~(n). Let M := S~(Ch:~(n)) U Fk and M1 := Slk+l(Ch:~(n)) U M. Any cluster
from Chx (n) is a subset of MI (or M) or has with M1 ( M correspondingly) empty
intersection. For each cluster i C M1 we introduce the new propositional variable Pi
(inewL means Pi is not a variable from the domain of the valuation of Ch~(n)). Now
we introduce special formulae f(i) for clusters i of M1 by induction on the depth
of i. If i, j are some clusters then iRj means that all elements of j are accessible
by R from all elements of i. If a cluster i belongs to S~(M~) then f(i) := Opi.
519

If formulae f(i) are already introduced for all clusters i from St(M1) and j is a
cluster from Slt+l(M1) then we put

f(j) : - DP~ A A{-',ap, l i # j, i c_ SIt+a(M1)} A A ~


jRi,',,,(iRj)

{~Of(i) [ ~(jRi), i C_St(M1)} A A ~nPi A n([3pi A I i # j,


-,(iRj)

i C SIt+l(M1)) A A Of(i) A A --,f(i)) V V f(i)))


jRi,",(iRj) jRi,",(iRj) jRi,",(iRj)

We introduce now the inference rule v as follows.


V{f(i) li_c M1}Vg
g:= A ~/(OA V Of(O'r:=
Jell'Ix ifiSlk+,(Ml) ~f(a) '

where a is the generating element for F0. We choose the following valuation W
of variables pi from r on Ch~,(n): W(pi) := {j I iRj}. It can be verified directly
by induction on depth, that for each cluster i from M1 and any a E i, a II-"wf(i)
holds. Also for each cluster i from (Ch;~(n)-M1) and every a E i, a I[- wg is valid.
Thus r is disproved in Ch~,(n) by the valuation W, that is

W( V f(i) v g) = Ch~,(n), W(~(f(a)) # Ch~,(n) (1)


iCMa

According to Lemma 2.2 every element of Chx(n) is expressible. Therefore the


valuation W assigns to propositional letters pi some expressible subsets of Ch~,(n).
Then by Lemma 2.1 inference rule r is not admissible in A.
We now intend to show that r is admissible in $4. Suppose that r is not
admissible in $4. Then according to Lemma 2.1 (which in particular holds for $4)
there is an expressible valuation W on some n-characterizing model T(n) for $4
(cf. [9], remind T(n) = Chs4(n) ) such that

W( V f(i) A g)) = T(n)&W(D-,f(a)) ~s T(n) (2)


iEMa

Then there is b,, E T(n) such that ba I~ wf(a). It is easy to see from definition of
f(i) that
Vbi E T(n)(bi IF wf(i)&j c_ MI&
&(iRj)&--,(jRi) ==r 3bj(biRbj&bj I~ w f(j))) (3)
Suppose that bl H- w f(i), bj I['- w f(j) and bi1:gbj. If-',(iRj) then bj I~- w-"rnpi. But
bi I~- wf(i) therefore bi I~ wrnpl. This and biRbj imply bj I}- wrnpi - a contradic-
tion. Hence
Vbi, bj E T(n)(bi I~- wf(i)&bj I~- wf(j)&",(iRj) ~ ~(binbj)). (4)
520

From (3) and (4) we obtain immediately

b~ I}- wf(i)&(iRj)&-(jni) ~ 3bj((biRbj)&~(bjnbi)&(bj It- wf(j))). (5)

From(4) we get

bi I[- wf(i)&bj I[" wf(j)&b~Rbj ~ iRj. (6)


From (4) we infer that

(bi I[- wf(i))&(bj I[- wY(j))&('(inj)&~(JR i) ~ ~(biRbj)&-(bj1:tbi) (7)

We immediately derive from the definition of f(i) that


(bi I}'- wf(i))&(biRd) ==r 3j((iTtj)&(d I}- wf(j))) (s)
Moreover, from this and (5) we have

(Vj)(aRj) ~ 3bj((banbj) h (b It- wf(j))) (9)


Let Y be a non-triviM antichain of clusters from M of depth not more than k.
Suppose there exists a collection of elements bi E T(n) such that bi ]~- wf(i), for
all / E Y. Then by (7) clusters ci including elements hi, i E Y form an antichain of
clusters in T(n). According to the construction of the model T(n) (which is equal
to Chs4(n)), there is a co-cover v for the antichain of cl in T(n). The relation (2)
implies that v I1-"wf(j) for some j E M or v ][" wg. We claim that

3j E Ml(v ][- wf(j)). (10)


Indeed, suppose otherwise. Then we have v I~- wg and v ][- w<)f(x), where x C
Slk+!(M1) and v I~ w-~f(j) for all j C M1. So there exists bx such that vRb~:,
~b,Rv and b, I~- wf(x). Then biRb, for some i from Y. From this and (7) we
infer iRx, which contradicts with x E Slk+I(M'I). So (10) holds. If j C_ M1 is a
cluster and v It-" wf(j) then
j is a co-cover for Y in M1 (11)

Indeed, by supposition, we have that Vi E Y(bi I[- wf(i)~vRbi)~(biRv). This,


by (6), yields that Vi E Y(jRi). Suppose a cluster I is a strict successor o f j in M1.
Then, by (5), there exists a bt from T(n) such that (bl ]l- wf(l))&(vRbz&~(biRv)).
Hence biRbt for some i E Y. This implies iRl by (6). Therefore we have that j is
a co-cover for antichain Y and (11) is proved. Applying (3), (9), (10), (11), we get
that for any cluster x

(x C Sk(fk)) ~ 3b~ E 7(n)(b= I}- w f(x)) (12)


Let A be an antichain of clusters from Fk (having clusters of depth k) which has
no co-cover in Ch~(n). Let D := {bx ] x E A}, where bz ]}- wf(x) (we use (12)).
521

According to (7) clusters containing elements from D form an antichain in T(n).


Each antichain from T(n) has a co-cover according to the construction of 7-(n)
from [9]. Let w be a co-cover for D. By (10) and (11) we have there exists j which
is a co-cover in M1 and Ch~(n) for A which contradicts to our assumption. 9
From Lemmas 3.1, 3.2 we immediately infer

T h e o r e m 3.3 In order for a modal logic A with fmp to preserve all admissible in
$4 inference rules it is necessary and sufficient that A to have co-cover property.

It seems that all modal logics with fmp and co-cover property have very similar
classes of finite frames which approximate these logics. So from first glance it
looks very likely that the class of all such logics is not so reach and has not many
representatives. But in real it does not a ease.

T h e o r e m 3.4 There are continuously many modal logics over modal system Grz
which preserve all admissible in $4 inference rules and have fmp.

Proos We introduce the sequence of partially ordered frames An, n E N as


follows. We set A1 is the three-elements antichain. Every An+l is obtained from A,~
by adding a single co-cover for each non-trivial antichain from An having elements
of depth n. We define F,, as the rooted by an partially ordered frame, where
(F~ - {an}) = A,, and we put Gn is the rooted by bn partially ordered frame,
where (G,~-{b~}) = Fn. For a n y Y C N we set A(Y) is the logic A({Fn ] n e
N} U {Gn [ n E Y}). Then obviously all logics A(Y) are extensions of the modal
logic Grz. Moreover all logics A(Y) have the co-cover property. In order to show
this we take any finite rooted frame B for A(Y) (which means B ]}- A(Y)). First
we note that B must be a poset.
Furthermore, the frame B has not more than three-maximal elements because
all frames from Y and all frames F~ have exactly three maximal elements, and
property to have not more than three maximal elements can be expressed by a
modal formula. If G is a co-covers poster of B then G is a generated subframe of
some p-morphic image of an appropriate finite direct disjoint union Fn U ... U Fn
of the frame Fn for some n. This can be easily shown taking into account the
structure of frames Fn. Because taking of direct disjoint unions, p-morphic images
and generated subframes preserves truth of formulae, we have ~ I}- A(Y) which we
need. Thus each A(Y) has co-cover property.
Every logic A(Y) has finite model property by the definition. Therefore all A(Y)
preserve all admissible in $4 inference rules according to Theorem 3.3. It remains
to show that all A(Y) are pairwise different. It is a. consequence of the following
proposition,
Vn(A(g - {n}) q: A(G,)). (13)
To prove (13) we suppose 3n(A(g - {n}) C )~(Gn)). Then according to the cor-
respondence between varieties of modal algebras and modal logics and Birkhoff's
representation theorem we have that
G+ = H S P ( { G + ] m e N - { n } } U { F +]keN})
522

It can be easily seen that

G + = H S ( G +, x...• xF+ •
because G + is finite. Then, using the correspondence between homomorphic im-
ages, subalgebras and directs products of finite modal algebras and generated sub-
frames, p-morphic images and disjoint unions of theirs corresponding frames we
have that G~ is a generated subframe of a frame D, where

79 --'--/(Gin, U ... U am~ UFm, U ... IA Fro,),

and f is a corresponding p-morphism. The frame G~ is rooted therefore the equality


above implies there is p-morphism f from a rooted by a subframe F(a) of the frame
G, where O = O , n or O = Fmz for some l, onto G~. Any p-morphism does not
increase the depth. Therefore n < l and a has depth not less than n. Moreover,
f maps SI(F(a)) one to one onto SI(G,). If S is an antichain from SI(F(a)) and
as is its co-cover from F(a) then f maps as into some co-cover in Gn for f - i m a g e
of S. Conversely, let bD be a co-cover in Gn for some antichain D from SI(Gn).
Then there exists a co-cover cn for f - l ( D ) in F(a) that f maps cn in bD. Thus f
is some isomorphism from S2(H) onto S2(Gn). By the same reasoning, we show
that f is an isomorphism from Si(G) onto Si(G,~), for each i E {1 .... , n}. Then
every co-cover aE in G for Sl,~(G) is mapped by f into am.
Suppose that G = F~. Then there are no elements in G which can be mapped by
f into generating element bn of Gn - a contradiction. Suppose that G = Fr, where
n < I. Then the co-cover of a two-elements antichain (for instance) from Sln (H)
should be m a p p e d by f in some co-cover in G~ for f-image of this antichain. But
Gn has no such co-cover - a contradiction. The remaining case G = Gt, n < 1 can
be considered quite similar and we will have a contradiction as well. Thus (13)
holds and the theorem is proved, n
Being interested in applications, we now focus our attention on tabular modal
logics, those that are logics of some finite Kripke frames. We will apply Theo-
rem 3.3 to tabular logics in order to clarify which tabular modal logics preserve all
admissible for $4 rules.
D e f i n i t i o n A frame F has width n if F has no antichains of clusters with more
than n dusters but has an antichain with n dusters.
As well known, the property to have width not more than n for rooted frames
can be expressed by some modal propositional formula. More precisely, the formula
Pn, where

:= V (>(;, A (pj v opt))


O<_i<n i,j=O,l,...,n,i#j

expresses this property: for any rooted frame F, F I[- ~n ~ F has width not
more than n. We say that a logic A has width not more than n if ~ E A and that
it has width more than n if ~n ~ A.
523

We need a special collection of finite reflexive transitive frames which is de-


scribed below:

M1 := ({a, bl, b2, cl, d}, <), a < bl, a < cl, bt < b2, cl < d, b_, < d.

M2 : - ({a, bl,b2, cl}, <),a < bl,a < Cl,bl < b2.

/1//3 := ({a, bl,b2, Cl,C2}, <),a < bl,a < cl,bl < b2, cl < c2,bl < cl.

M4 := ({a, bl, b2, cl, c2, d}, R), aRbl, aRb2,


bl Rb2, b2-~bl, bl Rcl, bt Re2, b2Rcl, b2Rc2, cl Rd, c2Rd.

Ms := ({a, bl, b2, cl, c2}, R), aRbl, aRb~,


bl Rb2, b2Rbl , bl Rc l , bl Rc2 , b2Rc l , b2Rc2 .

T h e structure of these frames is designated on Figure 1:

Figure 1: Structure of Frames M1 - M5

Any tabular modal logic A always has representation: A := A(F1 U ... U Fn),
where F1 U ... IJ F,~ is the disjoint union of some rooted finite frames and logics
A(FI) are incomparable. We fixe some such representation for any given tabular
logic.

L e m m a 3.5 A tabular logic A := A(F1 U ... II Fn) preserves all admissible for $4
derivation rules iff each logic A(Fi) preserves all they.
524

Proof. Indeed, if some logic )t(Fi) preserves no an admissible for $4 rule A / B then
logic )t preserves no rule t3A V Dz/tDB V rqz which is admissible for $4 according
to disjunction property of $4. The converse is evident. 9

L e m m a 3.6 If a labular logic )t has co-cover property and a finite frame F is a


co-covers poster for some rooted )t-frame then F has width not more than 2.

Proof. Assume that F has a 3-element antichain of clusters. According to the


co-cover property, taking consequently co-covers posters of F, we obtain that all
theorems of)t are true on some frame Mn with depth n for each n. But )t is tabular,
i.e., )t = A(Q) for some finite frame Q which has (say) depth k. Then Dk E ,~,
where Dk is the formula which expresses the property that a frame is of depth not
more than k. This contradicts with the truth of ,~ on the above mentioned frames
Mn for each n. 9
As we have seen above, for any tabular logic )t, A := )t(F1 H ... H Fn), where
F1 U ... IAFn is the disjoint union of some rooted finite frames, and all logics A(Fi)
are incomparable. We say that F~ is proper rooted if the root has only single
immediate successor cluster or the root is single-element cluster. Now we are ready
to get concluding description theorem for tabular logics.

T h e o r e m 3.7 A tabular logic A : : A(F1 kJ ... kJ F,), preserves all admissible for $4
inference rules iff each Fj is proper rooted, )~ has width not more than 2 and for
all i, 1 < i < 5, )t g A( Mi ).

Proof. Necessity. Suppose ), preserves all admissible for $4 rules. Then by


Theorem 3.3 and Lemma 3.5 every )t(Fj) has co-cover property. According to
L e m m a 3.6 A has width 2 or lower. Suppose A has width not more than 2 but
is included in some logic A(Mi), 1 _< i < 3. Then-there exists a p-morphism
from some rooted by cluster C generated subframe Qj of some frame Fj onto Mi.
Moreover we can suppose that C is a maximal cluster with respect to accessibility
relation R on Fj such that there is a p-morphism f from C n<- onto Mi. Then,
in particular, all strict successors for C are not mapped by f into the root of Mi.
Hence there are two clusters C1, C2 which are immediate successor clusters for C.
Then f maps C1, C2 into immediate successors al, a2 (respectively) of the root of
Mi. By definition of Mi there exists an immediate successor b of a l (or a2) such
that b is not a successor of a2 (al respectively). Therefore there is an immediate
successor cluster B for the cluster C1 which is mapped by f into b and which is
not a successor for C2.
Thus the set of two clusters antichains X := {C1, C2} from Fj such that one
cluster, say C~, has an immediate successor cluster B which is not a successor for
another cluster C2 is nonempty. We choose X with minimal (with respect to /~)
possible clusters C1 and C=. There is no co-cover for B, C2 in Fi, otherwise X
would be not minimal. Therefore co-covers poster F obtained from Fj by adding
the co-cover c for B, C2 differs with Fj. Now we take the co-covers poster Q of F
obtained from F by adding the co-cover d for the root of Fj and c. Because )t(Fi)
525

has co-cover property, Q must be a ~(Fj)-frame, e.g., ,\(Fj) C_A(Q). Moreover, Fj


is a generated subframe of Q, hence 2(Q) c_ A(Fj). Therefore
c c (Q),
where the first inclusion is proper because the depth oh Q is strictly more than the
depth of Fj. Thus ,~(Q) C ,~(Q), a contradiction.
Suppose now every Fj is proper rooted and A is included in A(Mi), where i = 4
or i = 5, and that for i E {1, 2, 3} the last does not holds. Then again there exists a
p-morphism f from some rooted generated subframe Qj of some Fj onto Mi. The
set X of clusters from of Gj, which f maps into immediate successors cl, e2 of the
two-elements cluster from Mi, cannot have in Fj co-cover (because for k = 1, 2, 3
(A ~ ~(Mk)) ). Now we take co-covers posters for Mj by adding co-cover c for
X and then for c and the root of Mi, and obtain a contradiction as above. At
last suppose some Fj is not proper rooted. Then it has the root cluster with two
or more elements and two immediate successor clusters C1, C2 9 We obtain from
this a contradiction straightforward by taking a co-cover c for C1, C2 and then the
co-cover for c and the root.
We now turn to the sufficiency. Suppose that A has width not more then two
and A ~ A(Mi) 1 < i < 5 and all Fj are proper rooted. Then every logic A(Fj) has
the same property. In order to show that A preserves all admissible for $4 inference
rules it is sufficient by L e m m a 3.5 to check that all A(Fi) preserve admissibility.
For this it is sufficient according to Theorem 3.3 to prove that all A(Fj) haxe co-
cover property. We show this below. Indeed, the root of Fj will be a co-cover
for immediate successors because Fj is proper rooted. From this and A ~ )t(Mi)
for every i, 1 < i < 5 it is not hard to derive that every non-trivial antichain o f
clusters from Fj has some co-cover inside itself. Therefore if a Q is a rooted finite
~(Fj)-frame which has no co-covers for some non-trivial antichain Y inside itself
then the following holds. There exists a p-morphic image Q1 of a rooted generated
subframe Dj of Fj that Q is a generated subframe of Q1 and Q1 has some co-covers
for all non-trivial antichains of clusters. Therefore A(Fj) has co-cover property..,.

4 Comments on applications
Applying results from this paper and other results concerning admissibility of in-
ference rules in computer science bases on some natural and simple observations.
Any inference rule r can be understood as an atomic instruction in a declarative
programming language. Any finite axiomatic system ,4 correspondingly can be
interpreted as a program P which is intended to solve a task (in our particular case
the task is to derive theorems). According to this interpretation an admissible for
,4 inference rule r is an atomic programming instruction which can be consistently
added to the P, that is adjoining r to P does not change cardinally the behavior of
P: the output states will be the same. Thinking of admissibility derivation rules we
can have in mind this interpretation. This appears to be rather closure to research
in declarative programming languages like PROLOG.
Verification of programs in declarative programing languages also sometimes
needs to clarify if an atomic programming instruction can be consistently adjoined
526

to the program, what again has mentioned above interpretation trough admissi-
bility. Besides finite reflexive transitive frames can be understood as transition
systems of arbitrary nature. The logic ),(~) of some collection l) of such frames
is just equational theory of such chosen transition systems in modal language. As
it is easily seen admissible inference rules for ,k(T~) describe some meta-properties
of this theory. Of course this comments only sketch the line, anyway for some
individual particular cases this approach can be properly developed.

References
[1] van Benthem :]. A Manual of Intensional Logic. Lecture Notes, CSLI, Stanford,
1988, 135pp.
[2] Clarke E.M., Grumberg O., Kurshan B.P. A Synthesis of Two Approaches
for Verifying Finite State Concurrent Systems. Lecture Notes in Computer
Science, No. 363, 1989, Logic at Botik'89, Springer-Verlag, 81-90.
[3] Fagin R., Halpern J.Y., Vardi M.Y., What is an Inference Rule. 3. of Symbolic
Logic, 57(1992), No 3, 1018 - 1045.
[4] Goldblatt R.I. Metamathematics of Modal Logics, Reports on Mathematical
Logic, Vol. 6(1976), 41 - 78 (Part 1), 7(1976), 21- 52 (Part 2).
[5] Konolige K., On the Relation between Default and Autoepistemic Logic. Ar-
tificial Intelligence, V. 35 (1988), 343-382.
[6] Larsen K.G., Thomsen B. A Modal Process Logic, in Proceedings of Third
Annual Symposium on Logic in Computer Science, Edinburgh, 1988.
[7] Moore R.G., Semantical Consideration on Non-monotonic Logic. Artificial In-
telligence, V.25(1985), 75-94.
[8] Rautenberg W. Klassische und nichtklassische Aussagenlogik,
Braunsehweig/Wiesbaden, 1979.
[9] Rybakov V.V. Problems of Admissibility and Substitution, Logical Equations
and Restricted Theories of Free Algebras. Proced. of the 8-th International.
Congress of Logic Method. and Phil. of Science. Elsevier Sci. Publ., North.
Holland, Amsterdam, 1989, 121 - 139.
[10] Rybakov V.V. Problems of Substitution and Admissibility in the Modal Sys-
tem Grz and Intuitionistic Calculus. Annals of Pure and Applied logic V.50 ,
1990, 71 - 106.
[11] Rybakov V.V. Intermediate Logics Preserving Admissible Inference Rules of
Heyting Calculus. Math. Logic Quart. V. 39 (1993), 403 - 415.
[12] Shvarts G.F. Gentzen Style Systems for K45 and K45D. Lecture Notes in
Computer Science, No. 363, 1989, Logic at Botik'89, Springer-Verlag, 245 -
255.
A B o u n d e d Set T h e o r y w i t h A n t i - F o u n d a t i o n
Axiom and Inductive Definability*

Vladimir Yu. Sazonov

Program Systems Institute, Pereslavl-Zalessky 152140, Russia


e-mail: sazonov@logic.botik.yaroslavl.su

1 Introduction. Kripke-Platek Set Theory with


Anti-Foundation Axiom
Let KP0 be Kripke-Platek Set Theory [Bar75] with omitted Foundation Axiom.
Denote by 1) any universe satisfying theory KP0. Anti-Foundation Axiom, AFA,
of P.Aczel [Acz88] says that there exists the unique set-theoretic decoration op-
eration D(G, x) over graphs G G ~ (as any sets of pairs (x, y) = {{x}, {x, y}})
satisfying the L.Gordeev's identity 2

D(G,y)={D(G,x):(x,y)CG}, or = { D ( G , x ) : x - + G y } .

Following originally to P.Aczel [Acz88] and to T.Fernando [Fer] in some de-


tails we define in terms of KP0, a universe G C ]2, just a A0-class in KP0 of
all pointed graphs (pg's) g = (G,a) (with G a graph and a a distinguished
vertex or point) together with two Z-relations over this class: E0 and a con-
gruence =0 with respect to E0 (and to many other predicates and operations
over pg's). We say that a class R C ]3 is bisimulation relation between graphs
G and G ~, shortly Bis(R, G, G~), if/~ is a class of pairs and always bRb ~ implies
(Vx --+a b3y --~a' b'. xRy) and (Vy ~ a , b'3x '-+a b. xRy). Let '~a,a' be the
largest such R. Then define

(G, a) =0 (G', a' I ~- a "~a,a' a' (iff 3R e ]).(Bis(t~, G, G') ~ are')) and
(G, a) Eo (G', a'> = 3b' --+G' a'.(G, a> =o (C', b') .

Theoreml [Fer]. KPo + Power I- "(g, =o, Eo) k KPo + Power + AFA".

It was essentially used in the proof of this theorem that ~-relations =0 and
Ee become A in the presence of Powerset which seems too strong axiom from
* Supported by G.Soros Foundation and by Russian Basic Research Foundation
(project 93-011-16016).
2 In a Bounded Set Theory [Saz85, Saz87, Saz93] (exactly corresponding to PTIME-
computability) it was used, instead of the decoration operation, analogous notion
of A.Mostowski's collapsing (cf. [Bar75], Ch.I.0.13), however, only for the case of
well-]ounded graphs where both these notions are equivalent. Also, our direction of
edges in graphs is taken as in [Bar75] and in our previous papersi just converse to
that in [Acz88].
528

various points of view. For example, Powerset is non-tractable operation giving


rise to Kalmar elementary rather than to polynomial-time computability.
Our aim is to show in Sect. 2 with the details in the Appendix that anal-
ogous effect for BSTA, a Bounded Set Theory [Saz85, Saz87; Saz93] with Anti-
Foundation Axiom (which extends KP0 and also a Basic Set Theory [Gan74])
may be reached by using, instead of Power, a Recursive A-Separation axiom
(added to KP0 in op.cit.).
Another aim is to compare in Sect. 3 proof-theoretic and expressive power
of BSTA with a Logic of Inductive Definitions, LID, whose relation to Gener-
alized Computability was investigated in [Mos74], whose extension by arith-
metical axioms is known rather as the system(s) IDi [FefT0, Acz77] investi-
gated also in [JSg86] in connection with admissible sets and whose interpretation
in finite linear-ordered models was shown to exactly correspond to PTIME in
[Imm82, Liv82, Var82]; cf. also related works [SazS0, Gut83, Gur84, GS86]. Sec-
tion 3 is written much more sketchy, in contrast to Sect. 2.
In the case of a previously considered in [Saz85, Saz87, Saz93] version of
Bounded Set Theory with Bounded Foundation Axiom (or with Bounded E-
Induction Axiom) interpreted in the ordinary universe H F of hereditarily-finite
sets it was possible to define in the language of BST a natural linear order on
H F . Therefore that version was exactly corresponding to PTIME-computability
(with respect to acyelic, or well-founded finite pointed graph representation of
HF-sets) unlike the present version of BSTA for which it remains an open ques-
tion. Only PTIME-computability of provably-total S-definable operations of
BSTA can be shown.
We postpone some additional related aims to a future research. Also we will
not touch here possible applications to Concurrent Transition Systems, Situation
Logic and "Nested" Data Bases. (For the connection of the latter subject with
a BST cf. [Saz93].)

2 Interpretation of set theory BSTA in KPRo

Let us consider aset theory KPRo based on the following An-language

An-formulas : : = a E b i a = b i ~ & r 1 6 2
An-terms ::= set-variables p, q, x, y,...I {a, b}lUa I
{ t( x ) : x E a~;~p(x)} Ithe-least p.(p = { x E a: ~( x, p)})

where a,b,t are An-terms and ~ , r are An-formulas, set-variables x , p are not
free in a and all occurrences of set variable p in An-formula ~ are only of the
form '- E p', positive and not inside of any complex subterm of 9. The condition
on p guarantees that ~ is monotonic under the set inclusion on this set variable
and therefore (at least in the framework of the classical set theory) the least such
set p must exist. In fact, we will consider also the closure of An-formulas under
&, V,-~ and unbounded V and 3 with the ordinary definition of the bounded
quantifiers of An via unbounded ones.
529

Non-logical axioms of KPRo(which also explain a meaning of the above An-


constructs) are the following ones:

Extensionality: a C a ~ a I C a = = ~ a - - a ~
Pair: t 9 ,
Union: t 9 1 4 9 1 4 9 ,
An-Image: s 9 {t(x) : : 9 a ~ ~(x)} ~ 3: 9 a.(~(:) ~ s = t(x)) ,
Recursive An-Separation: If p0 = the-least p.(p = {x 9 a : V~(x,p)}) then

po = {x 9 ({x 9 a: c_ p po c_ p) ,

An-Collection: Vx 9 a3y.lo(z, y) :=~3zVx 9 a3y 9 z.v~(x, y) .


where a,b,t,s are any An-terms and !o is any An-formula and a C b ~ V x 9
a.(x 9 b). Note, that An-Collection is the only axiom of KPR0 which itself is not
a An-formula.
Let us call by A D the extension of An-language by the decoration and tran-
sitive closure operations D(g,p) and TO(s) for any sets (or AD-terms ) g,p, s.
Then the corresponding theory BSTA, called Bounded Set Theory with Anti-
Foundation Axiom, consists of all axioms of KPR0 reformulated for the language
A D (with Recursive AD-Separation and AD-Collection , etc.) plus axioms AFA
and TC formulated to be AD-formulae.
So, axioms for TC are as follows (where Tran(y) = V u 9 y(u C_ y))

x C: TC(-x), Tran(TC(x)) ,
(x C: y =;, TC(x) C mC(y)) and Tran(y) =~ TC(y) C y .

We split AFA into two axioms AFA1, existence or correctness of the decoration,

Vz.(z E D(g,p) K=>.3q E field(g).((q,p) E g&z = D(g, q))) ,


(where field(R) = U UR = {x, y: (x, y) E R} for any set of pairs R) and AFA2,
the uniqueness of decoration, which is equivalent, as in [Acz88], to strong exten-
sionality property for any set of pairs R E ]2

Bis e(R) & xRy :=~x = y, where


Bis e (R) ~- Vx, y E field(R).(xRy ::~
(Vx' E x3y' 9 y.x'Ry') ~ (Vy' 9 y3x' 9 x.x'Ry')) .
In this paper let A-formulas and A-terms be those defined as An with
the construct the-least omitted. (They define predicates and operations also
known as basic [Gan74] or rudimentary [Jen72] ones.) Note, that provably-
total S-definable operations in KP0 coincide with those definable by A-terms
[Saz85, Saz85a, Saz87]. Therefore, we may use the name KP0 also for the sub-
theory of KPR0, based on this A-language, which does not involve both the term
construct the-least and the corresponding axiom. Analogously, An(AD)-terms
and An(AD)-formulas give all provably total Z-definable operations of KPR0
530

(BSTA) and provably-A-predicates (i.e. predicates over 1} represented simulta-


neously both by a ~ and a H-formula which are provably equivalent). It follows
that D-, DR- and DD-languages are complete in this sense for theories KP0,
KPR0 and BSTA, respectively.
Another result from op.cit, is conservativeness relative to D-formulas of our
version of KP0 over KP0 minus D-Collection axiom, i.e. over KP01z~, and, more-
over [Saz87], over (Extensionality axiom)l~ o in the A.Levy's language of D0-
formulas (i.e. A-formulas with variables as terms). Also KPR0 is conservative
over KPR01DR and the same for BSTA and BSTAID D.
It is easy to represent in DR(DD)-language a construct dual to the-least,

the-largest p.(p = {x E a : ~(x,p)}) =


a \ the-least q.(q = {z E a: -,~(x,a \ q)}) ,

where a \ b = { x E a :x ~b} and t E a \ q must be replaced everywhere in ~ by


t E a & -~t E q. So, the formula - ~ ( x , a \ q) will involve q satisfying the same
(positivity) requirement as p in p(x,p). Then the largest bisimulation relation
"~ca, E l} between any two graphs (sets of pairs) G,G' E V is expressible as
a DR-term of two variables G, G'. Therefore =0 and E0 defined with the help
of " ~ a , E 13 are also in DR. Evidently, =0 is an equivalence relation which is
congruent with respect to E0. (We will need even more; cf. the Congruence
Lemma 3 below). This allows to prove by a lengthy induction argument

T h e o r e m 2. KPR0 F- "{~, =o, E0) ~ BSTA'.

Here we take formally "(~,=0,E0) ~ W(2)" ~.~ V2. E ~.[[p]](2) where [~]] is
defined below a (DR-)translation of any (DD-)formula ~ which will give the in-
tended meaning of ~o in the universe (6, =0, E0}. Simultaneously, we must prop-
erly define (by a structural induction) for any DD-term t = t(2) corresponding
DR-term It[. In particular, we will have KPR0 ~- W E ~ . ( H ( 2 ) E ~). Some
denotations used in the following inductive definition of [-] will be explained in
Appendix 4.1. However, they have suitable mnemonics. Let us define here only
the point-changing operation for any pg (G, z) by (G, z} * y ~ (G, y) and also
elements(G, x) ~ {y E field(C) : y --~a x}. Then let

[a E b]] ,~- [all E0 Ibm; [a = b]~~---[all =0 [b]];


r = [r b, v r = H v H ; = -'H;
[Vx E a.~(x)] = [Yx.(x E a =~ ~(x)]] ~ Yx E elements[a]~.[~]]([[a[ * z);
[3x E a.~(x)] = [3x.(x e a~;~(x)] = 3 x E elementsEa]].[~]]([a~ * x);
= w E 6.H(~), if p(x) has not the form x E a ::~ r
= 3x E 6 . M ( ~ ) , if ~(z) has not the form x E a&r
[set-variable]] ~ the same variable;
[{a, b}]] ~-- E {Jail, [b]]} (eft Appendix 4.1);
[Ua]] ~ (collect elements-of-elements[all in graph[all) (cf. Appendix 4.1);
531

ir{t(x): x E a =

E {l[tl]([a~ * x): x E elementsEal] ~ lr!oi1([ral]9 x)};


Ethe-least p.(p = {x E a: ~(x,p)})] = (collect q0 in graph[a]) where
q0 = the-least q.(q = {x E elementsl[a] :
~i4(lra] * x, (collect q in graph~a]I))}).
To be correct, in the last use of the construct the-least we should analyze all oc-
currences in lrTIl(~a], x, (collect q in graph~a]]))of the term (collect q in graph~a]]),
or shortly (collect q in G). Each such occurrence has the form r E0 (collect q in G)
or v E elements(collect q in G) (the last case arise from [[-~-translation of a
bounded quantifier Qv e q in io(x, q), if any, with v a quantified variable) and is
evidently positive. Moreover, these formulas immediately prove to be equivalent
respectively to 3v E q.t =o (G, v I and v E q (by using the definition of collect
construct in Appendix 4.1) so that variable q remains positive, as required.
Finally, the translations of the operations TC, and D in terms of the original
An-language are as follows (cf. also Appendix 4.1).
[[TC]](g) ~ (collect elements*(g) in graph(g)) ,

[[Dl](g,p) = (hgraph(~), the:unique x E hvertices(~).~ 9 x =0 P) .

Here g,p are considered as any pg's in ~ and ~ denotes a pointed factor-
graph of g = (G, ;v) under the largest bisimulation (equivalence) relation " v
on G ~ graph(g) with any (x, y) E G defining an edge between corresponding
two equivalence classes [x], [y], and with the point ~r of g defining the point [~r]
of ~. Then it is clear, that any such ~ is isomorphic to some pointed "transitive
set" of the universe (6, =0, E0)/=o.
We can prove in KPR0 the congruence property of the equivalence relation
=0 with respect to the above definition of translation semantics [[-1]by using the
structural induction on any AD-formula v~(g) and AD-term t(g).

C o n g r u e n c e L e m m a 3.

g =0 g' (M(g) r M(g')) ( H ( g ) =0 [t](g')) .

Note, that in the proof of this lemma given in the Appendix 4.2 it is simul-
taneously shown that also for bounded quantifiers the equivalence

[Qx e a(g).r g)] r QX E0 I[a](g).[r g)


holds, the right-hand-side being in general not a An-formula even if r is. So, this
is not just a definition. (Unfortunately, it is unclear how to prove this equivalence
more directly; otherwise the validity in (G, =0, E0) of equality axioms considered
after this paragraph could be deduced simply from their partial cases for E and =
in place of ~ and for a variable v in place of a, without using the previous lemma.)
This together with the definition of [[tal] means that for all logical connectives and
quantifiers [[i~ behaves as required. Also all logical axioms remain true under
532

H , i.e. they hold in (G, =0, Co), provably in KPR0. For example, translations of
the (logical) equality axioms (cf. [Men64]),

{t=t] and [ t = s = ~ ( ~ ( t ) e*~(s)) & (a(t)=a(s))]

for any formula ~(v) and terms a(v), t and s, are deciphered, respectively, as

[[t]] =0 It] and [t]] =0 [[s]] ::~ (J[~(t)] r [[~(s))]) &; ([a(t)]] =0 [a(s)]])

and we also should use, besides the above congruence lemma, the following the-
orems of KPN0 proved by induction on p and a with any t:

[[~p(t)] r [[~]([t]) and [a(t)]] =J[a]([[t]) .

This also allows to prove translation of the quantifier axiom Yv.p(v) ~ p(t):

vv M(N) 9

Another one Vv.(r ~ p(v)) ~ (r ~ Vv.~(v)) (with v non-free in r is translated


to its own partial case

Vv e ~.([r ~ M(v)) ~ (H ~ W ~ ~.M(v))

as well as all propositional axioms. We finish the logical part by noting that
the logical inference rules from [Men64], just Modus ponens, ~, ~ =:~ r 1 6 2 and
Generalizat!on, ~(x)/Vx~(x), also preserve validity in (6, =0, e0). All of these
means that our semantics is logically correct.
Finally, and most important, we can prove in KPR0 translations of all non-
logical axioms of BSTA (cf. Appendix 4.3).

3 Interpretation of BSTAIA D in LID

Let us consider a Logic of Inductive Definitions, LID. The corresponding Lan-


guage of Inductive Definitions goes back to Y.Moschovakis; cf. [Mos74] and also
[Imm82, Var82, Gur84, GS86]. It is called sometimes FO+LFP and defined here
as many-sorted First-Order Logic with predicate variables and with the (posi-
tive) Least Fixed Point construct

[the-least PV~.(P(~) r #(~, 9, P, ~)))]

abbreviated below as P~, (= P~,(-) = P~,(-, 9, Q)). Here, 2, 9 and P, 0 are, respec-
tively, all free individual and predicate variables of T and the predicate variable
P is required to have only positive occurrences in ~. The last condition guar-
antees that ~ is monotonic on the predicate variable P and therefore (at least
in the framework of the classical set theory) the least required value of P must
exist. This new predicate construct may participate in other kiD-formulas with
any nesting except that its predicate variable P is not allowed to appear in the
corresponding body ~(~,P) inside any other such construct which is a part of
533

this body (essentially the same as in An-language3). In general, liD-formulas


are constructed from atomic formulas of the kind R(~), with R a predicate (a
variable or a constant or a binary equality relation = or, inductively, the above
the-least construct) and ~ individuals (variables or constants), by using the ordi-
nary first order connectives ~z, V,--1, =~, r V, 3 with the restrictions on the shape
of the-least construct mentioned. No second order quantification is allowed in
liD-formulas, however, n-ary predicates are considered as comprising a second
order sort.
liD as a logic extends the ordinary Firs-Order Logic with equality also by
the following axiom schemes (for T and P~ as above):

V~.(~(~, P~) =~ P~(~)) and V~.(~(~, P) =~ P(~)) => V~.(P~(~) =~ P(~)).

In particular, when ~ does not depend on P these axioms essentially give rise
just to the ordinary comprehension axiom for first-order formulas.
A semantics of this language and calculus in the framework of any set the-
ory universe V is defined via fixing any second-order structure A4 = (M, 7))
consisting of a many-sorted first-order structure M E V (e.g. with sorts being
just (the sets of vertices of) some pg's ~ = g t , . . . , g. in ~4 in which case M is
essentially a tuple or a join (gl,..., g.) of one-sorted structures) together with
some class "P C_ V of many-sorted predicates over M. It is required that always
(the value of) P~ = P~(-, #, Q) is in ~ when parameters ~3,Q are in A4 and
also that both the above axioms hold in ,44. In the case V satisfies KPR01A R it
suffices to take P just all the subsets E V of the corresponding direct products
of fitst-order sorts and we also will write M instead o f A4. Let us call such
A4 induced by M E V an inner model in V of LID. Bearing in mind this case,
Recursive AR-Separation axiom of KPR0 allows to interpret the-least construct of
LID by the-least construct of KPR0. Evidently, "M ~ ~" is represented by a AR-
formula of KPR0 for any ~ E LID. Then let ~V ~ abbreviate (non-AR-formula)
VM E V~t.(M ~ ~) where V denotes universal quantification over all free indi-
vidual and predicate variables of ~ (ranging eventually over V). Evidently, this
gives a natural embedding of LID in KPR0 (and, therefore, also in BSTA):

LIDF~ implies KPR0t-"~V~" (withV={x:x=x))

which is, in fact, conservative according to the Corollary 8 below. It is well known
that (for variations of the language of LID)

T h e o r e m 4 . ([Imm82, Liv82, Var82]; cf. also related results [Saz80, Gur83,


Gur84]) If LID has in its signature a binary predicate for a linear order then
the expressive power of this language in finite linear ordered structures exactly
corresponds to PTIME-computability.
3 Otherwise we could not embed below LID into KPR0. This restriction on LID seems
also essential for infinite models and sets which are not excluded by KPR0 or BSTA.
4 Any pg g may be considered as a one-sorted structure with one binary relation and
one individual constant, the point of the graph
534

The translation [[t]] of AD-terms defined in Sect. 2 gives for each such a term
an operation over pointed graphs (A# E 6.[[t]~(~)) : 6 '~ ---+ 6. This operation was
defined by a An-term and its values do matter only up to an isomorphism of
pg's. Analogously, translation [~o](~) of a AD-formula gives a predicate over 6
which is also expressed by a An-formula.
As for the case of a well-founded universe [Saz91], we can imitate these
translations by a more elementary kiD-language (than A n dealing with "nested"
sets and predicates) with binary predicate constants and individual constants,
each pair for each sort, corresponding to each input pointed graph. We may just
redefine translations [[~o~(~) and [[t]](~), written now respectively as LI D-formulas
((~o))(~) and ((t))(~, ~, e), with no free individual variables in the first and with
designated two lists of all free variables ~, 9 in the second and one list e of
constants in both (just a list of the points of the given input pg's ~ in any order
and possibly with repetitions) all of the same nonzero length (may be different
from 9). This may be done in a full correspondence with the old H-translation:

1. ~ ~ ((~o))(e) (i.e. ((~o}}(e)is true in a given many-sorted input structure g E g)


iff [~o](~) is true in 6 for the same input graphs ~, and
2. the graph defined by Afi~."~ ~ {{t))(~, 9, e)" is isomorphic to that defined by
H ( g ) with the tuple ~ in the role of the corresponding point,

all of these being provable in KPRo (with the isomorphism definable by a An-
term).
Then it proves that ((-))-translation of all AD-axioms of BSTA (i.e. all, except
AD-Collection5 ) including appropriate logical axioms for AD-language are prov-
able in LID. Also, logical inference rules for AD-language will preserve validity
in corresponding many-sorted structures under this translation. All this gives
rise to

Theorem5. LID I-- ((BSTAIAD, i.e. all AD-theorems of BSTA)). 6

On the other hand, any model Ad = (M, 7)) of LID gives rise to a supply ~A/[
of pointed graphs, just graphs with tuples as vertices defined by P-predicates
Afi~.P(g, 9), each one endowed with corresponding tuple of elements r~ of M as
a point. Naturally definable in LID relations =0A4 and C0-M also allow to consider
the triple (~j~4 ,-0-Ad , E0-M ) almost in the same way as (~, =0, C0> in Theorem 2
(just by an imitation of the whole argument).

Theorem6. (~A~,=0A4 EEo_tV[)i= BSTAIAD.


5 Remember, that the full version of BSTA with AD-COllection is a conservative ex-
tension of BSTA[AD, cf. Sect.2 and [Saz85, Saz85a, Saz87].
6 Note, that AD-formulas and terms are formally defined so that each always involves
a non-empty list of free variables ~ corresponding as above to a many sorted structure
O consisting of pg's with length(~) = length(~). Otherwise we would consider here
also 0-sorted structures, i.e. structures with no sorts at all.
535

T h e o r e m 7 (on a s t r o n g c o m p l e t e n e s s of LID). If any theory Th based on


the logic LID is formally consistent then it has such a model A/[ for which in
the corresponding universe (~.A4,_o_2r eoA4) there exists an inner model which
is isomorphic to Az[.
Therefore by taking here Th = - ~ and by using also Theorem 6, the conser-
vativity of BSTA over BSTAIA D and the abovementioned embedding of LID in
KPRo and BSTA we obtain
C o r o l l a r y 8 . LID is conservatively embedded in BSTA, i.e. for all LID-formulas
LIDF~ iff BSTAF-"~V~".
Also we can characterize in terms of LID the operations and predicates in the
universe (~, =0, E0) (of Theorem 2) definable in AD-language (up to =0).
T h e o r e m 9 . (i) AD-definable (or, equivalently, provably-total Z-definable in
BSTA) operations and predicares over (~, =0, E0) coincide (up to =0) with those
operations and predicates definable in the language LID, for which =0 is a con-
gruence relation.
(it) For~ based on Y = H F each such an operation over graphs is polynomial-
time computable (cf. Theorem 4 ) with respect to the number of vertices of the
graphs in ~.
One direction of (i) is based just on ((-))-translation of AD-language. Con-
versely, any LID-definable operation g' = F(g) in (G, =0, E0) over pg's (as sorts
for LID) which preserves =0 may be expressed in A D as a composition of the
corresponding AD-term $g E ~.tF(g) (as the second factor of the composition)
directly imitating the given LID-definition of F (up to an isomorphism of the
output graph g') with Ax E 1;.(E ITC({~)), x) (as the first factor) and with D (as
the third factor).
The converse to (it) (or, more precisely, if any PTIME-computable transfor-
mation over finite pg's preserving =0 is AD-definable) is an open question. Un-
fortunately, it is also unknown to the author how to AD-define, if possible at all,
any (natural or not) linear order on the whole (or even on each set of the) universe
(6, =0, E0)/=o based on Y = HF. (Note, that in the universe H F itself a natural
linear order is definable in the language AR + TC [Saz85, Saz87, Saz93].) Other-
wise it would be possible to prove the converse of the clause (it) of this theorem
for such universe (~,=0, E0)/=o and therefore identify PTIME-eomputability
over such (~, =0, C0)/=o with AD-expressibility (as it was done for the universe
H F and A C ~ AR + TC + C in op. cit. where C is the collapsing operation the
same as D except to be nontrivially defined only on well-founded graphs).

4 Appendix
4.1 S o m e t e c h n i c a l definitions n e e d e d for [-]
Let us define for every pg (G, a)
point(G, a) = a, graph(G, a) = (7, vertices(G, a) =field(G) U {a} ,
536

elements-of-elements(G, a) =
{x E vertices(G, a ) : qy E vertices(G, a).(x --+G Y --+a a)} ,
elements* (G, a) ~ {x E vertices(G,a): (x ---+a...-+a a)} ,
new(v) = { x E v: x ~ x}, new-vertex(G, A) :m- new(vertices(G, a>) .

Here, in the definition of elements* the-least construct must be actually involved.


Evidently, by using the idea of B.Russell's paradox, new(v) ~ v for any set v.
For any pg x = (G, a) we will need to consider its isomorphic copy copy(x) =
copy(G, a) so that for different x and y the sets of vertices of copy(x) and copy(x)
do not intersect. This can be done by labelling each vertex of the pg x just by
x itself. Given any graph G with a set of its distinguished vertices A, define
corresponding pg with A as the set of its elements

(collect A in G) ~ (G U {(x, new-vertex(G, d ) } : x E A}, new-vertex(G, A)).

Also, given any set of pg's X we can define its s u m

EX ~- (collect A in G x ) ,

where G x ~ Uxex graph(copy(x)) and d ~ {point(copy(x)) : x e X}. We will


need also the abbreviation

the-unique x E t . ~ ( x ) ~ U {x E t : ~(x)} .

Then for xo = the-unique x E t . ~ ( x )

3!x ~ t.~,(x) ~ xo e t ~ ~,(xo) and -~3x C t . ~ ( x ) ~ xo = 0 .

For example, for the-unique q E hvertices(~).(.0 * q =0 P) in the definition of


l[D]](g,p) one of these two alternatives holds. In the case that required q does
not exist, 0 = the-unique.., will be an (improper) isolated point of pg ~[D]l(g,p).
The so called "horizontal" notions hpair, etc. are defined as follows:

hpair((G, ~r}, (x, y)) ~ 3u, v E field(G).


(x ~ c u ~a ~r & x ---,a v --+a Tr & y ~a v &
Vz e field(G).((z ~ a u ~ z = x) &
(z--+av~ z=xVz=y) &
(z - ~ ~ ~ z = u v ~ = v))) ,

hgraph(g)~'~-{( x, Y/C (vertices(g))2 : qTr E elements(g).(hpair(g*~r), (x, y}))} ,


hvertices(g) = { x E vertices(g): :ly E vertices(g):l~- E elements(g).
(hpair(g * 7r, (x, y)) V hpair(g * 7r, (y, x } ) ) } .

All the above constructs are easily definable by A R - t e r m s . However, only the
cases of--0, Co, elements* and ~ require Recursive AR-Separation.
537

4.2 Proof of the Congruence Lemma


Consider the following cases of the induction step (according to the form of tp
or t).

9 Let io(g) be Vx E a(g).r Then


~Io(g)~ ~ Vx E elerneatsI[al~(g).[r g) r VX E0 [a]~(g).~r X, g)
by induction hypothesis for [r and because [[all(g)* x E0 [[a](g) for all
x E elements[[all(g). By using induction hypothesis [all(g) =0 [[a](g') for all g =0
gt and easily provable the congruence property of =0 with respect to E0 we also
obtain, as required, that
[[to(g)l] K=vVX E0 l[al](g).l[r g) r VX E0 IFa~(g').[[r g') r I[Io(g')~ .
9 Let t(g) be {a(g), b(g)}. Then [t(g)] is ~ {l[a](g), [[b]](g)}. By induction hy-
pothesis for a and b we have, as required,

E {[[all(g), I[b]I(g)} =0 E {l[all(g'), [[b]](g')} for all g =0 g'


by constructing a bisimulation from those between [[all(g) and [a]](g') and, re-
spectively, between [b](g) and [b~(g').

9 Let t(g) be Ua(g). Then It(g)] is


(collect elements-of-elements[Jail(g)in graph[a]](g))=0
(collect elements-of-elements~[a]~(g')in graph~ra]~(g~)) for all g =0 g~
by constructing a bisimulation from that one between [a](g) and [a~(g').

9 Let t(g) be ( s ( x , g ) : x E a(g) ~ p(x,g)}. Then It(g)] is


E {[[sll(l[ai(g) * x, g) : x E elementslrall(g) ~ [[!p~](~[a~(g)* x, g)} =0
E{[s~([a](g') * x', g'): x' e elements[a](g') ~ [!p]([a](g') * x', g')} ,
as required, for all g =0 g~ by constructing an appropriate bisimulation relation
from the given bisimnlations respectively between l[a]l(g) and K[a~(g'), between
lisa(liar(g) *x, g)and [[s]I([[a]](g')*x', g'), and the equivalence between statements
[~p~([a~(g) * x, g) and [~p~([a](g') * x', g') with corresponding x and x', via the
given bisimulation between [a](g) and [a](g').

9 Let t(g) be the-least p.(p = {x E a(g): ~(x,p,g)}). Then It(g)] =0 [t(g')~ is


just (collect qo(g)]n graph~[ai(g))=0 (co[[ect qo(g')]n graph[a](g')) where
qo(g) = the-least q.(q = {x E elernents~a](g) :
~[~]([[a~(g)* x, (collect q in graph[a](g)),g)}) and
qo(g') = the-least q'.(q' = {x' E etements[al](g') :
[[~]([a](g') * x', (collect q' in graph[a](gt)),g')}).
538

Suppose g =0 g~. Then by induction hypothesis for a and iP we have a bisim-


ulation relation R witnessing [[a](g) =0 [a]](g'), and an equivalence between
statements

[toll([a](g), x, (collect q in graph~a](g)),g)) and

[iPl~(l[a]](g') * x', (collect q' in graph[a~(g')), gt))

with corresponding x and x t and q and q~ via the given bisimulation R. Here we
say that sets of vertices q and q' correspond by R if for each v E q there exists
a v' E q' such that vRv' and vice versa. We need also use the fact that for each
set q ___elements[a~(g) satisfying the inequality

{x E elements[a~(g): [[~]([[a~(g)*x, (collect q in graph~a~(g)), g)} C q

the set R(q) = {v' E elements~a]l(g') : 3v E q.vRv'} corresponds to q via/i~ and


satisfies the corresponding inequality for q'

{x'E elements[a](g'): [[t4([a~(g')*x', (collect q' in graph[[a]l(g')), g')} _ q' .

Also if q0 is the least solution of the first (in)equality then R(qo) is the least
solution of the second. Indeed, if q~ is any solution of the second inequality then
(as above)/~-l(q,) is corresponding solution for q of the first and therefore q0 C_
R-l(q'). Any x' E R(qo) is related by bisimulation R with some x E q0. It follows
that x is an element of the right hand side of the first inequality (which proves
to be just equality) with q0 in place of q and therefore, by monotonicity, with
R - l ( q , ) in place of q. Then appropriate bisimulation gives rise to the analogous
membership x' to the right-hand-side of the second inequation for q'. Since this
right-hand-side is included in q' we have x ~ E q' and therefore R(qo) C_ q~ what
means that R(qo) is the least solution qo(g'), as required. It follows also that
qo = qo(g) is related by the required bisimulation relation with R(qo) = qo(g').
This finally gives (collect qo(g)in graph[el(g)) =0 (collect qo(g')in graph[a~(g')).

9 We leave to the reader the non-considered cases of &, V, --,, 3, TC and D.

4.3 P r o v a b i l i t y in KPRo o f [non-logical a x i o m s o f BSTA]

9 The case of axioms Pair, Union and Image is straightforward.

9 Extensionalityfollows from the following easy consideration. For any two pg's
(G, a) and (G', a') if for any x --+c a there exists x' ---~c' a' and a bisimulation
relation/~ such that xRx t, and vice versa, then the same holds for the unique,
the largest bisimulation relation --.r between G and G ~ (extending all these
R) which must also connect a and a'. It follows that (G, a) =0 (G', a').
Analogously, if [[(G, a) C {G', a')] then for some set A C elements(G', a') we
have (G, a) =0 (collect A in G').
539

9 Recursive AD-(and ziR-)Separation axiom may be equivalently rewritten for


Po =the-least p.(p = {x E a: IP(x,p)}) as

Vx E a.(io(x, P0) =~ z E P0) & (Vx E a. (~(z, p) =:~x E V)~P C a ::~ P0 C p) .

Its [-[-translation looks as follows. If q0 is An-defined as

q0 = the-least q.(q = {z E elements[a[ : lip[ ([[a[*z, (collect q in graph[a[))})

with [P0[ = [the-least p.(p = {x E a: Io(z, p)})]] = (collect q0 in graph[a[)(cf, the


definition of [-[) then

Vx E elements[a[.([TI([a] * x, ~o0]) ==~ [a[ * x E0 ~o0[) &


(Vx E elements[aI.(lloII(Ia]]* z,p) ::r [a] * x (50 p) &

where all free variables such as p are supposed to run over G. To show this let
us present the above form of Recursive An-Separation axiom for q0:

Vx E elements[[a]].([Ip[([a] * z, (collect q0 in graph[a[)) =r x E q0)


(Vx (5 elements[a].([T[([al1* x, (collect q in graph[a[)) =~ z E q)
q C elements[a[ ::~ qo C q])

where all free variables such as q run over arbitrary sets (of the original universe
V of KPR0). Then to prove the above [-[-translation of Recursive AD-Separation
it is sufficient to note, that

i. x E qo =r l[a]]* x Eo (collect qo in graph[aD = ~Po[,


2. ~p C a] is equivalent to existence of q C elements[a[ such that p =0 (collect q-
in graph[all) and, moreover, z E q r162[a] *z E0 (collect q in graph [a]) = ~o0]] for
all z and for this q, and
3. q0 C q :=~ [(collect q0 in graph[a]) C (collect q in graph[a])]] r [[Po]] C_p[.

9 AD-Collection axiom is translated to

Vx E elements[a[Zly E ~.[Vp]]([al]* z, y) :r
3z E (~Vz E elements[a]3v E elements(z).[~[([a] * x, z * v) .

According to the original AR-Collection the antecedent of this formula implies

3ZVx E elements[al3y E Z.~[tp[([a] * x, y) .


It remains to take z ~ ~ Z.

9 Consider [-[-translations of AFA axioms. First note, that AFA1 may be rewrit-
ten also as conjunction of the following two formulas:

Vz E D(g,p)3(q,p) E g.(z = D(g,q)) and V(q,p} E g.(D(g,q) E D(g,p))


540

or, more detailed, for the first formula


Vz E D(g, p)3w E g3u, v E w3q E u.
(p, q E v & Vx e u.(x = q)
v x E v . ( x = q v x = p) ,~ v x e w . ( x = ~ v x = v) t~ z = D(g, q)) ,

and analogously for the second formula. Their translations


Vz E elements(~O](g,p))
3w E elements(g)3u, v E elements(g 9 w)3q E elements(g 9 u).
( p , g * q Eo g , v
Vx E elements(g 9 u).(g 9 x - 0 g * q) ~:
Vx E elements(g * v).(g* x =0 g * q V g * x =0 p)
Vx@eJements(g*w).(g*x=0g*uVg*x=0g*v) L:
ED~(g,p) 9 z = 0 ~D~(g, g 9 q)) ,
and analogously for the second formula, are provable in KPR0 straightforwardly
(just take for the first formula w, u, v, q, s.t. q E z, i.e. z = [q]).
[AFA2] is IBis e (I~)~&[xRy] ::~ x =o Y where IBis e (/~)]] is equivalent to
vx, y E 6.([xRy])
(Vx' E e[ements(x)3y ~ e elements(y).Ex * x ' R y * y']])
(Vy' E e]ements(y):Ix' e elements(x).I[x* x'Ry * y'~))
and implies that for any x, y E (~ the relation Axlyq[[x*x~Ry*yq] is a bisimulation
relation between graph(x) and graph(y). It follows that also KPR0 k I[AFA2]].

9 Finally, the transitive closure axioms are translated as


Vv E elements(x).(x* v Eo [TC(x)ll) ,
Vu E elements[TC(x)llVv e elements(l[TC(x)II * u).(ETC(x)] * v Eo [TC(x)]) ,
(Vv E elements(x).(x * v E0 y) ==~
(Vu E elements(J[TC(x)]).(][TC(x)]] * u Eo [TC(y)]) and
Yu E elements(x)Vv E elements(x * u).(x * v Eo x) :=~ l[TC(x) C_x] .

The first two formulas are evident. For the third formula note, that the set {u E-
elements* (x) : 3w e elements*(y).(x*u =0 y . w ) } contains elements*(x) because
it satisfies corresponding equation for elements*(x). Analogously, antecedent of
the last formula implies that the set {w E vertices(x) : 3w' e elements(x).(x 9
w =0 x * w')} includes elements* (x) what is essentially we need.

5 Acknowledgments

The author is grateful to anonymous referees for comments and detailed sugges-
tions on improving the presentation of this paper.
541

References
[Acz88] P.Aczel, Non-well-founded sets, CSLI LN N14, Stanford, 1988.
[Acz77] P.Aczel, Introduction to the theory of inductive definitions, in: Handbook of
Math. Logic, J.Barwise ed., North-Holland, Amsterdam, 1977.
[Bar75] J.Barwise, Admissible sets and structures. Berlin, Springer, 1975
[J&g86] G.Js Theories for admissible sets. A unifying approach to proof theory,
Studies in Proof Theory, Lecture Notes 2, Bibliopolis, 1986.
[Fef70] S.Feferman, Formal theories for transfinite iterations of generalized induc-
tive definitions and some subsystems of analysis, In: Intuitionism and Proof
Theory, eds. A.Kino, et al., Amsterdam, North-Holland, 1970, 303-326.
[Fed T.Fernando, A Primitive recursive set theory and AFA: on the logical com-
plexity of the largest bisimulati0n, Report CS-R9213 ISSN 0169-118XCWI
P.O.Box 4079, 1009 AB Amsterdam. The Netherlands.
[Gan74] R.O.Gandy, Set-theoretic functions for elementary syntax, in: Proc. in Pure
Math., Vol 13, Part II (1974) 103-126.
[Gur83] Y.Gurevich, Algebras of feasible functions, in: FOCS'83 (1983) 210-214.
[Gur84] Y.Gurevich, Towards logic tailored for computational complexity, in: LNM
1104, Springer, Berlin (1984) 175-216.
[cs86] Y.Gurevich and S.Shelah, Fixed point extensions of first-order logic, Ann.
Pure AppL Logic 32 (1986) 265-280.
[Imm82] N.Immerman, Relational queries computable in polynomial time, in: 14th.
STOC (1982) 147-152; cf. Inform. and Control 68 (1986) 86-104.
[Jen72] R.B. Jensen, The fine structure of the constructible hierarchy, Ann.Math.
Logic 4 (1972) 229-308.
[Liv82] A.B.Livchak, Languages of polynomial queries, in: Raschet i optimizacija tep-
lotehnicheskih ob 'ektov s pomosh 'ju E V M , Sverdlovsk, 1982, 41 (in Russian).
[Mos74] Y.N.Moschovakis, Elementary Induction on Abstract Structures, Amsterdam,
North-Holland, 1974.
[Men64] E.Mendelson Introduction to Mathematical Logic, D. Van Nostrand, Prince-
ton, 1964.
[Saz80] V.Yu.Sazonov, Polynomial computability and recursivity in finite domains.
EIK, 16, N7 (1980) 319-323.
[Saz85] V.Yu.Sazonov, Bounded set theory and polynomial computability. Conf. on
Applied Logic, Novosibirsk, 1985, 188-191. (In Russian)
[SazS5a] V.Yu.Sazonov, Collection principle and existetial quantifier. (In Russian) Vy-
chislitel'nye sistemy 107 (1985) Novosibirsk, 30-39.) (Cf. English translation
in AMS Transl. (2) 142 (1989) 1-8.)
[Saz87] V.Yu.Sazonov, Bounded set theory, polynomial computability and A-prog-
ramming, Vyehislitel'nye sistemy 122 (1987) Novosibirsk, 110-132 (in Rus-
sian). Cf. also LNCS 278 (1987) 391-397 (in English).
[Saz91] V.Yu.Sazonov, Bounded Set Theory and Inductive Definability, Logic Collo-
quium'90, JSL, 56, Nu.3 (1991) 1141-1142.
[Saz93] V.Yu.Sazonov, Hereditarily-finite sets, data bases and polynomial-time com-
putability, TCS 119 (1993) 187-214, Elsevier.
[Var82] M.Y.Vardi, The complexity of relational query languages, STOC'82, 137-146.
Author Index

Aspinall ...................... 1 Lipton ..................... 249


Baaz ....................... 106 Liquori ..................... 16
Benton .................... 121 MalmstrSm ................ 217
Bono ....................... 16 Marandjian ................ 501

Brafiner ..................... 31 Marion .................... 486

Buss ....................... 151 McArthur .................. 228


Chagrov ................... 442 Mendler ................... 354

Cholewifiski ................ 456 Mintchev .................. 369

Compagnoni ................ 46 Moschovakis ............... 382


Courcelle .................. 163 Olive ...................... 190
Crole ...................... 339 Piessens ................... 397
de Nivelle ................... 279 Pudl~k .................... 151
Durand .................... 177 P~anaivoson ................ 177

Fairt|ough ................. 354 Rybakov ................... 512


Finkelstein ................. 249 Salzer ...................... 106

Freyd ...................... 249 Sazonov ................... 527

Gordeev ................... 136 Schulz ..................... 294


Gordon .................... 339 Schwentick ................. 205

Grandjean ................. 190 Shehtman .................. 442


tterbelin .................... 61 Steegmans ................. 397
Hermida ................... 412 Stolboushkin ............... 242

Hofmann .................. 427 Taitslin .................... 242


Jacobs ..................... 412 Tammet ................... 309
Klarlund ................... 471 Th~rien .................... 205
Kriautiukas ................ 264 Voda ...................... 324
Kuper ...................... 76 Walicki .................... 264

Lautemann ................ 205 Whitney ................... 382


Leivant .................... 486 Zaionc ...................... 91
Lester ..................... 369
Lecture Notes in Computer Science
For information about Vols. 1-865
please contact your bookseller or Springer-Verlag

Vol. 866: Y. Davidor, H.-P. Schwefel, R. Mfinner (Eds.), Vol. 883: L. Fribourg, F. Turini (Eds.), Logic Program
Parallel Problem Solving from Nature - PPSN III. Pro- Synthesis and Transformation - Meta-Programming in
ceedings, 1994. XV, 642 pages. 1994. Logic. Proceedings, 1994. IX, 451 pages. 1994.
Vo1867: L. Steels, G. Schreiber, W. Van de Velde (Eds.), Vol. 884: L Nievergelt, T. Roos, H.-J. Schek, P. Widmayer
A Future for Knowledge Acquisition. Proceedings, 1994. (Eds.), IGIS '94: Geographic Information Systems. Pro-
x n , 4 t 4 pages. 1994. (Subseries LNAI). ceedings, 1994. VIII, 292 pages. 19944.
Vol. 868: R. Steinmetz (Ed.), Multimedia: Advanced Vol. 885: R. C. Veltkamp, Closed Objects Boundaries
T e l e s e r v i c e s and High-Speed Communication from Scattered Points. VIII, 144 pages. 1994.
Architectures. Proceedings, 1994. IX, 451 pages. 1994. Vol. 886: M. M. Veloso, Planning and Learning by Ana-
Vol. 869: Z. W. Rag, Zemankova (Eds.), Methodologies logical Reasoning. XIII, 181 pages. 1994. (Subseries
for Intelligent Systems. Proceedings, 1994. X, 613 pages. LNAI).
1994. (Subseries LNAI). Vol. 887: M. Toussaint (Ed.), Ada in Europe. Proceed-
Vol. 870: J. S. Greenfield, Distributed Programming Para- ings, 1994. XII, 521 pages. 1994.
digms with Cryptography Applications. XI, 182 pages. Vol. 888: S. A. Andersson (Ed.), Analysis of Dynamical
1994. and Cognitive Systems. Proceedings, t993. VII, 260
Vol. 871: J. P. Lee, G. G. Grinstein (Eds.), Database Is- pages. 1995.
sues for Data Visualization. Proceedings, 1993. XIV, 229 Vol. 889: H. P. LuNch, Towards a CSCW Framework for
pages. 1994. Scientific Cooperation in Europe. X, 268 pages. 1995.
Vol. 872: S Arikawa, K. P. Jantke (Eds.), Algorithmic Vol. 890: M. J. Wooldridge, N. R. Jennings (Eds.), Intel-
Learning Theory. Proceedings, 1994. XIV, 575 pages. ligent Agents. Proceedings, 1994. VIII, 407 pages. 1995.
1994~ (Subseries LNAI).
Vol. 873: M. Naftalin, T. Denvir, M. Bertran (Eds.), FME Vol. 891: C. Lewerentz, T. Lindner (Eds.), Formal De-
'94: Industrial Benefit of Formal Methods. Proceedings, velopment of Reactive Systems. XI, 394 pages. 1995.
1994. XI, 723 pages. 1994.
Vol. 892: K. Pingati, U. Banerjee, D. Gelernter, A.
Vol. 874: A. Borning (Ed.), Principles and Practice of Nicolau, D. Padua (Eds.), Languages and Compilers for
Constraint Programming. Proceedings, 1994. IX, 361 Parallel Computing. Proceedings, 1994. XI, 496 pages.
pages. 1994. 1995.
Vol. 875: D. Gollmann (Ed.), Computer Security - Vol. 893: G. Gottlob, M. Y. Vardi (Eds,), Database
ESORICS 94. Proceedings, 1994. XI, 469 pages. 1994. Theory- ICDT '95. Proceedings, 1995. XI, 454 pages.
Vol. 876: B. Blumenthal, J. Gornostaev, C. Unger (Eds.), 1995.
Human-Computer Interaction. Proceedings, 1994. IX, 239 Vol. 894: R. Tamassia, I. G. Tollis (Eds.), Graph Draw-
pages. 1994. ing. Proceedings, 1994. X, 471 pages. 1995.
Vol. 877: L. M. Adleman, M.-D. Huang (Eds.), Algorith- Vol. 895: R. L. Ibrahim (Ed.), Software Engineering Edu-
mic Number Theory. Proceedings, 1994. IX, 323 pages. cation. Proceedings, 1995. XII, 449 pages. 1995.
1994.
Vol. 896: R. N. Taylor, J. Coutaz (Eds.), Software Engi-
Vol. 878: T. Ishida; Parallel, Distributed and Multiagent neering and Human-Computer Interaction. Proceedings,
Production Systems. XVII, 166 pages. 1994. (Subseries 1994. X, 281 pages. 1995.
LNAI).
Vol. 897: M. Fisher, R. Owens (Eds.), Executable Modal
Vol. 879: J. Dongarra, J. Wagniewsk'i (Eds.), Parallel Sci- and Temporal Logics. Proceedings, 1993. VII, 180 pages.
entific Computing. Proceedings, 1994. XI, 566 pages. 1995. (Subseries LNAI).
1994.
Vol. 898: P. Steffens (Ed.), Machine Translation and the
Vol. 880: P. S. Thiagarajan (Ed.), Foundations of Soft- Lexicon. Proceedings, 1993. X, 251 pages. 1995.
ware Technology and Theoretical Computer Science. Pro- (Subseries LNAI).
ceedings, 1994. XI, 451 pages. 1994.
Vol. 899: W. Banzhaf, F. H. Eeckman (Eds.), Evolution
Vol. 881: P. Loucopoulos (Ed.), Entity-Relationship Ap- and Biocnmputation. VII, 277 pages. 1995.
proach- ER'94. Proceedings, 1994. XIII, 579 pages. 1994.
Vol. 900: E. W. Mayr, C. Puech (Eds.), STACS 95. Pro-
Vol. 882: D. Hutchison, A. Danthine, H. Leopold, G. ceedings, 1995. XIII, 654 pages. 1995.
Coulson (Eds.), Multimedia Transport and Teleservices.
Proceedings, 1994. XI, 380 pages. 1994.
Vol. 901 : R. Kumar, T. Kropf (Eds.), Theorem Provers in Vol. 922: H. Dtrr, Efficient Grapg Rewriting and Its Im-
Circuit Design. Proceedings, 1994. VIII, 303 pages. 1995. plementation. IX, 266 pages. 1995. (Subseries LNAI).
Vok 902: M. Dezani-Ciancaglini, G. Plotkin (eds.), Typed Vol. 923: M. Meyer (Ed.), Constraint Processing. IV, 289
Lambda Calculi and Applications. Proceedings, 1995. pages. 1995.
VIII, 443 pages. 1995 Vol. 924: P. Ciancarini, O. Nierstrasz, A. Yonezawa
Vol. 903: E. W. Mayr, G. Schmidt, G. Tinhofer (Eds.), (Eds.), Object-Based Models and Languages for Concur-
Graph-Theoretic Concepts in Computer Science. Proceed- rent Systems. Proceedings, 1994. VII, 193 pages. 1995.
ings, 1994. IX, 414 pages. 1995.
Vol. 925: J. Jeuring, E. Meijer (Eds.), Advanced Func-
Vol. 904: P. Vit~inyi (Ed.), Computational Learning tional Programming. Proceedings, 1995. VII, 331 pages.
Theory. EuroCOLT'95. Proceedings, 1995. XVII, 415 1995.
pages. 1995. (Subseries LNAI).
Vol. 926: P. Nesi (Ed.), Objective Software Quality. Pro-
Vol. 905: N, Ayache (Ed.), Computer Vision, Virtual Re- ceedings, t995. VIII, 249 pages. 1995.
ality and Robotics in Medicine. Proceedings, 1995. XIV,
567 pages. 1995. Vol. 927: J. Dix, L. Moniz Pereira, T. C. Przymusinski
(Eds.), Non-Monotonic Extensions of Logic Program-
Vol. 906: E. Astesiano, G. Reggio, A. Tarlecki (Eds.), ming. Proceedings, 1994. IX, 229 pages. 1995. (Subseries
Recent Trends in Data Type Specification. Proceedings, LNAI).
I995. VIII, 523 pages. 1995.
Vol. 928: V.W. Marek, A. Nerode, M. Truszczynskl
Vol. 907: T. Ito, A. Yonezawa (Eds.), Theory and Prac- (Eds.), Logic Programming and Nonmonotonic Reason-
tice of Parallel Programming. Proceedings, 1995. VIII, ing. Proceedings, 1995. VIII, 417 pages. 1995. (Subseries
485 pages. 1995. LNAI).
Vol. 908: L R. Rao Extensions of the UNITY Methodol- Vol. 929: F. Morlin, A. Moreno, J.J. Merelo, P. Chac6n
ogy: Compositionality, Fairness and Probability in Paral- (Eds.), Advances in Artificial Life. Proceedings, 1995.
lelism. XI, 178 pages. 1995. XIII, 960 pages. 1995 (Subseries LNAI).
Vol. 909: H. Comon, J.-P. Jouannaud (Eds.), Term Re- Vo]. 930: J. Mira, F. Sandoval (Eds.), From Natural to
writing. Proceedings, 1993. VIII, 221 pages. 1995. Artificial Neural Computation. Proceedings, 1995. XVIII,
Vol. 910: A. Podelski (Ed.), Constraint Programming: 1150 pages. 1995.
Basics and Trends. Proceedings, 1995. XI, 315 pages. Vol. 931: P.J. Braspenning, F. Thuijsman, A.J.M.M.
1995. Weijters (Eds.), Artificial Neural Networks. IX, 295
Vot. 911 : R. Baeza-Yates, E. Goles, P. V. Poblete (Eds.), pages. 1995.
LATIN '95: Theoretical Informatics. Proceedings, 1995. Vol. 932: J. Iivari, K. Lyytinen, M. Rossi (Eds.), Advanced
IX, 525 pages. 1995. Information Systems Engineering. Proceedings, 1995. XI,
Vol. 912: N. Lavrac, S. Wrobel (Eds.), Machine Learn- 388 pages. 1995.
ing: ECML - 95. Proceedings, 1995. XI, 370 pages. 1995. Vol. 933: L. Pacholski, J. Tiuryn (Eds.), Computer Sci-
(Subseries LNAI). ence Logic. Proceedings, 1994. IX, 543 pages. 1995.
VoI. 913: W. Sch~ifer (Ed.), Software Process Technol- Vol. 934: P. Barahona, M. Stefanelli, J. Wyatt (Eds.), Ar-
ogy. Proceedings, 1995. IX, 261 pages. 1995. tificial Intelligence in Medicine. Proceedings, 1995: XI,
Vol. 914: J. Hsiang (Ed.), Rewriting Techniques and Ap- 449 pages. 1995. (Subseries LNAI).
plications. Proceedings, 1995. XII, 473 pages. 1995. Vol. 935: O. De Michelis, M. Diaz (Eds.), Application
Vol. 915: P. D. Mosses, M. Nielsen, M. I. Schwartzbach and Theory of Petri Nets 1995. Proceedings, 1995, VIII,
(Eds.), TAPSOFT '95: Theory and Practice of Software 511 pages. 1995.
Development. Proceedings, 1995. XV, 810 pages. 1995. Vol. 936: V.S. Alagar, M. Nivat (Eds.), Algebraic
Vol. 916: N. R. Adam, B. K. Bhargava, Y. Yesha (Eds.), Methodology and Software Technology. Proceedings,
Digital Libraries. Proceedings, 1994. XIII, 321 pages. I995. XIV, 591 pages. 1995.
1995. Vol. 937: Z. Galil, E. Ukkonen (Eds.), Combinatorial
Vol. 917: J. Pieprzyk, R. Safavi-Naini (Eds.), Advances Pattern Matching. Proceedings, 1995. VIII, 409 pages.
in Cryptology - ASIACRYPT '94. Proceedings, 1994. XtI, 1995.
431 pages. 1995. VoI. 938: K.P. Birman, F. Mattern, A. Schiper (Eds.),
Vol. 918: P. Baumgarmer, R. Hahnle, J. Posegga (Eds.), Theory and Practice in Distributed Systems.
Theorem Proving with Analytic Tableaux and Related Proceedings,1994. X, 263 pages. 1995.
Methods. Proceedings, 1995. X, 352 pages. 1995. Vo/, 939: P. Wolper (Ed.), Computer Aided Verification.
(Subseries LNAI). X, 451 pages. 1995.
Vol. 919: B, Hertzberger, G. Serazzi (Eds.), High-Per- Vol. 941: M. Cadoli, Tractable Reasoning in Artificial
formance Computing and Networking. Proceedings, 1995. Intelligence. XVII, 247 pages. 1995. (Subseries LNAI).
XXIV, 957 pages. 1995.
Vol. 942: G. B6ckle, Exploitation of Fine-Grain
Vol. 920: E. Balas, ,1. Clausen (Eds.), Integer Program- Parallelism. IX, 188 pages. 1995.
ruing and Combinatorial Optimization. Proceedings, 1995.
Vol. 943: W. Klas, M. Schrefl, Metactasses and Their
IX, 436 pages. 1995.
Application. IX, 201 pages. 1995.
Vol. 921: L. C. Guillou, J.-J. Quisquater (Eds.), Advances
in Cryptology - EUROCRYPT '95. Proceedings, 1995.
XIV, 417 pages. 1995.

S-ar putea să vă placă și