CS6005 Advanced Database System UNIT IV PDF

Sec: 8.
1 Datalog {1{
CZ
Facts and Rules

A relational database about students and courses
student
Name Major Year
"Joe Doe" cs senior
"Jim Jones" cs junior
"Jim Black" ee junior
took
Name Course Grade
"Joe Doe" cs123 2.7
"Jim Jones" cs101 3.0
"Jim Jones " cs143 3.3
"Jim Black" cs143 3.3
"Jim Black" cs101 2.7
The same fact base for Datalog

student("Joe Doe", cs, senior).
student("Jim Jones", cs, junior).
student("Jim Black", ee, junior).
took("Joe Doe", cs123, 2.7)

took("Jim Jones" , cs101, 3.0)
took("Jim Jones", cs143, 3.3)
took("Jim Black", cs143, 3.3)
took("Jim Black", cs101, 2.7)
Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Advanced Database Systems Morgan Kaufmann Copyright
c 1997
Sec: 8.1 Datalog {2{
CZ
Rules
How to express logical conjunction:
Find the name of junior-level students who have taken
both cs101 and cs143
firstreq(Name) student(Name; Major; junior);

took(Name; cs101; Grade1);
took(Name; cs143; Grade2):
Rule head, rule body.

Upper case, lower case, anonymous variables.
The commas in the body represent logical conjunction.
Junior-level students who took course cs131 or course
cs151 with grade better than 3.0
scndreq(Name) took(Name; cs131; Grade); Grade > 3:0;
student(Name; Major; junior):
scndreq(Name) took(Name; cs151; Grade); Grade > 3:0;
student(Name; ; junior):

c 1997
CZ
QUERIES
A closed query; the answer to such query is either yes or not. For
instance,
?firstreq(\Jim Black")
An open query:
?firstreq(X)
and its answer:
firstreq(\Jim Jones")
firstreq(\Jim Black")

c 1997
CZ
The Relational Model vs Datalog

The terminology of Datalog versus that of the rela-
tional model
Datalog Relational Model
Base Predicate Table or Relation
Derived Predicate View
Fact Row or Tuple
Argument Column
Most of the power is in cascading Both previous re-

quirements must be satised to enroll in cs298
req cs298(Name) firstreq(Name); scndreq(Name):

c 1997
CZ
Negation in Datalog
Only negated goals are allowed. Negated heads are not
Junior-level Students who did not take course cs143
hastaken(Name; Course) took(Name; Course; Grade):

lacks cs143(Name) student(Name; ; junior);
:hastaken(Name; cs143):
Universal quantication by Double Negation

Find the senior students who completed all the require-
ments for the cs major: ?all req sat(X)
The rst step is that of formulating the complementary query:
Find students who did not take some of the courses required
for a cs major.
We can now re-express the original query as: Find the senior
students who are NOT missing any requirement
req missing(Name) student(Name; ; senior);

:
req(cs; Course); hastaken(Name; Course):
all req sat(Name) student(Name; ; senior);

:req missing(Name):

c 1997
Sec: 8.3 Relational Algebra {9{
CZ
Additional Operators
Addditional operators of frequent use can be derived from these.
For instance, we have join, semijoin, intersection, division and
generalized projection.
The join operator can be constructed using Cartesian prod-
uct and selection. In general, a join has the following form:
R 1 S , where F = $i11$j1 ^ : : : ^ i $j ; i1; : : : ; i are
F k k k k
columns of R; j1; : : : ; i are columns of S ; and 1; : : : ;

k k
are comparison operators. Then, if R has arity m, we dene

F = $i11$(m + j1) ^ : : : ^ $i $(m + j ).
0
k k k
Therefore,
R 1 S = 0 (R S )
F F
The intersection of two relations can be constructed either

by taking the equijoin of the two relations in every column
(and then projecting out duplicate columns) or by using the
following property: R \ S = R (R S ) = S (S R).
The generalized projection of a relation R is denoted (R), L
where L is a list of column numbers and constants. Unlike or-

dinary projection, components might appear more than once,
and constants as components of the list L are permitted (e.g.,
$1 $1 is a valid generalized projection).
;c;

c 1997
CZ
Relational Operators|Cont
Selection. R denotes the selection on R according to the
F
selection formula F , where F obeys one of the following pat-

terns:
(i) $iC , where i is a column of R, is an arithmetic com-
parison operator, and C is a constant, or
(ii) $i$j , where $i and $j are columns of R, and is an
arithmetic comparison operator, or
(iii) an expression built from terms such as those described in
(i) and (ii), above, and the logical connectives _; ^; and
:.
Then,
R = ft j t 2 R ^ F g
F
0
where F denotes the formula obtained from F by replacing

0
$i and $j with t[i] and t[j ].

For example, if F is \$2 = $3 ^ $1 = bob", then F is 0
\t[2] = t[3] ^ t[1] = bob".

Thus:
$2=$3 $1= R = ft j t 2 R ^ t[2] = t[3] ^ t[1] = bobg:
^ bob
All previous operators, but set-dierenc, are monotonic.

c 1997
CZ
Relational Operators
Cartesian product. The Cartesian product of R and S is
denoted R S .
RS =
ftj(9r 2 R)(9s 2 S )(t[1; : : : ; n] = r ^ t[n + 1; : : :; n + m] = s)g
If R has n columns and S has m columns, then RS contains
all the possible m + n tuples whose rst m components form
a tuple in R and the last n components form a tuple in S .
Thus, R S has m + n columns and jRjjS j tuples, where
jRj and jS j denote the respective cardinalities of the two
relations.
Projection. Let R be a relation with n columns, and L =
$1; : : : ; $n be a list of the columns of R. Let L be a sublist 0
of L obtained by (1) eliminating of the elements, and (2)

reordering the remaining ones in an arbitrary order. Then,
the projection of R on columns L , denoted 0 , is dened as
0
L
follows:
0 R = fr[L ] j r 2 Rg
L
0

c 1997
CZ
Relational Algebra (RA)

A family of operators on relations that have the closure property:
take relations as arguments and return relations as result.
Set Operators:
Union. The union of relations R and S , denoted R [ S , is

the set of tuples that are in R, or in S , or in both. Thus, it
can be dened using TRC as follows:
R [ S = ftjt 2 R _ t 2 S g
This operation is dened only if R and S have the same
number of columns.
Set dierence. The dierence of relations R and S , denoted
R S , is the set of tuples that belong to R but not to S .
Thus, it can be dened as follows: (t = r denotes that both t
and r have n components and t[1] = r[1] ^ : : : ^ t[n] = r[n]):
R S = ftjt 2 R ^ :9r(r 2 S ^ t = r)g
This operation is dened only if R and S have the same
number of columns (arity).

c 1997
Sec: 8.2 Relational Calculi {5{
CZ
Commercial DB Languages
The actual query languages of commercial RDMS are largely
based on the formal query languages just discussed. For instance:
Query-By-Example (QBE) is a visual query language based

on DRC
Languages such as QUEL and SQL are instead based on TRC.
In QUEL and SQL, the notation t:Name and t:Course are used
instead of t[1] and t[2]; also existential quantication is (resp.)
replaced by the constructs RANGE and FROM.
RA algebra provides a good basis for the ecient implementation
of these relational languages.

c 1997
CZ
Relational DB Languages
The dierences between the various languages so far dened does
not really impact their ultimate expressive power
TRC and DRC are equivalent, and there are mappings that
transform a formula in one language into an equivalent one
in the other.
Also for each TRC or DRC expression there is an equivalent,
nonrecursive Datalog program. The converse is also true,
since a nonrecursive Datalog program can be mapped into an
equivalent DRC query.
Query languages that achieve the level of of expressive power

shared by these languages are called relational completene.
Another language that is equivalent to these, and thus relation-
ally complete, is relational algebra (RA). Relational algebra is
an operator-based language, and thus provides a useful link to
concrete implementation of these logic-based languages.

c 1997
CZ
Tuple Relational Calculus (TRC)

In TRC, variables range over the tuples of a relation. For instance,
the TRC expression for the query ?firstreq(N) is:
f(t[1])j 9u9s(took(t) ^ took(u) ^ student(s) ^ t[2] = cs101 ^

u[2] = cs143 ^ t [1 ] = u [1 ] ^ s [3 ] = junior ^ s [1 ] = t [1 ])g
The variables t and s, respectively denote tuples ranging over

took and student.
t[1] denotes the rst component in t (corresponding to Name);
t[2] denotes the second component (i.e., the Course value of
this tuple)
Let j1; : : : ; j denote columns of a relation R, and t 2 R.
n
Then the notation t[j1; : : : ; j ] will be used to denote the

n
n-tuple (t[j1]; : : : ; t[j ]).

n
TRC requires an explicit statement of equality (e.g., s[1] =

t[1]), while in DRC equality is denoted implicitly by the pres-
ence of the same variable in dierent places.

c 1997
CZ
Explicit Quantiers

DRC presents several syntactic dierences w.r.t. Datalog:
set-denition by abstraction (rather than rules)

conjunctions and disjunctions in the same formula,
nesting of parentheses, and
explicit quantiers.
Existential and universal quantication are both allowed in DRC.
A query such as ?all req sat(N) can be expressed either
(i) using double negation (and only existential quantiers),
(ii) or directly using the universal quantier as shown in the follow-
ing example Find the seniors who completed all cs requirements):
f(N )j 9M (student(N; M; senior)) ^

8C (req(cs; C ) ! 9G(took(N; C; G))g (1)
The implication sign !: p ! q is just a shorthand for :p _ q .

c 1997
CZ
Domain Relational Calculus

Relational calculus comes in two main avors:
1. in the Domain Relational Calculus (DRC) the variables de-

note values of attributes,
2. in the Tuple Relational Calculus (TRC) variables denote whole
tuples.
For instance the query \Find the name of junior-level students

who have taken both cs101 and cs143" (i.e. the Datalog query
?firstreq(N)|top of page 165) can be expressed as follows:
f(N ) j 9G1(took(N; cs101 ; G1)) ^ 9G2(took(N; cs143 ; G )) ^ 2
9M (student(N; M; junior)) g
The query ?scndreq(N) can be expressed as follows:
f(N ) j 9G; 9M (took(N; cs131 ; G ) ^ G > 3 :0 ^ student (N ; M ; junior )) _

9G; 9M (took(N; cs151 ; G ) ^ G > 3 :0 ^ student (N ; M ; junior ))g

c 1997
Sec: 8.4 From Safe Datalog to RA {7{
CZ
Equivalence of RA and Safe Nonrecursive

Datalog
Theorem: Let P be a safe Datalog program with-
out recursion or function symbols. Then, for each
predicate in P , there exists an equivalent rela-
tional algebra expression.

c 1997
CZ
Mapping with Negated Goals

Say that the body of some rule contains a negated goal, such as
the following following body:
r : ::: b1(a; Y); b2(Y):
Then we consider a positive body, i.e., one constructed by drop-

ping the negated goal,
rp : : : : b1(a; Y); b2(Y); :b3(Y):

and a negative body, i.e., one obtained by removing the negation
sign from the negated goal:
rn : : : : b1(a; Y); b2(Y); b3(Y):
The two bodies so generated are safe and contain no negation, so

we can transform them into equivalent relational algebra expres-
sions as per Step 2 of Algorithm ?? above; let Bodypr and
Bodynr be the RA expressions so obtained. Then the body
expression to be used in Step 3 of said algorithm is simply
Bodyr = Bodyrp Bodyrn.

c 1997
CZ
Mapping|cont
Step 2 The body of a rule r is translated into the RA expression
Bodyr. Bodyr consists of the cartesian product of all the
base or derived relations in the body, followed by a selection
, where F is the conjunction of the following conditions:
F
(i) inequality conditions for each such goal (e.g., Z > 24:3),
(ii) equality conditions between columns containing the same
variable, (iii) equality conditions between a column and the
constant occurring in such a column:
For the example at hand, (i) the condition Z > 24:3 trans-
lates into the selection condition $5 > 24:3, while (ii) the
equality between the two occurrences of X translates into
$1 = $2, while the equality between the two Y s maps into
$3 = $4, and (iii) the constant in the last column of p maps
into $6 = a. Thus we obtain:
Bodyr = $1=$2 $3=$4 $6=a $5
; ; ; >24:3 (Q P)
Step 3 Each rule r is translated into an extended projection on
Bodyr, according to the patterns in the head of r. For the
rule at hand we obtain:
S = $5 b $5(Bodyr)
; ;
Step 4 Multiple rules with the same head are translated into
union or their equivalent expressions.

c 1997
CZ
From Safe Rules to RA

Mapping a safe, non-recursive Datalog program P
into RA
Step 1 P is transformed into an equivalent program P that 0
does not contain any equality goal by replacing equals with

equals and removing the equality goals. For example:
r : s(Z; b; W) q(X; X; Y); p(Y; Z; a); W = Z; W > 24:3
Is translated into:
r : s(Z; b; Z) q(X; X; Y); p(Y; Z; a); Z > 24:3

c 1997
CZ
Safe Datalog
The following is an inductive denition of safety for a program P :
1. Safe Predicates. A predicate q of P is safe if

(i) q is a database predicate, or
(ii) every rule dening q is safe
2. Safe Variables. A variable X in rule r is safe if
(i) X is contained in some positive goal q (t1; :::; t ), where n
the predicate q (A1; :::; A ) is safe, or

n
(ii) r contains some equality goal X = Y , where Y is safe.

3. Safe Rules. A rule r is safe if all its variables are safe
4. The goal ?q (t1; :::; t ) is safe when the predicate q (A1; :::; A )
n n
is safe.

c 1997
CZ
Safety
In practical languages, it is desirable to allow only
safe formulas, which avoid the problems of innite
answers, and loss of domain independence.
But the problems of domain independence and
niteness of answers are undecidable even for non-
recursive queries. Therefore, necessary and su-
cient syntactic conditions that characterize safe
formulas cannot be given in general.
In practice, therefore, sucient conditions are de-
ned that might be a more restrictive than neces-
sary.

c 1997
CZ
Unsafe Rules
For instance, to nd grades better than the grade Joe Doe got in
cs143, a user might write the following rule:
bettergrade(G1) took(\JoeDoe"; cs143; G); G1 > G:
This rule presents the following peculiar traits:
1. Innite answers. Assuming that, say Joe Doe got the grade
of 3:3 (i.e., B+) in course cs143, then, there are innitely
many numbers that satisfy the conditions of being greater
than 3:3.
2. Lack of domain independence. A query formula is said to
be domain independent when its answer only depends on
the database and the constants in the query, but not on the
domain of interpretation. The set of values for G1 satisfying
the rule above depends on what domain we assume for num-
bers: e.g., integer, rational or real. Thus there is no domain
independence.
3. No relational algebra equivalent. Only database relations
are allowed as operand of a relational algebra expressions.
These relations are nite, and so is the result of every RA ex-
pression over these relations. Therefore, there cannot be any
RA expression over the database relations that is equivalent
to the rule above.
c 1997
Sec: 8.6 Stratication {7{
CZ
Stratication
By sorting on pdg (P ), the nodes of P can partitioned into a
nite set of n strata 1; :::; n, such that, for each rule r 2 P , the
predicate-name, of the head of r belongs to a stratum that
(i) is to each stratum containing some positive goal, and also
(ii) is strictly > than each stratum containing some negated goal.
Programs which are stratiable, always have a clear meaning;

but programs that are not stratiable might be ill-dened from a
semantic viewpoint (See Chapter 10 Examples).
A stratication of a program will be called strict every stratum
either contains a single predicate or a set of predicates that are
mutually recursive.

c 1997
Sec: 8.6 Stratication {6{
CZ
Predicate Dependency Graph

The Predicate Dependency Graph for a program P is a graph
having as nodes the names of the predicates in P . The graph
contains an arc a ! b if there exists a rule with goal name a and
head-name b. If the goal is negated then the arc is marked as a
negative arc.
PDG for the howsoon program
howsoon

Z
}
Z
Z
:

larger
>

timeForbasic
6 Z
}
Z
Z
fastest
' : OCC
Z
}
Z
Z
C Z
basic subpart faster
&
C
Q
k C

6 Q
C
Q
Q C

Q
Q C
assembly part cost
The nodes and arcs of the strong components of pdg (P ), respec-

tively, identify the recursive predicates and recursive rules of P .
A program is said to be stratiable when none of its negative arcs
belongs to a strong component.o
Programs which are stratiable, always have a clear meaning; but
programs that are not stratiable might not.
c 1997
Sec: 8.5 Recursive Rules {5{
CZ
One-at-the-Time
Set aggregates, such as count or sum, in SQL, require that the
element of a set be visited one-at-the-time. (These aggregates also
require arithmetic predicates, that we will consider later.)
Counting the elements in a set modulo an integer does not require
arithmetic, but still requires the elements of the set be visited one-
at-the-time.
The parity query: how many tuples in the base re-
lation? br(X)
between(X; Z) br(X); br(Y); br(Z); X < Y; Y < Z:
next(X; Y) :
br(X); br(Y); X < Y; between(X; Y):
next(nil; X) :
br(X); smaller(X):
smaller(X) br(X); br(Y); Y < X:
even(nil):
even(Y) odd(X); next(X; Y):
odd(Y) even(X); next(X; Y):
br is even :
even(X); next(X; Y):
next sorts the elements of br into an ascending chain, where the

rst link of the chain connects the distinguished node nil to the
least element in br (third rule in the example).
This works assuming that the elements in br are totally ordered.

c 1997
CZ
Negation and Recursion

For each basic part nd the least time needed for
delivery
fastest(Part; Time) part cost(Part; Sup1; Cost; Time);
:faster(Part; Time):
faster(Part; Time) part cost(Part; Sup2; Cost; Time);
part cost(Part; Sup1; Cost; Time1);
Time1 < Time:
Times required for basic subparts of the given as-

sembly
timeForbasic(AssPart; BasicSub; Time)
basic subparts(AssPart; BasicSub);
fastest(BasicSub; Time):
The maximum time required for basic subparts of the given as-
sembly
howsoon(AssPart; Time) timeForbasic(AssPart; ; Time);
:larger(AssPart; Time):
larger(Part; Time) timeForbasic(Part; ; Time);
timeForbasic(Part; ; Time1);
Time1 > Time:

c 1997
CZ
Subparts
All subparts: a transitive-closure query

all subparts(Part; Sub) assembly(Part; Sub; ):
all subparts(Part; Sub2) all subparts(Part; Sub1);
assembly(Sub1; Sub2; ):
For each part, basic or otherwise, nd its basic sub-

parts. A basic part is a subpart of itself
basic subparts(BasicP; BasicP) part cost(BasicP; ; ; ):
basic subparts(Prt; BasicP) assembly(Prt; SubP; );
basic subparts(SubP; BasicP):

c 1997
CZ
Relational Tables for a BoM application

PART COST
BASIC PART SUPPLIER COST TIME
top tube cinelli 20.00 14 ASSEMBLY
top tube columbus 15.00 6 PART SUBPART QTY
down tube columbus 10.00 6 bike frame 1
head tube cinelli 20.00 14 bike wheel 2
head tube columbus 15.00 6 frame top tube 1
seat mast cinelli 20.00 6 frame down tube 1
seat mast cinelli 15.00 14 frame head tube 1
seat stay cinelli 15.00 14 frame seat mast 1
seat stay columbus 10.00 6 frame seat stay 2
chain stay columbus 10.00 6 frame chain stay 2
fork cinelli 40.00 14 frame fork 1
fork columbus 30.00 6 wheel spoke 36
spoke campagnolo 0.60 15 wheel nipple 36
nipple mavic 0.10 3 wheel rim 1
hub campagnolo 31.00 5 wheel hub 1
hub suntour 18.00 14 wheel tire 1
rim mavic 50.00 3
rim araya 70.00 1

c 1997
CZ
Transitive Closure Queries

Transitive closure of the graph: arc(X, Y)
path(X; Y) arc(X; Y):

path(X; Z) arc(X; Y); path(Y; Z):
Transitive Closure of the graph: arc(X, Y)

path(X; Z) path(X; Y); arc(Y; Z):
Transitive Closure of the graph: arc(X, Y)

path(X; Z) path(X; Y); path(Y; Z):

c 1997
Sec: 8.7 Expressive Power and Data Complexity {3{
CZ
The Expressive Power Hierarchy

1. Stratied safe Datalog is equivalent to RA + xpoint (on
monotonic RA)
2. Safe, stratied Datalog expresses every DB-PTIME query if
we assume that there exists a total order in the databases
(thus it is DB-PTIME complete).
3. Order-independence property of queries (genericity):
queries are insensitive to the renaming of constants.
4. To be able to express all DB-PTIME queries Under genericity,
a non-deterministic construct such as choice is needed:
Stratied Datalog with choice is DB-PTIME-complete.
5. Safe Datalog (without function symbols) can express expo-
nential queries, using non-stratied negation and stable model
semantics (to be discussed later)
6. Function symbols and recursion are needed for Turing com-
pleteness.

c 1997
CZ
Polynomial Data Complexity

1. Use Turing machines as the general model of computation
and encode the database as a tape of length n
2. Then any computable function on the database can be en-
coded as a Turing machine
3. some of these machines halt (complete their computation),
in O(n) steps, other in an an exponential number of steps,
others never terminate.
4. The set machines that halt in a number of steps which is
polynomial in n denes the class of DB-PTIME functions.
Are relational algebra expressions ('safe relational calculus' safe

non-recursive Datalog) evaluable in DB-PTIME?
Yes, and actually we use indices and query optimizers to
keep exponents and coecient small.
But these languages cannot express DB-PTIME. For instance
they cannot express transitive closures, or aggregates (thus the
most frequently used aggregates were added to SQL in ad hoc
fashion).

c 1997
CZ
Expressive Power
1. Expressive Power of a language: the set of functions that can
be written as programs of the language.
2. Data Complexity: query languages are viewed as mappings
from the DB to the answer. The big O is evaluated in terms
of the size of the database, which is always nite.
The following languages are equivalent w.r.t. expressive power:
1. Relational Algebra expressions

2. Safe relational calculus queries (tuple or domain)
3. Datalog with Safe non-recursive rules.
Any language that is at least as powerful as these is said to be re-

lationally complete. Query languages must be relationally com-
plete (Codd{1970).
But relational completeness is not enough. For instance set-
aggregates are beyond relational completeness, and had to be
added to SQL in ad-hoc fashion.
Question: what is a reasonable requirement for a query language?
DB-PTIME completeness might be the answer.
c 1997
Sec: 8.7.1 Functors and Complex Terms {3{
CZ
Nesting a Flat Relation

Now, if we have:
ps(top tube, cinelli)

ps(top tube, columbus)
ps(top tube, mavic)
How do we construct the nested relation back?
between(P X Z) ; ; ; ; ; ; ; ; < ; < Z:

ps(P X) ps(P Y) ps(P Z) X Y Y
smaller(P X) ; ; ; ; ; < :
ps(P X) ps(P Y) Y X
nested(P [X]) ; ; ; ; :
ps(P X) :smaller(P X)
;
nested(P [Yj[XjW]]) ; ; ; ; < ;
nested(P [XjW]) ps(P Y) X Y
; ; :
:between(P X Y)
ps nested(P W) ; ; ; ; :
nested(P W) :nested(P [XjW])

c 1997
CZ
Lists
[ ] is the empty list.
[HeadjTail] represents a non-empty list.
[mary; mike; seattle]
[mary; [mike; [seattle; [ ]]]]
A list-based representation for suppliers of top tube
;
part sup list(top tube [cinelli columbus mavic]) ; ; :
Normalizing a nested relation into a at relation

flatten(P S L) ; ; part sup list(P [SjL]) ; :
flatten(P S L) ; ; flatten(P [SjL]);; :
;
ps(Part Sup) flatten(Part Sup ) ; ; :
This program applied to the previous fact yields.
ps(top tube, cinelli)

ps(top tube, columbus)
ps(top tube, mavic)

c 1997
CZ
Functors and Complex Terms

Flat parts, their number, shape and weight, follow-
ing the schema: part(Part#; Shape; Weight)
; ;
part(202 circle(11) actualkg(0 034)) : :
;
part(121 rectangle(10 20) unitkg(2 1)); ; : :
part weight(No Kilos) ; part(No ; actualkg(Kilos)) ; :
part weight(No Kilos) ; part(No Shape unitkg(K)) ; ; ;
area(Shape Area) Kilos = K Area ; ; :
area(circle(Dmtr) A) ; A = Dmtr Dmtr 3:14=4:
area(rectangle(Base Height) A) ; ; A = Base Height:
The complex terms circle(11), actualkg(34), rectangle(10; 20),
and unitkg(2:1) are in logical parlance called functions (A func-
tor followed by a list of arguments in parentheses).
In actual applications, these complex terms do not represent evalu-

able functions; they are used as variable length sub-records.
Thus, circle(11) and rectangle(10, 20), respectively, de-
note that the shape of our rst part is a circle with diameter
20cm, while the shape of the second part is a rectangle with base
10cm and height 20cm. Any number of sub-arguments is allowed
in such complex terms, recursively.
Objects of arbitrary complexity, including solid objects, can be
nested and represented in this fashion.
Functors are here used as case discriminants.
c 1997
Sec: 8.9 The Models of a Program {9{
CZ
Minimal Models and Least Models

A model M for a program P is said to be a minimal model for P
if there exists no other model M of P where M M . A model
0 0
M for a program P is said to be its least model if M M for 0
every model M of P .0
Model Intersection Property. Let P be a positive program, and

M1 and M2 be two models for P . Then, M1 \ M2 is also a model
for P .
Theorem: Every positive program has a least model.

c 1997
CZ
Models of a Program
Let I be an interpretation for a program P . If an atom a 2 I we
say that a is true, otherwise we say that a is false. Conversely for
negated atoms :a.
Satisfaction: A rule r 2 P is said to hold true in interpretation
I , or to be satised in I , if every instance of r is satised in I .
Model. An interpretation I that makes true all rules P is called
a model for P .
I is a model for P i it satises all the rules in ground(P ).

c 1997
CZ
Ground version of a program

Let r be a rule in a program P .
ground(r) denotes the set of ground instances of r (i.e., all the
rules obtained by assigning to the variables in r, values from the
Herbrand universe UP ).
Example:
parent(X Y) ; mother(X Y) ; :
Since there are 2 variables in this rule and UP = 3, then ground(r)
consists of 3 3 rules:
parent(anne anne) ; mother(anne anne) ; :
parent(anne marc) ; mother(anne marc) ; :
:::
parent(silvia silvia) ; mother(silvia silvia) ; :
The ground version of a program P , denoted ground(P ), is the
set of the ground instances of its rules:
ground(P ) = fground(r) j r 2 P g

c 1997
Sec: 8.8 Syntax and Semantics of Datalog Languages {6{
CZ
Examples
anc(X Y); parent(X Y) ; :

anc(X Z); anc(X Y) parent(Y Z); ; ;
parent(X Y) ;
father(X Y) ; :
parent(X Y) ;
mother(X Y) ; :
mother(anne silvia) ; :
mother(anne marc) ; :
In this example:
UP = fanne; silvia; marcg,
BP = fparent(x; y)jx; y 2 UP g [ ffather(x; y)jx; y 2 UP g [
fmother(x; y)jx; y 2 UP g [ fanc(x; y)jx; y 2 UP g
There are 236 Herbrand interpretations for this program. (Four bi-
nary predicates, and three possible assignments for the rst argu-
ments and three for their second arguments then BP = 4 3 3 =
36. There are 2 BP subsets of BP )
j j
With innite universe an innite number of interpretations.

c 1997
CZ
Herbrand Interpretation for a program P

The Herbrand Universe for P , denoted UP , is dened as the
set of all terms that can be recursively constructed by letting
the arguments of the functions be constants in P or elements
in UP .
Then the Herbrand Base of P is dened as the set of atoms
that can be built by assigning the arguments in the predicates
elements of UP .
Herbrand Interpretation is dened by assigning to each n-
ary predicate q an n-relation q, where q (a1 ; :::; an) is true i
(a1 ; :::; an) 2 q, a1 ; :::; an denoting elements in UL.
Alternatively, an Herbrand interpretation of P is a subset of
the Herbrand Base of P .

c 1997
CZ
Positive Programs
A denite clause with an empty body is called a unit clause.
It is customary to use the notation \A:" instead of the more
precise notation \A :" for such clauses.
A fact is a unit clause without variables.
A unit clause (everybody loves himself) and three
facts
loves(X X) ; :
;
loves(marc mary) :
loves(mary tom); :
hates(marc tom); :
A positive logic program is a set of denite clauses.
We will use the terms denite clause program and positive pro-
gram as synonyms.

c 1997
CZ
Closed Formulas and Clauses

A WFF F is said to a closed formula if every variable occurrence
in F is quantied.
The formula in the previous example is not closed. But the fol-
lowing one is.
8x8y8z (p(x; z) _ :q(x; y) _ :r(y; z))

A Denite Clause is a WFF that has the following properties:
it is closed
all its variables are universally quantied
it is a disjunction of one positive atom and zero or more
negated atoms.
A denite clause is representable with the rule notation:

forallx8y8zp(x; z) q(x; y); r(y; z):

c 1997
CZ
Well-Formed Formulas (WFFs)

1. If p is an n-ary predicate and t1; :::; tn are terms, then
p(t1; :::; tn) is a formula (called an atomic formula or, more
simply, an atom).
2. If F and G are formulas, then so are :F , F _ G; F ^
G; F G F ! G and F $ G are formulas.
3. If F is a formula and x is a variable, then 8x (F ) and 9x (F )
are formulas. When so, x is said to be quantied in F .
Example:
9G1(took(N; cs101; G1)) ^ 9G2(took(N; cs143; G2)) ^

9M (student(N; M; junior))

c 1997
CZ
Syntax of First Order Logic

Its alphabet consists of:
1. Constants
2. Variables: in addition identiers beginning with upper case,
x, y and z also represent variables in this section.
3. Functions, such as f (t1 ; :::; tn ) where f is an n-ary functor
and t1; :::; tn are the arguments.
4. Predicates
5. Connectives. These include basic logical connectives _, ^,
: and the implication symbol , !, and $.
6. Quantiers. The existential quantier 9 and the universal
quantier 8.
7. Parentheses and punctuation symbols, used liberally as needed
to avoid ambiguities.
A Term is dened inductively as follows:
(a) A variable is a term.

(b) A constant is a term.
(c) If f is an n-ary functor and t1; :::; tn are terms, then
f (t1; :::; tn) is a term.

c 1997
Sec: 8.10 Fixpoint-Based Semantics {4{
CZ
Computation of the Least Fixpoint
MP = lfp(TP ) = TP"! (;) yields a simple algorithm for comput-
ing the least model of a denite-clause program.
Since TP is monotonic, and that TP"0(;) TP"1(;), then TP"n(;)
TP"n+1(;)
Thus, the successive powers of TP , form an ascending chain.
Moreover:
TP"k (;) = [ TP"n (;)

+1
nk
If TP"n+1(;) = TP"n(;), then TP"n(;) = TP"! (;).

Thus, the least xpoint and least model can be computed by
starting from the bottom and iterating the application of T until
no new atoms are obtained and the n + 1-th power is identical
to the n-th one (if such condition never occurs then we have an
innite computation).

c 1997
CZ
Operational Semantics: Powers of TP
TP" (I ) = I
0
:::
TP"n (I ) = TP (TP"n(I ))
+1
Moreover, with ! denoting the rst limit ordinal, we dene:
TP"! (I ) = [fT "n(I ) j n 0g
Of particular interest are the powers of TP starting from the empty

set, i.e., for I = ;
Theorem: Let P be a denite clause program. Then lfp(TP ) =
TP"! (;).

c 1997
CZ
Least Fixpoint of TP
Ee view a program P as dening the following xpoint equation
over Herbrand interpretations:
I = TP (I )
In general, a xpoint equation might have no solution, one solu-

tion or several solutions.
Interpretations are subsets of 2jBP j that are partially ordered by
. Actually (2jBP j); ) is a complete lattice.
Also, TP is monotonic.
Thus by the Knaster/Tarski's xpoint theorem:
Theorem:Let P be a denite clause program. There always
exist a least xpoint for TP , denoted lfp(TP ).
Theorem: Let P be a denite clause program. Then MP =
lfp(TP ).

c 1997
CZ
Immediate Consequence Operator
Rules can be viewed as mappings. Recursive rules dene a Fix-
point Equation
The Immediate Consequence Operator TP is dened as follows:
TP (I ) = fA 2 BP j 9r : A A ; :::; An 2 ground(P ); fA ; :::; Ang I g

1 1
Thus TP is a mapping from Herbrand interpretations of P to

Herbrand interpretations of P .
For the ancestor program:
With I = fanc(anne; marc); parent(marc; silvia)g,
we have:
TP (I ) = fanc(marc; silvia); anc(anne; silvia);

mother(anne; silvia); mother(anne; marc)g

c 1997
Sec: 9.3 Dierential Fixpoint Computation {9{
CZ
General Rewriting
The general expression of TR (S; S; S 0) for a recursive rule of rank
k is as follows:
r : Q0 c0; Q1; c1; Q2; : : : ; Qk ; ck :
r1 : 0Q0 c0; Q1; c1 ; Q02 ; ::: 0

Qk; ck
r2 : 0Q0 c0; Q1; c1; Q2; ::: 0
Q ;
k ck
:::
rj : 0Q0 ::: Qj 0
Qk; ck
:::
r2 : 0 Q0 c0; Q1; c1; Q2; ::: Qk; ck

c 1997
CZ
Seminaive Fixpoint
Semianive xpoint is another name for the dierential xpoint
just described.
Analogy with symbolic dierentiation
Performance improvements: it is typically the case that n =
jS j N = jS j jS 0j.
The original ancs rule, for instance, requires the equijoin of two
relations of size N ; after the dierentiation we need to compute
two equijoins, each joining a relation of size n with one of size N .

c 1997
CZ
NonLinear Rules
Quadratic Ancestor Rules
ancs(X; Y) parent(X; Y):
ancs(X; Z) ancs(X; Y); ancs(Y; Z):
r : 0ancs(X; Z) ancs0(X; Y); ancs0(Y; Z):
r1 : 0ancs(X; Z) ancs(X; Y); ancs0(Y; Z):

r2 : 0ancs(X; Z) ancs(X; Y); ancs0(Y; Z):
Now, we can re-write r2 as:
r2;1 : 0ancs(X; Z) ancs(X; Y); ancs(Y; Z):
r2;2 : 0ancs(X; Z) ancs(X; Y); ancs(Y; Z):
Rule r2;2 produces only `old' values, and can thus be eliminated.
All is left is rules r1 and r2;1 , below:
0 ancs(X; Z) ancs(X; Y); ancs0(Y; Z):

0 ancs(X; Z) ancs(X; Y); ancs(Y; Z):
For nonlinear rules, the immediate consequence operator in the

Algorithm has the more general form 0S := TR(S; S; S 0) S 0,
where S = S 0 S .
c 1997
CZ
Symbolic Dierentiation
anc, anc, and anc0, respectively, denote ancestor atoms that are
in S , S , and S 0 = S S . [
Then, to compute S 0 := TR (S 0) S0 we can use a TR dened by
the following rule:
0anc(X; Z) anc0(X; Y); parent(Y; Z):
This can be rewritten as:
anc(X; Z) anc(X; Y); parent(Y; Z):

anc(X; Z) anc(X; Y); parent(Y; Z):
The second rule can now be eliminated, since it produces

only atoms that were already contained in anc0, i.e., in the
S 0 computed in the previous iteration.
Thus, in the previous Algorithm rather than using S := TR(S 0)

S 0 we can write S := TR (S ) S 0 , to express the fact that the
argument of TR is the set of delta tuples from the previous step,
rather the set of all tuples obtained so far. This transformation
holds for all linear recursive rules.

c 1997
CZ
Dierential Fixpoint
Redundant Computation: the j -th iteration step also re-computes
all atoms obtained in the j 1-th step. Finite dierences tech-
niques tracing the derivations over two steps:
1. S is a relation containing the atoms obtained up to step j 1,
2. S = R(S ) S = TR(S ) S denotes the atoms newly

obtained at step j (i.e., the atoms that were not in S at step
j 1).
3. 0S = R(S 0 ) S 0 = TR (S 0) S 0 are the atoms obtained at
step j .
The naive xpoint Algorithm can be improved as follows
Dierential xpoint
S := M ;
S := E (M );
S 0 := S S ;[
while (S = ) 6 ;
f
0 S := TR (S 0 ) S 0;
S := S 0 ;
S := 0 S ;
S 0 := S S[
g
c 1997
CZ
In ationary Fixpoint
"P! (Mj ) for stratum j of the iterated xpoint computation is is
j
computed as follows (to simplify the notation stands for P j
and M for Mj 1).

The computation for each stratum (naive xpoint
algorithm)
S := M ;
S 0 := (M )
while (S S 0)
f
S := S 0 ;
S 0 := (S ) g
But T (M ) = TE (M ) and (M ) = E (M ) where TE denotes
the immediate consequence operator for the exit rules and E
denotes its in ationary version.
Let TR denote the immediate consequence operator for the recur-
sive rules and let R be its in ationary version. Then, (S ) in
the while loop can be replaced by R(S ), while (M ) outside
the loop can can be replaced by E (M ).
r1 : anc(X; Y) parent(X; Y):
r2 : anc(X; Z) anc(X; Y); parent(Y; Z)
E and R are dened by rules r1 and r2 , respectively.
c 1997
CZ
Semantics
For positive programs:
Theorem: Let P be a positive program stratied in n strata,
and let Mn be the result produced by the iterated xpoint
computation. Then, MP = Mn, where MP is the least model
of P .
For programs with stratied negation:
Theorem: Let be a program stratied in n strata, and let
P
Mn be the result produced by the iterated xpoint computa-
tion. Mn is the unique stable model for P .

c 1997
Sec: 9.1,2 Operational Semantics: Bottom-Up Execution {2{
CZ
Stratied Programs and Iterated Fixpoint
In actual systems, TP"! (;) is computed by strata. Unless otherwise
specied, le us assume strict stratication.
Let P be program. The inflationary immediate consequence
operator for P , denoted P is a mapping on a subset of BP de-
ned as follows:
P (I ) = TP (I ) [ I
The computation TP"! (;) is frequently called in ationary x-

point computation
"Pn(;) = TP"n(;)
MP = lf p(TP ) = TP"! (;) = lf p(P ) = "P! (;)
Iterated Fixpoint Computation for program P strat-

ied in n strata
Let Pj , 1 j n denote the rules with their head in the j -th
stratum. Then, Mj be inductively constructed as follows:
1. M0 = ; and
2. Mj = "P! (Mj 1).
j

c 1997
Chapter 9:
Implementation of Rules and Recursion

c 1997
Sec: 9.4 Top-Down Execution {17{
CZ
Programming in Prolog
A solution to the previous problems is to put the exit rule before
the recursive one.
anc(X Y); parent(X Y); :
anc(X Z); ; ;
anc(X Y) parent(Y Z) ; :
Prolog loops after the generation of all the results. To make things
work parent must be put before anc in the recursive rule. A skill
not hard to learn.
In many cases, however, reordering rules and goals does avoid
innite loops.
anc(X Y) ; ; :
parent(X Y)
anc(X Z) ; ; ; ; :
anc(Y Z) anc(X Y)
Cycles in the parent database will also cause problems| SLD-

resolution has no memory.

c 1997
CZ
Prolog
Depth-rst exploration of alternatives, where goals are al-
ways chosen in a left-to-right order and the heads of the
rules are also considered in the order they appear in the
program.
The programmer is given responsibility for ordering the rules
and their goals in such a fashion as to guide Prolog into suc-
cessful and ecient searches.
The programmer must also make sure that the procedure
never falls into an innite loop.
Example: The goal ?anc(marc; mary) on the program:

anc(X Y); ; ;
anc(X Y) parent(Y Z) ; :
anc(X Z); parent(X Y) ; :
This causes an innite loop that never returns any result.

c 1997
CZ
Refutation
S = fF1; :::; Fng is a nite set of closed formulas, then F is a
logical consequence of S i F1 ^ ::: ^ Fn ! F is valid.
Theorem: Let S be a set of closed formulas and F be a closed
formula. Then F is a logical consequence of S i S [ f:F g is
unsatisable.
Thus to prove a goal G from a set of rules and facts P we simply
have to prove that P [ f Gg is unsatisable|i.e., we have to
refute P [ f Gg.
Resolution theorem proving that exactly that. It refutes the goal
list.
Prolog can be viewed in that light. But in fact there is no real
refutation|just procedural composition via unication.
The term SLD stands for Selected literal Linear resolution (or
refutation) strategy over Denite clauses.

c 1997
CZ
Satisability
Let S be a set of closed formulas. We say that
S is satisable there is an interpretation which is a model for S .
S is valid if every interpretation of L is a model for S .
S is unsatisable if it has no models.
Theorem: Let S be a set of clauses. Then S is unsatisable i
S has no Herbrand models.
Let S be a set of closed formulas and F be a closed formula of a
rst order language L. We say F is a logical consequence of S if,
every interpretation of L that is a model for S is also a model for
F.
Note that if S = fF1; :::; Fng is a nite set of closed formulas,
then F is a logical consequence of S i F1 ^ ::: ^ Fn ! F is
valid.

c 1997
CZ
Equivalent Semantics
Theorem: The success set of a program is equal to its least
Herbrand model.
Equivalence of the three formal semantics. (Least Model,

Least Fixpoint, and SLD-resolution).
SLD-resolution is a form of theorem proving (an ecient one).
In general, generation of the success requires that all choices
are visited in a breadth-rst fashion. This too inecient for
practical languages such as Prolog that uses depth-rst in-
stead.

c 1997
CZ
Success Set
SLD-derivations can be nite or innite.
{ A nite SLD-derivation can be successful or failed.
{ A successful SLD-derivation is a nite one that ends in
the empty clause. This is also called an SLD-refutation.
{ A failed SLD-derivation is a nite one that ends in a non-
empty goal, where the selected atom in this goal does not
unify with the head of any program clause.
Denition Let P be a program. The success set of P is the
set of all A 2 BP such that P [f Ag has an SLD-refutation
(i.e., there exist some successful derivation for it).

c 1997
CZ
SLD|Innite Trees

p(x,b)
1. p(x,z) q(x,y), p(y,z)
2. p(x,x)
3. q(a,b)
.
p(X, b)
Figure: An innite SLD-tree. @@
1 @@ 2
@@
q(X,Y)
.
, p(Y, b) X/b
@@
1 @@ 2
@@
q(X,Y), g(Y, Z), p(Z,b) q(X, b)
.
@@
1 @@ 2
@@ 3
q(X,Y), q(Y, Z), q(Z, U), p(U, b) q(X, Y), q(Y,b)
@@
1 @@ 2 X/a
3
innite q(X,a)
failure

c 1997
CZ
This SLD-tree comes from the standard PROLOG computation

rule (select the leftmost atom). The selected atoms are under-
lined.

c 1997
CZ
The SLD-Refutation Procedure

Example Consider the goal p(x,b) on program
1. p(x,z) q(x,y), p(y,z)
2. p(x,x)
3. q(a,b)
A nite SLD-tree for this program is:

p(X, b)
.
@@
1 @@ 2
@@
q(X,Y) , p(Y, b) X/b
.
p(b,b)
@@
1 @@ 2
@@
q(b,Z),
. p(Z, b) X/a
Figure: A nite SLD-tree. denotes success.

c 1997
CZ
Examples of SLD-Resolution
Any realization of the top-down evaluation procedure will have to
make two choices at each step by selecting
1. the next goal from the goal list and

2. the rule whose head unies with the selected goal.
In general, there will be more than one goal and many rules to
choose from. The choice aects the eciency of the deduction
process and also the actual result when the search falls into an
innite loop.
PROLOG interpreters usually choose goals in a left-to-right
order and rules in a sequential order that corresponds to
a depth-rst search of the SLD-tree with backtracking when
failure occurs. Thus, PROLOG treats the goal list as a stack
onto which goals are pushed or popped, depending on success
or failure.

c 1997
CZ
SLD-resolution. Example
;
s(X Y) ; ;
p(X Y) q(Y) :
; :
p(X 3)
q(3) :
q(4) :
1. The initial goal list is
s(5; W)
2. This unies the head of the rst rule with mgu: fX=5; Y=W g.
New goal list
p(5; W); q(W)
3. Say that we choose q(W) as a goal: it unies with the fact

q(3), under the substitution fW=3g.
p(5; 3)
This unies with the fact p(X; 3) under the substitution fX=5g.
The goal list becomes empty and we report success.
Thus, a top-down evaluation returns the answer fW=3g for
the query s(5; W) from the example program.
However, if we choose instead q(4), ...

c 1997
CZ
SLD-Resolution
A rule r : A B1 ; : : : ; Bn , and
A query goal g , r and g have no variables in common.
If 9 a most general unier (mgu) for A and g , the goal list:
B1; : : : ; Bn:
is called resolvent of r and g .
SLD-Resolution Algorithm:
Input: A rst-order program P and a goal list G.
Output: A G that was proved from P , or failure.
begin Set Res = G;
While Res is not empty, repeat the following:
Choose a goal g from Res;
Choose a rule A B1 ; : : : ; Bn (n 0) from P
such that A and g unify under the mgu ,
(renaming the variables in the rule as needed);
If no such rule exists, then
output failure and exit.
else Delete g from Res;
Add B1 ; : : : ; Bn to Res;
Apply to Res and G;
If Res is empty then output G
end
c 1997
CZ
Unication
A substitution is called a unier for (i.e., they cannot be
made identical)
two terms A and B if A = B.
Example The two terms p(f (x); a); and p(y; f (w)) are not
uniable, because the second arguments cannot be unied
The two terms p(f (x); z ); and p(y; a) are uniable, since =
fy=(f (a); x=a; z=a)g is a unier.
A unier for two terms is called a most general unier (mgu),

if for each other unier , there exists a substitution such that
= .
= fy=(f (a); x=a; z=a)g is not the mgu of p(f (x); z); and
p(y; a).
A most general unier for these two is = fy=(f (x); z=ag. Note
that = fx=ag.
There exist ecient algorithms to perform unication: such al-
gorithms either return a most general unier or report that none
exists.

c 1997
CZ
Composition of Substitutions
Let = fu1=s1; : : : ; um=smg and = fv1=t1 ; : : : ; vn=tng be sub-
stitutions.
Then the composition of and is the substitution obtained
from the set
fu1 =s1; : : : ; um=sm; v1=t1 ; : : : ; vn=tng
by deleting any binding ui=si for which ui = si and deleting
any binding vj =tj for which vj 2 fu1; : : : ; umg.
Example
Let = f(x=f (y ); y=z )g and = fx=a; y=b; z=y g.
Then = fx=f (b); z=y g.

x=f (y) x=a x=f (b)
y=z y=b y=y
z=z z=y z=y

c 1997
CZ
Substitutions
Substitutions: A substitution is a nite set of the form fv1=t1; : : : ; vn=tng,
where each vi is a distinct variable, and each ti is a term distinct
from vi. Each ti is called a binding for vi.
The substitution is called a ground substitution if every ti is a
ground term. (Then X= is an instantiation of X to .)
E denotes the result of applying the substitution to E; i.e.,
of replacing each variable with its respective binding. For in-
stance, if E = p(x; y; f (a)) and = fx=b; y=xg. Then E =
p(b; x; f (a)). If = fx=cg, then E = p(c; y; f (a)).
Thus variables that are not part of the substitution are left un-
changed.

c 1997
CZ
Passing Bindings from Goals to Heads
r1 : part weight(No; Kilos) ;;

part(No actualkg(Kilos)) :
r2 : part weight(No; Kilos) ; ;
part(No Shape unitkg(K)) ;
; ;
area(Shape Area) Kilos = K Area :
r3 : area(circle(Dmtr); A) A = Dmtr Dmtr 3:14=4:
r4 : area(rectangle(Base; Height); A) A = Base Height:
The goal area(Shape; Area) in rule r1 can be viewed as a call to

the procedure area dened by rules r3 and r4.
Thus Shape is instantiated to circle(11) by the execution of
r3, and rectangle(10; 20) and r4.
Instantiated to c means \assigned the value of the constant c".
Shape=rectangle(10; 20) denotes that Shape has been instan-
tiated to rectangle(10; 20).
Arguments can be complex; thus the passing of parameters is
performed through a process known as unication.
=
Shape rectangle(10 20), ;
this is made equal to (unied to)
the rst argument of the second area rule, rectangle(Base,
Height), by setting Base=10 and Height=20.

c 1997
CZ
Top-Down Execution of Datalog

A strict bottom-up execution strategy is frequently not nat-
ural nor ecient.
Pure top-down, SLD-resolution, Prolog
Mixing top-down and bottom-up in deductive databases

c 1997
Sec: 9.5 Rule Rewriting Methods {4{
CZ
Right-Linear Rules
Consider now the right-linear formulation of ancestor:
Right-linear rules for the descendants of Tom
anc(Old; Young) parent(Old; Young):
anc(Old; Young) parent(Old; Mid); anc(Mid; Young):
With these right-linear rules the query ?anc($Name; X) can no

longer be implemented by specializing the program.
Solution: turn the rules into equivalent left-recursive ones!
The situation is symmetric. A query such as anc(X; $Y) cannot
be supporte on the left-linear version of the program. But the
program can be transformed into the one above, to right-linear
rules above to which specializion can apply.
Deductive Database compilers do that.

c 1997
CZ
Left-Linear and Right-Linear Recursion

?anc(tom; Desc).

anc(Old; Young) anc(Old; Mid); parent(Mid; Young)
?anc(tom; Desc)
anc(Old=tom; Young) parent(Old=tom; Young):
anc(Old=tom; Young) anc(Old=tom; Mid); parent(Mid; Young):
These are left-linear recursive rules.

Query form are used for compilation: ?anc($Someone; Desc).

c 1997
CZ
Specialization of the Original Rules

Also a partial instantiation of the program that we will call spe-
cialization.
Also the result is the same as (equality to constants) selection

pushing into equivalent relational algebra expression.
For non-recursive rules it is all simple.
For recursive rules it is complicated.

c 1997
CZ
Unication at Compile Time

Blood Relations
anc(Old; Young) anc(Old; Mid); parent(Mid; Young)
grandma(Old; Young) parent(Mid; Young); mother(Old; Mid):
parent(F; Cf) father(F; Cf):
parent(M; Cm) mother(M; Cm):
Find the grandma of marc
?grandma(GM; marc)
grandma(Old; Young=marc) parent(Mid; Young=marc);

mother(Old; Mid):
parent(F; Cf=marc) father(F; Cf=marc):
parent(M; Cm=marc) mother(M; Cm=marc):
This a form of partial evaluation.

c 1997
Sec: 9.5.4 Supplementary Magic {9{
CZ
Supplementary Magic Sets

Only those variables that are needed for the second xpoint
are stored in the supplementary magic relations: thus St is
not included.
The method of choice in many prototypes because of gener-
ality and robustness.
the method works with cycles in the database
Ability of storing one-way predicates, such as X = f(Y; Z; : : :).

c 1997
CZ
Benets of Memorizing
People who are of the same generation through common an-
cestors who are less than 12 levels remote and always lived
in the same state
?stsg(marc; 12; Z):
stsg(X; K; Y) parent(XP; X); K > 0; KP = K 1;
born(X; St); born(XP; St);
stsg(XP; KP; YP);
parent(YP; Y):
stsg(X; K; X):
Since the rst two arguments of stsg are bound, The supplemen-
tary magic method for this example is:
m:stsg(marc; 12):
spm:stsg(X; K; XP; KP) m:stsg(X; K);
parent(XP; X); K > 0; KP = K 1;
born(X; St); born(XP; St):
m:stsg(X; K) spm:stsg(X; K; XP; KP):
stsg(X; K; X) m:stsg(X; K):
stsg(X; K; Y) stsg(XP; KP; YP); spm:stsg(X; K; XP; KP);
parent(YP; Y):

c 1997
CZ
Supplementary Magic Sets
m:sg(marc):
m:sg(XP) m:sg(X); parent(XP; X):
spm:sg(X; XP) parent(XP; X); m:sg(X):
sg0(X; X) m:sg(X):
sg0(X; Y) sg0(XP; YP); spm:sg(X; XP); parent(YP; Y):
%sg (X; Y)
0
parent(XP; X); sg (XP; YP); parent(YP; Y); m:sg(X):
0
?sg0(marc; Z):
In addition to the magic predicates, supplementary predicates are

used to store the pairs bound-arguments-in-head/bound-arguments-
in-recursive-goal.
The magic set method and supplementary magic set method are
very similar{often the rst term is used to refer to both methods.
However, there are important dierences:
The magic predicate and the supplementary magic predicate are
normally written in a mutually recursive form.
m:sg(marc):
spm:sg(X; XP) m:sg(X); parent(XP; X)
m:sg(XP) spm:sg(X; XP):

c 1997
Sec: 9.5.3 Same-Generation Example {6{
CZ
The counting method: pros and cons

The counting method is often more ecient than the magic-
set method.
However it is not as general: e.g. add the goal X 6= Y to
ensure that you to leave out marc from the people who are
of the same generation as marc. One need to memorize.
Cycles in the database will throw it into a loop{just as Prolog
Many approaches to get methods that combine the strengths
of magic and counting have been developed.

c 1997
Sec: 9.5.3 Same-Generation Example {5{
CZ
The Counting Method

\People who are of the same generation as marc". Is logically
equivalent to:
1. Find the ancestors of marc and their levels, where marc is

a zero-level ancestor of himself, his parents are rst-generation
(i.e., rst-level) ancestors, his grandparents are second-generation
ancestors, and so on.
This computation is performed by the predicate sg up in
2. Switch to the computation of descendants
3. Perform the computation of descendants|while descreasing
the level by one at each step
This is performed by the predicate sg dwn in
4. Check when you return to level 0 to nd those who are of the
same generation as marc.
Find ancestors of marc, and then their descendants

sg up(0; marc):
sg up(J + 1; XP) parent(XP; X); sg up(J; X):
sg dwn(J; X) sg up(J; X):
dwn(J 1; Y) sg dwn(J; YP); parent(YP; Y):
?dwn(0; Z):

c 1997
Sec: 9.5.2 The Magic Sets Method {4{
CZ
Computing the Magic Predicate
?sg(marc; Who):
sg(X; Y) parent(XP; X); sg(XP; YP); parent(YP; Y):
sg(A; A):
Binding analysis of the top-down behavior.

The rst argument in the query: thus X is bound and through
goal parent(XP; X) the binding is passed to XP in the recursive
goal.
The variables Y; YP remain unbound.
The rules for the magic predicates can be otained by:
(1) using the query constant as the exit rule (a fact).
(2) using the top-down bound arguments and predicates for the
exit rule|however head and tail must be reversed!

c 1997
CZ
Only ancestors of marc are of interest

The sg rules are linear, but not left-linear or right-linear.
Say that the predicate m:sg(X) computes the ancestors of marc.
We can add m:sg(X) to the exit rule to make it safe, and to the
recursive rule to make it more selective:
?sg (marc; Z):

0
sg (X; X)
0
m:sg(X):
sg (X; Y)
0
parent(XP; X); sg0(XP; YP); parent(YP; Y); m:sg(X):
Now, m:sg(X) is called the magic predicate for sg and can be

computed from the original program as follows:
m:sg(marc):
m:sg(XP) m:sg(X); parent(XP; X):

c 1997
CZ
The Top-Down Computation for the

same-generation
?sg(marc; Who):
sg(A; A):
If parent(tom marc) is in the database, the resolvent of the

;
query goal with the rst rule is

parent(XP; marc); sg(XP; YP); parent(YP; Y).
Then, by unifying the rst goal with the fact parent(tom; marc),
the new goal list becomes:
sg(tom; YP); parent(YP; Y).
The recursive call unfolds as in the previous case, yielding the

parents of tom, who are the grandparents of marc.
Thus the top-down computation generates all the ancestors
of marc using the recursive rule.
The binding has been passed from the rst argument in the
head to the rst argument in the body of the recursive pred-
icate. The computation causes the instantiation of variables
X and XP, while Y and YP remain unbound.

c 1997
CZ
The Same-Generation Example

People are of the same generation if their parents are of the
same generation
?sg(marc; Who):
sg(A; A):
This program cannot be computed in a bottom-up fashion because

the exit rule is not safe.
Even if we make it safe by adding a goal such as people(A), it
is inecient to compute in a bottom-up fashion since all same-
generation pairs are produced, while we only want those that have
marc as their rst component.

c 1997
Sec: 9.6 Compilation and Optimization {17{
CZ
Three-Way Join
p1(X; Y); p2(Y; Z); p3(Y; W)
The basic nested-loop join:
Loop 1: for each tuple in p1 do

Loop 2: for each tuple in p2 (joining with p1) do
Loop 3: for each tuple in p3
(joining with p1 and p2) do
return the computed tuple
end Loop 3
end Loop 2
end Loop 1.
But if p3(Y; W) fails on rst (i.e., step (i) fails) there is no

point in going back to Loop 2, since only a new value of Y
can make it succeed!
methos mgu nomore

c 1997
CZ
Existential Variables: Optimization

Existential Variables.
p(X) q1(X; Y); q2(Y; Z); :q3(W)
If q2(Y; Z) succeeds or fails for certain value of Y, there is no need

to nd all the other values of Y.
Same for W. Y and W are existential variables.
A tuple a tuple-oriented model of computation:
(i) Get-rst tuple in relation (joining with the pre-
vious tuples, if any)
(ii) Get-next of same, and repeat this step till nomore
such tuples
For q1(X; Y) both steps must me performed.

For q3(W) and q2($Y; Z) only step (i) is executed.

c 1997
CZ
Optimization|cont.
Ideally, the cost/benets of dierent recursive methods should be
quantied and compared. But quantication is often expensive
and and prediction is unreliable.
In practice, therefore, only very coarse criteria are given|e.g., use
certain goals as chain goals in the SIP.
Even for nonrecursive rules, full cost-based optimization is prob-
lematic (many goals deeply stacked). Heuristics approaches are
used instead. E.g., Glue/Nail! uses the following Heuristic: Do
First goals with more bound argument; and between two
goals with the same number of bound arguments, select those
which have fewer unbound arguments.
Following the order of goals specied by the user| in LDL++
and Prolog.

c 1997
CZ
Optimization
In relational databases there are two kinds of optimizations
1. Greedy optimization: whenever a technique is applicable

apply it. E.g., always push selection and projection into rela-
tional expressions. Computationally this is not very expensive
2. Cost Based Optimization: evaluate alternatives and predict
the cost. Then choose the least-expected-cost solution. This
is done for choosing a join order. Basically exponential in the
number of joins being evaluated.
Deductive Database prototypes follow mostly the rst approach.

E.g., in chosing the method for recursion.
1. The binding passing property is tested, and if satised

2. The applicability of the following methods are considered in
the order shown:
1. left- right-linear rules
[1.5 Counting Method]
2. magic or supplementary magic sets
[2.5 Generalized magic sets methods]
Most systems will not bother with [1.5] or [2.5].

c 1997
CZ
Generalizations
Unique binding property. Relaxing this assumption does
not require major modications or extensions
No Sideway Information Passing (SIP) between recursive
goals: only goals from lower strata can be used as chain goals
This assumption can be removed yielding the Generalized
Magic Set method.
The programs produced by this extension tend to be complex
and inecient to execute.
In the CORAL system, not all the variables are required to
be instantiated after a goal executes.

c 1997
CZ
Trivial Second Phase{Cont

After dropping the recursive rule can be dropped along with the
condition in the rst argument of the query goal, we obtain:
m:anc(tom):
m:anc(Mid) m:anc(Old); parent(Old; Mid):
anc0(Old; Young) m:anc(Old); father(Old; Young):
?anc0( ; Young):

c 1997
CZ
Trivial Second Phase

The descendants of tom with right-linear rules:
?anc(tom; Desc):
anc(Old; Young) father(Old; Young):
anc(Old; Young) parent(Old; Mid); anc(Mid; Young):
Magic-set rewriting:
m:anc(tom):
m:anc(Mid) m:anc(Old); parent(Old; Mid):
anc0(Old; Young) m:anc(Old); father(Old; Young):
anc0(Old; Young) parent(Old; Mid); anc0(Mid; Young);
m:anc0(Old):
?anc (tom; Young):
0
Observe that the recursive rule just copies the value of Young
generated by the exity rule, from the tail to the head. This value
of Y oung is returned as an answer if, after a few iterations, Old =
tom. But that is always true since this rule basically re-visits the
magic-set computation.
Thus the recursive rule can be dropped along with the con-
dition in the rst argument of the query goal, yielding:

c 1997
CZ
Selecting a Method for Recursion

Using the rewriting for the magic sets method, which can then be
used as the basis for other methods. The magic sets method can
also be used as the basis for detecting and handling the special
cases of left-linear and right-linear rules.
For instance,
?anc(tom; Desc):
anc(Old; Young) anc(Old; Mid); parent(Mid; Young):
If we write the magic rules for Example we obtain:
m:anc(tom):
m:anc(Old) m:anc(Old):
The recursive magic rule above is trivial and can be eliminated

(trivial rst phase).
The magic relation anc now contains only the value tom, rather
than appending the magic predicate goal to the original rules, we
can substitute this value directly into the rules.

c 1997
CZ
Binding Passing Property

Algorithm for Binding Passing Analysis
1. Initially A = fq g, with q the initial goal, where q is a

recursive predicate and is not a totally free adornment.
2. For each h 2 A, pass the binding to the heads of rules den-
ing q .
3. For each recursive rule, determine the adornments of its re-
cursive goals (i.e., of q or predicates mutually recursive with
q ).
4. If the last step generated adornments not currently in A, add
them to A and resume from step 2. Otherwise halt.
The calling goal g is said to have the
1. binding passing property when A does not contain any re-

cursive predicate with totally free adornment.
2. Unique binding passing property: if binding passing prop-
erty holds and A contains one pattern for each recursive pred-
icate.

c 1997
CZ
Safety{Cont
The basic idea behind the notion of chain goals is that the bind-
ing in the head will have to reduce the search space. Any goal
that is called with all its adornment free will not be benecial in
that respect. Also, there is no sideway information passing (SIP)
between two recursive goals; bindings come only from the head
through nonrecursive goals.
If q is not a recursive predicate, then safety is determined as
previously described.
If q is a recursive goal, then it belongs to a lower stratum; there-
fore, safety can be determined independently using the techniques
described here for recursive predicates.
Since we have a nite number of strata the process soon termi-
nates.

c 1997
CZ
Binding passing analysis for recursive

predicates
The Same-Generation query
?stsgbbf
ZZ
ZZ~
' $
?
stsgbbf
parentbf ; >bb ; =fb ;
& %
bornbf ; bornbb ;
stsgbbf
Only chain goals are used in the top-down propagation. An

adorned goal q in a recursive rule r is called a chain goal when:
1. SIP independence of recursive goals: q is not a recursive

goal (i.e., not the same predicate as that in the head of r,
nor a predicate mutually recursive with q ; however, recursive
predicates of lower strata can be used as chain goals).
2. Selectivity: q has some argument bound (according to the
bound variables in the head of r and the chain goals to the
left of q ).
3. Safety: q is a safe goal.

c 1997
CZ
Recursive Predicates
The treatment of recursive predicates is somewhat more complex
because a choice of recursive methods must be performed along
with the binding passing analysis.
The simplest case occurs when the goal calling a recursive pred-
icate has no bound argument. The recursive predicate, say p,
and all the predicates that are mutually recursive with it, will be
computed in a single dierential xpoint.
The construction of the rule graph for a recursive rule is the same
as for a non-recursive one.
1. The head of the rule is assumed to have no bound argument,

and
2. Safety analysis is performed by treating the recursive goals
(i.e., p and predicates mutually recursive with it) as safe a
priori|in fact, they are bound to the values computed in the
previous step.

c 1997
CZ
Computation of Safe Nonrecursive

Programs:
Let us have a safe rgg (P ) tree.
Every non-leaf node (a goal) which with bound adornments is
basically computed in two phases.
1. In the rst phase, the bound values of a goal's arguments are

passed to its dening rules, i.e., its children in the rule-goal
graph.
2. In the second phase, the goal receives the values of the f -
adorned arguments from its children.
Only the second computation takes place for goals without bound
arguments.
The computation of the heads of the rules follows the computation
of all the goals in the body. Thus, we have a strict stratication
where predicates are computed according to the postorder traver-
sal of the rule-goal graph.

c 1997
CZ
Safety
Safe a-priori:
1. For instance, base predicates are safe for every adornment.
Thus, partfff is safe.
2. The pattern bb is safe for denoting any comparison oper-
ator, such as or >.
3. Moreover, there is the special case of =bf or =fb where the
free argument consists of only one variable; in either case the
arithmetic expression in the bound argument can be com-
puted and the resulting value can be assigned to the free
variable.
(These are the basic patterns: a be more sophisticated com-
piler could solve more equations and accept other patterns as
safe)
Inductive Denition for Safety: Let P be a program with rule-

goal graph rgg (P ), where rgg (P ) is a tree (DAGs can be reduced
to trees):
Then P is safe if the following two conditions hold:
1. Every leaf node of rgg (P ) is safe a priori, and

2. Every variable in every rule in rgg (P ) is bound after the last
goal.
c 1997
CZ
Unication of Goal and Head: example

Goal ?g , with: g = p(f (X1 ); Y1; Z1; a)
Rule: r : p(X2 ; g (X2 ; Y2); Y2 ; W2) : : :.
Thus: h(r) : p(X2 ; g (X2 ; Y2); Y2 ; W2) (If g and h(r) had variables
in common, then a renaming step would be required here.)
A most general unier for g and h(r) is:
= fX2=f (X1 ); Y1 =g (f (X1 ); Y2); Z1 =Y2; W2=ag;
Yielding:
g = h(r) = h(r) = p(f (X1 ); g (f (X1 ); Y2); Y2 ; a)
If the adorned goal is pbff b: variables in the rst argument of the

head (i.e., X1 ) are bound. The resulting adorned head is pbf f b,
and there is an edge from pbf f b to pbf f b .
If the adorned goal is pfbf b: all the variables in the second argu-
ment of the head (i.e., X1 ; Y2) are bound. Then the remaining
arguments of the head are bound as well. In this case, there is an
edge from the adorned goal pf bf b to the adorned head pbbbb .

c 1997
CZ
Rule-Goal graph for a Nonrecursive P

The graph depicts all possible top-down, left-to-right executions.
Construction of the rule-goal graph rgg(P ) for a non-
recursive program P .
1. Initial step: The query goal is adorned according to the
constants and deferred constants (i.e., the variables pre-
ceded by $), and becomes the root of rgg (P ).
2. Bindings passing from goals to rule heads: If the calling goal
g unies with the head of the rule r, with mgu , then we
draw an edge (labeled with the name of the rule, i.e., r)
from the adorned calling goal to the adorned head, where
the adornments for h(r) are computed as follows: (i) all
arguments bound in g are marked bound in h(r) ; (ii) all
variables in such arguments are also marked bound; and
(iii) the arguments in h(r) that contain only constants
or variables marked bound in (ii) are adorned b, while
the others are adorned f .
3. Left-to-right passing of bindings to goals:
A variable X is bound after the nth goal in a rule, if X
is among the bound head variables (as for the last step),
or if X appears in one of the goals of the rule preceding
the nth goal.
The (n + 1)th goal of the rule is adorned on the basis of
the variables that are bound after the nth goal.
c 1997
CZ
Rule-Goal Graph
The graph has as nodes rules with adorned predicate names.
The adornment of the predicate is the subscript that denotes
bound/free argument.
E.g. The rule-goal graph for the Flat Parts Example and
query: ?part weight(Part; Weight).
part weightff
ll
r1 llr2
ll
part weightff part weightff
partfff partfff; area
.% .@ bf; =fb
r3% @@r4
%
%% @@
area bf areabf
=fb =fb
The Flat Parts Example:

r1 : part weight(No; Kilos) part(No; ; actualkg(Kilos)):
r2 : part weight(No; Kilos) part(No; Shape; unitkg(K));
area(Shape; Area);
Kilos = K Area:
r3 : area(circle(Dmtr); A) A = Dmtr Dmtr 3:14=4:

r4 : area(rectangle(Base; Height); A) A = Base Height:

c 1997
Sec: 9.7 Recursive Queries SQL {5{
CZ
Recursion in SQL and Datalog

Take the similar query:
SELECT *
FROM all subparts
WHERE Minor = 'top tube'
expressed against the virtual view of

CREATE RECURSIVE view all subparts(Major, Minor) AS
SELECT PART SUBPART
FROM assembly
UNION
SELECT all.Major assb.SUBPART
FROM all subparts all, assembly assb
WHERE all.Minor= assb.PART
Here, the addition of the condition Minor = 'top tube' to the
recursive select would not produce an equivalent query.
Instead, the SQL compiler must transform the original recursive
select into its right-linear equivalent before the condition Minor =
'top tube' can be attached to the WHERE clause.
In general the compilation techniques usable for such transforma-
tions are basically those previously described for Datalog.
Also stratication w.r.t. negation and aggregates is required in
the proposed SQL3 standards.
c 1997
CZ
Left-Linear and Right-Linear Recursion

Find the parts using top tube
WITH RECURSIVE all super(Major, Minor) AS
( SELECT PART, SUBPART
FROM assembly
UNION
SELECT assb.PART, all.Minor
FROM assembly assb, all super all
WHERE assb.SUBPART = all.Major )
SELECT *
This can be supported by simply adding the condition Minor
= 'top tube', to the WHERE clauses in the exit select and the
recursive select, yielding:
FROM assembly
WHERE SUBPART = 'top tube'
UNION
WHERE assb.SUBPART = all.Major
AND all.Minor = 'top tube')
SELECT *

c 1997
CZ
Implementation of Recursive SQL Queries

All Parts/Subparts: transitive closure in SQL3:

SELECT PART SUBPART
FROM assembly
UNION
To implement the dierential xpoint improvement one only need

to replace the recursive relation all subparts in the FROM clause
by all subparts, where all subparts contains the new tuples gen-
erated in the previous iteration of dierential xpoint Algorithm.

c 1997
CZ
The WITH Construct

Since all subparts is a virtual view, an actual query on this view
is needed to materialize the recursive relation or portions thereof.
Materialization of the recursive view from the previous Ex-
ample
SELECT *
FROM all subparts
The WITH construct provides another way, and a more direct one,
to express recursion in SQL3.
Find the parts using top tube
FROM assembly
UNION
WHERE assb.SUBPART = all.Major
)
SELECT *

c 1997
CZ
New SQL3 Standards

Relational tables for a BoM application
part cost
BASIC PART SUPPLIER COST TIME
top tube cinelli 20.00 14 assembly
top tube columbus 15.00 6 PART SUBPART QTY
down tube columbus 10.00 6 bike frame 1
head tube cinelli 20.00 14 bike wheel 2
head tube columbus 15.00 6 frame top tube 1
seat mast cinelli 20.00 6 frame down tube 1
seat mast cinelli 15.00 14 frame head tube 1
seat stay cinelli 15.00 14 frame seat mast 1
seat stay columbus 10.00 6 frame seat stay 2
chain stay columbus 10.00 6 frame chain stay 2
fork cinelli 40.00 14 frame fork 1
fork columbus 30.00 6 wheel spoke 36
spoke campagnolo 0.60 15 wheel nipple 36
nipple mavic 0.10 3 wheel rim 1
hub campagnolo 31.00 5 wheel hub 1
hub suntour 18.00 14 wheel tire 1
rim mavic 50.00 3
rim araya 70.00 1
All Parts/Subparts: transitive closure in SQL3:

SELECT PART SUBPART
FROM assembly
UNION
This is often called a recursive union. We will say that we have
the union of an Exit Select and a Recursive Select.
c 1997
Sec: 10.1 DB Updates & NonMonotonic Reasoning {8{
CZ
Stable Model Characterization

Assume now that N is kept constant to a certain M = BP M
throughout the computation. Then,
Theorem: Let P be a logic program with Herbrand base BP
and M = BP M . Then, M is a stable model for P i
"!
P (M ) (;) = M
This theorem can be used to check whether an interpretation I is
a stable model without having rst to construct groundP (I )|the
two computations are in fact identical.
Furthermore, the computation of the !-power of the positive

consequence operator has polynomial data complexity.
Thus, checking whether a given model is stable can be done
in polynomial time.
However, deciding whether a given program has a stable
model is, in general, NP -complete; thus, nding any such
model is NP -hard.

c 1997
CZ
ICOs with Negated Goals

A modied version of the immediate consequence operator (ICO):
With r being a rule of P , let h(r) denote the head of r, gp(r)
denote the set of positive goals of r, and gn(r) denote the set of
negated goals of r without their negation sign.
For instance, if r : a b; :c; :d:, then h(r) = a, gp(r) = fbg,
and gn(r) = fc; dg.
Denition Let P be a program and I BP . Then

the explicit negation ICO for P under a set of negative
assumptions N BP is dened as follows:
P (N ) (I ) = fh(r) j r 2 ground(P ); gp(r) I; gn(r) N g
The implicit negation ICO of P , TP , is dened as follows:
TP (I ) = P (I ) (I ); where I = BP I
can also be viewed as a two-place function (on I and N ). For

instance to compute TP"! (;) we will set N = I at each step.

c 1997
CZ
Multiple Models
A program can have several stable models.
p :q
q :p
This has two stable models: M1 = fpg and M2 = fq g.
With multiple models, one needs to decide what the intented
sematnics is: n all models, or nd one?
We take the second interpretation, which leads to the concept of
NonDeterminism.
Stratied Programs, however, always have a unique stable model.
Stratication is easy to check from the structure of the program|
independent of the database.
Thus stratied programs are well-suited for implementation.

c 1997
CZ
Stable Models{cont.
Every stable model for P is a minimal model for P and a minimal
xpoint for TP , however minimal models or minimal xpoints
need not be stable models:
M = fag is the only model and xpoint for this pro-
gram
r1 : a :a:
r2 : a a:
A program can have zero stable models, one stable model or

several stable models.
The previous program has no stable model and the barber exam-
ple has no stable model.
However the barber program has a unique stable model after we
eliminate the fact villager(barber).
Thus, the existence of a stable model for a program might depend
on the database. Given a negative Datalog program P , deciding
whether this has a stable model is NP -complete.

c 1997
CZ
Stable Models
Programs that have Stable Models avoid self-contradictions
Stability Transformation. Let P a program and I BP be
an interpretation of P . Then groundM (P ) denote the program
obtained from ground(P ) by the following transformation:
1. remove every rule having as a goals some literal :q with q 2 I
2. remove all negated goals from the remaining rules.
Example: P = ground(P )
p :q
q :p
Stable Models: Let P be a program with model M . M is
said to be a stable model for P , when M is the least model
of groundM (P ).
groundM (P ) is a positive program, by construction: so, its least
model is T "! (;), where T denotes the immediate consequence
operator for groundM (P ).

c 1997
CZ
Paradoxes and Contradictions

In the village, the barber shaves everyone who does not shave
himself: Every villager, who does not shave himself,
is shaved by the barber
shaves(barber; X) villager(X); shaves(X; X): :

shaves(miller; miller):
villager(miller):
villager(smith):
villager(barber):
There is no problem with villager(miller), who shaves him-

self, and therefore does not satises the body of the rst rule.
For villager(smith), given that shaves(smith; smith) is not
in our program, we can assume that :shaves(smith; smith);
then, shaves(barber; smith) is derived that is consistent with
with the negative assumptions made.
For villager(barber): under the assumption :shaves(barber; barber)
the rule yields shaves(barber; barber) which contradicts the
initial assumption.
If we do not initially assume :shaves(barber; barber), then we
cannot derive shaves(barber; barber) using this program and
by the CWA, we will have to assume :shaves(barber; barber),
and end-up with a contradiction.
c 1997
CZ
Open World and Closed World
Open World: what is not part of the database or the program

is assumed to be unknown.
Closed World: what is not part of the database or the pro-
gram is assumed to be false.
Databases and other information systems adopt the Closed World
Assumption (CWA).
If p is a base predicate with n arguments, then :p(a1; : : : ; an)
i p(a1; : : : ; an) is not true, i.e., it is not in the fact base.
Unique name axiom: no two constants in the database stand for
the same semantic object.
Example: The absence of coolguy(\Clark Kent") database im-
plies that :coolguy(\Clark Kent"), even though the database
contains a fact coolguy(\Super Man").
For positive programs, the CWA is as follows: Let P be a positive
program, then each atom a 2 BP :
1. a is true i a 2 TP"! (;)
2. :a is true i a 2= TP"! (;).
However the CWA for general programs (i.e., programs with negated
goals) might lead to inconsistencies.
c 1997
CZ
Beyond Stratied Negation
Finding classes of programs that are more powerful than the

ones with stratied negation and set aggregates is a research
topic.
The problem (at least in terms of xpoint theory) is due to
the non-monotonic nature of the implicit negation used in
DBs and AI.
Implicit negation describes the situation where negation is
inferred from the absence of the opposite conclusion, under
the closed-world assumption.
Nonmonotonic reasoning, and knowledge representation, is a
well-established research topic in AI. The concept of circum-
scription was followed by concepts such as default theories
and auto-epistemic logic; the concept of stable models is re-
cent.

c 1997
Sec 10.2 NonMonotonic Reasononing {9{
CZ
Well-Founded Models and Locally
Stratied Programs
Stratied and locally stratied programs always have a well-founded
model (and therefore a unique stable model) that can be com-
puted using the alternating xpoint procedure:
Theorem: Let P be a program that is stratied or locally
stratied. Then P has a well-founded model.

c 1997
CZ
Partial Well-Founded Models
p :q
q :p
Here:
SP (;) = fp; qg
AP (;) = SP (SP (;)) = SP (fp; qg) = ;
; SP (A"Pk (;))
Since the overestimates and underestimates never converge, this
program does not have a (total) well-founded model.
Indeed, this program has two stable models.
There is also the concept of partial well-founded model, dened
as having as negated atoms M = lfp(AP ) and positive atoms
M + = BP SP (M ); thus, the atoms in BP (M + [ M )
are undened in the partial well-founded model, while this set is
empty in the total well-founded model.

c 1997
CZ
Total Well-founded Model
Denition: Let P be a program and W be the least xpoint
for AP . If SP (W ) = W , then BP W is called the well-founded
model for P .
Now, BP SP (M ) = BP M = M .
But, BP SP (M ) = "P!M (;).
( )
Theorem: Let P be a program with well-founded model M .

Then M is a stable model for P , and P has no other stable
model.
The fact that M is a stable model was proven above.
If N is another stable model, then N is also a xpoint for AP ; in
fact, N M , since M is the least xpoint of AP . Thus, N M ;
but, N M cannot hold, since M is a stable model, and every
stable model is a minimal model. Thus N = M .

c 1997
CZ
Alternating Fixpoint Computation
The least xpoint lfp(AP ) can be computed by (possibly
transnite) applications of AP .
Every application of AP in fact consists of two applications
of SP .
Since A"Pn (;) A"Pn(;), the even powers of SP
1
A"Pn(;) = SP" n(;) 2
dene an ascending chain.
The odd powers of SP
SP (A"Pn(;)) = SP" n (;)
2 +1
dene a descending chain.
Every element of the descending chain is than every ele-
ment of the ascending chain.
Thus, have an increasing chain of underestimates dominated
by a decreasing chain of overestimates.
If the two chains ever meet, they dene the (total) well-
founded model for the program.

c 1997
CZ
Fixpoints for SP and AP
Since it is monotonic, AP has a least xpoint lfp(AP ) (by Knaster-
Tarski's theorem)
AP might have several xpoints:
Let (M; M ) be a dichotomy of BP .
LEMMA 1: Then, M is a stable model for P i M is a xpoint
for SP .
Proof: M is a stable model for P i "P!(M )(;) = M .
This equality holds i
BP "! (;) = B M =M
P (M ) P
i.e, i SP (M ) = M .
LEMMA 2: If M is a stable model for P then M is a xpoint of
AP .
Proof: Every xpoint for SP is also a xpoint for AP .

c 1997
CZ
Well-Founded Models
Much research work has been devoted to nding general approaches
for the ecient computation of nonstratied programs. The con-
cept of well-founded models represents a milestone in this eort.
lfp(TP ) = TP"! , the linchpin of bottom-up semantics and com-
putation in the presence of negation that is nonmonotonic. Nut
let's nd a related operator that is monotonic ...
"P!N (;) is monotonic in N .

( )
Let us dene:
SP (N ) = BP "P!N (;) ( )
This is antimonotonic in N : SP (N 0) SP (N ) for N 0 N ).

The composition of an even number of applications of SP
yields a monotonic mapping.
The composition of an odd number of applications yields an
antimonotonic mapping.
Let us dene:
AP (N ) = SP (SP (N ))
which is monotonic in N .

c 1997
CZ
Local Stratication: cont.
Theorem: Every locally stratied program has a stable model
that is equal to the result of the iterated xpoint computation
(on ground(P )).
Proof: same as for stratied programs.
Local stratication, however, behaves unlike regular stratication
from the viewpoints of computation and implementation. A pro-
gram P normally contains a small number of predicate names.
Thus, it is easy to check for strong components with negated arcs
in pdg (P ) and to determine the strata needed for the iterated
xpoint computation.
However, the question of whether a given program can be locally
stratied is undecidable, when the Herbrand base of the program
is innite.
Even when the universe is nite, the existence of a stable model
cannot be checked at compile-time: it often depends on the database
content.
In the Barber example, the existence of a local stratication (and
of a stable model0 depends on whether villager(barber) is in
the database.

c 1997
CZ
Locally Stratied Programs
Local stratication. A program P is locally stratiable i
BP can be partitioned into a (possibly innite) set of strata
S ; S ; : : :, such that the following property holds: For each rule
0 1
r in ground(P ) and each atom g in the body of r, if h(r) and g

are, respectively, in strata Si and Sj , then
(i) i j if g 2 pg (r), and

(ii) i > j if g 2 ng (r).
A locally stratied program dening integers

even(0):
even(s(J)) :even(J):
Local stratication: feven(0)g = S0, feven(s(0))g = S1, and
so on.
This alternative denition of integers is not locally stratied (Home-
work: prove it!!)
A program that is not locally stratied
even(0):
even(J) :even(s(J)):
c 1997
CZ
Stratication and Stable Models
Theorem: Let P be a stratied program. Then P has a
stable model that is equal to the result of the iterated xpoint
procedure.
1. Let be a stratication for P , and let M be the result of the

iterated xpoint computation on P according to .
2. The iterated xpoint computation on ground(P ) according
to also yields M
3. Let r 2 ground(P ) be a rule used in iterated xpoint com-
putation: say that h(r) belongs to a stratum i.
If :g is a goal of r, then the predicate name of g belongs to
a stratum lower than i
Let r0 be r without the negated goals such as :g .
4. If r was used in the iterated-xpoint computation of M , then
:g during the computation of stratum i. Thus g 2= M ,
since only atoms belonging to higher strata are produced af-
ter the the computation for stratum i. Thus, with Pgr =
groundM (P ) r0 2 Pgr. Thus every rule used in computing
M is also in groundM (P ) = Pgr. Thus lfp(Pgr) M .
5. But, since the iterated xpoint produces a least xpoint for
TP , TP (M ) = M . But TP gr (M ) = TP (M ) = M .
Thus lfp(Pgr) = M

c 1997
Sec. 10.3 Temporal Reasoning {6{
CZ
Temporal Reasoning with Datalog1S

Every query expressed in PLTL can also be expressed in proposi-
tional Datalog1S (i.e., Datalog with only the temporal argument).
For instance, the previous query can be turned into the query
?pair to newcstl where
pair to newcstl newcstl(J) ^ newcstl(J + 1) :
Express p U q : p must be true at each instant in history, until the

rst state in which q is true. Use recursion to reason back in time
and identify all states in history that precede the rst occurrence
of q.
post q(J + 1) q(J):

post q(J + 1) post q(J):
first q(J) :
q(J); post q(J):
pre first q(J) first q(J + 1):
pre first q(J) pre first q(J + 1):
fail p Until q :
pre first q(J); p(J):
p Until q pre q(0); :
fail p Until q:
A similar approach can be used to express other operators of

temporal logic. For instance, p B q can be dened using the
previous predicates Example and the rule
p Before q p(J); pre first q(J):

c 1997
CZ
Other Operators
For instance, the fact that will never be true can simply be
dened as :F q .
q
The fact that q is always true is simply described as :F (:q ));

the notation G q is often used to denote that q is always true.
The operator p before q , denoted pBq can be dened as :((:p) U q )|
that is, it is not true that p is false until q .
PLTL nds many applications, including temporal queries and
proving properties of dynamic systems. For instance, the question
\Is there a train to Newcastle that is followed by another one hour
later?" can be expressed by the following query:
?F (newcstl ^ newcstl)

c 1997
CZ
Temporal Operators
In addition to the usual propositional operators _ ^, and :,
;
PLTL oers the following operators:
2. Next: Next p, denoted p, is true in history H , when p

holds in history H1 = (S1 ; S2; : : :).
Therefore, np, n 0, denotes that p is true in history
(Sn ; Sn+1; : : :).
For instance,
8 newcastl ^ 9 : newcastl
is true since there is a train at 8 and no train at 9.

3. Eventually: Eventually q , denoted F q , holds when, for some
n, q .
n
4. Until: p until q , denoted p U q , holds if, for some n, nq ,

and for every state k < n, kp.

c 1997
CZ
Propositional Linear Temporal Logic

(PLTL).
PLTL is based on the notion that there is a succession of states
H = (S0 ; S1 ; : : :), called a history.
For instance, Trains to Newcastle can be modeled by a predicate

newcstl that holds true in the following states: S8 ; S10 ; S12 ; S14 ;
S16 ; S18 ; S20 ; S22 , and it is false everywhere else.
Modal operators are used to dene in which states a predicate p

holds true.
1. Atoms: Let p be an atomic propositional predicate. Then p

is said to hold in history H when p holds in H 's initial state
S0 .
For instance, :newcastl is true in our example since it is true in

S O

c 1997
CZ
Recurring Schedules
Trains for Newcastle leave daily at 800 hours and then every
two hours until 2200 hours (military time)
before22(22):
before22(H) before22(H + 1):
leaves(8; newcastle):
leaves(T + 2; newcastle) leaves(T; newcastle);
before22(T):

c 1997
CZ
Datalog1s
Discrete time, can be modeled using Datalog1S.
The discrete temporal domain consists of terms built using the
constant 0 and the unary function symbol +1 (written in postx
notation). For the sake of simplicity, we will write n for
z
n times
}| {
(: : : ((0 +1) + 1) : : : + 1)
if T is a variable in the temporal domain, then T , T + 1, and

T + n are valid temporal terms, where T + n denotes
z
n times
}| {
(: : : ((T +1) + 1) : : : + 1)
The endless succession of seasons
quarter(0; winter):
quarter(T + 1; spring) quarter(T; winter):
quarter(T + 1; summer) quarter(T; spring):
quarter(T + 1; fall) quarter(T; summer):
quarter(T + 1; winter) quarter(T; fall):

c 1997
Sec: 10.4 Beyond Stratication {8{
CZ
Temporal Projection{auxiliary predicates
distinct(Frm1; To1; Frm2; To2) To1 6= To2:

distinct(Frm1; To1; Frm2; To2) Frm1 6= Frm2:
select larger(X; Y; X) X Y:
select larger(X; Y; Y) Y > X:

c 1997
CZ
All Classical Algorithms Can be Expressed

as XY-stratied programs.
A simple example: Temporal Projection.
emp dep sal(1001; shoe; 35000; 19920101; 19940101):

emp dep sal(1001; shoe; 36500; 19940101; 19960101):
represent two tuples from this relation.
Merging overlapping periods into maximal periods after a
temporal projection
e hist(0; Eno; Frm; To) emp dep sal(0; Eno; D; S; Frm; To):
overlap(J + 1; Eno; Frm1; To1; Frm2; To2)
e hist(J; Eno; Frm1; To1);
e hist(J; Eno; Frm2; To2);
Frm1 Frm2; Frm2 To1;
distinct(Frm1; To1; Frm2; To2):
e hist(J; Eno; Frm1; To) overlap(J; Eno; Frm1; To1; Frm2; To2);
select larger(To1; To2; To):
e hist(J + 1; Eno; Frm; To) e hist(J; Eno; Frm; To);
overlap(J + 1; ; ; ; ; );
:overlap(J + 1; Eno; Frm; To; ; );
:overlap(J + 1; Eno; ; ; Frm; To):
final e hist(Eno; Frm; To) e hist(J; Eno; Frm; To);

:e hist(J + 1; ; ; ):

c 1997
CZ
Computing XY-stratied Programs

Computing the well-founded model of an XY-stratied pro-
gram P
Inititialize: Set = 0 and insert the fact counter(T).

T
Forever repeat the following two steps:

1. Apply the iterated xpoint computation to the syn-
chronized program Pbis, and for each recursive pred-
icate q, compute new q. Return the new q atoms
so computed, after adding a temporal argument T to
these atoms; the value of T is taken from counter(T).
2. For each recursive predicate q, replace old q with
new q, computed in the previous step. Then, replace
counter(T) with counter(T + 1).
Copy rules
When does the computation stop?

c 1997
CZ
XY-straticatied Programs

Let P be an XY-program. P is said to be XY-stratied when
Pbis is a stratied program.
The program of previous Example is stratied with the follow-

ing strata: S0 = fparent; old all anc; old delta ancg,
S1 = fnew delta ancg, and S2 = fnew all ancg. Thus, the
program in Example ?? is locally stratied.
Theorem: Let P be an XY-stratied program. Then P is
locally stratied.

c 1997
CZ
The Old and the New
r1 : delta anc(0; marc):

r2 : delta anc(J + 1; Y) delta anc(J; X); parent(Y; X);
:all anc(J; Y):
r3 : all anc(J + 1; X) all anc(J; X):
r4 : all anc(J; X) delta anc(J; X):
bi-state program Pbis, computed as follows: For each r 2 P ,
1. Rename all the recursive predicates in r that have the same

temporal argument as the head of r with the distinguished
prex new .
2. Rename all other occurrences of recursive predicates in r with
the distinguished prex old .
3. Drop the temporal arguments from the recursive predicates.
The bi-state version for the previous program is:
new delta anc(marc):

new delta anc(Y) old delta anc(X); parent(Y; X);
:old all anc(Y):
new all anc(X) new delta anc(X):
new all anc(X) old all anc(X):

c 1997
CZ
X-rules and Y-rules

:all anc(J; Y):
XY-programs: Let be a set of rules dening mutually re-

P
cursive predicates. Then we say that P is an XY-program if it

satises the following conditions:
1. Every recursive predicate of P has a distinguished temporal

argument.
2. Every recursive rule r is either an X-rule or a Y-rule, where
is an X-rule when the temporal argument in every re-
r
cursive predicate in r is the same variable (e.g., J ),

r is a Y-rule when (i) the head of r has as temporal argu-
ment J + 1, where J denotes any variable, (ii) some goal
of r has as temporal argument J , and (iii) the remaining
recursive goals have either J or J + 1 as their temporal
arguments.
For instance, is an XY-program where r4 is an X-rule while r2

and r3 are Y-rules.
c 1997
CZ
Stratication by the temporal Argument

Ancestors of marc and the generation gap including the dif-
ferential xpoint improvement
:all anc(J; Y):
This program is locally stratied by the rst argument in anc

that serves as temporal argument.
The zeroth stratum consists of atoms of nonrecursive predicates
such as parent and of atoms that unify with all anc(0; X) or
delta anc(0; X), where X can be any constant in the universe.
The kth stratum contains atoms all anc(k; X); delta anc(k; X).
Thus, this program is locally stratied, since the heads of recursive
rules belong
to strata that are one above those of their goals. he kth stra-
tum

c 1997
CZ
XY-Stratication
Does a program have a well-founded model? In general, the
only way to answer this question is to search for such a model
(e.g., by the alternating xpoint.
For stratied programs, the answer is however easy to answer
at compile time, independent of the database
XY-stratication: is a particular class of locally stratied pro-
grams for which we also have a simple compile-time check,
and an ecient implementation
In fact, XY-stratied programs are particular 1.s programs

c 1997
Sec 10.5 Updates and Active Rules {6{
CZ
Axioms for a Correct History

Question. How do we dene the correct behavior of a deductive
database with active rules?
Answer. By ensuring that its history satises the following two
axioms:
1. Completeness Axiom. The history relations in A must be

identical to the history relations in the stable model of A.
2. External Causation Axiom. Let Aext be the logic program
obtained from A by eliminating from the history relations all
changes but the external changes requested by users. Then,
the stable model of Aext and the stable model of the original
A must be identical.
Thus, (1) every rule that is enabled must be triggered, and (2)
the externally requested events plus those triggered by the active
rules will produced the complete history.
We have considered the ideal situation, where the system can
complete the ring of all the active rules before the next user
request comes in. Then there is a natural denition of what the
correct behavior of the system should be.
Concurrent requests: we can use the concept of serializability to
reduce those cases to this ideal situation.

c 1997
CZ
Active Rules
A1 : If a student is added to the alumni relation, then delete
his name from the student relation, provided that this is
a senior-level student (otherwise error{using a rule not
shown here).
A2 : If a person takes a course, and the name of this person
is not in the student relation, then add that name to the
student relation, using the (null) value tba for Major
and Level.
Using the \immediately after" activation semantics (under an ea-
ger ring policy) these rules can be modeled as follows:
A1 : student hist(J + 1; ; Name; Major; senior)
alumni hist(J; +; Name; ; ; );
student snap(J; Name; Major; senior):
A2 : student hist(J + 1; +; Name; tba; tba)
took hist(J; +; Name; ; );
:student snap(J; Name; ; ):
An active logic program consists of
(1) the history relations,
(2) the change predicates,
(3) the snapshot predicates, and
(4) the active rules.
The program A so dened is XY-stratied; thus it has a unique
stable model M , which denes the meaning of the program.
c 1997
CZ
Traditional Queries and Deductive Rules

Snapshot predicates dene the content of the database relations
at each instant J. For the student relation, for instance, we have
the following rules:
Snapshot predicates for student via frame axioms
;
student snap(J + 1 Name Major Level) ; ;
student snap(J Name Major Level) ; ; ; ;
:
student hist(J + 1 Name Major Level) ; ; ; ; :
;
student snap(J Name Major Level) ; ;
student hist(J + Name Major Level) ; ; ; ; :
These rules express what are commonly known as frame axioms.
The current content of each relation is the its snapshot at time J,
where J is the max value for the change counter:
The current content of the relation student
current state(J) change(J); :change(J + 1):
;
student(Name Major Year) ; ;
student snap(J Name Major Year) ; ; ;
current state(J) :

c 1997
CZ
Continuity Axioms
If the database consists of three relations:
; ;
student(Name Major Year); took(Name Course Grade) ; ;
; ;
alumni(Name Sex Degree ClassOf) ;
(the last relation stores the alumni who graduated from college in
the previous years.)
Then we need three rules to keep track of all changes (one rule
per relation in the schema):
change(J) student hist(J ; ; ; ; ):

change(J) took hist(J ; ; ; ; ):
change(J) alumni hist(J ; ; ; ; ; ):
A violation to the continuity axiom can be expressed as follows:
bad history change(J + 1) ; :change(J):
The temporal argument can only be increased by a new event,

and there is no hole in the sequence.

c 1997
CZ
Estensional and Intensional Information

These terms are often used to denote, respectively, the database
facts, and the rules.
Here the history of each relation becomes the (only) extensional
information, the rest is intensional information. Then, each database
relation such as
student(0 Jim Black 0
, cs, junior)
must be dened by rules from its history. e.g. a history of

changes for Jim Black:
student hist(2301,+, 'Jim Black', ee, freshman).

student hist(4007,-, 'Jim Black', ee, freshman).
student hist(4007,+, 'Jim Black', ee, sophomore).
student hist(4805,-, 'Jim Black', ee, sophomore).
student hist(4805,+, 'Jim Black', cs, sophomore).
student hist(6300,-, 'Jim Black', cs, sophomore).
student hist(6300,+, 'Jim Black', cs, junior).
The rst column is a change counter that is global for the system|
that is, it is incremented for each change request.
Several changes can be made in the same SQL update statement:
4; 007 2; 301 changes in one year.

c 1997
CZ
Updates in Logic
In general, logic-based systems have not dealt well with database
updates.
For instance, Prolog resorts to its operational semantics to
give meaning to assert and retract operations. The result
is that many dierent operational semantics (more than 9)
have been implemented in various systems.
Logic-based semantics for updates is also a major problem
faced by deductive database systems; however, these concen-
trate on changes in the base relations, rather than facts and
rules as in Prolog.
Current DB prototypes feature a strained coexistence of declar-
ative and operational constructs: e.g., in GLUE/Nail! GLUE
is an operationa wrap around the declarative Nail!: same syn-
tax but not same semantics.
Also, a DB system that supports updates and rules should
support active rules too ...
Desiderata:
1. providing a logical model for updates,
2. supporting the same queries that current deductive databases
do, and
3. supporting the same rules that active databases currently do.
c 1997
Sec 10.6 Nondeterministic Reasoning {9{
CZ
Beyond Don't Care Non-Determinism
In many situations, we seek to satisfy a condition that holds or
does not hold depending on the choice made. Thus, we might want
to seek among the choice models one that satisfy the condition.
Alternatively, we might make a choice and then backtrack to the
next choice once we nd that the condition does not hold. Thus
an exponential computation is often required.
Hamiltonian path in a graph: A graph has a Hamiltonian
path i there is a simple path that visits all nodes exactly
once.
simplepath(root root) ; :
simplepath(X Y) ; simple path( X) g(X Y) ; ; ; ;
choice((X) (Y)) choice((Y) (X)) ; ; ; :
nonhppath ;
n(X) : simplepath( X) ; :
q ;
:q nonhppath :
If nonhppath is true in M , then rule q :q must also be
satised by M . Thus, M cannot be a stable model. Thus, this
program has a stable model i there exists a Hamiltonian path.
Thus, deciding whether a stable model exists for a program is
N P -hard.

c 1997
CZ
DB-PTIME without assuming a total
order
Stratied Datalog programs with choice are also DB-PTIME com-
plete, without having to assume that the universe is totally or-
dered (i.e., respecting the genericity assumption).
The following program denes a total order for the elements of a
set d(X) by constructing an immediate-successor relation for its
elements (root is a distinguished new symbol):
Ordering a domain
ordered d(root root) ; :
ordered d(X Y) ; ordered d( X) d(Y) ; ; ;
choice((X) (Y)) choice((Y) (X)); ; ; :
Once an arc (X; Y) is generated, this is the only arc leaving the
source node X and the only arc entering the sink node Y.
Since we accept any choice model, we have don`t care non-determinism
and the computation remains polynomial.
Here it means that we accept any order.
For certain queries, we might still a deterministic result: e.g., in
the computation of aggregates which are commutative and asso-
ciative.
c 1997
CZ
Choice in Recursion
For instance, the following program computes the spanning tree,
starting from the source node a, for a graph where an arc from
node b to d is represented by the database fact g(b; d).
Computing a spanning tree
st(root a) ; :
st(X Y) ; st( X) ; ; g(X; Y); Y 6= a; choice((Y) (X)) ; :
The goal Y 6= a ensures that, in st, the end-node for the arc
produced by the exit rule has an in-degree of one;
likewise, the goal choice((Y); (X)) ensures that the end-nodes for
the arcs generated by the recursive rule have an in-degree of one.

c 1997
CZ
Properties
In general, the program SV (P ) generated by the transformation
discussed above has the following properties:
SV (P ) has one or more total stable models.

The chosen atoms in each stable model of SV (P ) obey the
FDs dened by the choice goals.
The stable models of SV (P ) are called choice models for P .

Stratied Datalog programs with choice are in DB-PTIME: actu-
ally they can be implemented eciently by producing chosen
atoms one at a time and memorizing them in a table. The
diffchoice atoms need not be computed and stored; rather, the
goal :diffchoice can simply be checked dynamically against
the table chosen.

c 1997
CZ
Stable Version, SV (P ), of a program P
For each choice rule r in P :
r:A B (Z ); choice((X1 ); (Y1)); : : : ; choice((X ); (Y )): k k
Let B (Z ) denotes the conjunction of all the choice goals of r that

are not choice goals, and
let X ; Y ; Z , 1 i k, denote vectors of variables occurring
i i
in the body of r such that X \ Y = ; and X ; Y Z .

i i i i
1. In P replace r with a rule r0 obtained by substituting the

choice goals with the atom chosen (W ): r
r0 : A B (Z ); chosen (W ): r
where W Z is the list of all variables appearing in choice

goals, i.e., W = S1 X [ Y .
j k j j
2. Add the new rule chosen (W ) B (Z ); :diffChoice (W ):

r r
3. For each choice atom choice((X ); (Y )) (1 i k), add the

i i
new rule
diffChoice (W ) chosen (W 0); Y 6= Y 0:
r r i i
where (i) the list of variables W 0 is derived from W by replac-

ing each A 2 Y with a new variable A0 2 Y 0 (i.e., by priming
i i
those vari ables), and (ii) Y 6= Y 0 is true if A 6= A0, for some

i i
variable A 2 Y and its primed counterpart A0 2 Y 0.

i i

c 1997
CZ
Choice by Negation
actual adv(S P) ; ;
student(S Mahor Yr) ; ; ;
professor(P Majr) ;
choice((S) (P)) ; :
The stable version for the adivisor rule
actual adv(S P) ; ; ;
student(S Majr Yr) professor(P Majr); ; ;
chosen(S P) ; :
chosen(S P) ; ; ; ;
student(S Majr Yr) professor(P Majr) ; ;
:diffChoice(S P) ; :
diffChoice(S P) ; ; ;
chosen(S P0) P 6= P0 :
This program has two stable models. One in which ohm is chosen
as advisor of Jim Black, and the other where bell is chosen
instead.
A program where the rules contain choice goals is called a choice
program.
The semantics of a choice program P can be dened by transform-
ing P into a program with negation, SV (P ), called the stable
version of a choice program P .
SV (P ) exhibits a multiplicity of stable models, each obeying the
FDs dened by the choice goals.
Each stable model for SV (P ) corresponds to an alternative set
of answers for P and is called a choice model for P .
c 1997
CZ
Many Applications of Chocie
Given the two relations boy(Bname ), girl(Gname), Are
there more boys than girls in our database?
match(Bname Gname); ;
boy(Bname) girl(Gname) :
choice((Bname) (Gname)); ;
choice((Gname) (Bname)); :
matched boy(Bname) match(Bname Gname) ; :
moreboys ;
boy(Bname) :matched boy(Bname) :

c 1997
CZ
Choice Goals
Then, in a language such as LDL++ the goal choice((S); (P))
can be added to force the selection of a unique advisor, out of the
eligible advisors, for a student:
Computation/selection of unique advisors by choice
rules
actual adv(S P) ; ;
student(S Majr Levl) ; ;
; ;
professor(P Majr) choice((S) (P)) ; :
More declaratively, the goal choice((S); (P)) can also be viewed
as enforcing a functional dependency (FD) S ! P; thus, in actual adv,
the second column (professor name) is functionally dependent on
the rst one (student name).

c 1997
CZ
NonMontonicity and Nondeterminism
With relation student(Name; Majr; Year), our university database
contains the relation professor(Name; Majr). A toy database
with only the following facts:
student(0Jim Black0 ee senior); ; : professor(ohm ee) ; :

professor(bell ee) ; :
elig adv(S P) ; ;
student(S Majr Year) ; ; professor(P Majr); :
We obtain:
elig adv(0 Jim Black0 ohm) ; :

elig adv(0Jim Black0 bell) ; :
But, a student can only have one advisor.

c 1997

CS6005 Advanced Database System UNIT IV PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

CS6005 Advanced Database System UNIT IV PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Sec: 8.

Facts and Rules

The same fact base for Datalog

took("Joe Doe", cs123, 2.7)

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

firstreq(Name) student(Name; Major; junior);

Rule head, rule body.

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

and its answer:

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

The Relational Model vs Datalog

Most of the power is in cascading Both previous re-

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

hastaken(Name; Course) took(Name; Course; Grade):

Universal quanti cation by Double Negation

req missing(Name) student(Name; ; senior);

all req sat(Name) student(Name; ; senior);

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

columns of R; j1; : : : ; i are columns of S ; and 1; : : : ; 

are comparison operators. Then, if R has arity m, we de ne

 The intersection of two relations can be constructed either

where L is a list of column numbers and constants. Unlike or-

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

selection formula F , where F obeys one of the following pat-

where F denotes the formula obtained from F by replacing

$i and $j with t[i] and t[j ].

\t[2] = t[3] ^ t[1] = bob".

All previous operators, but set-di erenc, are monotonic.

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

of L obtained by (1) eliminating of the elements, and (2)

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Relational Algebra (RA)

 Union. The union of relations R and S , denoted R [ S , is

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

 Query-By-Example (QBE) is a visual query language based

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Query languages that achieve the level of of expressive power

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Tuple Relational Calculus (TRC)

f(t[1])j 9u9s(took(t) ^ took(u) ^ student(s) ^ t[2] = cs101 ^

 The variables t and s, respectively denote tuples ranging over

Then the notation t[j1; : : : ; j ] will be used to denote the

n-tuple (t[j1]; : : : ; t[j ]).

 TRC requires an explicit statement of equality (e.g., s[1] =

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Explicit Quanti ers

 set-de nition by abstraction (rather than rules)

f(N )j 9M (student(N; M; senior)) ^

The implication sign !: p ! q is just a shorthand for :p _ q .

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Domain Relational Calculus

1. in the Domain Relational Calculus (DRC) the variables de-

For instance the query \Find the name of junior-level students

f(N ) j 9G1(took(N; cs101 ; G1)) ^ 9G2(took(N; cs143 ; G )) ^ 2

f(N ) j 9G; 9M (took(N; cs131 ; G ) ^ G > 3 :0 ^ student (N ; M ; junior )) _

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Equivalence of RA and Safe Nonrecursive

Zaniolo|Ceri|Faloutsos|Snodgrass|Subrahmanian| Zicari|All Rights Reserved

Mapping with Negated Goals

r : ::: b1(a; Y); b2(Y):

Then we consider a positive body, i.e., one constructed by drop-

rp : : : : b1(a; Y); b2(Y); :b3(Y):

rn : : : : b1(a; Y); b2(Y); b3(Y):

Universal quantication by Double Negation

columns of R; j1; : : : ; i are columns of S ; and 1; : : : ;

are comparison operators. Then, if R has arity m, we dene

The intersection of two relations can be constructed either

All previous operators, but set-dierenc, are monotonic.

Union. The union of relations R and S , denoted R [ S , is

Query-By-Example (QBE) is a visual query language based

The variables t and s, respectively denote tuples ranging over

TRC requires an explicit statement of equality (e.g., s[1] =

Explicit Quantiers

set-denition by abstraction (rather than rules)

Programs which are stratiable, always have a clear meaning;

M for a program P is said to be its least model if M M for 0

With innite universe an innite number of interpretations.

A denite clause is representable with the rule notation: