A Prolog to Prolog
John F. Sowa
3. Procedural Prolog
Programming style in Prolog is strongly influenced by the pure
Prolog subset. In fact, all of the examples so far have
used only this subset, except for comparison predicates
like ge and arithmetic operators like (X := A + B).
But to make Prolog run on conventional computers, the underlying
inference engine imposes a strict order of execution.
For purely declarative rules, that order can often be ignored.
For complex applications, however, that order can have
a major effect on performance. A Prolog programmer should write
declarative rules, but should also be aware of the way those rules are
executed by the inference engine.
Backtracking and Cuts
Two features distinguish Prolog from conventional programming
languages: the unification algorithm for pattern matching, and
the backtracking algorithm for executing rules. Backtracking
can be viewed as a
generalization of ordinary procedure calls. It allows rules that
have been called and exited to be reactivated at some later time.
The following diagram illustrates the flow through a conventional
procedure:
.in4
.sp
.in0
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
| |
Call ÄÄÄÄÄ>| Conventional Procedure ±ÄÄÄÄÄ> Exit
| |
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The arrows in this diagram indicate flow
in only one direction. The procedure
is activated by a call and terminated by an exit. After it exits, there
is no way that it can be reactivated except by another call. Many
systems allow programs to be interrupted. Then the system saves the
state of the current activation, performs some higher priority task, and
reactivates the suspended procedure. But one thing that conventional
systems never do is to reactivate a procedure that has already run to
completion and exited normally.
In Prolog, the rules and facts that define a predicate
behave like a procedure. They can be called by a goal and can exit
successfully like a conventional procedure. But backtracking
adds another way of exiting and another way of reentering.
The next diagram shows the flow through the Prolog clauses
that define some predicate:
.in4
.sp
.in0
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
Call ÄÄÄÄÄ>| ±ÄÄÄÄÄ> Exit
| Set of Prolog Clauses |
Fail <ÄÄÄÄİ |<ÄÄÄÄÄ Redo
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The top line shows the ordinary flow through a rule that terminates
successfully. But if a goal fails, control does not pass to the next
goal to the right. Instead, the goal terminates by failure, and
the previous goal on the left is reactivated by a redo.
To illustrate the possible entries and exits, consider the following
two rules that define the predicate p:
p(X) <- q(X) & r(X) & s(X) & t(X).
p(X) <- u(X) & v(X).
The first rule is called when the head p matches some goal.
Then the predicates in its body are called from left to right. If they
all succeed, the clause is exited in the normal way. Suppose, however,
that predicate s happened to
fail. Then control does not pass to t, but to the
redo entry for r. Another choice is taken for r,
and s is called again. If s fails once more, control
passes back to r. If no more options
are available for r, then r also fails and control
passes back to a choice point for q. If all choices for q fail,
then the whole rule fails and control passes to the second rule
that defines p. If the second rule fails, then the
original goal that called p also fails.
Each option in a rule creates a choice point, a point
where control returns when a rule is reentered by a redo.
There are two ways of specifying options: multiple rules
(or facts)
with the same
head predicate or the | operator in the body of some rule.
The more basic way is by multiple rules, since every use of
the | operator can be replaced by multiple rules. The
following rule, for example,
a <- b & (c | d) & e.
is equivalent to the pair of rules
a <- b & c & e.
a <- b & d & e.
Although the | operator can always be eliminated, it may
simplify the statement of the rules. In some cases, it can also
improve their execution speed. In the pair of rules above, for
example, the predicate b
must be executed a second time if c
fails. If b had side effects such as input/output, the final
results might be different.
In the basic Prolog inference method, choice points result from
multiple rules. The | operator behaves as though it were
defined by the following two rules:
(P | Q) <- P.
(P | Q) <- Q.
The first rule says that to execute a goal of the form (P | Q),
first do P. If P succeeds, then (P | Q) succeeds without
any need to try Q. But if P fails (or some later failure forces a
redo), then go to the second rule, which tries Q. In these
rules, P and Q are metavariables, whose
values are predicates rather than ordinary data structures.
These rules are sufficient as long as there are no complications
introduced by cuts. In order to handle those complications,
IBM Prolog provides | as a built-in operator.
Sometimes the choice points may not be obvious because they
are buried in the rules for some other predicate. The next
rule, for example, has no explicit choice points:
woman(W) <- human(W) & female(W).
Although there are no alternatives in this rule, the
predicate human may be defined by a long list of facts in the
workspace. If the goal is woman(X), the first value found
by the finder goal human(W) might be W=socrates.
Since female(socrates) fails, the system backtracks to the
redo point for human to find W=cicero.
Since female(cicero) also
fails, it keeps trying until it finds some
value for W that makes both predicates true.
In some cases, backtracking can be a highly efficient way of
searching. In other cases, it may cause a great deal of overhead.
Prolog provides a way of controlling the search process
by means of a built-in operator called cut,
written /. The cut operator commits Prolog to the current
path in the search process by throwing away choice points.
Thus it cuts out the remaining part of the search space that
would otherwise result from the open choice points. However, a
careless use of cuts can destroy the logical integrity of the
computation. Consider the following example:
woman(W) <- human(W) & / & female(W).
human(mary).
human(george).
female(mary).
Here, the goal woman(X) succeeds with X=mary. But
consider the next example:
woman(W) <- human(W) & / & female(W).
human(george).
human(mary).
female(mary).
Now human(W) finds W=george. Then
the cut throws away the choice point leading to human(mary).
But now female(george) fails. Since the choice point is gone,
the system cannot redo the human predicate to find another
value for W. Therefore the entire rule fails.
As this example shows, cuts can destroy the declarative
reading of a program. Following are three reasons for using cuts:
- If a goal is known to have at most one solution, a cut following the
goal may prevent unnecessary backtracking.
- Doer goals that have side effects (such as input/output) may cause
erroneous results if executed more than once; a cut can
prevent them from being repeated by backtracking.
- When an error occurs in a deeply nested computation, a cut may be
necessary as part of the recovery process.
To illustrate backtracking, the count predicate
keeps incrementing its argument each time backtracking
returns to it:
count(0).
count(I) <- count(J) & (I := J + 1).
The first line starts the count at 0. Then the recursive
rule adds 1 to the value computed by the previous call.
To see how it works, type the following goal at the keyboard:
count(I) & write(I) & fail.
At the first call, count(I) matches the first rule
with I=0. Then the write predicate prints 0
on the screen. The fail predicate takes no arguments and
always fails when it is called. The failure causes backtracking to
go to the second rule, which calls count to
compute J=0. Then I=1, and write
prints 1 on the screen. The predicate fail
forces backtracking to return to the subgoal count(J),
which computes J=1, whereupon I=2. As a result,
the system keeps printing all the integers 0, 1, ... until
the user causes an interrupt (which can be done by
typing sp.).
To keep counting from going on forever, some limit N is necessary.
But a cut is also needed to keep backtracking from redoing the
previous rules. The two-argument version of count calls
the one-argument version to do the counting, but it uses the second
argument as a limit:
count(I,N) <- count(I) & (gt(I,N) & / & fail | true).
As long as I is less than or equal to N, the second option
in this rule calls the predicate true, which always succeeds.
As soon as I becomes greater than N, the cut is executed to prevent
backtracking from redoing the one-argument count goal.
Then fail causes a failure that stops the process.
Because of the limit, the following goal only prints the integers
up to 5:
count(I,5) & write(I) & fail.
Cuts in Prolog are like go-to statements in conventional
languages: they are very powerful, but they may destroy the
structure of the program and make it difficult to read and maintain.
The best way to use cuts is to define better structured operators
in terms of them. Then use those operators instead of the more
primitive cuts. Two operators defined in terms of cuts are
the negation ^ and the if-then operator ->.
Following is a definition of negation:
^P <- P & / & fail.
^P.
The first rule says that to prove ^P, first try P.
If P succeeds, do the cut that throws away the choice point to the
second rule. Then the goal fail forces a failure.
Since the pointer to the second rule is gone, there is nothing else
to do, and the original goal ^P fails. Therefore ^P
fails if P succeeds. On the other hand, if P fails, the choice
point for the second rule is still available. The second rule does
nothing and always succeeds. Therefore ^P succeeds
if P fails. In IBM
Prolog, the ^ operator is built-in,
and it
does some additional checking to handle cases where the goal P
itself does cuts that may affect the success or failure of ^P.
The if-then operator is represented by a right pointing
arrow. The combination (P -> Q; R) may be read "If P
then Q else R." It has the following definition:
(P -> Q; R) <- P & / & Q.
(P -> Q; R) <- R.
If P succeeds, the cut throws away the choice point
for the second rule. Then Q is executed; Q may succeed or fail,
but backtracking will never cause R to be executed after P has
succeeded. On the other hand, if P fails, the second rule is
performed to execute R.
As with the definition of ^,
the system actually does more checking than this definition shows in
order to handle cases where P itself contains cuts.
Logically, the if-then operator is not necessary, since it could
be simulated with a construction of the form (P & Q | ^P & R).
However, this form is less efficient, since it executes P twice.
It could also give different results if P had side effects.
As an example of the use of if-then, consider the notin
predicate, which is the negation of member:
notin(X,L) <- ^member(X,L).
This rule says that X is not in the list L if it is false
that X is a member of L. With the cut operator, it is possible
to give a more basic definition of notin:
notin(X,nil).
notin(X,X.Tail) <- / & fail.
notin(X,Head.Tail) <- notin(X,Tail).
The first rule says that for any X, X is not in the empty list.
The second rule says that if X matches the head of the list, discard
the choice point and force an immediate failure. Then the third rule
says that if X is not equal to Head (since the choice point
must not have been discarded), check whether &X is not in Tail.
Using the if-then operator leads to a clearer definition:
notin(X,nil).
notin(X,Head.Tail) <-
(X=Head -> fail; notin(X,Tail)).
The second rule says that if X equals the head, then fail;
else check whether X is not in the tail.
Repeated if-then operators can be used to simulate a case
statement in other languages. For example, the following rule
determines pet sizes:
pet_size(P,S) <- (P=mouse -> S=small;
P=cat -> S=normal;
P=dog -> S=normal;
P=horse -> S=big;
P=elephant -> S=enormous;
S=unknown).
This predicate, however, can be defined much more simply by just
a collection of facts:
pet_size(mouse, small).
pet_size(cat, normal).
pet_size(dog, normal).
pet_size(horse, big).
pet_size(elephant, enormous).
pet_size(&asterisk;, unknown).
The case construction may be useful when the test predicate is complex,
but as this example illustrates, Prolog does an implicit case check in
deciding which rule to invoke. That implicit check is usually
more efficient than a case check inside the body of a rule.
However, the case construction might be useful to avoid problems
with backtracking. Suppose, for example, that P=dog, but
the predicate were followed by some other goal that might fail.
In that case,
backtracking might lead to pet_size(&asterisk;,unknown),
which would give the size unknown for the dog. To avoid that
backtracking, cuts could be added to all the clauses but the last:
pet_size(mouse, small) <- /.
pet_size(cat, normal) <- /.
pet_size(dog, normal) <- /.
pet_size(horse, big) <- /.
pet_size(elephant, enormous) <- /.
pet_size(&asterisk;, unknown).
Each of the conditions has just a single cut. That cut has no effect on
the success or failure of the clause in which it occurs, but it prevents
backtracking from leading to later clauses. However,
this proliferation of cuts is not desirable. They should be avoided by
using the if-then operator if backtracking might cause a problem.
Saving Computed Values
Unification binds values to variables, but backtracking unbinds the
old values before finding new ones. The automatic backtracking, which
can be powerful for certain operations, sometimes erases results that
are needed later. There are two basic ways of preserving a result to
keep it from being erased by backtracking:
- Cause some nonlogical side effect that backtracking cannot erase.
- Use the compute or setof predicates, which
accumulate a list of results.
To illustrate a side effect for preserving intermediate results,
consider the Fibonacci series 0, 1, 1, 2, 3, 5, 8, ..., where each
number is the sum of the previous two. Following is a recursive
predicate for computing the Nth number in the series:
fib(0,0).
fib(1,1).
fib(N,X) <- fib(? N - 1, X1) & fib(? N - 2, X2)
& X := X1 + X2.
The first two facts state the starting values for the series. Then
the general rule computes X by calling itself recursively to compute
the values for (N - 1) and (N - 2). Note the use
of the evaluation operator ? to evaluate the arguments before
calling fib. Since each
call to the general rule for fib makes two recursive calls,
the amount of computation increases rapidly. In fact, it increases
as the Fibonacci series itself: for N=30, X=832040.
In principle, the value
of X for N=30 should take only 29 additions; to make
832040 recursive calls is an enormous amount of overhead. One way to
improve performance is to add an extra argument to fib that
saves the previous value (see Exercise 19). Another way is to use
the built-in addax predicate to assert a new fact:
fib(0,0).
fib(1,1).
fib(N,X) <- fib(? N - 1, X1) & fib(? N - 2, X2)
& X := X1 + X2
& addax(fib(N,X),1).
The last line of this rule adds the new fact fib(N,X) to
the workspace. The second argument of addax is used here
to specify that
the new fact should be placed in position 1 of the
clauses for the predicate fib. Without that argument, the
new fact would be placed after all of the other clauses,
and the recursive rule would still be invoked.
To
compute fib(19, 4181), the first version took 108 milliseconds
on an IBM 3090, but the version with addax took
only one millisecond. Without addax, the Fibonacci series
could not be computed for values greater than N=19 without
increasing the amount of temporary stack space for Prolog goals. But
with addax, the execution time for N=100 was 6 ms.
Once all the values have been asserted by means of addax,
the answer to each future question can be found
in less than a millisecond.
This technique is useful for mixing computation with table look-up.
For more complicated problems, there may be no simple linear
algorithm, and addax can be valuable for avoiding
duplicate computations.
Game playing programs, for example, frequently reach the same positions
through multiple paths. They can often run much faster if they save
each position that has been evaluated.
This is a typical trade-off of space versus time: chess programs on large
machines save previous values, but programs on small machines do not
have the space.
The addax predicate has a variety of uses. For example,
the predicate married is supposed to be symmetric. But it may
happen that one fact married(zeus,hera), but
not the converse married(hera,zeus), has been asserted.
The following goal searches for all nonsymmetric pairs and asserts
the missing fact:
married(X,Y) & ^married(Y,X) &
addax(married(Y,X)) & fail.
The first subgoal married(X,Y) serves as a finder goal to
retrieve some married pair. If the converse has also been asserted,
the goal ^married(Y,X) fails and forces backtracking
to find another pair. If the converse has not been
asserted, addax asserts it, and fail forces
backtracking to find the next married pair.
The process continues until all married pairs have been checked.
Side effects can also accumulate a list or set of
results that are generated by some goal. Suppose, for example,
that one wanted to find all pairs of siblings in the family database.
The goal sibling(X,Y) would find the first pair; then
a semicolon from the keyboard would find the second pair. For each
pair, a semicolon would be necessary to force the inference engine to
backtrack and find another pair. Another way to force backtracking
is to use the built-in fail predicate, as in the following
goal:
sibling(X,Y) & write(X.Y) & fail.
First, sibling finds one pair of siblings, and write
prints out that pair. Then fail forces backtracking to the
most recent choice point. Since there are no choices for write,
backtracking continues to the redo entry for sibling and
searches through the list of facts for child to find another
pair who have a common parent. Then write prints out that
pair, and fail