John F. Sowa
Prolog can be a highly efficient language when its features are used to their best advantage. But the powerful backtracking facilities can lead to a combinatorial explosion when they are not used properly. This section discusses ways of avoiding inefficiencies and improving performance. There are two basic ways of improving performance: finding a better algorithm, and reordering the search.
The choice of algorithm is the most important factor in determining performance. A microprocessor running an efficient algorithm can outperform a supercomputer running an inefficient one. As an example, consider the problem of sorting a list in alphabetical order. Several different algorithms might be considered:
The first step in writing the permutation sort is to define
the To sort a list The The The bubble sort calls the predicate
The time for the bubble sort increases as the square of the
length N. For the list of 26 letters in QWERTY order, The
Sort programs rarely sort simple lists of numbers or character
strings. Usually, they sort larger records or structures where
one part is the key and the rest is some data associated with the
key. In Prolog, such structures can be represented as lists with
the head element as the key.
Note that the only places at which the
Finally, the most efficient sort is the one that relies on the
built-in In a large class of artificial intelligence problems,
values must be assigned
to a number of variables to satisfy certain constraints.
Exercise 16 on marching bands and Exercise 17 on assigning offices
are examples of constraint satisfaction problems. Other
examples include cryptarithmetic problems that assign
digits to letters to satisfy equations like SEND+MORE=MONEY or
DONALD+GERALD=ROBERT. A database query is an example of constraint
satisfaction where the body of the query states constraints on
the desired combinations of values that are selected from the database.
Whereas a database system merely selects pre-existing values,
an expert system for design may generate new values that meet given
constraints. The R1 computer configurator, for example,
assigns values to about 1,000 different variables in determining
a complete system configuration (McDermott 1982).
A common approach to solving such problems is
generate and test: pick some value for each design
variable, then test to see
if that combination of values satisfies the constraints.
If so, the problem is solved. If not, pick a new set of values,
and try again.
The This method is the epitome of the fixed-control approach:
it fits all problems; all constraints may be stated
in pure logic; and the control is clearly separated from the
logic. Its only drawback is inefficiency: for the R1 computer
configurator, there are about 1,000 variables to be defined with
an average of 3 possible values for each one.
The number of combinations
to be tested is 3&asterisk;&asterisk;1000 — an
array of supercomputers calculating since the beginning of the universe
could not finish this computation.
To reduce the combinatorial explosion, R1 uses flexible control.
The knowledge engineer explicitly adds control information
that determines the order of assigning values and testing constraints.
With such ordering, R1 can reduce the exponential term to a nearly
linear one. As a result, it finds a satisfactory assignment of
values to the 1,000 variables in about two minutes of CPU time.
The efficiency of R1 is a strong argument for a judicious blending
of logic and control. Yet it does not show whether the blend should
be determined by a person or by an optimizing
compiler.
To illustrate the issues, consider a simple cryptarithmetic problem
with 8 variables — in this case, the puzzle SEND+MORE=MONEY.
Each letter represents a single digit; no digit is repeated; and the
initial digits S and M must be positive.
(See Raphael (1976) for an analysis of the problem BEST+MADE=MASER,
which is the same problem with some variables renamed.)
Newell
and Simon (1972)
For the sample problem, the declarative information is derived
adrian -- minor change -- 3/2/90
from the statement SEND+MORE=MONEY. Let &C1; represent
a carry from the units position;
&C2;, a carry from the tens position;
&C3;, a carry from the hundreds position;
and &C4;, a carry from the thousands position.
Then the puzzle determines five constraint equations:
To solve the problem in Prolog, the first step is to define a
nondeterministic predicate Given the As the body of the solve rule is executed,
the first choice of values is M=0, S=1, O=2, E=3, N=4,
R=5, D=6, and Y=7. When the constraint M>0 is tested,
a failure occurs, and the system backs up to the most recent choice
point — in this case, the predicate,
This rule takes about 13 seconds of CPU time on an IBM 3090.
Although 13 seconds is not a long time for a complex problem,
this problem could be solved in just a few milliseconds
with a different ordering of the predicates in the rule.
But to decide how to reorder the predicates, a fair amount of analysis
is required, and the reordered rule is less readable than the one above.
Furthermore, for dynamically changing inputs, it may not be possible
to do that analysis in advance. The next section will show how
to speed up the execution by reordering the predicates by hand.
Chapter 3, on page &c3money;, will show how to achieve close to the same
performance automatically by using special features of IBM Prolog.
The ideal shape for a search tree is a straight line that leads
directly to a solution. The search tree that corresponds to
the above program is
a bush with 1,814,400 short branches, of which only one is correct.
For the SEND+MORE=MONEY problem, there is no way to convert the
search tree into the ideal, but it is possible to prune most of the
unnecessary branches and make it more tree-like than bush-like.
The optimization follows two general principles:
:ul
To illustrate these rules, apply them systematically to
optimize the SEND+MORE=MONEY problem. Of the five constraint
equations, the one with the fewest choices is
&C4=M.. Since M is
determined by the value of &C4;, the choice space is limited to the
two possible values of &C4;.
The further constraint &M>0. eliminates
the value 0 and completely determines the values &M=1. and &C4=1..
The rule for solve can therefore begin without any nondeterminism:
When &C4; and M have values, the next most
restricted constraint is
After these choices have been made, the next most restricted
constraint is
Now the most restricted constraint is
Finally, the last constraint to be tested is
Putting the pieces together produces the following Prolog rule.
Except for the first line of the body, which simplifies parts of the
previous rule, the rest of the rule is just a reordering of
the previous one. The improvement in performance, however,
is dramatic: it finds the correct solution in only 4 milliseconds
— over 3200 times faster.
The improved program is still a form of generate and test, but a more
distributed form. Instead of having a single block that generates all
combinations followed by another block that tests all constraints,
it repeatedly generates a few combinations and immediately
tests them.
Reordering the sequence of generating and testing improves performance
in three ways:
permute(nil,nil).
permute(L,H.T) <- pick(H,L,Rem) & permute(Rem,T).
The first line says that the only permutation of the empty
list
permute(a.b.c.nil,P).
At the first call to
psort(L1,L2) <- permute(L1,L2) & ordered(L2).
If
ordered(A.B.Tail) <- le(A,B) & ordered(B.Tail).
ordered(A.nil).
ordered(nil).
Instead of trying all possible permutations,
heuristics could guide the sort program by only selecting
ones that make some "improvement"; i.e each new trial must
be slightly better ordered than the previous one. A divide and
conquer approach would split the list in two parts, sort those,
and combine the results.
bsort(L,L) <- ordered(L).
bsort(L1,L2) <- bubble(L1,L) & bsort(L,L2).
Instead of using backtracking to generate
each permutation,
bubble(A.B.Tail, L) <-
(le(A,B) -> (bubble(B.Tail,T) & L=A.T);
(bubble(A.Tail,T) & L=B.T)).
bubble(A.nil, A.nil).
bubble(nil,nil).
The recursive rule checks whether A and B are in order.
If so, it calls itself to check
partition(M,H.L,H.L1,L2) <-
le(H,M) & partition(M,L,L1,L2).
partition(M,H.L,L1,H.L2) <-
gt(H,M) & partition(M,L,L1,L2).
partition(&asterisk;,nil,nil,nil).
The first two rules compare M to H, the head of the list. Depending
on the comparison, they put H in front of
qsort(nil,nil).
qsort(M.L,S) <- partition(M,L,L1,L2) &
qsort(L1,S1) & qsort(L2,S2) &
append(S1,M.S2,S).
The recursive rule takes the head of the input list M as the
partitioning element. Then all the elements no larger than M go
to
new_le(K1.&asterisk;,K2.&asterisk;) <- le(K1,K2).
new_gt(K1.&asterisk;,K2.&asterisk;) <- gt(K1,K2).
Suppose we make these changes, and then give the goal
ADRIAN 1
qsort((4 .door).(2 .shoe).(3 .tree).(1 .bun).nil,S).
Then
(1 .bun).(2 .shoe).(3 .tree).(4 .door).nil.
For this example, the key is a number, and the associated data
is a single word. The program would work just as well with
data of varying length, such as:
(4 .door.knob.lock.key.nil).(2 .shoe.sock.nil)
.(3 .tree.leaf.nil).(1 .bun.nil).nil.
In rearranging the lists, structure sharing versions of
Prolog move pointers rather than the
actual data. Therefore
sort(nil,nil).
sort(L1,L2) <- compute(set,X,member(X,L1),nil,L2).
The Generate and Test
D + E = Y + 10 &asterisk; C1
N + R + C1 = E + 10 &asterisk; C2
E + O + C2 = N + 10 &asterisk; C3
S + M + C3 = O + 10 &asterisk; C4
C4 = M
The carries
pick(X,X.TL,TL).
pick(X,HD.TL,HD.REM) <- pick(X,TL,REM).
Reordering the Generate and Test
M=1 & C4=1
& R9=(0 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . nil)
pick(S, R9, R8)
& gt(S, 0) & (C3=0 | C3=1)
& O := S + M + C3 - 10&asterisk;C4
& pick(O, R8, R7)
pick(E, R7, R6) & (C2=0 | C2=1)
& N := E + O + C2 - 10&asterisk;C3
& pick(N, R6, R5)
(C1=0 | C1=1)
& R := 10&asterisk;C2 + E - N - C1
& pick(R, R5, R4)
pick(D, R4, R3)
& Y := D + E - 10&asterisk;C1
& pick(Y, R3, R2).
solve(S.E.N.D, M.O.R.E, M.O.N.E.Y) <-
M=1 & C4=1
& R9=(0 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . nil)
& pick(S, R9, R8)
& gt(S, 0) & (C3=0 | C3=1)
& O := S + M + C3 - 10 &asterisk; C4
& pick(O, R8, R7)
& pick(E, R7, R6) & (C2=0 | C2=1)
& N := E + O + C2 - 10 &asterisk; C3
& pick(N, R6, R5)
& (C1=0 | C1=1)
& R := 10 &asterisk; C2 + E - N - C1
& pick(R, R5, R4)
& pick(D, R4, R3)
& Y := D + E - 10 &asterisk; C1
& pick(Y, R3, R2).
Observations on the Method