Logic Programming ========================================================================= (adapted from lecture notes by Henri Casanova and Todd Millstein) =============================== lecture 1 =============================== ------------------------------ introduction ----------------------------- We now turn to a brand new paradigm, called "Logic Programming Languages" -- our vehicle for studying this paradigm is Prolog, whose roots are in logic and on automated theorem proving. Prolog was developed in the 1970s for AI applications. Some such applications have a knowledge base (or database) of facts, from which you'd like to ask queries and deduce other facts. For example, given facts in the knowledge base like "Carnitas is Mexican" and "Mexican food is delicious", then we can deduce "Carnitas is delicious". Prolog probably looks nothing like any language you've seen before. Fundamental difference between Prolog and most other languages: You don't run a Prolog Program Instead, you ask questions and the system attempts to answer them using the rules and facts that it has been given. Logic programs are "declarative": the specification of the desired results are written, rather than how to obtain them. This approach is very good at expressing problems that involve searching a large space of possibilities. For example, given a list of cities and distances between them, find me the shortest route that passes through each one once (the travelling salesman problem). The philosophy of this approach is that it is often hard to specify a search algorithm -- and in such cases, it is easier to specify the characteristics of the solution. To do so, you specify *facts* and *rules* for deducing new facts from old facts, and then a *query*. So you just state what is true and then ask what (else) is true. The language implementation figures out how to actually compute appropriate solutions. Use the travelling salesman problem as an illustration: I say the constraints, not how to search the space of possible solutions. Of course, as we will see, this is a simplification, and often for reasons of efficiency, one has to impose constraints on the search. The original and principal applications for Prolog are in various AI settings such as (expert databases). Examples include using prolog-based databases to determine when credit card fraud has occurred (prolog is used to specify rules that indicate when a fraud occurs), and there are projects afoot to use ideas from prolog to determine suspicious people from phone/communication patterns. Note that currently AI researchers devise statistical techniques to complement (if not replace) logical approaches for such tasks. Another big application is as a database query language. E.g., my facts are things like the daily stock prices of various stocks over the last year. Queries can be things like: find me all pairs of stocks that had the same price on the same day at least 50 times this year. Standard query languages like SQL are inspired by and are really subsets of Prolog. Of course, the reason we're studying it is that its a radically different way of thinking about computation: "programming as proving". You'll be surprised at how many places this paradigm fits beautifully, or leads to very elegant and readable systems. --------------------------- Terms -------------------------------------- Based on propositional logic. The entire program comprises three kinds of elements: facts, rules, and queries. The basic unit of each of these are terms. Terms are Prolog's way of encoding data. They are very similar to the values of datatypes created in ML. There are three kinds of terms: constants, variables and compound terms. 1. Constants: The simplest kind of terms are constants. - integers and reals: are constants. - atoms: are identifiers starting with a lowercase letter. For example: alice, bob, charlie are all atoms. Atoms are NOT variables -- the way to think of them is as TAGS, or special constants, or elements of a giant enum datatype. They are similar to the tags used in ML datatypes: datatype day = Alice | Bob | Charlie ... Only in ML, the tags start with a capital letter. Atoms are *uninterpreted constants*: nothing is known about each tag except that it is equal to itself. Intuitively, Prolog knows that: alice = alice as the tags are the same. However, to it, alice = bob *never* makes sense, as it has a strange notion of equality, that we will see shortly. There are some built-in atoms such as "[]" that signifies the empty list, "." which is used for list concatenation and so on, that we will see shortly. 2. Variables: Any identifier beginning with an upper case letter or an underscore is a variable. For example, X,Y, Head, Tail, Alfred are all variables. The variable "_" is like a "wildcard" variable, whose meaning is similar to the "_" in ML, but more on that when we see what a variable means. As we shall see: -- x = a is nonsense (as x and a are two *different* constants/tags). -- X = a has some sense (but isnt at all what one might think...). WARNING: Upercase/Lowercase is a common source of error! Also, variables are NOT declared before use (so be careful!). 3. Compound Terms: These are terms of the form: atom(term,term,term,...) where each "term" is either an atom or a variable or a compound term. Examples include: x(y,z) parent(alice,bob). %here alice,bob are atoms parent(alice,Child). %here alice is an atom, Child is a variable In other words, (compound) terms are generated by the following grammar: atom := [a-z][A-z,a-z,0-9]* | [0-9]* | ... variable := [A-Z][a-z,A-Z,0-9]* | _ term := atom | variable | atom(term,term,term,...) While you may be tempted to think of compound terms like: parent(alice,bob) as function calls, they are NOT! Instead, you should think about this in the same way as we thought of ML the recursive, one-of types in ML. type term = Atom of string | Var of string | Compound of string * term list Thus, parent(alice,bob) in Prolog is equivalent to the ML value: Compound ("parent", [Atom "alice"; Atom "bob"]) or equivalently represented as a tree: parent / \ / \ alice bob and parent(alice, Charlie) in Prolog is "equivalent to" the ML value: Compound ("parent", [Atom "alice"; Var "Charlie"]), or represented as a tree: parent / \ / \ alice Var | | Charlie Consider a term: factorial(5) NOT a function (despite what it looks like). - there is no associated function implementation Prolog has no idea of the meaning you intend for this term -- to it, this is just a box containing "5" with a label "factorial". Another way to view it is as a tree: factorial | | 5 Thus, the only thing Prolog knows is that: factorial(5)=factorial(5) i.e. the two terms are the same. In particular, to Prolog, factorial(5)=120 is NOT true. Thus Prolog compound terms are really just structured data -- like values of a datatype in ML. Can also think of atoms that begin a compound term as *uninterpreted functions*: e.g., factorial is a function about which NOTHING is known, except that the result of applying this function to the atom x, is the term "factorial(x)". These atoms are called *function symbols*. -------------------------------------- Facts ------------------------------- A fact is just a term, typically without any variables. You specify a fact by writing a term followed by a '.'. For example, here are a few facts that one might have in the system. % List of parent relationships parent(kim,holly). parent(margaret,kim). parent(herbert,margaret). parent(john,kim). parent(felix,john). parent(albert,felix). Note that kim,holly,margaret,herbert,john,kim,felix,albert are all atoms. The Prolog interpreter maintains a collection of facts like the above -- think of it as the underlying data in the database over which queries will be asked. You can define your own facts and add them to the database. The function symbols beginning a fact are called *predicates*: intuitively, they represent functions that evaluate to a boolean. Thus, (the atom) "parent" is a predicate that, intuitively, takes two arguements and returns a boolean -- we say that "parent" is a predicate of arity 2. The key thing to note, is that predicates have _no intrinsic meaning_. However, they are generally designed and named so that the programmer can easily "interpret" them. For example, as a programmer, I will decide that parent(X,Y) means that 'X' is a parent of 'Y'. In other words, I will specify the fact: parent(a,b). only if the person corresponding to atom a is a parent of the person corresponding to atom b. Thus, the predicate is interpreted as a logical relation between X and Y. Prolog uses these facts to answer queries, as well as to infer new facts. Lets see how it does the first. -------------------------- Running Prolog : Queries -------------------------- The standard interface to Prolog is in an interactive shell. To run it, first, lets put a bunch of facts into a file, and then load the file into the shell. Suppose the list of facts are stored in a file called "facts.pl". First, we load prolog, and get the shell prompt: Prolog prompt: "?-" At this prompt, enter something like: ?- consult('facts.prolog'). You can manually add the facts one by one by typing at the prompt: ?- assert(parent(margaret,kim)). (or whatever fact you want to insert). Once this is done, the facts have been registered into the shell, we can query Prolog as follows: 1. Prompts you to type a query 2. You type a query 3. Prolog tries to prove your query 4. Prints out the result (or 'failure') 5. Repeat The simplest query is a term followed by a '.' (looks like a fact, but is just typed at the prompt). For example, suppose you type the following query: ?- parent(margaret, john). The meaning of a query is "is this fact in your database or can it be inferred from your database". In other words, we are asking Prolog if it can PROVE the fact. Prolog replies: False as this is not one of the facts (we have not yet given it any rules to infer new facts). Instead, if we were to ask: ?- parent(margaret,kim). Prolog replies: True As this is in the database of facts we fed in. (Tip: If you forget the period then you can type it on the next line) Not bad, but not especially exciting -- we gave it a bunch of facts, and basically each query is effectively asking if the query term was one of the facts we supplied. Things get more fun, when we toss variables into the queries. ?- parent(margaret,X). This is where we Prolog departs radically from other paradigms. The meaning of this query is: "What value(s) can you plug in for X such that the fact becomes provable for that value ?" Prolog replies: X = kim [press enter if you're satisfied] True This means that it can plug in "kim" for X, and thus, it can deduce the fact: parent(margaret,kim). Instead, you can enter the query: ?- parent(X,kim). This asks prolog, for what values of X does the fact parent( _ , kim) hold. In other words, this innocent query is asking prolog -- who are the (known) parents of kim ? It replies: X = margaret ; [press ';' if you want another answer] X = john ; False Thus, it returns, one-by-one, all the "solutions" for X that make the fact parent(X,kim) provable. We can make both the parameters variables: ?- parent(X,Y). This asks prolog -- what are the pairs X,Y such that X is (provably) the parent of Y ? It responds: X=kim Y=holly ; X=margaret Y=kim; ... Suppose you want to know if there are any strange circularities in your fact database -- for example, does there exist any person who is their own parent ? The following query does the trick: ?- parent(X,X). False ------------------------------- Unification ----------------------------------- In most other languages, a function designed to look up things like this would be less flexible -- it would require tedious parentOf() method or a childOf() method, loops, etc. With Prolog -- the queries are very flexible, and work like magic. Whats going on ? Turns out that Prolog's computational heart is a fancy pattern matching technique (also, btw, at the heart of ML's type system) borrowed from logicians, called "Unification". Regrettably, we won't have time to go into the details of unification -- and so, lets content ourselves with a cartoon version. Intuitively, two terms can be unified if there a way of assigning values to their variables so that the terms become identical. This is really what " = " means in prolog -- when you ask it: ?- t1 = t2. For any two terms t1 and t2, you are asking it whether the terms t1 and t2 can be unified. So, if you ask: ?- foo(bar) = foo(bar). True Because there are no variables and the terms are the same. If instead you ask: ?- foo(X) = foo(bar). It replies: X = bar True Meaning that foo(X) can be unified with foo(bar) by assigning the variable X to the term "bar". Note that we can ask this from Prolog without "declaring" any of the above atoms. This is because everything is symbolic -- everything is a term which is an arbitrary notation that can encode whatever concept. One can type the above right after starting the interpreter. The more interesting case is when there are several variable in the terms: ?- p(X,dog) = p(cat,Y). X = cat Y = dog True meaning that one can unify the terms p(X,dog) and p(cat,Y) by assigning appropriate values to X and Y. However, if we were to ask: ?- p(cat) = p(dog). False As the terms are different, and so, if we ask: ?- q(X,dog,X) = p(cat,Y,Y). False is the answer as to unify, X must be "cat" and Y must be "dog", but this ensures that the last parameter of the term can never be the same! Similarly, the unification happens "deep" into the terms: ?- a(W,foo(W,Y),Y) = a(2,foo(X,3),Z). W = 2 X = 2 Y = 3 Z = 3 True Intuitively, it first matches up the first position, and so W gets 2, next, it tries to match up the second position -- i.e. foo(W,Y) with foo(X,3). now, W is already 2, so X also gets 2, and Y gets 3. Finally, it tries to match up the last position, and Y is 3 and so Z gets 3. Instead, the query: ?- a(W,foo(W,Y),Y) = a(2,foo(X,3),X). False Thus, by using W,X,Y in two places, we are forcing it to find a solution where those two places get exactly the same value. As a result, the constraints ensure that all the variables must get the same value. However, W and Y must get 2 and 3 and so there is no solution. Thus, this innocent "pattern matching" operation actually does a lot under its hood, and it turns out to be a surprisingly powerful and flexible way to encode all kinds of computation! All the queries that we asked before, were answered via unification. When we ask: ?- parent(margaret, john). prolog checks if the term "parent(margaret,john)" can be unified with any of the known facts (that are also terms). If so, it says True, but as it cannot, it replies False. When we ask: ?- parent(margaret,kim). it can unify the query term with a known fact term, and so it replies True. When we ask: ?- parent(margaret,X). it tries to find all the known facts with which it can unify the query term -- there is only one, and so it "answers the query" by returning the substitution required for unification: X = kim Similarly, when we ask: ?- parent(X,kim). It attempts to find all the known terms with which this query can be unified -- this time, there are several terms, and the different valid unifying substitutions (called unifiers) yield the different parents of kim: X = margaret ; X = john ; False Finally, to answer the query: ?- parent(X,Y). It attempts to unify the query term with all known facts -- and the list of resulting unifiers is exactly the set of known parent child pairs. ------------------------------- Conjunction ------------------------------ Often, its more useful to ask questions about several terms. For example, to determine if margaret is holly's grandparent, we would like to find if there is some person who is both the child of margaret AND the parent of holly. To do so, we can issue a conjunctive query which is a list of terms separated by commas as follows: ?- parent(margaret, X), parent(X, holly). To answer this query, Prolog attempts to find an X such that parent(margaret,X) unifies with the set of known facts, AND, parent(X,holly) unifies with the set of known facts. Upon finding a unifier that works, it replies: X = kim True Thus, as kim is the intermediate parent, we can conclude that margaret is indeed a grandparent of holly. Finally, consider the following query: ?- parent(X,Y), parent(Y,Z), parent(Z,kim). It asks if there are X,Y,Z such that X is Y's parent, Y is Z's parent and Z is kim's parent. In other words, the query determines if kim has any "great-grandparent". Upon finding appropriate unifiers, prolog replies: X = john Y = felix Z = albert True ------------------------------- Rules ------------------------------------ The above is quite nifty -- it allows us to quickly mine the database to find interesting relationships. However, it gets somewhat cumbersome as we have to devise a complex conjunctive query ever time. Instead, it would be nice if we could define complex queries out of simpler queries. Rules serve exactly that purpose. They allow us to specify complex queries (i.e. predicates) using simpler ones. In general, the format of a rule is: head :- condition1, condition2, condition3.... Intuitively, it means, that the "head" query is true if condition1, condition2, condition3,... are all true. In other words, it tells prolog, to prove the head query, prove the conditions 1,2,3. For example, suppose we'd like to define a grandparent relationship (predicate). We do so as: grandparent(GP,GC) :- parent(GP,P), parent (P,GC). Intuitively this states: "GP is a grandparent of GC if GP is a parent of P AND P is a parent of GC". With this definition, we can now issue the following query: ?- grandparent(X,kim). and prolog responds with: X=herbert True as it can find that parent(herbert,margaret) and parent(margaret,kim), therefore, applying the rule, grandparent(herbert,kim), to which fact the query term gets unified. We can use this predicate to write more predicates: greatgrandparent(GGP,GGC) :- parent(GGP,GP) , grandparent (GP, GGC). We can now issue the query: ?- greatgrandparent(X,holly). X=herbert True Program = Facts + Rules: ------------------------ Facts and Rules are the two kinds of "Clauses" (intuitively, a fact is just a rule without any conditions). Thus, a prolog program is a set of clauses -- partitioned into a database of facts and a set of rules for inferring new facts. Scope: ------ Notice that the same variable P appears twice in the grandparent rule. Indeed one may be tempted to reuse P across several rules. In Prolog, the scope of a variable is the clause (rule) that contains it. Thus, there is no connection whatsoever between variables across clauses. For example, consider the two clauses: foo(P) :- bar(P). % There is no connection between P in stuff(P) :- thing(P). % the 2 clauses. In other words, there are no global variables, all variables are local to the individual clauses. ------------ Multiple Clauses = Disjunction and Recursion --------------- Suppose we want to define a predicate that is true for all those persons that have some family -- that is, those persons who have either a parent OR a child. We can do so as follows: has_family(X) :- parent(X,_). has_family(X) :- parent(_,X). If we have multiple rules for the same predicate, effectively we are specifying a disjunction. The first rule says: "X has a family if there is some _ such that X is the parent of _" The second rule says: "X has a family if there is some _ such that _ is the parent of X" If either of these clauses fire then, has_family(X) becomes true. Like in ML, the symbol "_" represents a "wildcard" or dont-care variable that we will use only in one place and so we not bother to name it. Thus, ?- has_family(holly). True as the second rule fires for holly. While, ?- has_family(mugatu). False as neither rule fires for mugatu. For those of you who are economical with the keystrokes, there is another way to specify disjunctive rules -- via a semicolon: has_family(X) :- parent(X,_) ; parent(_,X). Suppose you want to specify an predicate ancestor(X,Y) which is true if X is an ancestor of Y, i.e. if by following the parent relationship from Y one eventually reaches X. Intuitively, X is an ancestor of Y either if: 1. X is the parent of Y, or, 2. Z is the parent of Y, and X is an ancestor of Z. Thus, we can specify this predicate recursively as follows: ancestor(X,Y) :- parent(X,Y). %[Base case] ancestor(X,Y) :- parent(Z,Y),ancestor(X,Z) %[Recursive case] This works quite niftily: ?- ancestor(kim,X). X = holly ; False i.e. holly is the only "descendant" of kim, and: ?- ancestor(X,kim). X = margaret ; X = john ; X = herbert ; X = felix ; X = albert ; False i.e. kim has a long ancestry. ----------------------------- Backtracking Search --------------------------- At this point, its worth looking into how exactly prolog pulls off this trick of answering queries in this manner, as it has its limits, which one needs to know to phrase the queries appropriately. Turns out, there's no real magic -- just brute force "proof" search. We can view each clause as a "proof rule": goal :- subgoal_1, subgoal_2,... Thus, the rules for ancestor are as follows: ancestor(X,Y) :- parent(X,Y). %rule 1 ancestor(X,Y) :- parent(Z,Y),ancestor(X,Z). %rule 2 To prolog, these rules mean the following -- to prove ancestor(X,Y), try to: 1. prove the subgoal parent(X,Y), or, failing that, 2. prove the subgoal parent(X,Z), and then the subgoal ancestor(X,Z). Thus, suppose we ask it the query: ?- ancestor(felix,holly). To prove this query, it undertakes the following backtracking search: ancestor(felix,holly)? / \ parent(felix,holly) parent(Z,holly) NO ancestor(felix,Z) | | Z = kim (by fact) | ancestor(felix,kim) / \ parent(felix,kim) parent(Z',kim) NO ancestor(felix,Z') ----------| | | Z'=john Z'=margaret | | | ancestor(felix,john) ancestor(felix,margaret) | / \ parent(felix,john) parent(felix,margaret) parent(Z'',margaret) YES NO ancestor(felix,Z'') | Z'' = herbert | | ancestor(felix,herbet) / | parent(felix,herbert) parent(Z''',herbert) NO NO Thus, it first tries the base rule i.e. to prove the subgoal parent(felix,holly). As it cannot unify this query with any known fact, it fails (NO), and so it backtracks and tries the other recursive rule. The only Z such that parent(Z,holly) unifies with a known fact is when Z=kim, thus, it tries to prove the second subgoal ancestor(felix,kim). To do so, again it first applies the base rule, which fails, and so it backtracks and applies the recursive rule. This time, there are two different Z (written Z' in the figure to distinguish from the upper part of the tree), such that parent(Z,kim) -- namely Z=margaret and Z=john. It picks margaret first (as that is the first unification that succeeds, and tries to prove the the second subgoal ancestor(felix,margaret). As we can see, from the subtree, this search fails, (as margaret's sole parent is herbert who has no parent). Thus, prolog backtracks and tries the second Z=john, and tries to prove the second subgoal, ancestor(felix,john). This time, the base rule works as parent(felix,john) is a known fact, and thus the proof search succeeds and prolog returns: ?- ancestor(felix,holly). True This same process is repeated for any query. When there is a variable in the query, eg. ?- ancestor(X,kim). Prolog attempts the proof search and returns all the unifiers for X for which the proof succeeds. Thus, prolog is literally programming by proving. Hint: Trace mode in prolog shows the tree: ?- trace. The subsequent query is traced: use the on-line help on the ACS Prolog interpreter ?- help(trace). Order Matters: -------------- The rub is that the order of the clauses and terms influences greatly the order in which the unification and backtracking happens. This is because, to prove a particular goal query, the different clauses are selected in order, and further, within each clause, the subgoals are selected from left-to-right, which affects how the unification happens. In the above example, if we had entered the branch Z'=john rather than than the branch Z'=margaret, then we would have proven the query faster. Similarly, if we had swapped the order of the conjunctions (subgoals) in the recursive clause, we would have a rather different tree (try as an exercise). Thus, order matters for performance. Hint: Try simple things first! More importantly, there are cases where the program may not even work (may not terminate), depending on the order: ancestor(X,Y) :- ancestor(X,Z), parent(Z,Y). ancestor(X,Y) :- parent(X,Y). Now lets try the same query: ?- ancestor(felix,holly). ERROR: Out of local stack Why ? Well, if you try to build the search tree, you'll see it goes forever: ancestor(felix,holly)? | | | ancestor(felix,Z) %prove first subgoal, | %then parent(Z,holly) | | ancestor(felix,Z') %prove first subgoal, | %then parent(Z',Z) | | ancestor(felix,Z'') . . . So, to avoid this, we must place the parent subgoal first (in the recursive rule). If this is done, the unification with the base facts (parent), fix the possible unifiers for Z, thereby guaranteeing termination. Lets see another example. Suppose we want to define a sibling predicate, where sibling(X,Y) holds if X and Y have the same parent. How about: sibling(X,Y) :- parent(P,X), parent(P,Y). Almost: ? sibling(kim,kim). True Ah, we have to ensure that X and Y are not the same. Ok, how about: sibling(X,Y) :- not(X=Y), parent(P,X), parent(P,Y). Surely this works ? Nope. The reason is prolog's semantics of equality (i.e. unification). This clause is read by prolog as: first, find a X,Y such that X cannot be unified with Y, then, find a P such that parent(P,X), and parent(P,Y). Now the catch is that to process the first subgoal, prolog finds it can always unify (two unconstrained variables) X and Y, by simply assigning X to Y! ?- X=Y. X=Y True ?- not(X=Y). False Thus, the very first subgoal always fails, thereby ensuring that: ?- sibling(X,Y). False Thus, to get the rule right, we must make sure that the goal that ensures that X and Y are not the same, is fired AFTER X and Y have been unified with appropriate atoms. We can do so by simply placing the subgoal at the end. sibling(X,Y) :- parent(P,X), parent(P,Y), not(X=Y). ?- sibling(X,Y). X = john Y = maya ; X = felix Y = dana ; X = dana Y = felix ; X = maya Y = john ; False This shows a major weakness: You can's just rely on the logical meanings and you sort of need to know how things work. Oh well, nothing's perfect. We'll see that many, many things break down the pure philosophy that says: "Just write what you need logically" ---------------------------- Numeric Computation -------------------------- Although Prolog is mostly symbolic, there is a need for numeric computation. - '=' is the unification operator ?- X = 2+3. X = 2+3 True - 'is' evaluates arithmetic expressions before doing unification ?- X is 2+3. X = 5 True When prolog tries to solve an "is" goal it evalutes the second argument and then unifies, as opposed to "=" which just does the unification. ?- Y is X+2, X=1. ERROR: Args are not sufficiently instantiated ?- X=1, Y is X+2. X=1 Y=3 True Again, order of evaluation matters! Functions are Predicates: ------------------------- Lets try to write a factorial function in Prolog. We need to somehow encode functions as predicates. Here's the deal: Whenever you have a function f(x), you can write a predicate pred_f(X,Y) that captures the behavior of f by being true for all those pairs X,Y where Y is f(X). Thus, we can write a predicate capturing the input/output relationship of the factorial function -- i.e. a predicate factorial(X,Y), that is true for those pairs X,Y where Y is the factorial of X. factorial(0,1). % base case factorial(X,N):- X1 is X-1, factorial(X1,N1), N is X1*N1. We "call" the function with a query. ?- factorial(0,X). X = 1 True ?- factorial(5,X). X = 120 ---------------------------- Data Structures: Lists -------------------------------- Let us now see how we can encode lists in Prolog. Again, its useful to recall how lists were encoded as a datatype in ML. -- There is a "base atom": [] denoting the empty list -- There is a "cons"tructor: | (different syntax for this). Thus, ML's list Cons(1,Cons(2,Cons(3,Nil))) is equivalent to the prolog term: [1|[2|[3|[]]]] where (1) | is Cons, and (2) [] is Nil. Also, as they are heavily used, Prolog lets you write the above term as: [1,2,3]. ?- [1,2,3] = [1|[2|[3|[]]]]. True To "deconstruct" a list into head and tail, we use pattern-matching (very much like in ML). So: [X|Y] unifies with any non-empty list (like h::t), X unified to the first element (head) Y unified to the rest of the list (tail). ?- [X|Y] = [1,2,3,4,5]. X = 1 Y = [2,3,4,5] True ?- [X|Y] = [1]. X = 1 True ?- [X|Y] = []. False [1|Y] unifies with any list starting with 1 (like 1::t), Y unified to the rest of the list. However, prolog also lets you write: [1,2|X] which unifies with any list that starts with 1 and then 2. ?- [1,2|X] = [1,2,3,4,5]. X = [3,4,5] True ?- [1,2|X] = [1,2] X = [] True ?- [1,2|X] = [1,3] False One can place variables wherever in the term, so: ?- [X,Y|Z] = [1,2,3]. X = 1 Y = 2 Z = [3] Ok -- how do we do interesting things with lists. For example, how might we "append" two lists ? Well, there is no "concatenation" or sticking together. In prolog you write a predicate: append(X,Y,Z) which is true if Z is the result of appending the lists X and Y. Turns out such a predicate is built-in, so lets see what it does. ?- append([1,2],[3,4],Z). Z=[1,2,3,4] True It simply "solves" for the right Z that happens to be the result of appending [1,2] and [3,4], but wait, predicates can do more: ?- append(X,[3,4],[1,2,3,4]). X=[1,2] whoa! backwards computation -- what X is such that when appended to [3,4] you get [1,2,3,4] ? And now, the full power of multiple solutions: ?- append(X,Y,[1,2,3]). X = [] Y = [1,2,3] ; X = [1] Y = [2,3] ; X = [1,2] Y = [3] ; X = [1,2,3] Y = [] True Try doing that in another language. There are several such predicates for reverse, sort, append, built-in, but lets try to roll our own. How would you write append(X,Y,Z) in Prolog? Well, the base case is that if X is empty then, Z is just Y. append([], Y, Y). % base case The recursive case is when X is of the form [H|Tx], in which case, Z must begin with H, and the tail of Z is obtained by appending Tx to Y: append([H|Tx], Y, [H|Tz]) :- append(Tx,Y,Tz). % recursive case Very different way of thinking than imperative languages. Lets see what happens with the query: ?- append([1],[2],Z). Prolog tries to prove the term append([1],[2],Z) ---> recursive case fires. H unifies to 1 Tx unifies to [] Y unifies to [2] Z unifies to [1|Tz] ---> prove: append(Tx,Y,Tz) ---> i.e. prove: append([], [2], Tz) base case fires. ---> Tz unifies to [2] ---> therefore: Z = [1,2] But, because of the magic of pattern matching, proving and predicates, you get the backwards computations by the same "proving" process ! Lets do a few more. Lets write a predicate tailof(X,Y) which is true if Y is the tail of the list X. tailof([_|X],X). Again, note the judicious use of the wildcard "_". If you actually named the variable there, eg. tailof([H|X],X). The compiler would warn you that you named a variable but used it only once -- a "Singleton Variable". Lets write a predicate which is true of lists with three or more elements: has3orMoreElements([_,_,_|_]). What does this predicate do ? foo([X,_,_,_,X|_]). One more tricky one. Lets write a predicate isin(X,L) which is true if X is an element of the list L. How ? base case: if X is the first element of the list L. isin(X,[X|_]). recursive case: if X appears in the tail of the list L. isin(X,[_|T]) :- isin(X,T). Let's give it a spin: ?- isin(2,[1,2,3]). True ?- isin(X,[1,2]). X=1 ; X=2 ; False ?- isin(1,[2,3]). False Let's write another predicate: mylength(L,X) which is true if X is the length of list L mylength([],0). mylength([_|Tail],Len) :- mylength(Tail,TailLen), Len is TailLen +1. ?- mylength([1,2],L). does not unify with mylength([],0) unifies with mylength([_|Tail],Len) with the bindings: Tail = [2] and Len = L now I need to prove the two things: mylength([2],TailLen) and Len is TailLen + 1 can I prove the first one? mylength([2],TailLen) does not unify with mylength([],0) mylength([2],TailLen) unifies with mylength([_|Tail'],Len') with the bindings: Tail' = [] and Len' = TailLen now I need to prove the two conditions: mylength([],TailLen'') and Len' is TailLen'' + 1 can I prove the first one? mylength([],TailLen'') unifies with mylength([],0) with the bindings: TailLen'' = 0 Len' is TailLen'' + 1 then leads to the binding Len' = 1 therefore TailLen is equal to Len', and thus to 1 therefore Len is equal to Len' + 1, and thus to 2 therefore L is equal to Len, and thus to 2 Prolog answers L = 2. =============================== Lecture 2 =============================== -------------------------- Accumulators --------------------------- - There are cases in which you want to add an argument to a predicate just to keep track of useful information Example: List Reverse: ---------------------- We will now write a predicate: rev(X,Y) that is true if the list Y is the reverse of the list X. To do so, we will use an accumulator that tracks the elements seen so far in X. rev(X,Y) :- acc_rev(X,Y,[]). The third parameter is the "accumulator". We will "push" elements into it in the order they appear in X. Thus, when we have pushed all the elements, the third parameter is the "reversed" version of X. acc_rev([],Y,Y). %base case acc_rev([H|T],Y,SoFar) :- acc_rev(T,Y,[H|SoFar]). %recursive case ?- rev([1,2,3],Y). Y = [3,2,1] Yes ?- rev(X,[3,2,1]). X = 1,2,3 Yes The nice thing about predicates, is one can go forwards or backwards! Its completely symmetric... Example: Finding all solutions: ------------------------------- Suppose we have a predicate foo(X), defined: foo(a). foo(b). foo(c). foo(d). and say we want to find all the terms X such that foo(X) is true. We will use an accumulator to define a predicate: allfoos(L) that is true for a list of terms iff every term in the list satisfies foo. allfoos(L) :- listallfoos(L,[]). listallfoos is a helper predicate, whose second argument is an accumulator that "tracks" which terms satisfying foo are already known. We shall then "add" those terms that satisfy foo, but are not in the "accumulator". % recursive case listallfoos([X|L],SoFar) :- foo(X), not(isin(X,SoFar)), append(SoFar,[X],NewSoFar), listallfoos(L,NewSoFar). % base case listallfoos([],_). ?- allfoos(A). must prove listalllfoos(A,[]). unifies with listallfoos([X|L],[]) must prove four things: foo(X) not(isin(X,[]) append([],[X],NewSoFar) listallfoos(L,NewSoFar) foo(a) true (X is bound to a) not(isin(a,[])) true append([],[a],NewSoFar) true with NewSoFar=[a] must prove listallfoos(L,[a]) unifies with liastallfoos([Y|L'],[a]) must prove four things: foo(Y) not(isin(Y,[a]). append([a],[Y],NewSoFar') listallfoos(L',NewSoFar') foo(a) true not(isin(a,[a])) false BACKTRACK foo(b) true (Y is bound to b) not(isin(b,[b])) true append([a],[b],NewSoFar') true with NewSoFar' = [a,b] must prove listallfoos(L',[a,b]) unifies with listallfoos([Z|L''],[a,b]) must prove four things: foo(Z) not(isin(Z,[a,b]) one can see that will fail BACKTRACK unifies with listallfoos([],[a,b]). therefore: L' unifies with [] therefore: [Y|L'] unifies with [b] therefore L unifies with [b] therefore [X|L] unifies with [a,b] therefore A unifies with [a,b] therefore allfoos([a,b]) If you try this code, and hit ';', you'll get multiple answers Try to figure out why (using the "trace" mode) Solution: add a cut listallfoos_cut([X|L],SoFar) :- foo(X), not(isin(X,SoFar)), append(SoFar,[X],NewSoFar), listallfoos(L,NewSoFar),!. listallfoos_cut([],_). What happens when you flip order of base case and recursion ? ---------------------------- Puzzle Solving ---------------------------- We now have a good feel for what Prolog programs look like. Lets finish, by seeing how succinctly and elegantly prolog allows us to write code to solve tricky logical puzzles. Towers of Hanoi: --------------- | | | =|= | | ===|=== | | =====|===== | | =======|======= | | -------------------------------------------------- Peg 1 Peg 2 Peg 3 Puzzle: There are three pegs. On the first one, there is a stack of rings of decreasing radius (that forms a tower). In each step, you are allowed to move the top ring from one peg to another, but only if the rings in the new peg form a conical tower -- i.e. as long as the sequences of radii from top to bottom is increasing (as shown in the figure). - Goal: Find the sequence of moves (move top ring from from peg X to peg Y) that moves the tower from peg 1 to peg 3. - The basic action is to move 1 disk, with printing move(A,B) :- nl, write ('Move topdisk from '), write(A), write(' to '), write(B). - the main predicate is transfer(N,A,B,X): represents "Move N disks from peg A to peg B by using peg X as a helper" - base case: transfer(1,A,B,X) :- move(A,B). Better written as: transfer(1,A,B,_) :- move(A,B). - inductive case: transfer(N,A,B,X) :- transfer the top N-1 disks to X transfer the (bottom) disk from A to B transfer the top N-1 disks from X to B transfer(N,A,B,X) :- M is N-1, transfer(M,A,X,B), move(A,B), transfer(M,X,B,A). Lets see how these work -- lets name the pegs using atoms: peg1, peg2, peg3. Here's the sequence of moves to transfer a tower of size 3 across (you can try this at home!). ?- transfer(3,peg1,peg3,peg2). Move topdisk from peg1 to peg3 Move topdisk from peg1 to peg2 Move topdisk from peg3 to peg2 Move topdisk from peg1 to peg3 Move topdisk from peg2 to peg1 Move topdisk from peg2 to peg3 Move topdisk from peg1 to peg3 Farmer/ Wolf / Goat / Cabbage: ------------------------------ *West* *East* Goat | | Goat eats cabbage if no farmer Wolf | river | Wolf eats goat if no farmer Cabbage | | Only one spot on the boat(farmer+1) Configure the "state" of the program as a list with the location of the four objects (farmer, wolf, goat, cabbage). There are two locations: West (w) and East (e). initial state: [w,w,w,w] desired state: [e,e,e,e] There are four kinds of moves: with the cabbage, (move_cabbage) with the goat, (move_goat) with the wolf, (move_wolf) with nothing. (move_nothing) Here is how the "state" changes with a given move: For instance, [w,w,w,w] ---"move wolf"--> [e,e,w,w] We encode this as: move([w,w,w,w],move_wolf,[e,e,w,w]). (this just says that the state transformation above is true) We could write all the possible moves as facts, but there would be a lot. However, it is clear that when the farmer and wolf move, the goat and the cabbage do not move, so a more general fact is: move([w,w,P_Goat,P_Cabbage],move_wolf,[e,e,P_Goat,P_Cabbage]). move([e,e,P_Goat,P_Cabbage],move_wolf,[w,w,P_Goat,P_Cabbage]). therefore, a more general goal is: move([X,X,P_Goat,P_Cabbage],move_wolf,[Y,Y,P_Goat,P_Cabbage]) :- change(X,Y) where we have: change(e,w). change(w,e). In addition to ensuring that X and Y are different, this predicate ensures that they cannot be any random atom -- they must be either "e" or "w". Now we can just write the whole program: move([X,X,P_Goat,P_Cabbage],move_wolf,[Y,Y,P_Goat,P_Cabbage]) :- change(X,Y). move([X,P_Wolf,X,P_Cabbage],move_goat,[Y,P_Wolf,Y,P_Cabbage]) :- change(X,Y). move([X,P_Wolf,P_Goat,X],move_cabbage,[Y,P_Wolf,P_Goat,Y]) :- change(X,Y). move([X,P_Wolf,P_Goat,P_Cabbage],move_nothing,[Y,P_Wolf,P_Goat,P_Cabbage]) :- change(X,Y). At this point we have encoded all the possible moves. But there is nothing about moves being safe or unsafe. We need a safe predicate that takes a state as input and is true if the state is sage (nobody eats nobody). "if at least one of the goat or the wolf is on the same bank as the farmer, AND at least one of the goat or cabbage is on the same bank as the farmer, then we're safe" We define the one_equal(X,Y,Z) predicate that returns true if at least one of Y or Z is equal to X: one_equal(X,X,_). one_equal(X,_,X). then we can have: safe([P_Farmer,P_Wolf,P_Goat,P_Cabbage]) :- one_equal(P_Farmer,P_Goat,P_Wolf), one_equal(P_Farmer,P_Goat,P_Cabbage). this encodes the logical statement we made above. A solution is defined as a sequence of moves such that we are either: -- in the target configuration, all on the east bank, or, -- in a state from which there is a single move, which takes us into a safe state, from which we can get to the target configuration. solution([e,e,e,e],[]). solution(State,[FirstMove|RemainingMoves]) :- move(State,FirstMove,NextState), safe(NextState), solution(NextState,RemainingMoves). The program is complete. Example run: If you just type solution([w,w,w,w],X), we get into an infinite loop as there are infinitely solutions. So: ?- length(X,7), solution([w,w,w,w],X). X = [goat, nothing, wolf, goat, cabbage, nothing, goat] In fact, 7 steps is the shortest solution. ?- length(X,12), solution([w,w,w,w],X). (needs an odd number of moves) No ?- length(X,13423), solution([w,w,w,w],X). [goat, goat, goat, goat, goat, goat, goat, ....., +7 steps] is one of the solutions ------------------- Cuts (Not Covered, Bonus Material, Not on Final) --------------- - Ordering clauses and goals is a way to somewhat control the search and backtracking process, but it is very limited. - There is something called a "cut" that prevents Prolog from backtracking. - Example: Let's say we're writing a program to compute the following step function: X < 3 phi(X) = 0 3 <= X < 6 phi(X) = 2 6 <= X phi(X) = 4 In Prolog we can implement this with a binary predicate, f(X,Y), which is true if Y is the function value at point X. For instance, f(0,0) is true, f(4,2) is true, but f(2,4) is false. Here is the program: f(X,0) :- X < 3. [rule 1] f(X,2) :- 3 =< X, X < 6. [rule 2] note '=<' f(X,4) :- 6 =< X. [rule 3] There are two sources of inefficiency in this program, that we'll see on one example: ?- f(1,Y), 2 < Y. [find a Y such that Y = f(1) and 2 < Y] [ we can see this is going to fail] what does Prolog do? f(1,Y) 2 < Y ---------- rule 1 / \ \ Y = 0 / \ rule 2 \ rule 3 / | Y = 2 | Y = 4 | | | 1 < 3 3 <= 1 6 <= 1 2 < 0 1 < 6 2 < 4 | 2 < 2 NO | NO 2 < 0 NO There is really no point in trying rule 2 and rule 3 because since X < 3, we know that rule 2 and rule 3 will fail. Basically, the three rules are mututally exclusive. We know that. Prolog doesn't. So, we can "cut" the backtracking by using the '!' operator: f(X,0) :- X < 3, !. f(X,2) :- 3 <= X, X < 6, !. f(X,4) :- 6 <= X. The new execution looks like: f(1,Y) 2 < Y rule 1 / Y = 0 / / | 1 < 3 2 < 0 | CUT | 2 < 0 NO Lessons: cuts can be used to prevent Prolog from going into branches of the search tree that we know, due to our understanding and knowledge of the problem, will not succeed anyway. - There are many more things possible with cuts and using them well is an art. A program with no cuts at all will run orders of magnitude slower than an equivalent program with a few '!' thrown in. ------------------------------------------------------------------------------------------