Monday, June 12, 2017

Logic Programming in Functional Style

[This was also posted on the WombatLang blog, but it is self-contained and might have wider interest.]
[N.B. There is code. In I have hand-compiled the example below to an AST, and written a pretty-printer to check it is right. Next step is to write an interpreter for the AST. How hard can that be :-).]

Wombat was designed to be a (mostly) functional programming language with some logic programming capabilities. But it turned out that you can't be half-hearted about logic programming. However the functional roots shine through, giving Wombat a familiar look for most programmers. But behind the scenes, unification is everywhere.

Procedures are also ubiquitous in Wombat. They always have exactly one input and one output. Even things that aren't procedures can act like procedures. In normal functional programming the input is specified, and the output starts out as a hole that the result gets put into. In Wombat both the output and the input are passed to the procedure. Either can be a hole. One or both might be structures which include holes. Consider

(x,y) = f(1,2)

Here "=" causes unification. One might think that the function f will be called, it will return a pair, and unification will then happen. But this is not how Wombat works. Instead (x,y) is unified with f's output, (1,2) is unified with f's input, and execution of f then starts.

Before we look at a more interesting example, some relevant features of Wombat are:
  • An identifier is preceded by backquote when used for the first time. It starts life as a hole, and like all holes it can only be filled in once. `x:Int; x=3 (Explicit typing is optional.);
  • An explicit procedure (closure) is just an expression in braces -- { x+1 } ;
  • A closure's input is $ and its output is `$. The input is commonly a tuple which is unpacked immediately, and $ is never mentioned again -- { $ = (`x,`y); x+y } ;
  • If `$ isn't explicitly unified, then it is unified with the whole expression: {$+1} means {`$=$+1}.
  • A list is given by elements in square brackets separated by spaces. The +> operator adds an element to the head of the list and is invertible.

Here is the classic list append program (using the caseP procedure, rather than the almost identical case syntactic sugar):

`append = {
   $ = (`x,`y); # 2 input lists
   caseP [
       { x=[]; y }
       { x = `hdx +> `tlx;
         hdx +> append(tlx,y) }
   ] ()

print( append([1 2],[3 4])); # [1 2 3 4]
[1 2 3 4] = append([1 2],print(`a)); # [3 4] -- print returns its argument
[1 2 3 4] = append(print(`b),[3 4]); # [1 2]

Consider the last line. Execution proceeds concurrently:
  • x is unified with print(`b) and y with [3 4];
    • print is called with its `$ set to the hole x, and its input set to the hole `b. Since it is going to have an effect it has to stall waiting for one or other to be filled. If there were any later effects they would also stall, even if ready to go, because of a lexical ordering requirement.
  • At the same time caseP is called with input set to unit (=()), and output set to the output of the whole procedure (i.e. [1 2 3 4]) since it is the last expression. Now caseP calls all procedures in its list expecting precisely one to succeed. In this case:
    • Closures execute in a sandbox where unifications with holes from outside are tentative and only make it out if the procedure doesn't fail. If the outside hole gets filled in while the closure is executing then the unification is made firm if it agrees with the tentative binding, or the closure fails if it doesn't.
    • So when we look at the first procedure in the caseP, it tentatively unifies x with [], then tries to unify y=[3 4] with `$=[1 2 3 4]. This fails, so that closure fails.
    • At the same time we start the more complex 2nd closure. The first line creates a relationship between the 3 holes: x, hd and tl. The 2nd line then unifies [1 2 3 4] with (hd+>append(tl,y)). this sets hd=1 and unifies [2 3 4] with append(tl,y). So we call append recursively with `$=[2 3 4] and $=(tl,y).
    • The following time that append is called we have `$=[3 4] and then the first closure succeeds (while the 2nd fails), so that when it returns it establishes its parent's y as [3 4], tlx=[] and hdx=2. This resolves the previous line giving x=[2].
    • When this returns the output of print(`b) is unified with [1 2] which in turns sets b to [1 2] and allows the print to proceed.
    • If we weren't going to use b subsequently we could have just written print(_) because _ is always a new hole.