Optimisation idea: Hylomorphism inference

Motivation

Consider the following inefficient variant of fib:

{-# OPTIONS_GHC -O2 -fforce-recomp #-}

module Test (fib) where

data Tree = I Tree Tree | L !Int

fibTree :: Int -> Tree
fibTree 0 = L 0
fibTree 1 = L 1
fibTree n = I (fibTree (n-1)) (fibTree (n-2))

sumTree :: Tree -> Int
sumTree (L n) = n
sumTree (I l r) = sumTree l + sumTree r

fib :: Int -> Int
fib = sumTree . fibTree

We get the following optimised Core for fib:

fib
  = \ x_a1nd ->
      case x_a1nd of { I# ww1_s1oJ ->
      case $wsumTree ($wfibTree ww1_s1oJ) of ww2_s1oE { __DEFAULT ->
      I# ww2_s1oE
      }
      }

E.g. we first build up the Tree in fibTree and then consume it in sumTree.

What a waste! For lists, we have elobarate fusion rules, but that doesn't help for user-defined data structures. I want to get rid of the intermediate tree, so that fib above is equivalent to the hand-written fib', e.g.

$wfib'
  = \ ww_s1pX ->
      case ww_s1pX of ds_X1nA {
        __DEFAULT ->
          case $wfib' (-# ds_X1nA 1#) of ww1_s1q1 { __DEFAULT ->
          case $wfib' (-# ds_X1nA 2#) of ww2_X1r3 { __DEFAULT ->
          +# ww1_s1q1 ww2_X1r3
          }
          };
        0# -> 0#;
        1# -> 1#
      }

I think this is very relevant in practice; in GHC we have the non-recursive viewCore function and offload the recursion to call sites, so that viewCore can be inlined into its call sites.

Proposal

In terms of the pretty well established literature on recursion schemes, fibTree is an "anamorphism", a function that builds up a recursive data structure, and sumTree is a "catamorphism", a function that consumes/folds over a recursive data structure.

I imagine that whether or not some function is an anamorphism or catamorphism can be identified pretty easily. For example, the recursive invocations of fibTree are exactly in the places where T expects a Tree, e.g. there is no normalisation function in between. Likewise for sumTree, we can see that it recurses into its arguments l and r without any function in-between. In both cases, those are the only recursive invocations. That makes for a pretty simple analysis.

As for the transformation: We can apply it whenever an anamorphism is followed by a catamorphism, like sumTree . fibTree. The fusion of both functions is called a "hylomorphism", hence the title of this issue.

For the example above, I'd transform:

fibTree 0 = L 0
fibTree 1 = L 1
fibTree n = I (fibTree (n-1)) (fibTree (n-2))

sumTree (L n) = n
sumTree (I l r) = sumTree l + sumTree r

fib = sumTree . fibTree

==> { make recursion explicit }

fibTree = fibTree' fibTree
fibTree' :: (Int -> Tree) -> Int -> Tree
fibTree' fibTree 0 = L 0
fibTree' fibTree 1 = L 1
fibTree' fibTree n = I (fibTree (n-1)) (fibTree (n-2))

sumTree = sumTree' sumTree
sumTree' :: (Tree -> Int) -> Tree -> Int
sumTree' sumTree (L n) = n
sumTree' sumTree (I l r) = sumTree l + sumTree r

fib = sumTree . fibTree

==> { Replace Tree with church-encoded signature functor Tree C }

-- | You can derive TreeC by replacing all occurrences of `Tree` in data constructors
-- with `r` and taking the curch encoding.
type TreeC r = forall a. (Int -> a) -> (r -> r -> a) -> a

fibTree = fibTree' fibTree L I
fibTree' :: (Int -> r) -> Int -> TreeC r
fibTree' fibTree 0 l i = l 0
fibTree' fibTree 1 l i = l 1
fibTree' fibTree n l i = i (fibTree (n-1)) (fibTree (n-2))

sumTree = sumTree' sumTree
sumTree' :: (r-> Int) -> TreeC r -> Int
sumTree' sumTree t = t (\n -> n) (\l r -> sumTree l + sumTree r)

fib = sumTree . fibTree

==> { Interlock recursions }

-- sumTree', fibTree' as before

fib = sumTree' id (fibTree' fib n)

==> { Simplification }

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

Test it! fib = sumTree' id (fibTree' fib n) generates optimal code. Why id in the argument to sumTree'? Frankly, I've no idea yet, but the literature on recursion-schemes should answer that question. And of course I tested that it generates the correct code.

This transformation would also generalise a lot of foldr/build fusion, as this is a strictly more general framework. I think by also recognising "metamorphisms", e.g. a catamorphism followed by an anamorphism, we can fuse entire list pipelines.

I think nothing here is new, except for the idea of how to exploit it in a compiler optimisation. In particular, I think this is all due to "Deforestation: transforming programs to eliminate trees" by Phil Wadler. And reviewing that paper, I'm not even so certain that anything about this is novel at all, just unexploited.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information