Monoidal tail call optimisation
Consider
sumOf :: Int -> Int
sumOf 0 = 42
sumOf n = n + sumOf (n-1)
This allocates even with -O2 because it is not tail recursive. A better definition would be
sumOf :: Int -> Int
sumOf n = go 0 n
where
go acc 0 = acc + 42
go acc n = go (acc + n) (n-1)
This operates in constant space and is thus much faster. I think we could do this optimisation automatically, without user intervention.
The key is to realise that the tail continuation can be factored as c n = (+) n
and c n . c m = c (n+m)
(exploiting associativity of +
), thus it is enough to track n
as an accumulator (the "closure" of the continuation) and apply c acc
once to 42 in the base case, at the end of the recursion. Note that we also need a neutral element 0
to +
to start the recursion; otherwise we don't have an initial accumulator.
The challenge of course is to identify such tail continuations where the transformation would be beneficial. Although the immediate use cases are with unboxed integers, it might be useful for boxed "continuation closures" as well; I think the only requirement is that the closure forms a monoid (which is the case as long as all free vars do) and c
is a monoid homomorphism into the monoid of endomorphisms (which is a fancy way of the requirements c (n+m) = c n . c m
and c 0 = id
).