Static arguments
For this issue I will use the foldr
function (on lists) as an example.
It can be naively defined as follows.
{-# LANGUAGE LambdaCase #-}
module Foldr where
import Prelude hiding (foldr)
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr k z = \case
[] -> z
(x:xs) = x `k` foldr k z xs
However, the base
library chooses to define it with a local recursive helper function.
{-# INLINE [0] foldr #-}
foldr k z = go
where
go [] = z
go (y:ys) = y `k` go ys
The reason for this might not be obvious to everyone, especially less advanced users.
I suspect the main reason is that GHC is able to optimize this form more.
An obvious difference is that the recursive go
function now has two arguments less which should make the recursion faster, but I would expect such an optimization to be easy and automatically done by the compiler.
No, what really gives a big performance improvement is that the foldr
function is now non-recursive and hence can be inlined.
Inlining this function basically means specializing this function to known k
and z
arguments.
My first question is: why doesn't GHC transform the naive version to the version that optimizes better?
An optimization rule could be: for any recursive function (perhaps any INLINEABLE
recursive function), identify arguments that do not change during the recursion (this might be impossible in general but it could just be done syntactically) and generate a recursive helper function that only uses the changing arguments.
But there is another way to optmize this code that relies on specialization instead of inlining.
The foldr
function can be defined as follows.
{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}
module FoldrC where
class FoldrAlg a b | b -> a where
k :: a -> b -> b
z :: b
{-# INLINEABLE foldrC #-}
foldrC :: FoldrAlg a b => [a] -> b
foldrC [] = z
foldrC (x:xs) = x `k` foldrC xs
Now calling the foldr function requires defining an instance for this FoldrAlg
type class, but that is very mechanical.
Here is an example.
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}
import Foldr
default (Int)
newtype FoldrSum a = FoldrSum { unFoldrSum :: a }
instance Num a => FoldrAlg a (FoldrSum a) where
z = FoldrSum 0
x `k` FoldrSum xs = FoldrSum (x + xs)
main = print (unFoldrSum (foldrC [1,2,3]))
For me it is not obvious how to infer the functional dependencies of that FoldrAlg
class, but that class can be split to avoid functional dependencies.
class FoldrZ b where
z :: b
class FoldrK a b where
k :: a -> b -> b
{-# INLINEABLE foldrC #-}
foldrC :: (FoldrK a b, FoldrZ b) => [a] -> b
foldrC [] = z
foldrC (x:xs) = x `k` foldrC xs
Here is an example of using it, note the extra type applications:
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, TypeApplications #-}
import FoldrClass
newtype FoldrSum a = FoldrSum { unFoldrSum :: a }
instance Num a => FoldrZ (FoldrSum a) where
z = FoldrSum 0
instance Num a => FoldrK a (FoldrSum a) where
x `k` FoldrSum xs = FoldrSum (x + xs)
main = print (unFoldrSum @Int (foldrC @Int [1,2,3]))
The type arguments should not be a problem if this would be implemented as a core optimization.
And I believe the creation of the FoldrSum
type and these instances can be simplified in core too.
So, I feel like this optimization could be done automatically by GHC too.
Maybe the SPECIALIZE
pragma could even be extended to allow something like:
{-# SPECIALIZE foldr (+) 0 :: Num a => [a] -> a #-}
Which would automatically create that type class and the instances. This approach seems preferable to inlining because we can reuse the specialized version everywhere it is used with the same arguments. This requires determining equality of arbitrary values including functions, which is impossible in general, but I feel like again a simple analysis, purely syntactically if necessary, could be used to determine function equality.
Edit: this last note about the specialize pragma already seems to be tracked by: #5059