Optimize repeated usage of instance methods.
Motivation
Consider this code:
class C a where
op1 :: a -> a
op2 :: a -> a -> a
foo :: C a => a -> a -> a
foo x y =
op1 x `op2` op1 y
In the absence of specialization I expected op1
and op2
to be extracted once from the constraint/dictionary and to be reused for each use.
Instead we currently extract the methods once per use:
foo :: forall a. C a => a -> a -> a
[GblId,
Arity=3,
Caf=NoCafRefs,
Str=<S,U><L,U><L,U>, Unf=<irrelevant>
foo
= \ (@a_aAq)
($dC_aAs :: C a_aAq)
(x_agS :: a_aAq)
(y_agT :: a_aAq) ->
op2
@a_aAq
$dC_aAs
((op1 @a_aAq $dC_aAs) x_agS)
((op1 @a_aAq $dC_aAs) y_agT)
This isn't cheap as it causes a bunch of unknown calls and jumping back and forth. (On the plus side it doesn't allocate at least).
Even if we manually bind op1 to a name the simplifier will happily inline it again, duplicating the work performed.
Proposal
Ideally we would translate op1 x
op2 op1 y
from above into
let o = op1 @a_aAq $dC_aAs
in o x `op2` o y
The problems preventing this are:
- We desugar into the form
op1 x
op2op1 y
quite early. - Even if we don't the simplifier will inline
o
asop1
is considered cheap. - Instances currently generate rules which will match on
op1 @a_aAq $knownDict x_agS
. If the dictionary is known, the rule will replace it withop1_instance @a_aAq $knownDict x_agS
. This would break under the form above. - While the let for
o
can usually be turned into a (non-allocating) case this is not guaranteed. Allocating might be worse than the duplicated work.
I don't think this is a big problem in practice.
If performance really matters we would always want the above snippet to specialize so the overhead can disappear completely.
If performance doesn't really matter then neither does it if we do a little extra work.
So while I don't intend to work on this anytime soon it seemed reasonable to document this behaviour.