Optimize repeated usage of instance methods.
Consider this code:
class C a where op1 :: a -> a op2 :: a -> a -> a foo :: C a => a -> a -> a foo x y = op1 x `op2` op1 y
In the absence of specialization I expected
op2 to be extracted once from the constraint/dictionary and to be reused for each use.
Instead we currently extract the methods once per use:
foo :: forall a. C a => a -> a -> a [GblId, Arity=3, Caf=NoCafRefs, Str=<S,U><L,U><L,U>, Unf=<irrelevant> foo = \ (@a_aAq) ($dC_aAs :: C a_aAq) (x_agS :: a_aAq) (y_agT :: a_aAq) -> op2 @a_aAq $dC_aAs ((op1 @a_aAq $dC_aAs) x_agS) ((op1 @a_aAq $dC_aAs) y_agT)
This isn't cheap as it causes a bunch of unknown calls and jumping back and forth. (On the plus side it doesn't allocate at least).
Even if we manually bind op1 to a name the simplifier will happily inline it again, duplicating the work performed.
Ideally we would translate
op1 y from above into
let o = op1 @a_aAq $dC_aAs in o x `op2` o y
The problems preventing this are:
- We desugar into the form
op1 yquite early.
- Even if we don't the simplifier will inline
op1is considered cheap.
- Instances currently generate rules which will match on
op1 @a_aAq $knownDict x_agS. If the dictionary is known, the rule will replace it with
op1_instance @a_aAq $knownDict x_agS. This would break under the form above.
- While the let for
ocan usually be turned into a (non-allocating) case this is not guaranteed. Allocating might be worse than the duplicated work.
I don't think this is a big problem in practice.
If performance really matters we would always want the above snippet to specialize so the overhead can disappear completely.
If performance doesn't really matter then neither does it if we do a little extra work.
So while I don't intend to work on this anytime soon it seemed reasonable to document this behaviour.