Partial specialization of unfoldings stripping INLINEABLE pragmas?
I'm looking at some failures to specialise code and came across this statement from the Note [Specialising unfoldings].
Moreover, keeping the stable unfolding isn't much help, because the specialised function (probably) isn't overloaded any more.
I'm including the full note below the spoiler tag:
{- Note [Specialising unfoldings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When we specialise a function for some given type-class arguments, we use
specUnfolding to specialise its unfolding. Some important points:
* If the original function has a DFunUnfolding, the specialised one
must do so too! Otherwise we lose the magic rules that make it
interact with ClassOps
* For a /stable/ CoreUnfolding, we specialise the unfolding, no matter
how big, iff it has UnfWhen guidance. This happens for INLINE
functions, and for wrappers. For these, it would be very odd if a
function marked INLINE was specialised (because of some local use),
and then forever after (including importing modules) the specialised
version wasn't INLINEd! After all, the programmer said INLINE.
* However, for a stable CoreUnfolding with guidance UnfoldIfGoodArgs,
which arises from INLINABLE functions, we drop the unfolding.
See #4874 for persuasive examples. Suppose we have
{-# INLINABLE f #-}
f :: Ord a => [a] -> Int f xs = letrec f' = ...f'... in f'
Then, when f is specialised and optimised we might get
wgo :: [Int] -> Int#
wgo = ...wgo...
f_spec :: [Int] -> Int
f_spec xs = case wgo xs of { r -> I# r }
and we clearly want to inline f_spec at call sites. But if we still
have the big, un-optimised of f (albeit specialised) captured in the
stable unfolding for f_spec, we won't get that optimisation.
This happens with Control.Monad.liftM3, and can cause a lot more
allocation as a result (nofib n-body shows this).
Moreover, keeping the stable unfolding isn't much help, because
the specialised function (probably) isn't overloaded any more.
TL;DR: we simply drop the stable unfolding when specialising. It's not
really a complete solution; ignoring specialisation for now, INLINABLE
functions don't get properly strictness analysed, for example.
Moreover, it means that the specialised function has an INLINEABLE
pragma, but no stable unfolding. But it works well for examples
involving specialisation, which is the dominant use of INLINABLE.
And I wonder if this is ideal for partially specialized functions. For example consider the code below:
{-# INLINEABLE overloadedTwice #-}
overloadedTwice :: (Eq a, Monad m) => a -> m SomeType
overloadedTwice = <large>
overloadedOnce :: Monad m => TypeWithEq -> ... -> m SomeType
overloadedOnce some_a = ... overloadedTwice (@Eq TypeWithEq) some_a ...
Specialization will look at the rhs of overloadedOnce see overloadedTwice (@Eq TypeWithEq)
and will create $soverloadedTwice $dMonad x = ...
which is specialized for the Eq dictionary.
specUnfolding
will strip away the stable unfolding, giving $soverloadedTwice
an OtherCon[]
unfolding.
What are the consequences of this? For call sites of overloadedOnce
in another module:
- There is a chance that the overloaded version of
overloadedTwice
is small and hence will inline. This seems to be the main motivation for the current behavior but seems unlikely given a large rhs. - If
overloadedOnce
is small enough to get a stable unfolding it doesn't have any further consequences as the unfolding will mention - If
overloadedOnce
doesn't get a unfolding it also doesn't matter is it won't specialize at it's call sites. - If
overloadedOnce
get's a unfolding then this will mention$soverloadedTwice
which will now most likely have no unfolding and hence can't specialize any further.- One exception to this would be compiling with -fexpose-all-unfoldings which would allow
$soverloadedTwice
to get a unfolding despite it's size.
- One exception to this would be compiling with -fexpose-all-unfoldings which would allow
So perhaps this doesn't really matter all that much. Getting everything to specialize without either lots of INLINEABLE
pragmas or the use of -fexpose-all-unfoldings
seems unlikely after all.
But if we really wanted we could check if a binding is fully specialized before deciding weither or not we strip the INLINEABLE. But perhaps that is overkill.
One other thing I'm wondering about: Now that we fire rules in the specializer could it happen that if we have something like:
{-# INLINEABLE overloadedTwice #-}
overloadedTwice :: Eq a -> Monad m -> a -> m SomeType
overloadedTwice = <large>
overloadedOnce :: Monad m => TypeWithEq -> ... -> m SomeType
overloadedOnce some_a = ... overloadedTwice (@Eq TypeWithEq) some_a ...
foo = ... overloadedTwice $dEq $dMonad ...
- We partially specialize
overloadedTwice
to the Eq instance as above foroverloadedOnce
. - We run spec on
foo
- We see
overloadedTwice $dEq $dMonad
and apply the spec rule for the Eq specialization. - Now we have
foo = ... $soverloadedTwice $dMonad ...
but$soverloadedTwice
has no unfolding. - Since
$soverloadedTwice
has no unfolding we fail to fully specialize$soverloadedTwice
.
The above is speculative but looking at the code for specUnfolding the above seemed possible. I will have to look at the code involved there more closely. But as I'm currently looking at a code base still using 9.2 this might have to wait a bit.
This ticket is mostly for me to keep track of this but I welcome anyone to look into this further.