Orphan specializations are pruned much too aggressively

Currently, orphan specializations generated via cross-module specialization are aggressively pruned from compiled modules. Orphan specialization rules are not considered part of the “root set” of bindings to keep, so the specialization rule is only kept if the specialized definition is directly called by something else in the final module. At first blush this seems reasonable—the specialization might have been so small that it was just inlined everywhere, in which case keeping the specialization around is unhelpful—but in fact this can often cause GHC to unnecessarily duplicate a lot of work.

For an example of how things can go wrong, consider the following pair of modules:

{-# LANGUAGE FlexibleContexts, LambdaCase #-}
module A where
import Control.Monad.State.Class

countdown :: MonadState Int m => m Int
countdown = do
  n <- get
  if n <= 0 then pure n
            else put (n - 1) >> countdown
{-# INLINABLE countdown #-}

module B where
import Control.Monad.State.Strict
import A

program :: IO ()
program = print (runState countdown 1000)

While compiling module B, GHC will generate an orphan specialization of countdown at State Int that looks like this:

$scountdown :: Int -> (Int, Int)
$scountdown n = if n <= 0 then (n, n)
                          else $scountdown (n - 1)

This is a recursive function, so it can’t possibly be completely inlined away, but nevertheless, no orphan specialization rule makes it into B.hi. Why not? Because the worker/wrapper transformation kicks in first, and we end up with this:

$w$scountdown :: Int# -> (# Int, Int #)
$w$scountdown n = if n <=# 0# then (I# n, I# n)
                              else $w$scountdown (n -# 1#)

$scountdown :: Int -> (Int, Int)
$scountdown (I# n) = case $w$scountdown n of { (# a, b #) -> (a, b) }

(See also #21917 (closed).)

Now the $scountdown wrapper is inlined into program, so there are no more references to $scountdown in the original program (other than in the orphan rule). The rule is therefore dropped… but that’s silly, since essentially the entire specialization remains in the compiled module!

This is obviously an artificial case, since countdown is very small, and re-specializing it is not very expensive. Nevertheless, the same problem can and does occur for arbitrarily-large specializations. The worker/wrapper transformation is a common culprit, but there are other ways it can happen, too. For example, we might have a function like this:

f :: C a => a -> a -> a
f = go 0 where
  go n x y = <big RHS>
{-# INLINABLE f #-}

When f is called in some other module, the specializer will kick in, generating a specialization like this:

$sgo n x y = <big RHS>
$sf = $sgo 0
{-# RULES "SPEC/M f @T" forall ($dC :: C T). f $dC = $sf #-}

But $sf is very small, so it will be quickly inlined at its use site, leaving only a reference to $sgo. Now we’re in the same boat: GHC drops the $sf orphan specialization, even though virtually all of the specialization remains in the compiled module.

This issue can cause GHC to continually recompile the same orphan specializations for no benefit, bloating compilation times and possibly bloating binary sizes. I’m not sure what the right solution is, but it seems like a pretty severe problem with the current approach.

Edited Jul 26, 2022 by Simon Peyton Jones

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information