Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,872
    • Issues 4,872
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 455
    • Merge requests 455
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #18532
Closed
Open
Created Aug 04, 2020 by Alexis King@lexi.lambda

Orphan specializations are pruned much too aggressively

Currently, orphan specializations generated via cross-module specialization are aggressively pruned from compiled modules. Orphan specialization rules are not considered part of the “root set” of bindings to keep, so the specialization rule is only kept if the specialized definition is directly called by something else in the final module. At first blush this seems reasonable—the specialization might have been so small that it was just inlined everywhere, in which case keeping the specialization around is unhelpful—but in fact this can often cause GHC to unnecessarily duplicate a lot of work.

For an example of how things can go wrong, consider the following pair of modules:

{-# LANGUAGE FlexibleContexts, LambdaCase #-}
module A where
import Control.Monad.State.Class

countdown :: MonadState Int m => m Int
countdown = do
  n <- get
  if n <= 0 then pure n
            else put (n - 1) >> countdown
{-# INLINABLE countdown #-}
module B where
import Control.Monad.State.Strict
import A

program :: IO ()
program = print (runState countdown 1000)

While compiling module B, GHC will generate an orphan specialization of countdown at State Int that looks like this:

$scountdown :: Int -> (Int, Int)
$scountdown n = if n <= 0 then (n, n)
                          else $scountdown (n - 1)

This is a recursive function, so it can’t possibly be completely inlined away, but nevertheless, no orphan specialization rule makes it into B.hi. Why not? Because the worker/wrapper transformation kicks in first, and we end up with this:

$w$scountdown :: Int# -> (# Int, Int #)
$w$scountdown n = if n <=# 0# then (I# n, I# n)
                              else $w$scountdown (n -# 1#)

$scountdown :: Int -> (Int, Int)
$scountdown (I# n) = case $w$scountdown n of { (# a, b #) -> (a, b) }

Now the $scountdown wrapper is inlined into program, so there are no more references to $scountdown in the original program (other than in the orphan rule). The rule is therefore dropped… but that’s silly, since essentially the entire specialization remains in the compiled module!

This is obviously an artificial case, since countdown is very small, and re-specializing it is not very expensive. Nevertheless, the same problem can and does occur for arbitrarily-large specializations. The worker/wrapper transformation is a common culprit, but there are other ways it can happen, too. For example, we might have a function like this:

f :: C a => a -> a -> a
f = go 0 where
  go n x y = <big RHS>
{-# INLINABLE f #-}

When f is called in some other module, the specializer will kick in, generating a specialization like this:

$sgo n x y = <big RHS>
$sf = $sgo 0
{-# RULES "SPEC/M f @T" forall ($dC :: C T). f $dC = $sf #-}

But $sf is very small, so it will be quickly inlined at its use site, leaving only a reference to $sgo. Now we’re in the same boat: GHC drops the $sf orphan specialization, even though virtually all of the specialization remains in the compiled module.

This issue can cause GHC to continually recompile the same orphan specializations for no benefit, bloating compilation times and possibly bloating binary sizes. I’m not sure what the right solution is, but it seems like a pretty severe problem with the current approach.

Edited Aug 04, 2020 by Alexis King
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking