Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,860
    • Issues 4,860
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 454
    • Merge requests 454
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #21392
Closed
Open
Created Apr 14, 2022 by Sebastian Graf@sgraf812Developer

Don't float out join points to top-level with an interesting demand

Due to a complicated interaction with FloatOut in !7599 (merged), I'm seeing roughly the following sequence of steps:

==> after DmdAnal
f = ... joinrec go [Dmd=SCS(C1(P(1L,A))),Str=<S><L>] x y = 
                   ... case <e> of x' { __DEFAULT -> go x' y' }
        in ... go ...

==> FloatOut + FloatIn strict arg `e`

go [Str=<S><L>]
go = ... go <e> y' ... -- NB: Dropped the eval

f = ... go ...

==> DmdAnal, 2nd run

go [Str=<L><L>] -- Urgh, now lazy!
go = ... go <e> y' ...

f = ... go ...

This is what happens:

  1. The first run of DmdAnal records a signature <S><L> for the join point go, assuming a demand of P(1L,A) on the join body. NB: DmdAnal currently computes better signatures for join points than for regular functions because their evaluation context is known.
  2. The fact that go is strict in its first arg allows us to drop the eval on e in Core and we get go e y' in the recursive call because go, which is easier to handle in Core2Core passes.
  3. In the meantime, go has been simplified enough for the second FloatOut pass to see that it can float to the top-level. Off it goes
  4. But on the top-level it's just a regular function where DmdAnal isn't smart enough (yet?) to figure out an accurate evaluation context. So our strictness signature gets worse, <L><L>. And now we dropped the eval in (2) but can't recover it! We made the program lazier; in CorePrep we'll let-bind instead of case-bind e.

In !7599 (merged), we made inlined versions of addListToUniqDSet lazier in GHC.Linker.Loader, to detrimental effect on MultiLayerModulesTH_OneShot, which increased by 4.8%.

In a multi-day effort I could come up with a reproducer, but it only reproduces with !7599 (merged) (because t has demand MP(1L,A) which would be MP(ML,A) today. I couldn't reproduce with a demand of 1P(1L,A) because then t will just be turned into a case and pushed into the join point):

module Lib (f) where

import Data.List
import Data.Ord

newtype Unique = U { unU :: Int }
class Uniquable u where getKey :: u -> Unique
instance Uniquable Int where getKey = U
data UMap a = UMap { unS :: ![(Unique,a)], unI :: !Int }

insertBy' f v !xs = insertBy f v xs
{-# NOINLINE insertBy' #-}

addOne :: Uniquable u => UMap a -> u -> a -> UMap a
addOne (UMap set n) x v = UMap (insertBy' (comparing (unU . fst)) (getKey x,v) set) (n+1)

newtype USet u = USet (UMap u)

insertOne :: Uniquable u => USet u -> u -> USet u
insertOne (USet s) x = USet (addOne s x x)

insertMany :: Uniquable u => USet u -> [u] -> USet u
insertMany s vs = foldl' insertOne s (reverse (reverse vs))

seq' = seq
{-# NOINLINE seq' #-}

blah s@(USet m) = unS m `seq'` s
{-# OPAQUE blah #-}

end (USet m) = unS m
{-# NOINLINE end #-}

f :: USet Int -> [Int] -> [(Unique,Int)]
f !xs ys
  | length ys == 13 = end $ blah t
  | length ys == 23 = reverse $ end $ blah t
  | otherwise       = []
  where
    t = insertMany xs (reverse $ reverse $ reverse $ reverse ys)

Welp.

My suggestion for a fix: In SetLevels.destLevel, only float join points (to the top-level) if their demand is not "interesting", e.g. not just nCn(C1(L)). Alternatively, zap their demand signatures when floating. Perhaps a combination of both.

Perhaps a strange artifact is that go's first arg is actually unlifted (in terms of StrictWorkerId) and perhaps will be eval'd in Stg2Cmm. But the thunk will be allocated nonetheless which is a waste of ressources. Maybe we could case-bind in CorePrep expressions that go in Unlifted positions. But that isn't strictly the cause of the issue here.

Edited Apr 19, 2022 by Sebastian Graf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking