Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 5,407
    • Issues 5,407
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 602
    • Merge requests 602
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell CompilerGlasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #2607
Closed
Open
Issue created Sep 18, 2008 by Simon Marlow@simonmarDeveloper

Inlining defeats selector thunk optimisation

From a post on haskell-cafe.

Lev Walkin wrote:

I wondered why would a contemporary GHC 6.8.3 exhibit such a leak? After all, the technique was known in 2000 (and afir by Wadler in '87) and one would assume Joe English's reference to "most other Haskell systems" ought to mean GHC.

Thanks for this nice example - Don Stewart pointed me to it, and Simon PJ and I just spent some time this morning diagnosing it.

Incedentally, with GHC 6.8 you can just run the program with "+RTS -hT" to get a basic space profile, there's no need to compile it for profiling - this is tremendously useful for quick profiling jobs. And in this case we see the the heap is filling up with (:) and Tree constructors, no thunks.

Here's the short story: GHC does have the space leak optimisation you refer to, and it is working correctly, but it doesn't cover all the cases you might want it to cover. In particular, optimisations sometimes interact badly with the space leak avoidance, and that's what is happening here. We've known about the problem for some time, but this is the first time I've seen a nice small example that demonstrates it.

>     -- Lazily build a tree out of a sequence of tree-building events
>     build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
>     build (Start str : es) =
>             let (es', subnodes) = build es
>                 (spill, siblings) = build es'
>             in (spill, (Tree str subnodes : siblings))
>     build (Leaf str : es) =
>             let (spill, siblings) = build es
>             in (spill, Tree str [] : siblings)
>     build (Stop : es) = (es, [])
>     build [] = ([], [])

So here's the long story. Look at the first equation for build:

>     build (Start str : es) =
>             let (es', subnodes) = build es
>                 (spill, siblings) = build es'
>             in (spill, (Tree str subnodes : siblings))

this turns into

      x = build es
      es' = fst x
      subnodes = snd x

      y = build es'
      spill = fst y
      siblings = snd y

now, it's the "siblings" binding we're interested in, because this one is never demanded - in this example, "subnodes" ends up being an infinite list of trees, and we never get to evaluate "siblings". So anything referred to by siblings will remain in the heap.

The space-leak avoidance optimisation works on all those "fst" and "snd" bindings: in a binding like "siblings = snd y", when y is evaluated to a pair, the GC will automatically reduce "snd y", so releasing the first component of the pair. This all works fine.

But the optimiser sees the above code and spots that es' only occurs once, in the right hand side of the binding for y, and so it inlines it. Now we have

      x = build es
      subnodes = snd x

      y = build (fst x)
      spill = fst y
      siblings = snd y

Now, usually this is a good idea, but in this case we lost the special space-leak avoidance on the "fst x" expression, because it is now embedded in an expression. In fact in this case the thunk goes away entirely, because build is strict.

But now, when the program runs, the thunk for siblings retains y, which retains x, which evaluates to a pair, the second component of which evaluates to an infintely growing list of Trees (the first components is a chain of "fst y" expressions that constantly get reduced by the GC and don't take up any space).

Trac metadata
Trac field Value
Version 6.8.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system Unknown
Architecture Unknown
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking