I've tried to compile my vinyl-like records with 7.10.2-rc1 and got:
ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150612 for x86_64-unknown-linux): Simplifier ticks exhausted When trying RuleFired Class op rreplace To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 225484Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Attached self contained test compiles with 7.10.1, but fails with 7.10.2-rc1 when compiled with optimizations.
Trac metadata
Trac field
Value
Version
7.10.1-rc1
Type
Bug
TypeOfFailure
OtherFailure
Priority
normal
Resolution
Unresolved
Component
Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
$ /opt/ghc/7.10.1/bin/ghc -O -fforce-recomp Bug.hs -fsimpl-tick-factor=50 [1 of 1] Compiling Bug ( Bug.hs, Bug.o )$ /opt/ghc/7.10.2/bin/ghc -O -fforce-recomp Bug.hs -fsimpl-tick-factor=5000 [1 of 1] Compiling Bug ( Bug.hs, Bug.o )ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150609 for x86_64-unknown-linux): Simplifier ticks exhausted When trying UnfoldingDone $fFunctorConst_$cfmap To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 11274000
I am still churning through this one but I think I finally have some traction. In 7.10.2 each field added to the records roughly doubles the simpl-tick-factor necessary to finish.
I am now comparing the sequences of inlinings performed by GHC 7.10.1 and 7.10.2. For those following along at home, it is enlightening (albeit verbose) to compile like this,
One notices certain patterns in this output. In particular, one finds inlinings of Data.Functor.Identity.$fFoldableIdentity2 are performed far more often in 7.10.1 than 7.10.2. It is worth noting that whenever 7.10.2 considers inlining this definition it does so.
Notice the second "frame" of the context: ApplyToVal nodup ((g x) ...) (7.10.1) in contrast to ApplyToVal simpl ((g x) ...) (7.10.2). I'm not yet sure whether this is significant.
I have also noticed that GHC 7.10.1 often appears to by simplifying inside of a cast when 7.10.2 is just working inside of Const. For instance, it is common to see the SimplCont of 7.10.1 terminates with,
One obvious qualitative difference between the two simplifier trajectories is that the 7.10.2 emits its last SimplBind trace (for $s$crsubset_s2gQ) somewhere around 13% through the log (measured in lines). In contrast, 7.10.1 emits SimplBinds every regularly throughout the log. I suppose this means that 7.10.2 is spending most of its time playing with this one top-level binding?
Well, I quickly bisected get an idea of what I should be looking for. It seems like the regression was introduced by the fix for the huge space leak in the mighty simplifier (8af219ad). If I understand correctly this patch touched code surrounding the treatment of interesting arguments. Perhaps an inadvertent change here is now resulting in more inlining than we would like. I'll have another look after I've slept on it to see if this revelation gets us any closer to a fix.
Reading through the patch it indeed seems that the NoDup to Simplified change that I noticed above could have arisen out of this commit. See the treatment of simplArg for https://phabricator.haskell.org/rGHC8af219adb914b292d0f8c737fe0a1e3f7fb19cf3#C60694NL1193 for how. That being said, using dup_flag instead of Simplified on line 1203 doesn't seem to help. I'll need more sleep to get any further I'm afraid.
Another simple but perhaps unsurprising observation:
If the call to addBndrRules in simplRecBind is short-circuited (addBndrRules env _ out_id = return (env, out_id)) compilation completes. Neither of the other calls to addBndrRules have this property when short-circuited.
With both the current state of the ghc-7.10 branch and ghc-7.10.1-release. Note that this test case been reduced to the bare minimum to reproduce on 7.10.2 yet not on 7.10.1. Lowering simpl-tick-factor to 3 causes it to fail on both.
Another thing I have noticed is that the good commit considers some inlining contexts to be RuleArgCtxts which the bad commits considers BoringCtxt. This difference may be due to the good commit's eagerness to inline $fFoldableIndentity2 and friends.
Consider, for instance, this inlinig which occurs in a RuleArgCtxt (which directly follows an inlining of fFoldableIdentity2),
Also interesting is this inlining of $fFunctorIdentity2 is completely unchanged between the two commits other than IdInfo of a binding in the context. Namely, both have a context entry that look like this,
where the bad commit has the Unfolding for dt_X1PT and the good commit does not. Things of this nature appear to pop up fairly regularly later in the simplifier pass. Moreover it appears that the context contains more bindings (all of which having unfoldings) as the simplifier iteration progresses. See, for instance, this example https://gist.github.com/bgamari/c96e13404f00202f2902 (where dt_X1PT, dtX1U4, dt_X1U6, dt_X1Uc, and dt_X1Ue all have unfoldings with the bad commit and none with the good commit.
The above notes are getting to be a bit of a mess. Here's something of a summary of the current state of things.
I have established that the memory leak patch is the source of the regression. Below I will refer to the commit directly before this patch (9b406cc6) as the working commit and the regression commit (8af219ad) as the failing commit. Looking at verbose-core2core output, it seems there is no difference in the compilation up to the Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}) phase. It is apparently during this phase that the simplifier blows up.
At this point it's not clear to me whether one or more of these differences are the key to the issue or simply inconsequential fall-out due to the move away from CoreSubst. In particular the differences are,
One often sees that terms in the context of inlinings that are marked
as NoDup in 7.10.1 are marked as Simplified in 7.10.2. The new
implementation of simplArg appears to touch this logic but my tests
suggest that this isn't the cause of the difference.
$fFoldableIdentity2 and $fGeneric1Const are both inlined
This results in a rather obvious pattern in the differences between the
contexts of the inlinings performed by the two commits. For instance,
in the working commit one often sees contexts ending with,
Several inlinings further are performed in this state until eventually
$fFoldableIdentity2 or $fGeneric1Const are inlined by the bad
commit, which wipes away the difference.
One sees let bindings in inlining contexts which have unfoldings in the bad commit and no inlinings in the good commit. (comment:17)
One occasionally sees inlinings being consider in continuations
which are considered interesting for different reasons: namely the
good commit often considers continuations to be RuleArgCtxt while
the bad commit considers it to be. BoringCtxt
Interesting but unlikely to be the cause: the failing commit
considers (and rejects) a few inlinings that the
working commit does not. These typically look like,
Considering inlining: g_a2cv arg infos [TrivArg] interesting continuation BoringCtxt some_benefit True is exp: False is work-free: False guidance IF_ARGS [] 50 0 discounted size = 40 ANSWER = NO
g_a2cv is a let binding defined in several different bindings
(namely $dmrreplace, $dmrcast, and $dmrput . It
appears always be of the form Rec ss_a1LQ -> m (Rec ss_a1LQ) with
m of either Const _ or Identity depending upon the context.
It's interesting that the failing commit considers and rejects
inlining this binding in all three cases, whereas the working commit
doesn't even consider it.
There are a few other bindings (e.g. variants of lens) which also
exhibit this difference.
With both the good and bad commit. Note that this test case been reduced to the bare minimum to reproduce on the bad commit yet not on the good commit. However, it is very sensitive: lowering simpl-tick-factor to 3 causes it to fail on both.
Simon has identified that the issue is the simplifier expending a great deal of effort simplifying an argument which is ultimately ignored by the callee.
I have merged 07a1f32e into the ghc-7.10 branch which should resolve this for 7.10.2. Simonpj is working on a more thorough fix for master. I'll leave this open until the latter fix has been merged.
Simon, your explanation raises a question: what would happen if the argument weren't thrown away? It seems that the simplifier would blow up in this case, no? Perhaps this is another issue that should be tracked?
=====> T4945(normal) 1533 of 4449 [0, 1, 0]cd ./simplCore/should_compile && $MAKE -s --no-print-directory T4945 </dev/null > T4945.run.stdout 2> T4945.run.stderrActual stdout output differs from expected:--- ./simplCore/should_compile/T4945.stdout 2015-07-06 17:16:38.859135774 -0400+++ ./simplCore/should_compile/T4945.run.stdout 2015-07-07 03:57:30.592499506 -0400@@ -1,7 +0,0 @@- -> STUArray RealWorld Int Int- (ipv3 [OS=OneShot] :: STUArray RealWorld Int Int) ->- case ipv3 of _ [Occ=Dead] { STUArray ds5 ds6 dt ds7 ->- (Data.Array.Base.STUArray- (Data.Array.Base.STUArray- (Data.Array.Base.STUArray- (Data.Array.Base.STUArray*** unexpected failure for T4945(normal)
perf/should_run T5113 [stat not good enough] (normal)
=====> T5113(normal) 2426 of 4449 [0, 2, 0]cd ./perf/should_run && "/home/ben/trees/ghc/ghc-7.10/inplace/bin/ghc-stage2" -o T5113 T5113.hs -fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output -no-user-package-db -rtsopts -fno-warn-tabs -fno-ghci-history -O > T5113.comp.stderr 2>&1cd ./perf/should_run && ./T5113 +RTS -V0 -tT5113.stats --machine-readable -RTS </dev/null > T5113.run.stdout 2> T5113.run.stderrbytes allocated value is too high: Expected T5113(normal) bytes allocated: 8000000 +/-5% Lower bound T5113(normal) bytes allocated: 7600000 Upper bound T5113(normal) bytes allocated: 8400000 Actual T5113(normal) bytes allocated: 806747568 Deviation T5113(normal) bytes allocated: 9984.3 %*** unexpected stat test failure for T5113(normal)
perf/compiler T9961 [stat too good] (normal)
=====> T9961(normal) 2474 of 4449 [0, 2, 0] cd ./perf/compiler && "/home/ben/trees/ghc/ghc-7.10/inplace/bin/ghc-stage2" -c T9961.hs -fforce-recomp -dno-debug-output -no-user-package-db -rtsopts -fno-warn-tabs -fno-ghci-history -O +RTS -V0 -tT9961.comp.stats --machine-readable -RTS > T9961.comp.stderr 2>&1bytes allocated value is too low: (If this is because you have improved GHC, pleaseupdate the test so that GHC doesn't regress again) Expected T9961(normal) bytes allocated: 663978160 +/-5% Lower bound T9961(normal) bytes allocated: 630779252 Upper bound T9961(normal) bytes allocated: 697177068 Actual T9961(normal) bytes allocated: 616521968 Deviation T9961(normal) bytes allocated: -7.1 %*** unexpected stat test failure for T9961(normal)
Simon, I've left a comment on #5113 (closed):ticket:10527#comment:101453 describing a concern I have with your patch against ghc-7.10. When you get a chance, could you briefly explain how argument expressions are supposed to now make it to the rebuilder? Currently it appears that we merely perform substitutions on them, which means rules won't fire on them.