SpecConstr regression in NoFib's `spectral/ansi`
With a recent master GHC, I observe a perf regression in NoFib's spectral/ansi the presence of -fspec-constr
.
$ _build/stage1/bin/ghc -O Main.hs
$ ./Main 400 +RTS -t < ansi.stdout > /dev/null
<<ghc: 24321415416 bytes, 5880 GCs, 258273/364656 avg/max bytes residency (28 samples), 7M in use, 0.000 INIT (0.000 elapsed), 2.297 MUT (2.290 elapsed), 0.167 GC (0.171 elapsed) :ghc>>
$ _build/stage1/bin/ghc -O -fspec-constr Main.hs
$ ./Main 400 +RTS -t < ansi.stdout > /dev/null
<<ghc: 32211087200 bytes, 6471 GCs, 253717/401024 avg/max bytes residency (30 samples), 8M in use, 0.000 INIT (0.000 elapsed), 2.836 MUT (2.827 elapsed), 0.211 GC (0.215 elapsed) :ghc>>
(An aside: Perhaps me piping ansi.stdout is an incorrect run of the benchmark, but we shouldn't regress either way.)
Note that the second run allocates 33% more. This is due to SpecConstr introducing reboxing.
Having a hunch, I reverted !11689 (closed) and got the following results:
$ _build/stage1/bin/ghc -O -fspec-constr Main.hs
$ ./Main 400 +RTS -t < ansi.stdout > /dev/null
<<ghc: 22238093736 bytes, 5407 GCs, 258627/350776 avg/max bytes residency (22 samples), 7M in use, 0.000 INIT (0.000 elapsed), 2.127 MUT (2.120 elapsed), 0.125 GC (0.129 elapsed) :ghc>>
So that improved; hence !11689 (closed) is introducing a 40% regression for spectral/ansi
. Perhaps we should re-evaluate that patch or find a way that it does not regress.
I began to diagnose.
Here's a diff of the specialisations we do according to -ddump-spec-constr
, with !11689 (closed) reverted (e.g., OLD) first:
loop [Occ=LoopBreaker] :: Int -> [Char] -> Interact
[LclId,
Arity=2,
Str=<L><L>,
Unf=Unf{Src=<vanilla>, TopLvl=True,
Value=True, ConLike=True, WorkFree=True, Expandable=True,
Guidance=IF_ARGS [60 30] 585 60},
RULES: "SC:loop0"
forall (sc :: GHC.Prim.Int#).
loop (GHC.Types.I# sc) (GHC.Types.[] @Char)
= $sloop sc
"SC:loop1"
forall (sc :: Char) (sc :: [Char]) (sc :: Int).
loop sc (GHC.Types.: @Char sc sc)
= $sloop sc sc sc
"SC:loop2"
forall (sc :: GHC.Prim.Int#). loop (GHC.Types.I# sc) = $sloop sc]
and now with !11689 (closed) (e.g., NEW):
loop [Occ=LoopBreaker] :: Int -> [Char] -> Interact
[LclId,
Arity=2,
Str=<L><L>,
Unf=Unf{Src=<vanilla>, TopLvl=True,
Value=True, ConLike=True, WorkFree=True, Expandable=True,
Guidance=IF_ARGS [60 30] 585 60},
RULES: "SC:loop0"
forall (sc :: GHC.Prim.Int#). loop (GHC.Types.I# sc) = $sloop sc
"SC:loop1"
forall (sc :: Char) (sc :: [Char]) (sc :: Int).
loop sc (GHC.Types.: @Char sc sc)
= $sloop sc sc sc]
Apparently, we lose the specialisation
RULES: "SC:loop0"
forall (sc :: GHC.Prim.Int#).
loop (GHC.Types.I# sc) (GHC.Types.[] @Char)
= $sloop sc
Which IMO is an instance of (OLD)
"SC:loop2"
forall (sc :: GHC.Prim.Int#). loop (GHC.Types.I# sc) = $sloop sc]
With !11689 (closed), we never generate the first one, only the second one. But the $sloop
of the second one needs to rebox its I# sc
, resulting in the huge regression, whereas the $sloop
for the []
specialisation does not.
Although I'm tempted to accept this regression because it is ultimately a result of a lack of awareness of reboxing in SpecConstr, I wonder why we so easily discard the specialisation for []
.