`-fprof-late` impacts optimization more than expected.
@sheaf Reported that he saw a bind operator showing up unreasonably often in a core profile with -fprof-late
.
The reasoning is fairly silly.
module A where
f x = expr
module B where
g = ... f ...
We currently simplify f. Then we attach the late cost centre. However after doing so we update the unfolding for f
again. Giving us f's unfolding to be f = \x -> {scc f} expr
.
This means if we inline f
it's cost centre will show up in the body of g
potentially inhibiting further optimization.
We can avoid this by simply making f
stable after the core pipeline. This way the cost centre won't be included in the unfolding.
The downside of course is that the execution of expr
inside f
"disappears" from the profile. But that's the price we pay, and if desired users can either add a cost centre to f
manually or enable -fprof-auto-top
for the module defining f
.
!7797 (closed) is one approach to a fix for this.