`-fprof-late` impacts optimization more than expected.
@sheaf Reported that he saw a bind operator showing up unreasonably often in a core profile with -fprof-late.
The reasoning is fairly silly.
module A where
f x = expr
module B where
g = ... f ...
We currently simplify f. Then we attach the late cost centre. However after doing so we update the unfolding for f again. Giving us f's unfolding to be f = \x -> {scc f} expr.
This means if we inline f it's cost centre will show up in the body of g potentially inhibiting further optimization.
We can avoid this by simply making f stable after the core pipeline. This way the cost centre won't be included in the unfolding.
The downside of course is that the execution of expr inside f "disappears" from the profile. But that's the price we pay, and if desired users can either add a cost centre to f manually or enable -fprof-auto-top for the module defining f.
!7797 (closed) is one approach to a fix for this.