Draft: Track changes in DmdAnal to avoid rebuilding expressions
This introduces a Tainted
data type vendored from https://hackage.haskell.org/package/Tainted-0.1.0.2 and heavily modified:
data Tainted a = Dirty a | Clean a -- impl a bit different, but this is the external interface
The resolution #19584 ultimately being the goal, I then used this new data type to refactor DmdAnal to track changes to the syntax tree with this data type, mostly to re-use the old expression when there was no change to it. Normally, this kind of change tracking has absolutely devastating impact on code readability, but with this new data type it feels much less so (although all the applicative combinators and what not bear some overhead).
This basically supersedes !5267 (closed) with a simpler interface and implementation. Here's what it looks like for App
:
dmdAnal' env dmd e@(App fun arg)
= let
call_dmd = mkCalledOnceDmd dmd
S2 fun_ty fun' = dmdAnal env call_dmd fun
(arg_dmd, res_ty) = splitDmdTy fun_ty
S2 arg_ty arg' = dmdAnalStar env (dmdTransformThunkDmd arg arg_dmd) arg
-- this is the important line:
in S2 (res_ty `plusDmdType` arg_ty) (setWhenClean e $ App <$> fun' <*> arg')
S2
is just a new strict pair data constructor of type SPair
, the introduction of which I think was long overdue in GHC's code base.
The applicative combinators propagate Dirty
ness. So if either fun'
or arg'
was dirty (which it is if some annotation in the respective subexpression changed), then we get a Dirty (App fun' arg')
, basically. If both are Clean
, then we get a Clean (App fun' arg')
, for which we already have the equivalent expression e
, so we call setWhenClean
to replace the Clean
value and won't even allocate App
in the heap. Of course, this only begins to matter for huge expressions.
Other use cases are possible, like in CoreTidy in !5267 (closed).
I tried to translate all the bangs introduced by !5399 (closed) to the new impl, but may have failed to do so in some cases... At least it appears that ghc/alloc performance generally increases.