Avoid TidyPgm predicting what CorePrep will do
At the moment the TidyPgm
pass is forced to predict, accurately but unpleasantly, some aspects of what CorePrep
and Core2Stg
will do. Reason:
- Each
Id
in the interface file records (among other things) the arity of theId
, and whether it has CAF references - We really only know these two things for sure after
CorePrep
. The conversion from Core to STG makes no structural changes. - However the result of
TidyPgm
(which precededCorePrep
) is used to generate the interface file. So it has to predict the arity and CAF-ref status of eachId
. - This is not good. It restricts what
CorePrep
can do (notably, it must not change the arity of a top-levelId
), and it leads to unsavoury code (e.g. look at the call toCorePrep.cvtLitInteger
inTidyPgm.cafRefsL
. It's also dangerous: an inconsistency could lead to a crash.
This is a long-standing problem. My current thought for how to unravel it is this:
-
TidyPgm
does not attach arity or CAF-ref info. - Instead, run
CorePrep
afterTidyPgm
, and generate accurate arity and CAF-ref info - Then use that auxiliary mapping during the conversion from tidied program to
ModIface
.
I don't think this would be hard. It would mean that the tidied program and the core-prep'd program would have to exist in memory at the same time.
An alternative would be to generate the ModIface
from the tidied program sans arity and CAF-ref info, and then, after CorePrep
run over it to add arity and CAF-ref info. (You'd have to do this before generating the fingerprints.) The advantage of this is that the ModIface
can be a lot smaller than the code for the entire module.
Another alternative would be to ensure that after CoreTidy
we treat each top-level binding one at a time, and pump them right down the pipeline individually, all the way through code generation. That way we would avoid creating the STG, or Cmm, for the entire program all at once.
A long-standing wart which needs some careful attention.
Simon