Avoid TidyPgm predicting what CorePrep will do

At the moment the TidyPgm pass is forced to predict, accurately but unpleasantly, some aspects of what CorePrep and Core2Stg will do. Reason:

Each Id in the interface file records (among other things) the arity of the Id, and whether it has CAF references
We really only know these two things for sure after CorePrep. The conversion from Core to STG makes no structural changes.
However the result of TidyPgm (which preceded CorePrep) is used to generate the interface file. So it has to predict the arity and CAF-ref status of each Id.
This is not good. It restricts what CorePrep can do (notably, it must not change the arity of a top-level Id), and it leads to unsavoury code (e.g. look at the call to CorePrep.cvtLitInteger in TidyPgm.cafRefsL. It's also dangerous: an inconsistency could lead to a crash.

This is a long-standing problem. My current thought for how to unravel it is this:

TidyPgm does not attach arity or CAF-ref info.
Instead, run CorePrep after TidyPgm, and generate accurate arity and CAF-ref info
Then use that auxiliary mapping during the conversion from tidied program to ModIface.

I don't think this would be hard. It would mean that the tidied program and the core-prep'd program would have to exist in memory at the same time.

An alternative would be to generate the ModIface from the tidied program sans arity and CAF-ref info, and then, after CorePrep run over it to add arity and CAF-ref info. (You'd have to do this before generating the fingerprints.) The advantage of this is that the ModIface can be a lot smaller than the code for the entire module.

Another alternative would be to ensure that after CoreTidy we treat each top-level binding one at a time, and pump them right down the pipeline individually, all the way through code generation. That way we would avoid creating the STG, or Cmm, for the entire program all at once.

A long-standing wart which needs some careful attention.

Simon

Edited Mar 10, 2019 by Simon Peyton Jones

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information