I'm going to start on this, but I'd like to solicit some input on one design decision. Internally, GHC uses both a PrimRep type and a RuntimeRep type. The relationship between the two is well documented in Note [RuntimeRep and PrimRep] and Note [Getting from RuntimeRep to PrimRep]. The change that should be made to RuntimeRep is obvious. It is specified in the proposal. The change to PrimRep is (to me) less obvious. Currently, PrimRep is:
data PrimRep = LiftedRep | UnliftedRep | IntRep ...
I suspect that it would be a mistake to change PrimRep to:
It's worth reading Note [RuntimeRep and PrimRep] in RepType.hs. e..g.
The "representation or a primitive entity" specifies what kind of register isneeded and how many bits are required. The data type TyCon.PrimRepenumerates all the possiblities.
So I think that by the time we are talking of a PrimRep we should not have any levity polymorphism nor runtime-rep polymorpihsm. And if we dont' have polymorphism then the existing PrimRep should work just fine.
TL;DR: I suggest you start off by not changing PrimRep and see if it bites you.
I agree: leave PrimRep alone for now. It shouldn't be all that hard to keep RuntimeRep and PrimRep with (slightly) different shapes.
Do shout if you run into trouble. There is a fair bit of magic in the way that RuntimeRep is wired in; you will have to update it. I don't think there are particular unexpected twists -- and there are some nice examples of parameterized constructors of RuntimeRep to use as exemplars -- but do be careful.
Fortunately, this has been more straightforward than I expected it to be. On !2249 (closed), I've run into an issue (the full error message is listed there) where the new Levity type (data Levity = Lifted | Unlifted) results in:
/tmp/ghc26284_0/ghc_3.s:138995:0: error: Error: symbol `ghczmprim_GHCziTypes_zdtcLevity_closure' is already defined |138995 | ghczmprim_GHCziTypes_zdtcLevity_closure: | ^
The same error message shows up for both data constructors as well. My suspicion is that I've messed up something about how Levity is wired in, which must be causing code gen for both the definition in GHC.Types and the wired-in definition. I'm stuck on this at the moment, and if anyone would be able to look over what I've done, that would be appreciated.
On an entirely unrelated note, I was digging through old commits. In d8c64e86, @rae included a comment which has since been deleted:
but this slowed down GHC because every time we looked at *, we had to follow a bunch of pointers. When we have unpackable sums, we should go back to the stratified representation.
I had not considered that this might degrade performance, but I should try to measure this after I have the branch working. It would be nice to know empirically just how much compile times must suffer for this feature. I don't really see how unpackable sums help address this problem since in Core, you'd still have to use AppTy or TyConApp everywhere. Regardless, there may be something that can be some to help performance, although I'm more concerned with having it work at all first.
Performance is a worry here. But after flattening the RuntimeRep structure to what's in HEAD, the performance problems didn't really go away. So there are a few hacks in GHC to try to preserve Type without expanding it. See Note [Expanding synonyms during unification] in TcUnify and TcCanonical.zonk_eq_types, which were both invented (if memory serves) to deal with performance trouble once Type became a type synonym. Bottom line: these optimizations will continue to work just as well as they have, so you may get away without trouble.
In the MR, I've nearly completed the first part of the proposal (adding BoxedRep). There are documentation holes and test failures I'm working through, but I think the difficult part is behind me. The second part (changing type signatures of all the primops) should not be hard.
One issue I'm running into is that I have some changes to binary and haddock that are on personal forks, and gitlab CI cannot see these commits. What is the procedure for getting my branches of these into somewhere visible to CI?
One issue I'm running into is that I have some changes to binary and haddock that are on personal forks, and gitlab CI cannot see these commits
I usually just push them to the main repository instead of my own fork and open a PR against their GH repos. That works just fine, even though they are just mirrors.
Alternatively, you can probably alter the .gitmodules file to point to your own fork.