Make the enter code for small data types panic
Think of this as a counter proposal to #21792, the other end of a spectrum of which we currently reside in the middle.
The suggestion of @clyring in https://github.com/ghc-proposals/ghc-proposals/pull/530 to retain the panicking enter code for ByteArray#
and friends and instead give it proper, non-zero tags gave me pause. I tried hard to follow my intuition and make it break, but wasn't able to (we'll revisit my attempts below). Thinking about what is so special about ByteArray#
, I concluded "nothing" and propose the following:
Make the enter code of any taggable BoxedRep
normal form crash.
(Where "taggable" refers to the architecture specific distinction of "small constructor families", e.g., those that can be given a tag.)
If this is possible, then we'll have a straight win in terms of debugging; if not, we can adapt our reproducer to prove that under Proposal 530 ByteArray#
could be legimately entered.
Here's my attempt to break the suggestion:
data B = T | F deriving Show
b :: B
b = <churn>
main = b `seq` print b
Assuming nothing is inlined, the seq
will need to enter b
, so there are closures of type B
that need to be enterable. But it turns out that we only need thunk or indirection closures to be enterable, not the values T
or F
. When we enter the thunk for churn
, we'll run its code and ultimately (call another thunk/indirection/function returning a properly tagged value, or) "see", in the code for churn
, that the result is plain T
or F
, for which the code in churn
can be generated in a way that doesn't need to enter T
or F
and make sure that the proper tags are set.
I'm a bit hazy about indirectee pointers, especially static ones and perhaps blackholes. Consider the example above; after b
has been seq
'd, it will be evaluated by print
again. By then, b
no longer points to the thunk object for churn
but to an IND
irection closure, the payload pointer (indirectee pointer) of which must absolutely be properly tagged. Is that always the case?