FixedRuntimeRep: check type arguments instead of value arguments for Ids with no binding
Currently, we rule out bad representation-polymorphism in primops (and other Id
s with no binding) through tcRemainingValArgs
, which checks that the remaining value arguments all have a fixed RuntimeRep
.
However, @bgamari points out in #21868 that we also need to rule out representation-polymorphism in other places that aren't arguments, such as the return RuntimeRep
r_rep
of catch#
or keepAlive#
:
catch# :: forall (r_rep :: RuntimeRep) (r :: TYPE r_rep) w.
(State# RealWorld -> (# State# RealWorld, r #) )
-> (w -> State# RealWorld -> (# State# RealWorld, r #) )
-> State# RealWorld
-> (# State# RealWorld, r #)
keepAlive# :: forall (a_lev :: Levity) (a :: TYPE (BoxedRep lev))
(r_rep :: RuntimeRep) (r :: TYPE r_rep).
a -> State# RealWorld -> (State# RealWorld -> r) -> r
Furthermore, it is not enough to rule out representation-polymorphism: #21868 points out that the current implementation of these primops imposes further restrictions on the representation, because the passing of evaluation results may happen via the stack (e.g. in the case of an unregisterised compiler). The current implementation allows non-Void
PrimRep
s that are word-sized or smaller. This means that BoxedRep Lifted
, BoxedRep Unlifted
, WordRep
, Word8Rep
, AddrRep
for example are all OK, while FloatRep
, TupleRep '[]
and TupleRep '[ IntRep, WordRep ]
are all disallowed. This is a detail of the current implementation of these primops; this limitation could conceivably be lifted (see discussion on #21868).
Plan
Here's a plan, elaborated with @simonpj and @monoidal:
IdDetails
Step 1: store concreteness information in Id
s whose type quantifies over RuntimeRep
s which must be instantiated to concrete RuntimeRep
s should store this information in their IdDetails
:
- For primops, we can replace the "is rep-poly"
Bool
stored in thePrimOpId
constructor with a collection of type variables and associatedConcreteTvOrigin
. For example,catch#
would say that theRuntimeRep
variable needs to be instantiated to a concreteRuntimeRep
. - For wired-in
Id
s with a compulsory unfolding and representation-polymorphic arguments, add a new constructor toIdDetails
with this same information. - For unboxed tuples and sums, we know ahead of time that the
RuntimeRep
arguments should be concrete (the first half of the type arguments). - This leaves representation-polymorphic newtype constructors, for which this strategy won't work because it might be some type other than one of the quantified type variables which is required to be concrete. See § The newtype question below.
Step 2: create Concrete metavariables when instantiating
When instantiating an Id
which contains this information, use concrete metavariables instead of TauTv
s for the appropriately tagged variables.
To be specific, in tcInstFun.go1
, when we split off some type variables using tcSplitSomeForAllTyVars
, if acc
is empty (i.e. we are instantiating at the top-level, not in a nested situation) and the function being applied is an Id
with no binding, look up which variables must be concrete from the Id
and pass that information on to instantiateSigma
, saying to use newConcreteTyVarX
instead of newMetaTyVarX
for those variables.
Other than the wrinkle with representation-polymorphic unlifted newtypes, this would allow us to remove tcRemainingValArgs
, as we will be ensuring concreteness right at the source, when the type variables are created. This means we will be able to accept
type R :: RuntimeRep
type family R where { R = IntRep }
myId :: forall (a :: TYPE R). a -> a
myId = coerce
effectively implementing PHASE 2 of FixedRuntimeRep for primops without having to pay the onerous implementation cost of inserting the correct coercions when eta-expanding.
Step 3: check concreteness in Core Lint
The above steps attempt to instantiate these "MustBeConcrete" type variables to concrete types. We want to check that this is indeed the case, and that further transformations don't break the invariant. So we should also add a check to Core Lint to ensure that in any application of an Id
with no binding, the types it requires to be concrete are indeed concrete.
The newtype question
In Step 1, (4), we alluded to the observation that storing which type variables must be concrete might not be sufficient for representation-polymorphic newtype constructors. For example:
type R :: Type -> RuntimeRep
type family R a
type family T :: Type -> TYPE (R a)
type family T a
type N :: forall (a :: Type) -> TYPE (R a)
data N a = MkN (T a)
In this case, the type that needs to be concrete is R a
, which is not a type variable.
Option 1: store types instead of type variables
We could directly store which types must be concrete, instead of referring to type variables which must be concrete. Then, to adapt the plan from Step 2, in instantiateSigma
we would:
- create new meta type variables as usual,
- apply the obtained substitution to the types that must be concrete,
- ensure these substituted types are concrete.
Unfortunately it isn't clear what we would do with the evidence.
Option 2: change the newtype wrapper
We could change the newtype wrapper for MkN
to be something like:
MkN :: forall {conc :: RuntimeRep}. forall (a :: TYPE) -> R a ~# conc => T a -> N a
That is, we add an extra inferred type variable which must be concrete, and add an equality predicate between that concrete type variable and the representation of the type stored inside the newtype. We can then make use of this coercion in the compulsory unfolding of MkN
. (The type variable conc
is inferred and comes first, so as to not intefere with visible type applications.)
This brings us back to the case of type variables, which we can handle as in Step 2.
It isn't clear how much extra work this would be, as now we have a newtype worker, one of whose arguments is a coercion. This seems OK, as we already allow e.g. a newtype with a stupid theta. I think it would mean adding an eq_pred
to the Haskell98
datacon declaration however, and I don't know what knock-on consequences that might have.