FixedRuntimeRep: check type arguments instead of value arguments for Ids with no binding

Currently, we rule out bad representation-polymorphism in primops (and other Ids with no binding) through tcRemainingValArgs, which checks that the remaining value arguments all have a fixed RuntimeRep.

However, @bgamari points out in #21868 that we also need to rule out representation-polymorphism in other places that aren't arguments, such as the return RuntimeRep r_rep of catch# or keepAlive#:

catch# :: forall (r_rep :: RuntimeRep) (r :: TYPE r_rep) w.
          (State# RealWorld -> (# State# RealWorld, r #) )
       -> (w -> State# RealWorld -> (# State# RealWorld, r #) )
       -> State# RealWorld
       -> (# State# RealWorld, r #)

keepAlive# :: forall (a_lev :: Levity) (a :: TYPE (BoxedRep lev))
                     (r_rep :: RuntimeRep) (r :: TYPE r_rep).
              a -> State# RealWorld -> (State# RealWorld -> r) -> r

Furthermore, it is not enough to rule out representation-polymorphism: #21868 points out that the current implementation of these primops imposes further restrictions on the representation, because the passing of evaluation results may happen via the stack (e.g. in the case of an unregisterised compiler). The current implementation allows non-Void PrimReps that are word-sized or smaller. This means that BoxedRep Lifted, BoxedRep Unlifted, WordRep, Word8Rep, AddrRep for example are all OK, while FloatRep, TupleRep '[] and TupleRep '[ IntRep, WordRep ] are all disallowed. This is a detail of the current implementation of these primops; this limitation could conceivably be lifted (see discussion on #21868).

Plan

Here's a plan, elaborated with @simonpj and @monoidal:

Step 1: store concreteness information in `IdDetails`

Ids whose type quantifies over RuntimeReps which must be instantiated to concrete RuntimeReps should store this information in their IdDetails:

For primops, we can replace the "is rep-poly" Bool stored in the PrimOpId constructor with a collection of type variables and associated ConcreteTvOrigin. For example, catch# would say that the RuntimeRep variable needs to be instantiated to a concrete RuntimeRep.
For wired-in Ids with a compulsory unfolding and representation-polymorphic arguments, add a new constructor to IdDetails with this same information.
For unboxed tuples and sums, we know ahead of time that the RuntimeRep arguments should be concrete (the first half of the type arguments).
This leaves representation-polymorphic newtype constructors, for which this strategy won't work because it might be some type other than one of the quantified type variables which is required to be concrete. See § The newtype question below.

Step 2: create Concrete metavariables when instantiating

When instantiating an Id which contains this information, use concrete metavariables instead of TauTvs for the appropriately tagged variables.

To be specific, in tcInstFun.go1, when we split off some type variables using tcSplitSomeForAllTyVars, if acc is empty (i.e. we are instantiating at the top-level, not in a nested situation) and the function being applied is an Id with no binding, look up which variables must be concrete from the Id and pass that information on to instantiateSigma, saying to use newConcreteTyVarX instead of newMetaTyVarX for those variables.

Other than the wrinkle with representation-polymorphic unlifted newtypes, this would allow us to remove tcRemainingValArgs, as we will be ensuring concreteness right at the source, when the type variables are created. This means we will be able to accept

type R :: RuntimeRep
type family R where { R = IntRep }

myId :: forall (a :: TYPE R). a -> a
myId = coerce

effectively implementing PHASE 2 of FixedRuntimeRep for primops without having to pay the onerous implementation cost of inserting the correct coercions when eta-expanding.

Step 3: check concreteness in Core Lint

The above steps attempt to instantiate these "MustBeConcrete" type variables to concrete types. We want to check that this is indeed the case, and that further transformations don't break the invariant. So we should also add a check to Core Lint to ensure that in any application of an Id with no binding, the types it requires to be concrete are indeed concrete.

The newtype question

In Step 1, (4), we alluded to the observation that storing which type variables must be concrete might not be sufficient for representation-polymorphic newtype constructors. For example:

type R :: Type -> RuntimeRep
type family R a

type family T :: Type -> TYPE (R a)
type family T a

type N :: forall (a :: Type) -> TYPE (R a)
data N a = MkN (T a)

In this case, the type that needs to be concrete is R a, which is not a type variable.

Option 1: store types instead of type variables

We could directly store which types must be concrete, instead of referring to type variables which must be concrete. Then, to adapt the plan from Step 2, in instantiateSigma we would:

create new meta type variables as usual,
apply the obtained substitution to the types that must be concrete,
ensure these substituted types are concrete.

Unfortunately it isn't clear what we would do with the evidence.

Option 2: change the newtype wrapper

We could change the newtype wrapper for MkN to be something like:

MkN :: forall {conc :: RuntimeRep}. forall (a :: TYPE) -> R a ~# conc => T a -> N a

That is, we add an extra inferred type variable which must be concrete, and add an equality predicate between that concrete type variable and the representation of the type stored inside the newtype. We can then make use of this coercion in the compulsory unfolding of MkN. (The type variable conc is inferred and comes first, so as to not intefere with visible type applications.)

This brings us back to the case of type variables, which we can handle as in Step 2.

It isn't clear how much extra work this would be, as now we have a newtype worker, one of whose arguments is a coercion. This seems OK, as we already allow e.g. a newtype with a stupid theta. I think it would mean adding an eq_pred to the Haskell98 datacon declaration however, and I don't know what knock-on consequences that might have.

Edited Mar 21, 2023 by sheaf

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information