runExists: a type-level version of runRW

(All this started in #16646 (closed).)

Problem

In GHC.TypeLits we have

newtype SSymbol (s :: Symbol) = SSymbol String

The intent is that this type is very private to GHC.TypeLits, because it is supposed to be a singleton type. If we have two values x1, x2 :: SSymbol ty, then the string inside x1 should be the same as the string inside x2.

How do we make a SSymbol? We want to provide

withSSymbol :: String -> (forall s. SSymbol s -> r) -> r

Note the forall s, which plays a similar (existential) role to the forall s in

  runST :: (forall s. ST s a) -> a

But how can we define withSSymbol? Here is a try:

withSSymbol str cont = cont (SSymbol str)

But if we elaborate we find this means:

withSSymbol str cont = cont @(Any @Symbol) (SSymbol @(Any @Symbol) str)

What is that Any??? And consider

  withSSymbol "foo" (\s1 -> withSSymbol "bar2 (\s2 -> blah)

If we allow withSSymbol to inline we can get

  let s1 = SSymbol @(Any @Symbol) "foo"
      s2 = SSymbol @(Any @Symbol) "bar"
  in blah

Urk! s1, s2 :: SSymbol (Any @Symbol) but they carry different payloads. In conjunction with magicDict this can give rise to subtle outright bugs: see #16586 (closed) and Note [NOINLINE someNatVal] in GHC.TypeNats.

The hacky "solution" (adopted by the patch for #16586 (closed)) is "don't inline withSSymbol" but that is smelly -- and inefficient!

Solution

So what can we do? Here is one suggestion. Suppose we had

runExists :: forall k. (forall (a::k). Proxy# @k a -> r) -> r

Rather like runST, this gives you a type variable you can refer to, that is effectively existential. Now we can write

withSSymbol str cont = runExists (\(_::Proxy# s) -> cont (SSymbol @s str))

So runExists allows us to bring into scope a fresh skolem type variable.

The advantage of this is that we don't need any NOINLINE stuff for withSSymbol and friends. Only runExists is magic.

Note 1: If we had big-lambda in the source language, as Richard wants, maybe we could even get rid of the Proxy# and write

runExists :: forall k. (forall (a::k). r) -> r

withSSymbol str cont = runExists (\@s -> cont (SSymbol @s str))

which would be quite a bit nicer. End of note 1

Note 2. Actually we'd want `runExists to have a levity-polymorphic result, thus:

runExists :: forall k (r :: RuntimeRep) (o :: TYPE r).
             (forall (a::k). o) -> o

End of Note2.

Note 3. We already have something very like runExists, in the form of runRW#

runRW# :: forall (r :: RuntimeRep) (o :: TYPE r).
          (State# RealWorld -> o) -> o
-- See Note [runRW magic] in GHC.CoreToStg.Prep.
{-# NOINLINE runRW# #-}  -- runRW# is inlined manually in CorePrep
runRW# m = m realWorld#

where Note [runRW magic] in GHC.CoreToStg.Prep (reproduced below) describes various bit of magic are described that allow us to optimise around runRW. In particular:

case expressions move inside the argument
we inline runRW# in CorePrep, to eliminate the lambda. We could do the same for runExists.

End of Note 3

Question: can we somehow combine runRW# and runExists? I think they may be a bit different:

runExists brings a new skolem type variable into scope
runRW brings a value-level token into scope -- it's a value lambda

Note [runRW magic]

Reproduced from GHC.CoreToStg.Prep

{- Note [runRW magic]
~~~~~~~~~~~~~~~~~~~~~
Some definitions, for instance @runST@, must have careful control over float out
of the bindings in their body. Consider this use of @runST@,

    f x = runST ( \ s -> let (a, s')  = newArray# 100 [] s
                             (_, s'') = fill_in_array_or_something a x s'
                         in freezeArray# a s'' )

If we inline @runST@, we'll get:

    f x = let (a, s')  = newArray# 100 [] realWorld#{-NB-}
              (_, s'') = fill_in_array_or_something a x s'
          in freezeArray# a s''

And now if we allow the @newArray#@ binding to float out to become a CAF,
we end up with a result that is totally and utterly wrong:

    f = let (a, s')  = newArray# 100 [] realWorld#{-NB-} -- YIKES!!!
        in \ x ->
            let (_, s'') = fill_in_array_or_something a x s'
            in freezeArray# a s''

All calls to @f@ will share a {\em single} array! Clearly this is nonsense and
must be prevented.

This is what @runRW#@ gives us: by being inlined extremely late in the
optimization (right before lowering to STG, in CorePrep), we can ensure that
no further floating will occur. This allows us to safely inline things like
@runST@, which are otherwise needlessly expensive (see #10678 and #5916).

'runRW' has a variety of quirks:

 * 'runRW' is known-key with a NOINLINE definition in
   GHC.Magic. This definition is used in cases where runRW is curried.

 * In addition to its normal Haskell definition in GHC.Magic, we give it
   a special late inlining here in CorePrep and GHC.StgToByteCode, avoiding
   the incorrect sharing due to float-out noted above.

 * It is levity-polymorphic:

    runRW# :: forall (r1 :: RuntimeRep). (o :: TYPE r)
           => (State# RealWorld -> (# State# RealWorld, o #))
           -> (# State# RealWorld, o #)

 * It has some special simplification logic to allow unboxing of results when
   runRW# appears in a strict context. See Note [Simplification of runRW#]
   below.

 * Since its body is inlined, we allow runRW#'s argument to contain jumps to
   join points. That is, the following is allowed:

    join j x = ...
    in runRW# @_ @_ (\s -> ... jump j 42 ...)

   The Core Linter knows about this. See Note [Linting of runRW#] in
   GHC.Core.Lint for details.

   The occurrence analyser and SetLevels also know about this, as described in
   Note [Simplification of runRW#].

Other relevant Notes:

 * Note [Simplification of runRW#] below, describing a transformation of runRW
   applications in strict contexts performed by the simplifier.
 * Note [Linting of runRW#] in GHC.Core.Lint
 * Note [runRW arg] below, describing a non-obvious case where the
   late-inlining could go wrong.

Edited Aug 11, 2021 by Simon Peyton Jones

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information