runExists: a typelevel version of runRW
(All this started in #16646 (closed).)
Problem
In GHC.TypeLits we have
newtype SSymbol (s :: Symbol) = SSymbol String
The intent is that this type is very private to GHC.TypeLits, beucase
it is supposed to be a singleton type. If we have two values x1, x2 :: SSymbol ty
, then the string inside x1
should be the same as the
string inside x2
.
How do we make a SSymbol
? We want to provide
withSSymbol :: String > (forall s. SSymbol s > r) > r
Note the forall s
, which plays a similar (existential) role to the forall s
in
runST :: (forall s. ST s a) > a
But how can we define withSSymbol
? Here is a try:
withSSymbol str cont = cont (SSymbol str)
But if we elaborate we find this means:
withSSymbol str cont = cont @(Any @Symbol) (SSymbol @(Any @Symbol) str)
What is that Any
??? And consider
withSSymbol "foo" (\s1 > withSSymbol "bar2 (\s2 > blah)
If we allow withSSymbol
to inline we can get
let s1 = SSymbol @(Any @Symbol) "foo"
s2 = SSymbol @(Any @Symbol) "bar"
in blah
Urk! s1, s2 :: SSymbol (Any @Symbol)
but they carry different payloads.
In conjunction with magicDict
this can give rise to subtle outright bugs:
see #16586 (closed) and Note [NOINLINE someNatVal]
in GHC.TypeNats.
The hacky "solution" (adopted by the patch for #16586 (closed)) is "don't
inline withSSymbol
" but that is smelly  and inefficient!
Solution
So what can we do? Here is one suggestion. Suppose we had
runExists :: forall k. (forall (a::k). Proxy# @k a > r) > r
Rather like runST
, this gives you a type variable you can refer to, that is effectively existential.
Now we can write
withSSymbol str cont = runExists (\(_::Proxy# s) > cont (SSymbol @s str))
So runExists
allows us to bring into scope a fresh skolem type variable.
Note 1: If we had biglambda in the source language, as Richard wants, maybe we could even get rid of the Proxy#
and write
runExists :: forall k. (forall (a::k). r) > r
withSSymbol str cont = runExists (\@s > cont (SSymbol @s str))
which would be quite a bit nicer. End of note 1
We already have something very like runExists
, in the form of runRW#
runRW# :: forall (r :: RuntimeRep) (o :: TYPE r).
(State# RealWorld > o) > o
 See Note [runRW magic] in GHC.CoreToStg.Prep.
{# NOINLINE runRW# #}  runRW# is inlined manually in CorePrep
runRW# m = m realWorld#
where Note [runRW magic]
in GHC.CoreToStg.Prep (reproduced below) describes various bit of magic are described that allow us to optimise around runRW
. In particular:
 case expressions move inside the argument
 we inline
runRW#
in CorePrep, to eliminate the lambda. We could do the same forrunExists
.
The advantage of this is that we don't need any NOINLINE stuff for withSSymbol
and friends.
Only runExists
is magic.
Note 2. Actually we'd want `runExists to have a levitypolymorphic result, thus:
runExists :: forall k (r :: RuntimeRep) (o :: TYPE r).
(forall (a::k). o) > o
End of Note2.
Question: can we somehow combine runRW#
and runExists
? I think they may be a bit different:

runExists
brings a new skolem type variable into scope 
runRW
brings a valuelevel token into scope  it's a value lambda
Note [runRW magic]
Reproduced from GHC.CoreToStg.Prep
{ Note [runRW magic]
~~~~~~~~~~~~~~~~~~~~~
Some definitions, for instance @runST@, must have careful control over float out
of the bindings in their body. Consider this use of @runST@,
f x = runST ( \ s > let (a, s') = newArray# 100 [] s
(_, s'') = fill_in_array_or_something a x s'
in freezeArray# a s'' )
If we inline @runST@, we'll get:
f x = let (a, s') = newArray# 100 [] realWorld#{NB}
(_, s'') = fill_in_array_or_something a x s'
in freezeArray# a s''
And now if we allow the @newArray#@ binding to float out to become a CAF,
we end up with a result that is totally and utterly wrong:
f = let (a, s') = newArray# 100 [] realWorld#{NB}  YIKES!!!
in \ x >
let (_, s'') = fill_in_array_or_something a x s'
in freezeArray# a s''
All calls to @f@ will share a {\em single} array! Clearly this is nonsense and
must be prevented.
This is what @runRW#@ gives us: by being inlined extremely late in the
optimization (right before lowering to STG, in CorePrep), we can ensure that
no further floating will occur. This allows us to safely inline things like
@runST@, which are otherwise needlessly expensive (see #10678 and #5916).
'runRW' has a variety of quirks:
* 'runRW' is knownkey with a NOINLINE definition in
GHC.Magic. This definition is used in cases where runRW is curried.
* In addition to its normal Haskell definition in GHC.Magic, we give it
a special late inlining here in CorePrep and GHC.StgToByteCode, avoiding
the incorrect sharing due to floatout noted above.
* It is levitypolymorphic:
runRW# :: forall (r1 :: RuntimeRep). (o :: TYPE r)
=> (State# RealWorld > (# State# RealWorld, o #))
> (# State# RealWorld, o #)
* It has some special simplification logic to allow unboxing of results when
runRW# appears in a strict context. See Note [Simplification of runRW#]
below.
* Since its body is inlined, we allow runRW#'s argument to contain jumps to
join points. That is, the following is allowed:
join j x = ...
in runRW# @_ @_ (\s > ... jump j 42 ...)
The Core Linter knows about this. See Note [Linting of runRW#] in
GHC.Core.Lint for details.
The occurrence analyser and SetLevels also know about this, as described in
Note [Simplification of runRW#].
Other relevant Notes:
* Note [Simplification of runRW#] below, describing a transformation of runRW
applications in strict contexts performed by the simplifier.
* Note [Linting of runRW#] in GHC.Core.Lint
* Note [runRW arg] below, describing a nonobvious case where the
lateinlining could go wrong.