Overhaul ST using pseudodata
_Note: proposal currently closed - blocked by #19799 (closed) and #19800 (closed)._
Proposal
Replace GHC's current definition for ST
:
newtype ST s a = ST (State# s -> (# State# s, a #))
with:
newtype ST s a = ST (*(UO# s) -> (# a #))
where:
- the
UO#
type is a builtin variant of pseudodata as described here. - the asterisk/star-prefix of
*UO#
is a uniqueness annotation (#19799 (closed)), used to prevent the errant reuse ofUO#
values.
The various ST
-centric definitions are modified accordingly - the following are examples for returnST
and bindST
:
returnST :: a -> ST s a
returnST x = ST (\u -> case partUO# u of (# _, _ #) -> (# x #))
bindST :: ST s a -> (a -> ST s b) -> ST s b
bindST (ST m) k = ST (\u -> case partUO# u of
(# u1, u2 #) -> case m u1 of
(# x #) -> unST (k x) u2)
where:
unST :: ST s a -> *(UO# s) -> (# a #)
unST (ST m) = m
partUO# :: *(UO# s) -> *(# *(UO# s), *(UO# s) #) -- primitive
with partUO#
being generalised from the selectors left
and right
. Other primitives would be modified in similar fashion e.g:
newtype STRef s a = STRef (MutVar# s a)
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
newMutVar# :: monomo a . a -> *(UO# s) -> MutVar# s a
readMutVar# :: monomo a . MutVar# s a -> *(UO# s) -> a
writeMutVar# :: monomo a . MutVar# s a -> a -> *(UO# s) -> ()
newSTRef x = ST (\u -> case newMutVar# x u of r -> (# STRef r#))
readSTRef (STRef r) = ST (\u -> case readMutVar# r u of !x -> (# x #))
writeSTRef (STRef r) x = ST (\u -> case writeMutVar# r x u of () -> (# () #))
where the use of the monomo
annotation renders the type variable a
in each definition monomorphic (#19800 (closed)).
Certain unsafe... :: IO a -> a
pseudo-definitions rely on the GHC-specific definition realWorld# :: State# RealWorld
- this could be replaced by e.g. unsafeOblige :: OI#
, where:
type OI# = UO# OI
data OI = OI OI#
which also allows IO
to be defined as:
newtype IO a = IO (*OI# -> (# a #))
Alternately:
type ST# s a = *(UO# s) -> (# a #)
newtype ST s a = ST (ST# s a)
newtype IO a = IO (ST# OI a)
Nota Bene: The priority here is to avoid visible changes to the language's syntax and semantics for the standard [Haskell 2010] use of I/O. However, as the current "[...] IO
representation is deeply wired in to various parts of the system [...]", implementing this proposal without causing any such changes may prove infeasible.
Motivations
Documentation shared
If the Standard-ML version of echo
from here:
val echoML : unit -> unit
fun echoML () =
let val c = getcML () in
if c = #"\n" then
()
else
let val _ = putcML c in
echoML ()
end
end
is compared to using what is being proposed, albeit in a more direct style:
echo :: OI -> ()
echo u = let !(u1:u2:u3:_) = parts u in
let !c = getChar u1 in
if c == '\n' then
()
else
let !_ = putChar c u2
in echo u3
--
--
where:
part (OI u) = case partUO# u of (# u1, u2 #) -> (OI u1, OI u2)
getChar (OI u) = case getChar# u of !c -> C# c
putChar (C# c) (OI u) = case putChar# c u of () -> ()
-- also primitive
getChar# :: *OI# -> Char#
putChar# :: Char# -> *OI# -> ()
and:
parts u = case part u of (u1, u2) -> u1 : parts u2
we can now see just how similar the two versions really are: apart from the obvious changes of syntax, the Haskell version replaces all use of unit-values with OI
-values, and adds an extra call to parts
to provide them.
This similarity hints at the possibility of reusing in some way the various introductions, tutorials, guides, manuals, walkthroughs and other similar resources (much like these) dedicated to helping those learning Standard ML understand how to use I/O. Those learning Haskell can start with these basic elements of I/O, and then be introduced to the various abstract "control-classes": Functor
, Applicative
, Monad
, etc and how they apply to the IO
type (and more generally to I/O in Haskell).
Convenience improved
As the research into unifying threads and events shows, having a convenient alternate representation has its benefits. The relative convenience of *OI# -> (# a #)
, as opposed to State# RealWorld -> (# State# RealWorld, a #)
, is left as an exercise for interested readers...
Hocus-pocus reduced
Implementation: Standard ML of New Jersey
-
Using: https://smlnj-gforge.cs.uchicago.edu/search/advanced_search.php;
-
Search terms:
magic
,Magic
,MAGIC
; -
Selected: with one word;
-
Selected: Search All;
Files Search Results
No matches found for "magic"
Implementation: Glasgow-Haskell
# cd /tmp
# tar Jxf ~/projects/ghc/old/8.6.5/src/ghc-8.6.5-src.tar.xz
# grep -Enr 'magic|Magic|MAGIC' ghc-8.6.5/ | wc -l
613
# grep -Elr 'magic|Magic|MAGIC' ghc-8.6.5/ | wc -l
293
While running jokes have their place, the use of the term "magic" in the context of Haskell seems to have turned into something of a catch-all cliche for categorising difficult-to-explain concepts and phenomena in Haskell (including I/O):
-
from https://hackage.haskell.org/package/base-4.9.1.0/docs/Control-Monad-ST.html:
RealWorld
is deeply magical.(presumably, being plain ol' "magical" just wasn't enough...)
-
from ghc-8.6.5/compiler/basicTypes/UniqSupply.hs:
here comes THE MAGIC:
-
At Fri May 7 23:45 UTC 2021, Stack Overflow found over 400 entries matching the terms "haskell" and "magic".
(In contrast, at Fri May 7 23:49 UTC 2021, there were only 2 pages matching the terms "standard ml" and "magic".)
By defining IO
using pseudodata - complete with a convenient 5-page explanation - perhaps IO
(for GHC, at least) can lose some of the misery mystery for those who are new to Haskell (e.g. see this I/O tutorial).