Commit 4f8e86b4 authored by Simon Peyton Jones's avatar Simon Peyton Jones
Browse files

Revive 'mdo' expressions, per discussion in Trac #4148

Summary:
  - mdo expressions are enabled by RecursiveDo pragma

  - mdo expressions perform full segmentation

  - 'rec' groups inside 'do' are changed so they do *not*
    perform any segmentation.

  - Both 'mdo' and 'rec' are enabled by 'RecursiveDo'
    'DoRec' is deprecated in favour of 'RecursiveDo'
    (The 'rec' keyword is also enabled by 'Arrows', as now.)

Thanks to Levent for doing all the work
parent ad6af5fc
......@@ -448,7 +448,6 @@ data ExtensionFlag
| Opt_MonadComprehensions
| Opt_GeneralizedNewtypeDeriving
| Opt_RecursiveDo
| Opt_DoRec
| Opt_PostfixOperators
| Opt_TupleSections
| Opt_PatternGuards
......@@ -2028,9 +2027,9 @@ xFlags = [
( "ImpredicativeTypes", Opt_ImpredicativeTypes, nop),
( "TypeOperators", Opt_TypeOperators, nop ),
( "ExplicitNamespaces", Opt_ExplicitNamespaces, nop ),
( "RecursiveDo", Opt_RecursiveDo, -- Enables 'mdo'
deprecatedForExtension "DoRec"),
( "DoRec", Opt_DoRec, nop ), -- Enables 'rec' keyword
( "RecursiveDo", Opt_RecursiveDo, nop ), -- Enables 'mdo' and 'rec'
( "DoRec", Opt_RecursiveDo,
deprecatedForExtension "RecursiveDo" ),
( "Arrows", Opt_Arrows, nop ),
( "ParallelArrays", Opt_ParallelArrays, nop ),
( "TemplateHaskell", Opt_TemplateHaskell, checkTemplateHaskellOk ),
......@@ -2276,7 +2275,7 @@ glasgowExtsFlags = [
, Opt_RankNTypes
, Opt_TypeOperators
, Opt_ExplicitNamespaces
, Opt_DoRec
, Opt_RecursiveDo
, Opt_ParallelListComp
, Opt_EmptyDataDecls
, Opt_KindSignatures
......
......@@ -658,7 +658,8 @@ reservedWordsFM = listToUFM $
( "capi", ITcapiconv, bit cApiFfiBit),
( "prim", ITprimcallconv, bit ffiBit),
( "rec", ITrec, bit recBit),
( "rec", ITrec, bit arrowsBit .|.
bit recursiveDoBit),
( "proc", ITproc, bit arrowsBit)
]
......@@ -1826,8 +1827,6 @@ inRulePragBit :: Int
inRulePragBit = 19
rawTokenStreamBit :: Int
rawTokenStreamBit = 20 -- producing a token stream with all comments included
recBit :: Int
recBit = 22 -- rec
alternativeLayoutRuleBit :: Int
alternativeLayoutRuleBit = 23
relaxedLayoutBit :: Int
......@@ -1937,8 +1936,6 @@ mkPState flags buf loc =
.|. magicHashBit `setBitIf` xopt Opt_MagicHash flags
.|. kindSigsBit `setBitIf` xopt Opt_KindSignatures flags
.|. recursiveDoBit `setBitIf` xopt Opt_RecursiveDo flags
.|. recBit `setBitIf` xopt Opt_DoRec flags
.|. recBit `setBitIf` xopt Opt_Arrows flags
.|. unicodeSyntaxBit `setBitIf` xopt Opt_UnicodeSyntax flags
.|. unboxedTuplesBit `setBitIf` xopt Opt_UnboxedTuples flags
.|. datatypeContextsBit `setBitIf` xopt Opt_DatatypeContexts flags
......
......@@ -753,7 +753,7 @@ rnStmt ctxt (L _ (RecStmt { recS_stmts = rec_stmts })) thing_inside
-- Step 3: Group together the segments to make bigger segments
-- Invariant: in the result, no segment uses a variable
-- bound in a later segment
grouped_segs = glomSegments segs_w_fwd_refs
grouped_segs = glomSegments ctxt segs_w_fwd_refs
-- Step 4: Turn the segments into Stmts
-- Use RecStmt when and only when there are fwd refs
......@@ -1101,15 +1101,20 @@ addFwdRefs pairs
-- { rec { x <- ...y...; p <- z ; y <- ...x... ;
-- q <- x ; z <- y } ;
-- r <- x }
--
-- NB. June 7 2012: We only glom segments that appear in
-- an explicit mdo; and leave those found in "do rec"'s intact.
-- See http://hackage.haskell.org/trac/ghc/ticket/4148 for
-- the discussion leading to this design choice.
glomSegments :: [Segment (LStmt Name)] -> [Segment [LStmt Name]]
glomSegments :: HsStmtContext Name -> [Segment (LStmt Name)] -> [Segment [LStmt Name]]
glomSegments [] = []
glomSegments ((defs,uses,fwds,stmt) : segs)
glomSegments _ [] = []
glomSegments ctxt ((defs,uses,fwds,stmt) : segs)
-- Actually stmts will always be a singleton
= (seg_defs, seg_uses, seg_fwds, seg_stmts) : others
where
segs' = glomSegments segs
segs' = glomSegments ctxt segs
(extras, others) = grab uses segs'
(ds, us, fs, ss) = unzip4 extras
......@@ -1127,7 +1132,9 @@ glomSegments ((defs,uses,fwds,stmt) : segs)
= (reverse yeses, reverse noes)
where
(noes, yeses) = span not_needed (reverse dus)
not_needed (defs,_,_,_) = not (intersectsNameSet defs uses)
not_needed (defs,_,_,_) = case ctxt of
MDoExpr -> not (intersectsNameSet defs uses)
_ -> False -- unless we're in mdo, we *need* everything
----------------------------------------------------
......@@ -1297,9 +1304,9 @@ okParStmt dflags ctxt stmt
okDoStmt dflags ctxt stmt
= case stmt of
RecStmt {}
| Opt_DoRec `xopt` dflags -> isOK
| ArrowExpr <- ctxt -> isOK -- Arrows allows 'rec'
| otherwise -> Just (ptext (sLit "Use -XDoRec"))
| Opt_RecursiveDo `xopt` dflags -> isOK
| ArrowExpr <- ctxt -> isOK -- Arrows allows 'rec'
| otherwise -> Just (ptext (sLit "Use -XRecursiveDo"))
BindStmt {} -> isOK
LetStmt {} -> isOK
ExprStmt {} -> isOK
......
......@@ -110,7 +110,7 @@ identifiers, expressions, rules, and their operations.</strong>
</ul></tt>
<p><li> <strong>That is the end of the infrastructure. Now we get the
main layer of mdoules that perform useful work.</strong>
main layer of modules that perform useful work.</strong>
<tt><ul>
<p><li>
......
......@@ -1011,15 +1011,9 @@
<entry>dynamic</entry>
<entry><option>-XNoExplicitNamespaces</option></entry>
</row>
<row>
<entry><option>-XDoRec</option></entry>
<entry>Enable <link linkend="recursive-do-notation">recursive do notation</link>.</entry>
<entry>dynamic</entry>
<entry><option>-XNoDoRec</option></entry>
</row>
<row>
<entry><option>-XRecursiveDo</option></entry>
<entry>Enable <link linkend="mdo-notation">recursive do (mdo) notation</link>. This is deprecated; please use <link linkend="recursive-do-notation">recursive do notation</link> instead.</entry>
<entry>Enable <link linkend="recursive-do-notation">recursive do (mdo) notation</link>.</entry>
<entry>dynamic</entry>
<entry><option>-XNoRecursiveDo</option></entry>
</row>
......
......@@ -857,151 +857,195 @@ To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag
</title>
<para>
The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
that is, the variables bound in a do-expression are visible only in the textually following
code block. Compare this to a let-expression, where bound variables are visible in the entire binding
group. It turns out that several applications can benefit from recursive bindings in
the do-notation. The <option>-XDoRec</option> flag provides the necessary syntactic support.
The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
that is, the variables bound in a do-expression are visible only in the textually following
code block. Compare this to a let-expression, where bound variables are visible in the entire binding
group.
</para>
<para>
It turns out that such recursive bindings do indeed make sense for a variety of monads, but
not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class. Haskell's
<literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
<literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
</para>
<para>
For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
an extended version of the do-notation that allows recursive bindings.
The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
provides the necessary syntactic support, introducing the keyword <literal>mdo</literal>. Unlike
bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal>
are recursively defined, much like in an ordinary let-expression. Due to the new
keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
</para>
<para>
Here is a simple (albeit contrived) example:
Here is a simple (albeit contrived) example:
<programlisting>
{-# LANGUAGE DoRec #-}
justOnes = do { rec { xs &lt;- Just (1:xs) }
; return (map negate xs) }
{-# LANGUAGE RecursiveDo #-}
justOnes = mdo { xs &lt;- Just (1:xs)
; return (map negate xs) }
</programlisting>
As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
</para>
<para>
The background and motivation for recursive do-notation is described in
<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
by Levent Erkok, John Launchbury,
Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
The theory behind monadic value recursion is explained further in Erkok's thesis
<ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.
However, note that GHC uses a different syntax than the one described in these documents.
<para>
GHC's implementation the mdo-notation closely follows the original translation as described in the paper
<ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper by the
additional <literal>rec</literal> keyword, as we describe next.
</para>
<sect3>
<title>Details of recursive do-notation</title>
<title>Recursive binding groups</title>
<para>
The recursive do-notation is enabled with the flag <option>-XDoRec</option> or, equivalently,
the LANGUAGE pragma <option>DoRec</option>. It introduces the single new keyword "<literal>rec</literal>",
which wraps a mutually-recursive group of monadic statements,
producing a single statement.
</para>
<para>Similar to a <literal>let</literal>
statement, the variables bound in the <literal>rec</literal> are
visible throughout the <literal>rec</literal> group, and below it.
For example, compare
The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
<programlisting>
do { a &lt;- getChar do { a &lt;- getChar
; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
; r2 = g r1 } ; r2 &lt;- g r1 }
; return (r1 ++ r2) } ; return (r1 ++ r2) }
do { a &lt;- getChar do { a &lt;- getChar
; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
; return (r1 ++ r2) } ; return (r1 ++ r2) }
</programlisting>
In both cases, <literal>r1</literal> and <literal>r2</literal> are
available both throughout the <literal>let</literal> or <literal>rec</literal> block, and
in the statements that follow it. The difference is that <literal>let</literal> is non-monadic,
while <literal>rec</literal> is monadic. (In Haskell <literal>let</literal> is
really <literal>letrec</literal>, of course.)
In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
(In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
</para>
<para>
The static and dynamic semantics of <literal>rec</literal> can be described as follows:
<itemizedlist>
<listitem><para>
First,
similar to let-bindings, the <literal>rec</literal> is broken into
minimal recursive groups, a process known as <emphasis>segmentation</emphasis>.
For example:
<programlisting>
rec { a &lt;- getChar ===> a &lt;- getChar
; b &lt;- f a c rec { b &lt;- f a c
; c &lt;- f b a ; c &lt;- f b a }
; putChar c } putChar c
</programlisting>
The details of segmentation are described in Section 3.2 of
<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>.
Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper
describes, also has a semantic effect (unless the monad satisfies the right-shrinking law).
</para></listitem>
<listitem><para>
Then each resulting <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal>.
For example, the <literal>rec</literal> group in the preceding example is desugared like this:
The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
group, it will compute its set of bound variables, and will introduce an appropriate call
to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
<literal>MonadFix</literal> class. Here is an example:
<programlisting>
rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\~(b,c) -> do { b &lt;- f a c
; c &lt;- f b a } ; c &lt;- f b a
; return (b,c) })
</programlisting>
In general, the statement <literal>rec <replaceable>ss</replaceable></literal>
is desugared to the statement
As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
<programlisting>
<replaceable>vs</replaceable> &lt;- mfix (\~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
</programlisting>
where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
</para><para>
The original <literal>rec</literal> typechecks exactly
when the above desugared version would do so. For example, this means that
the variables <replaceable>vs</replaceable> are all monomorphic in the statements
following the <literal>rec</literal>, because they are bound by a lambda.
where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
</para>
<para>
The <literal>mfix</literal> function is defined in the <literal>MonadFix</literal>
class, in <literal>Control.Monad.Fix</literal>, thus:
Note in particular that the translation for a <literal>rec</literal> block only involves wrapping around a call
to <literal>mfix</literal>: otherwise it does not perform any other analysis on the bindings at all. The latter is the task
for the <literal>mdo</literal> notation, as we describe next.
</para>
</sect3>
<sect3>
<title>The <literal>mdo</literal> notation</title>
<para>
A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
the placement of the recursive knots can be rather delicate: In particular, we would like the knots to be wrapped
around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, an is described
in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
unnecessary interference caused by a fundamental issue with the so called <emphasis>right-shrinking</emphasis>
axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
interference, changing the termination behavior of the resulting translation.
(Details can be found in Sections 3.1 and 7.2.2 of
<ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
</para>
<para>
The <literal>mdo</literal>-notation removes the burden of placing explicit <literal>rec</literal> blocks in the code.
It automatically identifies minimally dependent recursive groups, treating them as if the user wrapped a
<literal>rec</literal> qualified around them. The definition of <emphasis>minimal</emphasis> in this context
is syntax oriented: Two bindings are called dependent if the latter one uses a variable defined by the former. Furthermore,
if a binding is dependent on another, then all the bindings that textually appear in between them are dependent on each other
as well. A minimally dependent group of bindings is simply a contagious group where none of the textually following
bindings depend on it. (Segments in this sense are related to <emphasis>strongly-connected components</emphasis>
analysis, with the exception that bindings in a segment cannot be reordered and has to be contagious.)
</para>
<para>
Here's an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
<programlisting>
mdo { a &lt;- getChar ===> do { a &lt;- getChar
; b &lt;- f a c ; rec { b &lt;- f a c
; c &lt;- f b a ; ; c &lt;- f b a }
; z &lt;- h a b ; z &lt;- h a b
; d &lt;- g d e ; rec { d &lt;- g d e
; e &lt;- g a z ; ; e &lt;- g a z }
; putChar c } ; putChar c }
</programlisting>
Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
would expect.
</para>
<para>
In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
<literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
<literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
version would do so. The <literal>mfix</literal> function is defined in the <literal>MonadFix</literal>
class, in <literal>Control.Monad.Fix</literal>, thusly:
<programlisting>
class Monad m => MonadFix m where
mfix :: (a -> m a) -> m a
</programlisting>
</para>
</listitem>
</itemizedlist>
</para>
<para>
Here are some other important points in using the recursive-do notation:
<itemizedlist>
<listitem><para>
It is enabled with the flag <literal>-XDoRec</literal>.
</para></listitem>
<listitem><para>
If recursive bindings are required for a monad,
then that monad must be declared an instance of the <literal>MonadFix</literal> class.
</para></listitem>
<listitem><para>
The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
for Haskell's internal state monad (strict and lazy, respectively).
</para></listitem>
<listitem><para>
Like <literal>let</literal> and <literal>where</literal> bindings,
name shadowing is not allowed within a <literal>rec</literal>;
that is, all the names bound in a single <literal>rec</literal> must
be distinct (Section 3.3 of the paper).
</para></listitem>
<listitem><para>
It supports rebindable syntax (see <xref linkend="rebindable-syntax"/>).
</para></listitem>
<itemizedlist>
<listitem>
<para>
It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
blocks inside <literal>do</literal> expressions.)
</para>
</listitem>
<listitem>
<para>
<literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
<literal>rec</literal> blocks in a single expression.
</para>
</listitem>
<listitem>
<para>
If recursive bindings are required for a monad, then that monad must be declared an instance of
the <literal>MonadFix</literal> class.
</para>
</listitem>
<listitem>
<para>
The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
state monad (strict and lazy, respectively).
</para>
</listitem>
<listitem>
<para>
Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="mdo-notation"> <title> Mdo-notation (deprecated) </title>
<para> GHC used to support the flag <option>-XRecursiveDo</option>,
which enabled the keyword <literal>mdo</literal>, precisely as described in
<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
but this is now deprecated. Instead of <literal>mdo { Q; e }</literal>, write
<literal>do { rec Q; e }</literal>.
</para>
<para>
Historical note: The old implementation of the mdo-notation (and most
of the existing documents) used the name
<literal>MonadRec</literal> for the class and the corresponding library.
This name is not supported by GHC.
</para>
</sect3>
</sect2>
......@@ -1469,7 +1513,7 @@ the comprehension being over an arbitrary monad.
functions <literal>(>>=)</literal>,
<literal>(>>)</literal>, and <literal>fail</literal>,
are in scope (not the Prelude
versions). List comprehensions, mdo (<xref linkend="mdo-notation"/>), and parallel array
versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array
comprehensions, are unaffected. </para></listitem>
<listitem>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment