Commit 01449eb5 authored by Simon Peyton Jones's avatar Simon Peyton Jones

Fix desugaring of bang-pattern let-bindings

When implementing Strict Haskell, the patch 46a03fbe didn't faithfully
implement the semantics given in the manual. In particular there was
an ad-hoc case in mkSelectorBinds for "strict and no binders" that
didn't work.

This patch fixes it, curing Trac #11572.

Howver it forced me to think about banged let-bindings, and I rather
think we do not have quite the right semantics yet, so I've opened
Trac #11601.
parent 27842ec1
......@@ -601,13 +601,13 @@ OR (B) t = case e of p -> (x,y)
x = case t of (x,_) -> x
y = case t of (_,y) -> y
We do (A) when
* Matching the pattern is cheap so we don't mind
doing it twice.
* Or if the pattern binds only one variable (so we'll only
match once)
* AND the pattern can't fail (else we tiresomely get two inexhaustive
pattern warning messages)
We do (A) when (test: isSingleton binders)
* The pattern binds only one variable (so we'll only match once)
OR when (test: is_simple_lpat)
* Matching the pattern is cheap so we don't mind doing it twice.
* AND the pattern can't fail (else we tiresomely get one
inexhaustive pattern warning message for each binder
Otherwise we do (B). Really (A) is just an optimisation for very common
cases like
......@@ -633,7 +633,8 @@ mkSelectorBinds _ ticks (L _ (VarPat (L _ v))) val_expr
mkSelectorBinds is_strict ticks pat val_expr
| null binders, not is_strict
= return (Nothing, [])
| isSingleton binders || is_simple_lpat pat
| isSingleton binders || is_simple_lpat pat -- Case (A)
-- See Note [mkSelectorBinds]
= do { let pat_ty = hsLPatType pat
; val_var <- newSysLocalDs pat_ty
......@@ -661,26 +662,22 @@ mkSelectorBinds is_strict ticks pat val_expr
(err_var, Lam alphaTyVar err_app) :
binds) }
| otherwise
= do { val_var <- newSysLocalDs (hsLPatType pat)
| otherwise -- Case (B)
= do { val_var <- newSysLocalDs (hsLPatType pat)
; tuple_var <- newSysLocalDs tuple_ty
; error_expr <- mkErrorAppDs iRREFUT_PAT_ERROR_ID tuple_ty (ppr pat)
; tuple_expr
<- matchSimply (Var val_var) PatBindRhs pat local_tuple error_expr
; tuple_var <- newSysLocalDs tuple_ty
; tuple_expr <- matchSimply (Var val_var) PatBindRhs pat
local_tuple error_expr
; let mk_tup_bind tick binder
= (binder, mkOptTickBox tick $
mkTupleSelector local_binders binder
tuple_var (Var tuple_var))
-- if strict and no binders we want to force the case
-- expression to force an error if the pattern match
-- failed. See Note [Desugar Strict binds] in DsBinds.
; let force_var = if null binders && is_strict
then tuple_var
else val_var
; return (Just force_var
,(val_var,val_expr) :
(tuple_var, tuple_expr) :
zipWith mk_tup_bind ticks' binders) }
= (binder, mkOptTickBox tick $
mkTupleSelector local_binders binder
tuple_var (Var tuple_var))
tup_binds
| null binders = []
| otherwise = (tuple_var, tuple_expr)
: zipWith mk_tup_bind ticks' binders
; return ( Just val_var
, (val_var,val_expr) : tup_binds ) }
where
binders = collectPatBinders pat
ticks' = ticks ++ repeat []
......
<
......@@ -10482,12 +10482,42 @@ preprocessor must observe some additional restrictions:
.. _bang-patterns:
Bang patterns
=============
.. _strict-haskell:
Bang patterns and Strict Haskell
================================
.. index::
single: strict haskell
.. index::
single: Bang patterns
In high-performance Haskell code (e.g. numeric code) eliminating
thunks from an inner loop can be a huge win.
GHC supports three extensions to allow the programmer to specify
use of strict (call-by-value) evalution rather than lazy (call-by-need)
evaluation.
- Bang patterns (:ghc-flag:`-XBangPatterns`) makes pattern matching and
let bindings stricter.
- Strict data types (:ghc-flag:`-XStrictData`) makes constructor fields
strict by default, on a per-module basis.
- Strict pattern (:ghc-flag:`-XStrict`) makes all patterns and let bindings
strict by default, on a per-module basis.
The latter two extensions are simply a way to avoid littering high-performance
code with bang patterns, making it harder to read.
Bang patterns and strict matching do not affect the type system in any way.
.. _bang-patterns-informal:
Bang patterns
-------------
.. ghc-flag:: -XBangPatterns
Allow use of bang pattern syntax.
......@@ -10498,23 +10528,6 @@ Prime. The `Haskell prime feature
description <http://ghc.haskell.org/trac/haskell-prime/wiki/BangPatterns>`__
contains more discussion and examples than the material below.
The key change is the addition of a new rule to the `semantics of
pattern matching in the Haskell 98
report <http://haskell.org/onlinereport/exps.html#sect3.17.2>`__. Add
new bullet 10, saying: Matching the pattern ``!``\ ⟨pat⟩ against a value
v behaves as follows:
- if v is bottom, the match diverges
- otherwise, pat is matched against v
Bang patterns are enabled by the flag :ghc-flag:`-XBangPatterns`.
.. _bang-patterns-informal:
Informal description of bang patterns
-------------------------------------
The main idea is to add a single new production to the syntax of
patterns: ::
......@@ -10531,2323 +10544,2284 @@ bang it would be lazy. Bang patterns can be nested of course: ::
f2 (!x, y) = [x,y]
Here, ``f2`` is strict in ``x`` but not in ``y``. A bang only really has
an effect if it precedes a variable or wild-card pattern: ::
Here, ``f2`` is strict in ``x`` but not in ``y``.
Note the following points:
- A bang only really has
an effect if it precedes a variable or wild-card pattern: ::
f3 !(x,y) = [x,y]
f4 (x,y) = [x,y]
Here, ``f3`` and ``f4`` are identical; putting a bang before a pattern
that forces evaluation anyway does nothing.
Here, ``f3`` and ``f4`` are identical; putting a bang before a pattern
that forces evaluation anyway does nothing.
- A bang pattern is allowed in a let or where clause, and makes the binding
strict. For example: ::
let !x = e in body
let !(p,q) = e in body
There is one (apparent) exception to this general rule that a bang only
makes a difference when it precedes a variable or wild-card: a bang at
the top level of a ``let`` or ``where`` binding makes the binding
strict, regardless of the pattern. (We say "apparent" exception because
the Right Way to think of it is that the bang at the top of a binding is
not part of the *pattern*; rather it is part of the syntax of the
*binding*, creating a "bang-pattern binding".) See :ref:`Strict recursive and
polymorphic let bindings <recursive-and-polymorphic-let-bindings>` for
how bang-pattern bindings are compiled.
In both cases ``e`` is evaluated before starting to evaluate ``body``.
However, *nested* bangs in a pattern binding behave uniformly with all
other forms of pattern matching. For example ::
However, *nested* bangs in a let/where pattern binding behave uniformly with all
other forms of pattern matching. For example ::
let (!x,[y]) = e in b
is equivalent to this: ::
is equivalent to this: ::
let { t = case e of (x,[y]) -> x `seq` (x,y)
x = fst t
y = snd t }
in b
The binding is lazy, but when either ``x`` or ``y`` is evaluated by
``b`` the entire pattern is matched, including forcing the evaluation of
``x``.
The binding is lazy, but when either ``x`` or ``y`` is evaluated by
``b`` the entire pattern is matched, including forcing the evaluation of
``x``.
Bang patterns work in ``case`` expressions too, of course: ::
See :ref:`Semantics of let bindings with bang patterns <recursive-and-polymorphic-let-bindings>` for
the detailed semantics.
- A pattern with a bang at the outermost level is not allowed at the top
level of a module.
- Bang patterns work in ``case`` expressions too, of course: ::
g5 x = let y = f x in body
g6 x = case f x of { y -> body }
g7 x = case f x of { !y -> body }
The functions ``g5`` and ``g6`` mean exactly the same thing. But ``g7``
evaluates ``(f x)``, binds ``y`` to the result, and then evaluates
``body``.
.. _bang-patterns-sem:
Syntax and semantics
--------------------
We add a single new production to the syntax of patterns: ::
pat ::= !pat
The functions ``g5`` and ``g6`` mean exactly the same thing. But ``g7``
evaluates ``(f x)``, binds ``y`` to the result, and then evaluates
``body``.
There is one problem with syntactic ambiguity. Consider: ::
- There is one problem with syntactic ambiguity. Consider: ::
f !x = 3
Is this a definition of the infix function "``(!)``", or of the "``f``"
with a bang pattern? GHC resolves this ambiguity in favour of the
latter. If you want to define ``(!)`` with bang-patterns enabled, you
have to do so using prefix notation: ::
Is this a definition of the infix function "``(!)``", or of the "``f``"
with a bang pattern? GHC resolves this ambiguity in favour of the
latter. If you want to define ``(!)`` with bang-patterns enabled, you
have to do so using prefix notation: ::
(!) f x = 3
The semantics of Haskell pattern matching is described in `Section
3.17.2 <http://www.haskell.org/onlinereport/exps.html#sect3.17.2>`__ of
the Haskell Report. To this description add one extra item 10, saying:
- Matching the pattern ``!pat`` against a value ``v`` behaves as
follows:
.. _strict-data:
- if ``v`` is bottom, the match diverges
Strict-by-default data types
----------------------------
- otherwise, ``pat`` is matched against ``v``
.. ghc-flag:: -XStrictData
Similarly, in Figure 4 of `Section
3.17.3 <http://www.haskell.org/onlinereport/exps.html#sect3.17.3>`__,
add a new case (t): ::
:since: 8.0.1
case v of { !pat -> e; _ -> e' }
= v `seq` case v of { pat -> e; _ -> e' }
Make fields of data types defined in the current module strict by default.
That leaves let expressions, whose translation is given in `Section
3.12 <http://www.haskell.org/onlinereport/exps.html#sect3.12>`__ of the
Haskell Report. In the translation box, first apply the following
transformation: for each pattern ``pi`` that is of form ``!qi = ei``,
transform it to ``(xi,!qi) = ((),ei)``, and replace ``e0`` by
``(xi `seq` e0)``. Then, when none of the left-hand-side patterns have a
bang at the top, apply the rules in the existing box.
Informally the ``StrictData`` language extension switches data type
declarations to be strict by default allowing fields to be lazy by
adding a ``~`` in front of the field.
The effect of the let rule is to force complete matching of the pattern
``qi`` before evaluation of the body is begun. The bang is retained in
the translated form in case ``qi`` is a variable, thus: ::
When the user writes ::
let !y = f x in b
data T = C a
data T' = C' ~a
The let-binding can be recursive. However, it is much more common for
the let-binding to be non-recursive, in which case the following law
holds: ``(let !p = rhs in body)`` is equivalent to
``(case rhs of !p -> body)``
we interpret it as if they had written ::
A pattern with a bang at the outermost level is not allowed at the top
level of a module.
data T = C !a
data T' = C' a
.. _assertions:
The extension only affects definitions in this module.
Assertions
==========
.. index::
single: Assertions
.. _strict:
If you want to make use of assertions in your standard Haskell code, you
could define a function like the following: ::
Strict-by-default pattern bindings
----------------------------------
assert :: Bool -> a -> a
assert False x = error "assertion failed!"
assert _ x = x
.. ghc-flag:: -XStrict
which works, but gives you back a less than useful error message -- an
assertion failed, but which and where?
:implies: :ghc-flag:`-XStrictData`
:since: 8.0.1
One way out is to define an extended ``assert`` function which also
takes a descriptive string to include in the error message and perhaps
combine this with the use of a pre-processor which inserts the source
location where ``assert`` was used.
Make bindings in the current module strict by default.
GHC offers a helping hand here, doing all of this for you. For every use
of ``assert`` in the user's source: ::
Informally the ``Strict`` language extension switches functions, data
types, and bindings to be strict by default, allowing optional laziness
by adding ``~`` in front of a variable. This essentially reverses the
present situation where laziness is default and strictness can be
optionally had by adding ``!`` in front of a variable.
kelvinToC :: Double -> Double
kelvinToC k = assert (k >= 0.0) (k+273.15)
``Strict`` implies :ref:`StrictData <strict-data>`.
GHC will rewrite this to also include the source location where the
assertion was made, ::
- **Function definitions**
assert pred val ==> assertError "Main.hs|15" pred val
When the user writes ::
The rewrite is only performed by the compiler when it spots applications
of ``Control.Exception.assert``, so you can still define and use your
own versions of ``assert``, should you so wish. If not, import
``Control.Exception`` to make use ``assert`` in your code.
f x = ...
.. index::
pair: assertions; disabling
we interpret it as if they had written ::
GHC ignores assertions when optimisation is turned on with the
:ghc-flag:`-O` flag. That is, expressions of the form ``assert pred e``
will be rewritten to ``e``. You can also disable assertions using the
:ghc-flag:`-fignore-asserts` option. The option
:ghc-flag:`-fno-ignore-asserts <-fignore-asserts>`
allows enabling assertions even when optimisation is turned on.
f !x = ...
Assertion failures can be caught, see the documentation for the
:base-ref:`Control.Exception <Control-Exception.html>` library for the details.
Adding ``~`` in front of ``x`` gives the regular lazy behavior.
.. _static-pointers:
- **Let/where bindings**
Static pointers
===============
When the user writes ::
.. index::
single: Static pointers
let x = ...
let pat = ...
.. ghc-flag:: -XStaticPointers
we interpret it as if they had written ::
:since: 7.10.1
let !x = ...
let !pat = ...
Allow use of static pointer syntax.
Adding ``~`` in front of ``x`` gives the regular lazy
behavior.
The general rule is that we add an implicit bang on the outermost pattern,
unless disabled with ``~``.
The language extension :ghc-flag:`-XStaticPointers` adds a new syntactic form
``static e``, which stands for a reference to the closed expression ⟨e⟩.
This reference is stable and portable, in the sense that it remains
valid across different processes on possibly different machines. Thus, a
process can create a reference and send it to another process that can
resolve it to ⟨e⟩.
- **Pattern matching in case expressions, lambdas, do-notation, etc**
With this extension turned on, ``static`` is no longer a valid
identifier.
The outermost pattern of all pattern matches gets an implicit bang,
unless disabled with ``~``.
This applies to case expressions, patterns in lambda, do-notation,
list comprehension, and so on.
For example ::
Static pointers were first proposed in the paper `Towards Haskell in the
cloud <http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf>`__,
Jeff Epstein, Andrew P. Black and Simon Peyton-Jones, Proceedings of the
4th ACM Symposium on Haskell, pp. 118-129, ACM, 2011.
case x of (a,b) -> rhs
.. _using-static-pointers:
is interpreted as ::
Using static pointers
---------------------
case x of !(a,b) -> rhs
Each reference is given a key which can be used to locate it at runtime
with
:base-ref:`unsafeLookupStaticPtr <GHC.StaticPtr.html#v%3AunsafeLookupStaticPtr>`
which uses a global and immutable table called the Static Pointer Table.
The compiler includes entries in this table for all static forms found
in the linked modules. The value can be obtained from the reference via
:base-ref:`deRefStaticPtr <GHC.StaticPtr.html#v%3AdeRefStaticPtr>`.
Since the semantics of pattern matching in case expressions is
strict, this usually has no effect whatsoever. But it does make a
difference in the degenerate case of variables and newtypes. So ::
The body ``e`` of a ``static e`` expression must be a closed expression.
That is, there can be no free variables occurring in ``e``, i.e. lambda-
or let-bound variables bound locally in the context of the expression.
case x of y -> rhs
All of the following are permissible: ::
is lazy in Haskell, but with ``Strict`` is interpreted as ::
inc :: Int -> Int
inc x = x + 1
case x of !y -> rhs
ref1 = static 1
ref2 = static inc
ref3 = static (inc 1)
ref4 = static ((\x -> x + 1) (1 :: Int))
ref5 y = static (let x = 1 in x)
which evalutes ``x``. Similarly, if ``newtype Age = MkAge Int``, then ::
While the following definitions are rejected: ::
case x of MkAge i -> rhs
ref6 = let x = 1 in static x
ref7 y = static (let x = 1 in y)
is lazy in Haskell; but with ``Strict`` the added bang makes it
strict.
.. _typechecking-static-pointers:
Similarly ::
Static semantics of static pointers
-----------------------------------
\ x -> body
do { x <- rhs; blah }
[ e | x <- rhs; blah }
Informally, if we have a closed expression ::
all get implicit bangs on the ``x`` pattern.
e :: forall a_1 ... a_n . t
- **Nested patterns**
the static form is of type ::
Notice that we do *not* put bangs on nested patterns. For
example ::
static e :: (Typeable a_1, ... , Typeable a_n) => StaticPtr t
let (p,q) = if flob then (undefined, undefined) else (True, False)
in ...
Furthermore, type ``t`` is constrained to have a ``Typeable`` instance.
The following are therefore illegal: ::
will behave like ::
static show -- No Typeable instance for (Show a => a -> String)
static Control.Monad.ST.runST -- No Typeable instance for ((forall s. ST s a) -> a)
let !(p,q) = if flob then (undefined, undefined) else (True,False)
in ...
That being said, with the appropriate use of wrapper datatypes, the
above limitations induce no loss of generality: ::
which will strictly evaluate the right hand side, and bind ``p``
and ``q`` to the components of the pair. But the pair itself is
lazy (unless we also compile the ``Prelude`` with ``Strict``; see
:ref:`strict-modularity` below). So ``p`` and ``q`` may end up bound to
undefined. See also :ref:`recursive-and-polymorphic-let-bindings` below.
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE StaticPointers #-}
- **Top level bindings**
import Control.Monad.ST
import Data.Typeable
import GHC.StaticPtr
are unaffected by ``Strict``. For example: ::
data Dict c = c => Dict
deriving Typeable
x = factorial 20
(y,z) = if x > 10 then True else False
g1 :: Typeable a => StaticPtr (Dict (Show a) -> a -> String)
g1 = static (\Dict -> show)
Here ``x`` and the pattern binding ``(y,z)`` remain lazy. Reason:
there is no good moment to force them, until first use.
data Rank2Wrapper f = R2W (forall s. f s)
deriving Typeable
newtype Flip f a s = Flip { unFlip :: f s a }
deriving Typeable
- **Newtypes**
g2 :: Typeable a => StaticPtr (Rank2Wrapper (Flip ST a) -> a)
g2 = static (\(R2W f) -> runST (unFlip f))
There is no effect on newtypes, which simply rename existing types.
For example: ::
.. _pragmas:
newtype T = C a
f (C x) = rhs1
g !(C x) = rhs2
Pragmas
=======
In ordinary Haskell, ``f`` is lazy in its argument and hence in
``x``; and ``g`` is strict in its argument and hence also strict in
``x``. With ``Strict``, both become strict because ``f``'s argument
gets an implict bang.
.. index::
single: pragma
GHC supports several pragmas, or instructions to the compiler placed in
the source code. Pragmas don't normally affect the meaning of the
program, but they might affect the efficiency of the generated code.
.. _strict-modularity:
Pragmas all take the form ``{-# word ... #-}`` where word indicates
the type of pragma, and is followed optionally by information specific
to that type of pragma. Case is ignored in word. The various values
for word that GHC understands are described in the following sections;
any pragma encountered with an unrecognised word is ignored. The
layout rule applies in pragmas, so the closing ``#-}`` should start in a
column to the right of the opening ``{-#``.
Modularity
----------
Certain pragmas are *file-header pragmas*:
``Strict`` and ``StrictData`` only affects definitions in the module
they are used in. Functions and data types imported from other modules
are unaffected. For example, we won't evaluate the argument to
``Just`` before applying the constructor. Similarly we won't evaluate
the first argument to ``Data.Map.findWithDefault`` before applying the
function.
- A file-header pragma must precede the ``module`` keyword in the file.
This is crucial to preserve correctness. Entities defined in other
modules might rely on laziness for correctness (whether functional or
performance).
- There can be as many file-header pragmas as you please, and they can
be preceded or followed by comments.
Tuples, lists, ``Maybe``, and all the other types from ``Prelude``
continue to have their existing, lazy, semantics.
- File-header pragmas are read once only, before pre-processing the
file (e.g. with cpp).
.. _bang-patterns-sem:
.. _recursive-and-polymorphic-let-bindings:
- The file-header pragmas are: ``{-# LANGUAGE #-}``,
``{-# OPTIONS_GHC #-}``, and ``{-# INCLUDE #-}``.
Dynamic semantics of bang patterns
----------------------------------
.. _language-pragma:
The semantics of Haskell pattern matching is described in `Section
3.17.2 <http://www.haskell.org/onlinereport/exps.html#sect3.17.2>`__ of
the Haskell Report. To this description add one extra item 10, saying:
LANGUAGE pragma
---------------
- Matching the pattern ``!pat`` against a value ``v`` behaves as
follows:
.. index::
single: LANGUAGE; pragma
single: pragma; LANGUAGE
- if ``v`` is bottom, the match diverges
The ``LANGUAGE`` pragma allows language extensions to be enabled in a
portable way. It is the intention that all Haskell compilers support the