|
|
|
|
|
|
|
|
|
|
|
# Bang patterns
|
|
|
|
|
|
|
|
|
## Tickets
|
|
|
|
|
|
|
|
|
|
|
|
<table><tr><th>[\#76](https://gitlab.haskell.org//haskell/prime/issues/76)</th>
|
|
|
<td>Bang patterns</td></tr></table>
|
|
|
|
|
|
|
|
|
|
|
|
## Goal
|
|
|
|
|
|
|
|
|
|
|
|
Our goal is to make it easier to write strict programs in
|
|
|
Haskell. Programs that use 'seq' extensively are possible but clumsy,
|
|
|
and it's often quite awkward or inconvenient to increase the strictness
|
|
|
of a Haskell program. This proposal changes nothing fundamental;
|
|
|
but it makes strictness more convenient.
|
|
|
|
|
|
|
|
|
## The basic idea
|
|
|
|
|
|
|
|
|
|
|
|
The main idea is to add a single new production to the syntax of
|
|
|
patterns
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
pat ::= !pat
|
|
|
```
|
|
|
|
|
|
|
|
|
Matching an expression e against a pattern !p is done by first
|
|
|
evaluating e (to WHNF) and then matching the result against p.
|
|
|
|
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f1 !x = True
|
|
|
```
|
|
|
|
|
|
|
|
|
f1 is strict in x. In this case, it's not so bad to write
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f1 x = x `seq` True
|
|
|
```
|
|
|
|
|
|
|
|
|
but when guards are involved the `seq` version becomes horrible:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
-- Duplicate the seq on y
|
|
|
f2 x y | g x = y `seq` rhs1
|
|
|
| otherwise = y `seq` rhs2
|
|
|
|
|
|
-- Have a weird guard
|
|
|
f2 x y | y `seq` False = undefined
|
|
|
| g x = rhs1
|
|
|
| otherwise = rhs2
|
|
|
|
|
|
-- Use bang patterns
|
|
|
f2 !x !y | g x = rhs1
|
|
|
| otherwise = rhs2
|
|
|
```
|
|
|
|
|
|
|
|
|
Bang patterns can be nested of course:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f2 (!x, y) = [x,y]
|
|
|
```
|
|
|
|
|
|
|
|
|
f2 is strict in x but not in y. A bang only really has an effect
|
|
|
if it precedes a variable or wild-card pattern:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f3 !(x,y) = [x,y]
|
|
|
f4 (x,y) = [x,y]
|
|
|
```
|
|
|
|
|
|
|
|
|
f3 and f4 are identical; putting a bang before a pattern that
|
|
|
forces evaluation anyway does nothing.
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
g5 x = let y = f x in body
|
|
|
g6 x = case f x of { y -> body }
|
|
|
g7 x = case f x of { !y -> body }
|
|
|
```
|
|
|
|
|
|
|
|
|
g5 and g6 mean exactly the same thing. But g7 evalutes (f x), binds y to the
|
|
|
result, and then evaluates body.
|
|
|
|
|
|
|
|
|
|
|
|
## Let and where bindings
|
|
|
|
|
|
|
|
|
|
|
|
In Haskell, let and where bindings can bind patterns. We propose to modify
|
|
|
this by allowing an optional bang at the top level of the pattern. Thus for example:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let ![x,y] = e in b
|
|
|
```
|
|
|
|
|
|
|
|
|
The "`!`" should not be regarded as part of the pattern; after all,
|
|
|
in a function argument `![x,y]` means the same as `[x,y]`. Rather, the "`!`"
|
|
|
is part of the syntax of `let` bindings.
|
|
|
|
|
|
|
|
|
|
|
|
The semantics is simple; the above program is equivalent to:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let p@[x,y] = e in p `seq` b
|
|
|
```
|
|
|
|
|
|
|
|
|
That is, for each bang pattern, invent a variable `p`, bind it to the
|
|
|
banged pattern (removing the bang) with an as-pattern, and `seq` on it
|
|
|
in the body of the `let`. (Thanks to Ben Rudiak-Gould for suggesting
|
|
|
this idea.)
|
|
|
|
|
|
|
|
|
|
|
|
A useful special case is when the pattern is a variable:
|
|
|
Similarly
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let !y = f x in b
|
|
|
```
|
|
|
|
|
|
|
|
|
means
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let y = f x in y `seq` b
|
|
|
```
|
|
|
|
|
|
|
|
|
which evaluates the `(f x)`, thereby giving a strict `let`.
|
|
|
|
|
|
|
|
|
|
|
|
A useful law is this. A *non-recursive* bang-pattern binding
|
|
|
is equivalent to a `case` expression:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let !p = <rhs> in <body>
|
|
|
is equivalent to (when the let is non-recursive)
|
|
|
case <rhs> of !p -> <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
Here is a more realistic example, a strict version of partition:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
partitionS p [] = ([], [])
|
|
|
partitionS p (x:xs)
|
|
|
| p x = (x:ys, zs)
|
|
|
| otherwise = (ys, x:zs)
|
|
|
where
|
|
|
!(ys,zs) = partitionS p xs
|
|
|
```
|
|
|
|
|
|
|
|
|
The bang in the where clause ensures that the recursive
|
|
|
call is evaluated eagerly. In Haskell today we are forced to
|
|
|
write
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
partitionS p [] = ([], [])
|
|
|
partitionS p (x:xs)
|
|
|
= case partitionS p xs of
|
|
|
(ys,zs) | p x = (x:ys, zs)
|
|
|
| otherwise = (ys, x:zs)
|
|
|
```
|
|
|
|
|
|
|
|
|
which is clumsier (especially if there are a bunch of where-clause
|
|
|
bindings, all of which should be evaluated strictly), and doesn't
|
|
|
provide the opportunity to fall through to the next equation (not
|
|
|
needed in this example but often useful).
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
# Changes to the Report
|
|
|
|
|
|
|
|
|
|
|
|
The changes to the Report would be these. (Incomplete.)
|
|
|
|
|
|
|
|
|
- Section 3.17, add pat ::= !pat to the syntax of patterns.
|
|
|
We would need to take care to make clear whether
|
|
|
|
|
|
```wiki
|
|
|
f !x = 3
|
|
|
```
|
|
|
|
|
|
was a definition of the function "!", or of "f". (There is
|
|
|
a somewhat similar complication with n+k patterns; see the
|
|
|
end of 4.3.3.2 in the Report. However we probably do not
|
|
|
want to require parens thus
|
|
|
|
|
|
```wiki
|
|
|
f (!x) = 3
|
|
|
```
|
|
|
|
|
|
which are required in n+k patterns.
|
|
|
|
|
|
- Section 3.17.2: add new bullet 10, saying "Matching
|
|
|
the pattern "!pat" against a value "v" behaves as follows:
|
|
|
|
|
|
- if v is bottom, the match diverges
|
|
|
- otherwise, "pat" is matched against "v".
|
|
|
|
|
|
- Fig 3.1, 3.2, add a new case (t):
|
|
|
|
|
|
```wiki
|
|
|
case v of { !pat -> e; _ -> e' }
|
|
|
= v `seq` case v of { pat -> e; _ -> e' }
|
|
|
```
|
|
|
|
|
|
- Section 3.12 (let expressions). In the translation box, first apply
|
|
|
the following transformation: for each pattern pi that is of
|
|
|
form `!qi = ei`, transform it to `xi@qi = ei`, and and replace `e0`
|
|
|
by `(xi `seq` e0)`. Then, when none of the left-hand-side patterns
|
|
|
have a bang at the top, apply the rules in the existing box.
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
# Discussion
|
|
|
|
|
|
|
|
|
## Recursive let and where bindings
|
|
|
|
|
|
|
|
|
|
|
|
At first you might think that a recursive bang pattern
|
|
|
don't make sense: how can you evaluate it strictly if it doesn't exist
|
|
|
yet? But consider
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let !xs = if funny then 1:xs else 2:xs in ...
|
|
|
```
|
|
|
|
|
|
|
|
|
Here the binding is recursive, but the translation still makes sense:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let xs = if funny then 1:xs else 2:xs
|
|
|
in xs `seq` ...
|
|
|
```
|
|
|
|
|
|
## Top-level bang-pattern bindings
|
|
|
|
|
|
|
|
|
|
|
|
Does this make sense?
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
module Foo where
|
|
|
!x = factorial 1000
|
|
|
```
|
|
|
|
|
|
|
|
|
A top-level bang-pattern binding like this would imply that
|
|
|
the binding is evaluated when the program is started; a kind of
|
|
|
module initialisation. This makes some kind of sense, since
|
|
|
(unlike unrestricted side effects) it doesn't matter in which order
|
|
|
the module initialisation is performed.
|
|
|
|
|
|
|
|
|
|
|
|
But it's not clear why it would be necessary or useful. **Conservative
|
|
|
conclusion**: no top-level bang-patterns.
|
|
|
|
|
|
|
|
|
## Tricky point: syntax
|
|
|
|
|
|
|
|
|
|
|
|
What does this mean?
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f ! x = True
|
|
|
```
|
|
|
|
|
|
|
|
|
Is this a definition of `(!)` or a banged argument? (Assuming that
|
|
|
space is insignificant.)
|
|
|
|
|
|
|
|
|
|
|
|
Proposal: resolve this ambiguity in favour of the bang-pattern. If
|
|
|
you want to define `(!)`, use the prefix form
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
(!) f x = True
|
|
|
```
|
|
|
|
|
|
|
|
|
Another point that came up in implementation is this. In GHC, at least,
|
|
|
*patterns* are initially parsed as *expressions*, because Haskell's syntax
|
|
|
doesn't let you tell the two apart until quite late in the day.
|
|
|
In expressions, the form `(! x)` is a right section, and parses fine.
|
|
|
But the form `(!x, !y)` is simply illegal. Solution:
|
|
|
in the syntax of expressions, allow sections without the wrapping parens
|
|
|
in explicit lists and tuples. Actually this would make sense generally:
|
|
|
what could `(+ 3, + 4)` mean apart from a tuple of two sections?
|
|
|
|
|
|
|
|
|
## Tricky point: pattern-matching semantics
|
|
|
|
|
|
|
|
|
|
|
|
A bang is part of a *pattern*; matching a bang forces evaluation.
|
|
|
So the exact placement of bangs in equations matters. For example,
|
|
|
there is a difference between these two functions:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f1 x True = True
|
|
|
f1 !x False = False
|
|
|
|
|
|
f2 !x True = True
|
|
|
f2 x False = False
|
|
|
```
|
|
|
|
|
|
|
|
|
Since pattern matching goes top-to-bottom, left-to-right,
|
|
|
`(f1 bottom True)` is `True`, whereas `(f2 bottom True)`
|
|
|
is `bottom`.
|
|
|
|
|
|
|
|
|
## Tricky point: binding transformations
|
|
|
|
|
|
|
|
|
|
|
|
In Haskell 98, these two bindings are equivalent:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
{ p1=e1; p2=e2 } and { (~p1,~p2) = (e1,e2) }
|
|
|
```
|
|
|
|
|
|
|
|
|
But with bang patterns this transformation only holds if `p1`, `p2` are
|
|
|
not bang-patterns. Remember, the bang is part of the binding, not the pattern,
|
|
|
|
|
|
|
|
|
## Tricky point: exceptions
|
|
|
|
|
|
|
|
|
|
|
|
Consider this program
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
test = "no exception"
|
|
|
where !_ = error "top down"
|
|
|
!_ = error "bottom up"
|
|
|
|
|
|
eval_order = evaluate test `catch` \e ->
|
|
|
case e of
|
|
|
ErrorCall txt -> return txt
|
|
|
_ -> throw e
|
|
|
```
|
|
|
|
|
|
|
|
|
Depending on whether the strict bindings induced by the bang patters in `test` are evaluated top-to-bottom or bottom-to-top, you'll get different results from `eval_order`. But this is exactly the point of of the [ Imprecise Exceptions paper](http://research.microsoft.com/%7Esimonpj/Papers/imprecise-exn.htm): the semantics of `test` is a set of exception values, rather than a unique one. So this non-determinism (which can only be observed in the IO monad) is fine.
|
|
|
|
|
|
|
|
|
|
|
|
See this [ email thread](http://article.gmane.org/gmane.comp.lang.haskell.general/15751)
|
|
|
|
|
|
|
|
|
## Tricky point: nested bangs (part 1)
|
|
|
|
|
|
|
|
|
|
|
|
Consider this:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let (x, Just !y) = <rhs> in <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
Is `y` evaluted before `<body>` is begun? No, it isn't. That would be
|
|
|
quite wrong. Pattern matching in a `let` is lazy; if any of the variables bound
|
|
|
by the pattern is evaluated, then the whole pattern is matched. In this example,
|
|
|
if `x` or `y` is evaluated, the whole pattern is matched, which in turn forces
|
|
|
evaluation of `y`. The binding is equivalent to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let t = <rhs>
|
|
|
x = case t of { (x, Just !y) -> x }
|
|
|
y = case t of { (x, Just !y) -> y }
|
|
|
in <body>
|
|
|
```
|
|
|
|
|
|
## Tricky point: nested bangs (part 2)
|
|
|
|
|
|
|
|
|
|
|
|
Consider this:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let !(x, Just !y) = <rhs> in <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
This should be equivalent to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
case <rhs> of { (x, Just !y) -> <body> }
|
|
|
```
|
|
|
|
|
|
|
|
|
Notice that this meant that the **entire** pattern is matched
|
|
|
(as always with Haskell). The `Just` may fail; `x` is
|
|
|
not evaluated; but `y` **is** evaluated.
|
|
|
|
|
|
|
|
|
|
|
|
This means that you can't give the obvious alternative translation that uses
|
|
|
just let-bindings and {{{seq}}. For example, we could attempt to translate
|
|
|
the example to:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let
|
|
|
t = <rhs>
|
|
|
x = case t of (x, Just !y) -> x
|
|
|
y = case t of (x, Just !y) -> y
|
|
|
in
|
|
|
t `seq` <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
This won't work, because using `seq` on `t` won't force `y`.
|
|
|
However, the semantics says that the original code is equivalent to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let p@(x, Just !y) = <rhs> in p `seq` <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
and we can desugar that in obvious way to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let
|
|
|
t = <rhs>
|
|
|
p = case t of p@(x, Just !y) -> p
|
|
|
x = case t of p@(x, Just !y) -> x
|
|
|
y = case t of p@(x, Just !y) -> y
|
|
|
in
|
|
|
p `seq` <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
which is fine.
|
|
|
|
|
|
|
|
|
|
|
|
You could also build an intermediate tuple, thus:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let
|
|
|
t = case <rhs> of p@(x, Just !y) -> (p,x,y)
|
|
|
p = sel13 t
|
|
|
x = sel23 t
|
|
|
y = sel33 t
|
|
|
in
|
|
|
t `seq` <body>
|
|
|
```
|
|
|
|
|
|
|
|
|
Indeed GHC does just this for complicated pattern bindings.
|
|
|
|
|
|
|
|
|
## Tricky point: polymorphism
|
|
|
|
|
|
|
|
|
|
|
|
Haskell allows this:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let f :: forall a. Num a => a->a
|
|
|
Just f = <rhs>
|
|
|
in (f (1::Int), f (2::Integer))
|
|
|
```
|
|
|
|
|
|
|
|
|
But if we were to allow a bang pattern, `!Just f = <rhs>`,
|
|
|
with the translation to
|
|
|
a case expression given earlier, we would end up with
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
case <rhs> of { Just f -> (f (1::Int), f (2::Integer) }
|
|
|
```
|
|
|
|
|
|
|
|
|
But if this is Haskell source, then `f` won't be polymorphic.
|
|
|
|
|
|
|
|
|
|
|
|
One could say that the translation isn't required to preserve the static
|
|
|
semantics, but GHC, at least, translates into System F, and being able
|
|
|
to do so is a good sanity check. If we were to do that, then we would
|
|
|
need
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
<rhs> :: Maybe (forall a. Num a => a -> a)
|
|
|
```
|
|
|
|
|
|
|
|
|
so that the case expression works out in System F:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
case <rhs> of {
|
|
|
Just (f :: forall a. Num a -> a -> a)
|
|
|
-> (f Int dNumInt (1::Int), f Integer dNumInteger (2::Integer) }
|
|
|
```
|
|
|
|
|
|
|
|
|
The trouble is that `<rhs>` probably has a type more like
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
<rhs> :: forall a. Num a => Maybe (a -> a)
|
|
|
```
|
|
|
|
|
|
|
|
|
…and now the dictionary lambda may get in the way
|
|
|
of forcing the pattern.
|
|
|
|
|
|
|
|
|
|
|
|
This is a swamp. **Conservative conclusion**: no generalisation (at all)
|
|
|
for bang-pattern bindings.
|
|
|
|
|
|
|
|
|
## Existentials
|
|
|
|
|
|
|
|
|
|
|
|
A strict pattern match could perhaps support existentials (which GHC currently
|
|
|
rejects in pattern bindings):
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
data T where
|
|
|
T :: a -> (a->Int) -> T
|
|
|
|
|
|
f x = let !(T y f) = x in ...
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
# A more radical proposal
|
|
|
|
|
|
|
|
|
|
|
|
A more radical proposal (from John Hughes) is to make pattern matching
|
|
|
in `let` strict. Thus
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let (x,y) = e in b
|
|
|
```
|
|
|
|
|
|
|
|
|
would be equivalent to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
case e of (x,y) -> b
|
|
|
```
|
|
|
|
|
|
|
|
|
(in the non-recursive case) and to
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let t = let { x = fst t; y = snd t } in e
|
|
|
in case t of (x,y) -> b
|
|
|
```
|
|
|
|
|
|
|
|
|
(in the recursive case).
|
|
|
|
|
|
|
|
|
|
|
|
To recover Haskell 98 semantics for a pattern binding, use a tilde:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let ~(x,y) = e in b
|
|
|
```
|
|
|
|
|
|
|
|
|
More precisely, the rules for desugaring binding are these:
|
|
|
|
|
|
|
|
|
1. Sort into strongly-connected components.
|
|
|
1. Apply the following rules:
|
|
|
|
|
|
```wiki
|
|
|
let p1=e1 ; ... ; pn = en in e ==> let (p1,...,pn) = (e1,...,en) in e
|
|
|
|
|
|
let p = e1 in e0 ==> case e1 of p -> e0
|
|
|
where no variable in p occurs free in e1
|
|
|
|
|
|
let p = e1 in e0 ==> let p = fix (\ ~p -> e1) in e0
|
|
|
otherwise
|
|
|
```
|
|
|
|
|
|
## Why the strongly-connected component?
|
|
|
|
|
|
|
|
|
|
|
|
Consider this
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
let (y:ys) = xs
|
|
|
(z:zs) = ys
|
|
|
in ...
|
|
|
```
|
|
|
|
|
|
|
|
|
There is no real recursion here, but considered all together the
|
|
|
second binding does refer to the first. So the desugaring above would use
|
|
|
fix, and if you work it out you'll find that all the variables get bound to bottom.
|
|
|
|
|
|
|
|
|
|
|
|
Note that the *static* semantics already requires a strongly-conneted component
|
|
|
analysis.
|
|
|
|
|
|
|