Strict & StrictData
This page explains the motivation, semantics, and implementation of the new language extensions
High-performance Haskell code (e.g. numeric code) can sometimes be littered with bang patterns, making it harder to read. The reason is that laziness isn't the right default in this particular code, but the programmer has no way to say that except by repeatedly adding bang patterns. This page proposes two new language extensions,
Strict, that allow the programmer to switch the default on a per module basis.
StrictData language extension switches data type declarations to be strict by default allowing fields to be lazy by adding a
~ in front of the field.
When the user writes
data T = C a
we interpret it as if she had written
data T = C !a
Haskell doesn't allow for
~ patterns in data constructor definitions today: we'll add support for such definitions and have it give the current lazy behavior.
The extension only affects definitions in this module.
Strict language extension switches functions, data types, and bindings to be strict by default, allowing optional laziness by adding
~ in front of a variable. This essentially reverses the present situation where laziness is default and strictness can be optionally had by adding
! in front of a variable.
Function definitions. When the user writes
f x = ...
we interpret it as if she had written
f !x = ...
~in front of
xgives the old lazy behavior.
Let/where bindings. When the user writes
let x = ... let pat = ...
we interpret it as if she had written
let !x = ... let !pat = ...
~in front of
xgives the old lazy behavior. Notice that we do not put bangs on nested patterns. For example
let (p,q) = if flob then (undefined, undefined) else (True, False) in ...
will behave like
let !(p,q) = if flob then (undefined, undefined) else (True, False)
which will strictly evaluate the RHS, and bind
qto the components of the pair. But the pair itself is lazy (unless we also compile the Prelude with
Strict; see "Modularity" below). So
qmay end up bound to
undefined. See also "Recursive and polymorphic let bindings" below.
Case expressions. The patterns of a case expression get an implicit bang, unless disabled with
~. For example
case x of (a,b) -> rhs
is interpreted as
case x of !(a,b) -> rhs
Since the semantics of pattern matching in case expressions is strict, this usually has no effect whatsoever. But it does make a difference in the degenerate case of variables and newtypes. So
case x of y -> rhs
is lazy in Haskell, but with
Strictis interpreted as
case x of !y -> rhs
which evalutes x. Similarly, if
newtype Age = MkAge Int, then
case x of MkAge i -> rhs
is lazy in Haskell; but with
Strictthe added bang makes it strict.
Top level bindings are unaffected by
Strict. For example:
x = factorial 20 (y,z) = if x > 10 then True else False
xand the pattern binding
(y,z)remain lazy. Reason: there is no good moment to force them, until first use.
Newtypes. There is no effect on newtypes, which simply rename existing types. For example:
newtype T = C a f (C x) = rhs1 g !(C x) = rhs2
In ordinary Haskell ,
fis lazy in its argument and hence in
gis strict in its argument and hence also strict in
Strict, both become strict because
f's argument gets an implict bang.
The extension only affects definitions in this module. Functions and data types imported from other modules are unaffected. For example, we won't evaluate the argument to
Just before applying the constructor. Similarly we won't evaluate the first argument to
Data.Map.findWithDefault before applying the function.
This is crucial to preserve correctness. Entities defined in other modules might rely on laziness for correctness (whether functional or performance).
Maybe, and all the other types from
Prelude continue to have their existing, lazy, semantics.
Recursive and polymorphic let bindings
Consider a banged let-binding
let !pat = rhs in body
Bang patterns in let bindings today (GHC 7.8.3 and earlier) behave as described in the user manual:
- The binding cannot be recursive
- The variables bound by the pattern always get monomorphic types
- The complete pattern is matched before evaluation of
The intent was that it is valid to desugar such a binding to
case rhs of pat -> body
This currently applies even if the pattern is just a single variable, so that the
case boils down to a
Continuing with this rule would mean that
Strict would not allow recursive or polymoprhic pattern bindings at all. So instead we propose the following revised specification for bang patterns in let bindings.
Static semantics. Exactly as in Haskell, unaffected by
Strict. This is more permissive than the current rule for bang patterns in let bindings, because it supports bang-patterns for polymorphic and recursive bindings.
Dynamic semantics. Consider the rules in the box of Section 3.12 of the Haskell report. Replace these rules with the following ones, where
vstands for a variable
(FORCE). Replace any binding
!p = ewith
v = e; p = vand replace
vis fresh. This translation works fine if
pis already a variable
x, but can obviously be optimised by not introducing a fresh variable
(SPLIT). Replace any binding
p = e, where
pis not a variable, with
v = e; x1 = case v of p -> x1; ...; xn = case v of p -> xn, where
vis fresh and
xnare the bound variables of
p. Again if
eis a variable, you can optimised his by not introducing a fresh variable.
- (FORCE). Replace any binding
The result will be a (possibly) recursive set of bindings, binding only simple variables on the LHS. (One could go one step further, as in the Haskell Report and make the recursive bindings non-recursive using
fix, but we do not do so in Core, and it only obfuscates matters, so we do not do so here.)
Here are some examples of how this translation works. The first expression of each sequence is Haskell source; the subsequent ones are Core.
Here is a simple non-recursive case.
let x :: Int -- Non-recursive !x = factorial y in body ===> (FORCE) let x = factorial y in x `seq` body ===> (inline seq) let x = factorial y in case x of x -> body ===> (inline x) case factorial y of x -> body
Same again, only with a pattern binding
let !(x,y) = if blob then (factorial p, factorial q) else (0,0) in body ===> (FORCE) let v = if blob then (factorial p, factorial q) else (0,0) (x,y) = v in v `seq` body ===> (SPLIT) let v = if blob then (factorial p, factorial q) else (0,0) x = case v of (x,y) -> x y = case v of (x,y) -> y in v `seq` body ===> (inline seq, float x,y bindings inwards) let v = if blob then (factorial p, factorial q) else (0,0) in case v of v -> let x = case v of (x,y) -> x y = case v of (x,y) -> y in body ===> (fluff up v's pattern; this is a standard Core optimisation) let v = if blob then (factorial p, factorial q) else (0,0) in case v of v@(p,q) -> let x = case v of (x,y) -> x y = case v of (x,y) -> y in body ===> (case of known constructor) let v = if blob then (factorial p, factorial q) else (0,0) in case v of v@(p,q) -> let x = p y = q in body ===> (inline x,y) let v = if blob then (factorial p, factorial q) else (0,0) in case v of (p,q) -> body[p/x, q/y]
The final form is just what we want: a simple case expression.
Here is a recursive case
letrec xs :: [Int] -- Recursive !xs = factorial y : xs in body ===> (FORCE) letrec xs = factorial y : xs in xs `seq` body ===> (inline seq) letrec xs = factorial y : xs in case xs of xs -> body ===> (eliminate case of value) letrec xs = factorial y : xs in body
and a polymorphic one:
let f :: forall a. [a] -> [a] -- Polymorphic !f = fst (reverse, True) in body ===> (FORCE) let f = /\a. fst (reverse a, True) in f `seq` body -- Notice that the `seq` is added only in the translation to Core -- If we did it in Haskell source, thus -- let f = ... in f `seq` body -- then f's polymorphic type would get intantiated, so the Core -- translation woudl be -- let f = ... in f Any `seq` body ===> (inline seq, inline f) case (/\a. fst (reverse a, True)) of f -> body
When overloading is involved, the results might be slightly counter intuitive:
let f :: forall a. Eq a => a -> [a] -> Bool -- Overloaded !f = fst (member, True) in body ===> (FORCE) let f = /\a \(d::Eq a). fst (member, True) in f `seq` body ===> (inline seq, case of value) let f = /\a \(d::Eq a). fst (member, True) in body
Note that the bang has no effect at all in this case
Interaction with irrefutable patterns
~ is used to recover ordinary patterns, to build an irrefutable pattern
~(~pat) is used.
Consider a recursive group like this
letrec f : g = rhs[f,g] in <body>
Strict, we get a translation like this:
let t = /\a. letrec tm = rhs[fm,gm] fm = case t of fm:_ -> fm gm = case t of _:gm -> gm in (fm,gm) in let f = /\a. case t a of (fm,_) -> fm in let g = /\a. case t a of (_,gm) -> gm in <body>
tm is the monomorphic binding for
Strict, we want to force
tm, but NOT
tm isn't in scope in the
in <body> part.
The simplest thing is to return it in the polymoprhic
let t = /\a. letrec tm = rhs[fm,gm] fm = case t of fm:_ -> fm gm = case t of _:gm -> gm in (tm, fm, gm) in let f = /\a. case t a of (_,fm,_) -> fm in let g = /\a. case t a of (_,_,gm) -> gm in let tm = /\a. case t a of (tm,_,_) -> tm in tm `seq` <body>