|
|
# Demand analyser in GHC
|
|
|
|
|
|
This wiki page focuses on information that GHC devs need to know about demand analysis and the corresponding worker/wrapper transformation that feeds on strictness and absence info.
|
|
|
|
|
|
This page explains basics of the so-called demand analysis in GHC, comprising strictness and absence analyses. Meanings of demand signatures are explained and examples are provided. Also, components of the compiler possibly affected by the results of the demand analysis are listed with explanations provided.
|
|
|
As a first step, it is recommended to get up to speed on demand analysis and notation of demand signatures that is relevant to *users* of GHC, as explained in the [user's guide entry on `-fstrictness`](https://ghc.gitlab.haskell.org/ghc/doc/users_guide/using-optimisation.html?highlight=demand#ghc-flag--fstrictness).
|
|
|
|
|
|
- The [demand-analyser draft paper](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/03/demand-jfp-draft.pdf) is as yet unpublished, but gives the most accurate overview of the way GHC's demand analyser works.
|
|
|
Unfortunately, there isn't a single paper (yet) that describes demand analysis as a whole. The relevant sources are:
|
|
|
|
|
|
- The [demand-analyser draft paper (2017)](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/03/demand-jfp-draft.pdf) is as yet unpublished, but gives the most accurate overview of the way GHC's demand analyser works.
|
|
|
- The [cardinality paper (2014)](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.646.8707&rep=rep1&type=pdf) describes what we call usage analysis today and introduces higher-order call demands. Also described in the demand-analysis draft paper.
|
|
|
- SG wrote a more colloquial [blog post](http://fixpt.de/blog/2017-12-04-strictness-analysis-part-1.html) [series](http://fixpt.de/blog/2018-12-30-strictness-analysis-part-2.html) about strictness analysis. Uses old demand notation, unfortunately.
|
|
|
|
|
|
---
|
|
|
|
|
|
## Demand signatures
|
|
|
|
|
|
|
|
|
Let us compile the following program with `-O2 -ddump-stranal` flags:
|
|
|
|
|
|
```wiki
|
|
|
f c p = case p
|
|
|
of (a, b) -> if c
|
|
|
then (a, True)
|
|
|
else (True, False)
|
|
|
```
|
|
|
|
|
|
|
|
|
The resulting demand signature for function `f` will be the following one:
|
|
|
|
|
|
```wiki
|
|
|
Str=DmdType <S,U><S,U(UA)>m
|
|
|
```
|
|
|
|
|
|
|
|
|
This should be read as "`f` puts stricts demands on both its arguments (hence, `S`); `f` might use its first and second arguments. but in the second argument (which is a product), the second component is ignored". The suffix `m` in the demand signature indicates that the function returns **CPR**, a constructed product result (for more information on CPR see the JFP paper [Constructed Product Result Analysis for Haskell](http://research.microsoft.com/en-us/um/people/simonpj/Papers/cpr/index.htm)).
|
|
|
|
|
|
|
|
|
Current implementation of demand analysis in Haskell performs annotation of all binders with demands, put on them in the context of their use. For functions, it is assumed, that the result of the function is used strictly. The analysis infers strictness and usage information separately, as two components of a cartesian product domain. The same analysis also performs inference CPR and bottoming properties for functions, which can be read from the suffix of the signature. Demand signatures of inner definitions may also include *demand environments* that indicate demands, which a closure puts to its free variables, once strictly used, e.g. the signature
|
|
|
|
|
|
```wiki
|
|
|
Str=DmdType <L,U> {skY-><S,U>}
|
|
|
```
|
|
|
|
|
|
|
|
|
indicates that the function has one parameter, which is used lazily (hence `<L,U>`), however, when its result is used strictly, the free variable `skY` in its body is also used strictly.
|
|
|
|
|
|
### Grammar
|
|
|
|
|
|
This a simple grammar extracted from the `Outputable` instances as of GHC 8.11:
|
|
|
```python
|
|
|
DmdType := JointDmd* Divergence
|
|
|
|
|
|
# Joint strictness/usage demand
|
|
|
JointDmd := '<' StrDmd ',' UseDmd '>'
|
|
|
|
|
|
####################################
|
|
|
# Strictness demands
|
|
|
####################################
|
|
|
|
|
|
StrDmd := 'B' # HyperStr: Diverges if forced (bottom of lattice)
|
|
|
| 'C' '(' StrDmd ')' # SCall: Call demand
|
|
|
| 'S' '(' ArgStr* ')' # SProd: Product demand
|
|
|
| 'S' # HeadStr: Forced only to WHNF
|
|
|
|
|
|
# Argument strictness
|
|
|
ArgStr := 'L' # Lazy: Argument not necessarily demanded
|
|
|
| StrDmd # Strict: Places given strictness demand on argument
|
|
|
|
|
|
####################################
|
|
|
# Usage demands
|
|
|
####################################
|
|
|
|
|
|
UseDmd := 'U' # Used: Top of lattice
|
|
|
| 'U' '(' (ArgUse ',')* ')' # UProd: Used only for values of product type
|
|
|
| 'C' Count '(' UseDmd ')' # UCall: Used only for values of function type
|
|
|
| 'H' # UHead: Used, but only to WHNF; components definitely not used
|
|
|
|
|
|
# Argument usage
|
|
|
ArgUse := 'A' # Abs: Definitely unused (bottom of lattice)
|
|
|
| '1*' UseDmd # Use Once: Used with the given usage demand exactly once
|
|
|
| UseDmd # Use Many: Used with the given usage demand more than once
|
|
|
|
|
|
# Usage cardinality
|
|
|
Count := '1' # Once
|
|
|
| '' # Many times
|
|
|
|
|
|
# Divergence
|
|
|
Divergence := 'b' # Diverges: Definitely divergences but *doesn't* throw a precise exception.
|
|
|
| 'x' # ExnOrDiv: Definitely diverges or throws a precise exception.
|
|
|
| '' # Dunno: May or may not diverge
|
|
|
|
|
|
####################################
|
|
|
# Constructed Product Result types
|
|
|
####################################
|
|
|
|
|
|
CprType := Arity CprResult # The arity is the number of value arguments necessary
|
|
|
# for the expression to reduce to CprResult.
|
|
|
|
|
|
CprResult := '' # NoCPR: No CPR information (top of lattice)
|
|
|
| 'm' ConTag # ConCPR: The result is the constructor identified by ConTag
|
|
|
| 'b' # BotCPR: Evaluation bottoms (bottom of lattice)
|
|
|
```
|
|
|
|
|
|
### Demand descriptions
|
|
|
|
|
|
|
|
|
|
|
|
Strictness demands
|
|
|
|
|
|
|
|
|
- `B` -- a *hyperstrict* demand. The expression `e` puts this demand on its argument `x` if every evaluation of `e` is guaranteed to diverge, regardless of the value of the argument. We call this demand *hyperstrict* because it is safe to evaluate `x` to arbitrary depth before evaluating `e`. This demand is polymorphic with respect to function calls and can be seen as `B = C(B) = C(C(B)) = ...` for an arbitrary depth.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- `L` -- a *lazy* demand. If an expression `e` places demand `L` on a variable `x`, we can deduce nothing about how `e` uses `x`. `L` is the completely uninformative demand, the top element of the lattice.
|
|
|
|
|
|
- `S` -- a *head-strict* demand. If `e` places demand `S` on `x` then `e` evaluates `x` to at least head-normal form; that is, to the outermost constructor of `x`. This demand is typically placed by the `seq` function on its first argument. The demand `S(L ... L)` places a lazy demand on all the components, and so is equivalent to `S`; hence the identity `S = S(L ... L)`. Another identity is for functions, which states that `S = C(L)`. Indeed, if a function is certainly called, it is evaluated at lest up to the head normal form, i.e., *strictly*. However, its result may be used lazily.
|
|
|
|
|
|
- `S(s1 ... sn)` -- a structured strictness demand on a product. It is at least head-strict, and perhaps more.
|
|
|
|
|
|
- `C(s)` -- a *call-demand*, when placed on a binder `x`, indicates that the value is a function, which is always called and its result is used according to the demand `s`.
|
|
|
|
|
|
|
|
|
Absence/usage demands
|
|
|
|
|
|
- `A` -- when placed on a binder `x` it means that `x` is definitely unused.
|
|
|
|
|
|
- `U` -- the value is used on some execution path. This demand is a top of usage domain.
|
|
|
|
|
|
- `H` -- a *head-used* demand. Indicates that a product value is used itself, however its components are certainly ignored. This demand is typically placed by the `seq` function on its first argument. This demand is polymorphic with respect to products and functions. For a product, the head-used demand is expanded as `U(A, ..., A)` and for functions it can be read as `C(A)`, as the function is called (i.e., evaluated to at least a head-normal form), but its result is ignored.
|
|
|
|
|
|
- `U(u1 ... un)` -- a structured usage demand on a product. It is at least head-used, and perhaps more.
|
|
|
|
|
|
- `C(u)` -- a *call-demand* for usage information. When put on a binder `x`, indicates that `x` in all executions paths where `x` is used, it is *applied* to some argument, and the result of the application is used with a demand `u`.
|
|
|
|
|
|
|
|
|
Additional information (demand signature suffix)
|
|
|
|
|
|
- `m` -- the function returns a [constructed product result](http://research.microsoft.com/en-us/um/people/simonpj/Papers/cpr/index.htm).
|
|
|
|
|
|
- `b` -- the function definitely diverges.
|
|
|
|
|
|
- `x` -- the function catches exceptions. For instance, consider `catch undefined g`: naturally, `catch` is strict in its first argument and therefore one would usually think that this expression would bottom. However, `catch` has special semantics: it catches exceptions. Consequently we give the first argument a demand of `C(L)x` to indicate that an application of `catch` to bottom can't be assumed to be itself bottom.
|
|
|
See the info in the [user's guide entry on `-fstrictness`](https://ghc.gitlab.haskell.org/ghc/doc/users_guide/using-optimisation.html?highlight=demand#ghc-flag--fstrictness).
|
|
|
|
|
|
## Worker-Wrapper split
|
|
|
|
|
|
|
|
|
Demand analysis in GHC drives the *worker-wrapper transformation*, which exposes specialised calling conventions to the rest of the compiler. In particular, the worker-wrapper transformation implements the unboxing optimisation.
|
|
|
|
|
|
|
|
|
The worker-wrapper transformation splits each
|
|
|
function `f` into a *wrapper*, with the
|
|
|
ordinary calling convention, and a *worker*, with a specialised
|
... | ... | @@ -153,9 +28,9 @@ worker; it simply calls the worker using the specialised calling convention. |
|
|
The transformation can be expressed directly in GHC's intermediate language.
|
|
|
Suppose that `f` is defined thus:
|
|
|
|
|
|
```wiki
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = <rhs>
|
|
|
```hs
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = <rhs>
|
|
|
```
|
|
|
|
|
|
|
... | ... | @@ -164,13 +39,13 @@ and uses its components. |
|
|
What worker-wrapper split shall we make? Here is one
|
|
|
possibility:
|
|
|
|
|
|
```wiki
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = case p of
|
|
|
```hs
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = case p of
|
|
|
(a,b) -> $wf a b
|
|
|
|
|
|
$wf :: Int -> Int -> Int
|
|
|
$wf a b = let p = (a,b) in <rhs>
|
|
|
$wf :: Int -> Int -> Int
|
|
|
$wf a b = let p = (a,b) in <rhs>
|
|
|
```
|
|
|
|
|
|
|
... | ... | @@ -184,12 +59,12 @@ pass them to the worker `$wf`. Hence the need for absence |
|
|
analysis. Suppose, then, that we know that `b` is not needed. Then
|
|
|
we can transform to:
|
|
|
|
|
|
```wiki
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = case p of (a,b) -> $wf a
|
|
|
```hs
|
|
|
f :: (Int,Int) -> Int
|
|
|
f p = case p of (a,b) -> $wf a
|
|
|
|
|
|
$wf :: Int -> Int
|
|
|
$wf a = let p = (a,error "abs") in <rhs>
|
|
|
$wf :: Int -> Int
|
|
|
$wf a = let p = (a,error "abs") in <rhs>
|
|
|
```
|
|
|
|
|
|
|
... | ... | |