stg-spec.mng 6.68 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133




% increase spacing between rules for ease of reading:



\title{The Spineless Tagless G-Machine, as implemented by GHC}




This document presents the syntax of STG, and the cost semantics utilized
for profiling.  While this document will be primarily useful for people
looking to work on profiling in GHC, the hope is that this will eventually
expanded to also have operational semantics for modern STG.

While care has been taken to adhere to the behavior in GHC, these rules
have not yet been used to perform any proofs.  There is some sloppiness
in here that probably would have to be cleaned up, especially with
respect to let-no-escape.  Some details are elided from this
presentation, especially extra annotated data in the STG data type
itself which is useful for code generation but not strictly necessary.



We will use the following metavariables:



Literals do not play a big role, so they are kept abstract:


Primitive operations and foreign calls can influence the costs of
an application, but because their behavior depends on the specific
operation in question, they are kept abstract for simplicity's sake.



Arguments in STG are restricted to be literals or variables:


\subsection{Cost centres}

Cost centres are abstract labels to which costs can be attributed.  They
are collected into cost centre stacks.  Entering a
function requires us to combine two cost-centre stacks ($\bowtie$),
while entering a SCC pushes a cost-centre onto a cost-centre stack
($\triangleright$); both of these functions are kept abstract in this
presentation.  The special current cost-centre stack ($\bullet$) occurs
only in STG and not at runtime and indicates that the lexically current
cost-centre should be used at runtime (see the cost semantics for details).
Some times we do not care about the choice of cost centre stack, in which case
we will use the don't care cost centre stack.




The STG datatype that represents expressions:


STG is a lot like Core, but with some differences:

\item Function arguments must be literals or variables (thus, function application does not allocate),
\item Constructor and primitive applications are saturated,
\item Let-bindings can only have constructor applications or closures on the right-hand-side, and
\item Lambdas are forbidden outside of let-bindings.

The details of bindings for let statements:




Closures have an update flag, which indicates whether or not they are
functions or thunks:


Details for case alternatives:


\section{Runtime productions}

In our cost semantics, we will explicitly model the heap:


Assignments on the heap are from names to heap values with an associated
cost-centre stack.  In our model, allocation produces a fresh name which
acts as a pointer to the value on the heap.



Brian Wignall's avatar
Brian Wignall committed
Execution proceeds until a return value (a literal or a variable, i.e.
Herbert Valerio Riedel's avatar
Herbert Valerio Riedel committed
pointer to the heap) is produced.  To accommodate for let-no-escape
136 137 138 139 140 141
bindings, we also allow execution to terminate with a jump to a function
application of a let-no-escape variable.


Values $v$ are functions (re-entrant closures) and constructors; thunks
Gabor Greif's avatar
Gabor Greif committed
are not considered values.  Evaluation guarantees that a value will be
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173

Profiling also records allocation costs for creating objects on the heap:



\section{Cost semantics}

The judgment can be read as follows: with current cost centre $\textsf{ccs}$
and current heap $\Gamma$, the expression $e$ evaluates to $ret$, producing
a new heap $\Delta$ and a new current cost centre $\textsf{ccs'}$, performing
$\theta$ allocations.



\item These semantics disagree with the executable semantics presented
by Simon Marlow at the Haskell Implementor's Workshop 2012.  (ToDo:
explain what the difference is.)
\item In the \textsc{Thunk} rule, while the indirection is attributed to
$\textsf{ccs}_0$, the result of the thunk itself ($y$) may be attributed
to someone else.
\item \textsc{AppUnder} and \textsc{AppOver} deal with under-saturated
and over-saturated function application.
The implementations of \textsc{App} rules are spread across two
different calling conventions for functions: slow calls and
direct calls.  Direct calls handle saturated and over-applied
cases (\coderef{GHC/StgToCmm/Layout.hs}{slowArgs}), while slow
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
calls handle all cases (\textit{utils/genapply/GenApply.hs});
in particular, these cases ensure that the current cost-center
reverts to the one originally at the call site.
\item The \textsc{App} rule demonstrates that modern GHC
profiling uses neither evaluation scoping or lexical scoping; rather,
it uses a hybrid of the two (though with an appropriate definition
of $\bowtie$, either can be recovered.)  The presence of cost centre stacks is one of the primary
differences between modern GHC and Sansom'95.
\item The \textsc{AppTop} rule utilizes $\bullet$ to notate when a
function should influence the current cost centre stack.  The data type
used here could probably be simplified, since we never actually take
advantage of the fact that it is a cost centre.
\item As it turns out, the state of the current cost centre after
evaluation is never utilized.  In the original Sansom'95, this information
was necessary to allow for the implementation of lexical scoping; in
this presentation, all closures must live on the heap, and the cost centre
is thus recorded there.
\item \textsc{LneClosure} must explicitly save and reset the $\textsf{ccs}$ when the
binding is evaluated, whereas \textsc{LetClosure} takes advantage of the
fact that when the closure is allocated on the heap the $\text{ccs}$ is saved.
(ToDo: But isn't there a behavior difference when the closure is re-entrant?
Note that re-entrant/updatable is indistinguishable to a let-no-escape.)
\item Recursive bindings have been omitted but they work in the way you would expect.