dias · 4c8b6fa4
--- a/commentary/compiler/stack-areas.md
+++ b/commentary/compiler/stack-areas.md
@@ -2,16 +2,21 @@

 ## Stack Layout

-### The old approach

+The stack-layout phase decides where to spill variables. The important goals are to avoid memory traffic and to minimize the size of the stack frame. Both of these goals are accomplished by reusing stack slots.

-In the old code generator, most of the pipeline refers to variables by name. In fact, we have a phase ordering problem: no compiler phase can name a stack slot until stack layout has been fixed. But stack layout can only be fixed at the end of the pipeline. The consequence of this approach is that we have to provide special treatment for code that must refer to stack slots (e.g. parameter passing in calling conventions, or spills and reloads). In particular, we defined special instructions for CopyIn and CopyOut of function arguments, which implicitly stand for an adjustment to the stack pointer and some parallel assignments to the function parameters or return results.
+### Representing Stack Slots
+
+#### The old approach
+
+
+In the old code generator, we have a phase ordering problem: no compiler phase can name a stack slot until stack layout has been fixed. But stack layout can only be fixed at the end of the pipeline. The consequence of this approach is that most of the pipeline requires special treatment for code that must refer to stack slots (e.g. parameter passing in calling conventions, or spills and reloads). In particular, we defined special instructions for CopyIn and CopyOut of function arguments, which implicitly stand for an adjustment to the stack pointer and some parallel assignments to the function parameters or return results.


 For example, we compile a function call

 ```wiki
-x, y = f(a, b, c)
+x, y = f(a, b, c);
 ```


@@ -27,20 +32,20 @@ k:

 Every stage of the back end must cope with the CopyIn and CopyOut pseudoinstructions.

-### The new approach
+#### The new approach


 A better approach is to introduce a unique name for each stack slot, then treat the name as the addressing expression for the slot. At the end of the pipeline, we choose a stack layout, then replace each stack slot with its offset from the stack pointer. The benefit is that we break the phase-ordering problem: any phase of the compiler can name a stack slot.


-For example, if we want to spill a variable *x*, we use a regular store instruction to a stack slot at address *stack\<x\>*:
+For example, if we want to spill a variable `x`, we use a regular store instruction to a stack slot at address `stack<x>`:

 ```wiki
 m[stack<x>] := x;
 ```


-where *m\[e\]* refers to an address *e* in memory.
+where `m[e]` refers to an address `e` in memory.


 But what about parameter passing? We use a similar technique, but this time we describe the slot for each location as an offset within the area where the parameters are passed. For example, we compile a function call
@@ -60,11 +65,11 @@ into approximately the following C--:
  m[stack<k + 4>] := c;
 k:  // on entry to k, sp == stack<k+3>
  x := m[stack<k + 2>]
-  y := m[stack<k + 1>]
+  y := m[stack<k + 3>]
 ```


-Note that the semantics of the now-unnecessary CopyIn and CopyOut are reified by an assignment to the stack pointer and by a series of copy instructions. If an optimization understands copy instructions, it can improve this code -- without having to worry about the semantics of CopyIn.
+Note that the semantics of the now-unnecessary CopyIn and CopyOut are reified by an assignment to the stack pointer and a series of copy instructions. If an optimization understands copy instructions, it can improve this code -- without having to worry about the semantics of CopyIn.


 Furthermore, the job of laying out the stack is now pleasantly simple: decide where to place the slots and parameter-passing areas, then rewrite the references to stack locations. The stack-layout phase is no longer responsible for inserting stack adjustments or lowering CopyIn and CopyOut nodes to data-movement instructions.
@@ -85,6 +90,31 @@ data CmmExpr
  deriving Eq
 ```

+
+An `Area` represents space on the stack; it may use either the `RegSlot` constructor to represent a single stack slot for a register or the `CallArea` constructor to represent parameters passed to/from a function call/return. In a `CallArea`, the `BlockId` is the label of the function call's continuation, and the two integers are the sizes of the outgoing and incoming parameter-passing areas.
+
+
+To name a specific location on the stack, we represent its address with a new kind of `CmmExpr`: the `CmmStackSlot`. A `CmmStackSlot` is just an integer offset into an `Area`. If the `Area` is a `RegSlot`, it is a dynamic invariant that the offset must be `0`.
+
+
+Note: We don't have a virtual frame pointer in this story, but do we really want it? Here's a minor argument against: it requires special treatment by some analyses in Quick C-- (on the other hand, QC-- doesn't have distinguished global registers, so it might not even be an issue in GHC, which has many distinguished global registers).
+
+### Laying out the stack
+
+
+A naive approach to laying out the stack would be to give each variable its own stack slot for spilling, and allocate only the ends of the stack frame for parameter-passing areas. But this approach misses two important opportunities for optimization:
+
+- Stack slots can be reused by variables that are never on the stack at the same time
+- If a function returns a variable on the stack, we might be able to use the return location as the variable's spill slot.
+
+
+As it turns out, it is quite common in GHC that the first definition of a variable comes when its value is returned from a function call. If the value is returned on the stack, then an important optimization is to avoid copying that value to some other local location on the stack. How is that achieved? By making sure the location where the value is returned is defined as its spill slot. 
+
+## Random Thoughts
+
+
+Assignments into parameter-passing areas can't be hoisted past adjustments to the stack pointer until we know the stack layout -- otherwise we might try to write off the young end of the stack. There must be some sort of invariant here, something along the lines of: "a reference to a parameter passing area must (a) succeed the adjustment moving the stack pointer to the bottom of the area and (b) precede any other stack adjustment.
+
 ## ToDo

 - Explain the stack layout algorithm.