... | ... | @@ -48,7 +48,7 @@ We use the following types to represent stack slots and parameter-passing areas: |
|
|
```wiki
|
|
|
data Area
|
|
|
= RegSlot LocalReg
|
|
|
| CallArea BlockId Int Int
|
|
|
| CallArea BlockId
|
|
|
deriving (Eq, Ord)
|
|
|
|
|
|
data CmmExpr
|
... | ... | @@ -59,30 +59,34 @@ data CmmExpr |
|
|
```
|
|
|
|
|
|
|
|
|
An `Area` represents space on the stack; it may use either the `RegSlot` constructor to represent a single stack slot for a register or the `CallArea` constructor to represent parameters passed to/from a function call/return. In a `CallArea`, the `BlockId` is the label of the function call's continuation, and the two integers are the sizes of the outgoing and incoming parameter-passing areas.
|
|
|
An `Area` represents space on the stack; it may use either the `RegSlot` constructor to represent a single stack slot for a register or the `CallArea` constructor to represent parameters passed to/from a function call/return. In a `CallArea`, the `BlockId` is the label of the function call's continuation. Each `Area` grows down, with offset 0 pointing to the old end of the `Area`.
|
|
|
|
|
|
|
|
|
To name a specific location on the stack, we represent its address with a new kind of `CmmExpr`: the `CmmStackSlot`. A `CmmStackSlot` is just an integer offset into an `Area`. Each stack area grows down, with offset 0 pointing to the old end of the area. If we wanted to place a 4-byte object at the old end of the area, we would address it using the offset 4.
|
|
|
To name a specific location on the stack, we represent its address with a new kind of `CmmExpr`: the `CmmStackSlot`.
|
|
|
A `CmmStackSlot` is just an integer offset into an `Area`.
|
|
|
|
|
|
```STACK PICTURE HERE.```
|
|
|
|
|
|
|
|
|
Note: If the `Area` is a `RegSlot`, we might still use a non-zero offset: for example, we might want to load the low word from a long integer.
|
|
|
To address a 4-byte object at the old end of the `Area`, we use the offset 4.
|
|
|
|
|
|
|
|
|
Notice that a `CmmStackSlot` is an *address*, so we can say
|
|
|
|
|
|
```wiki
|
|
|
Sp = SS(a+0)
|
|
|
Sp = SS(a+4)
|
|
|
```
|
|
|
|
|
|
|
|
|
to make `Sp` point to an particular area. Use a `CmmLoad` to load from the stack.
|
|
|
to make `Sp` point to a particular stack slot. Use a `CmmLoad` to load from the stack slot.
|
|
|
|
|
|
|
|
|
The following image shows the layout of a `CallArea` for both the outgoing parameters (function call) and incoming results (continuation after returning from the function call). Note that the incoming and outgoing parameters may be different, and they may overlap.
|
|
|
|
|
|
[](/trac/ghc/attachment/wiki/Commentary/Compiler/StackAreas/CallArea.png)
|
|
|
|
|
|
**More detail needed about which location in a `CallArea` is numbered 0**
|
|
|
|
|
|
A `RegSlot` is laid out in the same fashion, with the offset 0 pointing off the high byte of the stack slot. To address an 8-byte double-word, we would use the offset 8. To address only the high word of the same stack slot, we would use the offset 4.
|
|
|
|
|
|
Note: We don't have a virtual frame pointer in this story, but do we really want it? Here's a minor argument against: it requires special treatment by some analyses in Quick C-- (on the other hand, QC-- doesn't have distinguished global registers, so it might not even be an issue in GHC, which has many distinguished global registers).
|
|
|
|
|
|
Currently, the intermediate code does not explicitly use a virtual frame pointer, but when we talk about offsets into the stack, we implicitly assume that there is a virtual frame pointer that points just off the oldest byte of the return address on entry to the procedures. Therefore, on entry to the procedure, the offset of the (4-byte) return address is 4.
|
|
|
|
|
|
### Laying out the stack
|
|
|
|
... | ... | @@ -94,7 +98,7 @@ The business of the stack-layout pass is to construct a mapping (fixed across a |
|
|
```
|
|
|
|
|
|
|
|
|
which assigns a virtual stack slot (i.e offset relative to the virtual frame pointer) to each `Area`. **Mutter about vfp**.
|
|
|
which assigns a virtual stack slot (i.e. offset in bytes, relative to the virtual frame pointer) to each `Area`.
|
|
|
|
|
|
|
|
|
A naive approach to laying out the stack would be to give each variable its own stack slot for spilling, and allocate only the ends of the stack frame for parameter-passing areas. But this approach misses two opportunities for optimization:
|
... | ... | @@ -105,7 +109,13 @@ A naive approach to laying out the stack would be to give each variable its own |
|
|
|
|
|
As it turns out, it is quite common in GHC that the first definition of a variable comes when its value is returned from a function call. If the value is returned on the stack, then an important optimization is to avoid copying that value to some other location on the stack. How is that achieved? By making sure the location where the value is returned is also its spill slot.
|
|
|
|
|
|
### The greedy algorithm
|
|
|
### A greedy algorithm
|
|
|
|
|
|
|
|
|
We rewrite the stack slots in two passes:
|
|
|
|
|
|
1. Walk over the graph and choose an offset for each `Area`.
|
|
|
1. Walk over the graph, keeping track of the stack pointer, and rewrite each address of a stack slot with an offset from the stack pointer.
|
|
|
|
|
|
|
|
|
One way to assign stack slots is to traverse the flow graph in a reverse post-order depth-first search, allocating a stack location to each stack slot the first time we encounter the slot. (Note: When we encounter a parameter-passing area for the first time, we allocate stack locations for the whole area.) The allocation is then reused for every reference to the stack slot.
|
... | ... | |