Commit 78a506a9 authored by Simon Marlow's avatar Simon Marlow

Documentation on the stack layout algorithm

parent f0a7261a
......@@ -35,13 +35,95 @@ import Control.Monad (liftM)
#include "HsVersions.h"
{- Note [Stack Layout]
data StackSlot = Occupied | Empty
-- Occupied: a return address or part of an update frame
The job of this pass is to
- replace references to abstract stack Areas with fixed offsets from Sp.
- replace the CmmHighStackMark constant used in the stack check with
the maximum stack usage of the proc.
- save any variables that are live across a calll, and reload them as
necessary.
Before stack allocation, local variables remain live across native
calls (CmmCall{ cmm_cont = Just _ }), and after stack allocation local
variables are clobbered by native calls.
We want to do stack allocation so that as far as possible
- stack use is minimized, and
- unnecessary stack saves and loads are avoided.
The algorithm we use is a variant of linear-scan register allocation,
where the stack is our register file.
- First, we do a liveness analysis, which annotates every block with
the variables live on entry to the block.
- We traverse blocks in reverse postorder DFS; that is, we visit at
least one predecessor of a block before the block itself. The
stack layout flowing from the predecessor of the block will
determine the stack layout on entry to the block.
- We maintain a data structure
Map Label StackMap
which describes the contents of the stack and the stack pointer on
entry to each block that is a successor of a block that we have
visited.
- For each block we visit:
- Look up the StackMap for this block.
- If this block is a proc point (or a call continuation, if we
aren't splitting proc points), emit instructions to reload all
the live variables from the stack, according to the StackMap.
- Walk forwards through the instructions:
- At an assignment x = Sp[loc]
- Record the fact that Sp[loc] contains x, so that we won't
need to save x if it ever needs to be spilled.
- At an assignment x = E
- If x was previously on the stack, it isn't any more
- At the last node, if it is a call or a jump to a proc point
- Lay out the stack frame for the call (see setupStackFrame)
- emit instructions to save all the live variables
- Remember the StackMaps for all the successors
- emit an instruction to adjust Sp
- If the last node is a branch, then the current StackMap is the
StackMap for the successors.
- Manifest Sp: replace references to stack areas in this block
with real Sp offsets. We cannot do this until we have laid out
the stack area for the successors above.
In this phase we also eliminate redundant stores to the stack;
see elimStackStores.
- There is one important gotcha: sometimes we'll encounter a control
transfer to a block that we've already processed (a join point),
and in that case we might need to rearrange the stack to match
what the block is expecting. (exactly the same as in linear-scan
register allocation, except here we have the luxury of an infinite
supply of temporary variables).
- Finally, we update the magic CmmHighStackMark constant with the
stack usage of the function, and eliminate the whole stack check
if there was no stack use. (in fact this is done as part of the
main traversal, by feeding the high-water-mark output back in as
an input. I hate cyclic programming, but it's just too convenient
sometimes.)
There are plenty of tricky details: update frames, proc points, return
addresses, foreign calls, and some ad-hoc optimisations that are
convenient to do here and effective in common cases. Comments in the
code below explain these.
-}
instance Outputable StackSlot where
ppr Occupied = ptext (sLit "XXX")
ppr Empty = ptext (sLit "---")
-- All stack locations are expressed as positive byte offsets from the
-- "base", which is defined to be the address above the return address
......@@ -996,6 +1078,13 @@ callResumeThread new_base id =
plusW :: DynFlags -> ByteOff -> WordOff -> ByteOff
plusW dflags b w = b + w * wORD_SIZE dflags
data StackSlot = Occupied | Empty
-- Occupied: a return address or part of an update frame
instance Outputable StackSlot where
ppr Occupied = ptext (sLit "XXX")
ppr Empty = ptext (sLit "---")
dropEmpty :: WordOff -> [StackSlot] -> Maybe [StackSlot]
dropEmpty 0 ss = Just ss
dropEmpty n (Empty : ss) = dropEmpty (n-1) ss
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment