Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in / Register
Toggle navigation
Glasgow Haskell Compiler
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Locked Files
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Iterations
Merge Requests
0
Merge Requests
0
Requirements
Requirements
List
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Security & Compliance
Security & Compliance
Dependency List
License Compliance
Operations
Operations
Incidents
Environments
Packages & Registries
Packages & Registries
Package Registry
Container Registry
Analytics
Analytics
CI / CD
Code Review
Insights
Issue
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shayne Fletcher
Glasgow Haskell Compiler
Commits
78a506a9
Commit
78a506a9
authored
Jan 13, 2014
by
Simon Marlow
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Documentation on the stack layout algorithm
parent
f0a7261a
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
94 additions
and
5 deletions
+94
-5
compiler/cmm/CmmLayoutStack.hs
compiler/cmm/CmmLayoutStack.hs
+94
-5
No files found.
compiler/cmm/CmmLayoutStack.hs
View file @
78a506a9
...
...
@@ -35,13 +35,95 @@ import Control.Monad (liftM)
#
include
"HsVersions.h"
{- Note [Stack Layout]
data
StackSlot
=
Occupied
|
Empty
-- Occupied: a return address or part of an update frame
The job of this pass is to
- replace references to abstract stack Areas with fixed offsets from Sp.
- replace the CmmHighStackMark constant used in the stack check with
the maximum stack usage of the proc.
- save any variables that are live across a calll, and reload them as
necessary.
Before stack allocation, local variables remain live across native
calls (CmmCall{ cmm_cont = Just _ }), and after stack allocation local
variables are clobbered by native calls.
We want to do stack allocation so that as far as possible
- stack use is minimized, and
- unnecessary stack saves and loads are avoided.
The algorithm we use is a variant of linear-scan register allocation,
where the stack is our register file.
- First, we do a liveness analysis, which annotates every block with
the variables live on entry to the block.
- We traverse blocks in reverse postorder DFS; that is, we visit at
least one predecessor of a block before the block itself. The
stack layout flowing from the predecessor of the block will
determine the stack layout on entry to the block.
- We maintain a data structure
Map Label StackMap
which describes the contents of the stack and the stack pointer on
entry to each block that is a successor of a block that we have
visited.
- For each block we visit:
- Look up the StackMap for this block.
- If this block is a proc point (or a call continuation, if we
aren't splitting proc points), emit instructions to reload all
the live variables from the stack, according to the StackMap.
- Walk forwards through the instructions:
- At an assignment x = Sp[loc]
- Record the fact that Sp[loc] contains x, so that we won't
need to save x if it ever needs to be spilled.
- At an assignment x = E
- If x was previously on the stack, it isn't any more
- At the last node, if it is a call or a jump to a proc point
- Lay out the stack frame for the call (see setupStackFrame)
- emit instructions to save all the live variables
- Remember the StackMaps for all the successors
- emit an instruction to adjust Sp
- If the last node is a branch, then the current StackMap is the
StackMap for the successors.
- Manifest Sp: replace references to stack areas in this block
with real Sp offsets. We cannot do this until we have laid out
the stack area for the successors above.
In this phase we also eliminate redundant stores to the stack;
see elimStackStores.
- There is one important gotcha: sometimes we'll encounter a control
transfer to a block that we've already processed (a join point),
and in that case we might need to rearrange the stack to match
what the block is expecting. (exactly the same as in linear-scan
register allocation, except here we have the luxury of an infinite
supply of temporary variables).
- Finally, we update the magic CmmHighStackMark constant with the
stack usage of the function, and eliminate the whole stack check
if there was no stack use. (in fact this is done as part of the
main traversal, by feeding the high-water-mark output back in as
an input. I hate cyclic programming, but it's just too convenient
sometimes.)
There are plenty of tricky details: update frames, proc points, return
addresses, foreign calls, and some ad-hoc optimisations that are
convenient to do here and effective in common cases. Comments in the
code below explain these.
-}
instance
Outputable
StackSlot
where
ppr
Occupied
=
ptext
(
sLit
"XXX"
)
ppr
Empty
=
ptext
(
sLit
"---"
)
-- All stack locations are expressed as positive byte offsets from the
-- "base", which is defined to be the address above the return address
...
...
@@ -996,6 +1078,13 @@ callResumeThread new_base id =
plusW
::
DynFlags
->
ByteOff
->
WordOff
->
ByteOff
plusW
dflags
b
w
=
b
+
w
*
wORD_SIZE
dflags
data
StackSlot
=
Occupied
|
Empty
-- Occupied: a return address or part of an update frame
instance
Outputable
StackSlot
where
ppr
Occupied
=
ptext
(
sLit
"XXX"
)
ppr
Empty
=
ptext
(
sLit
"---"
)
dropEmpty
::
WordOff
->
[
StackSlot
]
->
Maybe
[
StackSlot
]
dropEmpty
0
ss
=
Just
ss
dropEmpty
n
(
Empty
:
ss
)
=
dropEmpty
(
n
-
1
)
ss
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment