Commit 665229e5 authored by sewardj's avatar sewardj

[project @ 2000-06-15 08:38:25 by sewardj]

Major thing: new register allocator.  Brief description follows.
Should correctly handle code with loops in, even though we don't
generate any such at the moment.  A lot of comments.  The previous
machinery for spilling is retained, as is the idea of a fast-and-easy
initial allocation attempt intended to deal with the majority of code
blocks (about 60% on x86) very cheaply.  Many comments explaining
in detail how it works :-)

The Stix inliner is now on by default.  Integer code seems to run
within about 1% of that -fvia-C.  x86 fp code is significantly worse,
up to about 30% slower, depending on the amount of fp activity.

Minor thing: lazyfication of the top-level NCG plumbing, so that the
NCG doesn't require any greater residency than compiling to C, just a
bit more time.  Created lazyThenUs and lazyMapUs for this purpose.

The new allocator is somewhat, although not catastophically, slower
than the old one.  Fixing of the long-standing NCG space leak more
than makes up for it; overall hsc run-time is down about 5%, due to
significantly reduced GC time.

--------------------------------------------------------------------

Instructions are numbered sequentially, starting at zero.

A flow edge (FE) is a pair of insn numbers (MkFE Int Int) denoting
a possible flow of control from the first insn to the second.

The input to the register allocator is a list of instructions, which
mention Regs.  A Reg can be a RealReg -- a real machine reg -- or a
VirtualReg, which carries a unique.  After allocation, all the
VirtualReg references will have been converted into RealRegs, and
possibly some spill code will have been inserted.

The heart of the register allocator works in four phases.

1.  (find_flow_edges) Calculate all the FEs for the code list.
    Return them not as a [FE], but implicitly, as a pair of
    Array Int [Int], being the successor and predecessor maps
    for instructions.

2.  (calc_liveness) Returns a FiniteMap FE RegSet.  For each
    FE, indicates the set of registers live on that FE.  Note
    that the set includes both RealRegs and VirtualRegs.  The
    former appear because the code could mention fixed register
    usages, and we need to take them into account from the start.

3.  (calc_live_range_sets) Invert the above mapping, giving a
    FiniteMap Reg FeSet, indicating, for each virtual and real
    reg mentioned in the code, which FEs it is live on.

4.  (calc_vreg_to_rreg_mapping) For virtual reg, try and find
    an allocatable real register for it.  Each real register has
    a "current commitment", indicating the set of FEs it is
    currently live on.  A virtual reg v can be assigned to
    real reg r iff v's live-fe-set does not intersect with r's
    current commitment fe-set.  If the assignment is made,
    v's live-fe-set is union'd into r's current commitment fe-set.
    There is also the minor restriction that v and r must be of
    the same register class (integer or floating).

    Once this mapping is established, we simply apply it to the
    input insns, and that's it.

    If no suitable real register can be found, the vreg is mapped
    to itself, and we deem allocation to have failed.  The partially
    allocated code is returned.  The higher echelons of the allocator
    (doGeneralAlloc and runRegAlloc) then cooperate to insert spill
    code and re-run allocation, until a successful allocation is found.
parent 1364aa0b
......@@ -15,6 +15,7 @@ module UniqSupply (
getUniqueUs, getUniquesUs,
mapUs, mapAndUnzipUs, mapAndUnzip3Us,
thenMaybeUs, mapAccumLUs,
lazyThenUs, lazyMapUs,
mkSplitUniqSupply,
splitUniqSupply
......@@ -121,6 +122,7 @@ initUs_ :: UniqSupply -> UniqSM a -> a
initUs_ init_us m = case m init_us of { (r,us) -> r }
{-# INLINE thenUs #-}
{-# INLINE lazyThenUs #-}
{-# INLINE returnUs #-}
{-# INLINE splitUniqSupply #-}
\end{code}
......@@ -135,10 +137,15 @@ thenUs :: UniqSM a -> (a -> UniqSM b) -> UniqSM b
thenUs expr cont us
= case (expr us) of { (result, us') -> cont result us' }
lazyThenUs :: UniqSM a -> (a -> UniqSM b) -> UniqSM b
lazyThenUs expr cont us
= let (result, us') = expr us in cont result us'
thenUs_ :: UniqSM a -> UniqSM b -> UniqSM b
thenUs_ expr cont us
= case (expr us) of { (_, us') -> cont us' }
returnUs :: a -> UniqSM a
returnUs result us = (result, us)
......@@ -159,13 +166,19 @@ getUniquesUs n us = case splitUniqSupply us of
\begin{code}
mapUs :: (a -> UniqSM b) -> [a] -> UniqSM [b]
mapUs f [] = returnUs []
mapUs f (x:xs)
= f x `thenUs` \ r ->
mapUs f xs `thenUs` \ rs ->
returnUs (r:rs)
lazyMapUs :: (a -> UniqSM b) -> [a] -> UniqSM [b]
lazyMapUs f [] = returnUs []
lazyMapUs f (x:xs)
= f x `lazyThenUs` \ r ->
lazyMapUs f xs `lazyThenUs` \ rs ->
returnUs (r:rs)
mapAndUnzipUs :: (a -> UniqSM (b,c)) -> [a] -> UniqSM ([b],[c])
mapAndUnzip3Us :: (a -> UniqSM (b,c,d)) -> [a] -> UniqSM ([b],[c],[d])
......
......@@ -462,13 +462,15 @@ initTyVarUnique = mkUnique 't' 0
initTidyUniques :: (Unique, Unique) -- Global and local
initTidyUniques = (mkUnique 'g' 0, mkUnique 'x' 0)
mkPseudoUnique1, mkPseudoUnique2, mkPseudoUnique3,
mkBuiltinUnique :: Int -> Unique
mkPseudoUnique1, mkPseudoUnique2, mkPseudoUnique3,
mkBuiltinUnique :: Int -> Unique
mkBuiltinUnique i = mkUnique 'B' i
mkPseudoUnique1 i = mkUnique 'C' i -- used for getUnique on Regs
mkPseudoUnique2 i = mkUnique 'D' i -- ditto
mkPseudoUnique3 i = mkUnique 'E' i -- ditto
mkPseudoUnique2 i = mkUnique 'D' i -- used in NCG for getUnique on RealRegs
mkPseudoUnique3 i = mkUnique 'E' i -- used in NCG spiller to create spill VirtualRegs
getBuiltinUniques :: Int -> [Unique]
getBuiltinUniques n = map (mkUnique 'B') [1 .. n]
......
......@@ -32,7 +32,7 @@ import CmdLineOpts
import Maybes ( maybeToBool )
import ErrUtils ( doIfSet, dumpIfSet )
import Outputable
import IO ( IOMode(..), hClose, openFile )
import IO ( IOMode(..), hClose, openFile, Handle )
\end{code}
......@@ -69,6 +69,7 @@ codeOutput mod_name tycons classes core_binds stg_binds
} }
doOutput :: (Handle -> IO ()) -> IO ()
doOutput io_action
= (do handle <- openFile opt_OutputFile WriteMode
io_action handle
......@@ -101,9 +102,9 @@ outputC flat_absC
outputAsm flat_absC ncg_uniqs
#ifndef OMIT_NATIVE_CODEGEN
= do dumpIfSet opt_D_dump_stix "Final stix code" stix_final
dumpIfSet opt_D_dump_asm "Asm code" ncg_output_d
doOutput (\ f -> printForAsm f ncg_output_d)
= do dumpIfSet opt_D_dump_stix "Final stix code" stix_final
dumpIfSet opt_D_dump_asm "Asm code" ncg_output_d
doOutput ( \f -> printForAsm f ncg_output_d)
where
(stix_final, ncg_output_d) = nativeCodeGen flat_absC ncg_uniqs
......
......@@ -14,7 +14,7 @@ import Stix
import MachMisc
import AbsCUtils ( getAmodeRep, mixedTypeLocn,
nonemptyAbsC, mkAbsCStmts, mkAbsCStmtList
nonemptyAbsC, mkAbsCStmts
)
import PprAbsC ( dumpRealC )
import SMRep ( fixedItblSize,
......@@ -54,11 +54,10 @@ We leave the chunks separated so that register allocation can be
performed locally within the chunk.
\begin{code}
genCodeAbstractC :: AbstractC -> UniqSM [[StixTree]]
genCodeAbstractC :: AbstractC -> UniqSM [StixTree]
genCodeAbstractC absC
= mapUs gentopcode (mkAbsCStmtList absC) `thenUs` \ trees ->
returnUs ([StComment SLIT("Native Code")] : trees)
= gentopcode absC
where
a2stix = amodeToStix
a2stix' = amodeToStix'
......
......@@ -18,9 +18,10 @@ import PprMach
import AbsCStixGen ( genCodeAbstractC )
import AbsCSyn ( AbstractC, MagicId )
import AbsCUtils ( mkAbsCStmtList )
import AsmRegAlloc ( runRegAllocate )
import PrimOp ( commutableOp, PrimOp(..) )
import RegAllocInfo ( mkMRegsState, MRegsState, findReservedRegs )
import RegAllocInfo ( findReservedRegs )
import Stix ( StixTree(..), StixReg(..),
pprStixTrees, pprStixTree, CodeSegment(..),
stixCountTempUses, stixSubst,
......@@ -29,7 +30,8 @@ import Stix ( StixTree(..), StixReg(..),
uniqOfNatM_State, deltaOfNatM_State )
import PrimRep ( isFloatingRep, PrimRep(..) )
import UniqSupply ( returnUs, thenUs, mapUs, initUs,
initUs_, UniqSM, UniqSupply )
initUs_, UniqSM, UniqSupply,
lazyThenUs, lazyMapUs )
import MachMisc ( IF_ARCH_i386(i386_insert_ffrees,) )
import OrdList ( fromOL, concatOL )
......@@ -87,38 +89,47 @@ So, here we go:
\begin{code}
nativeCodeGen :: AbstractC -> UniqSupply -> (SDoc, SDoc)
nativeCodeGen absC us
= let (stixRaw, us1) = initUs us (genCodeAbstractC absC)
stixOpt = map genericOpt stixRaw
insns = initUs_ us1 (codeGen stixOpt)
debug_stix = vcat (map pprStixTrees stixOpt)
in {- trace "nativeGen: begin" -} (debug_stix, insns)
\end{code}
@codeGen@ is the top-level code-generation function:
\begin{code}
codeGen :: [[StixTree]] -> UniqSM SDoc
codeGen stixFinal
= mapUs genMachCode stixFinal `thenUs` \ dynamic_codes ->
let
fp_kludge :: [Instr] -> [Instr]
fp_kludge = IF_ARCH_i386(i386_insert_ffrees,id)
static_instrss :: [[Instr]]
static_instrss = map fp_kludge (scheduleMachCode dynamic_codes)
docs = map (vcat . map pprInstr) static_instrss
-- for debugging only
docs_prealloc = map (vcat . map pprInstr . fromOL)
dynamic_codes
text_prealloc = vcat (intersperse (char ' ' $$ char ' ') docs_prealloc)
in
--trace (showSDoc text_prealloc) (
returnUs (vcat (intersperse (char ' '
$$ ptext SLIT("# ___stg_split_marker")
$$ char ' ')
docs))
--)
= let absCstmts = mkAbsCStmtList absC
(sdoc_pairs, us1) = initUs us (lazyMapUs absCtoNat absCstmts)
stix_sdocs = map fst sdoc_pairs
insn_sdocs = map snd sdoc_pairs
insn_sdoc = my_vcat insn_sdocs
stix_sdoc = vcat stix_sdocs
# if DEBUG
my_trace m x = trace m x
my_vcat sds = vcat (intersperse (char ' '
$$ ptext SLIT("# ___stg_split_marker")
$$ char ' ')
sds)
# else
my_vcat sds = vcat sds
my_trace m x = x
# endif
in
my_trace "nativeGen: begin"
(stix_sdoc, insn_sdoc)
absCtoNat :: AbstractC -> UniqSM (SDoc, SDoc)
absCtoNat absC
= genCodeAbstractC absC `thenUs` \ stixRaw ->
genericOpt stixRaw `bind` \ stixOpt ->
genMachCode stixOpt `thenUs` \ pre_regalloc ->
regAlloc pre_regalloc `bind` \ almost_final ->
x86fp_kludge almost_final `bind` \ final_mach_code ->
vcat (map pprInstr final_mach_code) `bind` \ final_sdoc ->
pprStixTrees stixOpt `bind` \ stix_sdoc ->
returnUs (stix_sdoc, final_sdoc)
where
bind f x = x f
x86fp_kludge :: [Instr] -> [Instr]
x86fp_kludge = IF_ARCH_i386(i386_insert_ffrees,id)
regAlloc :: InstrBlock -> [Instr]
regAlloc = runRegAllocate allocatableRegs findReservedRegs
\end{code}
Top level code generator for a chunk of stix code. For this part of
......@@ -154,20 +165,6 @@ genMachCode stmts initial_us
(int final_delta)
\end{code}
The next bit does the code scheduling. The scheduler must also deal
with register allocation of temporaries. Much parallelism can be
exposed via the OrdList, but more might occur, so further analysis
might be needed.
\begin{code}
scheduleMachCode :: [InstrBlock] -> [[Instr]]
scheduleMachCode
= map (runRegAllocate freeRegsState findReservedRegs)
where
freeRegsState = mkMRegsState (extractMappedRegNos freeRegs)
\end{code}
%************************************************************************
%* *
\subsection[NCOpt]{The Generic Optimiser}
......@@ -197,24 +194,26 @@ stixPeep :: [StixTree] -> [StixTree]
-- second assignment would be substituted for, giving nonsense
-- code. As far as I can see, StixTemps are only ever assigned
-- to once. It would be nice to be sure!
{-
stixPeep ( t1@(StAssign pka (StReg (StixTemp u pk)) rhs)
: t2
: ts )
| stixCountTempUses u t2 == 1
&& sum (map (stixCountTempUses u) ts) == 0
= trace ("nativeGen: stixInline: " ++ showSDoc (pprStixTree rhs))
=
# ifdef DEBUG
trace ("nativeGen: inlining " ++ showSDoc (pprStixTree rhs))
# endif
(stixPeep (stixSubst u rhs t2 : ts))
stixPeep (t1:t2:ts) = t1 : stixPeep (t2:ts)
stixPeep [t1] = [t1]
stixPeep [] = []
-}
-- disable stix inlining until we figure out how to fix the
-- latent bugs in the register allocator which are exposed by
-- the inliner.
stixPeep = id
--stixPeep = id
\end{code}
For most nodes, just optimize the children.
......
%
% (c) The AQUA Project, Glasgow University, 1993-1998
% (c) The AQUA Project, Glasgow University, 1993-2000
%
\section[AsmRegAlloc]{Register allocator}
\begin{code}
module AsmRegAlloc ( runRegAllocate, runHairyRegAllocate ) where
module AsmRegAlloc ( runRegAllocate ) where
#include "HsVersions.h"
import MachCode ( InstrBlock )
import MachMisc ( Instr(..) )
import PprMach ( pprUserReg ) -- debugging
import PprMach ( pprUserReg, pprInstr ) -- debugging
import MachRegs
import RegAllocInfo
import FiniteMap ( emptyFM, addListToFM, delListFromFM,
lookupFM, keysFM )
import FiniteMap ( FiniteMap, emptyFM, addListToFM, delListFromFM,
lookupFM, keysFM, eltsFM, mapFM, addToFM_C, addToFM,
listToFM, fmToList, lookupWithDefaultFM )
import Maybes ( maybeToBool )
import Unique ( mkBuiltinUnique )
import Util ( mapAccumB )
import OrdList ( unitOL, appOL, fromOL, concatOL )
import Outputable
import List ( mapAccumL )
import Unique ( Unique, Uniquable(..), mkPseudoUnique3 )
import CLabel ( CLabel, pprCLabel )
import List ( mapAccumL, nub, sort )
import Array ( Array, array, (!), bounds )
\end{code}
This is the generic register allocator.
This is the generic register allocator. It does allocation for all
architectures. Details for specific architectures are given in
RegAllocInfo.lhs. In practice the allocator needs to know next to
nothing about an architecture to do its job:
* It needs to be given a list of the registers it can allocate to.
* It needs to be able to find out which registers each insn reads and
writes.
* It needs be able to change registers in instructions into other
registers.
* It needs to be able to find out where execution could go after an
in instruction.
* It needs to be able to discover sets of registers which can be
used to attempt spilling.
First we try something extremely simple. If that fails, we have to do
things the hard way.
\begin{code}
runRegAllocate
:: MRegsState
-> ([Instr] -> [[RegNo]])
:: [Reg]
-> ([Instr] -> [[Reg]])
-> InstrBlock
-> [Instr]
runRegAllocate regs find_reserve_regs instrs
= case simpleAlloc of
Just simple -> simple
Nothing -> tryHairy reserves
Just simple -> --trace "SIMPLE"
simple
Nothing -> --trace "GENERAL"
(tryGeneral reserves)
where
tryHairy []
= error "nativeGen: spilling failed. Try -fvia-C.\n"
tryHairy (resv:resvs)
= case hairyAlloc resv of
tryGeneral []
= error "nativeGen: spilling failed. Workaround: compile with -fvia-C.\n"
tryGeneral (resv:resvs)
= case generalAlloc resv of
Just success -> success
Nothing -> tryHairy resvs
Nothing -> tryGeneral resvs
reserves = find_reserve_regs flatInstrs
flatInstrs = fromOL instrs
simpleAlloc = simpleRegAlloc regs [] emptyFM flatInstrs
hairyAlloc resvd = hairyRegAlloc regs resvd flatInstrs
runHairyRegAllocate
:: MRegsState
-> [RegNo]
-> InstrBlock
-> Maybe [Instr]
runHairyRegAllocate regs reserve_regs instrs
= hairyRegAlloc regs reserve_regs flatInstrs
where
flatInstrs = fromOL instrs
reserves = find_reserve_regs flatInstrs
flatInstrs = fromOL instrs
simpleAlloc = doSimpleAlloc regs flatInstrs
generalAlloc resvd = doGeneralAlloc regs resvd flatInstrs
\end{code}
Here is the simple register allocator. Just dole out registers until
we run out, or until one gets clobbered before its last use. Don't
do anything fancy with branches. Just pretend that you've got a block
of straight-line code and hope for the best. Experience indicates that
this approach will suffice for about 96 percent of the code blocks that
we generate.
Rather than invoke the heavyweight machinery in @doGeneralAlloc@ for
each and every code block, we first try using this simple, fast and
utterly braindead allocator. In practice it handles about 60\% of the
code blocks really fast, even with only 3 integer registers available.
Since we can always give up and fall back to @doGeneralAlloc@,
@doSimpleAlloc@ is geared to handling the common case as fast as
possible. It will succeed only if:
\begin{code}
simpleRegAlloc
:: MRegsState -- registers to select from
-> [Reg] -- live static registers
-> RegAssignment -- mapping of dynamics to statics
-> [Instr] -- code
-> Maybe [Instr]
simpleRegAlloc _ _ _ [] = Just []
simpleRegAlloc free live env (instr:instrs)
| null deadSrcs &&
maybeToBool newAlloc &&
maybeToBool instrs2
= Just (instr3 : instrs3)
| otherwise
= Nothing
where
instr3 = patchRegs instr (lookup env2)
(srcs, dsts) = case regUsage instr of
(RU s d) -> (regSetToList s, regSetToList d)
lookup env x = case lookupFM env x of Just y -> y; Nothing -> x
deadSrcs = [r | r@(UnmappedReg _ _) <- srcs, lookup env r `not_elem` live]
newDsts = [r | r@(UnmappedReg _ _) <- dsts, r `not_elem` keysFM env]
* The code mentions registers only of integer class, not floating
class.
newAlloc = foldr allocateNewReg (Just (free, [])) newDsts
(free2, new) = case newAlloc of Just x -> x
* The code doesn't mention any real registers, so we don't have to
think about dodging and weaving to work around fixed register uses.
env2 = env `addListToFM` new
* The code mentions at most N virtual registers, where N is the number
of real registers for allocation.
live2 = map snd new ++ [x | x <- live, x `not_elem` dsts]
If those conditions are satisfied, we simply trundle along the code,
doling out a real register every time we see mention of a new virtual
register. We either succeed at this, or give up when one of the above
three conditions is no longer satisfied.
instrs2 = simpleRegAlloc free2 live2 env2 instrs
instrs3 = case instrs2 of Just x -> x
allocateNewReg
:: Reg
-> Maybe (MRegsState, [(Reg, Reg)])
-> Maybe (MRegsState, [(Reg, Reg)])
allocateNewReg _ Nothing = Nothing
allocateNewReg d@(UnmappedReg _ pk) (Just (free, prs))
| null choices = Nothing
| otherwise = Just (free2, prs2)
where
choices = possibleMRegs pk free
reg = head choices
free2 = free `useMReg` (case reg of {IBOX(reg2) -> reg2} )
prs2 = ((d, MappedReg (case reg of {IBOX(reg2) -> reg2})) : prs)
\begin{code}
doSimpleAlloc :: [Reg] -> [Instr] -> Maybe [Instr]
doSimpleAlloc available_real_regs instrs
= let available_iregs
= filter ((== RcInteger).regClass) available_real_regs
trundle :: [( {-Virtual-}Reg, {-Real-}Reg )]
-> [ {-Real-}Reg ]
-> [Instr]
-> [Instr]
-> Maybe [Instr]
trundle vreg_map uncommitted_rregs ris_done []
= Just (reverse ris_done)
trundle vreg_map uncommitted_rregs ris_done (i:is)
= case regUsage i of
RU rds wrs
-- Mentions no regs? Move on quickly
| null rds_l && null wrs_l
-> trundle vreg_map uncommitted_rregs (i:ris_done) is
-- A case we can't be bothered to handle?
| any isFloatingOrReal rds_l || any isFloatingOrReal wrs_l
-> Nothing
-- Update the rreg commitments, and map the insn
| otherwise
-> case upd_commitment (wrs_l++rds_l)
vreg_map uncommitted_rregs of
Nothing -- out of rregs; give up
-> Nothing
Just (vreg_map2, uncommitted_rregs2)
-> let i2 = patchRegs i (subst_reg vreg_map2)
in trundle vreg_map2 uncommitted_rregs2
(i2:ris_done) is
where
isFloatingOrReal reg
= isRealReg reg || regClass reg == RcFloating
rds_l = regSetToList rds
wrs_l = regSetToList wrs
upd_commitment [] vr_map uncomm
= Just (vr_map, uncomm)
upd_commitment (reg:regs) vr_map uncomm
| isRealReg reg
= upd_commitment regs vr_map uncomm
| reg `elem` (map fst vr_map)
= upd_commitment regs vr_map uncomm
| null uncomm
= Nothing
| otherwise
= upd_commitment regs ((reg, head uncomm):vr_map)
(tail uncomm)
subst_reg vreg_map r
-- If it's a RealReg, it must be STG-specific one
-- (Hp,Sp,BaseReg,etc), since regUsage filters them out,
-- so isFloatingOrReal would not have objected to it.
| isRealReg r
= r
| otherwise
= case [rr | (vr,rr) <- vreg_map, vr == r] of
[rr2] -> rr2
other -> pprPanic
"doSimpleAlloc: unmapped VirtualReg"
(ppr r)
in
trundle [] available_iregs [] instrs
\end{code}
Here is the ``clever'' bit. First go backward (i.e. left), looking for
the last use of dynamic registers. Then go forward (i.e. right), filling
registers with static placements.
hairyRegAlloc takes reserve_regs as the regs to use as spill
From here onwards is the general register allocator and spiller. For
each flow edge (possible transition between instructions), we compute
which virtual and real registers are live on that edge. Then the
mapping is inverted, to give a mapping from register (virtual+real) to
sets of flow edges on which the register is live. Finally, we can use
those sets to decide whether a virtual reg v can be assigned to a real
reg r, by checking that v's live-edge-set does not intersect with r's
current live-edge-set. Having made that assignment, we then augment
r's current live-edge-set (its current commitment, you could say) with
v's live-edge-set.
doGeneralAlloc takes reserve_regs as the regs to use as spill
temporaries. First it tries to allocate using all regs except
reserve_regs. If that fails, it inserts spill code and tries again to
allocate regs, but this time with the spill temporaries available.
Even this might not work if there are insufficient spill temporaries:
in the worst case on x86, we'd need 3 of them, for insns like
addl (reg1,reg2,4) reg3, since this insn uses all 3 regs as input.
in the worst case on x86, we'd need 3 of them, for insns like addl
(%reg1,%reg2,4) %reg3, since this insn uses all 3 regs as input.
\begin{code}
hairyRegAlloc
:: MRegsState
-> [RegNo]
-> [Instr]
-> Maybe [Instr]
hairyRegAlloc regs reserve_regs instrs =
case mapAccumB (doRegAlloc reserve_regs)
(RH regs' 1 emptyFM) noFuture instrs of
(RH _ mloc1 _, _, instrs')
-- succeeded w/out using reserves
| mloc1 == 1 -> Just instrs'
-- failed, and no reserves avail, so pointless to attempt spilling
| null reserve_regs -> Nothing
-- failed, but we have reserves, so attempt to do spilling
| otherwise
-> let instrs_patched = patchMem instrs'
in
case mapAccumB (doRegAlloc []) (RH regs'' mloc1 emptyFM)
noFuture instrs_patched of
((RH _ mloc2 _),_,instrs'')
-- successfully allocated the patched code
| mloc2 == mloc1 -> maybetrace (spillMsg True) (Just instrs'')
-- no; we have to give up
| otherwise -> maybetrace (spillMsg False) Nothing
-- instrs''
where
regs' = regs `useMRegs` reserve_regs
regs'' = mkMRegsState reserve_regs
noFuture :: RegFuture
noFuture = RF emptyRegSet (FL emptyRegSet emptyFM) emptyFM
spillMsg success
= "nativeGen: spilling "
++ (if success then "succeeded" else "failed ")
++ " using "
++ showSDoc (hsep (map (pprUserReg.toMappedReg)
(reverse reserve_regs)))
where
toMappedReg (I# i) = MappedReg i
#ifdef DEBUG
maybetrace msg x = trace msg x
#else
maybetrace msg x = x
#endif
doGeneralAlloc
:: [Reg] -- all allocatable regs
-> [Reg] -- the reserve regs
-> [Instr] -- instrs in
-> Maybe [Instr] -- instrs out
doGeneralAlloc all_regs reserve_regs instrs
-- succeeded without spilling
| prespill_ok = Just prespill_insns
-- failed, and no spill regs avail, so pointless to attempt spilling
| null reserve_regs = Nothing
-- success after spilling
| postspill_ok = maybetrace (spillMsg True) (Just postspill_insns)
-- still not enough reserves after spilling; we have to give up
| otherwise = maybetrace (spillMsg False) Nothing
where
prespill_regs
= filter (`notElem` reserve_regs) all_regs
(prespill_ok, prespill_insns)
= allocUsingTheseRegs instrs prespill_regs
instrs_with_spill_code
= insertSpillCode prespill_insns
(postspill_ok, postspill_insns)
= allocUsingTheseRegs instrs_with_spill_code all_regs