Commit 535a88e1 authored by Andreas Klebinger's avatar Andreas Klebinger Committed by Marge Bot

Add loop level analysis to the NCG backend.

For backends maintaining the CFG during codegen
we can now find loops and their nesting level.

This is based on the Cmm CFG and dominator analysis.

As a result we can estimate edge frequencies a lot better
for methods, resulting in far better code layout.

Speedup on nofib: ~1.5%
Increase in compile times: ~1.9%

To make this feasible this commit adds:
* Dominator analysis based on the Lengauer-Tarjan Algorithm.
* An algorithm estimating global edge frequences from branch
probabilities - In CFG.hs

A few static branch prediction heuristics:

* Expect to take the backedge in loops.
* Expect to take the branch NOT exiting a loop.
* Expect integer vs constant comparisons to be false.

We also treat heap/stack checks special for branch prediction
to avoid them being treated as loops.
parent 9c11f817
Pipeline #11446 passed with stages
in 496 minutes and 26 seconds
......@@ -6,8 +6,6 @@
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# OPTIONS_GHC -fprof-auto-top #-}
--
-- Copyright (c) 2010, João Dias, Simon Marlow, Simon Peyton Jones,
-- and Norman Ramsey
......@@ -108,6 +106,7 @@ analyzeCmm
-> FactBase f
-> FactBase f
analyzeCmm dir lattice transfer cmmGraph initFact =
{-# SCC analyzeCmm #-}
let entry = g_entry cmmGraph
hooplGraph = g_graph cmmGraph
blockMap =
......@@ -169,7 +168,7 @@ rewriteCmm
-> CmmGraph
-> FactBase f
-> UniqSM (CmmGraph, FactBase f)
rewriteCmm dir lattice rwFun cmmGraph initFact = do
rewriteCmm dir lattice rwFun cmmGraph initFact = {-# SCC rewriteCmm #-} do
let entry = g_entry cmmGraph
hooplGraph = g_graph cmmGraph
blockMap1 =
......
......@@ -593,6 +593,7 @@ Library
Instruction
BlockLayout
CFG
Dominators
Format
Reg
RegClass
......
......@@ -562,7 +562,7 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count
Opt_D_dump_asm_native "Native code"
(vcat $ map (pprNatCmmDecl ncgImpl) native)
dumpIfSet_dyn dflags
when (not $ null nativeCfgWeights) $ dumpIfSet_dyn dflags
Opt_D_dump_cfg_weights "CFG Weights"
(pprEdgeWeights nativeCfgWeights)
......@@ -691,7 +691,7 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count
{-# SCC "generateJumpTables" #-}
generateJumpTables ncgImpl alloced
dumpIfSet_dyn dflags
when (not $ null nativeCfgWeights) $ dumpIfSet_dyn dflags
Opt_D_dump_cfg_weights "CFG Update information"
( text "stack:" <+> ppr stack_updt_blks $$
text "linearAlloc:" <+> ppr cfgRegAllocUpdates )
......@@ -705,8 +705,9 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count
optimizedCFG =
optimizeCFG (cfgWeightInfo dflags) cmm <$!> postShortCFG
maybe (return ())
(dumpIfSet_dyn dflags Opt_D_dump_cfg_weights "CFG Final Weights" . pprEdgeWeights)
maybe (return ()) (\cfg->
dumpIfSet_dyn dflags Opt_D_dump_cfg_weights "CFG Final Weights"
( pprEdgeWeights cfg ))
optimizedCFG
--TODO: Partially check validity of the cfg.
......
......@@ -6,6 +6,8 @@
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE FlexibleContexts #-}
module BlockLayout
( sequenceTop )
......@@ -22,7 +24,6 @@ import BlockId
import Cmm
import Hoopl.Collections
import Hoopl.Label
import Hoopl.Block
import DynFlags (gopt, GeneralFlag(..), DynFlags, backendMaintainsCfg)
import UniqFM
......@@ -41,11 +42,30 @@ import ListSetOps (removeDups)
import OrdList
import Data.List
import Data.Foldable (toList)
import Hoopl.Graph
import qualified Data.Set as Set
import Data.STRef
import Control.Monad.ST.Strict
import Control.Monad (foldM)
{-
Note [CFG based code layout]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The major steps in placing blocks are as follow:
* Compute a CFG based on the Cmm AST, see getCfgProc.
This CFG will have edge weights representing a guess
on how important they are.
* After we convert Cmm to Asm we run `optimizeCFG` which
adds a few more "educated guesses" to the equation.
* Then we run loop analysis on the CFG (`loopInfo`) which tells us
about loop headers, loop nesting levels and the sort.
* Based on the CFG and loop information refine the edge weights
in the CFG and normalize them relative to the most often visited
node. (See `mkGlobalWeights`)
* Feed this CFG into the block layout code (`sequenceTop`) in this
module. Which will then produce a code layout based on the input weights.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~ Note [Chain based CFG serialization]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -60,8 +80,8 @@ import qualified Data.Set as Set
but also how much a block would benefit from being placed sequentially after
it's predecessor.
For example blocks which are preceeded by an info table are more likely to end
up in a different cache line than their predecessor. So there is less benefit
in placing them sequentially.
up in a different cache line than their predecessor and we can't eliminate the jump
so there is less benefit to placing them sequentially.
For example consider this example:
......@@ -81,56 +101,83 @@ import qualified Data.Set as Set
Eg for our example we might end up with two chains like:
[A->B->C->X],[D]. Blocks inside chains will always be placed sequentially.
However there is no particular order in which chains are placed since
(hopefully) the blocks for which sequentially is important have already
(hopefully) the blocks for which sequentiality is important have already
been placed in the same chain.
-----------------------------------------------------------------------------
First try to create a lists of good chains.
1) First try to create a list of good chains.
-----------------------------------------------------------------------------
We do so by taking a block not yet placed in a chain and
looking at these cases:
Good chains are these which allow us to eliminate jump instructions.
Which further eliminate often executed jumps first.
We do so by:
*) Ignore edges which represent instructions which can not be replaced
by fall through control flow. Primarily calls and edges to blocks which
are prefixed by a info table we have to jump across.
*) Then process remaining edges in order of frequency taken and:
+) If source and target have not been placed build a new chain from them.
+) If source and target have been placed, and are ends of differing chains
try to merge the two chains.
*) Check if the best predecessor of the block is at the end of a chain.
If so add the current block to the end of that chain.
+) If one side of the edge is a end/front of a chain, add the other block of
to edge to the same chain
Eg if we look at block C and already have the chain (A -> B)
then we extend the chain to (A -> B -> C).
Eg if we look at edge (B -> C) and already have the chain (A -> B)
then we extend the chain to (A -> B -> C).
Combined with the fact that we process blocks in reverse post order
this means loop bodies and trivially sequential control flow already
ends up as a single chain.
+) If the edge was used to modify or build a new chain remove the edge from
our working list.
*) Otherwise we create a singleton chain from the block we are looking at.
Eg if we have from the example above already constructed (A->B)
and look at D we create the chain (D) resulting in the chains [A->B, D]
*) If there any blocks not being placed into a chain after these steps we place
them into a chain consisting of only this block.
Ranking edges by their taken frequency, if
two edges compete for fall through on the same target block, the one taken
more often will automatically win out. Resulting in fewer instructions being
executed.
Creating singleton chains is required for situations where we have code of the
form:
A: goto B:
<infoTable>
B: goto C:
<infoTable>
C: ...
As the code in block B is only connected to the rest of the program via edges
which will be ignored in this step we make sure that B still ends up in a chain
this way.
-----------------------------------------------------------------------------
We then try to fuse chains.
2) We also try to fuse chains.
-----------------------------------------------------------------------------
There are edge cases which result in two chains being created which trivially
represent linear control flow. For example we might have the chains
[(A-B-C),(D-E)] with an cfg triangle:
As a result from the above step we still end up with multiple chains which
represent sequential control flow chunks. But they are not yet suitable for
code layout as we need to place *all* blocks into a single sequence.
A----->C->D->E
\->B-/
In this step we combine chains result from the above step via these steps:
We also get three independent chains if two branches end with a jump
to a common successor.
*) Look at the ranked list of *all* edges, including calls/jumps across info tables
and the like.
We take care of these cases by fusing chains which are connected by an
edge.
*) Look at each edge and
We do so by looking at the list of edges sorted by weight.
Given the edge (C -> D) we try to find two chains such that:
* C is at the end of chain one.
* D is in front of chain two.
* If two such chains exist we fuse them.
We then remove the edge and repeat the process for the rest of the edges.
+) Given an edge (A -> B) try to find two chains for which
* Block A is at the end of one chain
* Block B is at the front of the other chain.
+) If we find such a chain we "fuse" them into a single chain, remove the
edge from working set and continue.
+) If we can't find such chains we skip the edge and continue.
-----------------------------------------------------------------------------
Place indirect successors (neighbours) after each other
3) Place indirect successors (neighbours) after each other
-----------------------------------------------------------------------------
We might have chains [A,B,C,X],[E] in a CFG of the sort:
......@@ -141,15 +188,11 @@ import qualified Data.Set as Set
While E does not follow X it's still beneficial to place them near each other.
This can be advantageous if eg C,X,E will end up in the same cache line.
TODO: If we remove edges as we use them (eg if we build up A->B remove A->B
from the list) we could save some more work in later phases.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~ Note [Triangle Control Flow]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Checking if an argument is already evaluating leads to a somewhat
Checking if an argument is already evaluated leads to a somewhat
special case which looks like this:
A:
......@@ -204,11 +247,6 @@ import qualified Data.Set as Set
neighbourOverlapp :: Int
neighbourOverlapp = 2
-- | Only edges heavier than this are considered
-- for fusing two chains into a single chain.
fuseEdgeThreshold :: EdgeWeight
fuseEdgeThreshold = 0
-- | Maps blocks near the end of a chain to it's chain AND
-- the other blocks near the end.
-- [A,B,C,D,E] Gives entries like (B -> ([A,B], [A,B,C,D,E]))
......@@ -224,40 +262,24 @@ type FrontierMap = LabelMap ([BlockId],BlockChain)
newtype BlockChain
= BlockChain { chainBlocks :: (OrdList BlockId) }
instance Eq (BlockChain) where
(BlockChain blks1) == (BlockChain blks2)
= fromOL blks1 == fromOL blks2
-- All chains are constructed the same way so comparison
-- including structure is faster.
instance Eq BlockChain where
BlockChain b1 == BlockChain b2 = strictlyEqOL b1 b2
-- Useful for things like sets and debugging purposes, sorts by blocks
-- in the chain.
instance Ord (BlockChain) where
(BlockChain lbls1) `compare` (BlockChain lbls2)
= (fromOL lbls1) `compare` (fromOL lbls2)
= ASSERT(toList lbls1 /= toList lbls2 || lbls1 `strictlyEqOL` lbls2)
strictlyOrdOL lbls1 lbls2
instance Outputable (BlockChain) where
ppr (BlockChain blks) =
parens (text "Chain:" <+> ppr (fromOL $ blks) )
data WeightedEdge = WeightedEdge !BlockId !BlockId EdgeWeight deriving (Eq)
-- | Non deterministic! (Uniques) Sorts edges by weight and nodes.
instance Ord WeightedEdge where
compare (WeightedEdge from1 to1 weight1)
(WeightedEdge from2 to2 weight2)
| weight1 < weight2 || weight1 == weight2 && from1 < from2 ||
weight1 == weight2 && from1 == from2 && to1 < to2
= LT
| from1 == from2 && to1 == to2 && weight1 == weight2
= EQ
| otherwise
= GT
instance Outputable WeightedEdge where
ppr (WeightedEdge from to info) =
ppr from <> text "->" <> ppr to <> brackets (ppr info)
type WeightedEdgeList = [WeightedEdge]
chainFoldl :: (b -> BlockId -> b) -> b -> BlockChain -> b
chainFoldl f z (BlockChain blocks) = foldl' f z blocks
noDups :: [BlockChain] -> Bool
noDups chains =
......@@ -270,19 +292,21 @@ inFront :: BlockId -> BlockChain -> Bool
inFront bid (BlockChain seq)
= headOL seq == bid
chainMember :: BlockId -> BlockChain -> Bool
chainMember bid chain
= elem bid $ fromOL . chainBlocks $ chain
-- = setMember bid . chainMembers $ chain
chainSingleton :: BlockId -> BlockChain
chainSingleton lbl
= BlockChain (unitOL lbl)
chainFromList :: [BlockId] -> BlockChain
chainFromList = BlockChain . toOL
chainSnoc :: BlockChain -> BlockId -> BlockChain
chainSnoc (BlockChain blks) lbl
= BlockChain (blks `snocOL` lbl)
chainCons :: BlockId -> BlockChain -> BlockChain
chainCons lbl (BlockChain blks)
= BlockChain (lbl `consOL` blks)
chainConcat :: BlockChain -> BlockChain -> BlockChain
chainConcat (BlockChain blks1) (BlockChain blks2)
= BlockChain (blks1 `appOL` blks2)
......@@ -311,52 +335,14 @@ takeL :: Int -> BlockChain -> [BlockId]
takeL n (BlockChain blks) =
take n . fromOL $ blks
-- | For a given list of chains try to fuse chains with strong
-- edges between them into a single chain.
-- Returns the list of fused chains together with a set of
-- used edges. The set of edges is indirectly encoded in the
-- chains so doesn't need to be considered for later passes.
fuseChains :: WeightedEdgeList -> LabelMap BlockChain
-> (LabelMap BlockChain, Set.Set WeightedEdge)
fuseChains weights chains
= let fronts = mapFromList $
map (\chain -> (headOL . chainBlocks $ chain,chain)) $
mapElems chains :: LabelMap BlockChain
(chains', used, _) = applyEdges weights chains fronts Set.empty
in (chains', used)
where
applyEdges :: WeightedEdgeList -> LabelMap BlockChain
-> LabelMap BlockChain -> Set.Set WeightedEdge
-> (LabelMap BlockChain, Set.Set WeightedEdge, LabelMap BlockChain)
applyEdges [] chainsEnd chainsFront used
= (chainsEnd, used, chainsFront)
applyEdges (edge@(WeightedEdge from to w):edges) chainsEnd chainsFront used
--Since we order edges descending by weight we can stop here
| w <= fuseEdgeThreshold
= ( chainsEnd, used, chainsFront)
--Fuse the two chains
| Just c1 <- mapLookup from chainsEnd
, Just c2 <- mapLookup to chainsFront
, c1 /= c2
= let newChain = chainConcat c1 c2
front = headOL . chainBlocks $ newChain
end = lastOL . chainBlocks $ newChain
chainsFront' = mapInsert front newChain $
mapDelete to chainsFront
chainsEnd' = mapInsert end newChain $
mapDelete from chainsEnd
in applyEdges edges chainsEnd' chainsFront'
(Set.insert edge used)
| otherwise
--Check next edge
= applyEdges edges chainsEnd chainsFront used
-- Note [Combining neighborhood chains]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- See also Note [Chain based CFG serialization]
-- We have the chains (A-B-C-D) and (E-F) and an Edge C->E.
--
-- While placing the later after the former doesn't result in sequential
-- control flow it is still be benefical since block C and E might end
-- While placing the latter after the former doesn't result in sequential
-- control flow it is still benefical. As block C and E might end
-- up in the same cache line.
--
-- So we place these chains next to each other even if we can't fuse them.
......@@ -365,7 +351,7 @@ fuseChains weights chains
-- v
-- - -> E -> F ...
--
-- Simple heuristic to chose which chains we want to combine:
-- A simple heuristic to chose which chains we want to combine:
-- * Process edges in descending priority.
-- * Check if there is a edge near the end of one chain which goes
-- to a block near the start of another edge.
......@@ -375,14 +361,22 @@ fuseChains weights chains
-- us to find all edges between two chains, check the distance for all edges,
-- rank them based on the distance and and only then we can select two chains
-- to combine. Which would add a lot of complexity for little gain.
--
-- So instead we just rank by the strength of the edge and use the first pair we
-- find.
-- | For a given list of chains and edges try to combine chains with strong
-- edges between them.
combineNeighbourhood :: WeightedEdgeList -> [BlockChain]
-> [BlockChain]
combineNeighbourhood :: [CfgEdge] -- ^ Edges to consider
-> [BlockChain] -- ^ Current chains of blocks
-> ([BlockChain], Set.Set (BlockId,BlockId))
-- ^ Resulting list of block chains, and a set of edges which
-- were used to fuse chains and as such no longer need to be
-- considered.
combineNeighbourhood edges chains
= -- pprTraceIt "Neigbours" $
applyEdges edges endFrontier startFrontier
-- pprTrace "combineNeighbours" (ppr edges) $
applyEdges edges endFrontier startFrontier (Set.empty)
where
--Build maps from chain ends to chains
endFrontier, startFrontier :: FrontierMap
......@@ -396,14 +390,14 @@ combineNeighbourhood edges chains
let front = getFronts chain
entry = (front,chain)
in map (\x -> (x,entry)) front) chains
applyEdges :: WeightedEdgeList -> FrontierMap -> FrontierMap
-> [BlockChain]
applyEdges [] chainEnds _chainFronts =
ordNub $ map snd $ mapElems chainEnds
applyEdges ((WeightedEdge from to _w):edges) chainEnds chainFronts
applyEdges :: [CfgEdge] -> FrontierMap -> FrontierMap -> Set.Set (BlockId, BlockId)
-> ([BlockChain], Set.Set (BlockId,BlockId))
applyEdges [] chainEnds _chainFronts combined =
(ordNub $ map snd $ mapElems chainEnds, combined)
applyEdges ((CfgEdge from to _w):edges) chainEnds chainFronts combined
| Just (c1_e,c1) <- mapLookup from chainEnds
, Just (c2_f,c2) <- mapLookup to chainFronts
, c1 /= c2 -- Avoid trying to concat a short chain with itself.
, c1 /= c2 -- Avoid trying to concat a chain with itself.
= let newChain = chainConcat c1 c2
newChainFrontier = getFronts newChain
newChainEnds = getEnds newChain
......@@ -437,165 +431,299 @@ combineNeighbourhood edges chains
-- text "fronts" <+> ppr newFronts $$
-- text "ends" <+> ppr newEnds
-- )
applyEdges edges newEnds newFronts
applyEdges edges newEnds newFronts (Set.insert (from,to) combined)
| otherwise
= --pprTrace "noNeigbours" (ppr ()) $
applyEdges edges chainEnds chainFronts
= applyEdges edges chainEnds chainFronts combined
where
getFronts chain = takeL neighbourOverlapp chain
getEnds chain = takeR neighbourOverlapp chain
-- See [Chain based CFG serialization]
buildChains :: CFG -> [BlockId]
-> ( LabelMap BlockChain -- Resulting chains.
-- In the last stop we combine all chains into a single one.
-- Trying to place chains with strong edges next to each other.
mergeChains :: [CfgEdge] -> [BlockChain]
-> (BlockChain)
mergeChains edges chains
= -- pprTrace "combine" (ppr edges) $
runST $ do
let addChain m0 chain = do
ref <- newSTRef chain
return $ chainFoldl (\m' b -> mapInsert b ref m') m0 chain
chainMap' <- foldM (\m0 c -> addChain m0 c) mapEmpty chains
merge edges chainMap'
where
-- We keep a map from ALL blocks to their respective chain (sigh)
-- This is required since when looking at an edge we need to find
-- the associated chains quickly.
-- We use a map of STRefs, maintaining a invariant of one STRef per chain.
-- When merging chains we can update the
-- STRef of one chain once (instead of writing to the map for each block).
-- We then overwrite the STRefs for the other chain so there is again only
-- a single STRef for the combined chain.
-- The difference in terms of allocations saved is ~0.2% with -O so actually
-- significant compared to using a regular map.
merge :: forall s. [CfgEdge] -> LabelMap (STRef s BlockChain) -> ST s BlockChain
merge [] chains = do
chains' <- ordNub <$> (mapM readSTRef $ mapElems chains) :: ST s [BlockChain]
return $ foldl' chainConcat (head chains') (tail chains')
merge ((CfgEdge from to _):edges) chains
-- | pprTrace "merge" (ppr (from,to) <> ppr chains) False
-- = undefined
| cFrom == cTo
= merge edges chains
| otherwise
= do
chains' <- mergeComb cFrom cTo
merge edges chains'
where
mergeComb :: STRef s BlockChain -> STRef s BlockChain -> ST s (LabelMap (STRef s BlockChain))
mergeComb refFrom refTo = do
cRight <- readSTRef refTo
chain <- pure chainConcat <*> readSTRef refFrom <*> pure cRight
writeSTRef refFrom chain
return $ chainFoldl (\m b -> mapInsert b refFrom m) chains cRight
cFrom = expectJust "mergeChains:chainMap:from" $ mapLookup from chains
cTo = expectJust "mergeChains:chainMap:to" $ mapLookup to chains
-- See Note [Chain based CFG serialization] for the general idea.
-- This creates and fuses chains at the same time for performance reasons.
-- Try to build chains from a list of edges.
-- Edges must be sorted **descending** by their priority.
-- Returns the constructed chains, along with all edges which
-- are irrelevant past this point, this information doesn't need
-- to be complete - it's only used to speed up the process.
-- An Edge is irrelevant if the ends are part of the same chain.
-- We say these edges are already linked
buildChains :: [CfgEdge] -> [BlockId]
-> ( LabelMap BlockChain -- Resulting chains, indexd by end if chain.
, Set.Set (BlockId, BlockId)) --List of fused edges.
buildChains succWeights blocks
= let (_, fusedEdges, chains) = buildNext setEmpty mapEmpty blocks Set.empty
in (chains, fusedEdges)
buildChains edges blocks
= runST $ buildNext setEmpty mapEmpty mapEmpty edges Set.empty
where
-- We keep a map from the last block in a chain to the chain itself.
-- This we we can easily check if an block should be appened to an
-- buildNext builds up chains from edges one at a time.
-- We keep a map from the ends of chains to the chains.
-- This we we can easily check if an block should be appended to an
-- existing chain!
buildNext :: LabelSet
-> LabelMap BlockChain -- Map from last element to chain.
-> [BlockId] -- Blocks to place
-> Set.Set (BlockId, BlockId)
-> ( [BlockChain] -- Placed Blocks
, Set.Set (BlockId, BlockId) --List of fused edges
, LabelMap BlockChain
)
buildNext _placed chains [] linked =
([], linked, chains)
buildNext placed chains (block:todo) linked
| setMember block placed
= buildNext placed chains todo linked
-- We store them using STRefs so we don't have to rebuild the spine of both
-- maps every time we update a chain.
buildNext :: forall s. LabelSet
-> LabelMap (STRef s BlockChain) -- Map from end of chain to chain.
-> LabelMap (STRef s BlockChain) -- Map from start of chain to chain.
-> [CfgEdge] -- Edges to check - ordered by decreasing weight
-> Set.Set (BlockId, BlockId) -- Used edges
-> ST s ( LabelMap BlockChain -- Chains by end
, Set.Set (BlockId, BlockId) --List of fused edges
)
buildNext placed _chainStarts chainEnds [] linked = do
ends' <- sequence $ mapMap readSTRef chainEnds :: ST s (LabelMap BlockChain)
-- Any remaining blocks have to be made to singleton chains.
-- They might be combined with other chains later on outside this function.
let unplaced = filter (\x -> not (setMember x placed)) blocks
singletons = map (\x -> (x,chainSingleton x)) unplaced :: [(BlockId,BlockChain)]
return (foldl' (\m (k,v) -> mapInsert k v m) ends' singletons , linked)
buildNext placed chainStarts chainEnds (edge:todo) linked
| from == to
-- We skip self edges
= buildNext placed chainStarts chainEnds todo (Set.insert (from,to) linked)
| not (alreadyPlaced from) &&
not (alreadyPlaced to)
= do
--pprTraceM "Edge-Chain:" (ppr edge)
chain' <- newSTRef $ chainFromList [from,to]
buildNext
(setInsert to (setInsert from placed))
(mapInsert from chain' chainStarts)
(mapInsert to chain' chainEnds)
todo
(Set.insert (from,to) linked)
| (alreadyPlaced from) &&
(alreadyPlaced to)
, Just predChain <- mapLookup from chainEnds
, Just succChain <- mapLookup to chainStarts
, predChain /= succChain -- Otherwise we try to create a cycle.
= do
-- pprTraceM "Fusing edge" (ppr edge)
fuseChain predChain succChain
| (alreadyPlaced from) &&
(alreadyPlaced to)
= --pprTraceM "Skipping:" (ppr edge) >>
buildNext placed chainStarts chainEnds todo linked
| otherwise
= buildNext placed' chains' todo linked'
= do -- pprTraceM "Finding chain for:" (ppr edge $$
-- text "placed" <+> ppr placed)
findChain
where
placed' = (foldl' (flip setInsert) placed placedBlocks)
linked' = Set.union linked linkedEdges
(placedBlocks, chains', linkedEdges) = findChain block
--Add the block to a existing or new chain
--Returns placed blocks, list of resulting chains
--and fused edges
findChain :: BlockId
-> ([BlockId],LabelMap BlockChain, Set.Set (BlockId, BlockId))
findChain block
-- B) place block at end of existing chain if
-- there is no better block to append.
| (pred:_) <- preds
, alreadyPlaced pred
, Just predChain <- mapLookup pred chains
, (best:_) <- filter (not . alreadyPlaced) $ getSuccs pred
, best == lbl
= --pprTrace "B.2)" (ppr (pred,lbl)) $
let newChain = chainSnoc predChain block
chainMap = mapInsert lbl newChain $ mapDelete pred chains
in ( [lbl]
, chainMap
, Set.singleton (pred,lbl) )
from = edgeFrom edge
to = edgeTo edge
alreadyPlaced blkId = (setMember blkId placed)
-- Combine two chains into a single one.
fuseChain :: STRef s BlockChain -> STRef s BlockChain
-> ST s ( LabelMap BlockChain -- Chains by end
, Set.Set (BlockId, BlockId) --List of fused edges
)
fuseChain fromRef toRef = do
fromChain <- readSTRef fromRef
toChain <- readSTRef toRef
let newChain = chainConcat fromChain toChain
ref <- newSTRef newChain
let start = head $ takeL 1 newChain
let end = head $ takeR 1 newChain
-- chains <- sequence $ mapMap readSTRef chainStarts
-- pprTraceM "pre-fuse chains:" $ ppr chains
buildNext
placed
(mapInsert start ref $ mapDelete to $ chainStarts)
(mapInsert end ref $ mapDelete from $ chainEnds)
todo
(Set.insert (from,to) linked)
--Add the block to a existing chain or creates a new chain
findChain :: ST s ( LabelMap BlockChain -- Chains by end
, Set.Set (BlockId, BlockId) --List of fused edges
)
findChain
-- We can attach the block to the end of a chain
| alreadyPlaced from
, Just predChain <- mapLookup from chainEnds
= do
chain <- readSTRef predChain
let newChain = chainSnoc chain to
writeSTRef predChain newChain
let chainEnds' = mapInsert to predChain $ mapDelete from chainEnds
-- chains <- sequence $ mapMap readSTRef chainStarts
-- pprTraceM "from chains:" $ ppr chains
buildNext (setInsert to placed) chainStarts chainEnds' todo (Set.insert (from,to) linked)
-- We can attack it to the front of a chain
| alreadyPlaced to
, Just succChain <- mapLookup to chainStarts
= do
chain <- readSTRef succChain
let newChain = from `chainCons` chain
writeSTRef succChain newChain
let chainStarts' = mapInsert from succChain $ mapDelete to chainStarts
-- chains <- sequence $ mapMap readSTRef chainStarts'
-- pprTraceM "to chains:" $ ppr chains
buildNext (setInsert from placed) chainStarts' chainEnds todo (Set.insert (from,to) linked)
-- The placed end of the edge is part of a chain already and not an end.
| otherwise
= --pprTrace "single" (ppr lbl)
( [lbl]
, mapInsert lbl (chainSingleton lbl) chains
, Set.empty)
= do
let block = if alreadyPlaced to then from else to
--pprTraceM "Singleton" $ ppr block
let newChain = chainSingleton block
ref <- newSTRef newChain
buildNext (setInsert block placed) (mapInsert block ref chainStarts)
(mapInsert block ref chainEnds) todo (linked)
where