• sewardj's avatar
    [project @ 2000-06-15 08:38:25 by sewardj] · 665229e5
    sewardj authored
    Major thing: new register allocator.  Brief description follows.
    Should correctly handle code with loops in, even though we don't
    generate any such at the moment.  A lot of comments.  The previous
    machinery for spilling is retained, as is the idea of a fast-and-easy
    initial allocation attempt intended to deal with the majority of code
    blocks (about 60% on x86) very cheaply.  Many comments explaining
    in detail how it works :-)
    The Stix inliner is now on by default.  Integer code seems to run
    within about 1% of that -fvia-C.  x86 fp code is significantly worse,
    up to about 30% slower, depending on the amount of fp activity.
    Minor thing: lazyfication of the top-level NCG plumbing, so that the
    NCG doesn't require any greater residency than compiling to C, just a
    bit more time.  Created lazyThenUs and lazyMapUs for this purpose.
    The new allocator is somewhat, although not catastophically, slower
    than the old one.  Fixing of the long-standing NCG space leak more
    than makes up for it; overall hsc run-time is down about 5%, due to
    significantly reduced GC time.
    Instructions are numbered sequentially, starting at zero.
    A flow edge (FE) is a pair of insn numbers (MkFE Int Int) denoting
    a possible flow of control from the first insn to the second.
    The input to the register allocator is a list of instructions, which
    mention Regs.  A Reg can be a RealReg -- a real machine reg -- or a
    VirtualReg, which carries a unique.  After allocation, all the
    VirtualReg references will have been converted into RealRegs, and
    possibly some spill code will have been inserted.
    The heart of the register allocator works in four phases.
    1.  (find_flow_edges) Calculate all the FEs for the code list.
        Return them not as a [FE], but implicitly, as a pair of
        Array Int [Int], being the successor and predecessor maps
        for instructions.
    2.  (calc_liveness) Returns a FiniteMap FE RegSet.  For each
        FE, indicates the set of registers live on that FE.  Note
        that the set includes both RealRegs and VirtualRegs.  The
        former appear because the code could mention fixed register
        usages, and we need to take them into account from the start.
    3.  (calc_live_range_sets) Invert the above mapping, giving a
        FiniteMap Reg FeSet, indicating, for each virtual and real
        reg mentioned in the code, which FEs it is live on.
    4.  (calc_vreg_to_rreg_mapping) For virtual reg, try and find
        an allocatable real register for it.  Each real register has
        a "current commitment", indicating the set of FEs it is
        currently live on.  A virtual reg v can be assigned to
        real reg r iff v's live-fe-set does not intersect with r's
        current commitment fe-set.  If the assignment is made,
        v's live-fe-set is union'd into r's current commitment fe-set.
        There is also the minor restriction that v and r must be of
        the same register class (integer or floating).
        Once this mapping is established, we simply apply it to the
        input insns, and that's it.
        If no suitable real register can be found, the vreg is mapped
        to itself, and we deem allocation to have failed.  The partially
        allocated code is returned.  The higher echelons of the allocator
        (doGeneralAlloc and runRegAlloc) then cooperate to insert spill
        code and re-run allocation, until a successful allocation is found.
MachCode.lhs 102 KB