1. 01 Mar, 2006 8 commits
  2. 28 Feb, 2006 8 commits
    • Simon Marlow's avatar
      takeMVar/putMVar were missing some write barriers when modifying a TSO · 080c9600
      Simon Marlow authored
      This relates to the recent introduction of clean/dirty TSOs, and the
      consqeuent write barriers required.  We were missing some write
      barriers in the takeMVar/putMVar family of primops, when performing
      the take/put directly on another TSO.
      Fixes #705, and probably some test failures.
    • Simon Marlow's avatar
      A better x86_64 register mapping, with more argument registers. · cd0bb88b
      Simon Marlow authored
      Now that we can handle using C argument registers as global registers,
      extend the x86_64 register mapping.  We now have 5 integer argument
      registers, 4 float, and 2 double (all caller-saves).  This results in a
      reasonable speedup on x86_64.
    • Simon Marlow's avatar
      filter the messages generated by gcc · 98344985
      Simon Marlow authored
      Eliminate things like "warning: call-clobbered register used as global
      register variable", which is an non-suppressible warning from gcc.
    • Simon Marlow's avatar
      Allow C argument regs to be used as global regs (R1, R2, etc.) · 14a5c62a
      Simon Marlow authored
      The problem here was that we generated C calls with expressions
      involving R1 etc. as parameters.  When some of the R registers are
      also C argument registers, both GCC and the native code generator
      generate incorrect code.  The hacky workaround is to assign
      problematic arguments to temporaries first; fortunately this works
      with both GCC and the NCG, but we have to be careful not to undo this
      with later optimisations (see changes to CmmOpt).
    • Simon Marlow's avatar
      pass arguments to unknown function calls in registers · 04db0e9f
      Simon Marlow authored
      We now have more stg_ap entry points: stg_ap_*_fast, which take
      arguments in registers according to the platform calling convention.
      This is faster if the function being called is evaluated and has the
      right arity, which is the common case (see the eval/apply paper for
      We still need the stg_ap_*_info entry points for stack-based
      application, such as an overflows when a function is applied to too
      many argumnets.  The stg_ap_*_fast functions actually just check for
      an evaluated function, and if they don't find one, push the args on
      the stack and invoke stg_ap_*_info.  (this might be slightly slower in
      some cases, but not the common case).
    • Simon Marlow's avatar
      fix live register annotations on foreign calls · 174c7f29
      Simon Marlow authored
      fix one incorrect case, and made several more accurate
    • simonpj@microsoft.com's avatar
      Simplify the IdInfo before any RHSs · 2317c27b
      simonpj@microsoft.com authored
      Simplfy (i.e. substitute) the IdInfo of a recursive group of Ids
      before looking at the RHSs of *any* of them.  That way, the rules
      are available throughout the letrec, which means we don't have to
      be careful about function to put first.
      Before, we just simplified the IdInfo of f before looking at f's RHS,
      but that's not so good when f and g both have RULES, and both rules
      mention the other.
      This change makes things simpler, but shouldn't change performance.
    • simonpj@microsoft.com's avatar
  3. 27 Feb, 2006 2 commits
  4. 22 Feb, 2006 2 commits
  5. 27 Feb, 2006 2 commits
  6. 25 Feb, 2006 1 commit
  7. 26 Feb, 2006 1 commit
  8. 25 Feb, 2006 3 commits
    • wolfgang.thaller@gmx.net's avatar
    • wolfgang.thaller@gmx.net's avatar
      NCG: Fix Typo in Register Allocator Loop Patch · 17fc6271
      wolfgang.thaller@gmx.net authored
      Fix previous patch "NCG: Handle loops in register allocator"
      Of course, I broke it when correcting a style problem just before committing.
    • wolfgang.thaller@gmx.net's avatar
      NCG: Handle loops in register allocator · 34f992d3
      wolfgang.thaller@gmx.net authored
      Fill in the missing parts in the register allocator so that it can
      handle loops.
      *) The register allocator now runs in the UniqSuppy monad, as it needs
         to be able to generate unique labels for fixup code blocks.
      *) A few functions have been added to RegAllocInfo:
      	mkRegRegMoveInstr -- generates a good old move instruction
      	mkBranchInstr     -- used to be MachCodeGen.genBranch
      	patchJump         -- Change the destination of a jump
      *) The register allocator now makes sure that only one spill slot is used
         for each temporary, even if it is spilled and reloaded several times.
         This obviates the need for memory-to-memory moves in fixup code.
      *) The case where the fixup code needs to cyclically permute a group of
         registers is currently unhandled. This will need more work once we come
         accross code where this actually happens.
      *) Register allocation for code with loop is probably very inefficient
         (both at compile-time and at run-time).
      *) We still cannot compile the RTS via NCG, for various other reasons.
  9. 24 Feb, 2006 12 commits
  10. 23 Feb, 2006 1 commit