1. 10 Sep, 2013 1 commit
  2. 12 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Remove OldCmm, convert backends to consume new Cmm · d92bd17f
      Simon Marlow authored
      This removes the OldCmm data type and the CmmCvt pass that converts
      new Cmm to OldCmm.  The backends (NCGs, LLVM and C) have all been
      converted to consume new Cmm.
      
      The main difference between the two data types is that conditional
      branches in new Cmm have both true/false successors, whereas in OldCmm
      the false case was a fallthrough.  To generate slightly better code we
      occasionally need to invert a conditional to ensure that the
      branch-not-taken becomes a fallthrough; this was previously done in
      CmmCvt, and it is now done in CmmContFlowOpt.
      
      We could go further and use the Hoopl Block representation for native
      code, which would mean that we could use Hoopl's postorderDfs and
      analyses for native code, but for now I've left it as is, using the
      old ListGraph representation for native code.
      d92bd17f
  3. 20 Sep, 2012 1 commit
    • Simon Marlow's avatar
      Teach the linear register allocator how to allocate more stack if necessary · 0b0a41f9
      Simon Marlow authored
      This squashes the "out of spill slots" panic that occasionally happens
      on x86, by adding instructions to bump and retreat the C stack pointer
      as necessary.  The panic has become more common since the new codegen,
      because we lump code into larger blocks, and the register allocator
      isn't very good at reusing stack slots for spilling (see Note [extra
      spill slots]).
      0b0a41f9
  4. 16 Sep, 2012 1 commit
  5. 14 Sep, 2012 1 commit
  6. 28 Aug, 2012 2 commits
  7. 21 Aug, 2012 6 commits
  8. 06 Aug, 2012 1 commit
  9. 11 Jul, 2012 2 commits
  10. 09 Jul, 2012 2 commits
    • Simon Marlow's avatar
      Don't re-allocate %esi on x86. · 10575479
      Simon Marlow authored
      Recent changes have freed up %esi for general use on x86 when it is
      not being used for R1.  However, x86 has a non-uniform register
      architecture where there is no 8-bit equivalent of %esi.  The register
      allocators aren't sophisticated enough to cope with this, so we have
      to back off and treat %esi as non-allocatable for now.  (of course,
      LLVM doesn't suffer from this problem)
      
      One workaround would be to change the calling convention to use %rbx
      for R1, however we can't change the calling convention now without
      patching LLVM too.
      10575479
    • Erik de Castro Lopo's avatar
      Fix compile failure on non x86/x86-64 (#7054). · 810f0be6
      Erik de Castro Lopo authored and Simon Marlow's avatar Simon Marlow committed
      810f0be6
  11. 06 Jul, 2012 1 commit
    • Simon Marlow's avatar
      Allow the register allocator access to argument regs (R1.., F1.., etc.) · f857f074
      Simon Marlow authored
      This was made possible by the recent change to codeGen to attach the
      live GlobalRegs to every CmmJump, and we'll be relying on it quite
      heavily in the new code generator too.
      
      What this means essentially is that when we see
      
        x = R1
      
      the register allocator will automatically assign x to R1 and generate
      no code at all (also known as "coalescing"). It wasn't possible before
      because the register allocator had to assume that R1 was always live,
      because it didn't have access to accurate liveness information.
      f857f074
  12. 12 Jun, 2012 1 commit
  13. 22 Mar, 2012 1 commit
  14. 21 Mar, 2012 2 commits
  15. 19 Mar, 2012 1 commit
  16. 06 Nov, 2011 1 commit
  17. 06 Sep, 2011 2 commits
  18. 31 Aug, 2011 2 commits
  19. 30 Aug, 2011 1 commit
  20. 04 May, 2011 1 commit
  21. 30 Apr, 2011 1 commit
  22. 24 Jan, 2011 1 commit
    • Simon Marlow's avatar
      Merge in new code generator branch. · 889c084e
      Simon Marlow authored
      This changes the new code generator to make use of the Hoopl package
      for dataflow analysis.  Hoopl is a new boot package, and is maintained
      in a separate upstream git repository (as usual, GHC has its own
      lagging darcs mirror in http://darcs.haskell.org/packages/hoopl).
      
      During this merge I squashed recent history into one patch.  I tried
      to rebase, but the history had some internal conflicts of its own
      which made rebase extremely confusing, so I gave up. The history I
      squashed was:
      
        - Update new codegen to work with latest Hoopl
        - Add some notes on new code gen to cmm-notes
        - Enable Hoopl lag package.
        - Add SPJ note to cmm-notes
        - Improve GC calls on new code generator.
      
      Work in this branch was done by:
         - Milan Straka <fox@ucw.cz>
         - John Dias <dias@cs.tufts.edu>
         - David Terei <davidterei@gmail.com>
      
      Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD
      and fixed a few bugs.
      889c084e
  23. 23 Jun, 2010 1 commit
  24. 15 Jun, 2010 1 commit
  25. 15 Feb, 2010 2 commits
  26. 05 Feb, 2010 1 commit
  27. 04 Feb, 2010 1 commit
    • Simon Marlow's avatar
      Implement SSE2 floating-point support in the x86 native code generator (#594) · 335b9f36
      Simon Marlow authored
      The new flag -msse2 enables code generation for SSE2 on x86.  It
      results in substantially faster floating-point performance; the main
      reason for doing this was that our x87 code generation is appallingly
      bad, and since we plan to drop -fvia-C soon, we need a way to generate
      half-decent floating-point code.
      
      The catch is that SSE2 is only available on CPUs that support it (P4+,
      AMD K8+).  We'll have to think hard about whether we should enable it
      by default for the libraries we ship.  In the meantime, at least
      -msse2 should be an acceptable replacement for "-fvia-C
      -optc-ffast-math -fexcess-precision".
      
      SSE2 also has the advantage of performing all operations at the
      correct precision, so floating-point results are consistent with other
      platforms.
      
      I also tweaked the x87 code generation a bit while I was here, now
      it's slighlty less bad than before.
      335b9f36
  28. 02 Aug, 2009 1 commit
    • Simon Marlow's avatar
      RTS tidyup sweep, first phase · a2a67cd5
      Simon Marlow authored
      The first phase of this tidyup is focussed on the header files, and in
      particular making sure we are exposinng publicly exactly what we need
      to, and no more.
      
       - Rts.h now includes everything that the RTS exposes publicly,
         rather than a random subset of it.
      
       - Most of the public header files have moved into subdirectories, and
         many of them have been renamed.  But clients should not need to
         include any of the other headers directly, just #include the main
         public headers: Rts.h, HsFFI.h, RtsAPI.h.
      
       - All the headers needed for via-C compilation have moved into the
         stg subdirectory, which is self-contained.  Most of the headers for
         the rest of the RTS APIs have moved into the rts subdirectory.
      
       - I left MachDeps.h where it is, because it is so widely used in
         Haskell code.
       
       - I left a deprecated stub for RtsFlags.h in place.  The flag
         structures are now exposed by Rts.h.
      
       - Various internal APIs are no longer exposed by public header files.
      
       - Various bits of dead code and declarations have been removed
      
       - More gcc warnings are turned on, and the RTS code is more
         warning-clean.
      
       - More source files #include "PosixSource.h", and hence only use
         standard POSIX (1003.1c-1995) interfaces.
      
      There is a lot more tidying up still to do, this is just the first
      pass.  I also intend to standardise the names for external RTS APIs
      (e.g use the rts_ prefix consistently), and declare the internal APIs
      as hidden for shared libraries.
      a2a67cd5