1. 04 Feb, 2010 1 commit
    • Simon Marlow's avatar
      Implement SSE2 floating-point support in the x86 native code generator (#594) · 335b9f36
      Simon Marlow authored
      The new flag -msse2 enables code generation for SSE2 on x86.  It
      results in substantially faster floating-point performance; the main
      reason for doing this was that our x87 code generation is appallingly
      bad, and since we plan to drop -fvia-C soon, we need a way to generate
      half-decent floating-point code.
      
      The catch is that SSE2 is only available on CPUs that support it (P4+,
      AMD K8+).  We'll have to think hard about whether we should enable it
      by default for the libraries we ship.  In the meantime, at least
      -msse2 should be an acceptable replacement for "-fvia-C
      -optc-ffast-math -fexcess-precision".
      
      SSE2 also has the advantage of performing all operations at the
      correct precision, so floating-point results are consistent with other
      platforms.
      
      I also tweaked the x87 code generation a bit while I was here, now
      it's slighlty less bad than before.
      335b9f36
  2. 03 Feb, 2010 6 commits
  3. 02 Feb, 2010 1 commit
  4. 30 Jan, 2010 1 commit
  5. 01 Feb, 2010 1 commit
    • simonpj@microsoft.com's avatar
      Fix Trac #3831: blowup in SpecConstr · 13c66820
      simonpj@microsoft.com authored
      It turned out that there were two bugs.  First, we were getting an
      exponential number of specialisations when we had a deep nest of
      join points.  See Note [Avoiding exponential blowup]. I fixed this
      by dividing sc_count (in ScEnv) by the number of specialisations
      when recursing.  Crude but effective.
      
      Second, when making specialisations I was looking at the result of
      applying specExpr to the RHS of the function, whereas I should have
      been looking at the original RHS.  See Note [Specialise original
      body].
      
      
      There's a tantalising missed opportunity here, though.  In this
      example (recorded as a test simplCore/should_compile/T3831), each join
      point has *exactly one* call pattern, so we should really just
      specialise for that alone, in which case there's zero code-blow-up.
      In particular, we don't need the *original* RHS at all.  I need to think
      more about how to exploit this.
      
      But the blowup is now limited, so compiling terminfo with -O2 works again.
      13c66820
  6. 29 Jan, 2010 1 commit
  7. 28 Jan, 2010 1 commit
    • Simon Marlow's avatar
      tweak the totally-bogus arbitrary stack-squeezing heuristic to fix #2797 · 990171bf
      Simon Marlow authored
      In #2797, a program that ran in constant stack space when compiled
      needed linear stack space when interpreted.  It turned out to be
      nothing more than stack-squeezing not happening.  We have a heuristic
      to avoid stack-squeezing when it would be too expensive (shuffling a
      large amount of memory to save a few words), but in some cases even
      expensive stack-squeezing is necessary to avoid linear stack usage.
      One day we should implement stack chunks, which would make this less
      expensive.
      990171bf
  8. 27 Jan, 2010 9 commits
  9. 26 Jan, 2010 4 commits
  10. 22 Jan, 2010 6 commits
  11. 19 Jan, 2010 1 commit
  12. 20 Jan, 2010 3 commits
  13. 16 Dec, 2009 1 commit
    • howard_b_golden@yahoo.com's avatar
      FIX #2615 (linker scripts in .so files) · e020e387
      howard_b_golden@yahoo.com authored
      This patch does not apply to Windows. It only applies to systems with
      ELF binaries.
      
      This is a patch to rts/Linker.c to recognize linker scripts in .so
      files and find the real target .so shared library for loading.
      e020e387
  14. 20 Jan, 2010 2 commits
    • simonpj@microsoft.com's avatar
      Fix Trac #3813: unused variables in GHCi bindings · 85f969a6
      simonpj@microsoft.com authored
      In a GHCi stmt we don't want to report unused variables, 
      because we don't know the scope of the binding, eg
      
      	Prelude> x <- blah
      
      Fixing this needed a little more info about the context of the stmt,
      thus the new constructor GhciStmt in the HsStmtContext type.
      85f969a6
    • simonpj@microsoft.com's avatar
      Fix Trac #3823, plus warning police in TcRnDriver · dfa43eb4
      simonpj@microsoft.com authored
      The immediate reason for this patch is to fix #3823. This was 
      rather easy: all the work was being done but I was returning
      type_env2 rather than type_env3.  
      
      An unused-veriable warning would have shown this up, so I fixed all
      the other warnings in TcRnDriver.  Doing so showed up at least two
      genuine lurking bugs.  Hurrah.
      dfa43eb4
  15. 19 Jan, 2010 2 commits