1. 01 Oct, 2013 12 commits
  2. 30 Sep, 2013 2 commits
  3. 29 Sep, 2013 5 commits
  4. 28 Sep, 2013 3 commits
  5. 27 Sep, 2013 5 commits
  6. 26 Sep, 2013 1 commit
  7. 24 Sep, 2013 3 commits
  8. 23 Sep, 2013 9 commits
    • Krzysztof Gogolewski's avatar
    • parcs's avatar
      Fix build when PROF_SPIN is unset · 9c11fdb9
      parcs authored
      whitehole_spin is only defined when PROF_SPIN is set.
    • parcs's avatar
      Fix the definition of cas() on x86 (#8219) · 84dff710
      parcs authored
      *p is both read and written to by the cmpxchg instruction, and therefore
      should be given the '+' constraint modifier.
      (In GCC's extended ASM language, '+' means that the operand is both read
      and written to whereas '=' means that it is only written to.)
      Otherwise, the compiler is allowed to rewrite something like
      SpinLock lock;
      initSpinLock(&lock);       /* sets lock = 1 */
      SpinLock lock;
      because according to the asm statement, the previous value of 'lock' is
      not important.
    • Krzysztof Gogolewski's avatar
      Remove fglasgow-exts from ghci --help · 93a04b49
      Krzysztof Gogolewski authored
      It has been deprecated for long and already removed from ghc --help
    • Simon Marlow's avatar
      Fix linker_unload now that we are running constructors in the linker (#8291) · 19081952
      Simon Marlow authored
      See also #5435.
      Now we have to remember the the StablePtrs that get created by the
      module initializer so that we can free them again in unloadObj().
    • Simon Marlow's avatar
      Discard unreachable code in the register allocator (#7574) · f5879acd
      Simon Marlow authored
      The problem with unreachable code is that it might refer to undefined
      registers.  This happens accidentally: a block can be orphaned by an
      optimisation, for example when the result of a comparsion becomes
      The register allocator panics when it finds an undefined register,
      because they shouldn't occur in generated code.  So we need to also
      discard unreachable code to prevent this panic being triggered by
      The register alloator already does a strongly-connected component
      analysis, so it ought to be easy to make it discard unreachable code
      as part of that traversal.  It turns out that we need a different
      variant of the scc algorithm to do that (see Digraph), however the new
      variant also generates slightly better code by putting the blocks
      within a loop in a better order for register allocation.
    • Krzysztof Gogolewski's avatar
      Typos · be3b84f3
      Krzysztof Gogolewski authored
    • gmainlan@microsoft.com's avatar
    • gmainlan@microsoft.com's avatar
      Merge branch 'wip/simd' · 680441de
      gmainlan@microsoft.com authored
      This merge revises and extends the current SIMD support in GHC. Notable
       * Support for AVX, AVX2, and AVX-512. Support for AVX-512 is untested.
       * SIMD primops are currently LLVM-only and documented in
       * By default only 128-bit wide SIMD vectors are passed in registers, and then
         only on the X86_64 architecture. There is a "hidden" flag,
         -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that
         assumes all vectors are passed in registers by LLVM. This can be used with a
         suitably patched version of LLVM, and if we get LLVM 3.4 patched, we can
         consider turning it on by default for LLVM 3.4+. This would mean that we
         couldn't mix LLVM <3.4-compiled object files with LLVM >=3.4-compiled object
         files, but I don't see that as much of a problem.
       * utils/genprimcode has been hacked up to allow us to write vector operations
         once and have them instantiated at multiple vector types. I'm not thrilled
         with this solution, but after discussing with Simon PJ, what I've implemented
         seems to be the minimal reasonable solution to the problem of exploding
         primop boilerplate. The changes are documented in
       * Error handling is sub-optimal. My patch checks to make sure that vector
         primops can be compiled efficiently based on the current set of dynamic
         flags. For example, if -mavx is not specified and the user tries to use a
         primop that adds together two 256-bit wide vectors of double-precision
         elements, the user will see an error message like:
           ghc-stage2: sorry! (unimplemented feature or known bug)
             (GHC version 7.7.20130916 for x86_64-unknown-linux):
      	 256-bit wide floating point SIMD vector instructions require at least -mavx.