1. 01 Oct, 2013 4 commits
  2. 30 Sep, 2013 2 commits
  3. 29 Sep, 2013 5 commits
  4. 28 Sep, 2013 3 commits
  5. 27 Sep, 2013 5 commits
  6. 26 Sep, 2013 1 commit
  7. 24 Sep, 2013 3 commits
  8. 23 Sep, 2013 17 commits
    • Krzysztof Gogolewski's avatar
    • parcs's avatar
      Fix build when PROF_SPIN is unset · 9c11fdb9
      parcs authored
      whitehole_spin is only defined when PROF_SPIN is set.
    • parcs's avatar
      Fix the definition of cas() on x86 (#8219) · 84dff710
      parcs authored
      *p is both read and written to by the cmpxchg instruction, and therefore
      should be given the '+' constraint modifier.
      (In GCC's extended ASM language, '+' means that the operand is both read
      and written to whereas '=' means that it is only written to.)
      Otherwise, the compiler is allowed to rewrite something like
      SpinLock lock;
      initSpinLock(&lock);       /* sets lock = 1 */
      SpinLock lock;
      because according to the asm statement, the previous value of 'lock' is
      not important.
    • Krzysztof Gogolewski's avatar
      Remove fglasgow-exts from ghci --help · 93a04b49
      Krzysztof Gogolewski authored
      It has been deprecated for long and already removed from ghc --help
    • Simon Marlow's avatar
      Fix linker_unload now that we are running constructors in the linker (#8291) · 19081952
      Simon Marlow authored
      See also #5435.
      Now we have to remember the the StablePtrs that get created by the
      module initializer so that we can free them again in unloadObj().
    • Simon Marlow's avatar
      Discard unreachable code in the register allocator (#7574) · f5879acd
      Simon Marlow authored
      The problem with unreachable code is that it might refer to undefined
      registers.  This happens accidentally: a block can be orphaned by an
      optimisation, for example when the result of a comparsion becomes
      The register allocator panics when it finds an undefined register,
      because they shouldn't occur in generated code.  So we need to also
      discard unreachable code to prevent this panic being triggered by
      The register alloator already does a strongly-connected component
      analysis, so it ought to be easy to make it discard unreachable code
      as part of that traversal.  It turns out that we need a different
      variant of the scc algorithm to do that (see Digraph), however the new
      variant also generates slightly better code by putting the blocks
      within a loop in a better order for register allocation.
    • Krzysztof Gogolewski's avatar
      Typos · be3b84f3
      Krzysztof Gogolewski authored
    • gmainlan@microsoft.com's avatar
    • gmainlan@microsoft.com's avatar
      Merge branch 'wip/simd' · 680441de
      gmainlan@microsoft.com authored
      This merge revises and extends the current SIMD support in GHC. Notable
       * Support for AVX, AVX2, and AVX-512. Support for AVX-512 is untested.
       * SIMD primops are currently LLVM-only and documented in
       * By default only 128-bit wide SIMD vectors are passed in registers, and then
         only on the X86_64 architecture. There is a "hidden" flag,
         -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that
         assumes all vectors are passed in registers by LLVM. This can be used with a
         suitably patched version of LLVM, and if we get LLVM 3.4 patched, we can
         consider turning it on by default for LLVM 3.4+. This would mean that we
         couldn't mix LLVM <3.4-compiled object files with LLVM >=3.4-compiled object
         files, but I don't see that as much of a problem.
       * utils/genprimcode has been hacked up to allow us to write vector operations
         once and have them instantiated at multiple vector types. I'm not thrilled
         with this solution, but after discussing with Simon PJ, what I've implemented
         seems to be the minimal reasonable solution to the problem of exploding
         primop boilerplate. The changes are documented in
       * Error handling is sub-optimal. My patch checks to make sure that vector
         primops can be compiled efficiently based on the current set of dynamic
         flags. For example, if -mavx is not specified and the user tries to use a
         primop that adds together two 256-bit wide vectors of double-precision
         elements, the user will see an error message like:
           ghc-stage2: sorry! (unimplemented feature or known bug)
             (GHC version 7.7.20130916 for x86_64-unknown-linux):
      	 256-bit wide floating point SIMD vector instructions require at least -mavx.
    • gmainlan@microsoft.com's avatar
      Check that SIMD vector instructions are compatible with current set of dynamic flags. · 25eeb678
      gmainlan@microsoft.com authored
      SIMD vector instructions currently require the LLVM back-end. The set of
      available instructions also depends on the set of architecture flags specified
      on the command line.
    • gmainlan@microsoft.com's avatar
      Enable -msse to be specified by itself. · 1ed36c54
      gmainlan@microsoft.com authored
      This sets the SSE "version" to 1.0.
    • gmainlan@microsoft.com's avatar
      By default, only pass 128-bit SIMD vectors in registers on X86-64. · d2b95264
      gmainlan@microsoft.com authored
      LLVM's GHC calling convention only allows 128-bit SIMD vectors to be passed in
      machine registers on X86-64. This may change in LLVM 3.4; the hidden flag
      -fllvm-pass-vectors-in-regs causes all SIMD vector widths to be passed in
      registers on both X86-64 and on X86-32.
    • gmainlan@microsoft.com's avatar
      Add 512-bit-wide SIMD primitives. · 7dda67b9
      gmainlan@microsoft.com authored
    • gmainlan@microsoft.com's avatar
    • gmainlan@microsoft.com's avatar
    • gmainlan@microsoft.com's avatar
    • gmainlan@microsoft.com's avatar
      Add support for -mavx512* flags. · 03e33c92
      gmainlan@microsoft.com authored