1. 20 Jul, 2015 2 commits
  2. 27 Mar, 2015 1 commit
    • rwbarton's avatar
      Remove some unimplemented GranSim primops · 90dd11bf
      rwbarton authored
      Summary:
      An attempt to use these resulted in an error like:
      
      [1 of 1] Compiling Main             ( p.hs, p.o )
      ghc: panic! (the 'impossible' happened)
        (GHC version 7.8.4 for x86_64-unknown-linux):
      	emitPrimOp: can't translate PrimOp  parAt#{v}
      
      Test Plan: validate
      
      Reviewers: thomie, austin
      
      Reviewed By: thomie, austin
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D758
      90dd11bf
  3. 15 Dec, 2014 1 commit
    • Carter Schonwald's avatar
      Changing prefetch primops to have a `seq`-like interface · f44333ea
      Carter Schonwald authored
      Summary:
      The current primops for prefetching do not properly work in pure code;
      namely, the primops are not 'hoisted' into the correct call sites based
      on when arguments are evaluated. Instead, they should use a `seq`-like
      interface, which will cause it to be evaluated when the needed term is.
      
      See #9353 for the full discussion.
      
      Test Plan: updated tests for pure prefetch in T8256 to reflect the design changes in #9353
      
      Reviewers: simonmar, hvr, ekmett, austin
      
      Reviewed By: ekmett, austin
      
      Subscribers: merijn, thomie, carter, simonmar
      
      Differential Revision: https://phabricator.haskell.org/D350
      
      GHC Trac Issues: #9353
      f44333ea
  4. 17 Sep, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement `decodeDouble_Int64#` primop · b62bd5ec
      Herbert Valerio Riedel authored
      The existing `decodeDouble_2Int#` primop is rather inconvenient to use
      (and in fact is not even used by `integer-gmp`) as the mantissa is split
      into 3 components which would actually fit in an `Int64#` value.
      
      However, `decodeDouble_Int64#` is to be used by the new `integer-gmp2`
      re-implementation (see #9281).
      
      Moreover, `decodeDouble_2Int#` performs direct bit-wise operations on the
      IEEE representation which can be replaced by a combination of the
      portable standard C99 `scalbn(3)` and `frexp(3)` functions.
      
      Differential Revision: https://phabricator.haskell.org/D160
      b62bd5ec
  5. 13 Sep, 2014 2 commits
    • Herbert Valerio Riedel's avatar
      Detabify primops.txt.pp · 2cd76c15
      Herbert Valerio Riedel authored
      2cd76c15
    • Herbert Valerio Riedel's avatar
      Move docstring of `seq` to primops.txt.pp · abff2ffd
      Herbert Valerio Riedel authored
      The documentation for `seq` was recently augmented via #9390 &
      cbfa1076. However, it doesn't show
      up in the Haddock generated docs because `#ifdef __HADDOCK__` doesn't
      work as expected.  Also, it's easier to just fix the problem at the
      origin (which in this is case is the primops.txt.pp file).
      
      The benefit/downside of this is that now the extended documentation
      shows up everywhere `seq` is re-exported directly.
      abff2ffd
  6. 16 Aug, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement {resize,shrink}MutableByteArray# primops · 246436f1
      Herbert Valerio Riedel authored
      The two new primops with the type-signatures
      
        resizeMutableByteArray# :: MutableByteArray# s -> Int#
                                -> State# s -> (# State# s, MutableByteArray# s #)
      
        shrinkMutableByteArray# :: MutableByteArray# s -> Int#
                                -> State# s -> State# s
      
      allow to resize MutableByteArray#s in-place (when possible), and are useful
      for algorithms where memory is temporarily over-allocated. The motivating
      use-case is for implementing integer backends, where the final target size of
      the result is either N or N+1, and only known after the operation has been
      performed.
      
      A future commit will implement a stateful variant of the
      `sizeofMutableByteArray#` operation (see #9447 for details), since now the
      size of a `MutableByteArray#` may change over its lifetime (i.e before
      it gets frozen or GCed).
      
      Test Plan: ./validate --slow
      
      Reviewers: ezyang, austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Differential Revision: https://phabricator.haskell.org/D133
      246436f1
  7. 14 Aug, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement new CLZ and CTZ primops (re #9340) · e0c1767d
      Herbert Valerio Riedel authored
      This implements the new primops
      
        clz#, clz32#, clz64#,
        ctz#, ctz32#, ctz64#
      
      which provide efficient implementations of the popular
      count-leading-zero and count-trailing-zero respectively
      (see testcase for a pure Haskell reference implementation).
      
      On x86, NCG as well as LLVM generates code based on the BSF/BSR
      instructions (which need extra logic to make the 0-case well-defined).
      
      Test Plan: validate and succesful tests on i686 and amd64
      
      Reviewers: rwbarton, simonmar, ezyang, austin
      
      Subscribers: simonmar, relrod, ezyang, carter
      
      Differential Revision: https://phabricator.haskell.org/D144
      
      GHC Trac Issues: #9340
      e0c1767d
  8. 10 Aug, 2014 1 commit
  9. 07 Aug, 2014 1 commit
    • Simon Peyton Jones's avatar
      Add has_side_effets to the raise# primop · 0957a9b0
      Simon Peyton Jones authored
      According to the definition of has_side_effets in PrimOp,
      raise# clearly has side effects!  In practice it makes little
      difference becuase the fact that it returns bottom is more
      important... but still it's better to say it right.
      0957a9b0
  10. 30 Jun, 2014 1 commit
    • tibbe's avatar
      Re-add more primops for atomic ops on byte arrays · 4ee4ab01
      tibbe authored
      This is the second attempt to add this functionality. The first
      attempt was reverted in 950fcae4, due
      to register allocator failure on x86. Given how the register
      allocator currently works, we don't have enough registers on x86 to
      support cmpxchg using complicated addressing modes. Instead we fall
      back to a simpler addressing mode on x86.
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      4ee4ab01
  11. 26 Jun, 2014 1 commit
  12. 24 Jun, 2014 1 commit
    • tibbe's avatar
      Add more primops for atomic ops on byte arrays · d8abf85f
      tibbe authored
      Summary:
      Add more primops for atomic ops on byte arrays
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      d8abf85f
  13. 11 Jun, 2014 1 commit
    • eir@cis.upenn.edu's avatar
      Fix #9097. · 051d694f
      eir@cis.upenn.edu authored
      `Any` is now an abstract (that is, no equations) closed type family.
      051d694f
  14. 10 Jun, 2014 1 commit
  15. 04 May, 2014 1 commit
  16. 29 Mar, 2014 1 commit
    • tibbe's avatar
      Add SmallArray# and SmallMutableArray# types · 90329b6c
      tibbe authored
      These array types are smaller than Array# and MutableArray# and are
      faster when the array size is small, as they don't have the overhead
      of a card table. Having no card table reduces the closure size with 2
      words in the typical small array case and leads to less work when
      updating or GC:ing the array.
      
      Reduces both the runtime and memory allocation by 8.8% on my insert
      benchmark for the HashMap type in the unordered-containers package,
      which makes use of lots of small arrays. With tuned GC settings
      (i.e. `+RTS -A6M`) the runtime reduction is 15%.
      
      Fixes #8923.
      90329b6c
  17. 28 Mar, 2014 1 commit
    • tibbe's avatar
      Make copy array ops out-of-line by default · e54828bf
      tibbe authored
      This should reduce code size when there's little to gain from inlining
      these primops, while still retaining the inlining benefit when the
      size of the copy is known statically.
      e54828bf
  18. 22 Mar, 2014 1 commit
    • tibbe's avatar
      codeGen: inline allocation optimization for clone array primops · 1eece456
      tibbe authored
      The inline allocation version is 69% faster than the out-of-line
      version, when cloning an array of 16 unit elements on a 64-bit
      machine.
      
      Comparing the new and the old primop implementations isn't
      straightforward. The old version had a missing heap check that I
      discovered during the development of the new version. Comparing the
      old and the new version would requiring fixing the old version, which
      in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim.
      
      The inline allocation threshold is configurable via
      -fmax-inline-alloc-size which gives the maximum array size, in bytes,
      to allocate inline. The size does not include the closure header size.
      
      Allowing the same primop to be either inline or out-of-line has some
      implication for how we lay out heap checks. We always place a heap
      check around out-of-line primops, as they may allocate outside of our
      knowledge. However, for the inline primops we only allow allocation
      via the standard means (i.e. virtHp). Since the clone primops might be
      either inline or out-of-line the heap check layout code now consults
      shouldInlinePrimOp to know whether a primop will be inlined.
      1eece456
  19. 16 Mar, 2014 1 commit
  20. 13 Mar, 2014 1 commit
    • tibbe's avatar
      Improve copy/clone array primop docs · ed2a8f07
      tibbe authored
      Clarify the order of the arguments. Also, remove any use of # in the
      comments, which would make the rest of that comment line disappear in
      the docs, due to being treated as a comment by the preprocessor.
      ed2a8f07
  21. 30 Jan, 2014 1 commit
  22. 09 Jan, 2014 1 commit
  23. 12 Dec, 2013 1 commit
  24. 09 Dec, 2013 1 commit
  25. 29 Nov, 2013 1 commit
  26. 03 Nov, 2013 1 commit
  27. 02 Nov, 2013 2 commits
  28. 02 Oct, 2013 1 commit
  29. 01 Oct, 2013 1 commit
  30. 23 Sep, 2013 4 commits
  31. 18 Sep, 2013 2 commits
    • Jan Stolarek's avatar
      Restore old names of comparison primops · 53948f91
      Jan Stolarek authored
      In 6579a6c7 we removed existing comparison primops and introduced new ones
      returning Int# instead of Bool. This commit (and associated commits in
      array, base, dph, ghc-prim, integer-gmp, integer-simple, primitive, testsuite and
      template-haskell) restores old names of primops. This allows us to keep
      our API cleaner at the price of not having backwards compatibility.
      
      This patch also temporalily disables fix for #8317 (optimization of
      tagToEnum# at Core level). We need to fix #8326 first, otherwise
      our primops code will be very slow.
      53948f91
    • Jan Stolarek's avatar
      Trailing whitespaces · 6eec7bc5
      Jan Stolarek authored
      6eec7bc5
  32. 15 Sep, 2013 2 commits
    • Austin Seipp's avatar
      Fix the type signatures of new copy primops. · bb532682
      Austin Seipp authored
      
      
      They claimed to work over 'ST RealWorld', when instead they should be
      parameterized in the state type. This fixes the cgrun070.
      Signed-off-by: default avatarAustin Seipp <austin@well-typed.com>
      bb532682
    • Duncan Coutts's avatar
      New primops for byte range copies ByteArray# <-> Addr# · f11289f6
      Duncan Coutts authored
      
      
      We have primops for copying ranges of bytes between ByteArray#s:
       * ByteArray# -> MutableByteArray#
       * MutableByteArray# -> MutableByteArray#
      This extends it with three further cases:
       * Addr# -> MutableByteArray#
       * ByteArray# -> Addr#
       * MutableByteArray# -> Addr#
      One use case for these is copying between ForeignPtr-based
      representations and in-heap arrays (like Text, UArray etc).
      
      The implementation is essentially the same as for the existing
      primops, and shares the memcpy stuff in the code generators.
      
      Defficiencies / future directions: none of these primops (existing
      or the new ones) let one take advantage of knowing that ByteArray#s
      are word-aligned in memory. Though it is unclear that any of the
      code generators would make use of this information unless the size
      to copy is also known at compile time.
      Signed-off-by: default avatarAustin Seipp <austin@well-typed.com>
      f11289f6