1. 13 Mar, 2014 1 commit
  2. 11 Mar, 2014 3 commits
    • tibbe's avatar
      Fix incorrect loop condition in inline array allocation · c1d74ab9
      tibbe authored
      Also make sure allocHeapClosure updates profiling counters with the
      memory allocated.
      c1d74ab9
    • Simon Marlow's avatar
      Refactor inline array allocation · b684f27e
      Simon Marlow authored
      - Move array representation knowledge into SMRep
      
      - Separate out low-level heap-object allocation so that we can reuse
        it from doNewArrayOp
      
      - remove card-table initialisation, we can safely ignore the card
        table for newly allocated arrays.
      b684f27e
    • tibbe's avatar
      codeGen: allocate small arrays of statically known size inline · 22f010e0
      tibbe authored
      This results in a 46% runtime decrease when allocating an array of 16
      unit elements on a 64-bit machine.
      
      In order to allow newArray# to have both an inline and an out-of-line
      implementation, cgOpApp is refactored slightly. The new implementation
      of cgOpApp should make it easier to add other primops with both inline
      and out-of-line implementations in the future.
      22f010e0
  3. 02 Oct, 2013 1 commit
  4. 23 Sep, 2013 2 commits
    • gmainlan@microsoft.com's avatar
      Check that SIMD vector instructions are compatible with current set of dynamic flags. · 25eeb678
      gmainlan@microsoft.com authored
      SIMD vector instructions currently require the LLVM back-end. The set of
      available instructions also depends on the set of architecture flags specified
      on the command line.
      25eeb678
    • gmainlan@microsoft.com's avatar
      SIMD primops are now generated using schemas that are polymorphic in · 16b350a4
      gmainlan@microsoft.com authored
      width and element type.
      
      SIMD primops are now polymorphic in vector size and element type, but
      only internally to the compiler. More specifically, utils/genprimopcode
      has been extended so that it "knows" about SIMD vectors. This allows us
      to, for example, write a single definition for the "add two vectors"
      primop in primops.txt.pp and have it instantiated at many vector types.
      This generates a primop in GHC.Prim for each vector type at which "add
      two vectors" is instantiated, but only one data constructor for the
      PrimOp data type, so the code generator is much, much simpler.
      16b350a4
  5. 15 Sep, 2013 1 commit
    • Duncan Coutts's avatar
      New primops for byte range copies ByteArray# <-> Addr# · f11289f6
      Duncan Coutts authored
      
      
      We have primops for copying ranges of bytes between ByteArray#s:
       * ByteArray# -> MutableByteArray#
       * MutableByteArray# -> MutableByteArray#
      This extends it with three further cases:
       * Addr# -> MutableByteArray#
       * ByteArray# -> Addr#
       * MutableByteArray# -> Addr#
      One use case for these is copying between ForeignPtr-based
      representations and in-heap arrays (like Text, UArray etc).
      
      The implementation is essentially the same as for the existing
      primops, and shares the memcpy stuff in the code generators.
      
      Defficiencies / future directions: none of these primops (existing
      or the new ones) let one take advantage of knowing that ByteArray#s
      are word-aligned in memory. Though it is unclear that any of the
      code generators would make use of this information unless the size
      to copy is also known at compile time.
      Signed-off-by: default avatarAustin Seipp <austin@well-typed.com>
      f11289f6
  6. 02 Sep, 2013 1 commit
  7. 20 Aug, 2013 1 commit
  8. 14 Aug, 2013 1 commit
    • Jan Stolarek's avatar
      Comparison primops return Int# (Fixes #6135) · 6579a6c7
      Jan Stolarek authored
      This patch modifies all comparison primops for Char#, Int#, Word#, Double#,
      Float# and Addr# to return Int# instead of Bool. A value of 1# represents True
      and 0# represents False. For a more detailed description of motivation for this
      change, discussion of implementation details and benchmarking results please
      visit the wiki page: http://hackage.haskell.org/trac/ghc/wiki/PrimBool
      
      There's also some cleanup: whitespace fixes in files that were extensively edited
      in this patch and constant folding rules for Integer div and mod operators (which
      for some reason have been left out up till now).
      6579a6c7
  9. 17 Jul, 2013 1 commit
  10. 11 Jun, 2013 1 commit
  11. 09 Jun, 2013 1 commit
    • ian@well-typed.com's avatar
      Add support for byte endian swapping for Word 16/32/64. · 1c5b0511
      ian@well-typed.com authored
      * Exposes bSwap{,16,32,64}# primops
      * Add a new machops MO_BSwap
      * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
        in NCG.
      * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
        instead of using xchg.
      * Generate llvm.bswap intrinsics in llvm codegen.
      
      Patch from Vincent Hanquez.
      1c5b0511
  12. 24 Apr, 2013 1 commit
  13. 12 Mar, 2013 1 commit
  14. 18 Feb, 2013 1 commit
  15. 01 Feb, 2013 6 commits
  16. 23 Jan, 2013 1 commit
  17. 13 Dec, 2012 1 commit
  18. 01 Nov, 2012 2 commits
  19. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  20. 08 Oct, 2012 2 commits
    • rl@cse.unsw.edu.au's avatar
      5cff4fb0
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  21. 16 Sep, 2012 1 commit
  22. 14 Sep, 2012 2 commits
  23. 13 Sep, 2012 2 commits
  24. 12 Sep, 2012 3 commits
  25. 10 Sep, 2012 2 commits