1. 13 Jul, 2013 1 commit
  2. 11 Jul, 2013 1 commit
  3. 10 Jul, 2013 1 commit
  4. 09 Jul, 2013 1 commit
  5. 15 Jun, 2013 1 commit
    • aljee@hyper.cx's avatar
      Allow multiple C finalizers to be attached to a Weak# · d61c623e
      aljee@hyper.cx authored
      The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#.
      This new primop mutates an existing Weak# object and adds a new
      C finalizer to it.
      
      This change removes an invariant in MarkWeak.c, namely that the relative
      order of Weak# objects in the list needs to be preserved across GC. This
      makes it easier to split the list into per-generation structures.
      
      The patch also removes a race condition between two threads calling
      finalizeWeak# on the same WEAK object at that same time.
      d61c623e
  6. 11 Jun, 2013 1 commit
  7. 09 Jun, 2013 1 commit
    • ian@well-typed.com's avatar
      Add support for byte endian swapping for Word 16/32/64. · 1c5b0511
      ian@well-typed.com authored
      * Exposes bSwap{,16,32,64}# primops
      * Add a new machops MO_BSwap
      * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
        in NCG.
      * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
        instead of using xchg.
      * Generate llvm.bswap intrinsics in llvm codegen.
      
      Patch from Vincent Hanquez.
      1c5b0511
  8. 18 Feb, 2013 1 commit
  9. 01 Feb, 2013 6 commits
  10. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  11. 13 Dec, 2012 1 commit
  12. 23 Nov, 2012 3 commits
  13. 13 Nov, 2012 2 commits
  14. 15 Oct, 2012 1 commit
    • Duncan Coutts's avatar
      Add a new traceMarker# primop for use in profiling output · a609027d
      Duncan Coutts authored
      In time-based profiling visualisations (e.g. heap profiles and ThreadScope)
      it would be useful to be able to mark particular points in the execution and
      have those points in time marked in the visualisation.
      
      The traceMarker# primop currently emits an event into the eventlog. In
      principle it could be extended to do something in the heap profiling too.
      a609027d
  15. 10 Oct, 2012 1 commit
  16. 28 Aug, 2012 1 commit
  17. 28 May, 2012 1 commit
  18. 27 Apr, 2012 1 commit
  19. 21 Apr, 2012 1 commit
    • Ian Lynagh's avatar
      Add a quotRemWord2 primop · 5136d64e
      Ian Lynagh authored
      It allows you to do
          (high, low) `quotRem` d
      provided high < d.
      
      Currently only has an inefficient fallback implementation.
      5136d64e
  20. 20 Apr, 2012 1 commit
  21. 24 Feb, 2012 1 commit
  22. 23 Feb, 2012 1 commit
  23. 17 Feb, 2012 1 commit
  24. 14 Feb, 2012 1 commit
  25. 13 Jan, 2012 1 commit
  26. 12 Dec, 2011 1 commit
  27. 07 Dec, 2011 1 commit
    • chak@cse.unsw.edu.au.'s avatar
      Add new primtypes 'ArrayArray#' and 'MutableArrayArray#' · 021a0dd2
      chak@cse.unsw.edu.au. authored
      The primitive array types, such as 'ByteArray#', have kind #, but are represented by pointers. They are boxed, but unpointed types (i.e., they cannot be 'undefined').
      
      The two categories of array types —[Mutable]Array# and [Mutable]ByteArray#— are containers for unboxed (and unpointed) as well as for boxed and pointed types.  So far, we lacked support for containers for boxed, unpointed types (i.e., containers for the primitive arrays themselves).  This is what the new primtypes provide.
      
      Containers for boxed, unpointed types are crucial for the efficient implementation of scattered nested arrays, which are central to the new DPH backend library dph-lifted-vseg.  Without such containers, we cannot eliminate all unboxing from the inner loops of traversals processing scattered nested arrays.
      021a0dd2
  28. 06 Dec, 2011 1 commit
  29. 30 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Further tweaks to the ccs primops · 1fc25dfd
      Simon Marlow authored
       - add getCCSOf# :: a -> State# s -> (# State# s, Addr# #)
         (returns the CCS attached to the supplied object)
      
       - remove traceCcs# (obsoleted by getCCSOf#)
      
       - rename getCCCS# to getCurrentCCS#
      1fc25dfd
  30. 29 Nov, 2011 1 commit
  31. 16 Nov, 2011 1 commit
  32. 11 Nov, 2011 1 commit
    • dreixel's avatar
      New kind-polymorphic core · 09015be8
      dreixel authored
      This big patch implements a kind-polymorphic core for GHC. The current
      implementation focuses on making sure that all kind-monomorphic programs still
      work in the new core; it is not yet guaranteed that kind-polymorphic programs
      (using the new -XPolyKinds flag) will work.
      
      For more information, see http://haskell.org/haskellwiki/GHC/Kinds
      09015be8