1. 21 Mar, 2019 1 commit
    • jberryman's avatar
      Rough working LTO · 6f369534
      jberryman authored
      Very rough and hacky, with some hacks unnecessary but which I haven't
      taken the time to sort out yet.
      
      you need to make sure you have the lto plugin set up correctly, e.g.:
      
        $ cd /usr/lib/bfd-plugins/
        $ sudo ln -s /usr/lib/gcc/x86_64-linux-gnu/6/liblto_plugin.so .
      
      then do:
      
         $ export PATH=SHIMS_PATH:$PATH
         $ LD=gold  CC_STAGE0=gcc CC=gcc ./configure
         $ make
      
      You should adjust your path in the same way when building with this GHC,
      and you can do e.g.
      
        $ cabal new-build --disable-profiling --disable-library-profiling --with-compiler=your-ghc-source-tree/inplace/bin/ghc-stage2 --package-db=your-ghc-source-tree/inplace/lib/package.conf.d all
      
      You can see `-time` log in gcc shim file, play with recompilation
      options there.
      6f369534
  2. 02 Jun, 2018 1 commit
  3. 06 Mar, 2017 1 commit
  4. 07 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul of Compact Regions (#12455) · 7036fde9
      Simon Marlow authored
      Summary:
      This commit makes various improvements and addresses some issues with
      Compact Regions (aka Compact Normal Forms).
      
      This was the most important thing I wanted to fix.  Compaction
      previously prevented GC from running until it was complete, which
      would be a problem in a multicore setting.  Now, we compact using a
      hand-written Cmm routine that can be interrupted at any point.  When a
      GC is triggered during a sharing-enabled compaction, the GC has to
      traverse and update the hash table, so this hash table is now stored
      in the StgCompactNFData object.
      
      Previously, compaction consisted of a deepseq using the NFData class,
      followed by a traversal in C code to copy the data.  This is now done
      in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
      longer use the NFData instances, instead the Cmm routine evaluates
      components directly as it compacts.
      
      The new compaction is about 50% faster than the old one with no
      sharing, and a little faster on average with sharing (the cost of the
      hash table dominates when we're doing sharing).
      
      Static objects that don't (transitively) refer to any CAFs don't need
      to be copied into the compact region.  In particular this means we
      often avoid copying Char values and small Int values, because these
      are static closures in the runtime.
      
      Each Compact# object can support a single compactAdd# operation at any
      given time, so the Data.Compact library now enforces mutual exclusion
      using an MVar stored in the Compact object.
      
      We now get exceptions rather than killing everything with a barf()
      when we encounter an object that cannot be compacted (a function, or a
      mutable object).  We now also detect pinned objects, which can't be
      compacted either.
      
      The Data.Compact API has been refactored and cleaned up.  A new
      compactSize operation returns the size (in bytes) of the compact
      object.
      
      Most of the documentation is in the Haddock docs for the compact
      library, which I've expanded and improved here.
      
      Various comments in the code have been improved, especially the main
      Note [Compact Normal Forms] in rts/sm/CNF.c.
      
      I've added a few tests, and expanded a few of the tests that were
      there.  We now also run the tests with GHCi, and in a new test way
      that enables sanity checking (+RTS -DS).
      
      There's a benchmark in libraries/compact/tests/compact_bench.hs for
      measuring compaction speed and comparing sharing vs. no sharing.
      
      The field totalDataW in StgCompactNFData was unnecessary.
      
      Test Plan:
      * new unit tests
      * validate
      * tested manually that we can compact Data.Aeson data
      
      Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
      
      Subscribers: thomie, simonpj
      
      Differential Revision: https://phabricator.haskell.org/D2751
      
      GHC Trac Issues: #12455
      7036fde9
  5. 20 Jul, 2016 1 commit
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
      Campagna.
      
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      
      Test Plan:
      validate
      (new test cases included)
      
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1264
      
      GHC Trac Issues: #11493
      cf989ffe
  6. 18 May, 2016 2 commits
    • Ben Gamari's avatar
      rts: Add isPinnedByteArray# primop · 310371ff
      Ben Gamari authored
      Adds a primitive operation to determine whether a particular
      `MutableByteArray#` is backed by a pinned buffer.
      
      Test Plan: Validate with included testcase
      
      Reviewers: austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2217
      
      GHC Trac Issues: #12059
      310371ff
    • mlen's avatar
      Fix histograms for ticky code · f0f0ac85
      mlen authored
      This patch fixes Cmm generation required to produce histograms when
      compiling with -ticky flag, strips dead code from rts/Ticky.c and
      reworks it to use a shared constant in both C and Haskell code.
      
      Fixes #8308.
      
      Test Plan: T8308
      
      Reviewers: jstolarek, simonpj, austin
      
      Reviewed By: simonpj
      
      Subscribers: mpickering, simonpj, bgamari, mlen, thomie, jstolarek
      
      Differential Revision: https://phabricator.haskell.org/D931
      
      GHC Trac Issues: #8308
      f0f0ac85
  7. 01 May, 2016 1 commit
  8. 28 Apr, 2016 1 commit
  9. 24 Apr, 2016 1 commit
  10. 16 Apr, 2016 1 commit
  11. 01 Dec, 2015 1 commit
  12. 19 Nov, 2015 2 commits
  13. 30 Oct, 2015 1 commit
  14. 11 Sep, 2015 1 commit
  15. 06 Jul, 2015 1 commit
  16. 15 Dec, 2014 1 commit
  17. 25 Nov, 2014 1 commit
    • Simon Marlow's avatar
      Make clearNursery free · e22bc0de
      Simon Marlow authored
      Summary:
      clearNursery resets all the bd->free pointers of nursery blocks to
      make the blocks empty.  In profiles we've seen clearNursery taking
      significant amounts of time particularly with large -N and -A values.
      
      This patch moves the work of clearNursery to the point at which we
      actually need the new block, thereby introducing an invariant that
      blocks to the right of the CurrentNursery pointer still need their
      bd->free pointer reset.  This should make things faster overall,
      because we don't need to clear blocks that we don't use.
      
      Test Plan: validate
      
      Reviewers: AndreasVoellmy, ezyang, austin
      
      Subscribers: thomie, carter, ezyang, simonmar
      
      Differential Revision: https://phabricator.haskell.org/D318
      e22bc0de
  18. 12 Nov, 2014 1 commit
  19. 21 Oct, 2014 1 commit
  20. 30 May, 2014 1 commit
  21. 20 May, 2014 1 commit
  22. 19 May, 2014 1 commit
  23. 04 May, 2014 1 commit
  24. 02 May, 2014 1 commit
    • Simon Marlow's avatar
      Per-thread allocation counters and limits · b0534f78
      Simon Marlow authored
      This tracks the amount of memory allocation by each thread in a
      counter stored in the TSO.  Optionally, when the counter drops below
      zero (it counts down), the thread can be sent an asynchronous
      exception: AllocationLimitExceeded.  When this happens, given a small
      additional limit so that it can handle the exception.  See
      documentation in GHC.Conc for more details.
      
      Allocation limits are similar to timeouts, but
      
        - timeouts use real time, not CPU time.  Allocation limits do not
          count anything while the thread is blocked or in foreign code.
      
        - timeouts don't re-trigger if the thread catches the exception,
          allocation limits do.
      
        - timeouts can catch non-allocating loops, if you use
          -fno-omit-yields.  This doesn't work for allocation limits.
      
      I couldn't measure any impact on benchmarks with these changes, even
      for nofib/smp.
      b0534f78
  25. 29 Mar, 2014 1 commit
    • tibbe's avatar
      Add SmallArray# and SmallMutableArray# types · 90329b6c
      tibbe authored
      These array types are smaller than Array# and MutableArray# and are
      faster when the array size is small, as they don't have the overhead
      of a card table. Having no card table reduces the closure size with 2
      words in the typical small array case and leads to less work when
      updating or GC:ing the array.
      
      Reduces both the runtime and memory allocation by 8.8% on my insert
      benchmark for the HashMap type in the unordered-containers package,
      which makes use of lots of small arrays. With tuned GC settings
      (i.e. `+RTS -A6M`) the runtime reduction is 15%.
      
      Fixes #8923.
      90329b6c
  26. 24 Mar, 2014 1 commit
  27. 23 Mar, 2014 1 commit
  28. 13 Mar, 2014 1 commit
  29. 28 Nov, 2013 1 commit
  30. 26 Oct, 2013 1 commit
  31. 25 Oct, 2013 1 commit
  32. 23 Sep, 2013 2 commits
  33. 06 Aug, 2013 1 commit
  34. 15 Jun, 2013 2 commits
    • aljee@hyper.cx's avatar
      fe652a8b
    • aljee@hyper.cx's avatar
      Allow multiple C finalizers to be attached to a Weak# · d61c623e
      aljee@hyper.cx authored
      The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#.
      This new primop mutates an existing Weak# object and adds a new
      C finalizer to it.
      
      This change removes an invariant in MarkWeak.c, namely that the relative
      order of Weak# objects in the list needs to be preserved across GC. This
      makes it easier to split the list into per-generation structures.
      
      The patch also removes a race condition between two threads calling
      finalizeWeak# on the same WEAK object at that same time.
      d61c623e
  35. 29 Mar, 2013 1 commit
    • nfrisby's avatar
      ticky enhancements · 460abd75
      nfrisby authored
        * the new StgCmmArgRep module breaks a dependency cycle; I also
          untabified it, but made no real changes
      
        * updated the documentation in the wiki and change the user guide to
          point there
      
        * moved the allocation enters for ticky and CCS to after the heap check
      
          * I left LDV where it was, which was before the heap check at least
            once, since I have no idea what it is
      
        * standardized all (active?) ticky alloc totals to bytes
      
        * in order to avoid double counting StgCmmLayout.adjustHpBackwards
          no longer bumps ALLOC_HEAP_ctr
      
        * I resurrected the SLOW_CALL counters
      
          * the new module StgCmmArgRep breaks cyclic dependency between
            Layout and Ticky (which the SLOW_CALL counters cause)
      
          * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL
      
        * added ALLOC_RTS_ctr and _tot ticky counters
      
          * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info
      
          * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and
            ALLOC_PRIM
      
          * added -ticky and -DTICKY_TICKY in ways.mk for debug ways
      
        * added a ticky counter for total LNE entries
      
        * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE
      
          * all off by default
      
          * -ticky-allocd: tracks allocation *of* closure in addition to
             allocation *by* that closure
      
          * -ticky-dyn-thunk tracks dynamic thunks as if they were functions
      
          * -ticky-LNE tracks LNEs as if they were functions
      
        * updated the ticky report format, including making the argument
          categories (more?) accurate again
      
        * the printed name for things in the report include the unique of
          their ticky parent as well as if they are not top-level
      460abd75
  36. 14 Feb, 2013 1 commit