1. 21 Aug, 2018 1 commit
  2. 15 Jul, 2018 1 commit
  3. 04 Jul, 2018 1 commit
  4. 29 Jun, 2018 1 commit
  5. 17 Jun, 2018 1 commit
    • Ömer Sinan Ağacan's avatar
      Use __FILE__ for Cmm assertion locations, fix #8619 · 008ea12d
      Ömer Sinan Ağacan authored
      It seems like we currently support string literals in Cmm, so we can use
      __LINE__ CPP macro in assertion macros. This improves error messages
      that previously looked like
      
          ASSERTION FAILED: file (null), line 1302
      
      (null) part now shows the actual file name.
      
      Also inline some single-use string literals in PrimOps.cmm.
      
      Reviewers: bgamari, simonmar, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4862
      008ea12d
  6. 05 Jun, 2018 1 commit
    • Ömer Sinan Ağacan's avatar
      Rename some mutable closure types for consistency · 4075656e
      Ömer Sinan Ağacan authored
      SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY
      SMALL_MUT_ARR_PTRS_FROZEN  -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN
      MUT_ARR_PTRS_FROZEN0       -> MUT_ARR_PTRS_FROZEN_DIRTY
      MUT_ARR_PTRS_FROZEN        -> MUT_ARR_PTRS_FROZEN_CLEAN
      
      Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR,
      MUT_ARR_PTRS).
      
      (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR)
      
      Removed a few comments in Scav.c about FROZEN0 being on the mut_list
      because it's now clear from the closure type.
      
      Reviewers: bgamari, simonmar, erikd
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4784
      4075656e
  7. 02 Jun, 2018 1 commit
  8. 20 May, 2018 1 commit
    • patrickdoc's avatar
      Add HeapView functionality · ec22f7dd
      patrickdoc authored
      This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
      The bits added are the C hooks into the RTS and a basic Haskell wrapper
      to these C hooks. The main reason for these to be added to GHC proper
      is that the code needs to be kept in sync with the closure types
      defined by the RTS. It is expected that the version of HeapView shipped
      with GHC will always work with that version of GHC and that extra
      functionality can be layered on top with a library like ghc-heap-view
      distributed via Hackage.
      
      Test Plan: validate
      
      Reviewers: simonmar, hvr, nomeata, austin, Phyx, bgamari, erikd
      
      Reviewed By: bgamari
      
      Subscribers: carter, patrickdoc, tmcgilchrist, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3055
      ec22f7dd
  9. 19 Mar, 2018 1 commit
    • Ben Gamari's avatar
      Improve accuracy of get/setAllocationCounter · 20cbb016
      Ben Gamari authored
      Summary:
      get/setAllocationCounter didn't take into account allocations in the
      current block. This was known at the time, but it turns out to be
      important to have more accuracy when using these in a fine-grained
      way.
      
      Test Plan:
      New unit test to test incrementally larger allocaitons.  Before I got
      results like this:
      
      ```
      +0
      +0
      +0
      +0
      +0
      +4096
      +0
      +0
      +0
      +0
      +0
      +4064
      +0
      +0
      +4088
      +4056
      +0
      +0
      +0
      +4088
      +4096
      +4056
      +4096
      ```
      
      Notice how the results aren't always monotonically increasing.  After
      this patch:
      
      ```
      +344
      +416
      +488
      +560
      +632
      +704
      +776
      +848
      +920
      +992
      +1064
      +1136
      +1208
      +1280
      +1352
      +1424
      +1496
      +1568
      +1640
      +1712
      +1784
      +1856
      +1928
      +2000
      +2072
      +2144
      ```
      
      Reviewers: hvr, erikd, simonmar, jrtc27, trommler
      
      Reviewed By: simonmar
      
      Subscribers: trommler, jrtc27, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4363
      20cbb016
  10. 09 Mar, 2018 1 commit
  11. 18 Feb, 2018 1 commit
  12. 29 Jan, 2018 1 commit
  13. 18 Jan, 2018 2 commits
  14. 15 Jan, 2018 1 commit
  15. 08 Jan, 2018 1 commit
    • Simon Marlow's avatar
      Improve accuracy of get/setAllocationCounter · a1a689dd
      Simon Marlow authored
      Summary:
      get/setAllocationCounter didn't take into account allocations in the
      current block. This was known at the time, but it turns out to be
      important to have more accuracy when using these in a fine-grained
      way.
      
      Test Plan:
      New unit test to test incrementally larger allocaitons.  Before I got
      results like this:
      
      ```
      +0
      +0
      +0
      +0
      +0
      +4096
      +0
      +0
      +0
      +0
      +0
      +4064
      +0
      +0
      +4088
      +4056
      +0
      +0
      +0
      +4088
      +4096
      +4056
      +4096
      ```
      
      Notice how the results aren't always monotonically increasing.  After
      this patch:
      
      ```
      +344
      +416
      +488
      +560
      +632
      +704
      +776
      +848
      +920
      +992
      +1064
      +1136
      +1208
      +1280
      +1352
      +1424
      +1496
      +1568
      +1640
      +1712
      +1784
      +1856
      +1928
      +2000
      +2072
      +2144
      ```
      
      Reviewers: niteria, bgamari, hvr, erikd
      
      Subscribers: rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4288
      a1a689dd
  16. 19 Dec, 2017 1 commit
  17. 08 Nov, 2017 1 commit
  18. 02 Nov, 2017 1 commit
    • Sergei Trofimovich's avatar
      rts/PrimOps.cmm: add declaration for heapOverflow closure · 51321cf2
      Sergei Trofimovich authored
      Before the change UNREG ghc build failed as:
      ```
        rts_dist_HC rts/dist/build/PrimOps.o
      /tmp/ghc2370_0/ghc_4.hc: In function 'stg_newByteArrayzh':
      
      /tmp/ghc2370_0/ghc_4.hc:26:13: error:
           error: 'base_GHCziIOziException_heapOverflow_closure'
               undeclared (first use in this function)
           R1.w = (W_)&base_GHCziIOziException_heapOverflow_closure;
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |
      26 | R1.w = (W_)&base_GHCziIOziException_heapOverflow_closure;
         |             ^
      ```
      
      It's an UNREG-specific failure because C backend always requires
      declarations to be known.
      
      Added missing declaration.
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      51321cf2
  19. 19 Oct, 2017 1 commit
    • Jessica Clarke's avatar
      Untag the potential AP_STACK in stg_getApStackValzh · b6204f70
      Jessica Clarke authored
      If the AP_STACK has been evaluated and a GC has run, the BLACKHOLE
      indirection will have been removed, and the StablePtr for the original
      AP_STACK referred to be GHCi will therefore now point directly to the
      value, and may be tagged. Add a hist002 test for this, and make sure
      hist001 doesn't do an idle GC, so the case when it's still a BLACKHOLE
      is definitely also tested.
      
      Reviewers: austin, bgamari, erikd, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D4099
      b6204f70
  20. 16 Oct, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      Implement new `compareByteArrays#` primop · e3ba26f8
      Herbert Valerio Riedel authored
      The new primop
      
          compareByteArrays# :: ByteArray# -> Int# {- offset -}
                             -> ByteArray# -> Int# {- offset -}
                             -> Int# {- length -}
                             -> Int#
      
      allows to compare the subrange of the first `ByteArray#` to
      the (same-length) subrange of the second `ByteArray#` and returns a
      value less than, equal to, or greater than zero if the range is found,
      respectively, to be byte-wise lexicographically less than, to match,
      or be greater than the second range.
      
      Under the hood, the new primop is implemented in terms of the standard
      ISO C `memcmp(3)` function. It is currently an out-of-line primop but
      work is underway to optimise this into an inline primop for a future
      follow-up Differential (see D4091).
      
      This primop has applications in packages like `text`, `text-short`,
      `bytestring`, `text-containers`, `primitive`, etc.  which currently
      have to incur the overhead of an ordinary FFI call to directly or
      indirectly invoke `memcmp(3)` as well has having to deal with some
      `unsafePerformIO`-variant.
      
      While at it, this also improves the documentation for the existing
      `copyByteArray#` primitive which has a non-trivial type-signature
      that significantly benefits from a more explicit description of its
      arguments.
      
      Reviewed By: bgamari
      
      Differential Revision: https://phabricator.haskell.org/D4090
      e3ba26f8
  21. 26 Sep, 2017 2 commits
  22. 22 Sep, 2017 1 commit
  23. 06 Jul, 2017 1 commit
  24. 03 Jul, 2017 1 commit
  25. 08 May, 2017 1 commit
  26. 29 Apr, 2017 2 commits
  27. 05 Apr, 2017 1 commit
  28. 04 Apr, 2017 1 commit
  29. 04 Feb, 2017 1 commit
    • Takenobu Tani's avatar
      Fix comment (old file names) in rts/ · 31bb85ff
      Takenobu Tani authored
      [skip ci]
      
      There ware some old file names (.lhs, ...) at comments.
      
      * rts/win32/ThrIOManager.c
        - Conc.lhs -> Conc.hs
      
      * rts/PrimOps.cmm
        - ByteCodeLink.lhs -> ByteCodeLink.hs
        - StgMiscClosures.hc -> StgMiscClosures.cmm
      
      * rts/AutoApply.h
        - AutoApply.hc -> AutoApply.cmm
      
      * rts/HeapStackCheck.cmm
        - PrimOps.hc -> PrimOps.cmm
      
      * rts/LdvProfile.h
        - Updates.hc -> Updates.cmm
      
      * rts/Schedule.c
        - StgStartup.hc -> StgStartup.cmm
      
      * rts/Weak.c
        - StgMiscClosures.hc -> StgMiscClosures.cmm
      
      Reviewers: bgamari, austin, erikd, simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D3075
      31bb85ff
  30. 07 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul of Compact Regions (#12455) · 7036fde9
      Simon Marlow authored
      Summary:
      This commit makes various improvements and addresses some issues with
      Compact Regions (aka Compact Normal Forms).
      
      This was the most important thing I wanted to fix.  Compaction
      previously prevented GC from running until it was complete, which
      would be a problem in a multicore setting.  Now, we compact using a
      hand-written Cmm routine that can be interrupted at any point.  When a
      GC is triggered during a sharing-enabled compaction, the GC has to
      traverse and update the hash table, so this hash table is now stored
      in the StgCompactNFData object.
      
      Previously, compaction consisted of a deepseq using the NFData class,
      followed by a traversal in C code to copy the data.  This is now done
      in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
      longer use the NFData instances, instead the Cmm routine evaluates
      components directly as it compacts.
      
      The new compaction is about 50% faster than the old one with no
      sharing, and a little faster on average with sharing (the cost of the
      hash table dominates when we're doing sharing).
      
      Static objects that don't (transitively) refer to any CAFs don't need
      to be copied into the compact region.  In particular this means we
      often avoid copying Char values and small Int values, because these
      are static closures in the runtime.
      
      Each Compact# object can support a single compactAdd# operation at any
      given time, so the Data.Compact library now enforces mutual exclusion
      using an MVar stored in the Compact object.
      
      We now get exceptions rather than killing everything with a barf()
      when we encounter an object that cannot be compacted (a function, or a
      mutable object).  We now also detect pinned objects, which can't be
      compacted either.
      
      The Data.Compact API has been refactored and cleaned up.  A new
      compactSize operation returns the size (in bytes) of the compact
      object.
      
      Most of the documentation is in the Haddock docs for the compact
      library, which I've expanded and improved here.
      
      Various comments in the code have been improved, especially the main
      Note [Compact Normal Forms] in rts/sm/CNF.c.
      
      I've added a few tests, and expanded a few of the tests that were
      there.  We now also run the tests with GHCi, and in a new test way
      that enables sanity checking (+RTS -DS).
      
      There's a benchmark in libraries/compact/tests/compact_bench.hs for
      measuring compaction speed and comparing sharing vs. no sharing.
      
      The field totalDataW in StgCompactNFData was unnecessary.
      
      Test Plan:
      * new unit tests
      * validate
      * tested manually that we can compact Data.Aeson data
      
      Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
      
      Subscribers: thomie, simonpj
      
      Differential Revision: https://phabricator.haskell.org/D2751
      
      GHC Trac Issues: #12455
      7036fde9
  31. 28 Oct, 2016 1 commit
  32. 12 Sep, 2016 1 commit
    • Simon Marlow's avatar
      Add hs_try_putmvar() · 454033b5
      Simon Marlow authored
      Summary:
      This is a fast, non-blocking, asynchronous, interface to tryPutMVar that
      can be called from C/C++.
      
      It's useful for callback-based C/C++ APIs: the idea is that the callback
      invokes hs_try_putmvar(), and the Haskell code waits for the callback to
      run by blocking in takeMVar.
      
      The callback doesn't block - this is often a requirement of
      callback-based APIs.  The callback wakes up the Haskell thread with
      minimal overhead and no unnecessary context-switches.
      
      There are a couple of benchmarks in
      testsuite/tests/concurrent/should_run.  Some example results comparing
      hs_try_putmvar() with using a standard foreign export:
      
          ./hs_try_putmvar003 1 64 16 100 +RTS -s -N4     0.49s
          ./hs_try_putmvar003 2 64 16 100 +RTS -s -N4     2.30s
      
      hs_try_putmvar() is 4x faster for this workload (see the source for
      hs_try_putmvar003.hs for details of the workload).
      
      An alternative solution is to use the IO Manager for this.  We've tried
      it, but there are problems with that approach:
      * Need to create a new file descriptor for each callback
      * The IO Manger thread(s) become a bottleneck
      * More potential for things to go wrong, e.g. throwing an exception in
        an IO Manager callback kills the IO Manager thread.
      
      Test Plan: validate; new unit tests
      
      Reviewers: niteria, erikd, ezyang, bgamari, austin, hvr
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2501
      454033b5
  33. 01 Aug, 2016 1 commit
  34. 20 Jul, 2016 1 commit
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
      Campagna.
      
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      
      Test Plan:
      validate
      (new test cases included)
      
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1264
      
      GHC Trac Issues: #11493
      cf989ffe
  35. 10 Jun, 2016 1 commit
    • Simon Marlow's avatar
      NUMA support · 9e5ea67e
      Simon Marlow authored
      Summary:
      The aim here is to reduce the number of remote memory accesses on
      systems with a NUMA memory architecture, typically multi-socket servers.
      
      Linux provides a NUMA API for doing two things:
      * Allocating memory local to a particular node
      * Binding a thread to a particular node
      
      When given the +RTS --numa flag, the runtime will
      * Determine the number of NUMA nodes (N) by querying the OS
      * Assign capabilities to nodes, so cap C is on node C%N
      * Bind worker threads on a capability to the correct node
      * Keep a separate free lists in the block layer for each node
      * Allocate the nursery for a capability from node-local memory
      * Allocate blocks in the GC from node-local memory
      
      For example, using nofib/parallel/queens on a 24-core 2-socket machine:
      
      ```
      $ ./Main 15 +RTS -N24 -s -A64m
        Total   time  173.960s  (  7.467s elapsed)
      
      $ ./Main 15 +RTS -N24 -s -A64m --numa
        Total   time  150.836s  (  6.423s elapsed)
      ```
      
      The biggest win here is expected to be allocating from node-local
      memory, so that means programs using a large -A value (as here).
      
      According to perf, on this program the number of remote memory accesses
      were reduced by more than 50% by using `--numa`.
      
      Test Plan:
      * validate
      * There's a new flag --debug-numa=<n> that pretends to do NUMA without
        actually making the OS calls, which is useful for testing the code
        on non-NUMA systems.
      * TODO: I need to add some unit tests
      
      Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2199
      9e5ea67e
  36. 04 Jun, 2016 1 commit
  37. 18 May, 2016 1 commit