1. 29 Apr, 2017 2 commits
  2. 23 Apr, 2017 3 commits
  3. 05 Apr, 2017 1 commit
  4. 04 Apr, 2017 1 commit
  5. 02 Apr, 2017 1 commit
    • Simon Marlow's avatar
      Report heap overflow in the same way as stack overflow · 61ba4518
      Simon Marlow authored
      Now that we throw an exception for heap overflow, we should only print
      the heap overflow message in the main thread when the HeapOverflow
      exception is caught, rather than as a side effect in the GC.
      
      Stack overflows were already done this way, I just made heap overflow
      consistent with stack overflow, and did some related cleanup.
      
      Fixes broken T2592(profasm) which was reporting the heap overflow
      message twice (you would only notice when building with profiling
      libs enabled).
      
      Test Plan: validate
      
      Reviewers: bgamari, niteria, austin, DemiMarie, hvr, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3394
      61ba4518
  6. 28 Feb, 2017 1 commit
    • Moritz Angermann's avatar
      Drop copy step from the rts/ghc.mk · 3e33d334
      Moritz Angermann authored
      Recently I've used a different build system for building the
      rts (Xcode).  And in doing so, I looked through the rts/ghc.mk
      to figure out how to build the rts.
      
      In general it's quite straight forward to just compile all the
      c files with the proper flags.
      
      However there is one rather awkward copy step that copies some
      files for special handling for the rts way.
      
      I'm wondering if the proposed solution in this diff is better
      or worse than the current situation?
      
      The idea is to keep the files, but use #includes to produce
      identical files with just an additional define. It does however
      produce empty objects for non threaded ways.
      
      Reviewers: ezyang, bgamari, austin, erikd, simonmar, rwbarton
      
      Reviewed By: bgamari, simonmar, rwbarton
      
      Subscribers: rwbarton, thomie, snowleopard
      
      Differential Revision: https://phabricator.haskell.org/D3237
      3e33d334
  7. 23 Feb, 2017 1 commit
  8. 07 Dec, 2016 2 commits
    • Simon Marlow's avatar
      Fix crashes in hash table scanning with THREADED_RTS · 9043a400
      Simon Marlow authored
      See comments.
      9043a400
    • Simon Marlow's avatar
      Overhaul of Compact Regions (#12455) · 7036fde9
      Simon Marlow authored
      Summary:
      This commit makes various improvements and addresses some issues with
      Compact Regions (aka Compact Normal Forms).
      
      This was the most important thing I wanted to fix.  Compaction
      previously prevented GC from running until it was complete, which
      would be a problem in a multicore setting.  Now, we compact using a
      hand-written Cmm routine that can be interrupted at any point.  When a
      GC is triggered during a sharing-enabled compaction, the GC has to
      traverse and update the hash table, so this hash table is now stored
      in the StgCompactNFData object.
      
      Previously, compaction consisted of a deepseq using the NFData class,
      followed by a traversal in C code to copy the data.  This is now done
      in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
      longer use the NFData instances, instead the Cmm routine evaluates
      components directly as it compacts.
      
      The new compaction is about 50% faster than the old one with no
      sharing, and a little faster on average with sharing (the cost of the
      hash table dominates when we're doing sharing).
      
      Static objects that don't (transitively) refer to any CAFs don't need
      to be copied into the compact region.  In particular this means we
      often avoid copying Char values and small Int values, because these
      are static closures in the runtime.
      
      Each Compact# object can support a single compactAdd# operation at any
      given time, so the Data.Compact library now enforces mutual exclusion
      using an MVar stored in the Compact object.
      
      We now get exceptions rather than killing everything with a barf()
      when we encounter an object that cannot be compacted (a function, or a
      mutable object).  We now also detect pinned objects, which can't be
      compacted either.
      
      The Data.Compact API has been refactored and cleaned up.  A new
      compactSize operation returns the size (in bytes) of the compact
      object.
      
      Most of the documentation is in the Haddock docs for the compact
      library, which I've expanded and improved here.
      
      Various comments in the code have been improved, especially the main
      Note [Compact Normal Forms] in rts/sm/CNF.c.
      
      I've added a few tests, and expanded a few of the tests that were
      there.  We now also run the tests with GHCi, and in a new test way
      that enables sanity checking (+RTS -DS).
      
      There's a benchmark in libraries/compact/tests/compact_bench.hs for
      measuring compaction speed and comparing sharing vs. no sharing.
      
      The field totalDataW in StgCompactNFData was unnecessary.
      
      Test Plan:
      * new unit tests
      * validate
      * tested manually that we can compact Data.Aeson data
      
      Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
      
      Subscribers: thomie, simonpj
      
      Differential Revision: https://phabricator.haskell.org/D2751
      
      GHC Trac Issues: #12455
      7036fde9
  9. 06 Dec, 2016 2 commits
    • Simon Marlow's avatar
      Overhaul GC stats · 24e6594c
      Simon Marlow authored
      Summary:
      Visible API changes:
      
      * The C struct `GCDetails` gives the stats about a single GC.  This is
        passed to the `gcDone()` callback if one is set via the
        RtsConfig. (previously we just passed a collection of values, so this
        is more extensible, at the expense of breaking the existing API)
      
      * `RTSStats` gives cumulative stats since the start of the program,
        and includes the `GCDetails` for the most recent GC.  This struct
        can be obtained via `getRTSStats()` (the old `getGCStats()` has been
        removed, and `getGCStatsEnabled()` has been renamed to
        `getRTSStatsEnabled()`)
      
      Improvements:
      
      * The per-GC stats and cumulative stats are now cleanly separated.
      
      * Inside the RTS we have a top-level `RTSStats` struct to keep all our
        stats in, previously this was just a collection of strangely-named
        variables.  This struct is mostly just copied in `getRTSStats()`, so
        the implementation of that function is a lot shorter.
      
      * Types are more consistent.  We use a uint64_t byte count for all
        memory values, and Time for all time values.
      
      * Names are more consistent.  We use a suffix `_bytes` for all byte
        counts and `_ns` for all time values.
      
      * We now collect information about the amount of memory in large
        objects and compact objects in `GCDetails`. (the latter was the reason
        I started doing this patch but it seems to have ballooned a bit!)
      
      * I fixed a bug in the calculation of the elapsed MUT time, and added
        an ASSERT to stop the calculations going wrong in the future.
      
      For now I kept the Haskell API in `GHC.Stats` the same, by
      impedence-matching with the new API.  We could either break that API
      and make it match the C API more closely, or we could add a new API
      and deprecate the old one.  Opinions welcome.
      
      This stuff is very easy to get wrong, and it's hard to test.  Reviews
      welcome!
      
      Test Plan:
      manual testing
      validate
      
      Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2756
      24e6594c
    • Tamar Christina's avatar
      Fix x86 Windows build and testsuite · b82f71b9
      Tamar Christina authored
      Summary:
      Fix issues preventing x86 GHC to build on Windows and
      fix segfault in the testsuite.
      
      Test Plan: ./validate
      
      Reviewers: austin, erikd, simonmar, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: #ghc_windows_task_force, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2789
      b82f71b9
  10. 29 Nov, 2016 2 commits
  11. 16 Nov, 2016 1 commit
  12. 14 Nov, 2016 1 commit
    • Simon Marlow's avatar
      Remove CONSTR_STATIC · 55d535da
      Simon Marlow authored
      Summary:
      We currently have two info tables for a constructor
      
      * XXX_con_info: the info table for a heap-resident instance of the
        constructor, It has type CONSTR, or one of the specialised types like
        CONSTR_1_0
      
      * XXX_static_info: the info table for a static instance of this
        constructor, which has type CONSTR_STATIC or CONSTR_STATIC_NOCAF.
      
      I'm getting rid of the latter, and using the `con_info` info table for
      both static and dynamic constructors.  For rationale and more details
      see Note [static constructors] in SMRep.hs.
      
      I also removed these macros: `isSTATIC()`, `ip_STATIC()`,
      `closure_STATIC()`, since they relied on the CONSTR/CONSTR_STATIC
      distinction, and anyway HEAP_ALLOCED() does the same job.
      
      Test Plan: validate
      
      Reviewers: bgamari, simonpj, austin, gcampax, hvr, niteria, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2690
      
      GHC Trac Issues: #12455
      55d535da
  13. 29 Oct, 2016 1 commit
    • Simon Marlow's avatar
      Fix a bug in parallel GC synchronisation · 4e088b49
      Simon Marlow authored
      Summary:
      The problem boils down to global variables: in particular gc_threads[],
      which was being modified by a subsequent GC before the previous GC had
      finished with it.  The fix is to not use global variables.
      
      This was causing setnumcapabilities001 to fail (again!).  It's an old
      bug though.
      
      Test Plan:
      Ran setnumcapabilities001 in a loop for a couple of hours.  Before this
      patch it had been failing after a few minutes.  Not a very scientific
      test, but it's the best I have.
      
      Reviewers: bgamari, austin, fryguybob, niteria, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2654
      4e088b49
  14. 09 Oct, 2016 1 commit
    • Simon Marlow's avatar
      Turn on -n4m with -A16m or greater · 85e81a85
      Simon Marlow authored
      Nursery chunks help reduce the cost of GC when capabilities are unevenly
      loaded, by ensuring that we use more of the available nursery.
      
      The rationale for enabling this at -A16m is that any negative effects
      due to loss of cache locality are less likely to be an issue at -A16m
      and above.  It's a conservative guess.  If we had a lot of benchmark
      data we could probably do better.
      
      Results for nofib/parallel at -N4 -A32m with and without -n4m:
      
      ```
      ------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      ------------------------------------------------------------------------
         blackscholes           0.0%     -9.5%     -9.0%    -15.0%     -2.2%
                coins           0.0%     -4.7%     -3.6%     -0.6%    -13.6%
               mandel           0.0%     -0.3%     +7.7%    +13.1%     +0.1%
              matmult           0.0%     +1.5%    +10.0%     +7.7%     +0.1%
                nbody           0.0%     -4.1%     -2.9%     0.085      0.0%
               parfib           0.0%     -1.4%     +1.0%     +1.5%     +0.2%
              partree           0.0%     -0.3%     +0.8%     +2.9%     -0.8%
                 prsa           0.0%     -0.5%     -2.1%     -7.6%      0.0%
               queens           0.0%     -3.2%     -1.4%     +2.2%     +1.3%
                  ray           0.0%     -5.6%    -14.5%     -7.6%     +0.8%
             sumeuler           0.0%     -0.4%     +2.4%     +1.1%      0.0%
      ------------------------------------------------------------------------
                  Min           0.0%     -9.5%    -14.5%    -15.0%    -13.6%
                  Max           0.0%     +1.5%    +10.0%    +13.1%     +1.3%
       Geometric Mean          +0.0%     -2.6%     -1.3%     -0.5%     -1.4%
      ```
      
      Not conclusive, but slightly better.  This matters a lot more when you
      have more cores.
      
      Test Plan: validate, nofib/paralel
      
      Reviewers: niteria, ezyang, nh2, trofi, austin, erikd, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2581
      
      GHC Trac Issues: #9221
      85e81a85
  15. 09 Sep, 2016 1 commit
    • bitonic's avatar
      Make start address of `osReserveHeapMemory` tunable via command line -xb · 1b5f9207
      bitonic authored
      Summary:
      We stumbled upon a case where an external library (OpenCL) does not work
      if a specific address (0x200000000) is taken.
      
      It so happens that `osReserveHeapMemory` starts trying to mmap at 0x200000000:
      
      ```
              void *hint = (void*)((W_)8 * (1 << 30) + attempt * BLOCK_SIZE);
              at = osTryReserveHeapMemory(*len, hint);
      ```
      
      This makes it impossible to use Haskell programs compiled with GHC 8
      with C functions that use OpenCL.
      
      See this example ​https://github.com/chpatrick/oclwtf for a repro.
      
      This patch allows the user to work around this kind of behavior outside
      our control by letting the user override the starting address through an
      RTS command line flag.
      
      Reviewers: bgamari, Phyx, simonmar, erikd, austin
      
      Reviewed By: Phyx, simonmar
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2513
      1b5f9207
  16. 16 Aug, 2016 1 commit
  17. 15 Aug, 2016 1 commit
  18. 14 Aug, 2016 1 commit
  19. 03 Aug, 2016 1 commit
  20. 27 Jul, 2016 1 commit
  21. 22 Jul, 2016 1 commit
    • Erik de Castro Lopo's avatar
      Fix the non-Linux build · d068220f
      Erik de Castro Lopo authored
      Summary:
      The recent Compact Regions commit (cf989ffe) builds fine on Linux
      but doesn't build on OS X r Windows.
      
      * rts/sm/CNF.c: Drop un-needed #includes.
      * Fix parenthesis usage with CPP ASSERT macro.
      * Fix format string in debugBelch messages.
      * Use stg_max() instead hand rolled inline max() function.
      
      Test Plan: Build on Linux, OS X and Windows
      
      Reviewers: gcampax, simonmar, austin, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2421
      d068220f
  22. 20 Jul, 2016 1 commit
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
      Campagna.
      
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      
      Test Plan:
      validate
      (new test cases included)
      
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1264
      
      GHC Trac Issues: #11493
      cf989ffe
  23. 17 Jun, 2016 1 commit
    • Simon Marlow's avatar
      NUMA cleanups · 498ed266
      Simon Marlow authored
      - Move the numaMap and nNumaNodes out of RtsFlags to Capability.c
      - Add a test to tests/rts
      498ed266
  24. 10 Jun, 2016 2 commits
    • Simon Marlow's avatar
      Rts flags cleanup · c88f31a0
      Simon Marlow authored
      * Remove unused/old flags from the structs
      * Update old comments
      * Add missing flags to GHC.RTS
      * Simplify GHC.RTS, remove C code and use hsc2hs instead
      * Make ParFlags unconditional, and add support to GHC.RTS
      c88f31a0
    • Simon Marlow's avatar
      NUMA support · 9e5ea67e
      Simon Marlow authored
      Summary:
      The aim here is to reduce the number of remote memory accesses on
      systems with a NUMA memory architecture, typically multi-socket servers.
      
      Linux provides a NUMA API for doing two things:
      * Allocating memory local to a particular node
      * Binding a thread to a particular node
      
      When given the +RTS --numa flag, the runtime will
      * Determine the number of NUMA nodes (N) by querying the OS
      * Assign capabilities to nodes, so cap C is on node C%N
      * Bind worker threads on a capability to the correct node
      * Keep a separate free lists in the block layer for each node
      * Allocate the nursery for a capability from node-local memory
      * Allocate blocks in the GC from node-local memory
      
      For example, using nofib/parallel/queens on a 24-core 2-socket machine:
      
      ```
      $ ./Main 15 +RTS -N24 -s -A64m
        Total   time  173.960s  (  7.467s elapsed)
      
      $ ./Main 15 +RTS -N24 -s -A64m --numa
        Total   time  150.836s  (  6.423s elapsed)
      ```
      
      The biggest win here is expected to be allocating from node-local
      memory, so that means programs using a large -A value (as here).
      
      According to perf, on this program the number of remote memory accesses
      were reduced by more than 50% by using `--numa`.
      
      Test Plan:
      * validate
      * There's a new flag --debug-numa=<n> that pretends to do NUMA without
        actually making the OS calls, which is useful for testing the code
        on non-NUMA systems.
      * TODO: I need to add some unit tests
      
      Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2199
      9e5ea67e
  25. 24 May, 2016 1 commit
    • Erik de Castro Lopo's avatar
      Runtime linker: Break m32 allocator out into its own file · fe8a4e5d
      Erik de Castro Lopo authored
      This makes the code a little more modular and allows the removal of some
      CPP hackery. By providing dummy implementations of of the `m32_*`
      functions (which simply call `errorBelch`) it means that the call sites
      for these functions are syntax checked even when `RTS_LINKER_USE_MMAP`
      is `0`.
      
      Also changes some size parameter types from `unsigned int` to `size_t`.
      
      Test Plan: Validate on Linux, OS X and Windows
      
      Reviewers: Phyx, hsyl20, bgamari, simonmar, austin
      
      Reviewed By: simonmar, austin
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2237
      fe8a4e5d
  26. 19 May, 2016 2 commits
  27. 17 May, 2016 1 commit
    • Erik de Castro Lopo's avatar
      rts: More const correct-ness fixes · 33c029dd
      Erik de Castro Lopo authored
      In addition to more const-correctness fixes this patch fixes an
      infelicity of the previous const-correctness patch (995cf0f3) which
      left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter
      but returning a non-const pointer. Here we restore the original type
      signature of `UNTAG_CLOSURE` and add a new function
      `UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure`
      pointer and uses that wherever possible.
      
      Test Plan: Validate on Linux, OS X and Windows
      
      Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi
      
      Reviewed By: simonmar, trofi
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2231
      33c029dd
  28. 12 May, 2016 2 commits
  29. 11 May, 2016 1 commit
    • takano-akio's avatar
      Handle promotion failures when scavenging a WEAK (#11108) · 9363f04d
      takano-akio authored
      Previously, we ignored promotion failures when evacuating fields of
      a WEAK object. When a failure happens, this resulted in an WEAK object
      pointing to another object in a younger generation, causing crashes.
      
      I used the test case from #11746 to check that the fix is working.
      However I haven't managed to produce a test case that quickly reproduces
      the issue.
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2189
      
      GHC Trac Issues: #11108
      9363f04d
  30. 10 May, 2016 1 commit
    • wereHamster's avatar
      Use stdint types for Stg{Word,Int}{8,16,32,64} · 260a5648
      wereHamster authored
      We can't define Stg{Int,Word} in terms of {,u}intptr_t because STG
      depends on them being the exact same size as void*, and {,u}intptr_t
      does not make that guarantee. Furthermore, we also need to define
      StgHalf{Int,Word}, so the preprocessor if needs to stay. But we can at
      least keep it in a single place instead of repeating it in various
      files.
      
      Also define STG_{INT,WORD}{8,16,32,64}_{MIN,MAX} and use it in HsFFI.h,
      further reducing the need for CPP in other files.
      
      Reviewers: austin, bgamari, simonmar, hvr, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2182
      260a5648
  31. 04 May, 2016 1 commit