This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts, and can be resumed by a project maintainer.
Last successful update .
  1. 18 Oct, 2019 2 commits
    • Ben Gamari's avatar
      rts: Implement concurrent collection in the nonmoving collector · d7017446
      Ben Gamari authored
      This extends the non-moving collector to allow concurrent collection.
      
      The full design of the collector implemented here is described in detail
      in a technical note
      
          B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
          Compiler" (2018)
      
      This extension involves the introduction of a capability-local
      remembered set, known as the /update remembered set/, which tracks
      objects which may no longer be visible to the collector due to mutation.
      To maintain this remembered set we introduce a write barrier on
      mutations which is enabled while a concurrent mark is underway.
      
      The update remembered set representation is similar to that of the
      nonmoving mark queue, being a chunked array of `MarkEntry`s. Each
      `Capability` maintains a single accumulator chunk, which it flushed
      when it (a) is filled, or (b) when the nonmoving collector enters its
      post-mark synchronization phase.
      
      While the write barrier touches a significant amount of code it is
      conceptually straightforward: the mutator must ensure that the referee
      of any pointer it overwrites is added to the update remembered set.
      However, there are a few details:
      
       * In the case of objects with a dirty flag (e.g. `MVar`s) we can
         exploit the fact that only the *first* mutation requires a write
         barrier.
      
       * Weak references, as usual, complicate things. In particular, we must
         ensure that the referee of a weak object is marked if dereferenced by
         the mutator. For this we (unfortunately) must introduce a read
         barrier, as described in Note [Concurrent read barrier on deRefWeak#]
         (in `NonMovingMark.c`).
      
       * Stable names are also a bit tricky as described in Note [Sweeping
         stable names in the concurrent collector] (`NonMovingSweep.c`).
      
      We take quite some pains to ensure that the high thread count often seen
      in parallel Haskell applications doesn't affect pause times. To this end
      we allow thread stacks to be marked either by the thread itself (when it
      is executed or stack-underflows) or the concurrent mark thread (if the
      thread owning the stack is never scheduled). There is a non-trivial
      handshake to ensure that this happens without racing which is described
      in Note [StgStack dirtiness flags and concurrent marking].
      Co-Authored-by: Ömer Sinan Ağacan's avatarÖmer Sinan Ağacan <omer@well-typed.com>
      d7017446
    • Ömer Sinan Ağacan's avatar
      rts/GC: Add an obvious assertion during block initialization · 697be2b6
      Ömer Sinan Ağacan authored
      Namely ensure that block descriptors are initialized with valid
      generation numbers.
      Co-Authored-By: Ben Gamari's avatarBen Gamari <ben@well-typed.com>
      697be2b6
  2. 27 Jun, 2019 1 commit
    • Matthew Pickering's avatar
      rts: Correct handling of LARGE ARR_WORDS in LDV profiler · a586b33f
      Matthew Pickering authored
      This implements the correct fix for #11627 by skipping over the slop
      (which is zeroed) rather than adding special case logic for LARGE
      ARR_WORDS which runs the risk of not performing a correct census by
      ignoring any subsequent blocks.
      
      This approach implements similar logic to that in Sanity.c
      a586b33f
  3. 02 Apr, 2019 1 commit
    • Michal Terepeta's avatar
      Improve performance of newSmallArray# · 7cf5ba3d
      Michal Terepeta authored
      This:
      - Hoists part of the condition outside of the initialization loop in
        `stg_newSmallArrayzh`.
      - Annotates one of the unlikely branches as unlikely, also in
        `stg_newSmallArrayzh`.
      - Adds a couple of annotations to `allocateMightFail` indicating which
        branches are likely to be taken.
      
      Together this gives about 5% improvement.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      7cf5ba3d
  4. 25 Mar, 2019 1 commit
    • Takenobu Tani's avatar
      Update Wiki URLs to point to GitLab · 3769e3a8
      Takenobu Tani authored
      This moves all URL references to Trac Wiki to their corresponding
      GitLab counterparts.
      
      This substitution is classified as follows:
      
      1. Automated substitution using sed with Ben's mapping rule [1]
          Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...
          New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...
      
      2. Manual substitution for URLs containing `#` index
          Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz
          New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz
      
      3. Manual substitution for strings starting with `Commentary`
          Old: Commentary/XxxYyy...
          New: commentary/xxx-yyy...
      
      See also !539
      
      [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
      3769e3a8
  5. 29 Aug, 2018 1 commit
    • David Feuer's avatar
      Finish stable split · f48e276a
      David Feuer authored
      Long ago, the stable name table and stable pointer tables were one.
      Now, they are separate, and have significantly different
      implementations. I believe the time has come to finish the split
      that began in #7674.
      
      * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`.
      
      * Give each table its own mutex.
      
      * Add FFI functions `hs_lock_stable_ptr_table` and
      `hs_unlock_stable_ptr_table` and document them.
        These are intended to replace the previously undocumented
      `hs_lock_stable_tables` and `hs_lock_stable_tables`,
        which are now documented as deprecated synonyms.
      
      * Make `eqStableName#` use pointer equality instead of unnecessarily
      comparing stable name table indices.
      
      Reviewers: simonmar, bgamari, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15555
      
      Differential Revision: https://phabricator.haskell.org/D5084
      f48e276a
  6. 20 May, 2018 1 commit
    • patrickdoc's avatar
      Add HeapView functionality · ec22f7dd
      patrickdoc authored
      This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
      The bits added are the C hooks into the RTS and a basic Haskell wrapper
      to these C hooks. The main reason for these to be added to GHC proper
      is that the code needs to be kept in sync with the closure types
      defined by the RTS. It is expected that the version of HeapView shipped
      with GHC will always work with that version of GHC and that extra
      functionality can be layered on top with a library like ghc-heap-view
      distributed via Hackage.
      
      Test Plan: validate
      
      Reviewers: simonmar, hvr, nomeata, austin, Phyx, bgamari, erikd
      
      Reviewed By: bgamari
      
      Subscribers: carter, patrickdoc, tmcgilchrist, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3055
      ec22f7dd
  7. 16 Aug, 2017 1 commit
    • Ben Gamari's avatar
      Speed up compilation of profiling stubs · a8da0de2
      Ben Gamari authored
      Here we encode the cost centre list as static data. This means that the
      initialization stubs are small functions which should be easy for GCC to
      compile, even with optimization.
      
      Fixes #7960.
      
      Test Plan: Test profiling
      
      Reviewers: austin, erikd, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, thomie
      
      GHC Trac Issues: #7960
      
      Differential Revision: https://phabricator.haskell.org/D3853
      a8da0de2
  8. 27 Jul, 2017 1 commit
    • Andreas Klebinger's avatar
      Initialize hs_init with UTF8 encoded arguments on Windows. · 7af0b906
      Andreas Klebinger authored
      Summary:
      Get utf8 encoded arguments before we call hs_init and use them
      instead of ignoring hs_init arguments. This reduces differing
      behaviour of the RTS between windows and linux and simplifies
      the code involved.
      
      A few testcases were changed to expect the same result on windows
      as on linux after the changes.
      
      This fixes #13940.
      
      Test Plan: ./validate
      
      Reviewers: austin, hvr, bgamari, erikd, simonmar, Phyx
      
      Subscribers: Phyx, rwbarton, thomie
      
      GHC Trac Issues: #13940
      
      Differential Revision: https://phabricator.haskell.org/D3739
      7af0b906
  9. 29 Apr, 2017 1 commit
  10. 23 Apr, 2017 1 commit
  11. 02 Apr, 2017 1 commit
    • Simon Marlow's avatar
      Report heap overflow in the same way as stack overflow · 61ba4518
      Simon Marlow authored
      Now that we throw an exception for heap overflow, we should only print
      the heap overflow message in the main thread when the HeapOverflow
      exception is caught, rather than as a side effect in the GC.
      
      Stack overflows were already done this way, I just made heap overflow
      consistent with stack overflow, and did some related cleanup.
      
      Fixes broken T2592(profasm) which was reporting the heap overflow
      message twice (you would only notice when building with profiling
      libs enabled).
      
      Test Plan: validate
      
      Reviewers: bgamari, niteria, austin, DemiMarie, hvr, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3394
      61ba4518
  12. 07 Dec, 2016 1 commit
  13. 06 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul GC stats · 24e6594c
      Simon Marlow authored
      Summary:
      Visible API changes:
      
      * The C struct `GCDetails` gives the stats about a single GC.  This is
        passed to the `gcDone()` callback if one is set via the
        RtsConfig. (previously we just passed a collection of values, so this
        is more extensible, at the expense of breaking the existing API)
      
      * `RTSStats` gives cumulative stats since the start of the program,
        and includes the `GCDetails` for the most recent GC.  This struct
        can be obtained via `getRTSStats()` (the old `getGCStats()` has been
        removed, and `getGCStatsEnabled()` has been renamed to
        `getRTSStatsEnabled()`)
      
      Improvements:
      
      * The per-GC stats and cumulative stats are now cleanly separated.
      
      * Inside the RTS we have a top-level `RTSStats` struct to keep all our
        stats in, previously this was just a collection of strangely-named
        variables.  This struct is mostly just copied in `getRTSStats()`, so
        the implementation of that function is a lot shorter.
      
      * Types are more consistent.  We use a uint64_t byte count for all
        memory values, and Time for all time values.
      
      * Names are more consistent.  We use a suffix `_bytes` for all byte
        counts and `_ns` for all time values.
      
      * We now collect information about the amount of memory in large
        objects and compact objects in `GCDetails`. (the latter was the reason
        I started doing this patch but it seems to have ballooned a bit!)
      
      * I fixed a bug in the calculation of the elapsed MUT time, and added
        an ASSERT to stop the calculations going wrong in the future.
      
      For now I kept the Haskell API in `GHC.Stats` the same, by
      impedence-matching with the new API.  We could either break that API
      and make it match the C API more closely, or we could add a new API
      and deprecate the old one.  Opinions welcome.
      
      This stuff is very easy to get wrong, and it's hard to test.  Reviews
      welcome!
      
      Test Plan:
      manual testing
      validate
      
      Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2756
      24e6594c
  14. 15 Aug, 2016 1 commit
  15. 10 Jun, 2016 1 commit
    • Simon Marlow's avatar
      NUMA support · 9e5ea67e
      Simon Marlow authored
      Summary:
      The aim here is to reduce the number of remote memory accesses on
      systems with a NUMA memory architecture, typically multi-socket servers.
      
      Linux provides a NUMA API for doing two things:
      * Allocating memory local to a particular node
      * Binding a thread to a particular node
      
      When given the +RTS --numa flag, the runtime will
      * Determine the number of NUMA nodes (N) by querying the OS
      * Assign capabilities to nodes, so cap C is on node C%N
      * Bind worker threads on a capability to the correct node
      * Keep a separate free lists in the block layer for each node
      * Allocate the nursery for a capability from node-local memory
      * Allocate blocks in the GC from node-local memory
      
      For example, using nofib/parallel/queens on a 24-core 2-socket machine:
      
      ```
      $ ./Main 15 +RTS -N24 -s -A64m
        Total   time  173.960s  (  7.467s elapsed)
      
      $ ./Main 15 +RTS -N24 -s -A64m --numa
        Total   time  150.836s  (  6.423s elapsed)
      ```
      
      The biggest win here is expected to be allocating from node-local
      memory, so that means programs using a large -A value (as here).
      
      According to perf, on this program the number of remote memory accesses
      were reduced by more than 50% by using `--numa`.
      
      Test Plan:
      * validate
      * There's a new flag --debug-numa=<n> that pretends to do NUMA without
        actually making the OS calls, which is useful for testing the code
        on non-NUMA systems.
      * TODO: I need to add some unit tests
      
      Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2199
      9e5ea67e
  16. 18 Apr, 2016 1 commit
  17. 23 Nov, 2015 1 commit
  18. 17 Oct, 2015 1 commit
    • Ben Gamari's avatar
      Libdw: Add libdw-based stack unwinding · a6a3dabc
      Ben Gamari authored
      This adds basic support to the RTS for DWARF-assisted unwinding of the
      Haskell and C stack via libdw. This only adds the infrastructure;
      consumers of this functionality will be introduced in future diffs.
      
      Currently we are carrying the initial register collection code in
      Libdw.c but this will eventually make its way upstream to libdw.
      
      Test Plan: See future patches
      
      Reviewers: Tarrasch, scpmw, austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Subscribers: simonmar, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1196
      
      GHC Trac Issues: #10656
      a6a3dabc
  19. 10 Apr, 2015 1 commit
  20. 07 Apr, 2015 1 commit
  21. 10 Dec, 2014 1 commit
  22. 20 Oct, 2014 1 commit
  23. 02 Oct, 2014 1 commit
    • Edward Z. Yang's avatar
      Rename _closure to _static_closure, apply naming consistently. · 35672072
      Edward Z. Yang authored
      Summary:
      In preparation for indirecting all references to closures,
      we rename _closure to _static_closure to ensure any old code
      will get an undefined symbol error.  In order to reference
      a closure foobar_closure (which is now undefined), you should instead
      use STATIC_CLOSURE(foobar).  For convenience, a number of these
      old identifiers are macro'd.
      
      Across C-- and C (Windows and otherwise), there were differing
      conventions on whether or not foobar_closure or &foobar_closure
      was the address of the closure.  Now, all foobar_closure references
      are addresses, and no & is necessary.
      
      CHARLIKE/INTLIKE were not changed, simply alpha-renamed.
      
      Part of remove HEAP_ALLOCED patch set (#8199)
      
      Depends on D265
      Signed-off-by: Edward Z. Yang's avatarEdward Z. Yang <ezyang@mit.edu>
      
      Test Plan: validate
      
      Reviewers: simonmar, austin
      
      Subscribers: simonmar, ezyang, carter, thomie
      
      Differential Revision: https://phabricator.haskell.org/D267
      
      GHC Trac Issues: #8199
      35672072
  24. 20 Aug, 2014 1 commit
  25. 25 Oct, 2013 1 commit
  26. 11 Oct, 2013 1 commit
  27. 01 Oct, 2013 1 commit
  28. 08 Sep, 2013 2 commits
  29. 15 Jun, 2013 1 commit
  30. 05 Feb, 2013 1 commit
  31. 17 Jan, 2013 1 commit
  32. 01 Nov, 2012 1 commit
  33. 08 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  34. 31 Jul, 2012 1 commit
  35. 27 Apr, 2012 1 commit
  36. 26 Apr, 2012 3 commits