1. 22 Feb, 2019 40 commits
    • Ömer Sinan Ağacan's avatar
      Remove origin arguments from write barrier functions · 6698bbad
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      Write barriers push old values of updated field, e.g. when we have
      
          *q = p;
      
      we do
      
          updateRemembSetPushClosure(*q, q);
          *q = p;
      
      Here the second argument ("origin") is not useful because by the time we
      do the update we invalidate "origin" (`q` is no longer origin of old
      `*q`).
      
      In general it doesn't make sense to record origins in write barriers so
      we remove all origin arguments from write barriers.
      
      Fixes #170.
      6698bbad
    • Ben Gamari's avatar
      rts: Write barriers for message mutations · a4f7fc5e
      Ben Gamari authored
      a4f7fc5e
    • Ben Gamari's avatar
      a7020f94
    • Ömer Sinan Ağacan's avatar
      3582ead9
    • Ömer Sinan Ağacan's avatar
      Remove upd_rem_set_syncd · bb47beb4
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      bb47beb4
    • Ömer Sinan Ağacan's avatar
      Fix incorrect flush calls · 505feaab
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      These flush calls cause capabilities to flush their UpdRemSets too
      early, then set `upd_rem_set_syncd = true`. This causes NOT syncing
      UpdRemSets in nonmovingBeginFlush and losing track of some objects.
      
      Fixes #159 (return_mem_to_os)
      505feaab
    • Ömer Sinan Ağacan's avatar
      Move non-moving weaks back to oldest gen on shutdown · ec2fe95c
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      This fixes a bug that happens when we start exit sequence while
      concurrent mark is running. Mark copies oldest gen threads to
      nonmoving_weak_ptr_list and nonmoving_old_weak_ptr_list (which also act
      as the snapshot for weaks) before releasing capabilities to allow
      concurrent minor collections (which may add weaks to oldest_gen weak
      list). We need to move these weaks back to oldest gen weak list to be
      able to run C finalizers of them in runAllCFinalizers in hs_exit_.
      ec2fe95c
    • Ömer Sinan Ağacan's avatar
      Add Note [Unintentional marking in resurrectThreads] · 8f69ec9c
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      8f69ec9c
    • Ömer Sinan Ağacan's avatar
      Fix unintentional marking in resurrectThreads · b33f7a5e
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      `resurrectThreads` runs a code that runs a write barrier, causing
      pushing objects to UpdRemSet. This causes marking some objects in the
      next GC cycle. We fix this by resetting UpdRemSets before releasing the
      capabilities. See also comments in the code.
      
      Fixes #142
      b33f7a5e
    • Ben Gamari's avatar
      StableName: Take stable name table lock in gcStableNameTable · 6b4870ce
      Ben Gamari authored
      Otherwise we may potentially race with the nonmoving collector. I
      believe this was the cause of #155.
      6b4870ce
    • Ben Gamari's avatar
      testsuite: Skip T15892 in nonmoving_thr_ghc · 917cb802
      Ben Gamari authored
      917cb802
    • Ben Gamari's avatar
      Omit broken tests · 104de11d
      Ben Gamari authored
      104de11d
    • Ben Gamari's avatar
      rts: Write barrier wibbles · f7805965
      Ben Gamari authored
      f7805965
    • Ben Gamari's avatar
      ec03303d
    • Ben Gamari's avatar
    • Ben Gamari's avatar
      XXX: testsuite: Skip seward-space-leak · 171b3993
      Ben Gamari authored
      171b3993
    • Ben Gamari's avatar
    • Ben Gamari's avatar
      06b26f13
    • Ben Gamari's avatar
      rts/Storage: Handle setNumCapabilities · ee58c1ed
      Ben Gamari authored
      ee58c1ed
    • Ben Gamari's avatar
      76a11e0a
    • Ben Gamari's avatar
      rts: Disallow combining of -xn and -c · e75d5636
      Ben Gamari authored
      These two collection strategies are mutually exclusive.
      Fixes #132.
      e75d5636
    • Ben Gamari's avatar
      gitlab-ci: Well-Typed CI · 1ecd298f
      Ben Gamari authored
      1ecd298f
    • Ben Gamari's avatar
      Whitespace fixes · b0d69480
      Ben Gamari authored
      b0d69480
    • Ben Gamari's avatar
      rts: Properly handle whiteholes in nonmoving collector · 82345ff5
      Ben Gamari authored
      Previously we relied on blatantly undefined behavior. Now we at very
      least use volatile loads.
      82345ff5
    • Ben Gamari's avatar
      221e18b8
    • Ben Gamari's avatar
      testsuite: Add nonmoving_thr_ghc way · 14604002
      Ben Gamari authored
      This uses the nonmoving collector when compiling the testcases.
      14604002
    • Ben Gamari's avatar
      rts: Fix MVAR write barrier in putMVar# · 37f8f6df
      Ben Gamari authored
      Previously we would fail to account for changes by putMVar#'s wakeup
      loop.
      37f8f6df
    • Ben Gamari's avatar
      rts/NonMovingMark: Allow WHITEHOLEs to be pushed as thunks · 3cb86a5b
      Ben Gamari authored
      This may happen if two threads enter the thunk at the same time.
      3cb86a5b
    • Ömer Sinan Ağacan's avatar
      Fix snapshot invariant when updating THUNK_STATICs · c255a2bc
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      We need to push the SRT to UpdRemSet when updating THUNK_STATICs as when
      they become IND_STATIC we lose the SRT which is in the snapshot.
      
      Fixes #145, #124
      
      (cherry picked from commit 6a88e1786f0d5cb287d5247627c6ef1b69acb4aa)
      c255a2bc
    • Ben Gamari's avatar
      1665ae78
    • Ben Gamari's avatar
      testsuite: Don't run T15892 in nonmoving ways · 9942bc3d
      Ben Gamari authored
      The nonmoving GC doesn't support `+RTS -G1`, which this test insists on.
      9942bc3d
    • Ben Gamari's avatar
      710732cf
    • Ben Gamari's avatar
      rts: Introduce non-moving heap census · f8b0f594
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      This introduces a simple census of the non-moving heap (not to be
      confused with the heap census used by the heap profiler). This
      collects basic heap usage information (number of allocated and free
      blocks) which is useful when characterising fragmentation of the
      nonmoving heap.
      f8b0f594
    • Ben Gamari's avatar
      rts: Tracing support for nonmoving collection events · e085f1ff
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      This introduces a few events to mark key points in the nonmoving
      garbage collection cycle. These include:
      
       * `EVENT_CONC_MARK_BEGIN`, denoting the beginning of a round of
         marking. This may happen more than once in a single major collection
         since we the major collector iterates until it hits a fixed point.
      
       * `EVENT_CONC_MARK_END`, denoting the end of a round of marking.
      
       * `EVENT_CONC_SYNC_BEGIN`, denoting the beginning of the post-mark
         synchronization phase
      
       * `EVENT_CONC_UPD_REM_SET_FLUSH`, indicating that a capability has
         flushed its update remembered set.
      
       * `EVENT_CONC_SYNC_END`, denoting that all mutators have flushed their
         update remembered sets.
      
       * `EVENT_CONC_SWEEP_BEGIN`, denoting the beginning of the sweep portion
         of the major collection.
      
       * `EVENT_CONC_SWEEP_END`, denoting the end of the sweep portion of the
         major collection.
      e085f1ff
    • Ben Gamari's avatar
      testsuite: Add nonmoving_thr WAY · 2992414f
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      2992414f
    • Ben Gamari's avatar
      rts: Implement concurrent collection in the nonmoving collector · 43f30cd6
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      
      
      This extends the non-moving collector to allow concurrent collection.
      
      The full design of the collector implemented here is described in detail
      in a technical note
      
          B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
          Compiler" (2018)
      
      This extension involves the introduction of a capability-local
      remembered set, known as the /update remembered set/, which tracks
      objects which may no longer be visible to the collector due to mutation.
      To maintain this remembered set we introduce a write barrier on
      mutations which is enabled while a concurrent mark is underway.
      
      The update remembered set representation is similar to that of the
      nonmoving mark queue, being a chunked array of `MarkEntry`s. Each
      `Capability` maintains a single accumulator chunk, which it flushed
      when it (a) is filled, or (b) when the nonmoving collector enters its
      post-mark synchronization phase.
      
      While the write barrier touches a significant amount of code it is
      conceptually straightforward: the mutator must ensure that the referee
      of any pointer it overwrites is added to the update remembered set.
      However, there are a few details:
      
       * In the case of objects with a dirty flag (e.g. `MVar`s) we can
         exploit the fact that only the *first* mutation requires a write
         barrier.
      
       * Weak references, as usual, complicate things. In particular, we must
         ensure that the referee of a weak object is marked if dereferenced by
         the mutator. For this we (unfortunately) must introduce a read
         barrier, as described in Note [Concurrent read barrier on deRefWeak#]
         (in `NonMovingMark.c`).
      
       * Stable names are also a bit tricky as described in Note [Sweeping
         stable names in the concurrent collector] (`NonMovingSweep.c`).
      
      We take quite some pains to ensure that the high thread count often seen
      in parallel Haskell applications doesn't affect pause times. To this end
      we allow thread stacks to be marked either by the thread itself (when it
      is executed or stack-underflows) or the concurrent mark thread (if the
      thread owning the stack is never scheduled). There is a non-trivial
      handshake to ensure that this happens without racing which is described
      in Note [StgStack dirtiness flags and concurrent marking].
      Co-Authored-by: Ömer Sinan Ağacan's avatarÖmer Sinan Ağacan <omer@well-typed.com>
      43f30cd6
    • Ben Gamari's avatar
      testsuite: Add nonmoving WAY · 68e5616f
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      This simply runs the compile_and_run tests with `-xn`, enabling the
      nonmoving oldest generation.
      68e5616f
    • Ömer Sinan Ağacan's avatar
      rts: Non-concurrent mark and sweep · 46322977
      Ömer Sinan Ağacan authored and Ben Gamari's avatar Ben Gamari committed
      
      
      This implements the core heap structure and a serial mark/sweep
      collector which can be used to manage the oldest-generation heap.
      This is the first step towards a concurrent mark-and-sweep collector
      aimed at low-latency applications.
      
      The full design of the collector implemented here is described in detail
      in a technical note
      
          B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
          Compiler" (2018)
      
      The basic heap structure used in this design is heavily inspired by
      
          K. Ueno & A. Ohori. "A fully concurrent garbage collector for
          functional programs on multicore processors." /ACM SIGPLAN Notices/
          Vol. 51. No. 9 (presented by ICFP 2016)
      
      This design is intended to allow both marking and sweeping
      concurrent to execution of a multi-core mutator. Unlike the Ueno design,
      which requires no global synchronization pauses, the collector
      introduced here requires a stop-the-world pause at the beginning and end
      of the mark phase.
      
      To avoid heap fragmentation, the allocator consists of a number of
      fixed-size /sub-allocators/. Each of these sub-allocators allocators into
      its own set of /segments/, themselves allocated from the block
      allocator. Each segment is broken into a set of fixed-size allocation
      blocks (which back allocations) in addition to a bitmap (used to track
      the liveness of blocks) and some additional metadata (used also used
      to track liveness).
      
      This heap structure enables collection via mark-and-sweep, which can be
      performed concurrently via a snapshot-at-the-beginning scheme (although
      concurrent collection is not implemented in this patch).
      
      The mark queue is a fairly straightforward chunked-array structure.
      The representation is a bit more verbose than a typical mark queue to
      accomodate a combination of two features:
      
       * a mark FIFO, which improves the locality of marking, reducing one of
         the major overheads seen in mark/sweep allocators (see [1] for
         details)
      
       * the selector optimization and indirection shortcutting, which
         requires that we track where we found each reference to an object
         in case we need to update the reference at a later point (e.g. when
         we find that it is an indirection). See Note [Origin references in
         the nonmoving collector] (in `NonMovingMark.h`) for details.
      
      Beyond this the mark/sweep is fairly run-of-the-mill.
      
      [1] R. Garner, S.M. Blackburn, D. Frampton. "Effective Prefetch for
          Mark-Sweep Garbage Collection." ISMM 2007.
      Co-Authored-By: Ben Gamari's avatarBen Gamari <ben@well-typed.com>
      46322977
    • Ben Gamari's avatar
      rts: Introduce debug flag for non-moving GC · f967f2b4
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      f967f2b4
    • Ben Gamari's avatar
      rts: Introduce flag to enable the nonmoving old generation · e3a35f24
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      This flag will enable the use of a non-moving oldest generation.
      e3a35f24