Skip to content
Snippets Groups Projects
  1. Dec 12, 2019
  2. Dec 11, 2019
  3. Dec 09, 2019
    • Gabor Greif's avatar
      Fix comment typos · d46a72e1
      Gabor Greif authored
      The below is only necessary to fix the CI perf fluke that
      happened in 9897e8c8:
      -------------------------
      Metric Decrease:
          T5837
          T6048
          T9020
          T12425
          T12234
          T13035
          T12150
          Naperian
      -------------------------
      d46a72e1
  4. Dec 05, 2019
    • Ben Gamari's avatar
      rts/NonMovingSweep: Fix locking of new mutable list allocation · a7a4efbf
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Previously we used allocBlockOnNode_sync in nonmovingSweepMutLists
      despite the fact that we aren't in the GC and therefore the allocation
      spinlock isn't in use. This meant that sweep would end up spinning until
      the next minor GC, when the SM lock was moved away from the SM_MUTEX to
      the spinlock. This isn't a correctness issue but it sure isn't good for
      performance.
      
      Found thanks for Ward.
      
      Fixes #17539.
      a7a4efbf
    • Ben Gamari's avatar
      nonmoving: Clear segment bitmaps during sweep · 69001f54
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Previously we would clear the bitmaps of segments which we are going to
      sweep during the preparatory pause. However, this is unnecessary: the
      existence of the mark epoch ensures that the sweep will correctly
      identify non-reachable objects, even if we do not clear the bitmap.
      
      We now defer clearing the bitmap to sweep, which happens concurrently
      with mutation.
      69001f54
  5. Dec 02, 2019
  6. Nov 28, 2019
  7. Nov 24, 2019
  8. Nov 23, 2019
  9. Nov 20, 2019
  10. Nov 19, 2019
    • Ben Gamari's avatar
      nonmoving: Drop redundant write barrier on stack underflow · 098d5017
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Previously we would push stack-carried return values to the new stack on
      a stack overflow. While the precise reasoning for this barrier is
      unfortunately lost to history, in hindsight I suspect it was prompted by
      a missing barrier elsewhere (that has been since fixed).
      
      Moreover, there the redundant barrier is actively harmful: the stack may
      contain non-pointer values; blindly pushing these to the mark queue will
      result in a crash. This is precisely what happened in the `stack003`
      test. However, because of a (now fixed) deficiency in the test this
      crash did not trigger on amd64.
      098d5017
    • Ben Gamari's avatar
      nonmoving: Fix handling on large object marking on 32-bit · eb7b233a
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Previously we would reset the pointer pointing to the object to be
      marked to the beginning of the block when marking a large object. This
      did no harm on 64-bit but on 32-bit it broke, e.g. `arr020`, since we
      align pinned ByteArray allocations such that the payload is 8
      byte-aligned. This means that the object might not begin at the
      beginning of the block.,
      eb7b233a
    • Ben Gamari's avatar
      nonmoving: Rework mark queue representation · 097f8072
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      The previous representation needlessly limited the array length to
      16-bits on 32-bit platforms.
      097f8072
    • Ben Gamari's avatar
      nonmoving: Fix incorrect masking in mark queue type test · deed8e31
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      We were using TAG_BITS instead of TAG_MASK. This happened to work on
      64-bit platforms where TAG_BITS==3 since we only use tag values 0 and
      3. However, this broken on 32-bit platforms where TAG_BITS==2.
      deed8e31
    • Ben Gamari's avatar
      nonmoving: Use correct info table pointer accessor · c819c0e4
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Previously we used INFO_PTR_TO_STRUCT instead of
      THUNK_INFO_PTR_TO_STRUCT when looking at a thunk. These two happen to be
      equivalent on 64-bit architectures due to alignment considerations
      however they are different on 32-bit platforms. This lead to #17487.
      
      To fix this we also employ a small optimization: there is only one thunk
      of type WHITEHOLE (namely stg_WHITEHOLE_info). Consequently, we can just
      use a plain pointer comparison instead of testing against info->type.
      c819c0e4
    • Ben Gamari's avatar
      rts: Add missing include of SymbolExtras.h · 0418c38d
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      This broke the Windows build.
      0418c38d
    • Ben Gamari's avatar
      Properly account for libdw paths in make build system · 2b27cc16
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Should finally fix #17255.
      2b27cc16
    • vdukhovni's avatar
      Enable USE_PTHREAD_FOR_ITIMER also on FreeBSD · ec8a463d
      vdukhovni authored and Marge Bot's avatar Marge Bot committed
      If using a pthread instead of a timer signal is more reliable, and
      has no known drawbacks, then FreeBSD is also capable of supporting
      this mode of operation (tested on FreeBSD 12 with GHC 8.8.1, but
      no reason why it would not also work on FreeBSD 11 or GHC 8.6).
      
      Proposed by Kevin Zhang in:
      
          https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241849
      ec8a463d
  11. Nov 08, 2019
  12. Nov 06, 2019
  13. Nov 05, 2019
  14. Nov 04, 2019
    • Ben Gamari's avatar
      rts/linker: Ensure that code isn't writable · 120f2e53
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      For many years the linker would simply map all of its memory with
      PROT_READ|PROT_WRITE|PROT_EXEC. However operating systems have been
      becoming increasingly reluctant to accept this practice (e.g. #17353
      and #12657) and for good reason: writable code is ripe for exploitation.
      
      Consequently mmapForLinker now maps its memory with
      PROT_READ|PROT_WRITE.  After the linker has finished filling/relocating
      the mapping it must then call mmapForLinkerMarkExecutable on the
      sections of the mapping which contain executable code.
      
      Moreover, to make all of this possible it was necessary to redesign the
      m32 allocator. First, we gave (in an earlier commit) each ObjectCode its
      own m32_allocator. This was necessary since code loading and symbol
      resolution/relocation are currently interleaved, meaning that it is not
      possible to enforce W^X when symbols from different objects reside in
      the same page.
      
      We then redesigned the m32 allocator to take advantage of the fact that
      all of the pages allocated with the allocator die at the same time
      (namely, when the owning ObjectCode is unloaded). This makes a number of
      things simpler (e.g. no more page reference counting; the interface
      provided by the allocator for freeing is simpler). See
      Note [M32 Allocator] for details.
      120f2e53
  15. Nov 02, 2019
  16. Nov 01, 2019
    • Ben Gamari's avatar
      rts: Make m32 allocator per-ObjectCode · c6759080
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      MacOS Catalina is finally going to force our hand in forbidden writable
      exeutable mappings. Unfortunately, this is quite incompatible with the
      current global m32 allocator, which mixes symbols from various objects
      in a single page. The problem here is that some of these symbols may not
      yet be resolved (e.g. had relocations performed) as this happens lazily
      (and therefore we can't yet make the section read-only and therefore
      executable).
      
      The easiest way around this is to simply create one m32 allocator per
      ObjectCode. This may slightly increase fragmentation for short-running
      programs but I suspect will actually improve fragmentation for programs
      doing lots of loading/unloading since we can always free all of the
      pages allocated to an object when it is unloaded (although this ability
      will only be implemented in a later patch).
      c6759080
    • Ben Gamari's avatar
      mmap: Factor out protection flags · 70b62c97
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      70b62c97
  17. Oct 30, 2019
  18. Oct 26, 2019
  19. Oct 25, 2019
  20. Oct 23, 2019
    • ryates@cs.rochester.edu's avatar
      Full abort on validate failure merging `orElse`. · 1f40e68a
      ryates@cs.rochester.edu authored and Marge Bot's avatar Marge Bot committed
      Previously partial roll back of a branch of an `orElse` was attempted
      if validation failure was observed.  Validation here, however, does
      not account for what part of the transaction observed inconsistent
      state.  This commit fixes this by fully aborting and restarting the
      transaction.
      1f40e68a
    • Matthew Pickering's avatar
      eventlog: Dump cost centre stack on each sample · 17987a4b
      Matthew Pickering authored and Marge Bot's avatar Marge Bot committed
      With this change it is possible to reconstruct the timing portion of a
      `.prof` file after the fact. By logging the stacks at each time point
      a more precise executation trace of the program can be observed rather
      than all identical cost centres being identified in the report.
      
      There are two new events:
      
      1. `EVENT_PROF_BEGIN` - emitted at the start of profiling to communicate
      the tick interval
      2. `EVENT_PROF_SAMPLE_COST_CENTRE` - emitted on each tick to communicate the
      current call stack.
      
      Fixes #17322
      17987a4b
    • Ömer Sinan Ağacan's avatar
      Refactor Compact.c: · b521e8b6
      Ömer Sinan Ağacan authored and Marge Bot's avatar Marge Bot committed
      - Remove forward declarations
      - Introduce UNTAG_PTR and GET_PTR_TAG for dealing with pointer tags
        without having to cast arguments to StgClosure*
      - Remove dead code
      - Use W_ instead of StgWord
      - Use P_ instead of StgPtr
      b521e8b6
Loading