1. 07 Apr, 2020 1 commit
    • Daniel Gröber (dxld)'s avatar
      rts: ProfHeap: Fix memory leak when not compiled with profiling · f38e8d61
      Daniel Gröber (dxld) authored
      If we're doing heap profiling on an unprofiled executable we keep
      allocating new space in initEra via nextEra on each profiler run but we
      don't have a corresponding freeEra call.
      
      We do free the last era in endHeapProfiling but previous eras will have
      been overwritten by initEra and will never get free()ed.
      
      Metric Decrease:
          space_leak_001
      f38e8d61
  2. 03 Apr, 2020 1 commit
    • Andreas Klebinger's avatar
      Improve and refactor StgToCmm codegen for DataCons. · 9462452a
      Andreas Klebinger authored
      We now differentiate three cases of constructor bindings:
      
      1)Bindings which we can "replace" with a reference to
        an existing closure. Reference the replacement closure
        when accessing the binding.
      2)Bindings which we can "replace" as above. But we still
        generate a closure which will be referenced by modules
        importing this binding.
      3)For any other binding generate a closure. Then reference
        it.
      
      Before this patch 1) did only apply to local bindings and we
      didn't do 2) at all.
      9462452a
  3. 02 Apr, 2020 2 commits
  4. 17 Mar, 2020 1 commit
    • Ömer Sinan Ağacan's avatar
      Update sanity checking for TSOs: · 92327e3a
      Ömer Sinan Ağacan authored
      - Remove an invalid assumption about GC checking what_next field. The GC
        doesn't care about what_next at all, if a TSO is reachable then all
        its pointers are followed (other than global_tso, which is only
        followed by compacting GC).
      
      - Remove checkSTACK in checkTSO: TSO stacks will be visited in
        checkHeapChain, or checkLargeObjects etc.
      
      - Add an assertion in checkTSO to check that the global_link field is
        sane.
      
      - Did some refactor to remove forward decls in checkGlobalTSOList and
        added braces around single-statement if statements.
      92327e3a
  5. 15 Mar, 2020 1 commit
    • Ömer Sinan Ağacan's avatar
      Fix global_link of TSOs for threads reachable via dead weaks · cfcc3c9a
      Ömer Sinan Ağacan authored
      Fixes #17785
      
      Here's how the problem occurs:
      
      - In generation 0 we have a TSO that is finished (i.e. it has no more
        work to do or it is killed).
      
      - The TSO only becomes reachable after collectDeadWeakPtrs().
      
      - After collectDeadWeakPtrs() we switch to WeakDone phase where we don't
        move TSOs to different lists anymore (like the next gen's thread list
        or the resurrected_threads list).
      
      - So the TSO will never be moved to a generation's thread list, but it
        will be promoted to generation 1.
      
      - Generation 1 collected via mark-compact, and because the TSO is
        reachable it is marked, and its `global_link` field, which is bogus at
        this point (because the TSO is not in a list), will be threaded.
      
      - Chaos ensues.
      
      In other words, when these conditions hold:
      
      - A TSO is reachable only after collectDeadWeakPtrs()
      - It's finished (what_next is ThreadComplete or ThreadKilled)
      - It's retained by mark-compact collector (moving collector doesn't
        evacuate the global_list field)
      
      We end up doing random mutations on the heap because the TSO's
      global_list field is not valid, but it still looks like a heap pointer
      so we thread it during compacting GC.
      
      The fix is simple: when we traverse old_threads lists to resurrect
      unreachable threads the threads that won't be resurrected currently
      stays on the old_threads lists. Those threads will never be visited
      again by MarkWeak so we now reset the global_list fields. This way
      compacting GC does not thread pointers to nowhere.
      
      Testing
      -------
      
      The reproducer in #17785 is quite large and hard to build, because of
      the dependencies, so I'm not adding a regression test.
      
      In my testing the reproducer would take a less than 5 seconds to run,
      and once in every ~5 runs would fail with a segfault or an assertion
      error. In other cases it also fails with a test failure. Because the
      tests never fail with the bug fix, assuming the code is correct, this
      also means that this bug can sometimes lead to incorrect runtime
      results.
      
      After the fix I was able to run the reproducer repeatedly for about an
      hour, with no runtime crashes or test failures.
      
      To run the reproducer clone the git repo:
      
          $ git clone https://github.com/osa1/streamly --branch ghc-segfault
      
      Then clone primitive and atomic-primops from their git repos and point
      to the clones in cabal.project.local. The project should then be
      buildable using GHC HEAD. Run the executable `properties` with `+RTS -c
      -DZ`.
      
      In addition to the reproducer above I run the test suite using:
      
          $ make slowtest EXTRA_HC_OPTS="-debug -with-rtsopts=-DS \
              -with-rtsopts=-c +RTS -c -RTS" SKIPWAY='nonmoving nonmoving_thr'
      
      This enables compacting GC always in both GHC when building the test
      programs and when running the test programs, and also enables sanity
      checking when running the test programs. These set of flags are not
      compatible for all tests so there are some failures, but I got the same
      set of failures with this patch compared to GHC HEAD.
      cfcc3c9a
  6. 14 Mar, 2020 3 commits
  7. 11 Mar, 2020 2 commits
    • Ömer Sinan Ağacan's avatar
      Zero any slop after compaction in compacting GC · 3aa9b35f
      Ömer Sinan Ağacan authored
      In copying GC, with the relevant debug flags enabled, we release the old
      blocks after a GC, and the block allocator zeroes the space before
      releasing a block. This effectively zeros the old heap.
      
      In compacting GC we reuse the blocks and previously we didn't zero the
      unused space in a compacting generation after compaction. With this
      patch we zero the slop between the free pointer and the end of the block
      when we're done with compaction and when switching to a new block
      (because the current block doesn't have enough space for the next object
      we're shifting).
      3aa9b35f
    • Ben Gamari's avatar
      rts: Prefer darwin-specific getCurrentThreadCPUTime · bb586f89
      Ben Gamari authored
      macOS Catalina now supports a non-POSIX-compliant version of clock_gettime
      which cannot use the clock_gettime codepath.
      
      Fixes #17906.
      bb586f89
  8. 09 Mar, 2020 4 commits
  9. 05 Mar, 2020 4 commits
  10. 04 Mar, 2020 1 commit
  11. 29 Feb, 2020 2 commits
  12. 28 Feb, 2020 1 commit
    • Ben Gamari's avatar
      nonmoving: Fix marking in compact regions · f4b6b594
      Ben Gamari authored
      Previously we were tracing the object we were asked to mark, even if it
      lives in a compact region. However, there is no need to do this; we need
      only to mark the region itself as live.
      
      I have seen a segfault due to this due to the concurrent mark seeing a
      an object in the process of being compacted by the mutator.
      f4b6b594
  13. 26 Feb, 2020 1 commit
  14. 25 Feb, 2020 1 commit
  15. 14 Feb, 2020 1 commit
  16. 12 Feb, 2020 1 commit
  17. 11 Feb, 2020 3 commits
  18. 08 Feb, 2020 3 commits
  19. 04 Feb, 2020 2 commits
  20. 01 Feb, 2020 1 commit
  21. 25 Jan, 2020 3 commits
    • PHO's avatar
      Fix rts allocateExec() on NetBSD · 8b726534
      PHO authored
      Similar to SELinux, NetBSD "PaX mprotect" prohibits marking a page
      mapping both writable and executable at the same time. Use libffi
      which knows how to work around it.
      8b726534
    • Sylvain Henry's avatar
      Module hierarchy: Cmm (cf #13009) · 6e2d9ee2
      Sylvain Henry authored
      6e2d9ee2
    • Ömer Sinan Ağacan's avatar
      Fix chaining tagged and untagged ptrs in compacting GC · 0e57d8a1
      Ömer Sinan Ağacan authored
      Currently compacting GC has the invariant that in a chain all fields are tagged
      the same. However this does not really hold: root pointers are not tagged, so
      when we thread a root we initialize a chain without a tag. When the pointed
      objects is evaluated and we have more pointers to it from the heap, we then add
      *tagged* fields to the chain (because pointers to it from the heap are tagged),
      ending up chaining fields with different tags (pointers from roots are NOT
      tagged, pointers from heap are). This breaks the invariant and as a result
      compacting GC turns tagged pointers into non-tagged.
      
      This later causes problem in the generated code where we do reads assuming that
      the pointer is aligned, e.g.
      
          0x7(%rax) -- assumes that pointer is tagged 1
      
      which causes misaligned reads. This caused #17088.
      
      We fix this using the "pointer tagging for large families" patch (#14373,
      !1742):
      
      - With the pointer tagging patch the GC can know what the tagged pointer to a
        CONSTR should be (previously we'd need to know the family size -- large
        families are always tagged 1, small families are tagged depending on the
        constructor).
      
      - Since we now know what the tags should be we no longer need to store the
        pointer tag in the info table pointers when forming chains in the compacting
        GC.
      
      As a result we no longer need to tag pointers in chains with 1/2 depending on
      whether the field points to an info table pointer, or to another field: an info
      table pointer is always tagged 0, everything else in the chain is tagged 1. The
      lost tags in pointers can be retrieved by looking at the info table.
      
      Finally, instead of using tag 1 for fields and tag 0 for info table pointers, we
      use two different tags for fields:
      
      - 1 for fields that have untagged pointers
      - 2 for fields that have tagged pointers
      
      When unchaining we then look at the pointer to a field, and depending on its tag
      we either leave a tagged pointer or an untagged pointer in the field.
      
      This allows chaining untagged and tagged fields together in compacting GC.
      
      Fixes #17088
      
      Nofib results
      -------------
      
      Binaries are smaller because of smaller `Compact.c` code.
      
      make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" EXTRA_HC_OPTS="-with-rtsopts=-c" NoFibRuns=1
      
          --------------------------------------------------------------------------------
                  Program           Size    Allocs    Instrs     Reads    Writes
          --------------------------------------------------------------------------------
                       CS          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                      CSD          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                       FS          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                        S          -0.3%      0.0%     +5.4%     +0.8%     +3.9%
                       VS          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                      VSD          -0.3%      0.0%     -0.0%     -0.0%     -0.2%
                      VSM          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                     anna          -0.1%      0.0%     +0.0%     +0.0%     +0.0%
                     ansi          -0.3%      0.0%     +0.1%     +0.0%     +0.0%
                     atom          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   awards          -0.2%      0.0%     +0.0%      0.0%     -0.0%
                   banner          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
               bernouilli          -0.3%      0.0%     +0.1%     +0.0%     +0.0%
             binary-trees          -0.2%      0.0%     +0.0%      0.0%     +0.0%
                    boyer          -0.3%      0.0%     +0.2%     +0.0%     +0.0%
                   boyer2          -0.2%      0.0%     +0.2%     +0.1%     +0.0%
                     bspt          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                cacheprof          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                 calendar          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                 cichelli          -0.3%      0.0%     +1.1%     +0.2%     +0.5%
                  circsim          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                 clausify          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
            comp_lab_zift          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                 compress          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                compress2          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
              constraints          -0.3%      0.0%     +0.2%     +0.1%     +0.1%
             cryptarithm1          -0.3%      0.0%     +0.0%     -0.0%      0.0%
             cryptarithm2          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                      cse          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
             digits-of-e1          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
             digits-of-e2          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
                   dom-lt          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                    eliza          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                    event          -0.3%      0.0%     +0.1%     +0.0%     -0.0%
              exact-reals          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   exp3_8          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                   expert          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
           fannkuch-redux          -0.3%      0.0%     -0.0%     -0.0%     -0.0%
                    fasta          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                      fem          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                      fft          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                     fft2          -0.2%      0.0%     +0.0%     -0.0%     +0.0%
                 fibheaps          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                     fish          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                    fluid          -0.2%      0.0%     +0.4%     +0.1%     +0.1%
                   fulsom          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   gamteb          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                      gcd          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
              gen_regexps          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   genfft          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                       gg          -0.2%      0.0%     +0.7%     +0.3%     +0.2%
                     grep          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   hidden          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                      hpg          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                      ida          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                    infer          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                  integer          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                integrate          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             k-nucleotide          -0.2%      0.0%     +0.0%     +0.0%     -0.0%
                    kahan          -0.3%      0.0%     -0.0%     -0.0%     -0.0%
                  knights          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   lambda          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
               last-piece          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                     lcss          -0.3%      0.0%     +0.0%     +0.0%      0.0%
                     life          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                     lift          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   linear          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                listcompr          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                 listcopy          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                 maillist          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   mandel          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                  mandel2          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                     mate          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                  minimax          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                  mkhprog          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
               multiplier          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   n-body          -0.2%      0.0%     -0.0%     -0.0%     -0.0%
                 nucleic2          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                     para          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                paraffins          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   parser          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                  parstof          -0.2%      0.0%     +0.8%     +0.2%     +0.2%
                      pic          -0.2%      0.0%     +0.1%     -0.1%     -0.1%
                 pidigits          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                    power          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                   pretty          -0.3%      0.0%     -0.0%     -0.0%     -0.1%
                   primes          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
                primetest          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                   prolog          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                   puzzle          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                   queens          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                  reptile          -0.2%      0.0%     +0.2%     +0.1%     +0.0%
          reverse-complem          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                  rewrite          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                     rfib          -0.2%      0.0%     +0.0%     +0.0%     -0.0%
                      rsa          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                      scc          -0.3%      0.0%     -0.0%     -0.0%     -0.1%
                    sched          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                      scs          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                   simple          -0.2%      0.0%     +3.4%     +1.0%     +1.8%
                    solid          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                  sorting          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
            spectral-norm          -0.2%      0.0%     -0.0%     -0.0%     -0.0%
                   sphere          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                   symalg          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                      tak          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
                transform          -0.2%      0.0%     +0.2%     +0.1%     +0.1%
                 treejoin          -0.3%      0.0%     +0.2%     -0.0%     -0.1%
                typecheck          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                  veritas          -0.1%      0.0%     +0.0%     +0.0%     +0.0%
                     wang          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
                wave4main          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             wheel-sieve1          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             wheel-sieve2          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                     x2n1          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
          --------------------------------------------------------------------------------
                      Min          -0.3%      0.0%     -0.0%     -0.1%     -0.2%
                      Max          -0.1%      0.0%     +5.4%     +1.0%     +3.9%
           Geometric Mean          -0.3%     -0.0%     +0.1%     +0.0%     +0.1%
      
          --------------------------------------------------------------------------------
                  Program           Size    Allocs    Instrs     Reads    Writes
          --------------------------------------------------------------------------------
                  circsim          -0.2%      0.0%     +1.6%     +0.4%     +0.7%
              constraints          -0.3%      0.0%     +4.3%     +1.5%     +2.3%
                 fibheaps          -0.3%      0.0%     +3.5%     +1.2%     +1.3%
                   fulsom          -0.2%      0.0%     +3.6%     +1.2%     +1.8%
                 gc_bench          -0.3%      0.0%     +4.1%     +1.3%     +2.3%
                     hash          -0.3%      0.0%     +6.6%     +2.2%     +3.6%
                     lcss          -0.3%      0.0%     +0.7%     +0.2%     +0.7%
                mutstore1          -0.3%      0.0%     +4.8%     +1.4%     +2.8%
                mutstore2          -0.3%      0.0%     +3.4%     +1.0%     +1.7%
                    power          -0.2%      0.0%     +2.7%     +0.6%     +1.9%
               spellcheck          -0.3%      0.0%     +1.1%     +0.4%     +0.4%
          --------------------------------------------------------------------------------
                      Min          -0.3%      0.0%     +0.7%     +0.2%     +0.4%
                      Max          -0.2%      0.0%     +6.6%     +2.2%     +3.6%
           Geometric Mean          -0.3%     +0.0%     +3.3%     +1.0%     +1.8%
      
      Metric changes
      --------------
      
      While it sounds ridiculous, this change causes increased allocations in
      the following tests. We concluded that this change can't cause a
      difference in allocations and decided to land this patch. Fluctuations
      in "bytes allocated" metric is tracked in #17686.
      
      Metric Increase:
          Naperian
          T10547
          T12150
          T12234
          T12425
          T13035
          T5837
          T6048
      0e57d8a1
  22. 20 Jan, 2020 1 commit