1. 12 Jan, 2019 1 commit
    • Ömer Sinan Ağacan's avatar
      Fix negative mutator time in GC stats in prof builds · 19670bc3
      Ömer Sinan Ağacan authored
      Because garbage collector calls `retainerProfile()` and `heapCensus()`,
      GC times normally include some of PROF times too. To fix this we have
      these lines:
      
          // heapCensus() is called by the GC, so RP and HC time are
          // included in the GC stats.  We therefore subtract them to
          // obtain the actual GC cpu time.
          stats.gc_cpu_ns      -=  prof_cpu;
          stats.gc_elapsed_ns  -=  prof_elapsed;
      
      These variables are later used for calculating GC time excluding the
      final GC (which should be attributed to EXIT).
      
          exit_gc_elapsed      = stats.gc_elapsed_ns - start_exit_gc_elapsed;
      
      The problem is if we subtract PROF times from `gc_elapsed_ns` and then
      subtract `start_exit_gc_elapsed` from the result, we end up subtracting
      PROF times twice, because `start_exit_gc_elapsed` also includes PROF
      times.
      
      We now subtract PROF times from GC after the calculations for EXIT and
      MUT times. The existing assertion that checks
      
          INIT + MUT + GC + EXIT = TOTAL
      
      now holds. When we subtract PROF numbers from GC, and a new assertion
      
          INIT + MUT + GC + PROF + EXIT = TOTAL
      
      also holds.
      
      Fixes #15897. New assertions added in this commit also revealed #16102,
      which is also fixed by this commit.
      19670bc3
  2. 25 Dec, 2018 2 commits
    • Ben Gamari's avatar
      testsuite: Fix a variety of issues when building with integer-simple · 99378207
      Ben Gamari authored
       * Mark arith011 as broken with integer-simple
      
         As noted in #16091, arith011 fails when run against integer-simple with a
         "divide by zero" exception. This suggests that integer-gmp and integer-simple
         are handling division by zero differently.
      
       * This also fixes broken_without_gmp; the lack of types made the previous
         failure silent, sadly. Improves situation of #16043.
      
       * Mark several tests implicitly depending upon integer-gmp as broken
         with integer-simple. These expect to see Core coming from integer-gmp,
         which breaks with integer-simple.
      
       * Increase runtime timeout multiplier of T11627a with integer-simple
      
         I previously saw that T11627a timed out in all profiling ways when run against
         integer-simple. I suspect this is due to integer-simple's rather verbose heap
         representation. Let's see whether increasing the runtime timeout helps.
      
         Fixes test for #11627.
      
      This is all in service of fixing #16043.
      99378207
    • Ben Gamari's avatar
      testsuite: Enable T11627a on Darwin · 6d9d6f9a
      Ben Gamari authored
      The retainer profiler no longer uses the C stack for its mark stack (#14758).
      Consequently even the small C stack provided on Darwin should be sufficient to
      run this test. See #11627
      6d9d6f9a
  3. 22 Dec, 2018 1 commit
  4. 13 Jul, 2018 1 commit
    • Ömer Sinan Ağacan's avatar
      Fix processHeapClosureForDead CONSTR_NOCAF case · 2625f131
      Ömer Sinan Ağacan authored
      CONSTR_NOCAF was introduced with 55d535da as a replacement for
      CONSTR_STATIC and CONSTR_NOCAF_STATIC, however, as explained in Note
      [static constructors], we copy CONSTR_NOCAFs (which can also be seen in
      evacuate) during GC, and they can become dead, like other CONSTR_X_Ys.
      processHeapClosureForDead is updated to reflect this.
      
      Test Plan: Validates on x86_64. Existing failures on i386.
      
      Reviewers: simonmar, bgamari, erikd
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: rwbarton, thomie, carter
      
      GHC Trac Issues: #7836, #15063, #15087, #15165
      
      Differential Revision: https://phabricator.haskell.org/D4928
      2625f131
  5. 20 Jun, 2018 1 commit
    • Ben Gamari's avatar
      testsuite: Skip T11627a and T11627b on Darwin · f0179e3a
      Ben Gamari authored
      Darwin tends to give us a very small stack which the retainer profiler tends to
      overflow. Strangely, this manifested on CircleCI yet not Harbormaster.
      
      See #15287 and #11627.
      f0179e3a
  6. 17 Jun, 2018 1 commit
  7. 20 May, 2018 1 commit
  8. 06 Mar, 2018 1 commit
    • Simon Marlow's avatar
      Fix interpreter with profiling · 488d63d6
      Simon Marlow authored
      This was broken by D3746 and/or D3809, but unfortunately we didn't
      notice because CI at the time wasn't building the profiling way.
      
      Test Plan:
      ```
      cd testsuite/test/profiling/should_run
      make WAY=ghci-ext-prof
      ```
      
      Reviewers: bgamari, michalt, hvr, erikd
      
      Subscribers: rwbarton, thomie, carter
      
      GHC Trac Issues: #14705
      
      Differential Revision: https://phabricator.haskell.org/D4437
      488d63d6
  9. 31 Jan, 2018 1 commit
  10. 22 Nov, 2017 1 commit
  11. 12 May, 2017 1 commit
    • David Feuer's avatar
      Automatically add SCCs to INLINABLE bindings · ab91daf2
      David Feuer authored
      Instead of excluding `isAnyInlinePragma`, just exclude
      `isInlinePragma`. This makes GHC behave as documented;
      the user's guide only indicates that GHC does not automatically
      add SCCs to `INLINE` bindings.
      
      Fixes #12962.
      
      Reviewers: austin, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: DemiMarie, osa1, Mikolaj, simonpj, rwbarton, thomie
      
      GHC Trac Issues: #12962
      
      Differential Revision: https://phabricator.haskell.org/D3550
      ab91daf2
  12. 29 Mar, 2017 1 commit
  13. 26 Feb, 2017 1 commit
    • rwbarton's avatar
      tests: remove extra_files.py (#12223) · 3415bcaa
      rwbarton authored
      The script I used is included as testsuite/driver/kill_extra_files.py,
      though at this point it is for mostly historical interest.
      
      Some of the tests in libraries/hpc relied on extra_files.py, so this
      commit includes an update to that submodule.
      
      One test in libraries/process also relies on extra_files.py, but we
      cannot update that submodule so easily, so for now we special-case it
      in the test driver.
      3415bcaa
  14. 22 Jan, 2017 1 commit
  15. 06 Jan, 2017 1 commit
    • Simon Marlow's avatar
      More fixes for #5654 · 3a18baff
      Simon Marlow authored
      * In stg_ap_0_fast, if we're evaluating a thunk, the thunk might
        evaluate to a function in which case we may have to adjust its CCS.
      
      * The interpreter has its own implementation of stg_ap_0_fast, so we
        have to do the same shenanigans with creating empty PAPs and copying
        PAPs there.
      
      * GHCi creates Cost Centres as children of CCS_MAIN, which enterFunCCS()
        wrongly assumed to imply that they were CAFs.  Now we use the is_caf
        flag for this, which we have to correctly initialise when we create a
        Cost Centre in GHCi.
      3a18baff
  16. 17 Dec, 2016 1 commit
  17. 15 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Fix cost-centre-stacks bug (#5654) · 394231b3
      Simon Marlow authored
      This fixes some cases of wrong stacks being generated by the profiler.
      For background and details on the fix see
      `Note [Evaluating functions with profiling]` in `rts/Apply.cmm`.
      
      This does have an impact on allocations for some programs when
      profiling.  nofib results:
      
      ```
         k-nucleotide          +0.0%     +8.8%    +11.0%    +11.0%      0.0%
               puzzle          +0.0%    +12.5%     0.244     0.246      0.0%
            typecheck           0.0%     +8.7%    +16.1%    +16.2%      0.0%
      ------------------------------------------------------------------------
      --------
                  Min          -0.0%     -0.0%    -34.4%    -35.5%    -25.0%
                  Max          +0.0%    +12.5%    +48.9%    +49.4%    +10.6%
       Geometric Mean          +0.0%     +0.6%     +2.0%     +1.8%     -0.3%
      
      ```
      
      But runtimes don't seem to be affected much, and the examples I looked
      at were completely legitimate.  For example, in puzzle we have this:
      
      ```
      position :: ItemType -> StateType ->  BankType
      position Bono = bonoPos
      position Edge = edgePos
      position Larry = larryPos
      position Adam = adamPos
      ```
      
      where the identifiers on the rhs are all record selectors.  Previously
      the profiler gave a stack that looked like
      
      ```
        position
        bonoPos
        ...
      ```
      
      i.e. `bonoPos` was at the same level of the call stack as `position`,
      but now it looks like
      
      ```
        position
         bonoPos
         ...
      ```
      
      I used the normaliser from the testsuite to diff the profiling output
      from other nofib programs and they all looked better.
      
      Test Plan:
      * the broken test passes
      * validate
      * compiled and ran all of nofib, measured perf, diff'd several .prof
      files
      
      Reviewers: niteria, erikd, austin, scpmw, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2804
      
      GHC Trac Issues: #5654, #10007
      394231b3
  18. 01 Oct, 2016 1 commit
  19. 20 Jul, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Support SCC pragmas in declaration context · 98b2c508
      Ömer Sinan Ağacan authored
      Not having SCCs at the top level is becoming annoying real quick. For
      simplest cases, it's possible to do this transformation:
      
          f x y = ...
          =>
          f = {-# SCC f #-} \x y -> ...
      
      However, it doesn't work when there's a `where` clause:
      
          f x y = <t is in scope>
            where t = ...
          =>
          f = {-# SCC f #-} \x y -> <t is out of scope>
            where t = ...
      
      Or when we have a "equation style" definition:
      
          f (C1 ...) = ...
          f (C2 ...) = ...
          f (C3 ...) = ...
          ...
      
      (usual solution is to rename `f` to `f'` and define a new `f` with a
      `SCC`)
      
      This patch implements support for SCC annotations in declaration
      contexts. This is now a valid program:
      
          f x y = ...
            where
              g z = ...
              {-# SCC g #-}
          {-# SCC f #-}
      
      Test Plan: This passes slow validate (no new failures added).
      
      Reviewers: goldfire, mpickering, austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonmar, thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D2407
      98b2c508
  20. 28 Jun, 2016 1 commit
    • thomie's avatar
      Testsuite: mark tests expect_broken · 3fb9837f
      thomie authored
      * T7837 is still broken for prof_ways (#9406)
      * T11627b is broken on Windows for WAY=prof_hc_hb (#12236)
      * T8089 is also broken for WAY=profasm on Windows
      3fb9837f
  21. 08 Jun, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Show sources of cost centers in .prof · d7933cbc
      Ömer Sinan Ağacan authored
      This fixes the problem with duplicate cost-centre names that was
      reported a couple of times before. When a module implements a typeclass
      multiple times for different types, methods of different implementations
      get same cost-centre names and are reported like this:
      
          COST CENTRE MODULE            %time %alloc
      
          CAF         GHC.IO.Handle.FD    0.0   32.8
          CAF         GHC.Read            0.0    1.0
          CAF         GHC.IO.Encoding     0.0    1.8
          showsPrec   Main                0.0    1.2
          readPrec    Main                0.0   19.4
          readPrec    Main                0.0   20.5
          main        Main                0.0   20.2
      
                                                  individual      inherited
          COST CENTRE  MODULE  no.     entries  %time %alloc   %time %alloc
      
          MAIN         MAIN     53          0    0.0    0.2     0.0  100.0
           CAF         Main    105          0    0.0    0.3     0.0   62.5
            readPrec   Main    109          1    0.0    0.6     0.0    0.6
            readPrec   Main    107          1    0.0    0.6     0.0    0.6
            main       Main    106          1    0.0   20.2     0.0   61.0
             ==        Main    114          1    0.0    0.0     0.0    0.0
             ==        Main    113          1    0.0    0.0     0.0    0.0
             showsPrec Main    112          2    0.0    1.2     0.0    1.2
             showsPrec Main    111          2    0.0    0.9     0.0    0.9
             readPrec  Main    110          0    0.0   18.8     0.0   18.8
             readPrec  Main    108          0    0.0   19.9     0.0   19.9
      
      It's not possible to tell from the report which `==` took how long. This
      patch adds one more column at the cost of making outputs wider. The
      report now looks like this:
      
          COST CENTRE MODULE           SRC                       %time %alloc
      
          CAF         GHC.IO.Handle.FD <entire-module>             0.0   32.9
          CAF         GHC.IO.Encoding  <entire-module>             0.0    1.8
          CAF         GHC.Read         <entire-module>             0.0    1.0
          showsPrec   Main             Main_1.hs:7:19-22           0.0    1.2
          readPrec    Main             Main_1.hs:7:13-16           0.0   19.5
          readPrec    Main             Main_1.hs:4:13-16           0.0   20.5
          main        Main             Main_1.hs:(10,1)-(20,20)    0.0   20.2
      
                                                                             individual      inherited
          COST CENTRE  MODULE        SRC                      no. entries  %time %alloc   %time %alloc
      
          MAIN         MAIN          <built-in>                53      0    0.0    0.2     0.0  100.0
           CAF         Main          <entire-module>          105      0    0.0    0.3     0.0   62.5
            readPrec   Main          Main_1.hs:7:13-16        109      1    0.0    0.6     0.0    0.6
            readPrec   Main          Main_1.hs:4:13-16        107      1    0.0    0.6     0.0    0.6
            main       Main          Main_1.hs:(10,1)-(20,20) 106      1    0.0   20.2     0.0   61.0
             ==        Main          Main_1.hs:7:25-26        114      1    0.0    0.0     0.0    0.0
             ==        Main          Main_1.hs:4:25-26        113      1    0.0    0.0     0.0    0.0
             showsPrec Main          Main_1.hs:7:19-22        112      2    0.0    1.2     0.0    1.2
             showsPrec Main          Main_1.hs:4:19-22        111      2    0.0    0.9     0.0    0.9
             readPrec  Main          Main_1.hs:7:13-16        110      0    0.0   18.8     0.0   18.8
             readPrec  Main          Main_1.hs:4:13-16        108      0    0.0   19.9     0.0   19.9
           CAF         Text.Read.Lex <entire-module>          102      0    0.0    0.5     0.0    0.5
      
      To fix failing test cases because of different orderings of cost centres
      (e.g. optimized and non-optimized build printing in different order),
      with this patch we also start sorting cost centres before printing. The
      order depends on 1) entries (more entered cost centres come first) 2)
      names (using strcmp() on cost centre names).
      
      Reviewers: simonmar, austin, erikd, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2282
      
      GHC Trac Issues: #11543, #8473, #7105
      d7933cbc
  22. 24 May, 2016 1 commit
    • seraphime's avatar
      Fix: #12084 deprecate old profiling flags · 1956cbf1
      seraphime authored
      Change help message so it doesn't specify -auto-all.
      Make old profiling flags deprecated as they are no longer
      documented.
      Update Makefile and documentation accordingly.
      Update release notes for ghc 8.2
      
      Test Plan:
      ./verify; `ghc --help` shouldn't specify the -auto-all
      flag. Furthermore `ghc -fprof -auto-all` should emit a warning
      
      Reviewed By: thomie, austin
      
      Differential Revision: https://phabricator.haskell.org/D2257
      
      GHC Trac Issues: #12084
      
      Update submodule nofib
      1956cbf1
  23. 28 Apr, 2016 1 commit
  24. 20 Mar, 2016 1 commit
  25. 23 Feb, 2016 3 commits
  26. 27 Jan, 2016 2 commits
  27. 26 Jan, 2016 1 commit
    • thomie's avatar
      Fix segmentation fault when .prof file not writeable · 6d2bdfd8
      thomie authored
      There are two ways to do retainer profiling. Quoting from the user's guide:
        1. `+RTS -hr` "Breaks down the graph by retainer set"
        2. `+RTS -hr<cc> -h<x>`, where `-h<x>` is one of normal heap profiling
           break-down options (e.g. `-hc`), and `-hr<cc> means "Restrict the
           profile to closures with retainer sets containing cost-centre
           stacks with one of the specified cost centres at the top."
      
      Retainer profiling writes to a .hp file, like the other heap profiling
      options, but also to a .prof file. Therefore, when the .prof file is not
      writeable for whatever reason, retainer profiling should be turned off
      completely.
      
      This worked ok when running the program with `+RTS -hr` (option 1), but a
      segfault would occur when using `+RTS -hr<cc> -h<x>`, with `x!=r` (option 2).
      
      This commit fixes that.
      
      Reviewed by: bgamari
      
      Differential Revision: https://phabricator.haskell.org/D1849
      
      GHC Trac Issues: #11489
      6d2bdfd8
  28. 08 Jan, 2016 2 commits
  29. 04 Feb, 2015 1 commit
    • Simon Marlow's avatar
      Fix a profiling bug · daed18c3
      Simon Marlow authored
      Summary:
      We were erroneously discarding SCCs on function-typed variables.
      These can affect the call stack, so we have to retain them.  The bug
      was introduced during the recent SourceNote refactoring.
      
      This is an alternative to the fix proposed in D616.  I also added the
      scc005 test from that diff, which works with this change.
      
      While I was here, I also fixed up the other profiling tests, marking a
      few as expect_broken_for(10037) where the opt/unopt output differs in
      non-fatal ways.
      
      Test Plan: profiling tests
      
      Reviewers: scpmw, ezyang, austin
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D636
      
      GHC Trac Issues: #10007
      daed18c3
  30. 05 May, 2014 1 commit
  31. 14 Feb, 2013 1 commit
  32. 25 Jan, 2013 1 commit
  33. 27 Jul, 2012 1 commit
  34. 15 Jun, 2012 1 commit
  35. 07 Dec, 2011 1 commit