1. 14 Sep, 2018 5 commits
    • Sergei Azovskov's avatar
      Mark code related symbols as @function not @object · c23f057f
      Sergei Azovskov authored
      Summary:
      This diff is a part of the bigger project which goal is to improve
      common profiling tools support (perf) for GHC binaries.
      
      A similar job was already done and reverted in the past:
       * https://phabricator.haskell.org/rGHCb1f453e16f0ce11a2ab18cc4c350bdcbd36299a6
       * https://phabricator.haskell.org/rGHCf1f3c4f50650110ad0f700d6566a44c515b0548f
      
      Reasoning:
      
      `Perf` and similar tools build in memory symbol table from the .symtab
      section of the ELF file to display human-readable function names instead
      of the addresses in the output. `Perf` uses only two types of symbols:
      `@function` and `@notype` but GHC is not capable to produce any
      `@function` symbols so the `perf` output is pretty useless (All the
      haskell symbols that you can see in `perf` now are `@notype` internal
      symbols extracted by mistake/hack).
      
      The changes:
       * mark code related symbols as @function
       * small hack to mark InfoTable symbols as code if TABLES_NEXT_TO_CODE is true
      
      Limitations:
       * The perf symbolization support is not complete after this patch but
         I'm working on the second patch.
       * Constructor symbols are not supported. To fix that we can issue extra
         local symbols which mark code sections as code and will be only used
         for debug.
      
      Test Plan:
      tests
      any additional ideas?
      
      Perf output on stock ghc 8.4.1:
      ```
           9.78%  FibbSlow  FibbSlow            [.] ckY_info
           9.59%  FibbSlow  FibbSlow            [.] cjqd_info
           7.17%  FibbSlow  FibbSlow            [.] c3sg_info
           6.62%  FibbSlow  FibbSlow            [.] c1X_info
           5.32%  FibbSlow  FibbSlow            [.] cjsX_info
           4.18%  FibbSlow  FibbSlow            [.] s3rN_info
           3.82%  FibbSlow  FibbSlow            [.] c2m_info
           3.68%  FibbSlow  FibbSlow            [.] cjlJ_info
           3.26%  FibbSlow  FibbSlow            [.] c3sb_info
           3.19%  FibbSlow  FibbSlow            [.] cjPQ_info
           3.05%  FibbSlow  FibbSlow            [.] cjQd_info
           2.97%  FibbSlow  FibbSlow            [.] cjAB_info
           2.78%  FibbSlow  FibbSlow            [.] cjzP_info
           2.40%  FibbSlow  FibbSlow            [.] cjOS_info
           2.38%  FibbSlow  FibbSlow            [.] s3rK_info
           2.27%  FibbSlow  FibbSlow            [.] cjq0_info
           2.18%  FibbSlow  FibbSlow            [.] cKQ_info
           2.13%  FibbSlow  FibbSlow            [.] cjSl_info
           1.99%  FibbSlow  FibbSlow            [.] s3rL_info
           1.98%  FibbSlow  FibbSlow            [.] c2cC_info
           1.80%  FibbSlow  FibbSlow            [.] s3rO_info
           1.37%  FibbSlow  FibbSlow            [.] c2f2_info
      ...
      ```
      
      Perf output on patched ghc:
      ```
           7.97%  FibbSlow  FibbSlow            [.] c3rM_info
           6.75%  FibbSlow  FibbSlow            [.] 0x000000000032cfa8
           6.63%  FibbSlow  FibbSlow            [.] cifA_info
           4.98%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
           4.55%  FibbSlow  FibbSlow            [.] chXn_info
           4.52%  FibbSlow  FibbSlow            [.] c3rH_info
           4.45%  FibbSlow  FibbSlow            [.] chZB_info
           4.04%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
           4.03%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
           3.76%  FibbSlow  FibbSlow            [.] chXA_info
           3.67%  FibbSlow  FibbSlow            [.] cifu_info
           3.25%  FibbSlow  FibbSlow            [.] ci4r_info
           2.64%  FibbSlow  FibbSlow            [.] s3rf_info
           2.42%  FibbSlow  FibbSlow            [.] s3rg_info
           2.39%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
           2.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
           2.17%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
           2.09%  FibbSlow  FibbSlow            [.] cicc_info
           2.03%  FibbSlow  FibbSlow            [.] 0x0000000000331e15
           2.02%  FibbSlow  FibbSlow            [.] s3ri_info
           1.91%  FibbSlow  FibbSlow            [.] 0x0000000000331bb8
           1.89%  FibbSlow  FibbSlow            [.] ci4N_info
      ...
      ```
      
      Reviewers: simonmar, niteria, bgamari, goldfire
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: lelf, rwbarton, thomie, carter
      
      GHC Trac Issues: #15501
      
      Differential Revision: https://phabricator.haskell.org/D4713
      c23f057f
    • Sergei Azovskov's avatar
      Mark system and internal symbols as private symbols in asm · 64c54fff
      Sergei Azovskov authored
      Summary:
      This marks system and internal symbols as private in asm output so those
      random generated sysmbols won't appear in .symtab
      
      Reasoning:
       * internal symbols don't help to debug because names are just random
       * the symbols style breaks perf logic
       * internal symbols can take ~75% of the .symtab. In the same time
         .symtab can take about 20% of the binary file size
      
      Notice:
      This diff mostly makes sense on top of the D4713 (or similar)
      
      Test Plan:
      tests
      
      Perf from D4713
      ```
           7.97%  FibbSlow  FibbSlow            [.] c3rM_info
           6.75%  FibbSlow  FibbSlow            [.] 0x000000000032cfa8
           6.63%  FibbSlow  FibbSlow            [.] cifA_info
           4.98%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
           4.55%  FibbSlow  FibbSlow            [.] chXn_info
           4.52%  FibbSlow  FibbSlow            [.] c3rH_info
           4.45%  FibbSlow  FibbSlow            [.] chZB_info
           4.04%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
           4.03%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
           3.76%  FibbSlow  FibbSlow            [.] chXA_info
           3.67%  FibbSlow  FibbSlow            [.] cifu_info
           3.25%  FibbSlow  FibbSlow            [.] ci4r_info
           2.64%  FibbSlow  FibbSlow            [.] s3rf_info
           2.42%  FibbSlow  FibbSlow            [.] s3rg_info
           2.39%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
           2.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
           2.17%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
           2.09%  FibbSlow  FibbSlow            [.] cicc_info
           2.03%  FibbSlow  FibbSlow            [.] 0x0000000000331e15
           2.02%  FibbSlow  FibbSlow            [.] s3ri_info
           1.91%  FibbSlow  FibbSlow            [.] 0x0000000000331bb8
           1.89%  FibbSlow  FibbSlow            [.] ci4N_info
      ...
      ```
      
      Perf from this patch:
      ```
          15.37%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
          15.33%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
          13.34%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
           9.24%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_plusInteger_info
           9.08%  FibbSlow  FibbSlow            [.] frame_dummy
           8.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
           4.29%  FibbSlow  FibbSlow            [.] 0x0000000000321ab0
           3.84%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
           3.07%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
           2.39%  FibbSlow  FibbSlow            [.] 0x0000000000321ab7
           1.90%  FibbSlow  FibbSlow            [.] 0x00000000003266b8
           1.88%  FibbSlow  FibbSlow            [.] base_GHCziNum_zm_info
           1.83%  FibbSlow  FibbSlow            [.] 0x0000000000326915
           1.34%  FibbSlow  FibbSlow            [.] 0x00000000003248cc
           1.07%  FibbSlow  FibbSlow            [.] base_GHCziNum_zp_info
           0.98%  FibbSlow  FibbSlow            [.] 0x00000000003247c8
           0.80%  FibbSlow  FibbSlow            [.] 0x0000000000121498
           0.79%  FibbSlow  FibbSlow            [.] stg_gc_noregs
           0.75%  FibbSlow  FibbSlow            [.] 0x0000000000321ad6
           0.67%  FibbSlow  FibbSlow            [.] 0x0000000000321aca
           0.64%  FibbSlow  FibbSlow            [.] 0x0000000000321b4a
           0.61%  FibbSlow  FibbSlow            [.] 0x00000000002ff633
      ```
      
      Reviewers: simonmar, niteria, bgamari
      
      Reviewed By: simonmar
      
      Subscribers: lelf, angerman, olsner, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4722
      64c54fff
    • Krzysztof Gogolewski's avatar
      Fix T15502 on 32-bit · ecbe26b6
      Krzysztof Gogolewski authored
      Summary:
      The expected output uses a hardcoded value for
      maxBound :: Int.
      
      This should fix one of circleci failures on i386.
      
      Test Plan: make test TEST=T15502
      
      Reviewers: RyanGlScott, bgamari
      
      Reviewed By: RyanGlScott
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15502
      
      Differential Revision: https://phabricator.haskell.org/D5151
      ecbe26b6
    • Alp Mestanogullari's avatar
      tests: increase (compile) timeout multiplier for T13701 and MultiLayerModules · 3040444d
      Alp Mestanogullari authored
      Summary:
      Those tests are currently making our i386 validation fail on CircleCI:
      
        https://circleci.com/gh/ghc/ghc/8827
      
      Test Plan: Using my Phab<->CircleCI bridge to run i386 validation for this diff.
      
      Reviewers: bgamari, monoidal
      
      Reviewed By: monoidal
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15484, #15383
      
      Differential Revision: https://phabricator.haskell.org/D5103
      3040444d
    • Michael Sloan's avatar
      Add support for ImplicitParams and RecursiveDo in TH · 9c6b7493
      Michael Sloan authored
      Summary:
      This adds TH support for the ImplicitParams and RecursiveDo extensions.
      
      I'm submitting this as one review because I cannot cleanly make
      the two commits independent.
      
      Initially, my goal was just to add ImplicitParams support, and
      I found that reasonably straightforward, so figured I might
      as well use my newfound knowledge to address some other TH omissions.
      
      Test Plan: Validate
      
      Reviewers: goldfire, austin, bgamari, RyanGlScott
      
      Reviewed By: RyanGlScott
      
      Subscribers: carter, RyanGlScott, thomie
      
      GHC Trac Issues: #1262
      
      Differential Revision: https://phabricator.haskell.org/D1979
      9c6b7493
  2. 13 Sep, 2018 13 commits
  3. 12 Sep, 2018 5 commits
  4. 11 Sep, 2018 3 commits
  5. 10 Sep, 2018 2 commits
  6. 08 Sep, 2018 2 commits
  7. 07 Sep, 2018 4 commits
  8. 06 Sep, 2018 2 commits
    • Ömer Sinan Ağacan's avatar
      Fix a race between GC threads in concurrent scavenging · c6fbac6a
      Ömer Sinan Ağacan authored
      While debugging #15285 I realized that free block lists (free_list in
      BlockAlloc.c) get corrupted when multiple scavenge threads allocate and
      release blocks concurrently. Here's a picture of one such race:
      
          Thread 2 (Thread 32573.32601):
          #0  check_tail
              (bd=0x940d40 <stg_TSO_info>) at rts/sm/BlockAlloc.c:860
          #1  0x0000000000928ef7 in checkFreeListSanity
              () at rts/sm/BlockAlloc.c:896
          #2  0x0000000000928979 in freeGroup
              (p=0x7e998ce02880) at rts/sm/BlockAlloc.c:721
          #3  0x0000000000928a17 in freeChain
              (bd=0x7e998ce02880) at rts/sm/BlockAlloc.c:738
          #4  0x0000000000926911 in freeChain_sync
              (bd=0x7e998ce02880) at rts/sm/GCUtils.c:80
          #5  0x0000000000934720 in scavenge_capability_mut_lists
              (cap=0x1acae80) at rts/sm/Scav.c:1665
          #6  0x000000000092b411 in gcWorkerThread
              (cap=0x1acae80) at rts/sm/GC.c:1157
          #7  0x000000000090be9a in yieldCapability
              (pCap=0x7f9994e69e20, task=0x7e9984000b70, gcAllowed=true) at rts/Capability.c:861
          #8  0x0000000000906120 in scheduleYield
              (pcap=0x7f9994e69e50, task=0x7e9984000b70) at rts/Schedule.c:673
          #9  0x0000000000905500 in schedule
              (initialCapability=0x1acae80, task=0x7e9984000b70) at rts/Schedule.c:293
          #10 0x0000000000908d4f in scheduleWorker
              (cap=0x1acae80, task=0x7e9984000b70) at rts/Schedule.c:2554
          #11 0x000000000091a30a in workerStart
              (task=0x7e9984000b70) at rts/Task.c:444
          #12 0x00007f99937fa6db in start_thread
              (arg=0x7f9994e6a700) at pthread_create.c:463
          #13 0x000061654d59f88f in clone
              () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      
          Thread 1 (Thread 32573.32573):
          #0  checkFreeListSanity
              () at rts/sm/BlockAlloc.c:887
          #1  0x0000000000928979 in freeGroup
              (p=0x7e998d303540) at rts/sm/BlockAlloc.c:721
          #2  0x0000000000926f23 in todo_block_full
              (size=513, ws=0x1aa8ce0) at rts/sm/GCUtils.c:264
          #3  0x00000000009583b9 in alloc_for_copy
              (size=513, gen_no=0) at rts/sm/Evac.c:80
          #4  0x000000000095850d in copy_tag_nolock
              (p=0x7e998c675f28, info=0x421d98 <Main_Large_con_info>, src=0x7e998d075d80, size=513,
              gen_no=0, tag=1) at rts/sm/Evac.c:153
          #5  0x0000000000959177 in evacuate
              (p=0x7e998c675f28) at rts/sm/Evac.c:715
          #6  0x0000000000932388 in scavenge_small_bitmap
              (p=0x7e998c675f28, size=1, bitmap=0) at rts/sm/Scav.c:271
          #7  0x0000000000934aaf in scavenge_stack
              (p=0x7e998c675f28, stack_end=0x7e998c676000) at rts/sm/Scav.c:1908
          #8  0x0000000000934295 in scavenge_one
              (p=0x7e998c66e000) at rts/sm/Scav.c:1466
          #9  0x0000000000934662 in scavenge_mutable_list
              (bd=0x7e998d300440, gen=0x1b1d880) at rts/sm/Scav.c:1643
          #10 0x0000000000934700 in scavenge_capability_mut_lists
              (cap=0x1aaa340) at rts/sm/Scav.c:1664
          #11 0x00000000009299b6 in GarbageCollect
              (collect_gen=0, do_heap_census=false, gc_type=2, cap=0x1aaa340, idle_cap=0x1b38aa0)
              at rts/sm/GC.c:378
          #12 0x0000000000907a4a in scheduleDoGC
              (pcap=0x7ffdec5b5310, task=0x1b36650, force_major=false) at rts/Schedule.c:1798
          #13 0x0000000000905de7 in schedule
              (initialCapability=0x1aaa340, task=0x1b36650) at rts/Schedule.c:546
          #14 0x0000000000908bc4 in scheduleWaitThread
              (tso=0x7e998c0067c8, ret=0x0, pcap=0x7ffdec5b5430) at rts/Schedule.c:2537
          #15 0x000000000091b5a0 in rts_evalLazyIO
              (cap=0x7ffdec5b5430, p=0x9c11f0, ret=0x0) at rts/RtsAPI.c:530
          #16 0x000000000091ca56 in hs_main
              (argc=1, argv=0x7ffdec5b5628, main_closure=0x9c11f0, rts_config=...) at rts/RtsMain.c:72
          #17 0x0000000000421ea0 in main
              ()
      
      In particular, dbl_link_onto() which is used to add a freed block to a
      doubly-linked free list is not thread safe and corrupts the list when
      called concurrently.
      
      Note that thread 1 is to blame here as thread 2 is properly taking the
      spinlock. With this patch we now take the spinlock when freeing a todo
      block in GC, avoiding this race.
      
      Test Plan:
      - Tried slow validate locally: this patch does not introduce new failures.
      - circleci: https://circleci.com/gh/ghc/ghc-diffs/283 The test got killed
        because it took 5 hours but T7919 (which was previously failing on circleci)
        passed.
      
      Reviewers: simonmar, bgamari, erikd
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15285
      
      Differential Revision: https://phabricator.haskell.org/D5115
      c6fbac6a
    • Ömer Sinan Ağacan's avatar
      Remove an incorrect assertion in threadPaused: · 16bc7ae8
      Ömer Sinan Ağacan authored
      The assertion is triggered when we have a loop in the program (in which case we
      see the same update frame multiple times in the stack). See #14915 for more
      details.
      
      Reviewers: simonmar, bgamari, erikd
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #14915
      
      Differential Revision: https://phabricator.haskell.org/D5133
      16bc7ae8
  9. 05 Sep, 2018 4 commits
    • Simon Peyton Jones's avatar
      Preserve specialisations despite CSE · 3addf72a
      Simon Peyton Jones authored
      Trac #15445 showed that, as a result of CSE, a function with an
      automatically generated specialisation RULE could be inlined
      before the RULE had a chance to fire.
      
      This patch attaches a NOINLINE[2] activation to the Id, during
      CSE, to stop this happening.
      
      See Note [Delay inlining after CSE]
      
      ---- Historical note ---
      
      This patch is simpler and more direct than an earlier
      version:
      
        commit 2110738b
        Author: Simon Peyton Jones <simonpj@microsoft.com>
        Date:   Mon Jul 30 13:43:56 2018 +0100
      
        Don't inline functions with RULES too early
      
      We had to revert this patch because it made GHC itself slower.
      
      Why? It delayed inlining of /all/ functions with RULES, and that was
      very bad in TcFlatten.flatten_ty_con_app
      
      * It delayed inlining of liftM
      * That delayed the unravelling of the recursion in some dictionary
        bindings.
      * That delayed some eta expansion, leaving
           flatten_ty_con_app = \x y. let <stuff> in \z. blah
      * That allowed the float-out pass to put sguff between
        the \y and \z.
      * And that permanently stopped eta expasion of the function,
        even once <stuff> was simplified.
      
      -- End of historical note ---
      3addf72a
    • Simon Peyton Jones's avatar
      Define activeAfterInitial, activeDuringFinal · 1152a3be
      Simon Peyton Jones authored
      This is pure refactoring, just adding a couple of
      definitions to BasicTypes, and using them.
      
      Plus some whitespace stuff.
      1152a3be
    • Alec Theriault's avatar
      Expose 'moduleToPkgConfAll' from 'PackageState' · e29ac2db
      Alec Theriault authored
      Summary:
      Having direct access to this field is going to enable Haddock to
      compute in batch which modules to load before looking up instances
      of external packages.
      
      Reviewers: bgamari, monoidal
      
      Reviewed By: monoidal
      
      Subscribers: rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5100
      e29ac2db
    • Ben Gamari's avatar
      testsuite: Use bools for booleans, not ints · ecde9546
      Ben Gamari authored
      Summary: Just as it says on the tin.
      
      Test Plan: Validate
      
      Reviewers: bgamari, osa1
      
      Reviewed By: osa1
      
      Subscribers: osa1, monoidal, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D5010
      ecde9546