Skip to content
Snippets Groups Projects
  1. Jan 08, 2024
  2. Oct 24, 2021
    • Sebastian Graf's avatar
      DmdAnal: Implement Boxity Analysis (#19871) · 3bab222c
      Sebastian Graf authored and Marge Bot's avatar Marge Bot committed
      This patch fixes some abundant reboxing of `DynFlags` in
      `GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic
      of #19407) by introducing a Boxity analysis to GHC, done as part of demand
      analysis. This allows to accurately capture ad-hoc unboxing decisions previously
      made in worker/wrapper in demand analysis now, where the boxity info can
      propagate through demand signatures.
      
      See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in
      `Note [No lazy, Unboxed demand in demand signature]`, but
      `Note [Finalising boxity for demand signature]` is probably a better entry-point.
      
      To support the fix for #19407, I had to change (what was)
      `Note [Add demands for strict constructors]` a bit
      (now `Note [Unboxing evaluated arguments]`). In particular, we now take care of
      it in `finaliseBoxity` (which is only called from demand analaysis) instead of
      `wantToUnboxArg`.
      
      I also had to resurrect `Note [Product demands for function body]` and rename
      it to `Note [Unboxed demand on function bodies returning small products]` to
      avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again.
      See the updated Note for details.
      
      A nice side-effect is that the worker/wrapper transformation no longer needs to
      look at strictness info and other bits such as `InsideInlineableFun` flags
      (needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects
      boxity info from argument demands and interprets them with a severely simplified
      `wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved
      to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys`
      which would be awkward to export.
      
      I spent some time figuring out the reason for why `T16197` failed prior to my
      amendments to `Note [Unboxing evaluated arguments]`. After having it figured
      out, I minimised it a bit and added `T16197b`, which simply compares computed
      strictness signatures and thus should be far simpler to eyeball.
      
      The 12% ghc/alloc regression in T11545 is because of the additional `Boxity`
      field in `Poly` and `Prod` that results in more allocation during `lubSubDmd`
      and `plusSubDmd`. I made sure in the ticky profiles that the number of calls
      to those functions stayed the same. We can bear such an increase here, as we
      recently improved it by -68% (in b760c1f7).
      T18698* regress slightly because there is more unboxing of dictionaries
      happening and that causes Lint (mostly) to allocate more.
      
      Fixes #19871, #19407, #4267, #16859, #18907 and #13331.
      
      Metric Increase:
          T11545
          T18698a
          T18698b
      
      Metric Decrease:
          T12425
          T16577
          T18223
          T18282
          T4267
          T9961
      3bab222c
  3. Oct 14, 2020
    • Simon Peyton Jones's avatar
      Fix some missed opportunities for preInlineUnconditionally · 15d2340c
      Simon Peyton Jones authored and Marge Bot's avatar Marge Bot committed
      There are two signficant changes here:
      
      * Ticket #18815 showed that we were missing some opportunities for
        preInlineUnconditionally.  The one-line fix is in the code for
        GHC.Core.Opt.Simplify.Utils.preInlineUnconditionally, which now
        switches off only for INLINE pragmas.  I expanded
        Note [Stable unfoldings and preInlineUnconditionally] to explain.
      
      * When doing this I discovered a way in which preInlineUnconditionally
        was occasionally /too/ eager.  It's all explained in
        Note [Occurrences in stable unfoldings] in GHC.Core.Opt.OccurAnal,
        and the one-line change adding markAllMany to occAnalUnfolding.
      
      I also got confused about what NoUserInline meant, so I've renamed
      it to NoUserInlinePrag, and changed its pretty-printing slightly.
      That led to soem error messate wibbling, and touches quite a few
      files, but there is no change in functionality.
      
      I did a nofib run.  As expected, no significant changes.
      
              Program           Size    Allocs
      ----------------------------------------
               sphere          -0.0%     -0.4%
      ----------------------------------------
                  Min          -0.0%     -0.4%
                  Max          -0.0%     +0.0%
       Geometric Mean          -0.0%     -0.0%
      
      I'm allowing a max-residency increase for T10370, which seems
      very irreproducible. (See comments on !4241.)  There is always
      sampling error for max-residency measurements; and in any case
      the change shows up on some platforms but not others.
      
      Metric Increase:
          T10370
      15d2340c
  4. Apr 30, 2020
  5. Mar 29, 2020
    • Simon Peyton Jones's avatar
      Demand analysis: simplify the demand for a RHS · 54250f2d
      Simon Peyton Jones authored and Marge Bot's avatar Marge Bot committed
      Ticket #17932 showed that we were using a stupid demand for the RHS
      of a let-binding, when the result is a product.  This was the result
      of a "fix" in 2013, which (happily) turns out to no longer be
      necessary.
      
      So I just deleted the code, which simplifies the demand analyser,
      and fixes #17932. That in turn uncovered that the anticipation
      of worker/wrapper in CPR analysis was inaccurate, hence the logic
      that decides whether to unbox an argument in WW was extracted into
      a function `wantToUnbox`, now consulted by CPR analysis.
      
      I tried nofib, and got 0.0% perf changes.
      
      All this came up when messing about with !2873 (ticket #17917),
      but is idependent of it.
      
      Unfortunately, this patch regresses #4267 and realised that it is now
      blocked on #16335.
      54250f2d
  6. Apr 20, 2018
    • Simon Peyton Jones's avatar
      Inline wrappers earlier · 8b10b896
      Simon Peyton Jones authored
      This patch has a single significant change:
      
        strictness wrapper functions are inlined earlier,
        in phase 2 rather than phase 0.
      
      As shown by Trac #15056, this gives a better chance for RULEs to fire.
      Before this change, a function that would have inlined early without
      strictness analyss was instead inlining late. Result: applying
      "optimisation" made the program worse.
      
      This does not make too much difference in nofib, but I've stumbled
      over the problem more than once, so even a "no-change" result would be
      quite acceptable.  Here are the headlines:
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
            cacheprof          -0.5%     -0.5%     +2.5%     +2.5%      0.0%
               fulsom          -1.0%     +2.6%     -0.1%     -0.1%      0.0%
                 mate          -0.6%     +2.4%     -0.9%     -0.9%      0.0%
              veritas          -0.7%    -23.2%     0.002     0.002      0.0%
      --------------------------------------------------------------------------------
                  Min          -1.4%    -23.2%    -12.5%    -15.3%      0.0%
                  Max          +0.6%     +2.6%     +4.4%     +4.3%    +19.0%
       Geometric Mean          -0.7%     -0.2%     -1.4%     -1.7%     +0.2%
      
      * A worthwhile reduction in binary size.
      
      * Runtimes are not to be trusted much but look as if they
        are moving the right way.
      
      * A really big win in veritas, described in comment:1 of
        Trac #15056; more fusion rules fired.
      
      * I investigated the losses in 'mate' and 'fulsom'; see #15056.
      8b10b896
  7. Sep 12, 2017
  8. Mar 30, 2016
  9. Dec 15, 2015
    • Ben Gamari's avatar
      Narrow scope of special-case for unqualified printing of names in core libraries · e2c91738
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      Commit 547c5971 modifies the
      pretty-printer to render names from a set of core packages (`base`,
      `ghc-prim`, `template-haskell`) as unqualified. The idea here was that
      many of these names typically are not in scope but are well-known by the
      user and therefore qualification merely introduces noise.
      
      This, however, is a very large hammer and potentially breaks any
      consumer who relies on parsing GHC output (hence #11208). This commit
      partially reverts this change, now only printing `Constraint` (which
      appears quite often in errors) as unqualified.
      
      Fixes #11208.
      
      Updates tests in `array` submodule.
      
      Test Plan: validate
      
      Reviewers: hvr, thomie, austin
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1619
      
      GHC Trac Issues: #11208
      e2c91738
  10. Jun 26, 2015
    • Simon Peyton Jones's avatar
      Improve CPR behavior for strict constructors · 0696fc6d
      Simon Peyton Jones authored
      When working on Trac #10482 I noticed that we could give constructor
      arguments the CPR property if they are use strictly.
      
      This is documented carefully in
          Note [CPR in a product case alternative]
      and also
          Note [Initial CPR for strict binders]
      
      There are a bunch of intersting examples in
          Note [CPR examples]
      which I have added to the test suite as T10482a.
      
      I also added a test for #10482 itself.
      0696fc6d
Loading