Skip to content
Snippets Groups Projects
  1. May 10, 2022
  2. May 09, 2022
  3. May 08, 2022
  4. May 06, 2022
  5. May 05, 2022
  6. May 04, 2022
  7. May 03, 2022
    • Sebastian Graf's avatar
      Assume at least one evaluation for nested SubDemands (#21081, #21133) · 15ffe2b0
      Sebastian Graf authored
      See the new `Note [SubDemand denotes at least one evaluation]`.
      
      A demand `n :* sd` on a let binder `x=e` now means
      
      > "`x` was evaluated `n` times and in any program trace it is evaluated, `e` is
      >  evaluated deeply in sub-demand `sd`."
      
      The "any time it is evaluated" premise is what this patch adds. As a result,
      we get better nested strictness. For example (T21081)
      ```hs
      f :: (Bool, Bool) -> (Bool, Bool)
      f pr = (case pr of (a,b) -> a /= b, True)
      -- before: <MP(L,L)>
      -- after:  <MP(SL,SL)>
      
      g :: Int -> (Bool, Bool)
      g x = let y = let z = odd x in (z,z) in f y
      ```
      The change in demand signature "before" to "after" allows us to case-bind `z`
      here.
      
      Similarly good things happen for the `sd` in call sub-demands `Cn(sd)`, which
      allows for more eta-reduction (which is only sound with `-fno-pedantic-bottoms`,
      albeit).
      
      We also fix #21085, a surprising inconsistency with `Poly` to `Call` sub-demand
      expansion.
      
      In an attempt to fix a regression caused by less inlining due to eta-reduction
      in T15426, I eta-expanded the definition of `elemIndex` and `elemIndices`, thus
      fixing #21345 on the go.
      
      The main point of this patch is that it fixes #21081 and #21133.
      
      Annoyingly, I discovered that more precise demand signatures for join points can
      transform a program into a lazier program if that join point gets floated to the
      top-level, see #21392. There is no simple fix at the moment, but !5349 might.
      Thus, we accept a ~5% regression in `MultiLayerModulesTH_OneShot`, where #21392
      bites us in `addListToUniqDSet`. T21392 reliably reproduces the issue.
      
      Surprisingly, ghc/alloc perf on Windows improves much more than on other jobs, by
      0.4% in the geometric mean and by 2% in T16875.
      
      Metric Increase:
          MultiLayerModulesTH_OneShot
      Metric Decrease:
          T16875
      15ffe2b0
Loading