1. 15 May, 2020 2 commits
    • Ben Gamari's avatar
      GHC.Cmm.Opt: Handle MO_XX_Conv · 568d7279
      Ben Gamari authored
      This MachOp was introduced by 2c959a18
      but a wildcard match in cmmMachOpFoldM hid the fact that it wasn't
      handled. Ideally we would eliminate the match but this appears to be a
      larger task.
      
      Fixes #18141.
      568d7279
    • Sebastian Graf's avatar
      DmdAnal: Improve handling of precise exceptions · 9bd20e83
      Sebastian Graf authored
      This patch does two things: Fix possible unsoundness in what was called
      the "IO hack" and implement part 2.1 of the "fixing precise exceptions"
      plan in
      https://gitlab.haskell.org/ghc/ghc/wikis/fixing-precise-exceptions,
      which, in combination with !2956, supersedes !3014 and !2525.
      
      **IO hack**
      
      The "IO hack" (which is a fallback to preserve precise exceptions
      semantics and thus soundness, rather than some smart thing that
      increases precision) is called `exprMayThrowPreciseException` now.
      I came up with two testcases exemplifying possible unsoundness (if
      twisted enough) in the old approach:
      
      - `T13380d`: Demonstrating unsoundness of the "IO hack" when resorting
                   to manual state token threading and direct use of primops.
                   More details below.
      - `T13380e`: Demonstrating unsoundness of the "IO hack" when we have
                   Nested CPR. Not currently relevant, as we don't have Nested
                   CPR yet.
      - `T13380f`: Demonstrating unsoundness of the "IO hack" for safe FFI
                   calls.
      
      Basically, the IO hack assumed that precise exceptions can only be
      thrown from a case scrutinee of type `(# State# RealWorld, _ #)`. I
      couldn't come up with a program using the `IO` abstraction that violates
      this assumption. But it's easy to do so via manual state token threading
      and direct use of primops, see `T13380d`. Also similar code might be
      generated by Nested CPR in the (hopefully not too) distant future, see
      `T13380e`. Hence, we now have a more careful test in `forcesRealWorld`
      that passes `T13380{d,e}` (and will hopefully be robust to Nested CPR).
      
      **Precise exceptions**
      
      In #13380 and #17676 we saw that we didn't preserve precise exception
      semantics in demand analysis. We fixed that with minimal changes in
      !2956, but that was terribly unprincipled.
      
      That unprincipledness resulted in a loss of precision, which is tracked
      by these new test cases:
      
      - `T13380b`: Regression in dead code elimination, because !2956 was too
                   syntactic about `raiseIO#`
      - `T13380c`: No need to apply the "IO hack" when the IO action may not
                   throw a precise exception (and the existing IO hack doesn't
                   detect that)
      
      Fixing both issues in !3014 turned out to be too complicated and had
      the potential to regress in the future. Hence we decided to only fix
      `T13380b` and augment the `Divergence` lattice with a new middle-layer
      element, `ExnOrDiv`, which means either `Diverges` (, throws an
      imprecise exception) or throws a *precise* exception.
      
      See the wiki page on Step 2.1 for more implementational details:
      https://gitlab.haskell.org/ghc/ghc/wikis/fixing-precise-exceptions#dead-code-elimination-for-raiseio-with-isdeadenddiv-introducing-exnordiv-step-21
      9bd20e83
  2. 14 May, 2020 14 commits
  3. 13 May, 2020 12 commits
    • Ivan-Yudin's avatar
      doc: Reformulate the opening paragraph of Ch. 4 in User's guide · 266310c3
      Ivan-Yudin authored
      Removes mentioning of Hugs
      (it is not helpful for new users anymore).
      
      Changes the wording for the rest of the paragraph.
      
      Fixes #18132.
      266310c3
    • Ben Gamari's avatar
      testsuite: Add testcase for #18129 · 9e4b981f
      Ben Gamari authored
      9e4b981f
    • Ben Gamari's avatar
      testsuite: Print sign of performance changes · 5d0f2445
      Ben Gamari authored
      Executes the minor formatting change in the tabulated performance
      changes suggested in #18135.
      5d0f2445
    • Ben Gamari's avatar
      users-guide: Add discussion of shared object naming · e34bf656
      Ben Gamari authored
      Fixes #18074.
      e34bf656
    • Sebastian Graf's avatar
      CprAnal: Don't attach CPR sigs to expandable bindings (#18154) · 86d8ac22
      Sebastian Graf authored
      Instead, look through expandable unfoldings in `cprTransform`.
      See the new Note [CPR for expandable unfoldings]:
      
      ```
      Long static data structures (whether top-level or not) like
      
        xs = x1 : xs1
        xs1 = x2 : xs2
        xs2 = x3 : xs3
      
      should not get CPR signatures, because they
      
        * Never get WW'd, so their CPR signature should be irrelevant after analysis
          (in fact the signature might even be harmful for that reason)
        * Would need to be inlined/expanded to see their constructed product
        * Recording CPR on them blows up interface file sizes and is redundant with
          their unfolding. In case of Nested CPR, this blow-up can be quadratic!
      
      But we can't just stop giving DataCon application bindings the CPR property,
      for example
      
        fac 0 = 1
        fac n = n * fac (n-1)
      
      fac certainly has the CPR property and should be WW'd! But FloatOut will
      transform the first clause to
      
        lvl = 1
        fac 0 = lvl
      
      If lvl doesn't have the CPR property, fac won't either. But lvl doesn't have a
      CPR signature to extrapolate into a CPR transformer ('cprTransform'). So
      instead we keep on cprAnal'ing through *expandable* unfoldings for these arity
      0 bindings via 'cprExpandUnfolding_maybe'.
      
      In practice, GHC generates a lot of (nested) TyCon and KindRep bindings, one
      for each data declaration. It's wasteful to attach CPR signatures to each of
      them (and intractable in case of Nested CPR).
      ```
      
      Fixes #18154.
      86d8ac22
    • Emeka Nkurumeh's avatar
    • Ben Gamari's avatar
      Add few cleanups of the CAF logic · cb22348f
      Ben Gamari authored
      Give the NameSet of non-CAFfy names a proper newtype to distinguish it
      from all of the other NameSets floating about.
      cb22348f
    • Simon Jakobi's avatar
      docs: Add examples for Data.Semigroup.Arg{Min,Max} · 8c0740b7
      Simon Jakobi authored
      Context: #17153
      8c0740b7
    • Ben Gamari's avatar
      8ad8dc41
    • Ben Gamari's avatar
      get-win32-tarballs: Fix base URL · 670c3e5c
      Ben Gamari authored
      Revert a change previously made for testing purposes.
      670c3e5c
    • Ömer Sinan Ağacan's avatar
      Pack some of IdInfo fields into a bit field · a03da9bf
      Ömer Sinan Ağacan authored
      This reduces residency of compiler quite a bit on some programs.
      Example stats when building T10370:
      
      Before:
      
         2,871,242,832 bytes allocated in the heap
         4,693,328,008 bytes copied during GC
            33,941,448 bytes maximum residency (276 sample(s))
               375,976 bytes maximum slop
                    83 MiB total memory in use (0 MB lost due to fragmentation)
      
      After:
      
         2,858,897,344 bytes allocated in the heap
         4,629,255,440 bytes copied during GC
            32,616,624 bytes maximum residency (278 sample(s))
               314,400 bytes maximum slop
                    80 MiB total memory in use (0 MB lost due to fragmentation)
      
      So -3.9% residency, -1.3% bytes copied and -0.4% allocations.
      
      Fixes #17497
      
      Metric Decrease:
          T9233
          T9675
      a03da9bf
    • Ben Gamari's avatar
      rts/CNF: Fix fixup comparison function · cf4f1e2f
      Ben Gamari authored
      Previously we would implicitly convert the difference between two words
      to an int, resulting in an integer overflow on 64-bit machines.
      
      Fixes #16992
      cf4f1e2f
  4. 10 May, 2020 2 commits
  5. 08 May, 2020 10 commits