1. 18 Feb, 2021 1 commit
  2. 06 Feb, 2021 1 commit
  3. 02 Jan, 2021 1 commit
    • Simon Peyton Jones's avatar
      Don't use absentError thunks for strict constructor fields · a8926e95
      Simon Peyton Jones authored
      This patch fixes #19133 by using LitRubbish for strict constructor
      fields, even if they are of lifted types.  Previously LitRubbish
      worked only for unlifted (but boxed) types.
      The change is very easy, although I needed a boolean field in
      LitRubbish to say whether or not it is lifted.  (That seemed easier
      than giving it another type argument.
      This is preparing for Andreas's work on establishing the invariant
      that strict constructor fields are always tagged and evaluated
      (see #16970).
      Meanwhile, nothing was actually wrong before, so there are no tests.
  4. 23 Dec, 2020 1 commit
  5. 12 Dec, 2020 2 commits
    • Sebastian Graf's avatar
      Demand: Simplify `CU(U)` to `U` (#19005) · 3aae036e
      Sebastian Graf authored
      Both sub-demands encode the same information.
      This is a trivial change and already affects a few regression tests
      (e.g. `T5075`), so no separate regression test is necessary.
    • Sebastian Graf's avatar
      DmdAnal: Annotate top-level function bindings with demands (#18894) · 5bd71bfd
      Sebastian Graf authored
      It's useful to annotate a non-exported top-level function like `g` in
      module Lib (h) where
      g :: Int -> Int -> (Int,Int)
      g m 1 = (m, 0)
      g m n = (2 * m, 2 `div` n)
      {-# NOINLINE g #-}
      h :: Int -> Int
      h 1 = 0
      h m
        | odd m     = snd (g m 2)
        | otherwise = uncurry (+) (g 2 m)
      with its demand `UCU(CS(P(1P(U),SP(U))`, which tells us that whenever `g` was
      called, the second component of the returned pair was evaluated strictly.
      Since #18903 we do so for local functions, where we can see all calls.
      For top-level functions, we can assume that all *exported* functions are
      demanded according to `topDmd` and thus get sound demands for
      non-exported top-level functions.
      The demand on `g` is crucial information for Nested CPR, which may the
      go on and unbox `g` for the second pair component. That is true even if
      that pair component may diverge, as is the case for the call site `g 13
      0`, which throws a div-by-zero exception.
      In `T18894b`, you can even see the new demand annotation enabling us to
      eta-expand a function that we wouldn't be able to eta-expand without
      Call Arity.
      We only track bindings of function type in order not to risk huge compile-time
      regressions, see `isInterestingTopLevelFn`.
      There was a CoreLint check that rejected strict demand annotations on
      recursive or top-level bindings, which seems completely unjustified.
      All the cases I investigated were fine, so I removed it.
      Fixes #18894.
  6. 20 Nov, 2020 1 commit
    • Sebastian Graf's avatar
      Demand: Interleave usage and strictness demands (#18903) · 0aec78b6
      Sebastian Graf authored
      As outlined in #18903, interleaving usage and strictness demands not
      only means a more compact demand representation, but also allows us to
      express demands that we weren't easily able to express before.
      Call demands are *relative* in the sense that a call demand `Cn(cd)`
      on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
      result is used according to `cd`". Example from #18903:
      h :: Int -> Int
      h m =
        let g :: Int -> (Int,Int)
            g 1 = (m, 0)
            g n = (2 * n, 2 `div` n)
            {-# NOINLINE g #-}
        in case m of
          1 -> 0
          2 -> snd (g m)
          _ -> uncurry (+) (g m)
      Without the interleaved representation, we would just get `L` for the
      strictness demand on `g`. Now we are able to express that whenever
      `g` is called, its second component is used strictly in denoting `g`
      by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
      division, for example.
      Fixes #18903.
      While fixing regressions, I also discovered and fixed #18957.
      Metric Decrease:
  7. 01 Nov, 2020 1 commit
  8. 27 Oct, 2020 1 commit
    • Sebastian Graf's avatar
      DmdAnal: Kill `is_thunk` case in `splitFV` · 28f98b01
      Sebastian Graf authored
      The `splitFV` function implements the highly dubious hack
      described in `Note [Lazy und unleashable free variables]` in
      GHC.Core.Opt.DmdAnal. It arranges it so that demand signatures only
      carry strictness info on free variables. Usage info is released through
      other means, see the Note. It's purely for analysis performance reasons.
      It turns out that `splitFV` has a quite involved case for thunks that
      produces slightly different usage signatures and it's not clear why we
      need it: `splitFV` is only relevant in the LetDown case and the only
      time we call it on thunks is for top-level or local recursive thunks.
      Since usage signatures of top-level thunks can only reference other
      top-level bindings and we completely discard demand info we have on
      top-level things (see the lack of `setIdDemandInfo` in
      `dmdAnalTopBind`), the `is_thunk` case is completely irrelevant here.
      For local, recursive thunks, the added benefit of the `is_thunk` test
      is marginal: We get used-multiple-times in some cases where previously
      we had used-once if a recursive thunk has multiple call sites. It's
      very unlikely and not a case to optimise for.
      So we kill the `is_thunk` case and inline `splitFV` at its call site,
      exposing `isWeakDmd` from `GHC.Types.Demand` instead.
      The NoFib summary supports this decision:
                  Min           0.0%     -0.0%
                  Max           0.0%     +0.0%
       Geometric Mean          -0.0%     -0.0%
  9. 10 Oct, 2020 1 commit
  10. 17 Sep, 2020 1 commit
    • Simon Peyton Jones's avatar
      Do absence analysis on stable unfoldings · 7cf09ab0
      Simon Peyton Jones authored
      Ticket #18638 showed that Very Bad Things happen if we fail
      to do absence analysis on stable unfoldings.  It's all described
      in Note [Absence analysis for stable unfoldings and RULES].
      I'm a bit surprised this hasn't bitten us before. Fortunately
      the fix is pretty simple.
  11. 12 Aug, 2020 1 commit
    • Sylvain Henry's avatar
      DynFlags: disentangle Outputable · accbc242
      Sylvain Henry authored
      - put panic related functions into GHC.Utils.Panic
      - put trace related functions using DynFlags in GHC.Driver.Ppr
      One step closer making Outputable fully independent of DynFlags.
      Bump haddock submodule
  12. 28 Jul, 2020 1 commit
    • Simon Peyton Jones's avatar
      This patch addresses the exponential blow-up in the simplifier. · 0bd60059
      Simon Peyton Jones authored
        #13253 exponential inlining
        #10421 ditto
        #18140 strict constructors
        #18282 another nested-function call case
      This patch makes one really significant changes: change the way that
      mkDupableCont handles StrictArg.  The details are explained in
      GHC.Core.Opt.Simplify Note [Duplicating StrictArg].
      Specific changes
      * In mkDupableCont, when making auxiliary bindings for the other arguments
        of a call, add extra plumbing so that we don't forget the demand on them.
        Otherwise we haev to wait for another round of strictness analysis. But
        actually all the info is to hand.  This change affects:
        - Make the strictness list in ArgInfo be [Demand] instead of [Bool],
          and rename it to ai_dmds.
        - Add as_dmd to ValArg
        - Simplify.makeTrivial takes a Demand
        - mkDupableContWithDmds takes a [Demand]
      There are a number of other small changes
      1. For Ids that are used at most once in each branch of a case, make
         the occurrence analyser record the total number of syntactic
         occurrences.  Previously we recorded just OneBranch or
         I thought this was going to be useful, but I ended up barely
         using it; see Note [Note [Suppress exponential blowup] in
         Actual changes:
           * See the occ_n_br field of OneOcc.
           * postInlineUnconditionally
      2. I found a small perf buglet in SetLevels; see the new
         function GHC.Core.Opt.SetLevels.hasFreeJoin
      3. Remove the sc_cci field of StrictArg.  I found I could get
         its information from the sc_fun field instead.  Less to get
      4. In ArgInfo, arrange that ai_dmds and ai_discs have a simpler
         invariant: they line up with the value arguments beyond ai_args
         This allowed a bit of nice refactoring; see isStrictArgInfo,
         lazyArgcontext, strictArgContext
      There is virtually no difference in nofib. (The runtime numbers
      are bogus -- I tried a few manually.)
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                  fft          +0.0%     -2.0%    -48.3%    -49.4%      0.0%
           multiplier          +0.0%     -2.2%    -50.3%    -50.9%      0.0%
                  Min          -0.4%     -2.2%    -59.2%    -60.4%      0.0%
                  Max          +0.0%     +0.1%     +3.3%     +4.9%      0.0%
       Geometric Mean          +0.0%     -0.0%    -33.2%    -34.3%     -0.0%
      Test T18282 is an existing example of these deeply-nested strict calls.
      We get a big decrease in compile time (-85%) because so much less
      inlining takes place.
      Metric Decrease:
  13. 17 Jun, 2020 1 commit
    • Krzysztof Gogolewski's avatar
      Linear types (#15981) · 40fa237e
      Krzysztof Gogolewski authored
      This is the first step towards implementation of the linear types proposal
      It features
      * A language extension -XLinearTypes
      * Syntax for linear functions in the surface language
      * Linearity checking in Core Lint, enabled with -dlinear-core-lint
      * Core-to-core passes are mostly compatible with linearity
      * Fields in a data type can be linear or unrestricted; linear fields
        have multiplicity-polymorphic constructors.
        If -XLinearTypes is disabled, the GADT syntax defaults to linear fields
      The following items are not yet supported:
      * a # m -> b syntax (only prefix FUN is supported for now)
      * Full multiplicity inference (multiplicities are really only checked)
      * Decent linearity error messages
      * Linear let, where, and case expressions in the surface language
        (each of these currently introduce the unrestricted variant)
      * Multiplicity-parametric fields
      * Syntax for annotating lambda-bound or let-bound with a multiplicity
      * Syntax for non-linear/multiple-field-multiplicity records
      * Linear projections for records with a single linear field
      * Linear pattern synonyms
      * Multiplicity coercions (test LinearPolyType)
      A high-level description can be found at
      Following the link above you will find a description of the changes made to Core.
      This commit has been authored by
      * Richard Eisenberg
      * Krzysztof Gogolewski
      * Matthew Pickering
      * Arnaud Spiwack
      With contributions from:
      * Mark Barbone
      * Alexander Vershilov
      Updates haddock submodule.
  14. 13 Jun, 2020 1 commit
    • Simon Peyton Jones's avatar
      Trim the demand for recursive product types · 42953902
      Simon Peyton Jones authored
      Ticket #18304 showed that we need to be very careful
      when exploring the demand (esp usage demand) on recursive
      product types.
      This patch solves the problem by trimming the demand on such types --
      in effect, a form of "widening".
      See the Note [Trimming a demand to a type] in DmdAnal, which explains
      how I did this by piggy-backing on an existing mechansim for trimming
      demands becuase of GADTs.  The significant payload of this patch is
      very small indeed:
      * Make GHC.Core.Opt.WorkWrap.Utils.typeShape use RecTcChecker to
        avoid looking through recursive types.
      But on the way
      * I found that ae_rec_tc was entirely inoperative and did nothing.
        So I removed it altogether from DmdAnal.
      * I moved some code around in DmdAnal and Demand.
        (There are no actual changes in dmdFix.)
      * I changed the API of DmsAnal.dmdAnalRhsLetDown to return
        a StrictSig rather than a decorated Id
      * I removed the dead function peelTsFuns from Demand
      Performance effects:
      Nofib: 0.0% changes.  Not surprising, because they don't
             use recursive products
      Perf tests
        1% increase in compiler allocation, becuase $cto gets w/w'd.
        It did not w/w before because it takes a deeply nested
        argument, so the worker gets too many args, so we abandon w/w
        altogether (see GHC.Core.Opt.WorkWrap.Utils.isWorkerSmallEnough)
        With this patch we trim the demands.  That is not strictly
        necessary (since these Generic type constructors are like
        tuples -- they can't cause a loop) but the net result is that
        we now w/w $cto which is fine.
        16% decrease in /runtime/ allocation. The UniqSupply is a
        recursive product, so currently we abandon all strictness on
        'churn'.  With this patch 'churn' gets useful strictness, and
        we w/w it.  Hooray
      Metric Decrease:
      Metric Increase:
  15. 10 Jun, 2020 1 commit
  16. 28 May, 2020 1 commit
    • Sebastian Graf's avatar
      DmdAnal: Recognise precise exceptions from case alternatives (#18086) · 08dab5f7
      Sebastian Graf authored
      m :: IO ()
      m = do
        putStrLn "foo"
        error "bar"
      `m` (from #18086) always throws a (precise or imprecise) exception or
      diverges. Yet demand analysis infers `<L,A>` as demand signature instead
      of `<L,A>x` for it.
      That's because the demand analyser sees `putStrLn` occuring in a case
      scrutinee and decides that it has to `deferAfterPreciseException`,
      because `putStrLn` throws a precise exception on some control flow
      paths. This will mask the `botDiv` `Divergence`of the single case alt
      containing `error` to `topDiv`. Since `putStrLn` has `topDiv` itself,
      the final `Divergence` is `topDiv`.
      This is easily fixed: `deferAfterPreciseException` works by `lub`ing
      with the demand type of a virtual case branch denoting the precise
      exceptional control flow. We used `nopDmdType` before, but we can be
      more precise and use `exnDmdType`, which is `nopDmdType` with `exnDiv`.
      Now the `Divergence` from the case alt will degrade `botDiv` to `exnDiv`
      instead of `topDiv`, which combines with the result from the scrutinee
      to `exnDiv`, and all is well.
      Fixes #18086.
  17. 15 May, 2020 1 commit
    • Sebastian Graf's avatar
      DmdAnal: Improve handling of precise exceptions · 9bd20e83
      Sebastian Graf authored
      This patch does two things: Fix possible unsoundness in what was called
      the "IO hack" and implement part 2.1 of the "fixing precise exceptions"
      plan in
      which, in combination with !2956, supersedes !3014 and !2525.
      **IO hack**
      The "IO hack" (which is a fallback to preserve precise exceptions
      semantics and thus soundness, rather than some smart thing that
      increases precision) is called `exprMayThrowPreciseException` now.
      I came up with two testcases exemplifying possible unsoundness (if
      twisted enough) in the old approach:
      - `T13380d`: Demonstrating unsoundness of the "IO hack" when resorting
                   to manual state token threading and direct use of primops.
                   More details below.
      - `T13380e`: Demonstrating unsoundness of the "IO hack" when we have
                   Nested CPR. Not currently relevant, as we don't have Nested
                   CPR yet.
      - `T13380f`: Demonstrating unsound...
  18. 14 May, 2020 1 commit
    • Simon Jakobi's avatar
      Improve some folds over Uniq[D]FM · c05c0659
      Simon Jakobi authored
      * Replace some non-deterministic lazy folds with
        strict folds.
      * Replace some O(n log n) folds in deterministic order
        with O(n) non-deterministic folds.
      * Replace some folds with set-operations on the underlying
      This reduces max residency when compiling
      `nofib/spectral/simple/Main.hs` with -O0 by about 1%.
      Maximum residency when compiling Cabal also seems reduced on the
      order of 3-9%.
  19. 26 Apr, 2020 1 commit
  20. 18 Apr, 2020 1 commit
  21. 10 Apr, 2020 1 commit
    • Sebastian Graf's avatar
      DmdAnal: No need to attach a StrictSig to DataCon workers · f5212dfc
      Sebastian Graf authored
      In GHC.Types.Id.Make we were giving a strictness signature to every data
      constructor wrapper Id that we weren't looking at in demand analysis
      anyway. We used to use its CPR info, but that has its own CPR signature
      `Note [Data-con worker strictness]` then felt very out of place, so I
      moved it to GHC.Core.DataCon.
  22. 02 Apr, 2020 1 commit
  23. 29 Mar, 2020 1 commit
  24. 25 Mar, 2020 1 commit
  25. 18 Mar, 2020 1 commit
  26. 17 Mar, 2020 1 commit
  27. 22 Feb, 2020 1 commit
  28. 12 Feb, 2020 1 commit
    • Sebastian Graf's avatar
      Separate CPR analysis from the Demand analyser · 059c3c9d
      Sebastian Graf authored
      The reasons for that can be found in the wiki:
      We now run CPR after demand analysis (except for after the final demand
      analysis run just before code gen). CPR got its own dump flags
      (`-ddump-cpr-anal`, `-ddump-cpr-signatures`), but not its own flag to
      activate/deactivate. It will run with `-fstrictness`/`-fworker-wrapper`.
      As explained on the wiki page, this step is necessary for a sane Nested
      CPR analysis. And it has quite positive impact on compiler performance:
      Metric Decrease:
  29. 27 Jan, 2020 1 commit
  30. 25 Jan, 2020 1 commit
    • Sebastian Graf's avatar
      PmCheck: Formulate as translation between Clause Trees · 8038cbd9
      Sebastian Graf authored
      We used to check `GrdVec`s arising from multiple clauses and guards in
      isolation. That resulted in a split between `pmCheck` and
      `pmCheckGuards`, the implementations of which were similar, but subtly
      different in detail. Also the throttling mechanism described in
      `Note [Countering exponential blowup]` ultimately got quite complicated
      because it had to cater for both checking functions.
      This patch realises that pattern match checking doesn't just consider
      single guarded RHSs, but that it's always a whole set of clauses, each
      of which can have multiple guarded RHSs in turn. We do so by
      translating a list of `Match`es to a `GrdTree`:
      data GrdTree
        = Rhs !RhsInfo
        | Guard !PmGrd !GrdTree      -- captures lef-to-right  match semantics
        | Sequence !GrdTree !GrdTree -- captures top-to-bottom match semantics
        | Empty                      -- For -XEmptyCase, neutral element of Sequence
      Then we have a function `checkGrdTree` that matches a given `GrdTree`
      against an incoming set of values, represented by `Deltas`:
      checkGrdTree :: GrdTree -> Deltas -> CheckResult
      Throttling is isolated to the `Sequence` case and becomes as easy as one
      would expect: When the union of uncovered values becomes too big, just
      return the original incoming `Deltas` instead (which is always a
      superset of the union, thus a sound approximation).
      The returned `CheckResult` contains two things:
      1. The set of values that were not covered by any of the clauses, for
         exhaustivity warnings.
      2. The `AnnotatedTree` that enriches the syntactic structure of the
         input program with divergence and inaccessibility information.
      This is `AnnotatedTree`:
      data AnnotatedTree
        = AccessibleRhs !RhsInfo
        | InaccessibleRhs !RhsInfo
        | MayDiverge !AnnotatedTree
        | SequenceAnn !AnnotatedTree !AnnotatedTree
        | EmptyAnn
      Crucially, `MayDiverge` asserts that the tree may force diverging
      values, so not all of its wrapped clauses can be redundant.
      While the set of uncovered values can be used to generate the missing
      equations for warning messages, redundant and proper inaccessible
      equations can be extracted from `AnnotatedTree` by
      For this to work properly, the interface to the Oracle had to change.
      There's only `addPmCts` now, which takes a bag of `PmCt`s. There's a
      whole bunch of `PmCt` variants to replace the different oracle functions
      from before.
      The new `AnnotatedTree` structure allows for more accurate warning
      reporting (as evidenced by a number of changes spread throughout GHC's
      code base), thus we fix #17465.
      Fixes #17646 on the go.
      Metric Decrease:
  31. 13 Jan, 2020 1 commit
  32. 28 Nov, 2019 1 commit
  33. 01 May, 2019 1 commit
    • Sebastian Graf's avatar
      Compute demand signatures assuming idArity · 014ed644
      Sebastian Graf authored
      This does four things:
      1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp
      2. Compute the strictness signature in LetDown assuming at least `idArity`
         incoming arguments
      3. Remove the special case for trivial RHSs, which is subsumed by 2
      4. Don't perform the W/W split when doing so would eta expand a binding.
         Otherwise we would eta expand PAPs, causing unnecessary churn in the
      NoFib Results
              Program         Allocs    Instrs
       fannkuch-redux          +0.3%      0.0%
                   gg          -0.0%     -0.1%
             maillist          +0.2%     +0.2%
              minimax           0.0%     +0.8%
               pretty           0.0%     -0.1%
              reptile          -0.0%     -1.2%
                  Min          -0.0%     -1.2%
                  Max          +0.3%     +0.8%
       Geometric Mean          +0.0%     -0.0%
  34. 15 Mar, 2019 1 commit
  35. 02 Feb, 2019 1 commit
  36. 01 Feb, 2019 1 commit
  37. 12 Dec, 2018 1 commit
    • Simon Peyton Jones's avatar
      Improvements to demand analysis · d77501cd
      Simon Peyton Jones authored
      This patch collects a few improvements triggered by Trac #15696,
      and fixing Trac #16029
      * Stop making toCleanDmd behave specially for unlifted types.
        This special case was the cause of stupid behaviour in Trac
        #16029.  And to my joy I discovered the let/app invariant
        rendered it unnecessary.  (Maybe the special case pre-dated
        the let/app invariant.)
        Result: less special-case handling in the compiler, and
        better perf for the compiled code.
      * In WwLib.mkWWstr_one, treat seqDmd like U(AAA).  It was not
        being so treated before, which again led to stupid code.
      * Update and improve Notes
      There are .stderr test wibbles because we get slightly different
      strictness signatures for an argumment of unlifted type:
          <L,U> rather than <S,U>        for Int#
          <S,U> rather than <S(S),U(U)>  for Int
  38. 23 Nov, 2018 1 commit
    • Sebastian Graf's avatar
      Implement late lambda lift · b2950e03
      Sebastian Graf authored
      This implements a selective lambda-lifting pass late in the STG
      Lambda lifting has the effect of avoiding closure allocation at the cost
      of having to make former free vars available at call sites, possibly
      enlarging closures surrounding call sites in turn.
      We identify beneficial cases by means of an analysis that estimates
      closure growth.
      There's a Wiki page at
      Reviewers: simonpj, bgamari, simonmar
      Reviewed By: simonpj
      Subscribers: rwbarton, carter
      GHC Trac Issues: #9476
      Differential Revision: https://phabricator.haskell.org/D5224
  39. 15 Oct, 2018 1 commit