Skip to content
Snippets Groups Projects
  1. Mar 01, 2024
    • Torsten Schmits's avatar
      Introduce ListTuplePuns extension · d91d00fc
      Torsten Schmits authored and Marge Bot's avatar Marge Bot committed
      This implements Proposal 0475, introducing the `ListTuplePuns` extension
      which is enabled by default.
      
      Disabling this extension makes it invalid to refer to list, tuple and
      sum type constructors by using built-in syntax like `[Int]`,
      `(Int, Int)`, `(# Int#, Int# #)` or `(# Int | Int #)`.
      Instead, this syntax exclusively denotes data constructors for use with
      `DataKinds`.
      The conventional way of referring to these data constructors by
      prefixing them with a single quote (`'(Int, Int)`) is now a parser
      error.
      
      Tuple declarations have been moved to `GHC.Tuple.Prim` and the `Solo`
      data constructor has been renamed to `MkSolo` (in a previous commit).
      Unboxed tuples and sums now have real source declarations in `GHC.Types`.
      Unit and solo types for tuples are now called `Unit`, `Unit#`, `Solo`
      and `Solo#`.
      Constraint tuples now have the unambiguous type constructors `CTuple<n>`
      as well as `CUnit` and `CSolo`, defined in `GHC.Classes` like before.
      
      A new parser construct has been added for the unboxed sum data
      constructor declarations.
      
      The type families `Tuple`, `Sum#` etc. that were intended to provide
      nicer syntax have been omitted from this change set due to inference
      problems, to be implemented at a later time.
      See the MR discussion for more info.
      
      Updates the submodule utils/haddock.
      Updates the cabal submodule due to new language extension.
      
          Metric Increase:
              haddock.base
      
          Metric Decrease:
              MultiLayerModulesTH_OneShot
              size_hello_artifact
      
      Proposal document: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0475-tuple-syntax.rst
      
      Merge request: ghc/ghc!8820
      
      Tracking ticket: ghc/ghc#21294
      d91d00fc
  2. Feb 25, 2024
  3. Jan 20, 2024
  4. Dec 06, 2023
  5. Oct 28, 2023
  6. Sep 06, 2023
  7. Feb 14, 2023
  8. Jan 11, 2023
    • Krzysztof Gogolewski's avatar
      Misc cleanup · 083f7015
      Krzysztof Gogolewski authored and Marge Bot's avatar Marge Bot committed
      - Remove unused mkWildEvBinder
      - Use typeTypeOrConstraint - more symmetric and asserts that
        that the type is Type or Constraint
      - Fix escape sequences in Python; they raise a deprecation warning
        with -Wdefault
      083f7015
  9. Oct 18, 2022
    • Andreas Klebinger's avatar
      Fix GHCis interaction with tag inference. · 0ac60423
      Andreas Klebinger authored and Marge Bot's avatar Marge Bot committed
      I had assumed that wrappers were not inlined in interactive mode.
      Meaning we would always execute the compiled wrapper which properly
      takes care of upholding the strict field invariant.
      This turned out to be wrong. So instead we now run tag inference even
      when we generate bytecode. In that case only for correctness not
      performance reasons although it will be still beneficial for runtime
      in some cases.
      
      I further fixed a bug where GHCi didn't tag nullary constructors
      properly when used as arguments. Which caused segfaults when calling
      into compiled functions which expect the strict field invariant to
      be upheld.
      
      Fixes #22042 and #21083
      
      -------------------------
      Metric Increase:
          T4801
      
      Metric Decrease:
          T13035
      -------------------------
      0ac60423
  10. Sep 21, 2022
    • sheaf's avatar
      Don't use isUnliftedType in isTagged · 06ccad0d
      sheaf authored and Marge Bot's avatar Marge Bot committed
      The function GHC.Stg.InferTags.Rewrite.isTagged can be given
      the Id of a join point, which might be representation polymorphic.
      This would cause the call to isUnliftedType to crash. It's better
      to use typeLevity_maybe instead.
      
      Fixes #22212
      06ccad0d
  11. Sep 15, 2022
  12. May 09, 2022
  13. Feb 12, 2022
    • Andreas Klebinger's avatar
      Tag inference work. · 0e93023e
      Andreas Klebinger authored and Matthew Pickering's avatar Matthew Pickering committed
      This does three major things:
      * Enforce the invariant that all strict fields must contain tagged
      pointers.
      * Try to predict the tag on bindings in order to omit tag checks.
      * Allows functions to pass arguments unlifted (call-by-value).
      
      The former is "simply" achieved by wrapping any constructor allocations with
      a case which will evaluate the respective strict bindings.
      
      The prediction is done by a new data flow analysis based on the STG
      representation of a program. This also helps us to avoid generating
      redudant cases for the above invariant.
      
      StrictWorkers are created by W/W directly and SpecConstr indirectly.
      See the Note [Strict Worker Ids]
      
      Other minor changes:
      
      * Add StgUtil module containing a few functions needed by, but
        not specific to the tag analysis.
      
      -------------------------
      Metric Decrease:
      	T12545
      	T18698b
      	T18140
      	T18923
              LargeRecord
      Metric Increase:
              LargeRecord
      	ManyAlternatives
      	ManyConstructors
      	T10421
      	T12425
      	T12707
      	T13035
      	T13056
      	T13253
      	T13253-spj
      	T13379
      	T15164
      	T18282
      	T18304
      	T18698a
      	T1969
      	T20049
      	T3294
      	T4801
      	T5321FD
      	T5321Fun
      	T783
      	T9233
      	T9675
      	T9961
      	T19695
      	WWRec
      -------------------------
      0e93023e
  14. Sep 30, 2021
    • Sebastian Graf's avatar
      Nested CPR light unleashed (#18174) · c261f220
      Sebastian Graf authored and Marge Bot's avatar Marge Bot committed
      This patch enables worker/wrapper for nested constructed products, as described
      in `Note [Nested CPR]`. The machinery for expressing Nested CPR was already
      there, since !5054. Worker/wrapper is equipped to exploit Nested CPR annotations
      since !5338. CPR analysis already handles applications in batches since !5753.
      This patch just needs to flip a few more switches:
      
      1. In `cprTransformDataConWork`, we need to look at the field expressions
         and their `CprType`s to see whether the evaluation of the expressions
         terminates quickly (= is in HNF) or if they are put in strict fields.
         If that is the case, then we retain their CPR info and may unbox nestedly
         later on. More details in `Note [Nested CPR]`.
      2. Enable nested `ConCPR` signatures in `GHC.Types.Cpr`.
      3. In the `asConCpr` call in `GHC.Core.Opt.WorkWrap.Utils`, pass CPR info of
         fields to the `Unbox`.
      4. Instead of giving CPR signatures to DataCon workers and wrappers, we now have
         `cprTransformDataConWork` for workers and treat wrappers by analysing their
         unfolding. As a result, the code from GHC.Types.Id.Make went away completely.
      5. I deactivated worker/wrappering for recursive DataCons and wrote a function
         `isRecDataCon` to detect them. We really don't want to give `repeat` or
         `replicate` the Nested CPR property.
         See Note [CPR for recursive data structures] for which kind of recursive
         DataCons we target.
      6. Fix a couple of tests and their outputs.
      
      I also documented that CPR can destroy sharing and lead to asymptotic increase
      in allocations (which is tracked by #13331/#19326) in
      `Note [CPR for data structures can destroy sharing]`.
      
      Nofib results:
      ```
      --------------------------------------------------------------------------------
              Program         Allocs    Instrs
      --------------------------------------------------------------------------------
         ben-raytrace          -3.1%     -0.4%
         binary-trees          +0.8%     -2.9%
         digits-of-e2          +5.8%     +1.2%
                event          +0.8%     -2.1%
       fannkuch-redux          +0.0%     -1.4%
                 fish           0.0%     -1.5%
               gamteb          -1.4%     -0.3%
              mkhprog          +1.4%     +0.8%
           multiplier          +0.0%     -1.9%
                  pic          -0.6%     -0.1%
              reptile         -20.9%    -17.8%
            wave4main          +4.8%     +0.4%
                 x2n1        -100.0%     -7.6%
      --------------------------------------------------------------------------------
                  Min         -95.0%    -17.8%
                  Max          +5.8%     +1.2%
       Geometric Mean          -2.9%     -0.4%
      ```
      The huge wins in x2n1 (loopy list) and reptile (see #19970) are due to
      refraining from unboxing (:). Other benchmarks like digits-of-e2 or wave4main
      regress because of that. Ultimately there are no great improvements due to
      Nested CPR alone, but at least it's a win.
      Binary sizes decrease by 0.6%.
      
      There are a significant number of metric decreases. The most notable ones (>1%):
      ```
             ManyAlternatives(normal) ghc/alloc   771656002.7   762187472.0  -1.2%
             ManyConstructors(normal) ghc/alloc  4191073418.7  4114369216.0  -1.8%
            MultiLayerModules(normal) ghc/alloc  3095678333.3  3128720704.0  +1.1%
                    PmSeriesG(normal) ghc/alloc    50096429.3    51495664.0  +2.8%
                    PmSeriesS(normal) ghc/alloc    63512989.3    64681600.0  +1.8%
                    PmSeriesV(normal) ghc/alloc    62575424.0    63767208.0  +1.9%
                       T10547(normal) ghc/alloc    29347469.3    29944240.0  +2.0%
                      T11303b(normal) ghc/alloc    46018752.0    47367576.0  +2.9%
                       T12150(optasm) ghc/alloc    81660890.7    82547696.0  +1.1%
                       T12234(optasm) ghc/alloc    59451253.3    60357952.0  +1.5%
                       T12545(normal) ghc/alloc  1705216250.7  1751278952.0  +2.7%
                       T12707(normal) ghc/alloc   981000472.0   968489800.0  -1.3% GOOD
                       T13056(optasm) ghc/alloc   389322664.0   372495160.0  -4.3% GOOD
                       T13253(normal) ghc/alloc   337174229.3   341954576.0  +1.4%
                       T13701(normal) ghc/alloc  2381455173.3  2439790328.0  +2.4%  BAD
                         T14052(ghci) ghc/alloc  2162530642.7  2139108784.0  -1.1%
                       T14683(normal) ghc/alloc  3049744728.0  2977535064.0  -2.4% GOOD
                       T14697(normal) ghc/alloc   362980213.3   369304512.0  +1.7%
                       T15164(normal) ghc/alloc  1323102752.0  1307480600.0  -1.2%
                       T15304(normal) ghc/alloc  1304607429.3  1291024568.0  -1.0%
                       T16190(normal) ghc/alloc   281450410.7   284878048.0  +1.2%
                       T16577(normal) ghc/alloc  7984960789.3  7811668768.0  -2.2% GOOD
                       T17516(normal) ghc/alloc  1171051192.0  1153649664.0  -1.5%
                       T17836(normal) ghc/alloc  1115569746.7  1098197592.0  -1.6%
                      T17836b(normal) ghc/alloc    54322597.3    55518216.0  +2.2%
                       T17977(normal) ghc/alloc    47071754.7    48403408.0  +2.8%
                      T17977b(normal) ghc/alloc    42579133.3    43977392.0  +3.3%
                       T18923(normal) ghc/alloc    71764237.3    72566240.0  +1.1%
                        T1969(normal) ghc/alloc   784821002.7   773971776.0  -1.4% GOOD
                        T3294(normal) ghc/alloc  1634913973.3  1614323584.0  -1.3% GOOD
                        T4801(normal) ghc/alloc   295619648.0   292776440.0  -1.0%
                      T5321FD(normal) ghc/alloc   278827858.7   276067280.0  -1.0%
                        T5631(normal) ghc/alloc   586618202.7   577579960.0  -1.5%
                        T5642(normal) ghc/alloc   494923048.0   487927208.0  -1.4%
                        T5837(normal) ghc/alloc    37758061.3    39261608.0  +4.0%
                        T9020(optasm) ghc/alloc   257362077.3   254672416.0  -1.0%
                        T9198(normal) ghc/alloc    49313365.3    50603936.0  +2.6%  BAD
                        T9233(normal) ghc/alloc   704944258.7   685692712.0  -2.7% GOOD
                        T9630(normal) ghc/alloc  1476621560.0  1455192784.0  -1.5%
                        T9675(optasm) ghc/alloc   443183173.3   433859696.0  -2.1% GOOD
                       T9872a(normal) ghc/alloc  1720926653.3  1693190072.0  -1.6% GOOD
                       T9872b(normal) ghc/alloc  2185618061.3  2162277568.0  -1.1% GOOD
                       T9872c(normal) ghc/alloc  1765842405.3  1733618088.0  -1.8% GOOD
         TcPlugin_RewritePerf(normal) ghc/alloc  2388882730.7  2365504696.0  -1.0%
                        WWRec(normal) ghc/alloc   607073186.7   597512216.0  -1.6%
      
                        T9203(normal) run/alloc   107284064.0   102881832.0  -4.1%
                haddock.Cabal(normal) run/alloc 24025329589.3 23768382560.0  -1.1%
                 haddock.base(normal) run/alloc 25660521653.3 25370321824.0  -1.1%
             haddock.compiler(normal) run/alloc 74064171706.7 73358712280.0  -1.0%
      ```
      The biggest exception to the rule is T13701 which seems to fluctuate as usual
      (not unlike T12545). T14697 has a similar quality, being a generated
      multi-module test. T5837 is small enough that it similarly doesn't measure
      anything significant besides module loading overhead.
      T13253 simply does one additional round of Simplification due to Nested CPR.
      
      There are also some apparent regressions in T9198, T12234 and PmSeriesG that we
      (@mpickering and I) were simply unable to reproduce locally. @mpickering tried
      to run the CI script in a local Docker container and actually found that T9198
      and PmSeriesG *improved*. In MRs that were rebased on top this one, like !4229,
      I did not experience such increases. Let's not get hung up on these regression
      tests, they were meant to test for asymptotic regressions.
      
      The build-cabal test improves by 1.2% in -O0.
      
      Metric Increase:
          T10421
          T12234
          T12545
          T13035
          T13056
          T13701
          T14697
          T18923
          T5837
          T9198
      Metric Decrease:
          ManyConstructors
          T12545
          T12707
          T13056
          T14683
          T16577
          T18223
          T1969
          T3294
          T9203
          T9233
          T9675
          T9872a
          T9872b
          T9872c
          T9961
          TcPlugin_RewritePerf
      c261f220
  15. Apr 17, 2021
    • Simon Peyton Jones's avatar
      Improve CSE in STG-land · 7bd12940
      Simon Peyton Jones authored
      This patch fixes #19717, a long-standing bug in CSE for STG, which
      led to a stupid loss of CSE in some situations.
      
      It's explained in Note [Trivial case scrutinee], which I have
      substantially extended.
      7bd12940
  16. Apr 30, 2020
  17. Mar 08, 2019
    • Sebastian Graf's avatar
      Always do the worker/wrapper split for NOINLINEs · 1675d40a
      Sebastian Graf authored and Marge Bot's avatar Marge Bot committed
      Trac #10069 revealed that small NOINLINE functions didn't get split
      into worker and wrapper. This was due to `certainlyWillInline`
      saying that any unfoldings with a guidance of `UnfWhen` inline
      unconditionally. That isn't the case for NOINLINE functions, so we
      catch this case earlier now.
      
      Nofib results:
      
      --------------------------------------------------------------------------------
              Program         Allocs    Instrs
      --------------------------------------------------------------------------------
       fannkuch-redux          -0.3%      0.0%
                   gg          +0.0%     +0.1%
             maillist          -0.2%     -0.2%
              minimax           0.0%     -0.8%
      --------------------------------------------------------------------------------
                  Min          -0.3%     -0.8%
                  Max          +0.0%     +0.1%
       Geometric Mean          -0.0%     -0.0%
      
      Fixes #10069.
      
      -------------------------
      Metric Increase:
          T9233
      -------------------------
      1675d40a
  18. Nov 07, 2018
    • davide's avatar
      testsuite: Save performance metrics in git notes. · 932cd41d
      davide authored
      This patch makes the following improvement:
        - Automatically records test metrics (per test environment) so that
          the programmer need not supply nor update expected values in *.T
          files.
          - On expected metric changes, the programmer need only indicate the
            direction of change in the git commit message.
        - Provides a simple python tool "perf_notes.py" to compare metrics
          over time.
      
      Issues:
        - Using just the previous commit allows performance to drift with each
          commit.
          - Currently we allow drift as we have a preference for minimizing
            false positives.
          - Some possible alternatives include:
            - Use metrics from a fixed commit per test: the last commit that
              allowed a change in performance (else the oldest metric)
            - Or use some sort of aggregate since the last commit that allowed
              a change in performance (else all available metrics)
            - These alternatives may result in a performance issue (with the
              test driver) having to heavily search git commits/notes.
        - Run locally, performance tests will trivially pass unless the tests
          were run locally on the previous commit. This is often not the case
          e.g.  after pulling recent changes.
      
      Previously, *.T files contain statements such as:
      ```
      stats_num_field('peak_megabytes_allocated', (2, 1))
      compiler_stats_num_field('bytes allocated',
                               [(wordsize(64), 165890392, 10)])
      ```
      This required the programmer to give the expected values and a tolerance
      deviation (percentage). With this patch, the above statements are
      replaced with:
      ```
      collect_stats('peak_megabytes_allocated', 5)
      collect_compiler_stats('bytes allocated', 10)
      ```
      So that programmer must only enter which metrics to test and a tolerance
      deviation. No expected value is required. CircleCI will then run the
      tests per test environment and record the metrics to a git note for that
      commit and push them to the git.haskell.org ghc repo. Metrics will be
      compared to the previous commit. If they are different by the tolerance
      deviation from the *.T file, then the corresponding test will fail. By
      adding to the git commit message e.g.
      ```
       # Metric (In|De)crease <metric(s)> <options>: <tests>
      Metric Increase ['bytes allocated', 'peak_megabytes_allocated'] \
               (test_env='linux_x86', way='default'):
          Test012, Test345
      Metric Decrease 'bytes allocated':
          Test678
      Metric Increase:
          Test711
      ```
      This will allow the noted changes (letting the test pass). Note that by
      omitting metrics or options, the change will apply to all possible
      metrics/options (i.e. in the above, an increase for all metrics in all
      test environments is allowed for Test711)
      
      phabricator will use the message in the description
      
      Reviewers: bgamari, hvr
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #12758
      
      Differential Revision: https://phabricator.haskell.org/D5059
      932cd41d
  19. Jul 27, 2018
    • Ömer Sinan Ağacan's avatar
      Run StgCse after unarise, fixes #15300 · 3c311e50
      Ömer Sinan Ağacan authored
      Given two unboxed sum terms:
      
          (# 1 | #) :: (# Int | Int# #)
          (# 1 | #) :: (# Int | Int  #)
      
      These two terms are not equal as they unarise to different unboxed
      tuples. However StgCse was thinking that these are equal, and replacing
      one of these with a binder to the other.
      
      To not deal with unboxed sums in StgCse we now do it after unarise. For
      StgCse to maintain post-unarise invariants we factor-out case binder
      in-scopeness check to `stgCaseBndrInScope` and use it in StgCse.
      
      Also did some refactoring in SimplStg.
      
      Another way to fix this would be adding a special case in StgCse to not
      bring unboxed sum binders in scope:
      
          diff --git a/compiler/simplStg/StgCse.hs
      b/compiler/simplStg/StgCse.hs
          index 6c740ca4cb..93a0f8f6ad 100644
          --- a/compiler/simplStg/StgCse.hs
          +++ b/compiler/simplStg/StgCse.hs
          @@ -332,7 +332,11 @@ stgCseExpr env (StgLetNoEscape binds body)
           stgCseAlt :: CseEnv -> OutId -> InStgAlt -> OutStgAlt
           stgCseAlt env case_bndr (DataAlt dataCon, args, rhs)
               = let (env1, args') = substBndrs env args
          -          env2 = addDataCon case_bndr dataCon (map StgVarArg
      args') env1
          +          env2
          +            | isUnboxedSumCon dataCon
          +            = env1
          +            | otherwise
          +            = addDataCon case_bndr dataCon (map StgVarArg args')
      env1
                       -- see note [Case 2: CSEing case binders]
                     rhs' = stgCseExpr env2 rhs
                 in (DataAlt dataCon, args', rhs')
      
      I think this patch seems better in that it doesn't add a special case to
      StgCse.
      
      Test Plan:
      Validate.
      
      I tried to come up with a minimal example but failed. I thought a simple
      program like
      
          data T = T (# Int | Int #) (# Int# | Int #)
      
          case T (# 1 | #) (# 1 | #) of ...
      
      should be enough to trigger this bug, but for some reason StgCse
      doesn't do
      anything on this program.
      
      Reviewers: simonpj, bgamari
      
      Reviewed By: simonpj
      
      Subscribers: rwbarton, thomie, carter
      
      GHC Trac Issues: #15300
      
      Differential Revision: https://phabricator.haskell.org/D4962
      3c311e50
  20. Apr 19, 2017
  21. Apr 18, 2017
  22. Apr 11, 2017
  23. Apr 10, 2017
  24. Jan 05, 2017
    • Joachim Breitner's avatar
      Add a CSE pass to Stg (#9291) · 19d5c731
      Joachim Breitner authored
      This CSE pass only targets data constructor applications. This is
      probably the best we can do, as function calls and primitive operations
      might have side-effects.
      
      Introduces the flag -fstg-cse, enabled by default with -O for now. It
      might also be a good candiate for -O2.
      
      Differential Revision: https://phabricator.haskell.org/D2871
      19d5c731
Loading