1. 22 Feb, 2020 1 commit
  2. 21 Feb, 2020 1 commit
    • Simon Peyton Jones's avatar
      Re-implement unsafe coercions in terms of unsafe equality proofs · 74ad75e8
      Simon Peyton Jones authored
      (Commit message written by Omer, most of the code is written by Simon
      and Richard)
      
      See Note [Implementing unsafeCoerce] for how unsafe equality proofs and
      the new unsafeCoerce# are implemented.
      
      New notes added:
      
      - [Checking for levity polymorphism] in CoreLint.hs
      - [Implementing unsafeCoerce] in base/Unsafe/Coerce.hs
      - [Patching magic definitions] in Desugar.hs
      - [Wiring in unsafeCoerce#] in Desugar.hs
      
      Only breaking change in this patch is unsafeCoerce# is not exported from
      GHC.Exts, instead of GHC.Prim.
      
      Fixes #17443
      Fixes #16893
      
      NoFib
      -----
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
                   CS          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  CSD          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                   FS          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                    S          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                   VS          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  VSD          -0.1%      0.0%     -0.0%     -0.0%     -0.1%
                  VSM          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 anna          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 ansi          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 atom          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               awards          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               banner          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
           bernouilli          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         binary-trees          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                boyer          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               boyer2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 bspt          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            cacheprof          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             calendar          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             cichelli          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              circsim          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             clausify          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
        comp_lab_zift          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             compress          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            compress2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm1          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  cse          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e1          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               dom-lt          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                eliza          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                event          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          exact-reals          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               exp3_8          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               expert          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
       fannkuch-redux          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                fasta          -0.1%      0.0%     -0.5%     -0.3%     -0.4%
                  fem          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  fft          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 fft2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             fibheaps          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 fish          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                fluid          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               fulsom          -0.1%      0.0%     +0.0%     +0.0%     +0.0%
               gamteb          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  gcd          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          gen_regexps          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               genfft          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                   gg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 grep          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               hidden          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  hpg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  ida          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                infer          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              integer          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            integrate          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         k-nucleotide          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                kahan          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              knights          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               lambda          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
           last-piece          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 life          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 lift          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               linear          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            listcompr          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             listcopy          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             maillist          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               mandel          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              mandel2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 mate          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              minimax          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              mkhprog          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
           multiplier          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               n-body          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             nucleic2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 para          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            paraffins          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               parser          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              parstof          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  pic          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             pidigits          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                power          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               pretty          -0.1%      0.0%     -0.1%     -0.1%     -0.1%
               primes          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            primetest          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               prolog          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               puzzle          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               queens          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              reptile          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
      reverse-complem          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              rewrite          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 rfib          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  rsa          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  scc          -0.1%      0.0%     -0.1%     -0.1%     -0.1%
                sched          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  scs          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               simple          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                solid          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              sorting          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
        spectral-norm          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               sphere          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               symalg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  tak          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            transform          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
             treejoin          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            typecheck          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              veritas          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 wang          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            wave4main          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve1          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve2          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 x2n1          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          -0.1%      0.0%     -0.5%     -0.3%     -0.4%
                  Max          -0.0%      0.0%     +0.0%     +0.0%     +0.0%
       Geometric Mean          -0.1%     -0.0%     -0.0%     -0.0%     -0.0%
      
      Test changes
      ------------
      
      - break006 is marked as broken, see #17833
      - The compiler allocates less when building T14683 (an unsafeCoerce#-
        heavy happy-generated code) on 64-platforms. Allocates more on 32-bit
        platforms.
      - Rest of the increases are tiny amounts (still enough to pass the
        threshold) in micro-benchmarks. I briefly looked at each one in a
        profiling build: most of the increased allocations seem to be because
        of random changes in the generated code.
      
      Metric Decrease:
          T14683
      
      Metric Increase:
          T12150
          T12234
          T12425
          T13035
          T14683
          T5837
          T6048
      Co-Authored-By: Richard Eisenberg's avatarRichard Eisenberg <rae@cs.brynmawr.edu>
      Co-Authored-By: Ömer Sinan Ağacan's avatarÖmer Sinan Ağacan <omeragacan@gmail.com>
      74ad75e8
  3. 12 Feb, 2020 1 commit
    • Sebastian Graf's avatar
      Separate CPR analysis from the Demand analyser · 059c3c9d
      Sebastian Graf authored
      The reasons for that can be found in the wiki:
      https://gitlab.haskell.org/ghc/ghc/wikis/nested-cpr/split-off-cpr
      
      We now run CPR after demand analysis (except for after the final demand
      analysis run just before code gen). CPR got its own dump flags
      (`-ddump-cpr-anal`, `-ddump-cpr-signatures`), but not its own flag to
      activate/deactivate. It will run with `-fstrictness`/`-fworker-wrapper`.
      
      As explained on the wiki page, this step is necessary for a sane Nested
      CPR analysis. And it has quite positive impact on compiler performance:
      
      Metric Decrease:
          T9233
          T9675
          T9961
          T15263
      059c3c9d
  4. 31 Jan, 2020 1 commit
    • Ömer Sinan Ağacan's avatar
      Do CafInfo/SRT analysis in Cmm · c846618a
      Ömer Sinan Ağacan authored
      This patch removes all CafInfo predictions and various hacks to preserve
      predicted CafInfos from the compiler and assigns final CafInfos to
      interface Ids after code generation. SRT analysis is extended to support
      static data, and Cmm generator is modified to allow generating
      static_link fields after SRT analysis.
      
      This also fixes `-fcatch-bottoms`, which introduces error calls in case
      expressions in CorePrep, which runs *after* CoreTidy (which is where we
      decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
      
      Fixes #17648
      Fixes #9718
      
      Evaluation
      ==========
      
      NoFib
      -----
      
      Boot with: `make boot mode=fast`
      Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
                   CS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  CSD          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   FS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                    S          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   VS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  VSD          -0.0%      0.0%     -0.0%     -0.0%     -0.5%
                  VSM          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 anna          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 ansi          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 atom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               awards          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               banner          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           bernouilli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         binary-trees          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                boyer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               boyer2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 bspt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            cacheprof          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             calendar          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             cichelli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              circsim          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             clausify          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        comp_lab_zift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             compress          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            compress2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  cse          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               dom-lt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                eliza          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                event          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          exact-reals          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               exp3_8          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               expert          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       fannkuch-redux          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                fasta          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 fft2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 fish          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                fluid          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               fulsom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               gamteb          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  gcd          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          gen_regexps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               genfft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   gg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 grep          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               hidden          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  hpg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  ida          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                infer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              integer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            integrate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         k-nucleotide          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                kahan          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              knights          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               lambda          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           last-piece          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 life          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               linear          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            listcompr          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             listcopy          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             maillist          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               mandel          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              mandel2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 mate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              minimax          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              mkhprog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           multiplier          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               n-body          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             nucleic2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 para          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            paraffins          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               parser          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              parstof          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  pic          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             pidigits          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               pretty          -0.0%      0.0%     -0.3%     -0.4%     -0.4%
               primes          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            primetest          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               prolog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               puzzle          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               queens          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              reptile          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      reverse-complem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              rewrite          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 rfib          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  rsa          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  scc          -0.0%      0.0%     -0.3%     -0.5%     -0.4%
                sched          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  scs          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               simple          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                solid          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              sorting          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        spectral-norm          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               sphere          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               symalg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  tak          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            transform          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             treejoin          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            typecheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              veritas          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 wang          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            wave4main          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 x2n1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          -0.1%      0.0%     -0.3%     -0.5%     -0.5%
                  Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       Geometric Mean          -0.0%     -0.0%     -0.0%     -0.0%     -0.0%
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
              circsim          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             gc_bench          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 hash          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           spellcheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       Geometric Mean          -0.0%     +0.0%     -0.0%     -0.0%     -0.0%
      
      Manual inspection of programs in testsuite/tests/programs
      ---------------------------------------------------------
      
      I built these programs with a bunch of dump flags and `-O` and compared
      STG, Cmm, and Asm dumps and file sizes.
      
      (Below the numbers in parenthesis show number of modules in the program)
      
      These programs have identical compiler (same .hi and .o sizes, STG, and
      Cmm and Asm dumps):
      
      - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
        andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
        jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
        jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
        bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
        strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
        joao-circular (6), 10queens (1)
      
      Programs with different compiler outputs:
      
      - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
        `[Int]` closures as CAFFY for no reason. With this patch we no longer
        make them CAFFY and generate less SRT entries. For some reason Main.o
        is slightly larger with this patch (1.3%) and the executable sizes are
        the same. (I'd expect both to be smaller)
      
      - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
        as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
        executable sizes are the same.
      
      - galois_raytrace (13): Differences are in the Parse module. There are a
        lot, but some of the changes are caused by the fact that for some
        reason (I think a bug) GHC HEAD marks the dictionary for `Functor
        Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
        same.
      
      - north_array: We now generate less SRT entries because some of array
        primops used in this program like `NewArrayOp` get eliminated during
        Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
        larger (9224 bytes from 9000 bytes), executable sizes are the same.
      
      - seward-space-leak: Difference in this program is better shown by this
        smaller example:
      
            module Lib where
      
            data CDS
              = Case [CDS] [(Int, CDS)]
              | Call CDS CDS
      
            instance Eq CDS where
              Case sels1 rets1 == Case sels2 rets2 =
                  sels1 == sels2 && rets1 == rets2
              Call a1 b1 == Call a2 b2 =
                  a1 == a2 && b1 == b2
              _ == _ =
                  False
      
         In this program GHC HEAD builds a new SRT for the recursive group of
         `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
         in its SRT field, and `==` uses the SRT object as its SRT. With this
         patch we use the closure for `/=` as the SRT and add `==` there. Then
         `/=` gets an empty SRT field and `==` points to `/=` in its SRT
         field.
      
         This change looks fine to me.
      
         Main.o gets 0.07% larger, executable sizes are identical.
      
      head.hackage
      ------------
      
      head.hackage's CI script builds 428 packages from Hackage using this
      patch with no failures.
      
      Compiler performance
      --------------------
      
      The compiler perf tests report that the compiler allocates slightly more
      (worst case observed so far is 4%). However most programs in the test
      suite are small, single file programs. To benchmark compiler performance
      on something more realistic I build Cabal (the library, 236 modules)
      with different optimisation levels. For the "max residency" row I run
      GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
      are generated with just `-s`. (This is because `-i0` causes running GC
      much more frequently and as a result "bytes copied" gets inflated by
      more than 25x in some cases)
      
      * -O0
      
      |                 | GHC HEAD       | This MR        | Diff   |
      | --------------- | -------------- | -------------- | ------ |
      | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
      | Bytes copied    |  4,926,037,184 |  4,990,638,760 | +1.31% |
      | Max residency   |    421,225,624 |    424,324,264 | +0.73% |
      
      * -O1
      
      |                 | GHC HEAD        | This MR         | Diff   |
      | --------------- | --------------- | --------------- | ------ |
      | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
      | Bytes copied    |  26,943,452,560 |  27,089,972,296 | +0.54% |
      | Max residency   |     982,643,440 |     991,663,432 | +0.91% |
      
      * -O2
      
      |                 | GHC HEAD        | This MR         | Diff   |
      | --------------- | --------------- | --------------- | ------ |
      | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
      | Bytes copied    |  37,044,237,616 |  36,121,690,472 | -2.49% |
      | Max residency   |   1,071,600,328 |   1,086,396,256 | +1.38% |
      
      Extra compiler allocations
      --------------------------
      
      Runtime allocations of programs are as reported above (NoFib section).
      
      The compiler now allocates more than before. Main source of allocation
      in this patch compared to base commit is the new SRT algorithm
      (GHC.Cmm.Info.Build). Below is some of the extra work we do with this
      patch, numbers generated by profiled stage 2 compiler when building a
      pathological case (the test 'ManyConstructors') with '-O2':
      
      - We now sort the final STG for a module, which means traversing the
        entire program, generating free variable set for each top-level
        binding, doing SCC analysis, and re-ordering the program. In
        ManyConstructors this step allocates 97,889,952 bytes.
      
      - We now do SRT analysis on static data, which in a program like
        ManyConstructors causes analysing 10,000 bindings that we would
        previously just skip. This step allocates 70,898,352 bytes.
      
      - We now maintain an SRT map for the entire module as we compile Cmm
        groups:
      
            data ModuleSRTInfo = ModuleSRTInfo
              { ...
              , moduleSRTMap :: SRTMap
              }
      
         (SRTMap is just a strict Map from the 'containers' library)
      
         This map gets an entry for most bindings in a module (exceptions are
         THUNKs and CAFFY static functions). For ManyConstructors this map
         gets 50015 entries.
      
      - Once we're done with code generation we generate a NameSet from SRTMap
        for the non-CAFFY names in the current module. This set gets the same
        number of entries as the SRTMap.
      
      - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
        the NameSet generated in the previous step. This usually does the
        least amount of allocation among the work listed here.
      
      Only place with this patch where we do less work in the CAF analysis in
      the tidying pass (CoreTidy). However that doesn't save us much, as the
      pass still needs to traverse the whole program and update IdInfos for
      other reasons. Only thing we don't here do is the `hasCafRefs` pass over
      the RHS of bindings, which is a stateless pass that returns a boolean
      value, so it doesn't allocate much.
      
      (Metric changes blow are all increased allocations)
      
      Metric changes
      --------------
      
      Metric Increase:
          ManyAlternatives
          ManyConstructors
          T13035
          T14683
          T1969
          T9961
      c846618a
  5. 27 Jan, 2020 1 commit
  6. 06 Jan, 2020 1 commit
  7. 31 Dec, 2019 1 commit
  8. 18 Dec, 2019 1 commit
    • Sylvain Henry's avatar
      Add GHC-API logging hooks · 58655b9d
      Sylvain Henry authored
      * Add 'dumpAction' hook to DynFlags.
      
      It allows GHC API users to catch dumped intermediate codes and
      information. The format of the dump (Core, Stg, raw text, etc.) is now
      reported allowing easier automatic handling.
      
      * Add 'traceAction' hook to DynFlags.
      
      Some dumps go through the trace mechanism (for instance unfoldings that
      have been considered for inlining). This is problematic because:
      1) dumps aren't written into files even with -ddump-to-file on
      2) dumps are written on stdout even with GHC API
      3) in this specific case, dumping depends on unsafe globally stored
      DynFlags which is bad for GHC API users
      
      We introduce 'traceAction' hook which allows GHC API to catch those
      traces and to avoid using globally stored DynFlags.
      
      * Avoid dumping empty logs via dumpAction/traceAction (but still write
      empty files to keep the existing behavior)
      58655b9d
  9. 17 Dec, 2019 1 commit
  10. 28 Nov, 2019 1 commit
  11. 05 Nov, 2019 1 commit
  12. 23 Oct, 2019 1 commit
    • Andreas Klebinger's avatar
      Make dynflag argument for withTiming pure. · 6beea836
      Andreas Klebinger authored
      19 times out of 20 we already have dynflags in scope.
      
      We could just always use `return dflags`. But this is in fact not free.
      When looking at some STG code I noticed that we always allocate a
      closure for this expression in the heap. Clearly a waste in these cases.
      
      For the other cases we can either just modify the callsite to
      get dynflags or use the _D variants of withTiming I added which
      will use getDynFlags under the hood.
      6beea836
  13. 26 Jun, 2019 1 commit
    • Ben Gamari's avatar
      Don't eta-expand unsaturated primops · cac8dc9f
      Ben Gamari authored
      Previously, as described in Note [Primop wrappers], `hasNoBinding` would
      return False in the case of `PrimOpId`s. This would result in eta
      expansion of unsaturated primop applications during CorePrep. Not only
      did this expansion result in unnecessary allocations, but it also meant
      lead to rather nasty inconsistencies between the CAFfy-ness
      determinations made by TidyPgm and CorePrep.
      
      This fixes #16846.
      cac8dc9f
  14. 20 Jun, 2019 1 commit
  15. 14 Jun, 2019 1 commit
    • Andrew Martin's avatar
      Implement the -XUnliftedNewtypes extension. · effdd948
      Andrew Martin authored
      GHC Proposal: 0013-unlifted-newtypes.rst
      Discussion: https://github.com/ghc-proposals/ghc-proposals/pull/98
      Issues: #15219, #1311, #13595, #15883
      Implementation Details:
        Note [Implementation of UnliftedNewtypes]
        Note [Unifying data family kinds]
        Note [Compulsory newtype unfolding]
      
      This patch introduces the -XUnliftedNewtypes extension. When this
      extension is enabled, GHC drops the restriction that the field in
      a newtype must be of kind (TYPE 'LiftedRep). This allows types
      like Int# and ByteArray# to be used in a newtype. Additionally,
      coerce is made levity-polymorphic so that it can be used with
      newtypes over unlifted types.
      
      The bulk of the changes are in TcTyClsDecls.hs. With -XUnliftedNewtypes,
      getInitialKind is more liberal, introducing a unification variable to
      return the kind (TYPE r0) rather than just returning (TYPE 'LiftedRep).
      When kind-checking a data constructor with kcConDecl, we attempt to
      unify the kind of a newtype with the kind of its field's type. When
      typechecking a data declaration with tcTyClDecl, we again perform a
      unification. See the implementation note for more on this.
      Co-authored-by: Richard Eisenberg's avatarRichard Eisenberg <rae@richarde.dev>
      effdd948
  16. 12 Jun, 2019 1 commit
  17. 31 May, 2019 1 commit
  18. 25 Mar, 2019 1 commit
    • Takenobu Tani's avatar
      Update Wiki URLs to point to GitLab · 3769e3a8
      Takenobu Tani authored
      This moves all URL references to Trac Wiki to their corresponding
      GitLab counterparts.
      
      This substitution is classified as follows:
      
      1. Automated substitution using sed with Ben's mapping rule [1]
          Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...
          New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...
      
      2. Manual substitution for URLs containing `#` index
          Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz
          New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz
      
      3. Manual substitution for strings starting with `Commentary`
          Old: Commentary/XxxYyy...
          New: commentary/xxx-yyy...
      
      See also !539
      
      [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
      3769e3a8
  19. 15 Mar, 2019 1 commit
  20. 29 Nov, 2018 1 commit
    • Chaitanya Koparkar's avatar
      Fix #15953 by consistently using dumpIfSet_dyn to print debug output · dcf1f926
      Chaitanya Koparkar authored
      Summary:
      In some modules we directly dump the debugging output to STDOUT
      via 'putLogMsg', 'printInfoForUser' etc. However, if `-ddump-to-file`
      is enabled, that output should be written to a file. Easily fixed.
      
      Certain tests (T3017, Roles3, T12763 etc.) expect part of the
      output generated by `-ddump-types` to be in 'PprUser' style. However,
      generally we want all other debugging output to use 'PprDump'
      style. `traceTcRn` and `traceTcRnForUser` help us accomplish this.
      
      This patch also documents some missing flags in the users guide.
      
      Reviewers: RyanGlScott, bgamari, hvr
      
      Reviewed By: RyanGlScott
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15953
      
      Differential Revision: https://phabricator.haskell.org/D5382
      dcf1f926
  21. 26 Jun, 2018 1 commit
  22. 18 Jun, 2018 1 commit
  23. 15 Jun, 2018 1 commit
    • Sylvain Henry's avatar
      Built-in Natural literals in Core · fe770c21
      Sylvain Henry authored
      Add support for built-in Natural literals in Core.
      
      - Replace MachInt,MachWord, LitInteger, etc. with a single LitNumber
        constructor with a LitNumType field
      - Support built-in Natural literals
      - Add desugar warning for negative literals
      - Move Maybe(..) from GHC.Base to GHC.Maybe for module dependency
        reasons
      
      This patch introduces only a few rules for Natural literals (compared
      to Integer's rules). Factorization of the built-in rules for numeric
      literals will be done in another patch as this one is already big to
      review.
      
      Test Plan:
        validate
        test build with integer-simple
      
      Reviewers: hvr, bgamari, goldfire, Bodigrim, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: phadej, simonpj, RyanGlScott, carter, hsyl20, rwbarton,
      thomie
      
      GHC Trac Issues: #14170, #14465
      
      Differential Revision: https://phabricator.haskell.org/D4212
      fe770c21
  24. 02 Jun, 2018 1 commit
    • Ben Gamari's avatar
      vectorise: Put it out of its misery · faee23bb
      Ben Gamari authored
      Poor DPH and its vectoriser have long been languishing; sadly it seems there is
      little chance that the effort will be rekindled. Every few years we discuss
      what to do with this mass of code and at least once we have agreed that it
      should be archived on a branch and removed from `master`. Here we do just that,
      eliminating heaps of dead code in the process.
      
      Here we drop the ParallelArrays extension, the vectoriser, and the `vector` and
      `primitive` submodules.
      
      Test Plan: Validate
      
      Reviewers: simonpj, simonmar, hvr, goldfire, alanz
      
      Reviewed By: simonmar
      
      Subscribers: goldfire, rwbarton, thomie, mpickering, carter
      
      Differential Revision: https://phabricator.haskell.org/D4761
      faee23bb
  25. 17 Jan, 2018 1 commit
  26. 03 Jan, 2018 1 commit
    • Simon Peyton Jones's avatar
      Get evaluated-ness right in the back end · bd438b2d
      Simon Peyton Jones authored
      See Trac #14626, comment:4.  We want to maintain evaluted-ness
      info on Ids into the code generateor for two reasons
      (see Note [Preserve evaluated-ness in CorePrep] in CorePrep)
      
      - DataToTag magic
      - Potentially using it in the codegen (this is Gabor's
        current work)
      
      But it was all being done very inconsistently, and actually
      outright wrong -- the DataToTag magic hasn't been working for
      years.
      
      This patch tidies it all up, with Notes to match.
      bd438b2d
  27. 26 Sep, 2017 1 commit
  28. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      modules.
      
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      
      Differential Revision: https://phabricator.haskell.org/D3989
      f63bc730
  29. 01 May, 2017 1 commit
  30. 15 Mar, 2017 1 commit
    • Ben Gamari's avatar
      Introduce putLogMsg · 086b514b
      Ben Gamari authored
      This factors out the repetition of (log_action dflags dflags) and will
      hopefully allow us to someday better abstract log output.
      
      Test Plan: Validate
      
      Reviewers: austin, hvr, goldfire
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3334
      086b514b
  31. 09 Mar, 2017 1 commit
    • bitonic's avatar
      Allow compilation of C/C++/ObjC/ObjC++ files with module from TH · 0fac488c
      bitonic authored
      The main goal is to easily allow the inline-c project (and
      similar projects such as inline-java) to emit C/C++ files to
      be compiled and linked with the current module.
      
      Moreover, `addCStub` is removed, since it's quite fragile. Most
      notably, the C stubs end up in the file generated by
      `CodeOutput.outputForeignStubs`, which is tuned towards
      generating a file for stubs coming from `capi` and Haskell-to-C
      exports.
      
      Reviewers: simonmar, austin, goldfire, facundominguez, dfeuer, bgamari
      
      Reviewed By: dfeuer, bgamari
      
      Subscribers: snowleopard, rwbarton, dfeuer, thomie, duncan, mboes
      
      Differential Revision: https://phabricator.haskell.org/D3280
      0fac488c
  32. 08 Feb, 2017 1 commit
  33. 03 Feb, 2017 1 commit
    • Sylvain Henry's avatar
      Ditch static flags · bbd3c399
      Sylvain Henry authored
      This patch converts the 4 lasting static flags (read from the command
      line and unsafely stored in immutable global variables) into dynamic
      flags. Most use cases have been converted into reading them from a DynFlags.
      
      In cases for which we don't have easy access to a DynFlags, we read from
      'unsafeGlobalDynFlags' that is set at the beginning of each 'runGhc'.
      It's not perfect (not thread-safe) but it is still better as we can
      set/unset these 4 flags before each run when using GHC API.
      
      Updates haddock submodule.
      
      Rebased and finished by: bgamari
      
      Test Plan: validate
      
      Reviewers: goldfire, erikd, hvr, austin, simonmar, bgamari
      
      Reviewed By: simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2839
      
      GHC Trac Issues: #8440
      bbd3c399
  34. 02 Feb, 2017 1 commit
    • Ben Gamari's avatar
      Add support for StaticPointers in GHCi · eedb3df0
      Ben Gamari authored
      Here we add support to GHCi for StaticPointers. This process begins by
      adding remote GHCi messages for adding entries to the static pointer
      table. We then collect binders needing SPT entries after linking and
      send the interpreter a message adding entries with the appropriate
      fingerprints.
      
      Test Plan: `make test TEST=StaticPtr`
      
      Reviewers: facundominguez, mboes, simonpj, simonmar, goldfire, austin,
      hvr, erikd
      
      Reviewed By: simonpj, simonmar
      
      Subscribers: RyanGlScott, simonpj, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2504
      
      GHC Trac Issues: #12356
      eedb3df0
  35. 26 Jan, 2017 1 commit
    • Matthew Pickering's avatar
      COMPLETE pragmas for enhanced pattern exhaustiveness checking · 1a3f1eeb
      Matthew Pickering authored
      This patch adds a new pragma so that users can specify `COMPLETE` sets of
      `ConLike`s in order to sate the pattern match checker.
      
      A function which matches on all the patterns in a complete grouping
      will not cause the exhaustiveness checker to emit warnings.
      
      ```
      pattern P :: ()
      pattern P = ()
      
      {-# COMPLETE P #-}
      
      foo P = ()
      ```
      
      This example would previously have caused the checker to warn that
      all cases were not matched even though matching on `P` is sufficient to
      make `foo` covering. With the addition of the pragma, the compiler
      will recognise that matching on `P` alone is enough and not emit
      any warnings.
      
      Reviewers: goldfire, gkaracha, alanz, austin, bgamari
      
      Reviewed By: alanz
      
      Subscribers: lelf, nomeata, gkaracha, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2669
      
      GHC Trac Issues: #8779
      1a3f1eeb
  36. 23 Jan, 2017 1 commit
  37. 13 Jan, 2017 1 commit
  38. 06 Jan, 2017 2 commits
  39. 23 Dec, 2016 1 commit