1. 22 Feb, 2020 1 commit
  2. 13 Feb, 2020 1 commit
  3. 12 Feb, 2020 1 commit
  4. 31 Jan, 2020 2 commits
    • Andreas Klebinger's avatar
      A few optimizations in STG and Cmm parts: · 2a87a565
      Andreas Klebinger authored
      (Guided by the profiler output)
      
      - Add a few bang patterns, INLINABLE annotations, and a seqList in a few
        places in Cmm and STG parts.
      
      - Do not add external variables as dependencies in STG dependency
        analysis (GHC.Stg.DepAnal).
      2a87a565
    • Ömer Sinan Ağacan's avatar
      Do CafInfo/SRT analysis in Cmm · c846618a
      Ömer Sinan Ağacan authored
      This patch removes all CafInfo predictions and various hacks to preserve
      predicted CafInfos from the compiler and assigns final CafInfos to
      interface Ids after code generation. SRT analysis is extended to support
      static data, and Cmm generator is modified to allow generating
      static_link fields after SRT analysis.
      
      This also fixes `-fcatch-bottoms`, which introduces error calls in case
      expressions in CorePrep, which runs *after* CoreTidy (which is where we
      decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
      
      Fixes #17648
      Fixes #9718
      
      Evaluation
      ==========
      
      NoFib
      -----
      
      Boot with: `make boot mode=fast`
      Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
                   CS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  CSD          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   FS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                    S          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   VS          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  VSD          -0.0%      0.0%     -0.0%     -0.0%     -0.5%
                  VSM          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 anna          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                 ansi          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 atom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               awards          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               banner          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           bernouilli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         binary-trees          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                boyer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               boyer2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 bspt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            cacheprof          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             calendar          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             cichelli          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              circsim          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             clausify          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        comp_lab_zift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             compress          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            compress2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  cse          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               dom-lt          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                eliza          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                event          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          exact-reals          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               exp3_8          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               expert          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       fannkuch-redux          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                fasta          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 fft2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 fish          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                fluid          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
               fulsom          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               gamteb          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  gcd          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
          gen_regexps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               genfft          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   gg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 grep          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               hidden          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  hpg          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  ida          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                infer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              integer          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            integrate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         k-nucleotide          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                kahan          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              knights          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               lambda          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           last-piece          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 life          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lift          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               linear          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
            listcompr          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             listcopy          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             maillist          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               mandel          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              mandel2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 mate          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              minimax          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              mkhprog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           multiplier          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               n-body          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             nucleic2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 para          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            paraffins          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               parser          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
              parstof          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  pic          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             pidigits          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               pretty          -0.0%      0.0%     -0.3%     -0.4%     -0.4%
               primes          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            primetest          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               prolog          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               puzzle          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               queens          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              reptile          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      reverse-complem          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              rewrite          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 rfib          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  rsa          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  scc          -0.0%      0.0%     -0.3%     -0.5%     -0.4%
                sched          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  scs          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               simple          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                solid          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              sorting          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
        spectral-norm          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               sphere          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
               symalg          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  tak          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            transform          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             treejoin          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            typecheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
              veritas          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 wang          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
            wave4main          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
         wheel-sieve2          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 x2n1          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          -0.1%      0.0%     -0.3%     -0.5%     -0.5%
                  Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       Geometric Mean          -0.0%     -0.0%     -0.0%     -0.0%     -0.0%
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
              circsim          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             fibheaps          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
             gc_bench          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 hash          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
           spellcheck          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          -0.1%      0.0%     -0.0%     -0.0%     -0.0%
                  Max          -0.0%      0.0%     -0.0%     -0.0%     -0.0%
       Geometric Mean          -0.0%     +0.0%     -0.0%     -0.0%     -0.0%
      
      Manual inspection of programs in testsuite/tests/programs
      ---------------------------------------------------------
      
      I built these programs with a bunch of dump flags and `-O` and compared
      STG, Cmm, and Asm dumps and file sizes.
      
      (Below the numbers in parenthesis show number of modules in the program)
      
      These programs have identical compiler (same .hi and .o sizes, STG, and
      Cmm and Asm dumps):
      
      - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
        andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
        jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
        jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
        bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
        strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
        joao-circular (6), 10queens (1)
      
      Programs with different compiler outputs:
      
      - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
        `[Int]` closures as CAFFY for no reason. With this patch we no longer
        make them CAFFY and generate less SRT entries. For some reason Main.o
        is slightly larger with this patch (1.3%) and the executable sizes are
        the same. (I'd expect both to be smaller)
      
      - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
        as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
        executable sizes are the same.
      
      - galois_raytrace (13): Differences are in the Parse module. There are a
        lot, but some of the changes are caused by the fact that for some
        reason (I think a bug) GHC HEAD marks the dictionary for `Functor
        Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
        same.
      
      - north_array: We now generate less SRT entries because some of array
        primops used in this program like `NewArrayOp` get eliminated during
        Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
        larger (9224 bytes from 9000 bytes), executable sizes are the same.
      
      - seward-space-leak: Difference in this program is better shown by this
        smaller example:
      
            module Lib where
      
            data CDS
              = Case [CDS] [(Int, CDS)]
              | Call CDS CDS
      
            instance Eq CDS where
              Case sels1 rets1 == Case sels2 rets2 =
                  sels1 == sels2 && rets1 == rets2
              Call a1 b1 == Call a2 b2 =
                  a1 == a2 && b1 == b2
              _ == _ =
                  False
      
         In this program GHC HEAD builds a new SRT for the recursive group of
         `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
         in its SRT field, and `==` uses the SRT object as its SRT. With this
         patch we use the closure for `/=` as the SRT and add `==` there. Then
         `/=` gets an empty SRT field and `==` points to `/=` in its SRT
         field.
      
         This change looks fine to me.
      
         Main.o gets 0.07% larger, executable sizes are identical.
      
      head.hackage
      ------------
      
      head.hackage's CI script builds 428 packages from Hackage using this
      patch with no failures.
      
      Compiler performance
      --------------------
      
      The compiler perf tests report that the compiler allocates slightly more
      (worst case observed so far is 4%). However most programs in the test
      suite are small, single file programs. To benchmark compiler performance
      on something more realistic I build Cabal (the library, 236 modules)
      with different optimisation levels. For the "max residency" row I run
      GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
      are generated with just `-s`. (This is because `-i0` causes running GC
      much more frequently and as a result "bytes copied" gets inflated by
      more than 25x in some cases)
      
      * -O0
      
      |                 | GHC HEAD       | This MR        | Diff   |
      | --------------- | -------------- | -------------- | ------ |
      | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
      | Bytes copied    |  4,926,037,184 |  4,990,638,760 | +1.31% |
      | Max residency   |    421,225,624 |    424,324,264 | +0.73% |
      
      * -O1
      
      |                 | GHC HEAD        | This MR         | Diff   |
      | --------------- | --------------- | --------------- | ------ |
      | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
      | Bytes copied    |  26,943,452,560 |  27,089,972,296 | +0.54% |
      | Max residency   |     982,643,440 |     991,663,432 | +0.91% |
      
      * -O2
      
      |                 | GHC HEAD        | This MR         | Diff   |
      | --------------- | --------------- | --------------- | ------ |
      | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
      | Bytes copied    |  37,044,237,616 |  36,121,690,472 | -2.49% |
      | Max residency   |   1,071,600,328 |   1,086,396,256 | +1.38% |
      
      Extra compiler allocations
      --------------------------
      
      Runtime allocations of programs are as reported above (NoFib section).
      
      The compiler now allocates more than before. Main source of allocation
      in this patch compared to base commit is the new SRT algorithm
      (GHC.Cmm.Info.Build). Below is some of the extra work we do with this
      patch, numbers generated by profiled stage 2 compiler when building a
      pathological case (the test 'ManyConstructors') with '-O2':
      
      - We now sort the final STG for a module, which means traversing the
        entire program, generating free variable set for each top-level
        binding, doing SCC analysis, and re-ordering the program. In
        ManyConstructors this step allocates 97,889,952 bytes.
      
      - We now do SRT analysis on static data, which in a program like
        ManyConstructors causes analysing 10,000 bindings that we would
        previously just skip. This step allocates 70,898,352 bytes.
      
      - We now maintain an SRT map for the entire module as we compile Cmm
        groups:
      
            data ModuleSRTInfo = ModuleSRTInfo
              { ...
              , moduleSRTMap :: SRTMap
              }
      
         (SRTMap is just a strict Map from the 'containers' library)
      
         This map gets an entry for most bindings in a module (exceptions are
         THUNKs and CAFFY static functions). For ManyConstructors this map
         gets 50015 entries.
      
      - Once we're done with code generation we generate a NameSet from SRTMap
        for the non-CAFFY names in the current module. This set gets the same
        number of entries as the SRTMap.
      
      - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
        the NameSet generated in the previous step. This usually does the
        least amount of allocation among the work listed here.
      
      Only place with this patch where we do less work in the CAF analysis in
      the tidying pass (CoreTidy). However that doesn't save us much, as the
      pass still needs to traverse the whole program and update IdInfos for
      other reasons. Only thing we don't here do is the `hasCafRefs` pass over
      the RHS of bindings, which is a stateless pass that returns a boolean
      value, so it doesn't allocate much.
      
      (Metric changes blow are all increased allocations)
      
      Metric changes
      --------------
      
      Metric Increase:
          ManyAlternatives
          ManyConstructors
          T13035
          T14683
          T1969
          T9961
      c846618a
  5. 25 Jan, 2020 1 commit
  6. 04 Jan, 2020 1 commit
  7. 09 Sep, 2019 1 commit
    • Sylvain Henry's avatar
      Module hierarchy: StgToCmm (#13009) · 447864a9
      Sylvain Henry authored
      Add StgToCmm module hierarchy. Platform modules that are used in several
      other places (NCG, LLVM codegen, Cmm transformations) are put into
      GHC.Platform.
      447864a9
  8. 29 Aug, 2019 1 commit
    • Ömer Sinan Ağacan's avatar
      Small optimization in the SRT algorithm · 304067a0
      Ömer Sinan Ağacan authored
      Noticed by @simonmar in !1362:
      
          If the srtEntry is Nothing, then it should be safe to omit
          references to this SRT from other SRTs, even if it is a static
          function.
      
      When updating SRT map we don't omit references to static functions (see
      Note [Invalid optimisation: shortcutting]), but there's no reason to add
      an SRT entry for a static function if the function is not CAFFY.
      
      (Previously we'd add SRT entries for static functions even when they're
      not CAFFY)
      
      Using 9151b99e I checked sizes of all SRTs when building GHC and
      containers:
      
      - GHC: 583736 (HEAD), 581695 (this patch). 2041 less SRT entries.
      - containers: 2457 (HEAD), 2381 (this patch). 76 less SRT entries.
      304067a0
  9. 15 Aug, 2019 1 commit
    • James Foster's avatar
      Remove unused imports of the form 'import foo ()' (Fixes #17065) · ca71d551
      James Foster authored
      These kinds of imports are necessary in some cases such as
      importing instances of typeclasses or intentionally creating
      dependencies in the build system, but '-Wunused-imports' can't
      detect when they are no longer needed. This commit removes the
      unused ones currently in the code base (not including test files
      or submodules), with the hope that doing so may increase
      parallelism in the build system by removing unnecessary
      dependencies.
      ca71d551
  10. 13 Jul, 2019 1 commit
  11. 20 Jun, 2019 1 commit
    • John Ericson's avatar
      Move 'Platform' to ghc-boot · bff2f24b
      John Ericson authored
      ghc-pkg needs to be aware of platforms so it can figure out which
      subdire within the user package db to use. This is admittedly
      roundabout, but maybe Cabal could use the same notion of a platform as
      GHC to good affect too.
      bff2f24b
  12. 15 Nov, 2018 1 commit
    • Simon Marlow's avatar
      Fix a bug in SRT generation (#15892) · eb46345d
      Simon Marlow authored
      Summary:
      The logic in `Note [recursive SRTs]` was correct. However, my
      implementation of it wasn't: I got the associativity of
      `Set.difference` wrong, which led to an extremely subtle and difficult
      to find bug.
      
      Fortunately now we have a test case. I was able to cut down the code
      to something manageable, and I've added it to the test suite.
      
      Test Plan:
      Before (using my stage 1 compiler without the fix):
      
      ```
      ====> T15892(normal) 1 of 1 [0, 0, 0]
      cd "T15892.run" &&  "/home/smarlow/ghc/inplace/bin/ghc-stage1" -o T15892
      T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
      -fno-warn-missed-specialisations -fshow-warning-groups
      -fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
      -dno-debug-output  -O
      cd "T15892.run" && ./T15892  +RTS -G1 -A32k -RTS
      Wrong exit code for T15892(normal)(expected 0 , actual 134 )
      Stderr ( T15892 ):
      T15892: internal error: evacuate: strange closure type 0
          (GHC version 8.7.20181113 for x86_64_unknown_linux)
          Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
      Aborted (core dumped)
      *** unexpected failure for T15892(normal)
      =====> T15892(g1) 1 of 1 [0, 1, 0]
      cd "T15892.run" &&  "/home/smarlow/ghc/inplace/bin/ghc-stage1" -o T15892
      T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
      -fno-warn-missed-specialisations -fshow-warning-groups
      -fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
      -dno-debug-output  -O
      cd "T15892.run" && ./T15892 +RTS -G1 -RTS +RTS -G1 -A32k -RTS
      Wrong exit code for T15892(g1)(expected 0 , actual 134 )
      Stderr ( T15892 ):
      T15892: internal error: evacuate: strange closure type 0
          (GHC version 8.7.20181113 for x86_64_unknown_linux)
          Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
      Aborted (core dumped)
      ```
      
      After (using my stage 2 compiler with the fix):
      
      ```
      =====> T15892(normal) 1 of 1 [0, 0, 0]
      cd "T15892.run" &&  "/home/smarlow/ghc/inplace/test   spaces/ghc-stage2"
      -o T15892 T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
      -fno-warn-missed-specialisations -fshow-warning-groups
      -fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
      -dno-debug-output
      cd "T15892.run" && ./T15892  +RTS -G1 -A32k -RTS
      =====> T15892(g1) 1 of 1 [0, 0, 0]
      cd "T15892.run" &&  "/home/smarlow/ghc/inplace/test   spaces/ghc-stage2"
      -o T15892 T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
      -fno-warn-missed-specialisations -fshow-warning-groups
      -fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
      -dno-debug-output
      cd "T15892.run" && ./T15892 +RTS -G1 -RTS +RTS -G1 -A32k -RTS
      ```
      
      Reviewers: bgamari, osa1, erikd
      
      Reviewed By: osa1
      
      Subscribers: rwbarton, carter
      
      GHC Trac Issues: #15892
      
      Differential Revision: https://phabricator.haskell.org/D5334
      eb46345d
  13. 18 Sep, 2018 1 commit
  14. 23 May, 2018 2 commits
    • Ben Gamari's avatar
      Disable the SRT offset optimisation on MachO platforms · bf10456e
      Ben Gamari authored
      Unfortunately, this optimisation is infeasible on MachO platforms (e.g.
      Darwin) due to an object format limitation. Specifically, linking fails
      with errors of the form:
      
           error: unsupported relocation with subtraction expression, symbol
           '_integerzmgmp_GHCziIntegerziType_quotInteger_closure' can not be
           undefined in a subtraction expression
      
      Apparently MachO does not permit relocations' subtraction expressions to
      refer to undefined symbols. As far as I can tell this means that it is
      essentially impossible to express an offset between symbols living in
      different compilation units. This means that we lively can't use this
      optimisation on MachO platforms.
      
      Test Plan: Validate on Darwin
      
      Reviewers: simonmar, erikd
      
      Subscribers: rwbarton, thomie, carter, angerman
      
      GHC Trac Issues: #15169
      
      Differential Revision: https://phabricator.haskell.org/D4715
      bf10456e
    • Simon Marlow's avatar
      Fix a bug in SRT generation · d424d4a4
      Simon Marlow authored
      Summary:
      I had good intentions, but they were not being followed. In particular,
      this comment:
      
      ```
      ---  - we never resolve a reference to a CAF to the contents of its SRT, since
      ---    the point of SRTs is to keep CAFs alive.
      ```
      
      was not true, because we updated the srtMap after generating the SRT
      for a CAF. Therefore it was possible for another CAF to refer to an
      earlier CAF, and the reference to the earlier CAF would be shortcutted
      to refer to its SRT instead of pointing to the CAF itself.
      
      The fix is just to not update the srtMap when generating the SRT for a
      CAF, but I also refactored the code and comments around this to be a bit
      better organised.
      
      Test Plan: Harbourmaster
      
      Reviewers: bgamari, michalt, simonpj, erikd
      
      Subscribers: rwbarton, thomie, carter
      
      GHC Trac Issues: #15173, #15168
      
      Differential Revision: https://phabricator.haskell.org/D4721
      d424d4a4
  15. 17 May, 2018 1 commit
  16. 16 May, 2018 4 commits
    • Simon Marlow's avatar
      Merge FUN_STATIC closure with its SRT · 838b6903
      Simon Marlow authored
      Summary:
      The idea here is to save a little code size and some work in the GC,
      by collapsing FUN_STATIC closures and their SRTs.
      
      This is (4) in a series; see D4632 for more details.
      
      There's a tradeoff here: more complexity in the compiler in exchange
      for a modest code size reduction (probably around 0.5%).
      
      Results:
      * GHC binary itself (statically linked) is 1% smaller
      * -0.2% binary sizes in nofib (-0.5% module sizes)
      
      Full nofib results comparing D4634 with this: P177 (ignore runtimes,
      these aren't stable on my laptop)
      
      Test Plan: validate, nofib
      
      Reviewers: bgamari, niteria, simonpj, erikd
      
      Subscribers: thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4637
      838b6903
    • Simon Marlow's avatar
      Save a word in the info table on x86_64 · 2b0918c9
      Simon Marlow authored
      Summary:
      An info table with an SRT normally looks like this:
      
          StgWord64 srt_offset
          StgClosureInfo layout
          StgWord32 layout
          StgWord32 has_srt
      
      But we only need 32 bits for srt_offset on x86_64, because the small
      memory model requires that code segments are at most 2GB. So we can
      optimise this to
      
          StgClosureInfo layout
          StgWord32 layout
          StgWord32 srt_offset
      
      saving a word.  We can tell whether the info table has an SRT or not,
      because zero is not a valid srt_offset, so zero still indicates that
      there's no SRT.
      
      Test Plan:
      * validate
      * For results, see D4632.
      
      Reviewers: bgamari, niteria, osa1, erikd
      
      Subscribers: thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4634
      2b0918c9
    • Simon Marlow's avatar
      Allow CmmLabelDiffOff with different widths · fbd28e2c
      Simon Marlow authored
      Summary:
      This change makes it possible to generate a static 32-bit relative label
      offset on x86_64. Currently we can only generate word-sized label
      offsets.
      
      This will be used in D4634 to shrink info tables.  See D4632 for more
      details.
      
      Test Plan: See D4632
      
      Reviewers: bgamari, niteria, michalt, erikd, jrtc27, osa1
      
      Subscribers: thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4633
      fbd28e2c
    • Simon Marlow's avatar
      An overhaul of the SRT representation · eb8e692c
      Simon Marlow authored
      Summary:
      - Previously we would hvae a single big table of pointers per module,
        with a set of bitmaps to reference entries within it. The new
        representation is identical to a static constructor, which is much
        simpler for the GC to traverse, and we get to remove the complicated
        bitmap-traversal code from the GC.
      
      - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and
        document it much better (see Note [SRTs]). This has been something
        I've wanted to do since we moved to the new code generator, I
        finally had the opportunity to finish it while on a transatlantic
        flight recently :)
      
      There are a series of 4 diffs:
      
      1. D4632 (this one), which does the bulk of the changes
      
      2. D4633 which adds support for smaller `CmmLabelDiffOff` constants
      
      3. D4634 which takes advantage of D4632 and D4633 to save a word in
         info tables that have an SRT on x86_64. This is where most of the
         binary size improvement comes from.
      
      4. D4637 which makes a further optimisation to merge some SRTs with
         static FUN closures.  This adds some complexity and the benefits
         are fairly modest, so it's not clear yet whether we should do this.
      
      Results (after (3), on x86_64)
      
      - GHC itself (staticaly linked) is 5.2% smaller
      
      - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176
      
      - I measured the overhead of traversing all the static objects in a
        major GC in GHC itself by doing `replicateM_ 1000 performGC` as the
        first thing in `Main.main`.  The new version was 5-10% faster, but
        the results did vary quite a bit.
      
      - I'm not sure if there's a compile-time difference, the results are
        too unreliable.
      
      Test Plan: validate
      
      Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1
      
      Subscribers: thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4632
      eb8e692c
  17. 06 Mar, 2018 1 commit
  18. 06 Feb, 2018 1 commit
  19. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      modules.
      
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      
      Differential Revision: https://phabricator.haskell.org/D3989
      f63bc730
  20. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
        `Uniq{FM,Set}`
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonpj, kavon, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3616
      42eee6ea
  21. 02 Jun, 2017 1 commit
    • Ryan Scott's avatar
      Use lengthIs and friends in more places · a786b136
      Ryan Scott authored
      While investigating #12545, I discovered several places in the code
      that performed length-checks like so:
      
      ```
      length ts == 4
      ```
      
      This is not ideal, since the length of `ts` could be much longer than 4,
      and we'd be doing way more work than necessary! There are already a slew
      of helper functions in `Util` such as `lengthIs` that are designed to do
      this efficiently, so I found every place where they ought to be used and
      did just that. I also defined a couple more utility functions for list
      length that were common patterns (e.g., `ltLength`).
      
      Test Plan: ./validate
      
      Reviewers: austin, hvr, goldfire, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: goldfire, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3622
      a786b136
  22. 05 Apr, 2017 1 commit
  23. 08 Dec, 2016 1 commit
  24. 29 Nov, 2016 1 commit
    • Michal Terepeta's avatar
      Hoopl/Dataflow: use block-oriented interface · 679ccd1c
      Michal Terepeta authored
      This introduces the new interface for dataflow analysis, where transfer
      functions operate on a whole basic block.
      
      The main changes are:
      - Hoopl.Dataflow: implement the new interface and remove the old code;
        expose a utility function to do a strict fold over the nodes of a
        basic block (for analyses that do want to look at all the nodes)
      - Refactor all the analyses to use the new interface.
      
      One of the nice effects is that we can remove the `analyzeFwdBlocks`
      hack that ignored the middle nodes (that existed for analyses that
      didn't need to go over all the nodes). Now this is no longer a special
      case and fits well with the new interface.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan:
      validate, earlier version of the patch had assertions
      comparing the results with the old implementation
      
      Reviewers: erikd, austin, simonmar, hvr, goldfire, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, erikd, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2754
      679ccd1c
  25. 02 Nov, 2016 1 commit
    • Michal Terepeta's avatar
      Hoopl/Dataflow: make the module more self-contained · dc4d5962
      Michal Terepeta authored
      This makes the GHC's Dataflow module more self-contained by also
      forking the `DataflowLattice` (instead of only the analysis
      algorithm). Effects/benefits:
      - We no longer need to use the deprecated Hoopl functions (and can
        remove `-fno-warn-warnings-deprecations` from two modules).
      - We can remove the unnecessary `Label` parameter of `JoinFun` (already
        ignored in all our implementations).
      - We no longer mix Hoopl's `Dataflow` module and GHC's one.
      - We can replace some calls to lazy folds in Hoopl with the strict ones
        (see `joinOutFacts` and `mkFactBase`).
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: validate
      
      Reviewers: austin, simonmar, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2660
      dc4d5962
  26. 26 Oct, 2016 1 commit
  27. 23 Jun, 2016 1 commit
    • niteria's avatar
      Provide Uniquable version of SCC · 35d1564c
      niteria authored
      We want to remove the `Ord Unique` instance because there's
      no way to implement it in deterministic way and it's too
      easy to use by accident.
      
      We sometimes compute SCC for datatypes whose Ord instance
      is implemented in terms of Unique. The Ord constraint on
      SCC is just an artifact of some internal data structures.
      We can have an alternative implementation with a data
      structure that uses Uniquable instead.
      
      This does exactly that and I'm pleased that I didn't have
      to introduce any duplication to do that.
      
      Test Plan:
      ./validate
      I looked at performance tests and it's a tiny bit better.
      
      Reviewers: bgamari, simonmar, ezyang, austin, goldfire
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2359
      
      GHC Trac Issues: #4012
      35d1564c
  28. 12 Nov, 2015 1 commit
    • olsner's avatar
      Implement function-sections for Haskell code, #8405 · 4a32bf92
      olsner authored
      This adds a flag -split-sections that does similar things to
      -split-objs, but using sections in single object files instead of
      relying on the Satanic Splitter and other abominations. This is very
      similar to the GCC flags -ffunction-sections and -fdata-sections.
      
      The --gc-sections linker flag, which allows unused sections to actually
      be removed, is added to all link commands (if the linker supports it) so
      that space savings from having base compiled with sections can be
      realized.
      
      Supported both in LLVM and the native code-gen, in theory for all
      architectures, but really tested on x86 only.
      
      In the GHC build, a new SplitSections variable enables -split-sections
      for relevant parts of the build.
      
      Test Plan: validate with both settings of SplitSections
      
      Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari
      
      Reviewed By: simonmar, thomie, bgamari
      
      Subscribers: hsyl20, erikd, kgardas, thomie
      
      Differential Revision: https://phabricator.haskell.org/D1242
      
      GHC Trac Issues: #8405
      4a32bf92
  29. 10 Feb, 2015 1 commit
  30. 01 Aug, 2014 1 commit
  31. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  32. 03 Feb, 2014 1 commit
  33. 02 Feb, 2014 1 commit
  34. 22 Nov, 2013 1 commit
  35. 06 Apr, 2013 1 commit