1. 22 Feb, 2020 1 commit
  2. 25 Jan, 2020 1 commit
  3. 09 Sep, 2019 1 commit
    • Sylvain Henry's avatar
      Module hierarchy: StgToCmm (#13009) · 447864a9
      Sylvain Henry authored
      Add StgToCmm module hierarchy. Platform modules that are used in several
      other places (NCG, LLVM codegen, Cmm transformations) are put into
      GHC.Platform.
      447864a9
  4. 15 Aug, 2019 1 commit
    • James Foster's avatar
      Remove unused imports of the form 'import foo ()' (Fixes #17065) · ca71d551
      James Foster authored
      These kinds of imports are necessary in some cases such as
      importing instances of typeclasses or intentionally creating
      dependencies in the build system, but '-Wunused-imports' can't
      detect when they are no longer needed. This commit removes the
      unused ones currently in the code base (not including test files
      or submodules), with the hope that doing so may increase
      parallelism in the build system by removing unnecessary
      dependencies.
      ca71d551
  5. 20 Jun, 2019 1 commit
    • John Ericson's avatar
      Move 'Platform' to ghc-boot · bff2f24b
      John Ericson authored
      ghc-pkg needs to be aware of platforms so it can figure out which
      subdire within the user package db to use. This is admittedly
      roundabout, but maybe Cabal could use the same notion of a platform as
      GHC to good affect too.
      bff2f24b
  6. 15 Mar, 2019 1 commit
  7. 30 Aug, 2018 1 commit
  8. 21 Aug, 2018 1 commit
    • Andreas Klebinger's avatar
      Replace most occurences of foldl with foldl'. · 09c1d5af
      Andreas Klebinger authored
      This patch adds foldl' to GhcPrelude and changes must occurences
      of foldl to foldl'. This leads to better performance especially
      for quick builds where GHC does not perform strictness analysis.
      
      It does change strictness behaviour when we use foldl' to turn
      a argument list into function applications. But this is only a
      drawback if code looks ONLY at the last argument but not at the first.
      And as the benchmarks show leads to fewer allocations in practice
      at O2.
      
      Compiler performance for Nofib:
      
      O2 Allocations:
              -1 s.d.                -----            -0.0%
              +1 s.d.                -----            -0.0%
              Average                -----            -0.0%
      
      O2 Compile Time:
              -1 s.d.                -----            -2.8%
              +1 s.d.                -----            +1.3%
              Average                -----            -0.8%
      
      O0 Allocations:
              -1 s.d.                -----            -0.2%
              +1 s.d.                -----            -0.1%
              Average                -----            -0.2%
      
      Test Plan: ci
      
      Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal
      
      Reviewed By: bgamari, monoidal
      
      Subscribers: tdammers, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4929
      09c1d5af
  9. 19 Mar, 2018 2 commits
    • Michal Terepeta's avatar
      Hoopl: improve postorder calculation · bbcea13a
      Michal Terepeta authored
      - Fix the naming and comments to indicate that we are calculating
        *reverse* postorder (and not the standard postorder).
      
      - Rewrite the calculation to avoid CPS code. I found it fairly
        difficult to understand and the new one seems faster (according to
        nofib, decreases compiler allocations by 0.2%)
      
      - Remove `LabelsPtr`, which seems unnecessary and could be *really*
        confusing. For instance, previously:
        `postorder_dfs_from <block with label X>`
        and
        `postorder_dfs_from <label X>`
        would actually mean quite different things (and give different
        results).
      
      - Change the `Dataflow` module to always use entry of the graph for
        reverse postorder calculation. This should be the only change in
        behavior of this commit.
      
        Previously, if the caller provided initial facts for some of the
        labels, we would use those labels for our postorder calculation.
        However, I don't think that's correct in general - if the initial
        facts did not contain the entry of the graph, we would never analyze
        the blocks reachable from the entry but unreachable from the labels
        provided with the initial facts. It seems that the only analysis that
        used this was proc-point analysis, which I think would always include
        the entry block (so I don't think there's any bug due to this).
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: bgamari, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4464
      bbcea13a
    • Simon Marlow's avatar
      Be more selective in which conditionals we invert · 39c74063
      Simon Marlow authored
      Test Plan: validate
      
      Reviewers: bgamari, AndreasK, erikd
      
      Reviewed By: AndreasK
      
      Subscribers: rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4398
      39c74063
  10. 06 Mar, 2018 1 commit
  11. 18 Feb, 2018 1 commit
  12. 29 Jan, 2018 1 commit
  13. 03 Nov, 2017 1 commit
    • alexbiehl's avatar
      CmmSink: Use a IntSet instead of a list · 43537568
      alexbiehl authored
      CmmProcs which have *lots* of local variables take a considerable
      amount of time in CmmSink. This was noticed by @tdammers in #7258
      while compiling files with large records (~200-400 fields).
      
      Before:
      
      ```
              Sun Oct 29 19:58 2017 Time and Allocation Profiling Report (Final)
      
                 ghc-stage2 +RTS -p -RTS
      -B/Users/alexbiehl/git/ghc/inplace/lib /Users/alexbiehl/Downloads/W2.hs
      -fforce-recomp -O2
      
              total time  =       26.00 secs   (25996 ticks @ 1000 us, 1 processor)
              total alloc = 14,921,627,912 bytes  (excludes profiling overheads)
      
      COST CENTRE     MODULE      SRC %time %alloc
      
      sink            CmmPipeline
      compiler/cmm/CmmPipeline.hs:(104,13)-(105,59)        55.7   15.9
      SimplTopBinds   SimplCore   compiler/simplCore/SimplCore.hs:761:39-74 19.5   30.6
      FloatOutwards   SimplCore   compiler/simplCore/SimplCore.hs:471:40-66 4.2    9.0
      RegAlloc-linear AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(658,27)-(660,55)    4.0   11.1
      pprNativeCode   AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(529,37)-(530,65)    2.8    6.3
      NewStranal      SimplCore   compiler/simplCore/SimplCore.hs:480:40-63 1.6    3.7
      OccAnal         SimplCore compiler/simplCore/SimplCore.hs:(739,22)-(740,67)     1.5    3.5
      StgCmm          HscMain compiler/main/HscMain.hs:(1426,13)-(1427,62)          1.2    2.4
      regLiveness     AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(591,17)-(593,52)    1.2    1.9
      genMachCode     AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(580,17)-(582,62)    0.9    1.8
      NativeCodeGen   CodeOutput  compiler/main/CodeOutput.hs:171:18-78 0.9    2.1
      CoreTidy        HscMain     compiler/main/HscMain.hs:1253:27-67 0.8    1.9
      ```
      
      After:
      
      ```
              Sun Oct 29 19:18 2017 Time and Allocation Profiling Report (Final)
      
                 ghc-stage2 +RTS -p -RTS
      -B/Users/alexbiehl/git/ghc/inplace/lib /Users/alexbiehl/Downloads/W2.hs
      -fforce-recomp -O2
      
              total time  =       13.31 secs   (13307 ticks @ 1000 us, 1 processor)
              total alloc = 15,772,184,488 bytes  (excludes profiling overheads)
      
      COST CENTRE     MODULE         SRC %time %alloc
      
      SimplTopBinds   SimplCore
      compiler/simplCore/SimplCore.hs:761:39-74            38.3   29.0
      sink            CmmPipeline compiler/cmm/CmmPipeline.hs:(104,13)-(105,59)        13.2   20.3
      RegAlloc-linear AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(658,27)-(660,55)    8.3   10.5
      FloatOutwards   SimplCore compiler/simplCore/SimplCore.hs:471:40-66             8.1    8.5
      pprNativeCode   AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(529,37)-(530,65)    5.4    5.9
      NewStranal      SimplCore compiler/simplCore/SimplCore.hs:480:40-63             3.1    3.5
      OccAnal         SimplCore compiler/simplCore/SimplCore.hs:(739,22)-(740,67)     2.9    3.3
      StgCmm          HscMain compiler/main/HscMain.hs:(1426,13)-(1427,62)          2.3    2.3
      regLiveness     AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(591,17)-(593,52)    2.1    1.8
      NativeCodeGen   CodeOutput     compiler/main/CodeOutput.hs:171:18-78 1.7    2.0
      genMachCode     AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(580,17)-(582,62)    1.6    1.7
      CoreTidy        HscMain        compiler/main/HscMain.hs:1253:27-67 1.4    1.8
      foldNodesBwdOO  Hoopl.Dataflow compiler/cmm/Hoopl/Dataflow.hs:(397,1)-(403,17)       1.1    0.8
      ```
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: duog, rwbarton, thomie, tdammers
      
      GHC Trac Issues: #7258
      
      Differential Revision: https://phabricator.haskell.org/D4145
      43537568
  14. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      modules.
      
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      
      Differential Revision: https://phabricator.haskell.org/D3989
      f63bc730
  15. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
        `Uniq{FM,Set}`
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonpj, kavon, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3616
      42eee6ea
  16. 28 Apr, 2017 1 commit
    • Simon Peyton Jones's avatar
      Improve code generation for conditionals · 6d14c148
      Simon Peyton Jones authored
      This patch in in preparation for the fix to Trac #13397
      
      The code generator has a special case for
        case tagToEnum (a>#b) of
          False -> e1
          True  -> e2
      
      but it was not doing nearly so well on
        case a>#b of
          DEFAULT -> e1
          1#      -> e2
      
      This patch arranges to behave essentially identically in
      both cases.  In due course we can eliminate the special
      case for tagToEnum#, once we've completed Trac #13397.
      
      The changes are:
      
      * Make CmmSink swizzle the order of a conditional where necessary;
        see Note [Improving conditionals] in CmmSink
      
      * Hack the general case of StgCmmExpr.cgCase so that it use
        NoGcInAlts for conditionals.  This doesn't seem right, but it's
        the same choice as the tagToEnum version. Without it, code size
        increases a lot (more heap checks).
      
        There's a loose end here.
      
      * Add comments in CmmOpt.cmmMachOpFoldM
      6d14c148
  17. 09 Feb, 2017 1 commit
  18. 25 Jan, 2017 1 commit
  19. 08 Dec, 2016 1 commit
  20. 25 Jun, 2015 1 commit
  21. 21 Oct, 2014 1 commit
    • Moritz Angermann's avatar
      Fixes the ARM build · 69f63612
      Moritz Angermann authored
      Summary:
      CodeGen.Platform.hs was changed with the following diff:
      
         -#endif
          globalRegMaybe _                        = Nothing
         +#elif MACHREGS_NO_REGS
         +globalRegMaybe _ = Nothing
         +#else
         +globalRegMaybe = panic "globalRegMaybe not defined for this platform"
         +#endif
      
      which causes globalRegMaybe ot panic for arch ARM.
      
      This patch ensures globalRegMaybe is not called on ARM.
      Signed-off-by: Moritz Angermann's avatarMoritz Angermann <moritz@lichtzwerge.de>
      
      Test Plan: Building arm cross-compiler (e.g. --target=arm-apple-darwin10)
      
      Reviewers: hvr, ezyang, simonmar, rwbarton, austin
      
      Reviewed By: austin
      
      Subscribers: dterei, bgamari, simonmar, ezyang, carter
      
      Differential Revision: https://phabricator.haskell.org/D208
      
      GHC Trac Issues: #9593
      69f63612
  22. 30 Jun, 2014 1 commit
    • tibbe's avatar
      Re-add more primops for atomic ops on byte arrays · 4ee4ab01
      tibbe authored
      This is the second attempt to add this functionality. The first
      attempt was reverted in 950fcae4, due
      to register allocator failure on x86. Given how the register
      allocator currently works, we don't have enough registers on x86 to
      support cmpxchg using complicated addressing modes. Instead we fall
      back to a simpler addressing mode on x86.
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      4ee4ab01
  23. 26 Jun, 2014 1 commit
  24. 24 Jun, 2014 1 commit
    • tibbe's avatar
      Add more primops for atomic ops on byte arrays · d8abf85f
      tibbe authored
      Summary:
      Add more primops for atomic ops on byte arrays
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      d8abf85f
  25. 29 Apr, 2014 1 commit
  26. 13 Apr, 2014 1 commit
  27. 04 Apr, 2014 1 commit
  28. 17 Mar, 2014 1 commit
  29. 16 Jan, 2014 1 commit
  30. 25 Oct, 2013 1 commit
    • Simon Marlow's avatar
      Discard dead assignments in tryToInline · 29be1a8a
      Simon Marlow authored
      Inlining global registers and constants made code slightly larger in
      some cases.  I finally got around to looking into why, and discovered
      one reason: we weren't discarding dead code in some cases.  This patch
      fixes it.
      29be1a8a
  31. 12 Oct, 2013 1 commit
  32. 20 Sep, 2013 1 commit
  33. 14 Sep, 2013 1 commit
  34. 12 Sep, 2013 1 commit
    • Jan Stolarek's avatar
      Improve sinking pass · ad15c2b4
      Jan Stolarek authored
      This commit does two things:
      
        * Allows duplicating of global registers and literals by inlining
          them. Previously we would only inline global register or literal
          if it was used only once.
      
        * Changes method of determining conflicts between a node and an
          assignment. New method has two advantages. It relies on
          DefinerOfRegs and UserOfRegs typeclasses, so if a set of registers
          defined or used by a node should ever change, `conflicts` function
          will use the changed definition. This definition also catches
          more cases than the previous one (namely CmmCall and CmmForeignCall)
          which is a step towards making it possible to run sinking pass
          before stack layout (currently this doesn't work).
      
      This patch also adds a lot of comments that are result of about two-week
      long investigation of how sinking pass works and why it does what it does.
      ad15c2b4
  35. 04 Sep, 2013 1 commit
  36. 03 Sep, 2013 1 commit
  37. 02 Sep, 2013 1 commit
  38. 19 Apr, 2013 1 commit
  39. 30 Oct, 2012 1 commit