1. 28 Nov, 2017 1 commit
    • Ben Gamari's avatar
      cmm: Use LocalBlockLabel instead of AsmTempLabel to represent blocks · 048a9138
      Ben Gamari authored
      blockLbl was originally changed in 8b007abb to
      use mkTempAsmLabel to fix an inconsistency resulting in #14221. However, this
      breaks the C code generator, which doesn't support AsmTempLabels (#14454).
      
      Instead let's try going the other direction: use a new CLabel variety,
      LocalBlockLabel. Then we can teach the C code generator to deal with
      these as well.
      048a9138
  2. 22 Nov, 2017 1 commit
  3. 15 Nov, 2017 1 commit
    • John Ky's avatar
      Add new mbmi and mbmi2 compiler flags · f5dc8ccc
      John Ky authored
      This adds support for the bit deposit and extraction operations provided
      by the BMI and BMI2 instruction set extensions on modern amd64 machines.
      
      Test Plan: Validate
      
      Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie
      
      GHC Trac Issues: #14206
      
      Differential Revision: https://phabricator.haskell.org/D4063
      f5dc8ccc
  4. 30 Oct, 2017 1 commit
  5. 26 Sep, 2017 1 commit
  6. 19 Sep, 2017 2 commits
  7. 07 Sep, 2017 1 commit
  8. 24 Aug, 2017 1 commit
  9. 22 Aug, 2017 1 commit
  10. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      
      
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
        `Uniq{FM,Set}`
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonpj, kavon, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3616
      42eee6ea
  11. 07 Mar, 2017 1 commit
  12. 17 Feb, 2017 1 commit
    • Simon Peyton Jones's avatar
      Honour -dsuppress-uniques more thoroughly · 8d401e50
      Simon Peyton Jones authored
      I found that tests
        parser/should_compile/DumpRenamedAst
      and friends were printing uniques, which makes the test fragile.
      But -dsuppress-uniques made no difference!  It turned out that
      pprName wasn't properly consulting Opt_SuppressUniques.
      
      This patch fixes the problem, and updates those three tests to
      use -dsuppress-uniques
      8d401e50
  13. 26 Jan, 2017 1 commit
  14. 08 Dec, 2016 1 commit
  15. 06 Dec, 2016 1 commit
  16. 16 Nov, 2016 1 commit
  17. 27 Jan, 2016 1 commit
    • olsner's avatar
      Restore original alignment for info tables · 0dc7b36c
      olsner authored
      This was broken in 4a32bf92, meaning
      that info tables and subsequent code are no longer guaranteed to have
      the recommended alignment.  Split up the section header and section
      alignment printers, and print an appropriate alignment directive before
      each info table.
      
      Fixes Trac #11486
      
      Reviewers: austin, bgamari, rwbarton
      
      Reviewed By: bgamari, rwbarton
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1847
      
      GHC Trac Issues: #11486
      0dc7b36c
  18. 18 Jan, 2016 1 commit
    • Jan Stolarek's avatar
      Replace calls to `ptext . sLit` with `text` · b8abd852
      Jan Stolarek authored
      Summary:
      In the past the canonical way for constructing an SDoc string literal was the
      composition `ptext . sLit`.  But for some time now we have function `text` that
      does the same.  Plus it has some rules that optimize its runtime behaviour.
      This patch takes all uses of `ptext . sLit` in the compiler and replaces them
      with calls to `text`.  The main benefits of this patch are clener (shorter) code
      and less dependencies between module, because many modules now do not need to
      import `FastString`.  I don't expect any performance benefits - we mostly use
      SDocs to report errors and it seems there is little to be gained here.
      
      Test Plan: ./validate
      
      Reviewers: bgamari, austin, goldfire, hvr, alanz
      
      Subscribers: goldfire, thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D1784
      b8abd852
  19. 11 Dec, 2015 1 commit
    • eir@cis.upenn.edu's avatar
      Add kind equalities to GHC. · 67465497
      eir@cis.upenn.edu authored
      This implements the ideas originally put forward in
      "System FC with Explicit Kind Equality" (ICFP'13).
      
      There are several noteworthy changes with this patch:
       * We now have casts in types. These change the kind
         of a type. See new constructor `CastTy`.
      
       * All types and all constructors can be promoted.
         This includes GADT constructors. GADT pattern matches
         take place in type family equations. In Core,
         types can now be applied to coercions via the
         `CoercionTy` constructor.
      
       * Coercions can now be heterogeneous, relating types
         of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2`
         proves both that `t1` and `t2` are the same and also that
         `k1` and `k2` are the same.
      
       * The `Coercion` type has been significantly enhanced.
         The documentation in `docs/core-spec/core-spec.pdf` reflects
         the new reality.
      
       * The type of `*` is now `*`. No more `BOX`.
      
       * Users can write explicit kind variables in their code,
         anywhere they can write type variables. For backward compatibility,
         automatic inference of kind-variable binding is still permitted.
      
       * The new extension `TypeInType` turns on the new user-facing
         features.
      
       * Type families and synonyms are now promoted to kinds. This causes
         trouble with parsing `*`, leading to the somewhat awkward new
         `HsAppsTy` constructor for `HsType`. This is dispatched with in
         the renamer, where the kind `*` can be told apart from a
         type-level multiplication operator. Without `-XTypeInType` the
         old behavior persists. With `-XTypeInType`, you need to import
         `Data.Kind` to get `*`, also known as `Type`.
      
       * The kind-checking algorithms in TcHsType have been significantly
         rewritten to allow for enhanced kinds.
      
       * The new features are still quite experimental and may be in flux.
      
       * TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203.
      
       * TODO: Update user manual.
      
      Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142.
      Updates Haddock submodule.
      67465497
  20. 18 Nov, 2015 1 commit
    • msosn's avatar
      Fix inconsistent pretty-printing of type families · c61759d5
      msosn authored
      After the changes, the three functions used to print type families
      were identical, so they are refactored into one.
      
      Original RHSs of data instance declarations are recreated and
      printed in user error messages.
      
      RHSs containing representation TyCons are printed in the
      Coercion Axioms section in a typechecker dump.
      
      Add vbar to the list of SDocs exported by Outputable.
      Replace all text "|" docs with it.
      
      Fixes #10839
      
      Reviewers: goldfire, jstolarek, austin, bgamari
      
      Reviewed By: jstolarek
      
      Subscribers: jstolarek, thomie
      
      Differential Revision: https://phabricator.haskell.org/D1441
      
      GHC Trac Issues: #10839
      c61759d5
  21. 12 Nov, 2015 1 commit
    • olsner's avatar
      Implement function-sections for Haskell code, #8405 · 4a32bf92
      olsner authored
      This adds a flag -split-sections that does similar things to
      -split-objs, but using sections in single object files instead of
      relying on the Satanic Splitter and other abominations. This is very
      similar to the GCC flags -ffunction-sections and -fdata-sections.
      
      The --gc-sections linker flag, which allows unused sections to actually
      be removed, is added to all link commands (if the linker supports it) so
      that space savings from having base compiled with sections can be
      realized.
      
      Supported both in LLVM and the native code-gen, in theory for all
      architectures, but really tested on x86 only.
      
      In the GHC build, a new SplitSections variable enables -split-sections
      for relevant parts of the build.
      
      Test Plan: validate with both settings of SplitSections
      
      Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari
      
      Reviewed By: simonmar, thomie, bgamari
      
      Subscribers: hsyl20, erikd, kgardas, thomie
      
      Differential Revision: https://phabricator.haskell.org/D1242
      
      GHC Trac Issues: #8405
      4a32bf92
  22. 31 Oct, 2015 1 commit
  23. 23 Sep, 2015 1 commit
    • Simon Marlow's avatar
      Annotate CmmBranch with an optional likely target · 939a7d63
      Simon Marlow authored
      Summary:
      This allows the code generator to give hints to later code generation
      steps about which branch is most likely to be taken.  Right now it
      is only taken into account in one place: a special case in
      CmmContFlowOpt that swapped branches over to maximise the chance of
      fallthrough, which is now disabled when there is a likelihood setting.
      
      Test Plan: validate
      
      Reviewers: austin, simonpj, bgamari, ezyang, tibbe
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1273
      939a7d63
  24. 21 Aug, 2015 2 commits
    • thomie's avatar
      Refactor: delete most of the module FastTypes · 2f29ebbb
      thomie authored
      This reverses some of the work done in #1405, and goes back to the
      assumption that the bootstrap compiler understands GHC-haskell.
      
      In particular:
        * use MagicHash instead of _ILIT and _CLIT
        * pattern matching on I# if possible, instead of using iUnbox
          unnecessarily
        * use Int#/Char#/Addr# instead of the following type synonyms:
          - type FastInt   = Int#
          - type FastChar  = Char#
          - type FastPtr a = Addr#
        * inline the following functions:
          - iBox           = I#
          - cBox           = C#
          - fastChr        = chr#
          - fastOrd        = ord#
          - eqFastChar     = eqChar#
          - shiftLFastInt  = uncheckedIShiftL#
          - shiftR_FastInt = uncheckedIShiftRL#
          - shiftRLFastInt = uncheckedIShiftRL#
        * delete the following unused functions:
          - minFastInt
          - maxFastInt
          - uncheckedIShiftRA#
          - castFastPtr
          - panicDocFastInt and pprPanicFastInt
        * rename panicFastInt back to panic#
      
      These functions remain, since they actually do something:
        * iUnbox
        * bitAndFastInt
        * bitOrFastInt
      
      Test Plan: validate
      
      Reviewers: austin, bgamari
      
      Subscribers: rwbarton
      
      Differential Revision: https://phabricator.haskell.org/D1141
      
      GHC Trac Issues: #1405
      2f29ebbb
    • thomie's avatar
      Delete FastBool · 3452473b
      thomie authored
      This reverses some of the work done in Trac #1405, and assumes GHC is
      smart enough to do its own unboxing of booleans now.
      
      I would like to do some more performance measurements, but the code
      changes can be reviewed already.
      
      Test Plan:
      With a perf build:
      ./inplace/bin/ghc-stage2 nofib/spectral/simple/Main.hs -fforce-recomp
      +RTS -t --machine-readable
      
      before:
      ```
        [("bytes allocated", "1300744864")
        ,("num_GCs", "302")
        ,("average_bytes_used", "8811118")
        ,("max_bytes_used", "24477464")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.833")
        ,("mutator_wall_seconds", "4.283")
        ,("GC_cpu_seconds", "0.960")
        ,("GC_wall_seconds", "0.961")
        ]
      ```
      
      after:
      ```
        [("bytes allocated", "1301088064")
        ,("num_GCs", "310")
        ,("average_bytes_used", "8820253")
        ,("max_bytes_used", "24539904")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.876")
        ,("mutator_wall_seconds", "4.474")
        ,("GC_cpu_seconds", "0.965")
        ,("GC_wall_seconds", "0.979")
        ]
      ```
      
      CPU time seems to be up a bit, but I'm not sure. Unfortunately CPU time
      measurements are rather noisy.
      
      Reviewers: austin, bgamari, rwbarton
      
      Subscribers: nomeata
      
      Differential Revision: https://phabricator.haskell.org/D1143
      
      GHC Trac Issues: #1405
      3452473b
  25. 07 Jul, 2015 1 commit
  26. 16 Jun, 2015 1 commit
  27. 30 Mar, 2015 1 commit
    • Joachim Breitner's avatar
      Refactor the story around switches (#10137) · de1160be
      Joachim Breitner authored
      This re-implements the code generation for case expressions at the Stg →
      Cmm level, both for data type cases as well as for integral literal
      cases. (Cases on float are still treated as before).
      
      The goal is to allow for fancier strategies in implementing them, for a
      cleaner separation of the strategy from the gritty details of Cmm, and
      to run this later than the Common Block Optimization, allowing for one
      way to attack #10124. The new module CmmSwitch contains a number of
      notes explaining this changes. For example, it creates larger
      consecutive jump tables than the previous code, if possible.
      
      nofib shows little significant overall improvement of runtime. The
      rather large wobbling comes from changes in the code block order
      (see #8082, not much we can do about it). But the decrease in code size
      alone makes this worthwhile.
      
      ```
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                  Min          -1.8%      0.0%     -6.1%     -6.1%     -2.9%
                  Max          -0.7%     +0.0%     +5.6%     +5.7%     +7.8%
       Geometric Mean          -1.4%     -0.0%     -0.3%     -0.3%     +0.0%
      ```
      
      Compilation time increases slightly:
      ```
              -1 s.d.                -----            -2.0%
              +1 s.d.                -----            +2.5%
              Average                -----            +0.3%
      ```
      
      The test case T783 regresses a lot, but it is the only one exhibiting
      any regression. The cause is the changed order of branches in an
      if-then-else tree, which makes the hoople data flow analysis traverse
      the blocks in a suboptimal order. Reverting that gets rid of this
      regression, but has a consistent, if only very small (+0.2%), negative
      effect on runtime. So I conclude that this test is an extreme outlier
      and no reason to change the code.
      
      Differential Revision: https://phabricator.haskell.org/D720
      de1160be
  28. 06 Jan, 2015 1 commit
  29. 16 Dec, 2014 3 commits
    • Peter Wortmann's avatar
      Add unwind information to Cmm · 711a51ad
      Peter Wortmann authored
      Unwind information allows the debugger to discover more information
      about a program state, by allowing it to "reconstruct" other states of
      the program. In practice, this means that we explain to the debugger
      how to unravel stack frames, which comes down mostly to explaining how
      to find their Sp and Ip register values.
      
      * We declare yet another new constructor for CmmNode - and this time
        there's actually little choice, as unwind information can and will
        change mid-block. We don't actually make use of these capabilities,
        and back-end support would be tricky (generate new labels?), but it
        feels like the right way to do it.
      
      * Even though we only use it for Sp so far, we allow CmmUnwind to specify
        unwind information for any register. This is pretty cheap and could
        come in useful in future.
      
      * We allow full CmmExpr expressions for specifying unwind values. The
        advantage here is that we don't have to make up new syntax, and can e.g.
        use the WDS macro directly. On the other hand, the back-end will now
        have to simplify the expression until it can sensibly be converted
        into DWARF byte code - a process which might fail, yielding NCG panics.
        On the other hand, when you're writing Cmm by hand you really ought to
        know what you're doing.
      
      (From Phabricator D169)
      711a51ad
    • Peter Wortmann's avatar
      Tick scopes · 5fecd767
      Peter Wortmann authored
      This patch solves the scoping problem of CmmTick nodes: If we just put
      CmmTicks into blocks we have no idea what exactly they are meant to
      cover.  Here we introduce tick scopes, which allow us to create
      sub-scopes and merged scopes easily.
      
      Notes:
      
      * Given that the code often passes Cmm around "head-less", we have to
        make sure that its intended scope does not get lost. To keep the amount
        of passing-around to a minimum we define a CmmAGraphScoped type synonym
        here that just bundles the scope with a portion of Cmm to be assembled
        later.
      
      * We introduce new scopes at somewhat random places, aligning with
        getCode calls. This works surprisingly well, but we might have to
        add new scopes into the mix later on if we find things too be too
        coarse-grained.
      
      (From Phabricator D169)
      5fecd767
    • Peter Wortmann's avatar
      Source notes (Cmm support) · 7ceaf96f
      Peter Wortmann authored
      This patch adds CmmTick nodes to Cmm code. This is relatively
      straight-forward, but also not very useful, as many blocks will simply
      end up with no annotations whatosever.
      
      Notes:
      
      * We use this design over, say, putting ticks into the entry node of all
        blocks, as it seems to work better alongside existing optimisations.
        Now granted, the reason for this is that currently GHC's main Cmm
        optimisations seem to mainly reorganize and merge code, so this might
        change in the future.
      
      * We have the Cmm parser generate a few source notes as well. This is
        relatively easy to do - worst part is that it complicates the CmmParse
        implementation a bit.
      
      (From Phabricator D169)
      7ceaf96f
  30. 19 Nov, 2014 1 commit
  31. 20 Oct, 2014 1 commit
  32. 19 Oct, 2014 1 commit
  33. 02 Oct, 2014 1 commit
    • Edward Z. Yang's avatar
      Place static closures in their own section. · b23ba2a7
      Edward Z. Yang authored
      Summary:
      The primary reason for doing this is assisting debuggability:
      if static closures are all in the same section, they are
      guaranteed to be adjacent to one another.  This will help
      later when we add some code that takes section start/end and
      uses this to sanity-check the sections.
      
      Part of remove HEAP_ALLOCED patch set (#8199
      
      )
      Signed-off-by: Edward Z. Yang's avatarEdward Z. Yang <ezyang@mit.edu>
      
      Test Plan: validate
      
      Reviewers: simonmar, austin
      
      Subscribers: simonmar, ezyang, carter, thomie
      
      Differential Revision: https://phabricator.haskell.org/D263
      
      GHC Trac Issues: #8199
      b23ba2a7
  34. 23 Aug, 2014 1 commit
    • rwbarton's avatar
      Add MO_AddIntC, MO_SubIntC MachOps and implement in X86 backend · cfd08a99
      rwbarton authored
      Summary:
      These MachOps are used by addIntC# and subIntC#, which in turn are
      used in integer-gmp when adding or subtracting small Integers. The
      following benchmark shows a ~6% speedup after this commit on x86_64
      (building GHC with BuildFlavour=perf).
      
          {-# LANGUAGE MagicHash #-}
      
          import GHC.Exts
          import Criterion.Main
      
          count :: Int -> Integer
          count (I# n#) = go n# 0
            where go :: Int# -> Integer -> Integer
                  go 0# acc = acc
                  go n# acc = go (n# -# 1#) $! acc + 1
      
          main = defaultMain [bgroup "count"
                                [bench "100" $ whnf count 100]]
      
      Differential Revision: https://phabricator.haskell.org/D140
      cfd08a99
  35. 14 Aug, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement new CLZ and CTZ primops (re #9340) · e0c1767d
      Herbert Valerio Riedel authored
      This implements the new primops
      
        clz#, clz32#, clz64#,
        ctz#, ctz32#, ctz64#
      
      which provide efficient implementations of the popular
      count-leading-zero and count-trailing-zero respectively
      (see testcase for a pure Haskell reference implementation).
      
      On x86, NCG as well as LLVM generates code based on the BSF/BSR
      instructions (which need extra logic to make the 0-case well-defined).
      
      Test Plan: validate and succesful tests on i686 and amd64
      
      Reviewers: rwbarton, simonmar, ezyang, austin
      
      Subscribers: simonmar, relrod, ezyang, carter
      
      Differential Revision: https://phabricator.haskell.org/D144
      
      GHC Trac Issues: #9340
      e0c1767d
  36. 20 Jul, 2014 1 commit