1. 06 Mar, 2019 1 commit
    • Ben Gamari's avatar
      Rip out object splitting · 37f257af
      Ben Gamari authored
      The splitter is an evil Perl script that processes assembler code.
      Its job can be done better by the linker's --gc-sections flag. GHC
      passes this flag to the linker whenever -split-sections is passed on
      the command line.
      
      This is based on @DemiMarie's D2768.
      
      Fixes Trac #11315
      Fixes Trac #9832
      Fixes Trac #8964
      Fixes Trac #8685
      Fixes Trac #8629
      37f257af
  2. 15 Feb, 2019 1 commit
    • Andreas Klebinger's avatar
      Don't wrap the entry map for LiveInfo in Maybe. · bcaba30a
      Andreas Klebinger authored
      It never really encoded a invariant.
      
      * The linear register allocator just did partial pattern matches
      * The graph allocator just set it to (Just mapEmpty) for Nothing
      
      So I changed LiveInfo to directly contain the map.
      
      Further natCmmTopToLive which filled in Nothing is no longer exported.
      Instead we know call cmmTopLiveness which changes the type AND fills
      in the map.
      bcaba30a
  3. 08 Feb, 2019 1 commit
    • Andreas Klebinger's avatar
      Allow resizing the stack for the graph allocator. · 03b7abc1
      Andreas Klebinger authored
      The graph allocator now dynamically resizes the number of stack
      slots when running into the limit.
      
      This fixes #8657.
      
      Also loop membership of basic blocks is now available
      in the register allocator for cost heuristics.
      03b7abc1
  4. 17 Nov, 2018 1 commit
    • Andreas Klebinger's avatar
      NCG: New code layout algorithm. · 912fd2b6
      Andreas Klebinger authored
      Summary:
      This patch implements a new code layout algorithm.
      It has been tested for x86 and is disabled on other platforms.
      
      Performance varies slightly be CPU/Machine but in general seems to be better
      by around 2%.
      Nofib shows only small differences of about +/- ~0.5% overall depending on
      flags/machine performance in other benchmarks improved significantly.
      
      Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec,
      containers, text and xeno.
      
      While the magnitude of gains differed three different CPUs where tested with
      all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell,
      Skylake
      
      * Library benchmark results summarized:
        * containers: ~1.5% faster
        * aeson: ~2% faster
        * megaparsec: ~2-5% faster
        * xml library benchmarks: 0.2%-1.1% faster
        * vector-benchmarks: 1-4% faster
        * text: 5.5% faster
      
      On average GHC compile times go down, as GHC compiled with the new layout
      is faster than the overhead introduced by using the new layout algorithm,
      
      Things this patch does:
      
      * Move code responsilbe for block layout in it's own module.
      * Move the NcgImpl Class into the NCGMonad module.
      * Extract a control flow graph from the input cmm.
      * Update this cfg to keep it in sync with changes during
        asm codegen. This has been tested on x64 but should work on x86.
        Other platforms still use the old codelayout.
      * Assign weights to the edges in the CFG based on type and limited static
        analysis which are then used for block layout.
      * Once we have the final code layout eliminate some redundant jumps.
      
        In particular turn a sequences of:
            jne .foo
            jmp .bar
          foo:
        into
            je bar
          foo:
            ..
      
      Test Plan: ci
      
      Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott
      
      Reviewed By: RyanGlScott
      
      Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton
      
      GHC Trac Issues: #15124
      
      Differential Revision: https://phabricator.haskell.org/D4726
      912fd2b6
  5. 21 Aug, 2018 1 commit
    • Andreas Klebinger's avatar
      Replace most occurences of foldl with foldl'. · 09c1d5af
      Andreas Klebinger authored
      This patch adds foldl' to GhcPrelude and changes must occurences
      of foldl to foldl'. This leads to better performance especially
      for quick builds where GHC does not perform strictness analysis.
      
      It does change strictness behaviour when we use foldl' to turn
      a argument list into function applications. But this is only a
      drawback if code looks ONLY at the last argument but not at the first.
      And as the benchmarks show leads to fewer allocations in practice
      at O2.
      
      Compiler performance for Nofib:
      
      O2 Allocations:
              -1 s.d.                -----            -0.0%
              +1 s.d.                -----            -0.0%
              Average                -----            -0.0%
      
      O2 Compile Time:
              -1 s.d.                -----            -2.8%
              +1 s.d.                -----            +1.3%
              Average                -----            -0.8%
      
      O0 Allocations:
              -1 s.d.                -----            -0.2%
              +1 s.d.                -----            -0.1%
              Average                -----            -0.2%
      
      Test Plan: ci
      
      Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal
      
      Reviewed By: bgamari, monoidal
      
      Subscribers: tdammers, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4929
      09c1d5af
  6. 13 Apr, 2018 1 commit
  7. 28 Nov, 2017 2 commits
  8. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      modules.
      
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      
      Reviewed By: bgamari
      
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      
      Differential Revision: https://phabricator.haskell.org/D3989
      f63bc730
  9. 22 Aug, 2017 1 commit
  10. 11 Jul, 2017 1 commit
    • Ben Gamari's avatar
      Use correct section types syntax for architecture · 9b9f978f
      Ben Gamari authored
      Previously GHC would always assume that section types began with `@` while
      producing assembly, which is not true. For instance, in ARM assembly syntax
      section types begin with `%`. This abstracts out section type pretty-printing
      and adjusts it to correctly account for the target architectures assembly
      flavor.
      
      Reviewers: austin, hvr, Phyx
      
      Reviewed By: Phyx
      
      Subscribers: Phyx, rwbarton, thomie, erikd
      
      GHC Trac Issues: #13937
      
      Differential Revision: https://phabricator.haskell.org/D3712
      9b9f978f
  11. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
        `Uniq{FM,Set}`
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonpj, kavon, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3616
      42eee6ea
  12. 05 Apr, 2017 1 commit
  13. 08 Feb, 2017 1 commit
    • Ben Gamari's avatar
      Generalize CmmUnwind and pass unwind information through NCG · 3eb737ee
      Ben Gamari authored
      As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack
      unwinding information produced by GHC is currently quite approximate.
      Essentially we assume that register values do not change at all within a
      basic block. While this is somewhat true in normal Haskell code, blocks
      containing foreign calls often break this assumption. This results in
      unreliable call stacks, especially in the code containing foreign calls.
      This is worse than it sounds as unreliable unwinding information can at
      times result in segmentation faults.
      
      This patch set attempts to improve this situation by tracking unwinding
      information with finer granularity. By dispensing with the assumption of
      one unwinding table per block, we allow the compiler to accurately
      represent the areas surrounding foreign calls.
      
      Towards this end we generalize the representation of unwind information
      in the backend in three ways,
      
       * Multiple CmmUnwind nodes can occur per block
      
       * CmmUnwind nodes can now carry unwind information for multiple
         registers (while not strictly necessary; this makes emitting
         unwinding information a bit more convenient in the compiler)
      
       * The NCG backend is given an opportunity to modify the unwinding
         records since it may need to make adjustments due to, for instance,
         native calling convention requirements for foreign calls (see
         #11353).
      
      This sets the stage for resolving #11337 and #11338.
      
      Test Plan: Validate
      
      Reviewers: scpmw, simonmar, austin, erikd
      
      Subscribers: qnikst, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2741
      3eb737ee
  14. 10 Jan, 2017 1 commit
    • Rufflewind's avatar
      Fix terminal corruption bug and clean up SDoc interface. · 22845adc
      Rufflewind authored
      - Fix #13076 by wrapping `printDoc_` so that the terminal color is
        reset even if an exception occurs.
      
      - Add `printSDoc`, `printSDocLn`, and `bufLeftRenderSDoc` to keep `SDoc`
        values abstract (they are wrappers of `printDoc_`, `printDoc`, and
        `bufLeftRender` respectively).
      
      - Remove unused function: `printForAsm`
      
      Test Plan: manual
      
      Reviewers: RyanGlScott, austin, dfeuer, bgamari
      
      Reviewed By: dfeuer, bgamari
      
      Subscribers: dfeuer, mpickering, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2932
      
      GHC Trac Issues: #13076
      22845adc
  15. 08 Dec, 2016 1 commit
  16. 29 Nov, 2016 2 commits
  17. 16 Nov, 2016 1 commit
  18. 06 Aug, 2016 1 commit
  19. 07 Jul, 2016 1 commit
    • niteria's avatar
      Document some codegen nondeterminism · 6ed7c479
      niteria authored
      Bit-for-bit reproducible binaries are not a goal for now,
      so this is just marking places that could be a problem.
      Doing this will allow eltsUFM to be removed and will
      leave only nonDetEltsUFM.
      
      GHC Trac: #4012
      6ed7c479
  20. 30 Jun, 2016 1 commit
  21. 23 Jun, 2016 1 commit
    • niteria's avatar
      Provide Uniquable version of SCC · 35d1564c
      niteria authored
      We want to remove the `Ord Unique` instance because there's
      no way to implement it in deterministic way and it's too
      easy to use by accident.
      
      We sometimes compute SCC for datatypes whose Ord instance
      is implemented in terms of Unique. The Ord constraint on
      SCC is just an artifact of some internal data structures.
      We can have an alternative implementation with a data
      structure that uses Uniquable instead.
      
      This does exactly that and I'm pleased that I didn't have
      to introduce any duplication to do that.
      
      Test Plan:
      ./validate
      I looked at performance tests and it's a tiny bit better.
      
      Reviewers: bgamari, simonmar, ezyang, austin, goldfire
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2359
      
      GHC Trac Issues: #4012
      35d1564c
  22. 18 Jan, 2016 1 commit
    • Jan Stolarek's avatar
      Replace calls to `ptext . sLit` with `text` · b8abd852
      Jan Stolarek authored
      Summary:
      In the past the canonical way for constructing an SDoc string literal was the
      composition `ptext . sLit`.  But for some time now we have function `text` that
      does the same.  Plus it has some rules that optimize its runtime behaviour.
      This patch takes all uses of `ptext . sLit` in the compiler and replaces them
      with calls to `text`.  The main benefits of this patch are clener (shorter) code
      and less dependencies between module, because many modules now do not need to
      import `FastString`.  I don't expect any performance benefits - we mostly use
      SDocs to report errors and it seems there is little to be gained here.
      
      Test Plan: ./validate
      
      Reviewers: bgamari, austin, goldfire, hvr, alanz
      
      Subscribers: goldfire, thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D1784
      b8abd852
  23. 31 Dec, 2015 2 commits
  24. 19 Dec, 2015 1 commit
  25. 23 Nov, 2015 1 commit
  26. 12 Nov, 2015 1 commit
    • olsner's avatar
      Implement function-sections for Haskell code, #8405 · 4a32bf92
      olsner authored
      This adds a flag -split-sections that does similar things to
      -split-objs, but using sections in single object files instead of
      relying on the Satanic Splitter and other abominations. This is very
      similar to the GCC flags -ffunction-sections and -fdata-sections.
      
      The --gc-sections linker flag, which allows unused sections to actually
      be removed, is added to all link commands (if the linker supports it) so
      that space savings from having base compiled with sections can be
      realized.
      
      Supported both in LLVM and the native code-gen, in theory for all
      architectures, but really tested on x86 only.
      
      In the GHC build, a new SplitSections variable enables -split-sections
      for relevant parts of the build.
      
      Test Plan: validate with both settings of SplitSections
      
      Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari
      
      Reviewed By: simonmar, thomie, bgamari
      
      Subscribers: hsyl20, erikd, kgardas, thomie
      
      Differential Revision: https://phabricator.haskell.org/D1242
      
      GHC Trac Issues: #8405
      4a32bf92
  27. 17 Oct, 2015 1 commit
    • Herbert Valerio Riedel's avatar
      Make Monad/Applicative instances MRP-friendly · e8ed2136
      Herbert Valerio Riedel authored
      This patch refactors pure/(*>) and return/(>>) in MRP-friendly way, i.e.
      such that the explicit definitions for `return` and `(>>)` match the
      MRP-style default-implementation, i.e.
      
        return = pure
      
      and
      
        (>>) = (*>)
      
      This way, e.g. all `return = pure` definitions can easily be grepped and
      removed in GHC 8.1;
      
      Test Plan: Harbormaster
      
      Reviewers: goldfire, alanz, bgamari, quchen, austin
      
      Reviewed By: quchen, austin
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1312
      e8ed2136
  28. 15 Oct, 2015 1 commit
  29. 07 Oct, 2015 1 commit
  30. 23 Sep, 2015 1 commit
    • Simon Marlow's avatar
      Annotate CmmBranch with an optional likely target · 939a7d63
      Simon Marlow authored
      Summary:
      This allows the code generator to give hints to later code generation
      steps about which branch is most likely to be taken.  Right now it
      is only taken into account in one place: a special case in
      CmmContFlowOpt that swapped branches over to maximise the chance of
      fallthrough, which is now disabled when there is a likelihood setting.
      
      Test Plan: validate
      
      Reviewers: austin, simonpj, bgamari, ezyang, tibbe
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1273
      939a7d63
  31. 03 Jul, 2015 1 commit
    • Peter Trommler's avatar
      Implement PowerPC 64-bit native code backend for Linux · d3c1dda6
      Peter Trommler authored
      Extend the PowerPC 32-bit native code generator for "64-bit
      PowerPC ELF Application Binary Interface Supplement 1.9" by
      Ian Lance Taylor and "Power Architecture 64-Bit ELF V2 ABI Specification --
      OpenPOWER ABI for Linux Supplement" by IBM.
      The latter ABI is mainly used on POWER7/7+ and POWER8
      Linux systems running in little-endian mode. The code generator
      supports both static and dynamic linking. PowerPC 64-bit
      code for ELF ABI 1.9 and 2 is mostly position independent
      anyway, and thus so is all the code emitted by the code
      generator. In other words, -fPIC does not make a difference.
      
      rts/stg/SMP.h support is implemented.
      
      Following the spirit of the introductory comment in
      PPC/CodeGen.hs, the rest of the code is a straightforward
      extension of the 32-bit implementation.
      
      Limitations:
      * Code is generated only in the medium code model, which
        is also gcc's default
      * Local symbols are not accessed directly, which seems to
        also be the case for 32-bit
      * LLVM does not work, but this does not work on 32-bit either
      * Must use the system runtime linker in GHCi, because the
        GHC linker for "static" object files (rts/Linker.c) for
        PPC 64-bit is not implemented. The system runtime
        (dynamic) linker works.
      * The handling of the system stack (register 1) is not ELF-
        compliant so stack traces break. Instead of allocating a new
        stack frame, spill code should use the "official" spill area
        in the current stack frame and deallocation code should restore
        the back chain
      * DWARF support is missing
      
      Fixes #9863
      
      Test Plan: validate (on powerpc, too)
      
      Reviewers: simonmar, trofi, erikd, austin
      
      Reviewed By: trofi
      
      Subscribers: bgamari, arnons1, kgardas, thomie
      
      Differential Revision: https://phabricator.haskell.org/D629
      
      GHC Trac Issues: #9863
      d3c1dda6
  32. 16 May, 2015 1 commit
  33. 17 Dec, 2014 2 commits
    • Peter Wortmann's avatar
      Generate DWARF info section · cc481ec8
      Peter Wortmann authored
      This is where we actually make GHC emit DWARF code. The info section
      contains all the general meta information bits as well as an entry for
      every block of native code.
      
      Notes:
      
      * We need quite a few new labels in order to properly address starts
        and ends of blocks.
      
      * Thanks to Nathan Howell for taking the iniative to get our own Haskell
        language ID for DWARF!
      
      (From Phabricator D396)
      cc481ec8
    • Peter Wortmann's avatar
      Generate .loc/.file directives from source ticks · 64678e9e
      Peter Wortmann authored
      This generates DWARF, albeit indirectly using the assembler. This is
      the easiest (and, apparently, quite standard) method of generating the
      .debug_line DWARF section.
      
      Notes:
      
      * Note we have to make sure that .file directives appear correctly
        before the respective .loc. Right now we ppr them manually, which makes
        them absent from dumps. Fixing this would require .file to become a
        native instruction.
      
      * We have to pass a lot of things around the native code generator. I
        know Ian did quite a bit of refactoring already, but having one common
        monad could *really* simplify things here...
      
      * To support SplitObjcs, we need to emit/reset all DWARF data at every
        split. We use the occassion to move split marker generation to
        cmmNativeGenStream as well, so debug data extraction doesn't have to
        choke on it.
      
      (From Phabricator D396)
      64678e9e
  34. 16 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Debug data extraction (NCG support) · f46aa733
      Peter Wortmann authored
      The purpose of the Debug module is to collect all required information
      to generate debug information (DWARF etc.) in the back-ends. Our main
      data structure is the "debug block", which carries all information we have
      about a block of code that is going to get produced.
      
      Notes:
      
      * Debug blocks are arranged into a tree according to tick scopes. This
        makes it easier to reason about inheritance rules. Note however that
        tick scopes are not guaranteed to form a tree, which requires us to
        "copy" ticks to not lose them.
      
      * This is also where we decide what source location we regard as
        representing a code block the "best". The heuristic is basically that
        we want the most specific source reference that comes from the same file
        we are currently compiling. This seems to be the most useful choice in
        my experience.
      
      * We are careful to not be too lazy so we don't end up breaking streaming.
        Debug data will be kept alive until the end of codegen, after all.
      
      * We change native assembler dumps to happen right away for every Cmm group.
        This simplifies the code somewhat and is consistent with how pretty much
        all of GHC handles dumps with respect to streamed code.
      
      (From Phabricator D169)
      f46aa733
  35. 30 Nov, 2014 1 commit
  36. 04 Nov, 2014 1 commit