1. 09 Mar, 2019 1 commit
  2. 06 Mar, 2019 1 commit
    • Ben Gamari's avatar
      Rip out object splitting · 37f257af
      Ben Gamari authored
      The splitter is an evil Perl script that processes assembler code.
      Its job can be done better by the linker's --gc-sections flag. GHC
      passes this flag to the linker whenever -split-sections is passed on
      the command line.
      
      This is based on @DemiMarie's D2768.
      
      Fixes Trac #11315
      Fixes Trac #9832
      Fixes Trac #8964
      Fixes Trac #8685
      Fixes Trac #8629
      37f257af
  3. 15 Feb, 2019 1 commit
    • Andreas Klebinger's avatar
      Don't wrap the entry map for LiveInfo in Maybe. · bcaba30a
      Andreas Klebinger authored
      It never really encoded a invariant.
      
      * The linear register allocator just did partial pattern matches
      * The graph allocator just set it to (Just mapEmpty) for Nothing
      
      So I changed LiveInfo to directly contain the map.
      
      Further natCmmTopToLive which filled in Nothing is no longer exported.
      Instead we know call cmmTopLiveness which changes the type AND fills
      in the map.
      bcaba30a
  4. 14 Feb, 2019 1 commit
    • Sylvain Henry's avatar
      NCG: fast compilation of very large strings (#16190) · 1d9a1d9f
      Sylvain Henry authored
      This patch adds an optimization into the NCG: for large strings
      (threshold configurable via -fbinary-blob-threshold=NNN flag), instead
      of printing `.asciz "..."` in the generated ASM source, we print
      `.incbin "tmpXXX.dat"` and we dump the contents of the string into a
      temporary "tmpXXX.dat" file.
      
      See the note for more details.
      1d9a1d9f
  5. 10 Feb, 2019 1 commit
  6. 09 Feb, 2019 1 commit
  7. 08 Feb, 2019 1 commit
    • Andreas Klebinger's avatar
      Allow resizing the stack for the graph allocator. · 03b7abc1
      Andreas Klebinger authored
      The graph allocator now dynamically resizes the number of stack
      slots when running into the limit.
      
      This fixes #8657.
      
      Also loop membership of basic blocks is now available
      in the register allocator for cost heuristics.
      03b7abc1
  8. 03 Feb, 2019 1 commit
  9. 31 Jan, 2019 5 commits
  10. 30 Jan, 2019 3 commits
  11. 28 Jan, 2019 1 commit
  12. 23 Jan, 2019 1 commit
  13. 17 Jan, 2019 6 commits
  14. 16 Jan, 2019 1 commit
  15. 13 Jan, 2019 1 commit
  16. 01 Jan, 2019 1 commit
  17. 30 Dec, 2018 1 commit
  18. 11 Dec, 2018 2 commits
  19. 29 Nov, 2018 1 commit
    • Simon Peyton Jones's avatar
      Taming the Kind Inference Monster · 2257a86d
      Simon Peyton Jones authored
      My original goal was (Trac #15809) to move towards using level numbers
      as the basis for deciding which type variables to generalise, rather
      than searching for the free varaibles of the environment.  However
      it has turned into a truly major refactoring of the kind inference
      engine.
      
      Let's deal with the level-numbers part first:
      
      * Augment quantifyTyVars to calculate the type variables to
        quantify using level numbers, and compare the result with
        the existing approach.  That is; no change in behaviour,
        just a WARNing if the two approaches give different answers.
      
      * To do this I had to get the level number right when calling
        quantifyTyVars, and this entailed a bit of care, especially
        in the code for kind-checking type declarations.
      
      * However, on the way I was able to eliminate or simplify
        a number of calls to solveEqualities.
      
      This work is incomplete: I'm not /using/ level numbers yet.
      When I subsequently get rid of any remaining WARNings in
      quantifyTyVars, that the level-number answers differ from
      the current answers, then I can rip out the current
      "free vars of the environment" stuff.
      
      Anyway, this led me into deep dive into kind inference for type and
      class declarations, which is an increasingly soggy part of GHC.
      Richard already did some good work recently in
      
         commit 5e45ad10
         Date:   Thu Sep 13 09:56:02 2018 +0200
      
          Finish fix for #14880.
      
          The real change that fixes the ticket is described in
          Note [Naughty quantification candidates] in TcMType.
      
      but I kept turning over stones. So this patch has ended up
      with a pretty significant refactoring of that code too.
      
      Kind inference for types and classes
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      * Major refactoring in the way we generalise the inferred kind of
        a TyCon, in kcTyClGroup.  Indeed, I made it into a new top-level
        function, generaliseTcTyCon.  Plus a new Note to explain it
        Note [Inferring kinds for type declarations].
      
      * We decided (Trac #15592) not to treat class type variables specially
        when dealing with Inferred/Specified/Required for associated types.
        That simplifies things quite a bit. I also rewrote
        Note [Required, Specified, and Inferred for types]
      
      * Major refactoring of the crucial function kcLHsQTyVars:
        I split it into
             kcLHsQTyVars_Cusk  and  kcLHsQTyVars_NonCusk
        because the two are really quite different. The CUSK case is
        almost entirely rewritten, and is much easier because of our new
        decision not to treat the class variables specially
      
      * I moved all the error checks from tcTyClTyVars (which was a bizarre
        place for it) into generaliseTcTyCon and/or the CUSK case of
        kcLHsQTyVars.  Now tcTyClTyVars is extremely simple.
      
      * I got rid of all the all the subtleties in tcImplicitTKBndrs. Indeed
        now there is no difference between tcImplicitTKBndrs and
        kcImplicitTKBndrs; there is now a single bindImplicitTKBndrs.
        Same for kc/tcExplicitTKBndrs.  None of them monkey with level
        numbers, nor build implication constraints.  scopeTyVars is gone
        entirely, as is kcLHsQTyVarBndrs. It's vastly simpler.
      
        I found I could get rid of kcLHsQTyVarBndrs entirely, in favour of
        the bnew bindExplicitTKBndrs.
      
      Quantification
      ~~~~~~~~~~~~~~
      * I now deal with the "naughty quantification candidates"
        of the previous patch in candidateQTyVars, rather than in
        quantifyTyVars; see Note [Naughty quantification candidates]
        in TcMType.
      
        I also killed off closeOverKindsCQTvs in favour of the same
        strategy that we use for tyCoVarsOfType: namely, close over kinds
        at the occurrences.
      
        And candidateQTyVars no longer needs a gbl_tvs argument.
      
      * Passing the ContextKind, rather than the expected kind itself,
        to tc_hs_sig_type_and_gen makes it easy to allocate the expected
        result kind (when we are in inference mode) at the right level.
      
      Type families
      ~~~~~~~~~~~~~~
      * I did a major rewrite of the impenetrable tcFamTyPats. The result
        is vastly more comprehensible.
      
      * I got rid of kcDataDefn entirely, quite a big function.
      
      * I re-did the way that checkConsistentFamInst works, so
        that it allows alpha-renaming of invisible arguments.
      
      * The interaction of kind signatures and family instances is tricky.
          Type families: see Note [Apparently-nullary families]
          Data families: see Note [Result kind signature for a data family instance]
                         and Note [Eta-reduction for data families]
      
      * The consistent instantation of an associated type family is tricky.
        See Note [Checking consistent instantiation] and
            Note [Matching in the consistent-instantation check]
        in TcTyClsDecls.  It's now checked in TcTyClsDecls because that is
        when we have the relevant info to hand.
      
      * I got tired of the compromises in etaExpandFamInst, so I did the
        job properly by adding a field cab_eta_tvs to CoAxBranch.
        See Coercion.etaExpandCoAxBranch.
      
      tcInferApps and friends
      ~~~~~~~~~~~~~~~~~~~~~~~
      * I got rid of the mysterious and horrible ClsInstInfo argument
        to tcInferApps, checkExpectedKindX, and various checkValid
        functions.  It was horrible!
      
      * I got rid of [Type] result of tcInferApps.  This list was used
        only in tcFamTyPats, when checking the LHS of a type instance;
        and if there is a cast in the middle, the list is meaningless.
        So I made tcInferApps simpler, and moved the complexity
        (not much) to tcInferApps.
      
        Result: tcInferApps is now pretty comprehensible again.
      
      * I refactored the many function in TcMType that instantiate skolems.
      
      Smaller things
      
      * I rejigged the error message in checkValidTelescope; I think it's
        quite a bit better now.
      
      * checkValidType was not rejecting constraints in a kind signature
           forall (a :: Eq b => blah). blah2
        That led to further errors when we then do an ambiguity check.
        So I make checkValidType reject it more aggressively.
      
      * I killed off quantifyConDecl, instead calling kindGeneralize
        directly.
      
      * I fixed an outright bug in tyCoVarsOfImplic, where we were not
        colleting the tyvar of the kind of the skolems
      
      * Renamed ClsInstInfo to AssocInstInfo, and made it into its
        own data type
      
      * Some fiddling around with pretty-printing of family
        instances which was trickier than I thought.  I wanted
        wildcards to print as plain "_" in user messages, although
        they each need a unique identity in the CoAxBranch.
      
      Some other oddments
      
      * Refactoring around the trace messages from reportUnsolved.
      * A bit of extra tc-tracing in TcHsSyn.commitFlexi
      
      This patch fixes a raft of bugs, and includes tests for them.
      
       * #14887
       * #15740
       * #15764
       * #15789
       * #15804
       * #15817
       * #15870
       * #15874
       * #15881
      2257a86d
  20. 22 Nov, 2018 3 commits
    • David Eichmann's avatar
      Fix unused-import warnings · 6353efc7
      David Eichmann authored
      This patch fixes a fairly long-standing bug (dating back to 2015) in
      RdrName.bestImport, namely
      
         commit 9376249b
         Author: Simon Peyton Jones <simonpj@microsoft.com>
         Date:   Wed Oct 28 17:16:55 2015 +0000
      
         Fix unused-import stuff in a better way
      
      In that patch got the sense of the comparison back to front, and
      thereby failed to implement the unused-import rules described in
        Note [Choosing the best import declaration] in RdrName
      
      This led to Trac #13064 and #15393
      
      Fixing this bug revealed a bunch of unused imports in libraries;
      the ones in the GHC repo are part of this commit.
      
      The two important changes are
      
      * Fix the bug in bestImport
      
      * Modified the rules by adding (a) in
           Note [Choosing the best import declaration] in RdrName
        Reason: the previosu rules made Trac #5211 go bad again.  And
        the new rule (a) makes sense to me.
      
      In unravalling this I also ended up doing a few other things
      
      * Refactor RnNames.ImportDeclUsage to use a [GlobalRdrElt] for the
        things that are used, rather than [AvailInfo]. This is simpler
        and more direct.
      
      * Rename greParentName to greParent_maybe, to follow GHC
        naming conventions
      
      * Delete dead code RdrName.greUsedRdrName
      
      Bumps a few submodules.
      
      Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27
      
      Subscribers: rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5312
      6353efc7
    • Andreas Klebinger's avatar
      Fixup the new code layout patch for SplitObjs. · 6c26b3f8
      Andreas Klebinger authored
      When splitting objects we sometimes generate
      dummy CmmProcs containing bottom in some fields.
      
      Code introduced in the new code layout patch looked
      at these which blew up the compiler. Now we instead
      check first if the function actually contains code.
      
      Reviewers: bgamari
      
      Subscribers: simonpj, rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5357
      6c26b3f8
    • Sylvain Henry's avatar
      Rename literal constructors · 13bb4bf4
      Sylvain Henry authored
      In a previous patch we replaced some built-in literal constructors
      (MachInt, MachWord, etc.) with a single LitNumber constructor.
      
      In this patch we replace the `Mach` prefix of the remaining constructors
      with `Lit` for consistency (e.g., LitChar, LitLabel, etc.).
      
      Sadly the name `LitString` was already taken for a kind of FastString
      and it would become misleading to have both `LitStr` (literal
      constructor renamed after `MachStr`) and `LitString` (FastString
      variant). Hence this patch renames the FastString variant `PtrString`
      (which is more accurate) and the literal string constructor now uses the
      least surprising `LitString` name.
      
      Both `Literal` and `LitString/PtrString` have recently seen breaking
      changes so doing this kind of renaming now shouldn't harm much.
      
      Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27, tdammers
      
      Subscribers: tdammers, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4881
      13bb4bf4
  21. 17 Nov, 2018 1 commit
    • Andreas Klebinger's avatar
      NCG: New code layout algorithm. · 912fd2b6
      Andreas Klebinger authored
      Summary:
      This patch implements a new code layout algorithm.
      It has been tested for x86 and is disabled on other platforms.
      
      Performance varies slightly be CPU/Machine but in general seems to be better
      by around 2%.
      Nofib shows only small differences of about +/- ~0.5% overall depending on
      flags/machine performance in other benchmarks improved significantly.
      
      Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec,
      containers, text and xeno.
      
      While the magnitude of gains differed three different CPUs where tested with
      all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell,
      Skylake
      
      * Library benchmark results summarized:
        * containers: ~1.5% faster
        * aeson: ~2% faster
        * megaparsec: ~2-5% faster
        * xml library benchmarks: 0.2%-1.1% faster
        * vector-benchmarks: 1-4% faster
        * text: 5.5% faster
      
      On average GHC compile times go down, as GHC compiled with the new layout
      is faster than the overhead introduced by using the new layout algorithm,
      
      Things this patch does:
      
      * Move code responsilbe for block layout in it's own module.
      * Move the NcgImpl Class into the NCGMonad module.
      * Extract a control flow graph from the input cmm.
      * Update this cfg to keep it in sync with changes during
        asm codegen. This has been tested on x64 but should work on x86.
        Other platforms still use the old codelayout.
      * Assign weights to the edges in the CFG based on type and limited static
        analysis which are then used for block layout.
      * Once we have the final code layout eliminate some redundant jumps.
      
        In particular turn a sequences of:
            jne .foo
            jmp .bar
          foo:
        into
            je bar
          foo:
            ..
      
      Test Plan: ci
      
      Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott
      
      Reviewed By: RyanGlScott
      
      Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton
      
      GHC Trac Issues: #15124
      
      Differential Revision: https://phabricator.haskell.org/D4726
      912fd2b6
  22. 02 Nov, 2018 1 commit
    • Michal Terepeta's avatar
      Add Int8# and Word8# · 2c959a18
      Michal Terepeta authored
      This is the first step of implementing:
      https://github.com/ghc-proposals/ghc-proposals/pull/74
      
      The main highlights/changes:
      
          primops.txt.pp gets two new sections for two new primitive types for
          signed and unsigned 8-bit integers (Int8# and Word8 respectively) along
          with basic arithmetic and comparison operations. PrimRep/RuntimeRep get
          two new constructors for them. All of the primops translate into the
          existing MachOPs.
      
          For CmmCalls the codegen will now zero-extend the values at call
          site (so that they can be moved to the right register) and then truncate
          them back their original width.
      
          x86 native codegen needed some updates, since it wasn't able to deal
          with the new widths, but all the changes are quite localized. LLVM
          backend seems to just work.
      
      This is the second attempt at merging this, after the first attempt in
      D4475 had to be backed out due to regressions on i386.
      
      Bumps binary submodule.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate (on both x86-{32,64})
      
      Reviewers: bgamari, hvr, goldfire, simonmar
      
      Subscribers: rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5258
      2c959a18
  23. 28 Oct, 2018 1 commit
    • Zejun Wu's avatar
      Fix rare undefined asm temp end label error in x86 · 3c452d0d
      Zejun Wu authored
      Summary:
      Encountered assembly error due to undefined label `.LcaDcU_info_end` for
      following code generated by `pprFrameProc`:
      
      ```
      .Lsat_sa8fp{v}_info_fde_end:
        .long .Lblock{v caDcU}_info_fde_end-.Lblock{v caDcU}_info_fde
      .Lblock{v caDcU}_info_fde:
        .long _nbHlD-.Lsection_frame
        .quad block{v caDcU}_info-1
        .quad .Lblock{v caDcU}_info_end-block{v caDcU}_info+1
        .byte 1
      ```
      
      This diff fixed the error.
      
      Test Plan:
        ./validate
      
      Also the case where we used to have assembly error is now fixed.
      Unfortunately, I have limited insight here and cannot get a small enough repro
      or test case for this.
      
      Ben says:
      
      > I think I see: Previously we only produced end symbols for the info
      > tables of top-level procedures. However, blocks within a procedure may
      > also have info tables, we will dutifully generate debug information for
      > and consequently we get undefined symbols.
      
      Reviewers: simonmar, scpmw, last_g, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5246
      3c452d0d
  24. 09 Oct, 2018 1 commit
    • Ben Gamari's avatar
      Revert "Add Int8# and Word8#" · d728c3c5
      Ben Gamari authored
      This unfortunately broke i386 support since it introduced references to
      byte-sized registers that don't exist on that architecture.
      
      Reverts binary submodule
      
      This reverts commit 5d5307f9.
      d728c3c5
  25. 07 Oct, 2018 1 commit
    • Michal Terepeta's avatar
      Add Int8# and Word8# · 5d5307f9
      Michal Terepeta authored
      This is the first step of implementing:
      https://github.com/ghc-proposals/ghc-proposals/pull/74
      
      The main highlights/changes:
      
      - `primops.txt.pp` gets two new sections for two new primitive types
        for signed and unsigned 8-bit integers (`Int8#` and `Word8`
        respectively) along with basic arithmetic and comparison
        operations. `PrimRep`/`RuntimeRep` get two new constructors for
        them. All of the primops translate into the existing `MachOP`s.
      
      - For `CmmCall`s the codegen will now zero-extend the values at call
        site (so that they can be moved to the right register) and then
        truncate them back their original width.
      
      - x86 native codegen needed some updates, since it wasn't able to deal
        with the new widths, but all the changes are quite localized. LLVM
        backend seems to just work.
      
      Bumps binary submodule.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate with new tests
      
      Reviewers: hvr, goldfire, bgamari, simonmar
      
      Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter
      
      Differential Revision: https://phabricator.haskell.org/D4475
      5d5307f9
  26. 18 Sep, 2018 1 commit
    • Andreas Klebinger's avatar
      Invert FP conditions to eliminate the explicit NaN check. · 6bb9bc7d
      Andreas Klebinger authored
      Summary:
      Optimisation: we don't have to test the parity flag if we
      know the test has already excluded the unordered case: eg >
      and >= test for a zero carry flag, which can only occur for
      ordered operands.
      
      By reversing comparisons we can avoid testing the parity
      for < and <= as well. This works since:
      * If any of the arguments is an NaN CF gets set. Resulting in a false result.
      * Since this allows us to rule out NaN we can exchange the arguments and invert the
        direction of the arrows.
      
      Test Plan: ci/nofib
      
      Reviewers: carter, bgamari, alpmestan
      
      Reviewed By: alpmestan
      
      Subscribers: alpmestan, simonpj, jmct, rwbarton, thomie
      
      GHC Trac Issues: #15196
      
      Differential Revision: https://phabricator.haskell.org/D4990
      6bb9bc7d