1. 02 Feb, 2018 1 commit
    • Michal Terepeta's avatar
      Hoopl.Collections: change right folds to strict left folds · 2974b2b8
      Michal Terepeta authored
      It seems that most uses of these folds should be strict left folds
      (I could only find a single place that benefits from a right fold).
      So this removes the existing `setFold`/`mapFold`/`mapFoldWihKey`
      replaces them with:
      - `setFoldl`/`mapFoldl`/`mapFoldlWithKey` (strict left folds)
      - `setFoldr`/`mapFoldr` (for the less common case where a right fold
        actually makes sense, e.g., `CmmProcPoint`)
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      Test Plan: ./validate
      Reviewers: bgamari, simonmar
      Reviewed By: bgamari
      Subscribers: rwbarton, thomie, carter, kavon
      Differential Revision: https://phabricator.haskell.org/D4356
  2. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      Reviewed By: bgamari
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      Differential Revision: https://phabricator.haskell.org/D3989
  3. 06 Sep, 2017 1 commit
  4. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      Test Plan: ./validate
      Reviewers: austin, bgamari, simonmar
      Reviewed By: bgamari, simonmar
      Subscribers: simonpj, kavon, rwbarton, thomie
      Differential Revision: https://phabricator.haskell.org/D3616
  5. 23 May, 2017 1 commit
  6. 20 Jan, 2017 1 commit
    • takano-akio's avatar
      Allow top-level string literals in Core (#8472) · d49b2bb2
      takano-akio authored
      This commits relaxes the invariants of the Core syntax so that a
      top-level variable can be bound to a primitive string literal of type
      This commit:
      * Relaxes the invatiants of the Core, and allows top-level bindings whose
        type is Addr# as long as their RHS is either a primitive string literal or
        another variable.
      * Allows the simplifier and the full-laziness transformer to float out
        primitive string literals to the top leve.
      * Introduces the new StgGenTopBinding type to accomodate top-level Addr#
      * Introduces a new type of labels in the object code, with the suffix "_bytes",
        for exported top-level Addr# bindings.
      * Makes some built-in rules more robust. This was necessary to keep them
        functional after the above changes.
      This is a continuation of D2554.
      Rebasing notes:
      This had two slightly suspicious performance regressions:
      * T12425: bytes allocated regressed by roughly 5%
      * T4029: bytes allocated regressed by a bit over 1%
      * T13035: bytes allocated regressed by a bit over 5%
      These deserve additional investigation.
      Rebased by: bgamari.
      Test Plan: ./validate --slow
      Reviewers: goldfire, trofi, simonmar, simonpj, austin, hvr, bgamari
      Reviewed By: trofi, simonpj, bgamari
      Subscribers: trofi, simonpj, gridaphobe, thomie
      Differential Revision: https://phabricator.haskell.org/D2605
      GHC Trac Issues: #8472
  7. 19 Jan, 2017 1 commit
    • Richard Eisenberg's avatar
      Update levity polymorphism · e7985ed2
      Richard Eisenberg authored
      This commit implements the proposal in
      https://github.com/ghc-proposals/ghc-proposals/pull/29 and
      Here are some of the pieces of that proposal:
      * Some of RuntimeRep's constructors have been shortened.
      * TupleRep and SumRep are now parameterized over a list of RuntimeReps.
      * This
      means that two types with the same kind surely have the same
      Previously, all unboxed tuples had the same kind, and thus the fact
      above was
      * RepType.typePrimRep and friends now return a *list* of PrimReps. These
      functions can now work successfully on unboxed tuples. This change is
      necessary because we allow abstraction over unboxed tuple types and so
      always handle unboxed tuples specially as we did before.
      * We sometimes have to create an Id from a PrimRep. I thus split PtrRep
      * into
      LiftedRep and UnliftedRep, so that the created Ids have the right
      * The RepType.RepType type was removed, as it didn't seem to help with
      * much.
      * The RepType.repType function is also removed, in favor of typePrimRep.
      * I have waffled a good deal on whether or not to keep VoidRep in
      TyCon.PrimRep. In the end, I decided to keep it there. PrimRep is *not*
      represented in RuntimeRep, and typePrimRep will never return a list
      VoidRep. But it's handy to have in, e.g., ByteCodeGen and friends. I can
      imagine another design choice where we have a PrimRepV type that is
      with an extra constructor. That seemed to be a heavier design, though,
      and I'm
      not sure what the benefit would be.
      * The last, unused vestiges of # (unliftedTypeKind) have been removed.
      * There were several pretty-printing bugs that this change exposed;
      * these are fixed.
      * We previously checked for levity polymorphism in the types of binders.
      * But we
      also must exclude levity polymorphism in function arguments. This is
      hard to check
      for, requiring a good deal of care in the desugarer. See Note [Levity
      checking] in DsMonad.
      * In order to efficiently check for levity polymorphism in functions, it
      * was necessary
      to add a new bit of IdInfo. See Note [Levity info] in IdInfo.
      * It is now safe for unlifted types to be unsaturated in Core. Core Lint
      * is updated
      * We can only know strictness after zonking, so several checks around
      * strictness
      in the type-checker (checkStrictBinds, the check for unlifted variables
      under a ~
      pattern) have been moved to the desugarer.
      * Along the way, I improved the treatment of unlifted vs. banged
      * bindings. See
      Note [Strict binds checks] in DsBinds and #13075.
      * Now that we print type-checked source, we must be careful to print
      * ConLikes correctly.
      This is facilitated by a new HsConLikeOut constructor to HsExpr.
      Particularly troublesome
      are unlifted pattern synonyms that get an extra void# argument.
      * Includes a submodule update for haddock, getting rid of #.
      * New testcases:
      * Fixed tickets:
      * This also adds a test case for #13105. This test case is
      * "compile_fail" and
      succeeds, because I want the testsuite to monitor the error message.
      When #13105 is fixed, the test case will compile cleanly.
  8. 08 Dec, 2016 1 commit
  9. 06 Dec, 2016 1 commit
  10. 26 Oct, 2016 1 commit
  11. 19 Oct, 2016 1 commit
  12. 10 Aug, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Remove StgRubbishArg and CmmArg · 9684dbb1
      Ömer Sinan Ağacan authored
      The idea behind adding special "rubbish" arguments was in unboxed sum types
      depending on the tag some arguments are not used and we don't want to move some
      special values (like 0 for literals and some special pointer for boxed slots)
      for those arguments (to stack locations or registers). "StgRubbishArg" was an
      indicator to the code generator that the value won't be used. During Stg-to-Cmm
      we were then not generating any move or store instructions at all.
      This caused problems in the register allocator because some variables were only
      initialized in some code paths. As an example, suppose we have this STG: (after
          Lib.$WT =
              \r [dt_sit]
                      case dt_sit of {
                        Lib.F dt_siv [Occ=Once] ->
                            (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#];
                        Lib.I dt_siw [Occ=Once] ->
                            (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw];
                  { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE];
      This basically unpacks a sum type to an unboxed sum with 3 fields, and then
      moves the unboxed sum to a constructor (`Lib.T`).
      This is the Cmm for the inner case expression (case expression in the scrutinee
      position of the outer case):
              -- look at dt_sit's tag
              if (_ciT::P64 != 1) goto ciS; else goto ciR;
          ciS: -- Tag is 2, i.e. Lib.F
              _siw::I64 = I64[_siu::P64 + 6];
              _giE::I64 = _siw::I64;
              _giD::P64 = stg_RUBBISH_ENTRY_info;
              _giC::I64 = 2;
              goto ciU;
          ciR: -- Tag is 1, i.e. Lib.I
              _siv::P64 = P64[_siu::P64 + 7];
              _giD::P64 = _siv::P64;
              _giC::I64 = 1;
              goto ciU;
      Here one of the blocks `ciS` and `ciR` is executed and then the execution
      continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch
      `_giE` is not initialized, because it's "rubbish" in the STG and so we don't
      generate an assignment during code generator. The code generator then panics
      during the register allocations:
          ghc-stage1: panic! (the 'impossible' happened)
            (GHC version 8.1.20160722 for x86_64-unknown-linux):
                  LocalReg's live-in to graph ciY {_giE::I64}
      (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a
      pointer slot, we have to initialize it otherwise garbage collector follows the
      pointer to some random place. So we only remove assignment if the "rubbish" arg
      has unboxed type.)
      This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize
      rubbish slots. If the slot is for boxed types we use the existing `absentError`,
      otherwise we initialize the slot with literal 0.
      Reviewers: simonpj, erikd, austin, simonmar, bgamari
      Reviewed By: erikd
      Subscribers: thomie
      Differential Revision: https://phabricator.haskell.org/D2446
  13. 21 Jul, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Implement unboxed sum primitive type · 714bebff
      Ömer Sinan Ağacan authored
      This patch implements primitive unboxed sum types, as described in
      Main changes are:
      - Add new syntax for unboxed sums types, terms and patterns. Hidden
        behind `-XUnboxedSums`.
      - Add unlifted unboxed sum type constructors and data constructors,
        extend type and pattern checkers and desugarer.
      - Add new RuntimeRep for unboxed sums.
      - Extend unarise pass to translate unboxed sums to unboxed tuples right
        before code generation.
      - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
        code generation when sum values are involved.
      - Add user manual section for unboxed sums.
      Some other changes:
      - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
        `MultiValAlt` to be able to use those with both sums and tuples.
      - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
        wrong, given an `Any` `TyCon`, there's no way to tell what its kind
        is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
      - Fix some bugs on the way: #12375.
      Not included in this patch:
      - Update Haddock for new the new unboxed sum syntax.
      - `TemplateHaskell` support is left as future work.
      For reviewers:
      - Front-end code is mostly trivial and adapted from unboxed tuple code
        for type checking, pattern checking, renaming, desugaring etc.
      - Main translation routines are in `RepType` and `UnariseStg`.
        Documentation in `UnariseStg` should be enough for understanding
        what's going on.
      - Johan Tibell wrote the initial front-end and interface file
      - Simon Peyton Jones reviewed this patch many times, wrote some code,
        and helped with debugging.
      Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
                 simonmar, hvr, erikd
      Reviewed By: simonpj
      Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
                   thomie, mpickering
      Differential Revision: https://phabricator.haskell.org/D2259
  14. 12 Nov, 2015 1 commit
    • olsner's avatar
      Implement function-sections for Haskell code, #8405 · 4a32bf92
      olsner authored
      This adds a flag -split-sections that does similar things to
      -split-objs, but using sections in single object files instead of
      relying on the Satanic Splitter and other abominations. This is very
      similar to the GCC flags -ffunction-sections and -fdata-sections.
      The --gc-sections linker flag, which allows unused sections to actually
      be removed, is added to all link commands (if the linker supports it) so
      that space savings from having base compiled with sections can be
      Supported both in LLVM and the native code-gen, in theory for all
      architectures, but really tested on x86 only.
      In the GHC build, a new SplitSections variable enables -split-sections
      for relevant parts of the build.
      Test Plan: validate with both settings of SplitSections
      Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari
      Reviewed By: simonmar, thomie, bgamari
      Subscribers: hsyl20, erikd, kgardas, thomie
      Differential Revision: https://phabricator.haskell.org/D1242
      GHC Trac Issues: #8405
  15. 25 Jun, 2015 1 commit
    • rwbarton's avatar
      Be aware of overlapping global STG registers in CmmSink (#10521) · a2f828a3
      rwbarton authored
      On x86_64, commit e2f6bbd3 assigned
      the STG registers F1 and D1 the same hardware register (xmm1), and
      the same for the registers F2 and D2, etc. When mixing calls to
      functions involving Float#s and Double#s, this can cause wrong Cmm
      optimizations that assume the F1 and D1 registers are independent.
      Reviewers: simonpj, austin
      Reviewed By: austin
      Subscribers: simonpj, thomie, bgamari
      Differential Revision: https://phabricator.haskell.org/D993
      GHC Trac Issues: #10521
  16. 12 Jun, 2015 1 commit
  17. 30 Mar, 2015 1 commit
    • Joachim Breitner's avatar
      Refactor the story around switches (#10137) · de1160be
      Joachim Breitner authored
      This re-implements the code generation for case expressions at the Stg →
      Cmm level, both for data type cases as well as for integral literal
      cases. (Cases on float are still treated as before).
      The goal is to allow for fancier strategies in implementing them, for a
      cleaner separation of the strategy from the gritty details of Cmm, and
      to run this later than the Common Block Optimization, allowing for one
      way to attack #10124. The new module CmmSwitch contains a number of
      notes explaining this changes. For example, it creates larger
      consecutive jump tables than the previous code, if possible.
      nofib shows little significant overall improvement of runtime. The
      rather large wobbling comes from changes in the code block order
      (see #8082, not much we can do about it). But the decrease in code size
      alone makes this worthwhile.
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                  Min          -1.8%      0.0%     -6.1%     -6.1%     -2.9%
                  Max          -0.7%     +0.0%     +5.6%     +5.7%     +7.8%
       Geometric Mean          -1.4%     -0.0%     -0.3%     -0.3%     +0.0%
      Compilation time increases slightly:
              -1 s.d.                -----            -2.0%
              +1 s.d.                -----            +2.5%
              Average                -----            +0.3%
      The test case T783 regresses a lot, but it is the only one exhibiting
      any regression. The cause is the changed order of branches in an
      if-then-else tree, which makes the hoople data flow analysis traverse
      the blocks in a suboptimal order. Reverting that gets rid of this
      regression, but has a consistent, if only very small (+0.2%), negative
      effect on runtime. So I conclude that this test is an extreme outlier
      and no reason to change the code.
      Differential Revision: https://phabricator.haskell.org/D720
  18. 09 Mar, 2015 1 commit
  19. 16 Dec, 2014 2 commits
    • Peter Wortmann's avatar
      Tick scopes · 5fecd767
      Peter Wortmann authored
      This patch solves the scoping problem of CmmTick nodes: If we just put
      CmmTicks into blocks we have no idea what exactly they are meant to
      cover.  Here we introduce tick scopes, which allow us to create
      sub-scopes and merged scopes easily.
      * Given that the code often passes Cmm around "head-less", we have to
        make sure that its intended scope does not get lost. To keep the amount
        of passing-around to a minimum we define a CmmAGraphScoped type synonym
        here that just bundles the scope with a portion of Cmm to be assembled
      * We introduce new scopes at somewhat random places, aligning with
        getCode calls. This works surprisingly well, but we might have to
        add new scopes into the mix later on if we find things too be too
      (From Phabricator D169)
    • Peter Wortmann's avatar
      Source notes (Cmm support) · 7ceaf96f
      Peter Wortmann authored
      This patch adds CmmTick nodes to Cmm code. This is relatively
      straight-forward, but also not very useful, as many blocks will simply
      end up with no annotations whatosever.
      * We use this design over, say, putting ticks into the entry node of all
        blocks, as it seems to work better alongside existing optimisations.
        Now granted, the reason for this is that currently GHC's main Cmm
        optimisations seem to mainly reorganize and merge code, so this might
        change in the future.
      * We have the Cmm parser generate a few source notes as well. This is
        relatively easy to do - worst part is that it complicates the CmmParse
        implementation a bit.
      (From Phabricator D169)
  20. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
  21. 11 Mar, 2014 1 commit
  22. 28 Nov, 2013 1 commit
  23. 26 Oct, 2013 1 commit
  24. 25 Oct, 2013 1 commit
  25. 19 Jun, 2013 1 commit
  26. 10 Mar, 2013 1 commit
  27. 09 Mar, 2013 1 commit
  28. 01 Feb, 2013 2 commits
    • gmainlan@microsoft.com's avatar
      Mimic OldCmm basic block ordering in the LLVM backend. · b39e4de1
      gmainlan@microsoft.com authored
      In OldCmm, the false case of a conditional was a fallthrough. In Cmm,
      conditionals have both true and false successors. When we convert Cmm to LLVM,
      we now first re-order Cmm blocks so that the false successor of a conditional
      occurs next in the list of basic blocks, i.e., it is a fallthrough, just like it
      (necessarily) did in OldCmm. Surprisingly, this can make a big performance
    • gmainlan@microsoft.com's avatar
      Always pass vector values on the stack. · 6480a35c
      gmainlan@microsoft.com authored
      Vector values are now always passed on the stack. This isn't particularly
      efficient, but it will have to do for now.
  29. 12 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Remove OldCmm, convert backends to consume new Cmm · d92bd17f
      Simon Marlow authored
      This removes the OldCmm data type and the CmmCvt pass that converts
      new Cmm to OldCmm.  The backends (NCGs, LLVM and C) have all been
      converted to consume new Cmm.
      The main difference between the two data types is that conditional
      branches in new Cmm have both true/false successors, whereas in OldCmm
      the false case was a fallthrough.  To generate slightly better code we
      occasionally need to invert a conditional to ensure that the
      branch-not-taken becomes a fallthrough; this was previously done in
      CmmCvt, and it is now done in CmmContFlowOpt.
      We could go further and use the Hoopl Block representation for native
      code, which would mean that we could use Hoopl's postorderDfs and
      analyses for native code, but for now I've left it as is, using the
      old ListGraph representation for native code.
  30. 24 Oct, 2012 1 commit
  31. 08 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      foo ( gcptr a, bits32 b )
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        return (x,y);
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
        jump %ENTRY_CODE(Sp(0)) [R1];
      Again, more details in Note [Syntax of .cmm files].
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      Some other changes in this batch:
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
       - CmmSink now does constant-folding (should fix #7219)
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
       - RET_DYN frames are removed from the RTS, lots of code goes away
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
  32. 18 Sep, 2012 5 commits
  33. 16 Sep, 2012 2 commits