1. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
        `Uniq{FM,Set}`
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: bgamari, simonmar
      
      Subscribers: simonpj, kavon, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3616
      42eee6ea
  2. 08 Feb, 2017 2 commits
    • Ben Gamari's avatar
      Cmm: Add support for undefined unwinding statements · 3328ddb8
      Ben Gamari authored
      And use to mark `stg_stack_underflow_frame`, which we are unable to
      determine a caller from.
      
      To simplify parsing at the moment we steal the `return` keyword to
      indicate an undefined unwind value. Perhaps this should be revisited.
      
      Reviewers: scpmw, simonmar, austin, erikd
      
      Subscribers: dfeuer, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2738
      3328ddb8
    • Ben Gamari's avatar
      Generalize CmmUnwind and pass unwind information through NCG · 3eb737ee
      Ben Gamari authored
      As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack
      unwinding information produced by GHC is currently quite approximate.
      Essentially we assume that register values do not change at all within a
      basic block. While this is somewhat true in normal Haskell code, blocks
      containing foreign calls often break this assumption. This results in
      unreliable call stacks, especially in the code containing foreign calls.
      This is worse than it sounds as unreliable unwinding information can at
      times result in segmentation faults.
      
      This patch set attempts to improve this situation by tracking unwinding
      information with finer granularity. By dispensing with the assumption of
      one unwinding table per block, we allow the compiler to accurately
      represent the areas surrounding foreign calls.
      
      Towards this end we generalize the representation of unwind information
      in the backend in three ways,
      
       * Multiple CmmUnwind nodes can occur per block
      
       * CmmUnwind nodes can now carry unwind information for multiple
         registers (while not strictly necessary; this makes emitting
         unwinding information a bit more convenient in the compiler)
      
       * The NCG backend is given an opportunity to modify the unwinding
         records since it may need to make adjustments due to, for instance,
         native calling convention requirements for foreign calls (see
         #11353).
      
      This sets the stage for resolving #11337 and #11338.
      
      Test Plan: Validate
      
      Reviewers: scpmw, simonmar, austin, erikd
      
      Subscribers: qnikst, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2741
      3eb737ee
  3. 29 Nov, 2016 1 commit
    • Michal Terepeta's avatar
      Minor cleanup of foldRegs{Used,Defd} · da5a61eb
      Michal Terepeta authored
      This makes the two functions strict in the accumulator - it seems that
      there are only two users of those functions: `CmmLive` and `CmmSink`
      and in both cases the strict fold fits better.
      
      The commit also removes a few unused functions (`filterRegsUsed`),
      instances (for `Maybe` and `RegSet`) and gets rid of unnecessary
      inculde of `HsVersions.h`.
      
      The performance effect of avoiding unnecessary thunks is mostly
      negligible, although we do allocate a tiny bit less (nofib's section
      on compile allocations):
      ```
      -1 s.d.                -----            -0.2%
      +1 s.d.                -----            -0.1%
      Average                -----            -0.2%
      ```
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: validate
      
      Reviewers: simonmar, austin, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2723
      da5a61eb
  4. 30 Jun, 2016 1 commit
    • niteria's avatar
      Delete Ord Unique · fb6e2c7f
      niteria authored
      Ord Unique can be a source of invisible, accidental
      nondeterminism as explained in Note [No Ord for Unique].
      This removes it, leaving a note with rationale.
      
      It's unfortunate that I had to write Ord instances for
      codegen data structures by hand, but I believe that it's a
      right trade-off here.
      
      Test Plan: ./validate
      
      Reviewers: simonmar, austin, bgamari
      
      Reviewed By: simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2370
      
      GHC Trac Issues: #4012
      fb6e2c7f
  5. 18 Jun, 2016 1 commit
  6. 22 Apr, 2016 1 commit
    • Simon Peyton Jones's avatar
      Warn about simplifiable class constraints · 9421b0c7
      Simon Peyton Jones authored
      Provoked by Trac #11948, this patch adds a new warning to GHC
      
        -Wsimplifiable-class-constraints
      
      It warns if you write a class constraint in a type signature that
      can be simplified by an existing instance declaration.  Almost always
      this means you should simplify it right now; type inference is very
      fragile without it, as #11948 shows.
      
      I've put the warning as on-by-default, but I suppose that if there are
      howls of protest we can move it out (as happened for -Wredundant-constraints.
      
      It actually found an example of an over-complicated context in CmmNode.
      
      Quite a few tests use these weird contexts to trigger something else,
      so I had to suppress the warning in those.
      
      The 'haskeline' library has a few occurrences of the warning (which
      I think should be fixed), so I switched it off for that library in
      warnings.mk.
      
      The warning itself is done in TcValidity.check_class_pred.
      
      HOWEVER, when type inference fails we get a type error; and the error
      suppresses the (informative) warning.  So as things stand, the warning
      only happens when it doesn't cause a problem.  Not sure what to do
      about this, but this patch takes us forward, I think.
      9421b0c7
  7. 23 Sep, 2015 1 commit
    • Simon Marlow's avatar
      Annotate CmmBranch with an optional likely target · 939a7d63
      Simon Marlow authored
      Summary:
      This allows the code generator to give hints to later code generation
      steps about which branch is most likely to be taken.  Right now it
      is only taken into account in one place: a special case in
      CmmContFlowOpt that swapped branches over to maximise the chance of
      fallthrough, which is now disabled when there is a likelihood setting.
      
      Test Plan: validate
      
      Reviewers: austin, simonpj, bgamari, ezyang, tibbe
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1273
      939a7d63
  8. 30 Mar, 2015 1 commit
    • Joachim Breitner's avatar
      Refactor the story around switches (#10137) · de1160be
      Joachim Breitner authored
      This re-implements the code generation for case expressions at the Stg →
      Cmm level, both for data type cases as well as for integral literal
      cases. (Cases on float are still treated as before).
      
      The goal is to allow for fancier strategies in implementing them, for a
      cleaner separation of the strategy from the gritty details of Cmm, and
      to run this later than the Common Block Optimization, allowing for one
      way to attack #10124. The new module CmmSwitch contains a number of
      notes explaining this changes. For example, it creates larger
      consecutive jump tables than the previous code, if possible.
      
      nofib shows little significant overall improvement of runtime. The
      rather large wobbling comes from changes in the code block order
      (see #8082, not much we can do about it). But the decrease in code size
      alone makes this worthwhile.
      
      ```
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                  Min          -1.8%      0.0%     -6.1%     -6.1%     -2.9%
                  Max          -0.7%     +0.0%     +5.6%     +5.7%     +7.8%
       Geometric Mean          -1.4%     -0.0%     -0.3%     -0.3%     +0.0%
      ```
      
      Compilation time increases slightly:
      ```
              -1 s.d.                -----            -2.0%
              +1 s.d.                -----            +2.5%
              Average                -----            +0.3%
      ```
      
      The test case T783 regresses a lot, but it is the only one exhibiting
      any regression. The cause is the changed order of branches in an
      if-then-else tree, which makes the hoople data flow analysis traverse
      the blocks in a suboptimal order. Reverting that gets rid of this
      regression, but has a consistent, if only very small (+0.2%), negative
      effect on runtime. So I conclude that this test is an extreme outlier
      and no reason to change the code.
      
      Differential Revision: https://phabricator.haskell.org/D720
      de1160be
  9. 23 Dec, 2014 1 commit
    • Simon Peyton Jones's avatar
      Eliminate so-called "silent superclass parameters" · a6f0f5ab
      Simon Peyton Jones authored
      The purpose of silent superclass parameters was to solve the
      awkward problem of superclass dictinaries being bound to bottom.
      See THE PROBLEM in Note [Recursive superclasses] in TcInstDcls
      
      Although the silent-superclass idea worked,
      
        * It had non-local consequences, and had effects even in Haddock,
          where we had to discard silent parameters before displaying
          instance declarations
      
        * It had unexpected peformance costs, shown up by Trac #3064 and its
          test case.  In monad-transformer code, when constructing a Monad
          dictionary you had to pass an Applicative dictionary; and to
          construct that you neede a Functor dictionary. Yet these extra
          dictionaries were often never used.  (All this got much worse when
          we added Applicative as a superclass of Monad.) Test T3064
          compiled *far* faster after silent superclasses were eliminated.
      
        * It introduced new bugs.  For example SilentParametersOverlapping,
          T5051, and T7862, all failed to compile because of instance overlap
          directly because of the silent-superclass trick.
      
      So this patch takes a new approach, which I worked out with Dimitrios
      in the closing hours before Christmas.  It is described in detail
      in THE PROBLEM in Note [Recursive superclasses] in TcInstDcls.
      
      Seems to work great!
      
      Quite a bit of knock-on effect
      
       * The main implementation work is in tcSuperClasses in TcInstDcls
         Everything else is fall-out
      
       * IdInfo.DFunId no longer needs its n-silent argument
         * Ditto IDFunId in IfaceSyn
         * Hence interface file format changes
      
       * Now that DFunIds do not have silent superclass parameters, printing
         out instance declarations is simpler. There is tiny knock-on effect
         in Haddock, so that submodule is updated
      
       * I realised that when computing the "size of a dictionary type"
         in TcValidity.sizePred, we should be rather conservative about
         type functions, which can arbitrarily increase the size of a type.
         Hence the new datatype TypeSize, which has a TSBig constructor for
         "arbitrarily big".
      
       * instDFunType moves from TcSMonad to Inst
      
       * Interestingly, CmmNode and CmmExpr both now need a non-silent
         (Ord r) in a couple of instance declarations. These were previously
         silent but must now be explicit.
      
       * Quite a bit of wibbling in error messages
      a6f0f5ab
  10. 19 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Some Dwarf generation fixes · f85db756
      Peter Wortmann authored
      - Make abbrev offset absolute on Non-Mac systems
      - Add another termination byte at the end of the abbrev section
        (readelf complains)
      - Scope combination was wrong for the simpler cases
      - Shouldn't have a "global/" in front of all scopes
      f85db756
  11. 16 Dec, 2014 3 commits
    • Peter Wortmann's avatar
      Add unwind information to Cmm · 711a51ad
      Peter Wortmann authored
      Unwind information allows the debugger to discover more information
      about a program state, by allowing it to "reconstruct" other states of
      the program. In practice, this means that we explain to the debugger
      how to unravel stack frames, which comes down mostly to explaining how
      to find their Sp and Ip register values.
      
      * We declare yet another new constructor for CmmNode - and this time
        there's actually little choice, as unwind information can and will
        change mid-block. We don't actually make use of these capabilities,
        and back-end support would be tricky (generate new labels?), but it
        feels like the right way to do it.
      
      * Even though we only use it for Sp so far, we allow CmmUnwind to specify
        unwind information for any register. This is pretty cheap and could
        come in useful in future.
      
      * We allow full CmmExpr expressions for specifying unwind values. The
        advantage here is that we don't have to make up new syntax, and can e.g.
        use the WDS macro directly. On the other hand, the back-end will now
        have to simplify the expression until it can sensibly be converted
        into DWARF byte code - a process which might fail, yielding NCG panics.
        On the other hand, when you're writing Cmm by hand you really ought to
        know what you're doing.
      
      (From Phabricator D169)
      711a51ad
    • Peter Wortmann's avatar
      Tick scopes · 5fecd767
      Peter Wortmann authored
      This patch solves the scoping problem of CmmTick nodes: If we just put
      CmmTicks into blocks we have no idea what exactly they are meant to
      cover.  Here we introduce tick scopes, which allow us to create
      sub-scopes and merged scopes easily.
      
      Notes:
      
      * Given that the code often passes Cmm around "head-less", we have to
        make sure that its intended scope does not get lost. To keep the amount
        of passing-around to a minimum we define a CmmAGraphScoped type synonym
        here that just bundles the scope with a portion of Cmm to be assembled
        later.
      
      * We introduce new scopes at somewhat random places, aligning with
        getCode calls. This works surprisingly well, but we might have to
        add new scopes into the mix later on if we find things too be too
        coarse-grained.
      
      (From Phabricator D169)
      5fecd767
    • Peter Wortmann's avatar
      Source notes (Cmm support) · 7ceaf96f
      Peter Wortmann authored
      This patch adds CmmTick nodes to Cmm code. This is relatively
      straight-forward, but also not very useful, as many blocks will simply
      end up with no annotations whatosever.
      
      Notes:
      
      * We use this design over, say, putting ticks into the entry node of all
        blocks, as it seems to work better alongside existing optimisations.
        Now granted, the reason for this is that currently GHC's main Cmm
        optimisations seem to mainly reorganize and merge code, so this might
        change in the future.
      
      * We have the Cmm parser generate a few source notes as well. This is
        relatively easy to do - worst part is that it complicates the CmmParse
        implementation a bit.
      
      (From Phabricator D169)
      7ceaf96f
  12. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  13. 18 Oct, 2013 1 commit
  14. 04 Sep, 2013 1 commit
  15. 03 Sep, 2013 1 commit
  16. 24 Jul, 2013 1 commit
    • Simon Marlow's avatar
      Fix a bug in stack layout with safe foreign calls (#8083) · c2348859
      Simon Marlow authored
      We weren't properly tracking the number of stack arguments in the
      continuation of a foreign call.  It happened to work when the
      continuation was not a join point, but when it was a join point we
      were using the wrong amount of stack fixup.
      c2348859
  17. 14 Apr, 2013 1 commit
  18. 06 Apr, 2013 1 commit
  19. 05 Mar, 2013 1 commit
  20. 12 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Remove OldCmm, convert backends to consume new Cmm · d92bd17f
      Simon Marlow authored
      This removes the OldCmm data type and the CmmCvt pass that converts
      new Cmm to OldCmm.  The backends (NCGs, LLVM and C) have all been
      converted to consume new Cmm.
      
      The main difference between the two data types is that conditional
      branches in new Cmm have both true/false successors, whereas in OldCmm
      the false case was a fallthrough.  To generate slightly better code we
      occasionally need to invert a conditional to ensure that the
      branch-not-taken becomes a fallthrough; this was previously done in
      CmmCvt, and it is now done in CmmContFlowOpt.
      
      We could go further and use the Hoopl Block representation for native
      code, which would mean that we could use Hoopl's postorderDfs and
      analyses for native code, but for now I've left it as is, using the
      old ListGraph representation for native code.
      d92bd17f
  21. 30 Oct, 2012 1 commit
  22. 08 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  23. 31 Aug, 2012 1 commit
    • Simon Marlow's avatar
      Fix a bug in foldExpDeep · 6dd55e8a
      Simon Marlow authored
      This caused the CAF analysis to occasionally miss a CAF sometimes,
      resulting in a very hard to diagnose crash.
      6dd55e8a
  24. 06 Aug, 2012 1 commit
  25. 20 Jul, 2012 1 commit
  26. 09 Jul, 2012 1 commit
  27. 05 Jul, 2012 2 commits
  28. 15 Mar, 2012 2 commits
  29. 13 Feb, 2012 1 commit
  30. 08 Feb, 2012 1 commit
    • Simon Marlow's avatar
      New stack layout algorithm · 76999b60
      Simon Marlow authored
      Also:
       - improvements to code generation: push slow-call continuations
         on the stack instead of generating explicit continuations
      
       - remove unused CmmInfo wrapper type (replace with CmmInfoTable)
      
       - squash Area and AreaId together, remove now-unused RegSlot
      
       - comment out old unused stack-allocation code that no longer
         compiles after removal of RegSlot
      76999b60
  31. 03 Feb, 2012 1 commit
  32. 30 Jan, 2012 1 commit
  33. 25 Jan, 2012 1 commit
  34. 23 Jan, 2012 1 commit
  35. 18 Jan, 2012 1 commit