1. 09 Mar, 2013 1 commit
  2. 19 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Code-size optimisation for top-level indirections (#7308) · 7da13762
      Simon Marlow authored
      Top-level indirections are often generated when there is a cast, e.g.
      
      foo :: T
      foo = bar `cast` (some coercion)
      
      For these we were generating a full-blown CAF, which is a fair chunk
      of code.
      
      This patch makes these indirections generate a single IND_STATIC
      closure (4 words) instead.  This is exactly what the CAF would
      evaluate to eventually anyway, we're just shortcutting the whole
      process.
      7da13762
  3. 13 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Fix the Slow calling convention (#7192) · 4270d7e7
      Simon Marlow authored
      The Slow calling convention passes the closure in R1, but we were
      ignoring this and hoping it would work, which it often did.  However,
      this bug seems to have been the cause of #7192, because the
      graph-colouring allocator is more sensitive to having correct liveness
      information on jumps.
      4270d7e7
  4. 30 Oct, 2012 1 commit
  5. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  6. 08 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  7. 26 Sep, 2012 1 commit
    • Edward Z. Yang's avatar
      Partially fix #367 by adding HpLim checks to entry with -fno-omit-yields. · d3128bfc
      Edward Z. Yang authored
      The current fix is relatively dumb as far as where to add HpLim
      checks: it will always perform a check unless we know that we're
      returning from a closure or we are doing a non let-no-escape case
      analysis.  The performance impact on the nofib suite looks like this:
      
                  Min          +5.7%     -0.0%     -6.5%     -6.4%    -50.0%
                  Max          +6.3%     +5.8%     +5.0%     +5.5%     +0.8%
       Geometric Mean          +6.2%     +0.1%     +0.5%     +0.5%     -0.8%
      
      Overall, the executable bloat is the biggest problem, so we keep the old
      omit-yields optimization on by default. Remember that if you need an
      interruptibility guarantee, you need to recompile all of your libraries
      with -fno-omit-yields.
      
      A better fix would involve only inserting the yields necessary to break
      loops; this is left as future work.
      Signed-off-by: Edward Z. Yang's avatarEdward Z. Yang <ezyang@mit.edu>
      d3128bfc
  8. 16 Sep, 2012 1 commit
  9. 12 Sep, 2012 2 commits
  10. 31 Aug, 2012 1 commit
  11. 07 Aug, 2012 3 commits
  12. 02 Aug, 2012 1 commit
    • Simon Marlow's avatar
      Explicitly share some return continuations · 6ede0067
      Simon Marlow authored
      Instead of relying on common-block-elimination to share return
      continuations in the common case (case-alternative heap checks) we do
      it explicitly.  This isn't hard to do, is more robust, and saves some
      compilation time.  Full commentary in Note [sharing continuations].
      6ede0067
  13. 30 Jul, 2012 1 commit
  14. 24 Jul, 2012 1 commit
  15. 11 Jul, 2012 2 commits
  16. 06 Jul, 2012 1 commit
  17. 04 Jul, 2012 1 commit
  18. 13 Jun, 2012 1 commit
  19. 05 Jun, 2012 1 commit
  20. 07 Mar, 2012 1 commit
    • Simon Marlow's avatar
      Improve the case-alternative heap checks · 65256948
      Simon Marlow authored
      The code we were generating for heap-checks in algebraic case
      alternatives wasn't working well with the common-block eliminator.  A
      small tweak to make the heap-check failure jump back to the same place
      in all branches lets the common-block eliminator squash more code.
      65256948
  21. 14 Feb, 2012 1 commit
    • Simon Marlow's avatar
      Fix an SRT-related bug · b8172ba1
      Simon Marlow authored
      We were using the SRT information generated by the computeSRTs pass to
      decide whether to add a static link field to a constructor or not, and
      this broke when I disabled computeSRTs for the new code generator.  So
      I've hacked it for now to only rely on the SRT information generated
      by CoreToStg.
      b8172ba1
  22. 08 Feb, 2012 1 commit
    • Simon Marlow's avatar
      New stack layout algorithm · 76999b60
      Simon Marlow authored
      Also:
       - improvements to code generation: push slow-call continuations
         on the stack instead of generating explicit continuations
      
       - remove unused CmmInfo wrapper type (replace with CmmInfoTable)
      
       - squash Area and AreaId together, remove now-unused RegSlot
      
       - comment out old unused stack-allocation code that no longer
         compiles after removal of RegSlot
      76999b60
  23. 25 Jan, 2012 1 commit
  24. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  25. 02 Oct, 2011 1 commit
  26. 25 Aug, 2011 4 commits
  27. 24 Jan, 2011 1 commit
    • Simon Marlow's avatar
      Merge in new code generator branch. · 889c084e
      Simon Marlow authored
      This changes the new code generator to make use of the Hoopl package
      for dataflow analysis.  Hoopl is a new boot package, and is maintained
      in a separate upstream git repository (as usual, GHC has its own
      lagging darcs mirror in http://darcs.haskell.org/packages/hoopl).
      
      During this merge I squashed recent history into one patch.  I tried
      to rebase, but the history had some internal conflicts of its own
      which made rebase extremely confusing, so I gave up. The history I
      squashed was:
      
        - Update new codegen to work with latest Hoopl
        - Add some notes on new code gen to cmm-notes
        - Enable Hoopl lag package.
        - Add SPJ note to cmm-notes
        - Improve GC calls on new code generator.
      
      Work in this branch was done by:
         - Milan Straka <fox@ucw.cz>
         - John Dias <dias@cs.tufts.edu>
         - David Terei <davidterei@gmail.com>
      
      Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD
      and fixed a few bugs.
      889c084e
  28. 06 Nov, 2009 2 commits
    • Ben.Lippmeier@anu.edu.au's avatar
      validate fixes · 374a85ae
      Ben.Lippmeier@anu.edu.au authored
      374a85ae
    • Ben.Lippmeier@anu.edu.au's avatar
      * Refactor CLabel.RtsLabel to CLabel.CmmLabel · a02e7f40
      Ben.Lippmeier@anu.edu.au authored
      The type of the CmmLabel ctor is now
        CmmLabel :: PackageId -> FastString -> CmmLabelInfo -> CLabel
        
       - When you construct a CmmLabel you have to explicitly say what
         package it is in. Many of these will just use rtsPackageId, but
         I've left it this way to remind people not to pretend labels are
         in the RTS package when they're not. 
         
       - When parsing a Cmm file, labels that are not defined in the 
         current file are assumed to be in the RTS package. 
         
         Labels imported like
            import label
         are assumed to be in a generic "foreign" package, which is different
         from the current one.
         
         Labels imported like
            import "package-name" label
         are marked as coming from the named package.
         
         This last one is needed for the integer-gmp library as we want to
         refer to labels that are not in the same compilation unit, but
         are in the same non-rts package.
         
         This should help remove the nasty #ifdef __PIC__ stuff from
         integer-gmp/cbits/gmp-wrappers.cmm
         
      a02e7f40
  29. 18 Oct, 2009 1 commit
  30. 07 Jul, 2009 1 commit
  31. 23 Mar, 2009 2 commits