1. 30 Oct, 2012 2 commits
  2. 19 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Remove the old codegen · 6fbd46b0
      Simon Marlow authored
      Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM
      backends, and should probably be moved somewhere else.
      6fbd46b0
  3. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  4. 08 Oct, 2012 2 commits
    • Simon Marlow's avatar
      untab · a94144b8
      Simon Marlow authored
      a94144b8
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  5. 18 Sep, 2012 2 commits
  6. 16 Sep, 2012 1 commit
  7. 12 Sep, 2012 3 commits
  8. 28 Aug, 2012 2 commits
  9. 06 Aug, 2012 1 commit
  10. 30 Jul, 2012 1 commit
    • Simon Marlow's avatar
      New codegen: do not split proc-points when using the NCG · f1ed6a10
      Simon Marlow authored
      Proc-point splitting is only required by backends that do not support
      having proc-points within a code block (that is, everything except the
      native backend, i.e. LLVM and C).
      
      Not doing proc-point splitting saves some compilation time, and might
      produce slightly better code in some cases.
      f1ed6a10
  11. 24 Jul, 2012 1 commit
  12. 13 Jun, 2012 1 commit
    • Ian Lynagh's avatar
      Remove PlatformOutputable · d06edb8e
      Ian Lynagh authored
      We can now get the Platform from the DynFlags inside an SDoc, so we
      no longer need to pass the Platform in.
      d06edb8e
  13. 12 Jun, 2012 1 commit
  14. 23 Feb, 2012 1 commit
  15. 08 Feb, 2012 1 commit
    • Simon Marlow's avatar
      New stack layout algorithm · 76999b60
      Simon Marlow authored
      Also:
       - improvements to code generation: push slow-call continuations
         on the stack instead of generating explicit continuations
      
       - remove unused CmmInfo wrapper type (replace with CmmInfoTable)
      
       - squash Area and AreaId together, remove now-unused RegSlot
      
       - comment out old unused stack-allocation code that no longer
         compiles after removal of RegSlot
      76999b60
  16. 27 Jan, 2012 1 commit
  17. 10 Jan, 2012 1 commit
    • dterei's avatar
      Track STG live register information for use in LLVM · 4384e146
      dterei authored
      We now carry around with CmmJump statements a list of
      the STG registers that are live at that jump site.
      This is used by the LLVM backend so it can avoid
      unnesecarily passing around dead registers, improving
      perfromance. This gives us the framework to finally
      fix trac #4308.
      4384e146
  18. 06 Jan, 2012 2 commits
  19. 19 Dec, 2011 1 commit
    • Ian Lynagh's avatar
      Add a class HasDynFlags(getDynFlags) · 06c6d970
      Ian Lynagh authored
      We no longer have many separate, clashing getDynFlags functions
      
      I've given each GhcMonad its own HasDynFlags instance, rather than
      using UndecidableInstances to make a GhcMonad m => HasDynFlags m
      instance.
      06c6d970
  20. 29 Nov, 2011 2 commits
    • Simon Marlow's avatar
      Make profiling work with multiple capabilities (+RTS -N) · 50de6034
      Simon Marlow authored
      This means that both time and heap profiling work for parallel
      programs.  Main internal changes:
      
        - CCCS is no longer a global variable; it is now another
          pseudo-register in the StgRegTable struct.  Thus every
          Capability has its own CCCS.
      
        - There is a new built-in CCS called "IDLE", which records ticks for
          Capabilities in the idle state.  If you profile a single-threaded
          program with +RTS -N2, you'll see about 50% of time in "IDLE".
      
        - There is appropriate locking in rts/Profiling.c to protect the
          shared cost-centre-stack data structures.
      
      This patch does enough to get it working, I have cut one big corner:
      the cost-centre-stack data structure is still shared amongst all
      Capabilities, which means that multiple Capabilities will race when
      updating the "allocations" and "entries" fields of a CCS.  Not only
      does this give unpredictable results, but it runs very slowly due to
      cache line bouncing.
      
      It is strongly recommended that you use -fno-prof-count-entries to
      disable the "entries" count when profiling parallel programs. (I shall
      add a note to this effect to the docs).
      50de6034
    • Simon Marlow's avatar
      Get rid of the "safety" field of CmmCall (OldCmm) · cbe24168
      Simon Marlow authored
      This field was doing nothing.  I think it originally appeared in a
      very old incarnation of the new code generator.
      cbe24168
  21. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  22. 25 Aug, 2011 3 commits
  23. 28 Jul, 2011 1 commit
  24. 15 Jul, 2011 1 commit
    • Ian Lynagh's avatar
      More work towards cross-compilation · f07af788
      Ian Lynagh authored
      There's now a variant of the Outputable class that knows what
      platform we're targetting:
      
      class PlatformOutputable a where
          pprPlatform :: Platform -> a -> SDoc
          pprPlatformPrec :: Platform -> Rational -> a -> SDoc
      
      and various instances have had to be converted to use that class,
      and we pass Platform around accordingly.
      f07af788
  25. 13 Jul, 2011 1 commit
  26. 07 Jul, 2011 1 commit
  27. 05 Jul, 2011 2 commits
    • batterseapower's avatar
    • batterseapower's avatar
      Refactoring: use a structured CmmStatics type rather than [CmmStatic] · 54843b5b
      batterseapower authored
      I observed that the [CmmStatics] within CmmData uses the list in a very stylised way.
      The first item in the list is almost invariably a CmmDataLabel. Many parts of the
      compiler pattern match on this list and fail if this is not true.
      
      This patch makes the invariant explicit by introducing a structured type CmmStatics
      that holds the label and the list of remaining [CmmStatic].
      
      There is one wrinkle: the x86 backend sometimes wants to output an alignment directive just
      before the label. However, this can be easily fixed up by parameterising the native codegen
      over the type of CmmStatics (though the GenCmmTop parameterisation) and using a pair
      (Alignment, CmmStatics) there instead.
      
      As a result, I think we will be able to remove CmmAlign and CmmDataLabel from the CmmStatic
      data type, thus nuking a lot of code and failing pattern matches. This change will come as part
      of my next patch.
      54843b5b
  28. 09 Jun, 2011 1 commit
    • Ian Lynagh's avatar
      Refactor SrcLoc and SrcSpan · b2bd63f9
      Ian Lynagh authored
      The "Unhelpful" cases are now in a separate type. This allows us to
      improve various things, e.g.:
      * Most of the panic's in SrcLoc are now gone
      * The Lexer now works with RealSrcSpans rather than SrcSpans, i.e. it
        knows that it has real locations and thus can assume that the line
        number etc really exists
      * Some of the more suspicious cases are no longer necessary, e.g.
        we no longer need this case in advanceSrcLoc:
            advanceSrcLoc loc _ = loc -- Better than nothing
      
      More improvements can probably be made, e.g. tick locations can
      probably use RealSrcSpans too.
      b2bd63f9
  29. 31 May, 2011 1 commit