1. 30 Jan, 2013 1 commit
  2. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  3. 18 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Refactor the way dump flags are handled · d4a19643
      ian@well-typed.com authored
      We were being inconsistent about how we tested whether dump flags
      were enabled; in particular, sometimes we also checked the verbosity,
      and sometimes we didn't.
      
      This lead to oddities such as "ghc -v4" printing an "Asm code" section
      which didn't contain any code, and "-v4" enabled some parts of
      "-ddump-deriv" but not others.
      
      Now all the tests use dopt, which also takes the verbosity into account
      as appropriate.
      d4a19643
  4. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  5. 17 Sep, 2012 1 commit
  6. 18 Jul, 2012 1 commit
  7. 13 Jul, 2012 1 commit
  8. 12 Jun, 2012 2 commits
  9. 11 Jun, 2012 2 commits
    • Ian Lynagh's avatar
      Pass DynFlags down to showSDocDump · 8685576a
      Ian Lynagh authored
      To help with this, we now also pass DynFlags around inside the SpecM
      monad.
      8685576a
    • Ian Lynagh's avatar
      Pass DynFlags to the LogAction · 5716a2f8
      Ian Lynagh authored
      A side-effect is that we can no longer use the LogAction in
      defaultErrorHandler, as we don't have DynFlags at that point.
      But all that defaultErrorHandler did is to print Strings as
      SevFatal, so now it takes a 'FatalMessager' instead.
      5716a2f8
  10. 29 May, 2012 1 commit
    • Ian Lynagh's avatar
      Replace printDump with a new Severity · 78252479
      Ian Lynagh authored
      We now use log_action with severity SevDump, rather than calling
      printDump. This means that what happens to dumped info is now under
      the control of the GHC API user, rather than always going to stdout.
      78252479
  11. 02 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Allow cases with empty alterantives · ac230c5e
      Simon Peyton Jones authored
      This patch allows, for the first time, case expressions with an empty
      list of alternatives. Max suggested the idea, and Trac #6067 showed
      that it is really quite important.
      
      So I've implemented the idea, fixing #6067. Main changes
      
       * See Note [Empty case alternatives] in CoreSyn
      
       * Various foldr1's become foldrs
      
       * IfaceCase does not record the type of the alternatives.
         I added IfaceECase for empty-alternative cases.
      
       * Core Lint does not complain about empty cases
      
       * MkCore.castBottomExpr constructs an empty-alternative case
         expression   (case e of ty {})
      
       * CoreToStg converts '(case e of {})' to just 'e'
      ac230c5e
  12. 26 Apr, 2012 1 commit
  13. 27 Nov, 2011 1 commit
  14. 11 Nov, 2011 1 commit
  15. 04 Nov, 2011 1 commit
  16. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  17. 21 Oct, 2011 1 commit
  18. 17 Oct, 2011 1 commit
  19. 23 Sep, 2011 2 commits
    • Simon Peyton Jones's avatar
      Make a new type synonym CoreProgram = [CoreBind] · 488e21c8
      Simon Peyton Jones authored
      and comment its invariants in Note [CoreProgram] in CoreSyn
      
      I'm not totally convinced that CoreProgram is the right name
      (perhaps CoreTopBinds might better), but it is useful to have
      a clue that you are looking at the top-level bindings.
      
      This is only a matter of a type synonym change; no deep
      refactoring here.
      488e21c8
    • Simon Peyton Jones's avatar
      Add a transformation limit to the simplifier (Trac #5448) · 24a2353a
      Simon Peyton Jones authored
      This addresses the rare cases where the simplifier diverges
      (see the above ticket).  We were already counting how many simplifier
      steps were taking place, but with no limit.  This patch adds a limit;
      at which point we halt compilation, and print out useful stats. The
      stats show what is begin inlined, and how often, which points you
      directly to the problem.  The limit is set based on the size of the
      program.
      
      Instead of halting compilation, we could instead just inhibit
      inlining, which would let compilation of the module complete. This is
      a bit harder to implement, and it's likely to mean that you unrolled
      the function 1143 times and then ran out of ticks; you probably don't
      want to complete parsing on this highly-unrolled program.
      
      Flags: -dsimpl-tick-factor=N.  Default is 100 (percent).
             A bigger number increases the allowed maximum tick count.
      24a2353a
  20. 27 Jul, 2011 1 commit
  21. 21 Jul, 2011 2 commits
  22. 16 Jun, 2011 1 commit
    • Simon Peyton Jones's avatar
      Add dynamically-linked plugins (see Trac #3843) · 592def09
      Simon Peyton Jones authored
      This patch was originally developed by Max Bolingbroke, and worked on
      further by Austin Seipp.  It allows you to write a Core-to-Core pass
      and have it dynamically linked into an otherwise-unmodified GHC, and
      run at a place you specify in the Core optimisation pipeline.
      
      Main components:
        - CoreMonad: new types Plugin, PluginPass
                     plus a new constructor CoreDoPluginPass in CoreToDo
      
        - SimplCore: stuff to dynamically load any plugins, splice
          them into the core-to-core pipeline, and invoke them
      
        - Move "getCoreToDo :: DynFlags -> [CoreToDo]"
            which constructs the main core-to-core pipeline
            from CoreMonad to SimplCore
          SimplCore is the driver for the optimisation pipeline, and it
          makes more sense to have the pipeline construction in the driver
          not in the infrastructure module.
      
        - New module DynamicLoading: invoked by SimplCore to load any plugins
          Some consequential changes in Linker.
      
        - New module GhcPlugins: this should be imported by plugin modules; it
          it not used by GHC itself.
      592def09
  23. 13 Jun, 2011 1 commit
  24. 10 Jun, 2011 1 commit
  25. 02 Mar, 2011 1 commit
  26. 20 Feb, 2011 1 commit
    • chak@cse.unsw.edu.au.'s avatar
      Added a VECTORISE pragma · f2aaae97
      chak@cse.unsw.edu.au. authored
      - Added a pragma {-# VECTORISE var = exp #-} that prevents
        the vectoriser from vectorising the definition of 'var'.
        Instead it uses the binding '$v_var = exp' to vectorise
        'var'.  The vectoriser checks that the Core type of 'exp'
        matches the vectorised Core type of 'var'.  (It would be
        quite complicated to perform that check in the type checker
        as the vectorisation of a type needs the state of the VM
        monad.)
      - Added parts of a related VECTORISE SCALAR pragma
      - Documented -ddump-vect
      - Added -ddump-vt-trace
      - Some clean up
      f2aaae97
  27. 16 Nov, 2010 1 commit
  28. 07 Oct, 2010 1 commit
  29. 13 Sep, 2010 1 commit
  30. 09 Sep, 2010 1 commit
  31. 08 Sep, 2010 1 commit
  32. 07 Sep, 2010 1 commit
  33. 24 Dec, 2009 2 commits
  34. 18 Dec, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Move all the CoreToDo stuff into CoreMonad · 63e3a411
      simonpj@microsoft.com authored
      This patch moves a lot of code around, but has zero functionality change.
      The idea is that the types
      
          CoreToDo
          SimplifierSwitch	
          SimplifierMode
          FloatOutSwitches
      
      and 
      
          the main core-to-core pipeline construction
      
      belong in simplCore/, and *not* in DynFlags.
      63e3a411
  35. 08 Dec, 2009 1 commit