1. 30 Jan, 2013 1 commit
  2. 09 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Make the opt_UF_* static flags dynamic · 0a768bcb
      ian@well-typed.com authored
      I also removed the default values from the "Discounts and thresholds"
      note: most of them were no longer up-to-date.
      
      Along the way I added FloatSuffix to the argument parser, analogous to
      IntSuffix.
      0a768bcb
  3. 17 Sep, 2012 1 commit
  4. 23 Aug, 2012 1 commit
  5. 11 Jun, 2012 2 commits
  6. 02 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Allow cases with empty alterantives · ac230c5e
      Simon Peyton Jones authored
      This patch allows, for the first time, case expressions with an empty
      list of alternatives. Max suggested the idea, and Trac #6067 showed
      that it is really quite important.
      
      So I've implemented the idea, fixing #6067. Main changes
      
       * See Note [Empty case alternatives] in CoreSyn
      
       * Various foldr1's become foldrs
      
       * IfaceCase does not record the type of the alternatives.
         I added IfaceECase for empty-alternative cases.
      
       * Core Lint does not complain about empty cases
      
       * MkCore.castBottomExpr constructs an empty-alternative case
         expression   (case e of ty {})
      
       * CoreToStg converts '(case e of {})' to just 'e'
      ac230c5e
  7. 04 Mar, 2012 1 commit
  8. 17 Feb, 2012 1 commit
  9. 16 Nov, 2011 1 commit
  10. 04 Nov, 2011 1 commit
  11. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  12. 06 Sep, 2011 1 commit
    • batterseapower's avatar
      Implement -XConstraintKind · 9729fe7c
      batterseapower authored
      Basically as documented in http://hackage.haskell.org/trac/ghc/wiki/KindFact,
      this patch adds a new kind Constraint such that:
      
        Show :: * -> Constraint
        (?x::Int) :: Constraint
        (Int ~ a) :: Constraint
      
      And you can write *any* type with kind Constraint to the left of (=>):
      even if that type is a type synonym, type variable, indexed type or so on.
      
      The following (somewhat related) changes are also made:
       1. We now box equality evidence. This is required because we want
          to give (Int ~ a) the *lifted* kind Constraint
       2. For similar reasons, implicit parameters can now only be of
          a lifted kind. (?x::Int#) => ty is now ruled out
       3. Implicit parameter constraints are now allowed in superclasses
          and instance contexts (this just falls out as OK with the new
          constraint solver)
      
      Internally the following major changes were made:
       1. There is now no PredTy in the Type data type. Instead
          GHC checks the kind of a type to figure out if it is a predicate
       2. There is now no AClass TyThing: we represent classes as TyThings
          just as a ATyCon (classes had TyCons anyway)
       3. What used to be (~) is now pretty-printed as (~#). The box
          constructor EqBox :: (a ~# b) -> (a ~ b)
       4. The type LCoercion is used internally in the constraint solver
          and type checker to represent coercions with free variables
          of type (a ~ b) rather than (a ~# b)
      9729fe7c
  13. 03 Aug, 2011 1 commit
  14. 21 Jul, 2011 1 commit
  15. 19 Apr, 2011 1 commit
    • Simon Peyton Jones's avatar
      This BIG PATCH contains most of the work for the New Coercion Representation · fdf86568
      Simon Peyton Jones authored
      See the paper "Practical aspects of evidence based compilation in System FC"
      
      * Coercion becomes a data type, distinct from Type
      
      * Coercions become value-level things, rather than type-level things,
        (although the value is zero bits wide, like the State token)
        A consequence is that a coerion abstraction increases the arity by 1
        (just like a dictionary abstraction)
      
      * There is a new constructor in CoreExpr, namely Coercion, to inject
        coercions into terms
      fdf86568
  16. 26 Jan, 2011 1 commit
    • simonpj@microsoft.com's avatar
      Fix dependencies among specialisations for imported Ids · 869984cd
      simonpj@microsoft.com authored
      This was a subtle one (Trac #4903).  See
        Note [Glom the bindings if imported functions are specialised]
      in Speclialise.
      
      Fundamentally, a specialised binding for an imported Id was being
      declared non-recursive, whereas in fact it can become recursive
      via a RULE.  Once it's specified non-recurive the OccAnal pass
      treats that as gospel -- and that in turn led to infinite inlining.
      
      Easily fixed by glomming all the specialised bindings in a Rec;
      now the OccAnal will sort them out correctly.
      869984cd
  17. 14 Jan, 2011 1 commit
    • simonpj@microsoft.com's avatar
      Fix Trac #4874: specialisation of INLINABLE things · 66dfd5c5
      simonpj@microsoft.com authored
      Johan discovered that when INLINABLE things are specialised
      bad things can happen. This patch implements a hack -- but
      it's a simple hack and it solves the problem.
      
      See Note [Inline specialisations]. 
      
      The hack part is that really INLINABLE should not cause *any* loss
      optimisation, and it does; see Note [Don't w/w INLINABLE things] in
      WorkWrap.
      66dfd5c5
  18. 10 Dec, 2010 1 commit
  19. 07 Oct, 2010 1 commit
    • simonpj@microsoft.com's avatar
      Implement auto-specialisation of imported Ids · 92267aa2
      simonpj@microsoft.com authored
      This big-ish patch arranges that if an Id 'f' is 
        * Type-class overloaded 
             f :: Ord a => [a] -> [a]
        * Defined with an INLINABLE pragma
             {-# INLINABLE f #-}
        * Exported from its defining module 'D'
      
      then in any module 'U' that imports D
      
      1. Any call of 'f' at a fixed type will generate 
         (a) a specialised version of f in U
         (b) a RULE that rewrites unspecialised calls to the
             specialised on
      
        e.g. if the call is (f Int dOrdInt xs) then the 
        specialiser will generate
           $sfInt :: [Int] -> [Int]
           $sfInt = <code for f, imported from D, specialised>
           {-# RULE forall d.  f Int d = $sfInt #-}
      
      2. In addition, you can give an explicit {-# SPECIALISE -#}
         pragma for the imported Id
           {-# SPECIALISE f :: [Bool] -> [Bool] #-}
         This too generates a local specialised definition, 
         and the corresponding RULE 
      
      The new RULES are exported from module 'U', so that any module
      importing U will see the specialised versions of 'f', and will
      not re-specialise them.
      
      There's a flag -fwarn-auto-orphan that warns you if the auto-generated
      RULES are orphan rules. It's not in -Wall, mainly to avoid lots of
      error messages with existing packages.
      
      Main implementation changes
      
       - A new flag on a CoreRule to say if it was auto-generated.
         This is persisted across interface files, so there's a small
         change in interface file format.
      
       - Quite a bit of fiddling with plumbing, to get the 
         {-# SPECIALISE #-} pragmas for imported Ids.  In particular, a
         new field tgc_imp_specs in TcGblEnv, to keep the specialise
         pragmas for imported Ids between the typechecker and the desugarer.
      
       - Some new code (although surprisingly little) in Specialise,
         to deal with calls of imported Ids
      92267aa2
  20. 15 Sep, 2010 1 commit
  21. 14 Sep, 2010 1 commit
    • Ian Lynagh's avatar
      Remove (most of) the FiniteMap wrapper · e95ee1f7
      Ian Lynagh authored
      We still have
          insertList, insertListWith, deleteList
      which aren't in Data.Map, and
          foldRightWithKey
      which works around the fold(r)WithKey addition and deprecation.
      e95ee1f7
  22. 12 Aug, 2010 1 commit
    • simonpj@microsoft.com's avatar
      Improve the Specialiser, fixing Trac #4203 · c107a00c
      simonpj@microsoft.com authored
      Simply fixing #4203 is a tiny fix: in case alterantives we should
      do dumpUDs *including* the case binder.  
      
      But I realised that we can do better and wasted far too much time
      implementing the idea.  It's described in
          Note [Floating dictionaries out of cases]
      c107a00c
  23. 09 Mar, 2010 1 commit
    • simonpj@microsoft.com's avatar
      Rule binders shouldn't have IdInfo · 435c5194
      simonpj@microsoft.com authored
      While I was looking at the rule binders generated in DsBinds for specialise pragmas,
      I also looked at Specialise.  It too was "cloning" the dictionary binders including
      their IdInfo. In this case they should not have any, but its seems better to make
      them completely fresh rather than substitute in existing (albeit non-existent) IdInfo.
      435c5194
  24. 06 Jan, 2010 1 commit
    • simonpj@microsoft.com's avatar
      Improve the handling of default methods · 77166b17
      simonpj@microsoft.com authored
      See the long Note [INLINE and default methods].  
      
      This patch changes a couple of data types, with a knock-on effect on
      the format of interface files.  A lot of files get touched, but is a
      relatively minor change.  The main tiresome bit is the extra plumbing
      to communicate default methods between the type checker and the
      desugarer.
      77166b17
  25. 24 Dec, 2009 1 commit
  26. 16 Dec, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Adjust Activations for specialise and work/wrap, and better simplify in InlineRules · 76dfa394
      simonpj@microsoft.com authored
      This patch does two main things:
      
      1. Adjusts the way we set the Activation for
      
         a) The wrappers generated by the strictness analyser
            See Note [Wrapper activation] in WorkWrap
      
         b) The RULEs generated by Specialise and SpecConstr
            See Note [Auto-specialisation and RULES] in Specialise
                Note [Transfer activation] in SpecConstr
      
      2. Refines how we set the phase when simplifying the right
         hand side of an InlineRule.  See
         Note [Simplifying inside InlineRules] in SimplUtils.
      
      Most of the extra lines are comments!  
      
      The merit of (2) is that a bit more stuff happens inside InlineRules,
      and that in turn allows better dead-code elimination.
      76dfa394
  27. 11 Dec, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Bottom extraction: float out bottoming expressions to top level · b84ba676
      simonpj@microsoft.com authored
        
      The idea is to float out bottoming expressions to top level,
      abstracting them over any variables they mention, if necessary.  This
      is good because it makes functions smaller (and more likely to
      inline), by keeping error code out of line. 
      
      See Note [Bottoming floats] in SetLevels.
      
      On the way, this fixes the HPC failures for cg059 and friends.
      
      I've been meaning to do this for some time.  See Maessen's paper 1999
      "Bottom extraction: factoring error handling out of functional
      programs" (unpublished I think).
      
      Here are the nofib results:
      
      
              Program           Size    Allocs   Runtime   Elapsed
      --------------------------------------------------------------------------------
                  Min          +0.1%     -7.8%    -14.4%    -32.5%
                  Max          +0.5%     +0.2%     +1.6%    +13.8%
       Geometric Mean          +0.4%     -0.2%     -4.9%     -6.7%
      
      Module sizes
              -1 s.d.                -----           -2.6%
              +1 s.d.                -----           +2.3%
              Average                -----           -0.2%
      
      Compile times:
              -1 s.d.                -----          -11.4%
              +1 s.d.                -----           +4.3%
              Average                -----           -3.8%
      
      I'm think program sizes have crept up because the base library
      is bigger -- module sizes in nofib decrease very slightly.  In turn
      I think that may be because the floating generates a call where
      there was no call before.  Anyway I think it's acceptable.
      
      
      The main changes are:
      
      * SetLevels floats out things that exprBotStrictness_maybe 
        identifies as bottom.  Make sure to pin on the right 
        strictness info to the newly created Ids, so that the
        info ends up in interface files.
      
        Since FloatOut is run twice, we have to be careful that we
        don't treat the function created by the first float-out as
        a candidate for the second; this is what worthFloating does.
      
        See SetLevels Note [Bottoming floats]
                      Note [Bottoming floats: eta expansion]
      
      * Be careful not to inline top-level bottoming functions; this 
        would just undo what the floating transformation achieves.
        See CoreUnfold Note [Do not inline top-level bottoming functions
       
        Ensuring this requires a bit of extra plumbing, but nothing drastic..
      
      * Similarly pre/postInlineUnconditionally should be 
        careful not to re-inline top-level bottoming things!
        See SimplUtils Note [Top-level botomming Ids]
                       Note [Top level and postInlineUnconditionally]
      b84ba676
  28. 02 Dec, 2009 1 commit
    • simonpj@microsoft.com's avatar
      More work on the simplifier's inlining strategies · c86161c5
      simonpj@microsoft.com authored
      This patch collects a small raft of related changes
      
      * Arrange that during 
           (a) rule matching and 
           (b) uses of exprIsConApp_maybe
        we "look through" unfoldings only if they are active
        in the phase. Doing this for (a) required a bit of 
        extra plumbing in the rule matching code, but I think
        it's worth it.
      
        One wrinkle is that even if inlining is off (in the 'gentle'
        phase of simplification) during rule matching we want to
        "look through" things with inlinings.  
         See SimplUtils.activeUnfInRule.
      
        This fixes a long-standing bug, where things that were
        supposed to be (say) NOINLINE, could still be poked into
        via exprIsConApp_maybe. 
      
      * In the above cases, also check for (non-rule) loop breakers; 
        we never look through these.  This fixes a bug that could make
        the simplifier diverge (and did for Roman).  
        Test = simplCore/should_compile/dfun-loop
      
      * Try harder not to choose a DFun as a loop breaker. This is 
        just a small adjustment in the OccurAnal scoring function
      
      * In the scoring function in OccurAnal, look at the InlineRule
        unfolding (if there is one) not the actual RHS, beause the
        former is what'll be inlined.  
      
      * Make the application of any function to dictionary arguments
        CONLIKE.  Thus (f d1 d2) is CONLIKE.  
        Encapsulated in CoreUtils.isExpandableApp
        Reason: see Note [Expandable overloadings] in CoreUtils
      
      * Make case expressions seem slightly smaller in CoreUnfold.
        This reverses an unexpected consequences of charging for
        alternatives.
      
      Refactorings
      ~~~~~~~~~~~~
      * Signficantly refactor the data type for Unfolding (again). 
        The result is much nicer.  
      
      * Add type synonym BasicTypes.CompilerPhase = Int
        and use it
      
      Many of the files touched by this patch are simply knock-on
      consequences of these two refactorings.
      c86161c5
  29. 19 Nov, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Implement -fexpose-all-unfoldings, and fix a non-termination bug · 6a944ae7
      simonpj@microsoft.com authored
      The -fexpose-all-unfoldings flag arranges to put unfoldings for *everything*
      in the interface file.  Of course,  this makes the file a lot bigger, but
      it also makes it complete, and that's great for supercompilation; or indeed
      any whole-program work.
      
      Consequences:
        * Interface files need to record loop-breaker-hood.  (Previously,
          loop breakers were never exposed, so that info wasn't necessary.)
          Hence a small interface file format change. 
      
        * When inlining, must check loop-breaker-hood. (Previously, loop
          breakers didn't have an unfolding at all, so no need to check.)
      
        * Ditto in exprIsConApp_maybe.  Roman actually tripped this bug, 
          because a DFun, which had an unfolding, was also a loop breaker
      
        * TidyPgm.tidyIdInfo must be careful to preserve loop-breaker-hood
      
      So Id.idUnfolding checks for loop-breaker-hood and returns NoUnfolding
      if so. When you want the unfolding regardless of loop-breaker-hood, 
      use Id.realIdUnfolding.
      
      I have not documented the flag yet, because it's experimental.  Nor
      have I tested it thoroughly.  But with the flag off (the normal case)
      everything should work.
      6a944ae7
  30. 05 Nov, 2009 1 commit
  31. 29 Oct, 2009 1 commit
    • simonpj@microsoft.com's avatar
      The Big INLINE Patch: totally reorganise way that INLINE pragmas work · 72462499
      simonpj@microsoft.com authored
      This patch has been a long time in gestation and has, as a
      result, accumulated some extra bits and bobs that are only
      loosely related.  I separated the bits that are easy to split
      off, but the rest comes as one big patch, I'm afraid.
      
      Note that:
       * It comes together with a patch to the 'base' library
       * Interface file formats change slightly, so you need to
         recompile all libraries
      
      The patch is mainly giant tidy-up, driven in part by the
      particular stresses of the Data Parallel Haskell project. I don't
      expect a big performance win for random programs.  Still, here are the
      nofib results, relative to the state of affairs without the patch
      
              Program           Size    Allocs   Runtime   Elapsed
      --------------------------------------------------------------------------------
                  Min         -12.7%    -14.5%    -17.5%    -17.8%
                  Max          +4.7%    +10.9%     +9.1%     +8.4%
       Geometric Mean          +0.9%     -0.1%     -5.6%     -7.3%
      
      The +10.9% allocation outlier is rewrite, which happens to have a
      very delicate optimisation opportunity involving an interaction
      of CSE and inlining (see nofib/Simon-nofib-notes). The fact that
      the 'before' case found the optimisation is somewhat accidental.
      Runtimes seem to go down, but I never kno wwhether to really trust
      this number.  Binary sizes wobble a bit, but nothing drastic.
      
      
      The Main Ideas are as follows.
      
      InlineRules
      ~~~~~~~~~~~
      When you say 
            {-# INLINE f #-}
            f x = <rhs>
      you intend that calls (f e) are replaced by <rhs>[e/x] So we
      should capture (\x.<rhs>) in the Unfolding of 'f', and never meddle
      with it.  Meanwhile, we can optimise <rhs> to our heart's content,
      leaving the original unfolding intact in Unfolding of 'f'.
      
      So the representation of an Unfolding has changed quite a bit
      (see CoreSyn).  An INLINE pragma gives rise to an InlineRule 
      unfolding.  
      
      Moreover, it's only used when 'f' is applied to the
      specified number of arguments; that is, the number of argument on 
      the LHS of the '=' sign in the original source definition. 
      For example, (.) is now defined in the libraries like this
         {-# INLINE (.) #-}
         (.) f g = \x -> f (g x)
      so that it'll inline when applied to two arguments. If 'x' appeared
      on the left, thus
         (.) f g x = f (g x)
      it'd only inline when applied to three arguments.  This slightly-experimental
      change was requested by Roman, but it seems to make sense.
      
      Other associated changes
      
      * Moving the deck chairs in DsBinds, which processes the INLINE pragmas
      
      * In the old system an INLINE pragma made the RHS look like
         (Note InlineMe <rhs>)
        The Note switched off optimisation in <rhs>.  But it was quite
        fragile in corner cases. The new system is more robust, I believe.
        In any case, the InlineMe note has disappeared 
      
      * The workerInfo of an Id has also been combined into its Unfolding,
        so it's no longer a separate field of the IdInfo.
      
      * Many changes in CoreUnfold, esp in callSiteInline, which is the critical
        function that decides which function to inline.  Lots of comments added!
      
      * exprIsConApp_maybe has moved to CoreUnfold, since it's so strongly
        associated with "does this expression unfold to a constructor application".
        It can now do some limited beta reduction too, which Roman found 
        was an important.
      
      Instance declarations
      ~~~~~~~~~~~~~~~~~~~~~
      It's always been tricky to get the dfuns generated from instance
      declarations to work out well.  This is particularly important in 
      the Data Parallel Haskell project, and I'm now on my fourth attempt,
      more or less.
      
      There is a detailed description in TcInstDcls, particularly in
      Note [How instance declarations are translated].   Roughly speaking
      we now generate a top-level helper function for every method definition
      in an instance declaration, so that the dfun takes a particularly
      stylised form:
        dfun a d1 d2 = MkD (op1 a d1 d2) (op2 a d1 d2) ...etc...
      
      In fact, it's *so* stylised that we never need to unfold a dfun.
      Instead ClassOps have a special rewrite rule that allows us to
      short-cut dictionary selection.  Suppose dfun :: Ord a -> Ord [a]
                                                  d :: Ord a
      Then   
          compare (dfun a d)  -->   compare_list a d 
      in one rewrite, without first inlining the 'compare' selector
      and the body of the dfun.
      
      To support this
      a) ClassOps have a BuiltInRule (see MkId.dictSelRule)
      b) DFuns have a special form of unfolding (CoreSyn.DFunUnfolding)
         which is exploited in CoreUnfold.exprIsConApp_maybe
      
      Implmenting all this required a root-and-branch rework of TcInstDcls
      and bits of TcClassDcl.
      
      
      Default methods
      ~~~~~~~~~~~~~~~
      If you give an INLINE pragma to a default method, it should be just
      as if you'd written out that code in each instance declaration, including
      the INLINE pragma.  I think that it now *is* so.  As a result, library
      code can be simpler; less duplication.
      
      
      The CONLIKE pragma
      ~~~~~~~~~~~~~~~~~~
      In the DPH project, Roman found cases where he had
      
         p n k = let x = replicate n k
                 in ...(f x)...(g x)....
      
         {-# RULE f (replicate x) = f_rep x #-}
      
      Normally the RULE would not fire, because doing so involves 
      (in effect) duplicating the redex (replicate n k).  A new
      experimental modifier to the INLINE pragma, {-# INLINE CONLIKE
      replicate #-}, allows you to tell GHC to be prepared to duplicate
      a call of this function if it allows a RULE to fire.
      
      See Note [CONLIKE pragma] in BasicTypes
      
      
      Join points
      ~~~~~~~~~~~
      See Note [Case binders and join points] in Simplify
      
      
      Other refactoring
      ~~~~~~~~~~~~~~~~~
      * I moved endPass from CoreLint to CoreMonad, with associated jigglings
      
      * Better pretty-printing of Core
      
      * The top-level RULES (ones that are not rules for locally-defined things)
        are now substituted on every simplifier iteration.  I'm not sure how
        we got away without doing this before.  This entails a bit more plumbing
        in SimplCore.
      
      * The necessary stuff to serialise and deserialise the new
        info across interface files.
      
      * Something about bottoming floats in SetLevels
            Note [Bottoming floats]
      
      * substUnfolding has moved from SimplEnv to CoreSubs, where it belongs
      
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed
      --------------------------------------------------------------------------------
                 anna          +2.4%     -0.5%      0.16      0.17
                 ansi          +2.6%     -0.1%      0.00      0.00
                 atom          -3.8%     -0.0%     -1.0%     -2.5%
               awards          +3.0%     +0.7%      0.00      0.00
               banner          +3.3%     -0.0%      0.00      0.00
           bernouilli          +2.7%     +0.0%     -4.6%     -6.9%
                boyer          +2.6%     +0.0%      0.06      0.07
               boyer2          +4.4%     +0.2%      0.01      0.01
                 bspt          +3.2%     +9.6%      0.02      0.02
            cacheprof          +1.4%     -1.0%    -12.2%    -13.6%
             calendar          +2.7%     -1.7%      0.00      0.00
             cichelli          +3.7%     -0.0%      0.13      0.14
              circsim          +3.3%     +0.0%     -2.3%     -9.9%
             clausify          +2.7%     +0.0%      0.05      0.06
        comp_lab_zift          +2.6%     -0.3%     -7.2%     -7.9%
             compress          +3.3%     +0.0%     -8.5%     -9.6%
            compress2          +3.6%     +0.0%    -15.1%    -17.8%
          constraints          +2.7%     -0.6%    -10.0%    -10.7%
         cryptarithm1          +4.5%     +0.0%     -4.7%     -5.7%
         cryptarithm2          +4.3%    -14.5%      0.02      0.02
                  cse          +4.4%     -0.0%      0.00      0.00
                eliza          +2.8%     -0.1%      0.00      0.00
                event          +2.6%     -0.0%     -4.9%     -4.4%
               exp3_8          +2.8%     +0.0%     -4.5%     -9.5%
               expert          +2.7%     +0.3%      0.00      0.00
                  fem          -2.0%     +0.6%      0.04      0.04
                  fft          -6.0%     +1.8%      0.05      0.06
                 fft2          -4.8%     +2.7%      0.13      0.14
             fibheaps          +2.6%     -0.6%      0.05      0.05
                 fish          +4.1%     +0.0%      0.03      0.04
                fluid          -2.1%     -0.2%      0.01      0.01
               fulsom          -4.8%     +9.2%     +9.1%     +8.4%
               gamteb          -7.1%     -1.3%      0.10      0.11
                  gcd          +2.7%     +0.0%      0.05      0.05
          gen_regexps          +3.9%     -0.0%      0.00      0.00
               genfft          +2.7%     -0.1%      0.05      0.06
                   gg          -2.7%     -0.1%      0.02      0.02
                 grep          +3.2%     -0.0%      0.00      0.00
               hidden          -0.5%     +0.0%    -11.9%    -13.3%
                  hpg          -3.0%     -1.8%     +0.0%     -2.4%
                  ida          +2.6%     -1.2%      0.17     -9.0%
                infer          +1.7%     -0.8%      0.08      0.09
              integer          +2.5%     -0.0%     -2.6%     -2.2%
            integrate          -5.0%     +0.0%     -1.3%     -2.9%
              knights          +4.3%     -1.5%      0.01      0.01
                 lcss          +2.5%     -0.1%     -7.5%     -9.4%
                 life          +4.2%     +0.0%     -3.1%     -3.3%
                 lift          +2.4%     -3.2%      0.00      0.00
            listcompr          +4.0%     -1.6%      0.16      0.17
             listcopy          +4.0%     -1.4%      0.17      0.18
             maillist          +4.1%     +0.1%      0.09      0.14
               mandel          +2.9%     +0.0%      0.11      0.12
              mandel2          +4.7%     +0.0%      0.01      0.01
              minimax          +3.8%     -0.0%      0.00      0.00
              mkhprog          +3.2%     -4.2%      0.00      0.00
           multiplier          +2.5%     -0.4%     +0.7%     -1.3%
             nucleic2          -9.3%     +0.0%      0.10      0.10
                 para          +2.9%     +0.1%     -0.7%     -1.2%
            paraffins         -10.4%     +0.0%      0.20     -1.9%
               parser          +3.1%     -0.0%      0.05      0.05
              parstof          +1.9%     -0.0%      0.00      0.01
                  pic          -2.8%     -0.8%      0.01      0.02
                power          +2.1%     +0.1%     -8.5%     -9.0%
               pretty         -12.7%     +0.1%      0.00      0.00
               primes          +2.8%     +0.0%      0.11      0.11
            primetest          +2.5%     -0.0%     -2.1%     -3.1%
               prolog          +3.2%     -7.2%      0.00      0.00
               puzzle          +4.1%     +0.0%     -3.5%     -8.0%
               queens          +2.8%     +0.0%      0.03      0.03
              reptile          +2.2%     -2.2%      0.02      0.02
              rewrite          +3.1%    +10.9%      0.03      0.03
                 rfib          -5.2%     +0.2%      0.03      0.03
                  rsa          +2.6%     +0.0%      0.05      0.06
                  scc          +4.6%     +0.4%      0.00      0.00
                sched          +2.7%     +0.1%      0.03      0.03
                  scs          -2.6%     -0.9%     -9.6%    -11.6%
               simple          -4.0%     +0.4%    -14.6%    -14.9%
                solid          -5.6%     -0.6%     -9.3%    -14.3%
              sorting          +3.8%     +0.0%      0.00      0.00
               sphere          -3.6%     +8.5%      0.15      0.16
               symalg          -1.3%     +0.2%      0.03      0.03
                  tak          +2.7%     +0.0%      0.02      0.02
            transform          +2.0%     -2.9%     -8.0%     -8.8%
             treejoin          +3.1%     +0.0%    -17.5%    -17.8%
            typecheck          +2.9%     -0.3%     -4.6%     -6.6%
              veritas          +3.9%     -0.3%      0.00      0.00
                 wang          -6.2%     +0.0%      0.18     -9.8%
            wave4main         -10.3%     +2.6%     -2.1%     -2.3%
         wheel-sieve1          +2.7%     -0.0%     +0.3%     -0.6%
         wheel-sieve2          +2.7%     +0.0%     -3.7%     -7.5%
                 x2n1          -4.1%     +0.1%      0.03      0.04
      --------------------------------------------------------------------------------
                  Min         -12.7%    -14.5%    -17.5%    -17.8%
                  Max          +4.7%    +10.9%     +9.1%     +8.4%
       Geometric Mean          +0.9%     -0.1%     -5.6%     -7.3%
      72462499
  32. 23 Oct, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Fix Trac #3591: very tricky specialiser bug · c43c9817
      simonpj@microsoft.com authored
        
      There was a subtle bug in the interation of specialisation and floating,
      described in Note [Specialisation of dictionary functions]. 
        
      The net effect was to create a loop where none existed before; plain wrong.
        
      In fixing it, I did quite a bit of house-cleaning in the specialiser, and
      added a lot more comments.  It's tricky, alas.
      c43c9817
  33. 02 Apr, 2009 1 commit
  34. 23 Mar, 2009 1 commit
  35. 18 Mar, 2009 1 commit
    • simonpj@microsoft.com's avatar
      Add the notion of "constructor-like" Ids for rule-matching · 4bc25e8c
      simonpj@microsoft.com authored
      This patch adds an optional CONLIKE modifier to INLINE/NOINLINE pragmas, 
         {-# NOINLINE CONLIKE [1] f #-}
      The effect is to allow applications of 'f' to be expanded in a potential
      rule match.  Example
        {-# RULE "r/f" forall v. r (f v) = f (v+1) #-}
      
      Consider the term
           let x = f v in ..x...x...(r x)...
      Normally the (r x) would not match the rule, because GHC would be scared
      about duplicating the redex (f v). However the CONLIKE modifier says to
      treat 'f' like a constructor in this situation, and "look through" the
      unfolding for x.  So (r x) fires, yielding (f (v+1)).
      
      The main changes are:
        - Syntax
      
        - The inlinePragInfo field of an IdInfo has a RuleMatchInfo
          component, which records whether or not the Id is CONLIKE.
          Of course, this needs to be serialised in interface files too.
      
        - The occurrence analyser (OccAnal) and simplifier (Simplify) treat
          CONLIKE thing like constructors, by ANF-ing them
      
        - New function coreUtils.exprIsExpandable is like exprIsCheap, but
          additionally spots applications of CONLIKE functions
      
        - A CoreUnfolding has a field that caches exprIsExpandable
      
        - The rule matcher consults this field.  See 
          Note [Expanding variables] in Rules.lhs.
      
      On the way I fixed a lurking variable bug in the way variables are
      expanded.  See Note [Do not expand locally-bound variables] in
      Rule.lhs.  I also did a bit of reformatting and refactoring in
      Rules.lhs, so the module has more lines changed than are really
      different.
      4bc25e8c
  36. 16 Dec, 2008 1 commit
    • Simon Marlow's avatar
      Rollback INLINE patches · e79c9ce0
      Simon Marlow authored
      rolling back:
      
      Fri Dec  5 16:54:00 GMT 2008  simonpj@microsoft.com
        * Completely new treatment of INLINE pragmas (big patch)
        
        This is a major patch, which changes the way INLINE pragmas work.
        Although lots of files are touched, the net is only +21 lines of
        code -- and I bet that most of those are comments!
        
        HEADS UP: interface file format has changed, so you'll need to
        recompile everything.
        
        There is not much effect on overall performance for nofib, 
        probably because those programs don't make heavy use of INLINE pragmas.
        
                Program           Size    Allocs   Runtime   Elapsed
                    Min         -11.3%     -6.9%     -9.2%     -8.2%
                    Max          -0.1%     +4.6%     +7.5%     +8.9%
         Geometric Mean          -2.2%     -0.2%     -1.0%     -0.8%
        
        (The +4.6% for on allocs is cichelli; see other patch relating to
        -fpass-case-bndr-to-join-points.)
        
        The old INLINE system
        ~~~~~~~~~~~~~~~~~~~~~
        The old system worked like this. A function with an INLINE pragam
        got a right-hand side which looked like
             f = __inline_me__ (\xy. e)
        The __inline_me__ part was an InlineNote, and was treated specially
        in various ways.  Notably, the simplifier didn't inline inside an
        __inline_me__ note.  
        
        As a result, the code for f itself was pretty crappy. That matters
        if you say (map f xs), because then you execute the code for f,
        rather than inlining a copy at the call site.
        
        The new story: InlineRules
        ~~~~~~~~~~~~~~~~~~~~~~~~~~
        The new system removes the InlineMe Note altogether.  Instead there
        is a new constructor InlineRule in CoreSyn.Unfolding.  This is a 
        bit like a RULE, in that it remembers the template to be inlined inside
        the InlineRule.  No simplification or inlining is done on an InlineRule,
        just like RULEs.  
        
        An Id can have an InlineRule *or* a CoreUnfolding (since these are two
        constructors from Unfolding). The simplifier treats them differently:
        
          - An InlineRule is has the substitution applied (like RULES) but 
            is otherwise left undisturbed.
        
          - A CoreUnfolding is updated with the new RHS of the definition,
            on each iteration of the simplifier.
        
        An InlineRule fires regardless of size, but *only* when the function
        is applied to enough arguments.  The "arity" of the rule is specified
        (by the programmer) as the number of args on the LHS of the "=".  So
        it makes a difference whether you say
          	{-# INLINE f #-}
        	f x = \y -> e     or     f x y = e
        This is one of the big new features that InlineRule gives us, and it
        is one that Roman really wanted.
        
        In contrast, a CoreUnfolding can fire when it is applied to fewer
        args than than the function has lambdas, provided the result is small
        enough.
        
        
        Consequential stuff
        ~~~~~~~~~~~~~~~~~~~
        * A 'wrapper' no longer has a WrapperInfo in the IdInfo.  Instead,
          the InlineRule has a field identifying wrappers.
        
        * Of course, IfaceSyn and interface serialisation changes appropriately.
        
        * Making implication constraints inline nicely was a bit fiddly. In
          the end I added a var_inline field to HsBInd.VarBind, which is why
          this patch affects the type checker slightly
        
        * I made some changes to the way in which eta expansion happens in
          CorePrep, mainly to ensure that *arguments* that become let-bound
          are also eta-expanded.  I'm still not too happy with the clarity
          and robustness fo the result.
        
        * We now complain if the programmer gives an INLINE pragma for
          a recursive function (prevsiously we just ignored it).  Reason for
          change: we don't want an InlineRule on a LoopBreaker, because then
          we'd have to check for loop-breaker-hood at occurrence sites (which
          isn't currenlty done).  Some tests need changing as a result.
        
        This patch has been in my tree for quite a while, so there are
        probably some other minor changes.
        
      
          M ./compiler/basicTypes/Id.lhs -11
          M ./compiler/basicTypes/IdInfo.lhs -82
          M ./compiler/basicTypes/MkId.lhs -2 +2
          M ./compiler/coreSyn/CoreFVs.lhs -2 +25
          M ./compiler/coreSyn/CoreLint.lhs -5 +1
          M ./compiler/coreSyn/CorePrep.lhs -59 +53
          M ./compiler/coreSyn/CoreSubst.lhs -22 +31
          M ./compiler/coreSyn/CoreSyn.lhs -66 +92
          M ./compiler/coreSyn/CoreUnfold.lhs -112 +112
          M ./compiler/coreSyn/CoreUtils.lhs -185 +184
          M ./compiler/coreSyn/MkExternalCore.lhs -1
          M ./compiler/coreSyn/PprCore.lhs -4 +40
          M ./compiler/deSugar/DsBinds.lhs -70 +118
          M ./compiler/deSugar/DsForeign.lhs -2 +4
          M ./compiler/deSugar/DsMeta.hs -4 +3
          M ./compiler/hsSyn/HsBinds.lhs -3 +3
          M ./compiler/hsSyn/HsUtils.lhs -2 +7
          M ./compiler/iface/BinIface.hs -11 +25
          M ./compiler/iface/IfaceSyn.lhs -13 +21
          M ./compiler/iface/MkIface.lhs -24 +19
          M ./compiler/iface/TcIface.lhs -29 +23
          M ./compiler/main/TidyPgm.lhs -55 +49
          M ./compiler/parser/ParserCore.y -5 +6
          M ./compiler/simplCore/CSE.lhs -2 +1
          M ./compiler/simplCore/FloatIn.lhs -6 +1
          M ./compiler/simplCore/FloatOut.lhs -23
          M ./compiler/simplCore/OccurAnal.lhs -36 +5
          M ./compiler/simplCore/SetLevels.lhs -59 +54
          M ./compiler/simplCore/SimplCore.lhs -48 +52
          M ./compiler/simplCore/SimplEnv.lhs -26 +22
          M ./compiler/simplCore/SimplUtils.lhs -28 +4
          M ./compiler/simplCore/Simplify.lhs -91 +109
          M ./compiler/specialise/Specialise.lhs -15 +18
          M ./compiler/stranal/WorkWrap.lhs -14 +11
          M ./compiler/stranal/WwLib.lhs -2 +2
          M ./compiler/typecheck/Inst.lhs -1 +3
          M ./compiler/typecheck/TcBinds.lhs -17 +27
          M ./compiler/typecheck/TcClassDcl.lhs -1 +2
          M ./compiler/typecheck/TcExpr.lhs -4 +6
          M ./compiler/typecheck/TcForeign.lhs -1 +1
          M ./compiler/typecheck/TcGenDeriv.lhs -14 +13
          M ./compiler/typecheck/TcHsSyn.lhs -3 +2
          M ./compiler/typecheck/TcInstDcls.lhs -5 +4
          M ./compiler/typecheck/TcRnDriver.lhs -2 +11
          M ./compiler/typecheck/TcSimplify.lhs -10 +17
          M ./compiler/vectorise/VectType.hs +7
      
      Mon Dec  8 12:43:10 GMT 2008  simonpj@microsoft.com
        * White space only
      
          M ./compiler/simplCore/Simplify.lhs -2
      
      Mon Dec  8 12:48:40 GMT 2008  simonpj@microsoft.com
        * Move simpleOptExpr from CoreUnfold to CoreSubst
      
          M ./compiler/coreSyn/CoreSubst.lhs -1 +87
          M ./compiler/coreSyn/CoreUnfold.lhs -72 +1
      
      Mon Dec  8 17:30:18 GMT 2008  simonpj@microsoft.com
        * Use CoreSubst.simpleOptExpr in place of the ad-hoc simpleSubst (reduces code too)
      
          M ./compiler/deSugar/DsBinds.lhs -50 +16
      
      Tue Dec  9 17:03:02 GMT 2008  simonpj@microsoft.com
        * Fix Trac #2861: bogus eta expansion
        
        Urghlhl!  I "tided up" the treatment of the "state hack" in CoreUtils, but
        missed an unexpected interaction with the way that a bottoming function
        simply swallows excess arguments.  There's a long
             Note [State hack and bottoming functions]
        to explain (which accounts for most of the new lines of code).
        
      
          M ./compiler/coreSyn/CoreUtils.lhs -16 +53
      
      Mon Dec 15 10:02:21 GMT 2008  Simon Marlow <marlowsd@gmail.com>
        * Revert CorePrep part of "Completely new treatment of INLINE pragmas..."
        
        The original patch said:
        
        * I made some changes to the way in which eta expansion happens in
          CorePrep, mainly to ensure that *arguments* that become let-bound
          are also eta-expanded.  I'm still not too happy with the clarity
          and robustness fo the result.
          
        Unfortunately this change apparently broke some invariants that were
        relied on elsewhere, and in particular lead to panics when compiling
        with profiling on.
        
        Will re-investigate in the new year.
      
          M ./compiler/coreSyn/CorePrep.lhs -53 +58
          M ./configure.ac -1 +1
      
      Mon Dec 15 12:28:51 GMT 2008  Simon Marlow <marlowsd@gmail.com>
        * revert accidental change to configure.ac
      
          M ./configure.ac -1 +1
      e79c9ce0
  37. 05 Dec, 2008 1 commit
    • simonpj@microsoft.com's avatar
      Completely new treatment of INLINE pragmas (big patch) · d95ce839
      simonpj@microsoft.com authored
      This is a major patch, which changes the way INLINE pragmas work.
      Although lots of files are touched, the net is only +21 lines of
      code -- and I bet that most of those are comments!
      
      HEADS UP: interface file format has changed, so you'll need to
      recompile everything.
      
      There is not much effect on overall performance for nofib, 
      probably because those programs don't make heavy use of INLINE pragmas.
      
              Program           Size    Allocs   Runtime   Elapsed
                  Min         -11.3%     -6.9%     -9.2%     -8.2%
                  Max          -0.1%     +4.6%     +7.5%     +8.9%
       Geometric Mean          -2.2%     -0.2%     -1.0%     -0.8%
      
      (The +4.6% for on allocs is cichelli; see other patch relating to
      -fpass-case-bndr-to-join-points.)
      
      The old INLINE system
      ~~~~~~~~~~~~~~~~~~~~~
      The old system worked like this. A function with an INLINE pragam
      got a right-hand side which looked like
           f = __inline_me__ (\xy. e)
      The __inline_me__ part was an InlineNote, and was treated specially
      in various ways.  Notably, the simplifier didn't inline inside an
      __inline_me__ note.  
      
      As a result, the code for f itself was pretty crappy. That matters
      if you say (map f xs), because then you execute the code for f,
      rather than inlining a copy at the call site.
      
      The new story: InlineRules
      ~~~~~~~~~~~~~~~~~~~~~~~~~~
      The new system removes the InlineMe Note altogether.  Instead there
      is a new constructor InlineRule in CoreSyn.Unfolding.  This is a 
      bit like a RULE, in that it remembers the template to be inlined inside
      the InlineRule.  No simplification or inlining is done on an InlineRule,
      just like RULEs.  
      
      An Id can have an InlineRule *or* a CoreUnfolding (since these are two
      constructors from Unfolding). The simplifier treats them differently:
      
        - An InlineRule is has the substitution applied (like RULES) but 
          is otherwise left undisturbed.
      
        - A CoreUnfolding is updated with the new RHS of the definition,
          on each iteration of the simplifier.
      
      An InlineRule fires regardless of size, but *only* when the function
      is applied to enough arguments.  The "arity" of the rule is specified
      (by the programmer) as the number of args on the LHS of the "=".  So
      it makes a difference whether you say
        	{-# INLINE f #-}
      	f x = \y -> e     or     f x y = e
      This is one of the big new features that InlineRule gives us, and it
      is one that Roman really wanted.
      
      In contrast, a CoreUnfolding can fire when it is applied to fewer
      args than than the function has lambdas, provided the result is small
      enough.
      
      
      Consequential stuff
      ~~~~~~~~~~~~~~~~~~~
      * A 'wrapper' no longer has a WrapperInfo in the IdInfo.  Instead,
        the InlineRule has a field identifying wrappers.
      
      * Of course, IfaceSyn and interface serialisation changes appropriately.
      
      * Making implication constraints inline nicely was a bit fiddly. In
        the end I added a var_inline field to HsBInd.VarBind, which is why
        this patch affects the type checker slightly
      
      * I made some changes to the way in which eta expansion happens in
        CorePrep, mainly to ensure that *arguments* that become let-bound
        are also eta-expanded.  I'm still not too happy with the clarity
        and robustness fo the result.
      
      * We now complain if the programmer gives an INLINE pragma for
        a recursive function (prevsiously we just ignored it).  Reason for
        change: we don't want an InlineRule on a LoopBreaker, because then
        we'd have to check for loop-breaker-hood at occurrence sites (which
        isn't currenlty done).  Some tests need changing as a result.
      
      This patch has been in my tree for quite a while, so there are
      probably some other minor changes.
      d95ce839
  38. 30 Oct, 2008 1 commit
    • simonpj@microsoft.com's avatar
      Add (a) CoreM monad, (b) new Annotations feature · 9bcd95ba
      simonpj@microsoft.com authored
      This patch, written by Max Bolingbroke,  does two things
      
      1.  It adds a new CoreM monad (defined in simplCore/CoreMonad),
          which is used as the top-level monad for all the Core-to-Core
          transformations (starting at SimplCore).  It supports
             * I/O (for debug printing)
             * Unique supply
             * Statistics gathering
             * Access to the HscEnv, RuleBase, Annotations, Module
          The patch therefore refactors the top "skin" of every Core-to-Core
          pass, but does not change their functionality.
      
      2.  It adds a completely new facility to GHC: Core "annotations".
          The idea is that you can say
             {#- ANN foo (Just "Hello") #-}
          which adds the annotation (Just "Hello") to the top level function
          foo.  These annotations can be looked up in any Core-to-Core pass,
          and are persisted into interface files.  (Hence a Core-to-Core pass
          can also query the annotations of imported things.)  Furthermore,
          a Core-to-Core pass can add new annotations (eg strictness info)
          of its own, which can be queried by importing modules.
      
      The design of the annotation system is somewhat in flux.  It's
      designed to work with the (upcoming) dynamic plug-ins mechanism,
      but is meanwhile independently useful.
      
      Do not merge to 6.10!  
      9bcd95ba
  39. 05 Sep, 2008 1 commit
    • simonpj@microsoft.com's avatar
      More specialiser wibbles · baa26ed7
      simonpj@microsoft.com authored
      Several things
      
       * Only gather call details for local things, not imported ones
      
       * When making auxiliary dictionary bindings in specDefn, remember
         to give them an unfolding.  Otherwise we don't gather call details
         for functions applied to them.
      baa26ed7