1. 01 Oct, 2015 1 commit
  2. 06 Aug, 2015 1 commit
  3. 27 Jul, 2015 1 commit
  4. 18 Jun, 2015 2 commits
    • Simon Peyton Jones's avatar
      Refactor filterAlts into two parts · ee643698
      Simon Peyton Jones authored
      This splits filterAlts into two:
       - filterAlts
       - refineDefaultAlt
      
      No change in functionality
      ee643698
    • Simon Peyton Jones's avatar
      Care with impossible-cons in combineIdenticalAlts · 023a0ba9
      Simon Peyton Jones authored
      This was a nasty, long-standing bug exposed in Trac #10538.
      Symptoms were that we had an empty case
         case (x :: Either a) of {}
      Core Lint correctly picked this bogus code up.
      
      Here is what happened
      
      * In SimplUtils.prepareAlts, we call
              filterAlts
        then
              combineIdenticalAlts
      
      * We had    case x of { Left _ -> e1; Right _ -> e1 }
      
      * filterAlts did nothing, but correctly retuned imposs_deflt_cons
        saying that 'x' cannot be {Left, Right} in the DEFAULT branch,
        if any (there isn't one.)
      
      * combineIdentialAlts correctly combines the identical alts, to give
           case x of { DEFAULT -> e1 }
      
      * BUT combineIdenticalAlts did no adjust imposs_deft_cons
      
      * Result: when compiling e1 we did so in the belief that 'x'
        could not be {Left,Right}.  Disaster.
      
      Easily fixed.
      
      (It is hard to trigger; I can't construct a simple test case.)
      023a0ba9
  5. 01 Jun, 2015 1 commit
  6. 22 May, 2015 1 commit
    • Simon Peyton Jones's avatar
      Fix a huge space leak in the mighty Simplifier · 45d9a15c
      Simon Peyton Jones authored
      This long-standing, terrible, adn somewhat subtle bug was exposed
      by Trac #10370, thanks to Reid Barton's brilliant test case (comment:3).
      
      The effect is large on the Trac #10370 test.
      Here is what the profile report says:
      
      Before:
       total time  =       24.35 secs   (24353 ticks @ 1000 us, 1 processor)
       total alloc = 11,864,360,816 bytes  (excludes profiling overheads)
      
      After:
       total time  =       21.16 secs   (21160 ticks @ 1000 us, 1 processor)
       total alloc = 7,947,141,136 bytes  (excludes profiling overheads)
      
      The /combined/ effect of the tidyOccName fix, plus this one, is dramtic
      for Trac #10370.  Here is what +RTS -s says:
      
      Before:
        15,490,210,952 bytes allocated in the heap
         1,783,919,456 bytes maximum residency (20 sample(s))
      
        MUT     time   30.117s  ( 31.383s elapsed)
        GC      time   90.103s  ( 90.107s elapsed)
        Total   time  120.843s  (122.065s elapsed)
      
      After:
         7,928,671,936 bytes allocated in the heap
            52,914,832 bytes maximum residency (25 sample(s))
      
        MUT     time   13.912s  ( 15.110s elapsed)
        GC      time    6.809s  (  6.808s elapsed)
        Total   time   20.789s  ( 21.954s elapsed)
      
      - Heap allocation halved
      - Residency cut by a factor of more than 30.
      - ELapsed time cut by a factor of 6
      
      Not bad!
      
      The details
      ~~~~~~~~~~~
      The culprit was SimplEnv.mkCoreSubst, which used mapVarEnv to do some
      impedence-matching from the substitituion used by the simplifier to
      the one used by CoreSubst.  But the impedence-mactching was recursive!
      
        mk_subst tv_env cv_env id_env
          = CoreSubst.mkSubst in_scope tv_env cv_env (mapVarEnv fiddle id_env)
      
        fiddle (DoneEx e)          = e
        fiddle (DoneId v)          = Var v
        fiddle (ContEx tv cv id e) = CoreSubst.substExpr (mk_subst tv cv id) e
      
      Inside fiddle, in the ContEx case, we may do another whole level of
      fiddle.  And so on.  Moreover, UniqFM (which is built on Data.IntMap) is
      strict, so the fiddling is done eagerly.  I didn't wok through all the
      details but the result is a gargatuan blow-up of entirely unnecessary work.
      
      Laziness would make this go away, I think, but I don't want to mess
      with IntMap.  And in any case, the impedence matching is a royal pain.
      
      In the end I simply ceased trying to use CoreSubst.substExpr in the
      simplifier, and instead just use simplExpr.  That does mean bit of
      duplication; e.g.  new code for simplRules.  But it's not a big deal
      and it's far more direct and easy to reason about.
      
      A bit of knock-on refactoring:
      
       * Data type ArgSummary moves to CoreUnfold.
      
       * interestingArg moves from CoreUnfold to SimplUtils, and gets a
         SimplEnv argument which can be used when we encounter a variable.
      
       * simplLamBndrs, addBndrRules move from SimplEnv to Simplify
         (because they now calls simplUnfolding, simplRules resp)
      
       * SimplUtils.substExpr, substUnfolding, mkCoreSubst die completely
      
       * In Simplify some several functions that were previously pure
         substitution-based functions are now monadic:
           - addBndrRules, simplRule
           - addCoerce, add_coerce in simplCast
      
       * In case 2c of Simplify.rebuildCase, there was a pretty disgusting
         expression-substitution taking place for 'rhs'; and we really don't
         want to make that monadic becuase 'rhs' can be big.
         Solution: reduce the arity of the rules for seq.
         See Note [User-defined RULES for seq] in MkId.
      45d9a15c
  7. 10 Feb, 2015 1 commit
  8. 19 Jan, 2015 1 commit
  9. 16 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Source notes (Core support) · 993975d3
      Peter Wortmann authored
      This patch introduces "SourceNote" tickishs that link Core to the
      source code that generated it. The idea is to retain these source code
      links throughout code transformations so we can eventually relate
      object code all the way back to the original source (which we can,
      say, encode as DWARF information to allow debugging).  We generate
      these SourceNotes like other tickshs in the desugaring phase. The
      activating command line flag is "-g", consistent with the flag other
      compilers use to decide DWARF generation.
      
      Keeping ticks from getting into the way of Core transformations is
      tricky, but doable. The changes in this patch produce identical Core
      in all cases I tested -- which at this point is GHC, all libraries and
      nofib. Also note that this pass creates *lots* of tick nodes, which we
      reduce somewhat by removing duplicated and overlapping source
      ticks. This will still cause significant Tick "clumps" - a possible
      future optimization could be to make Tick carry a list of Tickishs
      instead of one at a time.
      
      (From Phabricator D169)
      993975d3
  10. 11 Dec, 2014 1 commit
    • Simon Peyton Jones's avatar
      Fix an obscure but terrible bug in the simplifier (Trac #9567) · b8392ae7
      Simon Peyton Jones authored
      The issue was that contInputType simply gave the wrong answer
      for type applications.
      
      There was no way to fix contInputType; it just didn't have enough
      information.  So I did this:
      
      * Split the ApplyTo constructor of SimplUtils.SimplCont into
            ApplyToVal
            ApplyToTy
        I used record syntax for them; we should do this for some
        of the other constructors too.
      
      * The latter carries a sc_hole_ty, which is the type of the
        continuation's "hole"
      
      * Maintaining this type meant that I had do to something
        similar for SimplUtils.ArgSpec.
      
      * I renamed contInputType to contHoleType for consistency.
      
      * I did a bit of refactoring around the call to tryRules
        in Simplify, which was jolly confusing before.
      
      The resulting code is quite nice now.  And it has the additional
      merit that it works.
      
      The tests are simply tc124 and T7891 with -O enabled.
      b8392ae7
  11. 03 Dec, 2014 1 commit
  12. 29 Aug, 2014 1 commit
  13. 28 Aug, 2014 1 commit
  14. 07 Aug, 2014 1 commit
    • Simon Peyton Jones's avatar
      Document the maintenance of the let/app invariant in the simplifier · db17d58d
      Simon Peyton Jones authored
      It's not obvious why the simplifier generates code that correctly satisfies
      the let/app invariant.   This patch does some minor refactoring, but the main
      point is to document pre-conditions to key functions, namely that the rhs
      passed in satisfies the let/app invariant.
      
      There shouldn't be any change in behaviour.
      db17d58d
  15. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  16. 24 Apr, 2014 2 commits
    • Gabor Greif's avatar
      Some typos in comments · 4ceb5dec
      Gabor Greif authored
      4ceb5dec
    • Simon Peyton Jones's avatar
      Don't eta-expand PAPs (fixes Trac #9020) · 79e46aea
      Simon Peyton Jones authored
      See Note [Do not eta-expand PAPs] in SimplUtils.  This has a tremendously
      good effect on compile times for some simple benchmarks.
      
      The test is now where it belongs, in perf/compiler/T9020 (instead of simpl015).
      
      I did a nofib run and got essentially zero change except for cacheprof which
      gets 4% more allocation.  I investigated.  Turns out that we have
      
          instance PP Reg where
             pp ppm ST_0 = "%st"
             pp ppm ST_1 = "%st(1)"
             pp ppm ST_2 = "%st(2)"
             pp ppm ST_3 = "%st(3)"
             pp ppm ST_4 = "%st(4)"
             pp ppm ST_5 = "%st(5)"
             pp ppm ST_6 = "%st(6)"
             pp ppm ST_7 = "%st(7)"
             pp ppm r    = "%" ++ map toLower (show r)
      
      That (map toLower (show r) does a lot of map/toLowers.  But if we inline show
      we get something like
      
             pp ppm ST_0 = "%st"
             pp ppm ST_1 = "%st(1)"
             pp ppm ST_2 = "%st(2)"
             pp ppm ST_3 = "%st(3)"
             pp ppm ST_4 = "%st(4)"
             pp ppm ST_5 = "%st(5)"
             pp ppm ST_6 = "%st(6)"
             pp ppm ST_7 = "%st(7)"
             pp ppm EAX  = map toLower (show EAX)
             pp ppm EBX  = map toLower (show EBX)
             ...etc...
      
      and all those map/toLower calls can now be floated to top level.
      This gives a 4% decrease in allocation.  But it depends on inlining
      a pretty big 'show' function.
      
      With this new patch we get slightly better eta-expansion, which makes
      a function look slightly bigger, which just stops it being inlined.
      The previous behaviour was luck, so I'm not going to worry about
      losing it.
      
      I've added some notes to nofib/Simon-nofib-notes
      79e46aea
  17. 08 Apr, 2014 1 commit
    • Simon Peyton Jones's avatar
      Allow a longer demand signature than arity · 848f5952
      Simon Peyton Jones authored
      See Note [Demand analysis for trivial right-hand sides] in DmdAnal.
      This allows a function with arity 2 to have a DmdSig with 3 args;
      which in turn had a knock-on effect, which showed up in the test for
      Trac #8963.
      
      In fact it seems entirely reasonable, so this patch removes the
      WARN and CoreLint checks that were complaining.
      848f5952
  18. 10 Feb, 2014 1 commit
    • Joachim Breitner's avatar
      Implement CallArity analysis · cdceadf3
      Joachim Breitner authored
      This analysis finds out if a let-bound expression with lower manifest
      arity than type arity is always called with more arguments, as in that
      case eta-expansion is allowed and often viable. The analysis is very
      much tailored towards the code generated when foldl is implemented via
      foldr; without this analysis doing so would be a very bad idea!
      
      There are other ways to improve foldr/builder-fusion to cope with foldl,
      if any of these are implemented then this step can probably be moved to
      -O2 to save some compilation times. The current impact of adding this
      phase is just below +2% (measured running GHC's "make").
      cdceadf3
  19. 12 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve eta expansion (again) · 802f4b89
      Simon Peyton Jones authored
      The presenting issue was that we were never eta-expanding
      
          f (\x -> case x of (a,b) -> \s -> blah)
      
      and that meant we were allocating two lambdas instead of one.
      See Note [Eta expanding lambdas] in SimplUtils.
      
      However I didn't want to eta expand the lambda, and then try all over
      again for tryEtaExpandRhs.  Yet the latter is important in the context
      of a let-binding it can do simple arity analysis.  So I ended up
      refactoring CallCtxt so that it tells when we are on the RHS of a let.
      
      I also moved findRhsArity from SimplUtils to CoreArity.
      
      Performance increases nicely. Here are the ones where allocation improved
      by more than 0.5%. Notice the nice decrease in binary size too.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                 ansi          -2.3%     -0.9%      0.00      0.00     +0.0%
                 bspt          -2.1%     -9.7%      0.01      0.01    -33.3%
                fasta          -1.8%    -11.7%     -3.4%     -3.6%     +0.0%
                  fft          -1.9%     -1.3%      0.06      0.06    +11.1%
      reverse-complem          -1.9%    -18.1%     -1.9%     -2.8%     +0.0%
               sphere          -1.8%     -4.5%      0.09      0.09     +0.0%
            transform          -1.8%     -2.3%     -4.6%     -3.1%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -3.0%    -18.1%    -13.9%    -14.6%    -35.7%
                  Max          -1.3%     +0.0%     +7.7%     +7.7%    +50.0%
       Geometric Mean          -1.9%     -0.6%     -2.1%     -2.1%     -0.2%
      802f4b89
  20. 23 Sep, 2013 1 commit
  21. 20 Aug, 2013 1 commit
  22. 02 Aug, 2013 1 commit
  23. 06 Jun, 2013 2 commits
  24. 30 May, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make 'SPECIALISE instance' work again · 1ed04090
      Simon Peyton Jones authored
      This is a long-standing regression (Trac #7797), which meant that in
      particular the Eq [Char] instance does not get specialised.
      (The *methods* do, but the dictionary itself doesn't.)  So when you
      call a function
           f :: Eq a => blah
      on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
      methods.
      
      This only matters when calling an overloaded function from a
      specialised context, but that does matter in some programs.  I
      remember (though I cannot find the details) that Nick Frisby discovered
      this to be the source of some pretty solid performanc regresisons.
      
      Anyway it works now. The key change is that a DFunUnfolding now takes
      a form that is both simpler than before (the DFunArg type is eliminated)
      and more general:
      
      data Unfolding
        = ...
        | DFunUnfolding {     -- The Unfolding of a DFunId
          			-- See Note [DFun unfoldings]
            		  	--     df = /\a1..am. \d1..dn. MkD t1 .. tk
                              --                                 (op1 a1..am d1..dn)
           		      	--     	    	      	       	   (op2 a1..am d1..dn)
              df_bndrs :: [Var],      -- The bound variables [a1..m],[d1..dn]
              df_con   :: DataCon,    -- The dictionary data constructor (never a newtype datacon)
              df_args  :: [CoreExpr]  -- Args of the data con: types, superclasses and methods,
          }                           -- in positional order
      
      That in turn allowed me to re-enable the DFunUnfolding specialisation in
      DsBinds.  Lots of details here in TcInstDcls:
      	  Note [SPECIALISE instance pragmas]
      
      I also did some refactoring, in particular to pass the InScopeSet to
      exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
      
      NB: Interface file format has changed!
      1ed04090
  25. 30 Jan, 2013 1 commit
  26. 22 Jan, 2013 1 commit
  27. 24 Dec, 2012 1 commit
    • Simon Peyton Jones's avatar
      Make combine-identical-alternatives work again (Trac #7360) · bacf7ca0
      Simon Peyton Jones authored
      Move the "combine indentical alternatives" transformation *before*
      simplifying the alternatives.  For example
           case x of y
              [] -> length y
              (_:_) -> length y }
      
      If we look *post* simplification, since 'y' is used in the
      alterantives, the case binders *might* be (see the keep_occ_info test
      in Simplify.simplAlt); and hence the combination of the two
      alteranatives does not happen.  But if we do it *pre* simplification
      there is no problem.
      
      This fixes Trac #7360.
      bacf7ca0
  28. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  29. 09 Oct, 2012 3 commits
  30. 09 May, 2012 2 commits
  31. 04 May, 2012 1 commit
  32. 02 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Allow cases with empty alterantives · ac230c5e
      Simon Peyton Jones authored
      This patch allows, for the first time, case expressions with an empty
      list of alternatives. Max suggested the idea, and Trac #6067 showed
      that it is really quite important.
      
      So I've implemented the idea, fixing #6067. Main changes
      
       * See Note [Empty case alternatives] in CoreSyn
      
       * Various foldr1's become foldrs
      
       * IfaceCase does not record the type of the alternatives.
         I added IfaceECase for empty-alternative cases.
      
       * Core Lint does not complain about empty cases
      
       * MkCore.castBottomExpr constructs an empty-alternative case
         expression   (case e of ty {})
      
       * CoreToStg converts '(case e of {})' to just 'e'
      ac230c5e
  33. 27 Apr, 2012 2 commits