1. 05 Feb, 2015 1 commit
  2. 17 Dec, 2014 1 commit
  3. 16 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Source notes (Core support) · 993975d3
      Peter Wortmann authored
      This patch introduces "SourceNote" tickishs that link Core to the
      source code that generated it. The idea is to retain these source code
      links throughout code transformations so we can eventually relate
      object code all the way back to the original source (which we can,
      say, encode as DWARF information to allow debugging).  We generate
      these SourceNotes like other tickshs in the desugaring phase. The
      activating command line flag is "-g", consistent with the flag other
      compilers use to decide DWARF generation.
      
      Keeping ticks from getting into the way of Core transformations is
      tricky, but doable. The changes in this patch produce identical Core
      in all cases I tested -- which at this point is GHC, all libraries and
      nofib. Also note that this pass creates *lots* of tick nodes, which we
      reduce somewhat by removing duplicated and overlapping source
      ticks. This will still cause significant Tick "clumps" - a possible
      future optimization could be to make Tick carry a list of Tickishs
      instead of one at a time.
      
      (From Phabricator D169)
      993975d3
  4. 11 Dec, 2014 1 commit
    • Simon Peyton Jones's avatar
      Fix an obscure but terrible bug in the simplifier (Trac #9567) · b8392ae7
      Simon Peyton Jones authored
      The issue was that contInputType simply gave the wrong answer
      for type applications.
      
      There was no way to fix contInputType; it just didn't have enough
      information.  So I did this:
      
      * Split the ApplyTo constructor of SimplUtils.SimplCont into
            ApplyToVal
            ApplyToTy
        I used record syntax for them; we should do this for some
        of the other constructors too.
      
      * The latter carries a sc_hole_ty, which is the type of the
        continuation's "hole"
      
      * Maintaining this type meant that I had do to something
        similar for SimplUtils.ArgSpec.
      
      * I renamed contInputType to contHoleType for consistency.
      
      * I did a bit of refactoring around the call to tryRules
        in Simplify, which was jolly confusing before.
      
      The resulting code is quite nice now.  And it has the additional
      merit that it works.
      
      The tests are simply tc124 and T7891 with -O enabled.
      b8392ae7
  5. 03 Dec, 2014 1 commit
  6. 04 Nov, 2014 1 commit
  7. 14 Oct, 2014 1 commit
  8. 01 Oct, 2014 1 commit
  9. 29 Aug, 2014 1 commit
  10. 28 Aug, 2014 2 commits
    • Simon Peyton Jones's avatar
      Simple refactor of the case-of-case transform · a0b2897e
      Simon Peyton Jones authored
      More modular, less code.  No change in behaviour.
      a0b2897e
    • Simon Peyton Jones's avatar
      Refactor unfoldings · 6e0f6ede
      Simon Peyton Jones authored
      There are two main refactorings here
      
      1.  Move the uf_arity field
             out of CoreUnfolding
             into UnfWhen
          It's a lot tidier there.  If I've got this right, no behaviour
          should change.
      
      2.  Define specUnfolding and use it in DsBinds and Specialise
           a) commons-up some shared code
           b) makes sure that Specialise correctly specialises DFun
              unfoldings (which it didn't before)
      
      The two got put together because both ended up interacting in the
      specialiser.
      
      They cause zero difference to nofib.
      6e0f6ede
  11. 07 Aug, 2014 2 commits
    • Simon Peyton Jones's avatar
      Refactor the handling of case-elimination · 8367f062
      Simon Peyton Jones authored
      Mainly in Simplify.rebuildCase.  The old code wasn't wrong, but I kept
      mis-understanding it.  This patch cuts splits out "pure seq" from "strict
      let", which makes it much easier to grok.
      8367f062
    • Simon Peyton Jones's avatar
      Document the maintenance of the let/app invariant in the simplifier · db17d58d
      Simon Peyton Jones authored
      It's not obvious why the simplifier generates code that correctly satisfies
      the let/app invariant.   This patch does some minor refactoring, but the main
      point is to document pre-conditions to key functions, namely that the rhs
      passed in satisfies the let/app invariant.
      
      There shouldn't be any change in behaviour.
      db17d58d
  12. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  13. 08 May, 2014 1 commit
  14. 25 Mar, 2014 1 commit
  15. 24 Mar, 2014 2 commits
    • Simon Peyton Jones's avatar
      Eliminate redundant seq's (Trac #8900) · 0b6fa3e9
      Simon Peyton Jones authored
      This patch makes the simplifier eliminate a redundant seq like
          case x of y -> ...y....
      where y is used strictly.  GHC used to do this, but I made it less
      aggressive in
      
         commit 28d9a032 (Jan 2013)
      
      However #8900 shows that doing so sometimes loses good
      transformations; and the transformation is valid according to "A
      semantics for imprecise exceptions".  So I'm restoring the old
      behaviour.
      
      See Note [Eliminating redundant seqs]
      0b6fa3e9
    • Simon Peyton Jones's avatar
      Comments only · 5c7ced0f
      Simon Peyton Jones authored
      5c7ced0f
  16. 18 Mar, 2014 1 commit
    • Simon Peyton Jones's avatar
      Make sure we occurrence-analyse unfoldings (fixes Trac #8892) · 87bbc69c
      Simon Peyton Jones authored
      For DFunUnfoldings we were failing to occurrence-analyse the unfolding,
      and that meant that a loop breaker wasn't marked as such, which in turn
      meant it was inlined away when it still had occurrence sites.  See
      Note [Occurrrence analysis of unfoldings] in CoreUnfold.
      
      This is a pretty long-standing bug, happily nailed by John Lato.
      87bbc69c
  17. 01 Feb, 2014 1 commit
  18. 16 Dec, 2013 1 commit
  19. 22 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Replace (State# RealWorld) with Void# where we just want a 0-bit value · f4384647
      Simon Peyton Jones authored
      We were re-using the super-magical "state token" type (which has
      VoidRep and is zero bits wide) for situations in which we simply want
      to lambda-abstract over a zero-bit argument. For example, join points:
      
         case (case x of { True -> e1; False -> e2 }) of
            Red  -> f1
            Blue -> True
      
      ==>
      
        let $j1 = \voidArg::Void# -> f1
        in
        case x of
          True -> case e1 of
                    Red -> $j1 void
                    Blue -> True
          False -> case e2 of
                    Red -> $j1 void
                    Blue -> True
      
      This patch introduces
      
         * The new primitive type GHC.Prim.Void#, with PrimRep = VoidRep
      
         * A new global Id GHC.Prim.voidPrimId :: Void#.
           This has no binding because the code generator drops it,
           but is used as an argument (eg in the call of $j1)
      
         * A new local Id, MkId.voidArgId, which can be lambda-bound
           when you need to lambda-abstract over it.
      
      and uses them throughout.
      
      Now the State# thing is used only when we need state!
      f4384647
  20. 21 Nov, 2013 2 commits
  21. 12 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve eta expansion (again) · 802f4b89
      Simon Peyton Jones authored
      The presenting issue was that we were never eta-expanding
      
          f (\x -> case x of (a,b) -> \s -> blah)
      
      and that meant we were allocating two lambdas instead of one.
      See Note [Eta expanding lambdas] in SimplUtils.
      
      However I didn't want to eta expand the lambda, and then try all over
      again for tryEtaExpandRhs.  Yet the latter is important in the context
      of a let-binding it can do simple arity analysis.  So I ended up
      refactoring CallCtxt so that it tells when we are on the RHS of a let.
      
      I also moved findRhsArity from SimplUtils to CoreArity.
      
      Performance increases nicely. Here are the ones where allocation improved
      by more than 0.5%. Notice the nice decrease in binary size too.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                 ansi          -2.3%     -0.9%      0.00      0.00     +0.0%
                 bspt          -2.1%     -9.7%      0.01      0.01    -33.3%
                fasta          -1.8%    -11.7%     -3.4%     -3.6%     +0.0%
                  fft          -1.9%     -1.3%      0.06      0.06    +11.1%
      reverse-complem          -1.9%    -18.1%     -1.9%     -2.8%     +0.0%
               sphere          -1.8%     -4.5%      0.09      0.09     +0.0%
            transform          -1.8%     -2.3%     -4.6%     -3.1%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -3.0%    -18.1%    -13.9%    -14.6%    -35.7%
                  Max          -1.3%     +0.0%     +7.7%     +7.7%    +50.0%
       Geometric Mean          -1.9%     -0.6%     -2.1%     -2.1%     -0.2%
      802f4b89
  22. 23 Oct, 2013 1 commit
  23. 18 Oct, 2013 2 commits
  24. 18 Sep, 2013 2 commits
    • Jan Stolarek's avatar
      Restore old names of comparison primops · 53948f91
      Jan Stolarek authored
      In 6579a6c7 we removed existing comparison primops and introduced new ones
      returning Int# instead of Bool. This commit (and associated commits in
      array, base, dph, ghc-prim, integer-gmp, integer-simple, primitive, testsuite and
      template-haskell) restores old names of primops. This allows us to keep
      our API cleaner at the price of not having backwards compatibility.
      
      This patch also temporalily disables fix for #8317 (optimization of
      tagToEnum# at Core level). We need to fix #8326 first, otherwise
      our primops code will be very slow.
      53948f91
    • Simon Peyton Jones's avatar
      Optimise (case tagToEnum# x of ..) as in Trac #8317 · 62c40585
      Simon Peyton Jones authored
      See Note [Optimising tagToEnum#] in Simplify
      62c40585
  25. 29 Aug, 2013 1 commit
  26. 06 Jun, 2013 1 commit
  27. 30 May, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make 'SPECIALISE instance' work again · 1ed04090
      Simon Peyton Jones authored
      This is a long-standing regression (Trac #7797), which meant that in
      particular the Eq [Char] instance does not get specialised.
      (The *methods* do, but the dictionary itself doesn't.)  So when you
      call a function
           f :: Eq a => blah
      on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
      methods.
      
      This only matters when calling an overloaded function from a
      specialised context, but that does matter in some programs.  I
      remember (though I cannot find the details) that Nick Frisby discovered
      this to be the source of some pretty solid performanc regresisons.
      
      Anyway it works now. The key change is that a DFunUnfolding now takes
      a form that is both simpler than before (the DFunArg type is eliminated)
      and more general:
      
      data Unfolding
        = ...
        | DFunUnfolding {     -- The Unfolding of a DFunId
          			-- See Note [DFun unfoldings]
            		  	--     df = /\a1..am. \d1..dn. MkD t1 .. tk
                              --                                 (op1 a1..am d1..dn)
           		      	--     	    	      	       	   (op2 a1..am d1..dn)
              df_bndrs :: [Var],      -- The bound variables [a1..m],[d1..dn]
              df_con   :: DataCon,    -- The dictionary data constructor (never a newtype datacon)
              df_args  :: [CoreExpr]  -- Args of the data con: types, superclasses and methods,
          }                           -- in positional order
      
      That in turn allowed me to re-enable the DFunUnfolding specialisation in
      DsBinds.  Lots of details here in TcInstDcls:
      	  Note [SPECIALISE instance pragmas]
      
      I also did some refactoring, in particular to pass the InScopeSet to
      exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
      
      NB: Interface file format has changed!
      1ed04090
  28. 30 Jan, 2013 1 commit
  29. 24 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Introduce CPR for sum types (Trac #5075) · d3b8991b
      Simon Peyton Jones authored
      The main payload of this patch is to extend CPR so that it
      detects when a function always returns a result constructed
      with the *same* constructor, even if the constructor comes from
      a sum type.  This doesn't matter very often, but it does improve
      some things (results below).
      
      Binary sizes increase a little bit, I think because there are more
      wrappers.  This with -split-objs.  Without split-ojbs binary sizes
      increased by 6% even for HelloWorld.hs.  It's hard to see exactly why,
      but I think it was because System.Posix.Types.o got included in the
      linked binary, whereas it didn't before.
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                fluid          +1.8%     -0.3%      0.01      0.01     +0.0%
                  tak          +2.2%     -0.2%      0.02      0.02     +0.0%
                 ansi          +1.7%     -0.3%      0.00      0.00     +0.0%
            cacheprof          +1.6%     -0.3%     +0.6%     +0.5%     +1.4%
              parstof          +1.4%     -4.4%      0.00      0.00     +0.0%
              reptile          +2.0%     +0.3%      0.02      0.02     +0.0%
      ----------------------------------------------------------------------
                  Min          +1.1%     -4.4%     -4.7%     -4.7%    -15.0%
                  Max          +2.3%     +0.3%     +8.3%     +9.4%    +50.0%
       Geometric Mean          +1.9%     -0.1%     +0.6%     +0.7%     +0.3%
      
      Other things in this commit
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~
      * Got rid of the Lattice class in Demand
      
      * Refactored the way that products and newtypes are
        decomposed (no change in functionality)
      d3b8991b
  30. 22 Jan, 2013 1 commit
  31. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  32. 04 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make CaseElim a bit less aggressive · 28d9a032
      Simon Peyton Jones authored
      See Note [Case elimination: lifted case]:
      
      We used to do case elimination if
              (c) the scrutinee is a variable and 'x' is used strictly
      But that changes
          case x of { _ -> error "bad" }
          --> error "bad"
      which is very puzzling if 'x' is later bound to (error "good").
      Where the order of evaluation is specified (via seq or case)
      we should respect it.
      
      c.f. Note [Empty case alternatives] in CoreSyn, which is how
      I came across this.
      28d9a032
  33. 02 Jan, 2013 2 commits