1. 04 Nov, 2014 1 commit
  2. 14 Oct, 2014 1 commit
  3. 01 Oct, 2014 1 commit
  4. 29 Aug, 2014 1 commit
  5. 28 Aug, 2014 2 commits
    • Simon Peyton Jones's avatar
      Simple refactor of the case-of-case transform · a0b2897e
      Simon Peyton Jones authored
      More modular, less code.  No change in behaviour.
      a0b2897e
    • Simon Peyton Jones's avatar
      Refactor unfoldings · 6e0f6ede
      Simon Peyton Jones authored
      There are two main refactorings here
      
      1.  Move the uf_arity field
             out of CoreUnfolding
             into UnfWhen
          It's a lot tidier there.  If I've got this right, no behaviour
          should change.
      
      2.  Define specUnfolding and use it in DsBinds and Specialise
           a) commons-up some shared code
           b) makes sure that Specialise correctly specialises DFun
              unfoldings (which it didn't before)
      
      The two got put together because both ended up interacting in the
      specialiser.
      
      They cause zero difference to nofib.
      6e0f6ede
  6. 07 Aug, 2014 2 commits
    • Simon Peyton Jones's avatar
      Refactor the handling of case-elimination · 8367f062
      Simon Peyton Jones authored
      Mainly in Simplify.rebuildCase.  The old code wasn't wrong, but I kept
      mis-understanding it.  This patch cuts splits out "pure seq" from "strict
      let", which makes it much easier to grok.
      8367f062
    • Simon Peyton Jones's avatar
      Document the maintenance of the let/app invariant in the simplifier · db17d58d
      Simon Peyton Jones authored
      It's not obvious why the simplifier generates code that correctly satisfies
      the let/app invariant.   This patch does some minor refactoring, but the main
      point is to document pre-conditions to key functions, namely that the rhs
      passed in satisfies the let/app invariant.
      
      There shouldn't be any change in behaviour.
      db17d58d
  7. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  8. 08 May, 2014 1 commit
  9. 25 Mar, 2014 1 commit
  10. 24 Mar, 2014 2 commits
    • Simon Peyton Jones's avatar
      Eliminate redundant seq's (Trac #8900) · 0b6fa3e9
      Simon Peyton Jones authored
      This patch makes the simplifier eliminate a redundant seq like
          case x of y -> ...y....
      where y is used strictly.  GHC used to do this, but I made it less
      aggressive in
      
         commit 28d9a032 (Jan 2013)
      
      However #8900 shows that doing so sometimes loses good
      transformations; and the transformation is valid according to "A
      semantics for imprecise exceptions".  So I'm restoring the old
      behaviour.
      
      See Note [Eliminating redundant seqs]
      0b6fa3e9
    • Simon Peyton Jones's avatar
      Comments only · 5c7ced0f
      Simon Peyton Jones authored
      5c7ced0f
  11. 18 Mar, 2014 1 commit
    • Simon Peyton Jones's avatar
      Make sure we occurrence-analyse unfoldings (fixes Trac #8892) · 87bbc69c
      Simon Peyton Jones authored
      For DFunUnfoldings we were failing to occurrence-analyse the unfolding,
      and that meant that a loop breaker wasn't marked as such, which in turn
      meant it was inlined away when it still had occurrence sites.  See
      Note [Occurrrence analysis of unfoldings] in CoreUnfold.
      
      This is a pretty long-standing bug, happily nailed by John Lato.
      87bbc69c
  12. 01 Feb, 2014 1 commit
  13. 16 Dec, 2013 1 commit
  14. 22 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Replace (State# RealWorld) with Void# where we just want a 0-bit value · f4384647
      Simon Peyton Jones authored
      We were re-using the super-magical "state token" type (which has
      VoidRep and is zero bits wide) for situations in which we simply want
      to lambda-abstract over a zero-bit argument. For example, join points:
      
         case (case x of { True -> e1; False -> e2 }) of
            Red  -> f1
            Blue -> True
      
      ==>
      
        let $j1 = \voidArg::Void# -> f1
        in
        case x of
          True -> case e1 of
                    Red -> $j1 void
                    Blue -> True
          False -> case e2 of
                    Red -> $j1 void
                    Blue -> True
      
      This patch introduces
      
         * The new primitive type GHC.Prim.Void#, with PrimRep = VoidRep
      
         * A new global Id GHC.Prim.voidPrimId :: Void#.
           This has no binding because the code generator drops it,
           but is used as an argument (eg in the call of $j1)
      
         * A new local Id, MkId.voidArgId, which can be lambda-bound
           when you need to lambda-abstract over it.
      
      and uses them throughout.
      
      Now the State# thing is used only when we need state!
      f4384647
  15. 21 Nov, 2013 2 commits
  16. 12 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve eta expansion (again) · 802f4b89
      Simon Peyton Jones authored
      The presenting issue was that we were never eta-expanding
      
          f (\x -> case x of (a,b) -> \s -> blah)
      
      and that meant we were allocating two lambdas instead of one.
      See Note [Eta expanding lambdas] in SimplUtils.
      
      However I didn't want to eta expand the lambda, and then try all over
      again for tryEtaExpandRhs.  Yet the latter is important in the context
      of a let-binding it can do simple arity analysis.  So I ended up
      refactoring CallCtxt so that it tells when we are on the RHS of a let.
      
      I also moved findRhsArity from SimplUtils to CoreArity.
      
      Performance increases nicely. Here are the ones where allocation improved
      by more than 0.5%. Notice the nice decrease in binary size too.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                 ansi          -2.3%     -0.9%      0.00      0.00     +0.0%
                 bspt          -2.1%     -9.7%      0.01      0.01    -33.3%
                fasta          -1.8%    -11.7%     -3.4%     -3.6%     +0.0%
                  fft          -1.9%     -1.3%      0.06      0.06    +11.1%
      reverse-complem          -1.9%    -18.1%     -1.9%     -2.8%     +0.0%
               sphere          -1.8%     -4.5%      0.09      0.09     +0.0%
            transform          -1.8%     -2.3%     -4.6%     -3.1%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -3.0%    -18.1%    -13.9%    -14.6%    -35.7%
                  Max          -1.3%     +0.0%     +7.7%     +7.7%    +50.0%
       Geometric Mean          -1.9%     -0.6%     -2.1%     -2.1%     -0.2%
      802f4b89
  17. 23 Oct, 2013 1 commit
  18. 18 Oct, 2013 2 commits
  19. 18 Sep, 2013 2 commits
    • Jan Stolarek's avatar
      Restore old names of comparison primops · 53948f91
      Jan Stolarek authored
      In 6579a6c7 we removed existing comparison primops and introduced new ones
      returning Int# instead of Bool. This commit (and associated commits in
      array, base, dph, ghc-prim, integer-gmp, integer-simple, primitive, testsuite and
      template-haskell) restores old names of primops. This allows us to keep
      our API cleaner at the price of not having backwards compatibility.
      
      This patch also temporalily disables fix for #8317 (optimization of
      tagToEnum# at Core level). We need to fix #8326 first, otherwise
      our primops code will be very slow.
      53948f91
    • Simon Peyton Jones's avatar
      Optimise (case tagToEnum# x of ..) as in Trac #8317 · 62c40585
      Simon Peyton Jones authored
      See Note [Optimising tagToEnum#] in Simplify
      62c40585
  20. 29 Aug, 2013 1 commit
  21. 06 Jun, 2013 1 commit
  22. 30 May, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make 'SPECIALISE instance' work again · 1ed04090
      Simon Peyton Jones authored
      This is a long-standing regression (Trac #7797), which meant that in
      particular the Eq [Char] instance does not get specialised.
      (The *methods* do, but the dictionary itself doesn't.)  So when you
      call a function
           f :: Eq a => blah
      on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
      methods.
      
      This only matters when calling an overloaded function from a
      specialised context, but that does matter in some programs.  I
      remember (though I cannot find the details) that Nick Frisby discovered
      this to be the source of some pretty solid performanc regresisons.
      
      Anyway it works now. The key change is that a DFunUnfolding now takes
      a form that is both simpler than before (the DFunArg type is eliminated)
      and more general:
      
      data Unfolding
        = ...
        | DFunUnfolding {     -- The Unfolding of a DFunId
          			-- See Note [DFun unfoldings]
            		  	--     df = /\a1..am. \d1..dn. MkD t1 .. tk
                              --                                 (op1 a1..am d1..dn)
           		      	--     	    	      	       	   (op2 a1..am d1..dn)
              df_bndrs :: [Var],      -- The bound variables [a1..m],[d1..dn]
              df_con   :: DataCon,    -- The dictionary data constructor (never a newtype datacon)
              df_args  :: [CoreExpr]  -- Args of the data con: types, superclasses and methods,
          }                           -- in positional order
      
      That in turn allowed me to re-enable the DFunUnfolding specialisation in
      DsBinds.  Lots of details here in TcInstDcls:
      	  Note [SPECIALISE instance pragmas]
      
      I also did some refactoring, in particular to pass the InScopeSet to
      exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
      
      NB: Interface file format has changed!
      1ed04090
  23. 30 Jan, 2013 1 commit
  24. 24 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Introduce CPR for sum types (Trac #5075) · d3b8991b
      Simon Peyton Jones authored
      The main payload of this patch is to extend CPR so that it
      detects when a function always returns a result constructed
      with the *same* constructor, even if the constructor comes from
      a sum type.  This doesn't matter very often, but it does improve
      some things (results below).
      
      Binary sizes increase a little bit, I think because there are more
      wrappers.  This with -split-objs.  Without split-ojbs binary sizes
      increased by 6% even for HelloWorld.hs.  It's hard to see exactly why,
      but I think it was because System.Posix.Types.o got included in the
      linked binary, whereas it didn't before.
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                fluid          +1.8%     -0.3%      0.01      0.01     +0.0%
                  tak          +2.2%     -0.2%      0.02      0.02     +0.0%
                 ansi          +1.7%     -0.3%      0.00      0.00     +0.0%
            cacheprof          +1.6%     -0.3%     +0.6%     +0.5%     +1.4%
              parstof          +1.4%     -4.4%      0.00      0.00     +0.0%
              reptile          +2.0%     +0.3%      0.02      0.02     +0.0%
      ----------------------------------------------------------------------
                  Min          +1.1%     -4.4%     -4.7%     -4.7%    -15.0%
                  Max          +2.3%     +0.3%     +8.3%     +9.4%    +50.0%
       Geometric Mean          +1.9%     -0.1%     +0.6%     +0.7%     +0.3%
      
      Other things in this commit
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~
      * Got rid of the Lattice class in Demand
      
      * Refactored the way that products and newtypes are
        decomposed (no change in functionality)
      d3b8991b
  25. 22 Jan, 2013 1 commit
  26. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  27. 04 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make CaseElim a bit less aggressive · 28d9a032
      Simon Peyton Jones authored
      See Note [Case elimination: lifted case]:
      
      We used to do case elimination if
              (c) the scrutinee is a variable and 'x' is used strictly
      But that changes
          case x of { _ -> error "bad" }
          --> error "bad"
      which is very puzzling if 'x' is later bound to (error "good").
      Where the order of evaluation is specified (via seq or case)
      we should respect it.
      
      c.f. Note [Empty case alternatives] in CoreSyn, which is how
      I came across this.
      28d9a032
  28. 02 Jan, 2013 2 commits
  29. 24 Dec, 2012 1 commit
    • Simon Peyton Jones's avatar
      Make the treatment of addAltUnfoldings handle casts · 545fd8b9
      Simon Peyton Jones authored
      This minor refactoring re-attaches Note [Add unfolding for scrutinee].
      It had become detached, which led me on a bit of a wild goose
      chase.
      
      While I was at it, I made the code work right for the case where
      the scrutinee is of form (x |> co); I don't think this is an important
      improvement.
      
      I also make simplAlt unconditionally zap occurrence information on
      case-alternative binders (see Note [Case alternative occ info]);
      it was almost always being zapped and the additional complexity seems
      not worth it.
      545fd8b9
  30. 23 Dec, 2012 1 commit
    • Simon Peyton Jones's avatar
      Make {-# UNPACK #-} work for type/data family invocations · 1ee1cd41
      Simon Peyton Jones authored
      This fixes most of Trac #3990.  Consider
        data family D a
        data instance D Double = CD Int Int
        data T = T {-# UNPACK #-} !(D Double)
      Then we want the (D Double unpacked).
      
      To do this we need to construct a suitable coercion, and it's much
      safer to record that coercion in the interface file, lest the in-scope
      instances differ somehow.  That in turn means elaborating the HsBang
      type to include a coercion.
      
      To do that I moved HsBang from BasicTypes to DataCon, which caused
      quite a few minor knock-on changes.
      
      Interface-file format has changed!
      
      Still to do: need to do knot-tying to allow instances to take effect
      within the same module.
      1ee1cd41
  31. 18 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Refactor the way dump flags are handled · d4a19643
      ian@well-typed.com authored
      We were being inconsistent about how we tested whether dump flags
      were enabled; in particular, sometimes we also checked the verbosity,
      and sometimes we didn't.
      
      This lead to oddities such as "ghc -v4" printing an "Asm code" section
      which didn't contain any code, and "-v4" enabled some parts of
      "-ddump-deriv" but not others.
      
      Now all the tests use dopt, which also takes the verbosity into account
      as appropriate.
      d4a19643
  32. 16 Oct, 2012 2 commits