1. 12 Feb, 2016 1 commit
  2. 11 Feb, 2016 1 commit
    • Simon Marlow's avatar
      sizeExpr: fix a bug in the size calculation · 51a33924
      Simon Marlow authored
      There were two bugs here:
      
      * We weren't ignoring Cast in size_up_app
      * An application of a non-variable wasn't being charged correct
      
      The result was that some things looked too cheap.  In my case I had
      things like
      
          ((f x) `cast` ...) y
      
      which was given size 21 instead of 30, and this had knock-on effects
      elsewhere that caused some large code bloat.
      
      Test Plan:
      * nofib runs (todo)
      * validate
      
      Reviewers: simonpj, austin, bgamari, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1900
      
      GHC Trac Issues: #11564
      51a33924
  3. 27 Jan, 2016 1 commit
  4. 18 Jan, 2016 1 commit
    • Jan Stolarek's avatar
      Replace calls to `ptext . sLit` with `text` · b8abd852
      Jan Stolarek authored
      Summary:
      In the past the canonical way for constructing an SDoc string literal was the
      composition `ptext . sLit`.  But for some time now we have function `text` that
      does the same.  Plus it has some rules that optimize its runtime behaviour.
      This patch takes all uses of `ptext . sLit` in the compiler and replaces them
      with calls to `text`.  The main benefits of this patch are clener (shorter) code
      and less dependencies between module, because many modules now do not need to
      import `FastString`.  I don't expect any performance benefits - we mostly use
      SDocs to report errors and it seems there is little to be gained here.
      
      Test Plan: ./validate
      
      Reviewers: bgamari, austin, goldfire, hvr, alanz
      
      Subscribers: goldfire, thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D1784
      b8abd852
  5. 11 Dec, 2015 1 commit
    • eir@cis.upenn.edu's avatar
      Add kind equalities to GHC. · 67465497
      eir@cis.upenn.edu authored
      This implements the ideas originally put forward in
      "System FC with Explicit Kind Equality" (ICFP'13).
      
      There are several noteworthy changes with this patch:
       * We now have casts in types. These change the kind
         of a type. See new constructor `CastTy`.
      
       * All types and all constructors can be promoted.
         This includes GADT constructors. GADT pattern matches
         take place in type family equations. In Core,
         types can now be applied to coercions via the
         `CoercionTy` constructor.
      
       * Coercions can now be heterogeneous, relating types
         of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2`
         proves both that `t1` and `t2` are the same and also that
         `k1` and `k2` are the same.
      
       * The `Coercion` type has been significantly enhanced.
         The documentation in `docs/core-spec/core-spec.pdf` reflects
         the new reality.
      
       * The type of `*` is now `*`. No more `BOX`.
      
       * Users can write explicit kind variables in their code,
         anywhere they can write type variables. For backward compatibility,
         automatic inference of kind-variable binding is still permitted.
      
       * The new extension `TypeInType` turns on the new user-facing
         features.
      
       * Type families and synonyms are now promoted to kinds. This causes
         trouble with parsing `*`, leading to the somewhat awkward new
         `HsAppsTy` constructor for `HsType`. This is dispatched with in
         the renamer, where the kind `*` can be told apart from a
         type-level multiplication operator. Without `-XTypeInType` the
         old behavior persists. With `-XTypeInType`, you need to import
         `Data.Kind` to get `*`, also known as `Type`.
      
       * The kind-checking algorithms in TcHsType have been significantly
         rewritten to allow for enhanced kinds.
      
       * The new features are still quite experimental and may be in flux.
      
       * TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203.
      
       * TODO: Update user manual.
      
      Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142.
      Updates Haddock submodule.
      67465497
  6. 05 Oct, 2015 1 commit
  7. 21 Aug, 2015 1 commit
    • thomie's avatar
      Refactor: delete most of the module FastTypes · 2f29ebbb
      thomie authored
      This reverses some of the work done in #1405, and goes back to the
      assumption that the bootstrap compiler understands GHC-haskell.
      
      In particular:
        * use MagicHash instead of _ILIT and _CLIT
        * pattern matching on I# if possible, instead of using iUnbox
          unnecessarily
        * use Int#/Char#/Addr# instead of the following type synonyms:
          - type FastInt   = Int#
          - type FastChar  = Char#
          - type FastPtr a = Addr#
        * inline the following functions:
          - iBox           = I#
          - cBox           = C#
          - fastChr        = chr#
          - fastOrd        = ord#
          - eqFastChar     = eqChar#
          - shiftLFastInt  = uncheckedIShiftL#
          - shiftR_FastInt = uncheckedIShiftRL#
          - shiftRLFastInt = uncheckedIShiftRL#
        * delete the following unused functions:
          - minFastInt
          - maxFastInt
          - uncheckedIShiftRA#
          - castFastPtr
          - panicDocFastInt and pprPanicFastInt
        * rename panicFastInt back to panic#
      
      These functions remain, since they actually do something:
        * iUnbox
        * bitAndFastInt
        * bitOrFastInt
      
      Test Plan: validate
      
      Reviewers: austin, bgamari
      
      Subscribers: rwbarton
      
      Differential Revision: https://phabricator.haskell.org/D1141
      
      GHC Trac Issues: #1405
      2f29ebbb
  8. 03 Aug, 2015 1 commit
  9. 22 May, 2015 1 commit
    • Simon Peyton Jones's avatar
      Fix a huge space leak in the mighty Simplifier · 45d9a15c
      Simon Peyton Jones authored
      This long-standing, terrible, adn somewhat subtle bug was exposed
      by Trac #10370, thanks to Reid Barton's brilliant test case (comment:3).
      
      The effect is large on the Trac #10370 test.
      Here is what the profile report says:
      
      Before:
       total time  =       24.35 secs   (24353 ticks @ 1000 us, 1 processor)
       total alloc = 11,864,360,816 bytes  (excludes profiling overheads)
      
      After:
       total time  =       21.16 secs   (21160 ticks @ 1000 us, 1 processor)
       total alloc = 7,947,141,136 bytes  (excludes profiling overheads)
      
      The /combined/ effect of the tidyOccName fix, plus this one, is dramtic
      for Trac #10370.  Here is what +RTS -s says:
      
      Before:
        15,490,210,952 bytes allocated in the heap
         1,783,919,456 bytes maximum residency (20 sample(s))
      
        MUT     time   30.117s  ( 31.383s elapsed)
        GC      time   90.103s  ( 90.107s elapsed)
        Total   time  120.843s  (122.065s elapsed)
      
      After:
         7,928,671,936 bytes allocated in the heap
            52,914,832 bytes maximum residency (25 sample(s))
      
        MUT     time   13.912s  ( 15.110s elapsed)
        GC      time    6.809s  (  6.808s elapsed)
        Total   time   20.789s  ( 21.954s elapsed)
      
      - Heap allocation halved
      - Residency cut by a factor of more than 30.
      - ELapsed time cut by a factor of 6
      
      Not bad!
      
      The details
      ~~~~~~~~~~~
      The culprit was SimplEnv.mkCoreSubst, which used mapVarEnv to do some
      impedence-matching from the substitituion used by the simplifier to
      the one used by CoreSubst.  But the impedence-mactching was recursive!
      
        mk_subst tv_env cv_env id_env
          = CoreSubst.mkSubst in_scope tv_env cv_env (mapVarEnv fiddle id_env)
      
        fiddle (DoneEx e)          = e
        fiddle (DoneId v)          = Var v
        fiddle (ContEx tv cv id e) = CoreSubst.substExpr (mk_subst tv cv id) e
      
      Inside fiddle, in the ContEx case, we may do another whole level of
      fiddle.  And so on.  Moreover, UniqFM (which is built on Data.IntMap) is
      strict, so the fiddling is done eagerly.  I didn't wok through all the
      details but the result is a gargatuan blow-up of entirely unnecessary work.
      
      Laziness would make this go away, I think, but I don't want to mess
      with IntMap.  And in any case, the impedence matching is a royal pain.
      
      In the end I simply ceased trying to use CoreSubst.substExpr in the
      simplifier, and instead just use simplExpr.  That does mean bit of
      duplication; e.g.  new code for simplRules.  But it's not a big deal
      and it's far more direct and easy to reason about.
      
      A bit of knock-on refactoring:
      
       * Data type ArgSummary moves to CoreUnfold.
      
       * interestingArg moves from CoreUnfold to SimplUtils, and gets a
         SimplEnv argument which can be used when we encounter a variable.
      
       * simplLamBndrs, addBndrRules move from SimplEnv to Simplify
         (because they now calls simplUnfolding, simplRules resp)
      
       * SimplUtils.substExpr, substUnfolding, mkCoreSubst die completely
      
       * In Simplify some several functions that were previously pure
         substitution-based functions are now monadic:
           - addBndrRules, simplRule
           - addCoerce, add_coerce in simplCast
      
       * In case 2c of Simplify.rebuildCase, there was a pretty disgusting
         expression-substitution taking place for 'rhs'; and we really don't
         want to make that monadic becuase 'rhs' can be big.
         Solution: reduce the arity of the rules for seq.
         See Note [User-defined RULES for seq] in MkId.
      45d9a15c
  10. 18 Mar, 2015 1 commit
  11. 16 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Source notes (Core support) · 993975d3
      Peter Wortmann authored
      This patch introduces "SourceNote" tickishs that link Core to the
      source code that generated it. The idea is to retain these source code
      links throughout code transformations so we can eventually relate
      object code all the way back to the original source (which we can,
      say, encode as DWARF information to allow debugging).  We generate
      these SourceNotes like other tickshs in the desugaring phase. The
      activating command line flag is "-g", consistent with the flag other
      compilers use to decide DWARF generation.
      
      Keeping ticks from getting into the way of Core transformations is
      tricky, but doable. The changes in this patch produce identical Core
      in all cases I tested -- which at this point is GHC, all libraries and
      nofib. Also note that this pass creates *lots* of tick nodes, which we
      reduce somewhat by removing duplicated and overlapping source
      ticks. This will still cause significant Tick "clumps" - a possible
      future optimization could be to make Tick carry a list of Tickishs
      instead of one at a time.
      
      (From Phabricator D169)
      993975d3
  12. 03 Dec, 2014 1 commit
  13. 26 Sep, 2014 1 commit
  14. 28 Aug, 2014 2 commits
    • Simon Peyton Jones's avatar
      Make worker/wrapper work on INLINEABLE things · 9cf5906b
      Simon Peyton Jones authored
      This fixes a long-standing bug: Trac #6056.  The trouble was that
      INLINEABLE "used up" the unfolding for the Id, so it couldn't be
      worker/wrapper'd by the strictness analyser.
      
      This patch allows the w/w to go ahead, and makes the *worker* INLINEABLE
      instead, so it can later be specialised.
      
      However, that doesn't completely solve the problem, because the dictionary
      argument (which the specialiser treats specially) may be strict and
      hence unpacked by w/w, so now the worker won't be specilialised after all.
      
      Solution: never unpack dictionary arguments, which is done by the isClassTyCon
                test in WwLib.deepSplitProductType_maybe
      9cf5906b
    • Simon Peyton Jones's avatar
      Refactor unfoldings · 6e0f6ede
      Simon Peyton Jones authored
      There are two main refactorings here
      
      1.  Move the uf_arity field
             out of CoreUnfolding
             into UnfWhen
          It's a lot tidier there.  If I've got this right, no behaviour
          should change.
      
      2.  Define specUnfolding and use it in DsBinds and Specialise
           a) commons-up some shared code
           b) makes sure that Specialise correctly specialises DFun
              unfoldings (which it didn't before)
      
      The two got put together because both ended up interacting in the
      specialiser.
      
      They cause zero difference to nofib.
      6e0f6ede
  15. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  16. 18 Mar, 2014 1 commit
    • Simon Peyton Jones's avatar
      Make sure we occurrence-analyse unfoldings (fixes Trac #8892) · 87bbc69c
      Simon Peyton Jones authored
      For DFunUnfoldings we were failing to occurrence-analyse the unfolding,
      and that meant that a loop breaker wasn't marked as such, which in turn
      meant it was inlined away when it still had occurrence sites.  See
      Note [Occurrrence analysis of unfoldings] in CoreUnfold.
      
      This is a pretty long-standing bug, happily nailed by John Lato.
      87bbc69c
  17. 12 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve eta expansion (again) · 802f4b89
      Simon Peyton Jones authored
      The presenting issue was that we were never eta-expanding
      
          f (\x -> case x of (a,b) -> \s -> blah)
      
      and that meant we were allocating two lambdas instead of one.
      See Note [Eta expanding lambdas] in SimplUtils.
      
      However I didn't want to eta expand the lambda, and then try all over
      again for tryEtaExpandRhs.  Yet the latter is important in the context
      of a let-binding it can do simple arity analysis.  So I ended up
      refactoring CallCtxt so that it tells when we are on the RHS of a let.
      
      I also moved findRhsArity from SimplUtils to CoreArity.
      
      Performance increases nicely. Here are the ones where allocation improved
      by more than 0.5%. Notice the nice decrease in binary size too.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                 ansi          -2.3%     -0.9%      0.00      0.00     +0.0%
                 bspt          -2.1%     -9.7%      0.01      0.01    -33.3%
                fasta          -1.8%    -11.7%     -3.4%     -3.6%     +0.0%
                  fft          -1.9%     -1.3%      0.06      0.06    +11.1%
      reverse-complem          -1.9%    -18.1%     -1.9%     -2.8%     +0.0%
               sphere          -1.8%     -4.5%      0.09      0.09     +0.0%
            transform          -1.8%     -2.3%     -4.6%     -3.1%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -3.0%    -18.1%    -13.9%    -14.6%    -35.7%
                  Max          -1.3%     +0.0%     +7.7%     +7.7%    +50.0%
       Geometric Mean          -1.9%     -0.6%     -2.1%     -2.1%     -0.2%
      802f4b89
  18. 01 Oct, 2013 1 commit
  19. 02 Sep, 2013 1 commit
    • Simon Peyton Jones's avatar
      Remove the final vestiges of InlineWrappers · e4a1d2d0
      Simon Peyton Jones authored
      Part of Nick Frisby's patch (c080f727)
      for late demand-analysis removed the over-zealous short-cut whereby
      strictness wrappers were not spelled out in detail in interface files.
      
      This patch completes the process by
       * removing InlineWrapper from UnfoldingSource
       * removing IfWrapper from IfaceUnfolding
      
      There was a tiny bit of special ad-hocery for wrappers, in OccurAnal,
      but fortunately that too turns out to be rendered irrelevant by
      the more uniform treatment, and after that there was no need
      to remember which functions are wrappers.
      e4a1d2d0
  20. 29 Aug, 2013 1 commit
  21. 06 Jun, 2013 2 commits
  22. 30 May, 2013 1 commit
    • Simon Peyton Jones's avatar
      Make 'SPECIALISE instance' work again · 1ed04090
      Simon Peyton Jones authored
      This is a long-standing regression (Trac #7797), which meant that in
      particular the Eq [Char] instance does not get specialised.
      (The *methods* do, but the dictionary itself doesn't.)  So when you
      call a function
           f :: Eq a => blah
      on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
      methods.
      
      This only matters when calling an overloaded function from a
      specialised context, but that does matter in some programs.  I
      remember (though I cannot find the details) that Nick Frisby discovered
      this to be the source of some pretty solid performanc regresisons.
      
      Anyway it works now. The key change is that a DFunUnfolding now takes
      a form that is both simpler than before (the DFunArg type is eliminated)
      and more general:
      
      data Unfolding
        = ...
        | DFunUnfolding {     -- The Unfolding of a DFunId
          			-- See Note [DFun unfoldings]
            		  	--     df = /\a1..am. \d1..dn. MkD t1 .. tk
                              --                                 (op1 a1..am d1..dn)
           		      	--     	    	      	       	   (op2 a1..am d1..dn)
              df_bndrs :: [Var],      -- The bound variables [a1..m],[d1..dn]
              df_con   :: DataCon,    -- The dictionary data constructor (never a newtype datacon)
              df_args  :: [CoreExpr]  -- Args of the data con: types, superclasses and methods,
          }                           -- in positional order
      
      That in turn allowed me to re-enable the DFunUnfolding specialisation in
      DsBinds.  Lots of details here in TcInstDcls:
      	  Note [SPECIALISE instance pragmas]
      
      I also did some refactoring, in particular to pass the InScopeSet to
      exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
      
      NB: Interface file format has changed!
      1ed04090
  23. 11 Apr, 2013 1 commit
    • nfrisby's avatar
      ignore RealWorld in size_expr; flag to keep w/w from creating sharing · af12cf66
      nfrisby authored
      size_expr now ignores RealWorld lambdas, arguments, and applications.
      
      Worker-wrapper previously removed all lambdas from a function, if they
      were all unused. Removing *all* value lambdas is no longer
      allowed. Instead (\_ -> E) will become (\_void -> E), where it used to
      become E. The previous behavior can be recovered via the new
      -ffun-to-thunk flag.
      
      Nofib notables:
      
      ----------------------------------------------------------------
              Program               O2          O2 newly ignoring RealWorld
                                                and not turning function
                                                closures into thunks
      ----------------------------------------------------------------
      
       Allocations
      
        comp_lab_zift            333090392%           -5.0%
      reverse-complem            155188304%           -3.2%
      
              rewrite             15380888%           +4.0%
               boyer2              3901064%           +7.5%
      
      rewrite previously benefited from fortunate LoopBreaker choice that is
      now disrupted.
      
      A function in boyer2 goes from $wonewayunify1 size 700 to size 650,
      thus gets inlined into rewritelemmas, thus exposing a parameter
      scrutinisation, thus allowing SpecConstr, which unfortunately involves
      reboxing.
      
      Run Time
      
       fannkuch-redux                 7.89%          -15.9%
      
                  hpg                 0.25%           +5.6%
                 wang                 0.21%           +5.8%
      
      /shrug
      af12cf66
  24. 06 Apr, 2013 1 commit
  25. 01 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Refactor the invariants for ClsInsts · 5efe9b11
      Simon Peyton Jones authored
      We now have the invariant for a ClsInst that the is_tvs field
      is always completely fresh type variables. See
      Note [Template tyvars are fresh] in InstEnv.
      
      (Previously we frehened them when extending the instance environment,
      but that seems messier because it was an invariant only when the
      ClsInst was in an InstEnv.  Moreover, there was an invariant that
      thet tyvars of the DFunid in the ClsInst had to match, and I have
      removed that invariant altogether; there is no need for it.)
      
      Other changes I made at the same time:
      
       * Make is_tvs into a *list*, in the right order for the dfun type
         arguments.  This removes the wierd need for the dfun to have the
         same tyvars as the ClsInst template, an invariant I have always
         hated. The cost is that we need to make it a VarSet when matching.
         We could cache an is_tv_set instead.
      
       * Add a cached is_cls field to the ClsInst, to save fishing
         the Class out of the DFun.  (Renamed is_cls to is_cls_nm.)
      
       * Make tcSplitDFunTy return the dfun args, not just the *number*
         of dfun args
      
       * Make InstEnv.instanceHead return just the *head* of the
         instance declaration.  Add instanceSig to return the whole
         thing.
      5efe9b11
  26. 14 Dec, 2012 1 commit
  27. 18 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Refactor the way dump flags are handled · d4a19643
      ian@well-typed.com authored
      We were being inconsistent about how we tested whether dump flags
      were enabled; in particular, sometimes we also checked the verbosity,
      and sometimes we didn't.
      
      This lead to oddities such as "ghc -v4" printing an "Asm code" section
      which didn't contain any code, and "-v4" enabled some parts of
      "-ddump-deriv" but not others.
      
      Now all the tests use dopt, which also takes the verbosity into account
      as appropriate.
      d4a19643
  28. 16 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Some alpha renaming · cd33eefd
      ian@well-typed.com authored
      Mostly d -> g (matching DynFlag -> GeneralFlag).
      Also renamed if* to when*, matching the Haskell if/when names
      cd33eefd
  29. 09 Oct, 2012 1 commit
    • ian@well-typed.com's avatar
      Make the opt_UF_* static flags dynamic · 0a768bcb
      ian@well-typed.com authored
      I also removed the default values from the "Discounts and thresholds"
      note: most of them were no longer up-to-date.
      
      Along the way I added FloatSuffix to the argument parser, analogous to
      IntSuffix.
      0a768bcb
  30. 20 Jul, 2012 2 commits
  31. 14 Jul, 2012 1 commit
    • Ian Lynagh's avatar
      Implement FastBytes, and use it for MachStr · 7ae1bec5
      Ian Lynagh authored
      This is a first step on the way to refactoring the FastString type.
      
      FastBytes currently has no unique, mainly because there isn't currently
      a nice way to produce them in Binary.
      
      Also, we don't currently do the "Dictionary" thing with FastBytes in
      Binary. I'm not sure whether this is important.
      
      We can change both decisions later, but in the meantime this gets the
      refactoring underway.
      7ae1bec5
  32. 27 Jun, 2012 1 commit
    • Simon Peyton Jones's avatar
      Add silent superclass parameters (again) · aa1e0976
      Simon Peyton Jones authored
      Silent superclass parameters solve the problem that
      the superclasses of a dicionary construction can easily
      turn out to be (wrongly) bottom.  The problem and solution
      are described in
         Note [Silent superclass arguments] in TcInstDcls
      
      I first implemented this fix (with Dimitrios) in Dec 2010, but removed
      it again in Jun 2011 becuase we thought it wasn't necessary any
      more. (The reason we thought it wasn't necessary is that we'd stopped
      generating derived superclass constraints for *wanteds*.  But we were
      wrong; that didn't solve the superclass-loop problem.)
      
      So we have to re-implement it.  It's not hard.  Main features:
      
        * The IdDetails for a DFunId says how many silent arguments it has
      
        * A DFunUnfolding describes which dictionary args are
          just parameters (DFunLamArg) and which are a function to apply
          to the parameters (DFunPolyArg).  This adds the DFunArg type
          to CoreSyn
      
        * Consequential changes to IfaceSyn.  (Binary hi file format changes
          slightly.)
      
        * TcInstDcls changes to generate the right dfuns
      
        * CoreSubst.exprIsConApp_maybe handles the new DFunUnfolding
      
      The thing taht is *not* done yet is to alter the vectoriser to
      pass the relevant extra argument when building a PA dictionary.
      aa1e0976
  33. 11 Jun, 2012 1 commit
  34. 28 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Be less aggressive about the result discount · 4fa3f16d
      Simon Peyton Jones authored
      This patch fixes Trac #6099 by reducing the result discount in CoreUnfold.conSize.
      See Note [Constructor size and result discount] in CoreUnfold.
      
      The existing version is definitely too aggressive. Simon M found it an
      "unambiguous win" but it is definitely what led to the bloat. In a function
      with a lot of case branches, all returning a constructor, the discount could
      grow arbitrarily large.
      
      I also had to increase the -funfolding-creation-threshold from 450 to 750,
      otherwise some functions that should inline simply never get an unfolding.
      (The massive result discount was allow the unfolding to appear before.)
      
      The nofib results are these, picking a handful of outliers to show.
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               fulsom          -0.5%     -1.6%     -2.8%     -2.6%    +31.1%
             maillist          -0.2%     -0.0%      0.09      0.09     -3.7%
               mandel          -0.4%     +6.6%      0.12      0.12     +0.0%
             nucleic2          -0.2%    +18.5%      0.11      0.11     +0.0%
              parstof          -0.4%     +4.0%      0.00      0.00     +0.0%
      --------------------------------------------------------------------------------
                  Min          -0.9%     -1.6%    -19.7%    -19.7%     -3.7%
                  Max          +0.3%    +18.5%     +2.7%     +2.7%    +31.1%
       Geometric Mean          -0.3%     +0.4%     -3.0%     -3.0%     +0.2%
      
      Turns out that nucleic2 has a function
        Main.$wabsolute_pos =
          \ (ww_s4oj :: Types.Tfo) (ww1_s4oo :: Types.FloatT)
            (ww2_s4op :: Types.FloatT) (ww3_s4oq :: Types.FloatT) ->
            case ww_s4oj
            of _
            { Types.Tfo a_a1sS b_a1sT c_a1sU d_a1sV e_a1sW f_a1sX g_a1sY h_a1sZ i_a1t0 tx_a1t1 ty_a1t2 tz_a1t3 ->
            (# case ww1_s4oo of _ { GHC.Types.F# x_a2sO ->
               case a_a1sS of _ { GHC.Types.F# y_a2sS ->
               case ww2_s4op of _ { GHC.Types.F# x1_X2y9 ->
               case d_a1sV of _ { GHC.Types.F# y1_X2yh ->
               case ww3_s4oq of _ { GHC.Types.F# x2_X2yj ->
               case g_a1sY of _ { GHC.Types.F# y2_X2yr ->
               case tx_a1t1 of _ { GHC.Types.F# y3_X2yn ->
               GHC.Types.F#
                 (GHC.Prim.plusFloat#
                    (GHC.Prim.plusFloat#
                       (GHC.Prim.plusFloat#
                          (GHC.Prim.timesFloat# x_a2sO y_a2sS)
                          (GHC.Prim.timesFloat# x1_X2y9 y1_X2yh))
                       (GHC.Prim.timesFloat# x2_X2yj y2_X2yr))
                    y3_X2yn)
               } } }}}}},
      
              <similar>,
              <similar> )
      
      This is pretty big, but inlining it does get rid of that F# allocation.
      But we'll also get rid of it with deep CPR: Trac #2289. For now we just
      accept the change.
      4fa3f16d
  35. 09 May, 2012 2 commits
    • Simon Peyton Jones's avatar
      Re-do the "function application discount" (fixes Trac #6048) · 980372f3
      Simon Peyton Jones authored
      * Undoes Max's very aggressive function-inlining change
        (see comments with Trac #6048)
      
      * Resticts function application discount to functions
        that occur just once in the body. It was the multiple
        occurrences that led to the exponential behavour in
        Trac #6048.
      
      See Note [Function application discount] in CoreUnfold.
      
      Module binary sizes are down 2% on average, which is good.
      Allocations wobble about a bit, but only on a few benchmarks
      and not by much, so it seems a price worth paying to avoid
      exponential behaviour!
      
                               Allocs
                  Min           -1.2%
                  Max           +2.8%
       Geometric Mean           +0.0%
      980372f3
    • Simon Peyton Jones's avatar
      Be a little less aggressive about inlining (fixes Trac #5623) · 2112f43c
      Simon Peyton Jones authored
      When inlining, we are making a copy of the expression, so we have to
      be careful about duplicating work.  Previously we were using
      exprIsCheap for that, but it is willing to duplicate a cheap primop --
      and that is terribly bad if it happens inside some inner array loop
      (Trac #5623).  So now we use a new function exprIsWorkFree.  Even
      then there is some wiggle room:
         see Note [exprIsWorkFree] in CoreUtils
      
      This commit does make wheel-sieve1 allocate a lot more, but we decided
      that's just tough; it's more important for inlining to be robust
      about not duplicating work.
      2112f43c
  36. 02 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Allow cases with empty alterantives · ac230c5e
      Simon Peyton Jones authored
      This patch allows, for the first time, case expressions with an empty
      list of alternatives. Max suggested the idea, and Trac #6067 showed
      that it is really quite important.
      
      So I've implemented the idea, fixing #6067. Main changes
      
       * See Note [Empty case alternatives] in CoreSyn
      
       * Various foldr1's become foldrs
      
       * IfaceCase does not record the type of the alternatives.
         I added IfaceECase for empty-alternative cases.
      
       * Core Lint does not complain about empty cases
      
       * MkCore.castBottomExpr constructs an empty-alternative case
         expression   (case e of ty {})
      
       * CoreToStg converts '(case e of {})' to just 'e'
      ac230c5e