1. 17 Mar, 2017 1 commit
    • Simon Peyton Jones's avatar
      No join-point from an INLINE function with wrong arity · a7dbafe9
      Simon Peyton Jones authored
      The main payload of this patch is NOT to make a join-point
      from a function with an INLINE pragma and the wrong arity;
      see Note [Join points and INLINE pragmas] in CoreOpt.
      This is what caused Trac #13413.
      
      But we must do the exact same thing in simpleOptExpr,
      which drove me to the following refactoring:
      
      * Move simpleOptExpr and simpleOptPgm from CoreSubst to a new
        module CoreOpt along with a few others (exprIsConApp_maybe,
        pushCoArg, etc)
      
        This eliminates a module loop altogether (delete
        CoreArity.hs-boot), and stops CoreSubst getting too huge.
      
      * Rename Simplify.matchOrConvertToJoinPoint
           to joinPointBinding_maybe
        Move it to the new CoreOpt
        Use it in simpleOptExpr as well as in Simplify
      
      * Define CoreArity.joinRhsArity and use it
      a7dbafe9
  2. 01 Feb, 2017 1 commit
  3. 20 Jan, 2017 1 commit
    • Simon Peyton Jones's avatar
      Fix a nasty bug in exprIsExpandable · 9be18ea4
      Simon Peyton Jones authored
      This bug has been lurking for ages: Trac #13155
      
      The important semantic change is to ensure that exprIsExpandable
      returns False for primop calls.  Previously exprIsExpandable used
      exprIsCheap' which always used primOpIsCheap.
      
      I took the opportunity to combine the code for exprIsCheap' (two
      variants: exprIsCheap and exprIsExpandable) with that for
      exprIsWorkFree.  Result is simpler, tighter, easier to understand.
      And correct (at least wrt this bug)!
      9be18ea4
  4. 19 Jan, 2017 1 commit
    • Richard Eisenberg's avatar
      Update levity polymorphism · e7985ed2
      Richard Eisenberg authored
      This commit implements the proposal in
      https://github.com/ghc-proposals/ghc-proposals/pull/29 and
      https://github.com/ghc-proposals/ghc-proposals/pull/35.
      
      Here are some of the pieces of that proposal:
      
      * Some of RuntimeRep's constructors have been shortened.
      
      * TupleRep and SumRep are now parameterized over a list of RuntimeReps.
      * This
      means that two types with the same kind surely have the same
      representation.
      Previously, all unboxed tuples had the same kind, and thus the fact
      above was
      false.
      
      * RepType.typePrimRep and friends now return a *list* of PrimReps. These
      functions can now work successfully on unboxed tuples. This change is
      necessary because we allow abstraction over unboxed tuple types and so
      cannot
      always handle unboxed tuples specially as we did before.
      
      * We sometimes have to create an Id from a PrimRep. I thus split PtrRep
      * into
      LiftedRep and UnliftedRep, so that the created Ids have the right
      strictness.
      
      * The RepType.RepType type was removed, as it didn't seem to help with
      * much.
      
      * The RepType.repType function is also removed, in favor of typePrimRep.
      
      * I have waffled a good deal on whether or not to keep VoidRep in
      TyCon.PrimRep. In the end, I decided to keep it there. PrimRep is *not*
      represented in RuntimeRep, and typePrimRep will never return a list
      including
      VoidRep. But it's handy to have in, e.g., ByteCodeGen and friends. I can
      imagine another design choice where we have a PrimRepV type that is
      PrimRep
      with an extra constructor. That seemed to be a heavier design, though,
      and I'm
      not sure what the benefit would be.
      
      * The last, unused vestiges of # (unliftedTypeKind) have been removed.
      
      * There were several pretty-printing bugs that this change exposed;
      * these are fixed.
      
      * We previously checked for levity polymorphism in the types of binders.
      * But we
      also must exclude levity polymorphism in function arguments. This is
      hard to check
      for, requiring a good deal of care in the desugarer. See Note [Levity
      polymorphism
      checking] in DsMonad.
      
      * In order to efficiently check for levity polymorphism in functions, it
      * was necessary
      to add a new bit of IdInfo. See Note [Levity info] in IdInfo.
      
      * It is now safe for unlifted types to be unsaturated in Core. Core Lint
      * is updated
      accordingly.
      
      * We can only know strictness after zonking, so several checks around
      * strictness
      in the type-checker (checkStrictBinds, the check for unlifted variables
      under a ~
      pattern) have been moved to the desugarer.
      
      * Along the way, I improved the treatment of unlifted vs. banged
      * bindings. See
      Note [Strict binds checks] in DsBinds and #13075.
      
      * Now that we print type-checked source, we must be careful to print
      * ConLikes correctly.
      This is facilitated by a new HsConLikeOut constructor to HsExpr.
      Particularly troublesome
      are unlifted pattern synonyms that get an extra void# argument.
      
      * Includes a submodule update for haddock, getting rid of #.
      
      * New testcases:
        typecheck/should_fail/StrictBinds
        typecheck/should_fail/T12973
        typecheck/should_run/StrictPats
        typecheck/should_run/T12809
        typecheck/should_fail/T13105
        patsyn/should_fail/UnliftedPSBind
        typecheck/should_fail/LevPolyBounded
        typecheck/should_compile/T12987
        typecheck/should_compile/T11736
      
      * Fixed tickets:
        #12809
        #12973
        #11736
        #13075
        #12987
      
      * This also adds a test case for #13105. This test case is
      * "compile_fail" and
      succeeds, because I want the testsuite to monitor the error message.
      When #13105 is fixed, the test case will compile cleanly.
      e7985ed2
  5. 23 Dec, 2016 1 commit
    • Simon Peyton Jones's avatar
      Fix a bug in ABot handling in CoreArity · f06b71ae
      Simon Peyton Jones authored
      See Note [ABot branches: use max] in CoreArity.
      
      I stumbled on this when investigating something else, and
      opened Trac #13031 to track it.
      
      It's very hard to tickle the bug, which is why it has lurked so long,
      but the test
         stranal/should_compile/T13031
      does so
      
      Oddly, the testsuite framework doesn't actually run the test; I have
      no idea why.
      f06b71ae
  6. 21 Jul, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Implement unboxed sum primitive type · 714bebff
      Ömer Sinan Ağacan authored
      Summary:
      This patch implements primitive unboxed sum types, as described in
      https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes.
      
      Main changes are:
      
      - Add new syntax for unboxed sums types, terms and patterns. Hidden
        behind `-XUnboxedSums`.
      
      - Add unlifted unboxed sum type constructors and data constructors,
        extend type and pattern checkers and desugarer.
      
      - Add new RuntimeRep for unboxed sums.
      
      - Extend unarise pass to translate unboxed sums to unboxed tuples right
        before code generation.
      
      - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
        code generation when sum values are involved.
      
      - Add user manual section for unboxed sums.
      
      Some other changes:
      
      - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
        `MultiValAlt` to be able to use those with both sums and tuples.
      
      - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
        wrong, given an `Any` `TyCon`, there's no way to tell what its kind
        is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
      
      - Fix some bugs on the way: #12375.
      
      Not included in this patch:
      
      - Update Haddock for new the new unboxed sum syntax.
      
      - `TemplateHaskell` support is left as future work.
      
      For reviewers:
      
      - Front-end code is mostly trivial and adapted from unboxed tuple code
        for type checking, pattern checking, renaming, desugaring etc.
      
      - Main translation routines are in `RepType` and `UnariseStg`.
        Documentation in `UnariseStg` should be enough for understanding
        what's going on.
      
      Credits:
      
      - Johan Tibell wrote the initial front-end and interface file
        extensions.
      
      - Simon Peyton Jones reviewed this patch many times, wrote some code,
        and helped with debugging.
      
      Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
                 simonmar, hvr, erikd
      
      Reviewed By: simonpj
      
      Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
                   thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D2259
      714bebff
  7. 15 Jun, 2016 1 commit
    • Simon Peyton Jones's avatar
      Re-add FunTy (big patch) · 77bb0927
      Simon Peyton Jones authored
      With TypeInType Richard combined ForAllTy and FunTy, but that was often
      awkward, and yielded little benefit becuase in practice the two were
      always treated separately.  This patch re-introduces FunTy.  Specfically
      
      * New type
          data TyVarBinder = TvBndr TyVar VisibilityFlag
        This /always/ has a TyVar it.  In many places that's just what
        what we want, so there are /lots/ of TyBinder -> TyVarBinder changes
      
      * TyBinder still exists:
          data TyBinder = Named TyVarBinder | Anon Type
      
      * data Type = ForAllTy TyVarBinder Type
                  | FunTy Type Type
                  |  ....
      
      There are a LOT of knock-on changes, but they are all routine.
      
      The Haddock submodule needs to be updated too
      77bb0927
  8. 09 Jun, 2016 1 commit
  9. 27 Apr, 2016 1 commit
    • Joachim Breitner's avatar
      Implement the state hack without modifiyng OneShotInfo · a48ebcc4
      Joachim Breitner authored
      Previously, the state hack would be implemented in mkLocalId, by looking
      at the type, and setting the OneShot flag accordingly.
      
      This patch changes this so that the OneShot flag faithfully represents
      what our various analyses found out, and the State Hack is implemented
      by adjusting the accessors, in particular isOneShotBndr and
      idStateHackOneShotInfo. This makes it easier to understand what's going
      on in the analyses, and de-clutters core dumps and interface files.
      
      I don’t expect any change in behaviour, at least not in non-fringe
      cases.
      a48ebcc4
  10. 29 Mar, 2016 1 commit
    • Joachim Breitner's avatar
      Rename isNopSig to isTopSig · e6e17a09
      Joachim Breitner authored
      to be consistent with the other uses of nop vs. top in Demand.hs. Also,
      stop prettyprinting top strictness signatures in Core dumps.
      e6e17a09
  11. 18 Jan, 2016 1 commit
    • Jan Stolarek's avatar
      Replace calls to `ptext . sLit` with `text` · b8abd852
      Jan Stolarek authored
      Summary:
      In the past the canonical way for constructing an SDoc string literal was the
      composition `ptext . sLit`.  But for some time now we have function `text` that
      does the same.  Plus it has some rules that optimize its runtime behaviour.
      This patch takes all uses of `ptext . sLit` in the compiler and replaces them
      with calls to `text`.  The main benefits of this patch are clener (shorter) code
      and less dependencies between module, because many modules now do not need to
      import `FastString`.  I don't expect any performance benefits - we mostly use
      SDocs to report errors and it seems there is little to be gained here.
      
      Test Plan: ./validate
      
      Reviewers: bgamari, austin, goldfire, hvr, alanz
      
      Subscribers: goldfire, thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D1784
      b8abd852
  12. 07 Jan, 2016 1 commit
    • Simon Peyton Jones's avatar
      Make demand analysis understand catch · 9915b656
      Simon Peyton Jones authored
      As Trac #11222, and #10712 note, the strictness analyser
      needs to be rather careful about exceptions.  Previously
      it treated them as identical to divergence, but that
      won't quite do.
      
      See Note [Exceptions and strictness] in Demand, which
      explains the deal.
      
      Getting more strictness in 'catch' and friends is a
      very good thing.  Here is the nofib summary, keeping
      only the big ones.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                fasta          -0.1%     -6.9%     -3.0%     -3.0%     +0.0%
                  hpg          -0.1%     -2.0%     -6.2%     -6.2%     +0.0%
             maillist          -0.1%     -0.3%      0.08      0.09     +1.2%
      reverse-complem          -0.1%    -10.9%     -6.0%     -5.9%     +0.0%
               sphere          -0.1%     -4.3%      0.08      0.08     +0.0%
                 x2n1          -0.1%     -0.0%      0.00      0.00     +0.0%
      --------------------------------------------------------------------------------
                  Min          -0.2%    -10.9%    -17.4%    -17.3%     +0.0%
                  Max          -0.0%     +0.0%     +4.3%     +4.4%     +1.2%
       Geometric Mean          -0.1%     -0.3%     -2.9%     -3.0%     +0.0%
      
      On the way I did quite a bit of refactoring in Demand.hs
      9915b656
  13. 17 Dec, 2015 1 commit
  14. 11 Dec, 2015 1 commit
    • eir@cis.upenn.edu's avatar
      Add kind equalities to GHC. · 67465497
      eir@cis.upenn.edu authored
      This implements the ideas originally put forward in
      "System FC with Explicit Kind Equality" (ICFP'13).
      
      There are several noteworthy changes with this patch:
       * We now have casts in types. These change the kind
         of a type. See new constructor `CastTy`.
      
       * All types and all constructors can be promoted.
         This includes GADT constructors. GADT pattern matches
         take place in type family equations. In Core,
         types can now be applied to coercions via the
         `CoercionTy` constructor.
      
       * Coercions can now be heterogeneous, relating types
         of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2`
         proves both that `t1` and `t2` are the same and also that
         `k1` and `k2` are the same.
      
       * The `Coercion` type has been significantly enhanced.
         The documentation in `docs/core-spec/core-spec.pdf` reflects
         the new reality.
      
       * The type of `*` is now `*`. No more `BOX`.
      
       * Users can write explicit kind variables in their code,
         anywhere they can write type variables. For backward compatibility,
         automatic inference of kind-variable binding is still permitted.
      
       * The new extension `TypeInType` turns on the new user-facing
         features.
      
       * Type families and synonyms are now promoted to kinds. This causes
         trouble with parsing `*`, leading to the somewhat awkward new
         `HsAppsTy` constructor for `HsType`. This is dispatched with in
         the renamer, where the kind `*` can be told apart from a
         type-level multiplication operator. Without `-XTypeInType` the
         old behavior persists. With `-XTypeInType`, you need to import
         `Data.Kind` to get `*`, also known as `Type`.
      
       * The kind-checking algorithms in TcHsType have been significantly
         rewritten to allow for enhanced kinds.
      
       * The new features are still quite experimental and may be in flux.
      
       * TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203.
      
       * TODO: Update user manual.
      
      Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142.
      Updates Haddock submodule.
      67465497
  15. 25 Nov, 2015 1 commit
  16. 22 Apr, 2015 1 commit
  17. 02 Mar, 2015 1 commit
  18. 10 Feb, 2015 1 commit
  19. 16 Dec, 2014 1 commit
    • Peter Wortmann's avatar
      Source notes (Core support) · 993975d3
      Peter Wortmann authored
      This patch introduces "SourceNote" tickishs that link Core to the
      source code that generated it. The idea is to retain these source code
      links throughout code transformations so we can eventually relate
      object code all the way back to the original source (which we can,
      say, encode as DWARF information to allow debugging).  We generate
      these SourceNotes like other tickshs in the desugaring phase. The
      activating command line flag is "-g", consistent with the flag other
      compilers use to decide DWARF generation.
      
      Keeping ticks from getting into the way of Core transformations is
      tricky, but doable. The changes in this patch produce identical Core
      in all cases I tested -- which at this point is GHC, all libraries and
      nofib. Also note that this pass creates *lots* of tick nodes, which we
      reduce somewhat by removing duplicated and overlapping source
      ticks. This will still cause significant Tick "clumps" - a possible
      future optimization could be to make Tick carry a list of Tickishs
      instead of one at a time.
      
      (From Phabricator D169)
      993975d3
  20. 03 Dec, 2014 1 commit
  21. 26 Sep, 2014 1 commit
  22. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  23. 24 Apr, 2014 1 commit
    • Simon Peyton Jones's avatar
      Don't eta-expand PAPs (fixes Trac #9020) · 79e46aea
      Simon Peyton Jones authored
      See Note [Do not eta-expand PAPs] in SimplUtils.  This has a tremendously
      good effect on compile times for some simple benchmarks.
      
      The test is now where it belongs, in perf/compiler/T9020 (instead of simpl015).
      
      I did a nofib run and got essentially zero change except for cacheprof which
      gets 4% more allocation.  I investigated.  Turns out that we have
      
          instance PP Reg where
             pp ppm ST_0 = "%st"
             pp ppm ST_1 = "%st(1)"
             pp ppm ST_2 = "%st(2)"
             pp ppm ST_3 = "%st(3)"
             pp ppm ST_4 = "%st(4)"
             pp ppm ST_5 = "%st(5)"
             pp ppm ST_6 = "%st(6)"
             pp ppm ST_7 = "%st(7)"
             pp ppm r    = "%" ++ map toLower (show r)
      
      That (map toLower (show r) does a lot of map/toLowers.  But if we inline show
      we get something like
      
             pp ppm ST_0 = "%st"
             pp ppm ST_1 = "%st(1)"
             pp ppm ST_2 = "%st(2)"
             pp ppm ST_3 = "%st(3)"
             pp ppm ST_4 = "%st(4)"
             pp ppm ST_5 = "%st(5)"
             pp ppm ST_6 = "%st(6)"
             pp ppm ST_7 = "%st(7)"
             pp ppm EAX  = map toLower (show EAX)
             pp ppm EBX  = map toLower (show EBX)
             ...etc...
      
      and all those map/toLower calls can now be floated to top level.
      This gives a 4% decrease in allocation.  But it depends on inlining
      a pretty big 'show' function.
      
      With this new patch we get slightly better eta-expansion, which makes
      a function look slightly bigger, which just stops it being inlined.
      The previous behaviour was luck, so I'm not going to worry about
      losing it.
      
      I've added some notes to nofib/Simon-nofib-notes
      79e46aea
  24. 11 Mar, 2014 1 commit
  25. 05 Mar, 2014 1 commit
  26. 17 Feb, 2014 1 commit
  27. 10 Feb, 2014 1 commit
    • Joachim Breitner's avatar
      Implement CallArity analysis · cdceadf3
      Joachim Breitner authored
      This analysis finds out if a let-bound expression with lower manifest
      arity than type arity is always called with more arguments, as in that
      case eta-expansion is allowed and often viable. The analysis is very
      much tailored towards the code generated when foldl is implemented via
      foldr; without this analysis doing so would be a very bad idea!
      
      There are other ways to improve foldr/builder-fusion to cope with foldl,
      if any of these are implemented then this step can probably be moved to
      -O2 to save some compilation times. The current impact of adding this
      phase is just below +2% (measured running GHC's "make").
      cdceadf3
  28. 12 Dec, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve the handling of used-once stuff · 80989de9
      Simon Peyton Jones authored
      Joachim and I are committing this onto a branch so that we can share it,
      but we expect to do a bit more work before merging it onto head.
      
      Nofib staus:
        - Most programs, no change
        - A few improve
        - A couple get worse (cacheprof, tak, rfib)
      Investigating the "get worse" set is what's holding up putting this
      on head.
      
      The major issue is this.  Consider
      
          map (f g) ys
      
      where f's demand signature looks like
      
         f :: <L,C1(C1(U))> -> <L,U> -> .
      
      So 'f' is not saturated.  What demand do we place on g?
      Answer
              C(C1(U))
      That is, the inner C1 should stay, even though f is not saturated.
      
      I found that this made a significant difference in the demand signatures
      inferred in GHC.IO, which uses lots of higher-order exception handlers.
      
      I also had to add used-once demand signatures for some of the
      'catch' primops, so that we know their handlers are only called once.
      80989de9
  29. 09 Dec, 2013 2 commits
    • Joachim Breitner's avatar
      Replace mkTopDmdType by mkClosedStrictSig · 3cdf1251
      Joachim Breitner authored
      because it is not a top deman (see previous commit), and it is only used
      in an argument to mkStrictSig.
      3cdf1251
    • Joachim Breitner's avatar
      Rename topDmdType to nopDmdType · f64cf134
      Joachim Breitner authored
      because topDmdType is ''not'' the top of the lattice, as it puts an
      implicit absent demand on free variables, but Abs is the bottom of the
      Usage lattice.
      
      Why nopDmdType? Becuase it is the demand of doing nothing: Everything
      lazy, everything absent, no definite divergence.
      f64cf134
  30. 12 Nov, 2013 1 commit
    • Simon Peyton Jones's avatar
      Improve eta expansion (again) · 802f4b89
      Simon Peyton Jones authored
      The presenting issue was that we were never eta-expanding
      
          f (\x -> case x of (a,b) -> \s -> blah)
      
      and that meant we were allocating two lambdas instead of one.
      See Note [Eta expanding lambdas] in SimplUtils.
      
      However I didn't want to eta expand the lambda, and then try all over
      again for tryEtaExpandRhs.  Yet the latter is important in the context
      of a let-binding it can do simple arity analysis.  So I ended up
      refactoring CallCtxt so that it tells when we are on the RHS of a let.
      
      I also moved findRhsArity from SimplUtils to CoreArity.
      
      Performance increases nicely. Here are the ones where allocation improved
      by more than 0.5%. Notice the nice decrease in binary size too.
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                 ansi          -2.3%     -0.9%      0.00      0.00     +0.0%
                 bspt          -2.1%     -9.7%      0.01      0.01    -33.3%
                fasta          -1.8%    -11.7%     -3.4%     -3.6%     +0.0%
                  fft          -1.9%     -1.3%      0.06      0.06    +11.1%
      reverse-complem          -1.9%    -18.1%     -1.9%     -2.8%     +0.0%
               sphere          -1.8%     -4.5%      0.09      0.09     +0.0%
            transform          -1.8%     -2.3%     -4.6%     -3.1%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -3.0%    -18.1%    -13.9%    -14.6%    -35.7%
                  Max          -1.3%     +0.0%     +7.7%     +7.7%    +50.0%
       Geometric Mean          -1.9%     -0.6%     -2.1%     -2.1%     -0.2%
      802f4b89
  31. 24 Oct, 2013 1 commit
    • Simon Peyton Jones's avatar
      Refactor the topNormaliseNewType story, fixing Trac #8467 · 2650da2b
      Simon Peyton Jones authored
      A bit of a mess had accumulated, with unclear invariants.
      
      * Remove splitNewTypeRepCo_maybe, in favour of topNormaliseNewType_maybe
        (which had the same signature but behaved subtly differently).
      
      * Make topNormaliseNewType_maybe guaranteed to return a non-newtype
        if it says (Just ty).  This is what was causing the loop in #8467
      
      * Apply similar tidying up to FamInstEnv.topNormaliseType
      2650da2b
  32. 08 Oct, 2013 1 commit
  33. 01 Oct, 2013 1 commit
  34. 06 Jun, 2013 1 commit
    • Simon Peyton Jones's avatar
      Add TyCon.checkRecTc, and use in in typeArity · a1a67b58
      Simon Peyton Jones authored
      This just formalises an abstraction we've been using anyway,
      namely to expand "recursive" TyCons until we see them twice.
      We weren't doing this in typeArity, and that inconsistency
      was leading to a subsequent ASSERT failure, when compiling
      Stream.hs, which has a newtype like this
      
         newtype Stream m a b = Stream (m (Either b (a, Stream m a b)))
      a1a67b58
  35. 30 Jan, 2013 1 commit
  36. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  37. 16 Oct, 2012 2 commits
  38. 02 May, 2012 1 commit
    • Simon Peyton Jones's avatar
      Allow cases with empty alterantives · ac230c5e
      Simon Peyton Jones authored
      This patch allows, for the first time, case expressions with an empty
      list of alternatives. Max suggested the idea, and Trac #6067 showed
      that it is really quite important.
      
      So I've implemented the idea, fixing #6067. Main changes
      
       * See Note [Empty case alternatives] in CoreSyn
      
       * Various foldr1's become foldrs
      
       * IfaceCase does not record the type of the alternatives.
         I added IfaceECase for empty-alternative cases.
      
       * Core Lint does not complain about empty cases
      
       * MkCore.castBottomExpr constructs an empty-alternative case
         expression   (case e of ty {})
      
       * CoreToStg converts '(case e of {})' to just 'e'
      ac230c5e