1. 27 Jul, 2013 3 commits
  2. 06 Jul, 2013 1 commit
  3. 19 Jun, 2013 1 commit
  4. 06 Jun, 2013 4 commits
    • Simon Peyton Jones's avatar
      Add important missing case for bothCPR · 4669c9e6
      Simon Peyton Jones authored
      If either side diverges, both do!
    • Simon Peyton Jones's avatar
    • Simon Peyton Jones's avatar
      Comments about the Name Cache · 507c8970
      Simon Peyton Jones authored
    • Simon Peyton Jones's avatar
      Implement cardinality analysis · 99d4e5b4
      Simon Peyton Jones authored
      This major patch implements the cardinality analysis described
      in our paper "Higher order cardinality analysis". It is joint
      work with Ilya Sergey and Dimitrios Vytiniotis.
      The basic is augment the absence-analysis part of the demand
      analyser so that it can tell when something is used
      	 at most once
       	 some other way
      The "at most once" information is used
          a) to enable transformations, and
             in particular to identify one-shot lambdas
          b) to allow updates on thunks to be omitted.
      There are two new flags, mainly there so you can do performance
          -fkill-absence   stops GHC doing absence analysis at all
          -fkill-one-shot  stops GHC spotting one-shot lambdas
                           and single-entry thunks
      The big changes are:
      * The Demand type is substantially refactored.  In particular
        the UseDmd is factored as follows
            data UseDmd
              = UCall Count UseDmd
              | UProd [MaybeUsed]
              | UHead
              | Used
            data MaybeUsed = Abs | Use Count UseDmd
            data Count = One | Many
        Notice that UCall recurses straight to UseDmd, whereas
        UProd goes via MaybeUsed.
        The "Count" embodies the "at most once" or "many" idea.
      * The demand analyser itself was refactored a lot
      * The previously ad-hoc stuff in the occurrence analyser for foldr and
        build goes away entirely.  Before if we had build (\cn -> ...x... )
        then the "\cn" was hackily made one-shot (by spotting 'build' as
        special.  That's essential to allow x to be inlined.  Now the
        occurrence analyser propagates info gotten from 'build's stricness
        signature (so build isn't special); and that strictness sig is
        in turn derived entirely automatically.  Much nicer!
      * The ticky stuff is improved to count single-entry thunks separately.
      One shortcoming is that there is no DEBUG way to spot if an
      allegedly-single-entry thunk is acually entered more than once.  It
      would not be hard to generate a bit of code to check for this, and it
      would be reassuring.  But it's fiddly and I have not done it.
      Despite all this fuss, the performance numbers are rather under-whelming.
      See the paper for more discussion.
             nucleic2          -0.8%    -10.9%      0.10      0.10     +0.0%
               sphere          -0.7%     -1.5%      0.08      0.08     +0.0%
                  Min          -4.7%    -10.9%     -9.3%     -9.3%    -50.0%
                  Max          -0.4%     +0.5%     +2.2%     +2.3%     +7.4%
       Geometric Mean          -0.8%     -0.2%     -1.3%     -1.3%     -1.8%
      I don't quite know how much credence to place in the runtime changes,
      but movement seems generally in the right direction.
  5. 30 May, 2013 2 commits
    • Iavor S. Diatchki's avatar
      Add a primitive for coercing values into dictionaries in a special case. · ac330cb6
      Iavor S. Diatchki authored
      The details of this are described in Note [magicSingIId magic] in basicTypes/MkId.lhs
    • Simon Peyton Jones's avatar
      Make 'SPECIALISE instance' work again · 1ed04090
      Simon Peyton Jones authored
      This is a long-standing regression (Trac #7797), which meant that in
      particular the Eq [Char] instance does not get specialised.
      (The *methods* do, but the dictionary itself doesn't.)  So when you
      call a function
           f :: Eq a => blah
      on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
      This only matters when calling an overloaded function from a
      specialised context, but that does matter in some programs.  I
      remember (though I cannot find the details) that Nick Frisby discovered
      this to be the source of some pretty solid performanc regresisons.
      Anyway it works now. The key change is that a DFunUnfolding now takes
      a form that is both simpler than before (the DFunArg type is eliminated)
      and more general:
      data Unfolding
        = ...
        | DFunUnfolding {     -- The Unfolding of a DFunId
          			-- See Note [DFun unfoldings]
            		  	--     df = /\a1..am. \d1..dn. MkD t1 .. tk
                              --                                 (op1 a1..am d1..dn)
           		      	--     	    	      	       	   (op2 a1..am d1..dn)
              df_bndrs :: [Var],      -- The bound variables [a1..m],[d1..dn]
              df_con   :: DataCon,    -- The dictionary data constructor (never a newtype datacon)
              df_args  :: [CoreExpr]  -- Args of the data con: types, superclasses and methods,
          }                           -- in positional order
      That in turn allowed me to re-enable the DFunUnfolding specialisation in
      DsBinds.  Lots of details here in TcInstDcls:
      	  Note [SPECIALISE instance pragmas]
      I also did some refactoring, in particular to pass the InScopeSet to
      exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
      NB: Interface file format has changed!
  6. 28 May, 2013 1 commit
  7. 21 May, 2013 1 commit
    • Simon Peyton Jones's avatar
      Simplify kind generalisation, and fix Trac #7916 · ce89bdec
      Simon Peyton Jones authored
      A buglet that exposed an opportunity for some welcome refactoring
      and simplification.  Main changes
      * TcMType.zonkQuantifiedTyVars is replaced by quantifyTyVars, which
        does a bit more zonking (so that its clients do not need to)
      * TcHsType.kindGeneralise becomes a bit simpler, and hands off
        to quantifyTyVars
      * A bit of simplification of the hacky code in TcTyClsDcls.tcConDecl,
        where we figure out how to generalise the data constructor's type
      * Improve the error message from badExistential when a constructor
        has an existential type, by printing the offending type
      * Some consequential simplification in simplifyInfer.
  8. 17 May, 2013 1 commit
  9. 15 May, 2013 1 commit
  10. 25 Apr, 2013 1 commit
  11. 21 Apr, 2013 1 commit
  12. 05 Mar, 2013 1 commit
    • Simon Peyton Jones's avatar
      Ensure that isStrictDmd is False for Absent (fixes Trac #7737) · a37a7f7b
      Simon Peyton Jones authored
      The demand <HyperStr, Absent> for a let-bound value is bit
      strange; it means that the context will diverge, but this
      argument isn't used. We don't want to use call-by-value here,
      even though it's semantically sound if all bottoms mean
      the same.
      The fix is easy; just make "isStrictDmd" a bit more perspicuous.
      See Note [Strict demands] in Demand.lhs
  13. 14 Feb, 2013 1 commit
  14. 02 Feb, 2013 1 commit
  15. 30 Jan, 2013 2 commits
  16. 25 Jan, 2013 3 commits
    • Simon Peyton Jones's avatar
      Remove dead code · 9c661e07
      Simon Peyton Jones authored
    • Simon Peyton Jones's avatar
      Collapse DmdResult into CPRResult · e3426665
      Simon Peyton Jones authored
      There was no gain from PureResult; the CPRResult component
      needs a BotCPR value anyhow, so it was simply duplicate computation.
    • Simon Peyton Jones's avatar
      Refactor and improve the promotion inference · 09ff0e0d
      Simon Peyton Jones authored
      It should be the case that either an entire mutually recursive
      group of data type declarations can be promoted, or none of them.
      It's really odd to promote some data constructors of a type but
      not others. Eg
        data T a = T1 a | T2 Int
      Here T1 is sort-of-promotable but T2 isn't (becuase Int isn't
      This patch makes it all-or-nothing. At the same time I've made
      the TyCon point to its promoted cousin (via the tcPromoted field
      of an AlgTyCon), as well as vice versa (via the ty_con field of
      The inference for the group is done in TcTyDecls, the same place
      that infers which data types are recursive, another global question.
  17. 24 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Introduce CPR for sum types (Trac #5075) · d3b8991b
      Simon Peyton Jones authored
      The main payload of this patch is to extend CPR so that it
      detects when a function always returns a result constructed
      with the *same* constructor, even if the constructor comes from
      a sum type.  This doesn't matter very often, but it does improve
      some things (results below).
      Binary sizes increase a little bit, I think because there are more
      wrappers.  This with -split-objs.  Without split-ojbs binary sizes
      increased by 6% even for HelloWorld.hs.  It's hard to see exactly why,
      but I think it was because System.Posix.Types.o got included in the
      linked binary, whereas it didn't before.
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                fluid          +1.8%     -0.3%      0.01      0.01     +0.0%
                  tak          +2.2%     -0.2%      0.02      0.02     +0.0%
                 ansi          +1.7%     -0.3%      0.00      0.00     +0.0%
            cacheprof          +1.6%     -0.3%     +0.6%     +0.5%     +1.4%
              parstof          +1.4%     -4.4%      0.00      0.00     +0.0%
              reptile          +2.0%     +0.3%      0.02      0.02     +0.0%
                  Min          +1.1%     -4.4%     -4.7%     -4.7%    -15.0%
                  Max          +2.3%     +0.3%     +8.3%     +9.4%    +50.0%
       Geometric Mean          +1.9%     -0.1%     +0.6%     +0.7%     +0.3%
      Other things in this commit
      * Got rid of the Lattice class in Demand
      * Refactored the way that products and newtypes are
        decomposed (no change in functionality)
  18. 18 Jan, 2013 1 commit
  19. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      * Remove quite a bit of ad-hoc cruft
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      Everything else really follows from these changes.
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      Things to note
      * Binary sizes are smaller. I don't know why, but it's good.
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
      * Runtimes are never very reliable, but seem to improve very slightly.
      All in all, a good piece of work.  Thank you Ilya!
  20. 15 Jan, 2013 1 commit
  21. 14 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Be willing to parse {-# UNPACK #-} without '!' · deec5b74
      Simon Peyton Jones authored
      This change gives a more helpful error message when the
      user says    data T = MkT {-# UNPACK #-} Int
      which should have a strictness '!' as well. Rather than
      just a parse error, we get
        T7562.hs:3:14: Warning:
          UNPACK pragma lacks '!' on the first argument of `MkT'
      Fixes Trac #7562
  22. 07 Jan, 2013 1 commit
  23. 02 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Define ListSetOps.getNth, and use it · b0c0cae7
      Simon Peyton Jones authored
      I was tracking down an error looking like
        Prelude.(!!): index too large
      which is very unhelpful.  This patch replaces at least some uses
      of (!!) in GHC with getNth, which has a more helpful error
      message (with DEBUG anyway)
  24. 23 Dec, 2012 1 commit
    • Simon Peyton Jones's avatar
      Make {-# UNPACK #-} work for type/data family invocations · 1ee1cd41
      Simon Peyton Jones authored
      This fixes most of Trac #3990.  Consider
        data family D a
        data instance D Double = CD Int Int
        data T = T {-# UNPACK #-} !(D Double)
      Then we want the (D Double unpacked).
      To do this we need to construct a suitable coercion, and it's much
      safer to record that coercion in the interface file, lest the in-scope
      instances differ somehow.  That in turn means elaborating the HsBang
      type to include a coercion.
      To do that I moved HsBang from BasicTypes to DataCon, which caused
      quite a few minor knock-on changes.
      Interface-file format has changed!
      Still to do: need to do knot-tying to allow instances to take effect
      within the same module.
  25. 22 Dec, 2012 1 commit
    • eir@cis.upenn.edu's avatar
      Implement overlapping type family instances. · 8366792e
      eir@cis.upenn.edu authored
      An ordered, overlapping type family instance is introduced by 'type
      where', followed by equations. See the new section in the user manual
      ( for details. The canonical example is Boolean equality at the
      type family Equals (a :: k) (b :: k) :: Bool
      type instance where
        Equals a a = True
        Equals a b = False
      A branched family instance, such as this one, checks its equations in
      and applies only the first the matches. As explained in the note
      checking within groups] in FamInstEnv.lhs, we must be careful not to
      say, (Equals Int b) to False, because b might later unify with Int.
      This commit includes all of the commits on the overlapping-tyfams
      branch. SPJ
      requested that I combine all my commits over the past several months
      into one
      monolithic commit. The following GHC repos are affected: ghc, testsuite,
      utils/haddock, libraries/template-haskell, and libraries/dph.
      Here are some details for the interested:
      - The definition of CoAxiom has been moved from TyCon.lhs to a
        new file CoAxiom.lhs. I made this decision because of the
        number of definitions necessary to support BranchList.
      - BranchList is a GADT whose type tracks whether it is a
        singleton list or not-necessarily-a-singleton-list. The reason
        I introduced this type is to increase static checking of places
        where GHC code assumes that a FamInst or CoAxiom is indeed a
        singleton. This assumption takes place roughly 10 times
        throughout the code. I was worried that a future change to GHC
        would invalidate the assumption, and GHC might subtly fail to
        do the right thing. By explicitly labeling CoAxioms and
        FamInsts as being Unbranched (singleton) or
        Branched (not-necessarily-singleton), we make this assumption
        explicit and checkable. Furthermore, to enforce the accuracy of
        this label, the list of branches of a CoAxiom or FamInst is
        stored using a BranchList, whose constructors constrain its
        type index appropriately.
      I think that the decision to use BranchList is probably the most
      controversial decision I made from a code design point of view.
      Although I provide conversions to/from ordinary lists, it is more
      efficient to use the brList... functions provided in CoAxiom than
      always to convert. The use of these functions does not wander far
      from the core CoAxiom/FamInst logic.
      BranchLists are motivated and explained in the note [Branched axioms] in
      - The CoAxiom type has changed significantly. You can see the new
        type in CoAxiom.lhs. It uses a CoAxBranch type to track
        branches of the CoAxiom. Correspondingly various functions
        producing and consuming CoAxioms had to change, including the
        binary layout of interface files.
      - To get branched axioms to work correctly, it is important to have a
        of type "apartness": two types are apart if they cannot unify, and no
        substitution of variables can ever get them to unify, even after type
        simplification. (This is different than the normal failure to unify
        of the type family bit.) This notion in encoded in tcApartTys, in
        Because apartness is finer-grained than unification, the tcUnifyTys
        calls tcApartTys.
      - CoreLinting axioms has been updated, both to reflect the new
        form of CoAxiom and to enforce the apartness rules of branch
        application. The formalization of the new rules is in
      - The FamInst type (in types/FamInstEnv.lhs) has changed
        significantly, paralleling the changes to CoAxiom. Of course,
        this forced minor changes in many files.
      - There are several new Notes in FamInstEnv.lhs, including one
        discussing confluent overlap and why we're not doing it.
      - lookupFamInstEnv, lookupFamInstEnvConflicts, and
        lookup_fam_inst_env' (the function that actually does the work)
        have all been more-or-less completely rewritten. There is a
        Note [lookup_fam_inst_env' implementation] describing the
        implementation. One of the changes that affects other files is
        to change the type of matches from a pair of (FamInst, [Type])
        to a new datatype (which now includes the index of the matching
        branch). This seemed a better design.
      - The TySynInstD constructor in Template Haskell was updated to
        use the new datatype TySynEqn. I also bumped the TH version
        number, requiring changes to DPH cabal files. (That's why the
        DPH repo has an overlapping-tyfams branch.)
      - As SPJ requested, I refactored some of the code in HsDecls:
       * splitting up TyDecl into SynDecl and DataDecl, correspondingly
         changing HsTyDefn to HsDataDefn (with only one constructor)
       * splitting FamInstD into TyFamInstD and DataFamInstD and
         splitting FamInstDecl into DataFamInstDecl and TyFamInstDecl
       * making the ClsInstD take a ClsInstDecl, for parallelism with
         InstDecl's other constructors
       * changing constructor TyFamily into FamDecl
       * creating a FamilyDecl type that stores the details for a family
         declaration; this is useful because FamilyDecls can appear in classes
         other decls cannot
       * restricting the associated types and associated type defaults for a
       * class
         to be the new, more restrictive types
       * splitting cid_fam_insts into cid_tyfam_insts and cid_datafam_insts,
         according to the new types
       * perhaps one or two more that I'm overlooking
      None of these changes has far-reaching implications.
      - The user manual, section, is updated to describe the new type
  26. 21 Dec, 2012 1 commit
  27. 19 Dec, 2012 2 commits
  28. 14 Dec, 2012 3 commits
    • ian@well-typed.com's avatar
    • Simon Peyton Jones's avatar
      Major refactoring of the way that UNPACK pragmas are handled · faa8ff40
      Simon Peyton Jones authored
      The situation was pretty dire.  The way in which data constructors
      were handled, notably the mapping between their *source* argument types
      and their *representation* argument types (after seq'ing and unpacking)
      was scattered in three different places, and hard to keep in sync.
      Now it is all in one place:
       * The dcRep field of a DataCon gives its representation,
         specified by a DataConRep
       * As well as having the wrapper, the DataConRep has a "boxer"
         of type DataConBoxer (defined in MkId for loopy reasons).
         The boxer used at a pattern match to reconstruct the source-level
         arguments from the rep-level bindings in the pattern match.
       * The unboxing in the wrapper and the boxing in the boxer are dual,
         and are now constructed together, by MkId.mkDataConRep. This is
         the key function of this change.
       * All the computeBoxingStrategy code in TcTyClsDcls disappears.
      Much nicer.
      There is a little bit of refactoring left to do; the strange
      deepSplitProductType functions are now called only in WwLib, so
      I moved them there, and I think they could be tidied up further.
    • ian@well-typed.com's avatar