1. 20 Jun, 2014 1 commit
  2. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
      23892440
  3. 06 Mar, 2014 1 commit
    • Simon Peyton Jones's avatar
      Make the demand on a binder compatible with type (fixes Trac #8569) · 4b355cd2
      Simon Peyton Jones authored
      Because of GADTs and casts we were getting binders whose
      demand annotation was more deeply nested than made sense
      for its type.
      
      See Note [Trimming a demand to a type], in Demand.lhs,
      which I reproduce here:
      
         Note [Trimming a demand to a type]
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         Consider this:
      
           f :: a -> Bool
           f x = case ... of
                   A g1 -> case (x |> g1) of (p,q) -> ...
                   B    -> error "urk"
      
         where A,B are the constructors of a GADT.  We'll get a U(U,U) demand
         on x from the A branch, but that's a stupid demand for x itself, which
         has type 'a'. Indeed we get ASSERTs going off (notably in
         splitUseProdDmd, Trac #8569).
      
         Bottom line: we really don't want to have a binder whose demand is more
         deeply-nested than its type.  There are various ways to tackle this.
         When processing (x |> g1), we could "trim" the incoming demand U(U,U)
         to match x's type.  But I'm currently doing so just at the moment when
         we pin a demand on a binder, in DmdAnal.findBndrDmd.
      4b355cd2
  4. 01 Feb, 2014 1 commit
  5. 23 Jan, 2014 1 commit
    • Joachim Breitner's avatar
      Some polishing of the demand analyser. · 8d34ae39
      Joachim Breitner authored
      I did some refactoring of the demand analyser, because I was smelling
      some minor code smell. Most of my changes I had to undo, though,
      adding notes and testcases on why the existing code was correct after
      all.
      
      Especially the semantics of the DmdResult is confusing, as it differs in
      a DmdType and a StrictSig.
      
      I got to imrpove the readability of the code for lubDmdType, though.
      
      Also, dmdAnalRhs was a bit fishy in how it removed the demand on
      further arguments of the body, but used the DmdResult. This would be
      wrong if a body would return a demand type of "<L>m" (which currently
      does not happen).  This is now treated better in removeDmdTyArgs.
      8d34ae39
  6. 16 Jan, 2014 4 commits
  7. 10 Jan, 2014 2 commits
  8. 16 Dec, 2013 7 commits
  9. 12 Dec, 2013 4 commits
    • Joachim Breitner's avatar
      Move peelFV from DmdAnal to Demand · 6b6a30d6
      Joachim Breitner authored
      6b6a30d6
    • Simon Peyton Jones's avatar
      Improve the handling of used-once stuff · 80989de9
      Simon Peyton Jones authored
      Joachim and I are committing this onto a branch so that we can share it,
      but we expect to do a bit more work before merging it onto head.
      
      Nofib staus:
        - Most programs, no change
        - A few improve
        - A couple get worse (cacheprof, tak, rfib)
      Investigating the "get worse" set is what's holding up putting this
      on head.
      
      The major issue is this.  Consider
      
          map (f g) ys
      
      where f's demand signature looks like
      
         f :: <L,C1(C1(U))> -> <L,U> -> .
      
      So 'f' is not saturated.  What demand do we place on g?
      Answer
              C(C1(U))
      That is, the inner C1 should stay, even though f is not saturated.
      
      I found that this made a significant difference in the demand signatures
      inferred in GHC.IO, which uses lots of higher-order exception handlers.
      
      I also had to add used-once demand signatures for some of the
      'catch' primops, so that we know their handlers are only called once.
      80989de9
    • Simon Peyton Jones's avatar
      Assign strictness signatures to primitive operations · 0558911f
      Simon Peyton Jones authored
      This patch was authored by SPJ, and extracted from "Improve the handling
      of used-once stuff" by Joachim.
      0558911f
    • Simon Peyton Jones's avatar
      Some refactoring of Demand and DmdAnal · 838da6fc
      Simon Peyton Jones authored
      This was authored by SPJ and extracted from the "Improve the handling of
      used-once stuff" patch by Joachim.
      838da6fc
  10. 09 Dec, 2013 3 commits
    • Joachim Breitner's avatar
      Replace mkTopDmdType by mkClosedStrictSig · 3cdf1251
      Joachim Breitner authored
      because it is not a top deman (see previous commit), and it is only used
      in an argument to mkStrictSig.
      3cdf1251
    • Joachim Breitner's avatar
      Rename topDmdType to nopDmdType · f64cf134
      Joachim Breitner authored
      because topDmdType is ''not'' the top of the lattice, as it puts an
      implicit absent demand on free variables, but Abs is the bottom of the
      Usage lattice.
      
      Why nopDmdType? Becuase it is the demand of doing nothing: Everything
      lazy, everything absent, no definite divergence.
      f64cf134
    • Joachim Breitner's avatar
      Do not forget CPR information after an IO action · a31cb5b0
      Joachim Breitner authored
      but do forget about certain divergence, if required. Fixes one part of
      ticket #8598.
      
      The added function (deferAfterIO) can maybe be merged with existing
      code, but given the ongoing work in the nested-cpr branch, I defer that
      work.
      a31cb5b0
  11. 04 Dec, 2013 1 commit
  12. 02 Dec, 2013 2 commits
  13. 01 Oct, 2013 2 commits
    • Gabor Greif's avatar
      Typos in comments · d6ccea98
      Gabor Greif authored
      d6ccea98
    • Simon Peyton Jones's avatar
      Lift an unnecessary assumption in the demand analyser (fix Trac #8329) · 9bd36664
      Simon Peyton Jones authored
      Here's the Note about the (simple) fix.  Apparently #8329 prevented all
      23 packages of the Snap framework from compiling.
      
      Note [Demand transformer for a ditionary selector]
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      If we evaluate (op dict-expr) under demand 'd', then we can push the demand 'd'
      into the appropriate field of the dictionary. What *is* the appropriate field?
      We just look at the strictness signature of the class op, which will be
      something like: U(AAASAAAAA).  Then replace the 'S' by the demand 'd'.
      
      For single-method classes, which are represented by newtypes the signature
      of 'op' won't look like U(...), so the splitProdDmd_maybe will fail.
      That's fine: if we are doing strictness analysis we are also doing inling,
      so we'll have inlined 'op' into a cast.  So we can bale out in a conservative
      way, returning topDmdType.
      
      It is (just.. Trac #8329) possible to be running strictness analysis *without*
      having inlined class ops from single-method classes.  Suppose you are using
      ghc --make; and the first module has a local -O0 flag.  So you may load a class
      without interface pragmas, ie (currently) without an unfolding for the class
      ops.   Now if a subsequent module in the --make sweep has a local -O flag
      you might do strictness analysis, but there is no inlining for the class op.
      This is wierd so I'm not worried about whether this optimises brilliantly; but
      it should not fall over.
      9bd36664
  14. 08 Sep, 2013 1 commit
  15. 06 Jun, 2013 2 commits
    • Simon Peyton Jones's avatar
      Add important missing case for bothCPR · 4669c9e6
      Simon Peyton Jones authored
      If either side diverges, both do!
      4669c9e6
    • Simon Peyton Jones's avatar
      Implement cardinality analysis · 99d4e5b4
      Simon Peyton Jones authored
      This major patch implements the cardinality analysis described
      in our paper "Higher order cardinality analysis". It is joint
      work with Ilya Sergey and Dimitrios Vytiniotis.
      
      The basic is augment the absence-analysis part of the demand
      analyser so that it can tell when something is used
      	 never
      	 at most once
       	 some other way
      
      The "at most once" information is used
          a) to enable transformations, and
             in particular to identify one-shot lambdas
          b) to allow updates on thunks to be omitted.
      
      There are two new flags, mainly there so you can do performance
      comparisons:
          -fkill-absence   stops GHC doing absence analysis at all
          -fkill-one-shot  stops GHC spotting one-shot lambdas
                           and single-entry thunks
      
      The big changes are:
      
      * The Demand type is substantially refactored.  In particular
        the UseDmd is factored as follows
            data UseDmd
              = UCall Count UseDmd
              | UProd [MaybeUsed]
              | UHead
              | Used
      
            data MaybeUsed = Abs | Use Count UseDmd
      
            data Count = One | Many
      
        Notice that UCall recurses straight to UseDmd, whereas
        UProd goes via MaybeUsed.
      
        The "Count" embodies the "at most once" or "many" idea.
      
      * The demand analyser itself was refactored a lot
      
      * The previously ad-hoc stuff in the occurrence analyser for foldr and
        build goes away entirely.  Before if we had build (\cn -> ...x... )
        then the "\cn" was hackily made one-shot (by spotting 'build' as
        special.  That's essential to allow x to be inlined.  Now the
        occurrence analyser propagates info gotten from 'build's stricness
        signature (so build isn't special); and that strictness sig is
        in turn derived entirely automatically.  Much nicer!
      
      * The ticky stuff is improved to count single-entry thunks separately.
      
      One shortcoming is that there is no DEBUG way to spot if an
      allegedly-single-entry thunk is acually entered more than once.  It
      would not be hard to generate a bit of code to check for this, and it
      would be reassuring.  But it's fiddly and I have not done it.
      
      Despite all this fuss, the performance numbers are rather under-whelming.
      See the paper for more discussion.
      
             nucleic2          -0.8%    -10.9%      0.10      0.10     +0.0%
               sphere          -0.7%     -1.5%      0.08      0.08     +0.0%
      --------------------------------------------------------------------------------
                  Min          -4.7%    -10.9%     -9.3%     -9.3%    -50.0%
                  Max          -0.4%     +0.5%     +2.2%     +2.3%     +7.4%
       Geometric Mean          -0.8%     -0.2%     -1.3%     -1.3%     -1.8%
      
      I don't quite know how much credence to place in the runtime changes,
      but movement seems generally in the right direction.
      99d4e5b4
  16. 05 Mar, 2013 1 commit
    • Simon Peyton Jones's avatar
      Ensure that isStrictDmd is False for Absent (fixes Trac #7737) · a37a7f7b
      Simon Peyton Jones authored
      The demand <HyperStr, Absent> for a let-bound value is bit
      strange; it means that the context will diverge, but this
      argument isn't used. We don't want to use call-by-value here,
      even though it's semantically sound if all bottoms mean
      the same.
      
      The fix is easy; just make "isStrictDmd" a bit more perspicuous.
      See Note [Strict demands] in Demand.lhs
      a37a7f7b
  17. 30 Jan, 2013 1 commit
  18. 25 Jan, 2013 1 commit
  19. 24 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Introduce CPR for sum types (Trac #5075) · d3b8991b
      Simon Peyton Jones authored
      The main payload of this patch is to extend CPR so that it
      detects when a function always returns a result constructed
      with the *same* constructor, even if the constructor comes from
      a sum type.  This doesn't matter very often, but it does improve
      some things (results below).
      
      Binary sizes increase a little bit, I think because there are more
      wrappers.  This with -split-objs.  Without split-ojbs binary sizes
      increased by 6% even for HelloWorld.hs.  It's hard to see exactly why,
      but I think it was because System.Posix.Types.o got included in the
      linked binary, whereas it didn't before.
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
                fluid          +1.8%     -0.3%      0.01      0.01     +0.0%
                  tak          +2.2%     -0.2%      0.02      0.02     +0.0%
                 ansi          +1.7%     -0.3%      0.00      0.00     +0.0%
            cacheprof          +1.6%     -0.3%     +0.6%     +0.5%     +1.4%
              parstof          +1.4%     -4.4%      0.00      0.00     +0.0%
              reptile          +2.0%     +0.3%      0.02      0.02     +0.0%
      ----------------------------------------------------------------------
                  Min          +1.1%     -4.4%     -4.7%     -4.7%    -15.0%
                  Max          +2.3%     +0.3%     +8.3%     +9.4%    +50.0%
       Geometric Mean          +1.9%     -0.1%     +0.6%     +0.7%     +0.3%
      
      Other things in this commit
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~
      * Got rid of the Lattice class in Demand
      
      * Refactored the way that products and newtypes are
        decomposed (no change in functionality)
      d3b8991b
  20. 17 Jan, 2013 1 commit
    • Simon Peyton Jones's avatar
      Major patch to implement the new Demand Analyser · 0831a12e
      Simon Peyton Jones authored
      This patch is the result of Ilya Sergey's internship at MSR.  It
      constitutes a thorough overhaul and simplification of the demand
      analyser.  It makes a solid foundation on which we can now build.
      Main changes are
      
      * Instead of having one combined type for Demand, a Demand is
         now a pair (JointDmd) of
            - a StrDmd and
            - an AbsDmd.
         This allows strictness and absence to be though about quite
         orthogonally, and greatly reduces brain melt-down.
      
      * Similarly in the DmdResult type, it's a pair of
           - a PureResult (indicating only divergence/non-divergence)
           - a CPRResult (which deals only with the CPR property
      
      * In IdInfo, the
          strictnessInfo field contains a StrictSig, not a Maybe StrictSig
          demandInfo     field contains a Demand, not a Maybe Demand
        We don't need Nothing (to indicate no strictness/demand info)
        any more; topSig/topDmd will do.
      
      * Remove "boxity" analysis entirely.  This was an attempt to
        avoid "reboxing", but it added complexity, is extremely
        ad-hoc, and makes very little difference in practice.
      
      * Remove the "unboxing strategy" computation. This was an an
        attempt to ensure that a worker didn't get zillions of
        arguments by unboxing big tuples.  But in fact removing it
        DRAMATICALLY reduces allocation in an inner loop of the
        I/O library (where the threshold argument-count had been
        set just too low).  It's exceptional to have a zillion arguments
        and I don't think it's worth the complexity, especially since
        it turned out to have a serious performance hit.
      
      * Remove quite a bit of ad-hoc cruft
      
      * Move worthSplittingFun, worthSplittingThunk from WorkWrap to
        Demand. This allows JointDmd to be fully abstract, examined
        only inside Demand.
      
      Everything else really follows from these changes.
      
      All of this is really just refactoring, so we don't expect
      big performance changes, but acutally the numbers look quite
      good.  Here is a full nofib run with some highlights identified:
      
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
               expert          -2.6%    -15.5%      0.00      0.00     +0.0%
                fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
                   gg          -2.5%    -28.9%      0.02      0.02    -33.3%
            integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
              mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
             nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
                 para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
               parser          -2.5%    -17.9%      0.05      0.05     +0.0%
               prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
               puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
              sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
             treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
                  Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
       Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%
      
      Things to note
      
      * Binary sizes are smaller. I don't know why, but it's good.
      
      * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
        (I checked treejoin, gg, sorting) arise from one place, namely a function
        GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
        which have several arugments.  Not w/w'ing both arguments (which is what
        we did before) has a big effect.  So the big win in actually somewhat
        accidental, gained by removing the "unboxing strategy" code.
      
      * A couple of benchmarks allocate slightly more.  This turns out
        to be due to reboxing (integrate).  But the biggest increase is
        mandel2, and *that* turned out also to be a somewhat accidental
        loss of CSE, and pointed the way to doing better CSE: see Trac
        #7596.
      
      * Runtimes are never very reliable, but seem to improve very slightly.
      
      All in all, a good piece of work.  Thank you Ilya!
      0831a12e
  21. 12 Jun, 2012 1 commit
  22. 04 Nov, 2011 1 commit