1. 04 May, 2017 3 commits
    • Simon Peyton Jones's avatar
      Deal with exceptions in dsWhenNoErrs · e7701976
      Simon Peyton Jones authored
      Gracious me.  Ever since this patch
      
        commit 37445780
        Author: Jan Stolarek <jan.stolarek@p.lodz.pl>
        Date:   Fri Jul 11 13:54:45 2014 +0200
      
            Injective type families
      
      TcRnMonad.askNoErrs has been wrong. It looked like this
      
         askNoErrs :: TcRn a -> TcRn (a, Bool)
         askNoErrs m
          = do { errs_var <- newTcRef emptyMessages
               ; res  <- setErrsVar errs_var m
               ; (warns, errs) <- readTcRef errs_var
               ; addMessages (warns, errs)
               ; return (res, isEmptyBag errs) }
      
      The trouble comes if 'm' throws an exception in the TcRn monad.
      Then 'errs_var is never read, so any errors are simply lost.
      
      This mistake was then propgated into DsMonad.dsWhenNoErrs, where
      it gave rise to Trac #13642.
      
      Thank to Ryan for narrowing it down so sharply.
      
      I did some refactoring, as usual.
      e7701976
    • Gabor Greif's avatar
      Abandon typedefing the {Section,ObjectCode}FormatInfo structs · 81af480a
      Gabor Greif authored
      Summary:
      This is a follow-up to @angerman 's refactoring for ELF
      that happened with e5e8646d
      My previous commit a6675a93
      corrected a typedef redefinition issue with GCC v4.4
      (which is pervasive with RHEL 6). Now the problem has resurfaced.
      
      Instead of dancing after the different compiler's pipe, I decided
      to eliminate the typedefs altogether and refer to the struct
      namespace explicitly.
      
      Added a note to describe why typedefs are not
      applied on customisable structs.
      
      Reviewers: austin, bgamari, erikd, simonmar
      
      Subscribers: rwbarton, thomie, angerman
      
      Differential Revision: https://phabricator.haskell.org/D3527
      81af480a
    • Simon Peyton Jones's avatar
      Teach optCoecion about FunCo · 783dfa74
      Simon Peyton Jones authored
      I was seeing coercions like
      
         Nth 3 ((c2 -> c2) ; (c3 -> c4))
      
      which made me realise that optCoercion was doing a bad job
      of the (relatively new) FunCo.
      
      In particular, opt_trans_rule needs a FunCo/FunCo case,
      to go with the TyConAppCo/TyConAppCo case.  Easy.
      
      No behavioural change, some coercions will get smaller
      783dfa74
  2. 03 May, 2017 7 commits
  3. 02 May, 2017 6 commits
    • Gabor Greif's avatar
      Typos in manual and comments · b1aede61
      Gabor Greif authored
      b1aede61
    • Simon Peyton Jones's avatar
      Fix loss-of-SpecConstr bug · 9e47dc45
      Simon Peyton Jones authored
      This bug, reported in Trac #13623 has been present since
      
        commit b8b3e30a
        Author: Edward Z. Yang <ezyang@cs.stanford.edu>
        Date:   Fri Jun 24 11:03:47 2016 -0700
      
            Axe RecFlag on TyCons.
      
      SpecConstr tries not to specialise indefinitely, and had a
      limit (see Note [Limit recursive specialisation]) that made
      use of info about whether or not a data constructor was
      "recursive".  This info vanished in the above commit, making
      the limit fire much more often -- and indeed it fired in this
      test case, in a situation where specialisation is /highly/
      desirable.
      
      I refactored the test, to look instead at the number of
      iterations of the loop of "and now specialise calls that
      arise from the specialisation".  Actually less code, and
      more robust.
      
      I also added record field names to a couple of constructors,
      and renamed RuleInfo to SpecInfo.
      9e47dc45
    • Simon Peyton Jones's avatar
      Fix a small Float-Out bug · ff239787
      Simon Peyton Jones authored
      The float-out pass uses a heuristic based on strictness info.
      But it was getting the strictness info mis-aligned; I'd forgotten
      that strictness flags only apply to /value/ arguments.
      
      This patch fixes it.  It has some surprising effects!
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
              integer          -0.1%     +9.9%     +0.2%     +0.2%     +0.0%
                 lcss          +0.0%     +0.0%    -11.9%    -11.9%     +0.0%
               queens          -0.2%    +29.0%      0.02      0.02     +0.0%
               simple          -0.1%    -22.6%    -21.7%    -21.7%     -3.6%
             treejoin          +0.0%     +0.0%    -12.3%    -12.6%     +0.0%
      --------------------------------------------------------------------------------
                  Min          -0.2%    -22.6%    -21.7%    -21.7%    -10.0%
                  Max          +3.3%    +29.0%    +19.2%    +19.2%    +50.0%
       Geometric Mean          +0.0%     +0.1%     -2.1%     -2.1%     +0.2%
      
      The 'queens' and 'integer' allocation regressions are because, just
      before let-floatting, we get
          \v -> foldr k z (case x of I# y -> build ..y..)
      
      Becase of Note [Case MFEs] we don't float the build; so fusion
      happens.  This increases allocation in queens because the build
      isn't shared; but actaully runtime improves solidly.  Situation
      is similar in integer, although I think runtime gets a bit worse.
      
      The bug meant that, because of foldr's type arguments, the
      mis-aligned strictness info meant that the entire (case x ...)
      was floated, so fusion failed, but sharing happened.
      
      This is all very artificial-benchmark-ish so I'm not losing sleep
      over it.
      
      I did see some runtime numbers increasd, but I think it's noise;
      the differnce went away when I tried them one by one afterwards.
      ff239787
    • Simon Peyton Jones's avatar
      Join-point refactoring · 71037b61
      Simon Peyton Jones authored
      This commit has a raft of refactorings that improve the treatment
      of join points.  I wasn't aiming so much as to gain performance as
      to make the code simpler.
      
      The two big things are these:
      
      * Make mkDupableCont work for SimplBind as well.  This is simpler than
        I thought and quite neat.  (Luke had aready done StrictArg.)  That's
        a win in its own right. But also now /all/ continuations can be made
        dup-able
      
      * Now that all continuations can be made dup-able, I could simplify
        mkDupableCont to return just one SimplCont, instead of two.
        That really is a worthwhile simlification!  Much easier to think
        about.
      
      Plus a bunch of smaller things:
      
      * Remove the join-arity that had been added to seIdSubst.
        It can be done more simply by putting it in DoneEx, which
        is the only constructor that actually needs it, and now we
        don't need the unsavoury isJoinIdInEnv_maybe.
      
      * Re-order the handling of join points in Simplify, so that we don't need
        the horrible resultTypeOfDupableCont
      
      * Add field names for StrictBind, StrictArg; and use them
      
      * Define simplMonad.newJoinId, and use it
      
      * Rename the seFloats field of SimplEnv to seLetFloats
      
      Binary sizes seem to go up slightly, but allocations generally
      improve, sometimes significantly.  I don't believe the runtime numbers
      are reliable enough to draw any conclusions about
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs   Runtime   Elapsed  TotalMem
      --------------------------------------------------------------------------------
                event          +1.1%    -12.0%     -0.2%     -0.2%     -8.7%
               fulsom          +1.9%    -11.8%    -10.0%    -10.0%     +5.3%
           last-piece          +2.3%     -1.2%     -1.2%     -1.2%     +0.0%
                 mate          +0.9%     -1.4%     -0.6%     -0.7%     +0.0%
           multiplier          +1.5%     -8.3%      0.17      0.17     +0.0%
               parser          +2.0%     +1.0%      0.04      0.04     +0.0%
              parstof          +1.5%     +0.7%      0.01      0.01     +0.0%
                sched          +1.3%     -6.1%      0.03      0.03     +0.0%
               simple          +1.8%     +1.0%     +9.7%     +9.6%     +0.0%
      --------------------------------------------------------------------------------
                  Min          +0.5%    -12.0%    -10.0%    -10.0%     -8.7%
                  Max          +3.0%     +1.0%    +14.2%    +14.2%    +50.0%
       Geometric Mean          +1.4%     -0.4%     +0.3%     +0.4%     +0.5%
      
      There's also a tests/perf/compiler improvement of 20% allocation in
      T6048.  I think it's because we now generate smaller code.
      71037b61
    • Simon Peyton Jones's avatar
      Improve SpecConstr when there are many opportunities · c46a600f
      Simon Peyton Jones authored
      SpecConstr has -fspec-contr-count=N which limits the maximum
      number of specialisations we make for any particular function.
      But until now, if that limit was exceeded we discarded all the
      candidates!  So adding a new specialisaiton opportunity (by
      adding a new call site, or improving the optimiser) could result
      in less specialisation and worse performance.
      
      This patch instead picks the top N candidates, resulting in
      less brittle behaviour.
      
      See Note [Choosing patterns].
      c46a600f
    • Ben Gamari's avatar
      testsuite: Bump allocations of T3064 · 3746f623
      Ben Gamari authored
      This seems to have regressed due to,
      
          commit 5c602d22
          Author: Reid Barton <rwbarton@gmail.com>
          Date:   Mon May 1 11:17:47 2017 -0400
      
              Avoid excessive space usage from unfoldings in CoreTidy
      3746f623
  4. 01 May, 2017 11 commits
  5. 30 Apr, 2017 1 commit
  6. 29 Apr, 2017 2 commits
  7. 28 Apr, 2017 10 commits
    • Simon Peyton Jones's avatar
      Improve code generation for conditionals · 6d14c148
      Simon Peyton Jones authored
      This patch in in preparation for the fix to Trac #13397
      
      The code generator has a special case for
        case tagToEnum (a>#b) of
          False -> e1
          True  -> e2
      
      but it was not doing nearly so well on
        case a>#b of
          DEFAULT -> e1
          1#      -> e2
      
      This patch arranges to behave essentially identically in
      both cases.  In due course we can eliminate the special
      case for tagToEnum#, once we've completed Trac #13397.
      
      The changes are:
      
      * Make CmmSink swizzle the order of a conditional where necessary;
        see Note [Improving conditionals] in CmmSink
      
      * Hack the general case of StgCmmExpr.cgCase so that it use
        NoGcInAlts for conditionals.  This doesn't seem right, but it's
        the same choice as the tagToEnum version. Without it, code size
        increases a lot (more heap checks).
      
        There's a loose end here.
      
      * Add comments in CmmOpt.cmmMachOpFoldM
      6d14c148
    • Simon Peyton Jones's avatar
      Re-engineer caseRules to add tagToEnum/dataToTag · 193664d4
      Simon Peyton Jones authored
      See Note [Scrutinee Constant Folding] in SimplUtils
      
      * Add cases for tagToEnum and dataToTag. This is the main new
        bit.  It allows the simplifier to remove the pervasive uses
        of     case tagToEnum (a > b) of
                  False -> e1
                  True  -> e2
        and replace it by the simpler
               case a > b of
                  DEFAULT -> e1
                  1#      -> e2
        See Note [caseRules for tagToEnum]
        and Note [caseRules for dataToTag] in PrelRules.
      
      * This required some changes to the API of caseRules, and hence
        to code in SimplUtils.  See Note [Scrutinee Constant Folding]
        in SimplUtils.
      
      * Avoid duplication of work in the (unusual) case of
           case BIG + 3# of b
             DEFAULT -> e1
             6#      -> e2
      
        Previously we got
           case BIG of
             DEFAULT -> let b = BIG + 3# in e1
             3#      -> let b = 6#       in e2
      
        Now we get
           case BIG of b#
             DEFAULT -> let b = b' + 3# in e1
             3#      -> let b = 6#      in e2
      
      * Avoid duplicated code in caseRules
      
      A knock-on refactoring:
      
      * Move Note [Word/Int underflow/overflow] to Literal, as
        documentation to accompany mkMachIntWrap etc; and get
        rid of PrelRuls.intResult' in favour of mkMachIntWrap
      193664d4
    • Simon Peyton Jones's avatar
      Move dataConTagZ to DataCon · 1cae73aa
      Simon Peyton Jones authored
      Just a simple refactoring to remove duplication
      1cae73aa
    • Ben Gamari's avatar
      nativeGen: Use SSE2 SQRT instruction · 9ac22183
      Ben Gamari authored
      Reviewers: austin, dfeuer
      
      Subscribers: dfeuer, rwbarton, thomie
      
      GHC Trac Issues: #13629
      
      Differential Revision: https://phabricator.haskell.org/D3508
      9ac22183
    • Ben Gamari's avatar
      CSE: Fix cut and paste error · 9f9b90f1
      Ben Gamari authored
      extendCSRecEnv took the map to be extended from cs_map instead of
      cs_rec_map.  Oops!
      
      Test Plan: Validate
      
      Reviewers: simonpj, austin
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3510
      9f9b90f1
    • Ben Gamari's avatar
      Use memcpy in cloneArray · 228d4670
      Ben Gamari authored
      While looking at #13615 I noticed that there was this strange open-coded
      memcpy in the definition of the cloneArray macro. I don't see why this
      should be preferable to memcpy.
      
      Test Plan: Validate, particularly focusing on array operations
      
      Reviewers: simonmar, tibbe, austin, alexbiehl
      
      Reviewed By: tibbe, alexbiehl
      
      Subscribers: alexbiehl, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3504
      228d4670
    • Ryan Scott's avatar
      Make the tyvars in TH-reified data family instances uniform · b2c38d6b
      Ryan Scott authored
      It turns out we were using two different sets of type variables when
      reifying data family instances in Template Haskell. We were using the
      tyvars quantifying over the instance itself for the LHS, but using the
      tyvars quantifying over the data family instance constructor for the
      RHS. This commit uses the instance tyvars for both the LHS and the RHS,
      fixing #13618.
      
      Test Plan: make test TEST=T13618
      
      Reviewers: goldfire, austin, bgamari
      
      Reviewed By: goldfire, bgamari
      
      Subscribers: rwbarton, thomie
      
      GHC Trac Issues: #13618
      
      Differential Revision: https://phabricator.haskell.org/D3505
      b2c38d6b
    • Ryan Scott's avatar
      Add regression test for #12104 · 69b9b853
      Ryan Scott authored
      Commit 2f9f1f86
      (#13487) fixes #12104 as well. This adds a regression test for the
      program reported in #12104 to keep it fixed.
      
      Test Plan: make test TEST=T12104
      
      Reviewers: bgamari, austin
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie
      
      GHC Trac Issues: #12104
      
      Differential Revision: https://phabricator.haskell.org/D3495
      69b9b853
    • Ben Gamari's avatar
      get-win32-tarballs: Grab perl tarball from haskell.org, not GitHub · ba597c1d
      Ben Gamari authored
      Reviewers: austin, dfeuer
      
      Reviewed By: dfeuer
      
      Subscribers: Phyx, rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3509
      ba597c1d
    • Simon Peyton Jones's avatar
      Be a bit more eager to inline in a strict context · 29d88ee1
      Simon Peyton Jones authored
      If we see f (g x), and f is strict, we want to be a bit more eager to
      inline g, because it may well expose an eval (on x perhaps) that can
      be eliminated or shared.
      
      I saw this in nofib boyer2, function RewriteFuns.onewayunify1.  It
      showed up as a consequence of the preceding patch that makes the
      simplifier do less work (Trac #13379).  We had
      
         f d (g x)
      
      where f was a class-op. Previously we simplified both d and
      (g x) with a RuleArgCtxt (making g a bit more eager to inline).
      But now we simplify only d that way, then fire the rule, and
      only then simplify (g x).  Firing the rule produces a strict
      funciion, so we want to make a strict function encourage
      inlining a bit.
      29d88ee1