- 04 Nov, 2009 1 commit
-
-
rl@cse.unsw.edu.au authored
The patch adds this rule: seq (x `cast` co) y = seq x y This is subject to the usual treatment of seq rules. It also makes them match more often: it will rewrite seq (f x `cast` co) y = seq (f x) y and allow a seq rule for f to match.
-
- 30 Oct, 2009 2 commits
-
-
simonpj@microsoft.com authored
-
simonpj@microsoft.com authored
* Remove trace from optCoercion * Use simplCoercion for type arguments in the Simplifier (because they might be coercions)
-
- 29 Oct, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch has been a long time in gestation and has, as a result, accumulated some extra bits and bobs that are only loosely related. I separated the bits that are easy to split off, but the rest comes as one big patch, I'm afraid. Note that: * It comes together with a patch to the 'base' library * Interface file formats change slightly, so you need to recompile all libraries The patch is mainly giant tidy-up, driven in part by the particular stresses of the Data Parallel Haskell project. I don't expect a big performance win for random programs. Still, here are the nofib results, relative to the state of affairs without the patch Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- Min -12.7% -14.5% -17.5% -17.8% Max +4.7% +10.9% +9.1% +8.4% Geometric Mean +0.9% -0.1% -5.6% -7.3% The +10.9% allocation outlier is rewrite, which happens to have a very delicate optimisation opportunity involving an interaction of CSE and inlining (see nofib/Simon-nofib-notes). The fact that the 'before' case found the optimisation is somewhat accidental. Runtimes seem to go down, but I never kno wwhether to really trust this number. Binary sizes wobble a bit, but nothing drastic. The Main Ideas are as follows. InlineRules ~~~~~~~~~~~ When you say {-# INLINE f #-} f x = <rhs> you intend that calls (f e) are replaced by <rhs>[e/x] So we should capture (\x.<rhs>) in the Unfolding of 'f', and never meddle with it. Meanwhile, we can optimise <rhs> to our heart's content, leaving the original unfolding intact in Unfolding of 'f'. So the representation of an Unfolding has changed quite a bit (see CoreSyn). An INLINE pragma gives rise to an InlineRule unfolding. Moreover, it's only used when 'f' is applied to the specified number of arguments; that is, the number of argument on the LHS of the '=' sign in the original source definition. For example, (.) is now defined in the libraries like this {-# INLINE (.) #-} (.) f g = \x -> f (g x) so that it'll inline when applied to two arguments. If 'x' appeared on the left, thus (.) f g x = f (g x) it'd only inline when applied to three arguments. This slightly-experimental change was requested by Roman, but it seems to make sense. Other associated changes * Moving the deck chairs in DsBinds, which processes the INLINE pragmas * In the old system an INLINE pragma made the RHS look like (Note InlineMe <rhs>) The Note switched off optimisation in <rhs>. But it was quite fragile in corner cases. The new system is more robust, I believe. In any case, the InlineMe note has disappeared * The workerInfo of an Id has also been combined into its Unfolding, so it's no longer a separate field of the IdInfo. * Many changes in CoreUnfold, esp in callSiteInline, which is the critical function that decides which function to inline. Lots of comments added! * exprIsConApp_maybe has moved to CoreUnfold, since it's so strongly associated with "does this expression unfold to a constructor application". It can now do some limited beta reduction too, which Roman found was an important. Instance declarations ~~~~~~~~~~~~~~~~~~~~~ It's always been tricky to get the dfuns generated from instance declarations to work out well. This is particularly important in the Data Parallel Haskell project, and I'm now on my fourth attempt, more or less. There is a detailed description in TcInstDcls, particularly in Note [How instance declarations are translated]. Roughly speaking we now generate a top-level helper function for every method definition in an instance declaration, so that the dfun takes a particularly stylised form: dfun a d1 d2 = MkD (op1 a d1 d2) (op2 a d1 d2) ...etc... In fact, it's *so* stylised that we never need to unfold a dfun. Instead ClassOps have a special rewrite rule that allows us to short-cut dictionary selection. Suppose dfun :: Ord a -> Ord [a] d :: Ord a Then compare (dfun a d) --> compare_list a d in one rewrite, without first inlining the 'compare' selector and the body of the dfun. To support this a) ClassOps have a BuiltInRule (see MkId.dictSelRule) b) DFuns have a special form of unfolding (CoreSyn.DFunUnfolding) which is exploited in CoreUnfold.exprIsConApp_maybe Implmenting all this required a root-and-branch rework of TcInstDcls and bits of TcClassDcl. Default methods ~~~~~~~~~~~~~~~ If you give an INLINE pragma to a default method, it should be just as if you'd written out that code in each instance declaration, including the INLINE pragma. I think that it now *is* so. As a result, library code can be simpler; less duplication. The CONLIKE pragma ~~~~~~~~~~~~~~~~~~ In the DPH project, Roman found cases where he had p n k = let x = replicate n k in ...(f x)...(g x).... {-# RULE f (replicate x) = f_rep x #-} Normally the RULE would not fire, because doing so involves (in effect) duplicating the redex (replicate n k). A new experimental modifier to the INLINE pragma, {-# INLINE CONLIKE replicate #-}, allows you to tell GHC to be prepared to duplicate a call of this function if it allows a RULE to fire. See Note [CONLIKE pragma] in BasicTypes Join points ~~~~~~~~~~~ See Note [Case binders and join points] in Simplify Other refactoring ~~~~~~~~~~~~~~~~~ * I moved endPass from CoreLint to CoreMonad, with associated jigglings * Better pretty-printing of Core * The top-level RULES (ones that are not rules for locally-defined things) are now substituted on every simplifier iteration. I'm not sure how we got away without doing this before. This entails a bit more plumbing in SimplCore. * The necessary stuff to serialise and deserialise the new info across interface files. * Something about bottoming floats in SetLevels Note [Bottoming floats] * substUnfolding has moved from SimplEnv to CoreSubs, where it belongs -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- anna +2.4% -0.5% 0.16 0.17 ansi +2.6% -0.1% 0.00 0.00 atom -3.8% -0.0% -1.0% -2.5% awards +3.0% +0.7% 0.00 0.00 banner +3.3% -0.0% 0.00 0.00 bernouilli +2.7% +0.0% -4.6% -6.9% boyer +2.6% +0.0% 0.06 0.07 boyer2 +4.4% +0.2% 0.01 0.01 bspt +3.2% +9.6% 0.02 0.02 cacheprof +1.4% -1.0% -12.2% -13.6% calendar +2.7% -1.7% 0.00 0.00 cichelli +3.7% -0.0% 0.13 0.14 circsim +3.3% +0.0% -2.3% -9.9% clausify +2.7% +0.0% 0.05 0.06 comp_lab_zift +2.6% -0.3% -7.2% -7.9% compress +3.3% +0.0% -8.5% -9.6% compress2 +3.6% +0.0% -15.1% -17.8% constraints +2.7% -0.6% -10.0% -10.7% cryptarithm1 +4.5% +0.0% -4.7% -5.7% cryptarithm2 +4.3% -14.5% 0.02 0.02 cse +4.4% -0.0% 0.00 0.00 eliza +2.8% -0.1% 0.00 0.00 event +2.6% -0.0% -4.9% -4.4% exp3_8 +2.8% +0.0% -4.5% -9.5% expert +2.7% +0.3% 0.00 0.00 fem -2.0% +0.6% 0.04 0.04 fft -6.0% +1.8% 0.05 0.06 fft2 -4.8% +2.7% 0.13 0.14 fibheaps +2.6% -0.6% 0.05 0.05 fish +4.1% +0.0% 0.03 0.04 fluid -2.1% -0.2% 0.01 0.01 fulsom -4.8% +9.2% +9.1% +8.4% gamteb -7.1% -1.3% 0.10 0.11 gcd +2.7% +0.0% 0.05 0.05 gen_regexps +3.9% -0.0% 0.00 0.00 genfft +2.7% -0.1% 0.05 0.06 gg -2.7% -0.1% 0.02 0.02 grep +3.2% -0.0% 0.00 0.00 hidden -0.5% +0.0% -11.9% -13.3% hpg -3.0% -1.8% +0.0% -2.4% ida +2.6% -1.2% 0.17 -9.0% infer +1.7% -0.8% 0.08 0.09 integer +2.5% -0.0% -2.6% -2.2% integrate -5.0% +0.0% -1.3% -2.9% knights +4.3% -1.5% 0.01 0.01 lcss +2.5% -0.1% -7.5% -9.4% life +4.2% +0.0% -3.1% -3.3% lift +2.4% -3.2% 0.00 0.00 listcompr +4.0% -1.6% 0.16 0.17 listcopy +4.0% -1.4% 0.17 0.18 maillist +4.1% +0.1% 0.09 0.14 mandel +2.9% +0.0% 0.11 0.12 mandel2 +4.7% +0.0% 0.01 0.01 minimax +3.8% -0.0% 0.00 0.00 mkhprog +3.2% -4.2% 0.00 0.00 multiplier +2.5% -0.4% +0.7% -1.3% nucleic2 -9.3% +0.0% 0.10 0.10 para +2.9% +0.1% -0.7% -1.2% paraffins -10.4% +0.0% 0.20 -1.9% parser +3.1% -0.0% 0.05 0.05 parstof +1.9% -0.0% 0.00 0.01 pic -2.8% -0.8% 0.01 0.02 power +2.1% +0.1% -8.5% -9.0% pretty -12.7% +0.1% 0.00 0.00 primes +2.8% +0.0% 0.11 0.11 primetest +2.5% -0.0% -2.1% -3.1% prolog +3.2% -7.2% 0.00 0.00 puzzle +4.1% +0.0% -3.5% -8.0% queens +2.8% +0.0% 0.03 0.03 reptile +2.2% -2.2% 0.02 0.02 rewrite +3.1% +10.9% 0.03 0.03 rfib -5.2% +0.2% 0.03 0.03 rsa +2.6% +0.0% 0.05 0.06 scc +4.6% +0.4% 0.00 0.00 sched +2.7% +0.1% 0.03 0.03 scs -2.6% -0.9% -9.6% -11.6% simple -4.0% +0.4% -14.6% -14.9% solid -5.6% -0.6% -9.3% -14.3% sorting +3.8% +0.0% 0.00 0.00 sphere -3.6% +8.5% 0.15 0.16 symalg -1.3% +0.2% 0.03 0.03 tak +2.7% +0.0% 0.02 0.02 transform +2.0% -2.9% -8.0% -8.8% treejoin +3.1% +0.0% -17.5% -17.8% typecheck +2.9% -0.3% -4.6% -6.6% veritas +3.9% -0.3% 0.00 0.00 wang -6.2% +0.0% 0.18 -9.8% wave4main -10.3% +2.6% -2.1% -2.3% wheel-sieve1 +2.7% -0.0% +0.3% -0.6% wheel-sieve2 +2.7% +0.0% -3.7% -7.5% x2n1 -4.1% +0.1% 0.03 0.04 -------------------------------------------------------------------------------- Min -12.7% -14.5% -17.5% -17.8% Max +4.7% +10.9% +9.1% +8.4% Geometric Mean +0.9% -0.1% -5.6% -7.3%
-
- 26 Oct, 2009 1 commit
-
-
simonpj@microsoft.com authored
Coercion terms can get big (see Trac #2859 for example), so this patch puts the infrastructure in place to optimise them: * Adds Coercion.optCoercion :: Coercion -> Coercion * Calls optCoercion in Simplify.lhs The optimiser doesn't work right at the moment, so it is commented out, but Tom is going to work on it.
-
- 11 Sep, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch fixes test failures for the profiling way for drv001. The problem was that the arity of a function was decreasing during "optimisation" because of interaction with SCC annotations. In particular f = /\a. scc "f" (h x) -- where h had arity 2 and h gets inlined, led to f = /\a. scc "f" let v = scc "f" x in \y. <blah> Two main changes: 1. exprIsTrivial now says True for (scc "f" x) See Note [SCCs are trivial] in CoreUtils 2. The simplifier eliminates nested pushing of the same cost centre: scc "f" (...(scc "f" e)...) ==> scc "f" (...e...)
-
- 18 Jun, 2009 1 commit
-
-
t-peterj@microsoft.com authored
-
- 03 Jun, 2009 1 commit
-
-
simonpj@microsoft.com authored
Roman found situations where he had case (f n) of _ -> e where he knew that f (which was strict in n) would terminate if n did. Notice that the result of (f n) is discarded. So it makes sense to transform to case n of _ -> e Rather than attempt some general analysis to support this, I've added enough support that you can do this using a rewrite rule: RULE "f/seq" forall n. seq (f n) e = seq n e You write that rule. When GHC sees a case expression that discards its result, it mentally transforms it to a call to 'seq' and looks for a RULE. (This is done in Simplify.rebuildCase.) As usual, the correctness of the rule is up to you. This patch implements the extra stuff. I have not documented it explicitly in the user manual yet... let's see how useful it is first. The patch looks bigger than it is, because a) Comments; see esp MkId Note [seqId magic] b) Some refactoring. Notably, I moved the special desugaring for seq from MkCore back into DsUtils where it properly belongs. (It's really a desugaring thing, not a CoreSyn invariant.) c) Annoyingly, in a RULE left-hand side we need to be careful that the magical desugaring done in MkId Note [seqId magic] item (c) is *not* done on the LHS of a rule. Or rather, we arrange to un-do it, in DsBinds.decomposeRuleLhs.
-
- 02 Apr, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch fixes a rather obscure bug, whereby it's possible for (case C a b of <alts>) to have altenatives that do not inclue (C a b)! See Note [Unreachable code] in CoreUtils.
-
- 25 Mar, 2009 1 commit
-
-
simonpj@microsoft.com authored
It turns out that, as a result of a change I made a few months ago to the representation of SimplCont, it's easy to solve the optimisation challenge posed by Trac #3116. Hurrah. Extensive comments in Note [Duplicating StrictArg].
-
- 23 Mar, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch makes the specialiser propagate arities a bit more eagerly, which avoids a spurious warning in the simplifier. See Note [Arity decrease] in Simplify.lhs
-
- 18 Mar, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch adds an optional CONLIKE modifier to INLINE/NOINLINE pragmas, {-# NOINLINE CONLIKE [1] f #-} The effect is to allow applications of 'f' to be expanded in a potential rule match. Example {-# RULE "r/f" forall v. r (f v) = f (v+1) #-} Consider the term let x = f v in ..x...x...(r x)... Normally the (r x) would not match the rule, because GHC would be scared about duplicating the redex (f v). However the CONLIKE modifier says to treat 'f' like a constructor in this situation, and "look through" the unfolding for x. So (r x) fires, yielding (f (v+1)). The main changes are: - Syntax - The inlinePragInfo field of an IdInfo has a RuleMatchInfo component, which records whether or not the Id is CONLIKE. Of course, this needs to be serialised in interface files too. - The occurrence analyser (OccAnal) and simplifier (Simplify) treat CONLIKE thing like constructors, by ANF-ing them - New function coreUtils.exprIsExpandable is like exprIsCheap, but additionally spots applications of CONLIKE functions - A CoreUnfolding has a field that caches exprIsExpandable - The rule matcher consults this field. See Note [Expanding variables] in Rules.lhs. On the way I fixed a lurking variable bug in the way variables are expanded. See Note [Do not expand locally-bound variables] in Rule.lhs. I also did a bit of reformatting and refactoring in Rules.lhs, so the module has more lines changed than are really different.
-
- 13 Jan, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch does two main things a) Rewrite most of CorePrep to be much easier to understand (I hope!). The invariants established by CorePrep are now written out, and the code is more perspicuous. It is surpringly hard to get right, and the old code had become quite incomprehensible. b) Rewrite the eta-expander so that it does a bit of simplifying on-the-fly, and thereby guarantees to maintain the CorePrep invariants. This make it much easier to use from CorePrep, and is a generally good thing anyway. A couple of pieces of re-structuring: * I moved the eta-expander and arity analysis stuff into a new module coreSyn/CoreArity. Max will find that the type CoreArity.EtaInfo looks strangely familiar. * I moved a bunch of comments from Simplify to OccurAnal; that's why it looks as though there's a lot of lines changed in those modules. On the way I fixed various things - Function arguments are eta expanded f (map g) ===> let s = \x. map g x in f s - Trac #2368 The result is a modest performance gain, I think mainly due to the first of these changes: -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- Min -1.0% -17.4% -19.1% -46.4% Max +0.3% +0.5% +5.4% +53.8% Geometric Mean -0.1% -0.3% -7.0% -10.2%
-
- 16 Dec, 2008 2 commits
-
-
Simon Marlow authored
rolling back: Fri Dec 5 10:51:59 GMT 2008 simonpj@microsoft.com * Add -fpass-case-bndr-to-join-points See Note [Passing the case binder to join points] in Simplify.lhs The default now is *not* to pass the case binder. There are some nofib results with the above note; the effect is almost always negligible. I don't expect this flag to be used by users (hence no docs). It's just there to let me try the performance effects of switching on and off. M ./compiler/main/StaticFlagParser.hs +1 M ./compiler/main/StaticFlags.hs +4 M ./compiler/simplCore/Simplify.lhs -14 +73
-
Simon Marlow authored
rolling back: Fri Dec 5 16:54:00 GMT 2008 simonpj@microsoft.com * Completely new treatment of INLINE pragmas (big patch) This is a major patch, which changes the way INLINE pragmas work. Although lots of files are touched, the net is only +21 lines of code -- and I bet that most of those are comments! HEADS UP: interface file format has changed, so you'll need to recompile everything. There is not much effect on overall performance for nofib, probably because those programs don't make heavy use of INLINE pragmas. Program Size Allocs Runtime Elapsed Min -11.3% -6.9% -9.2% -8.2% Max -0.1% +4.6% +7.5% +8.9% Geometric Mean -2.2% -0.2% -1.0% -0.8% (The +4.6% for on allocs is cichelli; see other patch relating to -fpass-case-bndr-to-join-points.) The old INLINE system ~~~~~~~~~~~~~~~~~~~~~ The old system worked like this. A function with an INLINE pragam got a right-hand side which looked like f = __inline_me__ (\xy. e) The __inline_me__ part was an InlineNote, and was treated specially in various ways. Notably, the simplifier didn't inline inside an __inline_me__ note. As a result, the code for f itself was pretty crappy. That matters if you say (map f xs), because then you execute the code for f, rather than inlining a copy at the call site. The new story: InlineRules ~~~~~~~~~~~~~~~~~~~~~~~~~~ The new system removes the InlineMe Note altogether. Instead there is a new constructor InlineRule in CoreSyn.Unfolding. This is a bit like a RULE, in that it remembers the template to be inlined inside the InlineRule. No simplification or inlining is done on an InlineRule, just like RULEs. An Id can have an InlineRule *or* a CoreUnfolding (since these are two constructors from Unfolding). The simplifier treats them differently: - An InlineRule is has the substitution applied (like RULES) but is otherwise left undisturbed. - A CoreUnfolding is updated with the new RHS of the definition, on each iteration of the simplifier. An InlineRule fires regardless of size, but *only* when the function is applied to enough arguments. The "arity" of the rule is specified (by the programmer) as the number of args on the LHS of the "=". So it makes a difference whether you say {-# INLINE f #-} f x = \y -> e or f x y = e This is one of the big new features that InlineRule gives us, and it is one that Roman really wanted. In contrast, a CoreUnfolding can fire when it is applied to fewer args than than the function has lambdas, provided the result is small enough. Consequential stuff ~~~~~~~~~~~~~~~~~~~ * A 'wrapper' no longer has a WrapperInfo in the IdInfo. Instead, the InlineRule has a field identifying wrappers. * Of course, IfaceSyn and interface serialisation changes appropriately. * Making implication constraints inline nicely was a bit fiddly. In the end I added a var_inline field to HsBInd.VarBind, which is why this patch affects the type checker slightly * I made some changes to the way in which eta expansion happens in CorePrep, mainly to ensure that *arguments* that become let-bound are also eta-expanded. I'm still not too happy with the clarity and robustness fo the result. * We now complain if the programmer gives an INLINE pragma for a recursive function (prevsiously we just ignored it). Reason for change: we don't want an InlineRule on a LoopBreaker, because then we'd have to check for loop-breaker-hood at occurrence sites (which isn't currenlty done). Some tests need changing as a result. This patch has been in my tree for quite a while, so there are probably some other minor changes. M ./compiler/basicTypes/Id.lhs -11 M ./compiler/basicTypes/IdInfo.lhs -82 M ./compiler/basicTypes/MkId.lhs -2 +2 M ./compiler/coreSyn/CoreFVs.lhs -2 +25 M ./compiler/coreSyn/CoreLint.lhs -5 +1 M ./compiler/coreSyn/CorePrep.lhs -59 +53 M ./compiler/coreSyn/CoreSubst.lhs -22 +31 M ./compiler/coreSyn/CoreSyn.lhs -66 +92 M ./compiler/coreSyn/CoreUnfold.lhs -112 +112 M ./compiler/coreSyn/CoreUtils.lhs -185 +184 M ./compiler/coreSyn/MkExternalCore.lhs -1 M ./compiler/coreSyn/PprCore.lhs -4 +40 M ./compiler/deSugar/DsBinds.lhs -70 +118 M ./compiler/deSugar/DsForeign.lhs -2 +4 M ./compiler/deSugar/DsMeta.hs -4 +3 M ./compiler/hsSyn/HsBinds.lhs -3 +3 M ./compiler/hsSyn/HsUtils.lhs -2 +7 M ./compiler/iface/BinIface.hs -11 +25 M ./compiler/iface/IfaceSyn.lhs -13 +21 M ./compiler/iface/MkIface.lhs -24 +19 M ./compiler/iface/TcIface.lhs -29 +23 M ./compiler/main/TidyPgm.lhs -55 +49 M ./compiler/parser/ParserCore.y -5 +6 M ./compiler/simplCore/CSE.lhs -2 +1 M ./compiler/simplCore/FloatIn.lhs -6 +1 M ./compiler/simplCore/FloatOut.lhs -23 M ./compiler/simplCore/OccurAnal.lhs -36 +5 M ./compiler/simplCore/SetLevels.lhs -59 +54 M ./compiler/simplCore/SimplCore.lhs -48 +52 M ./compiler/simplCore/SimplEnv.lhs -26 +22 M ./compiler/simplCore/SimplUtils.lhs -28 +4 M ./compiler/simplCore/Simplify.lhs -91 +109 M ./compiler/specialise/Specialise.lhs -15 +18 M ./compiler/stranal/WorkWrap.lhs -14 +11 M ./compiler/stranal/WwLib.lhs -2 +2 M ./compiler/typecheck/Inst.lhs -1 +3 M ./compiler/typecheck/TcBinds.lhs -17 +27 M ./compiler/typecheck/TcClassDcl.lhs -1 +2 M ./compiler/typecheck/TcExpr.lhs -4 +6 M ./compiler/typecheck/TcForeign.lhs -1 +1 M ./compiler/typecheck/TcGenDeriv.lhs -14 +13 M ./compiler/typecheck/TcHsSyn.lhs -3 +2 M ./compiler/typecheck/TcInstDcls.lhs -5 +4 M ./compiler/typecheck/TcRnDriver.lhs -2 +11 M ./compiler/typecheck/TcSimplify.lhs -10 +17 M ./compiler/vectorise/VectType.hs +7 Mon Dec 8 12:43:10 GMT 2008 simonpj@microsoft.com * White space only M ./compiler/simplCore/Simplify.lhs -2 Mon Dec 8 12:48:40 GMT 2008 simonpj@microsoft.com * Move simpleOptExpr from CoreUnfold to CoreSubst M ./compiler/coreSyn/CoreSubst.lhs -1 +87 M ./compiler/coreSyn/CoreUnfold.lhs -72 +1 Mon Dec 8 17:30:18 GMT 2008 simonpj@microsoft.com * Use CoreSubst.simpleOptExpr in place of the ad-hoc simpleSubst (reduces code too) M ./compiler/deSugar/DsBinds.lhs -50 +16 Tue Dec 9 17:03:02 GMT 2008 simonpj@microsoft.com * Fix Trac #2861: bogus eta expansion Urghlhl! I "tided up" the treatment of the "state hack" in CoreUtils, but missed an unexpected interaction with the way that a bottoming function simply swallows excess arguments. There's a long Note [State hack and bottoming functions] to explain (which accounts for most of the new lines of code). M ./compiler/coreSyn/CoreUtils.lhs -16 +53 Mon Dec 15 10:02:21 GMT 2008 Simon Marlow <marlowsd@gmail.com> * Revert CorePrep part of "Completely new treatment of INLINE pragmas..." The original patch said: * I made some changes to the way in which eta expansion happens in CorePrep, mainly to ensure that *arguments* that become let-bound are also eta-expanded. I'm still not too happy with the clarity and robustness fo the result. Unfortunately this change apparently broke some invariants that were relied on elsewhere, and in particular lead to panics when compiling with profiling on. Will re-investigate in the new year. M ./compiler/coreSyn/CorePrep.lhs -53 +58 M ./configure.ac -1 +1 Mon Dec 15 12:28:51 GMT 2008 Simon Marlow <marlowsd@gmail.com> * revert accidental change to configure.ac M ./configure.ac -1 +1
-
- 08 Dec, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 05 Dec, 2008 2 commits
-
-
simonpj@microsoft.com authored
This is a major patch, which changes the way INLINE pragmas work. Although lots of files are touched, the net is only +21 lines of code -- and I bet that most of those are comments! HEADS UP: interface file format has changed, so you'll need to recompile everything. There is not much effect on overall performance for nofib, probably because those programs don't make heavy use of INLINE pragmas. Program Size Allocs Runtime Elapsed Min -11.3% -6.9% -9.2% -8.2% Max -0.1% +4.6% +7.5% +8.9% Geometric Mean -2.2% -0.2% -1.0% -0.8% (The +4.6% for on allocs is cichelli; see other patch relating to -fpass-case-bndr-to-join-points.) The old INLINE system ~~~~~~~~~~~~~~~~~~~~~ The old system worked like this. A function with an INLINE pragam got a right-hand side which looked like f = __inline_me__ (\xy. e) The __inline_me__ part was an InlineNote, and was treated specially in various ways. Notably, the simplifier didn't inline inside an __inline_me__ note. As a result, the code for f itself was pretty crappy. That matters if you say (map f xs), because then you execute the code for f, rather than inlining a copy at the call site. The new story: InlineRules ~~~~~~~~~~~~~~~~~~~~~~~~~~ The new system removes the InlineMe Note altogether. Instead there is a new constructor InlineRule in CoreSyn.Unfolding. This is a bit like a RULE, in that it remembers the template to be inlined inside the InlineRule. No simplification or inlining is done on an InlineRule, just like RULEs. An Id can have an InlineRule *or* a CoreUnfolding (since these are two constructors from Unfolding). The simplifier treats them differently: - An InlineRule is has the substitution applied (like RULES) but is otherwise left undisturbed. - A CoreUnfolding is updated with the new RHS of the definition, on each iteration of the simplifier. An InlineRule fires regardless of size, but *only* when the function is applied to enough arguments. The "arity" of the rule is specified (by the programmer) as the number of args on the LHS of the "=". So it makes a difference whether you say {-# INLINE f #-} f x = \y -> e or f x y = e This is one of the big new features that InlineRule gives us, and it is one that Roman really wanted. In contrast, a CoreUnfolding can fire when it is applied to fewer args than than the function has lambdas, provided the result is small enough. Consequential stuff ~~~~~~~~~~~~~~~~~~~ * A 'wrapper' no longer has a WrapperInfo in the IdInfo. Instead, the InlineRule has a field identifying wrappers. * Of course, IfaceSyn and interface serialisation changes appropriately. * Making implication constraints inline nicely was a bit fiddly. In the end I added a var_inline field to HsBInd.VarBind, which is why this patch affects the type checker slightly * I made some changes to the way in which eta expansion happens in CorePrep, mainly to ensure that *arguments* that become let-bound are also eta-expanded. I'm still not too happy with the clarity and robustness fo the result. * We now complain if the programmer gives an INLINE pragma for a recursive function (prevsiously we just ignored it). Reason for change: we don't want an InlineRule on a LoopBreaker, because then we'd have to check for loop-breaker-hood at occurrence sites (which isn't currenlty done). Some tests need changing as a result. This patch has been in my tree for quite a while, so there are probably some other minor changes.
-
simonpj@microsoft.com authored
See Note [Passing the case binder to join points] in Simplify.lhs The default now is *not* to pass the case binder. There are some nofib results with the above note; the effect is almost always negligible. I don't expect this flag to be used by users (hence no docs). It's just there to let me try the performance effects of switching on and off.
-
- 02 Oct, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 20 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
This patch does a lot of tidying up of the way that dead variables are handled in Core. Just the sort of thing to do on an aeroplane. * The tricky "binder-swap" optimisation is moved from the Simplifier to the Occurrence Analyser. See Note [Binder swap] in OccurAnal. This is really a nice change. It should reduce the number of simplifier iteratoins (slightly perhaps). And it means that we can be much less pessimistic about zapping occurrence info on binders in a case expression. * For example: case x of y { (a,b) -> e } Previously, each time around, even if y,a,b were all dead, the Simplifier would pessimistically zap their OccInfo, so that we can't see they are dead any more. As a result virtually no case expression ended up with dead binders. This wasn't Bad in itself, but it always felt wrong. * I added a check to CoreLint to check that a dead binder really isn't used. That showed up a couple of bugs in CSE. (Only in this sense -- they didn't really matter.) * I've changed the PprCore printer to print "_" for a dead variable. (Use -dppr-debug to see it again.) This reduces clutter quite a bit, and of course it's much more useful with the above change. * Another benefit of the binder-swap change is that I could get rid of the Simplifier hack (working, but hacky) in which the InScopeSet was used to map a variable to a *different* variable. That allowed me to remove VarEnv.modifyInScopeSet, and to simplify lookupInScopeSet so that it doesn't look for a fixpoint. This fixes no bugs, but is a useful cleanup. * Roman pointed out that Id.mkWildId is jolly dangerous, because of its fixed unique. So I've - localied it to MkCore, where it is private (not exported) - renamed it to 'mkWildBinder' to stress that you should only use it at binding sites, unless you really know what you are doing - provided a function MkCore.mkWildCase that emodies the most common use of mkWildId, and use that elsewhere So things are much better * A knock-on change is that I found a common pattern of localising a potentially global Id, and made a function for it: Id.localiseId
-
- 18 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 17 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
This warning tests that the arity of a function does not decrease. And that it's at least as great as the strictness signature. Failing this test isn't a disater, but it's distinctly odd and usually indicates that not enough information is getting propagated around, and hence you may get more simplifier iterations.
-
- 14 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 05 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
When binding x = e, we now attach an unfolding to 'x' even if it won't be used because SimplGently is on. Reason: the specialiser runs right after SimplGently, and it (now) only gathers call information for calls whose dictionary arguments are "interesting" -- i.e. have an unfolding of some kind.
-
- 03 Sep, 2008 1 commit
-
-
simonpj@microsoft.com authored
This patch significantly improves the way in which recursive groups are specialised. This turns out ot be very important when specilising the bindings that (now) emerge from instance declarations. Consider let rec { f x = ...g x'... ; g y = ...f y'.... } in f 'a' Here we specialise 'f' at Char; but that is very likely to lead to a specialisation of 'g' at Char. We must do the latter, else the whole point of specialisation is lost. This was not happening before. The whole thing is desribed in Note [Specialising a recursive group] Simon
-
- 31 Jul, 2008 1 commit
-
-
batterseapower authored
-
- 20 Jul, 2008 1 commit
-
-
Thomas Schilling authored
-
- 17 Jun, 2008 1 commit
-
-
Simon Marlow authored
That's 1 line of new code and 38 lines of new comments
-
- 14 Jun, 2008 1 commit
-
-
simonpj@microsoft.com authored
This bug was somehow tickled by the new code for desugaring polymorphic bindings, but the bug has been there a long time. The bindings floated out in simplLazyBind, generated by abstractFloats, were getting processed by postInlineUnconditionally. But that was wrong because part of their scope has already been processed. That led to a bit of refactoring in the simplifier. See comments with Simplify.addPolyBind. In principle this might happen in 6.8.3, but in practice it doesn't seem to, so probably not worth merging.
-
- 05 Jun, 2008 1 commit
-
-
simonpj@microsoft.com authored
This patch adds to Core the ability to say let a = Int in <body> where 'a' is a type variable. That is: a type-let. See Note [Type let] in CoreSyn. * The binding is always non-recursive * The simplifier immediately eliminates it by substitution So in effect a type-let is just a delayed substitution. This is convenient in a couple of places in the desugarer, one existing (see the call to CoreTyn.mkTyBind in DsUtils), and one that's in the next upcoming patch. The first use in the desugarer was previously encoded as (/\a. <body>) Int rather that eagerly substituting, but that was horrid because Core Lint had do "know" that a=Int inside <body> else it would bleat. Expressing it directly as a 'let' seems much nicer.
-
- 16 May, 2008 1 commit
-
-
simonpj@microsoft.com authored
Trac #2273 showed a case in which 'seq' didn't cure the space leak it was supposed to. This patch does two things to help a) It removes a now-redundant special case in Simplify, which switched off the case-binder-swap in the early stages. This isn't necessary any more because FloatOut has improved since the Simplify code was written. And switching off the binder-swap is harmful for seq. However fix (a) is a bit fragile, so I did (b) too: b) Desugar 'seq' specially. See Note [Desugaring seq (2)] in DsUtils This isn't very robust either, since it's defeated by abstraction, but that's not something GHC can fix; the programmer should use a let! instead.
-
- 12 Apr, 2008 1 commit
-
-
Ian Lynagh authored
-
- 22 Apr, 2008 1 commit
-
-
simonpj@microsoft.com authored
The main change in this patch is this: * The Stop constructor of SimplCont no longer contains the OutType of the whole continuation. This is a nice simplification in lots of places where we build a Stop continuation. For example, rebuildCall no longer needs to maintain the type of the function. * Similarly StrictArg no longer needs an OutType * The consequential complication is that contResultType (not called much) needs to be given the type of the thing in the middle. No big deal. * Lots of other small knock-on effects Other changes in here * simplLazyBind does do the type-abstraction thing if there's a lambda inside. See comments in simplLazyBind * simplLazyBind reduces simplifier iterations by keeping unfolding information for stuff for which type abstraction is done (see add_poly_bind) All of this came up when implementing System IF, but seems worth applying to the HEAD
-
- 29 Mar, 2008 1 commit
-
-
Ian Lynagh authored
Modules that need it import it themselves instead.
-
- 22 Feb, 2008 2 commits
-
-
Ian Lynagh authored
-
Ian Lynagh authored
-
- 07 Feb, 2008 1 commit
-
-
simonpj@microsoft.com authored
This adds back in the patch * UNDO: Be a little keener to inline It originally broke the compiler because it tickled a Cmm optimisation bug, now fixed. In revisiting this I have also make inlining a bit cleverer, in response to more examples from Roman. In particular * CoreUnfold.CallCtxt is a data type that tells something about the context of a call. The new feature is that if the context is the argument position of a function call, we record both - whether the function (or some higher up function) has rules - what the argument discount in that position is Either of these make functions keener to inline, even if it's in a lazy position * There was conseqential tidying up on the data type of CallCont. In particular I got rid of the now-unused LetRhsFlag
-
- 17 Jan, 2008 2 commits
-
-
twanvl authored
-
simonpj@microsoft.com authored
The add_evals code in Simplify.simplAlt had bit-rotted. Example: data T a = T !a data U a = U !a foo :: T a -> U a foo (T x) = U x Here we should not evaluate x before building the U result, because the x argument of T is already evaluated. Thanks to Roman for finding this.
-
- 20 Dec, 2007 1 commit
-
-
simonpj@microsoft.com authored
The ru_fn field was wrong when we moved RULES from one Id to another. The fix is simple enough. However, looking at this makes me realise that the worker/wrapper stuff for recursive newtypes isn't very clever: we generate demand info but then don't properly exploit it. This patch fixes the crash though.
-