Commits on Source (37)
-
Before, there were two distinct Notes named "Eta reduction for data families". This renames one of them to "Implementing eta reduction for data families" to disambiguate the two and fixes references in other parts of the codebase to ensure that they are pointing to the right place. Fixes #17313. [ci skip]
9b39f2e6 -
Fixes #17979. [ci skip]
7627eab5 -
Metric Decrease: T13035 T1969
0002db1b -
In #17977, we ran into the reduction depth limit of the typechecker. That was only a symptom of a much broader issue: The recursion depth of the coverage checker for trying to instantiate strict fields in the `nonVoid` test was far too high (100, the `defaultMaxTcBound`). As a result, we were performing quite poorly on `T17977`. Short of a proper termination analysis to prove emptyness of a type, we just arbitrarily default to a much lower recursion limit of 3. Fixes #17977.
7b217179 -
Fixes #17983.
3c09f636 -
The binder-swap transformation is implemented by the occurrence analyser -- see Note [Binder swap] in OccurAnal. However it had a very nasty corner in it, for the case where the case scrutinee was a GlobalId. This led to trouble and hacks, and ultimately to #16296. This patch re-engineers how the occurrence analyser implements the binder-swap, by actually carrying out a substitution rather than by adding a let-binding. It's all described in Note [The binder-swap substitution]. I did a few other things along the way * Fix a bug in StgCse, which could allow a loop breaker to be CSE'd away. See Note [Care with loop breakers] in StgCse. I think it can only show up if occurrence analyser sets up bad loop breakers, but still. * Better commenting in SimplUtils.prepareAlts * A little refactoring in CoreUnfold; nothing significant e.g. rename CoreUnfold.mkTopUnfolding to mkFinalUnfolding * Renamed CoreSyn.isFragileUnfolding to hasCoreUnfolding * Move mkRuleInfo to CoreFVs We observed respectively 4.6% and 5.9% allocation decreases for the following tests: Metric Decrease: T9961 haddock.base
b943b25d -
Fix #13380 and #17676 by 1. Changing `raiseIO#` to have `topDiv` instead of `botDiv` 2. Give it special treatment in `Simplifier.Util.mkArgInfo`, treating it as if it still had `botDiv`, to recover dead code elimination. This is the first commit of the plan outlined in !2525 (comment 260886).
42d68364 -
0a88dd11
-
5beac042
-
Not only is this a reasonable efficiency measure but it avoids making reentrant calls into ncurses, which is not thread-safe. See #17922.
88f38b03 -
GHC 8.10 ships with `Cabal-3.2.0.0`, so it would be convenient to make Hadrian supporting building against 3.2.* instead of having to rebuild the entirety of `Cabal-3.0.0.0`. There is one API change in `Cabal-3.2.*` that affects Hadrian: the `synopsis` and `description` functions now return `ShortText` instead of `String`. Since Hadrian manipulates these `String`s in various places, I found that the simplest fix was to use CPP to convert `ShortText` to `String`s where appropriate.
27740f24 -
Broken by 57b888c0
49802002 -
Two `ASSERT`s in `reifyDataCon` were always using `arg_tys`, but `arg_tys` is not meaningful for GADT constructors. In fact, it's worse than non-meaningful, since using `arg_tys` when reifying a GADT constructor can lead to failed `ASSERT`ions, as #17305 demonstrates. This patch applies the simplest possible fix to the immediate problem. The `ASSERT`s now use `r_arg_tys` instead of `arg_tys`, as the former makes sure to give something meaningful for GADT constructors. This makes the panic go away at the very least. There is still an underlying issue with the way the internals of `reifyDataCon` work, as described in #17305 (comment 227023), but we leave that as future work, since fixing the underlying issue is much trickier (see #17305 (comment 227087)).
30a63e79 -
flag to dump pretty printed contents of the .hie file Metric Increase: hie002 Because of the regression on i386: compile_time/bytes allocated increased from i386-linux-deb9 baseline @ HEAD~10: Expected hie002 (normal) compile_time/bytes allocated: 583014888.0 +/-10% Lower bound hie002 (normal) compile_time/bytes allocated: 524713399 Upper bound hie002 (normal) compile_time/bytes allocated: 641316377 Actual hie002 (normal) compile_time/bytes allocated: 877986292 Deviation hie002 (normal) compile_time/bytes allocated: 50.6 % *** unexpected stat test failure for hie002(normal)
ef7576c4 -
We now differentiate three cases of constructor bindings: 1)Bindings which we can "replace" with a reference to an existing closure. Reference the replacement closure when accessing the binding. 2)Bindings which we can "replace" as above. But we still generate a closure which will be referenced by modules importing this binding. 3)For any other binding generate a closure. Then reference it. Before this patch 1) did only apply to local bindings and we didn't do 2) at all.
9462452a -
We were mistakenly measuring program stats
f7597aa0 -
Move handling of big literal strings from CmmToAsm to StgToCmm. It avoids the use of `sdocWithDynFlags` (cf #10143). We might need to move this handling even higher in the pipeline in the future (cf #17960): this patch will make it easier.
a485c3c4 -
In !2959 we noticed that there was some redundant code (in GHC.Cmm.Utils and GHC.Cmm.StgToCmm.Utils) used to deal with `CmmStatics` datatype (before SRT generation) and `RawCmmStatics` datatype (after SRT generation). This patch removes this redundant code by using a single GADT for (Raw)CmmStatics.
cc2918a0 -
9e60273d
-
The newlines break the command on windows.
1b7e8a94 -
This patch is joint work of Alexis King and Simon PJ. It does some significant refactoring of the type-class specialiser. Main highlights: * We can specialise functions with types like f :: Eq a => a -> Ord b => b => blah where the classes aren't all at the front (#16473). Here we can correctly specialise 'f' based on a call like f @Int @Bool dEqInt x dOrdBool This change really happened in an earlier patch commit 2d0cf625 Author: Sandy Maguire <sandy@sandymaguire.me> Date: Thu May 16 12:12:10 2019 -0400 work that this new patch builds directly on that work, and refactors it a bit. * We can specialise functions with implicit parameters (#17930) g :: (?foo :: Bool, Show a) => a -> String Previously we could not, but now they behave just like a non-class argument as in 'f' above. * We can specialise under-saturated calls, where some (but not all of the dictionary arguments are provided (#17966). For example, we can specialise the above 'f' based on a call map (f @Int dEqInt) xs even though we don't (and can't) give Ord dictionary. This may sound exotic, but #17966 is a program from the wild, and showed significant perf loss for functions like f, if you need saturation of all dictionaries. * We fix a buglet in which a floated dictionary had a bogus demand (#17810), by using zapIdDemandInfo in the NonRec case of specBind. * A tiny side benefit: we can drop dead arguments to specialised functions; see Note [Drop dead args from specialisations] * Fixed a bug in deciding what dictionaries are "interesting"; see Note [Keep the old dictionaries interesting] This is all achieved by by building on Sandy Macguire's work in defining SpecArg, which mkCallUDs uses to describe the arguments of the call. Main changes: * Main work is in specHeader, which marched down the [InBndr] from the function definition and the [SpecArg] from the call site, together. * specCalls no longer has an arity check; the entire mechanism now handles unders-saturated calls fine. * mkCallUDs decides on an argument-by-argument basis whether to specialise a particular dictionary argument; this is new. See mk_spec_arg in mkCallUDs. It looks as if there are many more lines of code, but I think that all the extra lines are comments!
4291bdda -
Ömer Sinan Ağacan authored
[ci skip]
40a85563 -
Fixes #17947 When we have a ticky label for a proc, IdLabels for the ticky counter and proc entry share the same Name. This caused overriding proc CafInfos with the ticky CafInfos (i.e. NoCafRefs) during SRT analysis. We now ignore the ticky labels when building SRTMaps. This makes sense because: - When building the current module they don't need to be in SRTMaps as they're initialized as non-CAFFY (see mkRednCountsLabel), so they don't take part in the dependency analysis and they're never added to SRTs. (Reminder: a "dependency" in the SRT analysis is a CAFFY dependency, non-CAFFY uses are not considered as dependencies for the algorithm) - They don't appear in the interfaces as they're not exported, so it doesn't matter for cross-module concerns whether they're in the SRTMap or not. See also the new Note [Ticky labels in SRT analysis].
dcfe29c8 -
Issue #17151 was a very tricky example of a bug in which the specialiser accidentally constructs a recurive dictionary, so that everything turns into bottom. I have fixed variants of this bug at least twice before: see Note [Avoiding loops]. It was a bit of a struggle to isolate the problem, greatly aided by the work that Alexey Kuleshevich did in distilling a test case. Once I'd understood the problem, it was not difficult to fix, though it did lead me a bit of refactoring in specImports.
cec2c71f -
This refactors DictBinds into a data type rather than a pair. No change in behaviour, just better code
e850d14f -
If we're doing heap profiling on an unprofiled executable we keep allocating new space in initEra via nextEra on each profiler run but we don't have a corresponding freeEra call. We do free the last era in endHeapProfiling but previous eras will have been overwritten by initEra and will never get free()ed. Metric Decrease: space_leak_001
f38e8d61 -
bcd66859
-
We used to have another factor, ufKeenessFactor, which would scale the discounts before they were subtracted from the size. This was justified with the following comment: -- We multiple the raw discounts (args_discount and result_discount) -- ty opt_UnfoldingKeenessFactor because the former have to do with -- *size* whereas the discounts imply that there's some extra -- *efficiency* to be gained (e.g. beta reductions, case reductions) -- by inlining. However, this is highly suspect since it means that we subtract a *scaled* size from an absolute size, resulting in crazy (e.g. negative) scores in some cases (#15304). We consequently killed off ufKeenessFactor and bumped up the ufUseThreshold to compensate. Adjustment of unfolding use threshold ===================================== Since this removes a discount from our inlining heuristic, I revisited our default choice of -funfolding-use-threshold to minimize the change in overall inlining behavior. Specifically, I measured runtime allocations and executable size of nofib and the testsuite performance tests built using compilers (and core libraries) built with several values of -funfolding-use-threshold. This comes as a result of a quantitative comparison of testsuite performance and code size as a function of ufUseThreshold, comparing GHC trees using values of 50, 60, 70, 80, 90, and 100. The test set consisted of nofib and the testsuite performance tests. A full summary of these measurements are found in the description of !2608 Comparing executable sizes (relative to the base commit) across all nofib tests, we see that sizes are similar to the baseline: gmean min max median thresh 50 -6.36% -7.04% -4.82% -6.46% 60 -5.04% -5.97% -3.83% -5.11% 70 -2.90% -3.84% -2.31% -2.92% 80 -0.75% -2.16% -0.42% -0.73% 90 +0.24% -0.41% +0.55% +0.26% 100 +1.36% +0.80% +1.64% +1.37% baseline +0.00% +0.00% +0.00% +0.00% Likewise, looking at runtime allocations we see that 80 gives slightly better optimisation than the baseline: gmean min max median thresh 50 +0.16% -0.16% +4.43% +0.00% 60 +0.09% -0.00% +3.10% +0.00% 70 +0.04% -0.09% +2.29% +0.00% 80 +0.02% -1.17% +2.29% +0.00% 90 -0.02% -2.59% +1.86% +0.00% 100 +0.00% -2.59% +7.51% -0.00% baseline +0.00% +0.00% +0.00% +0.00% Finally, I had to add a NOINLINE in T4306 to ensure that `upd` is worker-wrappered as the test expects. This makes me wonder whether the inlining heuristic is now too liberal as `upd` is quite a large function. The same measure was taken in T12600. Wall clock time compiling Cabal with -O0 thresh 50 60 70 80 90 100 baseline build-Cabal 93.88 89.58 92.59 90.09 100.26 94.81 89.13 Also, this change happens to avoid the spurious test output in `plugin-recomp-change` and `plugin-recomp-change-prof` (see #17308). Metric Decrease: hie002 T12234 T13035 T13719 T14683 T4801 T5631 T5642 T9020 T9872d T9961 Metric Increase: T12150 T12425 T13701 T14697 T15426 T1969 T3064 T5837 T6048 T9203 T9872a T9872b T9872c T9872d haddock.Cabal haddock.base haddock.compiler
3d2991f8 -
Update Haddock submodule
255418da -
Ryan Scott authored
This changes every unused TTG extension constructor to be strict in its field so that the pattern-match coverage checker is smart enough any such constructors are unreachable in pattern matches. This lets us remove nearly every use of `noExtCon` in the GHC API. The only ones we cannot remove are ones underneath uses of `ghcPass`, but that is only because GHC 8.8's and 8.10's coverage checkers weren't smart enough to perform this kind of reasoning. GHC HEAD's coverage checker, on the other hand, _is_ smart enough, so we guard these uses of `noExtCon` with CPP for now. Bumps the `haddock` submodule. Fixes #17992.
04b6cf94 -
Instead of using `nlHsTyVar`, which hardcodes `NotPromoted`, have `typeToLHsType` pick between `Promoted` and `NotPromoted` by checking if a type constructor is promoted using `isPromotedDataCon`. Fixes #18020.
7802fa17 -
Josh Meredith authoredec82dc49
-
Josh Meredith authored9c0dba4e
-
Josh Meredith authored32c87ef3
-
Josh Meredith authored62b90707
Showing
- CODEOWNERS 20 additions, 20 deletionsCODEOWNERS
- compiler/GHC.hs 8 additions, 8 deletionscompiler/GHC.hs
- compiler/GHC/Cmm.hs 26 additions, 19 deletionscompiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CLabel.hs 8 additions, 3 deletionscompiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/DebugBlock.hs 1 addition, 1 deletioncompiler/GHC/Cmm/DebugBlock.hs
- compiler/GHC/Cmm/Expr.hs 1 addition, 1 deletioncompiler/GHC/Cmm/Expr.hs
- compiler/GHC/Cmm/Info.hs 6 additions, 7 deletionscompiler/GHC/Cmm/Info.hs
- compiler/GHC/Cmm/Info/Build.hs 31 additions, 4 deletionscompiler/GHC/Cmm/Info/Build.hs
- compiler/GHC/Cmm/Node.hs 1 addition, 1 deletioncompiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Parser.y 1 addition, 1 deletioncompiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/Ppr/Decl.hs 6 additions, 9 deletionscompiler/GHC/Cmm/Ppr/Decl.hs
- compiler/GHC/Cmm/Utils.hs 20 additions, 15 deletionscompiler/GHC/Cmm/Utils.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs 4 additions, 4 deletionscompiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/PPC/Ppr.hs 6 additions, 5 deletionscompiler/GHC/CmmToAsm/PPC/Ppr.hs
- compiler/GHC/CmmToAsm/PPC/RegInfo.hs 2 additions, 2 deletionscompiler/GHC/CmmToAsm/PPC/RegInfo.hs
- compiler/GHC/CmmToAsm/Ppr.hs 12 additions, 20 deletionscompiler/GHC/CmmToAsm/Ppr.hs
- compiler/GHC/CmmToAsm/SPARC/CodeGen.hs 1 addition, 1 deletioncompiler/GHC/CmmToAsm/SPARC/CodeGen.hs
- compiler/GHC/CmmToAsm/SPARC/CodeGen/Gen32.hs 2 additions, 2 deletionscompiler/GHC/CmmToAsm/SPARC/CodeGen/Gen32.hs
- compiler/GHC/CmmToAsm/SPARC/Ppr.hs 9 additions, 7 deletionscompiler/GHC/CmmToAsm/SPARC/Ppr.hs
- compiler/GHC/CmmToAsm/SPARC/ShortcutJump.hs 2 additions, 2 deletionscompiler/GHC/CmmToAsm/SPARC/ShortcutJump.hs