- Jul 15, 2023
-
-
This patch finally (I hope) nails the question of whether (forall a. ty) and (forall a -> ty) are `eqType`: they aren't! There is a long discussion in #22762, plus useful Notes: * Note [ForAllTy and type equality] in GHC.Core.TyCo.Compare * Note [Comparing visiblities] in GHC.Core.TyCo.Compare * Note [ForAllCo] in GHC.Core.TyCo.Rep It also establishes a helpful new invariant for ForAllCo, and ForAllTy, when the bound variable is a CoVar:in that case the visibility must be coreTyLamForAllTyFlag. All this is well documented in revised Notes.
-
- May 19, 2023
-
-
Oleg Grenrus authored
This change makes command line argument parsing use diagnostic framework for producing warnings.
-
- May 18, 2023
-
-
Addresses #23398. The demand analyser usually does not unpack class dictionaries: see Note [Do not unbox class dictionaries] in GHC.Core.Opt.DmdAnal. This patch makes an exception for tuple dictionaries and equality dictionaries, for reasons explained in wrinkles (DNB1) and (DNB2) of the above Note. Compile times fall by 0.1% for some reason (max 0.7% on T18698b).
-
- Apr 26, 2023
-
-
In #23208 we observed that the demand signature of a binder occuring in a RULE wasn't unleashed, leading to a transitively used binder being discarded as absent. The solution was to use the same code path that we already use for handling exported bindings. See the changes to `Note [Absence analysis for stable unfoldings and RULES]` for more details. I took the chance to factor out the old notion of a `PlusDmdArg` (a pair of a `VarEnv Demand` and a `Divergence`) into `DmdEnv`, which fits nicely into our existing framework. As a result, I had to touch quite a few places in the code. This refactoring exposed a few small bugs around correct handling of bottoming demand environments. As a result, some strictness signatures now mention uniques that weren't there before which caused test output changes to T13143, T19969 and T22112. But these tests compared whole -ddump-simpl listings which is a very fragile thing to begin with. I changed what exactly they test for based on the symptoms in the corresponding issues. There is a single regression in T18894 because we are more conservative around stable unfoldings now. Unfortunately it is not easily fixed; let's wait until there is a concrete motivation before invest more time. Fixes #23208.
-
- Mar 21, 2023
-
-
CmmCgInfos is needed to write interface files, but the JavaScript backend does not generate it, causing "Name without LFInfo" warnings. This patch adds a conservative but always correct CmmCgInfos when the JavaScript backend is used. Fixes #23053
-
This patch implements a part of GHC Proposal #475. The key change is in GHC.Tuple.Prim: - data () = () - data (a,b) = (a,b) - data (a,b,c) = (a,b,c) ... + data Unit = () + data Tuple2 a b = (a,b) + data Tuple3 a b c = (a,b,c) ... And the rest of the patch makes sure that Unit and Tuple<n> are pretty-printed as () and (,,...,,) in various contexts. Updates the haddock submodule. Co-authored-by:
Vladislav Zavialov <vlad.z.4096@gmail.com>
-
- Mar 10, 2023
-
-
We should not panic in `add_demands` (now `set_lam_dmds`), because that code path is legimitely taken for OPAQUE PAP bindings, as in T22997. Fixes #22997.
-
- Feb 08, 2023
-
-
This reverts commit caced757. It seems the patch "Don't keep exit join points so much" is causing wide-spread regressions in the bytestring library benchmarks. If I revert it then the 9.6 numbers are better on average than 9.4. See ghc/ghc#22893 (comment 479525) ------------------------- Metric Decrease: MultiComponentModules MultiComponentModulesRecomp MultiLayerModules MultiLayerModulesRecomp MultiLayerModulesTH_Make T12150 T13386 T13719 T21839c T3294 parsing001 -------------------------
-
- Feb 02, 2023
-
-
This MR runs the testsuite for the JS backend. Note that this is a temporary solution until !9515 is merged. Key point: The CI runs hadrian on the built cross compiler _but not_ on the bindist. Other Highlights: - stm submodule gets a bump to mark tests as broken - several tests are marked as broken or are fixed by adding more - conditions to their test runner instance. List of working commit messages: CI: test cross target _and_ emulator CI: JS: Try run testsuite with hadrian JS.CI: cleanup and simplify hadrian invocation use single bracket, print info JS CI: remove call to test_compiler from hadrian don't build haddock JS: mark more tests as broken Tracked in #22576 JS testsuite: don't skip sum_mod test Its expected to fail, yet we skipped it which automatically makes it succeed leading to an unexpected success, JS testsuite: don't mark T12035j as skip leads to an unexpected pass JS testsuite: remove broken on T14075 leads to unexpected pass JS testsuite: mark more tests as broken JS testsuite: mark T11760 in base as broken JS testsuite: mark ManyUnbSums broken submodules: bump process and hpc for JS tests Both submodules has needed tests skipped or marked broken for th JS backend. This commit now adds these changes to GHC. See: HPC: hpc/hpc!21 Process: https://github.com/haskell/process/pull/268 remove js_broken on now passing tests separate wasm and js backend ci test: T11760: add threaded, non-moving only_ways test: T10296a add req_c T13894: skip for JS backend tests: jspace, T22333: mark as js_broken(22573) test: T22513i mark as req_th stm submodule: mark stm055, T16707 broken for JS tests: js_broken(22374) on unpack_sums_6, T12010 dont run diff on JS CI, cleanup fixup: More CI cleanup fix: align text to master fix: align exceptions submodule to master CI: Bump DOCKER_REV Bump to ci-images commit that has a deb11 build with node. Required for !9552 testsuite: mark T22669 as js_skip See #22669 This test tests that .o-boot files aren't created when run in using the interpreter backend. Thus this is not relevant for the JS backend. testsuite: mark T22671 as broken on JS See #22835 base.testsuite: mark Chan002 fragile for JS see #22836 revert: submodule process bump bump stm submodule New hash includes skips for the JS backend. testsuite: mark RnPatternSynonymFail broken on JS Requires TH: - see !9779 - and #22261 compiler: GHC.hs ifdef import Utils.Panic.Plain
-
- Jan 11, 2023
-
-
- Remove unused mkWildEvBinder - Use typeTypeOrConstraint - more symmetric and asserts that that the type is Type or Constraint - Fix escape sequences in Python; they raise a deprecation warning with -Wdefault
-
- Dec 09, 2022
-
-
... thus fixing #22549. The details are in the refurbished and no longer dead `Note [Do not strictify a DFun's parameter dictionaries]`. There's a regression test in T22549.
-
- Nov 30, 2022
-
-
See the updated `Note [Data-con worker strictness]` and the new `Note [Demand transformer for data constructors]`. Fixes #22475.
-
- Nov 29, 2022
-
-
Literals in Core were printed as e.g. 0xFF#16 :: Int16#. The proposal 451 now specifies syntax 0xFF#Int16. This change affects the Core printer only - more to be done later. Part of #21422.
-
- Nov 11, 2022
-
-
This big patch addresses the rats-nest of issues that have plagued us for years, about the relationship between Type and Constraint. See #11715/#21623. The main payload of the patch is: * To introduce CONSTRAINT :: RuntimeRep -> Type * To make TYPE and CONSTRAINT distinct throughout the compiler Two overview Notes in GHC.Builtin.Types.Prim * Note [TYPE and CONSTRAINT] * Note [Type and Constraint are not apart] This is the main complication. The specifics * New primitive types (GHC.Builtin.Types.Prim) - CONSTRAINT - ctArrowTyCon (=>) - tcArrowTyCon (-=>) - ccArrowTyCon (==>) - funTyCon FUN -- Not new See Note [Function type constructors and FunTy] and Note [TYPE and CONSTRAINT] * GHC.Builtin.Types: - New type Constraint = CONSTRAINT LiftedRep - I also stopped nonEmptyTyCon being built-in; it only needs to be wired-in * Exploit the fact that Type and Constraint are distinct throughout GHC - Get rid of tcView in favour of coreView. - Many tcXX functions become XX functions. e.g. tcGetCastedTyVar --> getCastedTyVar * Kill off Note [ForAllTy and typechecker equality], in (old) GHC.Tc.Solver.Canonical. It said that typechecker-equality should ignore the specified/inferred distinction when comparein two ForAllTys. But that wsa only weakly supported and (worse) implies that we need a separate typechecker equality, different from core equality. No no no. * GHC.Core.TyCon: kill off FunTyCon in data TyCon. There was no need for it, and anyway now we have four of them! * GHC.Core.TyCo.Rep: add two FunTyFlags to FunCo See Note [FunCo] in that module. * GHC.Core.Type. Lots and lots of changes driven by adding CONSTRAINT. The key new function is sORTKind_maybe; most other changes are built on top of that. See also `funTyConAppTy_maybe` and `tyConAppFun_maybe`. * Fix a longstanding bug in GHC.Core.Type.typeKind, and Core Lint, in kinding ForAllTys. See new tules (FORALL1) and (FORALL2) in GHC.Core.Type. (The bug was that before (forall (cv::t1 ~# t2). blah), where blah::TYPE IntRep, would get kind (TYPE IntRep), but it should be (TYPE LiftedRep). See Note [Kinding rules for types] in GHC.Core.Type. * GHC.Core.TyCo.Compare is a new module in which we do eqType and cmpType. Of course, no tcEqType any more. * GHC.Core.TyCo.FVs. I moved some free-var-like function into this module: tyConsOfType, visVarsOfType, and occCheckExpand. Refactoring only. * GHC.Builtin.Types. Compiletely re-engineer boxingDataCon_maybe to have one for each /RuntimeRep/, rather than one for each /Type/. This dramatically widens the range of types we can auto-box. See Note [Boxing constructors] in GHC.Builtin.Types The boxing types themselves are declared in library ghc-prim:GHC.Types. GHC.Core.Make. Re-engineer the treatment of "big" tuples (mkBigCoreVarTup etc) GHC.Core.Make, so that it auto-boxes unboxed values and (crucially) types of kind Constraint. That allows the desugaring for arrows to work; it gathers up free variables (including dictionaries) into tuples. See Note [Big tuples] in GHC.Core.Make. There is still work to do here: #22336. But things are better than before. * GHC.Core.Make. We need two absent-error Ids, aBSENT_ERROR_ID for types of kind Type, and aBSENT_CONSTRAINT_ERROR_ID for vaues of kind Constraint. Ditto noInlineId vs noInlieConstraintId in GHC.Types.Id.Make; see Note [inlineId magic]. * GHC.Core.TyCo.Rep. Completely refactor the NthCo coercion. It is now called SelCo, and its fields are much more descriptive than the single Int we used to have. A great improvement. See Note [SelCo] in GHC.Core.TyCo.Rep. * GHC.Core.RoughMap.roughMatchTyConName. Collapse TYPE and CONSTRAINT to a single TyCon, so that the rough-map does not distinguish them. * GHC.Core.DataCon - Mainly just improve documentation * Some significant renamings: GHC.Core.Multiplicity: Many --> ManyTy (easier to grep for) One --> OneTy GHC.Core.TyCo.Rep TyCoBinder --> GHC.Core.Var.PiTyBinder GHC.Core.Var TyCoVarBinder --> ForAllTyBinder AnonArgFlag --> FunTyFlag ArgFlag --> ForAllTyFlag GHC.Core.TyCon TyConTyCoBinder --> TyConPiTyBinder Many functions are renamed in consequence e.g. isinvisibleArgFlag becomes isInvisibleForAllTyFlag, etc * I refactored FunTyFlag (was AnonArgFlag) into a simple, flat data type data FunTyFlag = FTF_T_T -- (->) Type -> Type | FTF_T_C -- (-=>) Type -> Constraint | FTF_C_T -- (=>) Constraint -> Type | FTF_C_C -- (==>) Constraint -> Constraint * GHC.Tc.Errors.Ppr. Some significant refactoring in the TypeEqMisMatch case of pprMismatchMsg. * I made the tyConUnique field of TyCon strict, because I saw code with lots of silly eval's. That revealed that GHC.Settings.Constants.mAX_SUM_SIZE can only be 63, because we pack the sum tag into a 6-bit field. (Lurking bug squashed.) Fixes * #21530 Updates haddock submodule slightly. Performance changes ~~~~~~~~~~~~~~~~~~~ I was worried that compile times would get worse, but after some careful profiling we are down to a geometric mean 0.1% increase in allocation (in perf/compiler). That seems fine. There is a big runtime improvement in T10359 Metric Decrease: LargeRecord MultiLayerModulesTH_OneShot T13386 T13719 Metric Increase: T8095
-
Now Budget roughly tracks the combined width of all arguments after unarisation. See the changes to `Note [Worker argument budgets]`. Fixes #21737.
-
See Note [Unboxing through unboxed tuples]. Fixes #22388.
-
- Oct 17, 2022
-
-
Previously, the demand signature we computed upfront for a DataCon wrapper lacked boxity information and was much less precise than the demand transformer for the DataCon worker. In this patch we adopt the solution to look through unfoldings of DataCon wrappers during Demand Analysis, but still attach a demand signature for other passes such as the Simplifier. See `Note [DmdAnal for DataCon wrappers]` for more details. Fixes #22241.
-
- Oct 11, 2022
-
-
We were religiously keeping exit join points throughout, which had some bad effects (#21148, #22084). This MR does two things: * Arranges that exit join points are inhibited from inlining only in /one/ Simplifier pass (right after Exitification). See Note [Be selective about not-inlining exit join points] in GHC.Core.Opt.Exitify It's not a big deal, but it shaves 0.1% off compile times. * Inline used-once non-recursive join points very aggressively Given join j x = rhs in joinrec k y = ....j x.... where this is the only occurrence of `j`, we want to inline `j`. (Unless sm_keep_exits is on.) See Note [Inline used-once non-recursive join points] in GHC.Core.Opt.Simplify.Utils This is just a tidy-up really. It doesn't change allocation, but getting rid of a binding is always good. Very effect on nofib -- some up and down.
-
We want to put implicit binds into fat interface files, so the easiest thing to do seems to be to treat them uniformly with other binders.
-
- Sep 30, 2022
-
-
Sebastian Graf authored
A small refactoring in our Core Opt pipeline and some new functions for transfering argument boxities from one signature to another to facilitate `Note [Don't change boxity without worker/wrapper]`. Fixes #21754.
-
- Sep 29, 2022
-
-
Sebastian Graf authored
Justification in #22231. Short form: In a demand like `1C1(C1(L))` it was too easy to confuse which `1` belongs to which `C`. Now that should be more obvious. Fixes #22231
-
- Sep 28, 2022
-
-
This patch fixes #21286, by not unboxing dictionaries in worker/wrapper (ever). The main payload is tiny: * In `GHC.Core.Opt.DmdAnal.finaliseArgBoxities`, do not unbox dictionaries in `get_dmd`. See Note [Do not unbox class dictionaries] in that module * I also found that imported wrappers were being fruitlessly specialised, so I fixed that too, in canSpecImport. See Note [Specialising imported functions] point (2). In doing due diligence in the testsuite I fixed a number of other things: * Improve Note [Specialising unfoldings] in GHC.Core.Unfold.Make, and Note [Inline specialisations] in GHC.Core.Opt.Specialise, and remove duplication between the two. The new Note describes how we specialise functions with an INLINABLE pragma. And simplify the defn of `spec_unf` in `GHC.Core.Opt.Specialise.specCalls`. * Improve Note [Worker/wrapper for INLINABLE functions] in GHC.Core.Opt.WorkWrap. And (critially) make an actual change which is to propagate the user-written pragma from the original function to the wrapper; see `mkStrWrapperInlinePrag`. * Write new Note [Specialising imported functions] in GHC.Core.Opt.Specialise All this has a big effect on some compile times. This is compiler/perf, showing only changes over 1%: Metrics: compile_time/bytes allocated ------------------------------------- LargeRecord(normal) -50.2% GOOD ManyConstructors(normal) +1.0% MultiLayerModulesTH_OneShot(normal) +2.6% PmSeriesG(normal) -1.1% T10547(normal) -1.2% T11195(normal) -1.2% T11276(normal) -1.0% T11303b(normal) -1.6% T11545(normal) -1.4% T11822(normal) -1.3% T12150(optasm) -1.0% T12234(optasm) -1.2% T13056(optasm) -9.3% GOOD T13253(normal) -3.8% GOOD T15164(normal) -3.6% GOOD T16190(normal) -2.1% T16577(normal) -2.8% GOOD T16875(normal) -1.6% T17836(normal) +2.2% T17977b(normal) -1.0% T18223(normal) -33.3% GOOD T18282(normal) -3.4% GOOD T18304(normal) -1.4% T18698a(normal) -1.4% GOOD T18698b(normal) -1.3% GOOD T19695(normal) -2.5% GOOD T5837(normal) -2.3% T9630(normal) -33.0% GOOD WWRec(normal) -9.7% GOOD hard_hole_fits(normal) -2.1% GOOD hie002(normal) +1.6% geo. mean -2.2% minimum -50.2% maximum +2.6% I diligently investigated some of the big drops. * Caused by not doing w/w for dictionaries: T13056, T15164, WWRec, T18223 * Caused by not fruitlessly specialising wrappers LargeRecord, T9630 For runtimes, here is perf/should+_run: Metrics: runtime/bytes allocated -------------------------------- T12990(normal) -3.8% T5205(normal) -1.3% T9203(normal) -10.7% GOOD haddock.Cabal(normal) +0.1% haddock.base(normal) -1.1% haddock.compiler(normal) -0.3% lazy-bs-alloc(normal) -0.2% ------------------------------------------ geo. mean -0.3% minimum -10.7% maximum +0.1% I did not investigate exactly what happens in T9203. Nofib is a wash: +-------------------------------++--+-----------+-----------+ | || | tsv (rel) | std. err. | +===============================++==+===========+===========+ | real/anna || | -0.13% | 0.0% | | real/fem || | +0.13% | 0.0% | | real/fulsom || | -0.16% | 0.0% | | real/lift || | -1.55% | 0.0% | | real/reptile || | -0.11% | 0.0% | | real/smallpt || | +0.51% | 0.0% | | spectral/constraints || | +0.20% | 0.0% | | spectral/dom-lt || | +1.80% | 0.0% | | spectral/expert || | +0.33% | 0.0% | +===============================++==+===========+===========+ | geom mean || | | | +-------------------------------++--+-----------+-----------+ I spent quite some time investigating dom-lt, but it's pretty complicated. See my note on !7847. Conclusion: it's just a delicate inlining interaction, and we have plenty of those. Metric Decrease: LargeRecord T13056 T13253 T15164 T16577 T18223 T18282 T18698a T18698b T19695 T9630 WWRec hard_hole_fits T9203
-
- Sep 27, 2022
-
-
Sebastian Graf authored
In #21717 we saw a reportedly unsound strictness signature due to an unsound definition of plusSubDmd on Calls. This patch contains a description and the fix to the unsoundness as outlined in `Note [Call SubDemand vs. evaluation Demand]`. This fix means we also get rid of the special handling of `-fpedantic-bottoms` in eta-reduction. Thanks to less strict and actually sound strictness results, we will no longer eta-reduce the problematic cases in the first place, even without `-fpedantic-bottoms`. So fixing the unsoundness also makes our eta-reduction code simpler with less hacks to explain. But there is another, more unfortunate side-effect: We *unfix* #21085, but fortunately we have a new fix ready: See `Note [mkCall and plusSubDmd]`. There's another change: I decided to make `Note [SubDemand denotes at least one evaluation]` a lot simpler by using `plusSubDmd` (instead of `lubPlusSubDmd`) even if both argument demands are lazy. That leads to less precise results, but in turn rids ourselves from the need for 4 different `OpMode`s and the complication of `Note [Manual specialisation of lub*Dmd/plus*Dmd]`. The result is simpler code that is in line with the paper draft on Demand Analysis. I left the abandoned idea in `Note [Unrealised opportunity in plusDmd]` for posterity. The fallout in terms of regressions is negligible, as the testsuite and NoFib shows. ``` Program Allocs Instrs -------------------------------------------------------------------------------- hidden +0.2% -0.2% linear -0.0% -0.7% -------------------------------------------------------------------------------- Min -0.0% -0.7% Max +0.2% +0.0% Geometric Mean +0.0% -0.0% ``` Fixes #21717.
-
- Sep 06, 2022
-
-
Rather conservatively return Top. See Note [Untyped demand on case-alternative binders]. I also factored `addCaseBndrDmd` into two separate functions `scrutSubDmd` and `fieldBndrDmds`. Fixes #22039.
-
- Jul 25, 2022
-
-
This patch fixes #21888, and simplifies finaliseArgBoxities by eliminating the (recently introduced) data type FinalDecision. A delicate interaction meant that this patch commit d1c25a48 Date: Tue Jul 12 16:33:46 2022 +0100 Refactor wantToUnboxArg a bit make worker/wrapper go into an infinite loop. This patch fixes it by narrowing the handling of case (B) of Note [Boxity for bottoming functions], to deal only the arguemnts that are type variables. Only then do we drop the trimBoxity call, which is what caused the bug. I also * Added documentation of case (B), which was previously completely un-mentioned. And a regression test, T21888a, to test it. * Made unboxDeeplyDmd stop at lazy demands. It's rare anyway for a bottoming function to have a lazy argument (mainly when the data type is recursive and then we don't want to unbox deeply). Plus there is Note [No lazy, Unboxed demands in demand signature] * Refactored the Case equation for dmdAnal a bit, to do less redundant pattern matching.
-
- Jun 27, 2022
-
-
We used to put OtherCon unfoldings on lambda binders of workers and sometimes also join points/specializations with with the assumption that since the wrapper would force these arguments once we execute the RHS they would indeed be in WHNF. This was wrong for reasons detailed in #21472. So now we purge evaluated unfoldings from *all* lambda binders. This fixes #21472, but at the cost of sometimes not using as efficient a calling convention. It can also change inlining behaviour as some occurances will no longer look like value arguments when they did before. As consequence we also change how we compute CBV information for arguments slightly. We now *always* determine the CBV convention for arguments during tidy. Earlier in the pipeline we merely mark functions as candidates for having their arguments treated as CBV. As before the process is described in the relevant notes: Note [CBV Function Ids] Note [Attaching CBV Marks to ids] Note [Never put `OtherCon` unfoldigns on lambda binders] ------------------------- Metric Decrease: T12425 T13035 T18223 T18223 T18923 MultiLayerModulesTH_OneShot Metric Increase: WWRec -------------------------
-
- May 30, 2022
-
-
This is a large collection of changes all relating to eta reduction, originally triggered by #18993, but there followed a long saga. Specifics: * Move state-hack stuff from GHC.Types.Id (where it never belonged) to GHC.Core.Opt.Arity (which seems much more appropriate). * Add a crucial mkCast in the Cast case of GHC.Core.Opt.Arity.eta_expand; helps with T18223 * Add clarifying notes about eta-reducing to PAPs. See Note [Do not eta reduce PAPs] * I moved tryEtaReduce from GHC.Core.Utils to GHC.Core.Opt.Arity, where it properly belongs. See Note [Eta reduce PAPs] * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, pull out the code for when eta-expansion is wanted, to make wantEtaExpansion, and all that same function in GHC.Core.Opt.Simplify.simplStableUnfolding. It was previously inconsistent, but it's doing the same thing. * I did a substantial refactor of ArityType; see Note [ArityType]. This allowed me to do away with the somewhat mysterious takeOneShots; more generally it allows arityType to describe the function, leaving its clients to decide how to use that information. I made ArityType abstract, so that clients have to use functions to access it. * Make GHC.Core.Opt.Simplify.Utils.rebuildLam (was stupidly called mkLam before) aware of the floats that the simplifier builds up, so that it can still do eta-reduction even if there are some floats. (Previously that would not happen.) That means passing the floats to rebuildLam, and an extra check when eta-reducting (etaFloatOk). * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, make use of call-info in the idDemandInfo of the binder, as well as the CallArity info. The occurrence analyser did this but we were failing to take advantage here. In the end I moved the heavy lifting to GHC.Core.Opt.Arity.findRhsArity; see Note [Combining arityType with demand info], and functions idDemandOneShots and combineWithDemandOneShots. (These changes partly drove my refactoring of ArityType.) * In GHC.Core.Opt.Arity.findRhsArity * I'm now taking account of the demand on the binder to give extra one-shot info. E.g. if the fn is always called with two args, we can give better one-shot info on the binders than if we just look at the RHS. * Don't do any fixpointing in the non-recursive case -- simple short cut. * Trim arity inside the loop. See Note [Trim arity inside the loop] * Make SimpleOpt respect the eta-reduction flag (Some associated refactoring here.) * I made the CallCtxt which the Simplifier uses distinguish between recursive and non-recursive right-hand sides. data CallCtxt = ... | RhsCtxt RecFlag | ... It affects only one thing: - We call an RHS context interesting only if it is non-recursive see Note [RHS of lets] in GHC.Core.Unfold * Remove eta-reduction in GHC.CoreToStg.Prep, a welcome simplification. See Note [No eta reduction needed in rhsToBody] in GHC.CoreToStg.Prep. Other incidental changes * Fix a fairly long-standing outright bug in the ApplyToVal case of GHC.Core.Opt.Simplify.mkDupableContWithDmds. I was failing to take the tail of 'dmds' in the recursive call, which meant the demands were All Wrong. I have no idea why this has not caused problems before now. * Delete dead function GHC.Core.Opt.Simplify.Utils.contIsRhsOrArg Metrics: compile_time/bytes allocated Test Metric Baseline New value Change --------------------------------------------------------------------------------------- MultiLayerModulesTH_OneShot(normal) ghc/alloc 2,743,297,692 2,619,762,992 -4.5% GOOD T18223(normal) ghc/alloc 1,103,161,360 972,415,992 -11.9% GOOD T3064(normal) ghc/alloc 201,222,500 184,085,360 -8.5% GOOD T8095(normal) ghc/alloc 3,216,292,528 3,254,416,960 +1.2% T9630(normal) ghc/alloc 1,514,131,032 1,557,719,312 +2.9% BAD parsing001(normal) ghc/alloc 530,409,812 525,077,696 -1.0% geo. mean -0.1% Nofib: Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- banner +0.0% +0.4% -8.9% -8.7% 0.0% exact-reals +0.0% -7.4% -36.3% -37.4% 0.0% fannkuch-redux +0.0% -0.1% -1.0% -1.0% 0.0% fft2 -0.1% -0.2% -17.8% -19.2% 0.0% fluid +0.0% -1.3% -2.1% -2.1% 0.0% gg -0.0% +2.2% -0.2% -0.1% 0.0% spectral-norm +0.1% -0.2% 0.0% 0.0% 0.0% tak +0.0% -0.3% -9.8% -9.8% 0.0% x2n1 +0.0% -0.2% -3.2% -3.2% 0.0% -------------------------------------------------------------------------------- Min -3.5% -7.4% -58.7% -59.9% 0.0% Max +0.1% +2.2% +32.9% +32.9% 0.0% Geometric Mean -0.0% -0.1% -14.2% -14.8% -0.0% Metric Decrease: MultiLayerModulesTH_OneShot T18223 T3064 T15185 T14766 Metric Increase: T9630
-
- May 03, 2022
-
-
Sebastian Graf authored
See the new `Note [SubDemand denotes at least one evaluation]`. A demand `n :* sd` on a let binder `x=e` now means > "`x` was evaluated `n` times and in any program trace it is evaluated, `e` is > evaluated deeply in sub-demand `sd`." The "any time it is evaluated" premise is what this patch adds. As a result, we get better nested strictness. For example (T21081) ```hs f :: (Bool, Bool) -> (Bool, Bool) f pr = (case pr of (a,b) -> a /= b, True) -- before: <MP(L,L)> -- after: <MP(SL,SL)> g :: Int -> (Bool, Bool) g x = let y = let z = odd x in (z,z) in f y ``` The change in demand signature "before" to "after" allows us to case-bind `z` here. Similarly good things happen for the `sd` in call sub-demands `Cn(sd)`, which allows for more eta-reduction (which is only sound with `-fno-pedantic-bottoms`, albeit). We also fix #21085, a surprising inconsistency with `Poly` to `Call` sub-demand expansion. In an attempt to fix a regression caused by less inlining due to eta-reduction in T15426, I eta-expanded the definition of `elemIndex` and `elemIndices`, thus fixing #21345 on the go. The main point of this patch is that it fixes #21081 and #21133. Annoyingly, I discovered that more precise demand signatures for join points can transform a program into a lazier program if that join point gets floated to the top-level, see #21392. There is no simple fix at the moment, but !5349 might. Thus, we accept a ~5% regression in `MultiLayerModulesTH_OneShot`, where #21392 bites us in `addListToUniqDSet`. T21392 reliably reproduces the issue. Surprisingly, ghc/alloc perf on Windows improves much more than on other jobs, by 0.4% in the geometric mean and by 2% in T16875. Metric Increase: MultiLayerModulesTH_OneShot Metric Decrease: T16875
-
- Apr 14, 2022
-
-
This will mean T9208 when run with lint will return a lint error instead of resulting in a panic. Fixes #21117
-
- Apr 09, 2022
-
-
previously, GHC had the "let/app-invariant" which said that the RHS of a let or the argument of an application must be of lifted type or ok for speculation. We want this on let to freely float them around, and we wanted that on app to freely convert between the two (e.g. in beta-reduction or inlining). However, the app invariant meant that simple code didn't stay simple and this got in the way of rules matching. By removing the app invariant, this thus fixes #20554. The new invariant is now called "let-can-float invariant", which is hopefully easier to guess its meaning correctly. Dropping the app invariant means that everywhere where we effectively do beta-reduction (in the two simplifiers, but also in `exprIsConApp_maybe` and other innocent looking places) we now have to check if the argument must be evaluated (unlifted and side-effecting), and analyses have to be adjusted to the new semantics of `App`. Also, `LetFloats` in the simplifier can now also carry such non-floating bindings. The fix for DmdAnal, refine by Sebastian, makes functions with unlifted arguments strict in these arguments, which changes some signatures. This causes some extra calls to `exprType` and `exprOkForSpeculation`, so some perf benchmarks regress a bit (while others improve). Metric Decrease: T9020 Metric Increase: LargeRecord T12545 T15164 T16577 T18223 T5642 T9961 Co-authored-by:
Sebastian Graf <sebastian.graf@kit.edu>
-
- Apr 06, 2022
-
-
Once we are done parsing the header of a module to obtain the options, we look through the rest of the tokens in order to determine if they contain any misplaced file header pragmas that would usually be ignored, potentially resulting in bad error messages. The warnings are reported immediately so that later errors don't shadow over potentially helpful warnings. Metric Increase: T13719
-
- Apr 01, 2022
-
-
Users are supposed to import GHC.Exts rather than GHC.Prim. Part of #18749.
-
- Mar 24, 2022
-
-
These dependencies would affect the demand signature depending on various rules and so on. Fixes #21271
-
Partial FUN apps like `(->) Bool` aren't detected by `splitFunTy_maybe`. A silly oversight that is easily fixed by replacing `splitFunTy_maybe` with a guard in the `splitTyConApp_maybe` case. But fortunately, Simon nudged me into rewriting the whole `isRecDataCon` function in a way that makes it much shorter and hence clearer which DataCons are actually considered as recursive. Fixes #21265.
-
- Mar 16, 2022
-
-
Previously, we let `Unboxed` win in `lubBoxity`, which is unsoundly optimistic in terms ob Boxity analysis. "Unsoundly" in the sense that we sometimes unbox parameters that we better shouldn't unbox. Examples are #18907 and T19871.absent. Until now, we thought that this hack pulled its weight becuase it worked around some shortcomings of the phase separation between Boxity analysis and CPR analysis. But it is a gross hack which caused regressions itself that needed all kinds of fixes and workarounds. See for example #20767. It became impossible to work with in !7599, so I want to remove it. For example, at the moment, `lubDmd B dmd` will not unbox `dmd`, but `lubDmd A dmd` will. Given that `B` is supposed to be the bottom element of the lattice, it's hardly justifiable to get a better demand when `lub`bing with `A`. The consequence of letting `Boxed` win in `lubBoxity` is that we *would* regress #2387, #16040 and parts of #5075 and T19871.sumIO, until Boxity and CPR are able to communicate better. Fortunately, that is not the case since I could tweak the other source of optimism in Boxity analysis that is described in `Note [Unboxed demand on function bodies returning small products]` so that we *recursively* assume unboxed demands on function bodies returning small products. See the updated Note. `Note [Boxity for bottoming functions]` describes why we need bottoming functions to have signatures that say that they deeply unbox their arguments. In so doing, I had to tweak `finaliseArgBoxities` so that it will never unbox recursive data constructors. This is in line with our handling of them in CPR. I updated `Note [Which types are unboxed?]` to reflect that. In turn we fix #21119, #20767, #18907, T19871.absent and get a much simpler implementation (at least to think about). We can also drop the very ad-hoc definition of `deferAfterPreciseException` and its Note in favor of the simple, intuitive definition we used to have. Metric Decrease: T16875 T18223 T18698a T18698b hard_hole_fits Metric Increase: LargeRecord MultiComponentModulesRecomp T15703 T8095 T9872d Out of all the regresions, only the one in T9872d doesn't vanish in a perf build, where the compiler is bootstrapped with -O2 and thus SpecConstr. Reason for regressions: * T9872d is due to `ty_co_subst` taking its `LiftingContext` boxed. That is because the context is passed to a function argument, for example in `liftCoSubstTyVarBndrUsing`. * In T15703, LargeRecord and T8095, we get a bit more allocations in `expand_syn` and `piResultTys`, because a `TCvSubst` isn't unboxed. In both cases that guards against reboxing in some code paths. * The same is true for MultiComponentModulesRecomp, where we get less unboxing in `GHC.Unit.Finder.$wfindInstalledHomeModule`. In a perf build, allocations actually *improve* by over 4%! Results on NoFib: -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- awards -0.4% +0.3% cacheprof -0.3% +2.4% fft -1.5% -5.1% fibheaps +1.2% +0.8% fluid -0.3% -0.1% ida +0.4% +0.9% k-nucleotide +0.4% -0.1% last-piece +10.5% +13.9% lift -4.4% +3.5% mandel2 -99.7% -99.8% mate -0.4% +3.6% parser -1.0% +0.1% puzzle -11.6% +6.5% reverse-complem -3.0% +2.0% scs -0.5% +0.1% sphere -0.4% -0.2% wave4main -8.2% -0.3% -------------------------------------------------------------------------------- Summary excludes mandel2 because of excessive bias Min -11.6% -5.1% Max +10.5% +13.9% Geometric Mean -0.2% +0.3% -------------------------------------------------------------------------------- Not bad for a bug fix. The regression in `last-piece` could become a win if SpecConstr would work on non-recursive functions. The regression in `fibheaps` is due to `Note [Reboxed crud for bottoming calls]`, e.g., #21128.
-
- Mar 13, 2022
-
-
Issue #21150 shows that worker/wrapper allocated a worker function for a function with multiple calls that said "called at most once" when the first argument was absent. That's bad! This patch makes it so that WW preserves at least one non-one-shot value lambda (see `Note [Preserving float barriers]`) by passing around `void#` in place of absent arguments. Fixes #21150. Since the fix is pretty similar to `Note [Protecting the last value argument]`, I put the logic in `mkWorkerArgs`. There I realised (#21204) that `-ffun-to-thunk` is basically useless with `-ffull-laziness`, so I deprecated the flag, simplified and split into `needsVoidWorkerArg`/`addVoidWorkerArg`. SpecConstr is another client of that API. Fixes #21204. Metric Decrease: T14683
-
- Mar 02, 2022
-
-
Don't instantiate type variables for :type in `GHC.Tc.Gen.App.tcInstFun`, to avoid inconsistently instantianting `r1` but not `r2` in the type forall {r1} (a :: TYPE r1) {r2} (b :: TYPE r2). ... This fixes #21088. This patch also changes the primop pretty-printer to ensure that we put all the inferred type variables first. For example, the type of reallyUnsafePtrEquality# is now forall {l :: Levity} {k :: Levity} (a :: TYPE (BoxedRep l)) (b :: TYPE (BoxedRep k)). a -> b -> Int# This means we avoid running into issue #21088 entirely with the types of primops. Users can still write a type signature where the inferred type variables don't come first, however. This change to primops had a knock-on consequence, revealing that we were sometimes performing eta reduction on keepAlive#. This patch updates tryEtaReduce to avoid eta reducing functions with no binding, bringing it in line with tryEtaReducePrep, and thus fixing #21090.
-
- Feb 12, 2022
-
-
This does three major things: * Enforce the invariant that all strict fields must contain tagged pointers. * Try to predict the tag on bindings in order to omit tag checks. * Allows functions to pass arguments unlifted (call-by-value). The former is "simply" achieved by wrapping any constructor allocations with a case which will evaluate the respective strict bindings. The prediction is done by a new data flow analysis based on the STG representation of a program. This also helps us to avoid generating redudant cases for the above invariant. StrictWorkers are created by W/W directly and SpecConstr indirectly. See the Note [Strict Worker Ids] Other minor changes: * Add StgUtil module containing a few functions needed by, but not specific to the tag analysis. ------------------------- Metric Decrease: T12545 T18698b T18140 T18923 LargeRecord Metric Increase: LargeRecord ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13253-spj T13379 T15164 T18282 T18304 T18698a T1969 T20049 T3294 T4801 T5321FD T5321Fun T783 T9233 T9675 T9961 T19695 WWRec -------------------------
-
- Feb 03, 2022
-
-
This patch implements a fix for #20817. It ensures that * The final strictness signature for a function accurately reflects the unboxing done by the wrapper See Note [Finalising boxity for demand signatures] and Note [Finalising boxity for let-bound Ids] * A much better "layer-at-a-time" implementation of the budget for how many worker arguments we can have See Note [Worker argument budget] Generally this leads to a bit more worker/wrapper generation, because instead of aborting entirely if the budget is exceeded (and then lying about boxity), we unbox a bit. Binary sizes in increase slightly (around 1.8%) because of the increase in worker/wrapper generation. The big effects are to GHC.Ix, GHC.Show, GHC.IO.Handle.Internals. If we did a better job of dropping dead code, this effect might go away. Some nofib perf improvements: Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- VSD +1.8% -0.5% 0.017 0.017 0.0% awards +1.8% -0.1% +2.3% +2.3% 0.0% banner +1.7% -0.2% +0.3% +0.3% 0.0% bspt +1.8% -0.1% +3.1% +3.1% 0.0% eliza +1.8% -0.1% +1.2% +1.2% 0.0% expert +1.7% -0.1% +9.6% +9.6% 0.0% fannkuch-redux +1.8% -0.4% -9.3% -9.3% 0.0% kahan +1.8% -0.1% +22.7% +22.7% 0.0% maillist +1.8% -0.9% +21.2% +21.6% 0.0% nucleic2 +1.7% -5.1% +7.5% +7.6% 0.0% pretty +1.8% -0.2% 0.000 0.000 0.0% reverse-complem +1.8% -2.5% +12.2% +12.2% 0.0% rfib +1.8% -0.2% +2.5% +2.5% 0.0% scc +1.8% -0.4% 0.000 0.000 0.0% simple +1.7% -1.3% +17.0% +17.0% +7.4% spectral-norm +1.8% -0.1% +6.8% +6.7% 0.0% sphere +1.7% -2.0% +13.3% +13.3% 0.0% tak +1.8% -0.2% +3.3% +3.3% 0.0% x2n1 +1.8% -0.4% +8.1% +8.1% 0.0% -------------------------------------------------------------------------------- Min +1.1% -5.1% -23.6% -23.6% 0.0% Max +1.8% +0.0% +36.2% +36.2% +7.4% Geometric Mean +1.7% -0.1% +6.8% +6.8% +0.1% Compiler allocations in CI have a geometric mean of +0.1%; many small decreases but there are three bigger increases (7%), all because we do more worker/wrapper than before, so there is simply more code to compile. That's OK. Perf benchmarks in perf/should_run improve in allocation by a geo mean of -0.2%, which is good. None get worse. T12996 improves by -5.8% Metric Decrease: T12996 Metric Increase: T18282 T18923 T9630
-
- Feb 01, 2022
-
-