- Feb 10, 2025
-
-
Closes #25693.
-
- Feb 08, 2025
-
-
In particular, use `NonEmpty` where appropriate: - the argument of `FieldLabelString` - the argument of `HsMultiIf` - `grhssGRHSs` Decreases overall compile-time allocation by about 0.1% in the benchmark suite (min -0.8%, max +0.3%). Metric Decrease: T3294
-
-
This cleans up a number of stylistic inconsistencies although it's still far from perfect.
-
Addresses #25452. Addresses core-libraries-committee#305.
-
Addresses part of #25452. Addresses core-libraries-committee#305.
-
Addresses part of #25452. Addresses core-libraries-committee#305.
-
Addresses part of #25452. Addresses core-libraries-committee#305.
-
Addresses part of #25452. Addresses core-libraries-committee#305.
-
Addresses part of #25452. Addresses core-libraries-committee#305.
-
- Feb 06, 2025
-
-
Profiles showed that about 0.2s was being spend constructing the keys before looking up values in the old symbol cache. The performance of this codepath is critical as it translates directly to a delay when a user evaluates a function like `main` in the interpreter. Therefore we implement a solution which keys the cache(s) by `Name` rather than the symbol directly, so the cache can be consulted before the symbol is constructed. Fixes #25731
-
The profiling code had slightly bitrotted since the last time it was used. This just fixes things so that if you toggle the INTERP_STATS macro then it just works and prints out the stats. Fixes #25695
-
`nameToCLabel` is called from `lookupHsSymbol` many times during bytecode linking. We can save a lot of allocations and time by directly manipulating the bytestrings rather than going via intermediate lists. Before: 2GB allocation, 1.11s After: 260MB allocation, 375ms Fixes #25719 ------------------------- Metric Decrease: MultiLayerModulesTH_OneShot -------------------------
-
genericLength is a recursive function and marked NOINLINE. It is not going to specialise. In profiles, it can be seen that 3% of total compilation time when computing bytecode is spend calling this non-specialised function. In addition, we can simplify `addListToSS` to avoid traversing the input list twice and also allocating an intermediate list (after the call to reverse). Overall these changes reduce the time spend in 'assembleBCOs' from 5.61s to 3.88s. Allocations drop from 8GB to 5.3G. Fixes #25706
-
- Feb 04, 2025
-
-
This patch removes the unused assembleOneBCO function from the bytecode assembler.
-
This is a moniker used for later 32-bit x86 implementations (Pentium Pro and later). Fixes #25691.
-
-
Ben Gamari authored
test-primops depends upon the existence of validate jobs, yet these do not exist in the context of nightly jobs, which .full-ci includes.
-
(cherry picked from commit afec4b75c2d0e9f5c462a86d9f3697acf30355c7) Co-authored-by:
Ben Gamari <bgamari.foss@gmail.com>
-
Currently the BCO_Name instruction is a bit difficult to use since the names are not qualified by the module they come from. When you have a very generic name such as "wildX4", it becomes impossible to work out which module the identifier comes from. Fixes #25694
-
- Feb 03, 2025
-
-
Cabal does not know about the different ABIs for powerpc64 and compiles StgCRunAsm.S unconditionally. The old make-based build system excluded this file from the build and it was OK to signal an error when it was compiled accidentally. With this patch we compile StgCRunAsm.S to an empty file, which fixes the build. Fixes #25700
-
The FastString table is shared between the boot compiler and interpreted compiler. Therefore it's very important the representation of `FastString` matches in both cases. Otherwise, the interpreter will read a FastString from the shared variable but place the fields in the wrong place which leads to segfaults. Ideally this state would not be shared, but for now we can always compile both with `-O2` and this leads to a working interpreter.
-
- Jan 30, 2025
-
- Jan 29, 2025
-
-
SLIDE x 0 is a no-op as it means to shift x elements of the stack by no spaces. In the interpreter, this results in a loop which copies an array element into the same place. I have instrumented GHCi to count how many of these instructions are interpreted. The workload was `ghc` compiling two simple modules. Total no-op slides: 7793476 Total slides: 11413289 Percentage useless (slides): 68% Percentage uselss of total instructions: 9%
-
Closes #25654.
-
We currently do not support top-level unlifted data constructor applications, therefore this is a safe assertion. Pointed out by @sheaf.
-
Previously we assumed that all unlifted types were `Addr#` but this isn't true. As noted in #25638, unlifted nullary data constructor workers can also appear at the top-level and are obviously not of type `Addr#`. Note that there is more work to be done to properly handle unlifted data constructors (especially nullary; see #25636). However, this is a small step in the right direction. Closes #25641.
-
Fixes #25615.
-
-
-
At some sites, we merely panic if the `[]` or `Maybe` is empty when we convert to `NonEmpty` or `Identity`, but at least now we make it explicit. At other sites, we are able to use more precise types and avoid the partiality altogether. To do so, we redefine various functions to operate over `Traversable` arguments, so we can use the appropriate shape where known.
-
- Match the vector element list only once in `shuffleInstructions`. - Define `isSuitableFloatingPointLit_maybe` which returns `Just` the width if the lit is indeed suitable.
-
We do so by introducing `mkLitNumberWrap'` whose ultimate codomain is `Integer` rather than `Literal`, and then use that rather than `mkLitNumberWrap` where we just need the number rather than the `Literal`.
-
We do so by changing the type of `BlockContext` to statically (in GHC) exclude the possibility of Cmm statics, and using `NonEmpty` lists of `BlockContext`s in `cmmDebugGen`.
-
Make the list of available names `Infinite`, to avoid panicking on the (now impossible) empty list case.
-
Make the list of variables to use in generated code `Infinite`, to avoid panicking on the (now impossible) empty list case.
-
Also drop the losing `instance MonadFail UniqSM`. We redefine `getUniquesM` in terms of `Infinite` rather than `[]`, and define another method `getUniqueListM` for the use sites where we actually want a `[]`. Thus, at many sites, we can avoid the partiality of the empty list case. We also define `withUniques`, `withUniquesM`, and `withUniquesM'`, which traverse an arbitrary `Traversable` structure and introduce a `Unique` for each element. This allows us to redefine various functions to operate on more appropriate types than `[]` and avoid further partiality (in the form of incomplete-uni-patterns).
-
We now have `cloneBndrs` and `cloneRecIdBndrs` which take a `UniqSupply` argument, and `cloneBndrsM` and `cloneRecIdBndrsM` which rather have a `MonadUnique` constraint.
-