- May 20, 2024
-
-
This new performance test has the purpose of detecting regressions in complexity in relation to the number of modules in a project, so 1% deviation is way too small to avoid false positives.
-
Fixes #24776
-
Remove unused epAnnAnns function various cases for showAstData that no longer exist
-
- May 19, 2024
-
-
Serge S. Gulin authored
Added trivial size performance test which involves unicode general category usage via `read`. The `read` itself uses general category to detect spaces. The purpose for this test is to measure outcome of applying improvements at General Category representation in code discussed at #24789.
-
-
- May 18, 2024
-
-
Jade authored
-
-
-
- May 17, 2024
-
-
Ben Gamari authored
-
-
-
Ben Gamari authored
git-subtree-dir: utils/haddock git-subtree-mainline: 7eb9f184 git-subtree-split: a7dcf13b
-
Ben Gamari authored
Using previously-added configuration and `fourmolu -i .` Note that we exclude the test-cases (`./{hoogle,html-hypsrc,latex}-test`) as they are sensitive to formatting.
-
Ben Gamari authored
Previously the Makefile was present only for GHC's old make-based build system. Now since the make-based build system is gone we can use it for more useful ends.
-
Ben Gamari authored
-
Ben Gamari authored
-
Although we do not really need it in the interface file serialisation, as the deserialisation uses `getWithUserData`, we need to mirror the structure `getWithUserData` expects. Thus, we write essentially an empty `IfaceType` table at the end of the file, as the interface file doesn't reference `IfaceType`. (cherry picked from commit c9bc29c6a708483d2abc3d8ec9262510ce87ca61)
-
(cherry picked from commit a711607e29b925f3d69e27c5fde4ba655c711ff1)
-
Ben Gamari authored
In preparation for merge into the GHC, as proposed in #23178.
-
Ticket #24806 showed that we also need to treat dead end thunks as tagged during the analysis.
-
Closes #24759 Background. In MR !12372 we began tracking shared object files and directories sizes for dependencies. However, this broke release builds because release builds alter the filenames swapping "in-place" for a hash. This was not considered in the MR and thus broke release pipelines. Furthermore, the rts_so test was found to be wildly varying and was therefore disabled in !12561. This commit fixes both of these issues: - fix the rts_so test by making the regex less general, now the rts_so test and all other foo.so tests must match "libHS<some-lib>-<version>-<hash|'in-place>-<ghc>". This prevents the rts_so test from accidentally matching different rts variants such as rts_threaded, which was the cause of the wild swings after !12372. - add logic to match either a hash or the string in-place. This should make the find_so function build agnostic.
-
-
Fixes #24815
-
This patch fixes I/O manager compilation errors for win32 target discovered when cross-compiling to win32 using recent clang: ``` rts/win32/ThrIOManager.c:117:7: error: error: call to undeclared function 'is_io_mng_native_p'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 117 | if (is_io_mng_native_p ()) { | ^ | 117 | if (is_io_mng_native_p ()) { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/fs.c:143:28: error: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] 143 | int setErrNoFromWin32Error () { | ^ | void | 143 | int setErrNoFromWin32Error () { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:227:9: error: error: call to undeclared function 'interruptIOManagerEvent'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 227 | interruptIOManagerEvent (); | ^ | 227 | interruptIOManagerEvent (); | ^ rts/win32/ConsoleHandler.c:227:9: error: note: did you mean 'getIOManagerEvent'? | 227 | interruptIOManagerEvent (); | ^ rts/include/rts/IOInterface.h:27:10: error: note: 'getIOManagerEvent' declared here 27 | void * getIOManagerEvent (void); | ^ | 27 | void * getIOManagerEvent (void); | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:196:9: error: error: call to undeclared function 'setThreadLabel'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/win32/ConsoleHandler.c:196:9: error: note: did you mean 'postThreadLabel'? | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/eventlog/EventLog.h:118:6: error: note: 'postThreadLabel' declared here 118 | void postThreadLabel(Capability *cap, | ^ | 118 | void postThreadLabel(Capability *cap, | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) ```
-
Non-moving segments are 8 blocks long and need to be aligned. Previously we serviced allocations by grabbing 15 blocks, finding an aligned 8 block group in it and returning the rest. This proved to lead to high levels of fragmentation as a de-allocating a segment caused an 8 block gap to form, and this could not be reused for allocation. This patch introduces a segment allocator based around using entire megablocks to service segment allocations in bulk. When there are no free segments, we grab an entire megablock and fill it with aligned segments. As the megablock is free, we can easily guarantee alignment. Any unused segments are placed on a free list. It only makes sense to free segments in bulk when all of the segments in a megablock are freeable. After sweeping, we grab the free list, sort it, and find all groups of segments where they cover the megablock and free them. This introduces a period of time when free segments are not available to the mutator, but the risk that this would lead to excessive allocation is low. Right after sweep, we should have an abundance of partially full segments, and this pruning step is relatively quick. In implementing this we drop the logic that kept NONMOVING_MAX_FREE segments on the free list. We also introduce an eventlog event to log the amount of pruned/retained free segments. See Note [Segment allocation strategy] Resolves #24150 ------------------------- Metric Decrease: T13253 T19695 -------------------------
-
This commit fixes a small an oversight in !12148: the prefetch logic in non-moving GC may trap in debug RTS because it calls Bdescr() for mark_closure which may be a static one. It's fine in non-debug RTS because even invalid bdescr addresses are prefetched, they will not cause segfaults, so this commit implements the most straightforward fix: don't prefetch mark_closure bdescr when assertions are enabled.
-
- May 16, 2024
-
-
Closes #24786
-
This generalises the HasField class to support representation polymorphism, so that instead of type HasField :: forall {k} . k -> Type -> Type -> Constraint we have type HasField :: forall {k} {r_rep} {a_rep} . k -> TYPE r_rep -> TYPE a_rep -> Constraint
-
-
-
-
- When suggesting Language extensions, also suggest Extensions which imply them - Suggest ExplicitForAll and GADTSyntax instead of more specific extensions - Rephrase suggestion to include the term 'Extension' - Also moves some flag specific definitions out of Session.hs into Flags.hs (#24478) Fixes: #24477 Fixes: #24448 Fixes: #10893
-
- May 15, 2024
-
-
Co-Authored-By:
romes <rodrigo.m.mesquita@gmail.com>
-
Add regression tests to track how `-fwrite-if-compression` levels affect the size of `.hi` files.
-
Introduce the flag `-fwrite-if-compression=<n>` which allows to configure the compression level of writing .hi files. The motivation is that some deduplication operations are too expensive for the average use case. Hence, we introduce multiple compression levels with variable impact on performance, but still reduce the memory residency and `.hi` file size on disk considerably. We introduce three compression levels: * `1`: `Normal` mode. This is the least amount of compression. It deduplicates only `Name` and `FastString`s, and is naturally the fastest compression mode. * `2`: `Safe` mode. It has a noticeable impact on .hi file size and is marginally slower than `Normal` mode. In general, it should be safe to always use `Safe` mode. * `3`: `Full` deduplication mode. Deduplicate as much as we can, resulting in minimal .hi files, but at the cost of additional compilation time. Reading .hi files doesn't need to know the initial compression level, and can always deserialise a `ModIface`, as we write out a byte that indicates the next value has been deduplicated. This allows users to experiment with different compression levels for packages, without recompilation of dependencies. Note, the deduplication also has an additional side effect of reduced memory consumption to implicit sharing of deduplicated elements. See #24540 for example where that matters. ------------------------- Metric Decrease: MultiLayerModulesDefsGhciWithCore T16875 T21839c T24471 hard_hole_fits libdir -------------------------
-
The type `IfaceType` is a highly redundant, tree-like data structure. While benchmarking, we realised that the high redundancy of `IfaceType` causes high memory consumption in GHCi sessions when byte code is embedded into the `.hi` file via `-fwrite-if-simplified-core` or `-fbyte-code-and-object-code`. Loading such `.hi` files from disk introduces many duplicates of memory expensive values in `IfaceType`, such as `IfaceTyCon`, `IfaceTyConApp`, `IA_Arg` and many more. We improve the memory behaviour of GHCi by adding an additional deduplication table for `IfaceType` to the serialisation of `ModIface`, similar to how we deduplicate `Name`s and `FastString`s. When reading the interface file back, the table allows us to automatically share identical values of `IfaceType`. To provide some numbers, we evaluated this patch on the agda code base. We loaded the full library from the `.hi` files, which contained the embedded core expressions (`-fwrite-if-simplified-core`). Before this patch: * Load time: 11.7 s, 2.5 GB maximum residency. After this patch: * Load time: 7.3 s, 1.7 GB maximum residency. This deduplication has the beneficial side effect to additionally reduce the size of the on-disk interface files tremendously. For example, on agda, we reduce the size of `.hi` files (with `-fwrite-if-simplified-core`): * Before: 101 MB on disk * Now: 24 MB on disk This has even a beneficial side effect on the cabal store. We reduce the size of the store on disk: * Before: 341 MB on disk * Now: 310 MB on disk Note, none of the dependencies have been compiled with `-fwrite-if-simplified-core`, but `IfaceType` occurs in multiple locations in a `ModIface`. We also add IfaceType deduplication table to .hie serialisation and refactor .hie file serialisation to use the same infrastrucutre as `putWithTables`. Bump haddock submodule to accomodate for changes to the deduplication table layout and binary interface.
-
-
We add an `Ord` instance so that we can store `IfaceType` in a `Data.Map` container. This is required to deduplicate `IfaceType` while writing `.hi` files to disk. Deduplication has many beneficial consequences to both file size and memory usage, as the deduplication enables implicit sharing of values. See issue #24540 for more motivation. The `Ord` instance would be unnecessary if we used a `TrieMap` instead of `Data.Map` for the deduplication process. While in theory this is clerarly the better option, experiments on the agda code base showed that a `TrieMap` implementation has worse run-time performance characteristics. To the change itself, we mostly derive `Eq` and `Ord`. This requires us to change occurrences of `FastString` with `LexicalFastString`, since `FastString` has no `Ord` instance. We change the definition of `IfLclName` to a newtype of `LexicalFastString`, to make such changes in the future easier. Bump haddock submodule for IfLclName changes
-
-