Add deduplication table for `IfaceType`
IfaceType
Add deduplication table for The type IfaceType
is a highly redundant, tree-like data structure.
While benchmarking, we realised that the high redundancy of IfaceType
causes high memory consumption in GHCi sessions.
We fix this by adding a deduplication table to the serialisation of
ModIface
, similar to how we deduplicate Name
s and FastString
s.
When reading the interface file back, the table allows us to automatically
share identical values of IfaceType
.
The type IfaceType
is a highly redundant, tree-like data structure.
While benchmarking, we realised that the high redundancy of IfaceType
causes high memory consumption in GHCi sessions when byte code is
embedded into the .hi
file via -fwrite-if-simplified-core
or
-fbyte-code-and-object-code
.
Loading such .hi
files from disk introduces many duplicates of
memory expensive values in IfaceType
, such as IfaceTyCon
,
IfaceTyConApp
, IA_Arg
and many more.
We improve the memory behaviour of GHCi by adding an additional
deduplication table for IfaceType
to the serialisation of ModIface
,
similar to how we deduplicate Name
s and FastString
s.
When reading the interface file back, the table allows us to automatically
share identical values of IfaceType
.
To provide some numbers, we evaluated this patch on the agda code base.
We loaded the full library from the .hi
files, which contained the
embedded core expressions (-fwrite-if-simplified-core
).
Before this patch:
- Load time: 11.7 s, 2.5 GB maximum residency.
After this patch:
- Load time: 7.3 s, 1.7 GB maximum residency.
This deduplication has the beneficial side effect to additionally reduce the size of the on-disk interface files tremendously.
For example, on agda, we reduce the size of .hi
files (with
-fwrite-if-simplified-core
):
- Before: 101 MB on disk
- Now: 24 MB on disk
This has even a beneficial side effect on the cabal store. We reduce the size of the store on disk:
- Before: 341 MB on disk
- Now: 310 MB on disk
Note, none of the dependencies have been compiled with
-fwrite-if-simplified-core
, but IfaceType
occurs in multiple
locations in a ModIface
.
We also add IfaceType deduplication table to .hie serialisation and
refactor .hie file serialisation to use the same infrastrucutre as
putWithTables
.
Add run-time configurability of .hi file compression
Introduce the flag -fwrite-if-compression=<n>
which allows to
configure the compression level of writing .hi files.
The motivation is that some deduplication operations are too expensive
for the average use case. Hence, we introduce multiple compression
levels that have a minimal impact on performance, but still reduce the
memory residency and .hi
file size on disk considerably.
We introduce three compression levels:
-
1
:Normal
mode. This is the least amount of compression. It deduplicates onlyName
andFastString
s, and is naturally the fastest compression mode. -
2
:Safe
mode. It has a noticeable impact on .hi file size and is marginally slower thanNormal
mode. In general, it should be safe to always useSafe
mode. -
3
:Full
deduplication mode. Deduplicate as much as we can, resulting in minimal .hi files, but at the cost of additional compilation time.
Reading .hi files doesn't need to know the initial compression level,
and can always deserialise a ModIface
.
This allows users to experiment with different compression levels for
packages, without recompilation of dependencies.
Note, the deduplication also has an additional side effect of reduced memory consumption to implicit sharing of deduplicated elements. See #24540 for example where that matters.
IfaceType
Add Eq and Ord instance to We add an Ord
instance so that we can store IfaceType
in a
Data.Map
container.
This is required to deduplicate IfaceType
while writing .hi
files to
disk. Deduplication has many beneficial consequences of both file size
and memory usage, as the deduplication enables implicit sharing of
values.
See issue #24540 for more motivation.
The Ord
instance would be unnecessary if we used a TrieMap
instead
of Data.Map
for the deduplication process. While in theory this is
Clearly the better option, experiments on the agda code base showed
that a TrieMap
implementation has worse run-time performance
characteristics.
To the change itself, we mostly derive Eq
and Ord
. This requires us
to change occurrences of FastString
with LexicalFastString
, since
FastString
has no Ord
instance.
We change the definition of IfLclName
to a newtype of
LexicalFastString
, to make such changes in the future easier.
Bump haddock submodule for IfLclName
newtype changes.
This PR is currently stacked on top of !12346 (closed) as it requires its refactorings.
-
Adds regression test cases for interface file sizes
Closes #24540