High amounts of allocation due to Data.Map, specifically balance operations and insert
The problem
This issue raised by @simonpj in the GHC meeting (08/11/2022). That at the top of a lot of ticky profiles are:
...
2467942 104527192 0 3 i.M Data.IntMap.Internal.$winsert{v rh8I} (fun)
...
272230 13907328 0 4 ..MM Data.Map.Internal.balanceR{v r4my} (fun)
...
271260 13483536 0 4 ..MM Data.Map.Internal.balanceL{v r4mx} (fun)
Those numbers are from a ticky build of GHC.Driver.Backpack
(@mpickering you were right!). Although @simonpj also mentioned similar numbers from ticky-ing a perf ghc build (Simon please correct me if I'm mistaken. I would like to have a list of test candidates that we can ticky to work on this ticket)
I also noticed this last year when working on (#18541). Now of course insert should allocate, but the question is Is this a reasonable amount of allocation?
. Similarly, why are we balancing these trees so much? IMHO we should avoid use of a balancing data structure if we can.
For example, we have several maps in GHC.Unit.State
that are Keyed on FastString
:
type UnitInfoMap = Map UnitId UnitInfo
...
type UnusableUnits = Map UnitId (UnitInfo, UnusableUnitReason)
where:
newtype UnitId = UnitId
{ unitIdFS :: FastString
-- ^ The full hashed unit identifier, including the component id
-- and the hash.
}
deriving (Data)
Now IMHO we should not be keying a Data.Map
on any kind of String
-like thing, but especially on a FastString
for two reasons:
-
FastString
s all have correspondingUnique
s so aUniqFM
will be more performant and will never need to rebalance. - Even
FastString
comparison is slow compared toUnique
comparison because one can choose to lexically compare, like we have inGHC.State.Unit
:
instance Ord UnitId where
-- we compare lexically to avoid non-deterministic output when sets of
-- unit-ids are printed (dependencies, etc.)
u1 `compare` u2 = unitIdFS u1 `lexicalCompareFS` unitIdFS u2
So the tasks for this ticket are:
-
Get a set of programs to ticky, so that we can reproducibly observe the excessive inserts and balances. -
Find the major contributors to insert
andbalance
allocations as observed by ticky profiles -
Try to use a different data structure, or avoid insertions -
Measure to observe any difference in compile time.
Ill try to post more suspect Maps
when I find them. @mpickering @simonpj please add anything I missed. Ill be actively working on this ticket under the guise of JS backend performance work.