Skip to content

High amounts of allocation due to Data.Map, specifically balance operations and insert

The problem

This issue raised by @simonpj in the GHC meeting (08/11/2022). That at the top of a lot of ticky profiles are:

...
   2467942  104527192          0   3 i.M                  Data.IntMap.Internal.$winsert{v rh8I} (fun)
...
     272230   13907328          0   4 ..MM                 Data.Map.Internal.balanceR{v r4my} (fun)
...
     271260   13483536          0   4 ..MM                 Data.Map.Internal.balanceL{v r4mx} (fun)

Those numbers are from a ticky build of GHC.Driver.Backpack (@mpickering you were right!). Although @simonpj also mentioned similar numbers from ticky-ing a perf ghc build (Simon please correct me if I'm mistaken. I would like to have a list of test candidates that we can ticky to work on this ticket)

I also noticed this last year when working on (#18541). Now of course insert should allocate, but the question is Is this a reasonable amount of allocation?. Similarly, why are we balancing these trees so much? IMHO we should avoid use of a balancing data structure if we can.

For example, we have several maps in GHC.Unit.State that are Keyed on FastString:

type UnitInfoMap = Map UnitId UnitInfo
...

type UnusableUnits = Map UnitId (UnitInfo, UnusableUnitReason)

where:

newtype UnitId = UnitId
  { unitIdFS :: FastString
      -- ^ The full hashed unit identifier, including the component id
      -- and the hash.
  }
  deriving (Data)

Now IMHO we should not be keying a Data.Map on any kind of String-like thing, but especially on a FastString for two reasons:

  1. FastStrings all have corresponding Uniques so a UniqFM will be more performant and will never need to rebalance.
  2. Even FastString comparison is slow compared to Unique comparison because one can choose to lexically compare, like we have in GHC.State.Unit:
instance Ord UnitId where
    -- we compare lexically to avoid non-deterministic output when sets of
    -- unit-ids are printed (dependencies, etc.)
    u1 `compare` u2 = unitIdFS u1 `lexicalCompareFS` unitIdFS u2

So the tasks for this ticket are:

  • Get a set of programs to ticky, so that we can reproducibly observe the excessive inserts and balances.
  • Find the major contributors to insert and balance allocations as observed by ticky profiles
  • Try to use a different data structure, or avoid insertions
  • Measure to observe any difference in compile time.

Ill try to post more suspect Maps when I find them. @mpickering @simonpj please add anything I missed. Ill be actively working on this ticket under the guise of JS backend performance work.

Edited by sheaf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information