Skip to content

Performance problem with TrieMap

Looking at #9805 (closed) reminded me. When investigating the performance of #9872 (closed) I found a major performance hole in TrieMap.

Suppose we have a handful H of entries in a TrieMap, each with a very large key, size K. If you fold over such a TrieMap you'd expect time O(H). That would certainly be true of an association list! But with TrieMap we actually have to navigate down a long singleton structure to get to the elements, so it takes time O(K*H).

This is pretty bad. In test T9872 I found that 90% of compile time was spent in TrieMap as we repeatedly built, folded, and then rebuilt TrieMaps. I fixed that by keeping smaller TrieMaps, but the underlying problem remains.

The point of a TrieMap is that you need to navigate to the point where only one key remains, and then things should be fast. So I think that TypeMap, say, should look like

data TypeMap a
  = EmptyTM
  | SingletonTM Type a
  | TM { tm_var   :: VarMap a
       , tm_app    :: TypeMap (TypeMap a)
       , tm_fun    :: TypeMap (TypeMap a)
       , tm_tc_app :: NameEnv (ListMap TypeMap a)
       , tm_forall :: TypeMap (BndrMap a)
       , tm_tylit  :: TyLitMap a
       }

The SingletonTM means that, once you are down to a single (key,value) pair, you stop and just use SingletonTM.

Tiresomely, because of the deBruijn thing, it'd need to be

...
  | SingletonTM (CMEnv, Type) a
...

and we'd need an auxiliary function for equality:

eqTypeModuloDeBruijn :: (CMEnv, Type) -> (CEnv, Type) -> Bool

But that's not hard. Very similar to the way RnEnv2 works.

Trac metadata
Trac field Value
Version 7.8.4
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information