Skip to content

Use TrieMaps to speed up type class instance lookup

Currently, type class instance resolution is performed by doing a map lookup by type class, and then linearly matching against every instance. This is not great, and for a while, simonpj has been keen on using the TrieMap data structure in GHC, which has been previously used to implement a map from CoreExprs to various things (e.g. for CSE). What makes this a little tricky is that instance lookup isn't intended to be an exact match: we should unify so-called template variables and provide a substitution; furthermore, there may be multiple matches.

After some prototyping, I think we should be able to make this work. The primary refactoring we have to do is *maintain the key associated with an entry in a TrieMap*. Unlike the current uses of TrieMaps, where it's acceptable to lose the underlying key associated with an entry in the TrieMap, we need to know the key when doing unification, because it may be reported in the substitution. There are a few knock-on effects of this:

  • We should add a method foldWithKeyTM :: (Key m -> a -> b -> b) -> m a -> b -> b to the TrieMap type class.
  • We need a variant of UniqFM which tracks the original key that was used. I propose we name it UniqKM (unique key map). A number of implementations of TrieMap need to be adjusted to use this instead of UniqFM.
  • Why can't we just represent keyed TrieMaps as TypeMap (Type, a)? It might be possible. An insurmountable difficulty would be if we need to know the partial key PRIOR to having finished traversing the TrieMap; however, for the parts of the unification algorithm I've implemented, this does not seem to be the case. The primary actual difficulty is that Type uses a named representation, whereas TypeMap keys are on-the-fly deBruijn numbered. There would at least be some annoying fiddliness.
  • Reversing the deBruijn numbered key into a Type is a bit goofy. Essentially, you need the reverse of the current CmEnv: a mapping from de Bruijn levels to the Var you've decided to allocate. (I've called this CfEnv in my prototype)
  • When we represent a TrieMap binder, we have to put the binder map on the OUTSIDE, as opposed to the inside as it is currently. Otherwise, we can't update the CfEnv with the new mapping before making the recursive call to fold-with-key.

I'll attach the standalone Haskell file I used to prototype this, wherein I copy-pasted a lot of Haskell from GHC's source and implemented fuzzy map on a simplified version of Type.

Trac metadata
Trac field Value
Version 7.9
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler (Type checker)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC simonpj
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information