As noted in ticket:12415#comment:123565 there are a large number of names inhabiting knownKeyNames and therefore the original name cache due to tuples and sums. It shouldn't be so hard to remove these and handle them in an ad-hoc manner:
original name cache lookups (e.g. finding out the Name associated with an OccName) can been handled by parsing the OccName's string
looking up the Name associated with a Unique (which is needed in BinIface during interface file deserialization) can be done with a set of static functions.
Trac metadata
Trac field
Value
Version
8.0.1
Type
Task
TypeOfFailure
OtherFailure
Priority
normal
Resolution
Unresolved
Component
Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
I have an initial attempt (Phab:D2469) at doing this. Here I remove all of the sum and tuples names from knownKeyNames implement the codepaths listed above for handling original name lookups and Names from Uniques. However, I found that there is one wrinkle: there are actually two places where we need to lookup in the original name cache during interface file deserialization,
While deserializing the symbol table. This is the case that I had anticipated and indeed it poses no problem since we have a special encoding for known-key names, so indeed we should never need to do a name cache lookup in this case.
To lookup IfaceTopBndrs of, e.g., IfaceDecls. This case I only realized after neck-deep in the refactoring and requires a bit more work to handle correctly.
My approach: Changing IfaceTopBndr
My approach to (2) is to essentially duplicate the special encoding of known key names that we use for the symbol table for IfaceTopBndr. This is done by changing it from a synonym for OccName to a sum of,
the knock-on changes for this approach go through pretty easily.
An alternative: Using the symbol table
Simon PJ has suggested an alternative: rather than duplicating the known-key encoding, we should instead simply make IfaceTopBndr a synonym for Name; in principle this would mean that known-key names would "just work" since IfaceTopBndr would be encoding via the symbol table.
Sadly, it seems that this is too good to be true. The trouble comes from the fingerprinting logic in MkIface (namely addFingerprints) which works roughly as follows,
we look at the IfaceDecls being processed and compute strongly-connect groups of them.
we walk the list of groups in dependency order and compute a fingerprint for each. As we progress we maintain an NameEnv Fingerprint which maps each name we've looked at to its fingerprint.
The fingerprint for a group is determined by serializing it with Binary to a buffer and MD5ing the result. However, this isn't quite normal serialization: we override the putName operation to, rather than merely serializing the Name, lookup the name in the fingerprint environment and instead writing the fingerprint.
To see the issue consider what happens when we try to fingerprint a single IfaceId with IfaceTopBndr ~ Name: the very first thing we do is put the IfaceId's ifName; this will attempt to lookup the Name in the fingerprint environment but this will of course fail since we haven't yet fingerprinted it (which results in a panic exclaiming "urk! lookup local fingerprint").
"Well," you might say, "perhaps we can simply add a special case to the dummy putName logic not to attempt to lookup the Name of the IfaceDecl which we are currently fingerprinting. Sadly, however, this isn't enough since we also need the other names which the decl binds implicitly (e.g. the datacons of a IfaceData). It is indeed possible to compute these but all-in-all the situation ends up looking much messier than the relatively simple solution I proposed above.
But instead of making the distiction with a data constructur, you can use isKnownKeyName just like putName!
putIfaceTopBndr :: IfaceTopBndr -> M ()putIFaceTopBndr name | isKnownKeyName name = ...(getUnique name)... | otherwise = ...(getOccName name)...
Your proposed IfaceTopBndr lets you distinguish the known-key situation. But so does isKnownKeyName. (Indeed we could use your IfaceTopBndr sum type at every occurrence of a name, i.e. for IfaceExtName.)
Right, but the problem that I describe is actually quite orthogonal to the known-key encoding issue. The problem is that while generating a fingerprint for an IfaceDecl we override the implementation of putName in to resolve ensure that the fingerprints the things it refers to are included in its fingerprint. We do this by building up a NameEnv Fingerprint as we walk the IfaceDecls in dependency order. Previously this was fine since none of the Names we would serialize were binding occurrences.
However, now that IfaceTopBndr is also a Name we have lost the ability to distinguish binding from non-binding occurrences. This breaks the fingerprinting implementation since we then end up attempting to lookup the Name (and implicit Names) of the thing we are trying to fingerprint since its serialization contains a binding Name.
In writing this the solution here became fairly obvious: simply tell putName whether it is serializing a binding or non-binding occurrence.