Skip to content

TypeRep fingerprints for arrow types are problematic

Motivation

I'm hacking together a package for type-indexed type fingerprints, for situations where typeRep is too heavy and Typeable reflection isn't necessary. There are two basic approaches one might consider, both of which start out similarly.

{-# language ScopedTypeVariables, TypeInType, GADTs, RoleAnnotations, TypeApplications, RankNTypes,
      MultiParamTypeClasses, TypeOperators, TypeFamilies, FlexibleContexts, FlexibleInstances,
      UndecidableInstances, AllowAmbiguousTypes, MagicHash #-}
module Type.Fingerprint.Internal where

import GHC.Fingerprint (fingerprintFingerprints)
import qualified GHC.Fingerprint as F
import Data.Type.Equality
import Type.Reflection
import qualified Type.Reflection.Unsafe as TU
import Data.Kind
import GHC.Exts (TYPE, RuntimeRep (..))
import Unsafe.Coerce

newtype Fingerprint (a :: k) = Fingerprint F.Fingerprint
  deriving (Eq, Ord)
type role Fingerprint nominal

class Fingerprinted (a :: k) where
  fingerprint# :: Fingerprint a

fingerprint :: Fingerprinted a => Fingerprint a
fingerprint = fingerprint#

withFingerprinted
  :: forall k (a :: k) rep (r :: TYPE rep) .
     Fingerprint a -> (Fingerprinted a => r) -> r
-- The same sort of implementation as withTypeable

From there the approaches diverge.

Lean entirely on Typeable

The easiest way to create Fingerprinted instances is to just write

instance Typeable a => Fingerprinted a where
  fingerprint# = Fingerprint . TU.typeRepFingerprint $ typeRep @a

Unfortunately, this loses a lot of non-reflection functionality compared to Typeable. In particular, if you use withFingerprinted with fingerprints of f, a, and b, you do not automatically get instances for Fingerprinted (f a) or Fingerprinted (a -> b). So I'm really not too interested in this approach.

Calculate fingerprints much like Typeable

The other major option is to calculate fingerprints by hand, and then use type family-guided instance resolution to distinguish between two cases:

  1. The type is an application: calculate fingerprints recursively and combine them using fingerprintFingerprints.
  2. The type is not an application: get the fingerprint using TU.typeRepFingerprint.

This approach will produce a perfectly good fingerprinting scheme. There's only one problem: the fingerprints it produces will not, in general, be the same as the ones obtained by TU.typeRepFingerprint. GHC fingerprints arrow types specially. When calculating the fingerprint of an application, it uses reflection to check whether the first argument is a partially applied arrow. If so, it extracts the left side of the arrow, and combines its fingerprint with that of the right side of the arrow, leaving out the arrow itself. With just fingerprints, this is impossible. We have no way to see that a type was produced by applying (->) to another type, let alone a way to extract that other type.

Proposal

I can think of two ways to fix this.

The easy way

Remove the special case for fingerprinting arrow types. There will be some performance regressions, but I hope they won't be intolerable.

The hard way

Use a separate fingerprint combining function for Typeable fingerprints than for other fingerprinting purposes. This would make it possible to arrange for the result of combining the fingerprint of (->) with another fingerprint (in the appropriate order) to just produce the other fingerprint. This should (nearly) eliminate the performance penalty.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information