Skip to content

Figure out how to make sense of Data instances for poly-kinded datatypes

Something's funny with Data.Data. In particular, what instance do you expect when you write this?

data T phantom = T                                                                        
  deriving Data

You might expect:

instance Typeable phantom => Data (T phantom) where
  ...

And indeed, it is possible to define a Data instance like this. But in practice, GHC actually derives this instance:

instance Data phantom => Data (T phantom) where
  ...
  dataCast1 f = gcast1 f

The reason is because GHC has special cases for when it derives Data instances for datatypes of kind * -> * or * -> * -> *. These special cases cause the derived Data instances to insert definitions for dataCast1 or dataCast2, respectively, and their implementations (gcast1 or gcast2, respectively) require the stronger instance context. See #4028 (closed) for a longer discussion about this.

Things get far less predictable when you throw in PolyKinds, however. If you try this instead:

data T (phantom :: k) = T
  deriving Data

Then you do //not// get instance Data phantom => Data (T phantom) as above. Instead, you get this:

instance (Typeable k, Typeable (phantom :: k)) => Data (T phantom) where
  ...
  -- No implementation for dataCast1

As it turns out, GHC's special-casing for deriving Data is not privy to datatypes of kind k -> * or k1 -> k2 -> *, just * -> * and * -> * -> *.

This is all a bit disconcerting because if you look at Data.Data, there are many instances of the flavor:

deriving instance Data (p :: *) => Data (U1 p) -- U1 :: k -> *

This makes sense in a pre-PolyKinds world, but really, the most general type for this Data instance would be:

deriving instance (Typeable k, Typeable (p :: k)) => Data (U1 p)

Even worse, choosing the less polymorphic instance Data (p :: *) => Data (U1 p) via StandaloneDeriving doesn't even cause GHC to emit a definition for dataCast1—it just restricts the kind with no benefit!

This whole situation doesn't sit well with me, especially since as we add more poly-kinded Data instances to Data.Data, we have to ask ourselves each time what the "right" Data instance is. It would be great if we could answer these questions:

  1. Should we expand the special-casing for datatypes of kind k -> * or k1 -> k2 -> *? Or do dataCast1 and dataCast2 even make sense for poly-kinded datatypes?
  2. Is there a way to modify DeriveDataTypeable such that emitting definitions for dataCast1 and dataCast2 don't require us to default some kind variables to *? Perhaps TypeInType could help us here somehow.
Trac metadata
Trac field Value
Version 8.0.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC dreixel
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information