Skip to content

Investigate splitting `Var` into a product type covering the common attributes and a sum distinguishing TyVar/Id/TcTyVar

In #22458 and !9351 (closed) @simonpj changed TyCon from something like:

data TyCon 
    = TyConA { tc_unique :: !Unique
            , <more_shared_fields>
            , <TyConA_fields> 
            }
    | TyConB { tc_unique :: !Unique
            , <more_shared_fields>
            , <TyConB_fields> 
            }
    | ...

Into something like this:

data TyCon 
    = TyCon { tc_unique :: !Unique
            , <more_shared_fields>
            , tc_details :: TyConDetails
            }
    
data TyConDetails 
    = TyConA { 
        <TyConA_fields>
        }
    | TyConB { 
        <TyConB_fields>
        }
    ...

Which turned out to be very good for performance.


However this pattern is not unique to TyCon. For the crucial data type Var we have:

data Var
  = TyVar {  -- Type and kind variables
             -- see Note [Kind and type variables]
        varName    :: !Name,
        realUnique :: {-# UNPACK #-} !Int,
                                     -- ^ Key for fast comparison
                                     -- Identical to the Unique in the name,
                                     -- cached here for speed
        varType    :: Kind           -- ^ The type or kind of the 'Var' in question
 }

  | TcTyVar {                           -- Used only during type inference
                                        -- Used for kind variables during
                                        -- inference, as well
        varName        :: !Name,
        realUnique     :: {-# UNPACK #-} !Int,
        varType        :: Kind,
        tc_tv_details  :: TcTyVarDetails
  }

  | Id {
        varName    :: !Name,
        realUnique :: {-# UNPACK #-} !Int,
        varType    :: Type,
        varMult    :: Mult,             -- See Note [Multiplicity of let binders]
        idScope    :: IdScope,
        id_details :: IdDetails,        -- Stable, doesn't change
        id_info    :: IdInfo }          -- Unstable, updated by simplifier

Which similarly could be changed into:

data Var
  = Var {  -- Type and kind variables
             -- see Note [Kind and type variables]
        varName    :: !Name,
        realUnique :: {-# UNPACK #-} !Int,
                                     -- ^ Key for fast comparison
                                     -- Identical to the Unique in the name,
                                     -- cached here for speed
        varType    :: Type,          -- ^ The type or kind of the 'Var' in question
        varDetails :: !VarDetails    -- ^ Specifics about the Id
 }

data VarDetails
  = TyVar {  -- Type and kind variables
             -- see Note [Kind and type variables]
 }

  | TcTyVar {                           -- Used only during type inference
                                        -- Used for kind variables during
                                        -- inference, as well
        tc_tv_details  :: TcTyVarDetails
  }

  | Id {
        varMult    :: Mult,             -- See Note [Multiplicity of let binders]
        idScope    :: IdScope,
        id_details :: IdDetails,        -- Stable, doesn't change
        id_info    :: IdInfo }          -- Unstable, updated by simplifier

For much of the same reason. The obvious downside that IdDetails/IdInfo would be hidden behind one more indirection. So there is a chance it might not be worthwhile. But based on the result for TyCon it seems worth trying.

Edited by Simon Peyton Jones
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information