Commit 9513fe6b authored by Ben Gamari's avatar Ben Gamari Committed by Ben Gamari
Browse files

Clean up interaction between name cache and built-in syntax

This cleans up various aspects of the handling of built-in syntax in the
original name cache (hopefully resulting in a nice reduction in compiler
allocations),

  * Remove tuple types from original name cache: There is really no
    reason for these to be in the name cache since we already handle
    them specially in interface files to ensure that we can resolve them
    directly to Names, avoiding extraneous name cache lookups.

  * Sadly it's not possible to remove all traces of tuples from the
    name cache, however. Namely we need to keep the tuple type
    representations in since otherwise they would need to be wired-in

  * Remove the special cases for (:), [], and (##) in isBuiltInOcc_maybe
    and rename it to isTupleOcc_maybe

  * Split lookupOrigNameCache into two variants,

     * lookupOrigNameCache': Merely looks up an OccName in the original
       name cache, making no attempt to resolve tuples

     * lookupOrigNameCache: Like the above but handles tuples as well.
       This is given the un-primed name since it does the "obvious"
       thing from the perspective of an API user, who knows nothing of
       our special treatment of tuples.

Arriving at this design took a significant amount of iteration. The
trail of debris leading here can be found in #11357.

Thanks to ezyang and Simon for all of their help in coming to this
solution.

Test Plan: Validate

Reviewers: goldfire, simonpj, austin

Reviewed By: simonpj

Subscribers: thomie, ezyang

Differential Revision: https://phabricator.haskell.org/D2414

GHC Trac Issues: #11357
parent ed480981
......@@ -275,7 +275,10 @@ fromOnDiskName :: Array Int Name -> NameCache -> OnDiskName -> (NameCache, Name)
fromOnDiskName _ nc (pid, mod_name, occ) =
let mod = mkModule pid mod_name
cache = nsNames nc
in case lookupOrigNameCache cache mod occ of
in -- We use lookupOrigNameCache' here since we will never see any built-in
-- syntax in an interface file that isn't in the name cache (for instance
-- tuple TyCons). See Note [Symbol table representation of names].
case lookupOrigNameCache' cache mod occ of
Just name -> (nc, name)
Nothing ->
let (uniq, us) = takeUniqFromSupply (nsUniqs nc)
......
......@@ -6,7 +6,7 @@ module IfaceEnv (
newGlobalBinder, newInteractiveBinder,
externaliseName,
lookupIfaceTop,
lookupOrig, lookupOrigNameCache, extendNameCache,
lookupOrig, lookupOrigNameCache, lookupOrigNameCache', extendNameCache,
newIfaceName, newIfaceNames,
extendIfaceIdEnv, extendIfaceTyVarEnv,
tcIfaceLclId, tcIfaceTyVar, lookupIfaceVar,
......@@ -33,13 +33,14 @@ import Module
import FastString
import FastStringEnv
import IfaceType
import PrelNames ( gHC_TYPES, gHC_PRIM, gHC_TUPLE )
import PrelNames ( gHC_TUPLE, gHC_PRIM )
import UniqSupply
import SrcLoc
import Util
import Outputable
import Data.List ( partition )
import Data.Maybe ( isNothing )
{-
*********************************************************
......@@ -193,9 +194,10 @@ their cost we use two tricks,
having to look up their names at all while loading interface files. See
Note [Symbol table representation of names] in BinIface for details.
a. We don't include them in the Orig name cache but instead parse their
OccNames (in isBuiltInOcc_maybe) to avoid bloating the name cache with
them.
a. We don't include tuples with arity >0 in the original-name cache but
instead parse their OccNames (in isBuiltInOcc_maybe) to avoid bloating the
name cache with them. As far as I know the reasons for the arity-0
exception are purely historical.
Why is the second measure necessary? Good question; afterall, 1) the parser
emits built-in syntax directly as Exact RdrNames, and 2) built-in syntax never
......@@ -210,19 +212,33 @@ are two reasons why we might look up an Orig RdrName for built-in syntax,
(Convert.thRdrName). So, e.g. $(do { reify '(,); ... }) will
go this route (Trac #8954).
See also: Note [Known-key names] in PrelNames
-}
-- | Lookup the 'Name' associated with an 'OccName'. Note that unlike
-- 'lookupOrigNameCache\'', this function will identify tuple types not present
-- in the name cache.
lookupOrigNameCache :: OrigNameCache -> Module -> OccName -> Maybe Name
lookupOrigNameCache nc mod occ
| mod == gHC_TYPES || mod == gHC_PRIM || mod == gHC_TUPLE
| mod == gHC_TUPLE || mod == gHC_PRIM
-- See Note [Built-in syntax and the OrigNameCache]
-- Special case for tuples; there are too many
-- of them to pre-populate the original-name cache
, Just name <- isBuiltInOcc_maybe occ
= -- See Note [Known-key names], 3(c) in PrelNames
-- Special case for tuples; there are too many
-- of them to pre-populate the original-name cache
Just name
= Just name
| otherwise
= case lookupModuleEnv nc mod of
| otherwise = lookupOrigNameCache' nc mod occ
-- | Lookup the 'Name' associated with an 'OccName'. Note that this function
-- will not find names for tuple types (e.g. @(,,)@). For this you want
-- 'lookupOrigNameCache'.
--
-- For discussion of why see Note [Built-in syntax and the OrigNameCache].
lookupOrigNameCache' :: OrigNameCache -> Module -> OccName -> Maybe Name
lookupOrigNameCache' nc mod occ
= -- This function should never see built-in syntax, assert this
ASSERT(isNothing $ isBuiltInOcc_maybe occ)
case lookupModuleEnv nc mod of
Nothing -> Nothing
Just occ_env -> lookupOccEnv occ_env occ
......
......@@ -85,7 +85,8 @@ knownKeyNames
, concatMap tycon_kk_names typeNatTyCons
, concatMap (tycon_kk_names . tupleTyCon Boxed) [2..mAX_TUPLE_SIZE] -- Yuk
-- Tuple type representations
, tuple_rep_names
, cTupleTyConNames
-- Constraint tuples are known-key but not wired-in
......@@ -97,6 +98,17 @@ knownKeyNames
, basicKnownKeyNames ]
where
-- We only include the type representation bindings (for both the type and
-- promoted data constructors) for tuples, not the TyCons themselves since
-- they are handled specially in interface files and by isBuiltInOcc_maybe.
-- See Note [Built-in syntax and the OrigNameCache] and Note [Grand plan for
-- Typeable].
tuple_rep_names =
[ rep
| tc <- map (tupleTyCon Boxed) [2..mAX_TUPLE_SIZE]
, rep <- rep_names tc ++ concatMap (rep_names . promoteDataCon) (tyConDataCons tc)
]
-- All of the names associated with a known-key thing.
-- This includes TyCons, DataCons and promoted TyCons.
tycon_kk_names :: TyCon -> [Name]
......
......@@ -87,7 +87,8 @@ This is accomplished through a combination of mechanisms:
b) The known infinite families of names are specially
serialised by BinIface.putName, with that special treatment
detected when we read back to ensure that we get back to the
correct uniques.
correct uniques. See Note [Symbol table representation of names]
in BinIface.
Most of the infinite families cannot occur in source code,
so mechanisms (a,b) sufficies to ensure that they always have
......
......@@ -198,29 +198,29 @@ wiredInTyCons = [ unitTyCon -- Not treated like other tuples, because
-- that it'll pre-populate the name cache, so
-- the special case in lookupOrigNameCache
-- doesn't need to look out for it
, anyTyCon
, boolTyCon
, charTyCon
, doubleTyCon
, floatTyCon
, intTyCon
, wordTyCon
, word8TyCon
, listTyCon
, maybeTyCon
, parrTyCon
, heqTyCon
, coercibleTyCon
, typeNatKindCon
, typeSymbolKindCon
, runtimeRepTyCon
, vecCountTyCon
, vecElemTyCon
, constraintKindTyCon
, liftedTypeKindTyCon
, starKindTyCon
, unicodeStarKindTyCon
]
, anyTyCon
, boolTyCon
, charTyCon
, doubleTyCon
, floatTyCon
, intTyCon
, wordTyCon
, word8TyCon
, listTyCon
, maybeTyCon
, parrTyCon
, heqTyCon
, coercibleTyCon
, typeNatKindCon
, typeSymbolKindCon
, runtimeRepTyCon
, vecCountTyCon
, vecElemTyCon
, constraintKindTyCon
, liftedTypeKindTyCon
, starKindTyCon
, unicodeStarKindTyCon
]
mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name
mkWiredInTyConName built_in modu fs unique tycon
......@@ -608,6 +608,15 @@ Note [How tuples work] See also Note [Known-key names] in PrelNames
BoxedTuple/UnboxedTuple, and then we used BasicTypes.Boxity to distinguish
E.g. tupleTyCon has a Boxity argument
* Names of tuple TyCons, DataCons, and DataCon workers have a special encoding
in the interface file symbol table. This allows us to eliminate the need for a
original-name cache lookup when loading from an interface file. See
Note [Symbol table representation of names] and
Note [Built-in syntax and the OrigNameCache].
Unfortunately, Typeable type representations still do need to be included in
the name cache for tiresome reasons. See [Grand plan for Typeable].
* When looking up an OccName in the original-name cache
(IfaceEnv.lookupOrigNameCache), we spot the tuple OccName to make sure
we get the right wired-in name. This guy can't tell the difference
......@@ -641,19 +650,22 @@ decl in GHC.Classes, so I think this part may not work properly. But
it's unused I think.
-}
-- | Built in syntax isn't "in scope" so these OccNames map to wired-in Names
-- with BuiltInSyntax. However, this should only be necessary while resolving
-- names produced by Template Haskell splices since we take care to encode
-- built-in syntax names specially in interface files. See
-- Note [Symbol table representation of names].
-- | Tuple types aren't included in the original name cache to keep the size of
-- the cache down. This function is responsible for identifying tuple types and
-- mapping them to the appropriate 'Name'.
--
-- This should only be necessary while resolving names produced by Template
-- Haskell splices since we take care to encode built-in syntax names specially
-- in interface files. See Note [Symbol table representation of names].
-- This function should be able to identify everything in GHC.Tuple
isBuiltInOcc_maybe :: OccName -> Maybe Name
isBuiltInOcc_maybe occ =
case name of
"[]" -> Just $ choose_ns listTyConName nilDataConName
"[]" -> Just $ choose_ns listTyConName nilDataConName
":" -> Just consDataConName
"[::]" -> Just parrTyConName
"()" -> Just $ tup_name Boxed 0
"(##)" -> Just $ tup_name Unboxed 0
"()" -> Just $ tup_name Boxed 0
"(##)" -> Just $ tup_name Unboxed 0
_ | Just rest <- "(" `stripPrefix` name
, (commas, rest') <- BS.span (==',') rest
, ")" <- rest'
......
......@@ -85,6 +85,25 @@ There are many wrinkles:
representations for TyCon and Module. See GHC.Types
Note [Runtime representation of modules and tycons]
Note [Tuples and Typeable]
~~~~~~~~~~~~~~~~~~~~~~~~~~
Tuples are generally special in GHC since they are not placed in the original
name cache (see Note [Built-in syntax and the OrigNameCache]). This poses a
problem for Typeable: we need to serialize the Name of a type representation for
a tuple type with enough information such that the compiler will realize that
the Name is that of a tuple type representation (and thus gets the same unique
as is indicated in the wired-in TyCon) when it is loaded from an interface file.
We ensure this by only including the type representations for the type
contructor and its promoted data constructor in the original name cache. The
alternative would have been to use the same special interface file encoding as
we use for tuple TyCons for tuple type representations. Unfortunately, this is
rather tiresome to do so we are going to live with this compromise for now.
A great deal of discussion on how we came to this design can be found in #12357.
-}
-- | Generate the Typeable bindings for a module. This is the only
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment