Commit 057e3f0d authored by simonpj's avatar simonpj
Browse files

[project @ 2002-03-14 15:26:53 by simonpj]

Lots of stuff about external and internal names
parent 09f96da1
......@@ -56,6 +56,7 @@
<li><a href="the-beast/basicTypes.html">The Basics</a>
<li><a href="the-beast/modules.html">Modules, ModuleNames and
Packages</a>
<li><a href="the-beast/names.html">The truth about names: Names and OccNamesd</a>
<li><a href="the-beast/vars.html">The Real Story about Variables, Ids,
TyVars, and the like</a>
<li><a href="the-beast/renamer.html">The Glorious Renamer</a>
......
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
<title>The GHC Commentary - The truth about names: OccNames, and Names</title>
</head>
<body BGCOLOR="FFFFFF">
<h1>The GHC Commentary - The truth about names: OccNames, and Names</h1>
<p>
Every entity (type constructor, class, identifier, type variable) has
a <code>Name</code>. The <code>Name</code> type is pervasive in GHC,
and is defined in <code>basicTypes/Name.lhs</code>. Here is what a Name looks like,
though it is private to the Name module.
<pre>
data Name = Name {
n_sort :: NameSort, -- What sort of name it is
n_occ :: !OccName, -- Its occurrence name
n_uniq :: Unique, -- Its identity
n_loc :: !SrcLoc -- Definition site
}
</pre>
<ul>
<li> The <code>n_sort</code> field says what sort of name this is: see
<a href="#sort">NameSort below</a>.
<li> The <code>n_occ</code> field gives the "occurrence name" of the Name; see
<a href="#occname">OccName below</a>.
<li> The <code>n_uniq</code> field allows fast tests for equality of Names.
<li> The <code>n_loc</code> field gives some indication of where the name was bound.
</ul>
<h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2>
There are three flavours of <code>Name</code>:
<pre>
data NameSort
= External Module
| Internal
| System
</pre>
<ul>
<li> Here are the sorts of Name an entity can have:
<ul>
<li> Class, TyCon: External.
<li> Id: External, Internal, or System.
<li> TyVar: Internal, or System.
</ul>
<p><li> An <code>ExternalName</code> has a globally-unique
(module name,occurrence name) pair, namely the
<em>original name</em> of the entity,
describing where the thing was originally defined. So for example,
if we have
<pre>
module M where
f = e1
g = e2
module A where
import qualified M as Q
import M
a = Q.f + g
</pre>
then the RdrNames for "a", "Q.f" and "g" get replaced (by the Renamer)
by the Names "A.a", "M.f", and "M.g" respectively.
<p><li> An <code>InternalName</code>
has only an occurrence name. Distinct InternalNames may have the same occurrence
name; use the Unique to distinguish them.
<p> <li> An <code>ExternalName</code> has a unique that never changes. It is never
cloned. This is important, because the simplifier invents new names pretty freely,
but we don't want to lose the connnection with the type environment (constructed earlier).
An <code>InternalName</code> name can be cloned freely.
<p><li> <strong>Before CoreTidy</strong>: the Ids that were defined at top level
in the original source program get <code>ExternalNames</code>, whereas extra
top-level bindings generated (say) by the type checker get <code>InternalNames</code>.
This distinction is occasionally useful for filtering diagnostic output; e.g.
for -ddump-types.
<p><li> <strong>After CoreTidy</strong>: An Id with an <code>ExternalName</code> will generate symbols that
appear as external symbols in the object file. An Id with an <code>InternalName</code>
cannot be referenced from outside the module, and so generates a local symbol in
the object file. The CoreTidy pass makes the decision about which names should
be External and which Internal.
<p><li> A <code>System</code> name is for the most part the same as an
<code>Internal</code>. Indeed, the differences are purely cosmetic:
<ul>
<li>Internal names usually come from some name the
user wrote, whereas a System name has an OccName like "a", or "t". Usually
there are masses of System names with the same OccName but different uniques,
whereas typically there are only a handful of distince Internal names with the same
OccName.
<li>
Another difference is that when unifying the type checker tries to
unify away type variables with System names, leaving ones with Internal names
(to improve error messages).
</ul>
</ul>
<h2> <a name="occname">Occurrence names: <code>OccName</code></a> </h2>
An <code>OccName</code> is more-or-less just a string, like "foo" or "Tree",
giving the (unqualified) name of an entity.
Well, not quite just a string, because in Haskell a name like "C" could mean a type
constructor or data constructor, depending on context. So GHC defines a type
<tt>OccName</tt> (defined in <tt>basicTypes/OccName.lhs</tt>) that is a pair of
a <tt>FastString</tt> and a <tt>NameSpace</tt> indicating which name space the
name is drawn from:
<pre>
data OccName = OccName NameSpace EncodedFS
</pre>
The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating that the
string is Z-encoded. (Details in <tt>OccName.lhs</tt>.) Z-encoding encodes
funny characters like '%' and '$' into alphabetic characters, like "zp" and "zd",
so that they can be used in object-file symbol tables without confusing linkers
and suchlike.
<p>
The name spaces are:
<ul>
<li> <tt>VarName</tt>: ordinary variables
<li> <tt>TvName</tt>: type variables
<li> <tt>DataName</tt>: data constructors
<li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they share a name space)
</ul>
<!-- hhmts start -->
Last modified: Tue Nov 13 14:11:35 EST 2001
<!-- hhmts end -->
</small>
</body>
</html>
......@@ -16,31 +16,12 @@ Roughly speaking, It has the type:
<pre>
HsModule RdrName -> HsModule Name
</pre>
That is, it converts all the <tt>RdrNames</tt> to <tt>Names</tt>.
That is, it converts all the <tt>RdrNames</tt> to <a href="names.html"><tt>Names</tt></a>.
<h2> OccNames, RdrNames, and Names </h2>
<h2> RdrNames </h2>
A <tt>RdrNames</tt> is pretty much just a string (for an unqualified name
like "<tt>f</tt>") or a pair of strings (for a qualified name like "<tt>M.f</tt>").
Well, not quite just strings, because in Haskell a name like "C" could mean a type
constructor or data constructor, depending on context. So GHC defines a type
<tt>OccName</tt> (defined in <tt>basicTypes/OccName.lhs</tt>) that is a pair of
a <tt>FastString</tt> and a <tt>NameSpace</tt> indicating which name space the
name is drawn from:
<pre>
data OccName = OccName NameSpace EncodedFS
</pre>
The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating that the
string is Z-encoded. (Details in <tt>OccName.lhs</tt>.)
<p>
The name spaces are:
<ul>
<li> <tt>VarName</tt>: ordinary variables
<li> <tt>TvName</tt>: type variables
<li> <tt>DataName</tt>: data constructors
<li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they share a name space)
</ul>
So a <tt>RdrName</tt> is defined thus:
like "<tt>f</tt>") or a pair of strings (for a qualified name like "<tt>M.f</tt>"):
<pre>
data RdrName = RdrName Qual OccName
......@@ -54,35 +35,11 @@ So a <tt>RdrName</tt> is defined thus:
| Orig ModuleName -- This is an *original* name; the module is the place
-- where the thing was defined
</pre>
The OccName type is described in <a href="names.html#occname">"The truth about names"</a>.
<p>
The <tt>OrigName</tt> variant is used internally; it allows GHC to speak of <tt>RdrNames</tt>
that refer to the original name of the thing.
<p>
On the other hand, a <tt>Name</tt>:
<ul>
<li> Contains the <em>original name</em> for the thing.
<li> Contains a <tt>Unique</tt> that makes it easy to compare names for equality quickly.
<li> Contains a <tt>SrcLoc</tt> saying where the name was bound.
</ul>
The <em>original name</em> of an entity (type constructor, class, function etc) is
the (module,name) pair describing where the thing was originally defined. So for example,
if we have
<pre>
module M where
f = e1
g = e2
module A where
import qualified M as Q
import M
a = Q.f + g
</pre>
then the RdrNames for "a", "Q.f" and "g" get replaced by the Names
"A.a", "M.f", and "M.g" respectively.
<p>
<tt>Names</tt> come in two flavours: Local and Global. The Global kind contain
both a <tt>Module</tt> and an <tt>OccName</tt>
Not all Names are qualifed. Local (e.g. lambda-bound) names are given Local Names
<h2> Rebindable syntax </h2>
......
......@@ -26,10 +26,12 @@ represents variables, both term variables and type variables:
</pre>
<ul>
<li> The <code>varName</code> field contains the identity of the variable:
its unique number, and its print-name. The unique number is cached in the
<code>realUnique</code> field, just to make comparison of <code>Var</code>s a little faster.
its unique number, and its print-name. See "<a href="names.html">The truth about names</a>".
<p><li> The <code>Type</code> field gives the type of a term variable, or the kind of a
<p><li> The <code>realUnique</code> field caches the unique number in the
<code>varName</code> field, just to make comparison of <code>Var</code>s a little faster.
<p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a
type variable. (Types and kinds are both represented by a <code>Type</code>.)
<p><li> The <code>varDetails</code> field distinguishes term variables from type variables,
......@@ -57,6 +59,7 @@ We define a couple of type synonyms:
just to help us document the occasions when we are expecting only term variables,
or only type variables.
<h2> The <code>VarDetails</code> field </h2>
The <code>VarDetails</code> field tells what kind of variable this is:
......@@ -206,24 +209,15 @@ to find calls to overloaded functions, <em>and then discards the <code>SpecPragm
So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser.
<h3>Global and Local <code>Name</code>s</h3>
<h3> ExternalNames and InternalNames </h3>
Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is
not the same as whether the Id has a <code>Local</code> or <code>Global</code> <code>Name</code>:
not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code>
(see "<a href="names.html#sort">The truth about Names</a>"):
<ul>
<li> Every <code>GlobalId</code> has a <code>Global</code> <code>Name</code>.
<li> Every <code>GlobalId</code> has an <code>ExternalName</code>.
<li> A <code>LocalId</code> might have either kind of <code>Name</code>.
</ul>
The significance of Global vs Local names is this:
<ul>
<li> A <code>Global</code> Name has a module and occurrence name; a <code>Local</code>
has only an occurrence name.
<p> <li> A <code>Global</code> Name has a unique that never changes. It is never
cloned. This is important, because the simplifier invents new names pretty freely,
but we don't want to lose the connnection with the type environment (constructed earlier).
A <code>Local</code> name can be cloned freely.
</ul>
<!-- hhmts start -->
Last modified: Tue Nov 13 14:11:35 EST 2001
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment