First Class Labels
The Opportunity
Haskell's type class and instance language enables a form of logic meta-programming at the level of types. Since record system proposal can often be phrased as an instance of Qualified Types, this means that many (not all!) features of polymorphic extensible record systems can be implemented as a Haskell library. To work around the limitation of Haskell 98, this tends to require several language extensions, as implemented in GHC and Hugs.
Examples include
- Strongly typed heterogeneous collections
- attachment:ticket:92:Data.Record.hs, which demonstrates an implementation of something similar to Daan Leijen's extensible records with scoped labels (though with type predicates instead of simple type system extension; as a bonus, we have record concatenation as well)
The Problem
Independent of the specific record library, there needs to be a representation of record field labels, and as record field selection in theses libraries is usually based on types, field labels need to be distinguishable by their types. In principle, this can easily be achieved by declaring for each label a single-constructor data type:
data LabelX = LabelX deriving (Read,Show)
However, this approach quickly runs into pragmatic issues in multi-module programming:
- There needs to be a common origin for importing record field label declarations used accross several modules.
- The labels occupy the same namespace as types and data constructors, and using qualified names as record field labels is awkward at best.
Of these, 1 is the most severe. To wit, imagine that we have two modules, A and B (think …OpenGL and some GUI library), that both want to use records with fields named 'PointX'. They'd both have to declare the field label, but if we ever want to import A and B into the same module C (think OpenGL windows in a GUI), we are in trouble, because we have two "conflicting" declarations for PointX. Note that the following doesn't work!
module A where
data PointX = PointX deriving Show
main = print PointX
module B where
data PointX = PointX deriving Show
main = print PointX
module C where
import A
import B
main = print [PointX,A.PointX,B.PointX] -- conflict here! ambiguous occurrence..
Not only is the unqualified reference to PointX in C ambiguous, but qualifying the labels doesn't help either: in spite of our intentions, A.PointX and B.PointX are different types!
With current Haskell, the only way around this is to modify the imports: introduce a new common ancestor module, say PointX, that declares the label PointX, and have both A and B import that. However, this is impractical: it breaks module composition, and there is no least upper bound in the import hierarchy where we can safely place our label declarations once and for all.
The Proposal
I propose to introduce implicitly declared typed labels as first-class values into Haskell' (ticket:92).
Options
Option 1:
make label declarations unneccessary, by reserving a separate namespace for
labels and their types (to be concrete, prefix identifiers with '#', so that
we'd have #pointX :: #PointX
).
This would affect language and implementations in the same way as numeric,
character, and string literals do. In particular, every occurrence of
'#'<identifier>
would be interpreted as a value of type '#'<Identifier>
(the label type name is the capitalized label name). Apart from having their
own namespace, identified by the prefix '#', labels and their types would
just be ordinary constants and types, respectively.
With this option, the problematic example would look like this:
module A where
main = print #pointX
module B where
main = print #pointX
module C where
import A
import B
main = print [#pointX,A.#pointX,B.#pointX] -- no conflicts here!
pro: simple in use, labels have their own namespace, no conflicting imports, known to work
con: need to give up some identifiable space in the language for labels and their types
Option 2:
make type sharing expressible (something like the sharing constraints in Standard ML's module language, to allow you to say when two declarations from different imports refer to the same type).
This would have a major impact on language and implementations. Assuming a sharing declaration of the form
sharing <type1> <type2>
the implementation would have to:
- find the declarations of
type1
andtype2
and check them for structural equivalence - unify
type1
andtype2
, ie., interpret either of them as a synonym for the same underlying type
In full generality, a feature like this would help to address similar problems with other conflicting imports, and could be extended to cover classes and instances as well (though instances couldn't be named). For the current proposal, however, only a trivial part of that generality would be needed.
With this option, the problematic example would look like this:
module A where
data PointX = PointX deriving Show
main = print PointX
module B where
data PointX = PointX deriving Show
main = print PointX
module C where
import A
import B
sharing A.PointX B.PointX
main = print [PointX,A.PointX,B.PointX] -- no conflicts here!
pro: seems like a useful feature anyway
con: more complex than needed for this proposal, and would be rather verbose in use
Option 3:
introduce a common least upper bound for shared label imports. (to be
concrete: there would be a module Data.Label
, implicitly providing shared
declarations of any labels).
This would have a similarly small effect on the type system as Option 1, only
that instead of syntax, we'd use imports from the reserved module Data.Label
to identify what is a label and what is not.
Whenever encountering an import Data.Label(<identifier>)
, we interpret
Data.Label.<identifier>
as a constant of type Data.Label.<Identifier>
and
<identifier>
as a constant of type <Identifier>
. the difference to normal
imports is that the compiler/type system needs to know about Data.Label
.
In other words, Data.Label
does not exist in source or object code, but as a
hint for the compiler/type system. Any identifier imported from there is a
label of its own type, nothing else can be imported from there.
With this option, the problematic example would look like this:
module A where
import Data.Label(pointX)
main = print pointX
module B where
import Data.Label(pointX)
main = print pointX
module C where
import A
import B
main = print [pointX,A.pointX,B.pointX] -- no conflicts here!
pro: no syntax extension or separate label namespace, no problems with common imports
con: no separate label namespace, labels still need to be declared, by means of import
Related Tickets and Links
-
ticket:92 add first class labels
-
ticket:27 tweak the existing records system (adopt: none)
-
ticket:54 add overlapping or incoherent instances (adopt: probably no)
-
ticket:71 Allow Undecidable Instances (adopt: probably no)
-
ticket:36 add FunctionalDependencies (adopt: probably no)
-
ticket:49 add multi parameter type classes (adopt: probably yes)