First Class Labels

The Opportunity

Haskell's type class and instance language enables a form of logic meta-programming at the level of types. Since record system proposal can often be phrased as an instance of Qualified Types, this means that many (not all!) features of polymorphic extensible record systems can be implemented as a Haskell library. To work around the limitation of Haskell 98, this tends to require several language extensions, as implemented in GHC and Hugs.

Examples include

Strongly typed heterogeneous collections
attachment:ticket:92:Data.Record.hs , which demonstrates an implementation of something similar to Daan Leijen's extensible records with scoped labels (though with type predicates instead of simple type system extension; as a bonus, we have record concatenation as well)

The Problem

Independent of the specific record library, there needs to be a representation of record field labels, and as record field selection in theses libraries is usually based on types, field labels need to be distinguishable by their types. In principle, this can easily be achieved by declaring for each label a single-constructor data type:

data LabelX = LabelX deriving (Read,Show)

However, this approach quickly runs into pragmatic issues in multi-module programming:

There needs to be a common origin for importing record field label declarations used accross several modules.
The labels occupy the same namespace as types and data constructors, and using qualified names as record field labels is awkward at best.

Of these, 1 is the most severe. To wit, imagine that we have two modules, A and B (think …OpenGL and some GUI library), that both want to use records with fields named 'PointX'. They'd both have to declare the field label, but if we ever want to import A and B into the same module C (think OpenGL windows in a GUI), we are in trouble, because we have two "conflicting" declarations for PointX. Note that the following doesn't work!

module A where 
data PointX = PointX deriving Show
main = print PointX

module B where 
data PointX = PointX deriving Show
main = print PointX

module C where
import A
import B
main = print [PointX,A.PointX,B.PointX] -- conflict here! ambiguous occurrence..

Not only is the unqualified reference to PointX in C ambiguous, but qualifying the labels doesn't help either: in spite of our intentions, A.PointX and B.PointX are different types!

With current Haskell, the only way around this is to modify the imports: introduce a new common ancestor module, say PointX, that declares the label PointX, and have both A and B import that. However, this is impractical: it breaks module composition, and there is no least upper bound in the import hierarchy where we can safely place our label declarations once and for all.

The Proposal

I propose to introduce implicitly declared typed labels as first-class values into Haskell' (ticket:92).

Options

Option 1:

make label declarations unneccessary, by reserving a separate namespace for labels and their types (to be concrete, prefix identifiers with '#', so that we'd have #pointX :: #PointX).

This would affect language and implementations in the same way as numeric, character, and string literals do. In particular, every occurrence of '#'<identifier> would be interpreted as a value of type '#'<Identifier> (the label type name is the capitalized label name). Apart from having their own namespace, identified by the prefix '#', labels and their types would just be ordinary constants and types, respectively.

With this option, the problematic example would look like this:

module A where 
main = print #pointX

module B where 
main = print #pointX

module C where
import A
import B
main = print [#pointX,A.#pointX,B.#pointX] -- no conflicts here!

pro: simple in use, labels have their own namespace, no conflicting imports, known to work

con: need to give up some identifiable space in the language for labels and their types

Option 2:

make type sharing expressible (something like the sharing constraints in Standard ML's module language, to allow you to say when two declarations from different imports refer to the same type).

This would have a major impact on language and implementations. Assuming a sharing declaration of the form

  sharing <type1> <type2>

the implementation would have to:

find the declarations of type1 and type2 and check them for structural equivalence
unify type1 and type2, ie., interpret either of them as a synonym for the same underlying type

In full generality, a feature like this would help to address similar problems with other conflicting imports, and could be extended to cover classes and instances as well (though instances couldn't be named). For the current proposal, however, only a trivial part of that generality would be needed.

With this option, the problematic example would look like this:

module A where 
data PointX = PointX deriving Show
main = print PointX

module B where 
data PointX = PointX deriving Show
main = print PointX

module C where
import A
import B
sharing A.PointX B.PointX
main = print [PointX,A.PointX,B.PointX] -- no conflicts here!

pro: seems like a useful feature anyway

con: more complex than needed for this proposal, and would be rather verbose in use

Option 3:

introduce a common least upper bound for shared label imports. (to be concrete: there would be a module Data.Label, implicitly providing shared declarations of any labels).

This would have a similarly small effect on the type system as Option 1, only that instead of syntax, we'd use imports from the reserved module Data.Label to identify what is a label and what is not.

Whenever encountering an import Data.Label(<identifier>), we interpret Data.Label.<identifier> as a constant of type Data.Label.<Identifier> and <identifier> as a constant of type <Identifier>. the difference to normal imports is that the compiler/type system needs to know about Data.Label.

In other words, Data.Label does not exist in source or object code, but as a hint for the compiler/type system. Any identifier imported from there is a label of its own type, nothing else can be imported from there.

With this option, the problematic example would look like this:

module A where 
import Data.Label(pointX)
main = print pointX

module B where 
import Data.Label(pointX)
main = print pointX

module C where
import A
import B
main = print [pointX,A.pointX,B.pointX] -- no conflicts here!

pro: no syntax extension or separate label namespace, no problems with common imports

con: no separate label namespace, labels still need to be declared, by means of import

Comments

Please register or sign in to add a comment.

first class labels

First Class Labels

The Opportunity

The Problem

The Proposal

Options

Related Tickets and Links

Comments

The Opportunity

The Problem

The Proposal

Options

Related Tickets and Links