|
|
# Proposal: [TypeDirectedNameResolution](type-directed-name-resolution)
|
|
|
|
|
|
<table><tr><th> Ticket </th>
|
|
|
<th>[\#129](https://gitlab.haskell.org//haskell/prime/issues/129)</th></tr>
|
|
|
<tr><th> Dependencies </th>
|
|
|
<th> names of other proposals on which this one depends
|
|
|
</th></tr>
|
|
|
<tr><th> Related </th>
|
|
|
<th>http://hackage.haskell.org/trac/haskell-prime/wiki/ExistingRecords?</th></tr></table>
|
|
|
|
|
|
## Exploiting the power of the dot
|
|
|
|
|
|
|
|
|
What do I envy about object-oriented languages? Not state, not subtyping,
|
|
|
not inheritance. But I *do* envy the power of type-based name resolution.
|
|
|
Here's what I mean:
|
|
|
|
|
|
- Programers can explore a library, with the help of an IDE, by typing "`x.`", at which point a list of x's methods pops up.
|
|
|
- x's methods can have short, perspicous names. "`x.reset`" calls x's `reset` method, while "`y.reset`" calls y's. The types (ie classes) to which x and y belong may be completely different; but both classes can have a method called "`reset`" without confusion. (And similarly for fields.)
|
|
|
|
|
|
|
|
|
What I only realised recently is that I was envying a feature that has
|
|
|
a *cultural* connection to OO, but that turns out to be fully
|
|
|
compatible with a functional language. This page describes a possible
|
|
|
extension to Haskell, which I'll call **Type-directed name
|
|
|
resolution**, or **TDNR**, intended to add the power of the dot to
|
|
|
Haskell.
|
|
|
|
|
|
## The problem
|
|
|
|
|
|
|
|
|
In Haskell, when we have several similarly named functions, we reply
|
|
|
on the module system to say which one we mean. For example:
|
|
|
|
|
|
```wiki
|
|
|
module Foo where
|
|
|
import Button( Button, reset ) as B
|
|
|
import Canvas( Canvas, reset ) as C
|
|
|
|
|
|
f :: Button -> Canvas -> IO ()
|
|
|
f b c = do { B.reset b; C.reset c }
|
|
|
```
|
|
|
|
|
|
|
|
|
Here I'm using a qualified name to say which `reset` I mean. And in
|
|
|
principle an IDE could pop up a list of what's imported from B if I
|
|
|
type "`B.`".
|
|
|
|
|
|
|
|
|
This is sufficient, but it is just sufficiently inconvenient that
|
|
|
people don't use it much. It is tantalising that knowing that `b` is
|
|
|
a `Button`, there is only one `reset` that can possibly work. Indeed,
|
|
|
any red-blooded Java programmer would expect to write
|
|
|
|
|
|
```wiki
|
|
|
f :: Button -> Canvas -> IO ()
|
|
|
f b c = do { b.reset; c.reset }
|
|
|
```
|
|
|
|
|
|
|
|
|
The same story applies in spades if 'reset' is actually the name of a
|
|
|
record field. Object-oriented folk are (literally) incredulous when
|
|
|
they find that two different data types cannot use the same field
|
|
|
name! To get that effect we must define the two data types in
|
|
|
different modules, and import them as above. For example, maybe
|
|
|
module `Button` looks like this:
|
|
|
|
|
|
```wiki
|
|
|
module Button where
|
|
|
data Button = MkButton { reset :: IO (), .... }
|
|
|
```
|
|
|
|
|
|
|
|
|
(and similarly `Canvas`). In Haskell98, here is how the client looks:
|
|
|
|
|
|
```wiki
|
|
|
module Foo where
|
|
|
import Button( Button(..) ) as B
|
|
|
import Canvas( Canvas(..) ) as C
|
|
|
|
|
|
mkBut :: IO () -> Button
|
|
|
mkBut r = Button { B.reset = r, .... }
|
|
|
|
|
|
updBut :: IO () -> Button -> Button
|
|
|
updBut r b = b { B.reset = r, .... }
|
|
|
```
|
|
|
|
|
|
|
|
|
Even though it is crystal clear in `mkBut` that the only `reset` that
|
|
|
could possibly make sense is the one from B, we must still say which
|
|
|
one. It's slightly less crystal clear in `updBut`, so Haskell 98 is
|
|
|
being consistent here. Nevertheless, GHC has an extension
|
|
|
`-XDisambiguateRecordFields` that allows you to omit teh "`B.`" in
|
|
|
`mkBut`.
|
|
|
|
|
|
## The proposal
|
|
|
|
|
|
|
|
|
The goal of TDNR is identical to that of qualified names: to offer an
|
|
|
alternative way to specify which `reset` (among perhaps several that
|
|
|
are in scope) is intended by an occurrence of "`reset`" in the source
|
|
|
code. TDNR is not "overloading" in the sense of type classes.
|
|
|
|
|
|
|
|
|
The basic design is simple:
|
|
|
|
|
|
- Extend the syntax of atoms:
|
|
|
|
|
|
```wiki
|
|
|
atom ::= var_qual -- Qualified variable name
|
|
|
| var_unqual -- Unqualified variable name
|
|
|
| '(' expr ')' -- Parenthesised expression
|
|
|
| atom '.' var_unqual -- TDNR invocation
|
|
|
```
|
|
|
|
|
|
The part after the dot a simple unqualified name (not a qualified name).
|
|
|
For example, `a`, `a.f` and `a.f.g` are atoms.
|
|
|
|
|
|
- The dynamic semantcs of `a.f` is simply reverse application `(f a)`.
|
|
|
|
|
|
- Unlike normal unqualified variable occurrences, it is legal for there to be many `f`'s
|
|
|
in scope. To resolve which one is intended, find the type of `a`, and the type of all
|
|
|
of the in-scope `f`'s. If there is exactly one `f` whose type matches that of `a`,
|
|
|
that resolve the occurrence of `f`. Otherwise the program is in error.
|
|
|
|
|
|
|
|
|
That's it.
|
|
|
|
|
|
|
|
|
## Syntax
|
|
|
|
|
|
|
|
|
TDNR is driven by a new syntactic form "`a.f`" which, for lack of a
|
|
|
better term, we call a "TDNR invocation". That is what makes the
|
|
|
type-directed name resolution spring into action. Otherwise it is
|
|
|
inactive. You are free to write "`(f a)`" instead, but then you must
|
|
|
use Haskell's module system to disambiguate which `f` you mean.
|
|
|
|
|
|
|
|
|
I have deliberately used dot "`.`" for this feature, for several reasons:
|
|
|
|
|
|
- It's standard practice, and that counts for a lot.
|
|
|
- Selecting a field from a record is a particularly convenient special case,
|
|
|
and the standard notation for that is "r.f".
|
|
|
- TDNR is doing the same job as Haskell's existing qualified names,
|
|
|
so it makes sense to use the sane notatoin.
|
|
|
|
|
|
|
|
|
Of course, there is a problem: Haskell already uses dot for (a)
|
|
|
qualified names, and (b) the function composition operator.
|
|
|
|
|
|
|
|
|
Concerning function composition, TDNR uses the same solution as for
|
|
|
qualified names: if the dot is surrounded by spaces it means function
|
|
|
composition; qualified names and TDNR are both written without spaces.
|
|
|
|
|
|
|
|
|
The overlap with qualified names is a bit more tiresome. A qualified
|
|
|
name for a non-data-constructor (such as "M.a") always finishes with a
|
|
|
lower-case name, or operator. So a subsequent "`.f`" must be a use of
|
|
|
TDNR. That is why I only allow `var_unqual` (e.g. "a") or `var_qual`
|
|
|
(e.g. "M.a") to the left of the dot, but not `con_unqual` (e.g. "C")
|
|
|
or `con_unqual` (e.g. "M.C"). The latter are indistinguishable from
|
|
|
qualified names. If you want a TDNR invocation for a constructor you
|
|
|
must use parentheses, thus "©.f" or "(M.C).f". But it's rare to
|
|
|
want this.
|
|
|
|
|
|
|
|
|
Another choice I have made is that you can say `(h 3).f`, i.e. a TNDR
|
|
|
application with a function call to the left. In Java, people often
|
|
|
write strings of dotted things, like "x.f(3).g(v,w).h". In Haskell
|
|
|
with TDNR this would be rendered "((x.f 3).g v w).h". I can't say I like
|
|
|
all those stacked-up parentheses, but I can't see a better way to do it.
|
|
|
|
|
|
|
|
|
Does TDNR bind more or less tightly than function application. That is,
|
|
|
does "f x.g" mean "(f x).g" or "f (x.g)"? If TDNR is regarded as a kind of
|
|
|
extension of qualified names, it ought to be the latter.
|
|
|
|
|
|
## Discussion
|
|
|
|
|
|
### Works with any function
|
|
|
|
|
|
|
|
|
Notice that, although record field names work great with TDNR, nothing
|
|
|
requires the "x" in "a.x" to be a record field name. The function "x"
|
|
|
can be any old function whose type looks like "*t1 → t2*". So the
|
|
|
first argument of a curried function is special. For example,
|
|
|
consider a finite map module with signaturs like this:
|
|
|
|
|
|
```wiki
|
|
|
data Map k v
|
|
|
lookup :: Ord k => Map k v -> k -> Maybe v
|
|
|
insert :: Ord k => Map k v -> k -> v -> Map k v
|
|
|
elems :: Map k v -> [v]
|
|
|
```
|
|
|
|
|
|
|
|
|
Then, if `m :: Map k v`, we can write
|
|
|
|
|
|
```wiki
|
|
|
m.lookup :: k -> Maybe v
|
|
|
m.insert :: k -> v -> Map k v
|
|
|
m.elems :: [v]
|
|
|
```
|
|
|
|
|
|
|
|
|
This is good if there are several sorts of maps or association lists
|
|
|
floating around, each with their own `lookup` and `insert` functions.
|
|
|
|
|
|
### Details of name resolution
|
|
|
|
|
|
|
|
|
It's very important that whether or not a program typechecks should be
|
|
|
independent of the order in which the typechecker traverses the
|
|
|
program. Consider
|
|
|
|
|
|
```wiki
|
|
|
f x = (x.reset, x::Button)
|
|
|
```
|
|
|
|
|
|
|
|
|
In a right-to-left order the type checker discovers that x has type
|
|
|
`Button`, so it can resolve which `reset` you mean. But if it worked
|
|
|
left to right, it would know nothing about x. To maintain the same
|
|
|
behaviour in both cases, the type inference engine should generate a
|
|
|
name-resolution constraint if it can't immediately figure out the
|
|
|
answer, in order to defer the choice. Then the constraint can be
|
|
|
solve at the end.
|
|
|
|
|
|
|
|
|
In a sense, this is like type-class constraints, except that these
|
|
|
special constraints are never quantified over; they serve only to
|
|
|
defer name resolution until the type of x is known.
|
|
|
|
|
|
### Multiple records with the same field
|
|
|
|
|
|
|
|
|
So far we have assumed Haskell 98 rules, apart from the TDNR
|
|
|
extension. So it is still illegal to say
|
|
|
|
|
|
```wiki
|
|
|
data S = MkS { x,y :: Int }
|
|
|
data T = MkT { x,y :: Float }
|
|
|
```
|
|
|
|
|
|
|
|
|
because that would define two top-level functions, both called `x`.
|
|
|
(And similarly `y`.) But we probably really want to allow this. The
|
|
|
problem is that you could then *only* refer to `x` and `y` using
|
|
|
TDNR, and I rather dislike that; I would prefer TDNR to be a convenient
|
|
|
abbrevation for something one could write more verbosely. If you
|
|
|
like, what is their "original name"?
|
|
|
|
|
|
|
|
|
I can't see any decent way to be able to directly name "the `x` from `S`".
|
|
|
But it's not untenable. If you wanted to pass S's selector x to `map`,
|
|
|
say, you'd have to eta-expand:
|
|
|
|
|
|
```wiki
|
|
|
map (\s::S. s.x) ss
|
|
|
```
|
|
|
|
|
|
### Multiple values with the same name
|
|
|
|
|
|
|
|
|
Do we want to allow this?
|
|
|
|
|
|
```wiki
|
|
|
get :: S -> Int
|
|
|
get s = s.x + s.y
|
|
|
|
|
|
get :: T -> Int
|
|
|
get t = t.x + t.y
|
|
|
```
|
|
|
|
|
|
|
|
|
That is, do we want to allow two top-level functions, both called
|
|
|
`get`? Arguably yes: if we allow both S and T to be defined in the
|
|
|
same module, even though they have the same field names,
|
|
|
it'd make sene to allow their accessor functions to
|
|
|
defined too.
|
|
|
|
|
|
|
|
|
But, again I could only refer to them using TDNR and, worse, there is
|
|
|
nothing in general to force them to have types that are amenable to
|
|
|
TDNR. This latter is different to record fields (whose types are
|
|
|
inherently suitable). So I propose to allow a module to contain
|
|
|
muliple types with the same field; but not muliple functions with the
|
|
|
same name.
|
|
|
|
|
|
## Record syntax
|
|
|
|
|
|
|
|
|
TDNR can apply to record construction and pattern-matching, as indeed
|
|
|
already happens in GHC with `-XDisambiguteRecordFields`. However I
|
|
|
propose *not* to apply it to record update, a construct that is already
|
|
|
rather complicated to typecheck. I do not want to make it worse. |