|
|
# Proposal: [TypeDirectedNameResolution](type-directed-name-resolution)
|
|
|
|
|
|
|
|
|
<table><tr><th> Ticket </th>
|
|
|
<th>[\#129](https://gitlab.haskell.org//haskell/prime/issues/129)</th></tr>
|
|
|
<th> [\#129](https://gitlab.haskell.org//haskell/prime/issues/129)
|
|
|
</th></tr>
|
|
|
<tr><th> Dependencies </th>
|
|
|
<th></th></tr>
|
|
|
<th> names of other proposals on which this one depends
|
|
|
</th></tr>
|
|
|
<tr><th> Related </th>
|
|
|
<th>[ExistingRecords](existing-records)</th></tr></table>
|
|
|
|
|
|
<th> http://hackage.haskell.org/trac/haskell-prime/wiki/ExistingRecords?
|
|
|
</th></tr></table>
|
|
|
|
|
|
See also
|
|
|
|
|
|
- email thread at [ http://thread.gmane.org/gmane.comp.lang.haskell.cafe/61723](http://thread.gmane.org/gmane.comp.lang.haskell.cafe/61723)
|
|
|
- The [ publicly-editable TDNR discussion page](http://haskell.org/haskellwiki/TypeDirectedNameResolution), and straw poll
|
|
|
|
|
|
## Exploiting the power of the dot
|
|
|
|
|
|
|
|
|
|
|
|
What do I envy about object-oriented languages? Not state, not subtyping,
|
|
|
not inheritance. But I *do* envy the power of type-based name resolution.
|
|
|
Here's what I mean:
|
|
|
|
|
|
|
|
|
- Programers can explore a library, with the help of an IDE, by typing "`x.`", at which point a list of x's methods pops up.
|
|
|
- x's methods can have short, perspicous names. "`x.reset`" calls x's `reset` method, while "`y.reset`" calls y's. The types (ie classes) to which x and y belong may be completely different; but both classes can have a method called "`reset`" without confusion. (And similarly for fields.)
|
|
|
|
... | ... | @@ -31,12 +32,15 @@ extension to Haskell, which I'll call **Type-directed name |
|
|
resolution**, or **TDNR**, intended to add the power of the dot to
|
|
|
Haskell.
|
|
|
|
|
|
|
|
|
## The problem
|
|
|
|
|
|
|
|
|
|
|
|
In Haskell, when we have several similarly named functions, we reply
|
|
|
on the module system to say which one we mean. For example:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
module Foo where
|
|
|
import Button( Button, reset ) as B
|
... | ... | @@ -52,11 +56,13 @@ principle an IDE could pop up a list of what's imported from B if I |
|
|
type "`B.`".
|
|
|
|
|
|
|
|
|
Using qualified names works, but it is just sufficiently inconvenient that
|
|
|
|
|
|
This is sufficient, but it is just sufficiently inconvenient that
|
|
|
people don't use it much. It is tantalising that knowing that `b` is
|
|
|
a `Button`, there is only one `reset` that can possibly work. Indeed,
|
|
|
any red-blooded Java programmer would expect to write
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f :: Button -> Canvas -> IO ()
|
|
|
f b c = do { b.reset; c.reset }
|
... | ... | @@ -70,6 +76,7 @@ name! To get that effect we must define the two data types in |
|
|
different modules, and import them as above. For example, maybe
|
|
|
module `Button` looks like this:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
module Button where
|
|
|
data Button = MkButton { reset :: IO (), .... }
|
... | ... | @@ -78,6 +85,7 @@ module `Button` looks like this: |
|
|
|
|
|
(and similarly `Canvas`). In Haskell98, here is how the client looks:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
module Foo where
|
|
|
import Button( Button(..) ) as B
|
... | ... | @@ -98,19 +106,20 @@ being consistent here. Nevertheless, GHC has an extension |
|
|
`-XDisambiguateRecordFields` that allows you to omit teh "`B.`" in
|
|
|
`mkBut`.
|
|
|
|
|
|
|
|
|
## The proposal
|
|
|
|
|
|
|
|
|
|
|
|
The goal of TDNR is identical to that of qualified names: to offer an
|
|
|
alternative way to specify which `reset` (among perhaps several that
|
|
|
are in scope) is intended by an occurrence of "`reset`" in the source
|
|
|
code. TDNR is not "overloading" in the sense of type classes.
|
|
|
|
|
|
|
|
|
|
|
|
The basic design is simple:
|
|
|
|
|
|
- There is a new lexeme, of the form `'.' varid`, which we call `dot_var`.
|
|
|
The part after the dot a simple unqualified name (not a qualified name).
|
|
|
|
|
|
- Extend the syntax of atoms:
|
|
|
|
... | ... | @@ -118,12 +127,13 @@ The basic design is simple: |
|
|
atom ::= var_qual -- Qualified variable name
|
|
|
| var_unqual -- Unqualified variable name
|
|
|
| '(' expr ')' -- Parenthesised expression
|
|
|
| atom dot_var -- TDNR invocation
|
|
|
| atom '.' var_unqual -- TDNR invocation
|
|
|
```
|
|
|
|
|
|
The part after the dot a simple unqualified name (not a qualified name).
|
|
|
For example, `a`, `a.f` and `a.f.g` are atoms.
|
|
|
|
|
|
- The dynamic semantics of `a.f` is simply reverse application `(f a)`.
|
|
|
- The dynamic semantcs of `a.f` is simply reverse application `(f a)`.
|
|
|
|
|
|
- Unlike normal unqualified variable occurrences, it is legal for there to be many `f`'s
|
|
|
in scope. To resolve which one is intended, find the type of `a`, and the type of all
|
... | ... | @@ -134,9 +144,11 @@ The basic design is simple: |
|
|
That's it.
|
|
|
|
|
|
|
|
|
|
|
|
## Syntax
|
|
|
|
|
|
|
|
|
|
|
|
TDNR is driven by a new syntactic form "`a.f`" which, for lack of a
|
|
|
better term, we call a "TDNR invocation". That is what makes the
|
|
|
type-directed name resolution spring into action. Otherwise it is
|
... | ... | @@ -144,24 +156,28 @@ inactive. You are free to write "`(f a)`" instead, but then you must |
|
|
use Haskell's module system to disambiguate which `f` you mean.
|
|
|
|
|
|
|
|
|
|
|
|
I have deliberately used dot "`.`" for this feature, for several reasons:
|
|
|
|
|
|
|
|
|
- It's standard practice, and that counts for a lot.
|
|
|
- Selecting a field from a record is a particularly convenient special case,
|
|
|
and the standard notation for that is "r.f".
|
|
|
- TDNR is doing the same job as Haskell's existing qualified names,
|
|
|
so it makes sense to use the sane notation.
|
|
|
so it makes sense to use the sane notatoin.
|
|
|
|
|
|
|
|
|
Of course, there is a problem: Haskell already uses dot for (a)
|
|
|
qualified names, and (b) the function composition operator.
|
|
|
|
|
|
|
|
|
|
|
|
Concerning function composition, TDNR uses the same solution as for
|
|
|
qualified names: if the dot is surrounded by spaces it means function
|
|
|
composition; qualified names and TDNR are both written without spaces.
|
|
|
|
|
|
|
|
|
|
|
|
The overlap with qualified names is a bit more tiresome. A qualified
|
|
|
name for a non-data-constructor (such as "M.a") always finishes with a
|
|
|
lower-case name, or operator. So a subsequent "`.f`" must be a use of
|
... | ... | @@ -173,6 +189,7 @@ must use parentheses, thus "©.f" or "(M.C).f". But it's rare to |
|
|
want this.
|
|
|
|
|
|
|
|
|
|
|
|
Another choice I have made is that you can say `(h 3).f`, i.e. a TNDR
|
|
|
application with a function call to the left. In Java, people often
|
|
|
write strings of dotted things, like "x.f(3).g(v,w).h". In Haskell
|
... | ... | @@ -180,86 +197,25 @@ with TDNR this would be rendered "((x.f 3).g v w).h". I can't say I like |
|
|
all those stacked-up parentheses, but I can't see a better way to do it.
|
|
|
|
|
|
|
|
|
|
|
|
Does TDNR bind more or less tightly than function application. That is,
|
|
|
does "f x.g" mean "(f x).g" or "f (x.g)"? If TDNR is regarded as a kind of
|
|
|
extension of qualified names, it ought to be the latter.
|
|
|
|
|
|
## Stacking operations
|
|
|
|
|
|
|
|
|
Another possibility for syntax is this:
|
|
|
|
|
|
```wiki
|
|
|
expr ::= atom
|
|
|
| expr atom -- Application
|
|
|
| expr dot_occ -- TDNR invocation
|
|
|
| ...
|
|
|
|
|
|
atom ::= var -- Unchanged
|
|
|
| '(' expr ')'
|
|
|
| ...
|
|
|
|
|
|
-- dot_occ is a lexeme of form ".v"
|
|
|
```
|
|
|
|
|
|
|
|
|
Here the form ".v" is a lexical token. The two forms `(f x)` and `(x .f)` are application and TDNR invocation respectively. Both forms associate to the left, so `x .f 3 .g 7 5` means `(((((x .f) 3) .g ) 7) 5)`. This means that you can stack up successive TDNR invocations without parens:
|
|
|
|
|
|
```wiki
|
|
|
m .lookup key
|
|
|
.snd
|
|
|
.reverse
|
|
|
```
|
|
|
|
|
|
|
|
|
which means `reverse(snd (lookup m key))`. And that in turn means the same as
|
|
|
|
|
|
```wiki
|
|
|
reverse . snd . (\m -> lookup m key) $ m
|
|
|
```
|
|
|
## Discussion
|
|
|
|
|
|
|
|
|
There is something odd about this, however. We'd really like to write
|
|
|
|
|
|
```wiki
|
|
|
x .reverse
|
|
|
.filter isEven
|
|
|
.map double
|
|
|
```
|
|
|
|
|
|
|
|
|
but that doesn't work because the list is `filter`'s *last* argument, not its first. Maybe you should be able to write
|
|
|
|
|
|
```wiki
|
|
|
x .reverse .(filter isEven) .(map double)
|
|
|
```
|
|
|
|
|
|
|
|
|
(Of course the spaces can be omitted). And that would be fine provided we allowed this:
|
|
|
|
|
|
```wiki
|
|
|
expr ::= atom | expr atom
|
|
|
| expr dot_occ
|
|
|
| expr '.(' var expr1 .. exprn ')
|
|
|
...
|
|
|
```
|
|
|
|
|
|
|
|
|
where `f .(g x)` means `(g x f)`. The oddness here is that the TDNR invocation can "look inside" the `.(..)` to see the function at the head. (And it had better BE a function, too.)
|
|
|
|
|
|
|
|
|
I think this is probably worth it, although it's a little odd. To me, the ability to "stack up" postfix operations is rather important, and the fact that it doesn't fit nicely is the biggest shortcoming of this whole proposal. Can anyone improve it?
|
|
|
|
|
|
## Discussion and other design choices
|
|
|
|
|
|
### Works with any function
|
|
|
|
|
|
|
|
|
|
|
|
Notice that, although record field names work great with TDNR, nothing
|
|
|
requires the "x" in "a.x" to be a record field name. The function "x"
|
|
|
can be any old function whose type looks like "*t1 → t2*". So the
|
|
|
first argument of a curried function is special. For example,
|
|
|
consider a finite map module with signatures like this:
|
|
|
consider a finite map module with signaturs like this:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
data Map k v
|
... | ... | @@ -271,6 +227,7 @@ elems :: Map k v -> [v] |
|
|
|
|
|
Then, if `m :: Map k v`, we can write
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
m.lookup :: k -> Maybe v
|
|
|
m.insert :: k -> v -> Map k v
|
... | ... | @@ -281,13 +238,16 @@ Then, if `m :: Map k v`, we can write |
|
|
This is good if there are several sorts of maps or association lists
|
|
|
floating around, each with their own `lookup` and `insert` functions.
|
|
|
|
|
|
|
|
|
### Details of name resolution
|
|
|
|
|
|
|
|
|
|
|
|
It's very important that whether or not a program typechecks should be
|
|
|
independent of the order in which the typechecker traverses the
|
|
|
program. Consider
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
f x = (x.reset, x::Button)
|
|
|
```
|
... | ... | @@ -302,16 +262,20 @@ answer, in order to defer the choice. Then the constraint can be |
|
|
solve at the end.
|
|
|
|
|
|
|
|
|
|
|
|
In a sense, this is like type-class constraints, except that these
|
|
|
special constraints are never quantified over; they serve only to
|
|
|
defer name resolution until the type of x is known.
|
|
|
|
|
|
|
|
|
### Multiple records with the same field
|
|
|
|
|
|
|
|
|
|
|
|
So far we have assumed Haskell 98 rules, apart from the TDNR
|
|
|
extension. So it is still illegal to say
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
data S = MkS { x,y :: Int }
|
|
|
data T = MkT { x,y :: Float }
|
... | ... | @@ -322,14 +286,16 @@ because that would define two top-level functions, both called `x`. |
|
|
(And similarly `y`.) But we probably really want to allow this. The
|
|
|
problem is that you could then *only* refer to `x` and `y` using
|
|
|
TDNR, and I rather dislike that; I would prefer TDNR to be a convenient
|
|
|
abbreviation for something one could write more verbosely. If you
|
|
|
abbrevation for something one could write more verbosely. If you
|
|
|
like, what is their "original name"?
|
|
|
|
|
|
|
|
|
|
|
|
I can't see any decent way to be able to directly name "the `x` from `S`".
|
|
|
But it's not untenable. If you wanted to pass S's selector x to `map`,
|
|
|
say, you'd have to eta-expand:
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
map (\s::S. s.x) ss
|
|
|
```
|
... | ... | @@ -337,8 +303,10 @@ say, you'd have to eta-expand: |
|
|
### Multiple values with the same name
|
|
|
|
|
|
|
|
|
|
|
|
Do we want to allow this?
|
|
|
|
|
|
|
|
|
```wiki
|
|
|
get :: S -> Int
|
|
|
get s = s.x + s.y
|
... | ... | @@ -351,123 +319,26 @@ Do we want to allow this? |
|
|
That is, do we want to allow two top-level functions, both called
|
|
|
`get`? Arguably yes: if we allow both S and T to be defined in the
|
|
|
same module, even though they have the same field names,
|
|
|
it'd make sense to allow their accessor functions to
|
|
|
it'd make sene to allow their accessor functions to
|
|
|
defined too.
|
|
|
|
|
|
|
|
|
|
|
|
But, again I could only refer to them using TDNR and, worse, there is
|
|
|
nothing in general to force them to have types that are amenable to
|
|
|
TDNR. This latter is different to record fields (whose types are
|
|
|
inherently suitable). So I propose to allow a module to contain
|
|
|
muliple types with the same field; but not multiple functions with the
|
|
|
muliple types with the same field; but not muliple functions with the
|
|
|
same name.
|
|
|
|
|
|
### Top-level disambiguation only
|
|
|
|
|
|
|
|
|
Consider this
|
|
|
## Record syntax
|
|
|
|
|
|
```wiki
|
|
|
data R = MkR { x,y :: Int }
|
|
|
|
|
|
f1 :: R -> Int -> R
|
|
|
f1 r x = r { x = x }
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
Function `f1` is already allowed in Haskell 98; in the update `r {x=x}` the first `x`
|
|
|
can only be a field name, while the second binds as usual to the local
|
|
|
variable that shadows the top-level name. This is good because it's convenient
|
|
|
to use similar names for field names as for local variables. Indeed GHC's support
|
|
|
for punning makes this even more attractive.
|
|
|
|
|
|
|
|
|
But now look at `f2`:
|
|
|
|
|
|
```wiki
|
|
|
f2 :: R -> Int
|
|
|
f2 r x = r.x + x
|
|
|
```
|
|
|
|
|
|
|
|
|
Arguably the same story should apply. Just as the update notation tells which is
|
|
|
a field name, the dot notation does the same. So perhaps in TDNR, the `r.x` should
|
|
|
choose among **top-level** bindings for `x`, ignoring nested ones.
|
|
|
|
|
|
### Qualified import
|
|
|
|
|
|
|
|
|
Consider this
|
|
|
|
|
|
```wiki
|
|
|
module R where
|
|
|
data R = MkR { x,y::Int }
|
|
|
|
|
|
module Foo where
|
|
|
import qualified R -- Note qualified
|
|
|
f1, f2 :: R -> Int
|
|
|
f1 r = R.y r
|
|
|
f2 r = r.x
|
|
|
```
|
|
|
|
|
|
|
|
|
Here the `R.y` in `f1` is an ordinary qualified name. But should
|
|
|
`r.x` in `f2` be allowed? Remember plain `x` isn't in scope; only `R.x` is. But
|
|
|
we don't want to have to force you to write `r.R.x`, even if we could parse
|
|
|
it! Maybe TDNR should choose among in-scope x's regardless of whether they are
|
|
|
qualified or not.
|
|
|
|
|
|
|
|
|
NB: GHC 6.12's existing record field disambiguation makes exactly this choice already.
|
|
|
See the [ user manual 7.3.14](http://www.haskell.org/ghc/dist/current/docs/html/users_guide/syntax-extns.html#disambiguate-fields).
|
|
|
|
|
|
### Record syntax
|
|
|
|
|
|
|
|
|
TDNR can apply to record construction and pattern-matching, as indeed
|
|
|
already happens in GHC with `-XDisambiguteRecordFields`. For example,
|
|
|
you could say
|
|
|
|
|
|
```wiki
|
|
|
a = R { x = 3 }
|
|
|
f (R {x = xval}) = xval
|
|
|
```
|
|
|
|
|
|
|
|
|
even if there were many x's in scope.
|
|
|
|
|
|
|
|
|
However I propose *not* to apply it to record update. Specifically
|
|
|
I propose not to take the expression
|
|
|
|
|
|
```wiki
|
|
|
r { x=3 }
|
|
|
```
|
|
|
|
|
|
|
|
|
and try to figure out which of many in-scope x's you mean by looking at the type of 'r'.
|
|
|
That's be possible, but it gets complicated: what about `(f v) {x=3}`? Record update
|
|
|
is already rather complicated to typecheck. I do not want to make it worse.
|
|
|
|
|
|
### Section-style selection
|
|
|
|
|
|
|
|
|
We have the option to allow '(.x)' as a valid expression, with its meaning given by the translation
|
|
|
|
|
|
```wiki
|
|
|
(.x) === (\f -> f.x)
|
|
|
```
|
|
|
|
|
|
|
|
|
This allows things like
|
|
|
|
|
|
```wiki
|
|
|
map (.x) rs
|
|
|
```
|
|
|
|
|
|
|
|
|
and means that `(.x) f` is equivalent to `f.x`. The syntax and meaning is consistent with right-section, although it is not really a right-section.
|
|
|
already happens in GHC with `-XDisambiguteRecordFields`. However I
|
|
|
propose *not* to apply it to record update, a construct that is already
|
|
|
rather complicated to typecheck. I do not want to make it worse.
|
|
|
|
|
|
|
|
|
What about `(.x.y)`? Does that expand to `(\f -> f.x.y)`? |