Regression in newName

changed weight to 5

added Tbug TemplateHaskell Trac import labels

Aha. This is a consequence of fixing another bug #5037 (closed)! (See also Note [Binders in Template Haskell] in source:compiler/hsSyn/Convert.lhs.) Consider this:

x = 4
y = $(do { x <- newName "x"; return (LamE (VarP x) (VarE (mkName "x"))) })

So the binder is made with newName, but the occurrence is made with mkName (the "dynamic binding" bit of TH). Clearly, I think, the occurrence should should bind to the lambda, not to the top-level x. So the defns should end up thus:

-- Like this
   x = 4
   y = \x -> x

-- NOT like this
   x = 4
   y = \x' -> x

Similarly, what do you think should happen here:

$( do { x <- newName "x"; return (ValD (VarP x) (LitE 3)) } )
v = x

Current the newName expands to something that captures the x below the splice:

-- We get this
  x = 3
  v = x

-- NOT this
  x' = 3
  v = x

Remember too that the newName stuff is what happens behind the scences with a quotation like [d| x = 3 |].

For your data type example, suppose we had

-- declare 'data E = D' with a fresh name 'D'
$(do
    nm <- newName "D"
    return $ [DataD [] (mkName "E") [] [NormalC nm []] []] )

f :: E
f = D

Wouldn't you expect the D in the defn of F to bind to the D declared by the spliced data type decl? Just as if you'd written

data E = D
f :: E
f = D

Remember, again, that quotation uses newName for binders, and you'd definitely expect this to expand to the above code:

$( [d| data E = D |] )
f :: E
f = D

In short, it seems to me that the "multiple definitions" thing is right. But I accept that:

There is therefore no way to splice in a data type decl that is guaranteed not to clash with any existing one. That's bad. (In the case of terms, lambdas etc simply shadow existing defns, so the problem doesn't arise.) Maybe we need a way to make a fresh name that can be referred to only by knowing the name, and not by mkName or by open Haskell code?
The "internal error" message from the type checker is terrible.

Interesting. I've been using newName as you said at the end of your comment: to make a fresh name which can't possibly be captured anywhere except by use of the produced Name, and as far as I can tell, this is exactly how newName has behaved until after ghc-7.0.3. So:

I would definitely like some function which produces genuinely uncapturable names.
Given that newName has produced genuinely uncapturable names up to and including ghc-7.0.3, I would suggest keeping it this way. If the "partial capturing" behaviour of ghc HEAD's newName is really necessary (I don't think it is; see below), then I would at least suggest implementing it under a new function name, to avoid changing the semantics of newName.

Examples

As I said, however, I'm not convinced that a construct supporting "partial capturing" is actually necessary. On to the specifics of your examples! I'll talk about the examples without quotations [| ... |] first, and then I'll talk about quotations.

x = 4
y = $(do { x <- newName "x"; return (LamE (VarP x) (VarE (mkName "x"))) })

I would have said we produce

x = 4
y = \x' -> x

If you want to produce what you suggested, why not just write

x = 4
y = $(return (LamE (VarP (mkName "x")) (VarE (mkName "x"))) })

Similarly, with

$( do { x <- newName "x"; return (ValD (VarP x) (LitE 3)) } )
v = x

I would expect

x' = 3
v = x

Otherwise, why not use mkName? Similarly, I'd expect

-- declare 'data E = D' with a fresh name 'D'
$(do
    nm <- newName "D"
    return $ [DataD [] (mkName "E") [] [NormalC nm []] []] )

f :: E
f = D

to produce:

data E = D'
f = D

But I definitely agree with you on your examples involving quotations. So, for instance, this:

$( [d| data E = D |] )
f :: E
f = D

should compile successfully (and indeed it does in ghc-7.0.3), as should the example in #5037 (closed).

I think the way to achieve the correct behaviour for quotations is simply to use mkName in the desugarer, rather than using newName.

My suggestion

In summary, I think we should:

revert the recent changes to newName
make the desugarer use mkName instead of newName

Let me clarify what I mean when I say "use mkName" in the desugarer. Given this:

import Prelude(map)
g = [| map (\x -> x) |]

the name map should be bound to Prelude.map, whereas both occurrences of the name x should be created using mkName. That is, the quotation should desugar to:

g = AppE (VarE ''map) (LamE [VarP (mkName "x")] (VarE (mkName "x")))

Here's the rule. When encountering a name in a quotation:

if the binding for that name is outside the quotation (like map above), use a NameL or NameG as appropriate
if the binding for that name is inside the quotation, use mkName (i.e. a NameS)
if a binding for that name can't be found, produce an error (this is what ghc currently does)

Note that what I suggest should leave almost all Template Haskell programs with the same semantics, and allow a few more (such as the one in #5037 (closed)) to compile. Even quotations relying on shadowing, such as

f = $( [| \x -> (\x -> x) |] )

should behave the same as before: the only difference is that the shadowing is resolved after splicing rather than during desugaring of the quotation.

Hmm, that doesn't seem to work. Here's an example I didn't think of before:

f = $( [| \x -> $( LamE [VarP (mkName "x")] (VarE 'x) ) |] )

In this case, we would like to produce:

f = \x' -> \x -> x'

But with my suggestion, this wouldn't be possible: under my suggestion, the only possible desugaring of the 'x would be mkName "x", which is not what we want. I have to think some more about this.

Indeed. And the classic example from every meta-programming paper: look at gen in Section 2 of the original TH paper. We write things like

f v = ... [| \x -> ...$(f [| x |])... |] ...

and the 'x' passed to f (insde the quote) must be indissolubly linked to the 'x' created by the lambda. Note that the invocation of f inside the quote may itself make a new \x and so on (this happens in gen). These lambdas must create fresh names; hence newName.

Simon

mentioned in issue #5379 (closed)

I see. Incidentally, when playing around with this, I discovered a related bug, which I've reported: #5379 (closed).

So it seems like we need three ways of making names:

mkName, as it currently behaves
newName, as it behaved in ghc-7.0.3 and before: there is absolutely no way of names it produces being accidentally captured
newName, as it behaves in ghc HEAD.

As I said in my previous comment, would it at least be possible to give the third option a different name, to avoid changing the semantics of newName?

Reiner

changed milestone to %7.2.1

changed weight to 10

This also affects 7.2. Do we know how many packages/users are affected by this?

Trac metadata

Trac field	Value
Priority	normal → highest

I'm quite reluctant to introduce a third option on the spur of the moment. The intended semantics of newName has not changed; it simply had a bug before. The fact that quotation syntax is syntactic sugar for newName was, for example, stated explicitly in the TH2 note (section 4.2). The distinction between newName and mkName is already subtle enough, without adding a third variant to explain.

Also, the "totally fresh" semantics you want has narrow usefulness, because it can never be referred to anywhere outside that immediate quotation. You data constructor D, for example, could not be referred to from anywhere else (and could not be exported from the module) under the semantics you propose.

Really the only significant shortcoming of the status quo is that you can't make a data type that is local to a single splice and is totally invisible outside it. (Actually you can get close by using a long name or a random number, but those are really only approximations.)

Before doing the Real Work it would take (eg adding a new constructor to TH.Syntax.Name, I'd need to be convinced that it was really all worth it.

Does anyone else have an opinion? I'm adding Kathleen to the cc in case it rings a bell with her.

I should point out another (perhaps more common) need for the "totally fresh" semantics. When I write a quasiquoter which antiquotes to Haskell, for instance for interpolated strings (see http://hackage.haskell.org/package/interpolatedstring-perl6-0.8.1), consider the following:

f x = [qc|The value of x is {x}|]

(here antiquotation is delimited by {}s.) In this case, I want the antiquoted x to refer to f's parameter. My quasiquotation will evaluate to a syntax tree as follows:

f x = $( ... (VarE (mkName "x")) ... )

What happens if I want to bind a variable somewhere in the syntax tree? I might for instance want to produce the following:

f x = let theString = "The value of x is "
      in theString ++ show x

In this case the name theString should be totally fresh: it's an implementation detail of the quasiquoter, and we don't want its binding to capture the variable x. So in this case we want:

f x = 
  $(do
      theString <- newName "theString"  -- old semantics are needed here
      return $ LetE [ValD (VarP theString) (NormalB (LitE "The value of x is")) []]
                    (InfixE (VarE theString) '(++) (VarE (mkName "x")))
  )

Cheers, Reiner

I think that's the wrong approach. It places a global burden on every binding enclosing the occurrence of "x".

But the problem is not with those bindings; rather, it's with the occurrence of x. The quasiquoter should not use mkName to generate a string-name for x -- as you yourself pointed out in #4429 (closed). Rather it should use the proper Name of x! Just like 'x in Haskell code

So how can a quasiquoter get hold of the Name for x? That's a whole separate question, which we have open in #4429 (closed). If we had the lookupValue described there, the quasiquoter could use it to look x up.

Now, I grant that #4429 (closed) is languishing unloved, but I think that's the right fix because it adds a simple-to-understand facility that may be independently useful, rather than making a regrettably-complex situation more complex still.

So how do quasiquoters handle shadowing? For instance, what should we do when we encounter the following:

f x = [some_quasi| ... (\x -> x)...|]

This should become

f x = ... (\x' -> x') ...

But I don't see how this would work if we used lookupName on all the occurrences of x in the quasiquote.

Put in a different way, what would happen if I wrote this:

f x = 
 $(do
     x' <- lookupName "x"
     return $ LamE [VarP x'] (VarE x')
  )

Quasiquotation, by definition, puts all the control in the hands of the author of the quasiquoter. All TH can hope to do is to make facilities available that let them do what they want. lookupName is such a facility. I can't answer your first example without knowing precisely what the quasiquoter is trying to do.

For your second example, you are doing a Bad Thing. It's precisely the same Bad Thing as this:

f x = return (LamE (VarP 'x) (VarE 'x))

Here the 'x is the Name of the x that is f's argument. You should not put such a Name in a binding position. (I forget whether TH will emit an error message if you do so, but it Jolly Well Should!)

Okay, consider the interpolated-string quasiquoter from before. Given:

f x = [qc|The value is { (x, let x = 3 in x) } |]

I would like to produce:

f x = let theString = "The value is "
      in theString ++ show (x, let x' = 3 in x')

It seems that I need to use lookupName for the x in the fst of the tuple (to make sure that it doesn't get captured by the theString binding, for example), and I need to use mkName for the occurrences of x in the snd of the tuple.

This is possible to do, certainly, but it means my quasiquoter has keep in mind the scoping rules of Haskell. Or is there a simpler way?

Yes, of course the quasiquoter needs to understand the scoping rules of the language it is parsing. In some weird language it might be that in (x, let x=3 in x) the first x is bound by the let x=3. It depends on the language!

Now, if you want to embed chunks of Haskell, you need a Haskell parser. I'm not sure what Kathleen uses, maybe haskell-src-exts or something. Then you need to convert that parser's output to TH syntax, and when you do that then, yes, you need to take care with scopes. That is how you express the semantics of the quasiquoted language, which just happens to be Haskell in this case.

Ok, makes sense. Sounds good.

Trac field	Value
Version	7.3
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Template Haskell
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

Regression in newName

Child items ...

Activity

Examples

My suggestion