Skip to content

Template Haskell is unhygienic. Can we change it?

This is just a discussion ticket for now; following through on it would need a GHC proposal.


Consider

{-# LANGUAGE TemplateHaskell #-}

module Lib where

f :: a -> b -> a
f y = \($([p| x |])) -> y

This compiles and type-checks. Now I supposedly alpha-convert f:

{-# LANGUAGE TemplateHaskell #-}

module Lib where

f :: a -> b -> a
f x = \($([p| x |])) -> x

... and get a type error, because the x pattern quote has captured x, which is of type b now.

This capturing of variables renders Template Haskell unhygienic. Macro hygiene is quite like referential transparency for macro systems/splices: If I can't rename y consistently to x without altering program semantics, then the macro system is not hygienic.

The Scheme, Rust and Lean community have put enormous efforts into their macro systems, and all of them are (mostly) hygienic.

Why is Template Haskell unhygienic? The reason is not that quotes may refer to names bound outside; it rather is that splices may produce binding constructs using names that capture local variables.

Lack of hygiene is problematic:

  1. In general, the splice $([p| x |]) might not capture x in such an obvious way through a pattern quote; we could have $(varP "x") for example, and the x is just a String there or could well be read from another file through IO.
  2. The result is completely unpredictable (i.e. dynamic) scoping behavior! We cannot do name resolution unless we run all the splices first. One consequence of this is the dreaded stage restriction: Since a custom splice function foo cannot be renamed until all the splices have been run in the module defining foo, we cannot use foo in the same module it is defined. Related: #21051.

So a net gain of this proposal would be that we would no longer need the stage restriction. (At least that is what I hope.)

This is what we need to fix in order to make TH hygienic: Whenever a binding construct is spliced in, we must rename any binders to use fresh names, and do the same to its use sites. For example, f x = $([| \x -> x |]) must expand to f x1 = \x2 -> x2, whereas f x = \($([| x |]) -> x must expand to f x1 = \x2 -> x1. (All this must be stable under splice evaluation, of course.)

Understanding TH hygiene in terms of Scheme macro hygiene

To better understand what hygiene implies for Template Haskell, it is important to understand how TH quotes and splices relate to macros in Scheme.

Scheme macros

Scheme offers the ability to define hygienic macros with the help of define-syntax-rule:

(define-syntax-rule (my-lambda x e)
  (lambda (x) e))

(define y 23)
((my-lambda x (* 2 x)) 42) ; 84
((my-lambda x (* 2 y)) 42) ; 46

such a macro definition implicitly desugars to a function taking x and y as quotes and returning a quote that "splices in" (the Lisp community says "expands"; "splice" refers to expanding lists of quotes) x and y; something like

(define my-lambda-impl (x e)
  `(lambda (,x) ,e)) ; `_ is a quasiquote like [|_|]; ,_ expands inside a quasiquote, like $_

(define y 23)
((eval (my-lambda-impl `x `(* 2 x))) 42) ; 84
((eval (my-lambda-impl `x `(* 2 y))) 42) ; 46

Correspondence to TH splices

We can define the analog of the "splice function" (TH lingo, would perhaps call it syntax transformer) my-lambda-impl, but currently does not offer an analog for the syntactic macro sugar my-lambda:

myLambdaImpl :: Q Pat -> Q Exp -> Q Exp
myLambdaImpl p e = [| \($p) -> $e |]

y = 23
main = do
  print $ $(myLambdaImpl [p| x |] [| 2 * x |]) 42 -- 84
  print $ $(myLambdaImpl [p| x |] [| 2 * y |]) 42 -- 46

(This doesn't compile as is because of the stage restriction; put myLambdaImpl in a different module.)

Whether or not we should offer syntactic sugar so that we can write just myLambda x (2 * y) is an interesting question best discussed in another proposal. The point here is that this connection allows us to understand macro hygiene in Scheme in terms of Template Haskell. The above examples are simple enough. Furthermore, this is still Haskell, which means we may inline the definition of myLambdaImpl into its use site:

main = print $ $([| \([p| x |]) -> $([| 2 * x |]) |]) 42 -- 84

and this continues to work without observable change. This is an important property that we should strive to preserve! Of course, the "reverse" is also true; any "macro" that evaluates to some splice expression must behave identical.

The last point brings us to the original puzzle: In Scheme, it is possible to write the following macro:

(define-syntax-rule (const e)
  (lambda (x) e))

(define x 42)
((const x) 23) ; 42

Hygiene demands that the use of const here must return 42, not 23, because the use site must be stable under α-renaming (that's the definition of hygiene). Yet, the naïve macro expansion would look like ((lambda (x) x) 23), thus shadowing the definition of x and returning 23.

The corresponding TH splice (i.e. the result after inlining code) is:

x = 42
main = print $ $([| \x -> $([| x |]) |]) 23 -- 23????

Alas, as you can see, the generated code currently evaluates to 23 instead of 42. This is the hygiene bug that is worth fixing!

It is also the kind of intrusion into scope-checking that makes the stage restriction necessary.

Edited by Sebastian Graf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information