Proposal: Relaxed Dependency Analysis

Ticket	#65
Dependencies	none
Related	#103: MonomorphicPatternBindings
	#80: MonomorphismRestriction

Compiler support

GHC	full
nhc98	full
Hugs	full
UHC	full
JHC	full
LHC	full

Summary

Haskell 98 specifies that type inference be performed in dependency order to increase polymorphism. However most Haskell implementations use a more liberal rule (proposed by Mark Jones).

Description

In Haskell 98, a group of bindings is sorted into strongly-connected components, and then type-checked in dependency order ( H98 s4.5.1). As each dependency group is type-checked, any binders of the group that have an explicit type signature are put in the type environment with the specified polymorphic type, and all others are monomorphic until the group is generalized ( H98 s4.5.2).

Consider

data BalancedTree a = Zero a | Succ (BalancedTree (a,a))

zig :: BalancedTree a -> a
zig (Zero a) = a
zig (Succ t) = fst (zag t)

zag (Zero a) = a
zag (Succ t) = snd (zig t)

As with many operations on non-regular (or nested) types, zig and zag need to be polymorphic in the element type. In Haskell 98, the bindings of the two functions are interdependent, and thus constitute a single binding group. When type inference is performed on this group, zig may be used at different types, because it has a user-supplied polymorphic signature. However, zag may not, and the example is rejected, unless we add an explicit type signature for zag.

Mark Jones suggested that the dependency analysis should ignore references to variables that have an explicit type signature, and most compilers already implement this. Hence zag does not depend on zig, and we can infer the type

zag :: BalancedTree a -> a

and then go on to successfully check the type signature of zig.

Dependency groups are smaller, and more programs type-check.

References

Dependency Analysis in the Haskell 98 Report
Typing Haskell in Haskell, Mark Jones, Haskell Workshop 1999.

Report Delta

Replace the body of section 4.5.1 Dependency analysis:

In general the static semantics are given by the normal Hindley-Milner inference rules. A dependency analysis transformation is first performed to increase polymorphism. Two variables bound by value declarations are in the same declaration group if either

they are bound by the same pattern binding, or

their bindings are mutually recursive (perhaps via some other declarations that are also part of the group).

Application of the following rules causes each let or where construct (including the where defining the top level bindings in a module) to bind only the variables of a single declaration group, thus capturing the required dependency analysis: (A similar transformation is described in Peyton Jones' book [10].)

The order of declarations in where/let constructs is irrelevant.

let {d₁; d₂} in e = let {d₁} in (let {d₂} in e)

(when no identifier bound in d₂ appears free in d₁)

with:

In general the static semantics are given by applying the normal Hindley-Milner inference rules. In order to increase polymorphism, these rules are applied to groups of bindings identified by a dependency analysis.

A binding b₁ depends on a binding b₂ in the same list of declarations if either

b₁ contains a free identifier that has no type signature and is bound by b₂, or

b₁ depends on a binding that depends on b₂.

A declaration group is a minimal set of mutually dependent bindings. Hindley-Milner type inference is applied to each declaration group in dependency order. The order of declarations in where/let constructs is irrelevant.

Notes:

also tightens up the original wording, which didn't mention that the declarations had to be in the same list and also defined declaration group but not dependency.
defining dependencies between bindings is a little simpler than dependencies between variables.
the dependency analysis transformation formerly listed in this section is no longer always possible.

Replace the first paragraph of section 4.5.2 Generalization:

The Hindley-Milner type system assigns types to a let-expression in two stages. First, the right-hand side of the declaration is typed, giving a type with no universal quantification. Second, all type variables that occur in this type are universally quantified unless they are associated with bound variables in the type environment; this is called generalization. Finally, the body of the let-expression is typed.

with

The Hindley-Milner type system assigns types to a let-expression in two stages:

The declaration groups are considered in dependency order. For each group, a type with no universal quantification is inferred for each variable bound in the group. Then, all type variables that occur in these types are universally quantified unless they are associated with bound variables in the type environment; this is called generalization.

Finally, the body of the let-expression is typed.