exps.verb 58.5 KB
Newer Older
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1
%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
2
% $Header: /home/cvs/root/haskell-report/report/exps.verb,v 1.20 2003/01/13 13:08:55 simonpj Exp $
Simon Peyton Jones's avatar
Simon Peyton Jones committed
3
4
5
6
7
8
9
%
%*section 3
%**<title>The Haskell 98 Report: Expressions</title>
%**~header
\section{Expressions}\index{expression}
\label{expressions}

Simon Peyton Jones's avatar
Simon Peyton Jones committed
10
In this chapter, we describe the syntax and informal semantics of
Simon Peyton Jones's avatar
Simon Peyton Jones committed
11
\Haskell{} {\em expressions}, including their translations into the
Simon Peyton Jones's avatar
Simon Peyton Jones committed
12
13
\Haskell{} kernel, where appropriate.  Except in the case of @let@
expressions, these translations preserve both the static and dynamic
Simon Peyton Jones's avatar
Simon Peyton Jones committed
14
15
16
17
18
19
20
semantics.  Free variables and constructors used in these translations
always refer to entities defined by the @Prelude@.  For example,
``@concatMap@'' used in the translation of list comprehensions
(Section~\ref{list-comprehensions}) means the @concatMap@ defined by
the @Prelude@, regardless of whether or not the identifier ``@concatMap@'' is in
scope where the list comprehension is used, and (if it is in scope)
what it is bound to.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
21

Simon Peyton Jones's avatar
Simon Peyton Jones committed
22
@@@
Simon Marlow's avatar
Simon Marlow committed
23
24
exp     ->  \hprime{infixexp @::@ [context @=>@] type}   & (\tr{expression type signature})
        |   \hprime{infixexp}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
25

Simon Marlow's avatar
Simon Marlow committed
26
27
28
29
30
\hprime{infixexp} -> \hprime{lexp qop infixexp} & (\tr{infix operator application})
            | \hprime{@-@ infixexp}             & (\tr{prefix negation})
            | \hprime{lexp}

\hprime{lexp}    ->  @\@ apat_1 ... apat_n @->@ exp	& (\tr{lambda abstraction}, n>=1)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
31
	|   @let@ decls @in@ exp	        & ({\tr{let expression}})
Simon Marlow's avatar
Simon Marlow committed
32
	|   @if@ exp \hprime{[@;@]} @then@ exp \hprime{[@;@]} @else@ exp & (\tr{conditional})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
33
34
35
36
37
38
39
40
41
42
43
44
45
	|   @case@ exp @of@ @{@ alts  @}@	& (\tr{case expression})
	|   @do@ @{@ stmts @}@        	        & (\tr{do expression})
	|   fexp
fexp	->  [fexp] aexp				& (\tr{function application})

aexp 	->  qvar				& (\tr{variable})
	|   gcon				& (\tr{general constructor})
	|   literal				
	|   @(@ exp @)@			      & (\tr{parenthesized expression})
	|   @(@ exp_1 @,@ ... @,@ exp_k @)@	& (\tr{tuple}, k>=2)
	|   @[@ exp_1 @,@ ... @,@ exp_k @]@	& (\tr{list}, k>=1)
	|   @[@ exp_1 [@,@ exp_2] @..@ [exp_3] @]@ & (\tr{arithmetic sequence})
	|   @[@ exp @|@ qual_1 @,@ ... @,@ qual_n @]@	& (\tr{list comprehension}, n>=1)
Simon Marlow's avatar
Simon Marlow committed
46
47
	|   @(@ \hprime{infixexp qop} @)@        & (\tr{left section})
        |   @(@ \hprime{qop_{\langle@-@\rangle} infixexp} @)@        & (\tr{right section})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
48
        |   qcon @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled construction}, n>=0)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
49
        |   aexp_{\langle{}qcon\rangle{}} @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled update}, n >= 1)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
50
51
52

@@@
\indexsyn{exp}%
Simon Marlow's avatar
Simon Marlow committed
53
54
\indexsyn{infixexp}%
\indexsyn{lexp}%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
55
56
57
\indexsyn{aexp}%
\indexsyn{fexp}%

Simon Peyton Jones's avatar
Simon Peyton Jones committed
58
%	Removed Aug 2001: more misleading than helpful. SLPJ
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
% As an aid to understanding this grammar,
% Table~\ref{syntax-precedences} shows the relative precedence of
% expressions, patterns and definitions, plus an extended associativity.
% "-" indicates that the item is non-associative.
% 
% \begin{table}[tb]
% \[
% \centerline{
% \begin{tabular}{|l|c|}\hline
% Item						&	Associativity \\ 
% \hline
% 						&			\\
% simple terms, parenthesized terms               &	--	       	\\
% irrefutable patterns (@~@)                      &       --		\\
% as-patterns ({\tt @@})                          &       right		\\
% function application				&	left		\\
% @do@, @if@, @let@, lambda(@\@), @case@ (leftwards)&	right		\\
% @case@ (rightwards)				&	right		\\
% 						&			\\
% infix operators, prec. 9			&	as defined	\\
% \ldots						&	\ldots		\\
% infix operators, prec. 0			&	as defined	\\
% 						&			\\
% function types (@->@)				&	right		\\
% contexts (@=>@)					&	--		\\
% type constraints (@::@)				&	--		\\
% @do@, @if@, @let@, lambda(@\@) (rightwards)	&	right		\\
% sequences (@..@)				&	--		\\
% generators (@<-@)				&	--		\\
% grouping (@,@)					&	n-ary		\\
% guards (@|@)					&	--		\\
% case alternatives (@->@)			&	--		\\
% definitions (@=@)				&	--		\\
% separation (@;@)				&	n-ary		\\ 
% \hline
% \end{tabular}
% }
% \]
% %**<div align=center> <h4>Table 1</h4> </div>
% \ecaption{Precedence of expressions, patterns, definitions (highest to lowest)}
% \label{syntax-precedences}
% \end{table}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
101
102
103
104
105

Expressions involving infix operators are disambiguated by the
operator's fixity (see Section~\ref{fixity}).  Consecutive
unparenthesized operators with the same precedence must both be either
left or right associative to avoid a syntax error.
Simon Marlow's avatar
Simon Marlow committed
106
107
108
109
110
111
112
113
Given an unparenthesized expression ``"x qop^{(a,i)} y qop^{(b,j)} z"'' 
\hprime{(where "qop^{(a,i)}" means an operator with associativity "a" and
precedence "i")}, parentheses must be added around either ``"x
qop^{(a,i)} y"'' or ``"y qop^{(b,j)} z"'' when "i=j" unless "a=b={\rm
  l}" or "a=b={\rm r}".

An example algorithm for resolving expressions involving infix
operators is given in Section~\ref{fixity-resolution}.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
114
115
116

Negation\index{negation} is the only prefix operator in
\Haskell{}; it has the same precedence as the infix @-@ operator
117
118
119
120
121
122
defined in the Prelude (see Section~\ref{fixity}, Figure~\ref{prelude-fixities}).

The grammar is ambiguous regarding the extent of lambda abstractions,
let expressions, and conditionals.  The ambiguity is resolved by the
meta-rule that each of these constructs extends as far to the right as
possible.  
Simon Peyton Jones's avatar
Simon Peyton Jones committed
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143

% 	I can't make head or tail of this para, so
%	I'm just deleting it.  SLPJ Dec 98
% The separation of function arrows from case alternatives solves
% the ambiguity that otherwise arises when an unparenthesized
% function type is used in an expression, such as the guard in a case
% expression.

Sample parses are shown below.
\[\bt{|l|l|}%%b
\hline
This                           	    & Parses as                             \\
\hline
@f x + g y@                         & @(f x) + (g y)@                       \\
@- f x + y@		            & @(- (f x)) + y@		            \\
@let { ... } in x + y@		    & @let { ... } in (x + y)@		    \\
@z + let { ... } in x + y@	    & @z + (let { ... } in (x + y))@	    \\
@f x y :: Int@                      & @(f x y) :: Int@                      \\
@\ x -> a+b :: Int@                 & @\ x -> ((a+b) :: Int@)               \\
\hline\et\]

Simon Marlow's avatar
Simon Marlow committed
144
145
146
For the sake of clarity, the rest of this section will assume that
expressions involving infix operators have been resolved according to
the fixities of the operators.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
147
148
149

\subsection{Errors}
\label{basic-errors}\index{error}
150
Errors during expression evaluation, denoted by "\bot"\index{"\bot"} (``bottom''\index{bottom}),
151
are indistinguishable by a Haskell program from non-termination.  Since \Haskell{} is a
152
non-strict language, all \Haskell{} types include "\bot".  That is, a value
Simon Peyton Jones's avatar
Simon Peyton Jones committed
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
of any type may be bound to a computation that, when demanded, results
in an error.  When evaluated, errors cause immediate program
termination and cannot be caught by the user.  The Prelude provides
two functions to directly 
cause such errors:
\bprog
@
error     :: String -> a
undefined :: a
@
\eprog
\indextt{error}
\indextt{undefined}
A call to @error@ terminates execution of
the program and returns an appropriate error indication to the
operating system.  It should also display the string in some
system-dependent manner.  When @undefined@ is used, the error message
is created by the compiler.

Translations of \Haskell{} expressions use @error@ and @undefined@ to
explicitly indicate where execution time errors may occur.  The actual
program behavior when an error occurs is up to the implementation.
The messages passed to the @error@ function in these translations are
only suggestions; implementations may choose to display more or less
information when an error occurs.

\subsection{Variables, Constructors, Operators, and Literals}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
180
\label{vars-and-lits}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
%
@@@
aexp 	->  qvar				& (\tr{variable})
	|   gcon				& (\tr{general constructor})
	|   literal				
@@@
\indexsyn{var}%
\indexsyn{con}%
\indexsyn{varop}%
\indexsyn{conop}%
\indexsyn{op}%
@@@
gcon    ->  @()@
        |   @[]@
        |   @(,@\{@,@\}@)@
        |   qcon

var	->  varid | @(@ varsym @)@		& (\tr{variable})
qvar	->  qvarid | @(@ qvarsym @)@		& (\tr{qualified variable})
con	->  conid | @(@ consym @)@		& (\tr{constructor})
qcon	->  qconid | @(@ gconsym @)@		& (\tr{qualified constructor})
varop	->  varsym | \bkqB varid \bkqA		& (\tr{variable operator})
qvarop	->  qvarsym | \bkqB qvarid \bkqA	& (\tr{qualified variable operator})
conop	->  consym | \bkqB conid \bkqA		& (\tr{constructor operator})
qconop	->  gconsym | \bkqB qconid \bkqA	& (\tr{qualified constructor operator})
op	->  varop | conop 			& (\tr{operator})
qop     ->  qvarop | qconop			& (\tr{qualified operator})
gconsym ->  @:@ | qconsym
@@@
\indexsyn{gcon}%
\indexsyn{var}%
\indexsyn{qvar}%
\indexsyn{con}%
\indexsyn{qcon}%
\indexsyn{varop}%
\indexsyn{qvarop}%
\indexsyn{conop}%
\indexsyn{qconop}%
\indexsyn{qop}%
\indexsyn{gconsym}%

Simon Peyton Jones's avatar
Simon Peyton Jones committed
222
\Haskell{} provides special syntax to support infix notation.
Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
223
An {\em operator} is a function that can be applied using infix 
Simon Peyton Jones's avatar
Simon Peyton Jones committed
224
syntax (Section~\ref{operators}), or partially applied using a
Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
225
{\em section} (Section~\ref{sections}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
226

Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
227
An {\em operator} is either an {\em operator symbol}, such as @+@ or @$$@,
Simon Peyton Jones's avatar
Simon Peyton Jones committed
228
229
230
231
or is an ordinary identifier enclosed in grave accents (backquotes), such
as \bkqB@op@\bkqA.  For example, instead of writing the prefix application
@op x y@, one can write the infix application \mbox{@x@ \bkqB@op@\bkqA@ y@}.
If no fixity\index{fixity}
Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
232
declaration is given for \bkqB@op@\bkqA{} then it defaults
Simon Peyton Jones's avatar
Simon Peyton Jones committed
233
234
235
to highest precedence and left associativity
(see Section~\ref{fixity}).

Simon Peyton Jones's avatar
Simon Peyton Jones committed
236
237
238
Dually, an operator symbol can be converted to an ordinary identifier
by enclosing it in parentheses.  For example, @(+) x y@ is equivalent
to @x + y@, and @foldr (*) 1 xs@ is equivalent to @foldr (\x y -> x*y) 1 xs@.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
239

Simon Peyton Jones's avatar
Simon Peyton Jones committed
240
241
242
243
244
245
246
247
248
249
250
%	This para is covered by Section 2.4 and 5.5.1
% A qualified name may only be used to refer to a variable or
% constructor imported from another module (see Section~\ref{import}), or
% defined at the top level, 
% but not in the definition of a new variable or constructor.  Thus
% \bprog
% let F.x = 1 in F.x   -- invalid
% \eprog
% incorrectly uses a qualifier in the definition of @x@, regardless of
% the module containing this definition.  Qualification does not affect
% the nature of an operator: @F.+@ is an infix operator just as @+@ is.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288

Special syntax is used to name some constructors for some of the
built-in types, as found
in the production for "gcon" and "literal".  These are described
in Section~\ref{basic-types}.

\index{number!translation of literals}
An integer literal represents the
application of the function @fromInteger@\indextt{fromInteger} to the
appropriate value of type 
@Integer@.  Similarly, a floating point literal stands for an application of
@fromRational@\indextt{fromRational} to a value of type @Rational@ (that is, 
@Ratio Integer@).

\outline{
\paragraph*{Translation:}
The integer literal "i" is equivalent to @fromInteger@ "i",
where @fromInteger@ is a method in class @Num@ (see Section
\ref{numeric-literals}).\indexclass{Num}


The floating point literal "f" is equivalent to @fromRational@
("n" @Ratio.%@ "d"), where @fromRational@ is a method in class @Fractional@
and @Ratio.%@ constructs a rational from two integers, as defined in
the @Ratio@ library.\indexclass{Fractional}
The integers "n" and "d" are chosen so that "n/d = f".
}


\subsection{Curried Applications and Lambda Abstractions}
\label{applications}
\label{lambda-abstractions}
\index{lambda abstraction}
\index{application}
%\index{function application|see{application}}
%
@@@
fexp	->  [fexp] aexp				& (\tr{function application})
289
lexp	->  @\@ apat_1 ... apat_n @->@ exp	& (\tr{lambda abstraction}, n>=1)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
@@@
\indexsyn{exp}%
\indexsyn{fexp}%

\noindent
{\em Function application}\index{application} is written 
"e_1 e_2".  Application associates to the left, so the
parentheses may be omitted in @(f x) y@.  Because "e_1" could
be a data constructor, partial applications of data constructors are
allowed. 

{\em Lambda abstractions} are written 
"@\@ p_1 ... p_n @->@ e", where the "p_i" are {\em patterns}.
An expression such as @\x:xs->x@ is syntactically incorrect;
it may legally be written as @\(x:xs)->x@.

The set of patterns must be {\em linear}\index{linearity}%
\index{linear pattern}---no variable may appear more than once in the set.

Simon Marlow's avatar
Simon Marlow committed
309
\outline{\ifhtml{}{\small}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
\paragraph*{Translation:}
The following identity holds:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
"@\@ p_1 ... p_n @->@ e"
	 & "=" &
	"@\@ x_1 ... x_n @-> case (@x_1@,@ ...@,@ x_n@) of (@p_1@,@ ...@,@ p_n@) ->@ e"
\et
\end{center}
where the "x_i" are new identifiers.
}
Given this translation combined with the semantics of case
expressions and pattern matching described in
Section~\ref{case-semantics}, if the
pattern fails to match, then the result is "\bot".

             
\subsection{Operator Applications}
\index{operator application}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
330
%\index{operator application|hseealso{application}}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
331
332
333
\label{operators}
%
@@@
334
335
336
\hprime{infixexp} ->  \hprime{lexp qop infixexp}
	|   @-@ \hprime{infixexp}		& (\tr{prefix negation})
        |   \hprime{lexp}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
qop     ->  qvarop | qconop			& (\tr{qualified operator})
@@@
\indexsyn{exp}%
\indexsyn{qop}%

\noindent
The form "e_1 qop e_2" is the infix application of binary
operator\index{operator} "qop" to expressions "e_1" and "e_2".  

The special
form "@-@e" denotes prefix negation\index{negation}, the only
prefix operator in \Haskell{}, and is 
syntax for "@negate @(e)".\indextt{negate}  The binary @-@ operator
does not necessarily refer 
to the definition of @-@ in the Prelude; it may be rebound 
by the module system.  However, unary @-@ will always refer to the
@negate@ function defined in the Prelude.  There is no link between
the local meaning of the @-@ operator and unary negation.

Prefix negation has the same precedence as the infix operator @-@
defined in the Prelude (see
358
Table~\ref{prelude-fixities}).  Because @e1-e2@ parses as an
Simon Peyton Jones's avatar
Simon Peyton Jones committed
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
infix application of the binary operator @-@, one must write @e1(-e2)@ for
the alternative parsing.  Similarly, @(-)@ is syntax for 
@(\ x y -> x-y)@, as with any infix operator, and does not denote 
@(\ x -> -x)@---one must use @negate@ for that.

\outline{
\paragraph*{Translation:}
The following identities hold:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
"e_1 op e_2" & "=" & "@(@op@)@ e_1 e_2" \\
"@-@e" & "=" & "@negate@ (e)"
\et
\end{center}
}

\subsection{Sections}
\index{section}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
378
%\index{section|hseealso{operator application}}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
379
380
381
\label{sections}
%
@@@
382
aexp 	->   @(@ \hprime{infixexp qop} @)@        & (\tr{left section})
Simon Marlow's avatar
Simon Marlow committed
383
        |   @(@ \hprime{qop_{\langle@-@\rangle} infixexp} @)@        & (\tr{right section})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
384
@@@
Simon Marlow's avatar
Simon Marlow committed
385
386
\indexsyn{infixexp}%
\indexsyn{qop}%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
387
388
389
390
391
392
393
394
395
396
397
398
399

\noindent
{\em Sections} are written as "@(@ op e @)@" or "@(@ e op @)@", where
"op" is a binary operator and "e" is an expression.  Sections are a
convenient syntax for partial application of binary operators.

Syntactic precedence rules apply to sections as follows.
"@(@op~e@)@" is legal if and only if "@(x@~op~e@)@" parses 
in the same way as "@(x@~op~@(@e@))@";
and similarly for  "@(@e~op@)@".
For example, @(*a+b)@ is syntactically invalid, but @(+a*b)@ and
@(*(a+b))@ are valid.  Because @(+)@ is left associative, @(a+b+)@ is syntactically correct,
but @(+a+b)@ is not; the latter may legally be written as @(+(a+b))@.
400
401
402
As another example, the expression
\bprog
@
Simon Peyton Jones's avatar
Simon Peyton Jones committed
403
  (let n = 10 in n +)
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
@
\eprog
is invalid because, by the let/lambda meta-rule (Section~\ref{expressions}),
the expression
\bprog
@
  (let n = 10 in n + x)
@
\eprog
parses as
\bprog
@
  (let n = 10 in (n + x))
@
\eprog
rather than
\bprog
@
  ((let n = 10 in n) + x)
@
\eprog
% This observation makes it easier to implement the let/lambda meta-rule
% (Section~\ref{expressions}) because once the operator has been seen it is clear that any
% legal parse must include the operator in the body of the @let@.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
428
429
430
431
432
433
434
435
436

Because @-@ is treated specially in the grammar,
"@(-@ exp@)@" is not a section, but an application of prefix
negation,\index{negation} as
described in the preceding section.  However, there is a @subtract@
function defined in the Prelude such that
"@(subtract@ exp@)@" is equivalent to the disallowed section.
The expression "@(+ (-@ exp@))@" can serve the same purpose.

437

Simon Peyton Jones's avatar
Simon Peyton Jones committed
438
% Changed to allow postfix operators.  That is, in (op x), we no
Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
439
% longer add a \x -> which would require op to be binary instead
Simon Peyton Jones's avatar
Simon Peyton Jones committed
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
% of unary.

\outline{
\paragraph*{Translation:}
The following identities hold:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
"@(@op e@)@" & "=" & "@\@ x @->@ x op e" \\
"@(@e op@)@" & "=" & "@\@ x @->@ e op x"
\et
\end{center}
where "op" is a binary operator, "e" is an expression, and "x" is a variable
that does not occur free in "e".
}

\subsection{Conditionals}
\label{conditionals}\index{conditional expression}
%
@@@
460
lexp ->  @if@ exp \hprime{[@;@]} @then@ exp \hprime{[@;@]} @else@ exp
Simon Peyton Jones's avatar
Simon Peyton Jones committed
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
@@@
\indexsyn{exp}%

%\indextt{if ... then ... else ...}
A {\em conditional expression}
\index{conditional expression}
has the form 
"@if@ e_1 @then@ e_2 @else@ e_3" and returns the value of "e_2" if the
value of "e_1" is @True@, "e_3" if "e_1" is @False@, and "\bot"
otherwise.

\outline{
\paragraph*{Translation:}
The following identity holds:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
"@if@ e_1 @then@ e_2 @else@ e_3"  & "=" & "@case@ e_1 @of { True ->@ e_2 @; False ->@ e_3 @}@"
\et
\end{center}
where @True@ and @False@ are the two nullary constructors from the 
type @Bool@, as defined in the Prelude.  The type of "e_1" must be @Bool@;
"e_2" and "e_3" must have the same type, which is also the type of the
entire conditional expression.
}

\subsection{Lists}
\label{lists}
%
@@@
491
infixexp ->  exp_1 qop exp_2
Simon Peyton Jones's avatar
Simon Peyton Jones committed
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
aexp	->  @[@ exp_1 @,@ ... @,@ exp_k @]@	& (k>=1)
	|   gcon
gcon	-> @[]@
	| qcon
qcon	-> @(@ gconsym @)@
qop     -> qconop
qconop  -> gconsym
gconsym -> @:@
@@@
\indexsyn{aexp}%

{\em Lists}\index{list} are written "@[@e_1@,@ ...@,@ e_k@]@", where
"k>=1".  The list constructor is @:@, and the empty list is denoted @[]@.
Standard operations on
lists are given in the Prelude (see Section~\ref{basic-lists}, and
Simon Peyton Jones's avatar
Simon Peyton Jones committed
507
Chapter~\ref{stdprelude} notably Section~\ref{preludelist}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524

\outline{
\paragraph*{Translation:}  
The following identity holds:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
"@[@e_1@,@ ...@,@ e_k@]@"  & "=" & "e_1 @: (@e_2 @: (@ ... @(@e_k @: [])))@"
\et
\end{center}
where @:@ and @[]@ are constructors for lists, as defined in
the Prelude (see Section~\ref{basic-lists}).  The types
of "e_1" through "e_k" must all be the same (call it "t\/"), and the
type of the overall expression is @[@"t"@]@ (see Section~\ref{type-syntax}).
}
The constructor ``@:@'' is reserved solely for list construction; like
@[]@, it is considered part of the language syntax, and cannot be hidden or redefined.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
525
It is a right-associative operator, with precedence level 5 (Section~\ref{fixity}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542

\subsection{Tuples}
\label{tuples}
%
@@@
aexp	->  @(@ exp_1 @,@ ... @,@ exp_k @)@	& (k>=2)
	| qcon
qcon -> @(,@\{@,@\}@)@

@@@
\indexsyn{aexp}%

{\em Tuples}\index{tuple} are written "@(@e_1@,@ ...@,@ e_k@)@", and may be
of arbitrary length "k>=2".  The constructor for an "n"-tuple is denoted by
@(,@\ldots@,)@, where there are "n-1" commas.  Thus @(a,b,c)@ and
@(,,) a b c@ denote the same value.
Standard operations on tuples are given
Simon Peyton Jones's avatar
Simon Peyton Jones committed
543
in the Prelude (see Section~\ref{basic-tuples} and Chapter~\ref{stdprelude}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568

\outline{
\paragraph*{Translation:}  
"@(@e_1@,@ ...@,@ e_k@)@" for "k\geq2" is an instance of a "k"-tuple as
defined in the Prelude, and requires no translation.  If
"t_1" through "t_k" are the types of "e_1" through "e_k",
respectively, then the type of the resulting tuple is 
"@(@t_1@,@ ...@,@ t_k@)@" (see Section~\ref{type-syntax}).
}

\subsection{Unit Expressions and Parenthesized Expressions}
\label{unit-expression}
\index{unit expression}
%
@@@
aexp	->  gcon
        |   @(@ exp @)@
gcon	-> @()@
@@@
\indexsyn{aexp}%

\noindent
The form "@(@e@)@" is simply a {\em parenthesized expression}, and is
equivalent to "e".  The {\em unit expression} @()@ has type
@()@\index{trivial type} (see
569
570
571
Section~\ref{type-syntax}).  It is the only member of that type apart
from $\bot$, and can
be thought of as the ``nullary tuple'' (see Section~\ref{basic-trivial}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
\nopagebreak[4]

\outline{
\paragraph{Translation:}  
"@(@e@)@" is equivalent to "e".
}

\subsection{Arithmetic Sequences}
\label{arithmetic-sequences}
%
@@@
aexp	->  @[@ exp_1 [@,@ exp_2] @..@ [exp_3] @]@	
@@@
\indexsyn{aexp}%

\noindent


The {\em arithmetic sequence}\index{arithmetic sequence}
"@[@e_1@,@ e_2 @..@ e_3@]@" denotes a list of values of
type "t", where each of the "e_i" has type "t", and "t" is an
instance of class @Enum@.

\outline{
\paragraph{Translation:}
Arithmetic sequences satisfy these identities:
\begin{center}
\begin{tabular}{lcl}%
\struthack{17pt}%
@[ @"e_1"@.. ]@		& "=" 
                        & @enumFrom@ "e_1" \\
@[ @"e_1"@,@"e_2"@.. ]@	& "=" 
                        & @enumFromThen@ "e_1" "e_2" \\
@[ @"e_1"@..@"e_3"@ ]@	& "=" 
                        & @enumFromTo@ "e_1" "e_3" \\
@[ @"e_1"@,@"e_2"@..@"e_3"@ ]@ 
                        & "=" 
                        & @enumFromThenTo@ "e_1" "e_2" "e_3"
\end{tabular}
\end{center}
where @enumFrom@, @enumFromThen@, @enumFromTo@, and @enumFromThenTo@
are class methods in the class @Enum@ as defined in the Prelude
614
(see Figure~\ref{standard-classes}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
615
616
617
}

The semantics of arithmetic sequences therefore depends entirely
Simon Peyton Jones's avatar
Simon Peyton Jones committed
618
on the instance declaration for the type "t".  
Simon Peyton Jones's avatar
Simon Peyton Jones committed
619
See Section~\ref{enum-class} for more details of which @Prelude@
620
types are in @Enum@ and their semantics.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
621

622
623
624
625
626
627
628
\subsection{List Comprehensions}
\index{list comprehension}
\index{let expression!in list comprehensions}
\label{list-comprehensions}
%
@@@
aexp	-> @[@ exp @|@ qual_1 @,@ ... @,@ qual_n @]@	& (\tr{list comprehension}, n>=1)
629
630
631
qual    -> pat @<-@ exp         & (\tr{generator})
         | @let@ decls          & (\tr{local declaration})
         | exp                  & (\tr{boolean guard})
632
633
@@@
\indexsyn{aexp}
634
\indexsyn{qual}
635
636

\noindent
637
638
639
640
641
642
643
644
A {\em list comprehension} has the form "@[@ e @|@ q_1@,@ ...@,@ q_n @]@,
n>=1," where the "q_i" qualifiers\index{qualifier} are either
\begin{itemize}
\item {\em generators}\index{generator} of the form "p @<-@ e", where
"p" is a
pattern (see Section~\ref{pattern-matching}) of type "t" and "e" is an
expression of type "@[@t@]@"
\item {\em local bindings} that provide new definitions for use in
Simon Marlow's avatar
Simon Marlow committed
645
646
647
the generated expression "e" or subsequent boolean guards and generators
\item {\em boolean guards},\index{boolean guard} which are arbitrary expressions of
type @Bool@.
648
\end{itemize}
649

650
651
652
653
654
655
Such a list comprehension returns the list of elements
produced by evaluating "e" in the successive environments
created by the nested, depth-first evaluation of the generators in the
qualifier list.  Binding of variables occurs according to the normal
pattern matching rules (see Section~\ref{pattern-matching}), and if a
match fails then that element of the list is simply skipped over.  Thus:\nopagebreak[4]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
656
657
658
659
660
661
\bprog
@
[ x |  xs   <- [ [(1,2),(3,4)], [(5,4),(3,2)] ], 
      (3,x) <- xs ]
@
\eprog
662
yields the list @[4,2]@.  If a qualifier is a boolen guard, it must evaluate
Simon Peyton Jones's avatar
Simon Peyton Jones committed
663
664
665
666
667
668
669
670
671
672
673
674
675
to @True@ for the previous pattern match to succeed.  
As usual, bindings in list comprehensions can shadow those in outer
scopes; for example:
\[\ba{lll}
@[ x | x <- x, x <- x ]@ & = & @[ z | y <- x, z <- y]@ \\
\ea\]
\outline{
\paragraph{Translation:}
List comprehensions satisfy these identities, which may be
used as a translation into the kernel:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
676
677
"@[ @ e@ | True ]@" & = & "@[@e@]@" \\
"@[ @ e@ | @q@ ]@" & = & "@[@~ e~@|@~q@, True ]@" \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
678
679
680
681
682
"@[ @ e@ | @b@,@~ Q ~@]@" & = &
	"@if@~b~@then@~@[ @ e@ | @Q@ ]@~@else []@" \\
"@[ @ e@ | @p @<-@ l@,@~ Q@ ]@" & = &
	"@let ok@~p~@=@~@[ @ e@ | @Q@ ]@" \\
&&       @    ok _ = []@ \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
683
&&	"@in concatMap ok@~ l" \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
684
685
686
687
"@[ @ e@ | let@~decls@,@~ Q@ ]@" & = &
	"@let@~decls~@in@~@[ @ e@ | @Q@ ]@"
\et
\end{center}
688
689
690
691
692
where "e" ranges over expressions, "p" over
patterns, "l" over list-valued expressions, "b" over
boolean expressions, "decls" over declaration lists, 
"q" over qualifiers, and "Q" over sequences of qualifiers.  "@ok@" is a fresh variable.
The function @concatMap@, and boolean value @True@, are defined in the Prelude.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
693
694
695
696
697
698
699
700
701
702
703
704
705
706
}

As indicated by the translation of list comprehensions, variables
bound by @let@ have fully polymorphic types while those defined by
@<-@ are lambda bound and are thus monomorphic (see Section
\ref{monomorphism}).

\subsection{Let Expressions}
\index{let expression}
\label{let-expressions}
%
% Including this syntax blurb does REALLY bad things to page breaking
% in the 1.[12] report (sigh); ToDo: hope it goes away.
@@@
707
lexp	->  @let@ decls @in@ exp
Simon Peyton Jones's avatar
Simon Peyton Jones committed
708
709
710
711
712
713
714
715
716
717
718
719
@@@
\indexsyn{exp}
\index{declaration!within a {\tt let} expression}

\noindent
{\em Let expressions} have the general form
"@let {@ d_1 @;@ ...  @;@ d_n @} in@ e",
and introduce a
nested, lexically-scoped, 
mutually-recursive list of declarations (@let@ is often called @letrec@ in
other languages).  The scope of the declarations is the expression "e"
and the right hand side of the declarations.  Declarations are
Simon Peyton Jones's avatar
Simon Peyton Jones committed
720
described in Chapter~\ref{declarations}.  Pattern bindings are matched
Simon Peyton Jones's avatar
Simon Peyton Jones committed
721
722
723
lazily; an implicit @~@ makes these patterns
irrefutable.\index{irrefutable pattern}
For example, 
Simon Marlow's avatar
Simon Marlow committed
724
725
726
727
728

\bt{l}
@let (x,y) = undefined in @"e"
\et

Simon Peyton Jones's avatar
Simon Peyton Jones committed
729
730
does not cause an execution-time error until @x@ or @y@ is evaluated.

Simon Marlow's avatar
Simon Marlow committed
731
\outline{\ifhtml{}{\small}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
732
733
734
735
736
737
738
739
740
741
742
\paragraph*{Translation:} The dynamic semantics of the expression 
"@let {@ d_1 @;@ ...  @;@ d_n @} in@ e_0" are captured by this
translation: After removing all type signatures, each
declaration "d_i" is translated into an equation of the form 
"p_i @=@ e_i", where "p_i" and "e_i" are patterns and expressions
respectively, using the translation in
Section~\ref{function-bindings}.  Once done, these identities
hold, which may be used as a translation into the kernel:
\begin{center}
\bt{lcl}%
\struthack{17pt}%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
743
@let {@"p_1"@=@"e_1"@; @ ... @; @"p_n"@=@"e_n"@} in@ "e_0"
744
      &=& @let (~@"p_1"@,@ ... @,~@"p_n"@) = (@"e_1"@,@ ... @,@"e_n"@) in@ "e_0" \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
@let @"p"@ = @"e_1" @ in @ "e_0"
	&=& @case @"e_1"@ of ~@"p"@ -> @"e_0"	\\
	& & {\rm where no variable in "p" appears free in "e_1"} \\
@let @"p"@ = @"e_1" @ in @ "e_0"
      &=& @let @"p"@ = fix ( \ ~@"p"@ -> @"e_1"@) in@ "e_0"
\et
\end{center}
where @fix@ is the least fixpoint operator.  Note the use of the
irrefutable patterns "@~@p".  This translation
does not preserve the static semantics because the use of @case@
precludes a fully polymorphic typing of the bound variables.
%\folks{Needs work -- KH}
% This same semantics applies to the top-level of
%a program that has been translated into a @let@ expression,
% as described at the beginning of Section~\ref{modules}.
The static semantics of the bindings in a @let@ expression
are described in 
Section~\ref{pattern-bindings}.
}

\subsection{Case Expressions}
\label{case}
%
@@@
769
lexp	->  @case@ exp @of@ @{@ alts @}@
Simon Peyton Jones's avatar
Simon Peyton Jones committed
770
alts	->  alt_1 @;@ ... @;@ alt_n 		& (n>=1)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
771
772
773
774
alt	->  pat @->@ exp [@where@ decls]
	|   pat gdpat [@where@ decls]
	|					& (empty alternative)

Simon Marlow's avatar
Simon Marlow committed
775
776
gdpat   ->  \hprime{guards} @->@ exp [ gdpat ]
\hprime{guards}	->  \hprime{@|@ guard_1, ..., guard_n}             & \hprime{(n>=1)}
777
\hprime{guard}	-> \hprime{pat @<-@ infixexp} 	& (\hprime{\tr{pattern guard}})
Simon Marlow's avatar
Simon Marlow committed
778
         | \hprime{@let@ decls}		& (\hprime{\tr{local declaration}})
779
         | infixexp		& (\tr{boolean guard})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
780
781
782
783
784
@@@
\indexsyn{exp}%
\indexsyn{alts}%
\indexsyn{alt}%
\indexsyn{gdpat}%
785
786
787
\indexsyn{guards}%
\indexsyn{guard}%

Simon Peyton Jones's avatar
Simon Peyton Jones committed
788
789
790
791
792
793
794

A {\em case expression}\index{case expression} has the general form
\[
"@case@ e @of { @p_1 match_1 @;@ ... @;@ p_n  match_n @}@"
\]
where each "match_i" is of the general form
\[\ba{lll}
Simon Marlow's avatar
Simon Marlow committed
795
 & "@|@ \hprime{gs_{i1}}"   & "@->@ e_{i1}" \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
796
 & "..." \\
Simon Marlow's avatar
Simon Marlow committed
797
 & "@|@ \hprime{gs_{im_i}}" & "@->@ e_{im_i}" \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
798
799
 & \multicolumn{2}{l}{"@where@ decls_i"}
\ea\]
800
(Notice that in the syntax rule for "guards", the ``@|@'' is a 
801
terminal symbol, not the syntactic metasymbol for alternation.)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
802
Each alternative "p_i match_i" consists of a 
803
804
pattern\index{pattern} "p_i" and its matches, "match_i".
Each match in turn
805
806
807
808
809
consists of a sequence of pairs of guards\index{guard}
"gs_{ij}" and bodies "e_{ij}" (expressions), followed by
optional bindings ("decls_i") that scope over all of the guards and
expressions of the alternative.

Simon Marlow's avatar
Simon Marlow committed
810
\begin{haskellprime}
811
812
813
814
815
816
817
\index{Pattern Guards}
\index{guards}
A {\em guard}\index{guard} has one of the following forms:
\begin{itemize}
\item {\em pattern guards}\index{pattern guard} are of the form "p @<-@ e", where
"p" is a 
pattern (see Section~\ref{pattern-matching}) of type "t" and "e" is an
818
expression type "t"\footnote{Note that the syntax of a pattern guard is the same as that of a generator in a list comprehension. 
819
The contextual difference is that, in a list comprehension, a pattern of type "t" goes with an expression of type "@[@t@]@".}.
820
They succeed if the expression "e" matches the pattern "p", and introduce the bindings of the pattern to the environment.
Simon Marlow's avatar
Simon Marlow committed
821
\item {\em local bindings} are of the form "@let@ decls".  They always succeed, and they introduce the names defined in "decls" to the environment.
822
823
824
\item {\em boolean guards}\index{boolean guard} are arbitrary expressions of
type @Bool@.  They succeed if the expression evaluates to @True@, and they do not introduce new names to the environment.  A boolean guard, "g", is semantically equivalent to the pattern guard "@True <- @g".
\end{itemize}
Simon Marlow's avatar
Simon Marlow committed
825
\end{haskellprime}
826
827
828


An alternative of the form
Simon Peyton Jones's avatar
Simon Peyton Jones committed
829
830
831
832
833
834
835
836
837
838
839
840
841
842
\[
"pat @->@ exp @where@ decls"
\]
is treated as shorthand for:
\[\ba{lll}
 & "pat @| True@"   & "@->@ exp" \\
 & \multicolumn{2}{l}{"@where@ decls"}
\ea\]

A case expression must have at least one alternative and each alternative must
have at least one body.  Each body must have the same type, and the
type of the whole expression is that type.

A case expression is evaluated by pattern matching the expression "e"
Simon Peyton Jones's avatar
Simon Peyton Jones committed
843
against the individual alternatives.  The alternatives are tried
844
845
846
sequentially, from top to bottom.  If "e" matches the pattern of an
alternative, then the guarded expressions for that alternative are
tried sequentially from top to bottom in the environment of the case
847
expression extended first by the bindings created during the matching
848
849
850
of the pattern, and then by the "decls_i" in the @where@ clause
associated with that alternative.

Simon Marlow's avatar
Simon Marlow committed
851
\begin{haskellprime}
852
853
854
855
856
857
858
859
For each guarded expression, the comma-separated guards are tried
sequentially from left to right.  If all of them succeed, then the
corresponding expression is evaluated in the environment extended with
the bindings introduced by the guards.  That is, the bindings that are
introduced by a guard (either by using a let clause or a pattern
guard) are in scope in the following guards and the corresponding
expression.  If any of the guards fail, then this guarded expression
fails and the next guarded expression is tried.
Simon Marlow's avatar
Simon Marlow committed
860
\end{haskellprime}
861
862
863
864

If none of the guarded expressions for a given alternative succeed,
then matching continues with the next alternative.  If no alternative
succeeds, then the result is "\bot".  Pattern matching is described in
865
866
Section~\ref{pattern-matching}, with the formal semantics of case
expressions in Section~\ref{case-semantics}.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
867

Simon Peyton Jones's avatar
Simon Peyton Jones committed
868
{\em A note about parsing.} The expression
869
870
871
872
873
874
875
876
877
878
879
880
881
882
\bprog
@
  case x of { (a,_) | let b = not a in b :: Bool -> a }
@
\eprog
is tricky to parse correctly.  It has a single unambiguous parse, namely
\bprog
@
  case x of { (a,_) | (let b = not a in b :: Bool) -> a }
@
\eprog
However, the phrase "@Bool -> a@" is syntactically valid as a type, and
parsers with limited lookahead may incorrectly commit to this choice, and hence
reject the program.  Programmers are advised, therefore, to avoid guards that
883
end with a type signature --- indeed that is why a "guard" contains 
884
an "infixexp" not an "exp".
885

Simon Peyton Jones's avatar
Simon Peyton Jones committed
886
887
888
889
890
891
892
\subsection{Do Expressions}
\index{do expression}
\label{do-expressions}
\index{let expression!in do expressions}
\index{monad}
%
@@@
893
lexp -> @do@ @{@ stmts @}@             & (\tr{do expression})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
894
stmts -> stmt_1 ... stmt_n exp [@;@]  &  (n>=0)
895
896
897
stmt -> exp @;@
      | pat @<-@ exp @;@
      | @let@ decls @;@
Simon Peyton Jones's avatar
Simon    
Simon Peyton Jones committed
898
      | @;@			& (empty statement)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
899
@@@
Simon Peyton Jones's avatar
Simon Peyton Jones committed
900
901
902
903
904
% The semicolons are done differently than for decls
%  Reason: to do it the same way would mean:
%	stmts -> stmt1 ; ... ; stmtn ; exp [;]
% Now, what happens if n=0?  Is there a ';' before the exp or not?
% Putting the ';' on the end of the stmt makes that clear.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
\indexsyn{exp}%
\indexsyn{stmt}%
\indexsyn{stmts}%

A {\em do expression} provides a more conventional syntax for monadic programming.
It allows an expression such as 
\bprog
@
  putStr "x: "    >> 
  getLine         >>= \l ->
  return (words l)
@
\eprog
to be written in a more traditional way as:
\bprog
@
  do putStr "x: "
     l <- getLine
     return (words l)
@
\eprog
\outline{
\paragraph*{Translation:} 
Do expressions satisfy these identities, which may be
used as a translation into the kernel, after eliminating empty "stmts":
\begin{center}
\bt{lcl}%
\struthack{17pt}%
@do {@"e"@}@                       &=& "e"\\
@do {@"e"@;@"stmts"@}@             &=& "e" @>> do {@"stmts"@}@ \\
@do {@"p"@ <- @"e"@; @"stmts"@}@   &=& @let ok @"p"@ = do {@"stmts"@}@\\
                                   & & @    ok _ = fail "..."@\\
                                   & & @  in @"e"@ >>= ok@ \\
@do {let@ "decls"@; @"stmts"@}@  &=& @let@ "decls" @in do {@"stmts"@}@\\
\et
\end{center}
The ellipsis @"..."@ stands for a compiler-generated error message,
passed to @fail@, preferably giving some indication of the location
of the pattern-match failure;
the functions @>>@, @>>=@, and @fail@ are operations in the class @Monad@,
as defined in the Prelude\indexclass{Monad}; and @ok@ is a fresh
identifier. 
}
As indicated by the translation of @do@, variables bound by @let@ have
fully polymorphic types while those defined by @<-@ are lambda bound
and are thus monomorphic.

\subsection{Datatypes with Field Labels}
\label{field-ops}
\index{data declaration@@{\tt data} declaration}
\index{label}
\index{field label|see{label}}
957
958
A datatype declaration may optionally define field labels
(see Section~\ref{datatype-decls}).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
959
960
961
962
963
964
These field labels can be used to 
construct, select from, and update fields in a manner
that is independent of the overall structure of the datatype.

Different datatypes cannot share common field labels in the same scope.
A field label can be used at most once in a constructor.
965
Within a datatype, however, a field label can be used in more
Simon Peyton Jones's avatar
Simon Peyton Jones committed
966
than one constructor provided the field has the same typing in all
967
constructors. To illustrate the last point, consider:
Simon Peyton Jones's avatar
Simon Peyton Jones committed
968
\bprog
969
970
971
972
@
  data S = S1 { x :: Int } | S2 { x :: Int }   -- OK
  data T = T1 { y :: Int } | T2 { y :: Bool }  -- BAD
@
Simon Peyton Jones's avatar
Simon Peyton Jones committed
973
\eprog
974
975
976
Here @S@ is legal but @T@ is not, because @y@ is given 
inconsistent typings in the latter.

Simon Peyton Jones's avatar
Simon Peyton Jones committed
977
978
979
980
981
982
\subsubsection{Field Selection}
@@@
aexp ->     qvar
@@@
\index{field label!selection}

983
984
Field labels are used as selector functions.  When used as a variable,
a field label serves as a function that extracts the field from an
Simon Peyton Jones's avatar
Simon Peyton Jones committed
985
986
987
object.  Selectors are top level bindings and so they
may be shadowed by local variables but cannot conflict with 
other top level bindings of the same name.  This shadowing only
988
989
affects selector functions; in record construction (Section~\ref{record-construction}) 
and update (Section~\ref{record-update}), field labels
Simon Peyton Jones's avatar
Simon Peyton Jones committed
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
cannot be confused with ordinary variables. 

\outline{
\paragraph*{Translation:} 
A field label "f" introduces a selector function defined as:
\begin{center}
\bt{lcl}
\struthack{17pt}%
"f"@ x@ &=&@case x of {@ "C_1 p_{11} \ldots p_{1k}" @ -> @ "e_1" @;@ 
 "\ldots" @;@ "C_n p_{n1} \ldots p_{nk}" @ -> @ "e_n" @}@\\
\et
\end{center}
where "C_1 \ldots C_n" are all the constructors of the datatype containing a
field labeled with "f", "p_{ij}" is @y@ when "f" labels the "j"th
component of "C_i" or @_@ otherwise, and "e_i" is @y@ when some field in
"C_i" has a label of "f" or @undefined@ otherwise.
}
\subsubsection{Construction Using Field Labels}
1008
\label{record-construction}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
\index{field label!construction}
@@@
aexp ->  qcon @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled construction}, n>=0)
fbind   ->  qvar @=@ exp
@@@
\indexsyn{fbind}%

A constructor with labeled fields may be used to construct a value 
in which the components are specified by name rather than by position.
Unlike the braces used in declaration lists, these are not subject to
layout; the @{@ and @}@ characters must be explicit.  (This is also
true of field updates and field patterns.)
1021
Construction using field labels is subject to the following constraints:
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1022
1023
1024
\begin{itemize}
\item Only field labels declared with the specified constructor may be
mentioned. 
1025
\item A field label may not be mentioned more than once.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1026
1027
1028
1029
1030
\item Fields not mentioned are initialized to $\bot$.
\item A compile-time error occurs when any strict fields (fields
whose declared types are prefixed by @!@) are omitted during
construction.  Strict fields are discussed in Section~\ref{strictness-flags}.
\end{itemize}
1031
1032
1033
1034
The expression @F {}@, where @F@ is a data constructor, is legal 
{\em whether or not @F@ was declared with record syntax} (provided @F@ has no strict
fields --- see the third bullet above); 
it denotes "@F@ \bot_1 ... \bot_n", where "n" is the arity of @F@.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049

\outline{
\paragraph*{Translation:} 
In the binding "f" @=@ "v", the field "f" labels "v".
\begin{center}
\bt{lcl}
\struthack{17pt}%
"C" @{@ "bs" @}@ &=& "C (pick^C_1 bs @undefined@) \ldots (pick^C_k bs @undefined@)"\\
\et
\end{center}
where "k" is the arity of "C".

The auxiliary function "pick^C_i bs d" is defined as follows:
\begin{quote}
    If the "i"th component of a constructor "C" has the
1050
    field label "f", and if "f=v" appears in the binding list
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1051
1052
1053
1054
    "bs", then "pick^C_i bs d" is "v".  Otherwise, "pick^C_i bs d" is
    the default value "d".
\end{quote}
}
1055

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1056
\subsubsection{Updates Using Field Labels}
1057
\label{record-update}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1058
1059
1060
1061
1062
\index{field label!update}
@@@
aexp ->  aexp_{\langle{}qcon\rangle{}} @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled update}, n>=1)
@@@

1063
Values belonging to a datatype with field labels may be
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
non-destructively updated.  This creates a new value in which the
specified field values replace those in the existing value.  
Updates are restricted in the following ways:
\begin{itemize}
\item All labels must be taken from the same datatype.
\item At least one constructor must define all of the labels
mentioned in the update.
\item No label may be mentioned more than once.
\item An execution error occurs when the value being updated does
not contain all of the specified labels.
\end{itemize}
\outline{
\paragraph*{Translation:} 
Using the prior definition of "pick",
\begin{center}
\bt{lcl}
\struthack{17pt}%
"e" @{@ "bs" @}@ &=& @case@ "e" @of@\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1082
&&@        @"C_1 v_1 ... v_{k_1}" @->@ "C_1 (pick^{C_1}_1 bs v_1) ... (pick^{C_1}_{k_1} bs v_{k_1})"\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1083
&&@            @ ... \\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1084
&&@        @"C_j v_1 ... v_{k_j}" @->@ "C_j (pick^{C_j}_1 bs v_1) ... (pick^{C_j}_{k_j} bs v_{k_j})"\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1085
1086
1087
1088
&&@        _ -> error "Update error"@\\
\et
\end{center}
where "\{C_1,...,C_j\}" is the set of constructors containing all labels
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1089
in "bs", and "k_i" is the arity of "C_i".
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
}
Here are some examples using labeled fields:
\bprog
@
data T    = C1 {f1,f2 :: Int}
          | C2 {f1 :: Int,
                f3,f4 :: Char}
@
\eprog
\[\bt{|l|l|}%%b
\hline
Expression                       	    & Translation                       \\
\hline
@C1 {f1 = 3}@                       & @C1 3 undefined@          \\
@C2 {f1 = 1, f4 = 'A', f3 = 'B'}@   & @C2 1 'B' 'A'@            \\
@x {f1 = 1}@                 & @case x of C1 _ f2    -> C1 1 f2@ \\
                             & @          C2 _ f3 f4 -> C2 1 f3 f4@   \\
\hline\et\]
1108
The field @f1@ is common to both constructors in T.  This
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1109
1110
1111
1112
example translates expressions using constructors in field-label
notation into equivalent expressions using the same constructors
without field labels. 
A compile-time error will result if no single constructor
1113
defines the set of field labels used in an update, such as
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
@x {f2 = 1, f3 = 'x'}@. 

\subsection{Expression Type-Signatures}
\index{expression type-signature}
\label{expression-type-sigs}
%
@@@
exp ->  exp @::@ [context @=>@] type
@@@
\indexsyn{exp}
\indextt{::}

\nopagebreak[4]
{\em Expression type-signatures} have the form "e @::@ t", where "e"
is an expression and "t" is a type (Section~\ref{type-syntax}); they
are used to type an expression explicitly
and may be used to resolve ambiguous typings due to overloading (see
Section~\ref{default-decls}).  The value of the expression is just that of
"exp".  As with normal type signatures (see
Section~\ref{type-signatures}), the declared type may be more specific than 
the principal type derivable from "exp", but it is an error to give
a type that is more general than, or not comparable to, the
principal type.
\outline{
\paragraph*{Translation:} 
\begin{center}
\bt{lcl}
\struthack{17pt}%
"e @::@ t" & = & "@let {@ v @::@ t@; @ v @=@ e @} in @v"
\et
\end{center}
}


\subsection{Pattern Matching}
\index{pattern-matching}
\label{pattern-matching}
\label{patterns}

{\em Patterns} appear in lambda abstractions, function definitions, pattern
bindings, list comprehensions, do expressions, and case expressions.
However, the 
first five of these ultimately translate into case expressions, so
defining the semantics of pattern matching for case expressions is sufficient.
%it suffices to restrict the definition of the semantics of
%pattern-matching to case expressions.

\subsubsection{Patterns}
\label{pattern-definitions}

Patterns\index{pattern} have this syntax:
@@@
Simon Marlow's avatar
Simon Marlow committed
1166
1167
1168
1169
1170
pat     -> \hprime{lpat qconop pat} & (\tr{infix constructor})
        | \hprime{lpat}

\hprime{lpat} ->  apat
        | \hprime{@-@ (integer | float)} & (\tr{negative literal})
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
	|   gcon apat_1 ... apat_k		& (\tr{arity} gcon = k, k>=1)

apat	->  var [{\tt @@} apat]			& (\tr{as pattern})
	|   gcon				& (\tr{arity} gcon = 0) 
        |   qcon @{@ fpat_1 @,@ ... @,@ fpat_k @}@ & (\tr{labeled pattern}, k>=0)
	|   literal
	|   @_@					& (\tr{wildcard})
	|   @(@ pat @)@				& (\tr{parenthesized pattern})
	|   @(@ pat_1 @,@ ... @,@ pat_k @)@	& (\tr{tuple pattern}, k>=2)
	|   @[@ pat_1 @,@ ... @,@ pat_k @]@	& (\tr{list pattern}, k>=1) 
	|   @~@ apat				& (\tr{irrefutable pattern})

fpat    ->  qvar @=@ pat
@@@
\indexsyn{pat}%
Simon Marlow's avatar
Simon Marlow committed
1186
\indexsyn{lpat}%
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1187
1188
1189
1190
1191
1192
1193
1194
1195
\indexsyn{apat}%
\indexsyn{fpats}%
\indexsyn{fpat}%

The arity of a constructor must match the number of
sub-patterns associated with it; one cannot match against a
partially-applied constructor.

All patterns must be {\em linear}\index{linearity}
1196
1197
\index{linear pattern}---no variable may appear more than once.  For
example, this definition is illegal:
1198
\bprog
1199
@
1200
f (x,x) = x	-- ILLEGAL; x used twice in pattern
1201
@
1202
\eprog
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1203
1204
1205
1206
1207
1208
Patterns of the form "var"{\tt @@}"pat" are called {\em as-patterns},
\index{as-pattern ({\tt {\char'100}})}
and allow one to use "var"
as a name for the value being matched by "pat".  For example,\nopagebreak[4]
\bprog
@
Simon Marlow's avatar
Simon Marlow committed
1209
case e of { xs@(x:rest) -> if x==0 then rest else xs }
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
@
\eprog
is equivalent to:
\bprog
@
let { xs = e } in
  case xs of { (x:rest) -> if x==0 then rest else xs }
@
\eprog

Patterns of the form @_@ are {\em
wildcards}\index{wildcard pattern ({\tt {\char'137}})} and are useful when some part of a pattern
is not referenced on the right-hand-side.  It is as if an
identifier not used elsewhere were put in its place.  For example,
\bprog
@
case e of { [x,_,_]  ->  if x==0 then True else False }
@
\eprog
is equivalent to:
\bprog
@
case e of { [x,y,z]  ->  if x==0 then True else False }
@
\eprog
% where @y@ and @z@ are identifiers not used elsewhere.

%old:
%This translation is also 
%assumed prior to the semantics given below.

\subsubsection{Informal Semantics of Pattern Matching}

Patterns are matched against values.  Attempting to match a pattern
can have one of three results: it may {\em fail\/}; it may {\em
succeed}, returning a binding for each variable in the pattern; or it
may {\em diverge} (i.e.~return "\bot").  Pattern matching proceeds
1247
from left to right, and outside to inside, according to the following rules:
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1248
\begin{enumerate}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1249
1250
\item Matching the pattern "var" 
against a value "v" always succeeds and binds "var" to "v".
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1251

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1252
1253
1254
1255
1256
1257
\item
Matching the pattern "@~@apat" against a value "v" always succeeds.
The free variables in "apat" are bound to the appropriate values if matching
"apat" against "v" would otherwise succeed, and to "\bot" if matching
"apat" against "v" fails or diverges.  (Binding does {\em
not} imply evaluation.)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1258

1259
1260
Operationally, this means that no matching is done on a
"@~@apat" pattern until one of the variables in "apat" is used.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1261
1262
1263
1264
At that point the entire pattern is matched against the value, and if
the match fails or diverges, so does the overall computation.

\item
1265
Matching the wildcard pattern @_@ against any value always succeeds,
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1266
and no binding is done.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1267

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1268
1269
1270
1271
\item
Matching the pattern "con pat" against a value, where "con" is a
constructor defined by @newtype@, depends on the value:
\begin{itemize}
1272
1273
\item If the value is of the form "con v", then "pat" is matched against "v".
\item If the value is "\bot", then "pat" is matched against "\bot".
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1274
1275
1276
\end{itemize}
That is, constructors associated with
@newtype@ serve only to change the type of a value.\index{newtype declaration@@{\tt newtype} declaration}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1277

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1278
1279
1280
1281
1282
\item
Matching the pattern "con pat_1 \ldots pat_n" against a value, where "con" is a
constructor defined by @data@, depends on the value:
\begin{itemize}
\item If the value is of the form "con v_1 \ldots v_n", 
1283
sub-patterns are matched left-to-right against the components of the data value;
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1284
1285
1286
1287
if all matches succeed, the overall match
succeeds; the first to fail or diverge causes the overall match to
fail or diverge, respectively.  

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1288
1289
1290
1291
1292
1293
\item If the value is of the form "con' v_1 \ldots v_m", where "con" is a different 
constructor to "con'", the match fails.

\item If the value is "\bot", the match diverges.
\end{itemize}

1294
1295
1296
1297
1298
1299
1300
\item 
Matching against a constructor using labeled fields is the same as
matching ordinary constructor patterns except that the fields are
matched in the order they are named in the field list.  All fields
listed must be declared by the constructor; fields may not be named
more than once.  Fields not named by the pattern are ignored (matched
against @_@).
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1301

1302
\item Matching a numeric, character, or string literal pattern "k" against a value "v"
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1303
\index{literal pattern}
1304
1305
succeeds if "v ~@==@ ~k", where @==@
is overloaded based on the type of the pattern.  The match diverges if
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1306
1307
this test diverges.

1308
1309
1310
1311
The interpretation of numeric literals is exactly as described in Section~\ref{vars-and-lits};
that is, the overloaded function @fromInteger@ or @fromRational@ is 
applied to an @Integer@ or @Rational@ literal (resp)
to convert it to the appropriate type.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1312

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1313
\item
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1314
Matching an as-pattern "var"{\tt @@}"apat" against a value "v" is
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1315
\index{as-pattern ({\tt {\char'100}})}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1316
1317
the result of matching "apat" against "v", augmented with the binding of
"var" to "v".  If the match of "apat" against "v" fails or diverges,
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1318
1319
1320
1321
1322
then so does the overall match.
\end{enumerate}

Aside from the obvious static type constraints (for
example, it is a static error to match a character against a
1323
1324
1325
boolean), the following static class constraints hold: 
\begin{itemize}
\item An integer
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1326
1327
1328
literal pattern
\index{integer literal pattern}
can only be matched against a value in the class
1329
1330
@Num@.
\item A floating literal pattern
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1331
1332
1333
\index{floating literal pattern}
can only be matched against a value
in the class @Fractional@.
1334
\end{itemize}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1335

1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
It is sometimes helpful to distinguish two kinds of
patterns.  Matching an {\em irrefutable pattern}
\index{irrefutable pattern}
is non-strict: the pattern matches even if the value to be matched is "\bot".
Matching a {\em refutable} pattern is strict: if the value to be matched
\index{refutable pattern}
is "\bot" the match diverges.
The irrefutable patterns are as follows:
a variable, a wildcard, "N apat" where "N" is a constructor
defined by @newtype@ and "apat" is irrefutable (see
Section~\ref{datatype-renaming}), 
\index{newtype declaration@@{\tt newtype} declaration} 
"var"{\tt @@}"apat" where "apat" is irrefutable,
or of the form "@~@apat" (whether or not "apat" is irrefutable).
All other patterns are {\em refutable}.

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1352
1353
1354
1355
1356
1357
1358
1359
1360
Here are some examples:
\begin{enumerate}
\item If the pattern @['a','b']@ is matched against "@['x',@\bot@]@", then @'a'@
"fails" to match against @'x'@, and the result is a failed match.  But
if @['a','b']@ is matched against "@[@\bot@,'x']@", then attempting to match
@'a'@ against "\bot" causes the match to "diverge".

\item These examples demonstrate refutable vs.~irrefutable
matching:
Simon Marlow's avatar
Simon Marlow committed
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380

\begin{tabular}{l}
@(\ ~(x,y) -> 0) @$\bot$@    @$\Rightarrow$@    0@\\
@(\  (x,y) -> 0) @$\bot$@    @$\Rightarrow$@    @$\bot$
\end{tabular}

\begin{tabular}{l}
@(\ ~[x] -> 0) []    @$\Rightarrow$@    0@\\
@(\ ~[x] -> x) []    @$\Rightarrow$@    @$\bot$
\end{tabular}

\begin{tabular}{l}
@(\ ~[x,~(a,b)] -> x) [(0,1),@$\bot$@]    @$\Rightarrow$@    (0,1)@\\
@(\ ~[x, (a,b)] -> x) [(0,1),@$\bot$@]    @$\Rightarrow$@    @$\bot$
\end{tabular}

\begin{tabular}{l}
@(\  (x:xs) -> x:x:xs) @$\bot$@   @$\Rightarrow$@   @$\bot$\\
@(\ ~(x:xs) -> x:x:xs) @$\bot$@   @$\Rightarrow$@   @$\bot$@:@$\bot$@:@$\bot$
\end{tabular}
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391

\item 
Consider the following declarations:
\bprog
@
  newtype N = N Bool
  data    D = D !Bool
@
\eprog
These examples illustrate the difference in pattern matching
between types defined by @data@ and @newtype@:
Simon Marlow's avatar
Simon Marlow committed
1392
1393
1394
1395
1396
1397
1398

\begin{tabular}{l}
@(\  (N True) -> True) @"\bot"@     @"\Rightarrow"@    @"\bot"\\
@(\  (D True) -> True) @"\bot"@     @"\Rightarrow"@    @"\bot"\\
@(\ ~(D True) -> True) @"\bot"@     @"\Rightarrow"@    True@
\end{tabular}

1399
1400
Additional examples may be found in Section~\ref{datatype-renaming}.

Simon Peyton Jones's avatar
Simon Peyton Jones committed
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
\end{enumerate}
\nopagebreak[4]
% \bprog
% @@
% (\ t -> let (x,y) = t in 0) @@"\bot"@@    @@"\Rightarrow"@@    0
% (\ (x,y) -> 0) @@"\bot"@@    @@"\Rightarrow"@@    @@"\bot"@@
% @@
% \eprog
% \bprog
% @@
% (\ l -> let [x] = l in 0) []    @@"\Rightarrow"@@    0
% (\ l -> let [x] = l in x) []    @@"\Rightarrow"@@    @@"\bot"@@
% @@
% \eprog
% \bprog
% @@
% (\ l -> let [x,t] = l; (a,b) = t in x) [(0,1),@@"\bot"@@]    @@"\Rightarrow"@@    (0,1)
% (\ l -> let [x, (a,b)] = l in x) [(0,1),@@"\bot"@@]    @@"\Rightarrow"@@    @@"\bot"@@
% @@
% \eprog
% \bprog
% @@
% (\  (x:xs) -> x:x:xs) @@"\bot"@@   @@"\Rightarrow"@@   @@"\bot"@@
% (\ l -> let (x:xs) = l in x:x:xs) @@"\bot"@@   @@"\Rightarrow"@@   @@"\bot"@@:@@"\bot"@@:@@"\bot"@@
% @@
% \eprogNoSkip
% \end{enumerate}

1429
1430
Top level patterns in case expressions and the set of top level
patterns in function or pattern bindings may have zero or more
1431
1432
associated {\em guards}\index{guards}.  See
Section~\ref{case} for the syntax and semantics of guards.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1433

1434
The guard semantics have an influence on the
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1435
1436
1437
1438
1439
1440
strictness characteristics of a function or case expression.  In
particular, an otherwise irrefutable pattern
\index{irrefutable pattern}
may be evaluated because of a guard.  For example, in
\bprog
@
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1441
1442
f :: (Int,Int,Int) -> [Int] -> Int
f ~(x,y,z) [a] | (a == y) = 1
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1443
1444
1445
1446
1447
1448
1449
1450
@
\eprog
% \bprog
% @@
% f t [a] | a==y = 1 
%           where (x,y,z) = t
% @@
% \eprog
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1451
both @a@ and @y@ will be evaluated by @==@ in the guard.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1452

1453
1454
1455
1456
1457
1458
1459
1460

\subsubsection{Formal Semantics of Pattern Matching}
\label{case-semantics}

The semantics of all pattern matching constructs other than @case@
expressions are defined by giving identities that relate those
constructs to @case@ expressions.  The semantics of
@case@ expressions themselves are in turn given as a series of
1461
identities, in Figures~\ref{simple-case-expr-1}--\ref{simple-case-expr-3}. 
1462
1463
1464
1465
Any implementation should behave so that these identities hold; it is 
not expected that it will use them directly, since that 
would generate rather inefficient code.
  
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1466
\begin{figure}[tb]
1467
\outlinec{\small
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1468
1469
\begin{tabular}{@@{}cl}
(a)&@case @$e$@ of { @$alts$@ } @$=$@ (\@$v$@ -> case @$v$@ of { @$alts$@ }) @"e"\\
1470
&{\rm where $v$ is a new variable}\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1471
1472
1473
(b)&@case @$v$@ of { @$p_1\ \ match_1$@;  @$\ldots{}$@ ; @$p_n\ \ match_n$@ }@\\
   &$=$@  case @$v$@ of { @$p_1\ \ match_1$@ ;@\\
   &@                _  -> @$\ldots{}$@ case @$v$@ of {@\\
1474
   &@                           @$p_n\ \ match_n$\ @;@\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1475
1476
   &@                           _  -> error "No match" }@$\ldots$@}@\\
   &@ @{\rm where each $match_i$ has the form:}\\
1477
   &@  | @$gs_{i,1}$  @ -> @$e_{i,1}$@ ; @$\ldots{}$@ ; | @$gs_{i,m_i}$@ -> @$e_{i,m_i}$@ where { @$decls_i$@ }@\\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1478
%\\
1479
1480
(c)&@case @$v$@ of { @$p$@ | @$gs_1$@ -> @$e_1$@ ; @$\ldots{}$\\
   &\hspace*{4pt}@             | @$gs_n$@ -> @$e_n$@ where { @$decls$@ }@\\
1481
   &\hspace*{2pt}@            _     -> @$e'$@ }@\\
1482
1483
1484
   &$=$@ case @$e'$@ of { @$y$@ ->@\\
   &@   case @$v$@ of {@\\
   &@     @$p$@ -> let { @$decls$@ } in@\\
Simon Marlow's avatar
Simon Marlow committed
1485
1486
1487
1488
1489
   &\hprime{@          case () of {@}\\
   &\hprime{@            () | @$gs_1$@ -> @$e_1$@;@}\\
   &\hprime{@            _ -> @$\ldots$@ case () of {@}\\
   &\hprime{@                       () | @$gs_n$@ -> @$e_n$@;@}\\
   &\hprime{@                       _  -> @$y$@ } @$\ldots$@ }@}\\
1490
1491
   &@     _ -> @$y$@ }}@\\
   &{\rm where $y$ is a new variable}\\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1492
1493
%\\
(d)&@case @$v$@ of { ~@$p$@ -> @$e$@; _ -> @$e'$@ }@\\
1494
&$=$@ (\@$x_1$ $\ldots$ $x_n$ @->@ $e$ @) (case @$v$@ of { @$p$@->@ 
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1495
$x_1$@ })@ $\ldots$ @(case @$v$@ of { @$p$@ -> @$x_n$@})@\\
1496
&{\rm where $x_1, \ldots, x_n$ are all the variables in $p\/$}\\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1497
1498
1499
1500
1501
1502
1503
%\\
(e)&@case @$v$@ of { @$x${\tt @@}$p$@ -> @$e$@; _ -> @$e'$@ }@\\
&$=$@  case @$v$@ of { @$p$@ -> ( \ @$x$@ -> @$e$@ ) @$v$@ ; _ -> @$e'$@ }@\\[4pt]
%\\
(f)&@case @$v$@ of { _ -> @$e$@; _ -> @$e'$@ } @$=$@ @$e$\\[4pt]
\end{tabular}
}
1504
%**<div align=center> <h4>Figure 3.1</h4> </div>
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1505
1506
1507
1508
1509
\ecaption{Semantics of Case Expressions, Part 1}
\label{simple-case-expr-1}
\end{figure}

\begin{figure}[tb]
1510
\outlinec{\small
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1511
1512
\begin{tabular}{@@{}cl}
(g)&@case @$v$@ of { @$K\ p_1 \ldots p_n$@ -> @$e$@; _ -> @$e'$@ }@\\
1513
&$=$@ case @$v$@ of {@\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1514
&@     @$K\ x_1 \ldots x_n$@ -> case @$x_1$@ of {@\\
1515
1516
1517
&@                    @$p_1$@ -> @$\ldots$@ case @$x_n$@ of { @$p_n$@ -> @$e$@ ; _ -> @$e'$@ } @$\ldots$\\
&@                    _  -> @$e'$@ }@\\
&@     _ -> @$e'$@ }@\\[2pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1518
1519
&{\rm at least one of $p_1, \ldots, p_n$ is not a variable; $x_1, \ldots, x_n$ are new variables}\\[4pt]
%\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1520
(h)&@case @$v$@ of { @$k$@ -> @$e$@; _ -> @$e'$@ } @$=$@ if (@$v$@==@$k$@) then @$e$@ else @$e'$ \\
1521
&{\rm where $k$ is a numeric, character, or string literal} \\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
%\\
(i)&@case @$v$@ of { @$x$@ -> @$e$@; _ -> @$e'$@ } @$=$@ case @$v$@ of { @$x$@ -> @$e$@ }@\\[4pt]
%\\
(j)&@case @$v$@ of { @$x$@ -> @$e$@ } @$=$@ ( \ @$x$@ -> @$e$@ ) @$v$\\[4pt]
%\\
(k)&@case @$N$ $v$@ of { @$N$@ @$p$@ -> @$e$@; _ -> @$e'$@ }@\\
&$=$@ case @$v$@ of { @$p$@ -> @$e$@; _ -> @$e'$@ }@\\
&{\rm where $N$ is a @newtype@ constructor}\\[4pt]
(l)&@case @$\bot$@ of { @$N$@ @$p$@ -> @$e$@; _ -> @$e'$@ } @$=$@ case @$\bot$@ of { @$p$@ -> @$e$@ }@\\
&{\rm where $N$ is a @newtype@ constructor}\\[4pt]
(m)& @case @ $v$ @ of { @ $K$ @ {@ $f_1$ @ = @ $p_1$ @ , @ $f_2$ @ = @
$p_2$ @ , @ $\ldots$ @} -> @ $e$ @; _ -> @ $e'$ @ }@\\
&$=$ @ case @$e'$@ of {@\\
&@   @$y$@ ->@\\
&@    case @ $v$ @ of { @\\
&@     @ $K$ @ { @ $f_1$ @ = @ $p_1$ @ } ->@\\
&@            case @ $v$ @ of {@ $K$ @ {@ $f_2$ @ = @ $p_2$ @ , @
$\ldots$ @ } -> @ $e$ @; _ -> @ $y$ @ };@\\
&@            _ -> @ $y$ @ }}@\\
&{\rm where $f_1$, $f_2$, $\ldots$ are fields of constructor $K$; $y$
is a new variable}\\[4pt]
(n)&@case @ $v$ @ of { @ $K$ @ {@ $f$ @ = @ $p$ @} -> @ $e$ @; _ -> @
$e'$ @ }@ \\
&$=$@ case @ $v$ @ of {@\\
&   @    @ $K$ $p_1\ \ldots \  p_n$ @ -> @ $e$ @; _ -> @ $e'$ @ }@\\
&{\rm where $p_i$ is $p$ if $f$ labels the $i$th component of $K$,
@_@ otherwise}\\
(o)&@case @ $v$ @ of { @ $K$ @ {} -> @ $e$ @; _ -> @
$e'$ @ }@ \\
&$=$@ case @ $v$ @ of {@\\
&   @    @ $K$ @_@ $\ldots$ @_ -> @ $e$ @; _ -> @ $e'$ @ }@\\
(p)&@case (@$K'$@ @$e_1$@ @$\ldots$@ @$e_m$@) of { @$K$@ @$x_1$@ @$\ldots$@ @$x_n$@ -> @$e$@; _ -> @$e'$@ } @$=$@ @$e'$\\
&{\rm where $K$ and $K'$ are distinct @data@ constructors of arity $n$ and $m$, respectively}\\[4pt]
%\\
(q)&@case (@$K$@ @$e_1$@ @$\ldots$@ @$e_n$@) of { @$K$@ @$x_1$@ @$\ldots$@ @$x_n$@ -> @$e$@; _ -> @$e'$@ }@\\
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1557
&$=$@ (\@$x_1~\ldots~x_n$@ -> @$e$@) @$e_1~\ldots~e_n$\\
1558
&{\rm where $K$ is a @data@ constructor of arity $n$}\\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1559
1560

(r)&@case@~$\bot$~@of { @$K$@ @$x_1$@ @$\ldots$@ @$x_n$@ -> @$e$@; _ -> @$e'$@ }@ ~$=$~ $\bot$ \\
1561
&{\rm where $K$ is a @data@ constructor of arity $n$}\\[4pt]
1562
1563
1564
1565
1566
1567
\end{tabular}
}
%**<div align=center> <h4>Figure 3.2</h4> </div>
\ecaption{Semantics of Case Expressions, Part 2}
\label{simple-case-expr-2}
\end{figure}
1568

1569
1570
\begin{figure}[tb]
\outlinec{\small
Simon Marlow's avatar
Simon Marlow committed
1571
\begin{haskellprime}
1572
\begin{tabular}{@@{}cl}
1573
(s)&@case () of { () | @$g_1$@, @$\ldots$@, @$g_n$@ -> @$e$@; _ -> @$e'$@ }@\\
1574
   &$=$@ case () of {@\\
1575
1576
   &@    () | @$g_1$@ -> @\ldots@ case () of {@\\
   &@                   () | @$g_n$@ -> @$e$@;@\\
1577
1578
   &@                   _ -> @$e'$@ } @\ldots\\
   &@    _ -> @$e'$@ }@\\
1579
1580
   &{\rm where $y$ is a new variable}\\[4pt]

1581
1582
(t)&@case () of { () | @$p$@ <- @$e_0$@ -> @$e$@; _ -> @$e'$@ }@\\
   &$=$@ case @$e_0$@ of { @$p$@ -> @$e$@; _ -> @$e'$@ }@\\
1583

1584
(u)&@case () of { () | let @$decls$@ -> @$e$@; _ -> @$e'$@ }@\\
1585
1586
   &$=$@ let @$decls$@ in @$e$\\[4pt]

1587
1588
(v)&@case () of { () | @$e_0$@ -> @$e$@; _ -> @$e'$@ }@\\
   &$=$@ if @$e_0$@ then @$e$@ else @$e'$\\[4pt]
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1589
\end{tabular}
Simon Marlow's avatar
Simon Marlow committed
1590
\end{haskellprime}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1591
}
1592
%**<div align=center> <h4>Figure 3.3</h4> </div>
Simon Marlow's avatar
Simon Marlow committed
1593
\ecaption{\hprime{Semantics of Case Expressions, Part 3}}
1594
\label{simple-case-expr-3}
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1595
1596
\end{figure}

1597
1598
1599
1600



In Figures~\ref{simple-case-expr-1}--\ref{simple-case-expr-3}:
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1601
"e", "e'" and "e_i" are expressions; 
1602
"g_i" and "gs_i" are guards and sequences of guards respecively;
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1603
1604
1605
"p" and "p_i" are patterns; 
"v", "x", and "x_i" are variables; 
"K" and "K'" are algebraic datatype (@data@) constructors (including
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1606
tuple constructors);  and "N" is a @newtype@ constructor\index{newtype declaration@@{\tt newtype} declaration}.
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
% For clarity, several rules are expressed using
% @let@ (used only in a non-recursive way); their usual purpose is to
% prevent name capture
% (e.g., in rule~(c)).  The rules may be re-expressed entirely with
% @cases@ by applying this identity:
% \[
% "@let @x@ = @y@ in @e@ @ =@  case @y@ of { @x@ -> @e@ }@"
% \]

%Using all but the last two identities (rules~(n) and~(o)) in Figure~\ref{simple-case-expr-2}
%in a left-to-right manner yields a translation into a
%subset of general @case@ expressions called {\em simple case expressions}.%
%\index{simple case expression}
Rule~(b) matches a general source-language
@case@ expression, regardless of whether it actually includes
1622
guards---if no guards are written, then @True@ is substituted for the guards "gs_{i,j}"
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
in the "match_i" forms.
Subsequent identities manipulate the resulting @case@ expression into simpler
and simpler forms.
%The semantics of simple @case@ expressions is 
%given by the last two identities ((n) and~(o)).

Rule~(h) in Figure~\ref{simple-case-expr-2} involves the
overloaded operator @==@; it is this rule that defines the
meaning of pattern matching against overloaded constants.
\index{pattern-matching!overloaded constant}

%% When used as a translation, the identities in
%% Figure~\ref{simple-case-expr} will generate a very inefficient
%% program.  This can be fixed by using further @case@ or @let@ 
%% expressions, but doing so 
%% would clutter the identities, which are intended only to convey the semantics.

1640
These identities all preserve the static semantics.  Rules~(d), (e), (j), and~(q)
Simon Peyton Jones's avatar
Simon Peyton Jones committed
1641
1642
1643
1644
1645
1646
1647
1648
1649
use a lambda rather than a @let@; this indicates that variables bound
by @case@ are monomorphically typed (Section~\ref{type-semantics}).
\index{monomorphic type variable}

%**~footer

% Local Variables: 
% mode: latex
% End: