exps.verb 58.4 KB
 Simon Peyton Jones committed Mar 28, 2001 1 %  Simon Peyton Jones committed Jan 13, 2003 2 % $Header: /home/cvs/root/haskell-report/report/exps.verb,v 1.20 2003/01/13 13:08:55 simonpj Exp$  Simon Peyton Jones committed Mar 28, 2001 3 4 5 6 7 8 9 % %*section 3 %**The Haskell 98 Report: Expressions %**~header \section{Expressions}\index{expression} \label{expressions}  Simon Peyton Jones committed Dec 10, 2002 10 In this chapter, we describe the syntax and informal semantics of  Simon Peyton Jones committed Sep 24, 2001 11 \Haskell{} {\em expressions}, including their translations into the  Simon Peyton Jones committed Mar 28, 2001 12 13 \Haskell{} kernel, where appropriate. Except in the case of @let@ expressions, these translations preserve both the static and dynamic  Simon Peyton Jones committed Sep 24, 2001 14 15 16 17 18 19 20 semantics. Free variables and constructors used in these translations always refer to entities defined by the @Prelude@. For example, @concatMap@'' used in the translation of list comprehensions (Section~\ref{list-comprehensions}) means the @concatMap@ defined by the @Prelude@, regardless of whether or not the identifier @concatMap@'' is in scope where the list comprehension is used, and (if it is in scope) what it is bound to.  Simon Peyton Jones committed Aug 14, 2001 21   Simon Peyton Jones committed Mar 28, 2001 22 @@@  Simon Marlow committed Apr 28, 2010 23 24 exp -> \hprime{infixexp @::@ [context @=>@] type} & (\tr{expression type signature}) | \hprime{infixexp}  Simon Peyton Jones committed Mar 28, 2001 25   Simon Marlow committed Apr 28, 2010 26 27 28 29 30 \hprime{infixexp} -> \hprime{lexp qop infixexp} & (\tr{infix operator application}) | \hprime{@-@ infixexp} & (\tr{prefix negation}) | \hprime{lexp} \hprime{lexp} -> @\@ apat_1 ... apat_n @->@ exp & (\tr{lambda abstraction}, n>=1)  Simon Peyton Jones committed Mar 28, 2001 31  | @let@ decls @in@ exp & ({\tr{let expression}})  Simon Marlow committed Apr 28, 2010 32  | @if@ exp \hprime{[@;@]} @then@ exp \hprime{[@;@]} @else@ exp & (\tr{conditional})  Simon Peyton Jones committed Mar 28, 2001 33 34 35 36 37 38 39 40 41 42 43 44 45  | @case@ exp @of@ @{@ alts @}@ & (\tr{case expression}) | @do@ @{@ stmts @}@ & (\tr{do expression}) | fexp fexp -> [fexp] aexp & (\tr{function application}) aexp -> qvar & (\tr{variable}) | gcon & (\tr{general constructor}) | literal | @(@ exp @)@ & (\tr{parenthesized expression}) | @(@ exp_1 @,@ ... @,@ exp_k @)@ & (\tr{tuple}, k>=2) | @[@ exp_1 @,@ ... @,@ exp_k @]@ & (\tr{list}, k>=1) | @[@ exp_1 [@,@ exp_2] @..@ [exp_3] @]@ & (\tr{arithmetic sequence}) | @[@ exp @|@ qual_1 @,@ ... @,@ qual_n @]@ & (\tr{list comprehension}, n>=1)  Simon Marlow committed Apr 28, 2010 46 47  | @(@ \hprime{infixexp qop} @)@ & (\tr{left section}) | @(@ \hprime{qop_{\langle@-@\rangle} infixexp} @)@ & (\tr{right section})  Simon Peyton Jones committed Mar 28, 2001 48  | qcon @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled construction}, n>=0)  Simon Peyton Jones committed Nov 01, 2001 49  | aexp_{\langle{}qcon\rangle{}} @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled update}, n >= 1)  Simon Peyton Jones committed Mar 28, 2001 50 51 52  @@@ \indexsyn{exp}%  Simon Marlow committed Apr 28, 2010 53 54 \indexsyn{infixexp}% \indexsyn{lexp}%  Simon Peyton Jones committed Mar 28, 2001 55 56 57 \indexsyn{aexp}% \indexsyn{fexp}%  Simon Peyton Jones committed Aug 23, 2001 58 % Removed Aug 2001: more misleading than helpful. SLPJ  Simon Peyton Jones committed Aug 20, 2001 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 % As an aid to understanding this grammar, % Table~\ref{syntax-precedences} shows the relative precedence of % expressions, patterns and definitions, plus an extended associativity. % "-" indicates that the item is non-associative. % % \begin{table}[tb] % $% \centerline{ % \begin{tabular}{|l|c|}\hline % Item & Associativity \\ % \hline % & \\ % simple terms, parenthesized terms & -- \\ % irrefutable patterns (@~@) & -- \\ % as-patterns ({\tt @@}) & right \\ % function application & left \\ % @do@, @if@, @let@, lambda(@\@), @case@ (leftwards)& right \\ % @case@ (rightwards) & right \\ % & \\ % infix operators, prec. 9 & as defined \\ % \ldots & \ldots \\ % infix operators, prec. 0 & as defined \\ % & \\ % function types (@->@) & right \\ % contexts (@=>@) & -- \\ % type constraints (@::@) & -- \\ % @do@, @if@, @let@, lambda(@\@) (rightwards) & right \\ % sequences (@..@) & -- \\ % generators (@<-@) & -- \\ % grouping (@,@) & n-ary \\ % guards (@|@) & -- \\ % case alternatives (@->@) & -- \\ % definitions (@=@) & -- \\ % separation (@;@) & n-ary \\ % \hline % \end{tabular} % } %$ % %**
% \ecaption{Precedence of expressions, patterns, definitions (highest to lowest)} % \label{syntax-precedences} % \end{table}  Simon Peyton Jones committed Mar 28, 2001 101 102 103 104 105  Expressions involving infix operators are disambiguated by the operator's fixity (see Section~\ref{fixity}). Consecutive unparenthesized operators with the same precedence must both be either left or right associative to avoid a syntax error.  Simon Marlow committed Apr 28, 2010 106 107 108 109 110 111 112 113 Given an unparenthesized expression "x qop^{(a,i)} y qop^{(b,j)} z"'' \hprime{(where "qop^{(a,i)}" means an operator with associativity "a" and precedence "i")}, parentheses must be added around either "x qop^{(a,i)} y"'' or "y qop^{(b,j)} z"'' when "i=j" unless "a=b={\rm l}" or "a=b={\rm r}". An example algorithm for resolving expressions involving infix operators is given in Section~\ref{fixity-resolution}.  Simon Peyton Jones committed Mar 28, 2001 114 115 116  Negation\index{negation} is the only prefix operator in \Haskell{}; it has the same precedence as the infix @-@ operator  Simon Peyton Jones committed Dec 02, 2002 117 118 119 120 121 122 defined in the Prelude (see Section~\ref{fixity}, Figure~\ref{prelude-fixities}). The grammar is ambiguous regarding the extent of lambda abstractions, let expressions, and conditionals. The ambiguity is resolved by the meta-rule that each of these constructs extends as far to the right as possible.  Simon Peyton Jones committed Mar 28, 2001 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143  % I can't make head or tail of this para, so % I'm just deleting it. SLPJ Dec 98 % The separation of function arrows from case alternatives solves % the ambiguity that otherwise arises when an unparenthesized % function type is used in an expression, such as the guard in a case % expression. Sample parses are shown below. $\bt{|l|l|}%%b \hline This & Parses as \\ \hline @f x + g y@ & @(f x) + (g y)@ \\ @- f x + y@ & @(- (f x)) + y@ \\ @let { ... } in x + y@ & @let { ... } in (x + y)@ \\ @z + let { ... } in x + y@ & @z + (let { ... } in (x + y))@ \\ @f x y :: Int@ & @(f x y) :: Int@ \\ @\ x -> a+b :: Int@ & @\ x -> ((a+b) :: Int@) \\ \hline\et$  Simon Marlow committed Apr 28, 2010 144 145 146 For the sake of clarity, the rest of this section will assume that expressions involving infix operators have been resolved according to the fixities of the operators.  Simon Peyton Jones committed Mar 28, 2001 147 148 149 150  \subsection{Errors} \label{basic-errors}\index{error} Errors during expression evaluation, denoted by "\bot"\index{"\bot"},  Simon Peyton Jones committed May 29, 2001 151 are indistinguishable by a Haskell program from non-termination. Since \Haskell{} is a  Simon Peyton Jones committed Dec 02, 2002 152 non-strict language, all \Haskell{} types include "\bot". That is, a value  Simon Peyton Jones committed Mar 28, 2001 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 of any type may be bound to a computation that, when demanded, results in an error. When evaluated, errors cause immediate program termination and cannot be caught by the user. The Prelude provides two functions to directly cause such errors: \bprog @ error :: String -> a undefined :: a @ \eprog \indextt{error} \indextt{undefined} A call to @error@ terminates execution of the program and returns an appropriate error indication to the operating system. It should also display the string in some system-dependent manner. When @undefined@ is used, the error message is created by the compiler. Translations of \Haskell{} expressions use @error@ and @undefined@ to explicitly indicate where execution time errors may occur. The actual program behavior when an error occurs is up to the implementation. The messages passed to the @error@ function in these translations are only suggestions; implementations may choose to display more or less information when an error occurs. \subsection{Variables, Constructors, Operators, and Literals}  Simon Peyton Jones committed Jan 29, 2002 180 \label{vars-and-lits}  Simon Peyton Jones committed Mar 28, 2001 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 % @@@ aexp -> qvar & (\tr{variable}) | gcon & (\tr{general constructor}) | literal @@@ \indexsyn{var}% \indexsyn{con}% \indexsyn{varop}% \indexsyn{conop}% \indexsyn{op}% @@@ gcon -> @()@ | @[]@ | @(,@\{@,@\}@)@ | qcon var -> varid | @(@ varsym @)@ & (\tr{variable}) qvar -> qvarid | @(@ qvarsym @)@ & (\tr{qualified variable}) con -> conid | @(@ consym @)@ & (\tr{constructor}) qcon -> qconid | @(@ gconsym @)@ & (\tr{qualified constructor}) varop -> varsym | \bkqB varid \bkqA & (\tr{variable operator}) qvarop -> qvarsym | \bkqB qvarid \bkqA & (\tr{qualified variable operator}) conop -> consym | \bkqB conid \bkqA & (\tr{constructor operator}) qconop -> gconsym | \bkqB qconid \bkqA & (\tr{qualified constructor operator}) op -> varop | conop & (\tr{operator}) qop -> qvarop | qconop & (\tr{qualified operator}) gconsym -> @:@ | qconsym @@@ \indexsyn{gcon}% \indexsyn{var}% \indexsyn{qvar}% \indexsyn{con}% \indexsyn{qcon}% \indexsyn{varop}% \indexsyn{qvarop}% \indexsyn{conop}% \indexsyn{qconop}% \indexsyn{qop}% \indexsyn{gconsym}%  Simon Peyton Jones committed Sep 10, 2001 222 \Haskell{} provides special syntax to support infix notation.  Simon Peyton Jones committed Sep 11, 2001 223 An {\em operator} is a function that can be applied using infix  Simon Peyton Jones committed Sep 10, 2001 224 syntax (Section~\ref{operators}), or partially applied using a  Simon Peyton Jones committed Sep 11, 2001 225 {\em section} (Section~\ref{sections}).  Simon Peyton Jones committed Sep 10, 2001 226   Simon Peyton Jones committed Sep 11, 2001 227 An {\em operator} is either an {\em operator symbol}, such as @+@ or @@,  Simon Peyton Jones committed Sep 10, 2001 228 229 230 231 or is an ordinary identifier enclosed in grave accents (backquotes), such as \bkqB@op@\bkqA. For example, instead of writing the prefix application @op x y@, one can write the infix application \mbox{@x@ \bkqB@op@\bkqA@ y@}. If no fixity\index{fixity}  Simon Peyton Jones committed Sep 11, 2001 232 declaration is given for \bkqB@op@\bkqA{} then it defaults  Simon Peyton Jones committed Mar 28, 2001 233 234 235 to highest precedence and left associativity (see Section~\ref{fixity}).  Simon Peyton Jones committed Sep 10, 2001 236 237 238 Dually, an operator symbol can be converted to an ordinary identifier by enclosing it in parentheses. For example, @(+) x y@ is equivalent to @x + y@, and @foldr (*) 1 xs@ is equivalent to @foldr (\x y -> x*y) 1 xs@.  Simon Peyton Jones committed Mar 28, 2001 239   Simon Peyton Jones committed Aug 23, 2001 240 241 242 243 244 245 246 247 248 249 250 % This para is covered by Section 2.4 and 5.5.1 % A qualified name may only be used to refer to a variable or % constructor imported from another module (see Section~\ref{import}), or % defined at the top level, % but not in the definition of a new variable or constructor. Thus % \bprog % let F.x = 1 in F.x -- invalid % \eprog % incorrectly uses a qualifier in the definition of @x@, regardless of % the module containing this definition. Qualification does not affect % the nature of an operator: @F.+@ is an infix operator just as @+@ is.  Simon Peyton Jones committed Mar 28, 2001 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288  Special syntax is used to name some constructors for some of the built-in types, as found in the production for "gcon" and "literal". These are described in Section~\ref{basic-types}. \index{number!translation of literals} An integer literal represents the application of the function @fromInteger@\indextt{fromInteger} to the appropriate value of type @Integer@. Similarly, a floating point literal stands for an application of @fromRational@\indextt{fromRational} to a value of type @Rational@ (that is, @Ratio Integer@). \outline{ \paragraph*{Translation:} The integer literal "i" is equivalent to @fromInteger@ "i", where @fromInteger@ is a method in class @Num@ (see Section \ref{numeric-literals}).\indexclass{Num} The floating point literal "f" is equivalent to @fromRational@ ("n" @Ratio.%@ "d"), where @fromRational@ is a method in class @Fractional@ and @Ratio.%@ constructs a rational from two integers, as defined in the @Ratio@ library.\indexclass{Fractional} The integers "n" and "d" are chosen so that "n/d = f". } \subsection{Curried Applications and Lambda Abstractions} \label{applications} \label{lambda-abstractions} \index{lambda abstraction} \index{application} %\index{function application|see{application}} % @@@ fexp -> [fexp] aexp & (\tr{function application})  Simon Peyton Jones committed Aug 20, 2001 289 exp -> @\@ apat_1 ... apat_n @->@ exp & (\tr{lambda abstraction}, n>=1)  Simon Peyton Jones committed Mar 28, 2001 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 @@@ \indexsyn{exp}% \indexsyn{fexp}% \noindent {\em Function application}\index{application} is written "e_1 e_2". Application associates to the left, so the parentheses may be omitted in @(f x) y@. Because "e_1" could be a data constructor, partial applications of data constructors are allowed. {\em Lambda abstractions} are written "@\@ p_1 ... p_n @->@ e", where the "p_i" are {\em patterns}. An expression such as @\x:xs->x@ is syntactically incorrect; it may legally be written as @\(x:xs)->x@. The set of patterns must be {\em linear}\index{linearity}% \index{linear pattern}---no variable may appear more than once in the set.  Simon Peyton Jones committed Dec 02, 2002 309 \outline{\small  Simon Peyton Jones committed Mar 28, 2001 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 \paragraph*{Translation:} The following identity holds: \begin{center} \bt{lcl}% \struthack{17pt}% "@\@ p_1 ... p_n @->@ e" & "=" & "@\@ x_1 ... x_n @-> case (@x_1@,@ ...@,@ x_n@) of (@p_1@,@ ...@,@ p_n@) ->@ e" \et \end{center} where the "x_i" are new identifiers. } Given this translation combined with the semantics of case expressions and pattern matching described in Section~\ref{case-semantics}, if the pattern fails to match, then the result is "\bot". \subsection{Operator Applications} \index{operator application}  Simon Peyton Jones committed Aug 31, 2001 330 %\index{operator application|hseealso{application}}  Simon Peyton Jones committed Mar 28, 2001 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 \label{operators} % @@@ exp -> exp_1 qop exp_2 | @-@ exp & (\tr{prefix negation}) qop -> qvarop | qconop & (\tr{qualified operator}) @@@ \indexsyn{exp}% \indexsyn{qop}% \noindent The form "e_1 qop e_2" is the infix application of binary operator\index{operator} "qop" to expressions "e_1" and "e_2". The special form "@-@e" denotes prefix negation\index{negation}, the only prefix operator in \Haskell{}, and is syntax for "@negate @(e)".\indextt{negate} The binary @-@ operator does not necessarily refer to the definition of @-@ in the Prelude; it may be rebound by the module system. However, unary @-@ will always refer to the @negate@ function defined in the Prelude. There is no link between the local meaning of the @-@ operator and unary negation. Prefix negation has the same precedence as the infix operator @-@ defined in the Prelude (see  Simon Marlow committed Apr 29, 2010 357 Table~\ref{prelude-fixities}). Because @e1-e2@ parses as an  Simon Peyton Jones committed Mar 28, 2001 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 infix application of the binary operator @-@, one must write @e1(-e2)@ for the alternative parsing. Similarly, @(-)@ is syntax for @(\ x y -> x-y)@, as with any infix operator, and does not denote @(\ x -> -x)@---one must use @negate@ for that. \outline{ \paragraph*{Translation:} The following identities hold: \begin{center} \bt{lcl}% \struthack{17pt}% "e_1 op e_2" & "=" & "@(@op@)@ e_1 e_2" \\ "@-@e" & "=" & "@negate@ (e)" \et \end{center} } \subsection{Sections} \index{section}  Simon Peyton Jones committed Aug 31, 2001 377 %\index{section|hseealso{operator application}}  Simon Peyton Jones committed Mar 28, 2001 378 379 380 \label{sections} % @@@  Simon Marlow committed Apr 28, 2010 381 382  | @(@ \hprime{infixexp qop} @)@ & (\tr{left section}) | @(@ \hprime{qop_{\langle@-@\rangle} infixexp} @)@ & (\tr{right section})  Simon Peyton Jones committed Mar 28, 2001 383 @@@  Simon Marlow committed Apr 28, 2010 384 385 \indexsyn{infixexp}% \indexsyn{qop}%  Simon Peyton Jones committed Mar 28, 2001 386 387 388 389 390 391 392 393 394 395 396 397 398  \noindent {\em Sections} are written as "@(@ op e @)@" or "@(@ e op @)@", where "op" is a binary operator and "e" is an expression. Sections are a convenient syntax for partial application of binary operators. Syntactic precedence rules apply to sections as follows. "@(@op~e@)@" is legal if and only if "@(x@~op~e@)@" parses in the same way as "@(x@~op~@(@e@))@"; and similarly for "@(@e~op@)@". For example, @(*a+b)@ is syntactically invalid, but @(+a*b)@ and @(*(a+b))@ are valid. Because @(+)@ is left associative, @(a+b+)@ is syntactically correct, but @(+a+b)@ is not; the latter may legally be written as @(+(a+b))@.  Simon Peyton Jones committed Dec 02, 2002 399 400 401 As another example, the expression \bprog @  Simon Peyton Jones committed Dec 10, 2002 402  (let n = 10 in n +)  Simon Peyton Jones committed Dec 02, 2002 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 @ \eprog is invalid because, by the let/lambda meta-rule (Section~\ref{expressions}), the expression \bprog @ (let n = 10 in n + x) @ \eprog parses as \bprog @ (let n = 10 in (n + x)) @ \eprog rather than \bprog @ ((let n = 10 in n) + x) @ \eprog % This observation makes it easier to implement the let/lambda meta-rule % (Section~\ref{expressions}) because once the operator has been seen it is clear that any % legal parse must include the operator in the body of the @let@.  Simon Peyton Jones committed Mar 28, 2001 427 428 429 430 431 432 433 434 435  Because @-@ is treated specially in the grammar, "@(-@ exp@)@" is not a section, but an application of prefix negation,\index{negation} as described in the preceding section. However, there is a @subtract@ function defined in the Prelude such that "@(subtract@ exp@)@" is equivalent to the disallowed section. The expression "@(+ (-@ exp@))@" can serve the same purpose.  Simon Peyton Jones committed Dec 02, 2002 436   Simon Peyton Jones committed Mar 28, 2001 437 % Changed to allow postfix operators. That is, in (op x), we no  Simon Peyton Jones committed Jun 18, 2001 438 % longer add a \x -> which would require op to be binary instead  Simon Peyton Jones committed Mar 28, 2001 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 % of unary. \outline{ \paragraph*{Translation:} The following identities hold: \begin{center} \bt{lcl}% \struthack{17pt}% "@(@op e@)@" & "=" & "@\@ x @->@ x op e" \\ "@(@e op@)@" & "=" & "@\@ x @->@ e op x" \et \end{center} where "op" is a binary operator, "e" is an expression, and "x" is a variable that does not occur free in "e". } \subsection{Conditionals} \label{conditionals}\index{conditional expression} % @@@  Simon Marlow committed Mar 20, 2008 459 exp -> @if@ exp \hprime{[@;@]} @then@ exp \hprime{[@;@]} @else@ exp  Simon Peyton Jones committed Mar 28, 2001 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 @@@ \indexsyn{exp}% %\indextt{if ... then ... else ...} A {\em conditional expression} \index{conditional expression} has the form "@if@ e_1 @then@ e_2 @else@ e_3" and returns the value of "e_2" if the value of "e_1" is @True@, "e_3" if "e_1" is @False@, and "\bot" otherwise. \outline{ \paragraph*{Translation:} The following identity holds: \begin{center} \bt{lcl}% \struthack{17pt}% "@if@ e_1 @then@ e_2 @else@ e_3" & "=" & "@case@ e_1 @of { True ->@ e_2 @; False ->@ e_3 @}@" \et \end{center} where @True@ and @False@ are the two nullary constructors from the type @Bool@, as defined in the Prelude. The type of "e_1" must be @Bool@; "e_2" and "e_3" must have the same type, which is also the type of the entire conditional expression. } \subsection{Lists} \label{lists} % @@@ exp -> exp_1 qop exp_2 aexp -> @[@ exp_1 @,@ ... @,@ exp_k @]@ & (k>=1) | gcon gcon -> @[]@ | qcon qcon -> @(@ gconsym @)@ qop -> qconop qconop -> gconsym gconsym -> @:@ @@@ \indexsyn{aexp}% {\em Lists}\index{list} are written "@[@e_1@,@ ...@,@ e_k@]@", where "k>=1". The list constructor is @:@, and the empty list is denoted @[]@. Standard operations on lists are given in the Prelude (see Section~\ref{basic-lists}, and  Simon Peyton Jones committed Dec 10, 2002 506 Chapter~\ref{stdprelude} notably Section~\ref{preludelist}).  Simon Peyton Jones committed Mar 28, 2001 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523  \outline{ \paragraph*{Translation:} The following identity holds: \begin{center} \bt{lcl}% \struthack{17pt}% "@[@e_1@,@ ...@,@ e_k@]@" & "=" & "e_1 @: (@e_2 @: (@ ... @(@e_k @: [])))@" \et \end{center} where @:@ and @[]@ are constructors for lists, as defined in the Prelude (see Section~\ref{basic-lists}). The types of "e_1" through "e_k" must all be the same (call it "t\/"), and the type of the overall expression is @[@"t"@]@ (see Section~\ref{type-syntax}). } The constructor @:@'' is reserved solely for list construction; like @[]@, it is considered part of the language syntax, and cannot be hidden or redefined.  Simon Peyton Jones committed Sep 24, 2001 524 It is a right-associative operator, with precedence level 5 (Section~\ref{fixity}).  Simon Peyton Jones committed Mar 28, 2001 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541  \subsection{Tuples} \label{tuples} % @@@ aexp -> @(@ exp_1 @,@ ... @,@ exp_k @)@ & (k>=2) | qcon qcon -> @(,@\{@,@\}@)@ @@@ \indexsyn{aexp}% {\em Tuples}\index{tuple} are written "@(@e_1@,@ ...@,@ e_k@)@", and may be of arbitrary length "k>=2". The constructor for an "n"-tuple is denoted by @(,@\ldots@,)@, where there are "n-1" commas. Thus @(a,b,c)@ and @(,,) a b c@ denote the same value. Standard operations on tuples are given  Simon Peyton Jones committed Dec 10, 2002 542 in the Prelude (see Section~\ref{basic-tuples} and Chapter~\ref{stdprelude}).  Simon Peyton Jones committed Mar 28, 2001 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567  \outline{ \paragraph*{Translation:} "@(@e_1@,@ ...@,@ e_k@)@" for "k\geq2" is an instance of a "k"-tuple as defined in the Prelude, and requires no translation. If "t_1" through "t_k" are the types of "e_1" through "e_k", respectively, then the type of the resulting tuple is "@(@t_1@,@ ...@,@ t_k@)@" (see Section~\ref{type-syntax}). } \subsection{Unit Expressions and Parenthesized Expressions} \label{unit-expression} \index{unit expression} % @@@ aexp -> gcon | @(@ exp @)@ gcon -> @()@ @@@ \indexsyn{aexp}% \noindent The form "@(@e@)@" is simply a {\em parenthesized expression}, and is equivalent to "e". The {\em unit expression} @()@ has type @()@\index{trivial type} (see  Simon Peyton Jones committed Aug 20, 2001 568 569 570 Section~\ref{type-syntax}). It is the only member of that type apart from $\bot$, and can be thought of as the nullary tuple'' (see Section~\ref{basic-trivial}).  Simon Peyton Jones committed Mar 28, 2001 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 \nopagebreak[4] \outline{ \paragraph{Translation:} "@(@e@)@" is equivalent to "e". } \subsection{Arithmetic Sequences} \label{arithmetic-sequences} % @@@ aexp -> @[@ exp_1 [@,@ exp_2] @..@ [exp_3] @]@ @@@ \indexsyn{aexp}% \noindent The {\em arithmetic sequence}\index{arithmetic sequence} "@[@e_1@,@ e_2 @..@ e_3@]@" denotes a list of values of type "t", where each of the "e_i" has type "t", and "t" is an instance of class @Enum@. \outline{ \paragraph{Translation:} Arithmetic sequences satisfy these identities: \begin{center} \begin{tabular}{lcl}% \struthack{17pt}% @[ @"e_1"@.. ]@ & "=" & @enumFrom@ "e_1" \\ @[ @"e_1"@,@"e_2"@.. ]@ & "=" & @enumFromThen@ "e_1" "e_2" \\ @[ @"e_1"@..@"e_3"@ ]@ & "=" & @enumFromTo@ "e_1" "e_3" \\ @[ @"e_1"@,@"e_2"@..@"e_3"@ ]@ & "=" & @enumFromThenTo@ "e_1" "e_2" "e_3" \end{tabular} \end{center} where @enumFrom@, @enumFromThen@, @enumFromTo@, and @enumFromThenTo@ are class methods in the class @Enum@ as defined in the Prelude  Simon Marlow committed Apr 29, 2010 613 (see Figure~\ref{standard-classes}).  Simon Peyton Jones committed Mar 28, 2001 614 615 616 } The semantics of arithmetic sequences therefore depends entirely  Simon Peyton Jones committed Nov 02, 2001 617 on the instance declaration for the type "t".  Simon Peyton Jones committed Dec 21, 2001 618 See Section~\ref{enum-class} for more details of which @Prelude@  Simon Peyton Jones committed Dec 02, 2002 619 types are in @Enum@ and their semantics.  Simon Peyton Jones committed Mar 28, 2001 620   Isaac Potoczny-Jones committed Jan 08, 2007 621 622 623 624 625 626 627 \subsection{List Comprehensions} \index{list comprehension} \index{let expression!in list comprehensions} \label{list-comprehensions} % @@@ aexp -> @[@ exp @|@ qual_1 @,@ ... @,@ qual_n @]@ & (\tr{list comprehension}, n>=1)  Isaac Potoczny-Jones committed Jan 12, 2007 628 629 630 qual -> pat @<-@ exp & (\tr{generator}) | @let@ decls & (\tr{local declaration}) | exp & (\tr{boolean guard})  Isaac Potoczny-Jones committed Jan 08, 2007 631 632 @@@ \indexsyn{aexp}  Isaac Potoczny-Jones committed Jan 12, 2007 633 \indexsyn{qual}  Isaac Potoczny-Jones committed Jan 08, 2007 634 635  \noindent  Isaac Potoczny-Jones committed Jan 12, 2007 636 637 638 639 640 641 642 643 644 645 646 647 A {\em list comprehension} has the form "@[@ e @|@ q_1@,@ ...@,@ q_n @]@, n>=1," where the "q_i" qualifiers\index{qualifier} are either \begin{itemize} \item {\em generators}\index{generator} of the form "p @<-@ e", where "p" is a pattern (see Section~\ref{pattern-matching}) of type "t" and "e" is an expression of type "@[@t@]@" \item {\em boolean guards},\index{boolean guard} which are arbitrary expressions of type @Bool@ \item {\em local bindings} that provide new definitions for use in the generated expression "e" or subsequent boolean guards and generators. \end{itemize}  Isaac Potoczny-Jones committed Jan 08, 2007 648   Isaac Potoczny-Jones committed Jan 12, 2007 649 650 651 652 653 654 Such a list comprehension returns the list of elements produced by evaluating "e" in the successive environments created by the nested, depth-first evaluation of the generators in the qualifier list. Binding of variables occurs according to the normal pattern matching rules (see Section~\ref{pattern-matching}), and if a match fails then that element of the list is simply skipped over. Thus:\nopagebreak[4]  Simon Peyton Jones committed Mar 28, 2001 655 656 657 658 659 660 \bprog @ [ x | xs <- [ [(1,2),(3,4)], [(5,4),(3,2)] ], (3,x) <- xs ] @ \eprog  Isaac Potoczny-Jones committed Jan 12, 2007 661 yields the list @[4,2]@. If a qualifier is a boolen guard, it must evaluate  Simon Peyton Jones committed Mar 28, 2001 662 663 664 665 666 667 668 669 670 671 672 673 674 to @True@ for the previous pattern match to succeed. As usual, bindings in list comprehensions can shadow those in outer scopes; for example: $\ba{lll} @[ x | x <- x, x <- x ]@ & = & @[ z | y <- x, z <- y]@ \\ \ea$ \outline{ \paragraph{Translation:} List comprehensions satisfy these identities, which may be used as a translation into the kernel: \begin{center} \bt{lcl}% \struthack{17pt}%  Simon Peyton Jones committed Aug 14, 2001 675 676 "@[ @ e@ | True ]@" & = & "@[@e@]@" \\ "@[ @ e@ | @q@ ]@" & = & "@[@~ e~@|@~q@, True ]@" \\  Simon Peyton Jones committed Mar 28, 2001 677 678 679 680 681 "@[ @ e@ | @b@,@~ Q ~@]@" & = & "@if@~b~@then@~@[ @ e@ | @Q@ ]@~@else []@" \\ "@[ @ e@ | @p @<-@ l@,@~ Q@ ]@" & = & "@let ok@~p~@=@~@[ @ e@ | @Q@ ]@" \\ && @ ok _ = []@ \\  Simon Peyton Jones committed Aug 14, 2001 682 && "@in concatMap ok@~ l" \\  Simon Peyton Jones committed Mar 28, 2001 683 684 685 686 "@[ @ e@ | let@~decls@,@~ Q@ ]@" & = & "@let@~decls~@in@~@[ @ e@ | @Q@ ]@" \et \end{center}  Simon Peyton Jones committed May 29, 2001 687 688 689 690 691 where "e" ranges over expressions, "p" over patterns, "l" over list-valued expressions, "b" over boolean expressions, "decls" over declaration lists, "q" over qualifiers, and "Q" over sequences of qualifiers. "@ok@" is a fresh variable. The function @concatMap@, and boolean value @True@, are defined in the Prelude.  Simon Peyton Jones committed Mar 28, 2001 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 } As indicated by the translation of list comprehensions, variables bound by @let@ have fully polymorphic types while those defined by @<-@ are lambda bound and are thus monomorphic (see Section \ref{monomorphism}). \subsection{Let Expressions} \index{let expression} \label{let-expressions} % % Including this syntax blurb does REALLY bad things to page breaking % in the 1.[12] report (sigh); ToDo: hope it goes away. @@@ exp -> @let@ decls @in@ exp @@@ \indexsyn{exp} \index{declaration!within a {\tt let} expression} \noindent {\em Let expressions} have the general form "@let {@ d_1 @;@ ... @;@ d_n @} in@ e", and introduce a nested, lexically-scoped, mutually-recursive list of declarations (@let@ is often called @letrec@ in other languages). The scope of the declarations is the expression "e" and the right hand side of the declarations. Declarations are  Simon Peyton Jones committed Dec 10, 2002 719 described in Chapter~\ref{declarations}. Pattern bindings are matched  Simon Peyton Jones committed Mar 28, 2001 720 721 722 723 724 725 726 727 728 729 lazily; an implicit @~@ makes these patterns irrefutable.\index{irrefutable pattern} For example, \bprog @ let (x,y) = undefined in @"e"@ @ \eprog does not cause an execution-time error until @x@ or @y@ is evaluated.  Simon Peyton Jones committed Dec 02, 2002 730 \outline{\small  Simon Peyton Jones committed Mar 28, 2001 731 732 733 734 735 736 737 738 739 740 741 \paragraph*{Translation:} The dynamic semantics of the expression "@let {@ d_1 @;@ ... @;@ d_n @} in@ e_0" are captured by this translation: After removing all type signatures, each declaration "d_i" is translated into an equation of the form "p_i @=@ e_i", where "p_i" and "e_i" are patterns and expressions respectively, using the translation in Section~\ref{function-bindings}. Once done, these identities hold, which may be used as a translation into the kernel: \begin{center} \bt{lcl}% \struthack{17pt}%  Simon Peyton Jones committed Dec 21, 2001 742 @let {@"p_1"@=@"e_1"@; @ ... @; @"p_n"@=@"e_n"@} in@ "e_0"  Simon Peyton Jones committed May 29, 2001 743  &=& @let (~@"p_1"@,@ ... @,~@"p_n"@) = (@"e_1"@,@ ... @,@"e_n"@) in@ "e_0" \\  Simon Peyton Jones committed Mar 28, 2001 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 @let @"p"@ = @"e_1" @ in @ "e_0" &=& @case @"e_1"@ of ~@"p"@ -> @"e_0" \\ & & {\rm where no variable in "p" appears free in "e_1"} \\ @let @"p"@ = @"e_1" @ in @ "e_0" &=& @let @"p"@ = fix ( \ ~@"p"@ -> @"e_1"@) in@ "e_0" \et \end{center} where @fix@ is the least fixpoint operator. Note the use of the irrefutable patterns "@~@p". This translation does not preserve the static semantics because the use of @case@ precludes a fully polymorphic typing of the bound variables. %\folks{Needs work -- KH} % This same semantics applies to the top-level of %a program that has been translated into a @let@ expression, % as described at the beginning of Section~\ref{modules}. The static semantics of the bindings in a @let@ expression are described in Section~\ref{pattern-bindings}. } \subsection{Case Expressions} \label{case} % @@@ exp -> @case@ exp @of@ @{@ alts @}@  Simon Peyton Jones committed Jan 13, 2003 769 alts -> alt_1 @;@ ... @;@ alt_n & (n>=1)  Simon Peyton Jones committed Mar 28, 2001 770 771 772 773 alt -> pat @->@ exp [@where@ decls] | pat gdpat [@where@ decls] | & (empty alternative)  Simon Marlow committed Mar 20, 2008 774 775 gdpat -> \hprime{guards} @->@ exp [ gdpat ] \hprime{guards} -> \hprime{@|@ guard_1, ..., guard_n} & \hprime{(n>=1)}  Simon Marlow committed Apr 28, 2010 776 \hprime{guard} -> \hprime{pat @<-@ infixexp} & (\hprime{\tr{pattern guard}})  Simon Marlow committed Mar 20, 2008 777  | \hprime{@let@ decls} & (\hprime{\tr{local declaration}})  Simon Marlow committed Apr 28, 2010 778  | infixexp & (\tr{boolean guard})  Simon Peyton Jones committed Mar 28, 2001 779 780 781 782 783 @@@ \indexsyn{exp}% \indexsyn{alts}% \indexsyn{alt}% \indexsyn{gdpat}%  Isaac Potoczny-Jones committed Jan 12, 2007 784 785 786 \indexsyn{guards}% \indexsyn{guard}%  Simon Peyton Jones committed Mar 28, 2001 787 788 789 790 791 792 793  A {\em case expression}\index{case expression} has the general form $"@case@ e @of { @p_1 match_1 @;@ ... @;@ p_n match_n @}@"$ where each "match_i" is of the general form $\ba{lll}  Simon Marlow committed Mar 20, 2008 794  & "@|@ \hprime{gs_{i1}}" & "@->@ e_{i1}" \\  Simon Peyton Jones committed Mar 28, 2001 795  & "..." \\  Simon Marlow committed Mar 20, 2008 796  & "@|@ \hprime{gs_{im_i}}" & "@->@ e_{im_i}" \\  Simon Peyton Jones committed Mar 28, 2001 797 798  & \multicolumn{2}{l}{"@where@ decls_i"} \ea$  Isaac Potoczny-Jones committed Jan 12, 2007 799 (Notice that in the syntax rule for "guards", the @|@'' is a  Simon Peyton Jones committed Aug 20, 2001 800 terminal symbol, not the syntactic metasymbol for alternation.)  Simon Peyton Jones committed Mar 28, 2001 801 Each alternative "p_i match_i" consists of a  Simon Peyton Jones committed Aug 20, 2001 802 803 pattern\index{pattern} "p_i" and its matches, "match_i". Each match in turn  Isaac Potoczny-Jones committed Jan 12, 2007 804 805 806 807 808 consists of a sequence of pairs of guards\index{guard} "gs_{ij}" and bodies "e_{ij}" (expressions), followed by optional bindings ("decls_i") that scope over all of the guards and expressions of the alternative.  Simon Marlow committed Mar 20, 2008 809 \begin{haskellprime}  Isaac Potoczny-Jones committed Jan 12, 2007 810 811 812 813 814 815 816 \index{Pattern Guards} \index{guards} A {\em guard}\index{guard} has one of the following forms: \begin{itemize} \item {\em pattern guards}\index{pattern guard} are of the form "p @<-@ e", where "p" is a pattern (see Section~\ref{pattern-matching}) of type "t" and "e" is an  ravi@bluespec.com committed Jan 19, 2007 817 expression type "t"\footnote{Note that the syntax of a pattern guard is the same as that of a generator in a list comprehension.  Simon Marlow committed Apr 28, 2010 818 The contextual difference is that, in a list comprehension, a pattern of type "t" goes with an expression of type "@[@t@]@".}.  ravi@bluespec.com committed Jan 19, 2007 819 They succeed if the expression "e" matches the pattern "p", and introduce the bindings of the pattern to the environment.  Simon Marlow committed Apr 28, 2010 820 \item {\em local bindings} are of the form "@let @decls". They always succeed, and they introduce the names defined in "decls" to the environment.  Isaac Potoczny-Jones committed Jan 12, 2007 821 822 823 \item {\em boolean guards}\index{boolean guard} are arbitrary expressions of type @Bool@. They succeed if the expression evaluates to @True@, and they do not introduce new names to the environment. A boolean guard, "g", is semantically equivalent to the pattern guard "@True <- @g". \end{itemize}  Simon Marlow committed Mar 20, 2008 824 \end{haskellprime}  Isaac Potoczny-Jones committed Jan 12, 2007 825 826 827  An alternative of the form  Simon Peyton Jones committed Mar 28, 2001 828 829 830 831 832 833 834 835 836 837 838 839 840 841 $"pat @->@ exp @where@ decls"$ is treated as shorthand for: $\ba{lll} & "pat @| True@" & "@->@ exp" \\ & \multicolumn{2}{l}{"@where@ decls"} \ea$ A case expression must have at least one alternative and each alternative must have at least one body. Each body must have the same type, and the type of the whole expression is that type. A case expression is evaluated by pattern matching the expression "e"  Simon Peyton Jones committed Aug 23, 2001 842 against the individual alternatives. The alternatives are tried  Isaac Potoczny-Jones committed Jan 12, 2007 843 844 845 sequentially, from top to bottom. If "e" matches the pattern of an alternative, then the guarded expressions for that alternative are tried sequentially from top to bottom in the environment of the case  Isaac Potoczny-Jones committed Jan 08, 2007 846 expression extended first by the bindings created during the matching  Isaac Potoczny-Jones committed Jan 12, 2007 847 848 849 of the pattern, and then by the "decls_i" in the @where@ clause associated with that alternative.  Simon Marlow committed Mar 20, 2008 850 \begin{haskellprime}  Isaac Potoczny-Jones committed Jan 12, 2007 851 852 853 854 855 856 857 858 For each guarded expression, the comma-separated guards are tried sequentially from left to right. If all of them succeed, then the corresponding expression is evaluated in the environment extended with the bindings introduced by the guards. That is, the bindings that are introduced by a guard (either by using a let clause or a pattern guard) are in scope in the following guards and the corresponding expression. If any of the guards fail, then this guarded expression fails and the next guarded expression is tried.  Simon Marlow committed Mar 20, 2008 859 \end{haskellprime}  Isaac Potoczny-Jones committed Jan 12, 2007 860 861 862 863  If none of the guarded expressions for a given alternative succeed, then matching continues with the next alternative. If no alternative succeeds, then the result is "\bot". Pattern matching is described in  Isaac Potoczny-Jones committed Jan 08, 2007 864 865 Section~\ref{pattern-matching}, with the formal semantics of case expressions in Section~\ref{case-semantics}.  Simon Peyton Jones committed Mar 28, 2001 866   Simon Peyton Jones committed Jan 13, 2003 867 {\em A note about parsing.} The expression  Simon Peyton Jones committed Dec 02, 2002 868 869 870 871 872 873 874 875 876 877 878 879 880 881 \bprog @ case x of { (a,_) | let b = not a in b :: Bool -> a } @ \eprog is tricky to parse correctly. It has a single unambiguous parse, namely \bprog @ case x of { (a,_) | (let b = not a in b :: Bool) -> a } @ \eprog However, the phrase "@Bool -> a@" is syntactically valid as a type, and parsers with limited lookahead may incorrectly commit to this choice, and hence reject the program. Programmers are advised, therefore, to avoid guards that  Isaac Potoczny-Jones committed Jan 12, 2007 882 end with a type signature --- indeed that is why a "guard" contains  Simon Marlow committed Apr 28, 2010 883 an "infixexp" not an "exp".  Simon Peyton Jones committed Dec 02, 2002 884   Simon Peyton Jones committed Mar 28, 2001 885 886 887 888 889 890 891 892 \subsection{Do Expressions} \index{do expression} \label{do-expressions} \index{let expression!in do expressions} \index{monad} % @@@ exp -> @do@ @{@ stmts @}@ & (\tr{do expression})  Simon Peyton Jones committed Dec 21, 2001 893 stmts -> stmt_1 ... stmt_n exp [@;@] & (n>=0)  Simon Peyton Jones committed May 29, 2001 894 895 896 stmt -> exp @;@ | pat @<-@ exp @;@ | @let@ decls @;@  Simon Peyton Jones committed Jun 18, 2001 897  | @;@ & (empty statement)  Simon Peyton Jones committed Mar 28, 2001 898 @@@  Simon Peyton Jones committed Dec 21, 2001 899 900 901 902 903 % The semicolons are done differently than for decls % Reason: to do it the same way would mean: % stmts -> stmt1 ; ... ; stmtn ; exp [;] % Now, what happens if n=0? Is there a ';' before the exp or not? % Putting the ';' on the end of the stmt makes that clear.  Simon Peyton Jones committed Mar 28, 2001 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 \indexsyn{exp}% \indexsyn{stmt}% \indexsyn{stmts}% A {\em do expression} provides a more conventional syntax for monadic programming. It allows an expression such as \bprog @ putStr "x: " >> getLine >>= \l -> return (words l) @ \eprog to be written in a more traditional way as: \bprog @ do putStr "x: " l <- getLine return (words l) @ \eprog \outline{ \paragraph*{Translation:} Do expressions satisfy these identities, which may be used as a translation into the kernel, after eliminating empty "stmts": \begin{center} \bt{lcl}% \struthack{17pt}% @do {@"e"@}@ &=& "e"\\ @do {@"e"@;@"stmts"@}@ &=& "e" @>> do {@"stmts"@}@ \\ @do {@"p"@ <- @"e"@; @"stmts"@}@ &=& @let ok @"p"@ = do {@"stmts"@}@\\ & & @ ok _ = fail "..."@\\ & & @ in @"e"@ >>= ok@ \\ @do {let@ "decls"@; @"stmts"@}@ &=& @let@ "decls" @in do {@"stmts"@}@\\ \et \end{center} The ellipsis @"..."@ stands for a compiler-generated error message, passed to @fail@, preferably giving some indication of the location of the pattern-match failure; the functions @>>@, @>>=@, and @fail@ are operations in the class @Monad@, as defined in the Prelude\indexclass{Monad}; and @ok@ is a fresh identifier. } As indicated by the translation of @do@, variables bound by @let@ have fully polymorphic types while those defined by @<-@ are lambda bound and are thus monomorphic. \subsection{Datatypes with Field Labels} \label{field-ops} \index{data declaration@@{\tt data} declaration} \index{label} \index{field label|see{label}}  Simon Peyton Jones committed Dec 02, 2002 956 957 A datatype declaration may optionally define field labels (see Section~\ref{datatype-decls}).  Simon Peyton Jones committed Mar 28, 2001 958 959 960 961 962 963 These field labels can be used to construct, select from, and update fields in a manner that is independent of the overall structure of the datatype. Different datatypes cannot share common field labels in the same scope. A field label can be used at most once in a constructor.  Simon Peyton Jones committed May 29, 2001 964 Within a datatype, however, a field label can be used in more  Simon Peyton Jones committed Mar 28, 2001 965 than one constructor provided the field has the same typing in all  Simon Peyton Jones committed Aug 20, 2001 966 constructors. To illustrate the last point, consider:  Simon Peyton Jones committed Dec 21, 2001 967 \bprog  Simon Peyton Jones committed Aug 20, 2001 968 969 970 971 @ data S = S1 { x :: Int } | S2 { x :: Int } -- OK data T = T1 { y :: Int } | T2 { y :: Bool } -- BAD @  Simon Peyton Jones committed Dec 21, 2001 972 \eprog  Simon Peyton Jones committed Aug 20, 2001 973 974 975 Here @S@ is legal but @T@ is not, because @y@ is given inconsistent typings in the latter.  Simon Peyton Jones committed Mar 28, 2001 976 977 978 979 980 981 \subsubsection{Field Selection} @@@ aexp -> qvar @@@ \index{field label!selection}  Simon Peyton Jones committed May 29, 2001 982 983 Field labels are used as selector functions. When used as a variable, a field label serves as a function that extracts the field from an  Simon Peyton Jones committed Mar 28, 2001 984 985 986 object. Selectors are top level bindings and so they may be shadowed by local variables but cannot conflict with other top level bindings of the same name. This shadowing only  Simon Peyton Jones committed May 29, 2001 987 988 affects selector functions; in record construction (Section~\ref{record-construction}) and update (Section~\ref{record-update}), field labels  Simon Peyton Jones committed Mar 28, 2001 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 cannot be confused with ordinary variables. \outline{ \paragraph*{Translation:} A field label "f" introduces a selector function defined as: \begin{center} \bt{lcl} \struthack{17pt}% "f"@ x@ &=&@case x of {@ "C_1 p_{11} \ldots p_{1k}" @ -> @ "e_1" @;@ "\ldots" @;@ "C_n p_{n1} \ldots p_{nk}" @ -> @ "e_n" @}@\\ \et \end{center} where "C_1 \ldots C_n" are all the constructors of the datatype containing a field labeled with "f", "p_{ij}" is @y@ when "f" labels the "j"th component of "C_i" or @_@ otherwise, and "e_i" is @y@ when some field in "C_i" has a label of "f" or @undefined@ otherwise. } \subsubsection{Construction Using Field Labels}  Simon Peyton Jones committed May 29, 2001 1007 \label{record-construction}  Simon Peyton Jones committed Mar 28, 2001 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 \index{field label!construction} @@@ aexp -> qcon @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled construction}, n>=0) fbind -> qvar @=@ exp @@@ \indexsyn{fbind}% A constructor with labeled fields may be used to construct a value in which the components are specified by name rather than by position. Unlike the braces used in declaration lists, these are not subject to layout; the @{@ and @}@ characters must be explicit. (This is also true of field updates and field patterns.)  Simon Peyton Jones committed May 29, 2001 1020 Construction using field labels is subject to the following constraints:  Simon Peyton Jones committed Mar 28, 2001 1021 1022 1023 \begin{itemize} \item Only field labels declared with the specified constructor may be mentioned.  Simon Peyton Jones committed May 29, 2001 1024 \item A field label may not be mentioned more than once.  Simon Peyton Jones committed Mar 28, 2001 1025 1026 1027 1028 1029 \item Fields not mentioned are initialized to $\bot$. \item A compile-time error occurs when any strict fields (fields whose declared types are prefixed by @!@) are omitted during construction. Strict fields are discussed in Section~\ref{strictness-flags}. \end{itemize}  Simon Peyton Jones committed Dec 02, 2002 1030 1031 1032 1033 The expression @F {}@, where @F@ is a data constructor, is legal {\em whether or not @F@ was declared with record syntax} (provided @F@ has no strict fields --- see the third bullet above); it denotes "@F@ \bot_1 ... \bot_n", where "n" is the arity of @F@.  Simon Peyton Jones committed Mar 28, 2001 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048  \outline{ \paragraph*{Translation:} In the binding "f" @=@ "v", the field "f" labels "v". \begin{center} \bt{lcl} \struthack{17pt}% "C" @{@ "bs" @}@ &=& "C (pick^C_1 bs @undefined@) \ldots (pick^C_k bs @undefined@)"\\ \et \end{center} where "k" is the arity of "C". The auxiliary function "pick^C_i bs d" is defined as follows: \begin{quote} If the "i"th component of a constructor "C" has the  Simon Peyton Jones committed May 29, 2001 1049  field label "f", and if "f=v" appears in the binding list  Simon Peyton Jones committed Mar 28, 2001 1050 1051 1052 1053  "bs", then "pick^C_i bs d" is "v". Otherwise, "pick^C_i bs d" is the default value "d". \end{quote} }  Simon Peyton Jones committed Dec 02, 2002 1054   Simon Peyton Jones committed Mar 28, 2001 1055 \subsubsection{Updates Using Field Labels}  Simon Peyton Jones committed May 29, 2001 1056 \label{record-update}  Simon Peyton Jones committed Mar 28, 2001 1057 1058 1059 1060 1061 \index{field label!update} @@@ aexp -> aexp_{\langle{}qcon\rangle{}} @{@ fbind_1 @,@ ... @,@ fbind_n @}@ & (\tr{labeled update}, n>=1) @@@  Simon Peyton Jones committed May 29, 2001 1062 Values belonging to a datatype with field labels may be  Simon Peyton Jones committed Mar 28, 2001 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 non-destructively updated. This creates a new value in which the specified field values replace those in the existing value. Updates are restricted in the following ways: \begin{itemize} \item All labels must be taken from the same datatype. \item At least one constructor must define all of the labels mentioned in the update. \item No label may be mentioned more than once. \item An execution error occurs when the value being updated does not contain all of the specified labels. \end{itemize} \outline{ \paragraph*{Translation:} Using the prior definition of "pick", \begin{center} \bt{lcl} \struthack{17pt}% "e" @{@ "bs" @}@ &=& @case@ "e" @of@\\  Simon Peyton Jones committed Aug 23, 2001 1081 &&@ @"C_1 v_1 ... v_{k_1}" @->@ "C_1 (pick^{C_1}_1 bs v_1) ... (pick^{C_1}_{k_1} bs v_{k_1})"\\  Simon Peyton Jones committed Mar 28, 2001 1082 &&@ @ ... \\  Simon Peyton Jones committed Aug 23, 2001 1083 &&@ @"C_j v_1 ... v_{k_j}" @->@ "C_j (pick^{C_j}_1 bs v_1) ... (pick^{C_j}_{k_j} bs v_{k_j})"\\  Simon Peyton Jones committed Mar 28, 2001 1084 1085 1086 1087 &&@ _ -> error "Update error"@\\ \et \end{center} where "\{C_1,...,C_j\}" is the set of constructors containing all labels  Simon Peyton Jones committed Aug 23, 2001 1088 in "bs", and "k_i" is the arity of "C_i".  Simon Peyton Jones committed Mar 28, 2001 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 } Here are some examples using labeled fields: \bprog @ data T = C1 {f1,f2 :: Int} | C2 {f1 :: Int, f3,f4 :: Char} @ \eprog $\bt{|l|l|}%%b \hline Expression & Translation \\ \hline @C1 {f1 = 3}@ & @C1 3 undefined@ \\ @C2 {f1 = 1, f4 = 'A', f3 = 'B'}@ & @C2 1 'B' 'A'@ \\ @x {f1 = 1}@ & @case x of C1 _ f2 -> C1 1 f2@ \\ & @ C2 _ f3 f4 -> C2 1 f3 f4@ \\ \hline\et$  Simon Peyton Jones committed Aug 20, 2001 1107 The field @f1@ is common to both constructors in T. This  Simon Peyton Jones committed Mar 28, 2001 1108 1109 1110 1111 example translates expressions using constructors in field-label notation into equivalent expressions using the same constructors without field labels. A compile-time error will result if no single constructor  Simon Peyton Jones committed May 29, 2001 1112 defines the set of field labels used in an update, such as  Simon Peyton Jones committed Mar 28, 2001 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 @x {f2 = 1, f3 = 'x'}@. \subsection{Expression Type-Signatures} \index{expression type-signature} \label{expression-type-sigs} % @@@ exp -> exp @::@ [context @=>@] type @@@ \indexsyn{exp} \indextt{::} \nopagebreak[4] {\em Expression type-signatures} have the form "e @::@ t", where "e" is an expression and "t" is a type (Section~\ref{type-syntax}); they are used to type an expression explicitly and may be used to resolve ambiguous typings due to overloading (see Section~\ref{default-decls}). The value of the expression is just that of "exp". As with normal type signatures (see Section~\ref{type-signatures}), the declared type may be more specific than the principal type derivable from "exp", but it is an error to give a type that is more general than, or not comparable to, the principal type. \outline{ \paragraph*{Translation:} \begin{center} \bt{lcl} \struthack{17pt}% "e @::@ t" & = & "@let {@ v @::@ t@; @ v @=@ e @} in @v" \et \end{center} } \subsection{Pattern Matching} \index{pattern-matching} \label{pattern-matching} \label{patterns} {\em Patterns} appear in lambda abstractions, function definitions, pattern bindings, list comprehensions, do expressions, and case expressions. However, the first five of these ultimately translate into case expressions, so defining the semantics of pattern matching for case expressions is sufficient. %it suffices to restrict the definition of the semantics of %pattern-matching to case expressions. \subsubsection{Patterns} \label{pattern-definitions} Patterns\index{pattern} have this syntax: @@@  Simon Marlow committed Apr 28, 2010 1165 1166 1167 1168 1169 1170 pat -> \hprime{lpat qconop pat} & (\tr{infix constructor}) | \hprime{@-@ (integer | float)} & (\tr{negative literal}) | \hprime{lpat} \hprime{lpat} -> apat | \hprime{@-@ (integer | float)} & (\tr{negative literal})  Simon Peyton Jones committed Mar 28, 2001 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185  | gcon apat_1 ... apat_k & (\tr{arity} gcon = k, k>=1) apat -> var [{\tt @@} apat] & (\tr{as pattern}) | gcon & (\tr{arity} gcon = 0) | qcon @{@ fpat_1 @,@ ... @,@ fpat_k @}@ & (\tr{labeled pattern}, k>=0) | literal | @_@ & (\tr{wildcard}) | @(@ pat @)@ & (\tr{parenthesized pattern}) | @(@ pat_1 @,@ ... @,@ pat_k @)@ & (\tr{tuple pattern}, k>=2) | @[@ pat_1 @,@ ... @,@ pat_k @]@ & (\tr{list pattern}, k>=1) | @~@ apat & (\tr{irrefutable pattern}) fpat -> qvar @=@ pat @@@ \indexsyn{pat}%  Simon Marlow committed Apr 28, 2010 1186 \indexsyn{lpat}%  Simon Peyton Jones committed Mar 28, 2001 1187 1188 1189 1190 1191 1192 1193 1194 1195 \indexsyn{apat}% \indexsyn{fpats}% \indexsyn{fpat}% The arity of a constructor must match the number of sub-patterns associated with it; one cannot match against a partially-applied constructor. All patterns must be {\em linear}\index{linearity}  Simon Peyton Jones committed Aug 20, 2001 1196 1197 \index{linear pattern}---no variable may appear more than once. For example, this definition is illegal:  Simon Marlow committed Apr 29, 2010 1198 \bprog  Simon Peyton Jones committed Aug 20, 2001 1199 @  Simon Marlow committed Apr 29, 2010 1200 f (x,x) = x -- ILLEGAL; x used twice in pattern  Simon Peyton Jones committed Aug 20, 2001 1201 @  Simon Marlow committed Apr 29, 2010 1202 \eprog  Simon Peyton Jones committed Mar 28, 2001 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 Patterns of the form "var"{\tt @@}"pat" are called {\em as-patterns}, \index{as-pattern ({\tt {\char'100}})} and allow one to use "var" as a name for the value being matched by "pat". For example,\nopagebreak[4] \bprog @ case e of { xs@@(x:rest) -> if x==0 then rest else xs } @ \eprog is equivalent to: \bprog @ let { xs = e } in case xs of { (x:rest) -> if x==0 then rest else xs } @ \eprog Patterns of the form @_@ are {\em wildcards}\index{wildcard pattern ({\tt {\char'137}})} and are useful when some part of a pattern is not referenced on the right-hand-side. It is as if an identifier not used elsewhere were put in its place. For example, \bprog @ case e of { [x,_,_] -> if x==0 then True else False } @ \eprog is equivalent to: \bprog @ case e of { [x,y,z] -> if x==0 then True else False } @ \eprog % where @y@ and @z@ are identifiers not used elsewhere. %old: %This translation is also %assumed prior to the semantics given below. \subsubsection{Informal Semantics of Pattern Matching} Patterns are matched against values. Attempting to match a pattern can have one of three results: it may {\em fail\/}; it may {\em succeed}, returning a binding for each variable in the pattern; or it may {\em diverge} (i.e.~return "\bot"). Pattern matching proceeds  Simon Peyton Jones committed Aug 20, 2001 1247 from left to right, and outside to inside, according to the following rules:  Simon Peyton Jones committed Mar 28, 2001 1248 \begin{enumerate}  Simon Peyton Jones committed Jan 29, 2002 1249 1250 \item Matching the pattern "var" against a value "v" always succeeds and binds "var" to "v".  Simon Peyton Jones committed Mar 28, 2001 1251   Simon Peyton Jones committed Jan 29, 2002 1252 1253 1254 1255 1256 1257 \item Matching the pattern "@~@apat" against a value "v" always succeeds. The free variables in "apat" are bound to the appropriate values if matching "apat" against "v" would otherwise succeed, and to "\bot" if matching "apat" against "v" fails or diverges. (Binding does {\em not} imply evaluation.)  Simon Peyton Jones committed Mar 28, 2001 1258   Simon Peyton Jones committed Dec 02, 2002 1259 1260 Operationally, this means that no matching is done on a "@~@apat" pattern until one of the variables in "apat" is used.  Simon Peyton Jones committed Mar 28, 2001 1261 1262 1263 1264 At that point the entire pattern is matched against the value, and if the match fails or diverges, so does the overall computation. \item  Simon Peyton Jones committed Dec 02, 2002 1265 Matching the wildcard pattern @_@ against any value always succeeds,  Simon Peyton Jones committed Jan 29, 2002 1266 and no binding is done.  Simon Peyton Jones committed Mar 28, 2001 1267   Simon Peyton Jones committed Jan 29, 2002 1268 1269 1270 1271 \item Matching the pattern "con pat" against a value, where "con" is a constructor defined by @newtype@, depends on the value: \begin{itemize}  Simon Peyton Jones committed Dec 02, 2002 1272 1273 \item If the value is of the form "con v", then "pat" is matched against "v". \item If the value is "\bot", then "pat" is matched against "\bot".  Simon Peyton Jones committed Jan 29, 2002 1274 1275 1276 \end{itemize} That is, constructors associated with @newtype@ serve only to change the type of a value.\index{newtype declaration@@{\tt newtype} declaration}  Simon Peyton Jones committed Mar 28, 2001 1277   Simon Peyton Jones committed Jan 29, 2002 1278 1279 1280 1281 1282 \item Matching the pattern "con pat_1 \ldots pat_n" against a value, where "con" is a constructor defined by @data@, depends on the value: \begin{itemize} \item If the value is of the form "con v_1 \ldots v_n",  Simon Peyton Jones committed Dec 02, 2002 1283 sub-patterns are matched left-to-right against the components of the data value;  Simon Peyton Jones committed Mar 28, 2001 1284 1285 1286 1287 if all matches succeed, the overall match succeeds; the first to fail or diverge causes the overall match to fail or diverge, respectively.  Simon Peyton Jones committed Jan 29, 2002 1288 1289 1290 1291 1292 1293 \item If the value is of the form "con' v_1 \ldots v_m", where "con" is a different constructor to "con'", the match fails. \item If the value is "\bot", the match diverges. \end{itemize}  Simon Peyton Jones committed Dec 02, 2002 1294 1295 1296 1297 1298 1299 1300 \item Matching against a constructor using labeled fields is the same as matching ordinary constructor patterns except that the fields are matched in the order they are named in the field list. All fields listed must be declared by the constructor; fields may not be named more than once. Fields not named by the pattern are ignored (matched against @_@).  Simon Peyton Jones committed Mar 28, 2001 1301   Simon Peyton Jones committed Dec 02, 2002 1302 \item Matching a numeric, character, or string literal pattern "k" against a value "v"  Simon Peyton Jones committed Jan 29, 2002 1303 \index{literal pattern}  Simon Peyton Jones committed Dec 02, 2002 1304 1305 succeeds if "v ~@==@ ~k", where @==@ is overloaded based on the type of the pattern. The match diverges if  Simon Peyton Jones committed Jan 29, 2002 1306 1307 this test diverges.  Simon Peyton Jones committed Dec 02, 2002 1308 1309 1310 1311 The interpretation of numeric literals is exactly as described in Section~\ref{vars-and-lits}; that is, the overloaded function @fromInteger@ or @fromRational@ is applied to an @Integer@ or @Rational@ literal (resp) to convert it to the appropriate type.  Simon Peyton Jones committed Jan 29, 2002 1312   Simon Peyton Jones committed Mar 28, 2001 1313 \item  Simon Peyton Jones committed Jan 29, 2002 1314 Matching an as-pattern "var"{\tt @@}"apat" against a value "v" is  Simon Peyton Jones committed Mar 28, 2001 1315 \index{as-pattern ({\tt {\char'100}})}  Simon Peyton Jones committed Jan 29, 2002 1316 1317 the result of matching "apat" against "v", augmented with the binding of "var" to "v". If the match of "apat" against "v" fails or diverges,  Simon Peyton Jones committed Mar 28, 2001 1318 1319 1320 1321 1322 then so does the overall match. \end{enumerate} Aside from the obvious static type constraints (for example, it is a static error to match a character against a  Simon Peyton Jones committed Dec 02, 2002 1323 1324 1325 boolean), the following static class constraints hold: \begin{itemize} \item An integer  Simon Peyton Jones committed Mar 28, 2001 1326 1327 1328 literal pattern \index{integer literal pattern} can only be matched against a value in the class  Simon Peyton Jones committed Dec 02, 2002 1329 1330 @Num@. \item A floating literal pattern  Simon Peyton Jones committed Mar 28, 2001 1331 1332 1333 \index{floating literal pattern} can only be matched against a value in the class @Fractional@.  Simon Peyton Jones committed Dec 02, 2002 1334 \end{itemize}  Simon Peyton Jones committed Mar 28, 2001 1335   Simon Peyton Jones committed Dec 02, 2002 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 It is sometimes helpful to distinguish two kinds of patterns. Matching an {\em irrefutable pattern} \index{irrefutable pattern} is non-strict: the pattern matches even if the value to be matched is "\bot". Matching a {\em refutable} pattern is strict: if the value to be matched \index{refutable pattern} is "\bot" the match diverges. The irrefutable patterns are as follows: a variable, a wildcard, "N apat" where "N" is a constructor defined by @newtype@ and "apat" is irrefutable (see Section~\ref{datatype-renaming}), \index{newtype declaration@@{\tt newtype} declaration} "var"{\tt @@}"apat" where "apat" is irrefutable, or of the form "@~@apat" (whether or not "apat" is irrefutable). All other patterns are {\em refutable}.  Simon Peyton Jones committed Mar 28, 2001 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 Here are some examples: \begin{enumerate} \item If the pattern @['a','b']@ is matched against "@['x',@\bot@]@", then @'a'@ "fails" to match against @'x'@, and the result is a failed match. But if @['a','b']@ is matched against "@[@\bot@,'x']@", then attempting to match @'a'@ against "\bot" causes the match to "diverge". \item These examples demonstrate refutable vs.~irrefutable matching: \bprog @ (\ ~(x,y) -> 0) @"\bot"@ @"\Rightarrow"@ 0 (\ (x,y) -> 0) @"\bot"@ @"\Rightarrow"@ @"\bot"@ @ \eprog \bprog @ (\ ~[x] -> 0) [] @"\Rightarrow"@ 0 (\ ~[x] -> x) [] @"\Rightarrow"@ @"\bot"@ @ \eprog \bprog @ (\ ~[x,~(a,b)] -> x) [(0,1),@"\bot"@] @"\Rightarrow"@ (0,1) (\ ~[x, (a,b)] -> x) [(0,1),@"\bot"@] @"\Rightarrow"@ @"\bot"@ @ \eprog \bprog @ (\ (x:xs) -> x:x:xs) @"\bot"@ @"\Rightarrow"@ @"\bot"@ (\ ~(x:xs) -> x:x:xs) @"\bot"@ @"\Rightarrow"@ @"\bot"@:@"\bot"@:@"\bot"@ @ \eprogNoSkip  Simon Peyton Jones committed Dec 02, 2002 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404  \item Consider the following declarations: \bprog @ newtype N = N Bool data D = D !Bool @ \eprog These examples illustrate the difference in pattern matching between types defined by @data@ and @newtype@: \bprog @ (\ (N True) -> True) @"\bot"@ @"\Rightarrow"@ @"\bot"@ (\ (D True) -> True) @"\bot"@ @"\Rightarrow"@ @"\bot"@ (\ ~(D True) -> True) @"\bot"@ @"\Rightarrow"@ True @ \eprog Additional examples may be found in Section~\ref{datatype-renaming}.  Simon Peyton Jones committed Mar 28, 2001 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 \end{enumerate} \nopagebreak[4] % \bprog % @@ % (\ t -> let (x,y) = t in 0) @@"\bot"@@ @@"\Rightarrow"@@ 0 % (\ (x,y) -> 0) @@"\bot"@@ @@"\Rightarrow"@@ @@"\bot"@@ % @@ % \eprog % \bprog % @@ % (\ l -> let [x] = l in 0) [] @@"\Rightarrow"@@ 0 % (\ l -> let [x] = l in x) [] @@"\Rightarrow"@@ @@"\bot"@@ % @@ % \eprog % \bprog % @@ % (\ l -> let [x,t] = l; (a,b) = t in x) [(0,1),@@"\bot"@@] @@"\Rightarrow"@@ (0,1) % (\ l -> let [x, (a,b)] = l in x) [(0,1),@@"\bot"@@] @@"\Rightarrow"@@ @@"\bot"@@ % @@ % \eprog % \bprog % @@ % (\ (x:xs) -> x:x:xs) @@"\bot"@@ @@"\Rightarrow"@@ @@"\bot"@@ % (\ l -> let (x:xs) = l in x:x:xs) @@"\bot"@@ @@"\Rightarrow"@@ @@"\bot"@@:@@"\bot"@@:@@"\bot"@@ % @@ % \eprogNoSkip % \end{enumerate}  Isaac Potoczny-Jones committed Jan 08, 2007 1433 1434 1435 1436 Top level patterns in case expressions and the set of top level patterns in function or pattern bindings may have zero or more associated {\em qualifiers}\index{qualifier}. See Section~\ref{qualifiers-in-patterns}.  Simon Peyton Jones committed Mar 28, 2001 1437 1438 1439 1440 1441 1442 1443 1444  The guard semantics have an obvious influence on the strictness characteristics of a function or case expression. In particular, an otherwise irrefutable pattern \index{irrefutable pattern} may be evaluated because of a guard. For example, in \bprog @  Simon Peyton Jones committed Aug 23, 2001 1445 1446 f :: (Int,Int,Int) -> [Int] -> Int f ~(x,y,z) [a] | (a == y) = 1  Simon Peyton Jones committed Mar 28, 2001 1447 1448 1449 1450 1451 1452 1453 1454 @ \eprog % \bprog % @@ % f t [a] | a==y = 1 % where (x,y,z) = t % @@ % \eprog  Simon Peyton Jones committed Dec 21, 2001 1455 both @a@ and @y@ will be evaluated by @==@ in the guard.  Simon Peyton Jones committed Mar 28, 2001 1456   Simon Peyton Jones committed Aug 20, 2001 1457 1458 1459 1460 1461 1462 1463 1464  \subsubsection{Formal Semantics of Pattern Matching} \label{case-semantics} The semantics of all pattern matching constructs other than @case@ expressions are defined by giving identities that relate those constructs to @case@ expressions. The semantics of @case@ expressions themselves are in turn given as a series of  1465 identities, in Figures~\ref{simple-case-expr-1}--\ref{simple-case-expr-3}.  Simon Peyton Jones committed Aug 20, 2001 1466 1467 1468 1469 Any implementation should behave so that these identities hold; it is not expected that it will use them directly, since that would generate rather inefficient code.  Simon Peyton Jones committed Mar 28, 2001 1470 \begin{figure}[tb]  Simon Peyton Jones committed Dec 02, 2002 1471 \outlinec{\small  Simon Peyton Jones committed Mar 28, 2001 1472 1473 \begin{tabular}{@@{}cl} (a)&@case @$e$@ of { @$alts$@ } @$=$@ (\@$v$@ -> case @$v$@ of { @$alts$@ }) @"e"\\  Simon Peyton Jones committed May 29, 2001 1474 &{\rm where $v$ is a new variable}\\  Simon Peyton Jones committed Mar 28, 2001 1475 1476 1477 (b)&@case @$v$@ of { @$p_1\ \ match_1$@; @$\ldots{}$@ ; @$p_n\ \ match_n$@ }@\\ &$=$@ case @$v$@ of { @$p_1\ \ match_1$@ ;@\\ &@ _ -> @$\ldots{}$@ case @$v$@ of {@\\  Simon Peyton Jones committed May 29, 2001 1478  &@ @$p_n\ \ match_n$\ @;@\\  Simon Peyton Jones committed Mar 28, 2001 1479 1480  &@ _ -> error "No match" }@$\ldots$@}@\\ &@ @{\rm where each $match_i$ has the form:}\\  Iavor S. Diatchki committed Jan 13, 2007 1481  &@ | @$gs_{i,1}$ @ -> @$e_{i,1}$@ ; @$\ldots{}$@ ; | @$gs_{i,m_i}$@ -> @$e_{i,m_i}$@ where { @$decls_i$@ }@\$4pt]  Simon Peyton Jones committed Mar 28, 2001 1482 %\\  Iavor S. Diatchki committed Jan 13, 2007 1483 1484 (c)&@case @v@ of { @p@ | @gs_1@ -> @e_1@ ; @\ldots{}\\ &\hspace*{4pt}@ | @gs_n@ -> @e_n@ where { @decls@ }@\\  Simon Peyton Jones committed Aug 20, 2001 1485  &\hspace*{2pt}@ _ -> @e'@ }@\\  Iavor S. Diatchki committed Jan 13, 2007 1486 1487 1488  &=@ case @e'@ of { @y@ ->@\\ &@ case @v@ of {@\\ &@ @p@ -> let { @decls@ } in@\\  Simon Marlow committed Mar 20, 2008 1489 1490 1491 1492 1493  &\hprime{@ case () of {@}\\ &\hprime{@ () | @gs_1@ -> @e_1@;@}\\ &\hprime{@ _ -> @\ldots@ case () of {@}\\ &\hprime{@ () | @gs_n@ -> @e_n@;@}\\ &\hprime{@ _ -> @y@ } @\ldots@ }@}\\  Iavor S. Diatchki committed Jan 13, 2007 1494 1495  &@ _ -> @y@ }}@\\ &{\rm where y is a new variable}\\[4pt]  Simon Peyton Jones committed Mar 28, 2001 1496 1497 %\\ (d)&@case @v@ of { ~@p@ -> @e@; _ -> @e'@ }@\\  Simon Peyton Jones committed Dec 02, 2002 1498 &=@ (\@x_1 \ldots x_n @->@ e @) (case @v@ of { @p@->@  Simon Peyton Jones committed Mar 28, 2001 1499 x_1@ })@ \ldots @(case @v@ of { @p@ -> @x_n@})@\\  Simon Peyton Jones committed Dec 02, 2002 1500 &{\rm where x_1, \ldots, x_n are all the variables in p\/}\\[4pt]  Simon Peyton Jones committed Mar 28, 2001 1501 1502 1503 1504 1505 1506 1507 %\\ (e)&@case @v@ of { @x{\tt @@}p@ -> @e@; _ -> @e'@ }@\\ &=@ case @v@ of { @p@ -> ( \ @x@ -> @e@ ) @v@ ; _ -> @e'@ }@\\[4pt] %\\ (f)&@case @v@ of { _ -> @e@; _ -> @e'@ } @=@ @e\\[4pt] \end{tabular} }  1508 %** Figure 3.1  Simon Peyton Jones committed Mar 28, 2001 1509 1510 1511 1512 1513 \ecaption{Semantics of Case Expressions, Part 1} \label{simple-case-expr-1} \end{figure} \begin{figure}[tb]  Simon Peyton Jones committed Dec 02, 2002 1514 \outlinec{\small  Simon Peyton Jones committed Mar 28, 2001 1515 1516 \begin{tabular}{@@{}cl} (g)&@case @v@ of { @K\ p_1 \ldots p_n@ -> @e@; _ -> @e'@ }@\\  Simon Marlow committed Mar 20, 2008 1517 &=\hprime{@ case @e'@ of { @y@ ->@}\\  1518 &@ case @v@ of {@\\  Simon Peyton Jones committed Mar 28, 2001 1519 &@ @K\ x_1 \ldots x_n@ -> case @x_1@ of {@\\  1520 &@ @p_1@ -> @\ldots@ case @x_n@ of { @p_n@ -> @e@ ; _ -> @y@ } @\ldots\\  Simon Marlow committed Mar 20, 2008 1521 1522 &\hprime{@ _ -> @y@ }@}\\ &\hprime{@ _ -> @y@ }}@}\\[2pt]  Simon Peyton Jones committed Mar 28, 2001 1523 1524 &{\rm at least one of p_1, \ldots, p_n is not a variable; x_1, \ldots, x_n are new variables}\\[4pt] %\\  Simon Peyton Jones committed Nov 01, 2001 1525 (h)&@case @v@ of { @k@ -> @e@; _ -> @e'@ } @=@ if (@v@==@k@) then @e@ else @e' \\  Simon Peyton Jones committed Dec 02, 2002 1526 &{\rm where k is a numeric, character, or string literal.} \\[4pt]  Simon Peyton Jones committed Mar 28, 2001 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 %\\ (i)&@case @v@ of { @x@ -> @e@; _ -> @e'@ } @=@ case @v@ of { @x@ -> @e@ }@\\[4pt] %\\ (j)&@case @v@ of { @x@ -> @e@ } @=@ ( \ @x@ -> @e@ ) @v\\[4pt] %\\ (k)&@case @N v@ of { @N@ @p@ -> @e@; _ -> @e'@ }@\\ &=@ case @v@ of { @p@ -> @e@; _ -> @e'@ }@\\ &{\rm where N is a @newtype@ constructor}\\[4pt] (l)&@case @\bot@ of { @N@ @p@ -> @e@; _ -> @e'@ } @=@ case @\bot@ of { @p@ -> @e@ }@\\ &{\rm where N is a @newtype@ constructor}\\[4pt] (m)& @case @ v @ of { @ K @ {@ f_1 @ = @ p_1 @ , @ f_2 @ = @ p_2 @ , @ \ldots @} -> @ e @; _ -> @ e' @ }@\\ &= @ case @e'@ of {@\\ &@ @y@ ->@\\ &@ case @ v @ of { @\\ &@ @ K @ { @ f_1 @ = @ p_1 @ } ->@\\ &@ case @ v @ of {@ K @ {@ f_2 @ = @ p_2 @ , @ \ldots @ } -> @ e @; _ -> @ y @ };@\\ &@ _ -> @ y @ }}@\\ &{\rm where f_1, f_2, \ldots are fields of constructor K; y is a new variable}\\[4pt] (n)&@case @ v @ of { @ K @ {@ f @ = @ p @} -> @ e @; _ -> @ e' @ }@ \\ &=@ case @ v @ of {@\\ & @ @ K p_1\ \ldots \ p_n @ -> @ e @; _ -> @ e' @ }@\\ &{\rm where p_i is p if f labels the ith component of K, @_@ otherwise}\\ (o)&@case @ v @ of { @ K @ {} -> @ e @; _ -> @ e' @ }@ \\ &=@ case @ v @ of {@\\ & @ @ K @_@ \ldots @_ -> @ e @; _ -> @ e' @ }@\\ (p)&@case (@K'@ @e_1@ @\ldots@ @e_m@) of { @K@ @x_1@ @\ldots@ @x_n@ -> @e@; _ -> @e'@ } @=@ @e'\\ &{\rm where K and K' are distinct @data@ constructors of arity n and m, respectively}\\[4pt] %\\ (q)&@case (@K@ @e_1@ @\ldots@ @e_n@) of { @K@ @x_1@ @\ldots@ @x_n@ -> @e@; _ -> @e'@ }@\\  Simon Peyton Jones committed Dec 10, 2002 1562 &=@ (\@x_1~\ldots~x_n@ -> @e@) @e_1~\ldots~e_n\\  Simon Peyton Jones committed Dec 02, 2002 1563 &{\rm where K is a @data@ constructor of arity n}\\[4pt]  Simon Peyton Jones committed Jan 29, 2002 1564 1565  (r)&@case@~\bot~@of { @K@ @x_1@ @\ldots@ @x_n@ -> @e@; _ -> @e'@ }@ ~=~ \bot \\  Simon Peyton Jones committed Dec 02, 2002 1566 &{\rm where K is a @data@ constructor of arity n}\\[4pt]  1567 1568 1569 1570 1571 1572 \end{tabular} } %** Figure 3.2 \ecaption{Semantics of Case Expressions, Part 2} \label{simple-case-expr-2} \end{figure}  Iavor S. Diatchki committed Jan 13, 2007 1573   1574 1575 \begin{figure}[tb] \outlinec{\small  Simon Marlow committed Mar 20, 2008 1576 \begin{haskellprime}  1577 \begin{tabular}{@@{}cl}  Simon Marlow committed Apr 28, 2010 1578 (s)&@case () of { () | @g_1@, @\ldots@, @g_n@ -> @e@; _ -> @e'@ }@\\  Iavor S. Diatchki committed Jan 13, 2007 1579 1580 1581 1582 1583 1584 1585 1586  &=@ case @e'@ of { @y@ ->@\\ &@ case () of {@\\ &@ () | @g_1@ -> @\ldots@ case () of {@\\ &@ () | @g_n@ -> @e@;@\\ &@ _ -> @y@ } @\ldots\\ &@ _ -> @y@ }}@\\ &{\rm where y is a new variable}\\[4pt]  Simon Marlow committed Apr 28, 2010 1587 1588 (t)&@case () of { () | @p@ <- @e_0@ -> @e@; _ -> @e'@ }@\\ &=@ case @e_0@ of { @p@ -> @e@; _ -> @e'@ }@\\  Iavor S. Diatchki committed Jan 13, 2007 1589   Simon Marlow committed Apr 28, 2010 1590 (u)&@case () of { () | let @decls@ -> @e@; _ -> @e'@ }@\\  Iavor S. Diatchki committed Jan 13, 2007 1591 1592  &=@ let @decls@ in @e\\[4pt]  Simon Marlow committed Apr 28, 2010 1593 1594 (v)&@case () of { () | @e_0@ -> @e@; _ -> @e'@ }@\\ &=@ if @e_0@ then @e@ else @e'\\[4pt]  Simon Peyton Jones committed Mar 28, 2001 1595 \end{tabular}  Simon Marlow committed Mar 20, 2008 1596 \end{haskellprime}  Simon Peyton Jones committed Mar 28, 2001 1597 }  1598 %** Figure 3.3  Simon Marlow committed Mar 20, 2008 1599 \ecaption{\hprime{Semantics of Case Expressions, Part 3}}  1600 \label{simple-case-expr-3}  Simon Peyton Jones committed Mar 28, 2001 1601 1602 \end{figure}  1603 1604 1605 1606  In Figures~\ref{simple-case-expr-1}--\ref{simple-case-expr-3}:  Simon Peyton Jones committed Mar 28, 2001 1607 "e", "e'" and "e_i" are expressions;  Iavor S. Diatchki committed Jan 13, 2007 1608 "g_i" and "gs_i" are guards and sequences of guards respecively;  Simon Peyton Jones committed Mar 28, 2001 1609 1610 1611 "p" and "p_i" are patterns; "v", "x", and "x_i" are variables; "K" and "K'" are algebraic datatype (@data@) constructors (including  Simon Peyton Jones committed Nov 01, 2001 1612 tuple constructors); and "N" is a @newtype@ constructor\index{newtype declaration@@{\tt newtype} declaration}.  Simon Peyton Jones committed Mar 28, 2001 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 % For clarity, several rules are expressed using % @let@ (used only in a non-recursive way); their usual purpose is to % prevent name capture % (e.g., in rule~(c)). The rules may be re-expressed entirely with % @cases@ by applying this identity: % \[ % "@let @x@ = @y@ in @e@ @ =@ case @y@ of { @x@ -> @e@ }@" %$ %Using all but the last two identities (rules~(n) and~(o)) in Figure~\ref{simple-case-expr-2} %in a left-to-right manner yields a translation into a %subset of general @case@ expressions called {\em simple case expressions}.% %\index{simple case expression} Rule~(b) matches a general source-language @case@ expression, regardless of whether it actually includes  Iavor S. Diatchki committed Jan 13, 2007 1628 guards---if no guards are written, then @True@ is substituted for the guards "gs_{i,j}"  Simon Peyton Jones committed Mar 28, 2001 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 in the "match_i" forms. Subsequent identities manipulate the resulting @case@ expression into simpler and simpler forms. %The semantics of simple @case@ expressions is %given by the last two identities ((n) and~(o)). Rule~(h) in Figure~\ref{simple-case-expr-2} involves the overloaded operator @==@; it is this rule that defines the meaning of pattern matching against overloaded constants. \index{pattern-matching!overloaded constant} %% When used as a translation, the identities in %% Figure~\ref{simple-case-expr} will generate a very inefficient %% program. This can be fixed by using further @case@ or @let@ %% expressions, but doing so %% would clutter the identities, which are intended only to convey the semantics.  Simon Marlow committed Apr 28, 2010 1646 These identities all preserve the static semantics. Rules~(d), (e), (j), and~(q)  Simon Peyton Jones committed Mar 28, 2001 1647 1648 1649 1650 1651 1652 1653 1654 1655 use a lambda rather than a @let@; this indicates that variables bound by @case@ are monomorphically typed (Section~\ref{type-semantics}). \index{monomorphic type variable} %**~footer % Local Variables: % mode: latex % End: