glasgow_exts.vsgml 65.4 KB
Newer Older
1
% 
sof's avatar
sof committed
2
% $Id: glasgow_exts.vsgml,v 1.10 1999/05/04 08:31:52 sof Exp $
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
%
% GHC Language Extensions.
%

As with all known Haskell systems, GHC implements some extensions to
the language.  To use them, you'll need to give a @-fglasgow-exts@%
<nidx>-fglasgow-exts option</nidx> option.

Virtually all of the Glasgow extensions serve to give you access to
the underlying facilities with which we implement Haskell.  Thus, you
can get at the Raw Iron, if you are willing to write some non-standard
code at a more primitive level.  You need not be ``stuck'' on
performance because of the implementation costs of Haskell's
``high-level'' features---you can always code ``under'' them.  In an
extreme case, you can write all your time-critical code in C, and then
just glue it together with Haskell!

Executive summary of our extensions:

<descrip>

<tag>Unboxed types and primitive operations:</tag> 

You can get right down to the raw machine types and operations;
included in this are ``primitive arrays'' (direct access to Big Wads
of Bytes).  Please see Section <ref name="Unboxed types"
id="glasgow-unboxed"> and following.

<tag>Multi-parameter type classes:</tag>

GHC's type system supports extended type classes with multiple
parameters.  Please see Section <ref name="Mult-parameter type
classes" id="multi-param-type-classes">.

<tag>Local universal quantification:</tag>

GHC's type system supports explicit unversal quantification in
constructor fields and function arguments.  This is useful for things
41
42
43
44
45
46
47
48
like defining @runST@ from the state-thread world.  See Section <ref
name="Local universal quantification" id="universal-quantification">.

<tag>Extistentially quantification in data types:</tag>

Some or all of the type variables in a datatype declaration may be
<em>existentially quantified</em>.  More details in Section <ref
name="Existential Quantification" id="existential-quantification">.
49

50
51
52
53
54
55
56
<tag>Scoped type variables:</tag>

Scoped type variables enable the programmer to supply type signatures
for some nested declarations, where this would not be legal in Haskell
98.  Details in Section <ref name="Scoped Type Variables"
id="scoped-type-variables">.

57
58
59
60
61
62
<tag>Calling out to C:</tag> 

Just what it sounds like.  We provide <em>lots</em> of rope that you
can dangle around your neck.  Please see Section <ref name="Calling~C
directly from Haskell" id="glasgow-ccalls">.

63
64
65
66
67
68
<tag>Pragmas</tag>

Pragmas are special instructions to the compiler placed in the source
file.  The pragmas GHC supports are described in Section <ref
name="Pragmas" id="pragmas">.

69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
</descrip>

Before you get too carried away working at the lowest level (e.g.,
sloshing @MutableByteArray#@s around your program), you may wish to
check if there are system libraries that provide a ``Haskellised
veneer'' over the features you want.  See Section <ref name="GHC
Prelude and libraries" id="ghc-prelude">.

%************************************************************************
%*                                                                      *
<sect1>Unboxed types
<label id="glasgow-unboxed">
<p>
<nidx>Unboxed types (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

These types correspond to the ``raw machine'' types you would use in
C: @Int#@ (long int), @Double#@ (double), @Addr#@ (void *), etc.  The
<em>primitive operations</em> (PrimOps) on these types are what you
might expect; e.g., @(+#)@ is addition on @Int#@s, and is the
machine-addition that we all know and love---usually one instruction.

92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
There are some restrictions on the use of unboxed types, the main one
being that you can't pass an unboxed value to a polymorphic function
or store one in a polymorphic data type.  This rules out things like
@[Int#]@ (ie. lists of unboxed integers).  The reason for this
restriction is that polymorphic arguments and constructor fields are
assumed to be pointers: if an unboxed integer is stored in one of
these, the garbage collector would attempt to follow it, leading to
unpredictable space leaks.  Or a @seq@ operation on the polymorphic
component may attempt to dereference the pointer, with disastrous
results.  Even worse, the unboxed value might be larger than a pointer
(@Double#@ for instance).

Nevertheless, A numerically-intensive program using unboxed types can
go a <em>lot</em> faster than its ``standard'' counterpart---we saw a
threefold speedup on one example.
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157

Please see Section <ref name="The module PrelGHC: really primitive
stuff" id="ghc-libs-ghc"> for the details of unboxed types and the
operations on them.

%************************************************************************
%*                                                                      *
<sect1>Primitive state-transformer monad
<label id="glasgow-ST-monad">
<p>
<nidx>state transformers (Glasgow extensions)</nidx>
<nidx>ST monad (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

This monad underlies our implementation of arrays, mutable and
immutable, and our implementation of I/O, including ``C calls''.

The @ST@ library, which provides access to the @ST@ monad, is a
GHC/Hugs extension library and is described in the separate <htmlurl
name="GHC/Hugs Extension Libraries" url="libs.html"> document.

%************************************************************************
%*                                                                      *
<sect1>Primitive arrays, mutable and otherwise
<label id="glasgow-prim-arrays">
<p>
<nidx>primitive arrays (Glasgow extension)</nidx>
<nidx>arrays, primitive (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

GHC knows about quite a few flavours of Large Swathes of Bytes.

First, GHC distinguishes between primitive arrays of (boxed) Haskell
objects (type @Array# obj@) and primitive arrays of bytes (type
@ByteArray#@).

Second, it distinguishes between...
<descrip>
<tag>Immutable:</tag>
Arrays that do not change (as with ``standard'' Haskell arrays); you
can only read from them.  Obviously, they do not need the care and
attention of the state-transformer monad.

<tag>Mutable:</tag>
Arrays that may be changed or ``mutated.''  All the operations on them
live within the state-transformer monad and the updates happen
<em>in-place</em>.

<tag>``Static'' (in C land):</tag>
158
A C routine may pass an @Addr#@ pointer back into Haskell land.  There
159
160
161
162
163
are then primitive operations with which you may merrily grab values
over in C land, by indexing off the ``static'' pointer.

<tag>``Stable'' pointers:</tag>
If, for some reason, you wish to hand a Haskell pointer (i.e.,
164
<em>not</em> an unboxed value) to a C routine, you first make the
165
166
167
168
169
170
171
172
173
pointer ``stable,'' so that the garbage collector won't forget that it
exists.  That is, GHC provides a safe way to pass Haskell pointers to
C.

Please see Section <ref name="Subverting automatic unboxing with
``stable pointers''" id="glasgow-stablePtrs"> for more details.

<tag>``Foreign objects'':</tag>
A ``foreign object'' is a safe way to pass an external object (a
174
C-allocated pointer, say) to Haskell and have Haskell do the Right
175
176
177
178
179
180
181
182
183
Thing when it no longer references the object.  So, for example, C
could pass a large bitmap over to Haskell and say ``please free this
memory when you're done with it.'' 

Please see Section <ref name="Pointing outside the Haskell heap"
id="glasgow-foreignObjs"> for more details.

</descrip>

184
The libraries section gives more details on all these ``primitive
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
array'' types and the operations on them, Section <ref name="The GHC
Prelude and Libraries" id="ghc-prelude">.  Some of these extensions
are also supported by Hugs, and the supporting libraries are described
in the <htmlurl name="GHC/Hugs Extension Libraries" url="libs.html">
document.

%************************************************************************
%*                                                                      *
<sect1>Calling~C directly from Haskell
<label id="glasgow-ccalls">
<p>
<nidx>C calls (Glasgow extension)</nidx>
<nidx>_ccall_ (Glasgow extension)</nidx>
<nidx>_casm_ (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

GOOD ADVICE: Because this stuff is not Entirely Stable as far as names
and things go, you would be well-advised to keep your C-callery
corraled in a few modules, rather than sprinkled all over your code.
It will then be quite easy to update later on.

%************************************************************************
%*                                                                      *
<sect2>@_ccall_@ and @_casm_@: an introduction
<label id="ccall-intro">
<p>
%*                                                                      *
%************************************************************************

The simplest way to use a simple C function

<tscreen><verb>
double fooC( FILE *in, char c, int i, double d, unsigned int u )
</verb></tscreen>

is to provide a Haskell wrapper:

<tscreen><verb>
fooH :: Char -> Int -> Double -> Word -> IO Double
fooH c i d w = _ccall_ fooC (``stdin''::Addr) c i d w
</verb></tscreen>

The function @fooH@ will unbox all of its arguments, call the C
function @fooC@ and box the corresponding arguments.

One of the annoyances about @_ccall_@s is when the C types don't quite
match the Haskell compiler's ideas.  For this, the @_casm_@ variant
may be just the ticket (NB: <em>no chance</em> of such code going
through a native-code generator):

<tscreen><verb>
sof's avatar
sof committed
237
238
239
import Addr
import CString

240
oldGetEnv name
sof's avatar
sof committed
241
  = _casm_ ``%r = getenv((char *) %0);'' name >>= \ litstring ->
242
    return (
sof's avatar
sof committed
243
        if (litstring == nullAddr) then
244
245
            Left ("Fail:oldGetEnv:"++name)
        else
sof's avatar
sof committed
246
            Right (unpackCString litstring)
247
248
249
250
251
252
253
254
255
    )
</verb></tscreen>

The first literal-literal argument to a @_casm_@ is like a @printf@
format: @%r@ is replaced with the ``result,'' @%0@--@%n-1@ are
replaced with the 1st--nth arguments.  As you can see above, it is an
easy way to do simple C~casting.  Everything said about @_ccall_@ goes
for @_casm_@ as well.

sof's avatar
sof committed
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
The use of @_casm_@ in your code does pose a problem to the compiler
when it comes to generating an interface file for a freshly compiled
module. Included in an interface file is the unfolding (if any) of a
declaration. However, if a declaration's unfolding happens to contain
a @_casm_@, its unfolding will <em/not/ be emitted into the interface
file even if it qualifies by all the other criteria. The reason why
the compiler prevents this from happening is that unfolding @_casm_@s
into an interface file unduly constrains how code that import your
module have to be compiled. If an imported declaration is unfolded and
it contains a @_casm_@, you now have to be using a compiler backend
capable of dealing with it (i.e., the C compiler backend). If you are
using the C compiler backend, the unfolded @_casm_@ may still cause you
problems since the C code snippet it contains may mention CPP symbols
that were in scope when compiling the original module are not when
compiling the importing module.

If you're willing to put up with the drawbacks of doing cross-module
inlining of C code (GHC - A Better C Compiler :-), the option
@-funfold-casms-in-hi-file@ will turn off the default behaviour.
<nidx>-funfold-casms-in-hi-file option</nidx>

sof's avatar
sof committed
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
%************************************************************************
%*                                                                      *
<sect2>Literal-literals
<label id="glasgow-literal-literals">
<p>
<nidx>Literal-literals</nidx>
%*                                                                      *
%************************************************************************

The literal-literal argument to @_casm_@ can be made use of separately
from the @_casm_@ construct itself. Indeed, we've already used it:

<tscreen><verb>
fooH :: Char -> Int -> Double -> Word -> IO Double
fooH c i d w = _ccall_ fooC (``stdin''::Addr) c i d w
</verb></tscreen>

The first argument that's passed to @fooC@ is given as a literal-literal, 
that is, a literal chunk of C code that will be inserted into the generated
@.hc@ code at the right place.

A literal-literal is restricted to having a type that's an instance of
the @CCallable@ class, see <ref name="CCallable" id="ccall-gotchas">
for more information.

Notice that literal-literals are by their very nature unfriendly to
native code generators, so exercise judgement about whether or not to
make use of them in your code.

306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
%************************************************************************
%*                                                                      *
<sect2>Using function headers
<label id="glasgow-foreign-headers">
<p>
<nidx>C calls, function headers</nidx>
%*                                                                      *
%************************************************************************

When generating C (using the @-fvia-C@ directive), one can assist the
C compiler in detecting type errors by using the @-#include@ directive
to provide @.h@ files containing function headers.

For example,

<tscreen><verb>
typedef unsigned long *StgForeignObj;
typedef long StgInt;

void          initialiseEFS (StgInt size);
StgInt        terminateEFS (void);
StgForeignObj emptyEFS(void);
StgForeignObj updateEFS (StgForeignObj a, StgInt i, StgInt x);
StgInt        lookupEFS (StgForeignObj a, StgInt i);
</verb></tscreen>

You can find appropriate definitions for @StgInt@, @StgForeignObj@,
etc using @gcc@ on your architecture by consulting
334
@ghc/includes/StgTypes.h@.  The following table summarises the
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
relationship between Haskell types and C types.

<tabular ca="ll">
<bf>C type name</bf>      | <bf>Haskell Type</bf> @@
@@
@StgChar@          | @Char#@ @@               
@StgInt@           | @Int#@ @@                
@StgWord@          | @Word#@ @@               
@StgAddr@          | @Addr#@ @@               
@StgFloat@         | @Float#@ @@              
@StgDouble@        | @Double#@ @@             

@StgArray@         | @Array#@ @@              
@StgByteArray@     | @ByteArray#@ @@          
@StgArray@         | @MutableArray#@ @@       
@StgByteArray@     | @MutableByteArray#@ @@   

@StgStablePtr@     | @StablePtr#@ @@
@StgForeignObj@    | @ForeignObj#@
</tabular>

Note that this approach is only <em>essential</em> for returning
@float@s (or if @sizeof(int) != sizeof(int *)@ on your
architecture) but is a Good Thing for anyone who cares about writing
solid code.  You're crazy not to do it.

%************************************************************************
%*                                                                      *
<sect2>Subverting automatic unboxing with ``stable pointers''
<label id="glasgow-stablePtrs">
<p>
<nidx>stable pointers (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

The arguments of a @_ccall_@ are automatically unboxed before the
call.  There are two reasons why this is usually the Right Thing to
do:

<itemize>
<item>
C is a strict language: it would be excessively tedious to pass
unevaluated arguments and require the C programmer to force their
evaluation before using them.

<item> Boxed values are stored on the Haskell heap and may be moved
within the heap if a garbage collection occurs---that is, pointers
to boxed objects are not <em>stable</em>.
</itemize>

It is possible to subvert the unboxing process by creating a ``stable
pointer'' to a value and passing the stable pointer instead.  For
example, to pass/return an integer lazily to C functions @storeC@ and
@fetchC@, one might write:

<tscreen><verb>
storeH :: Int -> IO ()
storeH x = makeStablePtr x              >>= \ stable_x ->
           _ccall_ storeC stable_x

fetchH :: IO Int
fetchH x = _ccall_ fetchC               >>= \ stable_x ->
           deRefStablePtr stable_x      >>= \ x ->
           freeStablePtr stable_x       >>
           return x
</verb></tscreen>

The garbage collector will refrain from throwing a stable pointer away
until you explicitly call one of the following from C or Haskell.

<tscreen><verb>
void freeStablePointer( StgStablePtr stablePtrToToss )
freeStablePtr :: StablePtr a -> IO ()
</verb></tscreen>

As with the use of @free@ in C programs, GREAT CARE SHOULD BE
EXERCISED to ensure these functions are called at the right time: too
early and you get dangling references (and, if you're lucky, an error
message from the runtime system); too late and you get space leaks.

And to force evaluation of the argument within @fooC@, one would
call one of the following C functions (according to type of argument).

<tscreen><verb>
void     performIO  ( StgStablePtr stableIndex /* StablePtr s (IO ()) */ );
StgInt   enterInt   ( StgStablePtr stableIndex /* StablePtr s Int */ );
StgFloat enterFloat ( StgStablePtr stableIndex /* StablePtr s Float */ );
</verb></tscreen>

<nidx>performIO</nidx>
<nidx>enterInt</nidx>
<nidx>enterFloat</nidx>

Note Bene: @_ccall_GC_@<nidx>_ccall_GC_</nidx> must be used if any of
these functions are used.

%************************************************************************
%*                                                                      *
433
<sect2>Foreign objects: pointing outside the Haskell heap
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
<label id="glasgow-foreignObjs">
<p>
<nidx>foreign objects (Glasgow extension)</nidx>
%*                                                                      *
%************************************************************************

There are two types that @ghc@ programs can use to reference
(heap-allocated) objects outside the Haskell world: @Addr@ and
@ForeignObj@.

If you use @Addr@, it is up to you to the programmer to arrange
allocation and deallocation of the objects.

If you use @ForeignObj@, @ghc@'s garbage collector will call upon the
user-supplied <em>finaliser</em> function to free the object when the
Haskell world no longer can access the object.  (An object is
associated with a finaliser function when the abstract
 Haskell type @ForeignObj@ is created). The finaliser function is
expressed in C, and is passed as argument the object:

<tscreen><verb>
void foreignFinaliser ( StgForeignObj fo )
</verb></tscreen>

when the Haskell world can no longer access the object.  Since
@ForeignObj@s only get released when a garbage collection occurs, we
provide ways of triggering a garbage collection from within C and from
within Haskell.

<tscreen><verb>
464
void GarbageCollect()
465
466
467
468
performGC :: IO ()
</verb></tscreen>

More information on the programmers' interface to @ForeignObj@ can be
sof's avatar
sof committed
469
found in the library documentation.
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555

%************************************************************************
%*                                                                      *
<sect2>Avoiding monads
<label id="glasgow-avoiding-monads">
<p>
<nidx>C calls to `pure C'</nidx>
<nidx>unsafePerformIO</nidx>
%*                                                                      *
%************************************************************************

The @_ccall_@ construct is part of the @IO@ monad because 9 out of 10
uses will be to call imperative functions with side effects such as
@printf@.  Use of the monad ensures that these operations happen in a
predictable order in spite of laziness and compiler optimisations.

To avoid having to be in the monad to call a C function, it is
possible to use @unsafePerformIO@, which is available from the
@IOExts@ module.  There are three situations where one might like to
call a C function from outside the IO world:

<itemize>
<item>
Calling a function with no side-effects:
<tscreen><verb>
atan2d :: Double -> Double -> Double
atan2d y x = unsafePerformIO (_ccall_ atan2d y x)

sincosd :: Double -> (Double, Double)
sincosd x = unsafePerformIO $ do
        da <- newDoubleArray (0, 1)
        _casm_ ``sincosd( %0, &((double *)%1[0]), &((double *)%1[1]) );'' x da
        s <- readDoubleArray da 0
        c <- readDoubleArray da 1
        return (s, c)
</verb></tscreen>

<item> Calling a set of functions which have side-effects but which can
be used in a purely functional manner.

For example, an imperative implementation of a purely functional
lookup-table might be accessed using the following functions.

<tscreen><verb>
empty  :: EFS x
update :: EFS x -> Int -> x -> EFS x
lookup :: EFS a -> Int -> a

empty = unsafePerformIO (_ccall_ emptyEFS)

update a i x = unsafePerformIO $
        makeStablePtr x         >>= \ stable_x ->
        _ccall_ updateEFS a i stable_x

lookup a i = unsafePerformIO $
        _ccall_ lookupEFS a i   >>= \ stable_x ->
        deRefStablePtr stable_x
</verb></tscreen>

You will almost always want to use @ForeignObj@s with this.

<item> Calling a side-effecting function even though the results will
be unpredictable.  For example the @trace@ function is defined by:

<tscreen><verb>
trace :: String -> a -> a
trace string expr
  = unsafePerformIO (
	((_ccall_ PreTraceHook sTDERR{-msg-}):: IO ())  >>
	fputs sTDERR string			        >>
	((_ccall_ PostTraceHook sTDERR{-msg-}):: IO ()) >>
	return expr )
  where
    sTDERR = (``stderr'' :: Addr)
</verb></tscreen>

(This kind of use is not highly recommended --- it is only really
useful in debugging code.)
</itemize>

%************************************************************************
%*                                                                      *
<sect2>C-calling ``gotchas'' checklist
<label id="ccall-gotchas">
<p>
<nidx>C call dangers</nidx>
sof's avatar
sof committed
556
557
<nidx>CCallable</nidx>
<nidx>CReturnable</nidx>
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
%*                                                                      *
%************************************************************************

And some advice, too.

<itemize>
<item> For modules that use @_ccall_@s, etc., compile with
@-fvia-C@.<nidx>-fvia-C option</nidx> You don't have to, but you should.

Also, use the @-#include "prototypes.h"@ flag (hack) to inform the C
compiler of the fully-prototyped types of all the C functions you
call.  (Section <ref name="Using function headers"
id="glasgow-foreign-headers"> says more about this...)

This scheme is the <em>only</em> way that you will get <em>any</em>
573
574
575
typechecking of your @_ccall_@s.  (It shouldn't be that way, but...).
GHC will pass the flag @-Wimplicit@ to gcc so that you'll get warnings
if any @_ccall_@ed functions have no prototypes.
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639

<item>
Try to avoid @_ccall_@s to C~functions that take @float@
arguments or return @float@ results.  Reason: if you do, you will
become entangled in (ANSI?) C's rules for when arguments/results are
promoted to @doubles@.  It's a nightmare and just not worth it.
Use @doubles@ if possible.

If you do use @floats@, check and re-check that the right thing is
happening.  Perhaps compile with @-keep-hc-file-too@ and look at
the intermediate C (@.hc@ file).

<item> The compiler uses two non-standard type-classes when
type-checking the arguments and results of @_ccall_@: the arguments
(respectively result) of @_ccall_@ must be instances of the class
@CCallable@ (respectively @CReturnable@).  Both classes may be
imported from the module @CCall@, but this should only be
necessary if you want to define a new instance.  (Neither class
defines any methods --- their only function is to keep the
type-checker happy.)

The type checker must be able to figure out just which of the
C-callable/returnable types is being used.  If it can't, you have to
add type signatures. For example,

<tscreen><verb>
f x = _ccall_ foo x
</verb></tscreen>

is not good enough, because the compiler can't work out what type @x@
is, nor what type the @_ccall_@ returns.  You have to write, say:

<tscreen><verb>
f :: Int -> IO Double
f x = _ccall_ foo x
</verb></tscreen>

This table summarises the standard instances of these classes.

% ToDo: check this table against implementation!

<tabular ca="llll">
<bf>Type</bf>       |<bf>CCallable</bf>|<bf>CReturnable</bf> | <bf>Which is probably...</bf> @@

@Char@              | Yes  | Yes   | @unsigned char@ @@
@Int@               | Yes  | Yes   | @long int@ @@
@Word@              | Yes  | Yes   | @unsigned long int@ @@
@Addr@              | Yes  | Yes   | @void *@ @@
@Float@             | Yes  | Yes   | @float@ @@
@Double@            | Yes  | Yes   | @double@ @@
@()@                | No   | Yes   | @void@ @@
@[Char]@            | Yes  | No    | @char *@ (null-terminated) @@
                                      
@Array@             | Yes  | No    | @unsigned long *@ @@
@ByteArray@         | Yes  | No    | @unsigned long *@ @@
@MutableArray@      | Yes  | No    | @unsigned long *@ @@
@MutableByteArray@  | Yes  | No    | @unsigned long *@ @@
                      		       
@State@             | Yes  | Yes   | nothing!@@
                      		       
@StablePtr@         | Yes  | Yes   | @unsigned long *@ @@
@ForeignObjs@       | Yes  | Yes   | see later @@
</tabular>

640
641
642
643
Actually, the @Word@ type is defined as being the same size as a
pointer on the target architecture, which is <em>probably</em>
@unsigned long int@.  

644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
The brave and careful programmer can add their own instances of these
classes for the following types:

<itemize>
<item>
A <em>boxed-primitive</em> type may be made an instance of both
@CCallable@ and @CReturnable@.  

A boxed primitive type is any data type with a
single unary constructor with a single primitive argument.  For
example, the following are all boxed primitive types:

<tscreen><verb>
Int
Double
data XDisplay = XDisplay Addr#
data EFS a = EFS# ForeignObj#
</verb></tscreen>

<tscreen><verb>
instance CCallable   (EFS a)
instance CReturnable (EFS a)
</verb></tscreen>

<item> Any datatype with a single nullary constructor may be made an
instance of @CReturnable@.  For example:

<tscreen><verb>
data MyVoid = MyVoid
instance CReturnable MyVoid
</verb></tscreen>

<item> As at version 2.09, @String@ (i.e., @[Char]@) is still
not a @CReturnable@ type.

Also, the now-builtin type @PackedString@ is neither
@CCallable@ nor @CReturnable@.  (But there are functions in
the PackedString interface to let you get at the necessary bits...)
</itemize>

<item> The code-generator will complain if you attempt to use @%r@ in
a @_casm_@ whose result type is @IO ()@; or if you don't use @%r@
<em>precisely</em> once for any other result type.  These messages are
supposed to be helpful and catch bugs---please tell us if they wreck
your life.

<item> If you call out to C code which may trigger the Haskell garbage
691
692
collector or create new threads (examples of this later...), then you
must use the @_ccall_GC_@<nidx>_ccall_GC_ primitive</nidx> or
693
694
695
696
697
698
699
700
@_casm_GC_@<nidx>_casm_GC_ primitive</nidx> variant of C-calls.  (This
does not work with the native code generator - use @\fvia-C@.) This
stuff is hairy with a capital H!  </itemize>

<sect1> Multi-parameter type classes
<label id="multi-param-type-classes">
<p>

701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
This section documents GHC's implementation of multi-paramter type
classes.  There's lots of background in the paper <url name="Type
classes: exploring the design space"
url="http://www.dcs.gla.ac.uk/~simonpj/multi.ps.gz"> (Simon Peyton
Jones, Mark Jones, Erik Meijer).

I'd like to thank people who reported shorcomings in the GHC 3.02
implementation.  Our default decisions were all conservative ones, and
the experience of these heroic pioneers has given useful concrete
examples to support several generalisations.  (These appear below as
design choices not implemented in 3.02.)

I've discussed these notes with Mark Jones, and I believe that Hugs
will migrate towards the same design choices as I outline here.
Thanks to him, and to many others who have offered very useful
feedback.

<sect2>Types
<p>

There are the following restrictions on the form of a qualified 
type:

<tscreen><verb>
  forall tv1..tvn (c1, ...,cn) => type
</verb></tscreen>

(Here, I write the "foralls" explicitly, although the Haskell source
language omits them; in Haskell 1.4, all the free type variables of an
explicit source-language type signature are universally quantified,
except for the class type variables in a class declaration.  However,
in GHC, you can give the foralls if you want.  See Section <ref
name="Explicit universal quantification"
id="universal-quantification">).

<enum>

<item> <bf>Each universally quantified type variable 
@tvi@ must be mentioned (i.e. appear free) in @type@</bf>.

The reason for this is that a value with a type that does not obey
this restriction could not be used without introducing
ambiguity. Here, for example, is an illegal type:

<tscreen><verb>
  forall a. Eq a => Int
</verb></tscreen>

When a value with this type was used, the constraint <tt>Eq tv</tt>
would be introduced where <tt>tv</tt> is a fresh type variable, and
(in the dictionary-translation implementation) the value would be
applied to a dictionary for <tt>Eq tv</tt>.  The difficulty is that we
can never know which instance of <tt>Eq</tt> to use because we never
get any more information about <tt>tv</tt>.

<item> <bf>Every constraint @ci@ must mention at least one of the
universally quantified type variables @tvi@</bf>.

For example, this type is OK because <tt>C a b</tt> mentions the
universally quantified type variable <tt>b</tt>:

<tscreen><verb>
  forall a. C a b => burble
</verb></tscreen>

The next type is illegal because the constraint <tt>Eq b</tt> does not
mention <tt>a</tt>:

<tscreen><verb>
  forall a. Eq b => burble
</verb></tscreen>

The reason for this restriction is milder than the other one.  The
excluded types are never useful or necessary (because the offending
context doesn't need to be witnessed at this point; it can be floated
out).  Furthermore, floating them out increases sharing. Lastly,
excluding them is a conservative choice; it leaves a patch of
territory free in case we need it later.

</enum>

These restrictions apply to all types, whether declared in a type signature
or inferred.

Unlike Haskell 1.4, constraints in types do <bf>not</bf> have to be of
the form <em>(class type-variables)</em>.  Thus, these type signatures
are perfectly OK

<tscreen><verb>
  f :: Eq (m a) => [m a] -> [m a]
  g :: Eq [a] => ...
</verb></tscreen>

This choice recovers principal types, a property that Haskell 1.4 does not have.

<sect2>Class declarations
<p>

<enum>

<item> <bf>Multi-parameter type classes are permitted</bf>. For example:

<tscreen><verb>
  class Collection c a where
    union :: c a -> c a -> c a
    ...etc..
</verb></tscreen>


<item> <bf>The class hierarchy must be acyclic</bf>.  However, the definition
of "acyclic" involves only the superclass relationships.  For example,
this is OK:

<tscreen><verb>
  class C a where { 
    op :: D b => a -> b -> b
  }

  class C a => D a where { ... }
</verb></tscreen>

Here, <tt>C</tt> is a superclass of <tt>D</tt>, but it's OK for a
class operation <tt>op</tt> of <tt>C</tt> to mention <tt>D</tt>.  (It
would not be OK for <tt>D</tt> to be a superclass of <tt>C</tt>.)

<item> <bf>There are no restrictions on the context in a class declaration
(which introduces superclasses), except that the class hierarchy must
be acyclic</bf>.  So these class declarations are OK:

<tscreen><verb>
  class Functor (m k) => FiniteMap m k where
    ...

  class (Monad m, Monad (t m)) => Transform t m where
    lift :: m a -> (t m) a
</verb></tscreen>

<item> <bf>In the signature of a class operation, every constraint
must mention at least one type variable that is not a class type
variable</bf>.

Thus:

<tscreen><verb>
  class Collection c a where
    mapC :: Collection c b => (a->b) -> c a -> c b
</verb></tscreen>

is OK because the constraint <tt>(Collection a b)</tt> mentions
<tt>b</tt>, even though it also mentions the class variable
<tt>a</tt>.  On the other hand:

<tscreen><verb>
  class C a where
    op :: Eq a => (a,b) -> (a,b)
</verb></tscreen>

is not OK because the constraint <tt>(Eq a)</tt> mentions on the class
type variable <tt>a</tt>, but not <tt>b</tt>.  However, any such
example is easily fixed by moving the offending context up to the
superclass context:

<tscreen><verb>
  class Eq a => C a where
    op ::(a,b) -> (a,b)
</verb></tscreen>

A yet more relaxed rule would allow the context of a class-op signature
to mention only class type variables.  However, that conflicts with
Rule 1(b) for types above.

<item> <bf>The type of each class operation must mention <em/all/ of
the class type variables</bf>.  For example:

<tscreen><verb>
  class Coll s a where
    empty  :: s
    insert :: s -> a -> s
</verb></tscreen>

is not OK, because the type of <tt>empty</tt> doesn't mention
<tt>a</tt>.  This rule is a consequence of Rule 1(a), above, for
types, and has the same motivation.

Sometimes, offending class declarations exhibit misunderstandings.  For
example, <tt>Coll</tt> might be rewritten

<tscreen><verb>
  class Coll s a where
    empty  :: s a
    insert :: s a -> a -> s a
</verb></tscreen>

which makes the connection between the type of a collection of
<tt>a</tt>'s (namely <tt>(s a)</tt>) and the element type <tt>a</tt>.
Occasionally this really doesn't work, in which case you can split the
class like this:

<tscreen><verb>
  class CollE s where
    empty  :: s

  class CollE s => Coll s a where
    insert :: s -> a -> s
</verb></tscreen>

</enum>

<sect2>Instance declarations
<p>

<enum>

<item> <bf>Instance declarations may not overlap</bf>.  The two instance
declarations

<tscreen><verb>
  instance context1 => C type1 where ...
  instance context2 => C type2 where ...
</verb></tscreen>

"overlap" if @type1@ and @type2@ unify

However, if you give the command line option
@-fallow-overlapping-instances@<nidx>-fallow-overlapping-instances
option</nidx> then two overlapping instance declarations are permitted
iff

<itemize>
<item> EITHER @type1@ and @type2@ do not unify
<item> OR @type2@ is a substitution instance of @type1@
		(but not identical to @type1@)
<item> OR vice versa
</itemize>

Notice that these rules

<itemize>
<item> make it clear which instance decl to use
	   (pick the most specific one that matches)

<item> do not mention the contexts @context1@, @context2@
	    Reason: you can pick which instance decl
	    "matches" based on the type.
</itemize>

Regrettably, GHC doesn't guarantee to detect overlapping instance
declarations if they appear in different modules.  GHC can "see" the
instance declarations in the transitive closure of all the modules
imported by the one being compiled, so it can "see" all instance decls
when it is compiling <tt>Main</tt>.  However, it currently chooses not
to look at ones that can't possibly be of use in the module currently
being compiled, in the interests of efficiency.  (Perhaps we should
change that decision, at least for <tt>Main</tt>.)

<item> <bf>There are no restrictions on the type in an instance
<em/head/, except that at least one must not be a type variable</bf>.
The instance "head" is the bit after the "=>" in an instance decl. For
example, these are OK:

<tscreen><verb>
  instance C Int a where ...

  instance D (Int, Int) where ...

  instance E [[a]] where ...
</verb></tscreen>

Note that instance heads <bf>may</bf> contain repeated type variables.
For example, this is OK:

<tscreen><verb>
  instance Stateful (ST s) (MutVar s) where ...
</verb></tscreen>

The "at least one not a type variable" restriction is to ensure that
context reduction terminates: each reduction step removes one type
constructor.  For example, the following would make the type checker
loop if it wasn't excluded:

<tscreen><verb>
  instance C a => C a where ...
</verb></tscreen>

There are two situations in which the rule is a bit of a pain. First,
if one allows overlapping instance declarations then it's quite
convenient to have a "default instance" declaration that applies if
something more specific does not:

<tscreen><verb>
  instance C a where
    op = ... -- Default
</verb></tscreen>

Second, sometimes you might want to use the following to get the
effect of a "class synonym":

<tscreen><verb>
  class (C1 a, C2 a, C3 a) => C a where { }

  instance (C1 a, C2 a, C3 a) => C a where { }
</verb></tscreen>

This allows you to write shorter signatures:

<tscreen><verb>
  f :: C a => ...
</verb></tscreen>

instead of

<tscreen><verb>
  f :: (C1 a, C2 a, C3 a) => ...
</verb></tscreen>

I'm on the lookout for a simple rule that preserves decidability while
allowing these idioms.  The experimental flag
@-fallow-undecidable-instances@<nidx>-fallow-undecidable-instances
option</nidx> lifts this restriction, allowing all the types in an
instance head to be type variables.

<item> <bf>Unlike Haskell 1.4, instance heads may use type
synonyms</bf>.  As always, using a type synonym is just shorthand for
writing the RHS of the type synonym definition.  For example:

<tscreen><verb>
  type Point = (Int,Int) 
  instance C Point   where ...
  instance C [Point] where ...
</verb></tscreen>

is legal.  However, if you added

<tscreen><verb>
  instance C (Int,Int) where ...
</verb></tscreen>

as well, then the compiler will complain about the overlapping
(actually, identical) instance declarations.  As always, type synonyms
must be fully applied.  You cannot, for example, write:

<tscreen><verb>
  type P a = [[a]]
  instance Monad P where ...
</verb></tscreen>

This design decision is independent of all the others, and easily
reversed, but it makes sense to me.

<item><bf>The types in an instance-declaration <em/context/ must all
be type variables</bf>. Thus

<tscreen><verb>
  instance C a b => Eq (a,b) where ...
</verb></tscreen>

is OK, but
1058

1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
<tscreen><verb>
  instance C Int b => Foo b where ...
</verb></tscreen>

is not OK.  Again, the intent here is to make sure that context
reduction terminates.

Voluminous correspondence on the Haskell mailing list has convinced me
that it's worth experimenting with a more liberal rule.  If you use
the flag <tt>-fallow-undecidable-instances</tt> you can use arbitrary
types in an instance context.  Termination is ensured by having a
fixed-depth recursion stack.  If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with <tt>-fcontext-stack</tt><em/N/.

</enum>

% -----------------------------------------------------------------------------
<sect1>Explicit universal quantification
1078
1079
1080
<label id="universal-quantification">
<p>

1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
GHC now allows you to write explicitly quantified types.  GHC's
syntax for this now agrees with Hugs's, namely:

<tscreen><verb>
	forall a b. (Ord a, Eq  b) => a -> b -> a
</verb></tscreen>

The context is, of course, optional.  You can't use <tt>forall</tt> as
a type variable any more!

Haskell type signatures are implicitly quantified.  The <tt>forall</tt>
allows us to say exactly what this means.  For example:

<tscreen><verb>
	g :: b -> b
</verb></tscreen>

means this:

<tscreen><verb>
	g :: forall b. (b -> b)
</verb></tscreen>

The two are treated identically.

<sect2>Universally-quantified data type fields
<label id="univ">
<p>

In a <tt>data</tt> or <tt>newtype</tt> declaration one can quantify
the types of the constructor arguments.  Here are several examples:

<tscreen><verb>
data T a = T1 (forall b. b -> b -> b) a

data MonadT m = MkMonad { return :: forall a. a -> m a,
			  bind   :: forall a b. m a -> (a -> m b) -> m b
		        }

newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
</verb></tscreen>

The constructors now have so-called <em/rank 2/ polymorphic
types, in which there is a for-all in the argument types.:

<tscreen><verb>
T1 :: forall a. (forall b. b -> b -> b) -> a -> T1 a
MkMonad :: forall m. (forall a. a -> m a)
		  -> (forall a b. m a -> (a -> m b) -> m b)
		  -> MonadT m
MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
</verb></tscreen>

Notice that you don't need to use a <tt>forall</tt> if there's an
explicit context.  For example in the first argument of the
constructor <tt>MkSwizzle</tt>, an implicit "<tt>forall a.</tt>" is
prefixed to the argument type.  The implicit <tt>forall</tt>
quantifies all type variables that are not already in scope, and are
mentioned in the type quantified over.

As for type signatures, implicit quantification happens for non-overloaded
types too.  So if you write this:
<tscreen><verb>
  data T a = MkT (Either a b) (b -> b)
</verb></tscreen>
it's just as if you had written this:
<tscreen><verb>
  data T a = MkT (forall b. Either a b) (forall b. b -> b)
</verb></tscreen>
That is, since the type variable <tt>b</tt> isn't in scope, it's
implicitly universally quantified.  (Arguably, it would be better
to <em>require</em> explicit quantification on constructor arguments
where that is what is wanted.  Feedback welcomed.)

<sect2> Construction 
<p>

You construct values of types <tt>T1, MonadT, Swizzle</tt> by applying
the constructor to suitable values, just as usual.  For example,

<tscreen><verb>
(T1 (\xy->x) 3) :: T Int

(MkSwizzle sort)    :: Swizzle
(MkSwizzle reverse) :: Swizzle

(let r x = Just x
     b m k = case m of
		Just y -> k y
		Nothing -> Nothing
  in
  MkMonad r b) :: MonadT Maybe
</verb></tscreen>

The type of the argument can, as usual, be more general than the type
required, as <tt>(MkSwizzle reverse)</tt> shows.  (<tt>reverse</tt>
does not need the <tt>Ord</tt> constraint.)

<sect2>Pattern matching
<p>

When you use pattern matching, the bound variables may now have
polymorphic types.  For example:

<tscreen><verb>
	f :: T a -> a -> (a, Char)
	f (T1 f k) x = (f k x, f 'c' 'd')

	g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
	g (MkSwizzle s) xs f = s (map f (s xs))

	h :: MonadT m -> [m a] -> m [a]
	h m [] = return m []
	h m (x:xs) = bind m x 		$ \y ->
		      bind m (h m xs)	$ \ys ->
		      return m (y:ys)
</verb></tscreen>

In the function <tt>h</tt> we use the record selectors <tt>return</tt>
and <tt>bind</tt> to extract the polymorphic bind and return functions
from the <tt>MonadT</tt> data structure, rather than using pattern
matching.

1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
You cannot pattern-match against an argument that is polymorphic.
For example:
<tscreen><verb>
	newtype TIM s a = TIM (ST s (Maybe a))

	runTIM :: (forall s. TIM s a) -> Maybe a
	runTIM (TIM m) = runST m
</verb></tscreen>

Here the pattern-match fails, because you can't pattern-match against
an argument of type <tt>(forall s. TIM s a)</tt>.  Instead you 
must bind the variable and pattern match in the right hand side:
<tscreen><verb>
	runTIM :: (forall s. TIM s a) -> Maybe a
	runTIM tm = case tm of { TIM m -> runST m }
</verb></tscreen>
The <tt>tm</tt> on the right hand side is (invisibly) instantiated, like
any polymorphic value at its occurrence site, and now you can pattern-match
against it.

1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
<sect2>The partial-application restriction
<p>

There is really only one way in which data structures with polymorphic
components might surprise you: you must not partially apply them.
For example, this is illegal:

<tscreen><verb>
	map MkSwizzle [sort, reverse]
</verb></tscreen>

The restriction is this: <em>every subexpression of the program must
have a type that has no for-alls, except that in a function
application (f e1 ... en) the partial applications are not subject to
this rule</em>.  The restriction makes type inference feasible.

In the illegal example, the sub-expression <tt>MkSwizzle</tt> has the
polymorphic type <tt>(Ord b => [b] -> [b]) -> Swizzle</tt> and is not
a sub-expression of an enclosing application.  On the other hand, this
expression is OK:

<tscreen><verb>
	map (T1 (\a b -> a)) [1,2,3]
</verb></tscreen>

even though it involves a partial application of <tt>T1</tt>, because
the sub-expression <tt>T1 (\a b -> a)</tt> has type <tt>Int -> T
Int</tt>.

<sect2>Type signatures
<label id="sigs">
<p>

Once you have data constructors with universally-quantified fields, or
constants such as <tt>runST</tt> that have rank-2 types, it isn't long
before you discover that you need more!  Consider:

<tscreen><verb>
  mkTs f x y = [T1 f x, T1 f y]
</verb></tscreen>

<tt>mkTs</tt> is a fuction that constructs some values of type
<tt>T</tt>, using some pieces passed to it.  The trouble is that since
<tt>f</tt> is a function argument, Haskell assumes that it is
monomorphic, so we'll get a type error when applying <tt>T1</tt> to
it.  This is a rather silly example, but the problem really bites in
practice.  Lots of people trip over the fact that you can't make
"wrappers functions" for <tt>runST</tt> for exactly the same reason.
In short, it is impossible to build abstractions around functions with
rank-2 types.

The solution is fairly clear.  We provide the ability to give a rank-2
type signature for <em>ordinary</em> functions (not only data
constructors), thus:

<tscreen><verb>
  mkTs :: (forall b. b -> b -> b) -> a -> [T a]
  mkTs f x y = [T1 f x, T1 f y]
</verb></tscreen>

This type signature tells the compiler to attribute <tt>f</tt> with
the polymorphic type <tt>(forall b. b -> b -> b)</tt> when type
checking the body of <tt>mkTs</tt>, so now the application of
<tt>T1</tt> is fine.

There are two restrictions:

<itemize>
<item> You can only define a rank 2 type, specified by the following
grammar:

<tscreen><verb>
   rank2type ::= [forall tyvars .] [context =>] funty
   funty     ::= ([forall tyvars .] [context =>] ty) -> funty
               | ty
   ty        ::= ...current Haskell monotype syntax...
</verb></tscreen>

Informally, the universal quantification must all be right at the beginning, 
or at the top level of a function argument.

<item> There is a restriction on the definition of a function whose
type signature is a rank-2 type: the polymorphic arguments must be
matched on the left hand side of the "<tt>=</tt>" sign.  You can't
define <tt>mkTs</tt> like this:

<tscreen><verb>
  mkTs :: (forall b. b -> b -> b) -> a -> [T a]
  mkTs = \ f x y -> [T1 f x, T1 f y]
</verb></tscreen>


The same partial-application rule applies to ordinary functions with
rank-2 types as applied to data constructors.  

</itemize>

% -----------------------------------------------------------------------------
<sect1>Existentially quantified data constructors
<label id="existential-quantification">
<p>

The idea of using existential quantification in data type declarations
was suggested by Laufer (I believe, thought doubtless someone will
correct me), and implemented in Hope+. It's been in Lennart
Augustsson's <tt>hbc</tt> Haskell compiler for several years, and
proved very useful.  Here's the idea.  Consider the declaration:

<tscreen><verb>
  data Foo = forall a. MkFoo a (a -> Bool)
	   | Nil
</verb></tscreen>

The data type <tt>Foo</tt> has two constructors with types:

<tscreen><verb>
  MkFoo :: forall a. a -> (a -> Bool) -> Foo
  Nil   :: Foo
</verb></tscreen>

Notice that the type variable <tt>a</tt> in the type of <tt>MkFoo</tt>
does not appear in the data type itself, which is plain <tt>Foo</tt>.
For example, the following expression is fine:

<tscreen><verb>
  [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
</verb></tscreen>

Here, <tt>(MkFoo 3 even)</tt> packages an integer with a function
<tt>even</tt> that maps an integer to <tt>Bool</tt>; and <tt>MkFoo 'c'
isUpper</tt> packages a character with a compatible function.  These
two things are each of type <tt>Foo</tt> and can be put in a list.

What can we do with a value of type <tt>Foo</tt>?.  In particular,
what happens when we pattern-match on <tt>MkFoo</tt>?

<tscreen><verb>
  f (MkFoo val fn) = ???
</verb></tscreen>

Since all we know about <tt>val</tt> and <tt>fn</tt> is that they
are compatible, the only (useful) thing we can do with them is to
apply <tt>fn</tt> to <tt>val</tt> to get a boolean.  For example:

<tscreen><verb>
  f :: Foo -> Bool
  f (MkFoo val fn) = fn val
</verb></tscreen>

What this allows us to do is to package heterogenous values
together with a bunch of functions that manipulate them, and then treat
that collection of packages in a uniform manner.  You can express
quite a bit of object-oriented-like programming this way.

<sect2>Why existential?
<label id="existential">
<p>

What has this to do with <em>existential</em> quantification?
Simply that <tt>MkFoo</tt> has the (nearly) isomorphic type

<tscreen><verb>
  MkFoo :: (exists a . (a, a -> Bool)) -> Foo
</verb></tscreen>

But Haskell programmers can safely think of the ordinary
<em>universally</em> quantified type given above, thereby avoiding
adding a new existential quantification construct.

<sect2>Type classes
<p>

An easy extension (implemented in <tt>hbc</tt>) is to allow 
arbitrary contexts before the constructor.  For example:

<tscreen><verb>
  data Baz = forall a. Eq a => Baz1 a a
	   | forall b. Show b => Baz2 b (b -> b)
</verb></tscreen>

The two constructors have the types you'd expect:

<tscreen><verb>
  Baz1 :: forall a. Eq a => a -> a -> Baz
  Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
</verb></tscreen>

But when pattern matching on <tt>Baz1</tt> the matched values can be compared
for equality, and when pattern matching on <tt>Baz2</tt> the first matched
value can be converted to a string (as well as applying the function to it).  
So this program is legal:

<tscreen><verb>
  f :: Baz -> String
  f (Baz1 p q) | p == q    = "Yes"
	       | otherwise = "No"
  f (Baz1 v fn)            = show (fn v)
</verb></tscreen>

Operationally, in a dictionary-passing implementation, the
constructors <tt>Baz1</tt> and <tt>Baz2</tt> must store the
dictionaries for <tt>Eq</tt> and <tt>Show</tt> respectively, and
extract it on pattern matching.

Notice the way that the syntax fits smoothly with that used for
universal quantification earlier.

<sect2>Restrictions
<p>

There are several restrictions on the ways in which existentially-quantified
constructors can be use.

<itemize>

<item> When pattern matching, each pattern match introduces a new,
distinct, type for each existential type variable.  These types cannot
be unified with any other type, nor can they escape from the scope of
the pattern match.  For example, these fragments are incorrect:

<tscreen><verb>
  f1 (MkFoo a f) = a
</verb></tscreen>

Here, the type bound by <tt>MkFoo</tt> "escapes", because <tt>a</tt>
is the result of <tt>f1</tt>.  One way to see why this is wrong is to
ask what type <tt>f1</tt> has:

<tscreen><verb>
  f1 :: Foo -> a             -- Weird!
</verb></tscreen>

What is this "<tt>a</tt>" in the result type? Clearly we don't mean
this:

<tscreen><verb>
  f1 :: forall a. Foo -> a   -- Wrong!
</verb></tscreen>

The original program is just plain wrong.  Here's another sort of error

<tscreen><verb>
  f2 (Baz1 a b) (Baz1 p q) = a==q
</verb></tscreen>

It's ok to say <tt>a==b</tt> or <tt>p==q</tt>, but
<tt>a==q</tt> is wrong because it equates the two distinct types arising
from the two <tt>Baz1</tt> constructors.


<item>You can't pattern-match on an existentially quantified
constructor in a <tt>let</tt> or <tt>where</tt> group of
bindings. So this is illegal:

<tscreen><verb>
  f3 x = a==b where { Baz1 a b = x }
</verb></tscreen>

You can only pattern-match
on an existentially-quantified constructor in a <tt>case</tt> expression or
in the patterns of a function definition.

The reason for this restriction is really an implementation one.
Type-checking binding groups is already a nightmare without
existentials complicating the picture.  Also an existential pattern
binding at the top level of a module doesn't make sense, because it's
not clear how to prevent the existentially-quantified type "escaping".
So for now, there's a simple-to-state restriction.  We'll see how
annoying it is.  

<item>You can't use existential quantification for <tt>newtype</tt> 
declarations.  So this is illegal:

<tscreen><verb>
  newtype T = forall a. Ord a => MkT a
</verb></tscreen>

Reason: a value of type <tt>T</tt> must be represented as a pair
of a dictionary for <tt>Ord t</tt> and a value of type <tt>t</tt>.
That contradicts the idea that <tt>newtype</tt> should have no 
concrete representation.  You can get just the same efficiency and effect
by using <tt>data</tt> instead of <tt>newtype</tt>.  If there is no
overloading involved, then there is more of a case for allowing
an existentially-quantified <tt>newtype</tt>, because the <tt>data</tt>
because the <tt>data</tt> version does carry an implementation cost,
but single-field existentially quantified constructors aren't much
use.  So the simple restriction (no existential stuff on <tt>newtype</tt>)
stands, unless there are convincing reasons to change it.
</itemize>
1513

sof's avatar
sof committed
1514
1515
1516
1517
1518

<sect1> <idx/Assertions/ 
<label id="sec:assertions">
<p>

sof's avatar
sof committed
1519
1520
If you want to make use of assertions in your standard Haskell code, you
could define a function like the following:
sof's avatar
sof committed
1521
1522
1523
1524
1525
1526
1527
1528

<tscreen><verb>
assert :: Bool -> a -> a
assert False x = error "assertion failed!"
assert _     x = x
</verb></tscreen>

which works, but gives you back a less than useful error message --
sof's avatar
sof committed
1529
an assertion failed, but which and where?
sof's avatar
sof committed
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547

One way out is to define an extended <tt/assert/ function which also
takes a descriptive string to include in the error message and
perhaps combine this with the use of a pre-processor which inserts
the source location where <tt/assert/ was used.

Ghc offers a helping hand here, doing all of this for you. For every
use of <tt/assert/ in the user's source:

<tscreen><verb>
kelvinToC :: Double -> Double
kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
</verb></tscreen>

Ghc will rewrite this to also include the source location where the
assertion was made, 

<tscreen><verb>
sof's avatar
sof committed
1548
assert pred val ==> assertError "Main.hs|15" pred val
sof's avatar
sof committed
1549
1550
</verb></tscreen>

sof's avatar
sof committed
1551
1552
1553
1554
The rewrite is only performed by the compiler when it spots
applications of <tt>Exception.assert</tt>, so you can still define and
use your own versions of <tt/assert/, should you so wish. If not,
import <tt/Exception/ to make use <tt/assert/ in your code.
sof's avatar
sof committed
1555
1556
1557
1558

To have the compiler ignore uses of assert, use the compiler option
@-fignore-asserts@. <nidx>-fignore-asserts option</nidx> That is,
expressions of the form @assert pred e@ will be rewritten to @e@.
sof's avatar
sof committed
1559
1560
1561
1562

Assertion failures can be caught, see the documentation for the
Hugs/GHC Exception library for information of how.

1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
% -----------------------------------------------------------------------------
<sect1>Scoped Type Variables
<label id="scoped-type-variables">
<p>

A <em/pattern type signature/ can introduce a <em/scoped type
variable/.  For example

<tscreen><verb>
f (xs::[a]) = ys ++ ys
	   where
	      ys :: [a]
	      ys = reverse xs 
</verb></tscreen>

The pattern @(xs::[a])@ includes a type signature for @xs@.
This brings the type variable @a@ into scope; it scopes over
all the patterns and right hand sides for this equation for @f@.
In particular, it is in scope at the type signature for @y@.

At ordinary type signatures, such as that for @ys@, any type variables
mentioned in the type signature <em/that are not in scope/ are
implicitly universally quantified.  (If there are no type variables in
scope, all type variables mentioned in the signature are universally
quantified, which is just as in Haskell 98.)  In this case, since @a@
is in scope, it is not universally quantified, so the type of @ys@ is
the same as that of @xs@.  In Haskell 98 it is not possible to declare
a type for @ys@; a major benefit of scoped type variables is that
it becomes possible to do so.

Scoped type variables are implemented in both GHC and Hugs.  Where the
implementations differ from the specification below, those differences
are noted.

So much for the basic idea.  Here are the details.

<sect2>Scope and implicit quantification
<p>

<itemize>
<item> All the type variables mentioned in the patterns for a single 
function definition equation, that are not already in scope,
are brought into scope by the patterns.  We describe this set as
the <em/type variables bound by the equation/.

<item> The type variables thus brought into scope may be mentioned
in ordinary type signatures or pattern type signatures anywhere within
their scope.

<item> In ordinary type signatures, any type variable mentioned in the
signature that is in scope is <em/not/ universally quantified.

<item> Ordinary type signatures do not bring any new type variables
into scope (except in the type signature itself!). So this is illegal:

<tscreen><verb>
  f :: a -> a
  f x = x::a
</verb></tscreen>

It's illegal because @a@ is not in scope in the body of @f@,
so the ordinary signature @x::a@ is equivalent to @x::forall a.a@;
and that is an incorrect typing.

<item> There is no implicit universal quantification on pattern type
signatures, nor may one write an explicit @forall@ type in a pattern
type signature.  The pattern type signature is a monotype.

<item> 
The type variables in the head of a @class@ or @instance@ declaration
scope over the methods defined in the @where@ part.  For example:

<tscreen><verb>
  class C a where
    op :: [a] -> a

    op xs = let ys::[a]
		ys = reverse xs
	    in
	    head ys
</verb></tscreen>

(Not implemented in Hugs yet, Dec 98).
</itemize>

<sect2>Polymorphism
<p>

<itemize>
<item> Pattern type signatures are completely orthogonal to ordinary, separate
type signatures.  The two can be used independently or together.  There is
no scoping associated with the names of the type variables in a separate type signature.

<tscreen><verb>
   f :: [a] -> [a]
   f (xs::[b]) = reverse xs
</verb></tscreen>

<item> The function must be polymorphic in the type variables
bound by all its equations.  Operationally, the type variables bound
by one equation must not:

<itemize>
<item> Be unified with a type (such as @Int@, or @[a]@).
<item> Be unified with a type variable free in the environment.
<item> Be unified with each other.  (They may unify with the type variables 
bound by another equation for the same function, of course.)
</itemize>

For example, the following all fail to type check:

<tscreen><verb>
  f (x::a) (y::b) = [x,y]	-- a unifies with b

  g (x::a) = x + 1::Int		-- a unifies with Int

  h x = let k (y::a) = [x,y]	-- a is free in the
	in k x			-- environment

  k (x::a) True    = ...	-- a unifies with Int
  k (x::Int) False = ...

  w :: [b] -> [b]
  w (x::a) = x			-- a unifies with [b]
</verb></tscreen>

<item> The pattern-bound type variable may, however, be constrained
by the context of the principal type, thus:

<tscreen><verb>
  f (x::a) (y::a) = x+y*2
</verb></tscreen>

gets the inferred type: @forall a. Num a => a -> a -> a@.
</itemize>

<sect2>Result type signatures
<p>

<itemize>
<item> The result type of a function can be given a signature,
thus:

<tscreen><verb>
  f (x::a) :: [a] = [x,x,x]
</verb></tscreen>

The final @":: [a]"@ after all the patterns gives a signature to the
result type.  Sometimes this is the only way of naming the type variable
you want:

<tscreen><verb>
  f :: Int -> [a] -> [a]
  f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
			in \xs -> map g (reverse xs `zip` xs)
</verb></tscreen>

</itemize>

Result type signatures are not yet implemented in Hugs.

<sect2>Pattern signatures on other constructs
<p>

<itemize>
<item> A pattern type signature can be on an arbitrary sub-pattern, not
just on a variable:

<tscreen><verb>
  f ((x,y)::(a,b)) = (y,x) :: (b,a)
</verb></tscreen>

<item> Pattern type signatures, including the result part, can be used
in lambda abstractions:

<tscreen><verb>
  (\ (x::a, y) :: a -> x)
</verb></tscreen>

Type variables bound by these patterns must be polymorphic in
the sense defined above.
For example:

<tscreen><verb>
  f1 (x::c) = f1 x	-- ok
  f2 = \(x::c) -> f2 x	-- not ok
</verb></tscreen>

Here, @f1@ is OK, but @f2@ is not, because @c@ gets unified
with a type variable free in the environment, in this
case, the type of @f2@, which is in the environment when
the lambda abstraction is checked.

<item> Pattern type signatures, including the result part, can be used
in @case@ expressions:

<tscreen><verb>
  case e of { (x::a, y) :: a -> x } 
</verb></tscreen>

The pattern-bound type variables must, as usual, 
be polymorphic in the following sense: each case alternative,
considered as a lambda abstraction, must be polymorphic.
Thus this is OK:

<tscreen><verb>
  case (True,False) of { (x::a, y) -> x }
</verb></tscreen>

Even though the context is that of a pair of booleans, 
the alternative itself is polymorphic.  Of course, it is
also OK to say:

<tscreen><verb>
  case (True,False) of { (x::Bool, y) -> x }
</verb></tscreen>

<item>
To avoid ambiguity, the type after the ``@::@'' in a result
pattern signature on a lambda or @case@ must be atomic (i.e. a single
token or a parenthesised type of some sort).  To see why, 
consider how one would parse this:

<tscreen><verb>
  \ x :: a -> b -> x
</verb></tscreen>

<item> Pattern type signatures that bind new type variables
may not be used in pattern bindings at all.
So this is illegal:

<tscreen><verb>
  f x = let (y, z::a) = x in ...
</verb></tscreen>

But these are OK, because they do not bind fresh type variables:

<tscreen><verb>
  f1 x            = let (y, z::Int) = x in ...
  f2 (x::(Int,a)) = let (y, z::a)   = x in ...
</verb></tscreen>

However a single variable is considered a degenerate function binding,
rather than a degerate pattern binding, so this is permitted, even
though it binds a type variable:

<tscreen><verb>
  f :: (b->b) = \(x::b) -> x
</verb></tscreen>

</itemize>
Such degnerate function bindings do not fall under the monomorphism
restriction.  Thus:

<tscreen><verb>
  g :: a -> a -> Bool = \x y. x==y
</verb></tscreen>

Here @g@ has type @forall a. Eq a => a -> a -> Bool@, just as if
@g@ had a separate type signature.  Lacking a type signature, @g@
would get a monomorphic type.

<sect2>Existentials
<p>

<itemize>
<item> Pattern type signatures can bind existential type variables.
For example:

<tscreen><verb>
  data T = forall a. MkT [a]

  f :: T -> T
  f (MkT [t::a]) = MkT t3
		 where
		   t3::[a] = [t,t,t]
</verb></tscreen>

</itemize>
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987

%-----------------------------------------------------------------------------
<sect1>Pragmas
<label id="pragmas">
<p>

GHC supports several pragmas, or instructions to the compiler placed
in the source code.  Pragmas don't affect the meaning of the program,
but they might affect the efficiency of the generated code.

<sect2>INLINE pragma
<label id="inline-pragma">
<nidx>INLINE pragma</nidx>
<nidx>pragma, INLINE</nidx>
<p>

GHC (with @-O@, as always) tries to inline (or ``unfold'')
functions/values that are ``small enough,'' thus avoiding the call
overhead and possibly exposing other more-wonderful optimisations.

You will probably see these unfoldings (in Core syntax) in your
interface files.

Normally, if GHC decides a function is ``too expensive'' to inline, it
will not do so, nor will it export that unfolding for other modules to
use.

The sledgehammer you can bring to bear is the
@INLINE@<nidx>INLINE pragma</nidx> pragma, used thusly:
<tscreen><verb>
key_function :: Int -> String -> (Bool, Double) 

#ifdef __GLASGOW_HASKELL__
{-# INLINE key_function #-}
#endif
</verb></tscreen>
(You don't need to do the C pre-processor carry-on unless you're going
to stick the code through HBC---it doesn't like @INLINE@ pragmas.)

The major effect of an @INLINE@ pragma is to declare a function's
``cost'' to be very low.  The normal unfolding machinery will then be
very keen to inline it.

An @INLINE@ pragma for a function can be put anywhere its type
signature could be put.

@INLINE@ pragmas are a particularly good idea for the
@then@/@return@ (or @bind@/@unit@) functions in a monad.
For example, in GHC's own @UniqueSupply@ monad code, we have:
<tscreen><verb>
#ifdef __GLASGOW_HASKELL__
{-# INLINE thenUs #-}
{-# INLINE returnUs #-}
#endif
</verb></tscreen>

<sect2>NOINLINE pragma
<label id="noinline-pragma">
<p>
<nidx>NOINLINE pragma</nidx>
<nidx>pragma, NOINLINE</nidx>

The @NOINLINE@ pragma does exactly what you'd expect: it stops the
named function from being inlined by the compiler.  You shouldn't ever
need to do this, unless you're very cautious about code size.

<sect2>SPECIALIZE pragma
<label id="specialize-pragma">
<p>
<nidx>SPECIALIZE pragma</nidx>
<nidx>pragma, SPECIALIZE</nidx>
<nidx>overloading, death to</nidx>

(UK spelling also accepted.)  For key overloaded functions, you can
create extra versions (NB: more code space) specialised to particular
types.  Thus, if you have an overloaded function:

<tscreen><verb>
hammeredLookup :: Ord key => [(key, value)] -> key -> value
</verb></tscreen>

If it is heavily used on lists with @Widget@ keys, you could
specialise it as follows:
<tscreen><verb>
{-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
</verb></tscreen>

To get very fancy, you can also specify a named function to use for
the specialised value, by adding @= blah@, as in:
<tscreen><verb>
{-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
</verb></tscreen>
It's <em>Your Responsibility</em> to make sure that @blah@ really
behaves as a specialised version of @hammeredLookup@!!!

NOTE: the @=blah@ feature isn't implemented in GHC 4.xx.

An example in which the @= blah@ form will Win Big:
<tscreen><verb>
toDouble :: Real a => a -> Double
toDouble = fromRational . toRational

{-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
</verb></tscreen>
The @i2d@ function is virtually one machine instruction; the
default conversion---via an intermediate @Rational@---is obscenely
expensive by comparison.

By using the US spelling, your @SPECIALIZE@ pragma will work with
HBC, too.  Note that HBC doesn't support the @= blah@ form.

A @SPECIALIZE@ pragma for a function can be put anywhere its type
signature could be put.

<sect2>SPECIALIZE instance pragma
<label id="specialize-instance-pragma">
<p>
<nidx>SPECIALIZE pragma</nidx>
<nidx>overloading, death to</nidx>
Same idea, except for instance declarations.  For example:
<tscreen><verb>
instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }

{-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
</verb></tscreen>
Compatible with HBC, by the way.

<sect2>LINE pragma
<label id="line-pragma">
<p>
<nidx>LINE pragma</nidx>
<nidx>pragma, LINE</nidx>

This pragma is similar to C's @#line@ pragma, and is mainly for use in
automatically generated Haskell code.  It lets you specify the line
number and filename of the original code; for example

<tscreen><verb>
{-# LINE 42 "Foo.vhs" #-}
</verb></tscreen>

if you'd generated the current file from something called @Foo.vhs@
and this line corresponds to line 42 in the original.  GHC will adjust
its error messages to refer to the line/file named in the @LINE@
pragma.
sof's avatar
sof committed
1988