Specialise.hs 94.5 KB
Newer Older
Austin Seipp's avatar
Austin Seipp committed
1 2 3
{-
(c) The GRASP/AQUA Project, Glasgow University, 1993-1998

4
\section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
Austin Seipp's avatar
Austin Seipp committed
5
-}
6

7
{-# LANGUAGE CPP #-}
8
module Specialise ( specProgram, specUnfolding ) where
9

10
#include "HsVersions.h"
11

12
import Id
13 14
import TcType hiding( substTy )
import Type   hiding( substTy, extendTvSubstList )
15
import Module( Module, HasModule(..) )
16
import Coercion( Coercion )
17
import CoreMonad
18
import qualified CoreSubst
19
import CoreUnfold
20 21
import VarSet
import VarEnv
22
import CoreSyn
23
import Rules
24
import CoreUtils        ( exprIsTrivial, applyTypeToArgs, mkCast )
25
import CoreFVs          ( exprFreeVars, exprsFreeVars, idFreeVars, exprsFreeIdsList )
26
import UniqSupply
27
import Name
28
import MkId             ( voidArgId, voidPrimId )
29 30
import Maybes           ( catMaybes, isJust )
import BasicTypes
31
import HscTypes
32
import Bag
33
import DynFlags
Ian Lynagh's avatar
Ian Lynagh committed
34
import Util
35
import Outputable
36
import FastString
37
import State
niteria's avatar
niteria committed
38
import UniqDFM
niteria's avatar
niteria committed
39
import TrieMap
40

41
import Control.Monad
quchen's avatar
quchen committed
42 43 44
#if __GLASGOW_HASKELL__ > 710
import qualified Control.Monad.Fail as MonadFail
#endif
45

Austin Seipp's avatar
Austin Seipp committed
46 47 48
{-
************************************************************************
*                                                                      *
49
\subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
Austin Seipp's avatar
Austin Seipp committed
50 51
*                                                                      *
************************************************************************
52 53

These notes describe how we implement specialisation to eliminate
sof's avatar
sof committed
54
overloading.
55

sof's avatar
sof committed
56
The specialisation pass works on Core
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
syntax, complete with all the explicit dictionary application,
abstraction and construction as added by the type checker.  The
existing type checker remains largely as it is.

One important thought: the {\em types} passed to an overloaded
function, and the {\em dictionaries} passed are mutually redundant.
If the same function is applied to the same type(s) then it is sure to
be applied to the same dictionary(s)---or rather to the same {\em
values}.  (The arguments might look different but they will evaluate
to the same value.)

Second important thought: we know that we can make progress by
treating dictionary arguments as static and worth specialising on.  So
we can do without binding-time analysis, and instead specialise on
dictionary arguments and no others.

The basic idea
~~~~~~~~~~~~~~
Suppose we have

77 78
        let f = <f_rhs>
        in <body>
79

80
and suppose f is overloaded.
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

STEP 1: CALL-INSTANCE COLLECTION

We traverse <body>, accumulating all applications of f to types and
dictionaries.

(Might there be partial applications, to just some of its types and
dictionaries?  In principle yes, but in practice the type checker only
builds applications of f to all its types and dictionaries, so partial
applications could only arise as a result of transformation, and even
then I think it's unlikely.  In any case, we simply don't accumulate such
partial applications.)


STEP 2: EQUIVALENCES

So now we have a collection of calls to f:
98 99 100
        f t1 t2 d1 d2
        f t3 t4 d3 d4
        ...
101 102 103 104 105 106 107 108 109 110 111
Notice that f may take several type arguments.  To avoid ambiguity, we
say that f is called at type t1/t2 and t3/t4.

We take equivalence classes using equality of the *types* (ignoring
the dictionary args, which as mentioned previously are redundant).

STEP 3: SPECIALISATION

For each equivalence class, choose a representative (f t1 t2 d1 d2),
and create a local instance of f, defined thus:

112
        f@t1/t2 = <f_rhs> t1 t2 d1 d2
113

sof's avatar
sof committed
114 115 116 117 118 119
f_rhs presumably has some big lambdas and dictionary lambdas, so lots
of simplification will now result.  However we don't actually *do* that
simplification.  Rather, we leave it for the simplifier to do.  If we
*did* do it, though, we'd get more call instances from the specialised
RHS.  We can work out what they are by instantiating the call-instance
set from f's RHS with the types t1, t2.
120 121 122 123 124 125 126 127 128 129

Add this new id to f's IdInfo, to record that f has a specialised version.

Before doing any of this, check that f's IdInfo doesn't already
tell us about an existing instance of f at the required type/s.
(This might happen if specialisation was applied more than once, or
it might arise from user SPECIALIZE pragmas.)

Recursion
~~~~~~~~~
130
Wait a minute!  What if f is recursive?  Then we can't just plug in
131 132 133 134 135
its right-hand side, can we?

But it's ok.  The type checker *always* creates non-recursive definitions
for overloaded recursive functions.  For example:

136
        f x = f (x+x)           -- Yes I know its silly
137 138 139

becomes

140 141 142 143 144
        f a (d::Num a) = let p = +.sel a d
                         in
                         letrec fl (y::a) = fl (p y y)
                         in
                         fl
145

sof's avatar
sof committed
146 147
We still have recusion for non-overloaded functions which we
speciailise, but the recursive call should get specialised to the
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
same recursive version.


Polymorphism 1
~~~~~~~~~~~~~~

All this is crystal clear when the function is applied to *constant
types*; that is, types which have no type variables inside.  But what if
it is applied to non-constant types?  Suppose we find a call of f at type
t1/t2.  There are two possibilities:

(a) The free type variables of t1, t2 are in scope at the definition point
of f.  In this case there's no problem, we proceed just as before.  A common
example is as follows.  Here's the Haskell:

163 164
        g y = let f x = x+x
              in f y + f y
165 166 167

After typechecking we have

168 169
        g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
                                in +.sel a d (f a d y) (f a d y)
170 171 172 173

Notice that the call to f is at type type "a"; a non-constant type.
Both calls to f are at the same type, so we can specialise to give:

174 175
        g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
                                in +.sel a d (f@a y) (f@a y)
176 177 178 179 180


(b) The other case is when the type variables in the instance types
are *not* in scope at the definition point of f.  The example we are
working with above is a good case.  There are two instances of (+.sel a d),
181
but "a" is not in scope at the definition of +.sel.  Can we do anything?
182 183 184
Yes, we can "common them up", a sort of limited common sub-expression deal.
This would give:

185 186 187
        g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
                                    f@a (x::a) = +.sel@a x x
                                in +.sel@a (f@a y) (f@a y)
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205

This can save work, and can't be spotted by the type checker, because
the two instances of +.sel weren't originally at the same type.

Further notes on (b)

* There are quite a few variations here.  For example, the defn of
  +.sel could be floated ouside the \y, to attempt to gain laziness.
  It certainly mustn't be floated outside the \d because the d has to
  be in scope too.

* We don't want to inline f_rhs in this case, because
that will duplicate code.  Just commoning up the call is the point.

* Nothing gets added to +.sel's IdInfo.

* Don't bother unless the equivalence class has more than one item!

206
Not clear whether this is all worth it.  It is of course OK to
207 208 209 210 211 212
simply discard call-instances when passing a big lambda.

Polymorphism 2 -- Overloading
~~~~~~~~~~~~~~
Consider a function whose most general type is

213
        f :: forall a b. Ord a => [a] -> b -> b
214 215 216 217 218 219 220 221 222

There is really no point in making a version of g at Int/Int and another
at Int/Bool, because it's only instancing the type variable "a" which
buys us any efficiency. Since g is completely polymorphic in b there
ain't much point in making separate versions of g for the different
b types.

That suggests that we should identify which of g's type variables
are constrained (like "a") and which are unconstrained (like "b").
223
Then when taking equivalence classes in STEP 2, we ignore the type args
224 225 226
corresponding to unconstrained type variable.  In STEP 3 we make
polymorphic versions.  Thus:

227
        f@t1/ = /\b -> <f_rhs> t1 b d1 d2
228

sof's avatar
sof committed
229
We do this.
230 231


sof's avatar
sof committed
232 233 234
Dictionary floating
~~~~~~~~~~~~~~~~~~~
Consider this
235

236 237 238
        f a (d::Num a) = let g = ...
                         in
                         ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
239

sof's avatar
sof committed
240 241 242 243 244
Here, g is only called at one type, but the dictionary isn't in scope at the
definition point for g.  Usually the type checker would build a
definition for d1 which enclosed g, but the transformation system
might have moved d1's defn inward.  Solution: float dictionary bindings
outwards along with call instances.
245 246 247

Consider

248 249 250 251
        f x = let g p q = p==q
                  h r s = (r+s, g r s)
              in
              h x x
252 253 254 255


Before specialisation, leaving out type abstractions we have

256 257 258 259 260 261 262
        f df x = let g :: Eq a => a -> a -> Bool
                     g dg p q = == dg p q
                     h :: Num a => a -> a -> (a, Bool)
                     h dh r s = let deq = eqFromNum dh
                                in (+ dh r s, g deq r s)
              in
              h df x x
263 264 265

After specialising h we get a specialised version of h, like this:

266 267
                    h' r s = let deq = eqFromNum df
                             in (+ df r s, g deq r s)
268 269

But we can't naively make an instance for g from this, because deq is not in scope
270
at the defn of g.  Instead, we have to float out the (new) defn of deq
271 272 273 274 275 276 277 278
to widen its scope.  Notice that this floating can't be done in advance -- it only
shows up when specialisation is done.

User SPECIALIZE pragmas
~~~~~~~~~~~~~~~~~~~~~~~
Specialisation pragmas can be digested by the type checker, and implemented
by adding extra definitions along with that of f, in the same way as before

279
        f@t1/t2 = <f_rhs> t1 t2 d1 d2
280 281 282 283

Indeed the pragmas *have* to be dealt with by the type checker, because
only it knows how to build the dictionaries d1 and d2!  For example

284 285
        g :: Ord a => [a] -> [a]
        {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
286 287 288 289 290 291 292 293 294 295 296

Here, the specialised version of g is an application of g's rhs to the
Ord dictionary for (Tree Int), which only the type checker can conjure
up.  There might not even *be* one, if (Tree Int) is not an instance of
Ord!  (All the other specialision has suitable dictionaries to hand
from actual calls.)

Problem.  The type checker doesn't have to hand a convenient <f_rhs>, because
it is buried in a complex (as-yet-un-desugared) binding group.
Maybe we should say

297
        f@t1/t2 = f* t1 t2 d1 d2
298 299 300 301 302 303 304 305 306

where f* is the Id f with an IdInfo which says "inline me regardless!".
Indeed all the specialisation could be done in this way.
That in turn means that the simplifier has to be prepared to inline absolutely
any in-scope let-bound thing.


Again, the pragma should permit polymorphism in unconstrained variables:

307 308
        h :: Ord a => [a] -> b -> b
        {-# SPECIALIZE h :: [Int] -> b -> b #-}
309 310 311

We *insist* that all overloaded type variables are specialised to ground types,
(and hence there can be no context inside a SPECIALIZE pragma).
312
We *permit* unconstrained type variables to be specialised to
313 314
        - a ground type
        - or left as a polymorphic type variable
315 316
but nothing in between.  So

317
        {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
318

319 320 321 322 323 324 325 326
is *illegal*.  (It can be handled, but it adds complication, and gains the
programmer nothing.)


SPECIALISING INSTANCE DECLARATIONS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider

327 328 329
        instance Foo a => Foo [a] where
                ...
        {-# SPECIALIZE instance Foo [Int] #-}
330 331 332 333

The original instance decl creates a dictionary-function
definition:

334
        dfun.Foo.List :: forall a. Foo a -> Foo [a]
335 336 337 338

The SPECIALIZE pragma just makes a specialised copy, just as for
ordinary function definitions:

339 340
        dfun.Foo.List@Int :: Foo [Int]
        dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
341 342 343 344 345 346 347 348 349 350 351 352 353 354

The information about what instance of the dfun exist gets added to
the dfun's IdInfo in the same way as a user-defined function too.


Automatic instance decl specialisation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Can instance decls be specialised automatically?  It's tricky.
We could collect call-instance information for each dfun, but
then when we specialised their bodies we'd get new call-instances
for ordinary functions; and when we specialised their bodies, we might get
new call-instances of the dfuns, and so on.  This all arises because of
the unrestricted mutual recursion between instance decls and value decls.

sof's avatar
sof committed
355 356 357
Still, there's no actual problem; it just means that we may not do all
the specialisation we could theoretically do.

358 359 360 361 362
Furthermore, instance decls are usually exported and used non-locally,
so we'll want to compile enough to get those specialisations done.

Lastly, there's no such thing as a local instance decl, so we can
survive solely by spitting out *usage* information, and then reading that
363
back in as a pragma when next compiling the file.  So for now,
364 365 366 367 368 369 370 371 372 373 374 375 376
we only specialise instance decls in response to pragmas.


SPITTING OUT USAGE INFORMATION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To spit out usage information we need to traverse the code collecting
call-instance information for all imported (non-prelude?) functions
and data types. Then we equivalence-class it and spit it out.

This is done at the top-level when all the call instances which escape
must be for imported functions and data types.

sof's avatar
sof committed
377 378
*** Not currently done ***

379 380 381 382 383

Partial specialisation by pragmas
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What about partial specialisation:

384 385
        k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
        {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
386 387 388

or even

389
        {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
390 391 392

Seems quite reasonable.  Similar things could be done with instance decls:

393 394 395 396
        instance (Foo a, Foo b) => Foo (a,b) where
                ...
        {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
        {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
397 398 399 400

Ho hum.  Things are complex enough without this.  I pass.


Gabor Greif's avatar
Gabor Greif committed
401
Requirements for the simplifier
402 403 404 405 406 407
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The simplifier has to be able to take advantage of the specialisation.

* When the simplifier finds an application of a polymorphic f, it looks in
f's IdInfo in case there is a suitable instance to call instead.  This converts

408
        f t1 t2 d1 d2   ===>   f_t1_t2
409 410 411 412 413 414

Note that the dictionaries get eaten up too!

* Dictionary selection operations on constant dictionaries must be
  short-circuited:

415
        +.sel Int d     ===>  +Int
416 417 418 419 420 421 422 423 424 425 426

The obvious way to do this is in the same way as other specialised
calls: +.sel has inside it some IdInfo which tells that if it's applied
to the type Int then it should eat a dictionary and transform to +Int.

In short, dictionary selectors need IdInfo inside them for constant
methods.

* Exactly the same applies if a superclass dictionary is being
  extracted:

427
        Eq.sel Int d   ===>   dEqInt
428 429 430 431 432

* Something similar applies to dictionary construction too.  Suppose
dfun.Eq.List is the function taking a dictionary for (Eq a) to
one for (Eq [a]).  Then we want

433
        dfun.Eq.List Int d      ===> dEq.List_Int
434 435 436 437 438 439 440

Where does the Eq [Int] dictionary come from?  It is built in
response to a SPECIALIZE pragma on the Eq [a] instance decl.

In short, dfun Ids need IdInfo with a specialisation for each
constant instance of their instance declaration.

sof's avatar
sof committed
441 442
All this uses a single mechanism: the SpecEnv inside an Id

443 444 445 446

What does the specialisation IdInfo look like?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

sof's avatar
sof committed
447 448
The SpecEnv of an Id maps a list of types (the template) to an expression

449
        [Type]  |->  Expr
450

451
For example, if f has this RuleInfo:
452

453
        [Int, a]  ->  \d:Ord Int. f' a
454

sof's avatar
sof committed
455
it means that we can replace the call
456

457
        f Int t  ===>  (\d. f' t)
sof's avatar
sof committed
458 459 460

This chucks one dictionary away and proceeds with the
specialised version of f, namely f'.
461 462 463 464 465 466 467


What can't be done this way?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There is no way, post-typechecker, to get a dictionary for (say)
Eq a from a dictionary for Eq [a].  So if we find

468
        ==.sel [t] d
469

470
we can't transform to
471

472
        eqList (==.sel t d')
473

474
where
475
        eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
476 477 478 479 480 481 482 483 484

Of course, we currently have no way to automatically derive
eqList, nor to connect it to the Eq [a] instance decl, but you
can imagine that it might somehow be possible.  Taking advantage
of this is permanently ruled out.

Still, this is no great hardship, because we intend to eliminate
overloading altogether anyway!

sof's avatar
sof committed
485 486 487 488
A note about non-tyvar dictionaries
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some Ids have types like

489
        forall a,b,c. Eq a -> Ord [a] -> tau
sof's avatar
sof committed
490 491 492 493

This seems curious at first, because we usually only have dictionary
args whose types are of the form (C a) where a is a type variable.
But this doesn't hold for the functions arising from instance decls,
Krzysztof Gogolewski's avatar
Typos  
Krzysztof Gogolewski committed
494
which sometimes get arguments with types of form (C (T a)) for some
sof's avatar
sof committed
495 496 497 498
type constructor T.

Should we specialise wrt this compound-type dictionary?  We used to say
"no", saying:
499 500 501 502 503 504
        "This is a heuristic judgement, as indeed is the fact that we
        specialise wrt only dictionaries.  We choose *not* to specialise
        wrt compound dictionaries because at the moment the only place
        they show up is in instance decls, where they are simply plugged
        into a returned dictionary.  So nothing is gained by specialising
        wrt them."
sof's avatar
sof committed
505 506

But it is simpler and more uniform to specialise wrt these dicts too;
507
and in future GHC is likely to support full fledged type signatures
sof's avatar
sof committed
508
like
509
        f :: Eq [(a,b)] => ...
sof's avatar
sof committed
510

511

Austin Seipp's avatar
Austin Seipp committed
512 513
************************************************************************
*                                                                      *
514
\subsubsection{The new specialiser}
Austin Seipp's avatar
Austin Seipp committed
515 516
*                                                                      *
************************************************************************
517 518

Our basic game plan is this.  For let(rec) bound function
519
        f :: (C a, D c) => (a,b,c,d) -> Bool
520

521
* Find any specialised calls of f, (f ts ds), where
522 523 524 525 526
  ts are the type arguments t1 .. t4, and
  ds are the dictionary arguments d1 .. d2.

* Add a new definition for f1 (say):

527
        f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
528 529 530 531 532

  Note that we abstract over the unconstrained type arguments.

* Add the mapping

533
        [t1,b,t3,d]  |->  \d1 d2 -> f1 b d
534 535

  to the specialisations of f.  This will be used by the
536 537
  simplifier to replace calls
                (f t1 t2 t3 t4) da db
538
  by
539
                (\d1 d1 -> f1 t2 t4) da db
540 541 542 543 544 545 546 547 548 549 550 551

  All the stuff about how many dictionaries to discard, and what types
  to apply the specialised function to, are handled by the fact that the
  SpecEnv contains a template for the result of the specialisation.

We don't build *partial* specialisations for f.  For example:

  f :: Eq a => a -> a -> Bool
  {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}

Here, little is gained by making a specialised copy of f.
There's a distinct danger that the specialised version would
552
first build a dictionary for (Eq b, Eq c), and then select the (==)
553 554 555 556 557
method from it!  Even if it didn't, not a great deal is saved.

We do, however, generate polymorphic, but not overloaded, specialisations:

  f :: Eq a => [a] -> b -> b -> b
Austin Seipp's avatar
Austin Seipp committed
558
  ... SPECIALISE f :: [Int] -> b -> b -> b ...
559

560
Hence, the invariant is this:
561

562
        *** no specialised version is overloaded ***
563 564


Austin Seipp's avatar
Austin Seipp committed
565 566
************************************************************************
*                                                                      *
567
\subsubsection{The exported function}
Austin Seipp's avatar
Austin Seipp committed
568 569 570
*                                                                      *
************************************************************************
-}
571

572
-- | Specialise calls to type-class overloaded functions occuring in a program.
Joachim Breitner's avatar
Joachim Breitner committed
573
specProgram :: ModGuts -> CoreM ModGuts
574 575 576 577
specProgram guts@(ModGuts { mg_module = this_mod
                          , mg_rules = local_rules
                          , mg_binds = binds })
  = do { dflags <- getDynFlags
578

579
             -- Specialise the bindings of this module
580
       ; (binds', uds) <- runSpecM dflags this_mod (go binds)
581

582
             -- Specialise imported functions
583 584
       ; hpt_rules <- getRuleBase
       ; let rule_base = extendRuleBaseList hpt_rules local_rules
585
       ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
586
                                                [] rule_base (ud_calls uds)
587 588 589 590 591 592 593 594 595

             -- Don't forget to wrap the specialized bindings with bindings
             -- for the needed dictionaries.
             -- See Note [Wrap bindings returned by specImports]
       ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds

       ; let final_binds
               | null spec_binds' = binds'
               | otherwise        = Rec (flattenBinds spec_binds') : binds'
596
                   -- Note [Glom the bindings if imported functions are specialised]
597 598 599

       ; return (guts { mg_binds = final_binds
                      , mg_rules = new_rules ++ local_rules }) }
600
  where
601 602 603 604 605
        -- We need to start with a Subst that knows all the things
        -- that are in scope, so that the substitution engine doesn't
        -- accidentally re-use a unique that's already in use
        -- Easiest thing is to do it all at once, as if all the top-level
        -- decls were mutually recursive
606 607 608
    top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
                              bindersOfBinds binds
                 , se_interesting = emptyVarSet }
609

610 611
    go []           = return ([], emptyUDs)
    go (bind:binds) = do (binds', uds) <- go binds
612
                         (bind', uds') <- specBind top_env bind uds
613
                         return (bind' ++ binds', uds')
614

615 616 617 618 619 620 621 622 623 624 625 626
{-
Note [Wrap bindings returned by specImports]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'specImports' returns a set of specialized bindings. However, these are lacking
necessary floated dictionary bindings, which are returned by
UsageDetails(ud_binds). These dictionaries need to be brought into scope with
'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
for instance, the 'specImports' call in 'specProgram'.


Note [Disabling cross-module specialisation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
627
Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
628 629 630 631 632 633 634 635 636
in modules outside of the current module. This can sometimes uncover user code
which explodes in size when aggressively optimized. The
-fno-cross-module-specialise option was introduced to allow users to being
bitten by such instances to revert to the pre-7.10 behavior.

See Trac #10491
-}

-- | Specialise a set of calls to imported bindings
637
specImports :: DynFlags
638
            -> Module
639
            -> SpecEnv          -- Passed in so that all top-level Ids are in scope
640
            -> VarSet           -- Don't specialise these ones
641
                                -- See Note [Avoiding recursive specialisation]
642
            -> [Id]             -- Stack of imported functions being specialised
643 644
            -> RuleBase         -- Rules from this module and the home package
                                -- (but not external packages, which can change)
645
            -> CallDetails      -- Calls for imported things, and floating bindings
646
            -> CoreM ( [CoreRule]   -- New rules
647 648
                     , [CoreBind] ) -- Specialised bindings
                                    -- See Note [Wrapping bindings returned by specImports]
649
specImports dflags this_mod top_env done callers rule_base cds
650 651 652 653 654
  -- See Note [Disabling cross-module specialisation]
  | not $ gopt Opt_CrossModuleSpecialise dflags =
    return ([], [])

  | otherwise =
niteria's avatar
niteria committed
655
    do { let import_calls = dVarEnvElts cds
656
       ; (rules, spec_binds) <- go rule_base import_calls
657
       ; return (rules, spec_binds) }
658
  where
659
    go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
660
    go _ [] = return ([], [])
niteria's avatar
niteria committed
661
    go rb (cis@(CIS fn _calls_for_fn) : other_calls)
662 663
      = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
                                                 done callers rb fn $
niteria's avatar
niteria committed
664
                                      ciSetToList cis
665 666 667
           ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
           ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }

668
specImport :: DynFlags
669
           -> Module
670
           -> SpecEnv               -- Passed in so that all top-level Ids are in scope
671
           -> VarSet                -- Don't specialise these
672
                                    -- See Note [Avoiding recursive specialisation]
673
           -> [Id]                  -- Stack of imported functions being specialised
674 675
           -> RuleBase              -- Rules from this module
           -> Id -> [CallInfo]      -- Imported function and calls for it
676 677
           -> CoreM ( [CoreRule]    -- New rules
                    , [CoreBind] )  -- Specialised bindings
678
specImport dflags this_mod top_env done callers rb fn calls_for_fn
679 680
  | fn `elemVarSet` done
  = return ([], [])     -- No warning.  This actually happens all the time
Gabor Greif's avatar
typos  
Gabor Greif committed
681
                        -- when specialising a recursive function, because
682 683
                        -- the RHS of the specialised function contains a recursive
                        -- call to the original function
684

685 686 687
  | null calls_for_fn   -- We filtered out all the calls in deleteCallsMentioning
  = return ([], [])

688 689
  | wantSpecImport dflags unfolding
  , Just rhs <- maybeUnfoldingTemplate unfolding
690
  = do {     -- Get rules from the external package state
691 692
             -- We keep doing this in case we "page-fault in"
             -- more rules as we go along
693
       ; hsc_env <- getHscEnv
694
       ; eps <- liftIO $ hscEPS hsc_env
695
       ; vis_orphs <- getVisibleOrphanMods
696
       ; let full_rb = unionRuleBase rb (eps_rule_base eps)
697
             rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
698

699 700 701
       ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
                                      runSpecM dflags this_mod $
              specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
702
       ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
703 704
             -- After the rules kick in we may get recursion, but
             -- we rely on a global GlomBinds to sort that out later
705
             -- See Note [Glom the bindings if imported functions are specialised]
706 707

              -- Now specialise any cascaded calls
708
       ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
709 710 711 712 713
                                  specImports dflags this_mod top_env
                                              (extendVarSet done fn)
                                              (fn:callers)
                                              (extendRuleBaseList rb rules1)
                                              (ud_calls uds)
714 715 716 717 718 719

             -- Don't forget to wrap the specialized bindings with bindings
             -- for the needed dictionaries
             -- See Note [Wrap bindings returned by specImports]
       ; let final_binds = wrapDictBinds (ud_binds uds)
                                         (spec_binds2 ++ spec_binds1)
720

721
       ; return (rules2 ++ rules1, final_binds) }
722

723
  |  warnMissingSpecs dflags callers
724 725
  = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
                            2 (vcat [ text "when specialising" <+> quotes (ppr caller)
726
                                    | caller <- callers])
727
                      , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
728
                      , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
729 730
       ; return ([], []) }

731
  | otherwise
732
  = return ([], [])
733 734
  where
    unfolding = realIdUnfolding fn   -- We want to see the unfolding even for loop breakers
735

736 737 738 739
warnMissingSpecs :: DynFlags -> [Id] -> Bool
-- See Note [Warning about missed specialisations]
warnMissingSpecs dflags callers
  | wopt Opt_WarnAllMissedSpecs dflags = True
740
  | not (wopt Opt_WarnMissedSpecs dflags) = False
741 742 743 744 745
  | null callers                       = False
  | otherwise                          = all has_inline_prag callers
  where
    has_inline_prag id = isAnyInlinePragma (idInlinePragma id)

746
wantSpecImport :: DynFlags -> Unfolding -> Bool
747
-- See Note [Specialise imported INLINABLE things]
748 749
wantSpecImport dflags unf
 = case unf of
750
     NoUnfolding      -> False
751
     BootUnfolding    -> False
752 753 754 755 756
     OtherCon {}      -> False
     DFunUnfolding {} -> True
     CoreUnfolding { uf_src = src, uf_guidance = _guidance }
       | gopt Opt_SpecialiseAggressively dflags -> True
       | isStableSource src -> True
757
               -- Specialise even INLINE things; it hasn't inlined yet,
758 759
               -- so perhaps it never will.  Moreover it may have calls
               -- inside it that we want to specialise
760
       | otherwise -> False    -- Stable, not INLINE, hence INLINABLE
761

762 763 764
{- Note [Warning about missed specialisations]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppose
765
 * In module Lib, you carefully mark a function 'foo' INLINABLE
766 767 768
 * Import Lib(foo) into another module M
 * Call 'foo' at some specialised type in M
Then you jolly well expect it to be specialised in M.  But what if
Gabor Greif's avatar
Gabor Greif committed
769
'foo' calls another function 'Lib.bar'.  Then you'd like 'bar' to be
770
specialised too.  But if 'bar' is not marked INLINABLE it may well
771 772 773 774 775 776 777 778
not be specialised.  The warning Opt_WarnMissedSpecs warns about this.

It's more noisy to warning about a missed specialisation opportunity
for /every/ overloaded imported function, but sometimes useful. That
is what Opt_WarnAllMissedSpecs does.

ToDo: warn about missed opportunities for local functions.

779 780
Note [Specialise imported INLINABLE things]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
781 782 783 784 785 786 787 788 789 790 791
What imported functions do we specialise?  The basic set is
 * DFuns and things with INLINABLE pragmas.
but with -fspecialise-aggressively we add
 * Anything with an unfolding template

Trac #8874 has a good example of why we want to auto-specialise DFuns.

We have the -fspecialise-aggressively flag (usually off), because we
risk lots of orphan modules from over-vigorous specialisation.
However it's not a big deal: anything non-recursive with an
unfolding-template will probably have been inlined already.
792 793 794

Note [Glom the bindings if imported functions are specialised]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
795
Suppose we have an imported, *recursive*, INLINABLE function
796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815
   f :: Eq a => a -> a
   f = /\a \d x. ...(f a d)...
In the module being compiled we have
   g x = f (x::Int)
Now we'll make a specialised function
   f_spec :: Int -> Int
   f_spec = \x -> ...(f Int dInt)...
   {-# RULE  f Int _ = f_spec #-}
   g = \x. f Int dInt x
Note that f_spec doesn't look recursive
After rewriting with the RULE, we get
   f_spec = \x -> ...(f_spec)...
BUT since f_spec was non-recursive before it'll *stay* non-recursive.
The occurrence analyser never turns a NonRec into a Rec.  So we must
make sure that f_spec is recursive.  Easiest thing is to make all
the specialisations for imported bindings recursive.


Note [Avoiding recursive specialisation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
816 817 818 819 820
When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
'f's RHS.  So we want to specialise g,h.  But we don't want to
specialise f any more!  It's possible that f's RHS might have a
recursive yet-more-specialised call, so we'd diverge in that case.
And if the call is to the same type, one specialisation is enough.
821
Avoiding this recursive specialisation loop is the reason for the
822 823
'done' VarSet passed to specImports and specImport.

Austin Seipp's avatar
Austin Seipp committed
824 825
************************************************************************
*                                                                      *
826
\subsubsection{@specExpr@: the main function}
Austin Seipp's avatar
Austin Seipp committed
827 828 829
*                                                                      *
************************************************************************
-}
830

Austin Seipp's avatar
Austin Seipp committed
831
data SpecEnv
832 833 834 835 836 837
  = SE { se_subst :: CoreSubst.Subst
             -- We carry a substitution down:
             -- a) we must clone any binding that might float outwards,
             --    to avoid name clashes
             -- b) we carry a type substitution to use when analysing
             --    the RHS of specialised bindings (no type-let!)
838

839

Austin Seipp's avatar
Austin Seipp committed
840
       , se_interesting :: VarSet
841 842 843 844 845 846 847 848 849
             -- Dict Ids that we know something about
             -- and hence may be worth specialising against
             -- See Note [Interesting dictionary arguments]
     }

specVar :: SpecEnv -> Id -> CoreExpr
specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v

specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
850

851 852 853 854 855 856 857
---------------- First the easy cases --------------------
specExpr env (Type ty)     = return (Type     (substTy env ty), emptyUDs)
specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
specExpr env (Var v)       = return (specVar env v, emptyUDs)
specExpr _   (Lit lit)     = return (Lit lit,       emptyUDs)
specExpr env (Cast e co)
  = do { (e', uds) <- specExpr env e
858
       ; return ((mkCast e' (substCo env co)), uds) }
859 860 861
specExpr env (Tick tickish body)
  = do { (body', uds) <- specExpr env body
       ; return (Tick (specTickish env tickish) body', uds) }
862 863

---------------- Applications might generate a call instance --------------------
864
specExpr env expr@(App {})
865
  = go expr []
866
  where
867
    go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
868 869
                               (fun', uds_app) <- go fun (arg':args)
                               return (App fun' arg', uds_arg `plusUDs` uds_app)
870

871 872
    go (Var f)       args = case specVar env f of
                                Var f' -> return (Var f', mkCallUDs env f' args)
873
                                e'     -> return (e', emptyUDs) -- I don't expect this!
874
    go other         _    = specExpr env other
875 876

---------------- Lambda/case require dumping of usage details --------------------
877 878
specExpr env e@(Lam _ _) = do
    (body', uds) <- specExpr env' body
879
    let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
880
    return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
881
  where
882
    (bndrs, body) = collectBinders e
883
    (env', bndrs') = substBndrs env bndrs
884 885
        -- More efficient to collect a group of binders together all at once
        -- and we don't want to split a lambda group with dumped bindings
886

887 888
specExpr env (Case scrut case_bndr ty alts)
  = do { (scrut', scrut_uds) <- specExpr env scrut
889
       ; (scrut'', case_bndr', alts', alts_uds)
890 891
             <- specCase env scrut' case_bndr alts
       ; return (Case scrut'' case_bndr' (substTy env ty) alts'
892
                , scrut_uds `plusUDs` alts_uds) }
893 894

---------------- Finally, let is the interesting case --------------------
895 896 897
specExpr env (Let bind body)
  = do { -- Clone binders
         (rhs_env, body_env, bind') <- cloneBindSM env bind
898

899 900
         -- Deal with the body
       ; (body', body_uds) <- specExpr body_env body
901

902
        -- Deal with the bindings
903
      ; (binds', uds) <- specBind rhs_env bind' body_uds
904 905

        -- All done
906
      ; return (foldr Let body' binds', uds) }
907

908 909 910
specTickish :: SpecEnv -> Tickish Id -> Tickish Id
specTickish env (Breakpoint ix ids)
  = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
911 912 913
  -- drop vars from the list if they have a non-variable substitution.
  -- should never happen, but it's harmless to drop them anyway.
specTickish _ other_tickish = other_tickish
914

915
specCase :: SpecEnv
916
         -> CoreExpr            -- Scrutinee, already done
917
         -> Id -> [CoreAlt]
918 919 920
         -> SpecM ( CoreExpr    -- New scrutinee
                  , Id
                  , [CoreAlt]
921
                  , UsageDetails)
922
specCase env scrut' case_bndr [(con, args, rhs)]
923
  | isDictId case_bndr           -- See Note [Floating dictionaries out of cases]
924
  , interestingDict env scrut'
925
  , not (isDeadBinder case_bndr && null sc_args')
926
  = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
927 928 929 930 931

       ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
                              [(con, args', Var sc_arg')]
                       | sc_arg' <- sc_args' ]

932
             -- Extend the substitution for RHS to map the *original* binders
Austin Seipp's avatar
Austin Seipp committed
933
             -- to their floated verions.
934
             mb_sc_flts :: [Maybe DictId]
935
             mb_sc_flts = map (lookupVarEnv clone_env) args'
936 937
             clone_env  = zipVarEnv sc_args' sc_args_flt
             subst_prs  = (case_bndr, Var case_bndr_flt)
938
                        : [ (arg, Var sc_flt)
939
                          | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]