      Retain InlinePragInfo on wrappers · c55bcbde
      For some reason, when doing the worker/wrapper split, we transferred the
      InlinePragInfo from the original function, but expunging it from the wrapper.
      This meant, for example, that a NOINLINE function would have its wrapper
      inlined, which isn't sensible.
      For a change, fixing a bug involves only deleting code!
      Remove NOINLINE strictness hack · 302265d5
      The stricteness analyser used to have a HACK which ensured that NOINLNE things
      were not strictness-analysed.  The reason was unsafePerformIO. Left to itself,
      the strictness analyser would discover this strictness for unsafePerformIO:
      	unsafePerformIO:  C(U(AV))
      But then consider this sub-expression
      	unsafePerformIO (\s -> let r = f x in 
      			       case writeIORef v r s of (# s1, _ #) ->
      			       (# s1, r #)
      The strictness analyser will now find that r is sure to be eval'd,
      and may then hoist it out.  This makes tests/lib/should_run/memo002
      Solving this by making all NOINLINE things have no strictness info is overkill.
      In particular, it's overkill for runST, which is perfectly respectable.
      	f x = runST (return x)
      This should be strict in x.
      So the new plan is to define unsafePerformIO using the 'lazy' combinator:
      	unsafePerformIO (IO m) = lazy (case m realWorld# of (# _, r #) -> r)
      Remember, 'lazy' is a wired-in identity-function Id, of type a->a, which is 
      magically NON-STRICT, and is inlined after strictness analysis.  So
      unsafePerformIO will look non-strict, and that's what we want.
      Now we don't need the hack in the strictness analyser.
      Reorganisation of the source tree · 0065d5ab
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      [project @ 2005-03-18 13:37:27 by simonmar] · d1c1b7d0
      Flags cleanup.
      Basically the purpose of this commit is to move more of the compiler's
      global state into DynFlags, which is moving in the direction we need
      to go for the GHC API which can have multiple active sessions
      supported by a single GHC instance.
      $ grep 'global_var' */*hs | wc -l
      $ grep 'global_var' */*hs | wc -l
      Well, it's an improvement.  Most of what's left won't really affect
      our ability to host multiple sessions.
      Lots of static flags have become dynamic flags (yay!).  Notably lots
      of flags that we used to think of as "driver" flags, like -I and -L,
      are now dynamic.  The most notable static flags left behind are the
      "way" flags, eg. -prof.  It would be nice to fix this, but it isn't
      On the way, lots of cleanup has happened.  Everything related to
      static and dynamic flags lives in StaticFlags and DynFlags
      respectively, and they share a common command-line parser library in
      CmdLineParser.  The flags related to modes (--makde, --interactive
      etc.) are now private to the front end: in fact private to Main
      itself, for now.
      [project @ 2004-12-22 12:06:13 by simonpj] · d7c402a3
           New Core invariant: keep case alternatives in sorted order
      We now keep the alternatives of a Case in the Core language in sorted
      order.  Sorted, that is,
      	by constructor tag	for DataAlt
      	by literal		for LitAlt
      The main reason is that it makes matching and equality testing more robust.
      But in fact some lines of code vanished from SimplUtils.mkAlts.
      WARNING: no change to interface file formats, but you'll need to recompile
      your libraries so that they generate interface files that respect the
      [project @ 2004-09-30 10:35:15 by simonpj] · 23f40f0e
      	Add Generalised Algebraic Data Types
      This rather big commit adds support for GADTs.  For example,
          data Term a where
       	  Lit :: Int -> Term Int
      	  App :: Term (a->b) -> Term a -> Term b
      	  If  :: Term Bool -> Term a -> Term a
          eval :: Term a -> a
          eval (Lit i) = i
          eval (App a b) = eval a (eval b)
          eval (If p q r) | eval p    = eval q
          		    | otherwise = eval r
      Lots and lots of of related changes throughout the compiler to make
      this fit nicely.
      One important change, only loosely related to GADTs, is that skolem
      constants in the typechecker are genuinely immutable and constant, so
      we often get better error messages from the type checker.  See
      There's a new module types/Unify.lhs, which has purely-functional
      unification and matching for Type. This is used both in the typechecker
      (for type refinement of GADTs) and in Core Lint (also for type refinement).
      [project @ 2003-10-09 11:58:39 by simonpj] · 98688c6e
      		GHC heart/lung transplant
      This major commit changes the way that GHC deals with importing
      types and functions defined in other modules, during renaming and
      typechecking.  On the way I've changed or cleaned up numerous other
      things, including many that I probably fail to mention here.
      Major benefit: GHC should suck in many fewer interface files when
      compiling (esp with -O).  (You can see this with -ddump-rn-stats.)
      It's also some 1500 lines of code shorter than before.
      **	So expect bugs!  I can do a 3-stage bootstrap, and run
      **	the test suite, but you may be doing stuff I havn't tested.
      ** 	Don't update if you are relying on a working HEAD.
      In particular, (a) External Core and (b) GHCi are very little tested.
      	But please, please DO test this version!
      		Big things
      Interface files, version control, and importing declarations
      * There is a totally new data type for stuff that lives in interface files:
      	Original names			IfaceType.IfaceExtName
      	Types				IfaceType.IfaceType
      	Declarations (type,class,id)	IfaceSyn.IfaceDecl
      	Unfoldings			IfaceSyn.IfaceExpr
        (Previously we used HsSyn for type/class decls, and UfExpr for unfoldings.)
        The new data types are in iface/IfaceType and iface/IfaceSyn.  They are
        all instances of Binary, so they can be written into interface files.
        Previous engronkulation concering the binary instance of RdrName has
        gone away -- RdrName is not an instance of Binary any more.  Nor does
        Binary.lhs need to know about the ``current module'' which it used to,
        which made it specialised to GHC.
        A good feature of this is that the type checker for source code doesn't
        need to worry about the possibility that we might be typechecking interface
        file stuff.  Nor does it need to do renaming; we can typecheck direct from
        IfaceSyn, saving a whole pass (module TcIface)
      * Stuff from interface files is sucked in *lazily*, rather than being eagerly
        sucked in by the renamer. Instead, we use unsafeInterleaveIO to capture
        a thunk for the unfolding of an imported function (say).  If that unfolding
        is every pulled on, TcIface will scramble over the unfolding, which may
        in turn pull in the interface files of things mentioned in the unfolding.
        The External Package State is held in a mutable variable so that it
        can be side-effected by this lazy-sucking-in process (which may happen
        way later, e.g. when the simplifier runs).   In effect, the EPS is a kind
        of lazy memo table, filled in as we suck things in.  Or you could think
        of it as a global symbol table, populated on demand.
      * This lazy sucking is very cool, but it can lead to truly awful bugs. The
        intent is that updates to the symbol table happen atomically, but very bad
        things happen if you read the variable for the table, and then force a
        thunk which updates the table.  Updates can get lost that way. I regret
        this subtlety.
        One example of the way it showed up is that the top level of TidyPgm
        (which updates the global name cache) to be much more disciplined about
        those updates, since TidyPgm may itself force thunks which allocate new
      * Version numbering in interface files has changed completely, fixing
        one major bug with ghc --make.  Previously, the version of A.f changed
        only if A.f's type and unfolding was textually different.  That missed
        changes to things that A.f's unfolding mentions; which was fixed by
        eagerly sucking in all of those things, and listing them in the module's
        usage list.  But that didn't work with --make, because they might have
        been already sucked in.
        Now, A.f's version changes if anything reachable from A.f (via interface
        files) changes.  A module with unchanged source code needs recompiling
        only if the versions of any of its free variables changes. [This isn't
        quite right for dictionary functions and rules, which aren't mentioned
        explicitly in the source.  There are extensive comments in module MkIface,
        where all version-handling stuff is done.]
      * We don't need equality on HsDecls any more (because they aren't used in
        interface files).  Instead we have a specialised equality for IfaceSyn
        (eqIfDecl etc), which uses IfaceEq instead of Bool as its result type.
        See notes in IfaceSyn.
      * The horrid bit of the renamer that tried to predict what instance decls
        would be needed has gone entirely.  Instead, the type checker simply
        sucks in whatever instance decls it needs, when it needs them.  Easy!
        Similarly, no need for 'implicitModuleFVs' and 'implicitTemplateHaskellFVs'
        etc.  Hooray!
      Types and type checking
      * Kind-checking of types is far far tidier (new module TcHsTypes replaces
        the badly-named TcMonoType).  Strangely, this was one of my
        original goals, because the kind check for types is the Right Place to
        do type splicing, but it just didn't fit there before.
      * There's a new representation for newtypes in TypeRep.lhs.  Previously
        they were represented using "SourceTypes" which was a funny compromise.
        Now they have their own constructor in the Type datatype.  SourceType
        has turned back into PredType, which is what it used to be.
      * Instance decl overlap checking done lazily.  Consider
      	instance C Int b
      	instance C a Int
        These were rejected before as overlapping, because when seeking
        (C Int Int) one couldn't tell which to use.  But there's no problem when
        seeking (C Bool Int); it can only be the second.
        So instead of checking for overlap when adding a new instance declaration,
        we check for overlap when looking up an Inst.  If we find more than one
        matching instance, we see if any of the candidates dominates the others
        (in the sense of being a substitution instance of all the others);
        and only if not do we report an error.
      	     Medium things
      * The TcRn monad is generalised a bit further.  It's now based on utils/IOEnv.lhs,
        the IO monad with an environment.  The desugarer uses the monad too,
        so that anything it needs can get faulted in nicely.
      * Reduce the number of wired-in things; in particular Word and Integer
        are no longer wired in.  The latter required HsLit.HsInteger to get a
        Type argument.  The 'derivable type classes' data types (:+:, :*: etc)
        are not wired in any more either (see stuff about derivable type classes
      * The PersistentComilerState is now held in a mutable variable
        in the HscEnv.  Previously (a) it was passed to and then returned by
        many top-level functions, which was painful; (b) it was invariably
        accompanied by the HscEnv.  This change tidies up top-level plumbing
        without changing anything important.
      * Derivable type classes are treated much more like 'deriving' clauses.
        Previously, the Ids for the to/from functions lived inside the TyCon,
        but now the TyCon simply records their existence (with a simple boolean).
        Anyone who wants to use them must look them up in the environment.
        This in turn makes it easy to generate the to/from functions (done
        in types/Generics) using HsSyn (like TcGenDeriv for ordinary derivings)
        instead of CoreSyn, which in turn means that (a) we don't have to figure
        out all the type arguments etc; and (b) it'll be type-checked for us.
        Generally, the task of generating the code has become easier, which is
        good for Manuel, who wants to make it more sophisticated.
      * A Name now says what its "parent" is. For example, the parent of a data
        constructor is its type constructor; the parent of a class op is its
        class.  This relationship corresponds exactly to the Avail data type;
        there may be other places we can exploit it.  (I made the change so that
        version comparison in interface files would be a bit easier; but in
        fact it tided up other things here and there (see calls to
        Name.nameParent).  For example, the declaration pool, of declararations
        read from interface files, but not yet used, is now keyed only by the 'main'
        name of the declaration, not the subordinate names.
      * New types OccEnv and OccSet, with the usual operations.
        OccNames can be efficiently compared, because they have uniques, thanks
        to the hashing implementation of FastStrings.
      * The GlobalRdrEnv is now keyed by OccName rather than RdrName.  Not only
        does this halve the size of the env (because we don't need both qualified
        and unqualified versions in the env), but it's also more efficient because
        we can use a UniqFM instead of a FiniteMap.
        Consequential changes to Provenance, which has moved to RdrName.
      * External Core remains a bit of a hack, as it was before, done with a mixture
        of HsDecls (so that recursiveness and argument variance is still inferred),
        and IfaceExprs (for value declarations).  It's not thoroughly tested.
      	     Minor things
      * DataCon fields dcWorkId, dcWrapId combined into a single field
        dcIds, that is explicit about whether the data con is a newtype or not.
        MkId.mkDataConWorkId and mkDataConWrapId are similarly combined into
      * Choosing the boxing strategy is done for *source* type decls only, and
        hence is now in TcTyDecls, not DataCon.
      * WiredIn names are distinguished by their n_sort field, not by their location,
        which was rather strange
      * Define Maybes.mapCatMaybes :: (a -> Maybe b) -> [a] -> [b]
        and use it here and there
      * Much better pretty-printing of interface files (--show-iface)
      Many, many other small things.
      	     File changes
      * New iface/ subdirectory
      * Much of RnEnv has moved to iface/IfaceEnv
      * MkIface and BinIface have moved from main/ to iface/
      * types/Variance has been absorbed into typecheck/TcTyDecls
      * RnHiFiles and RnIfaces have vanished entirely.  Their
        work is done by iface/LoadIface
      * hsSyn/HsCore has gone, replaced by iface/IfaceSyn
      * typecheck/TcIfaceSig has gone, replaced by iface/TcIface
      * typecheck/TcMonoType has been renamed to typecheck/TcHsType
      * basicTypes/Var.hi-boot and basicTypes/Generics.hi-boot have gone altogether
      [project @ 2002-06-18 13:58:22 by simonpj] · 80e39963
      	    Rehash the handling of SeqOp
      See the comments in the commentary (Cunning Prelude Code).
      * Expunge SeqOp altogether
      * Add GHC.Base.lazy :: a -> a
        to GHC.Base
      * Add GHC.Base.lazy
        to basicTypes/MkId.  The idea is that this defn will over-ride
        the info from GHC.Base.hi, thereby hiding strictness and
      * Make stranal/WorkWrap do a "manual inlining" for GHC.Base.lazy
        This happens nicely after the strictness analyser has run.
      * Expunge the SeqOp/ParOp magic in CorePrep
      * Expunge the RULE for seq in PrelRules
      * Change the defns of pseq/par in GHC.Conc to:
      	{-# INLINE pseq  #-}
             	pseq :: a -> b -> b
             	pseq  x y = x `seq` lazy y
             	{-# INLINE par  #-}
             	par :: a -> b -> b
             	par  x y = case (par# x) of { _ -> lazy y }
      [project @ 2002-04-11 12:03:29 by simonpj] · a7b95beb
      	Mainly derived Read
      This commit is a tangle of several things that somehow got wound up
      together, I'm afraid.
      The main course
      Replace the derived-Read machinery with Koen's cunning new parser
      combinator library.   The result should be
      	* much smaller code sizes from derived Read
      	* faster execution of derived Read
      WARNING: I have not thoroughly tested this stuff; I'd be glad if you did!
      	 All the hard work is done, but there may be a few nits.
      The Read class gets two new methods, not exposed
      in the H98 inteface of course:
        class Read a where
          readsPrec    :: Int -> ReadS a
          readList     :: ReadS [a]
          readPrec     :: ReadPrec a		-- NEW
          readListPrec :: ReadPrec [a]	-- NEW
      There are the following new libraries:
        Text.ParserCombinators.ReadP		Koens combinator parser
        Text.ParserCombinators.ReadPrec	Ditto, but with precedences
        Text.Read.Lex				An emasculated lexical analyser
      					that provides the functionality
      					of H98 'lex'
      TcGenDeriv is changed to generate code that uses the new libraries.
      The built-in instances of Read (List, Maybe, tuples, etc) use the new
      Other stuff
      1. Some fixes the the plumbing of external-core generation. Sigbjorn
      did most of the work earlier, but this commit completes the renaming and
      typechecking plumbing.
      2. Runtime error-generation functions, such as GHC.Err.recSelErr,
      GHC.Err.recUpdErr, etc, now take an Addr#, pointing to a UTF8-encoded
      C string, instead of a Haskell string.  This makes the *calls* to these
      functions easier to generate, and smaller too, which is a good thing.
      In particular, it means that MkId.mkRecordSelectorId doesn't need to
      be passed "unpackCStringId", which was GRUESOME; and that in turn means
      that tcTypeAndClassDecls doesn't need to be passed unf_env, which is
      a very worthwhile cleanup.   Win/win situation.
      3.  GHC now faithfully translates do-notation using ">>" for statements
      with no binding, just as the report says.  While I was there I tidied
      up HsDo to take a list of Ids instead of 3 (but now 4) separate Ids.
      Saves a bit of code here and there.  Also introduced Inst.newMethodFromName
      to package a common idiom.
      [project @ 2002-04-04 13:15:18 by simonpj] · c44e1c41
      	A glorious improvement to CPR analysis
      Working on the CPR paper, I finally figured out how to
      do a decent job of taking account of strictness analyis when doing
      CPR analysis.
      There are two places we do that:
      1.  Usually, on a letrec for a *thunk* we discard any CPR info from
      the RHS.  We can't worker/wrapper a thunk.  BUT, if the let is
      	used strictly
      we don't need to discard the CPR info, because the thunk-splitting
      transform (WorkWrap.splitThunk) works.  This idea isn't new in this
      2. Arguments to strict functions.  Consider
        fac n m = if n==0 then m
      		    else fac (n-1) (m*n)
      Does it have the CPR property?  Apparently not, because it returns the
      accumulating parameter, m.  But the strictness analyser will
      discover that fac is strict in m, so it will be passed unboxed to
      the worker for fac.  More concretely, here is the worker/wrapper
      split that will result from strictness analysis alone:
        fac n m = case n of MkInt n' ->
      	    case m of MkInt m' ->
      	    facw n' m'
        facw n' m' = if n' ==# 0#
      	       then I# m'
      	       else facw (n' -# 1#) (m' *# n')
      Now facw clearly does have the CPR property!  We can take advantage
      of this by giving a demanded lambda the CPR property.
      To make this work nicely, I've made NewDemandInfo into Maybe Demand
      rather than simply Demand, so that we can tell when we are on the
      first iteration.  Lots of comments about this in Note [CPR-AND-STRICTNESS].
      I don't know how much all this buys us, but it is simple and elegant.
      [project @ 2001-11-19 14:23:52 by simonpj] · d8af6b8c
      	Yet another cut at the DmdAnal domains
      This version of the domain for demand analysis was developed
      in discussion with Peter Sestoft, so I think it might at last
      be more or less right!
      Our idea is mentally to separate
      	strictness analysis
      	absence and boxity analysis
      Then we combine them back into a single domain.  The latter
      is all you see in the compiler (the Demand type, as before)
      but we understand it better now.
      [project @ 2001-10-25 02:13:10 by sof] · 9e933350
      - Pet peeve removal / code tidyup, replaced various sub-optimal
        uses of 'length' with something a bit better, i.e., replaced
        the following patterns
         *  length as `cmpOp` length bs
         *  length as `cmpOp` val   -- incl. uses where val == 1 and val == 0
         *  {take,drop,splitAt} (length as) bs
         *  length [ () | pat <- as ]
        with uses of misc Util functions.
        I'd be surprised if there's a noticeable reduction in running
        times as a result of these changes, but every little bit helps.
        [ The changes have been tested wrt testsuite/ - I'm seeing a couple
          of unexpected breakages coming from CorePrep, but I'm currently
          assuming that these are due to other recent changes. ]
      - compMan/CompManager.lhs: restored 4.08 compilability + some code
      None of these changes are HEADworthy.
      [project @ 2001-10-24 08:33:25 by simonpj] · 566075c3
      	Implement thunk splitting
      This is a rather nice transformation that I found when
      optimising some nofib programs.
      Suppose x is used strictly (never mind whether it has the CPR
      	x* = x-rhs
            in body
      splitThunk transforms like this:
      	x* = case x-rhs of { I# a -> I# a }
            in body
      Now simplifier will transform to
            case x-rhs of
      	I# a ->	let x* = I# b
      	        in body
      which is what we want. Now suppose x-rhs is itself a case:
      	x-rhs = case e of { T -> I# a; F -> I# b }
      The join point will abstract over a, rather than over (which is
      what would have happened before) which is fine.
      Notice that x certainly has the CPR property now!
      In fact, splitThunk uses the function argument w/w splitting
      function, so that if x's demand is deeper (say U(U(L,L),L))
      then the splitting will go deeper too.
      ** On the way, I tidied up some of the code in WwLib.
      [project @ 2001-09-26 15:12:33 by simonpj] · e0d750be
      		Simon's big commit
      This commit, which I don't think I can sensibly do piecemeal, consists
      of the things I've been doing recently, mainly directed at making
      Manuel, George, and Marcin happier with RULES.
      Reogranise the simplifier
      1. The simplifier's environment is now an explicit parameter.  This
      makes it a bit easier to figure out where it is going.
      2. Constructor arguments can now be arbitrary expressions, except
      when the application is the RHS of a let(rec).  This makes it much
      easier to match rules like
      	    "foo"  f (h x, g y) = f' x y
      In the simplifier, it's Simplify.mkAtomicArgs that ANF-ises a
      constructor application where necessary.  In the occurrence analyser,
      there's a new piece of context info (OccEncl) to say whether a
      constructor app is in a place where it should be in ANF.  (Unless
      it knows this it'll give occurrence info which will inline the
      argument back into the constructor app.)
      3. I'm experimenting with doing the "float-past big lambda" transformation
      in the full laziness pass, rather than mixed in with the simplifier (was
      4.  Arrange that
      	case (coerce (S,T) (x,y)) of ...
      will simplify.  Previous it didn't.
      A local change to CoreUtils.exprIsConApp_maybe.
      5. Do a better job in CoreUtils.exprEtaExpandArity when there's an
      error function in one branch.
      Phase numbers, RULES, and INLINE pragmas
      1.  Phase numbers decrease from N towards zero (instead of increasing).
      This makes it easier to add new earlier phases, which is what users want
      to do.
      2.  RULES get their own phase number, N, and are disabled in phases before N.
      e.g. 	{-# RULES "foo" [2] forall x y.  f (x,y) = f' x y #-}
      Note the [2], which says "only active in phase 2 and later".
      3.  INLINE and NOINLINE pragmas have a phase number to.  This is now treated
      in just the same way as the phase number on RULE; that is, the Id is not inlined
      in phases earlier than N.  In phase N and later the Id *may* be inlined, and
      here is where INLINE and NOINLINE differ: INLNE makes the RHS look small, so
      as soon as it *may* be inlined it probably *will* be inlined.
      The syntax of the phase number on an INLINE/NOINLINE pragma has changed to be
      like the RULES case (i.e. in square brackets).  This should also make sure
      you examine all such phase numbers; many will need to change now the numbering
      is reversed.
      Inlining Ids is no longer affected at all by whether the Id appears on the
      LHS of a rule.  Now it's up to the programmer to put a suitable INLINE/NOINLINE
      pragma to stop it being inlined too early.
      Implementation notes:
      *  A new data type, BasicTypes.Activation says when a rule or inline pragma
      is active.   Functions isAlwaysActive, isNeverActive, isActive, do the
      obvious thing (all in BasicTypes).
      * Slight change in the SimplifierSwitch data type, which led to a lot of
      simplifier-specific code moving from CmdLineOpts to SimplMonad; a Good Thing.
      * The InlinePragma in the IdInfo of an Id is now simply an Activation saying
      when the Id can be inlined.  (It used to be a rather bizarre pair of a
      Bool and a (Maybe Phase), so this is much much easier to understand.)
      * The simplifier has a "mode" environment switch, replacing the old
      black list.  Unfortunately the data type decl has to be in
      CmdLineOpts, because it's an argument to the CoreDoSimplify switch
          data SimplifierMode = SimplGently | SimplPhase Int
      Here "gently" means "no rules, no inlining".   All the crucial
      inlining decisions are now collected together in SimplMonad
      (preInlineUnconditionally, postInlineUnconditionally, activeInline,
      1.  Only dictionary *functions* are made INLINE, not dictionaries that
      have no parameters.  (This inline-dictionary-function thing is Marcin's
      idea and I'm still not sure whether it's a good idea.  But it's definitely
      a Bad Idea when there are no arguments.)
      2.  Be prepared to specialise an INLINE function: an easy fix in
      But there is still a problem, which is that the INLINE wins
      at the call site, so we don't use the specialised version anyway.
      I'm still unsure whether it makes sense to SPECIALISE something
      you want to INLINE.
      Random smaller things
      * builtinRules (there was only one, but may be more) in PrelRules are now
        incorporated.   They were being ignored before...
      * OrdList.foldOL -->  OrdList.foldrOL, OrdList.foldlOL
      * Some tidying up of the tidyOpenTyVar, tidyTyVar functions.  I've
        forgotten exactly what!
      [project @ 2001-09-07 12:42:46 by simonpj] · d3f61314
      	Fix the demand analyser
      A spiffy new domain for demands, and definitions for lub/both
      which are actually monotonic.   Quite a bit of related jiggling
      One of the original motivations was to do with functions like:
      	sum n []     = n
      	sum n (x:xs) = sum (n+x) xs
      Even though n is returned boxed from the first case, we don't want
      to get strictness
      	S(L)V -> T
      because that means we pass the box for n, and that is TERRIBLE.
      So the new version errs on the side of unboxing, more like the forwards
      analyser, and only passes the box if it is *definitely* needed, rather
      than if it *may* be needed.
      [project @ 2001-08-20 16:50:13 by simonpj] · 0c190d9e
      	Make NOINLINE zap the strictness info
      Make a NOINLINE pragma zap strictness information.
      Reasons given in the WorkWrap comment:
      	-- Furthermore, zap the strictess info in the Id.  Why?  Because
      	-- the NOINLINE says "don't expose any of the inner workings at the call
      	-- site" and the strictness is certainly an inner working.
      	-- More concretely, the demand analyser discovers the following strictness
      	-- for unsafePerformIO:  C(U(AV))
      	-- But then consider
      	--	unsafePerformIO (\s -> let r = f x in
      	--			       case writeIORef v r s of (# s1, _ #) ->
      	--			       (# s1, r #)
      	-- The strictness analyser will find that the binding for r is strict,
      	-- (becuase of uPIO's strictness sig), and so it'll evaluate it before
      	-- doing the writeIORef.  This actually makes tests/lib/should_run/memo002
      	-- get a deadlock!
      	-- Solution: don't expose the strictness of unsafePerformIO.
      This fixes the memo002 deadlock.
      [project @ 2001-07-23 10:54:46 by simonpj] · f6cd95ff
      	Switch to the new demand analyser
      This commit makes the new demand analyser the main beast,
      with the old strictness analyser as a backup.  When
      DEBUG is on, the old strictness analyser is run too, and the
      results compared.
      WARNING: this isn't thorougly tested yet, so expect glitches.
      Delay updating for a few days if the HEAD is mission critical
      for you.
      But do try it out.  I'm away for 2.5 weeks from Thursday, so
      it would be good to shake out any glaring bugs before then.
      [project @ 2001-06-25 14:36:04 by simonpj] · a5ded1f8
      Import wibbles
      [project @ 2001-06-25 08:09:57 by simonpj] · d069cec2
      	Squash newtypes
      This commit squashes newtypes and their coerces, from the typechecker
      onwards.  The original idea was that the coerces would not get in the
      way of optimising transformations, but despite much effort they continue
      to do so.   There's no very good reason to retain newtype information
      beyond the typechecker, so now we don't.
      Main points:
      * The post-typechecker suite of Type-manipulating functions is in
      types/Type.lhs, as before.   But now there's a new suite in types/TcType.lhs.
      The difference is that in the former, newtype are transparent, while in
      the latter they are opaque.  The typechecker should only import TcType,
      not Type.
      * The operations in TcType are all non-monadic, and most of them start with
      "tc" (e.g. tcSplitTyConApp).  All the monadic operations (used exclusively
      by the typechecker) are in a new module, typecheck/TcMType.lhs
      * I've grouped newtypes with predicate types, thus:
      	data Type = TyVarTy Tyvar | ....
      		  | SourceTy SourceType
      	data SourceType = NType TyCon [Type]
      			| ClassP Class [Type]
      			| IParam Type
      [SourceType was called PredType.]  This is a little wierd in some ways,
      because NTypes can't occur in qualified types.   However, the idea is that
      a SourceType is a type that is opaque to the type checker, but transparent
      to the rest of the compiler, and newtypes fit that as do implicit parameters
      and dictionaries.
      * Recursive newtypes still retain their coreces, exactly as before. If
      they were transparent we'd get a recursive type, and that would make
      various bits of the compiler diverge (e.g. things which do type comparison).
      * I've removed types/Unify.lhs (non-monadic type unifier and matcher),
      merging it into TcType.
      Ditto typecheck/TcUnify.lhs (monadic unifier), merging it into TcMType.
      [project @ 2001-03-23 10:44:08 by simonpj] · 86d20be7
      	Remove a redundant test in WorkWrap
      We were making a worker/wrapper for an INLINE
      function when it wasn't necessary, and that's a Bad Idea.
      As the comment with WorkWrap.tryWW now says:
      	-- It's very important to refrain from w/w-ing an INLINE function
      	-- If we do so by mistake we transform
      	--	f = __inline (\x -> E)
      	-- into
      	--	f = __inline (\x -> case x of (a,b) -> fw E)
      	--	fw = \ab -> (__inline (\x -> E)) (a,b)
      	-- and the original __inline now vanishes, so E is no longer
      	-- inside its __inline wrapper.  Death!  Disaster!
      There was one case when we did w/w it (to do with coercions),
      but it turned out to be a vestige, as the OUT OF DATE NOTE says.
      [project @ 2001-03-08 12:07:38 by simonpj] · 51a571c0
      	A major hygiene pass
      1. The main change here is to
      	Move what was the "IdFlavour" out of IdInfo,
      	and into the varDetails field of a Var
         It was a mess before, because the flavour was a permanent attribute
         of an Id, whereas the rest of the IdInfo was ephemeral.  It's
         all much tidier now.
         Main places to look:
      	   Var.lhs	Defn of VarDetails
      	   IdInfo.lhs	Defn of GlobalIdDetails
         The main remaining infelicity is that SpecPragmaIds are right down
         in Var.lhs, which seems unduly built-in for such an ephemeral thing.
         But that is no worse than before.
      2. Tidy up the HscMain story a little.  Move mkModDetails from MkIface
         into CoreTidy (where it belongs more nicely)
         This was partly forced by (1) above, because I didn't want to make
         DictFun Ids into a separate kind of Id (which is how it was before).
         Not having them separate means we have to keep a list of them right
         through, rather than pull them out of the bindings at the end.
      3. Add NameEnv as a separate module (to join NameSet).
      4. Remove unnecessary {-# SOURCE #-} imports from FieldLabel.
      [project @ 2000-12-06 13:03:28 by simonmar] · d3645411
      Re-engineer the transition from Core to STG syntax.  Main changes in
      this commit:
        - a new pass, CoreSat, handles saturation of constructors and PrimOps,
          and puts the syntax into STG-like normal form (applications to atoms
          only, etc), modulo type applications and Notes.
        - CoreToStg is now done at the same time as StgVarInfo.  Most of the
          contents of StgVarInfo.lhs have been copied into CoreToStg.lhs and
          some simplifications made.
      less major changes:
        - globalisation of names for the purposes of object splitting is
          now done by the C code generator (which is the Right Place in
          principle, but it was a bit fiddly).
        - CoreTidy now does cloning of local binders and collection of arity
          info.  The IdInfo from CoreTidy is now *almost* the final IdInfo we
          put in the interface file, except for CafInfo.  I'm going to move
          the CafInfo collection into CoreTidy in due course too.
        - and some other minor tidyups while I was in cluster-bomb commit mode.
      [project @ 2000-11-10 15:12:50 by simonpj] · f23ba2b2
      1.	Outputable.PprStyle now carries a bit more information
      	In particular, the printing style tells whether to print
      	a name in unqualified form.  This used to be embedded in
      	a Name, but since Names now outlive a single compilation unit,
      	that's no longer appropriate.
      	So now the print-unqualified predicate is passed in the printing
      	style, not embedded in the Name.
         2.	I tidied up HscMain a little.  Many of the showPass messages
      	have migraged into the repective pass drivers
    • simonpj's avatar
      [project @ 2000-09-14 13:46:39 by simonpj] · cae34044
      simonpj authored
      	Simon's tuning changes: early Sept 2000
      Library changes
      * Eta expand PrelShow.showLitChar.  It's impossible to compile this well,
        and it makes a big difference to some programs (e.g. gen_regexps)
      * Make PrelList.concat into a good producer (in the foldr/build sense)
      Flag changes
      * Add -ddump-hi-diffs to print out changes in interface files.  Useful
        when watching what the compiler is doing
      * Add -funfolding-update-in-place to enable the experimental optimisation
        that makes the inliner a bit keener to inline if it's in the RHS of
        a thunk that might be updated in place.  Sometimes this is a bad idea
        (one example is in spectral/sphere; see notes in nofib/Simon-nofib-notes)
      Tuning things
      * Fix a bug in SetLevels.lvlMFE.  (change ctxt_lvl to dest_level)
        I don't think this has any performance effect, but it saves making
        a redundant let-binding that is later eliminated.
      * Desugar.dsProgram and DsForeign
        Glom together all the bindings into a single Rec.  Previously the
        bindings generated by 'foreign' declarations were not glommed together, but
        this led to an infelicity (i.e. poorer code than necessary) in the modules
        that actually declare Float and Double (explained a bit more in Desugar.dsProgram)
      * OccurAnal.shortMeOut and IdInfo.shortableIdInfo
        Don't do the occurrence analyser's shorting out stuff for things which
        have rules.  Comments near IdInfo.shortableIdInfo.
        This is deeply boring, and mainly to do with making rules work well.
        Maybe rules should have phases attached too....
      * CprAnalyse.addIdCprInfo
        Be a bit more willing to add CPR information to thunks;
        in particular, if the strictness analyser has just discovered that this
        is a strict let, then the let-to-case transform will happen, and CPR is fine.
        This made a big difference to PrelBase.modInt, which had something like
      	modInt = \ x -> let r = ... -> I# v in
      			...body strict in r...
        r's RHS isn't a value yet; but modInt returns r in various branches, so
        if r doesn't have the CPR property then neither does modInt
      * MkId.mkDataConWrapId
        Arrange that vanilla constructors, like (:) and I#, get unfoldings that are
        just a simple variable $w:, $wI#.  This ensures they'll be inlined even into
        rules etc, which makes matching a bit more reliable.  The downside is that in
        situations like (map (:) xs), we'll end up with (map (\y ys. $w: y ys) xs.
        Which is tiresome but it doesn't happen much.
      * SaAbsInt.findStrictness
        Deal with the case where a thing with no arguments is bottom.  This is Good.
        E.g.   module M where { foo = error "help" }
        Suppose we have in another module
      	case M.foo of ...
        Then we'd like to do the case-of-error transform, without inlining foo.
      Tidying up things
      * Reorganised Simplify.completeBinding (again).
      * Removed the is_bot field in CoreUnfolding (is_cheap is true if is_bot is!)
        This is just a tidy up
      * HsDecls and others
        Remove the NewCon constructor from ConDecl.  It just added code, and nothing else.
        And it led to a bug in MkIface, which though that a newtype decl was always changing!
      * IdInfo and many others
        Remove all vestiges of UpdateInfo (hasn't been used for years)
      [project @ 2000-09-07 16:32:23 by simonpj] · 4e6d5798
      A list of simplifier-related stuff, triggered
      	by looking at GHC's performance.
      	I don't guarantee that this lot will lead to
      	a uniform improvement over 4.08, but it it should
      	be a bit better.  More work probably required.
      * Make the simplifier's Stop continuation record whether the expression being
        simplified is the RHS of a thunk, or (say) the body of a lambda or case RHS.
        In the thunk case we want to be a bit keener about inlining if the type of
        the thunk is amenable to update in place.
      * Fix interestingArg, which was being too liberal, and hence doing
        too much inlining.
      * Extended CoreUtils.exprIsCheap to make two more things cheap:
          - 	case (coerce x) of ...
          -   let x = y +# z
        This makes a bit more eta expansion happen.  It was provoked by
        a program of Marcin's.
      * MkIface.ifaceBinds.   Make sure that we emit rules for things
        (like class operations) that don't get a top-level binding in the
        interface file.  Previously such rules were silently forgotten.
      * Move transformRhs to *after* simplification, which makes it a
        little easier to do, and means that the arity it computes is
        readily available to completeBinding.  This gets much better
      * Do coerce splitting in completeBinding. This gets good code for
      	newtype CInt = CInt Int
      	test:: CInt -> Int
      	test x = case x of
      	      	   1 -> 2
      	      	   2 -> 4
      	      	   3 -> 8
      	      	   4 -> 16
      	      	   _ -> 0
      * Modify the meaning of "arity" so that during compilation it means
        "if you apply this function to fewer args, it will do virtually
        no work".   So, for example
      	f = coerce t (\x -> e)
        has arity at least 1.  When a function is exported, it's arity becomes
        the number of exposed, top-level lambdas, which is subtly different.
        But that's ok.
        I removed CoreUtils.exprArity altogether: it looked only at the exposed
        lambdas.  Instead, we use exprEtaExpandArity exclusively.
        All of this makes I/O programs work much better.
      [project @ 2000-06-09 07:32:31 by simonpj] · 3c1b89ab
      In my commit of 24 May I got this boolean condition
      back to front:
          tryWW non_rec fn_id rhs
            | not (isNeverInlinePrag inline_prag)
            =  -- Don't split things that will never be inlined
      The 'not' is obviously wrong!  As a result virtually nothing
      is being worker-wrapper'd
      How this has survived for more than two weeks beats me.
      [project @ 2000-05-24 15:47:13 by simonpj] · 95929be0
      MERGE 4.07
      * This fix cures the weird 'ifaceBinds' error that
        Sven and George tripped over.  It was quite obscure!
        Basically, there was a top level binding
      	f = x
        lying around, which CoreToStg didn't like.  Why hadn't
        it been substituted away?  Because it had a NOINLINE
        pragma.  Why did it have a NOINLINE pragma?  Because
        it's an always-diverging function, so we never want to
        inline it.
      [project @ 2000-03-30 16:23:56 by simonpj] · b822aa0e
      * Remove the unnecessary CPR parameter to mkUnfolding and friends
      * Make sure that even trivial wrappers have a __inline__
        (this was causing lots of 'substWorker' DEBUG messages)
      * Nuke demand info when the unfolding is a value
        (see notes with IdInfo.setUnfoldingInfo)
      * Add an update-in-place test to the 'interesting context'
        predicate in SimplUtils.
      [project @ 2000-03-23 17:45:17 by simonpj] · 111cee3f
      This utterly gigantic commit is what I've been up to in background
      mode in the last couple of months.  Originally the main goal
      was to get rid of Con (staturated constant applications)
      in the CoreExpr type, but one thing led to another, and I kept
      postponing actually committing.   Sorry.
      	Simon, 23 March 2000
      I've tested it pretty thoroughly, but doubtless things will break.
      Here are the highlights
      * Con is gone; the CoreExpr type is simpler
      * NoRepLits have gone
      * Better usage info in interface files => less recompilation
      * Result type signatures work
      * CCall primop is tidied up
      * Constant folding now done by Rules
      * Lots of hackery in the simplifier
      * Improvements in CPR and strictness analysis
      Many bug fixes including
      * Sergey's DoCon compiles OK; no loop in the strictness analyser
      * Volker Wysk's programs don't crash the CPR analyser
      I have not done much on measuring compilation times and binary sizes;
      they could have got worse.  I think performance has got significantly
      better, though, in most cases.
      Removing the Con form of Core expressions
      The big thing is that
        For every constructor C there are now *two* Ids:
      	C is the constructor's *wrapper*. It evaluates and unboxes arguments
      	before calling $wC.  It has a perfectly ordinary top-level defn
      	in the module defining the data type.
      	$wC is the constructor's *worker*.  It is like a primop that simply
      	allocates and builds the constructor value.  Its arguments are the
      	actual representation arguments of the constructor.
      	Its type may be different to C, because:
      		- useless dict args are dropped
      		- strict args may be flattened
        For every primop P there is *one* Id, its (curried) Id
        Neither contructor worker Id nor the primop Id have a defminition anywhere.
        Instead they are saturated during the core-to-STG pass, and the code generator
        generates code for them directly. The STG language still has saturated
        primops and constructor applications.
      * The Const type disappears, along with Const.lhs.  The literal part
        of Const.lhs reappears as Literal.lhs.  Much tidying up in here,
        to bring all the range checking into this one module.
      * I got rid of NoRep literals entirely.  They just seem to be too much trouble.
      * Because Con's don't exist any more, the funny C { args } syntax
        disappears from inteface files.
      * Result type signatures now work
      	f :: Int -> Int = \x -> x
      	-- The Int->Int is the type of f
      	g x y :: Int = x+y
      	-- The Int is the type of the result of (g x y)
      Recompilation checking and make
      * The .hi file for a modules is not touched if it doesn't change.  (It used to
        be touched regardless, forcing a chain of recompilations.)  The penalty for this
        is that we record exported things just as if they were mentioned in the body of
        the module.  And the penalty for that is that we may recompile a module when
        the only things that have changed are the things it is passing on without using.
        But it seems like a good trade.
      * -recomp is on by default
      Foreign declarations
      * If you say
      	foreign export zoo :: Int -> IO Int
        then you get a C produre called 'zoo', not 'zzoo' as before.
        I've also added a check that complains if you export (or import) a C
        procedure whose name isn't legal C.
      Code generation and labels
      * Now that constructor workers and wrappers have distinct names, there's
        no need to have a Foo_static_closure and a Foo_closure for constructor Foo.
        I nuked the entire StaticClosure story.  This has effects in some of
        the RTS headers (i.e. s/static_closure/closure/g)
      Rules, constant folding
      * Constant folding becomes just another rewrite rule, attached to the Id for the
        PrimOp.   To achieve this, there's a new form of Rule, a BuiltinRule (see CoreSyn.lhs).
        The prelude rules are in prelude/PrelRules.lhs, while simplCore/ConFold.lhs has gone.
      * Appending of constant strings now works, using fold/build fusion, plus
        the rewrite rule
      	unpack "foo" c (unpack "baz" c n)  =  unpack "foobaz" c n
        Implemented in PrelRules.lhs
      * The CCall primop is tidied up quite a bit.  There is now a data type CCall,
        defined in PrimOp, that packages up the info needed for a particular CCall.
        There is a new Id for each new ccall, with an big "occurrence name"
      	{__ccall "foo" gc Int# -> Int#}
        In interface files, this is parsed as a single Id, which is what it is, really.
      * There were numerous places where the host compiler's
        minInt/maxInt was being used as the target machine's minInt/maxInt.
        I nuked all of these; everything is localised to inIntRange and inWordRange,
        in Literal.lhs
      * Desugaring record updates was broken: it didn't generate correct matches when
        used withe records with fancy unboxing etc.  It now uses matchWrapper.
      * Significant tidying up in codeGen/SMRep.lhs
      * Add __word, __word64, __int64 terminals to signal the obvious types
        in interface files.  Add the ability to print word values in hex into
        C code.
      * PrimOp.lhs is no longer part of a loop.  Remove PrimOp.hi-boot*
      * isProductTyCon no longer returns False for recursive products, nor
        for unboxed products; you have to test for these separately.
        There's no reason not to do CPR for recursive product types, for example.
        Ditto splitProductType_maybe.
      * New -fno-case-of-case flag for the simplifier.  We use this in the first run
        of the simplifier, where it helps to stop messing up expressions that
        the (subsequent) full laziness pass would otherwise find float out.
        It's much more effective than previous half-baked hacks in inlining.
        Actually, it turned out that there were three places in Simplify.lhs that
        needed to know use this flag.
      * Make the float-in pass push duplicatable bindings into the branches of
        a case expression, in the hope that we never have to allocate them.
        (see FloatIn.sepBindsByDropPoint)
      * Arrange that top-level bottoming Ids get a NOINLINE pragma
        This reduced gratuitous inlining of error messages.
        But arrange that such things still get w/w'd.
      * Arrange that a strict argument position is regarded as an 'interesting'
        context, so that if we see
      	foldr k z (g x)
        then we'll be inclined to inline g; this can expose a build.
      * There was a missing case in CoreUtils.exprEtaExpandArity that meant
        we were missing some obvious cases for eta expansion
        Also improve the code when handling applications.
      * Make record selectors (identifiable by their IdFlavour) into "cheap" operations.
      	  [The change is a 2-liner in CoreUtils.exprIsCheap]
        This means that record selection may be inlined into function bodies, which
        greatly improves the arities of overloaded functions.
      * Make a cleaner job of inlining "lone variables".  There was some distributed
        cunning, but I've centralised it all now in SimplUtils.analyseCont, which
        analyses the context of a call to decide whether it is "interesting".
      * Don't specialise very small functions in Specialise.specDefn
        It's better to inline it.  Rather like the worker/wrapper case.
      * Be just a little more aggressive when floating out of let rhss.
        See comments with Simplify.wantToExpose
        A small change with an occasional big effect.
      * Make the inline-size computation think that
      	case x of I# x -> ...
        is *free*.
      CPR analysis
      * Fix what was essentially a bug in CPR analysis.  Consider
      	letrec f x = let g y = let ... in f e1
      		     if ... then (a,b) else g x
        g has the CPR property if f does; so when generating the final annotated
        RHS for f, we must use an envt in which f is bound to its final abstract
        value.  This wasn't happening.  Instead, f was given the CPR tag but g
        wasn't; but of course the w/w pass gives rotten results in that case!!
        (Because f's CPR-ness relied on g's.)
        On they way I tidied up the code in CprAnalyse.  It's quite a bit shorter.
        The fact that some data constructors return a constructed product shows
        up in their CPR info (MkId.mkDataConId) not in CprAnalyse.lhs
      Strictness analysis and worker/wrapper
      * BIG THING: pass in the demand to StrictAnal.saExpr.  This affects situations
      	f (let x = e1 in (x,x))
        where f turns out to have strictness u(SS), say.  In this case we can
        mark x as demanded, and use a case expression for it.
        The situation before is that we didn't "know" that there is the u(SS)
        demand on the argument, so we simply computed that the body of the let
        expression is lazy in x, and marked x as lazily-demanded.  Then even after
        f was w/w'd we got
      	let x = e1 in case (x,x) of (a,b) -> $wf a b
        and hence
      	let x = e1 in $wf a b
        I found a much more complicated situation in spectral/sphere/Main.shade,
        which improved quite a bit with this change.
      * Moved the StrictnessInfo type from IdInfo to Demand.  It's the logical
        place for it, and helps avoid module loops
      * Do worker/wrapper for coerces even if the arity is zero.  Thus:
      	stdout = coerce Handle (..blurg..)
      	wibble = (...blurg...)
      	stdout = coerce Handle wibble
        This is good because I found places where we were saying
      	case coerce t stdout of { MVar a ->
      	case coerce t stdout of { MVar b ->
        and the redundant case wasn't getting eliminated because of the coerce.
  38. 01 Nov, 1999 1 commit
      [project @ 1999-11-01 17:09:54 by simonpj] · 30b5ebe4
      A regrettably-gigantic commit that puts in place what Simon PJ
      has been up to for the last month or so, on and off.
      The basic idea was to restore unfoldings to *occurrences* of
      variables without introducing a space leak.  I wanted to make
      sure things improved relative to 4.04, and that proved depressingly
      hard.  On the way I discovered several quite serious bugs in the
      Here's a summary of what's gone on.
      * No commas between for-alls in RULES.  This makes the for-alls have
        the same syntax as in types.
      * Arrange that simplConArgs works in one less pass than before.
        This exposed a bug: a bogus call to completeBeta.
      * Add a top-level flag in CoreUnfolding, used in callSiteInline
      * Extend w/w to use etaExpandArity, so it does eta/coerce expansion
      * Implement inline phases.   The meaning of the inline pragmas is
        described in CoreUnfold.lhs.  You can say things like
      	{#- INLINE 2 build #-}
        to mean "inline build in phase 2"
      * Don't float anything out of an INLINE.
        Don't float things to top level unless they also escape a value lambda.
      	[see comments with SetLevels.lvlMFE
        Without at least one of these changes, I found that
      	{-# INLINE concat #-}
      	concat = __inline (/\a -> foldr (++) [])
        was getting floated to
      	concat = __inline( /\a -> lvl a )
      	lvl = ...inlined version of foldr...
        Subsequently I found that not floating constants out of an INLINE
        gave really bad code like
      	__inline (let x = e in \y -> ...)
        so I now let things float out of INLINE
      * Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier
        to implement it in SetLevels, and may benefit full laziness too.
      * It's a good idea to inline inRange. Consider
      	index (l,h) i = case inRange (l,h) i of
      		  	  True ->  l+i
      			  False -> error
        inRange itself isn't strict in h, but if it't inlined then 'index'
        *does* become strict in h.  Interesting!
      * Big change to the way unfoldings and occurrence info is propagated in the simplifier
        The plan is described in Subst.lhs with the Subst type
        Occurrence info is now in a separate IdInfo field than user pragmas
      * I found that
      	(coerce T (coerce S (\x.e))) y
        didn't simplify in one round. First we get to
      	(\x.e) y
        and only then do the beta. Solution: cancel the coerces in the continuation
      * Amazingly, CoreUnfold wasn't counting the cost of a function an application.
      * Disable rules in initial simplifier run.  Otherwise full laziness
        doesn't get a chance to lift out a MFE before a rule (e.g. fusion)
        zaps it.  queens is a case in point
      * Improve float-out stuff significantly.  The big change is that if we have
      	\x -> ... /\a -> ...let p = ..a.. in let q = ...p...
        where p's rhs doesn't x, we abstract a from p, so that we can get p past x.
        (We did that before.)  But we also substitute (p a) for p in q, and then
        we can do the same thing for q.  (We didn't do that, so q got stuck.)
        This is much better.  It involves doing a substitution "as we go" in SetLevels,