Commit 99d4e5b4 authored by Simon Peyton Jones's avatar Simon Peyton Jones

Implement cardinality analysis

This major patch implements the cardinality analysis described
in our paper "Higher order cardinality analysis". It is joint
work with Ilya Sergey and Dimitrios Vytiniotis.

The basic is augment the absence-analysis part of the demand
analyser so that it can tell when something is used
	 never
	 at most once
 	 some other way

The "at most once" information is used
    a) to enable transformations, and
       in particular to identify one-shot lambdas
    b) to allow updates on thunks to be omitted.

There are two new flags, mainly there so you can do performance
comparisons:
    -fkill-absence   stops GHC doing absence analysis at all
    -fkill-one-shot  stops GHC spotting one-shot lambdas
                     and single-entry thunks

The big changes are:

* The Demand type is substantially refactored.  In particular
  the UseDmd is factored as follows
      data UseDmd
        = UCall Count UseDmd
        | UProd [MaybeUsed]
        | UHead
        | Used

      data MaybeUsed = Abs | Use Count UseDmd

      data Count = One | Many

  Notice that UCall recurses straight to UseDmd, whereas
  UProd goes via MaybeUsed.

  The "Count" embodies the "at most once" or "many" idea.

* The demand analyser itself was refactored a lot

* The previously ad-hoc stuff in the occurrence analyser for foldr and
  build goes away entirely.  Before if we had build (\cn -> ...x... )
  then the "\cn" was hackily made one-shot (by spotting 'build' as
  special.  That's essential to allow x to be inlined.  Now the
  occurrence analyser propagates info gotten from 'build's stricness
  signature (so build isn't special); and that strictness sig is
  in turn derived entirely automatically.  Much nicer!

* The ticky stuff is improved to count single-entry thunks separately.

One shortcoming is that there is no DEBUG way to spot if an
allegedly-single-entry thunk is acually entered more than once.  It
would not be hard to generate a bit of code to check for this, and it
would be reassuring.  But it's fiddly and I have not done it.

Despite all this fuss, the performance numbers are rather under-whelming.
See the paper for more discussion.

       nucleic2          -0.8%    -10.9%      0.10      0.10     +0.0%
         sphere          -0.7%     -1.5%      0.08      0.08     +0.0%
--------------------------------------------------------------------------------
            Min          -4.7%    -10.9%     -9.3%     -9.3%    -50.0%
            Max          -0.4%     +0.5%     +2.2%     +2.3%     +7.4%
 Geometric Mean          -0.8%     -0.2%     -1.3%     -1.3%     -1.8%

I don't quite know how much credence to place in the runtime changes,
but movement seems generally in the right direction.
parent da4ff650
This diff is collapsed.
......@@ -321,10 +321,10 @@ mkDictSelId dflags no_unf name clas
strict_sig = mkStrictSig (mkTopDmdType [arg_dmd] topRes)
arg_dmd | new_tycon = evalDmd
| otherwise = mkProdDmd [ if the_arg_id == id then evalDmd else absDmd
| otherwise = mkManyUsedDmd $
mkProdDmd [ if the_arg_id == id then evalDmd else absDmd
| id <- arg_ids ]
tycon = classTyCon clas
new_tycon = isNewTyCon tycon
[data_con] = tyConDataCons tycon
......
......@@ -58,7 +58,8 @@ module OccName (
-- ** Derived 'OccName's
isDerivedOccName,
mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc, mkGenDefMethodOcc,
mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc,
mkGenDefMethodOcc,
mkDerivedTyConOcc, mkNewTyCoOcc, mkClassOpAuxOcc,
mkCon2TagOcc, mkTag2ConOcc, mkMaxTagOcc,
mkClassDataConOcc, mkDictOcc, mkIPOcc,
......@@ -574,8 +575,8 @@ isDerivedOccName occ =
\end{code}
\begin{code}
mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc, mkGenDefMethodOcc,
mkDerivedTyConOcc, mkClassDataConOcc, mkDictOcc,
mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc,
mkGenDefMethodOcc, mkDerivedTyConOcc, mkClassDataConOcc, mkDictOcc,
mkIPOcc, mkSpecOcc, mkForeignExportOcc, mkGenOcc1, mkGenOcc2,
mkGenD, mkGenR, mkGen1R, mkGenRCo,
mkDataTOcc, mkDataCOcc, mkDataConWorkerOcc, mkNewTyCoOcc,
......
......@@ -1005,8 +1005,7 @@ stmtMacros = listToUFM [
tickyAllocPAP goods slop ),
( fsLit "TICK_ALLOC_UP_THK", \[goods,slop] ->
tickyAllocThunk goods slop ),
( fsLit "UPD_BH_UPDATABLE", \[reg] -> emitBlackHoleCode False reg ),
( fsLit "UPD_BH_SINGLE_ENTRY", \[reg] -> emitBlackHoleCode True reg )
( fsLit "UPD_BH_UPDATABLE", \[reg] -> emitBlackHoleCode reg )
]
emitPushUpdateFrame :: CmmExpr -> CmmExpr -> FCode ()
......
......@@ -130,8 +130,7 @@ cgBind (StgNonRec name rhs)
= do { (info, fcode) <- cgRhs name rhs
; addBindC (cg_id info) info
; init <- fcode
; emit init
}
; emit init }
-- init cannot be used in body, so slightly better to sink it eagerly
cgBind (StgRec pairs)
......@@ -209,9 +208,34 @@ cgRhs id (StgRhsCon cc con args)
buildDynCon id True cc con args
cgRhs name (StgRhsClosure cc bi fvs upd_flag _srt args body)
| null fvs -- See Note [Nested constant closures]
= do { (info, fcode) <- cgTopRhsClosure Recursive name cc bi upd_flag args body
; return (info, fcode >> return mkNop) }
| otherwise
= do dflags <- getDynFlags
mkRhsClosure dflags name cc bi (nonVoidIds fvs) upd_flag args body
{- Note [Nested constant closures]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If we have
f x = let funny = not True
in ...
then 'funny' is a nested closure (compiled with cgRhs) that has no free vars.
This does not happen often, because let-floating takes them all to top
level; but it CAN happen. (Reason: let-floating may make a function f smaller
so it can be inlined, so now (f True) may generate a local no-fv closure.
This actually happened during bootsrapping GHC itself, with f=mkRdrFunBind
in TcGenDeriv.)
If we have one of these things, AND they allocate, the heap check will
refer to the static funny_closure; but there isn't one! (Why does the
heap check refer to the static closure? Becuase nodeMustPointToIt is
False, which is fair enough.)
Simple solution: compile the RHS as if it was top level. Then
everything works. A minor benefit is eliminating the allocation code
too. -}
------------------------------------------------------------------------
-- Non-constructor right hand sides
------------------------------------------------------------------------
......@@ -547,8 +571,9 @@ thunkCode cl_info fv_details _cc node arity body
; entryHeapCheck cl_info node' arity [] $ do
{ -- Overwrite with black hole if necessary
-- but *after* the heap-overflow check
; tickyEnterThunk cl_info
; when (blackHoleOnEntry cl_info && node_points)
(blackHoleIt cl_info node)
(blackHoleIt node)
-- Push update frame
; setupUpdate cl_info node $
......@@ -568,14 +593,14 @@ thunkCode cl_info fv_details _cc node arity body
-- Update and black-hole wrappers
------------------------------------------------------------------------
blackHoleIt :: ClosureInfo -> LocalReg -> FCode ()
blackHoleIt :: LocalReg -> FCode ()
-- Only called for closures with no args
-- Node points to the closure
blackHoleIt closure_info node
= emitBlackHoleCode (closureSingleEntry closure_info) (CmmReg (CmmLocal node))
blackHoleIt node_reg
= emitBlackHoleCode (CmmReg (CmmLocal node_reg))
emitBlackHoleCode :: Bool -> CmmExpr -> FCode ()
emitBlackHoleCode is_single_entry node = do
emitBlackHoleCode :: CmmExpr -> FCode ()
emitBlackHoleCode node = do
dflags <- getDynFlags
-- Eager blackholing is normally disabled, but can be turned on with
......@@ -603,7 +628,6 @@ emitBlackHoleCode is_single_entry node = do
-- work with profiling.
when eager_blackholing $ do
tickyBlackHole (not is_single_entry)
emitStore (cmmOffsetW dflags node (fixedHdrSize dflags))
(CmmReg (CmmGlobal CurrentTSO))
emitPrimCall [] MO_WriteBarrier []
......@@ -614,7 +638,7 @@ setupUpdate :: ClosureInfo -> LocalReg -> FCode () -> FCode ()
-- so that the cost centre in the original closure can still be
-- extracted by a subsequent enterCostCentre
setupUpdate closure_info node body
| closureReEntrant closure_info
| not (lfUpdatable (closureLFInfo closure_info))
= body
| not (isStaticClosure closure_info)
......
......@@ -133,7 +133,7 @@ import TyCon
import Data.Maybe
import qualified Data.Char
import Control.Monad ( when )
import Control.Monad ( unless, when )
-----------------------------------------------------------------------------
--
......@@ -238,13 +238,22 @@ tickyEnterDynCon = ifTicky $ bumpTickyCounter (fsLit "ENT_DYN_CON_ctr")
tickyEnterStaticCon = ifTicky $ bumpTickyCounter (fsLit "ENT_STATIC_CON_ctr")
tickyEnterViaNode = ifTicky $ bumpTickyCounter (fsLit "ENT_VIA_NODE_ctr")
tickyEnterThunk :: FCode ()
tickyEnterThunk = ifTicky $ do
bumpTickyCounter (fsLit "ENT_DYN_THK_ctr")
ifTickyDynThunk $ do
ticky_ctr_lbl <- getTickyCtrLabel
registerTickyCtrAtEntryDyn ticky_ctr_lbl
bumpTickyEntryCount ticky_ctr_lbl
tickyEnterThunk :: ClosureInfo -> FCode ()
tickyEnterThunk cl_info
= ifTicky $ do
{ bumpTickyCounter ctr
; unless static $ do
ticky_ctr_lbl <- getTickyCtrLabel
registerTickyCtrAtEntryDyn ticky_ctr_lbl
bumpTickyEntryCount ticky_ctr_lbl }
where
updatable = closureSingleEntry cl_info
static = isStaticClosure cl_info
ctr | static = if updatable then fsLit "ENT_STATIC_THK_SINGLE_ctr"
else fsLit "ENT_STATIC_THK_MANY_ctr"
| otherwise = if updatable then fsLit "ENT_DYN_THK_SINGLE_ctr"
else fsLit "ENT_DYN_THK_MANY_ctr"
tickyEnterStdThunk :: FCode ()
tickyEnterStdThunk = tickyEnterThunk
......
......@@ -345,12 +345,13 @@ cpeBind :: TopLevelFlag -> CorePrepEnv -> CoreBind
-> UniqSM (CorePrepEnv, Floats)
cpeBind top_lvl env (NonRec bndr rhs)
= do { (_, bndr1) <- cpCloneBndr env bndr
; let is_strict = isStrictDmd (idDemandInfo bndr)
; let dmd = idDemandInfo bndr
is_unlifted = isUnLiftedType (idType bndr)
; (floats, bndr2, rhs2) <- cpePair top_lvl NonRecursive
(is_strict || is_unlifted)
dmd
is_unlifted
env bndr1 rhs
; let new_float = mkFloat is_strict is_unlifted bndr2 rhs2
; let new_float = mkFloat dmd is_unlifted bndr2 rhs2
-- We want bndr'' in the envt, because it records
-- the evaluated-ness of the binder
......@@ -360,7 +361,7 @@ cpeBind top_lvl env (NonRec bndr rhs)
cpeBind top_lvl env (Rec pairs)
= do { let (bndrs,rhss) = unzip pairs
; (env', bndrs1) <- cpCloneBndrs env (map fst pairs)
; stuff <- zipWithM (cpePair top_lvl Recursive False env') bndrs1 rhss
; stuff <- zipWithM (cpePair top_lvl Recursive topDmd False env') bndrs1 rhss
; let (floats_s, bndrs2, rhss2) = unzip3 stuff
all_pairs = foldrOL add_float (bndrs2 `zip` rhss2)
......@@ -375,11 +376,11 @@ cpeBind top_lvl env (Rec pairs)
add_float b _ = pprPanic "cpeBind" (ppr b)
---------------
cpePair :: TopLevelFlag -> RecFlag -> RhsDemand
cpePair :: TopLevelFlag -> RecFlag -> Demand -> Bool
-> CorePrepEnv -> Id -> CoreExpr
-> UniqSM (Floats, Id, CpeRhs)
-- Used for all bindings
cpePair top_lvl is_rec is_strict_or_unlifted env bndr rhs
cpePair top_lvl is_rec dmd is_unlifted env bndr rhs
= do { (floats1, rhs1) <- cpeRhsE env rhs
-- See if we are allowed to float this stuff out of the RHS
......@@ -392,7 +393,7 @@ cpePair top_lvl is_rec is_strict_or_unlifted env bndr rhs
else WARN(True, text "CorePrep: silly extra arguments:" <+> ppr bndr)
-- Note [Silly extra arguments]
(do { v <- newVar (idType bndr)
; let float = mkFloat False False v rhs2
; let float = mkFloat topDmd False v rhs2
; return ( addFloat floats2 float
, cpeEtaExpand arity (Var v)) })
......@@ -406,6 +407,8 @@ cpePair top_lvl is_rec is_strict_or_unlifted env bndr rhs
; return (floats3, bndr', rhs') }
where
is_strict_or_unlifted = (isStrictDmd dmd) || is_unlifted
platform = targetPlatform (cpe_dynFlags env)
arity = idArity bndr -- We must match this arity
......@@ -650,9 +653,8 @@ cpeApp env expr
[] -> (topDmd, [])
(arg_ty, res_ty) = expectJust "cpeBody:collect_args" $
splitFunTy_maybe fun_ty
is_strict = isStrictDmd ss1
; (fs, arg') <- cpeArg env is_strict arg arg_ty
; (fs, arg') <- cpeArg env ss1 arg arg_ty
; return (App fun' arg', hd, res_ty, fs `appendFloats` floats, ss_rest) }
collect_args (Var v) depth
......@@ -682,8 +684,8 @@ cpeApp env expr
-- N-variable fun, better let-bind it
collect_args fun depth
= do { (fun_floats, fun') <- cpeArg env True fun ty
-- The True says that it's sure to be evaluated,
= do { (fun_floats, fun') <- cpeArg env evalDmd fun ty
-- The evalDmd says that it's sure to be evaluated,
-- so we'll end up case-binding it
; return (fun', (fun', depth), ty, fun_floats, []) }
where
......@@ -694,9 +696,9 @@ cpeApp env expr
-- ---------------------------------------------------------------------------
-- This is where we arrange that a non-trivial argument is let-bound
cpeArg :: CorePrepEnv -> RhsDemand
cpeArg :: CorePrepEnv -> Demand
-> CoreArg -> Type -> UniqSM (Floats, CpeTriv)
cpeArg env is_strict arg arg_ty
cpeArg env dmd arg arg_ty
= do { (floats1, arg1) <- cpeRhsE env arg -- arg1 can be a lambda
; (floats2, arg2) <- if want_float floats1 arg1
then return (floats1, arg1)
......@@ -710,11 +712,12 @@ cpeArg env is_strict arg arg_ty
else do
{ v <- newVar arg_ty
; let arg3 = cpeEtaExpand (exprArity arg2) arg2
arg_float = mkFloat is_strict is_unlifted v arg3
arg_float = mkFloat dmd is_unlifted v arg3
; return (addFloat floats2 arg_float, varToCoreExpr v) } }
where
is_unlifted = isUnLiftedType arg_ty
want_float = wantFloatNested NonRecursive (is_strict || is_unlifted)
is_strict = isStrictDmd dmd
want_float = wantFloatNested NonRecursive (is_strict || is_unlifted)
\end{code}
Note [Floating unlifted arguments]
......@@ -909,20 +912,16 @@ tryEtaReducePrep _ _ = Nothing
\end{code}
-- -----------------------------------------------------------------------------
-- Demands
-- -----------------------------------------------------------------------------
\begin{code}
type RhsDemand = Bool -- True => used strictly; hence not top-level, non-recursive
\end{code}
%************************************************************************
%* *
Floats
%* *
%************************************************************************
Note [Pin demand info on floats]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We pin demand info on floated lets so that we can see the one-shot thunks.
\begin{code}
data FloatingBind
= FloatLet CoreBind -- Rhs of bindings are CpeRhss
......@@ -957,12 +956,16 @@ data OkToSpec
-- ok-to-speculate unlifted bindings
| NotOkToSpec -- Some not-ok-to-speculate unlifted bindings
mkFloat :: Bool -> Bool -> Id -> CpeRhs -> FloatingBind
mkFloat is_strict is_unlifted bndr rhs
mkFloat :: Demand -> Bool -> Id -> CpeRhs -> FloatingBind
mkFloat dmd is_unlifted bndr rhs
| use_case = FloatCase bndr rhs (exprOkForSpeculation rhs)
| otherwise = FloatLet (NonRec bndr rhs)
| is_hnf = FloatLet (NonRec bndr rhs)
| otherwise = FloatLet (NonRec (setIdDemandInfo bndr dmd) rhs)
-- See Note [Pin demand info on floats]
where
use_case = is_unlifted || is_strict && not (exprIsHNF rhs)
is_hnf = exprIsHNF rhs
is_strict = isStrictDmd dmd
use_case = is_unlifted || is_strict && not is_hnf
-- Don't make a case for a value binding,
-- even if it's strict. Otherwise we get
-- case (\x -> e) of ...!
......
......@@ -273,6 +273,8 @@ data GeneralFlag
-- optimisation opts
| Opt_Strictness
| Opt_KillAbsence
| Opt_KillOneShot
| Opt_FullLaziness
| Opt_FloatIn
| Opt_Specialise
......@@ -2534,7 +2536,9 @@ fFlags = [
( "hpc", Opt_Hpc, nop ),
( "pre-inlining", Opt_SimplPreInlining, nop ),
( "flat-cache", Opt_FlatCache, nop ),
( "use-rpaths", Opt_RPath, nop )
( "use-rpaths", Opt_RPath, nop ),
( "kill-absence", Opt_KillAbsence, nop),
( "kill-one-shot", Opt_KillOneShot, nop)
]
-- | These @-f\<blah\>@ flags can all be reversed with @-fno-\<blah\>@
......
......@@ -31,10 +31,9 @@ import Coercion
import VarSet
import VarEnv
import Var
import Demand ( argOneShots, argsOneShots )
import Maybes ( orElse )
import Digraph ( SCC(..), stronglyConnCompFromEdgedVerticesR )
import PrelNames ( buildIdKey, foldrIdKey, runSTRepIdKey, augmentIdKey )
import Unique
import UniqFM
import Util
......@@ -138,7 +137,7 @@ occAnalBind env _ imp_rules_edges (NonRec binder rhs) body_usage
= (body_usage' +++ rhs_usage4, [NonRec tagged_binder rhs'])
where
(body_usage', tagged_binder) = tagBinder body_usage binder
(rhs_usage1, rhs') = occAnalRhs env (Just tagged_binder) rhs
(rhs_usage1, rhs') = occAnalNonRecRhs env tagged_binder rhs
rhs_usage2 = addIdOccs rhs_usage1 (idUnfoldingVars binder)
rhs_usage3 = addIdOccs rhs_usage2 (idRuleVars binder)
-- See Note [Rules are extra RHSs] and Note [Rule dependency info]
......@@ -665,7 +664,7 @@ makeNode env imp_rules_edges bndr_set (bndr, rhs)
-- Constructing the edges for the main Rec computation
-- See Note [Forming Rec groups]
(rhs_usage1, rhs') = occAnalRhs env Nothing rhs
(rhs_usage1, rhs') = occAnalRecRhs env rhs
rhs_usage2 = addIdOccs rhs_usage1 all_rule_fvs -- Note [Rules are extra RHSs]
-- Note [Rule dependency info]
rhs_usage3 = case mb_unf_fvs of
......@@ -1065,28 +1064,36 @@ ToDo: try using the occurrence info for the inline'd binder.
\begin{code}
occAnalRhs :: OccEnv
-> Maybe Id -> CoreExpr -- Binder and rhs
-- Just b => non-rec, and alrady tagged with occurrence info
-- Nothing => Rec, no occ info
occAnalRecRhs :: OccEnv -> CoreExpr -- Rhs
-> (UsageDetails, CoreExpr)
-- Returned usage details covers only the RHS,
-- and *not* the RULE or INLINE template for the Id
occAnalRhs env mb_bndr rhs
= occAnal ctxt rhs
occAnalRecRhs env rhs = occAnal (rhsCtxt env) rhs
occAnalNonRecRhs :: OccEnv
-> Id -> CoreExpr -- Binder and rhs
-- Binder is already tagged with occurrence info
-> (UsageDetails, CoreExpr)
-- Returned usage details covers only the RHS,
-- and *not* the RULE or INLINE template for the Id
occAnalNonRecRhs env bndr rhs
= occAnal rhs_env rhs
where
-- See Note [Use one-shot info]
env1 = env { occ_one_shots = argOneShots dmd }
-- See Note [Cascading inlines]
ctxt = case mb_bndr of
Just b | certainly_inline b -> env
_other -> rhsCtxt env
rhs_env | certainly_inline = env1
| otherwise = rhsCtxt env1
certainly_inline bndr -- See Note [Cascading inlines]
certainly_inline -- See Note [Cascading inlines]
= case idOccInfo bndr of
OneOcc in_lam one_br _ -> not in_lam && one_br && active && not_stable
_ -> False
where
active = isAlwaysActive (idInlineActivation bndr)
not_stable = not (isStableUnfolding (idUnfolding bndr))
dmd = idDemandInfo bndr
active = isAlwaysActive (idInlineActivation bndr)
not_stable = not (isStableUnfolding (idUnfolding bndr))
addIdOccs :: UsageDetails -> VarSet -> UsageDetails
addIdOccs usage id_set = foldVarSet add usage id_set
......@@ -1223,24 +1230,13 @@ occAnal env expr@(Lam _ _)
(final_usage, tagged_binders) = tagLamBinders body_usage binders'
-- Use binders' to put one-shot info on the lambdas
-- URGH! Sept 99: we don't seem to be able to use binders' here, because
-- we get linear-typed things in the resulting program that we can't handle yet.
-- (e.g. PrelShow) TODO
really_final_usage = if linear then
final_usage
else
mapVarEnv markInsideLam final_usage
really_final_usage | linear = final_usage
| otherwise = mapVarEnv markInsideLam final_usage
in
(really_final_usage,
mkLams tagged_binders body') }
(really_final_usage, mkLams tagged_binders body') }
where
env_body = vanillaCtxt env
-- Body is (no longer) an RhsContext
(binders, body) = collectBinders expr
binders' = oneShotGroup env binders
linear = all is_one_shot binders'
is_one_shot b = isId b && isOneShotBndr b
(binders, body) = collectBinders expr
(env_body, binders', linear) = oneShotGroup env binders
occAnal env (Case scrut bndr ty alts)
= case occ_anal_scrut scrut alts of { (scrut_usage, scrut') ->
......@@ -1282,12 +1278,20 @@ occAnal env (Let bind body)
case occAnalBind env env emptyVarEnv bind body_usage of { (final_usage, new_binds) ->
(final_usage, mkLets new_binds body') }}
occAnalArgs :: OccEnv -> [CoreExpr] -> (UsageDetails, [CoreExpr])
occAnalArgs env args
= case mapAndUnzip (occAnal arg_env) args of { (arg_uds_s, args') ->
(foldr (+++) emptyDetails arg_uds_s, args')}
where
arg_env = vanillaCtxt env
occAnalArgs :: OccEnv -> [CoreExpr] -> [OneShots] -> (UsageDetails, [CoreExpr])
occAnalArgs _ [] _
= (emptyDetails, [])
occAnalArgs env (arg:args) one_shots
| isTypeArg arg
= case occAnalArgs env args one_shots of { (uds, args') ->
(uds, arg:args') }
| otherwise
= case argCtxt env one_shots of { (arg_env, one_shots') ->
case occAnal arg_env arg of { (uds1, arg') ->
case occAnalArgs env args one_shots' of { (uds2, args') ->
(uds1 +++ uds2, arg':args') }}}
\end{code}
Applications are dealt with specially because we want
......@@ -1324,27 +1328,23 @@ occAnalApp env (Var fun, args)
in
(fun_uds +++ final_args_uds, mkApps (Var fun) args') }
where
fun_uniq = idUnique fun
fun_uds = mkOneOcc env fun (valArgCount args > 0)
is_exp = isExpandableApp fun (valArgCount args)
-- See Note [CONLIKE pragma] in BasicTypes
-- The definition of is_exp should match that in
-- Simplify.prepareRhs
-- Hack for build, fold, runST
args_stuff | fun_uniq == buildIdKey = appSpecial env 2 [True,True] args
| fun_uniq == augmentIdKey = appSpecial env 2 [True,True] args
| fun_uniq == foldrIdKey = appSpecial env 3 [False,True] args
| fun_uniq == runSTRepIdKey = appSpecial env 2 [True] args
one_shots = argsOneShots (idStrictness fun) (valArgCount args)
-- See Note [Use one-shot info]
args_stuff = occAnalArgs env args one_shots
-- (foldr k z xs) may call k many times, but it never
-- shares a partial application of k; hence [False,True]
-- This means we can optimise
-- foldr (\x -> let v = ...x... in \y -> ...v...) z xs
-- by floating in the v
| otherwise = occAnalArgs env args
occAnalApp env (fun, args)
= case occAnal (addAppCtxt env args) fun of { (fun_uds, fun') ->
-- The addAppCtxt is a bit cunning. One iteration of the simplifier
......@@ -1354,11 +1354,8 @@ occAnalApp env (fun, args)
-- thing much like a let. We do this by pushing some True items
-- onto the context stack.
case occAnalArgs env args of { (args_uds, args') ->
let
final_uds = fun_uds +++ args_uds
in
(final_uds, mkApps fun' args') }}
case occAnalArgs env args [] of { (args_uds, args') ->
(fun_uds +++ args_uds, mkApps fun' args') }}
markManyIf :: Bool -- If this is true
......@@ -1366,29 +1363,23 @@ markManyIf :: Bool -- If this is true
-> UsageDetails
markManyIf True uds = mapVarEnv markMany uds
markManyIf False uds = uds
\end{code}
appSpecial :: OccEnv
-> Int -> CtxtTy -- Argument number, and context to use for it
-> [CoreExpr]
-> (UsageDetails, [CoreExpr])
appSpecial env n ctxt args
= go n args
where
arg_env = vanillaCtxt env
go _ [] = (emptyDetails, []) -- Too few args
go 1 (arg:args) -- The magic arg
= case occAnal (setCtxtTy arg_env ctxt) arg of { (arg_uds, arg') ->
case occAnalArgs env args of { (args_uds, args') ->
(arg_uds +++ args_uds, arg':args') }}
Note [Use one-shot information]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The occurrrence analyser propagates one-shot-lambda information in two situation
* Applications: eg build (\cn -> blah)
Propagate one-shot info from the strictness signature of 'build' to
the \cn
go n (arg:args)
= case occAnal arg_env arg of { (arg_uds, arg') ->
case go (n-1) args of { (args_uds, args') ->
(arg_uds +++ args_uds, arg':args') }}
\end{code}
* Let-bindings: eg let f = \c. let ... in \n -> blah
in (build f, build f)
Propagate one-shot info from the demanand-info on 'f' to the
lambdas in its RHS (which may not be syntactically at the top)
Some of this is done by the demand analyser, but this way it happens
much earlier, taking advantage of the strictness signature of
imported functions.
Note [Binders in case alternatives]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -1441,10 +1432,10 @@ wrapProxy _ _ _ body_usg body
\begin{code}
data OccEnv
= OccEnv { occ_encl :: !OccEncl -- Enclosing context information
, occ_ctxt :: !CtxtTy -- Tells about linearity
, occ_gbl_scrut :: GlobalScruts
, occ_rule_act :: Activation -> Bool -- Which rules are active
= OccEnv { occ_encl :: !OccEncl -- Enclosing context information
, occ_one_shots :: !OneShots -- Tells about linearity
, occ_gbl_scrut :: GlobalScruts
, occ_rule_act :: Activation -> Bool -- Which rules are active
-- See Note [Finding rule RHS free vars]
, occ_binder_swap :: !Bool -- enable the binder_swap
-- See CorePrep Note [Dead code in CorePrep]
......@@ -1471,7 +1462,7 @@ instance Outputable OccEncl where
ppr OccRhs = ptext (sLit "occRhs")
ppr OccVanilla = ptext (sLit "occVanilla")
type CtxtTy = [Bool]
type OneShots = [Bool]
-- [] No info
--
-- True:ctxt Analysing a function-valued expression that will be
......@@ -1479,51 +1470,66 @@ type CtxtTy = [Bool]
--
-- False:ctxt Analysing a function-valued expression that may
-- be applied many times; but when it is,
-- the CtxtTy inside applies
-- the OneShots inside applies
initOccEnv :: (Activation -> Bool) -> OccEnv
initOccEnv active_rule
= OccEnv { occ_encl = OccVanilla
, occ_ctxt = []
= OccEnv { occ_encl = OccVanilla
, occ_one_shots = []
, occ_gbl_scrut = emptyVarSet -- PE emptyVarEnv emptyVarSet
, occ_rule_act = active_rule
, occ_rule_act = active_rule
, occ_binder_swap = True }
vanillaCtxt :: OccEnv -> OccEnv
vanillaCtxt env = env { occ_encl = OccVanilla, occ_ctxt = [] }
vanillaCtxt env = env { occ_encl = OccVanilla, occ_one_shots = [] }
rhsCtxt :: OccEnv -> OccEnv
rhsCtxt env = env { occ_encl = OccRhs, occ_ctxt = [] }
rhsCtxt env = env { occ_encl = OccRhs, occ_one_shots = [] }
setCtxtTy :: OccEnv -> CtxtTy -> OccEnv
setCtxtTy env ctxt = env { occ_ctxt = ctxt }
argCtxt :: OccEnv -> [OneShots] -> (OccEnv, [OneShots])
argCtxt env []
= (env { occ_encl = OccVanilla, occ_one_shots = [] }, [])
argCtxt env (one_shots:one_shots_s)
= (env { occ_encl = OccVanilla, occ_one_shots = one_shots }, one_shots_s)
isRhsEnv :: OccEnv -> Bool
isRhsEnv (OccEnv { occ_encl = OccRhs }) = True
isRhsEnv (OccEnv { occ_encl = OccVanilla }) = False
oneShotGroup :: OccEnv -> [CoreBndr] -> [CoreBndr]
oneShotGroup :: OccEnv -> [CoreBndr]
-> ( OccEnv
, [CoreBndr]
, Bool ) -- True <=> all binders are one-shot
-- The result binders have one-shot-ness set that they might not have had originally.
-- This happens in (build (\cn -> e)). Here the occurrence analyser
-- linearity context knows that c,n are one-shot, and it records that fact in
-- the binder. This is useful to guide subsequent float-in/float-out tranformations
oneShotGroup (OccEnv { occ_ctxt = ctxt }) bndrs
= go ctxt bndrs []
oneShotGroup env@(OccEnv { occ_one_shots = ctxt }) bndrs
= go ctxt bndrs [] True
where
go _ [] rev_bndrs = reverse rev_bndrs
go (lin_ctxt:ctxt) (bndr:bndrs) rev_bndrs
| isId bndr = go ctxt bndrs (bndr':rev_bndrs)
where
bndr' | lin_ctxt = setOneShotLambda bndr
| otherwise = bndr
go ctxt (bndr:bndrs) rev_bndrs = go ctxt bndrs (bndr:rev_bndrs)
go ctxt [] rev_bndrs linear
= ( env { occ_one_shots = ctxt, occ_encl = OccVanilla }
, reverse rev_bndrs
, linear )
go ctxt (bndr:bndrs) rev_bndrs lin_acc
| isId bndr