Commit 30c39066 authored by simonpj@microsoft.com's avatar simonpj@microsoft.com
Browse files

Improve inlining for INLINE non-functions

	
(No need to merge to 6.8, but no harm if a subsequent patch needs it.)

The proximate cause for this patch is to improve the inlining for INLINE
things that are not functions; this came up in the NDP project.  See
Note [Lone variables] in CoreUnfold.

This caused some refactoring that actually made things simpler.  In 
particular, more of the inlining logic has moved from SimplUtils to 
CoreUnfold, where it belongs.
parent a8984a87
......@@ -34,7 +34,8 @@ module CoreUnfold (
couldBeSmallEnoughToInline,
certainlyWillInline, smallEnoughToInline,
callSiteInline
callSiteInline, CallContInfo(..)
) where
#include "HsVersions.h"
......@@ -504,12 +505,23 @@ StrictAnal.addStrictnessInfoToTopId
callSiteInline :: DynFlags
-> Bool -- True <=> the Id can be inlined
-> Id -- The Id
-> Bool -- True if there are are no arguments at all (incl type args)
-> [Bool] -- One for each value arg; True if it is interesting
-> Bool -- True <=> continuation is interesting
-> CallContInfo -- True <=> continuation is interesting
-> Maybe CoreExpr -- Unfolding, if any
callSiteInline dflags active_inline id arg_infos interesting_cont
data CallContInfo = BoringCont
| InterestingCont -- Somewhat interesting
| CaseCont -- Very interesting; the argument of a case
-- that decomposes its scrutinee
instance Outputable CallContInfo where
ppr BoringCont = ptext SLIT("BoringCont")
ppr InterestingCont = ptext SLIT("InterestingCont")
ppr CaseCont = ptext SLIT("CaseCont")
callSiteInline dflags active_inline id lone_variable arg_infos cont_info
= case idUnfolding id of {
NoUnfolding -> Nothing ;
OtherCon cs -> Nothing ;
......@@ -529,9 +541,7 @@ callSiteInline dflags active_inline id arg_infos interesting_cont
n_val_args = length arg_infos
yes_or_no
| not active_inline = False
| otherwise = is_cheap && consider_safe False
yes_or_no = active_inline && is_cheap && consider_safe
-- We consider even the once-in-one-branch
-- occurrences, because they won't all have been
-- caught by preInlineUnconditionally. In particular,
......@@ -540,14 +550,13 @@ callSiteInline dflags active_inline id arg_infos interesting_cont
-- pre-inline will not have inlined it for fear of
-- invalidating the occurrence info in the rhs.
consider_safe once
consider_safe
-- consider_safe decides whether it's a good idea to
-- inline something, given that there's no
-- work-duplication issue (the caller checks that).
= case guidance of
UnfoldNever -> False
UnfoldIfGoodArgs n_vals_wanted arg_discounts size res_discount
| enough_args && size <= (n_vals_wanted + 1)
-- Inline unconditionally if there no size increase
-- Size of call is n_vals_wanted (+1 for the function)
......@@ -557,43 +566,46 @@ callSiteInline dflags active_inline id arg_infos interesting_cont
-> some_benefit && small_enough
where
some_benefit = or arg_infos || really_interesting_cont ||
(not is_top && ({- once || -} (n_vals_wanted > 0 && enough_args)))
-- [was (once && not in_lam)]
-- If it occurs more than once, there must be
-- something interesting about some argument, or the
-- result context, to make it worth inlining
--
-- If a function has a nested defn we also record
-- some-benefit, on the grounds that we are often able
-- to eliminate the binding, and hence the allocation,
-- for the function altogether; this is good for join
-- points. But this only makes sense for *functions*;
-- inlining a constructor doesn't help allocation
-- unless the result is scrutinised. UNLESS the
-- constructor occurs just once, albeit possibly in
-- multiple case branches. Then inlining it doesn't
-- increase allocation, but it does increase the
-- chance that the constructor won't be allocated at
-- all in the branches that don't use it.
enough_args = n_val_args >= n_vals_wanted
really_interesting_cont | n_val_args < n_vals_wanted = False -- Too few args
| n_val_args == n_vals_wanted = interesting_cont
| otherwise = True -- Extra args
enough_args = n_val_args >= n_vals_wanted
some_benefit = or arg_infos || really_interesting_cont
-- There must be something interesting
-- about some argument, or the result
-- context, to make it worth inlining
really_interesting_cont
| n_val_args < n_vals_wanted = False -- Too few args
| n_val_args == n_vals_wanted = interesting_saturated_call
| otherwise = True -- Extra args
-- really_interesting_cont tells if the result of the
-- call is in an interesting context.
interesting_saturated_call
= case cont_info of
BoringCont -> not is_top && n_vals_wanted > 0 -- Note [Nested functions]
CaseCont -> not lone_variable || not is_value -- Note [Lone variables]
InterestingCont -> n_vals_wanted > 0
small_enough = (size - discount) <= opt_UF_UseThreshold
discount = computeDiscount n_vals_wanted arg_discounts res_discount
arg_infos really_interesting_cont
discount = computeDiscount n_vals_wanted arg_discounts
res_discount' arg_infos
res_discount' = case cont_info of
BoringCont -> 0
CaseCont -> res_discount
InterestingCont -> 4 `min` res_discount
-- res_discount can be very large when a function returns
-- construtors; but we only want to invoke that large discount
-- when there's a case continuation.
-- Otherwise we, rather arbitrarily, threshold it. Yuk.
-- But we want to aovid inlining large functions that return
-- constructors into contexts that are simply "interesting"
in
if dopt Opt_D_dump_inlinings dflags then
pprTrace "Considering inlining"
(ppr id <+> vcat [text "active:" <+> ppr active_inline,
text "arg infos" <+> ppr arg_infos,
text "interesting continuation" <+> ppr interesting_cont,
text "interesting continuation" <+> ppr cont_info,
text "is value:" <+> ppr is_value,
text "is cheap:" <+> ppr is_cheap,
text "guidance" <+> ppr guidance,
......@@ -602,9 +614,78 @@ callSiteInline dflags active_inline id arg_infos interesting_cont
else
result
}
\end{code}
Note [Nested functions]
~~~~~~~~~~~~~~~~~~~~~~~
If a function has a nested defn we also record some-benefit, on the
grounds that we are often able to eliminate the binding, and hence the
allocation, for the function altogether; this is good for join points.
But this only makes sense for *functions*; inlining a constructor
doesn't help allocation unless the result is scrutinised. UNLESS the
constructor occurs just once, albeit possibly in multiple case
branches. Then inlining it doesn't increase allocation, but it does
increase the chance that the constructor won't be allocated at all in
the branches that don't use it.
Note [Lone variables]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The "lone-variable" case is important. I spent ages messing about
with unsatisfactory varaints, but this is nice. The idea is that if a
variable appears all alone
as an arg of lazy fn, or rhs Stop
as scrutinee of a case Select
as arg of a strict fn ArgOf
AND
it is bound to a value
then we should not inline it (unless there is some other reason,
e.g. is is the sole occurrence). That is what is happening at
the use of 'lone_variable' in 'interesting_saturated_call'.
Why? At least in the case-scrutinee situation, turning
let x = (a,b) in case x of y -> ...
into
let x = (a,b) in case (a,b) of y -> ...
and thence to
let x = (a,b) in let y = (a,b) in ...
is bad if the binding for x will remain.
Another example: I discovered that strings
were getting inlined straight back into applications of 'error'
because the latter is strict.
s = "foo"
f = \x -> ...(error s)...
Fundamentally such contexts should not encourage inlining because the
context can ``see'' the unfolding of the variable (e.g. case or a
RULE) so there's no gain. If the thing is bound to a value.
However, watch out:
* Consider this:
foo = _inline_ (\n. [n])
bar = _inline_ (foo 20)
baz = \n. case bar of { (m:_) -> m + n }
Here we really want to inline 'bar' so that we can inline 'foo'
and the whole thing unravels as it should obviously do. This is
important: in the NDP project, 'bar' generates a closure data
structure rather than a list.
* Even a type application or coercion isn't a lone variable.
Consider
case $fMonadST @ RealWorld of { :DMonad a b c -> c }
We had better inline that sucker! The case won't see through it.
For now, I'm treating treating a variable applied to types
in a *lazy* context "lone". The motivating example was
f = /\a. \x. BIG
g = /\a. \y. h (f a)
There's no advantage in inlining f here, and perhaps
a significant disadvantage. Hence some_val_args in the Stop case
computeDiscount :: Int -> [Int] -> Int -> [Bool] -> Bool -> Int
computeDiscount n_vals_wanted arg_discounts res_discount arg_infos result_used
\begin{code}
computeDiscount :: Int -> [Int] -> Int -> [Bool] -> Int
computeDiscount n_vals_wanted arg_discounts result_discount arg_infos
-- We multiple the raw discounts (args_discount and result_discount)
-- ty opt_UnfoldingKeenessFactor because the former have to do with
-- *size* whereas the discounts imply that there's some extra
......@@ -626,8 +707,4 @@ computeDiscount n_vals_wanted arg_discounts res_discount arg_infos result_used
mk_arg_discount discount is_evald | is_evald = discount
| otherwise = 0
-- Don't give a result discount unless there are enough args
result_discount | result_used = res_discount -- Over-applied, or case scrut
| otherwise = 0
\end{code}
......@@ -621,7 +621,7 @@ Can we eta-expand f? Only if we see that f has arity 1, and then we
take advantage of the 'state hack' on the result of
(f y) :: State# -> (State#, Int) to expand the arity one more.
There is a disadvantage though. Making the arity visible in the RHA
There is a disadvantage though. Making the arity visible in the RHS
allows us to eta-reduce
f = \x -> f x
to
......
......@@ -302,62 +302,25 @@ applies when x is bound to a lambda expression. Hence
contIsInteresting looks for case expressions with just a single
default case.
\begin{code}
interestingCallContext :: Bool -- False <=> no args at all
-> Bool -- False <=> no value args
-> SimplCont -> Bool
-- The "lone-variable" case is important. I spent ages
-- messing about with unsatisfactory varaints, but this is nice.
-- The idea is that if a variable appear all alone
-- as an arg of lazy fn, or rhs Stop
-- as scrutinee of a case Select
-- as arg of a strict fn ArgOf
-- then we should not inline it (unless there is some other reason,
-- e.g. is is the sole occurrence). We achieve this by making
-- interestingCallContext return False for a lone variable.
--
-- Why? At least in the case-scrutinee situation, turning
-- let x = (a,b) in case x of y -> ...
-- into
-- let x = (a,b) in case (a,b) of y -> ...
-- and thence to
-- let x = (a,b) in let y = (a,b) in ...
-- is bad if the binding for x will remain.
--
-- Another example: I discovered that strings
-- were getting inlined straight back into applications of 'error'
-- because the latter is strict.
-- s = "foo"
-- f = \x -> ...(error s)...
-- Fundamentally such contexts should not ecourage inlining because
-- the context can ``see'' the unfolding of the variable (e.g. case or a RULE)
-- so there's no gain.
--
-- However, even a type application or coercion isn't a lone variable.
-- Consider
-- case $fMonadST @ RealWorld of { :DMonad a b c -> c }
-- We had better inline that sucker! The case won't see through it.
--
-- For now, I'm treating treating a variable applied to types
-- in a *lazy* context "lone". The motivating example was
-- f = /\a. \x. BIG
-- g = /\a. \y. h (f a)
-- There's no advantage in inlining f here, and perhaps
-- a significant disadvantage. Hence some_val_args in the Stop case
interestingCallContext some_args some_val_args cont
interestingCallContext :: SimplCont -> CallContInfo
interestingCallContext cont
= interesting cont
where
interesting (Select {}) = some_args
interesting (ApplyTo {}) = True -- Can happen if we have (coerce t (f x)) y
interesting (Select _ bndr _ _ _)
| isDeadBinder bndr = CaseCont
| otherwise = InterestingCont
interesting (ApplyTo {}) = InterestingCont
-- Can happen if we have (coerce t (f x)) y
-- Perhaps True is a bit over-keen, but I've
-- seen (coerce f) x, where f has an INLINE prag,
-- So we have to give some motivaiton for inlining it
interesting (StrictArg {}) = some_val_args
interesting (StrictBind {}) = some_val_args -- ??
interesting (Stop ty _ interesting) = some_val_args && interesting
interesting (CoerceIt _ cont) = interesting cont
-- So we have to give some motivation for inlining it
interesting (StrictArg {}) = InterestingCont
interesting (StrictBind {}) = InterestingCont
interesting (Stop ty _ yes) = if yes then InterestingCont else BoringCont
interesting (CoerceIt _ cont) = interesting cont
-- If this call is the arg of a strict function, the context
-- is a bit interesting. If we inline here, we may get useful
-- evaluation information to avoid repeated evals: e.g.
......@@ -418,7 +381,9 @@ interestingArgContext :: Id -> SimplCont -> Bool
-- where g has rules, then we *do* want to inline f, in case it
-- exposes a rule that might fire. Similarly, if the context is
-- h (g (f x x))
-- where h has rules, then we do want to inline f.
-- where h has rules, then we do want to inline f; hence the
-- call_cont argument to interestingArgContext
--
-- The interesting_arg_ctxt flag makes this happen; if it's
-- set, the inliner gets just enough keener to inline f
-- regardless of how boring f's arguments are, if it's marked INLINE
......@@ -426,8 +391,8 @@ interestingArgContext :: Id -> SimplCont -> Bool
-- The alternative would be to *always* inline an INLINE function,
-- regardless of how boring its context is; but that seems overkill
-- For example, it'd mean that wrapper functions were always inlined
interestingArgContext fn cont
= idHasRules fn || go cont
interestingArgContext fn call_cont
= idHasRules fn || go call_cont
where
go (Select {}) = False
go (ApplyTo {}) = False
......
......@@ -1027,12 +1027,10 @@ completeCall env var cont
------------- Next try inlining ----------------
{ let arg_infos = [interestingArg arg | arg <- args, isValArg arg]
n_val_args = length arg_infos
interesting_cont = interestingCallContext (notNull args)
(notNull arg_infos)
call_cont
interesting_cont = interestingCallContext call_cont
active_inline = activeInline env var
maybe_inline = callSiteInline dflags active_inline
var arg_infos interesting_cont
maybe_inline = callSiteInline dflags active_inline var
(null args) arg_infos interesting_cont
; case maybe_inline of {
Just unfolding -- There is an inlining!
-> do { tick (UnfoldingDone var)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment