Commit 980372f3 authored by Simon Peyton Jones's avatar Simon Peyton Jones
Browse files

Re-do the "function application discount" (fixes Trac #6048)

* Undoes Max's very aggressive function-inlining change
  (see comments with Trac #6048)

* Resticts function application discount to functions
  that occur just once in the body. It was the multiple
  occurrences that led to the exponential behavour in
  Trac #6048.

See Note [Function application discount] in CoreUnfold.

Module binary sizes are down 2% on average, which is good.
Allocations wobble about a bit, but only on a few benchmarks
and not by much, so it seems a price worth paying to avoid
exponential behaviour!

                         Allocs
            Min           -1.2%
            Max           +2.8%
 Geometric Mean           +0.0%
parent 2112f43c
......@@ -163,7 +163,7 @@ mkUnfolding src top_lvl is_bottoming expr
, not (exprIsTrivial expr)
= NoUnfolding -- See Note [Do not inline top-level bottoming functions]
| otherwise
= CoreUnfolding { uf_tmpl = occurAnalyseExpr expr,
= CoreUnfolding { uf_tmpl = occ_anald_expr,
uf_src = src,
uf_arity = arity,
uf_is_top = top_lvl,
......@@ -173,7 +173,8 @@ mkUnfolding src top_lvl is_bottoming expr
uf_is_work_free = exprIsWorkFree expr,
uf_guidance = guidance }
where
(arity, guidance) = calcUnfoldingGuidance expr
occ_anald_expr = occurAnalyseExpr expr
(arity, guidance) = calcUnfoldingGuidance occ_anald_expr
-- Sometimes during simplification, there's a large let-bound thing
-- which has been substituted, and so is now dead; so 'expr' contains
-- two copies of the thing while the occurrence-analysed expression doesn't
......@@ -501,80 +502,6 @@ sizeExpr bOMB_OUT_SIZE top_args expr
d2 -- Ignore d1
\end{code}
Note [Function application discount]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I noticed that the output of the supercompiler generates a lot of code
with this form:
"""
module Inlining where
h1 k = k undefined undefined undefined
undefined undefined undefined
undefined undefined undefined
undefined undefined undefined
undefined undefined undefined
undefined undefined undefined
a = h1 (\x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
b = h1 (\_ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
c = h1 (\_ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
d = h1 (\_ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
e = h1 (\_ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
f = h1 (\_ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ -> x)
g = h1 (\_ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ -> x)
h = h1 (\_ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ -> x)
i = h1 (\_ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ -> x)
j = h1 (\_ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ -> x)
k = h1 (\_ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ -> x)
l = h1 (\_ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ -> x)
m = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ -> x)
n = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ -> x)
o = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ -> x)
p = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ -> x)
q = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ -> x)
r = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x -> x)
"""
With GHC head the applications of h1 are not inlined, which hurts the
quality of the generated code a bit. I was wondering why h1 wasn't
getting inlined into each of "a" to "i" - after all, it has a manifest
lambda argument.
It turns out that the code in CoreUnfold gives a fixed discount of
opt_UF_FunAppDiscount to a function argument such as "k" if it applied
to any arguments. This is enough to ensure that h1 is inlined if the number
of arguments applied to k is below a certain limit, but if many arguments are
applied to k then the fixed discount can't overcome the size of the
chain of apps, and h1 is never inlined.
My proposed solution is to change CoreUnfold.funSize so that longer
chains of arguments being applied to a lambda-bound function give a
bigger discount. The motivation for this is that we would *generally*
expect that the lambda at the callsite has enough lambdas such that
all of the applications within the body can be beta-reduced away. This
change might lead to over eager inlining in cases like this, though:
{{{
h1 k = k x y z
{-# NOINLINE g #-}
g = ...
main = ... h1 (\x -> g x) ...
}}}
In this case we aren't able to beta-reduce away all of the
applications in the body of h1 because the lambda at the call site
only binds 1 argument, not the 3 allowed by the type. I don't expect
this case to be particularly common, however.
I chose the bonus to be (size - 20) so that application to 1 arg got
same bonus as the old fixed bonus (i.e. opt_UF_FunAppDiscount, which is 60).
If you have the bonus being (size - 40) then $fMonad[]_$c>>= with interesting
2nd arg doesn't inline in cryptarithm2 so we lose some deforestation, and
overall binary size hardly falls.
\begin{code}
-- | Finds a nominal size of a string literal.
......@@ -615,23 +542,29 @@ funSize top_args fun n_val_args
where
some_val_args = n_val_args > 0
-- See Note [Function application discount]
arg_discount | some_val_args && fun `elem` top_args
= unitBag (fun, opt_UF_FunAppDiscount + (size - 20))
| otherwise = emptyBag
-- If the function is an argument and is applied
-- to some values, give it an arg-discount
-- See Note [Function application discount]
res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount + (size - 20)
| otherwise = 0
-- If the function is partially applied, show a result discount
size | some_val_args = 10 * (1 + n_val_args)
| otherwise = 0
-- The 1+ is for the function itself
-- Add 1 for each non-trivial arg;
-- the allocation cost, as in let(rec)
-- DISCOUNTS
-- See Note [Function application discounts]
arg_discount | some_val_args && one_call fun top_args
= unitBag (fun, opt_UF_FunAppDiscount)
| otherwise = emptyBag
-- If the function is an argument and is applied
-- to some values, give it an arg-discount
res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount
| otherwise = 0
-- If the function is partially applied, show a result discount
one_call _ [] = False
one_call fun (arg:args) | fun==arg = case idOccInfo arg of
OneOcc _ one_branch _ -> one_branch
_ -> False
| otherwise = one_call fun args
conSize :: DataCon -> Int -> ExprSize
conSize dc n_val_args
......@@ -648,6 +581,21 @@ conSize dc n_val_args
-- [SDM, 25/5/11]
\end{code}
Note [Function application discount]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We want a discount if the function is applied. A good example is
monadic combinators with continuation arguments, where inlining is
quite important.
But we don't want a big discount when a function is called many times
(see the detailed comments with Trac #6048) because if the function is
big it won't be inlined at its many call sites and no benefit results.
Indeed, we can get exponentially big inlinings this way; that is what
Trac #6048 is about.
So, we only give a function-application discount when the function appears
textually once, albeit possibly inside a lambda.
Note [Literal integer size]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Literal integers *can* be big (mkInteger [...coefficients...]), but
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment