Commit b9b1f999 authored by Tobias Dammers's avatar Tobias Dammers 🦈 Committed by Krzysztof Gogolewski

Honor INLINE on 0-arity bindings (#15578)

Summary:
Fix test for #15578

By allowing 0-arity values to be inlined, we end up changing boringness
annotations, and this gets reflected in the Core output for this
particular test.

Add Notes for #15578

Test Plan: ./validate

Reviewers: simonpj, bgamari

Reviewed By: simonpj

Subscribers: rwbarton, carter

GHC Trac Issues: #15578

Differential Revision: https://phabricator.haskell.org/D5137
parent 900c47f8
......@@ -159,7 +159,10 @@ mkInlineUnfoldingWithArity arity expr
guide = UnfWhen { ug_arity = arity
, ug_unsat_ok = needSaturated
, ug_boring_ok = boring_ok }
boring_ok = inlineBoringOk expr'
-- See Note [INLINE pragmas and boring contexts] as to why we need to look
-- at the arity here.
boring_ok | arity == 0 = True
| otherwise = inlineBoringOk expr'
mkInlinableUnfolding :: DynFlags -> CoreExpr -> Unfolding
mkInlinableUnfolding dflags expr
......@@ -236,6 +239,72 @@ specUnfolding to specialise its unfolding. Some important points:
we keep it (so the specialised thing too will always inline)
if a stable unfolding has UnfoldingGuidance of UnfIfGoodArgs
(which arises from INLINABLE), we discard it
Note [Honour INLINE on 0-ary bindings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider
x = <expensive>
{-# INLINE x #-}
f y = ...x...
The semantics of an INLINE pragma is
inline x at every call site, provided it is saturated;
that is, applied to at least as many arguments as appear
on the LHS of the Haskell source definition.
(This soure-code-derived arity is stored in the `ug_arity` field of
the `UnfoldingGuidance`.)
In the example, x's ug_arity is 0, so we should inline it at every use
site. It's rare to have such an INLINE pragma (usually INLINE Is on
functions), but it's occasionally very important (Trac #15578, #15519).
In #15519 we had something like
x = case (g a b) of I# r -> T r
{-# INLINE x #-}
f y = ...(h x)....
where h is strict. So we got
f y = ...(case g a b of I# r -> h (T r))...
and that in turn allowed SpecConstr to ramp up performance.
How do we deliver on this? By adjusting the ug_boring_ok
flag in mkInlineUnfoldingWithArity; see
Note [INLINE pragmas and boring contexts]
NB: there is a real risk that full laziness will float it right back
out again. Consider again
x = factorial 200
{-# INLINE x #-}
f y = ...x...
After inlining we get
f y = ...(factorial 200)...
but it's entirely possible that full laziness will do
lvl23 = factorial 200
f y = ...lvl23...
That's a problem for another day.
Note [INLINE pragmas and boring contexts]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
An INLINE pragma uses mkInlineUnfoldingWithArity to build the
unfolding. That sets the ug_boring_ok flag to False if the function
is not tiny (inlineBorkingOK), so that even INLINE functions are not
inlined in an utterly boring context. E.g.
\x y. Just (f y x)
Nothing is gained by inlining f here, even if it has an INLINE
pragma.
But for 0-ary bindings, we want to inline regardless; see
Note [Honour INLINE on 0-ary bindings].
I'm a bit worried that it's possible for the same kind of problem
to arise for non-0-ary functions too, but let's wait and see.
-}
mkCoreUnfolding :: UnfoldingSource -> Bool -> CoreExpr
......@@ -1449,6 +1518,8 @@ This kind of thing can occur if you have
foo = let x = e in (x,x)
which Roman did.
-}
computeDiscount :: DynFlags -> [Int] -> Int -> [ArgSummary] -> CallCtxt
......
......@@ -3420,14 +3420,24 @@ simplStableUnfolding env top_lvl mb_cont id unf rhs_ty
Just cont -> simplJoinRhs unf_env id expr cont
Nothing -> simplExprC unf_env expr (mkBoringStop rhs_ty)
; case guide of
UnfWhen { ug_arity = arity, ug_unsat_ok = sat_ok } -- Happens for INLINE things
-> let guide' = UnfWhen { ug_arity = arity, ug_unsat_ok = sat_ok
, ug_boring_ok = inlineBoringOk expr' }
UnfWhen { ug_arity = arity
, ug_unsat_ok = sat_ok
, ug_boring_ok = boring_ok
}
-- Happens for INLINE things
-> let guide' =
UnfWhen { ug_arity = arity
, ug_unsat_ok = sat_ok
, ug_boring_ok =
boring_ok || inlineBoringOk expr'
}
-- Refresh the boring-ok flag, in case expr'
-- has got small. This happens, notably in the inlinings
-- for dfuns for single-method classes; see
-- Note [Single-method classes] in TcInstDcls.
-- A test case is Trac #4138
-- But retain a previous boring_ok of True; e.g. see
-- the way it is set in calcUnfoldingGuidanceWithArity
in return (mkCoreUnfolding src is_top_lvl expr' guide')
-- See Note [Top-level flag on inline rules] in CoreUnfold
......
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE Strict #-}
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE DeriveGeneric #-}
module Main where
import qualified Data.Set as Set
import qualified Data.Text as Text
import Data.Set (Set)
import Data.Text (Text)
import System.IO (BufferMode (NoBuffering), hSetBuffering, stdout)
import GHC.Generics (Generic)
import Control.DeepSeq (force, NFData)
import Control.Exception (evaluate)
--------------------------------
-- === Running benchmarks === --
--------------------------------
iters :: Int
iters = 100000000
src1 :: Text
src1 = Text.replicate iters "tttt"
data Grammar a
= Tokens !(Set a) !(a -> Bool)
| Many !(Grammar a)
| X !(Grammar a)
instance Ord a => Semigroup (Grammar a) where
Tokens s f <> Tokens s' g = Tokens (s <> s') $ \c -> f c || g c
{-# INLINE (<>) #-}
token :: Eq a => a -> Grammar a
token = \a -> Tokens (Set.singleton a) (a ==)
{-# INLINE token #-}
many :: Grammar a -> Grammar a
many = Many
{-# INLINE many #-}
data Result
= Success Text Text
| Fail
deriving (Show, Generic)
instance NFData Result
runTokenParser :: Grammar Char -> Text -> Result
runTokenParser = \grammar stream -> case grammar of
Tokens _ tst -> let
head = Text.head stream
in if tst head
then Success (Text.tail stream) (Text.singleton head)
else Fail
Many (Tokens _ tst) -> let
(!consumed, !rest) = Text.span tst stream
in Success rest consumed
X !grammar -> runTokenParser grammar stream
testGrammar1 :: Grammar Char
testGrammar1 = let
s1 = token 't'
in many s1
{-# INLINE testGrammar1 #-}
test3 :: Text -> Result
test3 src =
runTokenParser testGrammar1 src
{-# NOINLINE test3 #-}
main :: IO ()
main = do
srcx <- evaluate $ force src1
evaluate $ force $ test3 srcx
pure ()
......@@ -604,3 +604,12 @@ test('T15426',
only_ways(['normal'])],
compile_and_run,
['-O2'])
test('T15578',
[stats_num_field('bytes allocated',
[ (wordsize(64), 800041456, 5) ]),
# 2018-09-07 800041456 Improvements from #15578
# initial 42400041456
only_ways(['normal'])],
compile_and_run,
['-O2'])
......@@ -26,7 +26,7 @@ fun1 [InlPrag=NOINLINE] :: Foo -> ()
Str=<S,1*U>,
Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
WorkFree=True, Expandable=True,
Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=True)
Tmpl= \ (x [Occ=Once] :: Foo) ->
case x of { __DEFAULT -> GHC.Tuple.() }}]
fun1 = \ (x :: Foo) -> case x of { __DEFAULT -> GHC.Tuple.() }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment