Commit 9be18ea4 authored by Simon Peyton Jones's avatar Simon Peyton Jones

Fix a nasty bug in exprIsExpandable

This bug has been lurking for ages: Trac #13155

The important semantic change is to ensure that exprIsExpandable
returns False for primop calls.  Previously exprIsExpandable used
exprIsCheap' which always used primOpIsCheap.

I took the opportunity to combine the code for exprIsCheap' (two
variants: exprIsCheap and exprIsExpandable) with that for
exprIsWorkFree.  Result is simpler, tighter, easier to understand.
And correct (at least wrt this bug)!
parent bf1e1f32
...@@ -498,9 +498,9 @@ getBotArity _ = Nothing ...@@ -498,9 +498,9 @@ getBotArity _ = Nothing
mk_cheap_fn :: DynFlags -> CheapAppFun -> CheapFun mk_cheap_fn :: DynFlags -> CheapAppFun -> CheapFun
mk_cheap_fn dflags cheap_app mk_cheap_fn dflags cheap_app
| not (gopt Opt_DictsCheap dflags) | not (gopt Opt_DictsCheap dflags)
= \e _ -> exprIsCheap' cheap_app e = \e _ -> exprIsOk cheap_app e
| otherwise | otherwise
= \e mb_ty -> exprIsCheap' cheap_app e = \e mb_ty -> exprIsOk cheap_app e
|| case mb_ty of || case mb_ty of
Nothing -> False Nothing -> False
Just ty -> isDictLikeTy ty Just ty -> isDictLikeTy ty
......
...@@ -25,7 +25,7 @@ module CoreUtils ( ...@@ -25,7 +25,7 @@ module CoreUtils (
exprType, coreAltType, coreAltsType, isExprLevPoly, exprType, coreAltType, coreAltsType, isExprLevPoly,
exprIsDupable, exprIsTrivial, getIdFromTrivialExpr, exprIsBottom, exprIsDupable, exprIsTrivial, getIdFromTrivialExpr, exprIsBottom,
getIdFromTrivialExpr_maybe, getIdFromTrivialExpr_maybe,
exprIsCheap, exprIsExpandable, exprIsCheap', CheapAppFun, exprIsCheap, exprIsExpandable, exprIsOk, CheapAppFun,
exprIsHNF, exprOkForSpeculation, exprOkForSideEffects, exprIsWorkFree, exprIsHNF, exprOkForSpeculation, exprOkForSideEffects, exprIsWorkFree,
exprIsBig, exprIsConLike, exprIsBig, exprIsConLike,
rhsIsStatic, isCheapApp, isExpandableApp, rhsIsStatic, isCheapApp, isExpandableApp,
...@@ -78,6 +78,7 @@ import DynFlags ...@@ -78,6 +78,7 @@ import DynFlags
import FastString import FastString
import Maybes import Maybes
import ListSetOps ( minusList ) import ListSetOps ( minusList )
import BasicTypes ( Arity )
import Platform import Platform
import Util import Util
import Pair import Pair
...@@ -1015,29 +1016,7 @@ heap-allocates noFactor's argument. At the moment (May 12) we are just ...@@ -1015,29 +1016,7 @@ heap-allocates noFactor's argument. At the moment (May 12) we are just
going to put up with this, because the previous more aggressive inlining going to put up with this, because the previous more aggressive inlining
(which treated 'noFactor' as work-free) was duplicating primops, which (which treated 'noFactor' as work-free) was duplicating primops, which
in turn was making inner loops of array calculations runs slow (#5623) in turn was making inner loops of array calculations runs slow (#5623)
-}
exprIsWorkFree :: CoreExpr -> Bool
-- See Note [exprIsWorkFree]
exprIsWorkFree e = go 0 e
where -- n is the number of value arguments
go _ (Lit {}) = True
go _ (Type {}) = True
go _ (Coercion {}) = True
go n (Cast e _) = go n e
go n (Case scrut _ _ alts) = foldl (&&) (exprIsWorkFree scrut)
[ go n rhs | (_,_,rhs) <- alts ]
-- See Note [Case expressions are work-free]
go _ (Let {}) = False
go n (Var v) = isCheapApp v n
go n (Tick t e) | tickishCounts t = False
| otherwise = go n e
go n (Lam x e) | isRuntimeVar x = n==0 || go (n-1) e
| otherwise = go n e
go n (App f e) | isRuntimeArg e = exprIsWorkFree e && go (n+1) f
| otherwise = go n f
{-
Note [Case expressions are work-free] Note [Case expressions are work-free]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Are case-expressions work-free? Consider Are case-expressions work-free? Consider
...@@ -1049,6 +1028,8 @@ that increased allocation slightly. It's a fairly small effect, and at ...@@ -1049,6 +1028,8 @@ that increased allocation slightly. It's a fairly small effect, and at
the moment we go for the slightly more aggressive version which treats the moment we go for the slightly more aggressive version which treats
(case x of ....) as work-free if the alternatives are. (case x of ....) as work-free if the alternatives are.
Moreover it improves arities of overloaded functions where
there is only dictionary selection (no construction) involved
Note [exprIsCheap] See also Note [Interaction of exprIsCheap and lone variables] Note [exprIsCheap] See also Note [Interaction of exprIsCheap and lone variables]
~~~~~~~~~~~~~~~~~~ in CoreUnfold.hs ~~~~~~~~~~~~~~~~~~ in CoreUnfold.hs
...@@ -1086,137 +1067,166 @@ Note that exprIsHNF does not imply exprIsCheap. Eg ...@@ -1086,137 +1067,166 @@ Note that exprIsHNF does not imply exprIsCheap. Eg
let x = fac 20 in Just x let x = fac 20 in Just x
This responds True to exprIsHNF (you can discard a seq), but This responds True to exprIsHNF (you can discard a seq), but
False to exprIsCheap. False to exprIsCheap.
Note [exprIsExpandable]
~~~~~~~~~~~~~~~~~~~~~~~
An expression is "expandable" if we are willing to dupicate it, if doing
so might make a RULE or case-of-constructor fire. Mainly this means
data-constructor applications, but it's a bit more generous than exprIsCheap
because it is true of "CONLIKE" Ids: see Note [CONLIKE pragma] in BasicTypes.
It is used to set the uf_expandable field of an Unfolding, and that
in turn is used
* In RULE matching
* In exprIsConApp_maybe, exprIsLiteral_maybe, exprIsLambda_maybe
But take care: exprIsExpandable should /not/ be true of primops. I
found this in test T5623a:
let q = /\a. Ptr a (a +# b)
in case q @ Float of Ptr v -> ...q...
q's inlining should not be expandable, else exprIsConApp_maybe will
say that (q @ Float) expands to (Ptr a (a +# b)), and that will
duplicate the (a +# b) primop, which we should not do lightly.
(It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.)
Note [Arguments in exprIsOk]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What predicate should we apply to the argument of an application? We
used to say "exprIsTrivial arg" due to concerns about duplicating
nested constructor applications, but see #4978. The principle here is
that
let x = a +# b in c *# x
should behave equivalently to
c *# (a +# b)
Since lets with cheap RHSs are accepted, so should paps with cheap arguments
-} -}
--------------------
exprIsCheap :: CoreExpr -> Bool exprIsCheap :: CoreExpr -> Bool
exprIsCheap = exprIsCheap' isCheapApp exprIsCheap = exprIsOk isCheapApp
exprIsExpandable :: CoreExpr -> Bool exprIsExpandable :: CoreExpr -> Bool -- See Note [exprIsExpandable]
exprIsExpandable = exprIsCheap' isExpandableApp -- See Note [CONLIKE pragma] in BasicTypes exprIsExpandable = exprIsOk isExpandableApp
exprIsCheap' :: CheapAppFun -> CoreExpr -> Bool exprIsWorkFree :: CoreExpr -> Bool -- See Note [exprIsWorkFree]
exprIsCheap' _ (Lit _) = True exprIsWorkFree = exprIsOk isWorkFreeApp
exprIsCheap' _ (Type _) = True
exprIsCheap' _ (Coercion _) = True --------------------
exprIsCheap' _ (Var _) = True exprIsOk :: CheapAppFun -> CoreExpr -> Bool
exprIsCheap' good_app (Cast e _) = exprIsCheap' good_app e exprIsOk ok_app e
exprIsCheap' good_app (Lam x e) = isRuntimeVar x = ok e
|| exprIsCheap' good_app e
exprIsCheap' good_app (Case e _ _ alts) = exprIsCheap' good_app e &&
and [exprIsCheap' good_app rhs | (_,_,rhs) <- alts]
-- Experimentally, treat (case x of ...) as cheap
-- (and case __coerce x etc.)
-- This improves arities of overloaded functions where
-- there is only dictionary selection (no construction) involved
exprIsCheap' good_app (Tick t e)
| tickishCounts t = False
| otherwise = exprIsCheap' good_app e
-- never duplicate counting ticks. If we get this wrong, then
-- HPC's entry counts will be off (check test in
-- libraries/hpc/tests/raytrace)
exprIsCheap' good_app (Let (NonRec _ b) e)
= exprIsCheap' good_app b && exprIsCheap' good_app e
exprIsCheap' good_app (Let (Rec prs) e)
= all (exprIsCheap' good_app . snd) prs && exprIsCheap' good_app e
exprIsCheap' good_app other_expr -- Applications and variables
= go other_expr []
where where
-- Accumulate value arguments, then decide ok e = go 0 e
go (Cast e _) val_args = go e val_args
go (App f a) val_args | isRuntimeArg a = go f (a:val_args)
| otherwise = go f val_args
go (Var _) [] = True
-- Just a type application of a variable
-- (f t1 t2 t3) counts as WHNF
-- This case is probably handeld by the good_app case
-- below, which should have a case for n=0, but putting
-- it here too is belt and braces; and it's such a common
-- case that checking for null directly seems like a
-- good plan
go (Var f) args
| good_app f (length args) -- Typically holds of data constructor applications
= go_pap args -- E.g. good_app = isCheapApp below
| otherwise -- n is the number of value arguments
= case idDetails f of go n (Var v) = ok_app v n
RecSelId {} -> go_sel args go _ (Lit {}) = True
ClassOpId {} -> go_sel args go _ (Type {}) = True
PrimOpId op -> go_primop op args go _ (Coercion {}) = True
_ | isBottomingId f -> True go n (Cast e _) = go n e
| otherwise -> False go n (Case scrut _ _ alts) = foldl (&&) (ok scrut)
-- Application of a function which [ go n rhs | (_,_,rhs) <- alts ]
-- always gives bottom; we treat this as cheap go n (Tick t e) | tickishCounts t = False
-- because it certainly doesn't need to be shared! | otherwise = go n e
go n (Lam x e) | isRuntimeVar x = n==0 || go (n-1) e
go (Tick t e) args | otherwise = go n e
| not (tickishCounts t) -- don't duplicate counting ticks, see above go n (App f e) | isRuntimeArg e = go (n+1) f && ok e
= go e args | otherwise = go n f
go _ (Let {}) = False
go _ _ = False
-- Case: see Note [Case expressions are work-free]
-------------- -- App: see Note [Arugments in exprIsOk]
go_pap args = all (exprIsCheap' good_app) args -- Let: the old exprIsCheap worked through lets
-- Used to be "all exprIsTrivial args" due to concerns about
-- duplicating nested constructor applications, but see #4978.
-- The principle here is that
-- let x = a +# b in c *# x
-- should behave equivalently to
-- c *# (a +# b)
-- Since lets with cheap RHSs are accepted,
-- so should paps with cheap arguments
--------------
go_primop op args = primOpIsCheap op && all (exprIsCheap' good_app) args
-- In principle we should worry about primops
-- that return a type variable, since the result
-- might be applied to something, but I'm not going
-- to bother to check the number of args
--------------
go_sel [arg] = exprIsCheap' good_app arg -- I'm experimenting with making record selection
go_sel _ = False -- look cheap, so we will substitute it inside a
-- lambda. Particularly for dictionary field selection.
-- BUT: Take care with (sel d x)! The (sel d) might be cheap, but
-- there's no guarantee that (sel d x) will be too. Hence (n_val_args == 1)
------------------------------------- -------------------------------------
type CheapAppFun = Id -> Int -> Bool type CheapAppFun = Id -> Arity -> Bool
-- Is an application of this function to n *value* args -- Is an application of this function to n *value* args
-- always cheap, assuming the arguments are cheap? -- always cheap, assuming the arguments are cheap?
-- Mainly true of partial applications, data constructors, -- True mainly of data constructors, partial applications;
-- and of course true if the number of args is zero -- but with minor variations:
-- isWorkFreeApp
-- isCheapApp
-- isExpandableApp
-- NB: isCheapApp and isExpandableApp are called from outside
-- this module, so don't be tempted to move the notRedex
-- stuff into the call site in exprIsOk, and remove it
-- from the CheapAppFun implementations
notRedex :: CheapAppFun
notRedex fn n_val_args
= n_val_args == 0 -- No value args
|| n_val_args < idArity fn -- Partial application
|| isBottomingId fn -- OK to duplicate calls to bottom;
-- it certainly doesn't need to be shared!
isWorkFreeApp :: CheapAppFun
isWorkFreeApp fn n_val_args
| notRedex fn n_val_args
= True
| otherwise
= case idDetails fn of
DataConWorkId {} -> True
_ -> False
isCheapApp :: CheapAppFun isCheapApp :: CheapAppFun
isCheapApp fn n_val_args isCheapApp fn n_val_args
= isDataConWorkId fn | notRedex fn n_val_args
|| n_val_args == 0 = True
|| n_val_args < idArity fn | otherwise
= case idDetails fn of
DataConWorkId {} -> True
RecSelId {} -> n_val_args == 1 -- See Note [Record selection]
ClassOpId {} -> n_val_args == 1
PrimOpId op -> primOpIsCheap op
_ -> False
-- In principle we should worry about primops
-- that return a type variable, since the result
-- might be applied to something, but I'm not going
-- to bother to check the number of args
isExpandableApp :: CheapAppFun isExpandableApp :: CheapAppFun
isExpandableApp fn n_val_args isExpandableApp fn n_val_args
= isConLikeId fn | notRedex fn n_val_args
|| n_val_args < idArity fn = True
|| go n_val_args (idType fn) | isConLikeId fn
= True
| otherwise
= case idDetails fn of
DataConWorkId {} -> True
RecSelId {} -> n_val_args == 1 -- See Note [Record selection]
ClassOpId {} -> n_val_args == 1
PrimOpId {} -> False
_ -> all_pred_args n_val_args (idType fn)
where where
-- See if all the arguments are PredTys (implicit params or classes) -- See if all the arguments are PredTys (implicit params or classes)
-- If so we'll regard it as expandable; see Note [Expandable overloadings] -- If so we'll regard it as expandable; see Note [Expandable overloadings]
-- This incidentally picks up the (n_val_args = 0) case all_pred_args n_val_args ty
go 0 _ = True | n_val_args == 0
go n_val_args ty = True
| Just (bndr, ty) <- splitPiTy_maybe ty | Just (bndr, ty) <- splitPiTy_maybe ty
= caseBinder bndr = caseBinder bndr
(\_tv -> go n_val_args ty) (\_tv -> all_pred_args n_val_args ty)
(\bndr_ty -> isPredTy bndr_ty && go (n_val_args-1) ty) (\bndr_ty -> isPredTy bndr_ty && all_pred_args (n_val_args-1) ty)
| otherwise | otherwise
= False = False
{- {- Note [Record selection]
~~~~~~~~~~~~~~~~~~~~~~~~~~
I'm experimenting with making record selection
look cheap, so we will substitute it inside a
lambda. Particularly for dictionary field selection.
BUT: Take care with (sel d x)! The (sel d) might be cheap, but
there's no guarantee that (sel d x) will be too. Hence (n_val_args == 1)
Note [Expandable overloadings] Note [Expandable overloadings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppose the user wrote this Suppose the user wrote this
......
...@@ -113,6 +113,10 @@ T5327: ...@@ -113,6 +113,10 @@ T5327:
T5623: T5623:
$(RM) -f T5623.hi T5623.o $(RM) -f T5623.hi T5623.o
'$(TEST_HC)' $(TEST_HC_OPTS) -c T5623.hs -O -ddump-prep | grep -c "plusAddr#" '$(TEST_HC)' $(TEST_HC_OPTS) -c T5623.hs -O -ddump-prep | grep -c "plusAddr#"
T13155:
$(RM) -f T13155.hi T13155.o
'$(TEST_HC)' $(TEST_HC_OPTS) -c T13155.hs -O -ddump-prep | grep -c "plusAddr#"
# There should be only one plusAddr#!
.PHONY: T4138 .PHONY: T4138
T4138: T4138:
......
{-# LANGUAGE MagicHash, UnboxedTuples #-}
{-# OPTIONS_GHC -funfolding-use-threshold=10 #-}
module T13155 where
import GHC.Ptr
import GHC.Prim
import GHC.Exts
foo :: Ptr Float -> State# RealWorld -> (# State# RealWorld, Float #)
foo p s = case q :: Ptr Float of { Ptr a1 ->
case readFloatOffAddr# a1 0# s of { (# s1, f1 #) ->
case q :: Ptr Float of { Ptr a2 ->
case readFloatOffAddr# a2 1# s of { (# s2, f2 #) ->
(# s2, F# (plusFloat# f1 f2) #) }}}}
where
q :: Ptr a -- Polymorphic
q = p `plusPtr` 4
...@@ -139,6 +139,7 @@ test('simpl021', ...@@ -139,6 +139,7 @@ test('simpl021',
test('T5327', normal, run_command, ['$MAKE -s --no-print-directory T5327']) test('T5327', normal, run_command, ['$MAKE -s --no-print-directory T5327'])
test('T5615', normal, run_command, ['$MAKE -s --no-print-directory T5615']) test('T5615', normal, run_command, ['$MAKE -s --no-print-directory T5615'])
test('T5623', normal, run_command, ['$MAKE -s --no-print-directory T5623']) test('T5623', normal, run_command, ['$MAKE -s --no-print-directory T5623'])
test('T13155', normal, run_command, ['$MAKE -s --no-print-directory T13155'])
test('T5658b', test('T5658b',
normal, normal,
run_command, run_command,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment