Commit 6d49d5be authored by Simon Peyton Jones's avatar Simon Peyton Jones Committed by Marge Bot

Implement cast worker/wrapper properly

The cast worker/wrapper transformation transforms
   x = e |> co
into
   y = e
   x = y |> co

This is done by the simplifier, but we were being
careless about transferring IdInfo from x to y,
and about what to do if x is a NOINLNE function.
This resulted in a series of bugs:
     #17673, #18093, #18078.

This patch fixes all that:

* Main change is in GHC.Core.Opt.Simplify, and
  the new prepareBinding function, which does this
  cast worker/wrapper transform.
  See Note [Cast worker/wrappers].

* There is quite a bit of refactoring around
  prepareRhs, makeTrivial etc.  It's nicer now.

* Some wrappers from strictness and cast w/w, notably those for
  a function with a NOINLINE, should inline very late. There
  wasn't really a mechanism for that, which was an existing bug
  really; so I invented a new finalPhase = Phase (-1).  It's used
  for all simplifier runs after the user-visible phase 2,1,0 have
  run.  (No new runs of the simplifier are introduced thereby.)

  See new Note [Compiler phases] in GHC.Types.Basic;
  the main changes are in GHC.Core.Opt.Driver

* Doing this made me trip over two places where the AnonArgFlag on a
  FunTy was being lost so we could end up with (Num a -> ty)
  rather than (Num a => ty)
    - In coercionLKind/coercionRKind
    - In contHoleType in the Simplifier

  I fixed the former by defining mkFunctionType and using it in
  coercionLKind/RKind.

  I could have done the same for the latter, but the information
  is almost to hand.  So I fixed the latter by
    - adding sc_hole_ty to ApplyToVal (like ApplyToTy),
    - adding as_hole_ty to ValArg (like TyArg)
    - adding sc_fun_ty to StrictArg
  Turned out I could then remove ai_type from ArgInfo.  This is
  just moving the deck chairs around, but it worked out nicely.

  See the new Note [AnonArgFlag] in GHC.Types.Var

* When looking at the 'arity decrease' thing (#18093) I discovered
  that stable unfoldings had a much lower arity than the actual
  optimised function.  That's what led to the arity-decrease
  message.  Simple solution: eta-expand.

  It's described in Note [Eta-expand stable unfoldings]
  in GHC.Core.Opt.Simplify

* I also discovered that unsafeCoerce wasn't being inlined if
  the context was boring.  So (\x. f (unsafeCoerce x)) would
  create a thunk -- yikes!  I fixed that by making inlineBoringOK
  a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold.

  I also found that unsafeCoerceName was unused, so I removed it.

I made a test case for #18078, and a very similar one for #17673.

The net effect of all this on nofib is very modest, but positive:

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
           anna          -0.4%     -0.1%     -3.1%     -3.1%      0.0%
 fannkuch-redux          -0.4%     -0.3%     -0.1%     -0.1%      0.0%
       maillist          -0.4%     -0.1%     -7.8%     -1.0%    -14.3%
      primetest          -0.4%    -15.6%     -7.1%     -6.6%      0.0%
--------------------------------------------------------------------------------
            Min          -0.9%    -15.6%    -13.3%    -14.2%    -14.3%
            Max          -0.3%      0.0%    +12.1%    +12.4%      0.0%
 Geometric Mean          -0.4%     -0.2%     -2.3%     -2.2%     -0.1%

All following metric decreases are compile-time allocation decreases
between -1% and -3%:

Metric Decrease:
  T5631
  T13701
  T14697
  T15164
parent 9454511b
......@@ -472,7 +472,6 @@ basicKnownKeyNames
, unsafeEqualityTyConName
, unsafeReflDataConName
, unsafeCoercePrimName
, unsafeCoerceName
]
genericTyConNames :: [Name]
......@@ -1333,12 +1332,11 @@ typeErrorShowTypeDataConName =
-- Unsafe coercion proofs
unsafeEqualityProofName, unsafeEqualityTyConName, unsafeCoercePrimName,
unsafeCoerceName, unsafeReflDataConName :: Name
unsafeReflDataConName :: Name
unsafeEqualityProofName = varQual uNSAFE_COERCE (fsLit "unsafeEqualityProof") unsafeEqualityProofIdKey
unsafeEqualityTyConName = tcQual uNSAFE_COERCE (fsLit "UnsafeEquality") unsafeEqualityTyConKey
unsafeReflDataConName = dcQual uNSAFE_COERCE (fsLit "UnsafeRefl") unsafeReflDataConKey
unsafeCoercePrimName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce#") unsafeCoercePrimIdKey
unsafeCoerceName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce") unsafeCoerceIdKey
-- Dynamic
toDynName :: Name
......@@ -2417,10 +2415,9 @@ naturalSDataConKey = mkPreludeMiscIdUnique 568
wordToNaturalIdKey = mkPreludeMiscIdUnique 569
-- Unsafe coercion proofs
unsafeEqualityProofIdKey, unsafeCoercePrimIdKey, unsafeCoerceIdKey :: Unique
unsafeEqualityProofIdKey, unsafeCoercePrimIdKey :: Unique
unsafeEqualityProofIdKey = mkPreludeMiscIdUnique 570
unsafeCoercePrimIdKey = mkPreludeMiscIdUnique 571
unsafeCoerceIdKey = mkPreludeMiscIdUnique 572
{-
************************************************************************
......
......@@ -2188,7 +2188,7 @@ coercionLKind co
go (TyConAppCo _ tc cos) = mkTyConApp tc (map go cos)
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (ForAllCo tv1 _ co1) = mkTyCoInvForAllTy tv1 (go co1)
go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
  • This just came across my radar (for boring reasons), but why do this? Coercion kinds should be used only when doing core lint or core transformations, when the distinction between -> and => is immaterial. Yet this new mkFunctionType does more work than the old mkVisFunTy. Maybe it's distracting to see Eq a -> a -> Bool in output. But if that's the problem, then I would consider a pre-pass en route to the pretty-printer that twiddles arrow shapes when necessary. (This can be done before converting to IfaceType so that we have kind information easily available. Maybe it can even be done right in toIfaceType.)

  • See Note [AnonArgFlag].

    The specialiser uses the -> vs => distinction.

    I think we can do two things

    • Maintain AnonArgFlag consistently
    • Not have it at all

    But having it sometimes right and sometimes wrong seems... well.. wrong.

Please register or sign in to reply
go (CoVarCo cv) = coVarLType cv
go (HoleCo h) = coVarLType (coHoleCoVar h)
go (UnivCo _ _ ty1 _) = ty1
......@@ -2245,7 +2245,7 @@ coercionRKind co
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (CoVarCo cv) = coVarRType cv
go (HoleCo h) = coVarRType (coHoleCoVar h)
go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
go (UnivCo _ _ _ ty2) = ty2
go (SymCo co) = coercionLKind co
go (TransCo _ co2) = go co2
......
......@@ -404,7 +404,7 @@ delayInlining top_lvl bndr
-- These rules are probably auto-generated specialisations,
-- since Ids with manual rules usually have manually-inserted
-- delayed inlining anyway
= bndr `setInlineActivation` activeAfterInitial
= bndr `setInlineActivation` activateAfterInitial
| otherwise
= bndr
......
......@@ -331,8 +331,8 @@ cprAnalBind top_lvl env id rhs
cprExpandUnfolding_maybe :: Id -> Maybe CoreExpr
cprExpandUnfolding_maybe id = do
guard (idArity id == 0)
-- There are only phase 0 Simplifier runs after CPR analysis
guard (isActiveIn 0 (idInlineActivation id))
-- There are only FinalPhase Simplifier runs after CPR analysis
guard (activeInFinalPhase (idInlineActivation id))
expandUnfolding_maybe (idUnfolding id)
{- Note [Arity trimming for CPR signatures]
......
......@@ -37,7 +37,7 @@ import GHC.Core.Opt.FloatOut ( floatOutwards )
import GHC.Core.FamInstEnv
import GHC.Types.Id
import GHC.Utils.Error ( withTiming, withTimingD, DumpFormat (..) )
import GHC.Types.Basic ( CompilerPhase(..), isDefaultInlinePragma, defaultInlinePragma )
import GHC.Types.Basic
import GHC.Types.Var.Set
import GHC.Types.Var.Env
import GHC.Core.Opt.LiberateCase ( liberateCase )
......@@ -141,8 +141,10 @@ getCoreToDo dflags
maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase)
maybe_strictness_before phase
= runWhen (phase `elem` strictnessBefore dflags) CoreDoDemand
maybe_strictness_before (Phase phase)
| phase `elem` strictnessBefore dflags = CoreDoDemand
maybe_strictness_before _
= CoreDoNothing
base_mode = SimplMode { sm_phase = panic "base_mode"
, sm_names = []
......@@ -152,20 +154,20 @@ getCoreToDo dflags
, sm_inline = True
, sm_case_case = True }
simpl_phase phase names iter
simpl_phase phase name iter
= CoreDoPasses
$ [ maybe_strictness_before phase
, CoreDoSimplify iter
(base_mode { sm_phase = Phase phase
, sm_names = names })
(base_mode { sm_phase = phase
, sm_names = [name] })
, maybe_rule_check (Phase phase) ]
, maybe_rule_check phase ]
simpl_phases = CoreDoPasses [ simpl_phase phase ["main"] max_iter
| phase <- [phases, phases-1 .. 1] ]
-- Run GHC's internal simplification phase, after all rules have run.
-- See Note [Compiler phases] in GHC.Types.Basic
simplify name = simpl_phase FinalPhase name max_iter
-- initial simplify: mk specialiser happy: minimum effort please
-- initial simplify: mk specialiser happy: minimum effort please
simpl_gently = CoreDoSimplify max_iter
(base_mode { sm_phase = InitialPhase
, sm_names = ["Gentle"]
......@@ -182,7 +184,7 @@ getCoreToDo dflags
demand_analyser = (CoreDoPasses (
dmd_cpr_ww ++
[simpl_phase 0 ["post-worker-wrapper"] max_iter]
[simplify "post-worker-wrapper"]
))
-- Static forms are moved to the top level with the FloatOut pass.
......@@ -203,7 +205,7 @@ getCoreToDo dflags
if opt_level == 0 then
[ static_ptrs_float_outwards,
CoreDoSimplify max_iter
(base_mode { sm_phase = Phase 0
(base_mode { sm_phase = FinalPhase
, sm_names = ["Non-opt simplification"] })
]
......@@ -251,8 +253,10 @@ getCoreToDo dflags
-- GHC.Iface.Tidy.StaticPtrTable.
static_ptrs_float_outwards,
simpl_phases,
-- Run the simplier phases 2,1,0 to allow rewrite rules to fire
CoreDoPasses [ simpl_phase (Phase phase) "main" max_iter
| phase <- [phases, phases-1 .. 1] ],
simpl_phase (Phase 0) "main" (max max_iter 3),
-- Phase 0: allow all Ids to be inlined now
-- This gets foldr inlined before strictness analysis
......@@ -263,7 +267,6 @@ getCoreToDo dflags
-- ==> let k = BIG in letrec go = \xs -> ...(k x).... in go xs
-- ==> let k = BIG in letrec go = \xs -> ...(BIG x).... in go xs
-- Don't stop now!
simpl_phase 0 ["main"] (max max_iter 3),
runWhen do_float_in CoreDoFloatInwards,
-- Run float-inwards immediately before the strictness analyser
......@@ -274,9 +277,10 @@ getCoreToDo dflags
runWhen call_arity $ CoreDoPasses
[ CoreDoCallArity
, simpl_phase 0 ["post-call-arity"] max_iter
, simplify "post-call-arity"
],
-- Strictness analysis
runWhen strictness demand_analyser,
runWhen exitification CoreDoExitify,
......@@ -302,24 +306,24 @@ getCoreToDo dflags
runWhen do_float_in CoreDoFloatInwards,
maybe_rule_check (Phase 0),
maybe_rule_check FinalPhase,
-- Case-liberation for -O2. This should be after
-- strictness analysis and the simplification which follows it.
runWhen liberate_case (CoreDoPasses [
CoreLiberateCase,
simpl_phase 0 ["post-liberate-case"] max_iter
simplify "post-liberate-case"
]), -- Run the simplifier after LiberateCase to vastly
-- reduce the possibility of shadowing
-- Reason: see Note [Shadowing] in GHC.Core.Opt.SpecConstr
runWhen spec_constr CoreDoSpecConstr,
maybe_rule_check (Phase 0),
maybe_rule_check FinalPhase,
runWhen late_specialise
(CoreDoPasses [ CoreDoSpecialising
, simpl_phase 0 ["post-late-spec"] max_iter]),
, simplify "post-late-spec"]),
-- LiberateCase can yield new CSE opportunities because it peels
-- off one layer of a recursive function (concretely, I saw this
......@@ -328,11 +332,10 @@ getCoreToDo dflags
runWhen ((liberate_case || spec_constr) && cse) CoreCSE,
-- Final clean-up simplification:
simpl_phase 0 ["final"] max_iter,
simplify "final",
runWhen late_dmd_anal $ CoreDoPasses (
dmd_cpr_ww ++
[simpl_phase 0 ["post-late-ww"] max_iter]
dmd_cpr_ww ++ [simplify "post-late-ww"]
),
-- Final run of the demand_analyser, ensures that one-shot thunks are
......@@ -342,7 +345,7 @@ getCoreToDo dflags
-- can become /exponentially/ more expensive. See #11731, #12996.
runWhen (strictness || late_dmd_anal) CoreDoDemand,
maybe_rule_check (Phase 0)
maybe_rule_check FinalPhase
]
-- Remove 'CoreDoNothing' and flatten 'CoreDoPasses' for clarity.
......
This diff is collapsed.
......@@ -118,7 +118,9 @@ data SimplCont
SimplCont
| ApplyToVal -- (ApplyToVal arg K)[e] = K[ e arg ]
{ sc_dup :: DupFlag -- See Note [DupFlag invariants]
{ sc_dup :: DupFlag -- See Note [DupFlag invariants]
, sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
-- See Note [The hole type in ApplyToTy/Val]
, sc_arg :: InExpr -- The argument,
, sc_env :: StaticEnv -- see Note [StaticEnv invariant]
, sc_cont :: SimplCont }
......@@ -126,7 +128,7 @@ data SimplCont
| ApplyToTy -- (ApplyToTy ty K)[e] = K[ e ty ]
{ sc_arg_ty :: OutType -- Argument type
, sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
-- See Note [The hole type in ApplyToTy]
-- See Note [The hole type in ApplyToTy/Val]
, sc_cont :: SimplCont }
| Select -- (Select alts K)[e] = K[ case e of alts ]
......@@ -151,6 +153,9 @@ data SimplCont
, sc_fun :: ArgInfo -- Specifies f, e1..en, Whether f has rules, etc
-- plus strictness flags for *further* args
, sc_cci :: CallCtxt -- Whether *this* argument position is interesting
, sc_fun_ty :: OutType -- Type of the function (f e1 .. en),
-- presumably (arg_ty -> res_ty)
-- where res_ty is expected by sc_cont
, sc_cont :: SimplCont }
| TickIt -- (TickIt t K)[e] = K[ tick t e ]
......@@ -254,8 +259,6 @@ data ArgInfo
ai_fun :: OutId, -- The function
ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order)
ai_type :: OutType, -- Type of (f a1 ... an)
ai_rules :: FunRules, -- Rules for this function
ai_encl :: Bool, -- Flag saying whether this function
......@@ -271,37 +274,36 @@ data ArgInfo
}
data ArgSpec
= ValArg OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
= ValArg { as_arg :: OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
, as_hole_ty :: OutType } -- Type of the function (presumably t1 -> t2)
| TyArg { as_arg_ty :: OutType -- Apply to this type; c.f. ApplyToTy
, as_hole_ty :: OutType } -- Type of the function (presumably forall a. blah)
| CastBy OutCoercion -- Cast by this; c.f. CastIt
instance Outputable ArgSpec where
ppr (ValArg e) = text "ValArg" <+> ppr e
ppr (ValArg { as_arg = arg }) = text "ValArg" <+> ppr arg
ppr (TyArg { as_arg_ty = ty }) = text "TyArg" <+> ppr ty
ppr (CastBy c) = text "CastBy" <+> ppr c
addValArgTo :: ArgInfo -> OutExpr -> ArgInfo
addValArgTo ai arg = ai { ai_args = ValArg arg : ai_args ai
, ai_type = applyTypeToArg (ai_type ai) arg
, ai_rules = decRules (ai_rules ai) }
addValArgTo :: ArgInfo -> OutExpr -> OutType -> ArgInfo
addValArgTo ai arg hole_ty = ai { ai_args = arg_spec : ai_args ai
, ai_rules = decRules (ai_rules ai) }
where
arg_spec = ValArg { as_arg = arg, as_hole_ty = hole_ty }
addTyArgTo :: ArgInfo -> OutType -> ArgInfo
addTyArgTo ai arg_ty = ai { ai_args = arg_spec : ai_args ai
, ai_type = piResultTy poly_fun_ty arg_ty
, ai_rules = decRules (ai_rules ai) }
addTyArgTo :: ArgInfo -> OutType -> OutType -> ArgInfo
addTyArgTo ai arg_ty hole_ty = ai { ai_args = arg_spec : ai_args ai
, ai_rules = decRules (ai_rules ai) }
where
poly_fun_ty = ai_type ai
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = poly_fun_ty }
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
addCastTo :: ArgInfo -> OutCoercion -> ArgInfo
addCastTo ai co = ai { ai_args = CastBy co : ai_args ai
, ai_type = coercionRKind co }
addCastTo ai co = ai { ai_args = CastBy co : ai_args ai }
argInfoAppArgs :: [ArgSpec] -> [OutExpr]
argInfoAppArgs [] = []
argInfoAppArgs (CastBy {} : _) = [] -- Stop at a cast
argInfoAppArgs (ValArg e : as) = e : argInfoAppArgs as
argInfoAppArgs (ValArg { as_arg = arg } : as) = arg : argInfoAppArgs as
argInfoAppArgs (TyArg { as_arg_ty = ty } : as) = Type ty : argInfoAppArgs as
pushSimplifiedArgs :: SimplEnv -> [ArgSpec] -> SimplCont -> SimplCont
......@@ -310,7 +312,9 @@ pushSimplifiedArgs env (arg : args) k
= case arg of
TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
-> ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = rest }
ValArg e -> ApplyToVal { sc_arg = e, sc_env = env, sc_dup = Simplified, sc_cont = rest }
ValArg { as_arg = arg, as_hole_ty = hole_ty }
-> ApplyToVal { sc_arg = arg, sc_env = env, sc_dup = Simplified
, sc_hole_ty = hole_ty, sc_cont = rest }
CastBy c -> CastIt c rest
where
rest = pushSimplifiedArgs env args k
......@@ -323,7 +327,7 @@ argInfoExpr fun rev_args
= go rev_args
where
go [] = Var fun
go (ValArg a : as) = go as `App` a
go (ValArg { as_arg = arg } : as) = go as `App` arg
go (TyArg { as_arg_ty = ty } : as) = go as `App` Type ty
go (CastBy co : as) = mkCast (go as) co
......@@ -409,11 +413,9 @@ contHoleType (TickIt _ k) = contHoleType k
contHoleType (CastIt co _) = coercionLKind co
contHoleType (StrictBind { sc_bndr = b, sc_dup = dup, sc_env = se })
= perhapsSubstTy dup se (idType b)
contHoleType (StrictArg { sc_fun = ai }) = funArgTy (ai_type ai)
contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy]
contHoleType (ApplyToVal { sc_arg = e, sc_env = se, sc_dup = dup, sc_cont = k })
= mkVisFunTy (perhapsSubstTy dup se (exprType e))
(contHoleType k)
contHoleType (StrictArg { sc_fun_ty = ty }) = funArgTy ty
contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
contHoleType (ApplyToVal { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se })
= perhapsSubstTy d se (idType b)
......@@ -458,13 +460,13 @@ mkArgInfo :: SimplEnv
mkArgInfo env fun rules n_val_args call_cont
| n_val_args < idArity fun -- Note [Unsaturated functions]
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
= ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = False
, ai_strs = vanilla_stricts
, ai_discs = vanilla_discounts }
| otherwise
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
= ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = interestingArgContext rules call_cont
, ai_strs = arg_stricts
......@@ -1091,7 +1093,7 @@ seems to be to do a callSiteInline based on the fact that there is
something interesting about the call site (it's strict). Hmm. That
seems a bit fragile.
Conclusion: inline top level things gaily until Phase 0 (the last
Conclusion: inline top level things gaily until FinalPhase (the last
phase), at which point don't.
Note [pre/postInlineUnconditionally in gentle mode]
......@@ -1214,23 +1216,21 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
-- not ticks. Counting ticks cannot be duplicated, and non-counting
-- ticks around a Lam will disappear anyway.
early_phase = case sm_phase mode of
Phase 0 -> False
_ -> True
-- If we don't have this early_phase test, consider
-- x = length [1,2,3]
-- The full laziness pass carefully floats all the cons cells to
-- top level, and preInlineUnconditionally floats them all back in.
-- Result is (a) static allocation replaced by dynamic allocation
-- (b) many simplifier iterations because this tickles
-- a related problem; only one inlining per pass
--
-- On the other hand, I have seen cases where top-level fusion is
-- lost if we don't inline top level thing (e.g. string constants)
-- Hence the test for phase zero (which is the phase for all the final
-- simplifications). Until phase zero we take no special notice of
-- top level things, but then we become more leery about inlining
-- them.
early_phase = sm_phase mode /= FinalPhase
-- If we don't have this early_phase test, consider
-- x = length [1,2,3]
-- The full laziness pass carefully floats all the cons cells to
-- top level, and preInlineUnconditionally floats them all back in.
-- Result is (a) static allocation replaced by dynamic allocation
-- (b) many simplifier iterations because this tickles
-- a related problem; only one inlining per pass
--
-- On the other hand, I have seen cases where top-level fusion is
-- lost if we don't inline top level thing (e.g. string constants)
-- Hence the test for phase zero (which is the phase for all the final
-- simplifications). Until phase zero we take no special notice of
-- top level things, but then we become more leery about inlining
-- them.
{-
************************************************************************
......@@ -1549,7 +1549,7 @@ tryEtaExpandRhs mode bndr rhs
return (new_arity, is_bot, new_rhs) }
where
try_expand
| exprIsTrivial rhs
| exprIsTrivial rhs -- See Note [Do not eta-expand trivial expressions]
= return (exprArity rhs, False, rhs)
| sm_eta_expand mode -- Provided eta-expansion is on
......@@ -1593,9 +1593,17 @@ because then 'genMap' will inline, and it really shouldn't: at least
as far as the programmer is concerned, it's not applied to two
arguments!
Note [Do not eta-expand trivial expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Do not eta-expand a trivial RHS like
f = g
If we eta expand do
f = \x. g x
we'll just eta-reduce again, and so on; so the
simplifier never terminates.
Note [Do not eta-expand join points]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Similarly to CPR (see Note [Don't w/w join points for CPR] in
GHC.Core.Opt.WorkWrap), a join point stands well to gain from its outer binding's
eta-expansion, and eta-expanding a join point is fraught with issues like how to
......
......@@ -1760,8 +1760,8 @@ Note [Transfer activation]
In which phase should the specialise-constructor rules be active?
Originally I made them always-active, but Manuel found that this
defeated some clever user-written rules. Then I made them active only
in Phase 0; after all, currently, the specConstr transformation is
only run after the simplifier has reached Phase 0, but that meant
in FinalPhase; after all, currently, the specConstr transformation is
only run after the simplifier has reached FinalPhase, but that meant
that specialisations didn't fire inside wrappers; see test
simplCore/should_compile/spec-inline.
......
......@@ -245,8 +245,8 @@ NOINLINE pragma to the worker.
(See #13143 for a real-world example.)
It is crucial that we do this for *all* NOINLINE functions. #10069
demonstrates what happens when we promise to w/w a (NOINLINE) leaf function, but
fail to deliver:
demonstrates what happens when we promise to w/w a (NOINLINE) leaf
function, but fail to deliver:
data C = C Int# Int#
......@@ -421,19 +421,27 @@ When should the wrapper inlining be active?
In module Bar we want to give specialisations a chance to fire
before inlining f's wrapper.
Historical note: At one stage I tried making the wrapper inlining
always-active, and that had a very bad effect on nofib/imaginary/x2n1;
a wrapper was inlined before the specialisation fired.
Reminder: Note [Don't w/w INLINE things], so we don't need to worry
about INLINE things here.
Conclusion:
- If the user said NOINLINE[n], respect that
- If the user said NOINLINE, inline the wrapper as late as
poss (phase 0). This is a compromise driven by (2) above
- If the user said NOINLINE, inline the wrapper only after
phase 0, the last user-visible phase. That means that all
rules will have had a chance to fire.
What phase is after phase 0? Answer: FinalPhase, that's the reason it
exists. NB: Similar to InitialPhase, users can't write INLINE[Final] f;
it's syntactically illegal.
- Otherwise inline wrapper in phase 2. That allows the
'gentle' simplification pass to apply specialisation rules
Historical note: At one stage I tried making the wrapper inlining
always-active, and that had a very bad effect on nofib/imaginary/x2n1;
a wrapper was inlined before the specialisation fired.
Note [Wrapper NoUserInline]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -575,8 +583,8 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
work_uniq <- getUniqueM
let work_rhs = work_fn rhs
work_act = case fn_inline_spec of -- See Note [Worker activation]
NoInline -> fn_act
_ -> wrap_act
NoInline -> inl_act fn_inl_prag
_ -> inl_act wrap_prag
work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = fn_inline_spec
......@@ -626,19 +634,7 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topDmd
wrap_rhs = wrap_fn work_id
wrap_act = case fn_act of -- See Note [Wrapper activation]
ActiveAfter {} -> fn_act
NeverActive -> activeDuringFinal
_ -> activeAfterInitial
wrap_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = NoUserInline
, inl_sat = Nothing
, inl_act = wrap_act
, inl_rule = rule_match_info }
-- inl_act: see Note [Wrapper activation]
-- inl_inline: see Note [Wrapper NoUserInline]
-- inl_rule: RuleMatchInfo is (and must be) unaffected
wrap_prag = mkStrWrapperInlinePrag fn_inl_prag
wrap_id = fn_id `setIdUnfolding` mkWwInlineRule dflags wrap_rhs arity
`setInlinePragma` wrap_prag
`setIdOccInfo` noOccInfo
......@@ -655,8 +651,6 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
rhs_fvs = exprFreeVars rhs
fn_inl_prag = inlinePragInfo fn_info
fn_inline_spec = inl_inline fn_inl_prag
fn_act = inl_act fn_inl_prag
rule_match_info = inlinePragmaRuleMatchInfo fn_inl_prag
fn_unfolding = unfoldingInfo fn_info
arity = arityInfo fn_info
-- The arity is set by the simplifier using exprEtaExpandArity
......@@ -674,6 +668,25 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topCpr
mkStrWrapperInlinePrag :: InlinePragma -> InlinePragma
mkStrWrapperInlinePrag (InlinePragma { inl_act = act, inl_rule = rule_info })
= InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = NoUserInline -- See Note [Wrapper NoUserInline]
, inl_sat = Nothing
, inl_act = wrap_act
, inl_rule = rule_info } -- RuleMatchInfo is (and must be) unaffected
where
wrap_act = case act of -- See Note [Wrapper activation]
NeverActive -> activateDuringFinal
FinalActive -> act
ActiveAfter {} -> act
ActiveBefore {} -> activateAfterInitial
AlwaysActive -> activateAfterInitial
-- For the last two cases, see (4) in Note [Wrapper activation]
-- NB: the (ActiveBefore n) isn't quite right. We really want
-- it to be active *after* Initial but *before* n. We don't have
-- a way to say that, alas.
{-
Note [Demand on the worker]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -1367,8 +1367,7 @@ pushCoTyArg co ty
| otherwise
= Nothing
where
tyL = coercionLKind co
tyR = coercionRKind co
Pair tyL tyR = coercionKind co
-- co :: tyL ~ tyR
-- tyL = forall (a1 :: k1). ty1
-- tyR = forall (a2 :: k2). ty2
......
......@@ -51,7 +51,7 @@ module GHC.Core.Type (
splitPiTy_maybe, splitPiTy, splitPiTys,
mkTyConBindersPreferAnon,
mkPiTy, mkPiTys,
mkLamType, mkLamTypes,
mkLamType, mkLamTypes, mkFunctionType,
piResultTy, piResultTys,
applyTysX, dropForAlls,
mkFamilyTyConApp,
......@@ -256,7 +256,7 @@ import {-# SOURCE #-} GHC.Core.Coercion
, mkTyConAppCo, mkAppCo, mkCoVarCo, mkAxiomRuleCo
, mkForAllCo, mkFunCo, mkAxiomInstCo, mkUnivCo
, mkSymCo, mkTransCo, mkNthCo, mkLRCo, mkInstCo
, mkKindCo, mkSubCo, mkFunCo, mkAxiomInstCo
, mkKindCo, mkSubCo
, decomposePiCos, coercionKind, coercionLKind
, coercionRKind, coercionType
, isReflexiveCo, seqCo )
......@@ -1517,6 +1517,8 @@ mkLamType :: Var -> Type -> Type
mkLamTypes :: [Var] -> Type -> Type
-- ^ 'mkLamType' for multiple type or value arguments
mkLamTypes vs ty = foldr mkLamType ty vs
mkLamType v body_ty
| isTyVar v
= ForAllTy (Bndr v Inferred) body_ty
......@@ -1525,43 +1527,19 @@ mkLamType v body_ty
, v `elemVarSet` tyCoVarsOfType body_ty
= ForAllTy (Bndr v Required) body_ty
| isPredTy arg_ty -- See Note [mkLamType: dictionary arguments]
= mkInvisFunTy arg_ty body_ty
| otherwise
= mkVisFunTy arg_ty body_ty
where
arg_ty = varType v
mkLamTypes vs ty = foldr mkLamType ty vs
= mkFunctionType (varType v) body_ty
{- Note [mkLamType: dictionary arguments]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If we have (\ (d :: Ord a). blah), we want to give it type
(Ord a => blah_ty)
with a fat arrow; that is, using mkInvisFunTy, not mkVisFunTy.
Why? After all, we are in Core, where (=>) and (->) behave the same.
Yes, but the /specialiser/ does treat dictionary arguments specially.