Commit 6d49d5be authored by Simon Peyton Jones's avatar Simon Peyton Jones Committed by Marge Bot
Browse files

Implement cast worker/wrapper properly

The cast worker/wrapper transformation transforms
   x = e |> co
into
   y = e
   x = y |> co

This is done by the simplifier, but we were being
careless about transferring IdInfo from x to y,
and about what to do if x is a NOINLNE function.
This resulted in a series of bugs:
     #17673, #18093, #18078.

This patch fixes all that:

* Main change is in GHC.Core.Opt.Simplify, and
  the new prepareBinding function, which does this
  cast worker/wrapper transform.
  See Note [Cast worker/wrappers].

* There is quite a bit of refactoring around
  prepareRhs, makeTrivial etc.  It's nicer now.

* Some wrappers from strictness and cast w/w, notably those for
  a function with a NOINLINE, should inline very late. There
  wasn't really a mechanism for that, which was an existing bug
  really; so I invented a new finalPhase = Phase (-1).  It's used
  for all simplifier runs after the user-visible phase 2,1,0 have
  run.  (No new runs of the simplifier are introduced thereby.)

  See new Note [Compiler phases] in GHC.Types.Basic;
  the main changes are in GHC.Core.Opt.Driver

* Doing this made me trip over two places where the AnonArgFlag on a
  FunTy was being lost so we could end up with (Num a -> ty)
  rather than (Num a => ty)
    - In coercionLKind/coercionRKind
    - In contHoleType in the Simplifier

  I fixed the former by defining mkFunctionType and using it in
  coercionLKind/RKind.

  I could have done the same for the latter, but the information
  is almost to hand.  So I fixed the latter by
    - adding sc_hole_ty to ApplyToVal (like ApplyToTy),
    - adding as_hole_ty to ValArg (like TyArg)
    - adding sc_fun_ty to StrictArg
  Turned out I could then remove ai_type from ArgInfo.  This is
  just moving the deck chairs around, but it worked out nicely.

  See the new Note [AnonArgFlag] in GHC.Types.Var

* When looking at the 'arity decrease' thing (#18093) I discovered
  that stable unfoldings had a much lower arity than the actual
  optimised function.  That's what led to the arity-decrease
  message.  Simple solution: eta-expand.

  It's described in Note [Eta-expand stable unfoldings]
  in GHC.Core.Opt.Simplify

* I also discovered that unsafeCoerce wasn't being inlined if
  the context was boring.  So (\x. f (unsafeCoerce x)) would
  create a thunk -- yikes!  I fixed that by making inlineBoringOK
  a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold.

  I also found that unsafeCoerceName was unused, so I removed it.

I made a test case for #18078, and a very similar one for #17673.

The net effect of all this on nofib is very modest, but positive:

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
           anna          -0.4%     -0.1%     -3.1%     -3.1%      0.0%
 fannkuch-redux          -0.4%     -0.3%     -0.1%     -0.1%      0.0%
       maillist          -0.4%     -0.1%     -7.8%     -1.0%    -14.3%
      primetest          -0.4%    -15.6%     -7.1%     -6.6%      0.0%
--------------------------------------------------------------------------------
            Min          -0.9%    -15.6%    -13.3%    -14.2%    -14.3%
            Max          -0.3%      0.0%    +12.1%    +12.4%      0.0%
 Geometric Mean          -0.4%     -0.2%     -2.3%     -2.2%     -0.1%

All following metric decreases are compile-time allocation decreases
between -1% and -3%:

Metric Decrease:
  T5631
  T13701
  T14697
  T15164
parent 9454511b
......@@ -472,7 +472,6 @@ basicKnownKeyNames
, unsafeEqualityTyConName
, unsafeReflDataConName
, unsafeCoercePrimName
, unsafeCoerceName
]
genericTyConNames :: [Name]
......@@ -1333,12 +1332,11 @@ typeErrorShowTypeDataConName =
-- Unsafe coercion proofs
unsafeEqualityProofName, unsafeEqualityTyConName, unsafeCoercePrimName,
unsafeCoerceName, unsafeReflDataConName :: Name
unsafeReflDataConName :: Name
unsafeEqualityProofName = varQual uNSAFE_COERCE (fsLit "unsafeEqualityProof") unsafeEqualityProofIdKey
unsafeEqualityTyConName = tcQual uNSAFE_COERCE (fsLit "UnsafeEquality") unsafeEqualityTyConKey
unsafeReflDataConName = dcQual uNSAFE_COERCE (fsLit "UnsafeRefl") unsafeReflDataConKey
unsafeCoercePrimName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce#") unsafeCoercePrimIdKey
unsafeCoerceName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce") unsafeCoerceIdKey
-- Dynamic
toDynName :: Name
......@@ -2417,10 +2415,9 @@ naturalSDataConKey = mkPreludeMiscIdUnique 568
wordToNaturalIdKey = mkPreludeMiscIdUnique 569
-- Unsafe coercion proofs
unsafeEqualityProofIdKey, unsafeCoercePrimIdKey, unsafeCoerceIdKey :: Unique
unsafeEqualityProofIdKey, unsafeCoercePrimIdKey :: Unique
unsafeEqualityProofIdKey = mkPreludeMiscIdUnique 570
unsafeCoercePrimIdKey = mkPreludeMiscIdUnique 571
unsafeCoerceIdKey = mkPreludeMiscIdUnique 572
{-
************************************************************************
......
......@@ -2188,7 +2188,7 @@ coercionLKind co
go (TyConAppCo _ tc cos) = mkTyConApp tc (map go cos)
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (ForAllCo tv1 _ co1) = mkTyCoInvForAllTy tv1 (go co1)
go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
go (CoVarCo cv) = coVarLType cv
go (HoleCo h) = coVarLType (coHoleCoVar h)
go (UnivCo _ _ ty1 _) = ty1
......@@ -2245,7 +2245,7 @@ coercionRKind co
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (CoVarCo cv) = coVarRType cv
go (HoleCo h) = coVarRType (coHoleCoVar h)
go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
go (UnivCo _ _ _ ty2) = ty2
go (SymCo co) = coercionLKind co
go (TransCo _ co2) = go co2
......
......@@ -404,7 +404,7 @@ delayInlining top_lvl bndr
-- These rules are probably auto-generated specialisations,
-- since Ids with manual rules usually have manually-inserted
-- delayed inlining anyway
= bndr `setInlineActivation` activeAfterInitial
= bndr `setInlineActivation` activateAfterInitial
| otherwise
= bndr
......
......@@ -331,8 +331,8 @@ cprAnalBind top_lvl env id rhs
cprExpandUnfolding_maybe :: Id -> Maybe CoreExpr
cprExpandUnfolding_maybe id = do
guard (idArity id == 0)
-- There are only phase 0 Simplifier runs after CPR analysis
guard (isActiveIn 0 (idInlineActivation id))
-- There are only FinalPhase Simplifier runs after CPR analysis
guard (activeInFinalPhase (idInlineActivation id))
expandUnfolding_maybe (idUnfolding id)
{- Note [Arity trimming for CPR signatures]
......
......@@ -37,7 +37,7 @@ import GHC.Core.Opt.FloatOut ( floatOutwards )
import GHC.Core.FamInstEnv
import GHC.Types.Id
import GHC.Utils.Error ( withTiming, withTimingD, DumpFormat (..) )
import GHC.Types.Basic ( CompilerPhase(..), isDefaultInlinePragma, defaultInlinePragma )
import GHC.Types.Basic
import GHC.Types.Var.Set
import GHC.Types.Var.Env
import GHC.Core.Opt.LiberateCase ( liberateCase )
......@@ -141,8 +141,10 @@ getCoreToDo dflags
maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase)
maybe_strictness_before phase
= runWhen (phase `elem` strictnessBefore dflags) CoreDoDemand
maybe_strictness_before (Phase phase)
| phase `elem` strictnessBefore dflags = CoreDoDemand
maybe_strictness_before _
= CoreDoNothing
base_mode = SimplMode { sm_phase = panic "base_mode"
, sm_names = []
......@@ -152,20 +154,20 @@ getCoreToDo dflags
, sm_inline = True
, sm_case_case = True }
simpl_phase phase names iter
simpl_phase phase name iter
= CoreDoPasses
$ [ maybe_strictness_before phase
, CoreDoSimplify iter
(base_mode { sm_phase = Phase phase
, sm_names = names })
(base_mode { sm_phase = phase
, sm_names = [name] })
, maybe_rule_check (Phase phase) ]
, maybe_rule_check phase ]
simpl_phases = CoreDoPasses [ simpl_phase phase ["main"] max_iter
| phase <- [phases, phases-1 .. 1] ]
-- Run GHC's internal simplification phase, after all rules have run.
-- See Note [Compiler phases] in GHC.Types.Basic
simplify name = simpl_phase FinalPhase name max_iter
-- initial simplify: mk specialiser happy: minimum effort please
-- initial simplify: mk specialiser happy: minimum effort please
simpl_gently = CoreDoSimplify max_iter
(base_mode { sm_phase = InitialPhase
, sm_names = ["Gentle"]
......@@ -182,7 +184,7 @@ getCoreToDo dflags
demand_analyser = (CoreDoPasses (
dmd_cpr_ww ++
[simpl_phase 0 ["post-worker-wrapper"] max_iter]
[simplify "post-worker-wrapper"]
))
-- Static forms are moved to the top level with the FloatOut pass.
......@@ -203,7 +205,7 @@ getCoreToDo dflags
if opt_level == 0 then
[ static_ptrs_float_outwards,
CoreDoSimplify max_iter
(base_mode { sm_phase = Phase 0
(base_mode { sm_phase = FinalPhase
, sm_names = ["Non-opt simplification"] })
]
......@@ -251,8 +253,10 @@ getCoreToDo dflags
-- GHC.Iface.Tidy.StaticPtrTable.
static_ptrs_float_outwards,
simpl_phases,
-- Run the simplier phases 2,1,0 to allow rewrite rules to fire
CoreDoPasses [ simpl_phase (Phase phase) "main" max_iter
| phase <- [phases, phases-1 .. 1] ],
simpl_phase (Phase 0) "main" (max max_iter 3),
-- Phase 0: allow all Ids to be inlined now
-- This gets foldr inlined before strictness analysis
......@@ -263,7 +267,6 @@ getCoreToDo dflags
-- ==> let k = BIG in letrec go = \xs -> ...(k x).... in go xs
-- ==> let k = BIG in letrec go = \xs -> ...(BIG x).... in go xs
-- Don't stop now!
simpl_phase 0 ["main"] (max max_iter 3),
runWhen do_float_in CoreDoFloatInwards,
-- Run float-inwards immediately before the strictness analyser
......@@ -274,9 +277,10 @@ getCoreToDo dflags
runWhen call_arity $ CoreDoPasses
[ CoreDoCallArity
, simpl_phase 0 ["post-call-arity"] max_iter
, simplify "post-call-arity"
],
-- Strictness analysis
runWhen strictness demand_analyser,
runWhen exitification CoreDoExitify,
......@@ -302,24 +306,24 @@ getCoreToDo dflags
runWhen do_float_in CoreDoFloatInwards,
maybe_rule_check (Phase 0),
maybe_rule_check FinalPhase,
-- Case-liberation for -O2. This should be after
-- strictness analysis and the simplification which follows it.
runWhen liberate_case (CoreDoPasses [
CoreLiberateCase,
simpl_phase 0 ["post-liberate-case"] max_iter
simplify "post-liberate-case"
]), -- Run the simplifier after LiberateCase to vastly
-- reduce the possibility of shadowing
-- Reason: see Note [Shadowing] in GHC.Core.Opt.SpecConstr
runWhen spec_constr CoreDoSpecConstr,
maybe_rule_check (Phase 0),
maybe_rule_check FinalPhase,
runWhen late_specialise
(CoreDoPasses [ CoreDoSpecialising
, simpl_phase 0 ["post-late-spec"] max_iter]),
, simplify "post-late-spec"]),
-- LiberateCase can yield new CSE opportunities because it peels
-- off one layer of a recursive function (concretely, I saw this
......@@ -328,11 +332,10 @@ getCoreToDo dflags
runWhen ((liberate_case || spec_constr) && cse) CoreCSE,
-- Final clean-up simplification:
simpl_phase 0 ["final"] max_iter,
simplify "final",
runWhen late_dmd_anal $ CoreDoPasses (
dmd_cpr_ww ++
[simpl_phase 0 ["post-late-ww"] max_iter]
dmd_cpr_ww ++ [simplify "post-late-ww"]
),
-- Final run of the demand_analyser, ensures that one-shot thunks are
......@@ -342,7 +345,7 @@ getCoreToDo dflags
-- can become /exponentially/ more expensive. See #11731, #12996.
runWhen (strictness || late_dmd_anal) CoreDoDemand,
maybe_rule_check (Phase 0)
maybe_rule_check FinalPhase
]
-- Remove 'CoreDoNothing' and flatten 'CoreDoPasses' for clarity.
......
This diff is collapsed.
......@@ -118,7 +118,9 @@ data SimplCont
SimplCont
| ApplyToVal -- (ApplyToVal arg K)[e] = K[ e arg ]
{ sc_dup :: DupFlag -- See Note [DupFlag invariants]
{ sc_dup :: DupFlag -- See Note [DupFlag invariants]
, sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
-- See Note [The hole type in ApplyToTy/Val]
, sc_arg :: InExpr -- The argument,
, sc_env :: StaticEnv -- see Note [StaticEnv invariant]
, sc_cont :: SimplCont }
......@@ -126,7 +128,7 @@ data SimplCont
| ApplyToTy -- (ApplyToTy ty K)[e] = K[ e ty ]
{ sc_arg_ty :: OutType -- Argument type
, sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
-- See Note [The hole type in ApplyToTy]
-- See Note [The hole type in ApplyToTy/Val]
, sc_cont :: SimplCont }
| Select -- (Select alts K)[e] = K[ case e of alts ]
......@@ -151,6 +153,9 @@ data SimplCont
, sc_fun :: ArgInfo -- Specifies f, e1..en, Whether f has rules, etc
-- plus strictness flags for *further* args
, sc_cci :: CallCtxt -- Whether *this* argument position is interesting
, sc_fun_ty :: OutType -- Type of the function (f e1 .. en),
-- presumably (arg_ty -> res_ty)
-- where res_ty is expected by sc_cont
, sc_cont :: SimplCont }
| TickIt -- (TickIt t K)[e] = K[ tick t e ]
......@@ -254,8 +259,6 @@ data ArgInfo
ai_fun :: OutId, -- The function
ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order)
ai_type :: OutType, -- Type of (f a1 ... an)
ai_rules :: FunRules, -- Rules for this function
ai_encl :: Bool, -- Flag saying whether this function
......@@ -271,37 +274,36 @@ data ArgInfo
}
data ArgSpec
= ValArg OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
= ValArg { as_arg :: OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
, as_hole_ty :: OutType } -- Type of the function (presumably t1 -> t2)
| TyArg { as_arg_ty :: OutType -- Apply to this type; c.f. ApplyToTy
, as_hole_ty :: OutType } -- Type of the function (presumably forall a. blah)
| CastBy OutCoercion -- Cast by this; c.f. CastIt
instance Outputable ArgSpec where
ppr (ValArg e) = text "ValArg" <+> ppr e
ppr (ValArg { as_arg = arg }) = text "ValArg" <+> ppr arg
ppr (TyArg { as_arg_ty = ty }) = text "TyArg" <+> ppr ty
ppr (CastBy c) = text "CastBy" <+> ppr c
addValArgTo :: ArgInfo -> OutExpr -> ArgInfo
addValArgTo ai arg = ai { ai_args = ValArg arg : ai_args ai
, ai_type = applyTypeToArg (ai_type ai) arg
, ai_rules = decRules (ai_rules ai) }
addValArgTo :: ArgInfo -> OutExpr -> OutType -> ArgInfo
addValArgTo ai arg hole_ty = ai { ai_args = arg_spec : ai_args ai
, ai_rules = decRules (ai_rules ai) }
where
arg_spec = ValArg { as_arg = arg, as_hole_ty = hole_ty }
addTyArgTo :: ArgInfo -> OutType -> ArgInfo
addTyArgTo ai arg_ty = ai { ai_args = arg_spec : ai_args ai
, ai_type = piResultTy poly_fun_ty arg_ty
, ai_rules = decRules (ai_rules ai) }
addTyArgTo :: ArgInfo -> OutType -> OutType -> ArgInfo
addTyArgTo ai arg_ty hole_ty = ai { ai_args = arg_spec : ai_args ai
, ai_rules = decRules (ai_rules ai) }
where
poly_fun_ty = ai_type ai
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = poly_fun_ty }
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
addCastTo :: ArgInfo -> OutCoercion -> ArgInfo
addCastTo ai co = ai { ai_args = CastBy co : ai_args ai
, ai_type = coercionRKind co }
addCastTo ai co = ai { ai_args = CastBy co : ai_args ai }
argInfoAppArgs :: [ArgSpec] -> [OutExpr]
argInfoAppArgs [] = []
argInfoAppArgs (CastBy {} : _) = [] -- Stop at a cast
argInfoAppArgs (ValArg e : as) = e : argInfoAppArgs as
argInfoAppArgs (ValArg { as_arg = arg } : as) = arg : argInfoAppArgs as
argInfoAppArgs (TyArg { as_arg_ty = ty } : as) = Type ty : argInfoAppArgs as
pushSimplifiedArgs :: SimplEnv -> [ArgSpec] -> SimplCont -> SimplCont
......@@ -310,7 +312,9 @@ pushSimplifiedArgs env (arg : args) k
= case arg of
TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
-> ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = rest }
ValArg e -> ApplyToVal { sc_arg = e, sc_env = env, sc_dup = Simplified, sc_cont = rest }
ValArg { as_arg = arg, as_hole_ty = hole_ty }
-> ApplyToVal { sc_arg = arg, sc_env = env, sc_dup = Simplified
, sc_hole_ty = hole_ty, sc_cont = rest }
CastBy c -> CastIt c rest
where
rest = pushSimplifiedArgs env args k
......@@ -323,7 +327,7 @@ argInfoExpr fun rev_args
= go rev_args
where
go [] = Var fun
go (ValArg a : as) = go as `App` a
go (ValArg { as_arg = arg } : as) = go as `App` arg
go (TyArg { as_arg_ty = ty } : as) = go as `App` Type ty
go (CastBy co : as) = mkCast (go as) co
......@@ -409,11 +413,9 @@ contHoleType (TickIt _ k) = contHoleType k
contHoleType (CastIt co _) = coercionLKind co
contHoleType (StrictBind { sc_bndr = b, sc_dup = dup, sc_env = se })
= perhapsSubstTy dup se (idType b)
contHoleType (StrictArg { sc_fun = ai }) = funArgTy (ai_type ai)
contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy]
contHoleType (ApplyToVal { sc_arg = e, sc_env = se, sc_dup = dup, sc_cont = k })
= mkVisFunTy (perhapsSubstTy dup se (exprType e))
(contHoleType k)
contHoleType (StrictArg { sc_fun_ty = ty }) = funArgTy ty
contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
contHoleType (ApplyToVal { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se })
= perhapsSubstTy d se (idType b)
......@@ -458,13 +460,13 @@ mkArgInfo :: SimplEnv
mkArgInfo env fun rules n_val_args call_cont
| n_val_args < idArity fun -- Note [Unsaturated functions]
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
= ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = False
, ai_strs = vanilla_stricts
, ai_discs = vanilla_discounts }
| otherwise
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
= ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = interestingArgContext rules call_cont
, ai_strs = arg_stricts
......@@ -1091,7 +1093,7 @@ seems to be to do a callSiteInline based on the fact that there is
something interesting about the call site (it's strict). Hmm. That
seems a bit fragile.
Conclusion: inline top level things gaily until Phase 0 (the last
Conclusion: inline top level things gaily until FinalPhase (the last
phase), at which point don't.
Note [pre/postInlineUnconditionally in gentle mode]
......@@ -1214,23 +1216,21 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
-- not ticks. Counting ticks cannot be duplicated, and non-counting
-- ticks around a Lam will disappear anyway.
early_phase = case sm_phase mode of
Phase 0 -> False
_ -> True
-- If we don't have this early_phase test, consider
-- x = length [1,2,3]
-- The full laziness pass carefully floats all the cons cells to
-- top level, and preInlineUnconditionally floats them all back in.
-- Result is (a) static allocation replaced by dynamic allocation
-- (b) many simplifier iterations because this tickles
-- a related problem; only one inlining per pass
--
-- On the other hand, I have seen cases where top-level fusion is
-- lost if we don't inline top level thing (e.g. string constants)
-- Hence the test for phase zero (which is the phase for all the final
-- simplifications). Until phase zero we take no special notice of
-- top level things, but then we become more leery about inlining
-- them.
early_phase = sm_phase mode /= FinalPhase
-- If we don't have this early_phase test, consider
-- x = length [1,2,3]
-- The full laziness pass carefully floats all the cons cells to
-- top level, and preInlineUnconditionally floats them all back in.
-- Result is (a) static allocation replaced by dynamic allocation
-- (b) many simplifier iterations because this tickles
-- a related problem; only one inlining per pass
--
-- On the other hand, I have seen cases where top-level fusion is
-- lost if we don't inline top level thing (e.g. string constants)
-- Hence the test for phase zero (which is the phase for all the final
-- simplifications). Until phase zero we take no special notice of
-- top level things, but then we become more leery about inlining
-- them.
{-
************************************************************************
......@@ -1549,7 +1549,7 @@ tryEtaExpandRhs mode bndr rhs
return (new_arity, is_bot, new_rhs) }
where
try_expand
| exprIsTrivial rhs
| exprIsTrivial rhs -- See Note [Do not eta-expand trivial expressions]
= return (exprArity rhs, False, rhs)
| sm_eta_expand mode -- Provided eta-expansion is on
......@@ -1593,9 +1593,17 @@ because then 'genMap' will inline, and it really shouldn't: at least
as far as the programmer is concerned, it's not applied to two
arguments!
Note [Do not eta-expand trivial expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Do not eta-expand a trivial RHS like
f = g
If we eta expand do
f = \x. g x
we'll just eta-reduce again, and so on; so the
simplifier never terminates.
Note [Do not eta-expand join points]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Similarly to CPR (see Note [Don't w/w join points for CPR] in
GHC.Core.Opt.WorkWrap), a join point stands well to gain from its outer binding's
eta-expansion, and eta-expanding a join point is fraught with issues like how to
......
......@@ -1760,8 +1760,8 @@ Note [Transfer activation]
In which phase should the specialise-constructor rules be active?
Originally I made them always-active, but Manuel found that this
defeated some clever user-written rules. Then I made them active only
in Phase 0; after all, currently, the specConstr transformation is
only run after the simplifier has reached Phase 0, but that meant
in FinalPhase; after all, currently, the specConstr transformation is
only run after the simplifier has reached FinalPhase, but that meant
that specialisations didn't fire inside wrappers; see test
simplCore/should_compile/spec-inline.
......
......@@ -245,8 +245,8 @@ NOINLINE pragma to the worker.
(See #13143 for a real-world example.)
It is crucial that we do this for *all* NOINLINE functions. #10069
demonstrates what happens when we promise to w/w a (NOINLINE) leaf function, but
fail to deliver:
demonstrates what happens when we promise to w/w a (NOINLINE) leaf
function, but fail to deliver:
data C = C Int# Int#
......@@ -421,19 +421,27 @@ When should the wrapper inlining be active?
In module Bar we want to give specialisations a chance to fire
before inlining f's wrapper.
Historical note: At one stage I tried making the wrapper inlining
always-active, and that had a very bad effect on nofib/imaginary/x2n1;
a wrapper was inlined before the specialisation fired.
Reminder: Note [Don't w/w INLINE things], so we don't need to worry
about INLINE things here.
Conclusion:
- If the user said NOINLINE[n], respect that
- If the user said NOINLINE, inline the wrapper as late as
poss (phase 0). This is a compromise driven by (2) above
- If the user said NOINLINE, inline the wrapper only after
phase 0, the last user-visible phase. That means that all
rules will have had a chance to fire.
What phase is after phase 0? Answer: FinalPhase, that's the reason it
exists. NB: Similar to InitialPhase, users can't write INLINE[Final] f;
it's syntactically illegal.
- Otherwise inline wrapper in phase 2. That allows the
'gentle' simplification pass to apply specialisation rules
Historical note: At one stage I tried making the wrapper inlining
always-active, and that had a very bad effect on nofib/imaginary/x2n1;
a wrapper was inlined before the specialisation fired.
Note [Wrapper NoUserInline]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -575,8 +583,8 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
work_uniq <- getUniqueM
let work_rhs = work_fn rhs
work_act = case fn_inline_spec of -- See Note [Worker activation]
NoInline -> fn_act
_ -> wrap_act
NoInline -> inl_act fn_inl_prag
_ -> inl_act wrap_prag
work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = fn_inline_spec
......@@ -626,19 +634,7 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topDmd
wrap_rhs = wrap_fn work_id
wrap_act = case fn_act of -- See Note [Wrapper activation]
ActiveAfter {} -> fn_act
NeverActive -> activeDuringFinal
_ -> activeAfterInitial
wrap_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = NoUserInline
, inl_sat = Nothing
, inl_act = wrap_act
, inl_rule = rule_match_info }
-- inl_act: see Note [Wrapper activation]
-- inl_inline: see Note [Wrapper NoUserInline]
-- inl_rule: RuleMatchInfo is (and must be) unaffected
wrap_prag = mkStrWrapperInlinePrag fn_inl_prag
wrap_id = fn_id `setIdUnfolding` mkWwInlineRule dflags wrap_rhs arity
`setInlinePragma` wrap_prag
`setIdOccInfo` noOccInfo
......@@ -655,8 +651,6 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
rhs_fvs = exprFreeVars rhs
fn_inl_prag = inlinePragInfo fn_info
fn_inline_spec = inl_inline fn_inl_prag
fn_act = inl_act fn_inl_prag
rule_match_info = inlinePragmaRuleMatchInfo fn_inl_prag
fn_unfolding = unfoldingInfo fn_info
arity = arityInfo fn_info
-- The arity is set by the simplifier using exprEtaExpandArity
......@@ -674,6 +668,25 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topCpr
mkStrWrapperInlinePrag :: InlinePragma -> InlinePragma
mkStrWrapperInlinePrag (InlinePragma { inl_act = act, inl_rule = rule_info })
= InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = NoUserInline -- See Note [Wrapper NoUserInline]
, inl_sat = Nothing
, inl_act = wrap_act
, inl_rule = rule_info } -- RuleMatchInfo is (and must be) unaffected
where
wrap_act = case act of -- See Note [Wrapper activation]
NeverActive -> activateDuringFinal
FinalActive -> act
ActiveAfter {} -> act
ActiveBefore {} -> activateAfterInitial
AlwaysActive -> activateAfterInitial
-- For the last two cases, see (4) in Note [Wrapper activation]
-- NB: the (ActiveBefore n) isn't quite right. We really want
-- it to be active *after* Initial but *before* n. We don't have
-- a way to say that, alas.
{-
Note [Demand on the worker]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -1367,8 +1367,7 @@ pushCoTyArg co ty
| otherwise
= Nothing
where
tyL = coercionLKind co
tyR = coercionRKind co
Pair tyL tyR = coercionKind co
-- co :: tyL ~ tyR
-- tyL = forall (a1 :: k1). ty1
-- tyR = forall (a2 :: k2). ty2
......
......@@ -51,7 +51,7 @@ module GHC.Core.Type (
splitPiTy_maybe, splitPiTy, splitPiTys,
mkTyConBindersPreferAnon,
mkPiTy, mkPiTys,
mkLamType, mkLamTypes,
mkLamType, mkLamTypes, mkFunctionType,
piResultTy, piResultTys,
applyTysX, dropForAlls,
mkFamilyTyConApp,
......@@ -256,7 +256,7 @@ import {-# SOURCE #-} GHC.Core.Coercion
, mkTyConAppCo, mkAppCo, mkCoVarCo, mkAxiomRuleCo
, mkForAllCo, mkFunCo, mkAxiomInstCo, mkUnivCo
, mkSymCo, mkTransCo, mkNthCo, mkLRCo, mkInstCo
, mkKindCo, mkSubCo, mkFunCo, mkAxiomInstCo
, mkKindCo, mkSubCo
, decomposePiCos, coercionKind, coercionLKind
, coercionRKind, coercionType
, isReflexiveCo, seqCo )
......@@ -1517,6 +1517,8 @@ mkLamType :: Var -> Type -> Type
mkLamTypes :: [Var] -> Type -> Type
-- ^ 'mkLamType' for multiple type or value arguments
mkLamTypes vs ty = foldr mkLamType ty vs
mkLamType v body_ty
| isTyVar v
= ForAllTy (Bndr v Inferred) body_ty
......@@ -1525,43 +1527,19 @@ mkLamType v body_ty
, v `elemVarSet` tyCoVarsOfType body_ty
= ForAllTy (Bndr v Required) body_ty
| isPredTy arg_ty -- See Note [mkLamType: dictionary arguments]
= mkInvisFunTy arg_ty body_ty
| otherwise