Commit 4edcef78 authored by Joachim Breitner's avatar Joachim Breitner Committed by Ben Gamari

CSE: Walk past join point lambdas (#15002)

As the CSE transformation traverses the syntax tree, it needs to go past
the lambdas of a join point, and only look for CSE opportunities inside,
as a join point’s lambdas must be preserved. Simple fix; comes with a
Note and a test case.

Thanks to Ryan Scott for an excellently minimized test case, and for
bisecting GHC.

Differential Revision: https://phabricator.haskell.org/D4572

(cherry picked from commit ae0cff0a)
parent 8147fa32
...@@ -17,7 +17,7 @@ import Var ( Var ) ...@@ -17,7 +17,7 @@ import Var ( Var )
import VarEnv ( elemInScopeSet, mkInScopeSet ) import VarEnv ( elemInScopeSet, mkInScopeSet )
import Id ( Id, idType, idInlineActivation, isDeadBinder import Id ( Id, idType, idInlineActivation, isDeadBinder
, zapIdOccInfo, zapIdUsageInfo, idInlinePragma , zapIdOccInfo, zapIdUsageInfo, idInlinePragma
, isJoinId ) , isJoinId, isJoinId_maybe )
import CoreUtils ( mkAltExpr, eqExpr import CoreUtils ( mkAltExpr, eqExpr
, exprIsTickedString , exprIsTickedString
, stripTicksE, stripTicksT, mkTicks ) , stripTicksE, stripTicksT, mkTicks )
...@@ -274,7 +274,28 @@ compiling ppHtml in Haddock.Backends.Xhtml). ...@@ -274,7 +274,28 @@ compiling ppHtml in Haddock.Backends.Xhtml).
We could try and be careful by tracking which join points are still valid at We could try and be careful by tracking which join points are still valid at
each subexpression, but since join points aren't allocated or shared, there's each subexpression, but since join points aren't allocated or shared, there's
less to gain by trying to CSE them. less to gain by trying to CSE them. (#13219)
Note [Don’t tryForCSE the RHS of a Join Point]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Another way how CSE for joint points is tricky is
let join foo x = (x, 42)
join bar x = (x, 42)
in … jump foo 1 … jump bar 2 …
naively, CSE would turn this into
let join foo x = (x, 42)
join bar = foo
in … jump foo 1 … jump bar 2 …
but now bar is a join point that claims arity one, but its right-hand side
is not a lambda, breaking the join-point invariant (this was #15002).
Therefore, `cse_bind` will zoom past the lambdas of a join point (using
`collectNBinders`) and resume searching for CSE opportunities only in the body
of the join point.
Note [CSE for recursive bindings] Note [CSE for recursive bindings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...@@ -353,6 +374,13 @@ cse_bind toplevel env (in_id, in_rhs) out_id ...@@ -353,6 +374,13 @@ cse_bind toplevel env (in_id, in_rhs) out_id
-- See Note [Take care with literal strings] -- See Note [Take care with literal strings]
= (env', (out_id, in_rhs)) = (env', (out_id, in_rhs))
| Just arity <- isJoinId_maybe in_id
-- See Note [Don’t tryForCSE the RHS of a Join Point]
= let (params, in_body) = collectNBinders arity in_rhs
(env', params') = addBinders env params
out_body = tryForCSE env' in_body
in (env, (out_id, mkLams params' out_body))
| otherwise | otherwise
= (env', (out_id', out_rhs)) = (env', (out_id', out_rhs))
where where
...@@ -392,6 +420,8 @@ addBinding env in_id out_id rhs' ...@@ -392,6 +420,8 @@ addBinding env in_id out_id rhs'
Var {} -> True Var {} -> True
_ -> False _ -> False
-- | Given a binder `let x = e`, this function
-- determines whether we should add `e -> x` to the cs_map
noCSE :: InId -> Bool noCSE :: InId -> Bool
noCSE id = not (isAlwaysActive (idInlineActivation id)) && noCSE id = not (isAlwaysActive (idInlineActivation id)) &&
not (noUserInlineSpec (inlinePragmaSpec (idInlinePragma id))) not (noUserInlineSpec (inlinePragmaSpec (idInlinePragma id)))
......
module T15002 where
import Control.Concurrent.MVar (MVar, modifyMVar_, putMVar)
import Data.Foldable (for_)
broadcastThen :: Either [MVar a] a -> MVar (Either [MVar a] a) -> a -> IO ()
broadcastThen finalState mv x =
modifyMVar_ mv $ \mx -> do
case mx of
Left ls -> do for_ ls (`putMVar` x)
return finalState
Right _ -> return finalState
...@@ -295,3 +295,4 @@ test('T14152a', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ...@@ -295,3 +295,4 @@ test('T14152a', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile,
test('T13990', normal, compile, ['-dcore-lint -O']) test('T13990', normal, compile, ['-dcore-lint -O'])
test('T14650', normal, compile, ['-O2']) test('T14650', normal, compile, ['-O2'])
test('T14959', normal, compile, ['-O']) test('T14959', normal, compile, ['-O'])
test('T15002', [ req_profiling ], compile, ['-O -fprof-auto -prof'])
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment