More efficient implementation plan for primops with continuation arguments
Note: this also relates to keepAlive# and touch#. See #21708 (closed) and !8597 (closed)
Original idea from #14375 (closed): in some of the primops that take continuation arguments we currently have to allocate the continuations but sometimes we immediately enter one of these continuations. One example is
maskAsyncExceptions# :: (State# RealWorld -> (# State# RealWorld, a #))
-> (State# RealWorld -> (# State# RealWorld, a #))
which is implemented as
stg_maskAsyncExceptionszh /* explicit stack */
{
/* Args: R1 :: IO a */
STK_CHK_P_LL (WDS(1)/* worst case */, stg_maskAsyncExceptionszh, R1);
if ((TO_W_(StgTSO_flags(CurrentTSO)) & TSO_BLOCKEX) == 0) {
/* avoid growing the stack unnecessarily */
if (Sp(0) == stg_maskAsyncExceptionszh_ret_info) {
Sp_adj(1);
} else {
Sp_adj(-1);
Sp(0) = stg_unmaskAsyncExceptionszh_ret_info;
}
} else {
if ((TO_W_(StgTSO_flags(CurrentTSO)) & TSO_INTERRUPTIBLE) == 0) {
Sp_adj(-1);
Sp(0) = stg_maskUninterruptiblezh_ret_info;
}
}
StgTSO_flags(CurrentTSO) = %lobits32(
TO_W_(StgTSO_flags(CurrentTSO)) | TSO_BLOCKEX | TSO_INTERRUPTIBLE);
TICK_UNKNOWN_CALL();
TICK_SLOW_CALL_fast_v();
jump stg_ap_v_fast [R1];
}
Here we want to have better compilation for maskAsyncExceptions# c where we
don't allocate for c.
There are some ideas in ticket:14375#comment:144211, but the idea was later superseded by ticket:14375#comment:144294, which suggests
-
Making continuation arguments explicit in STG so that an
maskAsyncExceptions#would now look like this in STG:maskAsyncExceptions# (\s. e). -
Then in the code gen generating stack frame push (plus
StgTSOupdates) and then emitting code foredirectly. (I don't understand ifbind s:=s2part in the comment is necessary?)
Some primops take more than one callback, e.g.
catch# :: (State# RealWorld -> (# State# RealWorld, a #) )
-> (b -> State# RealWorld -> (# State# RealWorld, a #) )
-> State# RealWorld
-> (# State# RealWorld, a #)
Which is implemented as
stg_catchzh ( P_ io, /* :: IO a */
P_ handler /* :: Exception -> IO a */ )
{
W_ exceptions_blocked;
STK_CHK_GEN();
exceptions_blocked =
TO_W_(StgTSO_flags(CurrentTSO)) & (TSO_BLOCKEX | TSO_INTERRUPTIBLE);
TICK_CATCHF_PUSHED();
/* Apply R1 to the realworld token */
TICK_UNKNOWN_CALL();
TICK_SLOW_CALL_fast_v();
jump stg_ap_v_fast
(CATCH_FRAME_FIELDS(,,stg_catch_frame_info, CCCS, 0,
exceptions_blocked, handler))
(io);
}
For this example ticket:14375#comment:144294 suggest using join points for the
handler argument (and using the non-allocating callback scheme for the io
argument), but I suggest we focus on more efficient implementation of callbacks
that are immediately entered, and worry about the join point stuff in another
ticket.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 8.6.3 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | bgamari, simonmar, simonpj |
| Operating system | |
| Architecture |