Skip to content

More efficient implementation plan for primops with continuation arguments

Note: this also relates to keepAlive# and touch#. See #21708 (closed) and !8597 (closed)

Original idea from #14375 (closed): in some of the primops that take continuation arguments we currently have to allocate the continuations but sometimes we immediately enter one of these continuations. One example is

maskAsyncExceptions# :: (State# RealWorld -> (# State# RealWorld, a #))
                     -> (State# RealWorld -> (# State# RealWorld, a #))

which is implemented as

stg_maskAsyncExceptionszh /* explicit stack */
{
    /* Args: R1 :: IO a */
    STK_CHK_P_LL (WDS(1)/* worst case */, stg_maskAsyncExceptionszh, R1);

    if ((TO_W_(StgTSO_flags(CurrentTSO)) & TSO_BLOCKEX) == 0) {
        /* avoid growing the stack unnecessarily */
        if (Sp(0) == stg_maskAsyncExceptionszh_ret_info) {
            Sp_adj(1);
        } else {
            Sp_adj(-1);
            Sp(0) = stg_unmaskAsyncExceptionszh_ret_info;
        }
    } else {
        if ((TO_W_(StgTSO_flags(CurrentTSO)) & TSO_INTERRUPTIBLE) == 0) {
            Sp_adj(-1);
            Sp(0) = stg_maskUninterruptiblezh_ret_info;
        }
    }

    StgTSO_flags(CurrentTSO) = %lobits32(
        TO_W_(StgTSO_flags(CurrentTSO)) | TSO_BLOCKEX | TSO_INTERRUPTIBLE);

    TICK_UNKNOWN_CALL();
    TICK_SLOW_CALL_fast_v();
    jump stg_ap_v_fast [R1];
}

Here we want to have better compilation for maskAsyncExceptions# c where we don't allocate for c.

There are some ideas in ticket:14375#comment:144211, but the idea was later superseded by ticket:14375#comment:144294, which suggests

  • Making continuation arguments explicit in STG so that an maskAsyncExceptions# would now look like this in STG: maskAsyncExceptions# (\s. e).

  • Then in the code gen generating stack frame push (plus StgTSO updates) and then emitting code for e directly. (I don't understand if bind s:=s2 part in the comment is necessary?)

Some primops take more than one callback, e.g.

catch# :: (State# RealWorld -> (# State# RealWorld, a #) )
       -> (b -> State# RealWorld -> (# State# RealWorld, a #) )
       -> State# RealWorld
       -> (# State# RealWorld, a #)

Which is implemented as

stg_catchzh ( P_ io,      /* :: IO a */
              P_ handler  /* :: Exception -> IO a */ )
{
    W_ exceptions_blocked;

    STK_CHK_GEN();

    exceptions_blocked =
        TO_W_(StgTSO_flags(CurrentTSO)) & (TSO_BLOCKEX | TSO_INTERRUPTIBLE);
    TICK_CATCHF_PUSHED();

    /* Apply R1 to the realworld token */
    TICK_UNKNOWN_CALL();
    TICK_SLOW_CALL_fast_v();

    jump stg_ap_v_fast
        (CATCH_FRAME_FIELDS(,,stg_catch_frame_info, CCCS, 0,
                            exceptions_blocked, handler))
        (io);
}

For this example ticket:14375#comment:144294 suggest using join points for the handler argument (and using the non-allocating callback scheme for the io argument), but I suggest we focus on more efficient implementation of callbacks that are immediately entered, and worry about the join point stuff in another ticket.

Trac metadata
Trac field Value
Version 8.6.3
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC bgamari, simonmar, simonpj
Operating system
Architecture
Edited by Simon Peyton Jones
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information