Skip to content

Introduce a standard thunk for allocating top-level strings

Ömer Sinan Ağacan requested to merge wip/osa1/std_string_thunks into master

Currently for a top-level closure in the form

hey = unpackCString# x

we generate code like this:

Main.hey_entry() //  [R1]
         { info_tbls: [(c2T4,
                        label: Main.hey_info
                        rep: HeapRep static { Thunk }
                        srt: Nothing)]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       c2T4: // global
           _rqm::P64 = R1;
           if ((Sp + 8) - 24 < SpLim) (likely: False) goto c2T5; else goto c2T6;
       c2T5: // global
           R1 = _rqm::P64;
           call (stg_gc_enter_1)(R1) args: 8, res: 0, upd: 8;
       c2T6: // global
           (_c2T1::I64) = call "ccall" arg hints:  [PtrHint,
                                                    PtrHint]  result hints:  [PtrHint] newCAF(BaseReg, _rqm::P64);
           if (_c2T1::I64 == 0) goto c2T3; else goto c2T2;
       c2T3: // global
           call (I64[_rqm::P64])() args: 8, res: 0, upd: 8;
       c2T2: // global
           I64[Sp - 16] = stg_bh_upd_frame_info;
           I64[Sp - 8] = _c2T1::I64;
           R2 = hey1_r2Gg_bytes;
           Sp = Sp - 16;
           call GHC.CString.unpackCString#_info(R2) args: 24, res: 0, upd: 24;
     }
 }

This code is generated for every string literal. Only difference between top-level closures like this is the argument for the bytes of the string (hey1_r2Gg_bytes in the code above).

With this patch we introduce a standard thunk in the RTS, called stg_MK_STRING_info, that does what unpackCString# x does, except it gets the bytes address from the payload. Using this, for the closure above, we generate this:

Main.hey_closure" {
    Main.hey_closure:
        const stg_MK_STRING_info;
        const hey1_r1Gg_bytes;
        const 0;
        const 0;
}

This is much smaller in code.

Fixes #16014 (closed).

Edited by Ben Gamari

Merge request reports