GHC should generate specialized libffi at link time in wasm backend
Currently, the wasm backend uses https://gitlab.haskell.org/ghc/libffi-wasm. The libffi.a
linked into each Haskell program is compiled from auto-generated C code, and contains function/data sections for each single possible C function signature within a certain arity range (see readme's linked blog post for details). This is sufficient to kick start initial wasm support, but comes with a huge overhead in code size for certain use cases:
-
ffi_call
adds ~100KB to linked wasm. It's used by the bytecode interpreter and likely reachable if code uses GHC API. -
ffi_alloc_prep_closure
adds ~4.8MB to linked wasm. It's used by dynamic C exports.
Although post-link optimization by wasm-opt
can mitigate the code size increase, it's still far from satisfactory, and I don't think static analysis can identify unused paths in libffi-wasm
's generated code. We can do much better here.
Without going too much in details in either libffi
or wasm
, my proposed solution is roughly:
- We still link with
libffi.a
, and no need to change libffi adjustor logic.libffi.a
still provides the same API, except it does not contain any generated code that covers each possible C function signature. -
libffi-wasm
has a global state at runtime: a mapping from encoded C function signatures to their underlying handler functions. -
ffi_call
orffi_alloc_prep_closure
will simply look up that mapping, and redirect the call to the handler function. - At link time, we already know all possible C function signatures that we need to handle. So there will be some auto-generated C stub code that contain handler logic and register the handler into that mapping via ctor.
The only remaining issue is how to collect the C function signatures that libffi-wasm
needs to support at GHC link time. This sounds like something that can be handled in interface files, but maybe we don't even need to. Whenever we desugar a dynamic C export, we add the handler logic we need in its stub code, that's it.