GHC should generate specialized libffi at link time in wasm backend
Currently, the wasm backend uses https://gitlab.haskell.org/ghc/libffi-wasm. The libffi.a linked into each Haskell program is compiled from auto-generated C code, and contains function/data sections for each single possible C function signature within a certain arity range (see readme's linked blog post for details). This is sufficient to kick start initial wasm support, but comes with a huge overhead in code size for certain use cases:
-
ffi_calladds ~100KB to linked wasm. It's used by the bytecode interpreter and likely reachable if code uses GHC API. -
ffi_alloc_prep_closureadds ~4.8MB to linked wasm. It's used by dynamic C exports.
Although post-link optimization by wasm-opt can mitigate the code size increase, it's still far from satisfactory, and I don't think static analysis can identify unused paths in libffi-wasm's generated code. We can do much better here.
Without going too much in details in either libffi or wasm, my proposed solution is roughly:
- We still link with
libffi.a, and no need to change libffi adjustor logic.libffi.astill provides the same API, except it does not contain any generated code that covers each possible C function signature. -
libffi-wasmhas a global state at runtime: a mapping from encoded C function signatures to their underlying handler functions. -
ffi_callorffi_alloc_prep_closurewill simply look up that mapping, and redirect the call to the handler function. - At link time, we already know all possible C function signatures that we need to handle. So there will be some auto-generated C stub code that contain handler logic and register the handler into that mapping via ctor.
The only remaining issue is how to collect the C function signatures that libffi-wasm needs to support at GHC link time. This sounds like something that can be handled in interface files, but maybe we don't even need to. Whenever we desugar a dynamic C export, we add the handler logic we need in its stub code, that's it.