Draft: Avoid linear search when linking bytecode against native symbols
This MR is intended to address #23415 (closed). In some cases, this patch can speed up bytecode linking on macOS quite dramatically: on one project I’ve tested it on, it improves -fprefer-byte-code
build times by over 20%.
The strategy taken in this patch is conceptually quite simple. As noted in #23415 (comment 517665), the RTS linker currently iterates through every loaded shared object when searching for a native symbol, and in projects with lots of dependencies, this can be very slow. This patch avoids the need for linear search in most cases by doing some additional bookkeeping:
-
We extend the RTS API so that loading a shared library returns a handle to the library. This handle can be passed to a new RTS function,
lookupSymbolInDLL
, which looks up a symbol in a given library (rather than searching all loaded libraries). -
We maintain a mapping from loaded unit ids to the shared library handles containing their symbols.
-
When the bytecode linker looks up a Haskell symbol, it consults the mapping to see if we have a known shared library for the unit it comes from. If we do, we use the
lookupSymbolInDLL
fast path.
This approach seems to work rather well. However, the patch is currently draft quality, and many things need to be changed before it can be considered mergeable:
-
The patch currently changes the signature of addDLL
in a backwards-incompatible way. It is currently unclear to me whether this is considered a public API and what the compatibility constraints are.- This is no longer true, the signature was changed back in a follow up commit.
-
lookupSymbolInDLL
is not yet implemented on Windows. -
On the Haskell side of things, lookupSymbol
maintains a symbol cache in-fexternal-interpreter
mode, solookupSymbolInDLL
should likely be modified to use it as well. -
Currently, the patch falls back to linear search if lookupSymbolInDLL
fails, but this seems unlikely to be helpful, and it may be better to just fail fast. -
Several things deserve Note
s. -
There are no tests, and I am not currently sure how to write good test cases for this.