rts/linker: Improve efficiency of proddable blocks structure
Previously the RTS linker used a naive linear linked-list implementation of an interval set in its "proddable blocks sanity check resulting in extremely poor asymptotic performance. This is particularly problematic when loading objects built with split sections. Here I rewrite this structure to instead use a sorted dense array with binary-search lookup.
In addition, I found that redundant calls to LoadLibraryEx
ended up contributing significantly to the overall runtime. To avoid this I introduce a HashTable
serving to cache to results of addDLL_PEi386
calls.
Finally, I fix #26052 (closed), a subtle mis-use of break
resulting in further redundant calls to LoadLibraryEx
.
In sum, these changes took the runtime of a trivial GHCi invocation (ghc --interactive -e 42
) on Windows from over 2.5 seconds to under 250 milliseconds.
Fixes #26009 (closed).