Skip to content

GHC should expose a mode for object merging

Problem

Function-section-splitting (e.g. GHC's -split-sections flag) allows the linker to reduce the size of statically-linked executables by placing each function in its own section, allowing unused sections to be dropped in the final link.

However, this carries with it a cost: loading such an object file (e.g. using GHC's RTS linker) requires a great deal of work as each section must be separately mapped. This can result in significant object loading times in GHCi.

To reduce this cost @simonmar introduced the notion of a "GHCi library", which is an object produced from the constitute module objects of a library, merging all text sections into a single section. This significantly lowers the cost of loading.

However, currently the logic for producing these objects is repeated throughout the ecosystem:

  • the Hadrian and make build systems both contain some logic
  • GHC has GHC.Driver.Pipeline.Execute.joinObjectFiles
  • Cabal has Distribution.Simple.Program.Ld.combineObjectFiles

This repetition seems unnecessary and a potential source of bugs. In the case of Cabal the logic is actually wrong, resulting in #15524 and #20682.

Solution

I suggest that GHC grows a new mode for the production of GHCi objects. The mode, e.g. ghc --join-objects, would be invoked with a list of objects and a --output flag and would invoke ld with the appropriate flags (e.g. -r and possibly a linker script). Hadrian, make, and Cabal could all use this mode, eliminating the repetition of object joining logic.

Edited by Ben Gamari
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information