Extensible Interfaces Format
Use cases for extensible interfaces follow a rough categorisation based on when that use case happens with respect to the core pipeline - since the current .hi
interface files are written at the end of the core pipeline. So, we end up with five categories to compare possible extensible interface implementations with the current implementation:
- Pre-compilation: source analysis tools, build tools
- Pre-core-phase: GHC type-checking (#17843), etc, and plugins
- Core-phase: GHC core pipeline (usable by e.g. Plutus) and plugins
- Post-core-phase: GHC STG (usable by e.g. GHCJS), etc, and plugins
- Post-compilation: IDE analysis, alternate backends, build tools
The current implementation of .hi
interface files follows the format:
- Header
- ModIface - a summary of the module, which is the result of the core pipeline
- Extensible fields header
- Extensible fields
Of these categories, the current implementation (which appends extensible data to the standard .hi
files), has varying levels of support for each:
1: No support - an interface file can't exist until the ModIface
exists at the end of the core pipeline, so there's nowhere for data from this category to be written to.
2: Low support - data from this category can be carried through the compiler execution, but if the use case requires loading extensible data from other modules, or pausing before the core pipeline is complete, there again won't be an interface file for the data to be stored in. However, use cases that use early phase data later on, such as writing a parsed AST for use after compilation, are possible in the current system.
3: Full support - any data output during the core phase by either GHC or plugins can be immediately written with the standard ModIface
that's written at the end of the core stage.
4, 5: Decent support - data from these categories require the interface file to be parsed and rewritten, but otherwise this is not problematic.
If we instead use an open format, where the ModIface
is treated as a regular extensible field, and therefore don't rely on the existence of that ModIface
to write an interface file, we're able to fully support all of these categories. Options for this format include real directories (i.e. the .hi
file is now a directory), or an archive such as zip or a custom format. Of these, real directories would require Cabal changes, but otherwise offer a few advantages over single-file archives:
- The filesystem automatically tracks file update times
- Third party tools only need file IO and the ability to serialise/deserialise their interested fields
- Appending fields is simply writing a new file, without having to parse and rewrite the archive
- By adding support for
.hi
directories to Cabal and other build tools once, we retain the advantages of extensible interfaces because the build tools don't require knowledge of the fields inside the container - just the container itself.