Cross-compilers should be stage2 compilers
Currently cross-compilers we distribute are stage1 compilers. I think they should be stage2 compilers.
Background
-
stage1 ghc uses a stage0 ABI and is linked with stage0's rts (like any other program built by stage0)
It means that stage1 can't load code objects it has produced (different ABIs).
-
stage1 may be unable to read the package databases containing the packages used to build itself: the package db format may have changed between stage0 and stage1.
It means that even if we were to build some units with stage0, we can't assume that stage1 could use them because stage0's ghc-pkg may install them using a db format that stage1 doesn't understand.
stage1 is a limited GHC compiler but it isn't an issue as long as it is only used to build a stage2 compiler from scratch. ghc
, ghc-pkg
and their dependencies are assumed to be buildable by these limited stage1 compilers: they don't use compiler plugins, don't use template-haskell, don't rely on ghci
or on introspection in any way.
Issues
-
Stage1 can't support compiler plugins. To fix #14335 we need cross-compiler to be stage2 compilers. -
Build system complexity and build time: cross-compilers aim to be as fully-featured as regular compilers. As a consequence, depending on whether we build a cross-compiler or not, we want to build its companion tools or not (unlit, haddock, iserv, etc.). See #18253 -
Build system hacks: currently in Hadrian we have a hack when stage0 is a kind of stage1 to support iserv-proxy: we detect when stage0's version == stage1's version (see bootCross
inhadrian/src/Settings/Packages.hs
). We could remove this hack. -
Interpreter protocol: ghc-heap can be used for dynamic heap inspection (e.g. in GHCi) and iserv's protocol can return Closure descriptions obtained via ghc-heap. But ghc-heap depends on the rts headers via .hsc files: which rts? Those protocol messages should be disabled in stage1 compiler as stage1 produces code objects for the newer rts while its iserv programs are linked against stage0's rts. We should probably split the protocol in two parts: interpreter (TH, etc.) and introspection (heap, interactive debug, etc.) as we will want to disable the introspection part in GHCJS and Asterius which won't support ghc-heap.
Proposed rules
- stage1 should never be a cross-compiler (stage2 may be a cross-compiler, stage3 may be a cross-compiled non-cross compiler)
- stage1 only constraint should be to be able to build stage2 ghc
- stage1 shouldn't depend on anything related to introspection: ghc-heap, rts, etc.
Advantages
- sanity: simpler build system.
- faster compilation times: stage0 only builds the minium required to bootstrap (no haddock, no ghc-heap, etc.)
- fewer packages need to be buildable with GHC versions N-2, N-1 and N. E.g. ghc-heap and ghci won't have to as they won't be dependencies of stage1's ghc.