|
|
# Cross Compiling GHC
|
|
|
|
|
|
|
|
|
As of this moment (GHC 6.12) GHC does not support cross-compilation. There are reasons that we would like it to:
|
|
|
|
|
|
- [ TakeoffGW](http://takeoffgw.sourceforge.net/) is a distribution of Unix tools for Windows, built by cross-compiling on a Linux machine. They would like to be able to build and distribute GHC this way. It might be useful for us to be able to cross-compile a Windows GHC from Linux too.
|
|
|
|
|
|
- We could build a 64-bit GHC on OS X, by cross-compiling using the 32-bit version.
|
|
|
|
|
|
- We could port to Win64 ([\#1884](https://gitlab.haskell.org//ghc/ghc/issues/1884)) by cross-compiling using a 32-bit Windows GHC.
|
|
|
|
|
|
- Other porting tasks might be easier, given a suitable cross-compilation toolchain.
|
|
|
As of this moment (GHC 7.0) GHC does not support cross-compilation. This page is to gather information and plans it.
|
|
|
|
|
|
## General Problem
|
|
|
|
|
|
By way of example, let's suppose we have an x86/Linux platform and we want to cross-compile to PPC64/OSX. Then our build is going to look like this:
|
|
|
|
|
|
<table><tr><th>**Compiler**</th>
|
|
|
<th>**Runs on**</th>
|
|
|
<th>**Generates code for**</th></tr>
|
|
|
<tr><th> Stage 0 </th>
|
|
|
<th> x86/Linux </th>
|
|
|
<th> x86/Linux
|
|
|
</th></tr>
|
|
|
<tr><th> Stage 1 </th>
|
|
|
<th> x86/Linux </th>
|
|
|
<th> PPC64/OSX
|
|
|
</th></tr>
|
|
|
<tr><th> Stage 2 </th>
|
|
|
<th> PPC64/OSX </th>
|
|
|
<th> PPC64/OSX
|
|
|
</th></tr></table>
|
|
|
The most general case is a developer on build platform (B), wishing to build a GHC that runs on host platform (H) which produces code that runs on target platform (T). But, we need not handle such a general case: There are two common cases:
|
|
|
|
|
|
- Building a **cross-compiler**: Create a compiler that runs on one platform, but targets another. Examples are building a GHC that:
|
|
|
|
|
|
Where stage 0 is the bootstrap compiler (the one you specify using `--with-ghc` when configuring the build), and stages 1 and 2 are the compilers being built.
|
|
|
- runs on Mac OS X, but targets iOS
|
|
|
- runs on x86_64 linux, but targets i386
|
|
|
- runs on some existing GHC supported platform, but targets a smaller embedded platform
|
|
|
- **Cross-building** a normal compiler: Build on one platform a compiler that that runs on, and targets another. Examples:
|
|
|
|
|
|
- [ TakeoffGW](http://takeoffgw.sourceforge.net/) is a distribution of Unix tools for Windows, built by cross-compiling on a Linux machine. They would like to be able to build and distribute GHC this way. It might be useful for us to be able to cross-compile a Windows GHC from Linux too.
|
|
|
- build a 64-bit GHC on OS X, by cross-compiling using the 32-bit version.
|
|
|
- We could port to Win64 ([\#1884](https://gitlab.haskell.org//ghc/ghc/issues/1884)) by cross-compiling using a 32-bit Windows GHC.
|
|
|
- Other porting tasks might be easier, given a suitable cross-compilation toolchain.
|
|
|
|
|
|
Now some general nomenclature:
|
|
|
|
|
|
- **Build platform**: the platform on which the software is being built
|
|
|
- **Host platform**: the platform on which the software will run
|
|
|
- **Target platform**: for a compiler, the platform on which the generated code will run
|
|
|
From the developer's perspective, these can be categorized as:
|
|
|
|
|
|
- **Cross-compiler**: B = H, and H ≠ T.
|
|
|
- **Cross-building**: B ≠ H, and H = T.
|
|
|
|
|
|
These correspond to CPP symbols that are defined when compiling both Haskell and C code:
|
|
|
|
|
|
- *xxx*`_BUILD_ARCH`, *xxx*`_BUILD_OS`: the build platform
|
|
|
- *xxx*`_HOST_ARCH`, *xxx*`_HOST_OS`: the host platform
|
|
|
- *xxx*`_TARGET_ARCH`, *xxx*`_TARGET_OS`: the target platform
|
|
|
It seems reasonable to limit ourselves to these two cases.
|
|
|
|
|
|
## Meshing with GHC's 2-Stage Build
|
|
|
|
|
|
The important thing to realise about the 2-stage bootstrap is that each stage has a different notion of build/host/target: these CPP symbols will map to different things when compiling stage 1 and stage 2. Furthermore the RTS and libraries also have a notion of build and host (but not target: they don't generate code).
|
|
|
|
|
|
The GHC build, in general, is a two stage process, involving three GHC compilers and two sets of build libraries:
|
|
|
|
|
|
The overall build has a build/host/target, supplied on the `configure` command line:
|
|
|
- **Stage 0**: the GHC that is already on the build system (the one you specify using `--with-ghc` when configuring the build), comes with a set of built libs, could be older than the version of GHC being built
|
|
|
- **libs boot**: libs that the current version of GHC being built relies on that are either absent or too old in older versions of GHC that might be being used as Stage 0. These libs are built with Stage 0 GHC, and linked into the Stage 1 GHC being built.
|
|
|
- **Stage 1**: the first GHC built, compiled by the Stage 0 GHC, and linked with both libs from that GHC installation, and the boot libs.
|
|
|
- **libs install**: libs that will accompany the final compiler, built by the Stage 1 GHC
|
|
|
- **Stage 2**: the final GHC built, compiled by the Stage 1 GHC, and linked with only the install libs
|
|
|
|
|
|
> `$ ./configure --build=`*build*` --host=`*host*` --target=`*target*
|
|
|
|
|
|
**The host and target specified on the configure command line refer to the stage 2 compiler.** Specifically, here is how we map the platforms from the configure command line onto the platforms used by the different stages, and the RTS and libraries:
|
|
|
In summary:
|
|
|
|
|
|
<table><tr><th></th>
|
|
|
<th>**Overall build**</th>
|
|
|
<th>**Stage 0**</th>
|
|
|
<th>**libs boot**</th>
|
|
|
<th>**Stage 1**</th>
|
|
|
<th>**Stage 2**</th>
|
|
|
<th>**libs-host**</th>
|
|
|
<th>**libs-target**</th></tr>
|
|
|
<tr><th>**Build platform**</th>
|
|
|
<th>*build*</th>
|
|
|
<th>--- </th>
|
|
|
<th>**libs install**</th>
|
|
|
<th>**Stage 2**</th></tr>
|
|
|
<tr><th>**built on**</th>
|
|
|
<th> --- </th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th></tr>
|
|
|
<tr><th>**Host platform**</th>
|
|
|
<th>*host*</th>
|
|
|
<tr><th>**runs on**</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*host*</th>
|
|
|
<th>*host*</th>
|
|
|
<th>*target*</th></tr>
|
|
|
<tr><th>**Target platform**</th>
|
|
|
<th>*target*</th>
|
|
|
<th>*host*</th></tr>
|
|
|
<tr><th>**targets**</th>
|
|
|
<th>*build*</th>
|
|
|
<th> --- </th>
|
|
|
<th>*host*</th>
|
|
|
<th>*target*</th>
|
|
|
<th> --- </th>
|
|
|
<th> ---
|
|
|
</th></tr></table>
|
|
|
<th> --- </th>
|
|
|
<th>*target*</th></tr></table>
|
|
|
|
|
|
|
|
|
Where **libs-host** refers to the libraries and RTS that we are building to link with the stage 2 compiler, and **libs-target** refers to the libraries and RTS that will be linked with binaries built by the stage 2 compiler to run on the target platform.
|
|
|
|
|
|
|
|
|
In the special case where we are using cross compilation to bootstrap a new platform, as in the above example, we have *host* == *target*:
|
|
|
Because of the way the compiler is configured (same configuration for Stage 1 and Stage 2), and the way the install libraries are built (built with Stage 1, but shipped with Stage 2), you can see that both Stage 1 and Stage 2 compilers must target the same architecture. Furthermore, the build only allows for probing for the properties (word size, library function availability, etc...) of one platform. This means that the build isn't as general as one might expect, and what we really have is:
|
|
|
|
|
|
<table><tr><th></th>
|
|
|
<th>**Overall build**</th>
|
|
|
<th>**Stage 0**</th>
|
|
|
<th>**libs boot**</th>
|
|
|
<th>**Stage 1**</th>
|
|
|
<th>**Stage 2**</th>
|
|
|
<th>**libs-host**</th></tr>
|
|
|
<tr><th>**Build platform**</th>
|
|
|
<th>**libs install**</th>
|
|
|
<th>**Stage 2**</th></tr>
|
|
|
<tr><th>**built on**</th>
|
|
|
<th> --- </th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th></tr>
|
|
|
<tr><th>**Host platform**</th>
|
|
|
<th>*target*</th>
|
|
|
<tr><th>**runs on**</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*target*</th>
|
|
|
<th>*target*</th></tr>
|
|
|
<tr><th>**Target platform**</th>
|
|
|
<th>*target*</th>
|
|
|
<th>*target*</th>
|
|
|
<tr><th>**targets**</th>
|
|
|
<th>*build*</th>
|
|
|
<th> --- </th>
|
|
|
<th>*target*</th>
|
|
|
<th></th></tr></table>
|
|
|
<th> --- </th>
|
|
|
<th>*target*</th></tr></table>
|
|
|
|
|
|
|
|
|
Note that with *host* == *target*, **libs-host** == **libs-target**, so we only need to build the RTS and libraries once (fortunately, because the GHC build system only supports building them once).
|
|
|
But this works out just fine for the two use cases we've identified:
|
|
|
|
|
|
- **Cross-compiler**: a Stage 1 compiler & libs install
|
|
|
- **Cross-building**: a Stage 2 compiler & libs install
|
|
|
|
|
|
Suppose we wanted to build a cross-compiler to run on the current platform. Then we could configure with *build* == *host*, but *target* is different:
|
|
|
|
|
|
<table><tr><th></th>
|
|
|
<th>**Overall build**</th>
|
|
|
<th>**Stage 1**</th>
|
|
|
<th>**Stage 2**</th>
|
|
|
<th>**libs-host**</th>
|
|
|
<th>**libs-target**</th></tr>
|
|
|
<tr><th>**Build platform**</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th></tr>
|
|
|
<tr><th>**Host platform**</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*target*</th></tr>
|
|
|
<tr><th>**Target platform**</th>
|
|
|
<th>*target*</th>
|
|
|
<th>*build*</th>
|
|
|
<th>*target*</th>
|
|
|
<th></th>
|
|
|
<th></th></tr></table>
|
|
|
The build plan becomes:
|
|
|
|
|
|
- **Cross-compiler**
|
|
|
|
|
|
Note in this configuration that we need both **libs-host** and **libs-target**, so currently the GHC build system does not support building this kind of cross-compiler. Fortunately, most of the things you would want to do with this kind of cross-compiler are supported by the first kind, the only caveat is that you can't *install* a cross-compiler that way.
|
|
|
- Developer configures with B = H, and H ≠ T:
|
|
|
|
|
|
## Plan
|
|
|
> > > `$ ./configure --target=`*other-platform*
|
|
|
|
|
|
- Build though Stage 1 and libs install
|
|
|
- Package Stage 1 GHC and libs install as the desired cross-compiler
|
|
|
|
|
|
Here is how it should work:
|
|
|
- **Cross-build**
|
|
|
|
|
|
```wiki
|
|
|
$ ./configure --build=<here> --host=<there> --target=<there>
|
|
|
```
|
|
|
- Developer configures with B ≠ H, and H = T
|
|
|
|
|
|
> > > `$ ./configure --host=`*other-platform*` --target=`*other-platform*
|
|
|
|
|
|
note that we're cross-compiling from the *build* machine to the *host* machine. The *target* machine is the same as the *host*: the GHC that we're trying to create will generate binaries for *host*.
|
|
|
- Internally, set H to B, so that we have B = H, and H ≠ T as required
|
|
|
- Build through libs install and Stage 2
|
|
|
- Package Stage 2 GHC and libs install as the desired cross-compiler
|
|
|
|
|
|
|
|
|
No doubt we'll also need to specify some additional configuration parameters to tell the build system where to find our cross-compilation tools. Perhaps something like
|
|
|
Thus, as far as the mechanics of the build are concerned, the two use cases are actually handled the same once the B/H/T variables are normalized. The only real difference is when to stop (before or after Stage 2), and which compiler gets bundled as the installed compiler (Stage 1 or Stage 2).
|
|
|
|
|
|
```wiki
|
|
|
--with-host-cc=...
|
|
|
--with-host-as=...
|
|
|
--with-host-ld=...
|
|
|
--with-host-ar=...
|
|
|
--with-host-strip=...
|
|
|
```
|
|
|
|
|
|
- stage 1: runs on `build`, compiles for `host`
|
|
|
- stage 2: runs on `host`, compiles for `host`
|
|
|
From here on out, this page assumes that the B/H/T variables have been normalized. That is B = H, and H ≠ T.
|
|
|
|
|
|
## Things that probably need fixing
|
|
|
## Tool-chains
|
|
|
|
|
|
- The configure script doesn't let you specify different `build`, `host`, and `target` right now
|
|
|
- The build system has no distinction between the gcc used to compile from build-\>build and build-\>host.
|
|
|
- We can't build anything with stage2 when cross-compiling, e.g. Haddock and DPH must be disabled.
|
|
|
|
|
|
---
|
|
|
These kinds of builds need two tool-chains: One that runs on B/H, and compiles for B/H, the "host-tool-chain" or HT, and one that runs on B/H, but compiles for T, the "cross-tool-chain" or XT. The tool-chains include many programs needed: gcc, ld, nm, ar, as, ranlib, strip, and even ghc! The stage0 GHC is effectively part of the HT, and the stage1 we are building is going to become part of the XT. The tool-chain also includes a raft of information about the tools: does ar need ranlib, which extra ld flags need to be passed, etc.
|
|
|
|
|
|
*This is a collection of information from Mark Lentczner's work on cross-compilation. Once things are settled, this information should be merged with the above.*
|
|
|
|
|
|
## General Problem
|
|
|
Even in a non-cross build, the current build system takes some care to achieve a limited form of tool-chain separation. In particular, when using the stage0 GHC, the build should be using the tool chain that that compiler is designed to work with -- which may not be the tool chain specified on the ./configure command line. This is only partially fulfilled. For example, while the build uses the stage0 GHC to compile C sources, so that the stage0 compatible gcc will be used, the build also uses other various tools ferreted out by ./configure (ar and ranlib for example).
|
|
|
|
|
|
## Autoconf
|
|
|
|
|
|
The most general case is a user on build platform B, wishing to build a GHC that runs on host platform H which produces code that runs on target platform T. But, we need not handle such a general case, it seems reasonable to limit ourselves to the case where (from the users' perspective) B = H, and H ≠ T. That is, the user things "I want to build a GHC on my current system, that runs on my current system, and which produces code for some other, target system".
|
|
|
|
|
|
## High Level Approach
|
|
|
Autoconf offers only limited support for cross compiling. While it professes to know about three platforms, base, host, and target; it knows only about one tool-chain. It uses that tool-chain to determine two classes of information: Information about how to use the tool-chain itself, and information about the target of that tool-chain. Hence, in the cross-compilation case, it makes sense for ./configure to be told about XT.
|
|
|
|
|
|
|
|
|
In this scenario, we have two choices:
|
|
|
1) Build a stage1 compiler that runs on and compiles for B/H, then use that to build a stage2 compiler that runs on B/H and compiles for T. In this case, the libraries rts built by stage1 compiler will be incompatible with the stage2 compiler - and so the libraries and rts for distribution would have to be built again.
|
|
|
--or--
|
|
|
2) Build a stage1 compiler that runs on B/H and compiles for T. Then the rts and library built by the stage1 compiler, are compatible with it, and together (stage1 compiler & libs/rts built by it) form the cross-compiler.
|
|
|
Autoconf's concept and variable $cross_compiling only gets set if B ≠ H. This is correct from the standpoint of compiling a simple program (for which T is irrelevant). In our normalized version of B/H/T, B = H, so the logic of autoconf needs to be ammended *(patch pending)*.
|
|
|
|
|
|
|
|
|
Approach 2 is more in-line with the rest of the build system. Further, if the users's stage0 GHC is the same as the tree they are building in, it is arguable that the extra compiler build of option 1 is redundant (since the stage1 build in that option should be identical to the stage0 they started with!).
|
|
|
This leaves us with the issue of how to tell it about parts of HT it can't infer from the stage0 compiler. We need a new set of variables that are how to compile, link and run things on the host, which if cross compiling need to be different. There needs to be some way to pass those on the configure line. Perhaps something like:
|
|
|
|
|
|
## Tool-chains
|
|
|
```wiki
|
|
|
--with-host-cc=...
|
|
|
--with-host-as=...
|
|
|
--with-host-ld=...
|
|
|
--with-host-ar=...
|
|
|
--with-host-strip=...
|
|
|
```
|
|
|
|
|
|
|
|
|
When building a cross compiler, we will need two tool-chains: One that runs on B/H, and compiles for B/H, the "host-tool-chain" or HT, and one that runs on B/H, but compiles for T, the "cross-tool-chain" or XT. The tool-chains include many programs needed: gcc, ld, nm, ar, as, ranlib, strip, and even ghc! The stage0 GHC is effectively part of the HT, and the stage1 we are building is going to become part of the XT. The tool-chain also includes a raft of information about the tools: does ar need ranlib, which extra ld flags need to be passed, etc.
|
|
|
A tricky aspect is that some properities of the tool chain are probed by Autoconf ("is cc gcc?", "does ar need ranlib?"). These probes technically should be performed for each tool-chain. *(partial patch pending)*
|
|
|
|
|
|
|
|
|
Even in a non-cross build, the current build system takes some care to achieve a limited form of tool-chain separation. In particular, when using the stage0 GHC, the build should be using the tool chain that that compiler is designed to work with -- which may not be the tool chain specified on the ./configure command line. This is only partially fulfilled. For example, while the build uses the stage0 GHC to compile C sources, so that the stage0 compatible gcc will be used, the build also other various tools ferreted out by ./configure (ar and ranlib for example).
|
|
|
Both ./configure, cabal configure, and hsc2hs desire to run things built for T. If the XT contains an emulator, than this is possible. Two approaches need to be supported here:
|
|
|
|
|
|
## Autoconf
|
|
|
1. Autoconf can now descern many values without running code and configure.ac / aclocal.m4 scripts can be changed to avoid running in many cases. (For example in libraries/base I rewrote things to use AC_COMPUTE_INT rather than AC_RUN_IFELSE to find the sizes of htypes.) *(patch pending)*
|
|
|
1. Plumb the need to call the emulator to run in the right places. An alternative is to use an alternate linker command that inserts the emulator into those build executables (but this is tricky as you don't want to use that link when building for the real target...)
|
|
|
|
|
|
## Make Files
|
|
|
|
|
|
Autoconf offers only limited support for cross compiling. While it professes to know about three platforms, base, host, and target; it knows only about one tool-chain. It uses that tool-chain to determine two classes of information: Information about how to use the tool-chain itself, and information about the target of that tool-chain. Hence, in the cross-compilation case, it makes sense for ./configure to be told about XT.
|
|
|
|
|
|
The over all build sequencing needs to recognize the cross compilation configuration, and adjust build targets and final packaging to match.
|
|
|
|
|
|
Autoconf's concept and variable $cross_compiling only gets set if B ≠ H. This is correct from the standpoint of compiling a simple program (for which T is irrelevant). From the user's perspective, B = H, so we need to augment the logic of autoconf here.
|
|
|
|
|
|
There are few other places where the make system needs to get fixed up to use the correct tool-chain at the right time *(partial patche pending)*.
|
|
|
|
|
|
This leaves us with the issue of how to tell it about parts of HT it can't infer from the stage0 compiler. We need a new set of variables that are how to compile, link and run things on the host, which if cross compiling need to be different. There needs to be some way to pass those on the configure line.
|
|
|
A tricky aspect is that some properities of the tool chain are probed by Autoconf ("is cc gcc?", "does ar need ranlib?"). These probes technically should be performed for each tool-chain.
|
|
|
|
|
|
There are a set of CPP symbols that are defined when compiling both Haskell and C code:
|
|
|
|
|
|
Both ./configure, cabal configure, and hsc2hs desire to run things built for T. If the XT contains an emulator, than this is possible. Two approaches need to be taken here: 1) Autoconf can now descern many values without running code and configure.ac / aclocal.m4 scripts can be changed to avoid running in many cases. (For example in libraries/base I rewrote things to use AC_COMPUTE_INT rather than AC_RUN_IFELSE to find the sizes of htypes.) 2) Plumb the need to call the emulator to run in the right places. An alternative is to use an alternate linker command that inserts the emulator into those build executables (but this is tricky as you don't want to use that link when building for the real target...)
|
|
|
- *xxx*`_BUILD_ARCH`, *xxx*`_BUILD_OS`: the build platform
|
|
|
- *xxx*`_HOST_ARCH`, *xxx*`_HOST_OS`: the host platform
|
|
|
- *xxx*`_TARGET_ARCH`, *xxx*`_TARGET_OS`: the target platform
|
|
|
|
|
|
## Status
|
|
|
|
|
|
There are also similar Make variables. These need to be normalized into something more rational: At present there the usage is somewhat sloppy, since in most builds all three are the same.
|
|
|
|
|
|
## Things that probably need fixing
|
|
|
|
|
|
- We can't build anything with stage2 when cross-compiling, e.g. Haddock and DPH must be disabled.
|
|
|
|
|
|
---
|
|
|
|
|
|
## Status
|
|
|
|
|
|
I actually have much of this working. At this point I can build and link and run a stage1 cross-compiler. I have plumbed two tool-chains through the top level ./configure and make, and have gotten through configuring several libraries with the in-place cabal.
|
|
|
### March 2011: Mark Lentczner
|
|
|
|
|
|
|
|
|
In general, the problems have all been in plumbing the concepts of XT vs. HT around the build system. While I've been able to fudge it for most of the components that use autoconf, I'm currently having trouble getting things right for "cabal configure" to work on the libs, as configured for the dist-install build phase. I'm worried that this might be hopeless without diving into cabal and teaching it about this kind of situation.
|
|
|
I actually have much of the above working. At this point I can build and link and run a stage1 cross-compiler. I have plumbed two tool-chains through the top level ./configure and make, and have gotten through configuring several libraries with the in-place cabal. I have patches almost ready to go for items above marked *pending*, and will submit them soon, if this whole approach agrees with everyone.
|
|
|
|
|
|
|
|
|
A Wholly different approach might be to instead mirror the two tree style that porting uses. This, however, will still run into similar issues, since "target" tree still wouldn't really be running on the target. |
|
|
In general, the problems have all been in plumbing the concepts of XT vs. HT around the build system. While I've been able to fudge it for most of the components though there are places where my work around is forced.
|
|
|
|