Hadrian build artifacts are sensitive to build location
Build artifacts contain absolute paths to other build artifacts. This is makes the artifacts non-relocatable. In particular, cloud builds can largely cache misses, fail, and produce stale build artifact. I noticed this when attempting a cloud build (after !1436 (closed)).
Recreation
You can grep the build directory for the absolute build dir path:
./boot && ./configure && ./hadrian -j
grep -rn _build -e "$(pwd)"
Here is a breakdown of what I see:
-
Makefile
s- I think we insert some paths based on
Makefile.in
files
- I think we insert some paths based on
-
.dependencies
- Paths to
ghcversion.h
are absolute.
- Paths to
_build/stage1/libraries/process/build/c/cbits/runProcess*
_build/stage1/{utils, libraries}/*/build/{config.status, config.log}
_build/stage1/{utils, libraries}/*/setup-config
-
_build/stage1/libffi/build/.../
libffi.la
Makefile
Makefile.in
ibffi.pc
- libffi.lai
_build/stage1/rts/build/c/*o
_build/stage1/lib/x86_64-linux-ghc-8.9.0.20190717/libHSrts-1.0_*-ghc8.9.0.20190717.so
_build/stage1/lib/x86_64-linux-ghc-8.9.0.20190717/rts-1.0/libHSrts-1.0_*.a
You can also try a cloud build from a different directory:
# Initial build to fill cache.
./boot && ./configure && ./hadrian -j --share=../_cache
# Create a new worktree.
git worktree add ../ghc2
cd ../ghc2
git submodule update --init
# Do the same build again but from a different worktree directory.
# Assuming you have the changes from !1436 this will mostly cache miss (if you
# fix this issue it should should only take 2-3mins).
./boot && ./configure && ./hadrian -j --share=../_cache
This is part of a workflow to develop on different work trees and take advantage of cloud builds.
you start from the same commit and start work on 2 separate issues. In this case, consider freezing stage one to some base commit:
# Checkout base commit.
cd ~/ghc
git checkout 24fd2e189b
# Build 1 (25m08s): base commit in worktree ~/ghc.
# This caches build artifacts for the base commit.
./boot && ./configure && ./hadrian/build.sh -j --share=../_cache
# Make some changes (I do a revert for reproducibility).
git revert d7c6c4717c
# Build 2 (1m41s) modified worktree ~/ghc.
# Freeze stage 1 to avoid rebuilding all of stage 2.
./hadrian/build.sh -j --share=../_cache --freeze1
# Create a new work tree from the base commit.
git worktree add ~/ghc2 24fd2e189b
cd ~/ghc2
git submodule update --init
# Build 3 (xxxxx): base commit in worktree ~/ghc2.
# This is will allow us to freeze stage1 to the base commit in the next build.
# This makes good use of the cached build in build 1, so should be fast.
./boot && ./configure && ./hadrian/build.sh -j --share=../_cache
# Make some changes (I do a revert for reproducibility).
git revert 348cc8ebf1
# Build 4 (xxxxxx) modified worktree ~/ghc.
# Freeze stage 1 to avoid rebuilding all of stage 2.
./hadrian/build.sh -j --share=../_cache --freeze1
Notes
In order to enable cache hits insensitive to build location, we should generally replace absolute paths with relative paths. The source of the absolute path to the ghc source directory is the configure script. It generates hadrian/cfg/system.config
which contains ghc-source-path = ...
. This value is used in Hadrian as topDirectory
.
One of the issues is config files generated by cabal (e.g. setup-config
). Investigation is needed, but this may not be an issue if these config files are only ever viewed through an oracle (is this already true?) and if the oracle's return value excludes the absolute path.