Changes

David Eichmann · e329f08b
--- a/Developing-Hadrian.md
+++ b/Developing-Hadrian.md
@@ -200,6 +200,13 @@ Now when we build `A.hs` we must make sure to need the CPP includes i.e. buildin

 So what is the problem here? The rule for `["A.o", "A.hi"]` is fine, it tracks the correct dependencies, but what about the `".dependencies"` rule? Running `ghc -M -include-cpp-deps A.hs ...` reads A.hs then traverses all the `#include`ed files, hence the dependencies of `.dependencies` are `A.hs`, `B.h`, and `C.h` (this indeed can't be reduced; any of those files can change the output of `.dependencies`). Well, this is awkward! The dependencies of `.dependencies` are exactly the dependencies that we are trying to discover in the first place! If we now include `D.h` from `C.h` and recompile, `A.hs` will recompile (it depends on `C.h` which has changed, but `.dependencies` will not be recalculated so `["A.o", "A.hi"]` will *not* track the new `D.h` dependency. Now change `D.h` and rebuild, `A.hs` (i.e. the `["A.o", "A.hi"]` rule) will not be rebuilt even though a dependency has changed.

+A possible solution to this is to use `needed` in the `.dependencies` rule to declare the dependencies *after* running ghc's dependency generation. This has a couple of issues:
+
+* ghc also outputs dependencies on other haskell interface files that are imported from this module. Unlike the CPP includes, these are *not* indicating inputs, so we must distinguish them. This could be done by observing file extensions though is not foolproof as CPP includes have no restriction on file extension.
+* This works fine if the CPP dependencies are source files in the source tree, but doesn't work if one of the dependencies needs to be generated by the build system: we would need to `need` them a priori to ensure they exist and so can be read to discover any further dependencies.
+
+Currently this is an unresolved issue in Hadrian, though how likely this issues is to surface as a bug in incremental/cloud builds is hard to say.
+
 ## Haskell object files and .hi inputs

 Consider crating a rule for a file X.o that compiles X.hs with ghc. We use `ghc -M` which returns `X.o : X.hs X.hi Y.hi`. The `Y.hi` is there because module `X` imports module `Y`. We conclude that `direct inputs = indicating inputs = { X.hs, X.hi, Y.hi } So we implement the rule like this:
@@ -217,7 +224,7 @@ This seems correct, but running the rule with `--lint-fsatrace` complains that t
 = change({ Z.hi })               -> change({ X.hs, X.hi, Y.hi })
 ```

-With some insight from an experienced ghc developer we see that all .hi files contain "a list of the fingerprints of everything it used when it last compiled the file" (from the [user guide](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/separate_compilation.html?highlight=fingerprint#the-recompilation-checker)). Making the practical assumption that the fingerprint's hashing function is injective, we know that `change({ transitive .hi files }) -> change({ X.hi }) -> change({ X.hs, X.hi, Y.hi })` and hence it is safe to not `need` transitive .hi files. In we `need` the direct .hi files reported by `ghc -M` and `trackAllow` all other `.hi` files. We apply the same logic for .hi-boot files too.
+With some insight from an experienced ghc developer we see that all .hi files contain "a list of the fingerprints of everything it used when it last compiled the file" (from the [user guide](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/separate_compilation.html?highlight=fingerprint#the-recompilation-checker)). Making the practical assumption that the fingerprint's hashing function is injective, we know that `change({ transitive .hi files }) -> change({ X.hi }) -> change({ X.hs, X.hi, Y.hi })` and hence it is safe to not `need` transitive .hi files. We `need` the direct .hi files reported by `ghc -M` and `trackAllow` all other `.hi` files. We apply the same logic for .hi-boot files too.

 ## Libffi