--make driver should "make" use of graph more
Problem
I have found a few issues in the --make driver code, which all relate to the graph in some ways
- Compilation of backpack indefinite package is broken with -j
The reason behind this is that the InstantiationNode
's compilation does not wait on any other nodes.
Nor does the compilation of ModuleNodes
wait on InstantiationNode
.
-
parUpsweep
contains a logic for handling theext_loop_deps
, which is not explicitly handled in the original graph, nor in theupsweep
. -
parUpsweep
recomputes the depsThe graph created by
moduleGraphNodes
is correct, but doing topological sort removes all the dependency informationFor single threaded
upsweep
it is not an issue (as long as-fkeep-going
is not used*) , but theparUpsweep
needs to figure out all the dependencies again.
Solution
I think the right solution here is that upsweep
and parUpweep
should work on an input graph, and there should not be a need to do any extra patching over the graph.
Fixing this properly requires these two
- The
parUpweep
specifically should not compute the deps info, it should be an input computed from the graph. - The graph should take into consideration the
ext_loop_deps
Here is the summary of a solution I have in mind
-
Make the initial graph, just like currently being done by
moduleGraphNodes
-
Do topological sort and filter out all cyclic modules
-
For each ACyclic SCC
a. Create another graph
b. Determine all the module loops
c. For each module loop, fix the edges of external modules which directly depend on any module in the loop. (ie change the head of the edge to the loop closer.)
-
Use the updated graph to construct a suitable datastructure which contains the set of MVars the
parUpweep
needs to wait on. (eg.[BuildModule, [MVar ...]]
)
Open question: In case of multiple connected module loops, what should be the behavior of typecheckLoop
Example
Below is concrete example of how these modifications would happen
(omitting the InstantiationNode
, since it behaves similar to a normal HsSrc
node with respect to this algo)
- Initial graph created by direct source imports
graph LR;
A.hs-boot --> B.hs;
A.hs-boot --> D.hs;
B.hs --> E.hs-boot;
B.hs --> A.hs;
C.hs-boot --> A.hs;
A.hs --> C.hs;
A.hs --> F.hs;
E.hs-boot --> G.hs;
G.hs --> H.hs;
H.hs --> E.hs;
G.hs --> I.hs;
J.hs-boot --> J.hs;
J.hs-boot --> K.hs;
3a. Determine module loops
Number of module loops == Number of HsBoot modules
Loops could be represented by a NonEmpty style list, where head is the loop closer
A :| [B]
C :| [A]
E :| [G, H]
J :| []
As far as I could determine the ordering of the processing of loops does not matter here
I. Process A.hs-boot
graph LR;
A.hs-boot --> B.hs; %% [isLoopMember]
A.hs --> D.hs; %% (modified by I)
A.hs --> E.hs-boot; %% (modified by I)
B.hs --> A.hs;
C.hs-boot --> A.hs;
A.hs --> C.hs;
A.hs --> F.hs;
E.hs-boot --> G.hs;
G.hs --> H.hs;
H.hs --> E.hs;
G.hs --> I.hs;
J.hs-boot --> J.hs;
J.hs-boot --> K.hs;
II. Process C.hs-boot
graph LR;
A.hs-boot --> B.hs; %% [isLoopMember]
C.hs --> D.hs; %% (modified by I, II)
C.hs --> E.hs-boot; %% (modified by I, II)
B.hs --> A.hs;
C.hs-boot --> A.hs; %% [isLoopMember]
A.hs --> C.hs;
C.hs --> F.hs; %% (modified by II)
E.hs-boot --> G.hs;
G.hs --> H.hs;
H.hs --> E.hs;
G.hs --> I.hs;
J.hs-boot --> J.hs;
J.hs-boot --> K.hs;
III. Process E.hs-boot
graph LR;
A.hs-boot --> B.hs; %% [isLoopMember]
C.hs --> D.hs; %% (modified by I, II)
C.hs --> E.hs-boot; %% (modified by I, II)
B.hs --> A.hs;
C.hs-boot --> A.hs;
A.hs --> C.hs;
C.hs --> F.hs; %% (modified by II)
E.hs-boot --> G.hs;
G.hs --> H.hs;
H.hs --> E.hs;
E.hs --> I.hs;
J.hs-boot --> J.hs;
J.hs-boot --> K.hs;
IV. Process J.hs-boot
graph LR;
A.hs-boot --> B.hs; %% [isLoopMember]
C.hs --> D.hs; %% (modified by I, II)
C.hs --> E.hs-boot; %% (modified by I, II)
B.hs --> A.hs;
C.hs-boot --> A.hs;
A.hs --> C.hs;
C.hs --> F.hs; %% (modified by II)
E.hs-boot --> G.hs;
G.hs --> H.hs;
H.hs --> E.hs;
E.hs --> I.hs;
J.hs-boot --> J.hs;
J.hs --> K.hs; %% (modified by IV)
Problematic case
graph TB;
subgraph one
L1.hs-boot --> L1_1.hs;
L1_1.hs --> L1.hs;
end
subgraph three
M.hs
end
subgraph two
L2.hs-boot --> L2_1.hs;
L2_1.hs --> L2.hs;
end
L1_1.hs --> M.hs;
L2_1.hs ==> L1.hs;
M.hs --> L2.hs;
In this case the L1 depends on L2_1 ( highlighted edge), but it should not be changed to L2. (It is ok for M to depend on L1, as shown by dotted)
graph TB;
subgraph one
L1.hs-boot --> L1_1.hs;
L1_1.hs --> L1.hs;
end
subgraph three
M.hs
end
subgraph two
L2.hs-boot --> L2_1.hs;
L2_1.hs --> L2.hs;
end
L1_1.hs --> M.hs;
L1.hs -..-> M.hs;
L2_1.hs ==> L1.hs;
M.hs --> L2.hs;
Also this somehow magically works on master branch, as the L1 is considered part of the L2 loop!
comp_graph_loops
[[L2, L2 {-# SOURCE #-}, L2_1, L1, L2],
[L1, L1 {-# SOURCE #-}, L1_1, L1]]