--make driver should "make" use of graph more

Problem

I have found a few issues in the --make driver code, which all relate to the graph in some ways

Compilation of backpack indefinite package is broken with -j

The reason behind this is that the InstantiationNode's compilation does not wait on any other nodes. Nor does the compilation of ModuleNodes wait on InstantiationNode.

parUpsweep contains a logic for handling the ext_loop_deps, which is not explicitly handled in the original graph, nor in the upsweep.
parUpsweep recomputes the deps

The graph created by moduleGraphNodes is correct, but doing topological sort removes all the dependency information

For single threaded upsweep it is not an issue (as long as -fkeep-going is not used*) , but the parUpsweep needs to figure out all the dependencies again.

Solution

I think the right solution here is that upsweep and parUpweep should work on an input graph, and there should not be a need to do any extra patching over the graph.

Fixing this properly requires these two

The parUpweep specifically should not compute the deps info, it should be an input computed from the graph.
The graph should take into consideration the ext_loop_deps

Here is the summary of a solution I have in mind

Make the initial graph, just like currently being done by moduleGraphNodes
Do topological sort and filter out all cyclic modules
For each ACyclic SCC

a. Create another graph

b. Determine all the module loops

c. For each module loop, fix the edges of external modules which directly depend on any module in the loop. (ie change the head of the edge to the loop closer.)
Use the updated graph to construct a suitable datastructure which contains the set of MVars the parUpweep needs to wait on. (eg. [BuildModule, [MVar ...]])

Open question: In case of multiple connected module loops, what should be the behavior of typecheckLoop

Example

Below is concrete example of how these modifications would happen (omitting the InstantiationNode, since it behaves similar to a normal HsSrc node with respect to this algo)

Initial graph created by direct source imports

graph LR;
  A.hs-boot --> B.hs;
  A.hs-boot --> D.hs;
  B.hs --> E.hs-boot;
  B.hs --> A.hs;
  C.hs-boot --> A.hs;
  A.hs --> C.hs;
  A.hs --> F.hs;

  E.hs-boot --> G.hs;
  G.hs --> H.hs;
  H.hs --> E.hs;
  G.hs --> I.hs;

  J.hs-boot --> J.hs;
  J.hs-boot --> K.hs;

3a. Determine module loops

Number of module loops == Number of HsBoot modules

Loops could be represented by a NonEmpty style list, where head is the loop closer

A :| [B]
C :| [A]
E :| [G, H]
J :| []

As far as I could determine the ordering of the processing of loops does not matter here

I. Process A.hs-boot

graph LR;
  A.hs-boot --> B.hs; %% [isLoopMember]
  A.hs --> D.hs; %% (modified by I)
  A.hs --> E.hs-boot; %% (modified by I)
  B.hs --> A.hs;
  C.hs-boot --> A.hs;
  A.hs --> C.hs;
  A.hs --> F.hs;

  E.hs-boot --> G.hs;
  G.hs --> H.hs;
  H.hs --> E.hs;
  G.hs --> I.hs;

  J.hs-boot --> J.hs;
  J.hs-boot --> K.hs;

II. Process C.hs-boot

graph LR;
  A.hs-boot --> B.hs; %% [isLoopMember]
  C.hs --> D.hs; %% (modified by I, II)
  C.hs --> E.hs-boot; %% (modified by I, II)
  B.hs --> A.hs;
  C.hs-boot --> A.hs; %% [isLoopMember]
  A.hs --> C.hs;
  C.hs --> F.hs; %% (modified by II)

  E.hs-boot --> G.hs;
  G.hs --> H.hs;
  H.hs --> E.hs;
  G.hs --> I.hs;

  J.hs-boot --> J.hs;
  J.hs-boot --> K.hs;

III. Process E.hs-boot

graph LR;
  A.hs-boot --> B.hs; %% [isLoopMember]
  C.hs --> D.hs; %% (modified by I, II)
  C.hs --> E.hs-boot; %% (modified by I, II)
  B.hs --> A.hs;
  C.hs-boot --> A.hs;
  A.hs --> C.hs;
  C.hs --> F.hs; %% (modified by II)

  E.hs-boot --> G.hs;
  G.hs --> H.hs;
  H.hs --> E.hs;
  E.hs --> I.hs;

  J.hs-boot --> J.hs;
  J.hs-boot --> K.hs;

IV. Process J.hs-boot

graph LR;
  A.hs-boot --> B.hs; %% [isLoopMember]
  C.hs --> D.hs; %% (modified by I, II)
  C.hs --> E.hs-boot; %% (modified by I, II)
  B.hs --> A.hs;
  C.hs-boot --> A.hs;
  A.hs --> C.hs;
  C.hs --> F.hs; %% (modified by II)

  E.hs-boot --> G.hs;
  G.hs --> H.hs;
  H.hs --> E.hs;
  E.hs --> I.hs;

  J.hs-boot --> J.hs;
  J.hs --> K.hs; %% (modified by IV)

Problematic case

graph TB;
  subgraph one
  L1.hs-boot --> L1_1.hs;
  L1_1.hs --> L1.hs;
  end
  subgraph three
  M.hs
  end
  subgraph two
  L2.hs-boot --> L2_1.hs;
  L2_1.hs --> L2.hs;
  end

  L1_1.hs --> M.hs;  
  L2_1.hs ==> L1.hs;  
  M.hs --> L2.hs;

In this case the L1 depends on L2_1 ( highlighted edge), but it should not be changed to L2. (It is ok for M to depend on L1, as shown by dotted)

graph TB;
  subgraph one
  L1.hs-boot --> L1_1.hs;
  L1_1.hs --> L1.hs;
  end
  subgraph three
  M.hs
  end
  subgraph two
  L2.hs-boot --> L2_1.hs;
  L2_1.hs --> L2.hs;
  end

  L1_1.hs --> M.hs;  
  L1.hs -..-> M.hs;  
  L2_1.hs ==> L1.hs;  
  M.hs --> L2.hs;

Also this somehow magically works on master branch, as the L1 is considered part of the L2 loop!

comp_graph_loops
  [[L2, L2 {-# SOURCE #-}, L2_1, L1, L2],
   [L1, L1 {-# SOURCE #-}, L1_1, L1]]

Edited Jun 25, 2021 by Divam Narula

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information