Skip to content

Driver rework pt3: The upsweep

Matthew Pickering requested to merge wip/driver-rework-pt3-clean into master

This patch specifies and simplifies the module cycle compilation in upsweep. How things work are described in the Note [Upsweep]

Note [Upsweep]


Upsweep takes a 'ModuleGraph' as input, computes a build plan and then executes
the plan in order to compile the project.

The first step is computing the build plan from a 'ModuleGraph'.

The output of this step is a `[BuildPlan]`, which is a topologically sorted plan for
how to build all the modules.

```
data BuildPlan = SingleModule ModuleGraphNode  -- A simple, single module all alone but *might* have an hs-boot file which isn't part of a cycle
               | ResolvedCycle [ModuleGraphNode]   -- A resolved cycle, linearised by hs-boot files
               | UnresolvedCycle [ModuleGraphNode] -- An actual cycle, which wasn't resolved by hs-boot files
```

The plan is computed in two steps:

Step 1: Topologically sort the module graph without hs-boot files. This returns a [SCC ModuleGraphNode] which contains
        cycles.
Step 2: For each cycle, topologically sort the modules in the cycle *with* the relevant hs-boot files. This should
        result in an acyclic build plan if the hs-boot files are sufficient to resolve the cycle.

The `[BuildPlan]` is then interpreted by the `interpretBuildPlan` function.

* `SingleModule nodes` are compiled normally by either the upsweep_inst or upsweep_mod functions.
* `ResolvedCycles` need to compiled "together" so that the information which ends up in
  the interface files at the end is accurate (and doesn't contain temporary information from
  the hs-boot files.)
  - During the initial compilation, a `KnotVars` is created which stores an IORef TypeEnv for
    each module of the loop. These IORefs are gradually updated as the loop completes and provide
    the required laziness to typecheck the module loop.
  - At the end of typechecking, all the interface files are typechecked again in
    the retypecheck loop. This time, the knot-tying is done by the normal laziness
    based tying, so the environment is run without the KnotVars.
* UnresolvedCycles are indicative of a proper cycle, unresolved by hs-boot files
  and are reported as an error to the user.

The main trickiness of `interpretBuildPlan` is deciding which version of a dependency
is visible from each module. For modules which are not in a cycle, there is just
one version of a module, so that is always used. For modules in a cycle, there are two versions of
'HomeModInfo'.

1. Internal to loop: The version created whilst compiling the loop by upsweep_mod.
2. External to loop: The knot-tied version created by typecheckLoop.

Whilst compiling a module inside the loop, we need to use the (1). For a module which
is outside of the loop which depends on something from in the loop, the (2) version
is used.

As the plan is interpreted, which version of a HomeModInfo is visible is updated
by updating a map held in a state monad. So after a loop has finished being compiled,
the visible module is the one created by typecheckLoop and the internal version is not
used again.

This plan also ensures the most important invariant to do with module loops:

> If you depend on anything within a module loop, before you can use the dependency,
  the whole loop has to finish compiling.

The end result of `interpretBuildPlan` is a `[MakeAction]`, which are pairs
of `IO a` actions and a `MVar (Maybe a)`, somewhere to put the result of running
the action. This list is topologically sorted, so can be run in order to compute
the whole graph.

As well as this `interpretBuildPlan` also outputs an `IO [Maybe (Maybe HomeModInfo)]` which
can be queried at the end to get the result of all modules at the end, with their proper
visibility. For example, if any module in a loop fails then all modules in that loop will
report as failed because the visible node at the end will be the result of retypechecking
those modules together.

Along the way we also fix a number of other bugs in the driver:

* Unify upsweep and parUpsweep.
* Fix #19937 (static points, ghci and -j)
* Adds lots of module loop tests due to Divam.

Also related to #20030

Co-authored-by: Divam Narula's avatarDivam Narula <dfordivam@gmail.com>

-------------------------
Metric Decrease:
    T10370
-------------------------
Edited by Matthew Pickering

Merge request reports