Skip to content
  • Rodrigo Mesquita's avatar
    7575709b
    Improve reachability queries on ModuleGraph · 7575709b
    Rodrigo Mesquita authored and Marge Bot's avatar Marge Bot committed
    Introduces `ReachabilityIndex`, an index constructed from a
    `GHC.Data.Graph.Directed` `Graph` that supports fast reachability
    queries (in $O(1)$). This abstract data structure is exposed from
    `GHC.Data.Graph.Directed.Reachability`.
    
    This index is constructed from the module graph nodes and cached in
    `ModuleGraph`, enabling efficient reachability queries on the module
    graph. Previously, we'd construct a Map of Set of ModuleGraph nodes
    which used a lot of memory (`O(n^2)` in the number of nodes) and cache
    that in the `ModuleGraph`. By using the reachability index we get rid of
    this space leak in the module graph -- even though the index is still
    quadratic in the number of modules, it is much, much more space
    efficient due to its representation using an IntMap of IntSet as opposed
    to the transitive closure we previously cached.
    
    In a memory profile of MultiLayerModules with 100x100 modules, memory
    usage improved from 6GB residency to 2.8GB, out of which roughly 1.8GB
    are caused by a second space leak related to ModuleGraph. On the same
    program, it brings compile time from 7.5s to 5.5s.
    
    Note how we simplify `checkHomeUnitsClosed` in terms of
    `isReachableMany` and by avoiding constructing a second graph with the
    full transitive closure -- it suffices to answer the reachability query
    on the full graph without collapsing the transitive closure completely
    into nodes.
    
    Unfortunately, solving this leak means we have to do a little bit more
    work since we can no longer cache the result of turning vertex indices
    into nodes. This results in a slight regression in MultiLayerModulesTH_Make,
    but results in large performance and memory wins when compiling large
    amounts of modules.
    
    -------------------------
    Metric Decrease:
        mhu-perf
    Metric Increase:
        MultiLayerModulesTH_Make
    -------------------------
    7575709b
    Improve reachability queries on ModuleGraph
    Rodrigo Mesquita authored and Marge Bot's avatar Marge Bot committed
    Introduces `ReachabilityIndex`, an index constructed from a
    `GHC.Data.Graph.Directed` `Graph` that supports fast reachability
    queries (in $O(1)$). This abstract data structure is exposed from
    `GHC.Data.Graph.Directed.Reachability`.
    
    This index is constructed from the module graph nodes and cached in
    `ModuleGraph`, enabling efficient reachability queries on the module
    graph. Previously, we'd construct a Map of Set of ModuleGraph nodes
    which used a lot of memory (`O(n^2)` in the number of nodes) and cache
    that in the `ModuleGraph`. By using the reachability index we get rid of
    this space leak in the module graph -- even though the index is still
    quadratic in the number of modules, it is much, much more space
    efficient due to its representation using an IntMap of IntSet as opposed
    to the transitive closure we previously cached.
    
    In a memory profile of MultiLayerModules with 100x100 modules, memory
    usage improved from 6GB residency to 2.8GB, out of which roughly 1.8GB
    are caused by a second space leak related to ModuleGraph. On the same
    program, it brings compile time from 7.5s to 5.5s.
    
    Note how we simplify `checkHomeUnitsClosed` in terms of
    `isReachableMany` and by avoiding constructing a second graph with the
    full transitive closure -- it suffices to answer the reachability query
    on the full graph without collapsing the transitive closure completely
    into nodes.
    
    Unfortunately, solving this leak means we have to do a little bit more
    work since we can no longer cache the result of turning vertex indices
    into nodes. This results in a slight regression in MultiLayerModulesTH_Make,
    but results in large performance and memory wins when compiling large
    amounts of modules.
    
    -------------------------
    Metric Decrease:
        mhu-perf
    Metric Increase:
        MultiLayerModulesTH_Make
    -------------------------
Loading