Skip to content
  • Simon Marlow's avatar
    Turn on -n4m with -A16m or greater · 85e81a85
    Simon Marlow authored and Ben Gamari's avatar Ben Gamari committed
    Nursery chunks help reduce the cost of GC when capabilities are unevenly
    loaded, by ensuring that we use more of the available nursery.
    
    The rationale for enabling this at -A16m is that any negative effects
    due to loss of cache locality are less likely to be an issue at -A16m
    and above.  It's a conservative guess.  If we had a lot of benchmark
    data we could probably do better.
    
    Results for nofib/parallel at -N4 -A32m with and without -n4m:
    
    ```
    ------------------------------------------------------------------------
            Program           Size    Allocs   Runtime   Elapsed  TotalMem
    ------------------------------------------------------------------------
       blackscholes           0.0%     -9.5%     -9.0%    -15.0%     -2.2%
              coins           0.0%     -4.7%     -3.6%     -0.6%    -13.6%
             mandel           0.0%     -0.3%     +7.7%    +13.1%     +0.1%
            matmult           0.0%     +1.5%    +10.0%     +7.7%     +0.1%
              nbody           0.0%     -4.1%     -2.9%     0.085      0.0%
             parfib           0.0%     -1.4%     +1.0%     +1.5%     +0.2%
            partree           0.0%     -0.3%     +0.8%     +2.9%     -0.8%
               prsa           0.0%     -0.5%     -2.1%     -7.6%      0.0%
             queens           0.0%     -3.2%     -1.4%     +2.2%     +1.3%
                ray           0.0%     -5.6%    -14.5%     -7.6%     +0.8%
           sumeuler           0.0%     -0.4%     +2.4%     +1.1%      0.0%
    ------------------------------------------------------------------------
                Min           0.0%     -9.5%    -14.5%    -15.0%    -13.6%
                Max           0.0%     +1.5%    +10.0%    +13.1%     +1.3%
     Geometric Mean          +0.0%     -2.6%     -1.3%     -0.5%     -1.4%
    ```
    
    Not conclusive, but slightly better.  This matters a lot more when you
    have more cores.
    
    Test Plan: validate, nofib/paralel
    
    Reviewers: niteria, ezyang, nh2, trofi, austin, erikd, bgamari
    
    Reviewed By: bgamari
    
    Subscribers: thomie
    
    Differential Revision: https://phabricator.haskell.org/D2581
    
    GHC Trac Issues: #9221
    85e81a85