Skip to content
  • Simon Peyton Jones's avatar
    Don't eta-expand PAPs (fixes Trac #9020) · 79e46aea
    Simon Peyton Jones authored
    See Note [Do not eta-expand PAPs] in SimplUtils.  This has a tremendously
    good effect on compile times for some simple benchmarks.
    
    The test is now where it belongs, in perf/compiler/T9020 (instead of simpl015).
    
    I did a nofib run and got essentially zero change except for cacheprof which
    gets 4% more allocation.  I investigated.  Turns out that we have
    
        instance PP Reg where
           pp ppm ST_0 = "%st"
           pp ppm ST_1 = "%st(1)"
           pp ppm ST_2 = "%st(2)"
           pp ppm ST_3 = "%st(3)"
           pp ppm ST_4 = "%st(4)"
           pp ppm ST_5 = "%st(5)"
           pp ppm ST_6 = "%st(6)"
           pp ppm ST_7 = "%st(7)"
           pp ppm r    = "%" ++ map toLower (show r)
    
    That (map toLower (show r) does a lot of map/toLowers.  But if we inline show
    we get something like
    
           pp ppm ST_0 = "%st"
           pp ppm ST_1 = "%st(1)"
           pp ppm ST_2 = "%st(2)"
           pp ppm ST_3 = "%st(3)"
           pp ppm ST_4 = "%st(4)"
           pp ppm ST_5 = "%st(5)"
           pp ppm ST_6 = "%st(6)"
           pp ppm ST_7 = "%st(7)"
           pp ppm EAX  = map toLower (show EAX)
           pp ppm EBX  = map toLower (show EBX)
           ...etc...
    
    and all those map/toLower calls can now be floated to top level.
    This gives a 4% decrease in allocation.  But it depends on inlining
    a pretty big 'show' function.
    
    With this new patch we get slightly better eta-expansion, which makes
    a function look slightly bigger, which just stops it being inlined.
    The previous behaviour was luck, so I'm not going to worry about
    losing it.
    
    I've added some notes to nofib/Simon-nofib-notes
    79e46aea