• Simon Marlow's avatar
    When acquiring a spinlock, yieldThread() every 1000 spins (#3553, #3758) · 65ac2f4c
    Simon Marlow authored
    This helps when the thread holding the lock has been descheduled,
    which is the main cause of the "last-core slowdown" problem.  With
    this patch, I get much better results with -N8 on an 8-core box,
    although some benchmarks are still worse than with 7 cores.
    I also added a yieldThread() into the any_work() loop of the parallel
    GC when it has no work to do. Oddly, this seems to improve performance
    on the parallel GC benchmarks even when all the cores are busy.
    Perhaps it is due to reducing contention on the memory bus.
Constants.h 10.6 KB