    eg. use +RTS -g2 -RTS for 2 threads.  Only major GCs are parallelised,
    minor GCs are still sequential. Don't use more threads than you
    have CPUs.
    It works most of the time, although you won't see much speedup yet.
    Tuning and more work on stability still required.
