rts: correct stats when running with +RTS -qn1
Despite the documented care having been taken, several bugs are fixed here.
When run with -qn1, when a SYNC_GC_PAR is requested we will have
n_gc_threads == n_capabilities && n_gc_idle_threads == (n_gc_threads - 1)
In this case we now:
- Don't increment par_collections
- Don't increment par_balanced_copied
- Don't emit debug traces for idle threads
- Take the fast path in scavenge_until_all_done, wakeup_gc_threads, and shutdown_gc_threads.
Some ASSERTs have also been tightened.
Fixes #19685 (closed)
This is morally a follow-up up to !4704 (closed), which has been backported as far back as 8.10, so this could also be backported to 8.10,9.0, and 9.2. I will prepare a patch for those branches if you think it worthwhile.