Fix #2783: detect black-hole loops properly
At some point we regressed on detecting simple black-hole loops. This happened due to the introduction of duplicate-work detection for parallelism: a black-hole loop looks very much like duplicate work, except it's duplicate work being performed by the very same thread. So we have to detect and handle this case.
Showing with 38 additions and 20 deletions