NonMoving: Eliminate integer division in nonmovingBlockCount
Perf showed that the this single div was capturing up to 10% of samples in nonmovingMark. However, the overwhelming majority of cases is looking at small block sizes. These cases we can easily compute explicitly, allowing the compiler to turn the division into a significantly more efficient division-by-constant. While the increase in source code looks scary, this all optimises down to very nice looking assembler. At this point the only remaining hotspots in nonmovingBlockCount are due to memory access.
Showing with 20 additions and 4 deletions