Reorder tests in quot to improve code

[SLPJ: capturing an email thread in a ticket]

Krasimir Angelov found that this function:

a `quot` b
    | b == 0                     = divZeroError
    | a == minBound && b == (-1) = overflowError
    | otherwise                  =  a `quotInt` b

is expanded to:

a `quot` b =
   if b == 0
     then divZeroError
     else if a == minBound
              then if b == (-1)
                       then overflowError
                       else a `quotInt` b
              else a `quotInt` b

Then the compiler sees that b is a constant and computes that b == 0 is False and b == (-1) is also False so it could eliminate two If statements. The result is:

a `quot` b =
   if a == minBound
     then a `quotInt` b
     else a `quotInt` b

and this is exactly what we get. I bet that if the original function was:

a `quot` b
    | b == 0                     = divZeroError
    | b == (-1) && a == minBound = overflowError   -- Note the changed order here
    | otherwise                  =  a `quotInt` b

then we would get what we want. I think that it is much more often to have division where the divisor is known so we will get the best code in this case.

Shortly afterwards, Bertram Felgenhauer tried it out. It works, and it has the desired effect. It's not always a win though; the nofib prime sieve benchmark suffers.

For a patch, see

http://int-e.home.tlink.de/haskell/ghc-libraries-base-tune-division.patch

(SLPJ: I've attached it to the ticket too)

Nofib results extract:

------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed
------------------------------------------------------------------------
           fish          -0.7%     -5.3%      0.05      0.04
         primes          -0.0%    +28.5%    +25.6%    +25.5%
   wheel-sieve2          -0.0%     -0.3%    -17.9%    -18.6%
------------------------------------------------------------------------
            Min          -0.9%     -5.3%    -17.9%    -18.6%
            Max          +0.1%    +28.5%    +25.6%    +25.5%
 Geometric Mean          -0.2%     +0.2%     -0.0%     +0.2%

'primes' is an outlier - the other slowdowns are below 3%

What happens in 'primes' is that 'mod' no longer gets inlined; apparently it now looks bigger to the compiler than before.

Using -funfolding-use-threshold=10 brings the benchmark back to its original speed, despite the extra comparison before doing the division.

In 'wheel-sieve2', the first different optimization choice I see is again a 'mod' that is no longer inlined; this leads to a change in other inlining choices that result in a speedup for reasons that I have not investigated.

Trac metadata

Trac field	Value
Version	6.10.1
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	bertram.felgenhauer@googlemail.com; kr.angelov@gmail.com
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Reorder tests in quot to improve code