lcm :: Word -> Word -> Word is not specialised
GHC.Real
defines gcd
with two specialised versions gcdInt'
and gcdWord'
and rules
{-# RULES
"gcd/Int->Int->Int" gcd = gcdInt'
"gcd/Word->Word->Word" gcd = gcdWord'
#-}
It also defines lcm x y = abs ((x
quot (gcd x y)) * y)
, but specialises it only for Int
:
{-# SPECIALISE lcm :: Int -> Int -> Int #-}
So lcm :: Int -> Int -> Int
will be compiled to a nice and fast gcdInt'
, but lcm :: Word -> Word -> Word
will not benefit from the existence of gcdWord'
. This leads to a huge performance gap, about 8x.
Here is a test program:
module Main where
import Data.Time.Clock
main :: IO ()
main = do
t0 <- getCurrentTime
print $ maximum $ [ lcm x y | x <- [1..1000 :: Int], y <- [1..1000 :: Int] ]
t1 <- getCurrentTime
putStrLn "lcm :: Int -> Int -> Int"
print $ diffUTCTime t1 t0
t0 <- getCurrentTime
print $ maximum $ [ lcm x y | x <- [1..1000 :: Word], y <- [1..1000 :: Word] ]
t1 <- getCurrentTime
putStrLn "lcm :: Word -> Word -> Word"
print $ diffUTCTime t1 t0
t0 <- getCurrentTime
print $ maximum $ [ lcmWord x y | x <- [1..1000 :: Word], y <- [1..1000 :: Word] ]
t1 <- getCurrentTime
putStrLn "lcmWord :: Word -> Word -> Word"
print $ diffUTCTime t1 t0
-- Similar to GHC.Real.lcm, but specialized for Word
lcmWord :: Word -> Word -> Word
lcmWord _ 0 = 0
lcmWord 0 _ = 0
lcmWord x y = abs ((x `quot` (gcd x y)) * y)
On my PC the output (ghc -O2
) is:
999000
lcm :: Int -> Int -> Int
0.086963s
999000
lcm :: Word -> Word -> Word
0.591168s
999000
lcmWord :: Word -> Word -> Word
0.077644s
My proposal is to add a SPECIALIZE pragma to GHC.Real
:
{-# SPECIALISE lcm :: Word -> Word -> Word #-}