GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2022-01-18T13:24:01Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/2269Word type to Double or Float conversions are slower than Int conversions2022-01-18T13:24:01ZdonsWord type to Double or Float conversions are slower than Int conversionsWe have int2Double\# and int2Float\# primitives, but not equivalent ones for Word
types. We may need word2Double\# too, for Words\* to be fully first-class performance-wise.
This means we have to do extra tests in the Num instances for ...We have int2Double\# and int2Float\# primitives, but not equivalent ones for Word
types. We may need word2Double\# too, for Words\* to be fully first-class performance-wise.
This means we have to do extra tests in the Num instances for Word types
to implement 'fromIntegral':
```
toInteger (W# x#)
| i# >=# 0# = smallInteger i#
| otherwise = wordToInteger x#
where i# = word2Int# x#
```
Now, for some types, we work around this:
```
"fromIntegral/Int->Word" fromIntegral = \(I# x#) -> W# (int2Word# x#)
"fromIntegral/Word->Int" fromIntegral = \(W# x#) -> I# (word2Int# x#)
"fromIntegral/Word->Word" fromIntegral = id :: Word -> Word
```
and so on for other Word/Int types. And all is fine.
The problem comes up for Float and Double. For Int, we can write:
```
"fromIntegral/Int->Float" fromIntegral = int2Float
"fromIntegral/Int->Double" fromIntegral = int2Double
int2Float :: Int -> Float
int2Float (I# x) = F# (int2Float# x)
int2Double :: Int -> Double
int2Double (I# x) = D# (int2Double# x)
```
But we can't write these rules for Word types.
The result is a slow down on Word conversions, consider this
program:
```
main = print . sumU
. mapU (fromIntegral::Int->Double)
$ enumFromToU 0 100000000
```
When in lhs is Int, we get this nice code:
```
$wfold :: Double# -> Int# -> Double#
$wfold =
\ (ww_s18k :: Double#) (ww1_s18o :: Int#) ->
case ># ww1_s18o 100000000 of wild_a14T {
False ->
$wfold
(+## ww_s18k (int2Double# ww1_s18o)) (+# ww1_s18o 1);
True -> ww_s18k
```
But for Word types, we get:
```
$wfold :: Double# -> Word# -> Double#
$wfold =
\ (ww_s1gN :: Double#) (ww1_s1gR :: Word#) ->
case gtWord# ww1_s1gR __word 100000000 of wild_a1do {
False ->
case case >=# (word2Int# ww1_s1gR) 0 of wild1_a1cS {
False ->
case word2Integer# ww1_s1gR of wild11_a1d9 { (# s_a1db, d_a1dc #) ->
case {__ccall __encodeDouble Int#
-> ByteArray#
-> Int#
-> State# RealWorld
-> (# State# RealWorld, Double# #)}_a1bT
s_a1db d_a1dc 0 realWorld#
of wild12_a1bX { (# ds1_a1bZ, ds2_a1c0 #) ->
ds2_a1c0
}
};
True -> int2Double# (word2Int# ww1_s1gR)
}
of wild1_a1bM { __DEFAULT ->
$wfold
(+## ww_s1gN wild1_a1bM) (plusWord# ww1_s1gR __word 1)
};
True -> ww_s1gN
}
```
Which is to be expected, and the running time goes from:
```
$ time ./henning
5.00000000067109e17
./henning 1.53s user 0.00s system 99% cpu 1.534 total
```
To:
```
$ time ./henning
5.00000000067109e17
./henning 4.57s user 0.00s system 99% cpu 4.571 total
```
So not too bad, but still, principle of least surprise says Word and Int
should behave the same.
Should we have a word2Double\# primop?
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.8.2 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | Unknown |
| Architecture | Unknown |
</details>
<!-- {"blocked_by":[],"summary":"Word type to Double or Float conversions are slower than Int conversions","status":"New","operating_system":"Unknown","component":"Compiler","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dons@galois.com"},"version":"6.8.2","keywords":["double","performance,","rules,"],"differentials":[],"test_case":"","architecture":"Unknown","cc":[""],"type":"FeatureRequest","description":"We have int2Double# and int2Float# primitives, but not equivalent ones for Word\r\ntypes. We may need word2Double# too, for Words* to be fully first-class performance-wise.\r\n\r\nThis means we have to do extra tests in the Num instances for Word types\r\nto implement 'fromIntegral':\r\n\r\n{{{\r\n\r\n toInteger (W# x#)\r\n | i# >=# 0# = smallInteger i#\r\n | otherwise = wordToInteger x#\r\n where i# = word2Int# x#\r\n\r\n}}}\r\n\r\nNow, for some types, we work around this:\r\n\r\n{{{\r\n\r\n\"fromIntegral/Int->Word\" fromIntegral = \\(I# x#) -> W# (int2Word# x#)\r\n\"fromIntegral/Word->Int\" fromIntegral = \\(W# x#) -> I# (word2Int# x#)\r\n\"fromIntegral/Word->Word\" fromIntegral = id :: Word -> Word\r\n\r\n}}}\r\n\r\nand so on for other Word/Int types. And all is fine.\r\n\r\nThe problem comes up for Float and Double. For Int, we can write:\r\n\r\n{{{\r\n\r\n\"fromIntegral/Int->Float\" fromIntegral = int2Float\r\n\"fromIntegral/Int->Double\" fromIntegral = int2Double\r\n\r\nint2Float :: Int -> Float\r\nint2Float (I# x) = F# (int2Float# x)\r\n\r\nint2Double :: Int -> Double \r\nint2Double (I# x) = D# (int2Double# x)\r\n\r\n}}}\r\n\r\nBut we can't write these rules for Word types.\r\n\r\nThe result is a slow down on Word conversions, consider this\r\nprogram:\r\n\r\n{{{\r\n\r\nmain = print . sumU\r\n . mapU (fromIntegral::Int->Double)\r\n $ enumFromToU 0 100000000\r\n\r\n}}}\r\n\r\nWhen in lhs is Int, we get this nice code:\r\n\r\n{{{\r\n\r\n$wfold :: Double# -> Int# -> Double#\r\n\r\n$wfold =\r\n \\ (ww_s18k :: Double#) (ww1_s18o :: Int#) ->\r\n case ># ww1_s18o 100000000 of wild_a14T {\r\n False ->\r\n $wfold\r\n (+## ww_s18k (int2Double# ww1_s18o)) (+# ww1_s18o 1);\r\n True -> ww_s18k\r\n\r\n\r\n}}}\r\n\r\nBut for Word types, we get:\r\n\r\n{{{\r\n\r\n$wfold :: Double# -> Word# -> Double#\r\n\r\n$wfold =\r\n \\ (ww_s1gN :: Double#) (ww1_s1gR :: Word#) ->\r\n case gtWord# ww1_s1gR __word 100000000 of wild_a1do {\r\n False ->\r\n case case >=# (word2Int# ww1_s1gR) 0 of wild1_a1cS {\r\n False ->\r\n case word2Integer# ww1_s1gR of wild11_a1d9 { (# s_a1db, d_a1dc #) ->\r\n case {__ccall __encodeDouble Int#\r\n -> ByteArray#\r\n -> Int#\r\n -> State# RealWorld\r\n -> (# State# RealWorld, Double# #)}_a1bT\r\n s_a1db d_a1dc 0 realWorld#\r\n of wild12_a1bX { (# ds1_a1bZ, ds2_a1c0 #) ->\r\n ds2_a1c0\r\n }\r\n };\r\n True -> int2Double# (word2Int# ww1_s1gR)\r\n }\r\n of wild1_a1bM { __DEFAULT ->\r\n $wfold\r\n (+## ww_s1gN wild1_a1bM) (plusWord# ww1_s1gR __word 1)\r\n };\r\n True -> ww_s1gN\r\n }\r\n\r\n}}}\r\n\r\nWhich is to be expected, and the running time goes from:\r\n\r\n{{{\r\n\r\n$ time ./henning \r\n5.00000000067109e17\r\n./henning 1.53s user 0.00s system 99% cpu 1.534 total\r\n\r\n}}}\r\n\r\nTo:\r\n\r\n{{{\r\n\r\n$ time ./henning \r\n5.00000000067109e17\r\n./henning 4.57s user 0.00s system 99% cpu 4.571 total\r\n\r\n}}}\r\n\r\nSo not too bad, but still, principle of least surprise says Word and Int\r\nshould behave the same.\r\n\r\nShould we have a word2Double# primop?","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1donsdons