Skip to content

Different floating point results with -msse2 on 32bit Linux

I noticed strange behaviour of Conversions test on 32bit machines. Here's a slightly reduced version of that test:

{-# LANGUAGE BangPatterns #-}

-- | Tests that conversions between various primitive types (e.g.
-- Word, Double, etc) doesn't allocate.
module Main (main) where

import Data.Word

-- Repeatedly convert Words to Doubles
loop :: Word -> Float
loop n = go 0 0.0
  where
    go i !acc | i < n = go (i+1) (acc + fromIntegral i)
              | otherwise = acc

main :: IO ()
main = do
    print (loop 1000000 :: Float)

This test is expected to produce:

[t-jastol@cam-05-unx : /5playpen/t-jastol/ghc-validate] inplace/bin/ghc-stage2 -O -fforce-recomp testsuite/tests/perf/should_run/Conversions.hs
[1 of 1] Compiling Main             ( testsuite/tests/perf/should_run/Conversions.hs, testsuite/tests/perf/should_run/Conversions.o )
Linking testsuite/tests/perf/should_run/Conversions ...
[t-jastol@cam-05-unx : /5playpen/t-jastol/ghc-validate] ./testsuite/tests/perf/should_run/Conversions
4.9994036e11

My optimization of self-recursive tail loops (d61c3ac1) change results of that program to be slightly different:

4.999995e11

I believe my optimisation shouldn't affect this, but clearly it does.

Now here's the interesting part: if I generate SSE assembly by adding -msse2 flag, then I get expected results:

4.9994036e11

I'm not sure if we should get different results for different instruction sets. I'm putting it up as a ticket so perhaps we could investigate this more one day if we consider it relevant.

Edited by Jan Stolarek
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information