Skip to content

GHC should produce FMA instructions even if `-mfma` is not set on AArch64

Motivation

Currently, GHC emits FMA instructions only when -mfma is set. This flag is necessary on x86 because only newer CPUs (Haswell/Ryzen or later) support them.

However, AArch64 has FMA instructions from the beginning, so the flag is redundant.

Here is an example to test the effect of -mfma:

$ cat TwoProdFMA.hs
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
module TwoProdFMA where
import GHC.Exts

twoProductFloat# :: Float# -> Float# -> (# Float#, Float# #)
twoProductFloat# x y = let !r = x `timesFloat#` y
                       in (# r, fmsubFloat# x y r #)
$ ghc -S -fforce-recomp -O2 TwoProdFMA.hs
$ grep -E "fn?m[as]" TwoProdFMA.s
	bl _fmaf
$ ghc -S -fforce-recomp -O2 -mfma TwoProdFMA.hs
$ grep -E "fn?m[as]" TwoProdFMA.s
	fnmsub s9, s8, s9, s31

Tested with GHC 9.8.1 and GHC 9.9.20240105 (90ea574e) on AArch64 macOS.

Proposal

On AArch64, GHC should emit FMA instructions whether -mfma is set or not. I believe the same goes for PowerPC, but I don't have an environment to test it.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information