`-mavx` should imply `-msse4.2` (or, implications between x86 CPU feature flags)
Summary
If a program is compiled with -mavx
, the CPU running it should also support SSE4.2, thus GHC can emit popcnt
instruction.
More generally, GHC should have a better knowledge of implications between CPU features. As a note, LLVM's knowledge is written at: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86.td
Maybe SseVersion
in compiler/GHC/Platform.hs
should include AVX family.
Steps to reproduce
$ cat popcounttest.hs
import Data.Bits
{-# NOINLINE foo #-}
foo :: Int -> Int
foo x = 1 + popCount x
main = print (foo 42)
$ ghc -S -O2 -fforce-recomp popcounttest.hs
$ grep -i popcnt popcounttest.s
call hs_popcnt64
$ ghc -S -O2 -fforce-recomp -msse4.2 popcounttest.hs
$ grep -i popcnt popcounttest.s
popcnt %r14,%rax
$ ghc -S -O2 -fforce-recomp -mavx popcounttest.hs
$ grep -i popcnt popcounttest.s
call hs_popcnt64
Expected behavior
GHC should emit popcnt
when compiling with -mavx
.
Environment
- GHC version used: 9.10.1 and 9.11.20240614 (ce76bf78)
- Operating System: Ubuntu 22.04
- System Architecture: x86_64