Can we have more SIMD primops, corresponding to the untapped AVX etc. instructions?
GHC.Prim contains a good couple of vectorised instructions, which can be used by libraries for generating nice fast e.g. sums of floating-point vectors.
However, several instructions that modern processors could vectorise are missing there. In particular, I would like to be able to use the VPSLLVD...VPSRAVD shifting operations, and at some point perhaps VPMAXSQ...VPMINUQ maximum/minimum operations.
It would be great if corresponding primops could be added. Else I would like to know – where is this stuff even defined? GHC.Prim as such seems to be merely an automatically-generated dummy module, mostly for Haddock.
(On the other hand, I find it also a bit strange that there are primops for integer division, which is apparently not supported by SSE/AVX at all!)
Trac metadata
Trac field | Value |
---|---|
Version | 8.0.1 |
Type | FeatureRequest |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler (LLVM) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |