Can we have more SIMD primops, corresponding to the untapped AVX etc. instructions?
GHC.Prim contains a good couple of vectorised instructions, which can be used by libraries for generating nice fast e.g. sums of floating-point vectors.
However, several instructions that modern processors could vectorise are missing there. In particular, I would like to be able to use the VPSLLVD...VPSRAVD shifting operations, and at some point perhaps VPMAXSQ...VPMINUQ maximum/minimum operations.
It would be great if corresponding primops could be added. Else I would like to know – where is this stuff even defined? GHC.Prim as such seems to be merely an automatically-generated dummy module, mostly for Haddock.
(On the other hand, I find it also a bit strange that there are primops for integer division, which is apparently not supported by SSE/AVX at all!)
Trac metadata
| Trac field | Value |
|---|---|
| Version | 8.0.1 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |