... | ... | @@ -464,6 +464,131 @@ Note that these constants are of type Int since top level values of type Int\# a |
|
|
|
|
|
The native-sized vector types are distinct types from the explicit-sized vector types, not type aliases for the corresponding explicit-sized vector. This is to support and encourage portable code.
|
|
|
|
|
|
## Vector operations
|
|
|
|
|
|
|
|
|
The following operations on vectors will be supported. They will need to be implemented at the Haskell/core primop layer, Cmm MachOp layer and optional support in the code generators.
|
|
|
|
|
|
|
|
|
Extracting and inserting vector elements:
|
|
|
|
|
|
```wiki
|
|
|
extractInt<w>Vec<m># :: Int<w>Vec<m># -> Int# -> Int#
|
|
|
extractWord<w>Vec<m># :: Word<w>Vec<m># -> Int# -> Word#
|
|
|
extractFloatVec# :: FloatVec<m># -> Int# -> Float#
|
|
|
extractDoubleVec# :: DoubleVec<m># -> Int# -> Double#
|
|
|
```
|
|
|
|
|
|
```wiki
|
|
|
insertInt<w>Vec<m># :: Int<w>Vec<m># -> Int# -> Int# -> Int<w>Vec<m>#
|
|
|
insertWord<w>Vec<m># :: Word<w>Vec<m># -> Int# -> Word# -> Word<w>Vec<m>#
|
|
|
insertFloatVec# :: FloatVec<m># -> Int# -> Float# -> FloatVec<m>#
|
|
|
insertDoubleVec# :: DoubleVec<m># -> Int# -> Double# -> DoubleVec<m>#
|
|
|
```
|
|
|
|
|
|
|
|
|
Vector shuffle:
|
|
|
|
|
|
```wiki
|
|
|
shuffleInt<w>Vec<m>ToVec<m'> :: Int<w>Vec<m># -> Int32Vec<m'># -> Int<w>Vec<m'>#
|
|
|
```
|
|
|
|
|
|
|
|
|
For the fixed size vectors (not native size) we may also want to add pack/unpack functions like:
|
|
|
|
|
|
```wiki
|
|
|
unpackInt<w>Vec4# :: Int<w>Vec4# -> (# Int#, Int#, Int#, Int# #)
|
|
|
packInt<w>Vec4# :: (# Int#, Int#, Int#, Int# #) -> Int<w>Vec4#
|
|
|
```
|
|
|
|
|
|
|
|
|
In the following, `<t>` ranges over `Int<w>`, `Word<w>`, `Float`, `Double`.
|
|
|
|
|
|
|
|
|
Arithmetic operations:
|
|
|
|
|
|
```wiki
|
|
|
plus<t>Vec<m>#, minus<t>Vec<m>#,
|
|
|
times<t>Vec<m>#, quot<t>Vec<m>#, rem<t>Vec<m># :: <t>Vec<m># -> <t>Vec<m># -> <t>Vec<m>#
|
|
|
|
|
|
negate<t>Vec<m># :: <t>Vec<m># -> <t>Vec<m>#
|
|
|
```
|
|
|
|
|
|
|
|
|
Logic operations:
|
|
|
|
|
|
```wiki
|
|
|
andInt<w>Vec<m>#, orInt<w>Vec<m>#, xorInt<w>Vec<m># :: Int<w>Vec<m># -> Int<w>Vec<m># -> Int<w>Vec<m>#
|
|
|
andWord<w>Vec<m>#, orWord<w>Vec<m>#, xorWord<w>Vec<m># :: Word<w>Vec<m># -> Word<w>Vec<m># -> Word<w>Vec<m>#
|
|
|
|
|
|
notInt<w>Vec<m># :: Int<w>Vec<m># -> Int<w>Vec<m>#
|
|
|
notWord<w>Vec<m># :: Word<w>Vec<m># -> Word<w>Vec<m>#
|
|
|
|
|
|
shiftLInt<w>Vec<m>#, shiftRAInt<w>Vec<m># :: Int<w>Vec<m># -> Word# -> Int<w>Vec<m>#
|
|
|
ShiftLWord<w>Vec<m>#, ShiftRLWord<w>Vec<m># :: Word<w>Vec<m># -> Word# -> Word<w>Vec<m>#
|
|
|
```
|
|
|
|
|
|
|
|
|
Comparison:
|
|
|
|
|
|
```wiki
|
|
|
cmp<eq,ne,gt,gt,lt,le>Int<w>Vec<m># :: Int<w>Vec<m># -> Int<w>Vec<m># -> Word<w>Vec<m>#
|
|
|
cmp<eq,ne,gt,gt,lt,le>Word<w>Vec<m># :: Word<w>Vec<m># -> Word<w>Vec<m># -> Word<w>Vec<m>#
|
|
|
```
|
|
|
|
|
|
|
|
|
Note that LLVM does not yet support the comparison operations.
|
|
|
|
|
|
TODO
|
|
|
|
|
|
- conversion sign/width operations, e.g. Word \<-\> Int, Word8 \<-\> Word16 etc.
|
|
|
- conversion fp operations, e.g. Float \<-\> Int
|
|
|
|
|
|
|
|
|
Should also consider:
|
|
|
|
|
|
- vector constants, at least at Cmm level
|
|
|
- replicating a scalar to a vector
|
|
|
- AVX also suppports a bunch of interesting things:
|
|
|
|
|
|
- permute, shuffle, "blend", masked moves.
|
|
|
- min, max within a vector
|
|
|
- average
|
|
|
- horizontal add/sub
|
|
|
- shift whole vector left/right by n bytes
|
|
|
- gather (but not scatter) of 32, 64bit int and fp from memory (base + vector of offsets)
|
|
|
|
|
|
### Int/Word size wrinkle
|
|
|
|
|
|
|
|
|
Note that there is a wrinkle with the 32 and 64 bit int and word types. For example, the types for the extract functions should be:
|
|
|
|
|
|
```wiki
|
|
|
extractInt32Vec<m># :: Int32Vec# -> Int# -> INT32
|
|
|
extractInt64Vec<m># :: Int64Vec# -> Int# -> INT64
|
|
|
extractWord32Vec<m># :: Word32Vec# -> Int# -> WORD32
|
|
|
extractWord64Vec<m># :: Word64Vec# -> Int# -> WORD64
|
|
|
```
|
|
|
|
|
|
|
|
|
where `INT32`, `INT64`, `INT64`, `WORD64` are CPP macros that expand in a arch-dependent way to the types Int\#/Int64\# and Word\#/Word64\#.
|
|
|
|
|
|
|
|
|
To describe this in the primop definition we might want something like:
|
|
|
|
|
|
```wiki
|
|
|
primop IntAddOp <w,m,t> "extractWord<w>Vec<m>#" Dyadic
|
|
|
Word<w>Vec<m># -> Int# -> <t>
|
|
|
with <w, m, t> in <8, 2,Word#>,<8, 4,Word#>,<8, 8,Word#>,<8, 16,Word#>,<8, 32,Word#>,
|
|
|
<16,2,Word#>,<16,4,Word#>,<16,8,Word#>,<16,16,Word#>,
|
|
|
<32,2,WORD32>,<32,4,WORD32>,<32,8,WORD32>,
|
|
|
<64,2,WORD64>,<64,4,WORD64>
|
|
|
<"",2,WORD>,<"",4,WORD>
|
|
|
```
|
|
|
|
|
|
|
|
|
To iron out this wrinkle we would need the whole family of primitve types: Int8\#, Int16\#, Int32\# etc whereas currently only the native register sized Int\# type is provided, plus a primitive Int64\# type is provided on 32bit systems.
|
|
|
|
|
|
## Data Parallel Haskell layer
|
|
|
|
|
|
|
... | ... | |