Skip to content

Add primops for MMX parallel subtract and parallel add intrinsics

Parallel subtract: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=MMX&text=_m_psub&expand=5805

Parallel add: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=MMX&text=_m_padd&expand=5805

These to byte-wise, short-wise, word-wise subtraction and addition in a word64. There is also a saturated version of each which caps at the maximum and minimum value to prevent overflow/underflow.

These instructions are significantly faster than broadword programming, which in turn is faster than word at a time processing.

On systems that don't have these instructions the behaviour can be emulated with broadword programming.

Details to follow.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information