Specialize min/max functions for GHC provided instances.
In particular for Int/Word/Float/Double we should aim to provide branchless code where reasonable.
Especially when using SSE we should just use
minss instead of
ucomisd %xmm1,%xmm0 jp .Lc5vv jbe .Lc5vw
which is worse in all kinds of ways.
If someone feels like tackling this I'm happy to help with code review, answering questions, etc.
I won't get around to doing it myself any time soon.
As a starting point the instance declarations are in ghc-prim/GHC/Classes.hs.
I would look into solving this via adding a new MachOP/PrimOP but maybe there are even better ways.