nativeGen/AArch64: Fix sign extension in MulMayOflo
Previously the 32-bit implementations of MulMayOflo would use the a non-sensical sign-extension mode. Rewrite these to reflect what gcc 11 produces. Also similarly rework the 16- and 8-bit cases.
This now passes the MulMayOflo tests in ghc/test-primops> in all four widths.
Fixes #23721 (closed).