Use wasm bulk-memory opcodes in wasm NCG
Currently, the wasm NCG lowers MO_Memcpy/MO_Memset/MO_Memmove to call memcpy/memset/memmove in libc. There do exist single wasm opcodes in the bulk-memory feature set, so the wasm NCG should be able to emit those opcodes directly and avoid the function call overhead.
Note that we do enable bulk-memory by default in our builds, but it should still be possible to do a wasi-sdk/wasm32-wasi-ghc build without bulk-memory support. Therefore, the wasm NCG should also detect presence of -mbulk-memory in C compiler flags, and fall back to generating libc function calls if this flag is not present.