From e5d0c19d1de843bd0b895910ee25770620a0b5b8 Mon Sep 17 00:00:00 2001 From: Cheng Shao <terrorjack@type.dance> Date: Wed, 16 Oct 2024 01:34:25 +0200 Subject: [PATCH] rts: fix pointer overflow undefined behavior in bytecode interpreter This patch fixes an unnoticed undefined behavior in the bytecode interpreter. It can be caught by building `rts/Interpreter.c` with `-fsanitize=pointer-overflow`, the warning message is something like: ``` rts/Interpreter.c:1369:13: runtime error: addition of unsigned offset to 0x004200197660 overflowed to 0x004200197658 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1369:13 rts/Interpreter.c:1265:13: runtime error: addition of unsigned offset to 0x004200197660 overflowed to 0x004200197658 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1265:13 rts/Interpreter.c:1645:13: runtime error: addition of unsigned offset to 0x0042000b22f8 overflowed to 0x0042000b22f0 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1645:13 ``` Whenever we do something like `SpW(-1)`, the negative argument is implicitly converted to an unsigned integer type and causes pointer arithmetic overflow. It happens to be harmless for most targets since overflowing would wrap the result to desired value, but it's still coincidental and undefined behavior. Furthermore, it causes real damage to the wasm backend, given clang-20 will emit invalid wasm code that crashes at run-time for this kind of C code! (see https://github.com/llvm/llvm-project/issues/108770) The fix here is adding some explicit casts to ensure we always use the signed `ptrdiff_t` type as right hand operand of pointer arithmetic. (cherry picked from commit 5bcfefd5bb73c18a9bad63d1813968832b696f9a) (cherry picked from commit 990d79a01bf8ed124c5475e49f856897a91992af) (cherry picked from commit 213428e4112fdd9da441d4080c7f689a4a6e8b2b) --- rts/Interpreter.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/rts/Interpreter.c b/rts/Interpreter.c index 020bc5fe0d7..d9bfb5f39a4 100644 --- a/rts/Interpreter.c +++ b/rts/Interpreter.c @@ -158,11 +158,11 @@ tag functions as tag inference currently doesn't rely on those being properly ta cap->r.rRet = (retcode); \ return cap; -#define Sp_plusB(n) ((void *)(((StgWord8*)Sp) + (n))) -#define Sp_minusB(n) ((void *)(((StgWord8*)Sp) - (n))) +#define Sp_plusB(n) ((void *)((StgWord8*)Sp + (ptrdiff_t)(n))) +#define Sp_minusB(n) ((void *)((StgWord8*)Sp - (ptrdiff_t)(n))) -#define Sp_plusW(n) (Sp_plusB((n) * sizeof(W_))) -#define Sp_minusW(n) (Sp_minusB((n) * sizeof(W_))) +#define Sp_plusW(n) (Sp_plusB((ptrdiff_t)(n) * (ptrdiff_t)sizeof(W_))) +#define Sp_minusW(n) (Sp_minusB((ptrdiff_t)(n) * (ptrdiff_t)sizeof(W_))) #define Sp_addB(n) (Sp = Sp_plusB(n)) #define Sp_subB(n) (Sp = Sp_minusB(n)) -- GitLab