- Feb 06, 2025
-
-
Cheng Shao authored
(cherry picked from commit 42826a89) (cherry picked from commit 3ea15740) (cherry picked from commit 30d4f0ce) (cherry picked from commit d5d19e5e)
-
Cheng Shao authored
This commit fixes conflicting StgRun definition when building dynamic ways of RTS for wasm in unregisterised mode. (cherry picked from commit bef94bde) (cherry picked from commit 6f844296) (cherry picked from commit e7806b12)
-
Cheng Shao authored
This commit wraps the predefined GlobalRegs in Wasm.S under a CPP guard to prevent building for PIC mode. When building dynamic ways of RTS, the wasm globals that represent STG GlobalRegs will be created and supplied by dyld.mjs. The current wasm dylink convention doesn't properly support exporting relocatable wasm globals at all, any wasm global exported by a .so is assumed to be a GOT.mem entry. (cherry picked from commit 98a32ec5) (cherry picked from commit cc67ef51) (cherry picked from commit 49d340a6)
-
Cheng Shao authored
This commit drops interpretBCO support from non dynamic rts ways on wasm. The bytecode interpreter is only useful when the RTS linker also works, and on wasm it only works for dynamic ways anyway. An additional benefit of dropping interpretBCO is reduction in code size of linked wasm modules, especially since interpretBCO references ffi_call which is an auto-generated large function in libffi-wasm and unused by most user applications. (cherry picked from commit 90a35c41) (cherry picked from commit 7a16cf7c) (cherry picked from commit 5fcee705)
-
Cheng Shao authored
This patch changes mblock size to page size on wasm. It allows us to simplify our wasi-libc fork, makes it much easier to test third party libc allocators like emmalloc/mimalloc, as well as experimenting with threaded RTS in wasm. (cherry picked from commit 558353f4) (cherry picked from commit f3ea9fb8) (cherry picked from commit f747805f)
-
Cheng Shao authored
This patch removes pre-C11 legacy code paths related to INLINE_HEADER/STATIC_INLINE/EXTERN_INLINE macros, ensure EXTERN_INLINE is treated as static inline in most cases (fixes #24945), and also corrects the comments accordingly. (cherry picked from commit 35a64220) (cherry picked from commit 241f401d) (cherry picked from commit 562d9ad7)
-
Cheng Shao authored
This patch fixes I/O manager compilation errors for win32 target discovered when cross-compiling to win32 using recent clang: ``` rts/win32/ThrIOManager.c:117:7: error: error: call to undeclared function 'is_io_mng_native_p'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 117 | if (is_io_mng_native_p ()) { | ^ | 117 | if (is_io_mng_native_p ()) { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/fs.c:143:28: error: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] 143 | int setErrNoFromWin32Error () { | ^ | void | 143 | int setErrNoFromWin32Error () { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:227:9: error: error: call to undeclared function 'interruptIOManagerEvent'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 227 | interruptIOManagerEvent (); | ^ | 227 | interruptIOManagerEvent (); | ^ rts/win32/ConsoleHandler.c:227:9: error: note: did you mean 'getIOManagerEvent'? | 227 | interruptIOManagerEvent (); | ^ rts/include/rts/IOInterface.h:27:10: error: note: 'getIOManagerEvent' declared here 27 | void * getIOManagerEvent (void); | ^ | 27 | void * getIOManagerEvent (void); | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:196:9: error: error: call to undeclared function 'setThreadLabel'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/win32/ConsoleHandler.c:196:9: error: note: did you mean 'postThreadLabel'? | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/eventlog/EventLog.h:118:6: error: note: 'postThreadLabel' declared here 118 | void postThreadLabel(Capability *cap, | ^ | 118 | void postThreadLabel(Capability *cap, | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) ``` (cherry picked from commit 710665bd) (cherry picked from commit 9499ed96) (cherry picked from commit e19bea4a)
-
Cheng Shao authored
This commit adds an assertion to Bdescr() to assert the pointer is indeed heap allocated. This is useful to rule out RTS bugs that attempt to access non-existent block descriptor of a static closure, #24492 being one such example. (cherry picked from commit d19441d7) (cherry picked from commit 8af2c3fd)
-
Cheng Shao authored
This commit exposes HeapAlloc.h as a public header. The intention is to expose HEAP_ALLOCED/HEAP_ALLOCED_GC, so they can be used in assertions in other public headers, and they may also be useful for user code. (cherry picked from commit dedcf102) (cherry picked from commit 8d38b450)
-
Cheng Shao authored
This commit removes the redundant logic of initializing each Capability's rCCCS to CCS_SYSTEM in initProfiling(). Before initProfiling() is called during RTS startup, each Capability's rCCCS has already been assigned CCS_SYSTEM when they're first initialized. (cherry picked from commit a7569495) (cherry picked from commit 3dace6e1)
-
Cheng Shao authored
This commit cleans up how we include the xxhash.h header and only define XXH_INLINE_ALL, which is sufficient to inline the xxHash functions without symbol collision. (cherry picked from commit ee01de7d) (cherry picked from commit 2c52e59b)
-
Cheng Shao authored
This commit enables XXH3_64bits hash to be used on all 64-bit platforms. Previously it was only enabled on x86_64, so platforms like aarch64 silently falls back to using XXH32 which degrades the hashing function quality. (cherry picked from commit 4a97bdb8) (cherry picked from commit 8ed990f5)
-
Cheng Shao authored
(cherry picked from commit b19ec331) (cherry picked from commit 7c0a6f59)
-
Cheng Shao authored
We used to have MIN_UPD_SIZE macro that describes the minimum reserved size for thunks, so that the thunk can be overwritten in place as indirections or blackholes. However, this macro has not been actually defined or used anywhere since a long time ago; StgThunkHeader already reserves a padding word for this purpose. Hence this patch which drops stale mentions of MIN_UPD_SIZE. (cherry picked from commit c1e3719c) (cherry picked from commit 8d9efe4d)
-
Cheng Shao authored
(cherry picked from commit a4785b33) (cherry picked from commit 45935f5d)
-
Cheng Shao authored
See added comment for details. Closes #24423. (cherry picked from commit 20f80b77)
-
Cheng Shao authored
This patch fixes the STG_FIELD_OFFSET macro definition by using __builtin_offsetof, which is what gcc/clang uses to implement offsetof in standard C. The previous definition that uses NULL pointer involves subtle undefined behavior in C and thus reported by UndefinedBehaviorSanitizer as well: ``` rts/Capability.h:243:58: runtime error: member access within null pointer of type 'Capability' (aka 'struct Capability_') SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Capability.h:243:58 ``` (cherry picked from commit 83341bbc)
-
Cheng Shao authored
The wasm backend didn't properly make use of all Cmm global registers due to #24347. Now that it is fixed, this patch re-enables full register mapping for wasm32, and we can now generate smaller & faster wasm modules that doesn't always spill arguments onto the stack. Fixes #22460 #24152. (cherry picked from commit 0cda2b8b) (cherry picked from commit f1f5068b398b1effb837add38ecc5303dc9a381f) (cherry picked from commit 1e695750)
-
Cheng Shao authored
This patch does a few things: - Always build 64-bit atomic ops in rts/ghc-prim, even on 32-bit platforms - Remove legacy "64bit" cabal flag of rts package - Fix hs_xchg64 function prototype for 32-bit platforms - Fix AtomicFetch test for wasm32 (cherry picked from commit 87095f6a)
-
Cheng Shao authored
The MBLOCK_SHIFT macro must be the single source of truth for defining the mblock size, and changing it should only affect performance, not correctness. This patch makes it truly possible to reconfigure mblock size, at least on 32-bit targets, by fixing places which implicitly relied on the previous MBLOCK_SHIFT constant. Fixes #22901. (cherry picked from commit 1928c7f3)
-
- Jan 31, 2025
-
-
Luite Stegeman authored
This makes it easier for packges to use GHC 9.6.7 because the symbols exported by the rts are the same as in 9.6.6. This means that we define the symbol in the ghc library instead, where we have to pessimistically assume a threaded rts.
-
- Jan 20, 2025
- Jan 16, 2025
-
-
In #22010 we established that Int was not always sufficient to store all the uniques we generate during compilation on 32-bit platforms. This commit addresses that problem by using Word64 instead of Int for uniques. The core of the change is in GHC.Core.Types.Unique and GHC.Core.Types.Unique.Supply. However, the representation of uniques is used in many other places, so those needed changes too. Additionally, the RTS has been extended with an atomic_inc64 operation. One major change from this commit is the introduction of the Word64Set and Word64Map data types. These are adapted versions of IntSet and IntMap from the containers package. These are planned to be upstreamed in the future. As a natural consequence of these changes, the compiler will be a bit slower and take more space on 32-bit platforms. Our CI tests indicate around a 5% residency increase. Metric Increase: CoOpt_Read CoOpt_Singletons LargeRecord ManyAlternatives ManyConstructors MultiComponentModules MultiComponentModulesRecomp MultiLayerModulesTH_OneShot RecordUpdPerf T10421 T10547 T12150 T12227 T12234 T12425 T12707 T13035 T13056 T13253 T13253-spj T13379 T13386 T13719 T14683 T14697 T14766 T15164 T15703 T16577 T16875 T17516 T18140 T18223 T18282 T18304 T18698a T18698b T18923 T1969 T19695 T20049 T21839c T3064 T3294 T4801 T5030 T5321FD T5321Fun T5631 T5642 T5837 T6048 T783 T8095 T9020 T9198 T9233 T9630 T9675 T9872a T9872b T9872b_defer T9872c T9872d T9961 TcPlugin_RewritePerf UniqLoop WWRec hard_hole_fits MultiLayerModulesTH_Make T21839r mhu-perf Metric Decrease: MultiLayerModulesTH_Make (cherry picked from commit 9edcb1fb)
-
Previously the maximum number of capabilities supported by the RTS was statically capped at 256. However, this bound is uncomfortably low given the size of today's machine. While supporting unbounded, fully-dynamic adjustment would be nice, it is complex and so instead we do something simpler: Probe the logical core count at RTS startup and use this as the static bound for the rest of our execution. This should avoid users running into the capability limit on large machines while avoiding wasting memory on a large capabilities array for most users and keeping complexity at bay. Addresses #25560. (cherry picked from commit 71f050b7)
-
Just a stylistic change. (cherry picked from commit d488470b)
-
Addresses #25560. (cherry picked from commit 06265655)
-
The foreign imports of `enabled_capabilities` and `getNumberOfProcessors` were declared as `CInt` whereas they are defined as `uint32_t`. (cherry picked from commit e10d31ad)
-
It was noticed in #25560 that this would previously be allowed, resulting in a segfault. I will add a proper exception in `base` in a future commit. (cherry picked from commit f08a72eb)
-
(cherry picked from commit 20912f5b)
-
Previously the structure of `mmapInRegion` concealed a subtle bug concerning handling of `mmap` returning mappings below the beginning of the desired region. Specifically, we would reset `p = result + bytes` and then again reset `p = region->start` before looping around for another iteration. This resulted in an infinite loop on FreeBSD. Fixes #25492. (cherry picked from commit 292ed74e)
-
(cherry picked from commit 639f0149)
-
This patch fixes an unnoticed undefined behavior in the bytecode interpreter. It can be caught by building `rts/Interpreter.c` with `-fsanitize=pointer-overflow`, the warning message is something like: ``` rts/Interpreter.c:1369:13: runtime error: addition of unsigned offset to 0x004200197660 overflowed to 0x004200197658 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1369:13 rts/Interpreter.c:1265:13: runtime error: addition of unsigned offset to 0x004200197660 overflowed to 0x004200197658 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1265:13 rts/Interpreter.c:1645:13: runtime error: addition of unsigned offset to 0x0042000b22f8 overflowed to 0x0042000b22f0 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior rts/Interpreter.c:1645:13 ``` Whenever we do something like `SpW(-1)`, the negative argument is implicitly converted to an unsigned integer type and causes pointer arithmetic overflow. It happens to be harmless for most targets since overflowing would wrap the result to desired value, but it's still coincidental and undefined behavior. Furthermore, it causes real damage to the wasm backend, given clang-20 will emit invalid wasm code that crashes at run-time for this kind of C code! (see https://github.com/llvm/llvm-project/issues/108770) The fix here is adding some explicit casts to ensure we always use the signed `ptrdiff_t` type as right hand operand of pointer arithmetic. (cherry picked from commit 5bcfefd5)
-
This patch is part of the patches upstreamed from haskell.nix. See https://github.com/input-output-hk/haskell.nix/pull/1960 for the original report/patch. (cherry picked from commit c749bdfd)
-
part of the upstream haskell.nix patches (cherry picked from commit bfe4b3d3)
-
Remove unjustified +8 offset that leads to memory corruption (cf discussion in #24432). (cherry picked from commit c34fef56)
-
(cherry picked from commit 52d66984)
-
Use M32 allocator to avoid fragmentation when allocating ELF sections. We already did this when NEED_PLT was undefined. Failing to do this led to relocations impossible to fulfil (#24432). (cherry picked from commit 5104ee61)
-
This patch fixes an error message in checkClosure() when the closure has already been evacuated. The previous logic was meant to print the evacuated closure's type in the error message, but it was completely wrong, given info was not really an info table, but a tagged pointer that points to the closure's new address. (cherry picked from commit 0d3bc2fa)
-
Fixes #24487 (cherry picked from commit 23c3e624)
-