- May 24, 2024
- May 23, 2024
-
-
Detect ELF v2 on PowerPC 64-bit systems. Check for `_CALL_ELF` preprocessor macro. Fixes #21191
-
- May 22, 2024
-
-
Previously the entry code of the `stg_orig_thunk` frame failed to account for the size of the profiling header as it hard-coded the frame size. Fix this. Fixes #24809.
-
-
- May 17, 2024
-
-
This patch fixes I/O manager compilation errors for win32 target discovered when cross-compiling to win32 using recent clang: ``` rts/win32/ThrIOManager.c:117:7: error: error: call to undeclared function 'is_io_mng_native_p'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 117 | if (is_io_mng_native_p ()) { | ^ | 117 | if (is_io_mng_native_p ()) { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/fs.c:143:28: error: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] 143 | int setErrNoFromWin32Error () { | ^ | void | 143 | int setErrNoFromWin32Error () { | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:227:9: error: error: call to undeclared function 'interruptIOManagerEvent'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 227 | interruptIOManagerEvent (); | ^ | 227 | interruptIOManagerEvent (); | ^ rts/win32/ConsoleHandler.c:227:9: error: note: did you mean 'getIOManagerEvent'? | 227 | interruptIOManagerEvent (); | ^ rts/include/rts/IOInterface.h:27:10: error: note: 'getIOManagerEvent' declared here 27 | void * getIOManagerEvent (void); | ^ | 27 | void * getIOManagerEvent (void); | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) rts/win32/ConsoleHandler.c:196:9: error: error: call to undeclared function 'setThreadLabel'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/win32/ConsoleHandler.c:196:9: error: note: did you mean 'postThreadLabel'? | 196 | setThreadLabel(cap, t, "signal handler thread"); | ^ rts/eventlog/EventLog.h:118:6: error: note: 'postThreadLabel' declared here 118 | void postThreadLabel(Capability *cap, | ^ | 118 | void postThreadLabel(Capability *cap, | ^ 1 error generated. `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1) ```
-
Non-moving segments are 8 blocks long and need to be aligned. Previously we serviced allocations by grabbing 15 blocks, finding an aligned 8 block group in it and returning the rest. This proved to lead to high levels of fragmentation as a de-allocating a segment caused an 8 block gap to form, and this could not be reused for allocation. This patch introduces a segment allocator based around using entire megablocks to service segment allocations in bulk. When there are no free segments, we grab an entire megablock and fill it with aligned segments. As the megablock is free, we can easily guarantee alignment. Any unused segments are placed on a free list. It only makes sense to free segments in bulk when all of the segments in a megablock are freeable. After sweeping, we grab the free list, sort it, and find all groups of segments where they cover the megablock and free them. This introduces a period of time when free segments are not available to the mutator, but the risk that this would lead to excessive allocation is low. Right after sweep, we should have an abundance of partially full segments, and this pruning step is relatively quick. In implementing this we drop the logic that kept NONMOVING_MAX_FREE segments on the free list. We also introduce an eventlog event to log the amount of pruned/retained free segments. See Note [Segment allocation strategy] Resolves #24150 ------------------------- Metric Decrease: T13253 T19695 -------------------------
-
This commit fixes a small an oversight in !12148: the prefetch logic in non-moving GC may trap in debug RTS because it calls Bdescr() for mark_closure which may be a static one. It's fine in non-debug RTS because even invalid bdescr addresses are prefetched, they will not cause segfaults, so this commit implements the most straightforward fix: don't prefetch mark_closure bdescr when assertions are enabled.
-
- May 10, 2024
-
-
Instead of encoding the closure type as decimal string we now simply represent it as an integer, eliminating the need for `Read` in `GHC.Internal.InfoProv.Types.peekInfoProv`. Closes #24504. ------------------------- Metric Decrease: T24602_perf_size size_hello_artifact -------------------------
-
- May 02, 2024
-
-
When a breakpoint is inlined, its context may change (e.g. tyvars in scope). We must take this into account and not used the breakpoint tick index as its sole identifier. Each instance of a breakpoint (even with the same tick index) now gets a different "info" index. We also need to distinguish modules: - tick module: module with the break array (tick counters, status, etc.) - info module: module having the CgBreakInfo (info at occurrence site)
-
* Don't lock tvars when performing non-committal validation. * If we encounter a locked tvar don't consider it a failure. This means in-flight validation will only fail if committing at the moment of validation is *guaranteed* to fail. This prevents in-flight validation from failing spuriously if it happens in parallel on multiple threads or parallel to thread comitting.
-
The STM code had a coarse grained locking mode guarded by #defines that was unused. This commit removes the code.
-
- Apr 21, 2024
-
-
Serge S. Gulin authored
These errors were fixed just by introducing stubbed functions with throw for further implementation.
-
Serge S. Gulin authored
code copied from GHCJS (fixes #24602) I've just copied some old pieces of GHCJS from publicly available sources (See https://github.com/Taneb/shims/blob/a6dd0202dcdb86ad63201495b8b5d9763483eb35/src/io.js#L607). Also I didn't put details to h$fds. I took minimal and left only its object initialization: `var h$fds = {};`
-
Serge S. Gulin authored
-
Serge S. Gulin authored
You may noted that I've also changed term of ``` , global "h$vt_double" ||= toJExpr IntV ``` See "IntV" and ``` WaitReadOp -> \[] [fd] -> pure $ PRPrimCall $ returnS (app "h$waidRead" [fd]) ``` See "h$waidRead"
-
- Apr 17, 2024
-
-
While the RTS does attempt to mask signals, it may be that a foreign library unmasks them. This previously caused benign warnings which we now ignore. See #24610.
-
- Apr 12, 2024
-
-
It is sometimes more useful to know how much bigger or smaller the nursery got when it is resized. In particular I am trying to investigate situations where we end up with fragmentation due to the nursery (#24577)
-
Fixes #23680.
-
Fixes #24487
-
- Apr 10, 2024
-
-
Rewrite the implementation of `addDLL` as a wrapper around the more principled `loadNativeObj` rts linker function. The latter should be preferred while the former is preserved for backwards compatibility. `loadNativeObj` was previously only available on ELF platforms, so this commit further refactors the rts linker to transform loadNativeObj_ELF into loadNativeObj_POSIX, which is available in ELF and MachO platforms. The refactor made it possible to remove the `dl_mutex` mutex in favour of always using `linker_mutex` (rather than a combination of both). Lastly, we implement `loadNativeObj` for Windows too.
-
See the primary Note [Looking up symbols in the relevant objects] for a more in-depth explanation. When dynamically loading a Haskell symbol (typical when running a splice or GHCi expression), before this commit we would search for the symbol in all dynamic libraries that were loaded. However, this could be very inefficient when too many packages are loaded (which can happen if there are many package dependencies) because the time to lookup the would be linear in the number of packages loaded. This commit drastically improves symbol loading performance by introducing a mapping from units to the handles of corresponding loaded dlls. These handles are returned by dlopen when we load a dll, and can then be used to look up in a specific dynamic library. Looking up a given Name is now much more precise because we can get lookup its unit in the mapping and lookup the symbol solely in the handles of the dynamic libraries loaded for that unit. In one measurement, the wait time before the expression was executed went from +-38 seconds down to +-2s. This commit also includes Note [Symbols may not be found in pkgs_loaded], explaining the fallback to the old behaviour in case no dll can be found in the unit mapping for a given Name. Fixes #23415 Co-authored-by:
Rodrigo Mesquita <(@alt-romes)>
-
Fixes a memory leak in rts/linker/PEi386.c
-
- Apr 03, 2024
-
-
Some GCC versions don't know about some warnings, and they complain that we're ignoring unknown warnings. So we try to ignore the warning based on the GCC version.
-
Previously it was unclear that they did not work because the code path was shared with other I/O managers (in particular select()). Following the code carefully shows that what actually happens is that the calling thread would block forever: the thread will be put into the blocked queue, but no other action is scheduled that will ever result in it getting unblocked. It's better to just fail loudly in case anyone accidentally calls it, also it's less confusing code.
-
Document the extra +RTS --info output in the user guide
-
Using the new tracer class. Note: The unconditional definition of showIOManager should be compatible with the debugTrace change in 7c7d1f66. Co-authored-by:
Pi Delport <pi@well-typed.com>
-
It's just to make sure an exception CAF is a GC root.
-
Provide an opaque (forward) definition in Capability.h (since the cap contains a *CapIOManager) and then only provide a full definition in a new file IOManagerInternals.h. This new file is only supposed to be included by the IOManager implementation, not by its users. So that means IOManager.c and individual I/O manager implementations. The posix/Signals.c still needs direct access, but that should be eliminated. Anything that needs direct access either needs to be clearly part of an I/O manager (e.g. the sleect() one) or go via a proper API.
-
We need to select the I/O manager to use during startup before the per-cap I/O manager initialisation.
-
Used in setNumCapabilities. It only does anything for MIO on Posix. Previously it always invoked Haskell code, but that code only did anything on non-Windows (and non-JS), and only threaded. That currently effectively means the MIO I/O manager on Posix. So now it only invokes it for the MIO Posix case.
-
When the GC scavenges a TSO it needs to scavenge the tso->blocked_info but the blocked_info is a big union and what lives there depends on the two->why_blocked, which for I/O-related reasons is something that in principle is the responsibility of the I/O manager and not the GC. So the right thing to do is for the GC to ask the I/O manager to sscavenge the blocked_info if it encounters any I/O-related why_blocked reasons. So we add scavengeTSOIOManager in IOManager.{h,c} with the usual style. Now as it happens, right now, there is no special scavenging to do, so the implementation of scavengeTSOIOManager is a fancy no-op. That's because the select I/O manager uses only the fd and target members, which are not GC pointers, and the win32-legacy I/O manager _ought_ to be using GC-managed heap objects for the StgAsyncIOResult but it is actually usingthe C heap, so again no GC pointers. If the win32-legacy were doing this more sensibly, then scavengeTSOIOManager would be the right place to do the GC magic. Future I/O managers will need GC heap objects in the tso->blocked_info and will make use of this functionality.
-
Use the standard #include {Begin,End}Private.h style rather than RTS_PRIVATE on individual decls. And conditionally build the code for the select I/O manager based on the new CPP IOMGR_ENABLED_SELECT rather than on THREADED_RTS.
-
These are now just called from IOManager.c and are the per-I/O manager backend impls (whereas previously awaitEvent was the entry point). Follow the new naming convention in the IOManager.{h,c} of awaitCompletedTimeoutsOrIO, with the I/O manager's name as a suffix: so awaitCompletedTimeoutsOrIO{Select,Win32}.
-
and have the scheduler use it. Previously the scheduler calls awaitEvent directly, and awaitEvent is implemented directly in the RTS I/O managers (select, win32). This relies on the old scheme where there's a single active I/O manager for each platform and RTS way. We want to move that to go via an API in IOManager.{h,c} which can then call out to the active I/O manager. Also take the opportunity to split awaitEvent into two. The existing awaitEvent has a bool wait parameter, to say if the call should be blocking or non-blocking. We split this into two separate functions: pollCompletedTimeoutsOrIO and awaitCompletedTimeoutsOrIO. We split them for a few reasons: they have different post-conditions (specifically the await version is supposed to guarantee that there are threads runnable when it completes). Secondly, it is also anticipated that in future I/O managers the implementations of the two cases will be simpler if they are separated.
-
rather than directly operating on the IO manager's data structures. Specifically, when thowing an async exception to a thread that is blocked waiting for I/O or waiting for a timer, then we want to cancel that I/O waiting or cancel the timer. Currently this is done directly in removeFromQueues() in RaiseAsync.c. We want it to go via proper APIs both for modularity but also to let us support multiple I/O managers. So add sync{IO,Delay}Cancel, which is the cancellation for the corresponding sync{IO,Delay}. The implementations of these use the usual "switch (iomgr_type)" style.
-
It makes sense now for it to be separate from the scheduler class of tracers. Enabled with +RTS -Do. Document the -Do debug flag in the user guide.
-
We have lots of functions with conditional implementations for different I/O managers. Some functions, for some I/O managers, naturally have implementations that do nothing or barf. When only one such I/O manager is enabled then the whole function implementation will have an implementation that does nothing or barfs. This then results in warnings from gcc that parameters are unused, or that the function should be marked with attribute noreturn (since barf does not return). The USED_IF_THREADS trick for fine-grained warning supression is fine for just two cases, but an equivalent here would need USED_IF_THE_ONLY_ENABLED_IOMGR_IS_X_OR_Y which would have combinitorial blowup. So we take a coarse grained approach and simply disable these two warnings for the whole file. So we use a GCC pragma, with its handy push/pop support: #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wsuggest-attribute=noreturn" #pragma GCC diagnostic ignored "-Wunused-parameter" ... #pragma GCC diagnostic pop
-
The implementation is eventually going to need to use more private things, which will drag in unwanted includes into IOManager.h, so it's better to move the impl out of the header file and into the .c file, at the slight cost of it no longer being inline. At the same time, change to the "switch (iomgr_type)" style.
-
No longer defined in IOManager.h, just a private function in IOManager.c. Since it is no longer called from cmm code, just from syncDelay. It ought to get moved further into the select() I/O manager impl, rather than living in IOManager.c. On the other hand appendToIOBlockedQueue is still called from cmm code in the win32-legacy I/O manager primops async{Read,Write}#, and it is also used by the select() I/O manager. Update the CPP and comments to reflect this.
-
Moves it into the IOManager.c where we can follow the new pattern of switching on the selected I/O manager. Uses a new IOManager API: syncDelay, following the naming convention of sync* for thread-synchronous I/O & timer/delay operations. As part of porting from cmm to C, we maintain the rule that the why_blocked gets accessed using load acquire and store release atomic memory operations. There was one exception to this rule: in the delay# primop cmm code on posix (not win32), the why_blocked was being updated using a store relaxed, not a store release. I've no idea why. In this convesion I'm playing it safe here and using store release consistently.
-